Fernando Alva-Manchego
Fernando Alva-Manchego
Home
Publications
Team
Talks
Teaching
Contact
Light
Dark
Automatic
1
BLESS: Benchmarking Large Language Models on Sentence Simplification
We benchmark 44 state-of-the-art large language models on the task of text simplification across datasets from three domains (Wiki, news and medical) under a few-shot setting. We analyse their performance using automatic metrics, a quantitative investigation on the edit operation performed, and a manual qualitative analysis.
Tannon Kew
,
Alison Chi
,
Laura Vásquez-Rodríguez
,
Sweta Agrawal
,
Dennis Aumiller
,
Fernando Alva-Manchego
,
Matthew Shardlow
PDF
Cite
Code
ACL Anthology
Simplifying Administrative Texts for Italian L2 Readers with Controllable Transformers Models: A Data-driven Approach
We introduce Admin-It-L2, a parallel corpus of original-simplified sentences in the Italian administrative language, in which simplifications are aimed at Italian L2 speakers. This dataset is used to assess Transformer-based controllable models for sentence simplification.
Martina Miliani
,
Fernando Alva-Manchego
,
Alessandro Lenci
PDF
Cite
Dataset
Comparing Generic and Expert Models for Genre-Specific Text Simplification
We compare the performances of genre-specific models trained via transfer learning and zero-shot GPT-like large language models for the task of sentence simplification using datasets from Wikipedia and PubMed.
Zihao LI
,
Matthew Shardlow
,
Fernando Alva-Manchego
PDF
Cite
ACL Anthology
A Practical Toolkit for Multilingual Question and Answer Generation
We introduce AutoQG, an online service for multilingual question and answer generation (QAG) along with lmqg, an all-in-one python package for model fine-tuning, generation, and evaluation. We also release QAG models in eight languages fine-tuned on a few variants of pre-trained encoder-decoder language models, which can be used online via AutoQG or locally via lmqg.
Asahi Ushio
,
Fernando Alva-Manchego
,
Jose Camacho-Collados
PDF
Cite
Code
DOI
Demo
ACL Anthology
An Empirical Comparison of LM-based Question and Answer Generation Methods
We establish baselines with three different question and answer generation methodologies (pipeline, multitask, end-to-end) that leverage sequence-to-sequence language model fine-tuning.
Asahi Ushio
,
Fernando Alva-Manchego
,
Jose Camacho-Collados
PDF
Cite
Code
DOI
ACL Anthology
Generative Language Models for Paragraph-Level Question Generation
We introduce QG-Bench, a multilingual and multidomain benchmark for question generation, which we use to evaluate the performance of robust baselines based on fine-tuning generative language models, as well as to assess the reliability of automatic metrics commonly-used used for the task.
Asahi Ushio
,
Fernando Alva-Manchego
,
Jose Camacho-Collados
PDF
Cite
Code
DOI
Demo
ACL Anthology
Improving Embeddings Representations for Comparing Higher Education Curricula: A Use Case in Computing
We propose an approach for obtaining representations of courses in a curriculum based on a novel course-guided attention mechanism and metric learning, and test it in a new dataset with curricula of computing programs from the USA and LATAM.
Jeffri Murrugarra-Llerena
,
Fernando Alva-Manchego
,
Nils Murrugarra-Llerena
PDF
Cite
Code
DOI
ACL Anthology
A Benchmark for Neural Readability Assessment of Texts in Spanish
We compile a new benchmark for automated readability assessments of texts in Spanish, and fine-tune pre-trained language models to perform the task at both sentence and paragraph levels.
Laura Vásquez-Rodríguez
,
Pedro-Manuel Cuenca-Jiménez
,
Sergio Esteban Morales-Esquivel
,
Fernando Alva-Manchego
PDF
Cite
Code
ACL Anthology
Neural Readability Pairwise Ranking for Sentences in Italian Administrative Language
We introduce Admin-It, a new dataset for sentence-level readability assessment of Italian administrative texts, and evaluate the performance of Neural Pairwise Ranking models in this new data.
Martina Miliani
,
Serena Auriemma
,
Fernando Alva-Manchego
,
Alessandro Lenci
PDF
Cite
Dataset
ACL Anthology
PeruSIL: A Framework to Build a Continuous Peruvian Sign Language Interpretation Dataset
We present a framework for creating a multi-modal Peruvian sign language interpretation dataset based on videos.
Gissella Bejarano
,
Joe Huamani-Malca
,
Francisco Cerna-Herrera
,
Fernando Alva-Manchego
,
Pablo Rivas
PDF
Cite
Code
Dataset
ACL Anthology
»
Cite
×