1

BLESS: Benchmarking Large Language Models on Sentence Simplification

We benchmark 44 state-of-the-art large language models on the task of text simplification across datasets from three domains (Wiki, news and medical) under a few-shot setting. We analyse their performance using automatic metrics, a quantitative investigation on the edit operation performed, and a manual qualitative analysis.

Tannon Kew, Alison Chi, Laura Vásquez-Rodríguez, Sweta Agrawal, Dennis Aumiller, Fernando Alva-Manchego, Matthew Shardlow

Simplifying Administrative Texts for Italian L2 Readers with Controllable Transformers Models: A Data-driven Approach

We introduce Admin-It-L2, a parallel corpus of original-simplified sentences in the Italian administrative language, in which simplifications are aimed at Italian L2 speakers. This dataset is used to assess Transformer-based controllable models for sentence simplification.

Martina Miliani, Fernando Alva-Manchego, Alessandro Lenci

Comparing Generic and Expert Models for Genre-Specific Text Simplification

We compare the performances of genre-specific models trained via transfer learning and zero-shot GPT-like large language models for the task of sentence simplification using datasets from Wikipedia and PubMed.

Zihao LI, Matthew Shardlow, Fernando Alva-Manchego

A Practical Toolkit for Multilingual Question and Answer Generation

We introduce AutoQG, an online service for multilingual question and answer generation (QAG) along with lmqg, an all-in-one python package for model fine-tuning, generation, and evaluation. We also release QAG models in eight languages fine-tuned on a few variants of pre-trained encoder-decoder language models, which can be used online via AutoQG or locally via lmqg.

Asahi Ushio, Fernando Alva-Manchego, Jose Camacho-Collados

An Empirical Comparison of LM-based Question and Answer Generation Methods

We establish baselines with three different question and answer generation methodologies (pipeline, multitask, end-to-end) that leverage sequence-to-sequence language model fine-tuning.

Asahi Ushio, Fernando Alva-Manchego, Jose Camacho-Collados

Generative Language Models for Paragraph-Level Question Generation

We introduce QG-Bench, a multilingual and multidomain benchmark for question generation, which we use to evaluate the performance of robust baselines based on fine-tuning generative language models, as well as to assess the reliability of automatic metrics commonly-used used for the task.

Asahi Ushio, Fernando Alva-Manchego, Jose Camacho-Collados

Improving Embeddings Representations for Comparing Higher Education Curricula: A Use Case in Computing

We propose an approach for obtaining representations of courses in a curriculum based on a novel course-guided attention mechanism and metric learning, and test it in a new dataset with curricula of computing programs from the USA and LATAM.

Jeffri Murrugarra-Llerena, Fernando Alva-Manchego, Nils Murrugarra-Llerena

A Benchmark for Neural Readability Assessment of Texts in Spanish

We compile a new benchmark for automated readability assessments of texts in Spanish, and fine-tune pre-trained language models to perform the task at both sentence and paragraph levels.

Laura Vásquez-Rodríguez, Pedro-Manuel Cuenca-Jiménez, Sergio Esteban Morales-Esquivel, Fernando Alva-Manchego

Neural Readability Pairwise Ranking for Sentences in Italian Administrative Language

We introduce Admin-It, a new dataset for sentence-level readability assessment of Italian administrative texts, and evaluate the performance of Neural Pairwise Ranking models in this new data.

Martina Miliani, Serena Auriemma, Fernando Alva-Manchego, Alessandro Lenci

PeruSIL: A Framework to Build a Continuous Peruvian Sign Language Interpretation Dataset

We present a framework for creating a multi-modal Peruvian sign language interpretation dataset based on videos.

Gissella Bejarano, Joe Huamani-Malca, Francisco Cerna-Herrera, Fernando Alva-Manchego, Pablo Rivas