Comparing Generic and Expert Models for Genre-Specific Text Simplification

Abstract

We investigate how text genre influences the performance of models for controlled text simplification. Regarding datasets from Wikipedia and PubMed as two different genres, we compare the performance of genre-specific models trained by transfer learning and prompt-only GPT-like large language models. Our experiments showed that: (1) the performance loss of genre-specific models on general tasks can be limited to 2%, (2) transfer learning can improve performance on genre-specific datasets up to 10% in SARI score from the base model without transfer learning, (3) simplifications generated by the smaller but more customized models show similar performance in simplicity and a better meaning preservation capability to the larger generic models in both automatic and human evaluations

Publication
TSAR 2023
Fernando Alva-Manchego
Fernando Alva-Manchego
Lecturer

My research interests include text simplification, readability assessment, evaluation of natural language generation, and writing assistance.