Simplifying Administrative Texts for Italian L2 Readers with Controllable Transformers Models: A Data-driven Approach

Nov 30, 2023·
Martina Miliani
Fernando Alva-Manchego
Fernando Alva-Manchego
,
Alessandro Lenci
· 0 min read
Abstract
This paper presents a data-driven study focused on the automatic simplification of in-domain texts for specific target readers, which is “controlled” through data collected from behavioral analysis. We used these data to create Admin-It-L2, a parallel corpus of original-simplified sentences in the Italian administrative language, in which simplifications are aimed at Italian L2 speakers. Then, we used this corpus to test controllable models for text simplification based on Transformers. Although we obtained a high SARI score of 39.24, we show that this datum alone is not fully reliable in evaluating text simplification.
Type
Publication
CLiC-it 2023
Fernando Alva-Manchego
Authors
Lecturer
My research interests include text simplification, readability assessment, evaluation of natural language generation, and writing assistance.