Simplifying Administrative Texts for Italian L2 Readers with Controllable Transformers Models: A Data-driven Approach


This paper presents a data-driven study focused on the automatic simplification of in-domain texts for specific target readers, which is “controlled” through data collected from behavioral analysis. We used these data to create Admin-It-L2, a parallel corpus of original-simplified sentences in the Italian administrative language, in which simplifications are aimed at Italian L2 speakers. Then, we used this corpus to test controllable models for text simplification based on Transformers. Although we obtained a high SARI score of 39.24, we show that this datum alone is not fully reliable in evaluating text simplification.

CLiC-it 2023
Fernando Alva-Manchego
Fernando Alva-Manchego

My research interests include text simplification, readability assessment, evaluation of natural language generation, and writing assistance.