Simplifying Administrative Texts for Italian L2 Readers with Controllable Transformers Models: A Data-driven Approach

Nov 30, 2023·

Martina Miliani

Fernando Alva-Manchego

Alessandro Lenci

· 0 min read

Abstract

This paper presents a data-driven study focused on the automatic simplification of in-domain texts for specific target readers, which is “controlled” through data collected from behavioral analysis. We used these data to create Admin-It-L2, a parallel corpus of original-simplified sentences in the Italian administrative language, in which simplifications are aimed at Italian L2 speakers. Then, we used this corpus to test controllable models for text simplification based on Transformers. Although we obtained a high SARI score of 39.24, we show that this datum alone is not fully reliable in evaluating text simplification.

Type

Conference paper

Publication

CLiC-it 2023