Challenges in Automatic Evaluation of Text Simplification

Abstract

Text Simplification consists of rewriting sentences to make them easier to read and understand, while preserving as much as possible of their original meaning. Human editors simplify by performing several text transformations, such as replacing complex terms by simpler synonyms, reordering words or phrases, removing non-essential information, and splitting long sentences. If this is the case for manual simplifications, we should expect automatic simplifications to be produced in a similar fashion. Data-driven models for the task are trained on datasets with instances that (in theory) exhibit a variety of rewriting operations. However, it is unclear whether this implicit learning of multi-operation simplifications results in automatic outputs with such characteristics, since current automatic evaluation resources (i.e. metrics and test sets) focus on single-operation simplifications. In this talk, I will discuss how this limitation hinders research in Text Simplification, and some of our latest work on overcoming it. In particular, I will present (1) ASSET, a new dataset for tuning and testing with multi-operation reference simplifications; and (2) the first meta-evaluation of automatic metrics for Sentence Simplification focused on simplicity judgements.

Date
Nov 25, 2020 14:00
Location
NLP with Friends (online)
Fernando Alva-Manchego
Fernando Alva-Manchego
Lecturer

My research interests include text simplification, readability assessment, evaluation of natural language generation, and writing assistance.