Text Simplification consists of rewriting sentences to make them easier to read and understand, while preserving as much as possible of their original meaning. Human editors simplify by performing several text transformations, such as replacing complex terms by simpler synonyms, reordering words or phrases, removing non-essential information, and splitting long sentences. If this is the case for manual simplifications, we should expect automatic simplifications to be produced in a similar fashion. Data-driven models for the task are trained on datasets with instances that (in theory) exhibit a variety of rewriting operations. However, it is unclear whether this implicit learning of multi-operation simplifications results in automatic outputs with such characteristics, since current automatic evaluation resources (i.e. metrics and test sets) focus on single-operation simplifications. In this talk, I will discuss how this limitation hinders research in Text Simplification, and some of our latest work on overcoming it. In particular, I will present (1) ASSET, a new dataset for tuning and testing with multi-operation reference simplifications; and (2) the first meta-evaluation of automatic metrics for Sentence Simplification focused on simplicity judgements.