1

ASSET: A Dataset for Tuning and Evaluation of Sentence Simplification Models with Multiple Rewriting Transformations

In order to simplify a sentence, human editors perform multiple rewriting transformations: they split it into several shorter sentences, paraphrase words (i.e. replacing complex words or phrases by simpler synonyms), reorder components, and/or delete …

EASSE: Easier Automatic Sentence Simplification Evaluation

We introduce EASSE, a Python package aiming to facilitate and standardise automatic evaluation and comparison of Sentence Simplification (SS) systems. EASSE provides a single access point to a broad range of evaluation resources: standard automatic …

Cross-Sentence Transformations in Text Simplification

Current approaches to Text Simplification focus on simplifying sentences individually. However, certain simplification transformations span beyond single sentences (e.g. joining and re-ordering sentences). In this paper, we motivate the need for …

Strong Baselines for Complex Word Identification across Multiple Languages

Complex Word Identification (CWI) is the task of identifying which words or phrases in a sentence are difficult to understand by a target audience. The latest CWI Shared Task released data for two settings: monolingual (i.e. train and test in the …

Learning How to Simplify From Explicit Labeling of Complex-Simplified Text Pairs

Current research in text simplification has been hampered by two central problems: (i) the small amount of high-quality parallel simplification data available, and (ii) the lack of explicit annotations of simplification operations, such as deletions …

MASSAlign: Alignment and Annotation of Comparable Documents

We introduce MASSAlign: a Python library for the alignment and annotation of monolingual comparable documents. MASSAlign offers easy-to-use access to state of the art algorithms for paragraph and sentence-level alignment, as well as novel algorithms …

Coh-Metrix-Esp: A Complexity Analysis Tool for Documents Written in Spanish

Text Complexity Analysis is an useful task in Education. For example, it can help teachers select appropriate texts for their students according to their educational level. This task requires the analysis of several text features that people do …

Semantic Role Labeling for Brazilian Portuguese: A Benchmark

One of the main research challenges in Semantic Role Labeling (SRL) is the development of systems for languages other than English. For Brazilian Portuguese, a corpus with appropriate manually-annotated data, PropBank.Br, has recently become …

Towards Semi-supervised Brazilian Portuguese Semantic Role Labeling: Building a Benchmark

One of the main research challenges in semantic role labeling (SRL) is the development of applications for languages other than English. For Brazilian Portuguese, recent projects in lexical semantics are about to provide the necessary computational …

Aplicando Minería de Textos en el Análisis de Mallas Curriculares de Carreras de Computación en el Perú

La comparación de mallas curriculares permite identificar y evaluar la calidad de los programas de carrera en una universidad; sin embargo, esta comparación casi siempre es realizada manualmente. En este trabajo, se propone una metodología de minería …