I am a Lecturer (~Assistant Professor) at the School of Computer Science and Informatics at Cardiff University. My research focuses on technologies that apply Artificial Intelligence for education and information accessibility. In particular, my work employs Natural Language Processing approaches to facilitate reading and understanding. I am especially interested in studying the real capabilities of systems for several Natural Language Generation tasks, such as Text Simplification, Summarisation and Machine Translation. In order to do that, my collaborators and I create language resources, design evaluation methodologies or metrics, and implement models using machine learning techniques.
Previously, I was a Research Associate at SheffieldNLP (2020-2021), working with Prof. Lucia Specia for the APE-QUEST and Bergamot projects on Quality Estimation for Machine Translation. Before that, I worked as Adjunct Professor at the Pontifical Catholic University of Peru (2013-2016), where I was a member of the Artificial Intelligence Group IA-PUCP. During my Masters, I was also a member of the Interinstitutional Center for Computational Linguistics at the University of São Paulo.
Looking for PhD Students! I am interested in supervising self-funded PhD students in projects involving Natural Language Processing for Text Adaptation. Please, check the relevant page in FindAPhD for more information. Do not hesitate to contact me if you have any questions!
PhD in Computer Science, 2021
University of Sheffield
MSc in Computer Science, 2013
University of Sao Paulo
BSc in Informatics Engineering, 2009
Pontifical Catholic University of Peru
We introduce QG-Bench, a multilingual and multidomain benchmark for question generation, which we use to evaluate the performance of robust baselines based on fine-tuning generative language models, as well as to assess the reliability of automatic metrics commonly-used used for the task.
We propose an approach for obtaining representations of courses in a curriculum based on a novel course-guided attention mechanism and metric learning, and test it in a new dataset with curricula of computing programs from the USA and LATAM.
We compile a new benchmark for automated readability assessments of texts in Spanish, and fine-tune pre-trained language models to perform the task at both sentence and paragraph levels.
We introduce Admin-It, a new dataset for sentence-level readability assessment of Italian administrative texts, and evaluate the performance of Neural Pairwise Ranking models in this new data.
We present a framework for creating a multi-modal Peruvian sign language interpretation dataset based on videos.
Department of Computer Science
Department of Engineering - Section of Informatics Engineering