I am a Lecturer (~Assistant Professor) at the School of Computer Science and Informatics at Cardiff University. My research focuses on technologies that apply Artificial Intelligence for education and information accessibility. In particular, my work employs Natural Language Processing approaches to facilitate reading and understanding. I am especially interested in studying the real capabilities of systems for several Natural Language Generation tasks, such as Text Simplification, Summarisation and Machine Translation. In order to do that, my collaborators and I create language resources, design evaluation methodologies or metrics, and implement models using machine learning techniques.
Previously, I was a Research Associate at SheffieldNLP (2020-2021), working with Prof. Lucia Specia for the APE-QUEST and Bergamot projects on Quality Estimation for Machine Translation. Before that, I worked as Adjunct Professor at the Pontifical Catholic University of Peru (2013-2016), where I was a member of the Artificial Intelligence Group IA-PUCP. During my Masters, I was also a member of the Interinstitutional Center for Computational Linguistics at the University of São Paulo.
Looking for PhD Students! I am interested in supervising self-funded PhD students in projects involving Natural Language Processing for Text Adaptation. Please, check the relevant page in FindAPhD for more information. Do not hesitate to contact me if you have any questions!
PhD in Computer Science, 2021
University of Sheffield
MSc in Computer Science, 2013
University of Sao Paulo
BSc in Informatics Engineering, 2009
Pontifical Catholic University of Peru
We benchmark 44 state-of-the-art large language models on the task of text simplification across datasets from three domains (Wiki, news and medical) under a few-shot setting. We analyse their performance using automatic metrics, a quantitative investigation on the edit operation performed, and a manual qualitative analysis.
We introduce Admin-It-L2, a parallel corpus of original-simplified sentences in the Italian administrative language, in which simplifications are aimed at Italian L2 speakers. This dataset is used to assess Transformer-based controllable models for sentence simplification.
We compare the performances of genre-specific models trained via transfer learning and zero-shot GPT-like large language models for the task of sentence simplification using datasets from Wikipedia and PubMed.