Welsh Language Technology & Multilingual NLP
January 1, 2023
·
1 min read
I develop NLP resources and tools for Welsh and other low-resource languages, with a focus on practical utility for speakers, learners, and educators. This work addresses both foundational linguistic challenges and applied downstream tasks, grounded in close collaboration with language communities and validated with real users.
- 🏴 Welsh text complexity analysis and profiling (e.g. Proffiliadur)
- 📊 CEFR language proficiency assessment for Welsh (e.g. CEFR-Cymraeg) and multilingual benchmarks (e.g. UniversalCEFR, ComplexityMT)
- 🔡 Welsh morphology: initial consonant mutation trigger labelling
- 🌍 Multilingual lexical simplification pipelines (e.g. MultiLS)
- 📖 Readability datasets and models for Spanish, Italian, and more
Funded project: NLP Tools for Welsh Language Assessment and Learning
