<?xml version="1.0" encoding="utf-8" standalone="yes"?><rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom"><channel><title>Low-Resource Languages | Fernando Alva-Manchego</title><link>https://feralvam.github.io/tags/low-resource-languages/</link><atom:link href="https://feralvam.github.io/tags/low-resource-languages/index.xml" rel="self" type="application/rss+xml"/><description>Low-Resource Languages</description><generator>HugoBlox Kit (https://hugoblox.com)</generator><language>en-us</language><lastBuildDate>Sun, 01 Jan 2023 00:00:00 +0000</lastBuildDate><image><url>https://feralvam.github.io/media/icon_hu_ab250a83af8ff43c.png</url><title>Low-Resource Languages</title><link>https://feralvam.github.io/tags/low-resource-languages/</link></image><item><title>NLP Tools for Welsh Language Assessment and Learning</title><link>https://feralvam.github.io/projects/welsh-gov-nlp/</link><pubDate>Sun, 01 Jan 2023 00:00:00 +0000</pubDate><guid>https://feralvam.github.io/projects/welsh-gov-nlp/</guid><description>&lt;p&gt;Welsh Government-funded project developing computational tools for Welsh text complexity analysis, CEFR proficiency assessment, and morphological analysis to support Welsh-language education.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Funder:&lt;/strong&gt; Welsh Government&lt;br&gt;
&lt;strong&gt;Period:&lt;/strong&gt; 2025 – 2026&lt;br&gt;
&lt;strong&gt;Role:&lt;/strong&gt; Principal Investigator&lt;br&gt;
&lt;strong&gt;Research theme:&lt;/strong&gt;
&lt;/p&gt;
&lt;hr&gt;
&lt;p&gt;Welsh is spoken by approximately 900,000 people and has unique linguistic features, including initial consonant mutation, that pose significant challenges for standard NLP pipelines. This project develops the foundational NLP infrastructure for Welsh language assessment and learning, in partnership with Welsh-language educational institutions and the Welsh Government.&lt;/p&gt;
&lt;p&gt;Key outputs include:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;Proffiliadur&lt;/strong&gt;: an open-source toolkit computing 141 linguistic complexity indices for Welsh texts, supporting CEFR-level classification and accessibility analysis&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;CEFR-Cymraeg&lt;/strong&gt;: the first CEFR-annotated proficiency dataset for Welsh (A1–B2), enabling automated language proficiency assessment for Welsh learners&lt;/li&gt;
&lt;/ul&gt;
&lt;hr&gt;
&lt;h3 id="selected-publications"&gt;Selected Publications&lt;/h3&gt;
&lt;ul&gt;
&lt;li&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;/li&gt;
&lt;/ul&gt;</description></item></channel></rss>