Junior Researcher in Speech-to-Text Technologies for Minority Languages
Descrizione dell'offerta
Institute for Applied Linguistics
The Language Technologies (LT) research group at the Institute for Applied Linguistics is seeking a junior computational linguist to contribute to the DIGI-RLF project (Interreg Italia-Svizzera ). DIGI-RLF addresses the challenge of preserving and enhancing the Rhaeto‑Romance minority languages --Ladin in South Tyrol and Romansh in Grisons--, aiming to overcome linguistic barriers that hinder the efficiency of public administrations in cross‑border regions. The goal is to transform the currently fragmented landscape of digital linguistic resources into a coordinated ecosystem, enabling administrations to provide services in minority languages with greater efficiency and quality. Key outputs include a joint digitisation strategy, integration into international standards (Unicode CLDR) and optimised AI models for automatic speech‑to‑text (STT) transcription for Ladin and Romansh.
The position focusses on the acquisition, refinement and processing of Ladin and Romansh spoken data for the development and evaluation of STT models. The role offers opportunities to build expertise in NLP for low‑resource languages, while working in a collaborative, interdisciplinary research environment. As the position involves working with other researchers and practitioners on locally collected language data, experience with linguistic research (especially speech data) as well as basic knowledge of Ladin and Romansh is considered an advantage.
We are looking for a cooperative, proactive colleague, who thrives in an interdisciplinary and application‑oriented research environment.
Tasks
- Contribute to the creation, processing, documentation and maintenance of spoken corpora
- Support digitisation, data cleaning and annotation, quality control and metadata management in line with good research data management practices
- Analyse mono‑ and multilingual data using quantitative and computational methods
- Implement, adapt and evaluate language technology workflows (e.g. NLP pipelines, data processing, evaluation setups)
- Support research dissemination through scientific and transfer‑oriented publications, presentations and internal knowledge sharing
- Although dedicated to the project, the candidate will join the LT group and wider Institute meetings and initiatives
Requirements
- Degree (MA/MSc or BSc) in relevant fields, such as Computational Linguistics, Data Science, Computer Science or similar (linguistic degrees will be considered *if* technical skills are also provided)
- Awareness of (and/or interest to acquire good practice in) research data management, including all steps required for the collection and creation of data and metadata that comply with FAIR principles
- Awareness of reproducible research practices or strong interest in learning and applying them in practice
- Strong programming skills in Python, Pytorch, Transformers and relevant libraries
- Knowledge of typical text and data processing pipelines and current NLP toolkits (e.g. spaCy, Stanza, quanteda) or strong interest in learning and applying them in practice
- Solid knowledge of (large) language models and their application to common NLP tasks
- Familiarity with git, Jupyter notebooks and command‑line interfaces (CLI)
- Willingness to move to South Tyrol or to its vicinity in order to work on‑site
- Strong command of English and Italian
- Basic knowledge of German or the willingness to acquire it as a working language
- Social, organisational and communication skills, including careful scheduling and task management
- Ability and willingness to collaborate with researchers from different disciplinary backgrounds and research paradigms
Additional advantageous skills
- Experience with DevOps, data integration workflows or high‑performance computing environments
- Experience with digitisation technologies such as digital audio recording and editing
- Experience with linguistic research, especially speech data, as well as basic knowledge of Ladin and Romansh
We offer
- A full‑time position for 18 months. If the selected candidate is interested, a part‑time contract of at least 70% could also be considered.
- A supportive, international, and interdisciplinary research environment
- Professional development opportunities
- Flexible working arrangements with regular on‑site presence to ensure exchange and collaboration
- Benefits (e.g. family‑friendly benefits, lunch bonus, supplementary health insurance, etc.)
- Access to numerous scientific and cultural facilities and events
Eurac Research actively supports equal opportunities and diversity and encourages applications from candidates of all backgrounds.
Interested candidates should submit their application (CV and cover letter) by .
#J-18808-Ljbffr