Call for participation "Low-Resource Indic Language Translation" under Tenth Conference on Machine Translation (WMT25) - EMNLP 2025, November 5-9, 2025
Suzhou, China
We are pleased to inform you that we are hosting the "Shared Task: Low-Resource Indic Language Translation" again this year as part of WMT 2025. Following the outstanding success and enthusiastic participation witnessed in the previous year's edition, we are excited to continue this important initiative. Despite recent advancements in machine translation (MT), such as multilingual translation and transfer learning techniques, the scarcity of parallel data remains a significant challenge, particularly for low-resource languages.
The WMT 2025 Indic Machine Translation Shared Task aims to address this challenge by focusing on low-resource Indic languages from diverse language families. Specifically, we are targeting North East Indian languages such as Assamese, Mizo, Khasi, Manipuri, Nyishi, Bodo, Mising, and Kokborok.
For inquiries and further information, please contact us at lrilt.wmt@gmail.com. Additionally, you can find more details and updates on the task through the following link:
Task Link: https://www2.statmt.org/ wmt25/indic-mt-task.html
We highly encourage participants to register in advance so that we can provide updates regarding release dates of data and other relevant information periodically
To register for the event, please fill out the registration form available here:
Link: https://docs.google.com/ forms/d/ 1EWz5obFNaUnzXLEW6MTf46e4V5Mes jKWihL4NymEqg8/preview
This year’s task features two categories:
Category 1: (Moderate Training Data Available)
en-as: English ⇔ Assamese
en-lus: English ⇔ Mizo
en-kha: English ⇔ Khasi
en-mni: English ⇔ Manipuri
en-njz: English ⇔ Nyishi
Category 2: (Very Limited Training Data)
en-bodo: English ⇔ Bodo
GOAL
The central objective is to develop MT systems that produce high-quality translations despite the constraints of data availability. Participants are encouraged to explore:
Monolingual Data Utilization: Leveraging monolingual data effectively for improved translation.
Multilingual Approaches: Investigating whether cross-lingual transfer benefits low-resource pairs.
Transfer Learning: Adapting models trained on richer language pairs to the target languages.
Innovative Techniques: Experimenting with novel methods specifically tailored for low-resource settings.
DEADLINES:
March 18, 2025: Website released, and the task is announced!
Mach 20, 2025: Team Registration Open
April 20, 2025: Team Registration Close
April 25, 2025: Training Data Release only registered participants
May 25, 2025: Test Data Release only registered participants
June 01, 2025: Run Submission deadline (AoE)! Please dont forget to send a brief system description.
June 25, 2025: Result Declaration to individual team
System Paper Submission: August 14, 2025
Under EMNLP Conference: November 5-9, 2025
ORGANIZERS:
Santanu Pal, Wipro AI Lab, Kolkata, India/ London, UK
Partha Pakray, National Institute of Technology, Silchar, India
Sandeep Kumar Dash, National Institute of Technology, Mizoram, India
Lenin Laitonjam, National Institute of Technology, Mizoram, India
Arnab Maji, North-Eastern Hill University, India
Saralin A Lyngdoh, North-Eastern Hill University, India
Riyanka Manna, Amrita Vishwa Vidyapeetham, Andhra Pradesh, India
Ajit Das, Bodoland University, India
Anupam Jamatia, National Institute of Technology, Agartala, India
Koj Sambyo, National Institute of Technology, Arunachal Pradesh, India
CONTACT:
lrilt.wmt@gmail.com
https://www2.statmt.org/wmt25/indic-mt-task.html
Partha Pakray