Please use this identifier to cite or link to this item:
http://hdl.handle.net/2080/3763
Title: | NIT Rourkela Machine Translation(MT) System Submission to WAT 2022 for MultiIndicMT: An Indic Language Multilingual Shared Task |
Authors: | Das, Sudhansu Bala Biradar, Atharv Mishra, Tapas Kumar Patra, Bidyut Kumar |
Keywords: | Multilingual Neural Machine Translation airseq modelling toolkit BLEU and RIBESmetric scores |
Issue Date: | Oct-2022 |
Citation: | 29th International Conference on Linguistics, October 12-17, 2022,Korea,Virtual |
Abstract: | Multilingual Neural Machine Translation (MNMT) exhibits incredible performance with the development of a single translation model for many languages. Previous studies on multilingual translation reveal that multilingual training is effective for languages with limited corpus. This paper presents our submission (Team Id: NITR) in the WAT 2022 for "MultiIndicMT shared task" where the objective of the task is the translation between 5 Indic languages(which are newly added in WAT 2022 corpus) into English and vice versa using the corpus provided by the organizer of WAT. Our system is based on a transformer-based NMT using fairseq modelling toolkit with ensemble techniques. Heuristic preprocessing approaches are carried out before keeping the model under training. Our multilingual NMT systems are trained with shared encoder and decoder parameters followed by assigning language embedding to each token in both encoder and decoder. Our final multilingual system was examined by using BLEU and RIBESmetric scores. |
Description: | Copyright belongs to proceeding publisher |
URI: | http://hdl.handle.net/2080/3763 |
Appears in Collections: | Conference Papers |
Files in This Item:
File | Description | Size | Format | |
---|---|---|---|---|
DasS_COLING2022.pdf | 78.28 kB | Adobe PDF | View/Open |
Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.