Please use this identifier to cite or link to this item:
Title: NIT Rourkela Machine Translation(MT) System Submission to WAT 2022 for MultiIndicMT: An Indic Language Multilingual Shared Task
Authors: Das, Sudhansu Bala
Biradar, Atharv
Mishra, Tapas Kumar
Patra, Bidyut Kumar
Keywords: Multilingual Neural Machine Translation
airseq modelling toolkit
BLEU and RIBESmetric scores
Issue Date: Oct-2022
Citation: 29th International Conference on Linguistics, October 12-17, 2022,Korea,Virtual
Abstract: Multilingual Neural Machine Translation (MNMT) exhibits incredible performance with the development of a single translation model for many languages. Previous studies on multilingual translation reveal that multilingual training is effective for languages with limited corpus. This paper presents our submission (Team Id: NITR) in the WAT 2022 for "MultiIndicMT shared task" where the objective of the task is the translation between 5 Indic languages(which are newly added in WAT 2022 corpus) into English and vice versa using the corpus provided by the organizer of WAT. Our system is based on a transformer-based NMT using fairseq modelling toolkit with ensemble techniques. Heuristic preprocessing approaches are carried out before keeping the model under training. Our multilingual NMT systems are trained with shared encoder and decoder parameters followed by assigning language embedding to each token in both encoder and decoder. Our final multilingual system was examined by using BLEU and RIBESmetric scores.
Description: Copyright belongs to proceeding publisher
Appears in Collections:Conference Papers

Files in This Item:
File Description SizeFormat 
DasS_COLING2022.pdf78.28 kBAdobe PDFView/Open    Request a copy

Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.