Machine translation quality estimation system ranked first in WMT 2020
Prof Constantin Orasan’s machine translation quality estimation system has ranked first in the WMT 2020 sentence level direct assessment quality estimation shared task on seven different language pairs. This is joint work with colleagues from the Research Group in Computational Linguistics at University of Wolverhampton.
Quality estimation can play a very important role in the translation workflow as it can be used to decide whether an automatic translation is good enough to be used in a particular context, whether it requires postediting from a professional translator before it can be displayed, or it is so poor that it needs to be manually translated from scratch. The method developed by Prof Orasan relies on the latest advances in neural based natural language processing techniques which enable development of highly accurate multilingual systems. In addition to performing better than the state of the art, the proposed approach is appropriate for environments that have to deal with several language pairs, as it is usually the case with Language Service Providers (LSP), requiring less computational resources and therefore less costs for LSPs.
A paper describing the method has been accepted at the Fifth Conference on Machine Translation (WMT20):
- Tharindu Ranasinghe, Constantin Orăsan and Ruslan Mitkov (2020) TransQuest at WMT2020: Sentence-Level Direct Assessment. In Proceedings of Fifth Conference on Machine Translation (WMT20). [ArXiv]
A second paper, which goes beyond describing the wining QE system and shows how the approach can be successfully employed for less resourced languages such as Nepali and Sinhala, was accepted for presentation at the prestigious 28th International Conference on Computational Linguistics (COLING 2020):
- Tharindu Ranasinghe, Constantin Orăsan and Ruslan Mitkov (2020) TransQuest: Translation Quality Estimation with Cross-lingual Transformers. In Proceedings of COLING 2020. [ArXiv]