Massively Parallel Cross-Lingual Learning in Low-Resource Target Language Translation

Zhong Zhou, Matthias Sperber, Alexander Waibel


Abstract
We work on translation from rich-resource languages to low-resource languages. The main challenges we identify are the lack of low-resource language data, effective methods for cross-lingual transfer, and the variable-binding problem that is common in neural systems. We build a translation system that addresses these challenges using eight European language families as our test ground. Firstly, we add the source and the target family labels and study intra-family and inter-family influences for effective cross-lingual transfer. We achieve an improvement of +9.9 in BLEU score for English-Swedish translation using eight families compared to the single-family multi-source multi-target baseline. Moreover, we find that training on two neighboring families closest to the low-resource language is often enough. Secondly, we construct an ablation study and find that reasonably good results can be achieved even with considerably less target data. Thirdly, we address the variable-binding problem by building an order-preserving named entity translation model. We obtain 60.6% accuracy in qualitative evaluation where our translations are akin to human translations in a preliminary study.
Anthology ID:
W18-6324
Volume:
Proceedings of the Third Conference on Machine Translation: Research Papers
Month:
October
Year:
2018
Address:
Brussels, Belgium
Editors:
Ondřej Bojar, Rajen Chatterjee, Christian Federmann, Mark Fishel, Yvette Graham, Barry Haddow, Matthias Huck, Antonio Jimeno Yepes, Philipp Koehn, Christof Monz, Matteo Negri, Aurélie Névéol, Mariana Neves, Matt Post, Lucia Specia, Marco Turchi, Karin Verspoor
Venue:
WMT
SIG:
SIGMT
Publisher:
Association for Computational Linguistics
Note:
Pages:
232–243
Language:
URL:
https://aclanthology.org/W18-6324
DOI:
10.18653/v1/W18-6324
Bibkey:
Cite (ACL):
Zhong Zhou, Matthias Sperber, and Alexander Waibel. 2018. Massively Parallel Cross-Lingual Learning in Low-Resource Target Language Translation. In Proceedings of the Third Conference on Machine Translation: Research Papers, pages 232–243, Brussels, Belgium. Association for Computational Linguistics.
Cite (Informal):
Massively Parallel Cross-Lingual Learning in Low-Resource Target Language Translation (Zhou et al., WMT 2018)
Copy Citation:
PDF:
https://aclanthology.org/W18-6324.pdf