A Stochastic Decoder for Neural Machine Translation

Philip Schulz, Wilker Aziz, Trevor Cohn


Abstract
The process of translation is ambiguous, in that there are typically many valid translations for a given sentence. This gives rise to significant variation in parallel corpora, however, most current models of machine translation do not account for this variation, instead treating the problem as a deterministic process. To this end, we present a deep generative model of machine translation which incorporates a chain of latent variables, in order to account for local lexical and syntactic variation in parallel corpora. We provide an in-depth analysis of the pitfalls encountered in variational inference for training deep generative models. Experiments on several different language pairs demonstrate that the model consistently improves over strong baselines.
Anthology ID:
P18-1115
Volume:
Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
Month:
July
Year:
2018
Address:
Melbourne, Australia
Editors:
Iryna Gurevych, Yusuke Miyao
Venue:
ACL
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
1243–1252
Language:
URL:
https://aclanthology.org/P18-1115
DOI:
10.18653/v1/P18-1115
Bibkey:
Cite (ACL):
Philip Schulz, Wilker Aziz, and Trevor Cohn. 2018. A Stochastic Decoder for Neural Machine Translation. In Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 1243–1252, Melbourne, Australia. Association for Computational Linguistics.
Cite (Informal):
A Stochastic Decoder for Neural Machine Translation (Schulz et al., ACL 2018)
Copy Citation:
PDF:
https://aclanthology.org/P18-1115.pdf
Note:
 P18-1115.Notes.pdf
Video:
 https://aclanthology.org/P18-1115.mp4
Code
 awslabs/sockeye