Towards Neural Machine Translation with Latent Tree Attention

James Bradbury, Richard Socher


Abstract
Building models that take advantage of the hierarchical structure of language without a priori annotation is a longstanding goal in natural language processing. We introduce such a model for the task of machine translation, pairing a recurrent neural network grammar encoder with a novel attentional RNNG decoder and applying policy gradient reinforcement learning to induce unsupervised tree structures on both the source and target. When trained on character-level datasets with no explicit segmentation or parse annotation, the model learns a plausible segmentation and shallow parse, obtaining performance close to an attentional baseline.
Anthology ID:
W17-4303
Volume:
Proceedings of the 2nd Workshop on Structured Prediction for Natural Language Processing
Month:
September
Year:
2017
Address:
Copenhagen, Denmark
Editors:
Kai-Wei Chang, Ming-Wei Chang, Vivek Srikumar, Alexander M. Rush
Venue:
WS
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
12–16
Language:
URL:
https://aclanthology.org/W17-4303
DOI:
10.18653/v1/W17-4303
Bibkey:
Cite (ACL):
James Bradbury and Richard Socher. 2017. Towards Neural Machine Translation with Latent Tree Attention. In Proceedings of the 2nd Workshop on Structured Prediction for Natural Language Processing, pages 12–16, Copenhagen, Denmark. Association for Computational Linguistics.
Cite (Informal):
Towards Neural Machine Translation with Latent Tree Attention (Bradbury & Socher, 2017)
Copy Citation:
PDF:
https://aclanthology.org/W17-4303.pdf
Attachment:
 W17-4303.Attachment.zip