An Exploration of Neural Sequence-to-Sequence Architectures for Automatic Post-Editing

Marcin Junczys-Dowmunt, Roman Grundkiewicz


Abstract
In this work, we explore multiple neural architectures adapted for the task of automatic post-editing of machine translation output. We focus on neural end-to-end models that combine both inputs mt (raw MT output) and src (source language input) in a single neural architecture, modeling {mt, src} → pe directly. Apart from that, we investigate the influence of hard-attention models which seem to be well-suited for monolingual tasks, as well as combinations of both ideas. We report results on data sets provided during the WMT-2016 shared task on automatic post-editing and can demonstrate that dual-attention models that incorporate all available data in the APE scenario in a single model improve on the best shared task system and on all other published results after the shared task. Dual-attention models that are combined with hard attention remain competitive despite applying fewer changes to the input.
Anthology ID:
I17-1013
Volume:
Proceedings of the Eighth International Joint Conference on Natural Language Processing (Volume 1: Long Papers)
Month:
November
Year:
2017
Address:
Taipei, Taiwan
Editors:
Greg Kondrak, Taro Watanabe
Venue:
IJCNLP
SIG:
Publisher:
Asian Federation of Natural Language Processing
Note:
Pages:
120–129
Language:
URL:
https://aclanthology.org/I17-1013
DOI:
Bibkey:
Cite (ACL):
Marcin Junczys-Dowmunt and Roman Grundkiewicz. 2017. An Exploration of Neural Sequence-to-Sequence Architectures for Automatic Post-Editing. In Proceedings of the Eighth International Joint Conference on Natural Language Processing (Volume 1: Long Papers), pages 120–129, Taipei, Taiwan. Asian Federation of Natural Language Processing.
Cite (Informal):
An Exploration of Neural Sequence-to-Sequence Architectures for Automatic Post-Editing (Junczys-Dowmunt & Grundkiewicz, IJCNLP 2017)
Copy Citation:
PDF:
https://aclanthology.org/I17-1013.pdf
Data
WMT 2016