Incremental Syntactic Language Models for Phrase-based Translation

Lane Schwartz1,  Chris Callison-Burch2,  William Schuler3,  Stephen Wu4
1Air Force Research Lab, 2Johns Hopkins University, 3Ohio State University, 4Mayo Clinic


Abstract

This paper describes a novel technique for incorporating syntactic knowledge into phrase-based machine translation through incremental syntactic parsing.

Bottom-up and top-down parsers typically require a completed string as input. This requirement makes it difficult to incorporate them into phrase-based translation, which generates partial hypothesized translations from left-to-right.

Incremental syntactic language models score sentences in a similar left-to-right fashion, and are therefore a good mechanism for incorporating syntax into phrase-based translation.

We give a formal definition of one such linear-time syntactic language model, detail its relation to phrase-based decoding, and integrate the model with the Moses phrase-based translation system.

We present empirical results on a constrained Urdu-English translation task that demonstrate a significant BLEU score improvement and a large decrease in perplexity.




Full paper: http://www.aclweb.org/anthology/P/P11/P11-1063.pdf