Dealing with Spurious Ambiguity in Learning ITG-based Word Alignment

Shujian Huang1,  Stephan Vogel2,  Jiajun Chen1
1State Key Laboratory for Novel Software Technology, Nanjing University, 2Language Technologies Institute, Carnegie Mellon Univeristy


Abstract

Word alignment has an exponentially large search space, which often makes exact inference infeasible. Recent studies have shown that inversion transduction grammars are reasonable constraints for word alignment, and that the constrained space could be efficiently searched using synchronous parsing algorithms. However, spurious ambiguity may occur in synchronous parsing and cause problems in both search efficiency and accuracy. In this paper, we conduct a detailed study of the causes of spurious ambiguity and how it effects parsing and discriminative learning. We also propose a variant of the grammar which eliminates those ambiguities. Our grammar shows advantages over previous grammars in both synthetic and real-world experiments.




Full paper: http://www.aclweb.org/anthology/P/P11/P11-2066.pdf