Once upon a time, synchronous grammars and tree transducers were
esoteric topics in formal language theory, far removed from the
practice of building real, large-scale natural language systems. But
these tools are now rapidly becoming essential for modeling machine
translation and other complex language transformations. It has
therefore become practical and important to understand the basic
properties of tree transformation systems.
Many advances in NLP in the 1990s exploited basic algorithms for
probabilistic finite-state string transducers, whose theory is
well understood and widely taught. The analogous theory for trees is
less widely known but well developed, with roots going back to the
1960s. In this tutorial, we aim to (1) cover the literature of
synchronous grammars and tree transducers, (2) describe how they
relate to current NLP applications, such as machine translation, and
(3) discuss some new theoretical and algorithmic problems raised by
these applications, and some recent solutions.
Prerequisites
Attendees should be familiar with basic concepts of
finite-state automata and context-free grammars.
Dr. David Chiang is a researcher at USC/ISI working on statistical
machine translation using synchronous grammars. His interests span
both the theoretical and practical aspects of this area, from
investigating the formal power of synchronous grammar formalisms to
building a large-scale synchronous CFG based statistical machine
translation system.
Dr. Kevin Knight, also at USC/ISI, has done work in many areas
including novel approaches to syntax-based machine translation,
natural language generation, and name transliteration. He is
currently interested in tree automata algorithms and applications,
and in building large-scale machine translation systems. Dr. Knight
has given invited talks and tutorials in recent years at IJCAI, UAI,
ACL, and other conferences.