All ‘A’ sessions are held in the Bayside Auditorium A; ‘B’ sessions are in Bayside 103; ‘C’ sessions are in Bayside 104; and ‘D’ sessions and the Student Research Workshop sessions are in Bayside 102.
Monday 17th July 930am–1030am
1A: Machine Translation I
Session Chair: David Chiang
Combination of Arabic Preprocessing Schemes for Statistical Machine Translation
Fatiha Sadat and Nizar Habash
Statistical machine
translation is quite robust when it comes to the choice of input
representation. It only requires consistency between training and testing. As a
result, there is a wide range of possible preprocessing choices for data used
in statistical machine translation. This is even more so for morphologically
rich languages such as Arabic. In this paper, we study the effect of different
word-level preprocessing schemes for Arabic on the quality of phrase-based
statistical machine translation. We also present and evaluate different methods
for combining preprocessing schemes resulting in improved translation quality.
Going Beyond
Necip Fazil
Ayan and
Bonnie J. Dorr
This paper presents an
extensive evaluation of five different alignments and investigates their impact
on the corresponding MT system output. We introduce new measures for intrinsic
evaluations and examine the distribution of phrases and untranslated words
during decoding to identify which characteristics of different alignments
affect translation. We show that precision-oriented alignments yield better MT
output (translating more words and using longer phrases) than recall-oriented
alignments.
1B: Topic Segmentation
Session Chair: Martha Palmer
Unsupervised Topic Modelling for Multi-Party Spoken Discourse
Matthew Purver, Konrad P. Körding, Thomas L. Griffiths and Joshua B. Tenebaum
We present a method for
unsupervised topic modelling which adapts methods used in document
classification (Blei et al., 2003; Griffiths and Steyvers, 2004) to unsegmented
multi-party discourse transcripts. We show how Bayesian inference in this
generative model can be used to simultaneously address the problems of topic
segmentation and topic identification: automatically segmenting multi-party
meetings into topically coherent segments with performance which compares well
with previous unsupervised segmentation-only methods (Galley et al., 2003)
while simultaneously extracting topics which rate highly when assessed for
coherence by human judges. We also show that this method appears robust in the
face of off-topic dialogue and speech recognition errors.
Minimum Cut Model for Spoken Lecture Segmentation
Igor Malioutov and
We consider the task of
unsupervised lecture segmentation. We formalize segmentation as a
graph-partitioning task that optimizes the normalized cut criterion. Our
approach moves beyond localized comparisons and takes into account long-range
cohesion dependencies. Our results demonstrate that global analysis improves
the segmentation accuracy and is robust in the presence of speech recognition
errors.
1C: Coreference
Session Chair: Vincent Ng
Bootstrapping Path-Based Pronoun Resolution
Shane Bergsma and Dekang Lin
We
present an approach to pronoun resolution based on syntactic paths. Through a
simple bootstrapping procedure, we learn the likelihood of coreference between
a pronoun and a candidate noun based on the path in the parse tree between the
two entities. This path information enables us to handle previously challenging
resolution instances, and also robustly addresses traditional syntactic
coreference constraints. Highly coreferent paths also allow mining of precise
probabilistic gender/number information. We combine statistical knowledge with
well-known features in a Support Vector Machine pronoun resolution classifier.
Significant gains in performance are observed on several datasets.
Kernel-Based Pronoun Resolution with Structured Syntactic Knowledge
Xiaofeng
Yang, Jian Su and
Chew Lim Tan
Syntactic knowledge is important for pronoun resolution. Traditionally, the syntactic information for pronoun resolution is represented in terms of features that have to be selected and defined heuristically. In the paper, we propose a kernel-based method that can automatically mine the syntactic information from the parse trees for pronoun resolution. Specifically, we utilize the parse trees directly as a structured feature and apply kernel functions to this feature, as well as other normal features, to learn the resolution classifier. In this way, our approach avoids the efforts of decoding the parse trees into the set of flat syntactic features. The experimental results show that our approach can bring significant performance improvement and is reliably effective for the pronoun resolution task.
1D: Grammars I
Session Chair: Martin Kay
A Finite-State Model of Human Sentence Processing
It has previously been assumed in the psycholinguistic literature that
finite-state models of language are crucially limited in their explanatory
power by the locality of the probability distribution and the narrow scope of
information used by the model. We show that a simple computational model (a
bigram part-of-speech tagger based on the design used by Corley and Crocker
(2000) makes correct predictions on processing difficulty observed in a wide
range of empirical sentence processing data. We use two modes of evaluation: one that
relies on comparison with a control sentence, paralleling practice in human studies;
another that measures probability drop in the disambiguating region of the
sentence. Both are surprisingly good indicators of the processing difficulty of
garden-path sentences. The sentences tested are drawn from published sources
and systematically explore five different types of ambiguity: previous studies
have been narrower in scope and smaller in scale. We do not deny the
limitations of finite-state models, but argue that our results show that their
usefulness has been underestimated.
Acceptability Prediction by Means of Grammaticality Quantification
Philippe Blache, Barbara Hemforth and Stéphane Rauzy
We propose in this paper a method for quantifying sentence grammaticality. The approach based on Property Grammars, a constraint-based syntactic formalism, makes it possible to evaluate a grammaticality index for any kind of sentence, including ill-formed ones. We compare on a sample of sentences the grammaticality indices obtained from PG formalism and the acceptability judgements measured by means of a psycholinguistic analysis. The results show that the derived grammaticality index is a fairly good tracer of acceptability scores.
Monday 17th July 1100am–1230pm
2A: Machine Translation II
Session Chair: David Chiang
Discriminative Word Alignment with Conditional Random Fields
Phil Blunsom and Trevor Cohn
In this
paper we present a novel approach for inducing word alignments from sentence-aligned
data. We use a Conditional Random Field (CRF), a discriminative model, which is
estimated on a small supervised training set. The CRF is conditioned on both
the source and target texts, and thus allows for the use of arbitrary and
overlapping features over these data. Moreover, the CRF has efficient training
and decoding processes which both find globally optimal solutions.
We apply
this alignment model to both French-English and Romanian-English language
pairs. We show how a large number of highly predictive features can be easily
incorporated into the CRF, and demonstrate that even with only a few hundred
word-aligned training sentences, our model improves over the current
state-of-the-art with alignment error rates of 5.29 and 25.8 for the two tasks
respectively.
Named Entity Transliteration with Comparable Corpora
Richard Sproat, Tao Tao and ChengXiang Zhai
In this paper we investigate Chinese-English name transliteration using comparable corpora, corpora where texts in the two languages deal in some of the same topics --- and therefore share references to named entities --- but are not translations of each other. We present two distinct methods for transliteration, one approach using phonetic transliteration, and the second using the temporal distribution of candidate pairs. Each of these approaches works quite well, but by combining the approaches one can achieve even better results. We then propose a novel score propagation method that utilizes the co-occurrence of transliteration pairs within document pairs. This propagation method achieves further improvement over the best results from the previous step.
Extracting Parallel Sub-Sentential Fragments from Non-Parallel Corpora
Dragos Stefan Munteanu and Daniel Marcu
We present a novel method for extracting parallel sub-sentential fragments from comparable, non-parallel bilingual corpora. By analyzing potentially similar sentence pairs using a signal processing-inspired approach, we detect which segments of the source sentence are translated into segments in the target sentence, and which are not. This method enables us to extract useful machine translation training data even from very non-parallel corpora, which contain no parallel sentence pairs. We evaluate the quality of the extracted data by showing that it improves the performance of a state-of-the-art statistical machine translation system.
2B: Word Sense Disambiguation I
Session Chair: Martha Palmer
Estimating Class Priors in Domain Adaptation for Word Sense Disambiguation
Yee Seng Chan and Hwee Tou Ng
Instances of a word drawn from different domains may have different sense priors (the proportions of the different senses of a word). This in turn affects the accuracy of word sense disambiguation (WSD) systems trained and applied on different domains. This paper presents a method to estimate the sense priors of words drawn from a new domain, and highlights the importance of using well-calibrated probabilities when performing these estimations. By using well-calibrated probabilities, we are able to estimate the sense priors effectively to achieve significant improvements in WSD accuracy.
Ensemble Methods for Unsupervised WSD
Samuel Brody, Roberto Navigli and Mirella Lapata
Combination methods are an effective way of improving system performance. This paper examines the benefits of system combination for unsupervised WSD. We investigate several voting- and arbiter-based combination strategies over a diverse pool of unsupervised WSD systems. Our combination methods rely on predominant senses which are derived automatically from raw text. Experiments using the SemCor and Senseval-3 data sets demonstrate that our ensembles yield significantly better results when compared with state-of-the-art.
Meaningful Clustering of Senses Helps Boost Word Sense Disambiguation Performance
Roberto Navigli
Fine-grained sense distinctions are one of the major obstacles to successful Word Sense Disambiguation. In this paper, we present a method for reducing the granularity of the WordNet sense inventory based on the mapping to a manually crafted dictionary encoding sense hierarchies, namely the Oxford Dictionary of English. We assess the quality of the mapping and the induced clustering, and evaluate the performance of coarse WSD systems in the Senseval-3 English all-words task.
2C: Information Extraction I
Session Chair: Vincent Ng
Espresso: Leveraging Generic Patterns for Automatically Harvesting Semantic Relations
Patrick Pantel and Marco Pennacchiotti
In this
paper, we present Espresso, a weakly-supervised, general-purpose, and accurate
algorithm for harvesting semantic relations. The main contributions are: i) a
method for exploiting generic patterns by filtering incorrect instances using
the Web; and ii) a principled measure of pattern and instance reliability
enabling the filtering algorithm. We present an empirical comparison of
Espresso with various state of the art systems, on different size and genre
corpora, on extracting various general and specific relations. Experimental
results show that our exploitation of generic patterns substantially increases
system recall with small effect on overall precision.
Modeling Commonality among Related Classes in Relation Extraction
Zhou GuoDong, Su Jian and Zhang Min
This paper proposes a novel hierarchical learning strategy
to deal with the data sparseness
problem in relation extraction by modeling the commonality among related classes. For each class
in the hierarchy either predefined manually
or automatically clustered, a linear discriminative function is determined in a top-down way using a
perceptron algorithm with the lower-level
weight vector derived from the upper-level weight vector. As the
upper-level class normally has
much more positive training examples than the lower-level class, the corresponding linear
discriminative function can be determined more reliably. The upper-level discriminative function then can
effectively guide the
discriminative function learning in the lower-level, which otherwise might suffer from limited training data.
Evaluation on the ACE
Relation Extraction Using Label Propagation Based Semi-supervised Learning
Jinxiu Chen, Donghong Ji, Chew Lim Tan and Zhengyu Niu
Shortage
of manually labeled data is an obstacle to supervised relation extraction
methods. In this paper we investigate a graph based semi-supervised learning
algorithm, a label propagation (LP) algorithm, for relation extraction. It
represents labeled and unlabeled examples and their distances as the nodes and
the weights of edges of a graph, and tries to obtain a labeling function to
satisfy two constraints: 1) it should be fixed on the labeled nodes, 2) it
should be smooth on the whole graph. Experiment results on the ACE corpus
showed that this LP algorithm achieves better performance than
2D: Grammars II
Session Chair: Martin Kay
Polarized Unification Grammars
Sylvain Kahane
This paper proposes a
generic mathematical formalism for the combination of various structures:
strings, trees, dags, graphs and products of them. The polarization of the
objects of the elementary structures controls the saturation of the final
structure. This formalism is both elementary and powerful enough to strongly
simulate many grammar formalisms, such as rewriting systems, dependency
grammars,
Partially Specified Signatures: a Vehicle for Grammar Modularity
Yael Cohen-Sygal and Shuly Wintner
This work provides the
essential foundations for modular construction of
(typed) unification grammars for natural languages. Much of the information in such grammars is encoded in the signature, and hence the key is facilitating a modularized development of type signatures. We introduce a definition of signature modules and show how two modules combine. Our definitions are motivated by the actual needs of grammar developers obtained through a careful examination of large scale grammars. We show that our definitions meet these needs by conforming to a detailed set of desiderata.
Morphology-Syntax Interface for Turkish
Özlem Çetinoğlu and Kemal Oflazer
This paper investigates the use of sublexical units as a solution to
handling the complex morphology with productive derivational processes, in the
development of a lexical functional grammar for Turkish. Such sublexical units
make it possible to expose the internal structure of words with multiple
derivations to the grammar rules in a uniform manner. This in turn leads to
more succinct and manageable rules. Further, the semantics of the derivations
can also be systematically reflected in a compositional way by constructing
Monday 17th July 200pm–330pm
3A: Parsing I
Session Chair: Joakim Nivre
PCFGs with Syntactic and Prosodic Indicators of Speech Repairs
John Hale, Izhak Shafran, Lisa Yung, Bonnie Dorr, Mary Harper, Anna Krasnyanskaya, Matthew Lease, Yang Liu, Brian Roark, Matthew Snover and Robin Stewart
A grammatical method of
combining two kinds of speech repair cues is presented. One cue, prosodic
disjuncture, is detected by a decision tree-based ensemble classifier that uses
acoustic cues to identify where normal prosody seems to be interrupted
(Lickley, 1996). The other cue, syntactic parallelism, codifies the expectation
that repairs continue a syntactic category that was left unfinished in the
reparandum (Levelt, 1983). The two cues are combined in a Treebank PCFG whose
states are split using a few simple tree transformations. Parsing performance
on the Switchboard and Fisher corpora suggests that these two cues help to
locate speech repairs in a synergistic way.
Dependency Parsing of Japanese Spoken Monologue Based on Clause Boundaries
Tomohiro Ohno, Shigeki Matsubara, Hideki Kashioka, Takehiko Maruyama and Yasuyoshi Inagaki
Spoken monologues feature greater sentence length and structural complexity than do spoken dialogues. To achieve high parsing performance for spoken monologues, it could prove effective to simplify the structure by dividing a sentence into suitable language units. This paper proposes a method for dependency parsing of Japanese monologues based on sentence segmentation. In this method, the dependency parsing is executed in two stages: at the clause level and the sentence level. First, the dependencies within a clause are identified by dividing a sentence into clauses and executing stochastic dependency parsing for each clause. Next, the dependencies over clause boundaries are identified stochastically, and the dependency structure of the entire sentence is thus completed. An experiment using a spoken monologue corpus shows this method to be effective for efficient dependency parsing of Japanese monologue sentences.
Trace Prediction and Recovery With Unlexicalized PCFGs and Slash Features
Helmut Schmid
This paper describes a
parser which generates parse trees with empty
elements in which traces and fillers are co-indexed. The parser is an unlexicalized PCFG parser which is guaranteed to return the most probable parse. The grammar is extracted from a version of the
3B: Dialogue I
Session Chair: Stanley Peters
Learning More Effective Dialogue Strategies Using Limited Dialogue Move Features
Matthew Frampton and Oliver Lemon
We explore the use of
restricted dialogue contexts in reinforcement learning (RL) of effective
dialogue strategies for information seeking spoken dialogue systems (e.g.
COMMUNICATOR (Walker et al., 2001)). The contexts we use are richer than
previous research in this area, e.g. (Levin and Pieraccini, 1997; Schefer and
Young, 2001; Singh et al., 2002; Pietquin, 2004), which use only slot-based
information, but are much less complex than the full dialogue Information
States explored in (Henderson et al., 2005), for which tractable learning is an
issue. We explore how incrementally adding richer features allows learning of
more effective dialogue strategies. We use 2 user simulations learned from
COMMUNICATOR data (Walker et al., 2001; Georgila et al., 2005b) to explore the
effects of different features on learned dialogue strategies. Our results show
that adding the dialogue moves of the last system and user turns increases the
average reward of the automatically learned strategies by 65:9% over the
original (hand-coded) COMMUNICATOR systems, and by 7:8% over a baseline RL
policy that uses only slot-status features. We show that the learned strategies
exhibit an emergent focus switching strategy and effective use of the `give
help' action.
Dependencies between Student State and Speech Recognition Problems in Spoken Tutoring Dialogues
Mihai Rotaru and Diane J. Litman
Speech recognition
problems are a reality in current spoken dialogue systems. In order to better
understand these phenomena, we study dependencies between speech recognition
problems and several higher level dialogue factors that define our notion of
student state: frustration/anger, certainty and correctness. We apply Chi
Square (?2) analysis to a corpus of speech-based computer tutoring dialogues to
discover these dependencies both within and across turns. Significant
dependencies are combined to produce interesting insights regarding speech
recognition problems and to propose new strategies for handling these problems.
We also find that tutoring, as a new domain for speech applications, exhibits
interesting tradeoffs and new factors to consider for spoken dialogue design.
Learning the Structure of Task-driven Human-Human Dialogs
Srinivas Bangalore, Giuseppe Di Fabbrizio and Amanda Stent
Data-driven techniques
have been used for many computational linguistics tasks. Models derived from
data are generally more robust than hand-crafted systems since they better
reflect the distribution of the phenomena being modeled. With the availability
of large corpora of spoken dialog, dialog management is now reaping the
benefits of data-driven techniques. In this paper, we compare two approaches to
modeling subtask structure in dialog: a chunk-based model of subdialog
sequences, and a parse-based, or hierarchical, model. We evaluate these models
using customer agent dialogs from a catalog service domain.
3C: Machine Learning Methods I
Session Chair: Hal Daumé
Semi-Supervised Conditional Random Fields for Improved Sequence Segmentation and Labeling
Feng Jiao, Shaojun Wang, Chi-Hoon Lee, Russell Greiner and Dale Schuurmans
We present a new semi-supervised training procedure for conditional random fields (CRFs) that can be used to train sequence segmentors and labelers from a combination of labeled and unlabeled training data. Our approach is based on extending the minimum entropy regularization framework to the structured prediction case, yielding a training objective that combines unlabeled conditional entropy with labeled conditional likelihood. Although the training objective is no longer concave, it can still be used to improve an initial model (e.g. obtained from supervised training) by iterative ascent. We apply our new training algorithm to the problem of identifying gene and protein mentions in biological texts, and show that incorporating unlabeled data improves the performance of the supervised CRF in this case.
Training Conditional Random Fields with Multivariate Evaluation Measures
Jun Suzuki, Erik McDermott and Hideki Isozaki
This
paper proposes a framework for training Conditional Random Fields (CRFs) to
optimize multivariate evaluation measures, including non-linear measures such
as F-score. Our proposed framework is derived from an error minimization
approach that provides a simple solution for directly optimizing any evaluation
measure. Specifically focusing on sequential segmentation tasks, i.e. text
chunking and named entity recognition, we introduce a loss function which
closely reflects the target evaluation measure for these tasks, namely, segmentation
F-score. Our experiments show that our method performs better than standard CRF
training.
Approximation Lasso Methods for Language Modeling
Jianfeng Gao, Hisami Suzuki and Bin Yu
Lasso is a regularization method for parameter estimation in linear models. It optimizes the model parameters with respect to a loss function subject to model complexities. This paper explores the use of lasso for statistical language modeling for text input. Owing to the very large number of parameters, directly optimizing the penalized lasso loss function is impossible. Therefore, we investigate two approximation methods, the boosted lasso (BLasso) and the forward stagewise linear regression (FSLR). Both methods, when used with the exponential loss function, bear strong resemblance to the boosting algorithm which has been used as a discriminative training method for language modeling. Evaluations on the task of Japanese text input show that BLasso is able to produce the best approximation to the lasso solution, and leads to a significant improvement, in terms of character error rate, over boosting and the traditional maximum likelihood estimation.
3D: Applications I
Session Chair: John Prager
Automated Japanese Essay Scoring System based on Articles Written by Experts
Tsunenori Ishioka and Masayuki Kameda
We have
developed an automated Japanese essay scoring system called Jess. The system
needs expert writings rather than expert raters to build the evaluation model.
By detecting statistical outliers of predetermined aimed essay features
compared with many professional writings for each prompt, our system can
evaluate essays. The following three features are examined: (1) rhetoric –
syntactic variety, or the use of various structures in the arrangement of
phases, clauses, and sentences, (2) organization – characteristics associated
with the orderly presentation of ideas, such as rhetorical features and
linguistic cues, and (3) content – vocabulary related to the topic, such as
relevant information and precise or specialized vocabulary. The final
evaluation score is calculated by deducting from a perfect score assigned by a
learning process using editorials and columns from the Mainichi Daily News
newspaper. A diagnosis for the essay is also given.
A Feedback-Augmented Method for Detecting Errors in the Writing of Learners of English
Ryo Nagata, Atsuo Kawai, Koichiro Morihiro and Naoki Isu
This paper proposes a method for detecting errors in article usage and singular plural usage based on the mass count distinction. First, it learns decision lists from training data generated automatically to distinguish mass and count nouns. Then, in order to improve its performance, it is augmented by feedback that is obtained from the writing of learners. Finally, it detects errors by applying rules to the mass count distinction. Experiments show that it achieves a recall of 0.71 and a precision of 0.72 and outperforms other methods used for comparison when augmented by feedback.
Correcting
Chris Brockett, William B. Dolan and Michael Gamon
This paper presents a
pilot study of the use of phrasal Statistical Machine Translation (
Monday 17th July 400pm–430pm
4A: Parsing II
Session Chair: Joakim Nivre
Graph Transformations in Data-Driven Dependency Parsing
Jens Nilsson, Joakim Nivre and Johan Hall
Transforming syntactic representations in order to improve parsing accuracy has been exploited successfully in statistical parsing systems using constituency-based representations. In this paper, we show that similar transformations can give substantial improvements also in data-driven dependency parsing. Experiments on the Prague Dependency Treebank show that systematic transformations of coordinate structures and verb groups result in a 10% error reduction for a deterministic data-driven dependency parser. Combining these transformations with previously proposed techniques for recovering non-projective dependencies leads to state-of-the-art accuracy for the given data set.
4B: Dialogue II
Session Chair: Stanley Peters
Learning to Generate Naturalistic Utterances Using Reviews in Spoken Dialogue Systems
Ryuichiro Higashinaka, Rashmi Prasad and Marilyn A. Walker
Spoken language
generation for dialogue systems requires a dictionary of mappings between
semantic representations of concepts the system wants to express and realizations
of those concepts. Dictionary creation is a costly process; it is currently
done by hand for each dialogue domain. We propose a novel unsupervised method
for learning such mappings from user reviews in the target domain, and test it
on restaurant reviews. We test the hypothesis that user reviews that provide
individual ratings for distinguished attributes of the domain entity make it
possible to map review sentences to their semantic representation with high
precision. Experimental analyses show that the mappings learned cover most of
the domain ontology, and provide good linguistic variation. A subjective user
evaluation shows that the consistency between the semantic representations and
the learned realizations is high and that the naturalness of the realizations
is higher than a hand-crafted baseline.
4C: Linguistic Kinships
Measuring Language Divergence by Intra-Lexical Comparison
T. Mark Ellison and Simon Kirby
This paper presents a method for building genetic language taxonomies based on a new approach to comparing lexical forms. Instead of comparing forms cross-linguistically, a matrix of language-internal similarities between forms is calculated. These matrices are then compared to give distances between languages. We argue that this coheres better with current thinking in linguistics and psycholinguistics. An implementation of this approach, called PHILOLOGICON, is described, along with its application to Dyen et al.'s (1992) ninety-five wordlists from Indo-European languages.
4D: Applications II
Session Chair: John Prager
Enhancing electronic dictionaries with an index based on associations
Olivier Ferret and Michael Zock
A good dictionary
contains not only many entries and a lot of
information concerning each one of them, but also adequate means to reveal the stored information.
Information access depends crucially on the quality of
the index. We
will present here some ideas of how a dictionary could be enhanced to support a speaker/writer to find the word s/he is looking for. To this end we suggest to add to an existing electronic resource an index based on the notion of association. We will also present preliminary work of how a subset of such associations, for example, topical associations, can be acquired by filtering a network of lexical co-occurrences extracted from a corpus.
Tuesday 18th July 1000am–1030am
5A: Parsing III
Session Chair: Dan Klein
Guiding a Constraint Dependency Parser with Supertags
Kilian A. Foth, Tomas By and Wolfgang Menzel
We investigate the utility of supertag information for guiding an existing dependency parser of German. Using weighted constraints to integrate the additionally available information, the decision process of the parser is influenced by changing its preferences, without excluding alternative structural interpretations from being considered. The paper reports on a series of experiments using varying models of supertags that significantly increase the parsing accuracy. In addition, an upper bound on the accuracy that can be achieved with perfect supertags is estimated.
5B: Lexical Issues I
Session Chair: Chu Ren Huang
Efficient Unsupervised Discovery of Word Categories Using Symmetric Patterns and High Frequency Words
Dmitry Davidov and Ari Rappoport
We present
a novel approach for discovering word categories, sets of words sharing a
significant aspect of their meaning. We utilize meta-patterns of high-frequency
words and content words in order to discover pattern candidates. Symmetric
patterns are then identified using graph-based measures, and word categories
are created based on graph clique sets. Our method is the first pattern-based
method that requires no corpus annotation or manually provided seed patterns or
words. We evaluate our algorithm on very large corpora in two languages, using
both human judgments and WordNet-based evaluation. Our fully unsupervised
results are superior to previous work that used a
5C: Summarization I
Session Chair: Simone Teufel
Bayesian Query-Focused Summarization
Hal Daumé
We present BayeSum (for "Bayesian summarization"), a model for sentence extraction in query-focused summarization. BayeSum leverages the common case in which multiple documents are relevant to a single query. Using these documents as reinforcement for query terms, BayeSum is not afflicted by the paucity of information in short queries. We show that approximate inference in BayeSum is possible on large data sets and results in a state-of-the-art summarization system. Furthermore, we show how BayeSum can be understood as a justified query expansion technique in the language modeling for IR framework.
5D: Semantics I
Session Chair: Johan Bos
Expressing Implicit Semantic Relations without Supervision
Peter D. Turney
We present an unsupervised learning algorithm that mines
large text corpora for patterns that express implicit semantic relations. For a given
input word pair X:Y with some unspecified semantic relations, the corresponding
output list of patterns <P1,...,Pm> is ranked according to how well each
pattern Pi expresses the relations between X and Y. For example, given
X=ostrich and Y=bird, the two highest-ranking output patterns are "X is
the largest Y" and "Y such as the X". The output patterns are
intended to be useful for finding further pairs with the same relations, to
support the construction of lexicons, ontologies, and semantic networks. The
patterns are sorted by pertinence, where the pertinence of a pattern Pi for a
word pair X:Y is the expected relational similarity between the given pair and
typical pairs for Pi. The algorithm is empirically evaluated on two tasks,
solving multiple-choice
Tuesday 18th July 1100am–1230pm
6A: Parsing IV
Session Chair: Owen Rambow
Hybrid Parsing: Using Probabilistic Models as Predictors for a Symbolic Parser
Kilian A. Foth and Wolfgang Menzel
In this paper we investigate the benefit of stochastic predictor components for the parsing quality which can be obtained with a rule-based dependency grammar. By including a chunker, a supertagger, a PP attacher, and a fast probabilistic parser we were able to improve upon the baseline by 3.2%, bringing the overall labelled accuracy to 91.1% on the German NEGRA corpus. We attribute the successful integration to the ability of the underlying grammar model to combine uncertain evidence in a soft manner, thus avoiding the problem of error propagation.
Error mining in parsing results
Benoît Sagot and
Éric de La Clergerie
We introduce an error mining technique for automatically detecting errors in resources that are used in parsing systems. We applied this technique on parsing results produced on several million words by two distinct parsing systems, which share the syntactic lexicon and the pre-parsing processing chain. We were thus able to identify missing and erroneous information in these resources.
Reranking and Self-Training for Parser Adaptation
David McClosky, Eugene Charniak and Mark Johnson
Statistical
parsers trained and tested on the Penn Wall Street Journal (WSJ) treebank have
shown vast improvements over the last 10 years. Much of this improvement,
however, is based upon an ever-increasing number of features to be trained on
(typically) the WSJ treebank data. This has led to concern that such parsers
may be too finely tuned to this corpus at the expense of portability to other
genres. Such worries have merit. The standard "Charniak parser"
checks in at a labeled precision-recall f-measure of 89.7% on the Penn WSJ test
set, but only 82.9% on the test set from the Brown treebank corpus.
This paper should allay these fears. In particular, we show that the reranking parser described in Charniak and Johnson (2005) improves performance of the parser on Brown to 85.2%. Furthermore, use of the self-training techniques described in (McClosky et al. 2006) raise this to 87.8% (an error reduction of 28%) again without any use of labeled Brown data. This is remarkable since training the parser and reranker on labeled Brown data achieves only 88.4%.
6B: Lexical Issues II
Session Chair: Chu Ren Huang
Automatic Classification of Verbs in Biomedical Texts
Anna Korhonen, Yuval Krymolowski and Nigel Collier
Lexical classes, when
tailored to the application and domain in question, can provide an effective
means to deal with a number of natural language processing (NLP) tasks. While
manual construction of such classes is difficult, recent research shows that it
is possible to automatically induce verb classes from cross-domain corpora with
promising accuracy. We report a novel experiment where similar technology is
applied to the important, challenging domain of biomedicine. We show that the
resulting classification, acquired from a corpus of biomedical journal
articles, is highly accurate and strongly domain specific. It can be used to
aid
Selection of Effective Contextual Information for Automatic Synonym Acquisition
Masato Hagiwara, Yasuhiro Ogawa and Katsuhiko Toyama
Various methods have been proposed for automatic synonym acquisition, as synonyms are one of the most fundamental lexical knowledge. Whereas many methods are based on contextual clues of words, little attention has been paid to what kind of categories of contextual information are useful for the purpose. This study has experimentally investigated the impact of contextual information selection, by extracting three kinds of word relationships from corpora: dependency, sentence co-occurrence, and proximity. The evaluation result shows that while dependency and proximity perform relatively well by themselves, combination of two or more kinds of contextual information gives more stable performance. We’ve further investigated useful selection of dependency relations and modification categories, and it is found that modification has the greatest contribution, even greater than the widely adopted subject object combination.
Scaling Distributional Similarity to Large Corpora
James Gorman and James R. Curran
Accurately representing
synonymy using distributional similarity requires large volumes of data to
reliably represent infrequent words. However, the naive nearest-neighbour
approach to comparing context vectors extracted from large corpora scales
poorly (O (n2) in the vocabulary size).
In this paper, we compare several existing approaches to approximating the nearest-neighbour search for distributional similarity. We investigate the trade-off between efficiency and accuracy, and find that SASH (Houle and Sakuma, 2005) provides the best balance.
6C: Summarization II
Session Chair: Simone Teufel
Extractive Summarization using Inter- and Intra- Event Relevance
Wenjie Li, Mingli Wu, Qin Lu, Wei Xu and Chunfa Yuan
Event-based summarization attempts to select and organize sentences in a summary with respect to events or sub-events that the sentences describe. Each event has its own internal structure and meanwhile relates to the other events semantically, temporally, spatially, causally or conditionally. In this paper, we define an event as one or more event terms along with the named entities associated, and present a novel approach to derive intra- and inter- event relevance using the information of internal association, semantic related-ness, distributional similarity and named entity clustering. We then apply PageRank ranking algorithm to estimate the significance of an event for inclusion in a summary from the event relevance derived. Experiments on the DUC 2001 test data shows that the relevance of the named entities involved in events achieves better result when their relevance is derived from the event terms they associate. It also reveals that the topic-specific from documents themselves outperforms the semantic relevance from a general purpose knowledge base like Word-Net.
Models for Sentence Compression: A Comparison across Domains, Training Requirements and Evaluation Measures
James Clarke and Mirella Lapata
Sentence compression is the task of producing a summary at the sentence level. This paper focuses on three aspects of this task which have not received detailed treatment in the literature: training requirements, scalability, and automatic evaluation. We provide a novel comparison between a supervised constituent-based and a weakly supervised word-based compression algorithm and examine how these models port to different domains (written vs. spoken text). To achieve this, a human-authored compression corpus has been created and our study highlights potential problems with the automatically gathered compression corpora currently used. Finally, we assess whether automatic evaluation measures can be used to determine compression quality.
A Bottom-up Approach to Sentence Ordering for Multi-document Summarization
Danushka Bollegala, Naoaki Okazaki and Mitsuru Ishizuka
Ordering information is a difficult but important task for applications generating natural-language text. We present a bottom-up approach to arranging sentences extracted for multi-document summarization. To capture the association and order of two textual segments (eg, sentences), we define four criteria, chronology, topical-closeness, precedence, and succession. These criteria are integrated into a criterion by a supervised learning approach. We repeatedly concatenate two textual segments into one segment based on the criterion until we obtain the overall segment with all sentences arranged. Our experimental results show a significant improvement over existing sentence ordering strategies.
6D: Semantics II
Session Chair: Johan Bos
Learning Event Durations from Event Descriptions
Feng Pan, Rutu Mulkar and Jerry R. Hobbs
We have constructed a corpus of news articles in which events are annotated for estimated bounds on their durations. Here we describe a method for measuring inter-annotator agreement for these event duration distributions. We then show that machine learning techniques applied to this data yield coarse-grained event duration information, considerably outperforming a baseline and approaching human performance.
Automatic learning of textual entailments with cross-pair similarities
Fabio Massimo Zanzotto and Alessandro Moschitti
In this paper we define a novel similarity measure between examples of textual entailments and we use it as a kernel function in Support Vector Machines (SVMs). This allows us to automatically learn the rewrite rules that describe a non trivial set of entailment cases. The experiments with the data sets of the RTE 2005 challenge show an improvement of 4.4% over the state-of-the-art methods.
An Improved Redundancy Elimination Algorithm for Underspecified Representations
Alexander Koller and Stefan Thater
We present an efficient algorithm for the redundancy elimination problem: Given an underspecified semantic representation (USR) of a scope ambiguity, compute an USR with fewer mutually equivalent readings. The algorithm operates on underspecified chart representations which are derived from dominance graphs; it can be applied to the USRs computed by large-scale grammars. We evaluate the algorithm on a corpus, and show that it reduces the degree of ambiguity significantly while taking negligible runtime.
Tuesday 18th July 200pm–330pm
7A: Parsing V
Session Chair: Takashi Ninomiya
Integrating Syntactic Priming into an Incremental Probabilistic Parser, with an Application to Psycholinguistic Modeling
Amit Dubey, Frank Keller and
Patrick Sturt
The psycholinguistic literature provides evidence for syntactic priming, i.e., the tendency to repeat structures. This paper describes a method for incorporating priming into an incremental probabilistic parser. Three models are compared, which involve priming of rules between sentences, within sentences, and within coordinate structures. These models simulate the reading time advantage for parallel structures found in human data, and also yield a small increase in overall parsing accuracy.
A Fast, Accurate Deterministic Parser for Chinese
Mengqiu Wang, Kenji Sagae and Teruko Mitamura
We present a novel classifier-based deterministic parser
for Chinese constituency parsing. Our parser computes parse trees from bottom
up in one pass, and uses classifiers to make shift-reduce decisions. Trained
and evaluated on the standard training and test sets, our best model (using
stacked classifiers) runs in linear time and has labeled precision and recall
above 88% using gold-standard part-of-speech tags, surpassing the best
published results. Our
Learning Accurate, Compact, and Interpretable Tree Annotation
Slav Petrov, Leon Barrett, Romain Thibaux and
Dan Klein
We present an automatic approach to tree annotation in which basic nonterminal symbols are alternately split and merged to maximize the likelihood of a training treebank. Starting with a simple Xbar grammar, we learn a new grammar whose nonterminals are subsymbols of the original nonterminals. In contrast with previous work, we are able to split various terminals to different degrees, as appropriate to the actual complexity in the data. Our grammars automatically learn the kinds of linguistic distinctions exhibited in previous work on manual tree annotation. On the other hand, our grammars are much more compact and substantially more accurate than previous work on automatic annotation. Despite its simplicity, our best grammar achieves an F1 of 90.2% on the Penn Treebank, higher than fully lexicalized systems.
7B: Word Sense Disambiguation II
Session Chair: Hwee Tou Ng
Semi-Supervised Learning of Partial Cognates using Bilingual Bootstrapping
Oana Frunza and Diana Inkpen
Partial cognates are pairs of words in two languages that have the same meaning in some, but not all contexts. Detecting the actual meaning of a partial cognate in con-text can be useful for Machine Translation tools and for Computer-Assisted Language Learning tools. In this paper we propose a supervised and a semi-supervised method to disambiguate partial cognates between two languages: French and English. The methods use only automatically-labeled data; therefore they can be applied for other pairs of languages as well. We also show that our methods perform well when using corpora from different domains.
Direct Word Sense Matching for Lexical Substitution
Ido Dagan, Oren Glickman, Alfio Gliozzo, Efrat Marmorshtein and Carlo Strapparava
This paper investigates
conceptually and empirically the novel sense matching task, which requires to
recognize whether the senses of two synonymous words match in context. We
suggest direct approaches to the problem, which avoid the intermediate step of
explicit word sense disambiguation, and demonstrate their appealing advantages
and stimulating potential for future research.
An Equivalent Pseudoword for Unsupervised Chinese Word Sense Disambiguation
Zhimao Lu, Haifeng Wang, Jianmin Yao, Ting Liu and Sheng Li
This paper presents a new approach based on Equivalent Pseudowords (EPs) to tackle Word Sense Disambiguation (WSD) in Chinese language. EPs are particular artificial ambiguous words, which can be used to realize unsupervised WSD. A Bayesian classifier is implemented to test the efficacy of the EP solution on Senseval-3 Chinese test set. The performance is better than state-of-the-art results with an average F-measure of 0.80. The experiment verifies the value of EP for unsupervised WSD.
7C: Information Extraction II
Session Chair: Ming Zhou
Improving the Scalability of Semi-Markov Conditional Random Fields for Named Entity Recognition
Daisuke Okanohara, Yusuke Miyao, Yoshimasa Tsuruoka and Jun'ichi Tsujii
This
paper presents techniques to apply semi-CRFs to Named Entity Recognition tasks
with a tractable computational cost. Our framework can handle an
Factorizing Complex Models: A Case Study in Mention Detection
Radu Florian, Hongyan Jing, Nanda Kambhatla and
Imed Zitouni
As natural language processing moves towards natural language understanding, the tasks are becoming more and more subtle: we are interested in more nuanced word characteristics, more linguistic properties, more semantic and syntactic features. One such example, which we consider in this article, is the mention detection in the ACE project (NIST, 2004), where the goal is to identify named, nominal or pronominal references to real-world entities – mentions – and label them with three types of information: entity type, entity subtype and mention type. In this article, we investigate several methods to assign these related tags and compare them on several data sets. A system based on the methods presented in this article ranked very well in the ACE’04 evaluation.
Segment-based Hidden Markov Models for Information Extraction
Zhenmei Gu and
Nick Cercone
Hidden Markov models
(HMMs) are powerful statistical models that have found successful applications in
Information Extraction (IE). In current approaches to applying HMMs to IE, an
HMM is used to model text at the document level. This modeling might cause
undesired redundancy in extraction in the sense that more than one filler is
identified and extracted. We propose to use HMMs to model text at the segment
level, in which the extraction process consists of two steps: a segment
retrieval step followed by an extraction step. In order to retrieve extraction
relevant segments from documents, we introduce a method to use HMMs to model
and retrieve segments. Our experimental results show that the resulting segment
HMM IE system not only achieves near zero extraction redundancy, but also has
better overall extraction performance than traditional document HMM IE systems.
7D: Resources I
Session Chair: Erhard Hinrichs
A
Lei Shi, Cheng Niu, Ming Zhou and Jianfeng Gao
This paper presents a
new web mining
scheme for parallel data acquisition. Based on the Document Object Model (
QuestionBank: Creating a Corpus of Parse-Annotated Questions
John Judge, Aoife Cahill and Josef van Genabith
This paper describes
the development of QuestionBank, a corpus of 4000 parse annotated questions for
(i) use in training parsers employed in QA, and (ii) evaluation of question
parsing. We present a series of experiments to investigate the effectiveness of
QuestionBank as both an exclusive and supplementary training resource for a
state-of-the-art parser in parsing both question and non-question test sets. We
introduce a new method for recovering empty nodes and their antecedents
(capturing long distance dependencies) from parser output in CFG trees using
Creating a CCGbank and a wide-coverage
Julia Hockenmaier
We present an algorithm
which creates a German CCGbank by translating the syntax graphs in the German
Tiger corpus into
Tuesday 18th July 400pm–530pm
8A: Machine Translation III
Session Chair: Kevin Knight
Improved Discriminative Bilingual Word Alignment
Robert C. Moore, Wen-tau Yih and Andreas Bode
For many years, statistical machine translation relied on generative models to provide bilingual word alignments. In 2005, several independent efforts showed that discriminative models could be used to enhance or replace the standard generative approach. Building on this work, we demonstrate substantial improvement in word-alignment accuracy, partly though improved training methods, but predominantly through selection of more and better features. Our best model produces the lowest alignment error rate yet reported on Canadian Hansard’s bilingual data.
Maximum Entropy Based Phrase Reordering Model for Statistical Machine Translation
Deyi Xiong, Qun Liu and Shouxun Lin
We propose a novel reordering model for phrase-based statistical machine translation
(
Distortion Models for Statistical Machine Translation
Yaser Al-Onaizan and Kishore Papineni
In this paper, we argue that n-gram language models are
not sufficient to address word reordering required for Machine Translation. We
propose a new distortion model that can be used with existing phrase-based
8B: Text Classification I
Session Chair: Janyce Wiebe
A Study on Automatically Extracted Keywords in Text Categorization
Anette Hulth and Beáta B. Megyesi
This paper presents a study on if and how automatically extracted keywords can be used to improve text categorization. In summary we show that a higher performance – as measured by micro-averaged F-measure on a standard text categorization collection – is achieved when the full-text representation is combined with the automatically extracted keywords. The combination is obtained by giving higher weights to words in the full-texts that are also extracted as keywords. We also present results for experiments in which the keywords are the only input to the categorizer, either represented as unigrams or intact. Of these two experiments, the unigrams have the best performance, although neither performs as well as headlines only
A Comparison and Semi-Quantitative Analysis of Words and Character-Bigrams as Features in Chinese Text Categorization
Jingyang Li, Maosong Sun and Xian Zhang
Words and
character-bigrams are both used as features in Chinese
text processing
tasks, but no systematic comparison or analysis of their values as features for Chinese text categorization has been reported heretofore. We carry out here a full performance comparison between them by experiments on various document collections (including a manually word-segmented corpus as a golden standard), and a semi-quantitative analysis to elucidate the characteristics of their behavior; and try to provide some preliminary clue for feature term choice (in most cases, character-bigrams are better than words) and dimensionality setting in text categorization systems.
Exploiting Comparable Corpora and Bilingual Dictionaries for Cross-Language Text Categorization
Alfio Gliozzo and Carlo Strapparava
Cross-language
Text Categorization is the task to assign semantic classes to documents written
in a target language (e.g. English) while the system is trained using labeled
documents in a source language (e.g. Italian).
In this
work we present many solutions according to the availability of bilingual
resources, and we show that it is possible to deal with the problem even when
no such resources are accessible. The core technique relies on the automatic
acquisition of Multilingual Domain Models from comparable corpora.
Experiments show the effectiveness of our approach, providing a low cost solution for the Cross Language Text Categorization task. In particular, when bilingual dictionaries are available the performance of the categorization gets close to that of monolingual text categorization.
8C: Machine Learning Methods II
Session Chair: Anoop Sarkar
A Progressive Feature Selection Algorithm for Ultra Large Feature Spaces
Qi Zhang, Fuliang Weng and Zhe Feng
Recent developments in statistical modeling of various
linguistic phenomena have shown that additional features give consistent
performance improvements. Quite often, improvements are limited by the number
of features a system is able to explore. This paper describes a novel
progressive training algorithm that selects features from virtually unlimited
feature spaces for conditional maximum entropy (
Annealing Structural Bias in Multilingual Weighted Grammar Induction
Noah A. Smith and Jason Eisner
We first show how a structural locality bias can improve the accuracy of state-of-the-art dependency grammar induction models trained by EM from unannotated examples (Klein and Manning, 2004). Next, by annealing the free parameter that controls this bias, we achieve further improvements. We then describe an alternative kind of structural bias, toward "broken" hypotheses consisting of partial structures over segmented sentences, and show a similar pattern of improvement. We relate this approach to contrastive estimation (Smith and Eisner, 2005), apply the latter to grammar induction in si languages, and show that our new approach improves accuracy by 1-17% (absolute) over CE (and 8-30% over EM), achieving to our knowledge the best results on this task to date. Our method, structural annealing, is a general technique with broad applicability to hidden-structure discovery problems.
Maximum Entropy Based Restoration of Arabic Diacritics
Imed Zitouni, Jeffrey S. Sorensen and Ruhi Sarikaya
Short vowels and other
diacritics are not part of written Arabic scripts. Exceptions are made for
important political and religious texts and in scripts for beginning students
of Arabic. Script without diacritics have considerable ambiguity because many
words with different diacritic patterns appear identical in a diacritic-less
setting. We propose in this paper a maximum entropy approach for restoring
diacritics in a document. The approach can easily integrate and make effective
use of diverse types of information; the model we propose integrates a wide
array of lexical, segment based and part-of-speech tag features. The
combination of these feature types leads to a state-of-the-art diacritization
model. Using a publicly available corpus (LDC's Arabic Treebank Part 3), we
achieve a diacritic error rate of 5:1%, a segment error rate 8:5%, and a word
error rate of 17:3%. In case-ending-less setting, we obtain a diacritic error
rate of 2:2%, a segment error rate 4:0%, and a word error rate of 7:2%.
8D: Information Retrieval I
Session Chair: Jian-Yun Nie
An Iterative Implicit Feedback Approach to Personalized Search
Yuanhua Lv, Le Sun, Junlin Zhang, Jian-Yun Nie, Wan Chen and Wei Zhang
General information retrieval systems are designed to serve all users without considering individual needs. In this paper, we propose a novel approach to personalized search. It can, in a unified way, exploit and utilize implicit feedback information, such as query logs and immediately viewed documents. Moreover, our approach can implement result re-ranking and query expansion simultaneously and collaboratively. Based on this approach, we develop a client-side personalized web search agent PAIR (Personalized Assistant for Information Retrieval), which supports both English and Chinese. Our experiments on TREC and HTRDP collections clearly show that the new approach is both effective and efficient.
The Effect of Translation Quality in MT-Based Cross-Language Information Retrieval
Jiang Zhu and Haifeng Wang
This
paper explores the relationship between the translation quality and the
retrieval effectiveness in Machine Translation (MT) based Cross-Language
Information Retrieval (CLIR). To obtain MT systems of different translation
quality, we degrade a rule-based MT system by decreasing the size of the rule
base and the size of the dictionary. We use the degraded MT systems to
translate queries and submit the translated queries of varying quality to the
IR system. Retrieval effectiveness is found to correlate highly with the
translation quality of the queries. We further analyze the factors that affect
the retrieval effectiveness. Title queries are found to be preferred in MT-based
CLIR. In addition, dictionary-based degradation is shown to have stronger
impact than rule-based degradation in MT-based CLIR.
A Comparison of Document, Sentence, and Term Event Spaces
Catherine Blake
The trend in information retrieval systems is from document to sub-document retrieval, such as sentences in a summarization system and words or phrases in question-answering system. Despite this trend, systems continue to model language at a document level using the inverse document frequency (IDF). In this paper, we compare and contrast IDF with inverse sentence frequency (ISF) and inverse term frequency (ITF). A direct comparison reveals that all language models are highly correlated; however, the average ISF and ITF values are 5.5 and 10.4 higher than IDF. All language models appeared to follow a power law distribution with a slope coefficient of 1.6 for documents and 1.7 for sentences and terms. We conclude with an analysis of IDF stability with respect to random, journal, and section partitions of the 100,830 full-text scientific articles in our experimental corpus.
Thursday 20th July 900am–930am
Best Asian Language Paper Nominees
Tree-to-String Alignment Template for Statistical Machine Translation
Yang Liu, Qun Liu and Shouxun Lin
We present a novel translation model based on tree-to-string alignment template (TAT) which describes the alignment between a source parse tree and a target string. A TAT is capable of generating both terminals and non-terminals and performing reordering at both low and high levels. The model is linguistically syntax-based because TATs are extracted automatically from word-aligned, source side parsed parallel texts. To translate a source sentence, we first employ a parser to produce a source parse tree and then apply TATs to transform the tree into a target string. Our experiments show that the TAT-based model significantly outperforms Pharaoh, a state-of-the-art decoder for phrase-based models.
Incorporating speech recognition confidence into discriminative named entity recognition of speech data
Katsuhito Sudoh, Hajime Tsukada and Hideki Isozaki
This paper proposes a named entity recognition (
Exploiting Syntactic Patterns as Clues in Zero-Anaphora Resolution
Ryu Iida, Kentaro Inui and Yuji Matsumoto
We approach the
zero-anaphora resolution problem by decomposing it into intra-sentential and
inter-sentential zero-anaphora resolution. For the former problem, syntactic patterns
of the appearance of zero-pronouns and their antecedents are useful clues.
Taking Japanese as a target language, we empirically demonstrate that
incorporating rich syntactic pattern features in a state-of-the-art
learning-based anaphora resolution model dramatically improves the accuracy of
intra-sentential zero-anaphora, which consequently improves the overall
performance of zero-anaphora resolution.
Self-Organizing n-gram Model for Automatic Word Spacing
Seong-Bae Park, Yoon-Shik Tae and Se-Young Park
Automatic
word spacing is one of the important tasks in Korean language processing and
information retrieval. Since there are a number of confusing cases in word
spacing of Korean, there are some mistakes in many texts including news
articles. This paper presents a high-accurate method for automatic word spacing
based on self-organizing n-gram model. This method is basically a variant of
n-gram model, but achieves high accuracy by automatically adapting context
size.
In order
to find the optimal context size, the proposed method automatically increases
the context size when the contextual distribution after increasing it does not
agree with that of the current context. It also decreases the context size when
the distribution of reduced context is similar to that of the current context.
This approach achieves high accuracy by considering higher dimensional data in
case of necessity, and the increased computational cost are compensated by the
reduced context size. The experimental results show that the self-organizing
structure of n-gram model enhances the basic model.
Thursday 20th July 1030am–1230pm
10A: Asian Language Processing
Session Chair: Michael White
Concept Unification of Terms in Different Languages for IR
Qing Li, Sung-Hyon Myaeng, Yun Jin and Bo-yeong Kang
Due to the historical and cultural reasons, English phases, especially the proper nouns and new words, frequently appear in Web pages written primarily in Asian languages such as Korean and Chinese. Although these English terms and their equivalences in the Asian language refer to the same concept, they are erroneously treated as independent index units in traditional Information Retrieval (IR). This paper describes the degree to which the problem arises in IR and suggests a novel technique to solve it. Our method firstly extracts the English phrase from Asian language Web pages, and then unifies the extracted phrase and its equivalence(s) in the language as one index unit. Experimental results show that the high precision of our conceptual unification approach greatly improves the IR performance.
Word Alignment in English-Hindi Parallel Corpus Using Recency-Vector Approach: Some Studies
Niladri Chatterjee and Saumya Agrawal
Word alignment using
recency-vector
based approach has recently become
popular. One major advantage of these techniques is that unlike other approaches they perform well even if the size of the parallel corpora is small. This makes these algorithms worth-studying for languages where resources are scarce. In this work we studied the performance of two very popular recency-vector based approaches, proposed in (Fung and McKeown, 1994) and (Somers, 1998), respectively, for word alignment in English-Hindi parallel corpus. But performance of the above algorithms was not found to be satisfactory. However, subsequent addition of some new constraints improved the performance of the recency-vector based alignment technique significantly for the said corpus. The present paper discusses the new version of the algorithm and its performance in
detail.
Extracting loanwords from Mongolian corpora and producing a Japanese-Mongolian bilingual dictionary
Badam-Osor Khaltar, Atsushi Fujii and Tetsuya Ishikawa
This paper proposes methods for extracting loanwords from Cyrillic Mongolian corpora and producing a Japanese–Mongolian bilingual dictionary. We extract loanwords from Mongolian corpora using our own handcrafted rules. To complement the rule-based extraction, we also extract words in Mongolian corpora that are phonetically similar to Japanese Katakana words as loanwords. In addition, we correspond the extracted loanwords to Japanese words and produce a bilingual dictionary. We propose a stemming method for Mongolian to extract loanwords correctly. We verify the effectiveness of our methods experimentally.
10B: Morphology and Word Segmentation
Session Chair: Yuji Matsumoto
An Unsupervised Morpheme-Based HMM for Hebrew Morphological Disambiguation
Meni Adler and Michael Elhadad
Morphological disambiguation is the process of assigning one set of morphological features to each individual word in a text. When the word is ambiguous (there are several possible analyses for the word), a disambiguation procedure based on the word context must be applied. This paper deals with morphological disambiguation of the Hebrew language, which combines morphemes into a word in both agglutinative and fusional ways. We present an unsupervised stochastic model - the only resource we use is a morphological analyzer - which deals with the data sparseness problem caused by the affixational morphology of the Hebrew language.
We present a text encoding method for languages with affixational morphology in which the knowledge of word formation rules (which are quite restricted in Hebrew) helps in the disambiguation. We adapt HMM algorithms for learning and searching this text representation, in such a way that segmentation and tagging can be learned in parallel in one step. Results on a large-scale evaluation indicate that this learning improves disambiguation for complex tag sets. Our method is applicable to other languages with affix morphology.
Contextual Dependencies in Unsupervised Word Segmentation
Sharon Goldwater, Thomas L. Griffiths and Mark Johnson
Developing better methods for segmenting continuous text into words is important for improving the processing of Asian languages, and may shed light on how humans learn to segment speech. We propose two new Bayesian word segmentation methods that assume unigram and bigram models of word dependencies respectively. The bigram model greatly outperforms the unigram model (and previous probabilistic models), demonstrating the importance of such dependencies for word segmentation. We also show that previous probabilistic models rely crucially on suboptimal search procedures.
MAGEAD: A Morphological Analyzer and Generator for the Arabic Dialects
Nizar Habash and Owen Rambow
We present MAGEAD, a morphological analyzer and generator for the Arabic language family. Our work is novel in that it explicitly addresses the need for processing the morphology of the dialects. MAGEAD performs an on-line analysis to or generation from a root+pattern+features representation, it has separate phonological and orthographic representations, and it allows for combining morphemes from different dialects. We present a detailed evaluation of MAGEAD.
10C: Tagging and Chunking
Session Chair: Jan Hajič
Noun Phrase Chunking in Hebrew – Influence of Lexical and Morphological Features
Yoav Goldberg, Meni Adler and Michael Elhadad
We present a method for Noun Phrase chunking in Hebrew. We
show that the traditional definition of base-NPs as non-recursive noun phrases
does not apply in Hebrew, and propose an alternative definition of Simple
NPs. We review syntactic properties
of Hebrew related to noun phrases, which indicate that the task of Hebrew
SimpleNP chunking is harder than base-NP chunking in English. As a
confirmation, we apply methods known to work well for English to Hebrew data.
These methods give low results (F from 76 to 86) in Hebrew. We then discuss our
method, which applies
Multi-Tagging for Lexicalized-Grammar Parsing
James R. Curran, Stephen Clark and David Vadas
With performance above
97% accuracy for newspaper text, part of speech (
We describe a
multi-tagging approach which maintains a suitable level of lexical category
ambiguity for accurate and efficient
Guessing Parts-of-Speech of Unknown Words Using Global Information
Tetsuji Nakagawa and Yuji Matsumoto
In this paper, we
present a method for guessing
SRW 1: Multilinguality
Session Chair: Marine Carpuat
S1 Discursive Usage of Six Chinese Punctuation Marks
Ming Yue
Both rhetorical structure and punctuation have been helpful in discourse processing. Based on a corpus annotation project, this paper reports the discursive usage of 6 Chinese punctuation marks in news commentary texts: Colon, Dash, Ellipsis, Exclamation Mark, Question Mark, and Semicolon. The rhetorical patterns of these marks are compared against patterns around cue phrases in general. Results show that these Chinese punctuation marks, though fewer in number than cue phrases, are easy to identify, have strong correlation with certain relations, and can be used as distinctive indicators of nuclearity in Chinese texts.
S2 Integrated Morphological and Syntactic Disambiguation for Modern Hebrew
Reut Tsarfaty
Current parsing models are not immediately applicable for languages that exhibit strong interaction between morphology and syntax, e.g., Modern Hebrew (MH), Arabic and other Semitic languages. This work represents a first attempt at modeling morphological-syntactic interaction in a generative probabilistic framework to allow for MH parsing. We show that morphological information selected in tandem with syntactic categories is instrumental for parsing Semitic languages. We further show that redundant morphological information helps syntactic disambiguation.
S3 A Hybrid Relational Approach for WSD
Lucia Specia
We present a novel hybrid approach for Word Sense Disambiguation (WSD) which makes use of a relational formalism to represent instances and background knowledge. It is built using Inductive Logic Programming techniques to combine evidence coming from both sources during the learning process, producing a rule-based WSD model. We experimented with this approach to disambiguate 7 highly ambiguous verbs in English- Portuguese translation. Results showed that the approach is promising, achieving an average accuracy of 75%, which outperforms the other machine learning techniques investigated (66%).
Thursday 20th July 230pm–330pm
11A: Machine Translation IV
Session Chair: Alon Lavie
A Clustered Global Phrase Reordering Model for Statistical Machine Translation
Masaaki Nagata, Kuniko Saito, Kazuhide Yamamoto and Kazuteru Ohashi
In this paper, we present a novel global reordering model that can be incorporated into standard phrase-based statistical machine translation. Unlike previous local reordering models that emphasize the reordering of adjacent phrase pairs [Tillmann-Zhang05], our model explicitly models the reordering of long distances by directly estimating the parameters from the phrase alignments of bilingual training sentences. In principle, the global phrase-reordering model is conditioned on the source and target phrases that are currently being translated, and the previously translated source and target phrases. To cope with sparseness, we use N-best phrase alignments and bilingual phrase clustering, and investigate a variety of combinations of conditioning factors. Through experiments, we show, that the global reordering model significantly improves the translation accuracy of a standard Japanese-English translation task.
A Discriminative Global Training Algorithm for Statistical MT
Christoph Tillmann and Tong Zhang
This
paper presents a novel training algorithm for a linearly-scored block sequence
translation model. The key component is a new procedure to directly optimize
the global scoring function used by a
11B: Speech
Session Chair: Roland Kuhn
Phoneme-to-Text Transcription System with an Infinite Vocabulary
Shinsuke Mori, Daisuke Takuma and Gakuto Kurata
The noisy channel model approach is successfully applied to various natural language processing tasks. Currently the main research focus of this approach is adaptation methods, how to capture characteristics of words and expressions in a target domain given example sentences in that domain. As a solution we describe a method enlarging the vocabulary of a language model to an almost infinite size and capturing their context information. Especially the new method is suitable for languages in which words are not delimited by whitespace. We applied our method to a phoneme-to-text transcription task in Japanese and reduced about 10% of the errors in the results of an existing method.
Automatic Generation of Domain Models for Call Centers from Noisy Transcriptions
Shourya Roy and L Venkata Subramaniam
Call centers handle
customer queries from various domains such as computer sales and support,
mobile phones, car rental, etc. Each such domain generally has a domain model
which is essential to handle customer complaints. These models contain common
problem categories, typical customer issues and their solutions, greeting
styles. Currently these models are manually created over time. Towards this, we
propose an unsupervised technique to generate domain models automatically from
call transcriptions. We use a state of the art Automatic Speech Recognition
system to transcribe the calls between agents and customers, which still
results in high word error rates (40%) and show that even from these noisy
transcriptions of calls we can automatically build a domain model. The domain
model is comprised of primarily a topic taxonomy where every node is
characterized by topic(s), typical Questions-Answers (Q&As), typical
actions and call statistics. We show how such a domain model can be used for
topic identification of unseen calls. We also propose applications for aiding
agents while handling calls and for agent monitoring based on the domain model.
11C: Discourse
Session Chair: Daniel Marcu
Proximity in Context: an empirically grounded computational model of proximity for processing topological spatial expressions
John D. Kelleher, Geert-Jan M. Kruijff and Fintan J. Costello
The paper presents a new model for context-dependent interpretation of linguistic expressions about spatial proximity between objects in a natural scene. The paper discusses novel psycholinguistic experimental data that tests and verifies the model. The model has been implemented, and enables a conversational robot to identify objects in a scene through topological spatial relations (e.g. “X near Y''). The model can help motivate the choice between topological and projective prepositions.
Machine Learning of Temporal Relations
Inderjeet Mani, Marc Verhagen, Ben Wellner, Chong Min Lee and James Pustejovsky
This paper investigates a machine learning approach for temporally ordering and anchoring events in natural language texts. To address data sparseness, we used temporal reasoning as an over-sampling method to dramatically expand the amount of training data, resulting in predictive accuracy on link labeling as high as 93% using a Maximum Entropy classifier on human annotated data. This method compared favorably against a series of increasingly sophisticated baselines involving expansion of rules derived from human intuitions.
SRW 2: Speech
Session Chair: Kevin Duh
S4 On2L - A Framework for Incremental Ontology Learning in Spoken Dialog Systems
Berenike Loos
An open-domain spoken dialog system has to deal with the challenge of lacking lexical as well as conceptual knowledge. As the real world is constantly changing, it is not possible to store all necessary knowledge beforehand. Therefore, this knowledge has to be acquired during the run time of the system, with the help of the out-of-vocabulary information of a speech recognizer. As every word can have various meanings depending on the context in which it is uttered, additional context information is taken into account, when searching for the meaning of such a word. In this paper, I will present the incremental ontology learning framework On2L. The defined tasks for the framework are: the hypernym extraction from Internet texts for unknown terms delivered by the speech recognizer; the mapping of those and their hypernyms into ontological concepts and instances; and the following integration of them into the system’s ontology.
S5 Focus to Emphasize Tone Structures for Prosodic Analysis in Spoken Language Generation
Lalita Narupiyakul
We analyze the concept of focus in speech and the relationship between focus and speech acts for prosodic generation. We determine how the speaker’s utterances are influenced by speaker’s intention. The relationship between speech acts and focus information is used to define which parts of the sentence serve as the focus parts. We propose the Focus to Emphasize Tones (FET) structure to analyze the focus components. We also design the FET grammar to analyze the intonation patterns and produce tone marks as a result of our analysis. We present a proof-of-the-concept working example to validate our proposal. More comprehensive evaluations are part of our current work.
Thursday 20th July 400pm–530pm
12A: Machine Translation V
Session Chair: Alon Lavie
An End-to-End Discriminative Approach to Machine Translation
Percy Liang, Alexandre Bouchard-Côté, Dan Klein and
Ben Taskar
We present a
perceptron-style discriminative approach to machine translation in which large
feature sets can be exploited. Unlike discriminative reranking approaches, our
system can take advantage of learned features in all stages of decoding. We
first discuss several challenges to error-driven discriminative approaches. In
particular, we explore different ways of updating parameters given a training
example. We find that making frequent but smaller updates is preferable to
making fewer but larger updates. Then, we discuss an array of features and show
both how they quantitatively increase BLEU score and how they qualitatively
interact on specific examples. One particular feature we investigate is a novel
way to introduce learning into the initial phrase extraction process, which has
previously been entirely heuristic.
Semi-Supervised Training for Statistical Word Alignment
Alexander Fraser and Daniel Marcu
We introduce a semi-supervised approach to training for statistical machine translation that alternates the traditional Expectation Maximization step that is applied on a large training corpus with a discriminative step aimed at increasing word-alignment quality on a small, manually word-aligned sub-corpus. We show that our algorithm leads not only to improved alignments but also to machine translation outputs of higher quality.
Left-to-Right Target Generation for Hierarchical Phrase-based Translation
Taro Watanabe, Hajime Tsukada and Hideki Isozaki
We present a hierarchical phrase-based statistical machine translation in which a target sentence is efficiently generated in left-to-right order. The model is a class of synchronous-CFG with a Greibach Normal Form-like structure for the projected production rule: The paired target-side of a production rule takes a phrase-prefixed form. The decoder for the target-normalized form is based on an Early-style top down parser on the source side. The target-normalized form coupled with our top down parser implies a left-to-right generation of translations which enables us a straightforward integration with ngram language models. Our model was experimented on a Japanese-to-English newswire translation task, and showed statistically significant performance improvements against a phrase-based translation system.
12B: Lexical Issues III
Session Chair: Nicoletta Calzolari
You Can't Beat Frequency (Unless You Use Linguistic Knowledge) – A Qualitative Evaluation of Association Measures for Collocation and Term Extraction
Joachim Wermter and Udo Hahn
In the past years, a number of lexical association measures have been studied to help extract new scientific terminology or general-language collocations. The implicit assumption of this research was that newly designed term measures involving more sophisticated statistical criteria would outperform simple counts of co-occurrence frequencies. We here explicitly test this assumption. By way of four qualitative criteria, we show that purely statistics-based measures reveal virtually no difference compared with frequency of occurrence counts, while linguistically more informed metrics do reveal such a marked difference.
Ontologizing Semantic Relations
Marco Pennacchiotti and Patrick Pantel
Many algorithms have been developed to harvest lexical semantic resources, however few have linked the mined knowledge into formal knowledge repositories. In this paper, we propose two algorithms for automatically ontologizing (attaching) semantic relations into WordNet. We present an empirical evaluation on the task of attaching partof and causation relations, showing an improvement on F-score over a baseline model.
Semantic Taxonomy Induction from Heterogenous Evidence
Rion Snow, Daniel Jurafsky and Andrew Y. Ng
We propose a novel algorithm for inducing semantic taxonomies. Previous algorithms for taxonomy induction have typically focused on independent classifiers for discovering new single relationships based on hand-constructed or automatically discovered textual patterns. By contrast, our algorithm flexibly incorporates evidence from multiple classifiers over heterogenous relationships to optimize the entire structure of the taxonomy, using knowledge of a word’s coordinate terms to help in determining its hypernyms, and vice versa. We apply our algorithm on the problem of sense-disambiguated noun hyponym acquisition, where we combine the predictions of hypernym and coordinate term classifiers with the knowledge in a preexisting semantic taxonomy (WordNet 2.1). We add 10; 000 novel synsets to WordNet 2.1 at 84% precision, a relative error reduction of 70% over a non-joint algorithm using the same component classifiers. Finally, we show that a taxonomy built using our algorithm shows a 23% relative F-score improvement over WordNet 2.1 on an independent testset of hypernym pairs.
12C: Information Extraction III
Session Chair: Yorick Wilks
Names and Similarities on the Web: Fact Extraction in the Fast Lane
Marius Paşca, Dekang Lin, Jeffrey Bigham, Andrei Lifchits and Alpa Jain
In a new approach to large-scale extraction of facts from unstructured text, distributional similarities become an integral part of both the iterative acquisition of high-coverage contextual extraction patterns, and the validation and ranking of candidate facts. The evaluation measures the quality and coverage of facts extracted from one hundred million Web documents, starting from ten seed facts and using no additional knowledge, lexicons or complex tools.
Weakly Supervised Named Entity Transliteration and Discovery from Multilingual Comparable Corpora
Alexandre Klementiev and Dan Roth
Named Entity
recognition (
A Composite Kernel to Extract Relations between Entities with both Flat and Structured Features
Min Zhang, Jie Zhang, Jian Su and Guodong Zhou
This paper proposes a
novel composite kernel for relation extraction. The composite kernel consists
of two individual kernels: an entity kernel that allows for entity-related
features and a convolution parse tree kernel that models syntactic information
of relation examples. The motivation of our method is to fully utilize the nice
properties of kernel methods to explore diverse knowledge for relation
extraction. Our study illustrates that the composite kernel can effectively
capture both flat and structured features without the need for extensive
feature engineering, and can also easily scale to include more features.
Evaluation on the ACE corpus shows that our method outperforms the previous
best-reported methods and significantly outperforms previous two dependency
tree kernels for relation extraction.
SRW 3: Parsing
Session Chair: Stephen Wan
S6 Extraction of Tree Adjoining Grammars from a Treebank for Korean
Jungyeul Park
We present the implementation of a system which extracts
not only lexicalized grammars but also feature-based lexicalized grammars from
Korean Sejong Treebank. We report on some practical experiments where we
extract
S7 Parsing and Subcategorization Data
Jianguo Li
In this paper, we compare the performance of a state-of-the-art statistical parser (Bikel, 2004) in parsing written and spoken language and in generating subcategorization cues from written and spoken language. Although Bikel’s parser achieves a higher accuracy for parsing written language, it achieves a higher accuracy when extracting subcategorization cues from spoken language. Additionally, we explore the utility of punctuation in helping parsing and extraction of subcategorization cues. Our experiments show that punctuation is of little help in parsing spoken language and extracting subcategorization cues from spoken language. This indicates that there is no need to add punctuation in transcribing spoken corpora simply in order to help parsers.
S8 Clavius: Bi-Directional Parsing for Generic Multimodal Interaction
Frank Rudzicz
We introduce a new multi-threaded parsing algorithm on unification grammars designed specifically for multimodal interaction and noisy environments. By lifting some traditional constraints, namely those related to the ordering of constituents, we overcome several difficulties of other systems in this domain. We also present several criteria used in this model to constrain the search process using dynamically loadable scoring functions. Some early analyses of our implementation are discussed.
Friday 21st July 1000am–1030am
13A: Parsing VI
Session Chair: Srinivas Bangalore
Japanese Dependency Parsing Using Co-occurrence Information and a Combination of Case Elements
Takeshi Abekawa and Manabu Okumura
In this paper, we
present a method that improves Japanese dependency parsing by using large-scale
statistical information. It takes into account two kinds of information not
considered in previous statistical (machine learning based) parsing methods:
information about dependency relations among the case elements of a verb, and
information about co-occurrence relations between a verb and its case element.
This information can be collected from the results of automatic dependency
parsing of large-scale corpora. The results of an experiment in which our
method was used to rerank the results obtained using an existing machine
learning based parsing method showed that our method can improve the accuracy
of the results obtained using the existing method.
13B: Question Answering I
Session Chair: Dan Moldovan
Answer Extraction, Semantic Clustering, and Extractive Summarization for Clinical Question Answering
Dina Demner-Fushman and Jimmy Lin
This paper presents a hybrid approach to question answering in the clinical domain that combines techniques from summarization and information retrieval. We tackle a frequently-occurring class of questions that takes the form “What is the best drug treatment for X?” Starting from an initial set of MEDLINE citations, our system first identifies the drugs under study. Abstracts are then clustered using semantic classes from the UMLS ontology. Finally, a short extractive summary is generated for each abstract to populate the clusters. Two evaluations—a manual one focused on short answers and an automatic one focused on the supporting abstracts—demonstrate that our system compares favorably to PubMed, the search system most widely used by physicians today.
13C: Semantics III
Session Chair: Alexander Koller
Discovering asymmetric entailment relations between verbs using selectional preferences
Fabio Massimo Zanzotto, Marco Pennacchiotti and Maria Teresa Pazienza
In this
paper we investigate a novel method to detect asymmetric entailment relations
between verbs. Our starting point is the idea that some point-wise verb
selectional preferences carry relevant semantic information. Experiments using
WordNet as a gold standard show promising results. Where applicable, our
method, used in combination with other approaches, significantly increases the
performance of entailment detection. A combined approach including our model
improves the AROC of 5% with respect to standard models.
13D: Applications III
Session Chair: Eva Hajičová
Event Extraction in a Plot Advice Agent
Harry Halpin and Johanna D. Moore
In this paper we
present how the automatic extraction of events from text can be used to both
classify narrative texts according to plot quality and produce advice in an
interactive learning environment intended to help students with story writing.
We focus on the story-rewriting task, in which an exemplar story is read to the
students and the students rewrite the story in their own words. The system
automatically extracts events from the raw text, formalized as a sequence of
temporally ordered predicate-arguments. These events are given to a
machine-learner that produces a coarse-grained rating of the story. The results
of the machine-learner and the extracted events are then used to generate fine-grained
advice for the students.
Friday 21st July 1100am–1230pm
14A: Parsing VII
Session Chair: Srinivas Bangalore
An All-Subtrees Approach to Unsupervised Parsing
Rens Bod
We investigate generalizations of the all-subtrees "DOP"
approach to unsupervised parsing. Unsupervised DOP models assign all possible
binary trees to a set of sentences and
next use (a large random subset of) all subtrees from these binary trees to
compute the most probable parse trees. We will test both a relative frequency
estimator for unsupervised DOP and a maximum likelihood estimator which is
known to be statistically consistent. We report state-of-the-art results on
English (WSJ), German (NEGRA) and Chinese (
Advances in Discriminative Parsing
Joseph Turian and I. Dan Melamed
The present work
advances the accuracy and training speed of discriminative parsing. Our
discriminative parsing method has no generative component, yet surpasses a
generative baseline on constituent parsing, and does so with minimal linguistic
cleverness. Our model can incorporate arbitrary features of the input and parse
state, and performs feature selection incrementally over an exponential feature
space during training. We demonstrate the flexibility of our approach by
testing it with several parsing strategies and various feature sets. Our
implementation is freely available at: http://nlp.cs.nyu.edu/parser/.
Prototype-Driven Grammar Induction
Aria Haghighi and Dan Klein
We investigate
prototype-driven learning for primarily unsupervised grammar induction. Prior
knowledge is specified declaratively, by providing a few canonical examples of
each target phrase type. This sparse prototype information is then propagated
across a corpus using distributional similarity features, which augment an
otherwise standard PCFG model. We show that distributional features are
effective at distinguishing bracket labels, but not determining bracket
locations. To improve the quality of the induced trees, we combine our PCFG
induction with the CCM model of Klein and Manning (2002), which has
complementary strengths: it identifies brackets but does not label them. Using
only a handful of prototypes, we show substantial improvements over naive PCFG
induction for English and Chinese grammar induction.
14B: Question Answering II
Session Chair: Dan Moldovan
Exploring Correlation of Dependency Relation Paths for Answer Extraction
Dan Shen and Dietrich Klakow
In this paper, we explore correlation of dependency
relation paths to rank candidate answers in answer extraction. Using the
correlation measure, we compare dependency relations of a candidate answer and
mapped question phrases in sentence with the corresponding relations in
question. Different from previous studies, we propose an approximate
phrase-mapping algorithm and incorporate the mapping score into the correlation
measure. The correlations are further incorporated into a Maximum Entropy-based
ranking model which estimates path weights from training. Experimental results
show that our method significantly outperforms state-of-the-art syntactic
relation-based methods by up to 20% in
Question Answering with Lexical Chains Propagating Verb Arguments
Adrian Novischi and Dan Moldovan
This paper describes an algorithm for propagating verb arguments along lexical chains consisting of WordNet relations. The algorithm creates verb argument structures using VerbNet syntactic patterns. In order to increase the coverage, a larger set of verb senses were automatically associated with the existing patterns from VerbNet. The algorithm is used in an in-house Question Answering system for re-ranking the set of candidate answers. Tests on factoid questions from TREC 2004 indicate that the algorithm improved the system performance by 2.4%.
Methods for Using Textual Entailment in Open-Domain Question Answering
Sanda Harabagiu and Andrew Hickl
Work on the semantics of questions has argued that the relation between a question and its answer(s) can be cast in terms of logical entailment. In this paper, we demonstrate how computational systems designed to recognize textual entailment can be used to enhance the accuracy of current open-domain automatic question answering (Q/A) systems. In our experiments, we show that when textual entailment information is used to either filter or rank answers returned by a Q/A system, accuracy can be increased by as much as 20% overall.
14C: Semantics IV
Session Chair: Alexander Koller
Using String-Kernels for Learning Semantic Parsers
Rohit J. Kate and Raymond J. Mooney
We present a new approach for mapping natural language sentences to their formal meaning representations using string-kernel-based classifiers. Our system learns these classifiers for every production in the formal language grammar. Meaning representations for novel natural language sentences are obtained by finding the most probable semantic parse using these string classifiers. Our experiments on two real-world data sets show that this approach compares favorably to other existing systems and is particularly robust to noise.
A Bootstrapping Approach to Unsupervised Detection of Cue Phrase Variants
Rashid M. Abdalla and Simone Teufel
We investigate the unsupervised
detection of semi-fixed cue phrases such as “This paper proposes a novel
approach …” from unseen text, on the basis of only a handful of seed cue
phrases with the desired semantics. The problem, in contrast to bootstrapping
approaches for Question Answering and Information Extraction, is that it is
hard to find a constraining context for occurrences of semi-fixed cue phrases.
Our method uses components of the cue phrase itself, rather than external
context, to bootstrap. It successfully excludes phrases which are different
from the target semantics, but which look superficially similar. The method
achieves 88% accuracy, outperforming standard bootstrapping approaches.
Semantic Role Labeling via FrameNet, VerbNet and PropBank
Ana-Maria Giuglea and Alessandro Moschitti
This article describes
a robust semantic parser that uses a broad knowledge base created by
interconnecting three major resources: FrameNet, VerbNet and PropBank. The
FrameNet corpus contains the examples annotated with semantic roles whereas the
VerbNet lexicon provides the knowledge about the syntactic behavior of the
verbs. We connect VerbNet and FrameNet by mapping the FrameNet frames to the
VerbNet Intersective Levin classes. The PropBank corpus, which is tightly
connected to the VerbNet lexicon, is used to increase the verb coverage and
also to test the effectiveness of our approach. The results indicate that our
model is an interesting step towards the design of more robust semantic parsers.
14D: Resources II
Session Chair: Eva Hajičová
Multilingual Legal Terminology on the Jibiki Platform: The LexALP Project
Gilles
Sérasset, Francis Brunet-Manquat and Elena Chiocchetti
This paper presents the particular use of “Jibiki” (Papillon’s web server development platform) for the LexALP1 project. LexALP’s goal is to harmonise the terminology on spatial planning and sustainable development used within the Alpine Convention2, so that the member states are able to cooperate and communicate efficiently in the four official languages (French, German, Italian and Slovene). To this purpose, LexALP uses the Jibiki platform to build a term bank for the contrastive analysis of the specialised terminology used in six different national legal systems and four different languages. In this paper we present how a generic platform like Jibiki can cope with a new kind of dictionary.
Leveraging Reusability: Cost-effective Lexical Acquisition for Large-scale Ontology Translation
G. Craig Murray, Bonnie Dorr, Jimmy Lin, Jan Hajič and Pavel Pecina
Thesauri and ontologies provide important value in facilitating access to digital archives by representing underlying principles of organization. Translation of such resources into multiple languages is an important component for providing multilingual access. However, the specificity of vocabulary terms in most ontologies precludes fully-automated machine translation using general-domain lexical resources. In this paper, we present an efficient process for leveraging human translations when constructing domain-specific lexical resources. We evaluate the effectiveness of this process by producing a probabilistic phrase dictionary and translating a thesaurus of 56,000 concepts used to catalogue a large archive of oral histories. Our experiments demonstrate a cost-effective technique for accurate machine translation of large ontologies.
Accurate Collocation Extraction Using a Multilingual Parser
Violeta Seretan and Eric Wehrli
This paper focuses on the use of advanced techniques of text analysis as support for collocation extraction. A hybrid system is presented that combines statistical methods and multilingual parsing for detecting accurate collocational information from English, French, Spanish and Italian corpora. The advantage of relying on full parsing over using a traditional window method (which ignores the syntactic information) is first theoretically motivated, then empirically validated by a comparative evaluation experiment.
Friday 21st July 200pm–330pm
15A: Machine Translation VI
Session Chair: Dekai Wu
Scalable Inference and Training of Context-rich Syntactic Translation Models
Michel Galley, Jonathan Graehl, Kevin Knight, Daniel Marcu, Steve DeNeefe, Wei Wang and Ignacio Thayer
Statistical MT has made great progress in the last few years, but current translation models are weak on re-ordering and target language fluency. Syntactic approaches seek to remedy these problems. In this paper, we take the framework for acquiring multi-level syntactic translation rules of (Galley et al., 2004) from aligned tree-string pairs, and present two main extensions of their approach: first, instead of merely computing a single derivation that minimally explains a sentence pair, we construct a large number of derivations that include contextually richer rules, and account for multiple interpretations of unaligned words. Second, we propose probability estimates and a training procedure for weighting these rules. We contrast different approaches on real examples, show that our estimates based on multiple derivations favor phrasal re-orderings that are linguistically better motivated, and establish that our larger rules provide a 3.63 BLEU point increase over minimal rules.
Modelling lexical redundancy for machine translation
David Talbot and Miles Osborne
Certain distinctions made in the lexicon of one language may be redundant when translating into another language. We quantify redundancy among source types by the similarity of their distributions over target types. We propose a language-independent framework for minimising lexical redundancy that can be optimised directly from parallel text. Optimisation of the source lexicon for a given target language is viewed as model selection over a set of cluster-based translation models.
Redundant distinctions between types may exhibit
monolingual regularities, for example, inflexion patterns. We define a prior
over model structure using a Markov random field and learn features over sets
of monolingual types that are predictive of bilingual redundancy. The prior
makes model selection more robust without the need for language-specific
assumptions regarding redundancy. Using these models in a phrase-based
Empirical Lower Bounds on the Complexity of Translational Equivalence
Benjamin Wellington, Sonjia Waxmonsky and I. Dan Melamed
This paper describes a
study of the patterns of translational equivalence exhibited by a variety of
bitexts. The study found that the complexity of these patterns in every bitext
was higher than suggested in the literature. These findings shed new light on
why “syntactic” constraints have not helped to improve statistical translation
models, including finite state phrase-based models, tree-to-string models, and
tree-to-tree models. The paper also presents evidence that inversion
transduction grammars cannot generate some translational equivalence relations,
even in relatively simple real bitexts in syntactically similar languages with
rigid word order. Instructions for replicating our experiments are at
http://nlp.cs.nyu.edu/GenPar/
15B: Language Modelling
Session Chair: Jianfeng Gao
A Hierarchical Bayesian Language Model Based On Pitman-Yor Processes
Yee Whye Teh
We propose a new hierarchical Bayesian n-gram model of natural languages. Our model makes use of a generalization of the commonly used Dirichlet distributions called Pitman-Yor processes which produce power-law distributions more closely resembling those in natural languages. We show that an approximation to the hierarchical Pitman-Yor language model recovers the exact formulation of interpolated Kneser-Ney, one of the best smoothing methods for n-gram language models. Experiments verify that our model gives cross entropy results superior to interpolated Kneser-Ney and comparable to modified Kneser-Ney.
A Phonetic-Based Approach to Chinese Chat Text Normalization
Yunqing Xia, Kam-Fai Wong and Wenjie Li
Chatting is a popular communication media on the Internet via ICQ, chat rooms, etc. Chat language is different from natural language due to its anomalous and dynamic natures, which renders conventional NLP tools inapplicable. The dynamic problem is enormously troublesome because it makes static chat language corpus outdated quickly in representing contemporary chat language. To address the dynamic problem, we propose the phonetic mapping models to present mappings between chat terms and standard words via phonetic transcription, i.e. Chinese Pinyin in our case. Different from character mappings, the phonetic mappings can be constructed from available standard Chinese corpus. To perform the task of dynamic chat language term normalization, we extend the source channel model by incorporating the phonetic mapping models. Experimental results show that this method is effective and stable in normalizing dynamic chat language terms.
Discriminative Pruning of Language Models for Chinese Word Segmentation
Jianfeng Li, Haifeng Wang, Dengjun Ren and Guohua Li
This paper presents a discriminative pruning method of n-gram language model for Chinese word segmentation. To reduce the size of the language model that is used in a Chinese word segmentation system, importance of each bigram is computed in terms of discriminative pruning criterion that is related to the performance loss caused by pruning the bigram. Then we propose a step-by-step growing algorithm to build the language model of desired size. Experimental results show that the discriminative pruning method leads to a much smaller model compared with the model pruned using the state-of-the-art method. At the same Chinese word segmentation F-measure, the number of bigrams in the model can be reduced by up to 90%. Correlation between language model perplexity and word segmentation performance is also discussed.
15C: Information Retrieval II
Session Chair: Rosie Jones
Novel Association Measures Using Web Search with Double Checking
Hsin-Hsi Chen, Ming-Shun Lin and Yu-Chuan Wei
A web search with double checking model is proposed to explore the web as a live corpus. Five association measures including variants of Dice, Overlap Ratio, Jaccard, and Cosine, as well as Co-Occurrence Double Check (CODC), are presented. In the experiments on Rubenstein-Goodenough’s benchmark data set, the CODC measure achieves correlation coefficient 0.8492, which competes with the performance (0.8914) of the model using WordNet. The experiments on link detection of named entities using the strategies of direct association, association matrix and scalar association matrix verify that the double-check frequencies are reliable. Further study on named entity clustering shows that the five measures are quite useful. In particular, CODC measure is very stable on word-word and name-name experiments. The application of CODC measure to expand community chains for personal name disambiguation achieves 9.65% and 14.22% increase compared to the system without community expansion. All the experiments illustrate that the novel model of web search with double-checking is feasible for mining associations from the web.
Semantic Retrieval for the Accurate Identification of Relational Concepts in Massive Textbases
Yusuke Miyao, Tomoko Ohta, Katsuya Masuda, Yoshimasa Tsuruoka, Kazuhiro Yoshida, Takashi Ninomiya and Jun'ichi Tsujii
This paper introduces a novel framework for the accurate retrieval of relational concepts from huge texts. Prior to retrieval, all sentences are annotated with predicate argument structures and ontological identifiers by applying a deep parser and a term recognizer. During the run time, user requests are converted into queries of region algebra on these annotations. Structural matching with pre-computed semantic annotations establishes the accurate and efficient retrieval of relational concepts. This framework was applied to a text retrieval system for MEDLINE. Experiments on the retrieval of biomedical correlations revealed that the cost is sufficiently small for real-time applications and that the retrieval precision is significantly improved.
Exploring Distributional Similarity Based Models for Query Spelling Correction
Mu Li, Muhua Zhu, Yang Zhang and Ming Zhou
A query speller is crucial to search engine in improving web search relevance. This paper describes novel methods for use of distributional similarity estimated from query logs in learning improved query spelling correction models. The key to our methods is the property of distributional similarity between two terms: it is high between a frequently occurring misspelling and its correction, and low between two irrelevant terms only with similar spellings. We present two models that are able to take advantage of this property. Experimental results demonstrate that the distributional similarity based models can significantly outperform their baseline systems in the web query spelling correction task.
15D: Generation I
Session Chair: Donia Scott
Robust PCFG-Based Generation using
Automatically Acquired
Aoife Cahill and Josef van Genabith
We present a novel PCFG-based architecture for robust
probabilistic generation based on wide-coverage
Incremental generation of spatial referring expressions in situated dialog
John D. Kelleher and Geert-Jan M. Kruijff
This paper presents an approach to incrementally generating locative expressions. It addresses the issue of combinatorial explosion inherent in the construction of relational context models by: (a) contextually defining the set of objects in the context that may function as a landmark, and (b) sequencing the order in which spatial relations are considered using a cognitively motivated hierarchy of relations, and visual and discourse salience.
Learning to Predict Case Markers in Japanese
Hisami
Suzuki and
Kristina Toutanova
Japanese case markers, which indicate the grammatical relation of the complement NP to the predicate, often pose challenges to the generation of Japanese text, be it done by a foreign language learner, or by a machine translation (MT) system. In this paper, we describe the task of predicting Japanese case markers and propose machine learning methods for solving it in two settings: (i) monolingual, when given information only from the Japanese sentence; and (ii) bilingual, when also given information from a corresponding English source sentence in an MT context. We formulate the task after the well-studied task of English semantic role labelling, and explore features from a syntactic dependency structure of the sentence. For the monolingual task, we evaluated our models on the Kyoto Corpus and achieved over 84% accuracy in assigning correct case markers for each phrase. For the bilingual task, we achieved an accuracy of 92% per phrase using a bilingual dataset from a technical domain. We show that in both settings, features that exploit dependency information, whether derived from gold-standard annotations or automatically assigned, contribute significantly to the prediction of case markers.
Friday 21st July 400pm–500pm
16A: Text Classification II
Session Chair: Peter Turney
Are These Documents Written from Different Perspectives? A Test of Different Perspectives Based On Statistical Distribution Divergence
Wei-Hao Lin and Alexander Hauptmann
In this paper we investigate how to automatically determine if two document collections are written from different perspectives. By perspectives we mean a point of view, for example, from the perspective of Democrats or Republicans. We propose a test of different perspectives based on distribution divergence between the statistical models of two collections. Experimental results show that the test can successfully distinguish document collections of different perspectives from other types of collections.
Janyce Wiebe and Rada Mihalcea
Subjectivity and
meaning are both important properties of language. This paper explores their
interaction, and brings empirical evidence in support of the hypotheses that
(1) subjectivity is a property that can be associated with word senses, and (2)
word sense disambiguation can directly benefit from subjectivity annotations.
16B: Question Answering III
Session Chair: John Prange
Improving QA Accuracy by Question Inversion
John Prager, Pablo Duboue and Jennifer Chu-Carroll
This paper demonstrates a conceptually simple but effective method of increasing the accuracy of QA systems on factoid-style questions. We define the notion of an inverted question, and show that by requiring that the answers to the original and inverted questions be mutually consistent, incorrect answers get demoted in confidence and correct ones promoted. Additionally, we show that lack of validation can be used to assert no-answer (nil) conditions. We demonstrate increases of performance on TREC and other question-sets, and discuss the kinds of future activities that can be particularly beneficial to approaches such as ours.
Reranking Answers for Definitional QA Using Language Modeling
Yi Chen, Ming Zhou and Shilong Wang
Statistical ranking methods based on centroid vector (profile) extracted from ex-ternal knowledge have become widely adopted in the top definitional QA systems in TREC 2003 and 2004. In these approaches, terms in the centroid vector are treated as a bag of words based on the independent assumption. To relax this assumption, this paper proposes a novel language model-based answer reranking method to improve the existing bag-of-words model approach by considering the dependence of the words in the centroid vector. Experiments have been conducted to evaluate the different dependence models. The results on the TREC 2003 test set show that the reranking approach with biterm language model, significantly outperforms the one with the bag-of-words model and unigram language model by 14.9% and 12.5% respectively in F-Measure(5).
16C: Grammars III
Session Chair: Gerald Penn
Highly constrained unification grammars
Daniel Feinstein and Shuly Wintner
Unification grammars
are widely accepted as an expressive means for describing the structure of
natural languages. In general, the recognition problem is undecidable for
unification grammars. Even with restricted variants of the formalism, offline
parsable grammars, the problem is computationally hard. We present two natural
constraints on unification grammars which limit their expressivity. We first
show that non-reentrant unification grammars generate exactly the class of
context-free languages. We then relax the constraint and show that
one-reentrant unification grammars generate exactly the class of tree-adjoining
languages. We thus relate the commonly used and linguistically motivated
formalism of unification grammars to more restricted, computationally tractable
classes of languages.
A polynomial parsing algorithm for the topological model Synchronizing Constituent and Dependency Grammars, Illustrated by German Word Order Phenomena
Kim Gerdes and Sylvain Kahane
This paper describes a minimal topology driven parsing algorithm for topological grammars that synchronizes a rewriting grammar and a dependency grammar, obtaining two linguistically motivated syntactic structures. The use of non-local slash and visitor features can be restricted to obtain a CKY type analysis in polynomial time. German long distance phenomena illustrate the algorithm, bringing to the fore the procedural needs of the analyses of syntax-topology mismatches in constraint based approaches like for example HPSG.
16D: Generation II
Session Chair: Donia Scott
Stochastic Language Generation Using WIDL-expressions and its Application in Machine Translation and Summarization
Radu Soricut and Daniel Marcu
We propose WIDL-expressions as a flexible formalism that facilitates the integration of a generic sentence realization system within end-to-end language processing applications. WIDL-expressions represent compactly probability distributions over finite sets of candidate realizations, and have optimal algorithms for realization via interpolation with language model probability distributions. We show the effectiveness of a WIDL-based NLG system in two sentence realization tasks: automatic translation and headline generation.
Learning to Say It Well: Reranking Realizations by Predicted Synthesis Quality
Crystal Nakatsu and Michael White
This paper presents a
method for adapting a language generator to the strengths and weaknesses of a
synthetic voice, thereby improving the naturalness of synthetic speech in a
spoken language dialogue system. The method trains a discriminative reranker to
select paraphrases that are predicted to sound natural when synthesized. The
ranker is trained on realizer and synthesizer features in supervised fashion,
using human judgments of synthetic voice quality on a sample of the paraphrases
representative of the generator’s capability. Results from a cross-validation
study indicate that discriminative paraphrase reranking can achieve substantial
improvements in naturalness on average, ameliorating the problem of highly
variable synthesis quality typically encountered with today’s unit selection
synthesizers.