Noah A. Smith

Also published as: Noah Smith


2019

pdf pdf bib
Linguistic Knowledge and Transferability of Contextual Representations
Nelson F. Liu | Matt Gardner | Yonatan Belinkov | Matthew E. Peters | Noah A. Smith

Contextual word representations derived from large-scale neural language models are successful across a diverse set of NLP tasks, suggesting that they encode useful and transferable features of language. To shed light on the linguistic knowledge they capture, we study the representations produced by several recent pretrained contextualizers (variants of ELMo, the OpenAI transformer language model, and BERT) with a suite of sixteen diverse probing tasks. We find that linear models trained on top of frozen contextual representations are competitive with state-of-the-art task-specific models in many cases, but fail on tasks requiring fine-grained linguistic knowledge (e.g., conjunct identification). To investigate the transferability of contextual word representations, we quantify differences in the transferability of individual layers within contextualizers, especially between recurrent neural networks (RNNs) and transformers. For instance, higher layers of RNNs are more task-specific, while transformer layers do not exhibit the same monotonic trend. In addition, to better understand what makes contextual word representations transferable, we compare language model pretraining with eleven supervised pretraining tasks. For any given task, pretraining on a closely related task yields better performance than language model pretraining (which is better on average) when the pretraining dataset is fixed. However, language model pretraining on more data gives the best results.

pdf pdf bib
Inoculation by Fine-Tuning: A Method for Analyzing Challenge Datasets
Nelson F. Liu | Roy Schwartz | Noah A. Smith

Several datasets have recently been constructed to expose brittleness in models trained on existing benchmarks. While model performance on these challenge datasets is significantly lower compared to the original benchmark, it is unclear what particular weaknesses they reveal. For example, a challenge dataset may be difficult because it targets phenomena that current models cannot capture, or because it simply exploits blind spots in a model’s specific training set. We introduce inoculation by fine-tuning, a new analysis method for studying challenge datasets by exposing models (the metaphorical patient) to a small amount of data from the challenge dataset (a metaphorical pathogen) and assessing how well they can adapt. We apply our method to analyze the NLI “stress tests” (Naik et al., 2018) and the Adversarial SQuAD dataset (Jia and Liang, 2017). We show that after slight exposure, some of these datasets are no longer challenging, while others remain difficult. Our results indicate that failures on challenge datasets may lead to very different conclusions about models, training datasets, and the challenge datasets themselves.

pdf pdf bib
Polyglot Contextual Representations Improve Crosslingual Transfer
Phoebe Mulcaire | Jungo Kasai | Noah A. Smith

We introduce Rosita, a method to produce multilingual contextual word representations by training a single language model on text from multiple languages. Our method combines the advantages of contextual word representations with those of multilingual representation learning. We produce language models from dissimilar language pairs (English/Arabic and English/Chinese) and use them in dependency parsing, semantic role labeling, and named entity recognition, with comparisons to monolingual and non-contextual variants. Our results provide further evidence for the benefits of polyglot learning, in which representations are shared across multiple languages.

2018

pdf pdf bib
Neural Cross-Lingual Named Entity Recognition with Minimal Resources
Jiateng Xie | Zhilin Yang | Graham Neubig | Noah A. Smith | Jaime Carbonell

For languages with no annotated resources, unsupervised transfer of natural language processing models such as named-entity recognition (NER) from resource-rich languages would be an appealing capability. However, differences in words and word order across languages make it a challenging problem. To improve mapping of lexical items across languages, we propose a method that finds translations based on bilingual word embeddings. To improve robustness to word order differences, we propose to use self-attention, which allows for a degree of flexibility with respect to word order. We demonstrate that these methods achieve state-of-the-art or competitive NER performance on commonly tested languages under a cross-lingual setting, with much lower resource requirements than past approaches. We also evaluate the challenges of applying these methods to Uyghur, a low-resource language.

pdf pdf bib
Rational Recurrences
Hao Peng | Roy Schwartz | Sam Thomson | Noah A. Smith

Despite the tremendous empirical success of neural models in natural language processing, many of them lack the strong intuitions that accompany classical machine learning approaches. Recently, connections have been shown between convolutional neural networks (CNNs) and weighted finite state automata (WFSAs), leading to new interpretations and insights. In this work, we show that some recurrent neural networks also share this connection to WFSAs. We characterize this connection formally, defining rational recurrences to be recurrent hidden state update functions that can be written as the Forward calculation of a finite set of WFSAs. We show that several recent neural models use rational recurrences. Our analysis provides a fresh view of these models and facilitates devising new neural architectures that draw inspiration from WFSAs. We present one such model, which performs better than two recent baselines on language modeling and text classification. Our results demonstrate that transferring intuitions from classical models like WFSAs can be an effective approach to designing and understanding neural models.

pdf pdf bib
Syntactic Scaffolds for Semantic Structures
Swabha Swayamdipta | Sam Thomson | Kenton Lee | Luke Zettlemoyer | Chris Dyer | Noah A. Smith

We introduce the syntactic scaffold, an approach to incorporating syntactic information into semantic tasks. Syntactic scaffolds avoid expensive syntactic processing at runtime, only making use of a treebank during training, through a multitask objective. We improve over strong baselines on PropBank semantics, frame semantics, and coreference resolution, achieving competitive performance on all three tasks.

pdf pdf bib
Discovering Phonesthemes with Sparse Regularization
Nelson F. Liu | Gina-Anne Levow | Noah A. Smith

We introduce a simple method for extracting non-arbitrary form-meaning representations from a collection of semantic vectors. We treat the problem as one of feature selection for a model trained to predict word vectors from subword features. We apply this model to the problem of automatically discovering phonesthemes, which are submorphemic sound clusters that appear in words with similar meaning. Many of our model-predicted phonesthemes overlap with those proposed in the linguistics literature, and we validate our approach with human judgments.

pdf pdf bib
LSTMs Exploit Linguistic Attributes of Data
Nelson F. Liu | Omer Levy | Roy Schwartz | Chenhao Tan | Noah A. Smith

While recurrent neural networks have found success in a variety of natural language processing applications, they are general models of sequential data. We investigate how the properties of natural language data affect an LSTM’s ability to learn a nonlinguistic task: recalling elements from its input. We find that models trained on natural language data are able to recall tokens from much longer sequences than models trained on non-language sequential data. Furthermore, we show that the LSTM learns to solve the memorization task by explicitly using a subset of its neurons to count timesteps in the input. We hypothesize that the patterns and structure in natural language data enable LSTMs to learn by providing approximate ways of reducing loss, but understanding the effect of different training data on the learnability of LSTMs remains an open question.

pdf pdf bib
Parsing Tweets into Universal Dependencies
Yijia Liu | Yi Zhu | Wanxiang Che | Bing Qin | Nathan Schneider | Noah A. Smith

We study the problem of analyzing tweets with universal dependencies (UD). We extend the UD guidelines to cover special constructions in tweets that affect tokenization, part-of-speech tagging, and labeled dependencies. Using the extended guidelines, we create a new tweet treebank for English (Tweebank v2) that is four times larger than the (unlabeled) Tweebank v1 introduced by Kong et al. (2014). We characterize the disagreements between our annotators and show that it is challenging to deliver consistent annotation due to ambiguity in understanding and explaining tweets. Nonetheless, using the new treebank, we build a pipeline system to parse raw tweets into UD. To overcome the annotation noise without sacrificing computational efficiency, we propose a new method to distill an ensemble of 20 transition-based parsers into a single one. Our parser achieves an improvement of 2.2 in LAS over the un-ensembled baseline and outperforms parsers that are state-of-the-art on other treebanks in both accuracy and speed.

pdf pdf bib
Learning Joint Semantic Parsers from Disjoint Data
Hao Peng | Sam Thomson | Swabha Swayamdipta | Noah A. Smith

We present a new approach to learning a semantic parser from multiple datasets, even when the target semantic formalisms are drastically different and the underlying corpora do not overlap. We handle such “disjoint” data by treating annotations for unobserved formalisms as latent structured variables. Building on state-of-the-art baselines, we show improvements both in frame-semantic parsing and semantic dependency parsing by modeling them jointly.

pdf pdf bib
The Importance of Calibration for Estimating Proportions from Annotations
Dallas Card | Noah A. Smith

Estimating label proportions in a target corpus is a type of measurement that is useful for answering certain types of social-scientific questions. While past work has described a number of relevant approaches, nearly all are based on an assumption which we argue is invalid for many problems, particularly when dealing with human annotations. In this paper, we identify and differentiate between two relevant data generating scenarios (intrinsic vs. extrinsic labels), introduce a simple but novel method which emphasizes the importance of calibration, and then analyze and experimentally validate the appropriateness of various methods for each of the two scenarios.

pdf pdf bib
Neural Text Generation in Stories Using Entity Representations as Context
Elizabeth Clark | Yangfeng Ji | Noah A. Smith

We introduce an approach to neural text generation that explicitly represents entities mentioned in the text. Entity representations are vectors that are updated as the text proceeds; they are designed specifically for narrative text like fiction or news stories. Our experiments demonstrate that modeling entities offers a benefit in two automatic evaluations: mention generation (in which a model chooses which entity to mention next and which words to use in the mention) and selection between a correct next sentence and a distractor from later in the same story. We also conduct a human evaluation on automatically generated text in story contexts; this study supports our emphasis on entities and suggests directions for further research.

pdf pdf bib
Annotation Artifacts in Natural Language Inference Data
Suchin Gururangan | Swabha Swayamdipta | Omer Levy | Roy Schwartz | Samuel Bowman | Noah A. Smith

Large-scale datasets for natural language inference are created by presenting crowd workers with a sentence (premise), and asking them to generate three new sentences (hypotheses) that it entails, contradicts, or is logically neutral with respect to. We show that, in a significant portion of such data, this protocol leaves clues that make it possible to identify the label by looking only at the hypothesis, without observing the premise. Specifically, we show that a simple text categorization model can correctly classify the hypothesis alone in about 67% of SNLI (Bowman et. al, 2015) and 53% of MultiNLI (Williams et. al, 2017). Our analysis reveals that specific linguistic phenomena such as negation and vagueness are highly correlated with certain inference classes. Our findings suggest that the success of natural language inference models to date has been overestimated, and that the task remains a hard open problem.

pdf pdf bib
Sounding Board: A User-Centric and Content-Driven Social Chatbot
Hao Fang | Hao Cheng | Maarten Sap | Elizabeth Clark | Ari Holtzman | Yejin Choi | Noah A. Smith | Mari Ostendorf

We present Sounding Board, a social chatbot that won the 2017 Amazon Alexa Prize. The system architecture consists of several components including spoken language processing, dialogue management, language generation, and content management, with emphasis on user-centric and content-driven design. We also share insights gained from large-scale online logs based on 160,000 conversations with real-world users.

pdf pdf bib
Bridging CNNs, RNNs, and Weighted Finite-State Machines
Roy Schwartz | Sam Thomson | Noah A. Smith

Recurrent and convolutional neural networks comprise two distinct families of models that have proven to be useful for encoding natural language utterances. In this paper we present SoPa, a new model that aims to bridge these two approaches. SoPa combines neural representation learning with weighted finite-state automata (WFSAs) to learn a soft version of traditional surface patterns. We show that SoPa is an extension of a one-layer CNN, and that such CNNs are equivalent to a restricted version of SoPa, and accordingly, to a restricted form of WFSA. Empirically, on three text classification tasks, SoPa is comparable or better than both a BiLSTM (RNN) baseline and a CNN baseline, and is particularly useful in small data settings.

pdf pdf bib
Event2Mind: Commonsense Inference on Events, Intents, and Reactions
Hannah Rashkin | Maarten Sap | Emily Allaway | Noah A. Smith | Yejin Choi

We investigate a new commonsense inference task: given an event described in a short free-form text (“X drinks coffee in the morning”), a system reasons about the likely intents (“X wants to stay awake”) and reactions (“X feels alert”) of the event’s participants. To support this study, we construct a new crowdsourced corpus of 25,000 event phrases covering a diverse range of everyday events and situations. We report baseline performance on this task, demonstrating that neural encoder-decoder models can successfully compose embedding representations of previously unseen events and reason about the likely intents and reactions of the event participants. In addition, we demonstrate how commonsense inference on people’s intents and reactions can help unveil the implicit gender inequality prevalent in modern movie scripts.

pdf pdf bib
Backpropagating through Structured Argmax using a SPIGOT
Hao Peng | Sam Thomson | Noah A. Smith

We introduce structured projection of intermediate gradients (SPIGOT), a new method for backpropagating through neural networks that include hard-decision structured predictions (e.g., parsing) in intermediate layers. SPIGOT requires no marginal inference, unlike structured attention networks and reinforcement learning-inspired solutions. Like so-called straight-through estimators, SPIGOT defines gradient-like quantities associated with intermediate nondifferentiable operations, allowing backpropagation before and after them; SPIGOT’s proxy aims to ensure that, after a parameter update, the intermediate structure will remain well-formed. We experiment on two structured NLP pipelines: syntactic-then-semantic dependency parsing, and semantic parsing followed by sentiment classification. We show that training with SPIGOT leads to a larger improvement on the downstream task than a modularly-trained pipeline, the straight-through estimator, and structured attention, reaching a new state of the art on semantic dependency parsing.

pdf pdf bib
Neural Models for Documents with Metadata
Dallas Card | Chenhao Tan | Noah A. Smith

Most real-world document collections involve various types of metadata, such as author, source, and date, and yet the most commonly-used approaches to modeling text corpora ignore this information. While specialized models have been developed for particular applications, few are widely used in practice, as customization typically requires derivation of a custom inference algorithm. In this paper, we build on recent advances in variational inference methods and propose a general neural framework, based on topic models, to enable flexible incorporation of metadata and allow for rapid exploration of alternative models. Our approach achieves strong performance, with a manageable tradeoff between perplexity, coherence, and sparsity. Finally, we demonstrate the potential of our framework through an exploration of a corpus of articles about US immigration.

pdf pdf bib
Polyglot Semantic Role Labeling
Phoebe Mulcaire | Swabha Swayamdipta | Noah A. Smith

Previous approaches to multilingual semantic dependency parsing treat languages independently, without exploiting the similarities between semantic structures across languages. We experiment with a new approach where we combine resources from different languages in the CoNLL 2009 shared task to build a single polyglot semantic dependency parser. Notwithstanding the absence of parallel data, and the dissimilarity in annotations between languages, our approach results in improvement in parsing performance on several languages over a monolingual baseline. Analysis of the polyglot models’ performance provides a new understanding of the similarities and differences between languages in the shared task.

2017

pdf pdf bib
The Effect of Different Writing Tasks on Linguistic Style: A Case Study of the ROC Story Cloze Task
Roy Schwartz | Maarten Sap | Ioannis Konstas | Leila Zilles | Yejin Choi | Noah A. Smith

A writer’s style depends not just on personal traits but also on her intent and mental state. In this paper, we show how variants of the same writing task can lead to measurable differences in writing style. We present a case study based on the story cloze task (Mostafazadeh et al., 2016a), where annotators were assigned similar writing tasks with different constraints: (1) writing an entire story, (2) adding a story ending for a given story context, and (3) adding an incoherent ending to a story. We show that a simple linear classifier informed by stylistic features is able to successfully distinguish among the three cases, without even looking at the story context. In addition, combining our stylistic features with language model predictions reaches state of the art performance on the story cloze challenge. Our results demonstrate that different task framings can dramatically affect the way people write.

pdf pdf bib
Friendships, Rivalries, and Trysts: Characterizing Relations between Ideas in Texts
Chenhao Tan | Dallas Card | Noah A. Smith

Understanding how ideas relate to each other is a fundamental question in many domains, ranging from intellectual history to public communication. Because ideas are naturally embedded in texts, we propose the first framework to systematically characterize the relations between ideas based on their occurrence in a corpus of documents, independent of how these ideas are represented. Combining two statistics—cooccurrence within documents and prevalence correlation over time—our approach reveals a number of different ways in which ideas can cooperate and compete. For instance, two ideas can closely track each other’s prevalence over time, and yet rarely cooccur, almost like a “cold war” scenario. We observe that pairwise cooccurrence and prevalence correlation exhibit different distributions. We further demonstrate that our approach is able to uncover intriguing relations between ideas through in-depth case studies on news articles and research papers.

pdf pdf bib
Neural Discourse Structure for Text Categorization
Yangfeng Ji | Noah A. Smith

We show that discourse structure, as defined by Rhetorical Structure Theory and provided by an existing discourse parser, benefits text categorization. Our approach uses a recursive neural network and a newly proposed attention mechanism to compute a representation of the text that focuses on salient content, from the perspective of both RST and the task. Experiments consider variants of the approach and illustrate its strengths and weaknesses.

pdf pdf bib
Deep Multitask Learning for Semantic Dependency Parsing
Hao Peng | Sam Thomson | Noah A. Smith

We present a deep neural architecture that parses sentences into three semantic dependency graph formalisms. By using efficient, nearly arc-factored inference and a bidirectional-LSTM composed with a multi-layer perceptron, our base system is able to significantly improve the state of the art for semantic dependency parsing, without using hand-engineered features or syntax. We then explore two multitask learning approaches—one that shares parameters across formalisms, and one that uses higher-order structures to predict the graphs jointly. We find that both approaches improve performance across formalisms on average, achieving a new state of the art. Our code is open-source and available at https://github.com/Noahs-ARK/NeurboParser.

pdf pdf bib
Story Cloze Task: UW NLP System
Roy Schwartz | Maarten Sap | Ioannis Konstas | Leila Zilles | Yejin Choi | Noah A. Smith

This paper describes University of Washington NLP’s submission for the Linking Models of Lexical, Sentential and Discourse-level Semantics (LSDSem 2017) shared task—the Story Cloze Task. Our system is a linear classifier with a variety of features, including both the scores of a neural language model and style features. We report 75.2% accuracy on the task. A further discussion of our results can be found in Schwartz et al. (2017).

pdf pdf bib
Greedy Transition-Based Dependency Parsing with Stack LSTMs
Miguel Ballesteros | Chris Dyer | Yoav Goldberg | Noah A. Smith

We introduce a greedy transition-based parser that learns to represent parser states using recurrent neural networks. Our primary innovation that enables us to do this efficiently is a new control structure for sequential neural networks—the stack long short-term memory unit (LSTM). Like the conventional stack data structures used in transition-based parsers, elements can be pushed to or popped from the top of the stack in constant time, but, in addition, an LSTM maintains a continuous space embedding of the stack contents. Our model captures three facets of the parser’s state: (i) unbounded look-ahead into the buffer of incoming words, (ii) the complete history of transition actions taken by the parser, and (iii) the complete contents of the stack of partially built tree fragments, including their internal structures. In addition, we compare two different word representations: (i) standard word vectors based on look-up tables and (ii) character-based models of words. Although standard word embedding models work well in all languages, the character-based models improve the handling of out-of-vocabulary words, particularly in morphologically rich languages. Finally, we discuss the use of dynamic oracles in training the parser. During training, dynamic oracles alternate between sampling parser states from the training data and from the model as it is being learned, making the model more robust to the kinds of errors that will be made at test time. Training our model with dynamic oracles yields a linear-time greedy parser with very competitive performance.

pdf pdf bib
Dynamic Entity Representations in Neural Language Models
Yangfeng Ji | Chenhao Tan | Sebastian Martschat | Yejin Choi | Noah A. Smith

Understanding a long document requires tracking how entities are introduced and evolve over time. We present a new type of language model, EntityNLM, that can explicitly model entities, dynamically update their representations, and contextually generate their mentions. Our model is generative and flexible; it can model an arbitrary number of entities in context while generating each entity mention at an arbitrary length. In addition, it can be used for several different tasks such as language modeling, coreference resolution, and entity prediction. Experimental results with all these tasks demonstrate that our model consistently outperforms strong baselines and prior work.

pdf pdf bib
What Do Recurrent Neural Network Grammars Learn About Syntax?
Adhiguna Kuncoro | Miguel Ballesteros | Lingpeng Kong | Chris Dyer | Graham Neubig | Noah A. Smith

Recurrent neural network grammars (RNNG) are a recently proposed probablistic generative modeling family for natural language. They show state-of-the-art language modeling and parsing performance. We investigate what information they learn, from a linguistic perspective, through various ablations to the model and the data, and by augmenting the model with an attention mechanism (GA-RNNG) to enable closer inspection. We find that explicit modeling of composition is crucial for achieving the best performance. Through the attention mechanism, we find that headedness plays a central role in phrasal representation (with the model’s latent attention largely agreeing with predictions made by hand-crafted head rules, albeit with some important differences). By training grammars without nonterminal labels, we find that phrasal representations depend minimally on nonterminals, providing support for the endocentricity hypothesis.

2016

pdf pdf bib
Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
Katrin Erk | Noah A. Smith

pdf pdf bib
Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers)
Katrin Erk | Noah A. Smith

pdf pdf bib
Greedy, Joint Syntactic-Semantic Parsing with Stack LSTMs
Swabha Swayamdipta | Miguel Ballesteros | Chris Dyer | Noah A. Smith

pdf pdf bib
A Neural Model for Language Identification in Code-Switched Tweets
Aaron Jaech | George Mulcaire | Mari Ostendorf | Noah A. Smith

pdf pdf bib
Hierarchical Character-Word Models for Language Identification
Aaron Jaech | George Mulcaire | Shobhit Hathi | Mari Ostendorf | Noah A. Smith

pdf pdf bib
Recurrent Neural Network Grammars
Chris Dyer | Adhiguna Kuncoro | Miguel Ballesteros | Noah A. Smith

pdf pdf bib
Generation from Abstract Meaning Representation using Tree Transducers
Jeffrey Flanigan | Chris Dyer | Noah A. Smith | Jaime Carbonell

pdf pdf bib
Many Languages, One Parser
Waleed Ammar | George Mulcaire | Miguel Ballesteros | Chris Dyer | Noah A. Smith

We train one multilingual model for dependency parsing and use it to parse sentences in several languages. The parsing model uses (i) multilingual word clusters and embeddings; (ii) token-level language information; and (iii) language-specific features (fine-grained POS tags). This input representation enables the parser not only to parse effectively in multiple languages, but also to generalize across languages based on linguistic universals and typological similarities, making it more effective to learn from limited annotations. Our parser’s performance compares favorably to strong baselines in a range of data scenarios, including when the target language has a large treebank, a small treebank, or no treebank for training.

pdf pdf bib
UW-CSE at SemEval-2016 Task 10: Detecting Multiword Expressions and Supersenses using Double-Chained Conditional Random Fields
Mohammad Javad Hosseini | Noah A. Smith | Su-In Lee

pdf pdf bib
CMU at SemEval-2016 Task 8: Graph-based AMR Parsing with Infinite Ramp Loss
Jeffrey Flanigan | Chris Dyer | Noah A. Smith | Jaime Carbonell

pdf pdf bib
Semi-Supervised Learning of Sequence Models with Method of Moments
Zita Marinho | André F. T. Martins | Shay B. Cohen | Noah A. Smith

pdf pdf bib
Analyzing Framing through the Casts of Characters in the News
Dallas Card | Justin Gross | Amber Boydstun | Noah A. Smith

pdf pdf bib
Friends with Motives: Using Text to Infer Influence on SCOTUS
Yanchuan Sim | Bryan Routledge | Noah A. Smith

pdf pdf bib
Distilling an Ensemble of Greedy Dependency Parsers into One MST Parser
Adhiguna Kuncoro | Miguel Ballesteros | Lingpeng Kong | Chris Dyer | Noah A. Smith

pdf pdf bib
Character Sequence Models for Colorful Words
Kazuya Kawakami | Chris Dyer | Bryan Routledge | Noah A. Smith

pdf pdf bib
Training with Exploration Improves a Greedy Stack LSTM Parser
Miguel Ballesteros | Yoav Goldberg | Chris Dyer | Noah A. Smith

2015

pdf pdf bib
Transition-Based Dependency Parsing with Stack Long Short-Term Memory
Chris Dyer | Miguel Ballesteros | Wang Ling | Austin Matthews | Noah A. Smith

pdf pdf bib
Sparse Overcomplete Word Vector Representations
Manaal Faruqui | Yulia Tsvetkov | Dani Yogatama | Chris Dyer | Noah A. Smith

pdf pdf bib
Frame-Semantic Role Labeling with Heterogeneous Annotations
Meghana Kshirsagar | Sam Thomson | Nathan Schneider | Jaime Carbonell | Noah A. Smith | Chris Dyer

pdf pdf bib
The Media Frames Corpus: Annotations of Frames Across Issues
Dallas Card | Amber E. Boydstun | Justin H. Gross | Philip Resnik | Noah A. Smith

pdf pdf bib
A Supertag-Context Model for Weakly-Supervised CCG Parser Learning
Dan Garrette | Chris Dyer | Jason Baldridge | Noah A. Smith

pdf pdf bib
Transforming Dependencies into Phrase Structures
Lingpeng Kong | Alexander M. Rush | Noah A. Smith

pdf pdf bib
Toward Abstractive Summarization Using Semantic Representations
Fei Liu | Jeffrey Flanigan | Sam Thomson | Norman Sadeh | Noah A. Smith

pdf pdf bib
A Corpus and Model Integrating Multiword Expressions and Supersenses
Nathan Schneider | Noah A. Smith

pdf pdf bib
Retrofitting Word Vectors to Semantic Lexicons
Manaal Faruqui | Jesse Dodge | Sujay Kumar Jauhar | Chris Dyer | Eduard Hovy | Noah A. Smith

pdf pdf bib
Open Extraction of Fine-Grained Political Statements
David Bamman | Noah A. Smith

pdf pdf bib
Improved Transition-based Parsing by Modeling Characters instead of Words with LSTMs
Miguel Ballesteros | Chris Dyer | Noah A. Smith

pdf pdf bib
A Utility Model of Authors in the Scientific Community
Yanchuan Sim | Bryan Routledge | Noah A. Smith

pdf pdf bib
Extractive Summarization by Maximizing Semantic Volume
Dani Yogatama | Fei Liu | Noah A. Smith

pdf pdf bib
Bayesian Optimization of Text Representations
Dani Yogatama | Lingpeng Kong | Noah A. Smith

2014

pdf pdf bib
A Bayesian Mixed Effects Model of Literary Character
David Bamman | Ted Underwood | Noah A. Smith

pdf pdf bib
Linguistic Structured Sparsity in Text Categorization
Dani Yogatama | Noah A. Smith

pdf pdf bib
A Discriminative Graph-Based Parser for the Abstract Meaning Representation
Jeffrey Flanigan | Sam Thomson | Jaime Carbonell | Chris Dyer | Noah A. Smith

pdf pdf bib
Unsupervised Alignment of Privacy Policies using Hidden Markov Models
Rohan Ramanath | Fei Liu | Norman Sadeh | Noah A. Smith

pdf pdf bib
Distributed Representations of Geographically Situated Language
David Bamman | Chris Dyer | Noah A. Smith

pdf pdf bib
Simplified Dependency Annotations with GFL-Web
Michael T. Mordowanec | Nathan Schneider | Chris Dyer | Noah A. Smith

pdf bib
Comprehensive Annotation of Multiword Expressions in a Social Web Corpus
Nathan Schneider | Spencer Onuffer | Nora Kazour | Emily Danchik | Michael T. Mordowanec | Henrietta Conrad | Noah A. Smith

pdf pdf bib
Weakly-Supervised Bayesian Learning of a CCG Supertagger
Dan Garrette | Chris Dyer | Jason Baldridge | Noah A. Smith

pdf pdf bib
Proceedings of the ACL 2014 Workshop on Language Technologies and Computational Social Science
Cristian Danescu-Niculescu-Mizil | Jacob Eisenstein | Kathleen McKeown | Noah A. Smith

pdf pdf bib
Overview of the 2014 NLP Unshared Task in PoliInformatics
Noah A. Smith | Claire Cardie | Anne Washington | John Wilkerson

pdf pdf bib
CMU: Arc-Factored, Discriminative Semantic Dependency Parsing
Sam Thomson | Brendan O’Connor | Jeffrey Flanigan | David Bamman | Jesse Dodge | Swabha Swayamdipta | Nathan Schneider | Chris Dyer | Noah A. Smith

pdf pdf bib
Dynamic Language Models for Streaming Text
Dani Yogatama | Chong Wang | Bryan R. Routledge | Noah A. Smith | Eric P. Xing

We present a probabilistic language model that captures temporal dynamics and conditions on arbitrary non-linguistic context features. These context features serve as important indicators of language changes that are otherwise difficult to capture using text data by itself. We learn our model in an efficient online fashion that is scalable for large, streaming data. With five streaming datasets from two different genres—economics news articles and social media—we evaluate our model on the task of sequential language modeling. Our model consistently outperforms competing models.

pdf pdf bib
Discriminative Lexical Semantic Segmentation with Gaps: Running the MWE Gamut
Nathan Schneider | Emily Danchik | Chris Dyer | Noah A. Smith

We present a novel representation, evaluation measure, and supervised models for the task of identifying the multiword expressions (MWEs) in a sentence, resulting in a lexical semantic segmentation. Our approach generalizes a standard chunking representation to encode MWEs containing gaps, thereby enabling efficient sequence tagging algorithms for feature-rich discriminative models. Experiments on a new dataset of English web text offer the first linguistically-driven evaluation of MWE identification with truly heterogeneous expression types. Our statistical sequence model greatly outperforms a lookup-based segmentation procedure, achieving nearly 60% F1 for MWE identification.

pdf pdf bib
Unsupervised Discovery of Biographical Structure from Text
David Bamman | Noah A. Smith

We present a method for discovering abstract event classes in biographies, based on a probabilistic latent-variable model. Taking as input timestamped text, we exploit latent correlations among events to learn a set of event classes (such as Born, Graduates High School, and Becomes Citizen), along with the typical times in a person’s life when those events occur. In a quantitative evaluation at the task of predicting a person’s age for a given event, we find that our generative model outperforms a strong linear regression baseline, along with simpler variants of the model that ablate some features. The abstract event classes that we learn allow us to perform a large-scale analysis of 242,970 Wikipedia biographies. Though it is known that women are greatly underrepresented on Wikipedia—not only as editors (Wikipedia, 2011) but also as subjects of articles (Reagle and Rhue, 2011)—we find that there is a bias in their characterization as well, with biographies of women containing significantly more emphasis on events of marriage and divorce than biographies of men.

pdf pdf bib
Frame-Semantic Parsing
Dipanjan Das | Desai Chen | André F. T. Martins | Nathan Schneider | Noah A. Smith

pdf pdf bib
Phrase Dependency Machine Translation with Quasi-Synchronous Tree-to-Tree Features
Kevin Gimpel | Noah A. Smith

pdf pdf bib
A Dependency Parser for Tweets
Lingpeng Kong | Nathan Schneider | Swabha Swayamdipta | Archna Bhatia | Chris Dyer | Noah A. Smith

pdf pdf bib
A Step Towards Usable Privacy Policy: Automatic Alignment of Privacy Statements
Fei Liu | Rohan Ramanath | Norman Sadeh | Noah A. Smith

2013

pdf pdf bib
Learning Latent Personas of Film Characters
David Bamman | Brendan O’Connor | Noah A. Smith

pdf pdf bib
Learning to Extract International Relations from Political Context
Brendan O’Connor | Brandon M. Stewart | Noah A. Smith

pdf pdf bib
Turning on the Turbo: Fast Third-Order Non-Projective Turbo Parsers
André Martins | Miguel Almeida | Noah A. Smith

pdf pdf bib
Improved Part-of-Speech Tagging for Online Conversational Text with Word Clusters
Olutobi Owoputi | Brendan O’Connor | Chris Dyer | Kevin Gimpel | Nathan Schneider | Noah A. Smith

pdf pdf bib
A Simple, Fast, and Effective Reparameterization of IBM Model 2
Chris Dyer | Victor Chahuneau | Noah A. Smith

pdf pdf bib
Supersense Tagging for Arabic: the MT-in-the-Middle Attack
Nathan Schneider | Behrang Mohit | Chris Dyer | Kemal Oflazer | Noah A. Smith

pdf pdf bib
Knowledge-Rich Morphological Priors for Bayesian Language Models
Victor Chahuneau | Noah A. Smith | Chris Dyer

pdf pdf bib
A Framework for (Under)specifying Dependency Syntax without Overloading Annotators
Nathan Schneider | Brendan O’Connor | Naomi Saphra | David Bamman | Manaal Faruqui | Noah A. Smith | Chris Dyer | Jason Baldridge

pdf pdf bib
Measuring Ideological Proportions in Political Speeches
Yanchuan Sim | Brice D. L. Acree | Justin H. Gross | Noah A. Smith

pdf pdf bib
Translating into Morphologically Rich Languages with Synthetic Phrases
Victor Chahuneau | Eva Schlinger | Noah A. Smith | Chris Dyer

pdf pdf bib
Learning Topics and Positions from Debatepedia
Swapna Gottipati | Minghui Qiu | Yanchuan Sim | Jing Jiang | Noah A. Smith

2012

pdf pdf bib
A Probabilistic Model for Canonicalizing Named Entity Mentions
Dani Yogatama | Yanchuan Sim | Noah A. Smith

pdf pdf bib
Coarse Lexical Semantic Annotation with Supersenses: An Arabic Case Study
Nathan Schneider | Behrang Mohit | Kemal Oflazer | Noah A. Smith

pdf pdf bib
An Exact Dual Decomposition Algorithm for Shallow Semantic Parsing with Constraints
Dipanjan Das | André F. T. Martins | Noah A. Smith

pdf pdf bib
Empirical Risk Minimization for Probabilistic Grammars: Sample Complexity and Hardness of Learning
Shay B. Cohen | Noah A. Smith

pdf pdf bib
Structured Ramp Loss Minimization for Machine Translation
Kevin Gimpel | Noah A. Smith

pdf pdf bib
Concavity and Initialization for Unsupervised Dependency Parsing
Kevin Gimpel | Noah A. Smith

pdf pdf bib
Graph-Based Lexicon Expansion with Sparsity-Inducing Penalties
Dipanjan Das | Noah A. Smith

pdf pdf bib
Textual Predictors of Bill Survival in Congressional Committees
Tae Yano | Noah A. Smith | John D. Wilkerson

pdf pdf bib
Structured Sparsity in Natural Language Processing: Models, Algorithms and Applications
André F. T. Martins | Mário A. T. Figueiredo | Noah A. Smith

pdf pdf bib
Discovering Factions in the Computational Linguistics Community
Yanchuan Sim | Noah A. Smith | David A. Smith

pdf pdf bib
Transliteration by Sequence Labeling with Lattice Encodings and Reranking
Waleed Ammar | Chris Dyer | Noah Smith

pdf pdf bib
Word Salad: Relating Food Prices and Descriptions
Victor Chahuneau | Kevin Gimpel | Bryan R. Routledge | Lily Scherlis | Noah A. Smith

pdf pdf bib
Recall-Oriented Learning of Named Entities in Arabic Wikipedia
Behrang Mohit | Nathan Schneider | Rishav Bhowmick | Kemal Oflazer | Noah A. Smith

2011

pdf pdf bib
Unsupervised Word Alignment with Arbitrary Features
Chris Dyer | Jonathan H. Clark | Alon Lavie | Noah A. Smith

pdf pdf bib
Discovering Sociolinguistic Associations with Structured Sparsity
Jacob Eisenstein | Noah A. Smith | Eric P. Xing

pdf pdf bib
Semi-Supervised Frame-Semantic Parsing for Unknown Predicates
Dipanjan Das | Noah A. Smith

pdf pdf bib
Part-of-Speech Tagging for Twitter: Annotation, Features, and Experiments
Kevin Gimpel | Nathan Schneider | Brendan O’Connor | Dipanjan Das | Daniel Mills | Jacob Eisenstein | Michael Heilman | Dani Yogatama | Jeffrey Flanigan | Noah A. Smith

pdf pdf bib
Better Hypothesis Testing for Statistical Machine Translation: Controlling for Optimizer Instability
Jonathan H. Clark | Chris Dyer | Alon Lavie | Noah A. Smith

pdf pdf bib
Author Age Prediction from Text using Linear Regression
Dong Nguyen | Noah A. Smith | Carolyn P. Rosé

pdf pdf bib
The CMU-ARK German-English Translation System
Chris Dyer | Kevin Gimpel | Jonathan H. Clark | Noah A. Smith

pdf pdf bib
Generative Models of Monolingual and Bilingual Gappy Patterns
Kevin Gimpel | Noah A. Smith

pdf pdf bib
Structured Databases of Named Entities from Bayesian Nonparametrics
Jacob Eisenstein | Tae Yano | William Cohen | Noah Smith | Eric Xing

pdf pdf bib
Unsupervised Bilingual POS Tagging with Markov Random Fields
Desai Chen | Chris Dyer | Shay Cohen | Noah Smith

pdf pdf bib
Unsupervised Structure Prediction with Non-Parallel Multilingual Guidance
Shay B. Cohen | Dipanjan Das | Noah A. Smith

pdf pdf bib
Dual Decomposition with Many Overlapping Components
André Martins | Noah Smith | Mário Figueiredo | Pedro Aguiar

pdf pdf bib
Quasi-Synchronous Phrase Dependency Grammars for Machine Translation
Kevin Gimpel | Noah A. Smith

pdf pdf bib
Predicting a Scientific Community’s Response to an Article
Dani Yogatama | Michael Heilman | Brendan O’Connor | Chris Dyer | Bryan R. Routledge | Noah A. Smith

pdf pdf bib
Structured Sparsity in Structured Prediction
André Martins | Noah Smith | Mário Figueiredo | Pedro Aguiar

2010

pdf pdf bib
Viterbi Training for PCFGs: Hardness Results and Competitiveness of Uniform Initialization
Shay Cohen | Noah A. Smith

pdf pdf bib
SEMAFOR: Frame Argument Resolution with Log-Linear Models
Desai Chen | Nathan Schneider | Dipanjan Das | Noah A. Smith

pdf pdf bib
Rating Computer-Generated Questions with Mechanical Turk
Michael Heilman | Noah A. Smith

pdf pdf bib
Shedding (a Thousand Points of) Light on Biased Language
Tae Yano | Philip Resnik | Noah A. Smith

pdf pdf bib
Distributed Asynchronous Online Learning for Natural Language Processing
Kevin Gimpel | Dipanjan Das | Noah A. Smith

pdf pdf bib
Movie Reviews and Revenues: An Experiment in Text Regression
Mahesh Joshi | Dipanjan Das | Kevin Gimpel | Noah A. Smith

pdf pdf bib
Variational Inference for Adaptor Grammars
Shay B. Cohen | David M. Blei | Noah A. Smith

pdf pdf bib
Good Question! Statistical Ranking for Question Generation
Michael Heilman | Noah A. Smith

pdf pdf bib
Softmax-Margin CRFs: Training Log-Linear Models with Cost Functions
Kevin Gimpel | Noah A. Smith

pdf pdf bib
Probabilistic Frame-Semantic Parsing
Dipanjan Das | Nathan Schneider | Desai Chen | Noah A. Smith

pdf pdf bib
Tree Edit Models for Recognizing Textual Entailments, Paraphrases, and Answers to Questions
Michael Heilman | Noah A. Smith

pdf pdf bib
Turbo Parsers: Dependency Parsing by Approximate Variational Inference
André Martins | Noah Smith | Eric Xing | Pedro Aguiar | Mário Figueiredo

pdf pdf bib
A Latent Variable Model for Geographic Lexical Variation
Jacob Eisenstein | Brendan O’Connor | Noah A. Smith | Eric P. Xing

pdf pdf bib
Nonparametric Word Segmentation for Machine Translation
ThuyLinh Nguyen | Stephan Vogel | Noah A. Smith

2009

pdf pdf bib
Cube Summing, Approximate Inference with Non-Local Features, and Dynamic Programming without Semirings
Kevin Gimpel | Noah A. Smith

pdf pdf bib
Feature-Rich Translation by Quasi-Synchronous Lattice Parsing
Kevin Gimpel | Noah A. Smith

pdf pdf bib
Shared Logistic Normal Distributions for Soft Parameter Tying in Unsupervised Grammar Induction
Shay Cohen | Noah A. Smith

pdf pdf bib
Preference Grammars: Softening Syntactic Constraints to Improve Statistical Machine Translation
Ashish Venugopal | Andreas Zollmann | Noah A. Smith | Stephan Vogel

pdf pdf bib
Predicting Risk from Financial Reports with Regression
Shimon Kogan | Dimitry Levin | Bryan R. Routledge | Jacob S. Sagi | Noah A. Smith

pdf pdf bib
Predicting Response to Political Blog Posts with Topic Models
Tae Yano | William W. Cohen | Noah A. Smith

pdf pdf bib
Summarization with a Joint Model for Sentence Extraction and Compression
André Martins | Noah A. Smith

pdf pdf bib
Concise Integer Linear Programming Formulations for Dependency Parsing
André Martins | Noah Smith | Eric Xing

pdf pdf bib
Paraphrase Identification as Probabilistic Quasi-Synchronous Recognition
Dipanjan Das | Noah A. Smith

pdf pdf bib
Variational Inference for Grammar Induction with Prior Knowledge
Shay Cohen | Noah A. Smith

pdf pdf bib
Leveraging Structural Relations for Fluent Compressions at Multiple Compression Rates
Sourish Chaudhuri | Naman K. Gupta | Noah A. Smith | Carolyn P. Rosé

2008

pdf pdf bib
Stacking Dependency Parsers
André Filipe Torres Martins | Dipanjan Das | Noah A. Smith | Eric P. Xing

pdf pdf bib
Competitive Grammar Writing
Jason Eisner | Noah A. Smith

pdf pdf bib
Rich Source-Side Context for Statistical Machine Translation
Kevin Gimpel | Noah A. Smith

pdf pdf bib
Book Reviews: Computational Approaches to Morphology and Syntax by Brian Roark and Richard Sproat
Noah A. Smith

2007

pdf pdf bib
Computationally Efficient M-Estimation of Log-Linear Structure Models
Noah A. Smith | Douglas L. Vail | John D. Lafferty

pdf pdf bib
Weighted and Probabilistic Context-Free Grammars Are Equally Expressive
Noah A. Smith | Mark Johnson

pdf pdf bib
What is the Jeopardy Model? A Quasi-Synchronous Grammar for QA
Mengqiu Wang | Noah A. Smith | Teruko Mitamura

pdf pdf bib
Probabilistic Models of Nonprojective Dependency Trees
David A. Smith | Noah A. Smith

pdf pdf bib
Joint Morphological and Syntactic Disambiguation
Shay B. Cohen | Noah A. Smith

2006

pdf pdf bib
Annealing Structural Bias in Multilingual Weighted Grammar Induction
Noah A. Smith | Jason Eisner

pdf pdf bib
Vine Parsing and Minimum Risk Reranking for Speed and Precision
Markus Dreyer | David A. Smith | Noah A. Smith

2005

pdf pdf bib
Contrastive Estimation: Training Log-Linear Models on Unlabeled Data
Noah A. Smith | Jason Eisner

pdf pdf bib
Compiling Comp Ling: Weighted Dynamic Programming and the Dyna Language
Jason Eisner | Eric Goldlust | Noah A. Smith

pdf pdf bib
Context-Based Morphological Disambiguation with Random Fields
Noah A. Smith | David A. Smith | Roy W. Tromble

pdf pdf bib
Parsing with Soft and Hard Constraints on Dependency Length
Jason Eisner | Noah A. Smith

2004

pdf pdf bib
Annealing Techniques For Unsupervised Statistical Language Learning
Noah A. Smith | Jason Eisner

pdf pdf bib
Dyna: A Language for Weighted Dynamic Programming
Jason Eisner | Eric Goldlust | Noah A. Smith

pdf pdf bib
Bilingual Parsing with Factored Estimation: Using English to Parse Korean
David A. Smith | Noah A. Smith

2003

pdf pdf bib
The Web as a Parallel Corpus
Philip Resnik | Noah A. Smith

2002

pdf pdf bib
From Words to Corpora: Recognizing Translation
Noah A. Smith

2000

pdf bib
Cairo: An Alignment Visualization Tool
Noah A. Smith | Michael E. Jahr

Search
Co-authors