Alex Lưu

Also published as: Alex Luu


2022

pdf bib
Towards Human Evaluation of Mutual Understanding in Human-Computer Spontaneous Conversation: An Empirical Study of Word Sense Disambiguation for Naturalistic Social Dialogs in American English
Alex Lưu
Proceedings of the 2nd Workshop on Human Evaluation of NLP Systems (HumEval)

Current evaluation practices for social dialog systems, dedicated to human-computer spontaneous conversation, exclusively focus on the quality of system-generated surface text, but not human-verifiable aspects of mutual understanding between the systems and their interlocutors. This work proposes Word Sense Disambiguation (WSD) as an essential component of a valid and reliable human evaluation framework, whose long-term goal is to radically improve the usability of dialog systems in real-life human-computer collaboration. The practicality of this proposal is proved via experimentally investigating (1) the WordNet 3.0 sense inventory coverage of lexical meanings in spontaneous conversation between humans in American English, assumed as an upper bound of lexical diversity of human-computer communication, and (2) the effectiveness of state-of-the-art WSD models and pretrained transformer-based contextual embeddings on this type of data.

pdf bib
Sketching a Linguistically-Driven Reasoning Dialog Model for Social Talk
Alex Lưu
Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics: Student Research Workshop

The capability of holding social talk (or casual conversation) and making sense of conversational content requires context-sensitive natural language understanding and reasoning, which cannot be handled efficiently by the current popular open-domain dialog systems and chatbots. Heavily relying on corpus-based machine learning techniques to encode and decode context-sensitive meanings, these systems focus on fitting a particular training dataset, but not tracking what is actually happening in a conversation, and therefore easily derail in a new context. This work sketches out a more linguistically-informed architecture to handle social talk in English, in which corpus-based methods form the backbone of the relatively context-insensitive components (e.g. part-of-speech tagging, approximation of lexical meaning and constituent chunking), while symbolic modeling is used for reasoning out the context-sensitive components, which do not have any consistent mapping to linguistic forms. All components are fitted into a Bayesian game-theoretic model to address the interactive and rational aspects of conversation.

2020

pdf bib
Annotating Coherence Relations for Studying Topic Transitions in Social Talk
Alex Luu | Sophia A. Malamud
Proceedings of the 14th Linguistic Annotation Workshop

This study develops the strand of research on topic transitions in social talk which aims to gain a better understanding of interlocutors’ conversational goals. Lưu and Malamud (2020) proposed that one way to identify such transitions is to annotate coherence relations, and then to identify utterances potentially expressing new topics as those that fail to participate in these relations. This work validates and refines their suggested annotation methodology, focusing on annotating most prominent coherence relations in face-to-face social dialogue. The result is a publicly accessible gold standard corpus with efficient and reliable annotation, whose broad coverage provides a foundation for future steps of identifying and classifying new topic utterances.

pdf bib
Non-Topical Coherence in Social Talk: A Call for Dialogue Model Enrichment
Alex Luu | Sophia A. Malamud
Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics: Student Research Workshop

Current models of dialogue mainly focus on utterances within a topically coherent discourse segment, rather than new-topic utterances (NTUs), which begin a new topic not correlating with the content of prior discourse. As a result, these models may sufficiently account for discourse context of task-oriented but not social conversations. We conduct a pilot annotation study of NTUs as a first step towards a model capable of rationalizing conversational coherence in social talk. We start with the naturally occurring social dialogues in the Disco-SPICE corpus, annotated with discourse relations in the Penn Discourse Treebank and Cognitive approach to Coherence Relations frameworks. We first annotate content-based coherence relations that are not available in Disco-SPICE, and then heuristically identify NTUs, which lack a coherence relation to prior discourse. Based on the interaction between NTUs and their discourse context, we construct a classification for NTUs that actually convey certain non-topical coherence in social talk. This classification introduces new sequence-based social intents that traditional taxonomies of speech acts do not capture. The new findings advocates the development of a Bayesian game-theoretic model for social talk.

2016

pdf bib
Converting SynTagRus Dependency Treebank into Penn Treebank Style
Alex Luu | Sophia A. Malamud | Nianwen Xue
Proceedings of the 10th Linguistic Annotation Workshop held in conjunction with ACL 2016 (LAW-X 2016)