Rui Dong


2023

pdf bib
Improving Cross-task Generalization of Unified Table-to-text Models with Compositional Task Configurations
Jifan Chen | Yuhao Zhang | Lan Liu | Rui Dong | Xinchi Chen | Patrick Ng | William Yang Wang | Zhiheng Huang
Findings of the Association for Computational Linguistics: ACL 2023

There has been great progress in unifying various table-to-text tasks using a single encoder-decoder model trained via multi-task learning (Xie et al., 2022).However, existing methods typically encode task information with a simple dataset name as a prefix to the encoder. This not only limits the effectiveness of multi-task learning, but also hinders the model’s ability to generalize to new domains or tasks that were not seen during training, which is crucial for real-world applications. In this paper, we propose compositional task configurations, a set of prompts prepended to the encoder to improve cross-task generalization of unified models. We design the task configurations to explicitly specify the task type, as well as its input and output types. We show that this not only allows the model to better learn shared knowledge across different tasks at training, but also allows us to control the model by composing new configurations that apply novel input-output combinations in a zero-shot manner. We demonstrate via experiments over ten table-to-text tasks that our method outperforms the UnifiedSKG baseline by noticeable margins in both in-domain and zero-shot settings, with average improvements of +0.5 and +12.6 from using a T5-large backbone, respectively.

2022

pdf bib
ASCM: An Answer Space Clustered Prompting Method without Answer Engineering
Zhen Wang | Yating Yang | Zhou Xi | Bo Ma | Lei Wang | Rui Dong | Azmat Anwar
Findings of the Association for Computational Linguistics: ACL 2022

Prompt-based learning, which exploits knowledge from pre-trained language models by providing textual prompts and designing appropriate answer-category mapping methods, has achieved impressive successes on few-shot text classification and natural language inference (NLI). Because of the diverse linguistic expression, there exist many answer tokens for the same category. However, both manual answer design and automatic answer search constrain answer space and therefore hardly achieve ideal performance. To address this issue, we propose an answer space clustered prompting model (ASCM) together with a synonym initialization method (SI) which automatically categorizes all answer tokens in a semantic-clustered embedding space. We also propose a stable semi-supervised method named stair learning (SL) that orderly distills knowledge from better models to weaker models. Extensive experiments demonstrate that our ASCM+SL significantly outperforms existing state-of-the-art techniques in few-shot settings.

pdf bib
Entailment Tree Explanations via Iterative Retrieval-Generation Reasoner
Danilo Neves Ribeiro | Shen Wang | Xiaofei Ma | Rui Dong | Xiaokai Wei | Henghui Zhu | Xinchi Chen | Peng Xu | Zhiheng Huang | Andrew Arnold | Dan Roth
Findings of the Association for Computational Linguistics: NAACL 2022

Large language models have achieved high performance on various question answering (QA) benchmarks, but the explainability of their output remains elusive. Structured explanations, called entailment trees, were recently suggested as a way to explain the reasoning behind a QA system’s answer. In order to better generate such entailment trees, we propose an architecture called Iterative Retrieval-Generation Reasoner (IRGR). Our model is able to explain a given hypothesis by systematically generating a step-by-step explanation from textual premises. The IRGR model iteratively searches for suitable premises, constructing a single entailment step at a time. Contrary to previous approaches, our method combines generation steps and retrieval of premises, allowing the model to leverage intermediate conclusions, and mitigating the input size limit of baseline encoder-decoder models. We conduct experiments using the EntailmentBank dataset, where we outperform existing benchmarks on premise retrieval and entailment tree generation, with around 300% gain in overall correctness.

2021

pdf bib
基于时间注意力胶囊网络的维吾尔语情感分类模型(Uyghur Sentiment Classification Model Based on Temporal Attention Capsule Networks)
Hantian Luo (罗涵天) | Yating Yang (杨雅婷) | Rui Dong (董瑞) | Bo Ma (马博)
Proceedings of the 20th Chinese National Conference on Computational Linguistics

维吾尔语属于稀缺资源语言,如何在资源有限的情况下提升维吾尔语情感分类模型的性能,是目前待解决的问题。本文针对现有维吾尔语情感分析因为泛化能力不足所导致的分类效果不佳的问题,提出了基于时间卷积注意力胶囊网络的维吾尔语情感分类模型匨協十匭千卡印匩。本文在维吾尔语情感分类数据集中进行了实验并且从多个评价指标(准确率,精确率,召回率,F1值)进行评估,实验结果表明本文提出的模型相比传统深度学习模型可以有效提升维吾尔语情感分类的各项指标。

pdf bib
Structural Encoding and Pre-training Matter: Adapting BERT for Table-Based Fact Verification
Rui Dong | David Smith
Proceedings of the 16th Conference of the European Chapter of the Association for Computational Linguistics: Main Volume

Growing concern with online misinformation has encouraged NLP research on fact verification. Since writers often base their assertions on structured data, we focus here on verifying textual statements given evidence in tables. Starting from the Table Parsing (TAPAS) model developed for question answering (Herzig et al., 2020), we find that modeling table structure improves a language model pre-trained on unstructured text. Pre-training language models on English Wikipedia table data further improves performance. Pre-training on a question answering task with column-level cell rank information achieves the best performance. With improved pre-training and cell embeddings, this approach outperforms the state-of-the-art Numerically-aware Graph Neural Network table fact verification model (GNN-TabFact), increasing statement classification accuracy from 72.2% to 73.9% even without modeling numerical information. Incorporating numerical information with cell rankings and pre-training on a question-answering task increases accuracy to 76%. We further analyze accuracy on statements implicating single rows or multiple rows and columns of tables, on different numerical reasoning subtasks, and on generalizing to detecting errors in statements derived from the ToTTo table-to-text generation dataset.

2020

pdf bib
Multi-Task Neural Model for Agglutinative Language Translation
Yirong Pan | Xiao Li | Yating Yang | Rui Dong
Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics: Student Research Workshop

Neural machine translation (NMT) has achieved impressive performance recently by using large-scale parallel corpora. However, it struggles in the low-resource and morphologically-rich scenarios of agglutinative language translation task. Inspired by the finding that monolingual data can greatly improve the NMT performance, we propose a multi-task neural model that jointly learns to perform bi-directional translation and agglutinative language stemming. Our approach employs the shared encoder and decoder to train a single model without changing the standard NMT architecture but instead adding a token before each source-side sentence to specify the desired target outputs of the two different tasks. Experimental results on Turkish-English and Uyghur-Chinese show that our proposed approach can significantly improve the translation performance on agglutinative languages by using a small amount of monolingual data.

2019

pdf bib
Noisy Neural Language Modeling for Typing Prediction in BCI Communication
Rui Dong | David Smith | Shiran Dudy | Steven Bedrick
Proceedings of the Eighth Workshop on Speech and Language Processing for Assistive Technologies

Language models have broad adoption in predictive typing tasks. When the typing history contains numerous errors, as in open-vocabulary predictive typing with brain-computer interface (BCI) systems, we observe significant performance degradation in both n-gram and recurrent neural network language models trained on clean text. In evaluations of ranking character predictions, training recurrent LMs on noisy text makes them much more robust to noisy histories, even when the error model is misspecified. We also propose an effective strategy for combining evidence from multiple ambiguous histories of BCI electroencephalogram measurements.

2018

pdf bib
Multi-Input Attention for Unsupervised OCR Correction
Rui Dong | David Smith
Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)

We propose a novel approach to OCR post-correction that exploits repeated texts in large corpora both as a source of noisy target outputs for unsupervised training and as a source of evidence when decoding. A sequence-to-sequence model with attention is applied for single-input correction, and a new decoder with multi-input attention averaging is developed to search for consensus among multiple sequences. We design two ways of training the correction model without human annotation, either training to match noisily observed textual variants or bootstrapping from a uniform error model. On two corpora of historical newspapers and books, we show that these unsupervised techniques cut the character and word error rates nearly in half on single inputs and, with the addition of multi-input decoding, can rival supervised methods.

2017

pdf bib
Log-linear Models for Uyghur Segmentation in Spoken Language Translation
Chenggang Mi | Yating Yang | Rui Dong | Xi Zhou | Lei Wang | Xiao Li | Tonghai Jiang
Proceedings of the International Conference Recent Advances in Natural Language Processing, RANLP 2017

To alleviate data sparsity in spoken Uyghur machine translation, we proposed a log-linear based morphological segmentation approach. Instead of learning model only from monolingual annotated corpus, this approach optimizes Uyghur segmentation for spoken translation based on both bilingual and monolingual corpus. Our approach relies on several features such as traditional conditional random field (CRF) feature, bilingual word alignment feature and monolingual suffixword co-occurrence feature. Experimental results shown that our proposed segmentation model for Uyghur spoken translation achieved 1.6 BLEU score improvements compared with the state-of-the-art baseline.