Semi-Supervised Sequence Modeling with Cross-View Training

Kevin Clark; Minh-Thang Luong; Christopher D. Manning; Quoc Le

doi:10.18653/v1/D18-1217

Semi-Supervised Sequence Modeling with Cross-View Training

Kevin Clark, Minh-Thang Luong, Christopher D. Manning, Quoc Le

Abstract

Unsupervised representation learning algorithms such as word2vec and ELMo improve the accuracy of many supervised NLP models, mainly because they can take advantage of large amounts of unlabeled text. However, the supervised models only learn from task-specific labeled data during the main training phase. We therefore propose Cross-View Training (CVT), a semi-supervised learning algorithm that improves the representations of a Bi-LSTM sentence encoder using a mix of labeled and unlabeled data. On labeled examples, standard supervised learning is used. On unlabeled examples, CVT teaches auxiliary prediction modules that see restricted views of the input (e.g., only part of a sentence) to match the predictions of the full model seeing the whole input. Since the auxiliary modules and the full model share intermediate representations, this in turn improves the full model. Moreover, we show that CVT is particularly effective when combined with multi-task learning. We evaluate CVT on five sequence tagging tasks, machine translation, and dependency parsing, achieving state-of-the-art results.

Anthology ID:: D18-1217
Volume:: Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing
Month:: October-November
Year:: 2018
Address:: Brussels, Belgium
Editors:: Ellen Riloff, David Chiang, Julia Hockenmaier, Jun’ichi Tsujii
Venue:: EMNLP
SIG:: SIGDAT
Publisher:: Association for Computational Linguistics
Note:
Pages:: 1914–1925
Language:
URL:: https://aclanthology.org/D18-1217/
DOI:: 10.18653/v1/D18-1217
Bibkey:
Cite (ACL):: Kevin Clark, Minh-Thang Luong, Christopher D. Manning, and Quoc Le. 2018. Semi-Supervised Sequence Modeling with Cross-View Training. In Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing, pages 1914–1925, Brussels, Belgium. Association for Computational Linguistics.
Cite (Informal):: Semi-Supervised Sequence Modeling with Cross-View Training (Clark et al., EMNLP 2018)
Copy Citation:
PDF:: https://aclanthology.org/D18-1217.pdf
Attachment:: D18-1217.Attachment.zip
Video:: https://aclanthology.org/D18-1217.mp4
Code: additional community code
Data: CCGbank, CoNLL, CoNLL 2003, OntoNotes 5.0, Penn Treebank

PDF Cite Search Code Attachment Video Fix data