Semi-supervised sequence tagging with bidirectional language models

Matthew E. Peters, Waleed Ammar, Chandra Bhagavatula, Russell Power


Abstract
Pre-trained word embeddings learned from unlabeled text have become a standard component of neural network architectures for NLP tasks. However, in most cases, the recurrent network that operates on word-level representations to produce context sensitive representations is trained on relatively little labeled data. In this paper, we demonstrate a general semi-supervised approach for adding pretrained context embeddings from bidirectional language models to NLP systems and apply it to sequence labeling tasks. We evaluate our model on two standard datasets for named entity recognition (NER) and chunking, and in both cases achieve state of the art results, surpassing previous systems that use other forms of transfer or joint learning with additional labeled data and task specific gazetteers.
Anthology ID:
P17-1161
Volume:
Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
Month:
July
Year:
2017
Address:
Vancouver, Canada
Editors:
Regina Barzilay, Min-Yen Kan
Venue:
ACL
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
1756–1765
Language:
URL:
https://aclanthology.org/P17-1161
DOI:
10.18653/v1/P17-1161
Bibkey:
Cite (ACL):
Matthew E. Peters, Waleed Ammar, Chandra Bhagavatula, and Russell Power. 2017. Semi-supervised sequence tagging with bidirectional language models. In Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 1756–1765, Vancouver, Canada. Association for Computational Linguistics.
Cite (Informal):
Semi-supervised sequence tagging with bidirectional language models (Peters et al., ACL 2017)
Copy Citation:
PDF:
https://aclanthology.org/P17-1161.pdf
Code
 additional community code
Data
CoNLLCoNLL 2003