Connecting Supervised and Unsupervised Sentence Embeddings

Gil Levi


Abstract
Representing sentences as numerical vectors while capturing their semantic context is an important and useful intermediate step in natural language processing. Representations that are both general and discriminative can serve as a tool for tackling various NLP tasks. While common sentence representation methods are unsupervised in nature, recently, an approach for learning universal sentence representation in a supervised setting was presented in (Conneau et al.,2017). We argue that although promising results were obtained, an improvement can be reached by adding various unsupervised constraints that are motivated by auto-encoders and by language models. We show that by adding such constraints, superior sentence embeddings can be achieved. We compare our method with the original implementation and show improvements in several tasks.
Anthology ID:
W18-3010
Volume:
Proceedings of the Third Workshop on Representation Learning for NLP
Month:
July
Year:
2018
Address:
Melbourne, Australia
Editors:
Isabelle Augenstein, Kris Cao, He He, Felix Hill, Spandana Gella, Jamie Kiros, Hongyuan Mei, Dipendra Misra
Venue:
RepL4NLP
SIG:
SIGREP
Publisher:
Association for Computational Linguistics
Note:
Pages:
79–83
Language:
URL:
https://aclanthology.org/W18-3010
DOI:
10.18653/v1/W18-3010
Bibkey:
Cite (ACL):
Gil Levi. 2018. Connecting Supervised and Unsupervised Sentence Embeddings. In Proceedings of the Third Workshop on Representation Learning for NLP, pages 79–83, Melbourne, Australia. Association for Computational Linguistics.
Cite (Informal):
Connecting Supervised and Unsupervised Sentence Embeddings (Levi, RepL4NLP 2018)
Copy Citation:
PDF:
https://aclanthology.org/W18-3010.pdf
Data
MPQA Opinion CorpusSICKSNLISST