Extrapolation in NLP

Jeff Mitchell, Pontus Stenetorp, Pasquale Minervini, Sebastian Riedel


Abstract
We argue that extrapolation to unseen data will often be easier for models that capture global structures, rather than just maximise their local fit to the training data. We show that this is true for two popular models: the Decomposable Attention Model and word2vec.
Anthology ID:
W18-1005
Volume:
Proceedings of the Workshop on Generalization in the Age of Deep Learning
Month:
June
Year:
2018
Address:
New Orleans, Louisiana
Editors:
Yonatan Bisk, Omer Levy, Mark Yatskar
Venue:
Gen-Deep
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
28–33
Language:
URL:
https://aclanthology.org/W18-1005
DOI:
10.18653/v1/W18-1005
Bibkey:
Cite (ACL):
Jeff Mitchell, Pontus Stenetorp, Pasquale Minervini, and Sebastian Riedel. 2018. Extrapolation in NLP. In Proceedings of the Workshop on Generalization in the Age of Deep Learning, pages 28–33, New Orleans, Louisiana. Association for Computational Linguistics.
Cite (Informal):
Extrapolation in NLP (Mitchell et al., Gen-Deep 2018)
Copy Citation:
PDF:
https://aclanthology.org/W18-1005.pdf
Data
SNLI