A Deep Dive into Word Sense Disambiguation with LSTM

Minh Le, Marten Postma, Jacopo Urbani, Piek Vossen


Abstract
LSTM-based language models have been shown effective in Word Sense Disambiguation (WSD). In particular, the technique proposed by Yuan et al. (2016) returned state-of-the-art performance in several benchmarks, but neither the training data nor the source code was released. This paper presents the results of a reproduction study and analysis of this technique using only openly available datasets (GigaWord, SemCor, OMSTI) and software (TensorFlow). Our study showed that similar results can be obtained with much less data than hinted at by Yuan et al. (2016). Detailed analyses shed light on the strengths and weaknesses of this method. First, adding more unannotated training data is useful, but is subject to diminishing returns. Second, the model can correctly identify both popular and unpopular meanings. Finally, the limited sense coverage in the annotated datasets is a major limitation. All code and trained models are made freely available.
Anthology ID:
C18-1030
Volume:
Proceedings of the 27th International Conference on Computational Linguistics
Month:
August
Year:
2018
Address:
Santa Fe, New Mexico, USA
Editors:
Emily M. Bender, Leon Derczynski, Pierre Isabelle
Venue:
COLING
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
354–365
Language:
URL:
https://aclanthology.org/C18-1030
DOI:
Bibkey:
Cite (ACL):
Minh Le, Marten Postma, Jacopo Urbani, and Piek Vossen. 2018. A Deep Dive into Word Sense Disambiguation with LSTM. In Proceedings of the 27th International Conference on Computational Linguistics, pages 354–365, Santa Fe, New Mexico, USA. Association for Computational Linguistics.
Cite (Informal):
A Deep Dive into Word Sense Disambiguation with LSTM (Le et al., COLING 2018)
Copy Citation:
PDF:
https://aclanthology.org/C18-1030.pdf