Expanding wordnets to new languages with multilingual sense disambiguation

Mihael Arcan, John Philip McCrae, Paul Buitelaar


Abstract
Princeton WordNet is one of the most important resources for natural language processing, but is only available for English. While it has been translated using the expand approach to many other languages, this is an expensive manual process. Therefore it would be beneficial to have a high-quality automatic translation approach that would support NLP techniques, which rely on WordNet in new languages. The translation of wordnets is fundamentally complex because of the need to translate all senses of a word including low frequency senses, which is very challenging for current machine translation approaches. For this reason we leverage existing translations of WordNet in other languages to identify contextual information for wordnet senses from a large set of generic parallel corpora. We evaluate our approach using 10 translated wordnets for European languages. Our experiment shows a significant improvement over translation without any contextual information. Furthermore, we evaluate how the choice of pivot languages affects performance of multilingual word sense disambiguation.
Anthology ID:
C16-1010
Volume:
Proceedings of COLING 2016, the 26th International Conference on Computational Linguistics: Technical Papers
Month:
December
Year:
2016
Address:
Osaka, Japan
Editors:
Yuji Matsumoto, Rashmi Prasad
Venue:
COLING
SIG:
Publisher:
The COLING 2016 Organizing Committee
Note:
Pages:
97–108
Language:
URL:
https://aclanthology.org/C16-1010
DOI:
Bibkey:
Cite (ACL):
Mihael Arcan, John Philip McCrae, and Paul Buitelaar. 2016. Expanding wordnets to new languages with multilingual sense disambiguation. In Proceedings of COLING 2016, the 26th International Conference on Computational Linguistics: Technical Papers, pages 97–108, Osaka, Japan. The COLING 2016 Organizing Committee.
Cite (Informal):
Expanding wordnets to new languages with multilingual sense disambiguation (Arcan et al., COLING 2016)
Copy Citation:
PDF:
https://aclanthology.org/C16-1010.pdf