Difference between revisions of "Bigger analogy test set (State of the art)"
Jump to navigation
Jump to search
(This page lists published results on Bigger Analogy Test Set (BATS)) |
|||
(3 intermediate revisions by 2 users not shown) | |||
Line 1: | Line 1: | ||
− | + | * see also: [[Analogy (State of the art)]] | |
== Dataset description == | == Dataset description == | ||
* New dataset proposed by Gladkova et al. (2016) <ref name = "Gladkova2016">Gladkova, A., Drozd, A., & Matsuoka, S. (2016). Analogy-based detection of morphological and semantic relations with word embeddings: what works and what doesn’t. In Proceedings of the NAACL-HLT SRW (pp. 47–54). San Diego, California, June 12-17, 2016: ACL. Retrieved from https://www.aclweb.org/anthology/N/N16/N16-2002.pdf | * New dataset proposed by Gladkova et al. (2016) <ref name = "Gladkova2016">Gladkova, A., Drozd, A., & Matsuoka, S. (2016). Analogy-based detection of morphological and semantic relations with word embeddings: what works and what doesn’t. In Proceedings of the NAACL-HLT SRW (pp. 47–54). San Diego, California, June 12-17, 2016: ACL. Retrieved from https://www.aclweb.org/anthology/N/N16/N16-2002.pdf | ||
</ref> | </ref> | ||
+ | * available [http://vsm.blackbird.pw/bats here] | ||
* dataset balanced across 4 types of relations (inflectional morphology, derivational morphology, lexicographic semantics, encyclopedic semantics) | * dataset balanced across 4 types of relations (inflectional morphology, derivational morphology, lexicographic semantics, encyclopedic semantics) | ||
* 10 relations of each type, 50 unique pairs per category | * 10 relations of each type, 50 unique pairs per category | ||
Line 52: | Line 53: | ||
|} | |} | ||
+ | |||
+ | == Methodological issues == | ||
+ | |||
+ | * As with other analogy test sets, accuracy depends not only on the embedding and its parameters, but also on the method with which analogies are solved <ref>Linzen, T. (2016). Issues in evaluating semantic spaces using word analogies. In Proceedings of the First Workshop on Evaluating Vector Space Representations for NLP. Association for Computational Linguistics. Retrieved from http://anthology.aclweb.org/W16-2503</ref> <ref>Levy, O., Goldberg, Y., & Ramat-Gan, I. (2014). Linguistic Regularities in Sparse and Explicit Word Representations. In CoNLL (pp. 171–180). Retrieved from http://anthology.aclweb.org/W/W14/W14-1618.pdf | ||
+ | </ref>. Set-based methods<ref>Drozd, A., Gladkova, A., & Matsuoka, S. (2016). Word embeddings, analogies, and machine learning: beyond king - man + woman = queen. In Proceedings of COLING 2016, the 26th International Conference on Computational Linguistics: Technical Papers (pp. 3519–3530). Osaka, Japan, December 11-17: ACL. Retrieved from https://www.aclweb.org/anthology/C/C16/C16-1332.pdf</ref> considerably outperform pair-based methods, showing that models do in fact encode much "missed" information. | ||
+ | * Therefore it is more accurate to think of analogy task as a way to describe and characterize an embedding, rather than evaluate it. | ||
== References == | == References == | ||
Line 57: | Line 64: | ||
[[Category:State of the art]] | [[Category:State of the art]] | ||
+ | [[Category:Analogy]] |
Latest revision as of 15:14, 25 January 2017
- see also: Analogy (State of the art)
Dataset description
- New dataset proposed by Gladkova et al. (2016) [1]
- available here
- dataset balanced across 4 types of relations (inflectional morphology, derivational morphology, lexicographic semantics, encyclopedic semantics)
- 10 relations of each type, 50 unique pairs per category
- 99,200 questions in total
- more challenging than the Google set because of more diverse relations
- where applicable, more than one correct answer is supplied (e.g. both canine and animal are hypernyms of dog).
- comes with a testing script a testing script that implements 5 methods of solving analogies (See Analogy (State of the art))
This page reports results obtained with the "vanilla" 3CosAdd method, or vector offset[2].
Table of results
- Listed in chronological order.
Model | Reference | Inflectional morphology |
Derivational morphology |
Lexicographic semantics |
Encyclopedic semantics |
Corpus, window size, vector size |
---|---|---|---|---|---|---|
SVD | Drozd et al. (2016) [3] | 44.0 | 9.8 | 10.1 | 18.5 | 5B corpus (Araneum + Wikipedia + UkWac), window 3, 1000 dimensions |
GloVe | Drozd et al. (2016) [3] | 59.9 | 10.2 | 10.9 | 31.5 | 5B corpus (Araneum + Wikipedia + UkWac), window 8, 300 dimensions |
Skip-Gram | Drozd et al. (2016) [3] | 61.0 | 11.2 | 9.1 | 26.5 | 5B corpus (Araneum + Wikipedia + UkWac), window 8, 300 dimensions |
Methodological issues
- As with other analogy test sets, accuracy depends not only on the embedding and its parameters, but also on the method with which analogies are solved [4] [5]. Set-based methods[6] considerably outperform pair-based methods, showing that models do in fact encode much "missed" information.
- Therefore it is more accurate to think of analogy task as a way to describe and characterize an embedding, rather than evaluate it.
References
- ↑ Gladkova, A., Drozd, A., & Matsuoka, S. (2016). Analogy-based detection of morphological and semantic relations with word embeddings: what works and what doesn’t. In Proceedings of the NAACL-HLT SRW (pp. 47–54). San Diego, California, June 12-17, 2016: ACL. Retrieved from https://www.aclweb.org/anthology/N/N16/N16-2002.pdf
- ↑ Mikolov, T., Chen, K., Corrado, G., & Dean, J. (2013). Efficient estimation of word representations in vector space. In Proceedings of International Conference on Learning Representations (ICLR).
- ↑ 3.0 3.1 3.2 Drozd, A., Gladkova, A., & Matsuoka, S. (2016). Word embeddings, analogies, and machine learning: beyond king - man + woman = queen. In Proceedings of COLING 2016, the 26th International Conference on Computational Linguistics: Technical Papers (pp. 3519–3530). Osaka, Japan, December 11-17: ACL. Retrieved from https://www.aclweb.org/anthology/C/C16/C16-1332.pdf
- ↑ Linzen, T. (2016). Issues in evaluating semantic spaces using word analogies. In Proceedings of the First Workshop on Evaluating Vector Space Representations for NLP. Association for Computational Linguistics. Retrieved from http://anthology.aclweb.org/W16-2503
- ↑ Levy, O., Goldberg, Y., & Ramat-Gan, I. (2014). Linguistic Regularities in Sparse and Explicit Word Representations. In CoNLL (pp. 171–180). Retrieved from http://anthology.aclweb.org/W/W14/W14-1618.pdf
- ↑ Drozd, A., Gladkova, A., & Matsuoka, S. (2016). Word embeddings, analogies, and machine learning: beyond king - man + woman = queen. In Proceedings of COLING 2016, the 26th International Conference on Computational Linguistics: Technical Papers (pp. 3519–3530). Osaka, Japan, December 11-17: ACL. Retrieved from https://www.aclweb.org/anthology/C/C16/C16-1332.pdf