MEN Test Collection (State of the art)
- State of the art on the MEN dataset (Bruni et al., 2013)
- 3000 word pairs: 2000 pairs in the development part of the dataset, 1000 pairs in the test part of the dataset
- The similarity values in the dataset are the means of judgments made by 50 subjects
- see also: Similarity (State of the art)
Table of results for the test part of the dataset (1000 word pairs)
- Listed in order of decreasing Spearman's rho.
Algorithm | Reference for algorithm | Reference for reported results | Type | Spearman correlation (ρ) | Pearson correlation (r) |
---|---|---|---|---|---|
Do19-hybrid | Dobó (2019) | Dobó (2019) | Hybrid | 0.867 | 0.866 |
DC19a-hybrid | Dobó and Csirik (2019a) | Dobó and Csirik (2019a) | Hybrid | 0.866 | 0.869 |
Sp17 | Speer et al. (2017) | Dobó (2019c) | Hybrid | 0.866 | 0.861 |
Ch18 | Christopoulou et al. (2018) | Christopoulou et al. (2018) | Corpus-based, predictive | 0.84 | - |
Sa18 | Salle et al. (2018) | Dobó (2019c) | Corpus-based, distributional | 0.813 | 0.808 |
Pe14 | Pennington et al. (2014) | Dobó (2019c) | Corpus-based, distributional | 0.798 | 0.798 |
DC19a-corpus | Dobó and Csirik (2019a) | Dobó and Csirik (2019a) | Corpus-based, distributional | 0.781 | 0.749 |
Br13 | Bruni et al. (2013) | Bruni et al. (2013) | Hybrid | 0.78 | - |
Do19-corpus | Dobó (2019), Dobó and Csirik (2019b) | Dobó (2019), Dobó and Csirik (2019b) | Corpus-based, distributional | 0.705 | 0.709 |
Table of results for the full dataset (3000 word pairs)
- Listed in order of decreasing Spearman's rho.
Algorithm | Reference for algorithm | Reference for reported results | Type | Spearman correlation (ρ) | Pearson correlation (r) |
---|---|---|---|---|---|
DC19a-hybrid | Dobó and Csirik (2019a) | Dobó and Csirik (2019a) | Hybrid | 0.862 | 0.865 |
Sp17 | Speer et al. (2017) | Dobó (2019c) | Hybrid | 0.862 | 0.846 |
Do19-hybrid | Dobó (2019) | Dobó (2019) | Hybrid | 0.861 | 0.859 |
Sa18 | Salle et al. (2018) | Dobó (2019c) | Corpus-based, distributional | 0.809 | 0.803 |
Pe14 | Pennington et al. (2014) | Dobó (2019c) | Corpus-based, distributional | 0.802 | 0.801 |
DC19a-corpus | Dobó and Csirik (2019a) | Dobó and Csirik (2019a) | Corpus-based, distributional | 0.771 | 0.746 |
Do19-corpus | Dobó (2019) | Dobó (2019) | Corpus-based, distributional | 0.702 | 0.707 |
References
- Listed alphabetically.
Bruni, E., Tran, N. K., and Baroni, M. (2014). Multimodal distributional semantics. Journal of Artificial Intelligence Research, 49, pp. 1-47.
Christopoulou, F., Briakou, E., Iosif, E., and Potamianos, A. (2018). Mixture of topic-based distributional semantic and affective models. "IEEE 1ICSC 2018. pp. 203-210. IEEE.
Dobó, A. (2019). A comprehensive analysis of the parameters in the creation and comparison of feature vectors in distributional semantic models for multiple languages. University of Szeged.
Dobó, A., and Csirik, J. (2019a). A comprehensive study of the parameters in the creation and comparison of feature vectors in distributional semantic models. "Journal of Quantitative Linguistics", pp. 1-28.
Dobó, A., and Csirik, J. (2019b). Comparison of the best parameter settings in the creation and comparison of feature vectors in distributional semantic models across multiple languages. "AIAI 2019: Artificial Intelligence Applications and Innovations", pp. 487-499.
Pennington, J., Socher, R., and Manning, C. (2014). Glove: Global vectors for word representation. EMNLP 2014, pp. 1532-1543.
Salle A., Idiart M., and Villavicencio A. (2018) LexVec
Speer, R., Chin, J., and Havasi, C. (2017). Conceptnet 5.5: An open multilingual graph of general knowledge. AAAI-17, pp. 4444-4451.