Difference between revisions of "MEN Test Collection (State of the art)"
Doboandris (talk | contribs) m |
Doboandris (talk | contribs) m |
||
(4 intermediate revisions by the same user not shown) | |||
Line 26: | Line 26: | ||
| 0.866 | | 0.866 | ||
|- | |- | ||
− | | | + | | DC20-hybrid |
− | | Dobó and Csirik ( | + | | Dobó and Csirik (2020) |
− | | Dobó and Csirik ( | + | | Dobó and Csirik (2020) |
| Hybrid | | Hybrid | ||
| 0.866 | | 0.866 | ||
Line 35: | Line 35: | ||
| Sp17 | | Sp17 | ||
| Speer et al. (2017) | | Speer et al. (2017) | ||
− | | Dobó ( | + | | Dobó (2019) |
| Hybrid | | Hybrid | ||
| 0.866 | | 0.866 | ||
Line 49: | Line 49: | ||
| Sa18 | | Sa18 | ||
| Salle et al. (2018) | | Salle et al. (2018) | ||
− | | Dobó ( | + | | Dobó (2019) |
| Corpus-based, distributional | | Corpus-based, distributional | ||
| 0.813 | | 0.813 | ||
Line 56: | Line 56: | ||
| Pe14 | | Pe14 | ||
| Pennington et al. (2014) | | Pennington et al. (2014) | ||
− | | Dobó ( | + | | Dobó (2019) |
| Corpus-based, distributional | | Corpus-based, distributional | ||
| 0.798 | | 0.798 | ||
| 0.798 | | 0.798 | ||
|- | |- | ||
− | | | + | | DC20-corpus |
− | | Dobó and Csirik ( | + | | Dobó and Csirik (2020) |
− | | Dobó and Csirik ( | + | | Dobó and Csirik (2020) |
| Corpus-based, distributional | | Corpus-based, distributional | ||
| 0.781 | | 0.781 | ||
Line 76: | Line 76: | ||
|- | |- | ||
| Do19-corpus | | Do19-corpus | ||
− | | Dobó (2019), Dobó and Csirik ( | + | | Dobó (2019), Dobó and Csirik (2019) |
− | | Dobó (2019), Dobó and Csirik ( | + | | Dobó (2019), Dobó and Csirik (2019) |
| Corpus-based, distributional | | Corpus-based, distributional | ||
| 0.705 | | 0.705 | ||
Line 98: | Line 98: | ||
! [http://en.wikipedia.org/wiki/Pearson_product-moment_correlation_coefficient Pearson correlation] (r) | ! [http://en.wikipedia.org/wiki/Pearson_product-moment_correlation_coefficient Pearson correlation] (r) | ||
|- | |- | ||
− | | | + | | DC20-hybrid |
− | | Dobó and Csirik ( | + | | Dobó and Csirik (2020) |
− | | Dobó and Csirik ( | + | | Dobó and Csirik (2020) |
| Hybrid | | Hybrid | ||
| 0.862 | | 0.862 | ||
Line 107: | Line 107: | ||
| Sp17 | | Sp17 | ||
| Speer et al. (2017) | | Speer et al. (2017) | ||
− | | Dobó ( | + | | Dobó (2019) |
| Hybrid | | Hybrid | ||
| 0.862 | | 0.862 | ||
Line 121: | Line 121: | ||
| Sa18 | | Sa18 | ||
| Salle et al. (2018) | | Salle et al. (2018) | ||
− | | Dobó ( | + | | Dobó (2019) |
| Corpus-based, distributional | | Corpus-based, distributional | ||
| 0.809 | | 0.809 | ||
Line 128: | Line 128: | ||
| Pe14 | | Pe14 | ||
| Pennington et al. (2014) | | Pennington et al. (2014) | ||
− | | Dobó ( | + | | Dobó (2019) |
| Corpus-based, distributional | | Corpus-based, distributional | ||
| 0.802 | | 0.802 | ||
| 0.801 | | 0.801 | ||
|- | |- | ||
− | | | + | | DC20-corpus |
− | | Dobó and Csirik ( | + | | Dobó and Csirik (2020) |
− | | Dobó and Csirik ( | + | | Dobó and Csirik (2020) |
| Corpus-based, distributional | | Corpus-based, distributional | ||
| 0.771 | | 0.771 | ||
Line 154: | Line 154: | ||
Bruni, E., Tran, N. K., and Baroni, M. (2014). [https://www.jair.org/index.php/jair/article/download/10857/25905/ Multimodal distributional semantics]. ''Journal of Artificial Intelligence Research'', 49, pp. 1-47. | Bruni, E., Tran, N. K., and Baroni, M. (2014). [https://www.jair.org/index.php/jair/article/download/10857/25905/ Multimodal distributional semantics]. ''Journal of Artificial Intelligence Research'', 49, pp. 1-47. | ||
− | Christopoulou, F., Briakou, E., Iosif, E., and Potamianos, A. (2018). [https://ieeexplore.ieee.org/abstract/document/8334459/ Mixture of topic-based distributional semantic and affective models]. | + | Christopoulou, F., Briakou, E., Iosif, E., and Potamianos, A. (2018). [https://ieeexplore.ieee.org/abstract/document/8334459/ Mixture of topic-based distributional semantic and affective models]. ''IEEE 1ICSC 2018'', pp. 203-210. IEEE. |
− | Dobó, A. (2019). [http://doktori.bibl.u-szeged.hu/10120/1/AndrasDoboThesis2019.pdf A comprehensive analysis of the parameters in the creation and comparison of feature vectors in distributional semantic models for multiple languages]. University of Szeged. | + | Dobó, A. (2019). [http://doktori.bibl.u-szeged.hu/10120/1/AndrasDoboThesis2019.pdf A comprehensive analysis of the parameters in the creation and comparison of feature vectors in distributional semantic models for multiple languages]. University of Szeged. [https://github.com/doboandras/dsm-parameter-analysis GitHub repository] |
− | Dobó, A., and Csirik, J. ( | + | Dobó, A., and Csirik, J. (2020). [https://doi.org/10.1080/09296174.2019.1570897 A comprehensive study of the parameters in the creation and comparison of feature vectors in distributional semantic models]. ''Journal of Quantitative Linguistics'', 27(3), pp. 244-271. |
− | Dobó, A., and Csirik, J. ( | + | Dobó, A., and Csirik, J. (2019). [http://www.inf.u-szeged.hu/~dobo/Publications/Comparison%20of%20the%20best%20parameter%20settings%20of%20DSMs%20across%20languages.pdf Comparison of the best parameter settings in the creation and comparison of feature vectors in distributional semantic models across multiple languages]. ''AIAI 2019: Artificial Intelligence Applications and Innovations'', pp. 487-499. |
Pennington, J., Socher, R., and Manning, C. (2014). [https://www.aclweb.org/anthology/D14-1162 Glove: Global vectors for word representation]. ''EMNLP 2014'', pp. 1532-1543. | Pennington, J., Socher, R., and Manning, C. (2014). [https://www.aclweb.org/anthology/D14-1162 Glove: Global vectors for word representation]. ''EMNLP 2014'', pp. 1532-1543. |
Latest revision as of 01:05, 6 September 2020
- State of the art on the MEN dataset (Bruni et al., 2013)
- 3000 word pairs: 2000 pairs in the development part of the dataset, 1000 pairs in the test part of the dataset
- The similarity values in the dataset are the means of judgments made by 50 subjects
- see also: Similarity (State of the art)
Table of results for the test part of the dataset (1000 word pairs)
- Listed in order of decreasing Spearman's rho.
Algorithm | Reference for algorithm | Reference for reported results | Type | Spearman correlation (ρ) | Pearson correlation (r) |
---|---|---|---|---|---|
Do19-hybrid | Dobó (2019) | Dobó (2019) | Hybrid | 0.867 | 0.866 |
DC20-hybrid | Dobó and Csirik (2020) | Dobó and Csirik (2020) | Hybrid | 0.866 | 0.869 |
Sp17 | Speer et al. (2017) | Dobó (2019) | Hybrid | 0.866 | 0.861 |
Ch18 | Christopoulou et al. (2018) | Christopoulou et al. (2018) | Corpus-based, predictive | 0.84 | - |
Sa18 | Salle et al. (2018) | Dobó (2019) | Corpus-based, distributional | 0.813 | 0.808 |
Pe14 | Pennington et al. (2014) | Dobó (2019) | Corpus-based, distributional | 0.798 | 0.798 |
DC20-corpus | Dobó and Csirik (2020) | Dobó and Csirik (2020) | Corpus-based, distributional | 0.781 | 0.749 |
Br13 | Bruni et al. (2013) | Bruni et al. (2013) | Hybrid | 0.78 | - |
Do19-corpus | Dobó (2019), Dobó and Csirik (2019) | Dobó (2019), Dobó and Csirik (2019) | Corpus-based, distributional | 0.705 | 0.709 |
Table of results for the full dataset (3000 word pairs)
- Listed in order of decreasing Spearman's rho.
Algorithm | Reference for algorithm | Reference for reported results | Type | Spearman correlation (ρ) | Pearson correlation (r) |
---|---|---|---|---|---|
DC20-hybrid | Dobó and Csirik (2020) | Dobó and Csirik (2020) | Hybrid | 0.862 | 0.865 |
Sp17 | Speer et al. (2017) | Dobó (2019) | Hybrid | 0.862 | 0.846 |
Do19-hybrid | Dobó (2019) | Dobó (2019) | Hybrid | 0.861 | 0.859 |
Sa18 | Salle et al. (2018) | Dobó (2019) | Corpus-based, distributional | 0.809 | 0.803 |
Pe14 | Pennington et al. (2014) | Dobó (2019) | Corpus-based, distributional | 0.802 | 0.801 |
DC20-corpus | Dobó and Csirik (2020) | Dobó and Csirik (2020) | Corpus-based, distributional | 0.771 | 0.746 |
Do19-corpus | Dobó (2019) | Dobó (2019) | Corpus-based, distributional | 0.702 | 0.707 |
References
- Listed alphabetically.
Bruni, E., Tran, N. K., and Baroni, M. (2014). Multimodal distributional semantics. Journal of Artificial Intelligence Research, 49, pp. 1-47.
Christopoulou, F., Briakou, E., Iosif, E., and Potamianos, A. (2018). Mixture of topic-based distributional semantic and affective models. IEEE 1ICSC 2018, pp. 203-210. IEEE.
Dobó, A. (2019). A comprehensive analysis of the parameters in the creation and comparison of feature vectors in distributional semantic models for multiple languages. University of Szeged. GitHub repository
Dobó, A., and Csirik, J. (2020). A comprehensive study of the parameters in the creation and comparison of feature vectors in distributional semantic models. Journal of Quantitative Linguistics, 27(3), pp. 244-271.
Dobó, A., and Csirik, J. (2019). Comparison of the best parameter settings in the creation and comparison of feature vectors in distributional semantic models across multiple languages. AIAI 2019: Artificial Intelligence Applications and Innovations, pp. 487-499.
Pennington, J., Socher, R., and Manning, C. (2014). Glove: Global vectors for word representation. EMNLP 2014, pp. 1532-1543.
Salle A., Idiart M., and Villavicencio A. (2018) LexVec
Speer, R., Chin, J., and Havasi, C. (2017). Conceptnet 5.5: An open multilingual graph of general knowledge. AAAI-17, pp. 4444-4451.