RG-65 Test Collection (State of the art)

From ACL Wiki
Revision as of 09:38, 16 October 2013 by Taher (talk | contribs)
Jump to navigation Jump to search

State of the art in Rubenstein & Goodenough (RG-65) dataset

  • 65 word pairs;
  • Similarity of each pair is scored according to a scale from 0 to 4 (the higher the "similarity of meaning," the higher the number);
  • The similarity values in the dataset are the means of judgments made by 51 subjects [Rubenstein and Goodenough, 1965].

Table of results

Algorithm Reference for algorithm Reference for reported results Type Spearman correlation (ρ) Pearson correlation (r)
ADW Pilehvar et al. (2013) Pilehvar et al. (2013) Knowledge-based 0.868 0.810
PPR Hughes and Ramage (2007) Hughes and Ramage (2007) Knowledge-based 0.838 -
PPR Agirre et al. (2009) Agirre et al. (2009) Knowledge-based 0.830 -
H&S Hirst and St-Onge (1998) Hassan and Mihalcea (2011) Knowledge-based 0.813 0.732
Roget Jarmasz (2003) Hassan and Mihalcea (2011) Knowledge-based 0.804 0.818
J&C Jiang and Conrath (1997) Hassan and Mihalcea (2011) Knowledge-based 0.804 0.731
WNE Jarmasz (2003) Hassan and Mihalcea (2011) Knowledge-based 0.801 0.787
L&C Leacock and Chodorow (1998) Hassan and Mihalcea (2011) Knowledge-based 0.797 0.852
Lin Lin (1998) Hassan and Mihalcea (2011) Corpus-based 0.788 0.834
ESA* Gabrilovich and Markovitch (2007) Hassan and Mihalcea (2011) Corpus-based 0.749 0.716
SOCPMI* Islam and Inkpen (2006) Hassan and Mihalcea (2011) Corpus-based 0.741 0.729
Resnik Resnik (1995) Hassan and Mihalcea (2011) Knowledge-based 0.731 0.800
WLM Milne and Witten (2008) Milne and Witten (2008) Knowledge-based 0.640 -
LSA* Landauer et al. (1997) Hassan and Mihalcea (2011) Corpus-based 0.609 0.644
WikiRelate Strube and Ponzetto (2006) Strube and Ponzetto (2006) Knowledge-based - 0.530

Note: values reported by (Hassan and Mihalcea, 2011) are "based on the collected raw data from the respective authors", and those highlighted by (*) are re-implementations.

References

  • Herbert Rubenstein and John B. Goodenough. Contextual correlates of synonymy. Communications of the ACM, 8(10):627–633, 1965.
  • Thad Hughes, Daniel Ramage, Lexical Semantic Relatedness with Random Graph Walks. EMNLP-CoNLL 2007: 581-589.
  • Hirst, Graeme and David St-Onge. Lexical chains as representations of context for the detection and correction of malapropisms. In Christiane Fellbaum, editor, WordNet: An Electronic Lexical Database. The MIT Press, Cambridge, MA, pages 305–332, 1998.
  • Jiang, Jay J. and David W. Conrath. Semantic similarity based on corpus statistics and lexical taxonomy. In Proceedings of International Conference on Research in Computational Linguistics (ROCLING X), Taiwan, pages 19–33, 1997.
  • Leacock, Claudia and Martin Chodorow. Combining local context and WordNet similarity for word sense identification. In Christiane Fellbaum, editor, WordNet: An Electronic Lexical Database. The MIT Press, Cambridge, MA, pages 265–283, 1998.
  • Lin, Dekang. An information-theoretic definition of similarity. In Proceedings of the 15th International Conference on Machine Learning, Madison,WI, pages 296–304, 1998.
  • Resnik, Philip. Using information content to evaluate semantic similarity. In Proceedings of the 14th International Joint Conference on Artificial Intelligence, pages 448–453, Montreal, Canada, 1995.
  • David Milne, and Ian H. Witten, An Effective, Low-Cost Measure of Semantic Relatedness Obtained from Wikipedia Links, In Proceedings of AAAI 2008.
  • Landauer, T. K.; L, T. K.; Laham, D.; Rehder, B.; and Schreiner, M. E. 1997. How well can passage meaning be derived without using word order? a comparison of latent semantic analysis and humans.
  • Michael Strube, Simone Paolo Ponzetto: WikiRelate! Computing Semantic Relatedness Using Wikipedia. AAAI 2006: 1419-1424