Difference between revisions of "TOEFL Synonym Questions (State of the art)"
Line 118: | Line 118: | ||
Jiang, J.J., and Conrath, D.W. (1997). Semantic similarity based on corpus statistics and lexical taxonomy. ''Proceedings of the International Conference on Research in Computational Linguistics'', Taiwan. | Jiang, J.J., and Conrath, D.W. (1997). Semantic similarity based on corpus statistics and lexical taxonomy. ''Proceedings of the International Conference on Research in Computational Linguistics'', Taiwan. | ||
+ | |||
+ | Landauer, T.K., and Dumais, S.T. (1997). [http://lsa.colorado.edu/papers/plato/plato.annote.html A solution to Plato's problem: The latent semantic analysis theory of the acquisition, induction, and representation of knowledge]. ''Psychological Review'', 104(2):211–240. | ||
Leacock, C., and Chodorow, M. (1998). Combining local context and WordNet similarity for word sense identification. In C. Fellbaum (ed.), ''WordNet: An Electronic Lexical Database''. Cambridge: MIT Press, pp. 265-283. | Leacock, C., and Chodorow, M. (1998). Combining local context and WordNet similarity for word sense identification. In C. Fellbaum (ed.), ''WordNet: An Electronic Lexical Database''. Cambridge: MIT Press, pp. 265-283. | ||
Line 126: | Line 128: | ||
Resnik, P. (1995). Using information content to evaluate semantic similarity. ''Proceedings of the 14th International Joint Conference on Artificial Intelligence (IJCAI-95)'', Montreal, pp. 448-453. | Resnik, P. (1995). Using information content to evaluate semantic similarity. ''Proceedings of the 14th International Joint Conference on Artificial Intelligence (IJCAI-95)'', Montreal, pp. 448-453. | ||
+ | |||
+ | Terra, E., and Clarke, C.L.A. (2003). [http://acl.ldc.upenn.edu/N/N03/N03-1032.pdf Frequency estimates for statistical word similarity measures]. ''Proceedings of the Human Language Technology and North American Chapter of Association of Computational Linguistics Conference 2003 (HLT/NAACL 2003)'', pp. 244–251. | ||
Turney, P.D. (2001). [http://arxiv.org/abs/cs.LG/0212033 Mining the Web for synonyms: PMI-IR versus LSA on TOEFL]. ''Proceedings of the Twelfth European Conference on Machine Learning (ECML-2001)'', Freiburg, Germany, pp. 491-502. | Turney, P.D. (2001). [http://arxiv.org/abs/cs.LG/0212033 Mining the Web for synonyms: PMI-IR versus LSA on TOEFL]. ''Proceedings of the Twelfth European Conference on Machine Learning (ECML-2001)'', Freiburg, Germany, pp. 491-502. | ||
Turney, P.D., Littman, M.L., Bigham, J., and Shnayder, V. (2003). [http://arxiv.org/abs/cs.CL/0309035 Combining independent modules to solve multiple-choice synonym and analogy problems]. ''Proceedings of the International Conference on Recent Advances in Natural Language Processing (RANLP-03)'', Borovets, Bulgaria, pp. 482-489. | Turney, P.D., Littman, M.L., Bigham, J., and Shnayder, V. (2003). [http://arxiv.org/abs/cs.CL/0309035 Combining independent modules to solve multiple-choice synonym and analogy problems]. ''Proceedings of the International Conference on Recent Advances in Natural Language Processing (RANLP-03)'', Borovets, Bulgaria, pp. 482-489. |
Revision as of 19:26, 12 May 2007
- TOEFL = Test of English as a Foreign Language
- 80 multiple-choice synonym questions; 4 choices per question
- TOEFL questions available from Thomas Landauer
- introduced in Landauer and Dumais (1997) as a way of evaluating algorithms for measuring similarity
- subsequently used by many other researchers
- Algorithm = name of algorithm
- Reference for algorithm = where to find out more about given algorithm for measuring similarity
- Reference for experiment = where to find out more about evaluation of given algorithm with TOEFL questions
- Algorithm = general type of algorithm: corpus-based, lexicon-based, hybrid
- Correct = percent of 80 questions that given algorithm answered correctly
- 95% confidence = confidence interval calculated using Binomial Exact Test
- table rows sorted in order of increasing percent correct
- several WordNet-based similarity measures are implemented in Ted Pedersen's WordNet::Similarity package
- LSA = Latent Semantic Analysis
- PMI-IR = Pointwise Mutual Information - Information Retrieval
- PR = Product Rule
Algorithm | Reference for algorithm | Reference for experiment | Algorithm | Correct | 95% confidence |
---|---|---|---|---|---|
RES | Resnik (1995) | Jarmasz and Szpakowicz (2003) | hybrid | 20.31% | 12.89–31.83% |
LC | Leacock and Chodrow (1998) | Jarmasz and Szpakowicz (2003) | lexicon-based | 21.88% | 13.91–33.21% |
LIN | Lin (1998) | Jarmasz and Szpakowicz (2003) | hybrid | 24.06% | 15.99–35.94% |
JC | Jiang and Conrath (1997) | Jarmasz and Szpakowicz (2003) | hybrid | 25.00% | 15.99–35.94% |
LSA | Landauer and Dumais (1997) | Landauer and Dumais (1997) | corpus-based | 64.38% | 52.90–74.80% |
Average non-English US college applicant | Landauer and Dumais (1997) | human | 64.50% | 53.01–74.88% | |
PMI-IR | Turney (2001) | Turney (2001) | corpus-based | 73.75% | 62.71–82.96% |
HSO | Hirst and St.-Onge (1998) | Jarmasz and Szpakowicz (2003) | lexicon-based | 77.91% | 68.17–87.11% |
JS | Jarmasz and Szpakowicz (2003) | Jarmasz and Szpakowicz (2003) | lexicon-based | 78.75% | 68.17–87.11% |
PMI-IR | Terra and Clarke (2003) | Terra and Clarke (2003) | corpus-based | 81.25% | 70.97–89.11% |
LSA | Rapp (2003) | Rapp (2003) | corpus-based | 92.50% | 84.39-97.20% |
PR | Turney et al. (2003) | Turney et al. (2003) | hybrid | 97.50% | 91.26–99.70% |
Hirst, G., and St-Onge, D. (1998). Lexical chains as representation of context for the detection and correction of malapropisms. In C. Fellbaum (ed.), WordNet: An Electronic Lexical Database. Cambridge: MIT Press, 305-332.
Jarmasz, M., and Szpakowicz, S. (2003). Roget’s thesaurus and semantic similarity, Proceedings of the International Conference on Recent Advances in Natural Language Processing (RANLP-03), Borovets, Bulgaria, September, pp. 212-219.
Jiang, J.J., and Conrath, D.W. (1997). Semantic similarity based on corpus statistics and lexical taxonomy. Proceedings of the International Conference on Research in Computational Linguistics, Taiwan.
Landauer, T.K., and Dumais, S.T. (1997). A solution to Plato's problem: The latent semantic analysis theory of the acquisition, induction, and representation of knowledge. Psychological Review, 104(2):211–240.
Leacock, C., and Chodorow, M. (1998). Combining local context and WordNet similarity for word sense identification. In C. Fellbaum (ed.), WordNet: An Electronic Lexical Database. Cambridge: MIT Press, pp. 265-283.
Lin, D. (1998). An information-theoretic definition of similarity. Proceedings of the 15th International Conference on Machine Learning (ICML-98), Madison, WI, pp. 296-304.
Rapp, R. (2003). Word sense discovery based on sense descriptor dissimilarity, Proceedings of the Ninth Machine Translation Summit, pp. 315-322.
Resnik, P. (1995). Using information content to evaluate semantic similarity. Proceedings of the 14th International Joint Conference on Artificial Intelligence (IJCAI-95), Montreal, pp. 448-453.
Terra, E., and Clarke, C.L.A. (2003). Frequency estimates for statistical word similarity measures. Proceedings of the Human Language Technology and North American Chapter of Association of Computational Linguistics Conference 2003 (HLT/NAACL 2003), pp. 244–251.
Turney, P.D. (2001). Mining the Web for synonyms: PMI-IR versus LSA on TOEFL. Proceedings of the Twelfth European Conference on Machine Learning (ECML-2001), Freiburg, Germany, pp. 491-502.
Turney, P.D., Littman, M.L., Bigham, J., and Shnayder, V. (2003). Combining independent modules to solve multiple-choice synonym and analogy problems. Proceedings of the International Conference on Recent Advances in Natural Language Processing (RANLP-03), Borovets, Bulgaria, pp. 482-489.