TOEFL Synonym Questions (State of the art)

The printable version is no longer supported and may have rendering errors. Please update your browser bookmarks and please use the default browser print function instead.

TOEFL = Test of English as a Foreign Language
80 multiple-choice synonym questions; 4 choices per question
TOEFL questions available from Thomas Landauer
introduced in Landauer and Dumais (1997) as a way of evaluating algorithms for measuring similarity
subsequently used by many other researchers
Reference for algorithm = where to find out more about given algorithm for measuring similarity
Reference for experiment = where to find out more about evaluation of given algorithm with TOEFL questions
Algorithm = general type of algorithm: corpus-based, lexicon-based, hybrid
Correct = percent of 80 questions that given algorithm answered correctly
95% confidence = confidence interval calculated using Binomial Exact Test
table rows sorted in order of increasing percent correct

Reference for algorithm	Reference for experiment	Algorithm	Correct	95% confidence
Resnik (1995)	Jarmasz and Szpakowicz (2003)	hybrid	20.31%	12.89–31.83%
Leacock and Chodrow (1998)	Jarmasz and Szpakowicz (2003)	lexicon-based	21.88%	13.91–33.21%
Lin (1998)	Jarmasz and Szpakowicz (2003)	hybrid	24.06%	15.99–35.94%
Jiang and Conrath (1997)	Jarmasz and Szpakowicz (2003)	hybrid	25.00%	15.99–35.94%
Landauer and Dumais (1997)	Landauer and Dumais (1997)	corpus-based	64.38%	52.90–74.80%
Average non-English US college applicant	Landauer and Dumais (1997)	human	64.50%	53.01–74.88%
Turney (2001)	Turney (2001)	corpus-based	73.75%	62.71–82.96%
Hirst and St.-Onge (1998)	Jarmasz and Szpakowicz (2003)	lexicon-based	77.91%	68.17–87.11%
Jarmasz and Szpakowicz (2003)	Jarmasz and Szpakowicz (2003)	lexicon-based	78.75%	68.17–87.11%
Terra and Clarke (2003)	Terra and Clarke (2003)	corpus-based	81.25%	70.97–89.11%
Rapp (2003)	Rapp (2003)	corpus-based	92.50%	84.39-97.20%
Turney et al. (2003)	Turney et al. (2003)	hybrid	97.50%	91.26–99.70%

Jarmasz, M., and Szpakowicz, S. (2003). Roget’s thesaurus and semantic similarity, Proceedings of the International Conference on Recent Advances in Natural Language Processing (RANLP-03), Borovets, Bulgaria, September, pp. 212-219.

TOEFL Synonym Questions (State of the art)

Navigation menu

Search