Difference between revisions of "SAT Analogy Questions (State of the art)"

Revision as of 07:16, 2 July 2013

SAT = Scholastic Aptitude Test
374 multiple-choice analogy questions; 5 choices per question
SAT questions collected by Michael Littman, available on request from Peter Turney
introduced in Turney et al. (2003) as a way of evaluating algorithms for measuring relational similarity

Sample question

Stem:		mason:stone
Choices:	(a)	teacher:chalk
	(b)	carpenter:wood
	(c)	soldier:gun
	(d)	photograph:camera
	(e)	book:word
Solution:	(b)	carpenter:wood

Table of results

Algorithm	Reference for algorithm	Reference for experiment	Type	Correct	95% confidence
Random	Random guessing	1 / 5 = 20.0%	Random	20.0%	16.1-24.5%
JC	Jiang and Conrath (1997)	Turney (2006b)	Hybrid	27.3%	23.1-32.4%
LIN	Lin (1998)	Turney (2006b)	Hybrid	27.3%	23.1-32.4%
LC	Leacock and Chodrow (1998)	Turney (2006b)	Lexicon-based	31.3%	26.9-36.5%
HSO	Hirst and St.-Onge (1998)	Turney (2006b)	Lexicon-based	32.1%	27.6-37.4%
RES	Resnik (1995)	Turney (2006b)	Hybrid	33.2%	28.7-38.5%
PMI-IR	Turney (2001)	Turney (2006b)	Corpus-based	35.0%	30.2-40.1%
LSA+Predication	Mangalath et al. (2004)	Mangalath et al. (2004)	Corpus-based	42.0%	37.2-47.4%
KNOW-BEST	Veale (2004)	Veale (2004)	Lexicon-based	43.0%	38.0-48.2%
k-means	Bicici and Yuret (2006)	Bicici and Yuret (2006)	Corpus-based	44.0%	39.0-49.3%
BagPack	Herdağdelen and Baroni (2009)	Herdağdelen and Baroni (2009)	Corpus-based	44.1%	39.0-49.3%
VSM	Turney and Littman (2005)	Turney and Littman (2005)	Corpus-based	47.1%	42.2-52.5%
Dual-Space	Turney (2012)	Turney (2012)	Corpus-based	51.1%	46.1-56.5%
BMI	Bollegala et al. (2009)	Bollegala et al. (2009)	Corpus-based	51.1%	46.1-56.5%
PairClass	Turney (2008)	Turney (2008)	Corpus-based	52.1%	46.9-57.3%
PERT	Turney (2006a)	Turney (2006a)	Corpus-based	53.5%	48.5-58.9%
LRA	Turney (2006b)	Turney (2006b)	Corpus-based	56.1%	51.0-61.2%
Human	Average US college applicant	Turney and Littman (2005)	Human	57.0%	52.0-62.3%
Human Voting	Lofi (2013)	Lofi (2013)	Human Voting	81.5%	77.2-85.4

Explanation of table

Algorithm = name of algorithm
Reference for algorithm = where to find out more about given algorithm
Reference for experiment = where to find out more about evaluation of given algorithm with SAT questions
Type = general type of algorithm: corpus-based, lexicon-based, hybrid
Correct = percent of 374 questions that given algorithm answered correctly
95% confidence = confidence interval calculated using the Binomial Exact Test
table rows sorted in order of increasing percent correct
several WordNet-based similarity measures are implemented in Ted Pedersen's WordNet::Similarity package
KNOW-BEST = KNOWledge-Based Entertainment and Scholastic Testing
VSM = Vector Space Model
LRA = Latent Relational Analysis
PERT = Pertinence
PMI-IR = Pointwise Mutual Information - Information Retrieval
LSA+Predication = Latent Semantic Analysis + Predication
BagPack = Bag of words representation of Paired concept knowledge

References

Bicici, E., and Yuret, D. (2006). Clustering word pairs to answer analogy questions. Proceedings of the Fifteenth Turkish Symposium on Artificial Intelligence and Neural Networks (TAINN 2006).

Bollegala D., Matsuo Y., and Ishizuka M. (2009). Measuring the similarity between implicit semantic relations from the web. Proceedings of the 18th International Conference on World Wide Web, ACM, pages 651–660.

Herdağdelen A. and Baroni M. (2009) BagPack: A general framework to represent semantic relations. Proceedings of the EACL 2009 Geometrical Models for Natural Language Semantics (GEMS) Workshop, East Stroudsburg PA: ACL, 33-40.

Hirst, G., and St-Onge, D. (1998). Lexical chains as representation of context for the detection and correction of malapropisms. In C. Fellbaum (ed.), WordNet: An Electronic Lexical Database. Cambridge: MIT Press, 305-332.

Jiang, J.J., and Conrath, D.W. (1997). Semantic similarity based on corpus statistics and lexical taxonomy. Proceedings of the International Conference on Research in Computational Linguistics, Taiwan.

Leacock, C., and Chodorow, M. (1998). Combining local context and WordNet similarity for word sense identification. In C. Fellbaum (ed.), WordNet: An Electronic Lexical Database. Cambridge: MIT Press, pp. 265-283.

Lin, D. (1998). An information-theoretic definition of similarity. Proceedings of the 15th International Conference on Machine Learning (ICML-98), Madison, WI, pp. 296-304.

Lofi, C. (2013). Just ask a human?--Controlling Quality in Relational Similarity and Analogy Processing using the Crowd. Proceedings of the Workshop of the 15th BTW Conference on Database Systems for Business, Technology, and Web (BTW 2013), Magdeburg, Germany, pp. 197-210.

Mangalath, P., Quesada, J., and Kintsch, W. (2004). Analogy-making as predication using relational information and LSA vectors. In K.D. Forbus, D. Gentner & T. Regier (Eds.), Proceedings of the 26th Annual Meeting of the Cognitive Science Society. Chicago: Lawrence Erlbaum Associates.

Resnik, P. (1995). Using information content to evaluate semantic similarity. Proceedings of the 14th International Joint Conference on Artificial Intelligence (IJCAI-95), Montreal, pp. 448-453.

Turney, P.D., Littman, M.L., Bigham, J., and Shnayder, V. (2003). Combining independent modules to solve multiple-choice synonym and analogy problems. Proceedings of the International Conference on Recent Advances in Natural Language Processing (RANLP-03), Borovets, Bulgaria, pp. 482-489.

Turney, P.D., and Littman, M.L. (2005). Corpus-based learning of analogies and semantic relations. Machine Learning, 60 (1-3), 251-278.

Turney, P.D. (2001). Mining the Web for synonyms: PMI-IR versus LSA on TOEFL. Proceedings of the Twelfth European Conference on Machine Learning (ECML-2001), Freiburg, Germany, pp. 491-502.

Turney, P.D. (2006a). Expressing implicit semantic relations without supervision. Proceedings of the 21st International Conference on Computational Linguistics and 44th Annual Meeting of the Association for Computational Linguistics (Coling/ACL-06), Sydney, Australia, pp. 313-320.

Turney, P.D. (2006b). Similarity of semantic relations. Computational Linguistics, 32 (3), 379-416.

Turney, P.D. (2008). A uniform approach to analogies, synonyms, antonyms, and associations. Proceedings of the 22nd International Conference on Computational Linguistics (Coling 2008), Manchester, UK, pp. 905-912.

Turney, P.D. (2012). Domain and function: A dual-space model of semantic relations and compositions, Journal of Artificial Intelligence Research (JAIR), 44, 533-585.

Veale, T. (2004). WordNet sits the SAT: A knowledge-based approach to lexical analogy. Proceedings of the 16th European Conference on Artificial Intelligence (ECAI 2004), pp. 606–612, Valencia, Spain.

@@ Line 167: / Line 167: @@
 | Corpus-based
 | 56.1%
-| 51.0–61.2%
+| 51.0-61.2%
 |-
 | Human
@@ Line 176: / Line 176: @@
 | 52.0-62.3%
 |-
+| Human Voting
+| Lofi (2013)
+| Lofi (2013)
+| Human Voting
+| 81.5%
+| 77.2-85.4
 |}
@@ Line 211: / Line 217: @@
 Lin, D. (1998). [http://www.cs.ualberta.ca/~lindek/papers/sim.pdf An information-theoretic definition of similarity]. ''Proceedings of the 15th International Conference on Machine Learning (ICML-98)'', Madison, WI, pp. 296-304.
+Lofi, C. (2013). [http://www.btw-2013.de/proceedings/Just%20ask%20a%20human%20%20Controlling%20Quality%20in%20Relational%20Similarity%20and%20Analogy%20Processing%20using%20the%20Crowd.pdf Just ask a human?--Controlling Quality in Relational Similarity and Analogy Processing using the Crowd]. ''Proceedings of the Workshop of the 15th BTW Conference on Database Systems for Business, Technology, and Web (BTW 2013)'', Magdeburg, Germany, pp. 197-210.
 Mangalath, P., Quesada, J., and Kintsch, W. (2004). [http://www.josequesada.name/papers/Mangalath-Quesada-2004-analogyPredicationCogSciPoster1.pdf Analogy-making as predication using relational information and LSA vectors]. In K.D. Forbus, D. Gentner & T. Regier (Eds.), ''Proceedings of the 26th Annual Meeting of the Cognitive Science Society''. Chicago: Lawrence Erlbaum Associates.

Difference between revisions of "SAT Analogy Questions (State of the art)"

Revision as of 07:16, 2 July 2013

Contents

Sample question

Table of results

Explanation of table

References

See also

Navigation menu

Search