Difference between revisions of "TOEFL Synonym Questions (State of the art)"

From ACL Wiki
Jump to navigation Jump to search
m (Reverted edits by Creek (talk) to last revision by Pdturney)
(41 intermediate revisions by 4 users not shown)
Line 1: Line 1:
 
* TOEFL = Test of English as a Foreign Language
 
* TOEFL = Test of English as a Foreign Language
 
* 80 multiple-choice synonym questions; 4 choices per question
 
* 80 multiple-choice synonym questions; 4 choices per question
* TOEFL questions available from [http://www.pearsonkt.com/bioLandauer.shtml Thomas Landauer]
+
* the TOEFL questions are available on request by contacting [http://lsa.colorado.edu/mail_sub.html LSA Support at CU Boulder], the people who manage the [http://lsa.colorado.edu/ LSA web site at Colorado]
* introduced in Landauer and Dumais (1997) as a way of evaluating algorithms for measuring similarity
+
* introduced in Landauer and Dumais (1997) as a way of evaluating algorithms for measuring degree of similarity between words
 
* subsequently used by many other researchers
 
* subsequently used by many other researchers
* '''Algorithm''' = name of algorithm
 
* '''Reference for algorithm''' = where to find out more about given algorithm for measuring similarity
 
* '''Reference for experiment''' = where to find out more about evaluation of given algorithm with TOEFL questions
 
* '''Algorithm''' = general type of algorithm: corpus-based, lexicon-based, hybrid
 
* '''Correct''' = percent of 80 questions that given algorithm answered correctly
 
* '''95% confidence''' = confidence interval calculated using [http://home.clara.net/sisa/onemean.htm Binomial Exact Test]
 
* table rows sorted in order of increasing percent correct
 
* several WordNet-based similarity measures are implemented in [http://www.d.umn.edu/~tpederse/ Ted Pedersen]'s [http://www.d.umn.edu/~tpederse/similarity.html WordNet::Similarity] package
 
* LSA = Latent Semantic Analysis
 
* PMI-IR = Pointwise Mutual Information - Information Retrieval
 
* PR = Product Rule
 
  
 +
 +
== Sample question ==
 +
 +
::{| border="0" cellpadding="1" cellspacing="1"
 +
|-
 +
! Stem:
 +
|
 +
| levied
 +
|-
 +
! Choices:
 +
| (a)
 +
| imposed
 +
|-
 +
|
 +
| (b)
 +
| believed
 +
|-
 +
|
 +
| (c)
 +
| requested
 +
|-
 +
|
 +
| (d)
 +
| correlated
 +
|-
 +
! Solution:
 +
| (a)
 +
| imposed
 +
|-
 +
|}
 +
 +
 +
== Table of results ==
  
 
{| border="1" cellpadding="5" cellspacing="1" width="100%"
 
{| border="1" cellpadding="5" cellspacing="1" width="100%"
Line 22: Line 44:
 
! Reference for algorithm
 
! Reference for algorithm
 
! Reference for experiment
 
! Reference for experiment
! Algorithm
+
! Type
 
! Correct
 
! Correct
 
! 95% confidence
 
! 95% confidence
Line 29: Line 51:
 
| Resnik (1995)
 
| Resnik (1995)
 
| Jarmasz and Szpakowicz (2003)
 
| Jarmasz and Szpakowicz (2003)
| hybrid
+
| Hybrid
 
| 20.31%
 
| 20.31%
 
| 12.89–31.83%
 
| 12.89–31.83%
Line 36: Line 58:
 
| Leacock and Chodrow (1998)
 
| Leacock and Chodrow (1998)
 
| Jarmasz and Szpakowicz (2003)
 
| Jarmasz and Szpakowicz (2003)
| lexicon-based
+
| Lexicon-based
 
| 21.88%
 
| 21.88%
 
| 13.91–33.21%
 
| 13.91–33.21%
Line 43: Line 65:
 
| Lin (1998)
 
| Lin (1998)
 
| Jarmasz and Szpakowicz (2003)
 
| Jarmasz and Szpakowicz (2003)
| hybrid
+
| Hybrid
 
| 24.06%
 
| 24.06%
 +
| 15.99–35.94%
 +
|-
 +
| Random
 +
| Random guessing
 +
| 1 / 4 = 25.00%
 +
| Random
 +
| 25.00%
 
| 15.99–35.94%
 
| 15.99–35.94%
 
|-
 
|-
Line 50: Line 79:
 
| Jiang and Conrath (1997)
 
| Jiang and Conrath (1997)
 
| Jarmasz and Szpakowicz (2003)
 
| Jarmasz and Szpakowicz (2003)
| hybrid
+
| Hybrid
 
| 25.00%
 
| 25.00%
 
| 15.99–35.94%
 
| 15.99–35.94%
Line 57: Line 86:
 
| Landauer and Dumais (1997)
 
| Landauer and Dumais (1997)
 
| Landauer and Dumais (1997)
 
| Landauer and Dumais (1997)
| corpus-based
+
| Corpus-based
 
| 64.38%
 
| 64.38%
 
| 52.90–74.80%
 
| 52.90–74.80%
 
|-
 
|-
|
+
| Human
 
| Average non-English US college applicant
 
| Average non-English US college applicant
 
| Landauer and Dumais (1997)
 
| Landauer and Dumais (1997)
| human
+
| Human
 
| 64.50%
 
| 64.50%
 
| 53.01–74.88%
 
| 53.01–74.88%
 +
|-
 +
| DS
 +
| Pado and Lapata (2007)
 +
| Pado and Lapata (2007)
 +
| Corpus-based
 +
| 73.00%
 +
| 62.72-82.96%
 
|-
 
|-
 
| PMI-IR
 
| PMI-IR
 
| Turney (2001)
 
| Turney (2001)
 
| Turney (2001)
 
| Turney (2001)
| corpus-based
+
| Corpus-based
 
| 73.75%
 
| 73.75%
| 62.71–82.96%
+
| 62.72–82.96%
 +
|-
 +
| PairClass
 +
| Turney (2008)
 +
| Turney (2008)
 +
| Corpus-based
 +
| 76.25%
 +
| 65.42-85.06%
 
|-
 
|-
 
| HSO
 
| HSO
 
| Hirst and St.-Onge (1998)
 
| Hirst and St.-Onge (1998)
 
| Jarmasz and Szpakowicz (2003)
 
| Jarmasz and Szpakowicz (2003)
| lexicon-based
+
| Lexicon-based
 
| 77.91%
 
| 77.91%
 
| 68.17–87.11%
 
| 68.17–87.11%
Line 85: Line 128:
 
| Jarmasz and Szpakowicz (2003)
 
| Jarmasz and Szpakowicz (2003)
 
| Jarmasz and Szpakowicz (2003)
 
| Jarmasz and Szpakowicz (2003)
| lexicon-based
+
| Lexicon-based
 
| 78.75%
 
| 78.75%
 
| 68.17–87.11%
 
| 68.17–87.11%
Line 92: Line 135:
 
| Terra and Clarke (2003)
 
| Terra and Clarke (2003)
 
| Terra and Clarke (2003)
 
| Terra and Clarke (2003)
| corpus-based
+
| Corpus-based
 
| 81.25%
 
| 81.25%
 
| 70.97–89.11%
 
| 70.97–89.11%
 +
|-
 +
| CWO
 +
| Ruiz-Casado et al. (2005)
 +
| Ruiz-Casado et al. (2005)
 +
| Web-based
 +
| 82.55%
 +
| 72.38–90.09%
 +
|-
 +
| PPMIC
 +
| Bullinaria and Levy (2006)
 +
| Bullinaria and Levy (2006)
 +
| Corpus-based
 +
| 85.00%
 +
| 75.26-92.00%
 +
|-
 +
| GLSA
 +
| Matveeva et al. (2005)
 +
| Matveeva et al. (2005)
 +
| Corpus-based
 +
| 86.25%
 +
| 76.73-92.93%
 
|-
 
|-
 
| LSA
 
| LSA
 
| Rapp (2003)
 
| Rapp (2003)
 
| Rapp (2003)
 
| Rapp (2003)
| corpus-based
+
| Corpus-based
 
| 92.50%
 
| 92.50%
 
| 84.39-97.20%
 
| 84.39-97.20%
Line 106: Line 170:
 
| Turney et al. (2003)
 
| Turney et al. (2003)
 
| Turney et al. (2003)
 
| Turney et al. (2003)
| hybrid
+
| Hybrid
 
| 97.50%
 
| 97.50%
 
| 91.26–99.70%
 
| 91.26–99.70%
Line 113: Line 177:
  
  
Hirst, G., and St-Onge, D. (1998). Lexical chains as representation of context for the detection and correction of malapropisms. In C. Fellbaum (ed.), ''WordNet: An Electronic Lexical Database''. Cambridge: MIT Press, 305-332.
+
== Explanation of table ==
  
Jarmasz, M., and Szpakowicz, S. (2003). [http://www.site.uottawa.ca/~mjarmasz/pubs/jarmasz_roget_sim.pdf Roget’s thesaurus and semantic similarity], ''Proceedings of the International Conference on Recent Advances in Natural Language Processing (RANLP-03)'', Borovets, Bulgaria, September, pp. 212-219.
+
* '''Algorithm''' = name of algorithm
 +
* '''Reference for algorithm''' = where to find out more about given algorithm
 +
* '''Reference for experiment''' = where to find out more about evaluation of given algorithm with TOEFL questions
 +
* '''Type''' = general type of algorithm: corpus-based, lexicon-based, hybrid
 +
* '''Correct''' = percent of 80 questions that given algorithm answered correctly
 +
* '''95% confidence''' = confidence interval calculated using [http://www.quantitativeskills.com/sisa/statistics/onemean.htm Binomial Exact Test]
 +
* table rows sorted in order of increasing percent correct
 +
* several WordNet-based similarity measures are implemented in [http://www.d.umn.edu/~tpederse/ Ted Pedersen]'s [http://www.d.umn.edu/~tpederse/similarity.html WordNet::Similarity] package
 +
* LSA = Latent Semantic Analysis
 +
* PMI-IR = Pointwise Mutual Information - Information Retrieval
 +
* PR = Product Rule
 +
* PPMIC = Positive Pointwise Mutual Information with Cosine
 +
* GLSA = Generalized Latent Semantic Analysis
 +
* CWO = Context Window Overlapping
 +
* DS = Dependency Space
  
Jiang, J.J., and Conrath, D.W. (1997). Semantic similarity based on corpus statistics and lexical taxonomy. ''Proceedings of the International Conference on Research in Computational Linguistics'', Taiwan.
+
== Caveats ==
 +
 
 +
* the performance of a corpus-based algorithm depends on the corpus, so the difference in performance between two corpus-based systems may be due to the different corpora, rather than the different algorithms
 +
* the TOEFL questions include nouns, verbs, and adjectives, but some of the WordNet-based algorithms were only designed to work with nouns
 +
* some of the algorithms may have been tuned on the TOEFL questions
 +
 
 +
 
 +
== References ==
 +
 
 +
Bullinaria, J.A., and Levy, J.P. (2006). [http://www.cs.bham.ac.uk/~jxb/PUBS/BRM.pdf Extracting semantic representations from word co-occurrence statistics: A computational study]. To appear in ''Behavior Research Methods'', 38.
 +
 
 +
Hirst, G., and St-Onge, D. (1998). [http://mirror.eacoss.org/documentation/ITLibrary/IRIS/Data/1997/Hirst/Lexical/1997-Hirst-Lexical.pdf Lexical chains as representation of context for the detection and correction of malapropisms]. In C. Fellbaum (ed.), ''WordNet: An Electronic Lexical Database''. Cambridge: MIT Press, 305-332.
 +
 
 +
Jarmasz, M., and Szpakowicz, S. (2003). [http://www.csi.uottawa.ca/~szpak/recent_papers/TR-2003-01.pdf Roget’s thesaurus and semantic similarity], ''Proceedings of the International Conference on Recent Advances in Natural Language Processing (RANLP-03)'', Borovets, Bulgaria, September, pp. 212-219.
 +
 
 +
Jiang, J.J., and Conrath, D.W. (1997). [http://wortschatz.uni-leipzig.de/~sbordag/aalw05/Referate/03_Assoziationen_BudanitskyResnik/Jiang_Conrath_97.pdf Semantic similarity based on corpus statistics and lexical taxonomy]. ''Proceedings of the International Conference on Research in Computational Linguistics'', Taiwan.
 +
 
 +
Landauer, T.K., and Dumais, S.T. (1997). [http://lsa.colorado.edu/papers/plato/plato.annote.html A solution to Plato's problem: The latent semantic analysis theory of the acquisition, induction, and representation of knowledge]. ''Psychological Review'', 104(2):211–240.
  
 
Leacock, C., and Chodorow, M. (1998). Combining local context and WordNet similarity for word sense identification. In C. Fellbaum (ed.), ''WordNet: An Electronic Lexical Database''. Cambridge: MIT Press, pp. 265-283.
 
Leacock, C., and Chodorow, M. (1998). Combining local context and WordNet similarity for word sense identification. In C. Fellbaum (ed.), ''WordNet: An Electronic Lexical Database''. Cambridge: MIT Press, pp. 265-283.
Line 123: Line 218:
 
Lin, D. (1998). [http://www.cs.ualberta.ca/~lindek/papers/sim.pdf An information-theoretic definition of similarity]. ''Proceedings of the 15th International Conference on Machine Learning (ICML-98)'', Madison, WI, pp. 296-304.
 
Lin, D. (1998). [http://www.cs.ualberta.ca/~lindek/papers/sim.pdf An information-theoretic definition of similarity]. ''Proceedings of the 15th International Conference on Machine Learning (ICML-98)'', Madison, WI, pp. 296-304.
  
Rapp, R. (2003). [http://www.amtaweb.org/summit/MTSummit/FinalPapers/19-Rapp-final.pdf Word sense discovery based on sense descriptor dissimilarity], ''Proceedings of the Ninth Machine Translation Summit'', pp. 315-322.
+
Matveeva, I., Levow, G., Farahat, A., and Royer, C. (2005). [http://people.cs.uchicago.edu/~matveeva/SynGLSA_ranlp_final.pdf Generalized latent semantic analysis for term representation]. ''Proceedings of the International Conference on Recent Advances in Natural Language Processing (RANLP-05)'', Borovets, Bulgaria.
 +
 
 +
Pado, S., and Lapata, M. (2007). [http://www.coli.uni-saarland.de/~pado/pub/papers/cl07_pado.pdf Dependency-based construction of semantic space models]. ''Computational Linguistics'', 33(2), 161-199.
 +
 
 +
Rapp, R. (2003). [http://www.amtaweb.org/summit/MTSummit/FinalPapers/19-Rapp-final.pdf Word sense discovery based on sense descriptor dissimilarity]. ''Proceedings of the Ninth Machine Translation Summit'', pp. 315-322.
 +
 
 +
Resnik, P. (1995). [http://citeseer.ist.psu.edu/resnik95using.html Using information content to evaluate semantic similarity]. ''Proceedings of the 14th International Joint Conference on Artificial Intelligence (IJCAI-95)'', Montreal, pp. 448-453.
 +
 
 +
Ruiz-Casado, M., Alfonseca, E. and Castells, P. (2005) [http://alfonseca.org/pubs/2005-ranlp1.pdf Using context-window overlapping in Synonym Discovery and Ontology Extension]. ''Proceedings of the International Conference Recent Advances in Natural Language Processing (RANLP-2005)'', Borovets, Bulgaria.
  
Resnik, P. (1995). Using information content to evaluate semantic similarity. ''Proceedings of the 14th International Joint Conference on Artificial Intelligence (IJCAI-95)'', Montreal, pp. 448-453.
+
Terra, E., and Clarke, C.L.A. (2003). [http://acl.ldc.upenn.edu/N/N03/N03-1032.pdf Frequency estimates for statistical word similarity measures]. ''Proceedings of the Human Language Technology and North American Chapter of Association of Computational Linguistics Conference 2003 (HLT/NAACL 2003)'', pp. 244–251.
  
 
Turney, P.D. (2001). [http://arxiv.org/abs/cs.LG/0212033 Mining the Web for synonyms: PMI-IR versus LSA on TOEFL]. ''Proceedings of the Twelfth European Conference on Machine Learning (ECML-2001)'', Freiburg, Germany, pp. 491-502.
 
Turney, P.D. (2001). [http://arxiv.org/abs/cs.LG/0212033 Mining the Web for synonyms: PMI-IR versus LSA on TOEFL]. ''Proceedings of the Twelfth European Conference on Machine Learning (ECML-2001)'', Freiburg, Germany, pp. 491-502.
  
 
Turney, P.D., Littman, M.L., Bigham, J., and Shnayder, V. (2003). [http://arxiv.org/abs/cs.CL/0309035 Combining independent modules to solve multiple-choice synonym and analogy problems]. ''Proceedings of the International Conference on Recent Advances in Natural Language Processing (RANLP-03)'', Borovets, Bulgaria, pp. 482-489.
 
Turney, P.D., Littman, M.L., Bigham, J., and Shnayder, V. (2003). [http://arxiv.org/abs/cs.CL/0309035 Combining independent modules to solve multiple-choice synonym and analogy problems]. ''Proceedings of the International Conference on Recent Advances in Natural Language Processing (RANLP-03)'', Borovets, Bulgaria, pp. 482-489.
 +
 +
Turney, P.D. (2008). [http://arxiv.org/abs/0809.0124 A uniform approach to analogies, synonyms, antonyms, and associations]. ''Proceedings of the 22nd International Conference on Computational Linguistics (Coling 2008)'', Manchester, UK, pp. 905-912.
 +
 +
== See also ==
 +
 +
* [[Attributional and Relational Similarity (State of the art)]]
 +
* [[ESL Synonym Questions (State of the art)|ESL Synonym Questions]]
 +
* [[SAT Analogy Questions]]
 +
* [[State of the art]]
 +
 +
 +
[[Category:State of the art]]

Revision as of 05:19, 25 June 2012

  • TOEFL = Test of English as a Foreign Language
  • 80 multiple-choice synonym questions; 4 choices per question
  • the TOEFL questions are available on request by contacting LSA Support at CU Boulder, the people who manage the LSA web site at Colorado
  • introduced in Landauer and Dumais (1997) as a way of evaluating algorithms for measuring degree of similarity between words
  • subsequently used by many other researchers


Sample question

Stem: levied
Choices: (a) imposed
(b) believed
(c) requested
(d) correlated
Solution: (a) imposed


Table of results

Algorithm Reference for algorithm Reference for experiment Type Correct 95% confidence
RES Resnik (1995) Jarmasz and Szpakowicz (2003) Hybrid 20.31% 12.89–31.83%
LC Leacock and Chodrow (1998) Jarmasz and Szpakowicz (2003) Lexicon-based 21.88% 13.91–33.21%
LIN Lin (1998) Jarmasz and Szpakowicz (2003) Hybrid 24.06% 15.99–35.94%
Random Random guessing 1 / 4 = 25.00% Random 25.00% 15.99–35.94%
JC Jiang and Conrath (1997) Jarmasz and Szpakowicz (2003) Hybrid 25.00% 15.99–35.94%
LSA Landauer and Dumais (1997) Landauer and Dumais (1997) Corpus-based 64.38% 52.90–74.80%
Human Average non-English US college applicant Landauer and Dumais (1997) Human 64.50% 53.01–74.88%
DS Pado and Lapata (2007) Pado and Lapata (2007) Corpus-based 73.00% 62.72-82.96%
PMI-IR Turney (2001) Turney (2001) Corpus-based 73.75% 62.72–82.96%
PairClass Turney (2008) Turney (2008) Corpus-based 76.25% 65.42-85.06%
HSO Hirst and St.-Onge (1998) Jarmasz and Szpakowicz (2003) Lexicon-based 77.91% 68.17–87.11%
JS Jarmasz and Szpakowicz (2003) Jarmasz and Szpakowicz (2003) Lexicon-based 78.75% 68.17–87.11%
PMI-IR Terra and Clarke (2003) Terra and Clarke (2003) Corpus-based 81.25% 70.97–89.11%
CWO Ruiz-Casado et al. (2005) Ruiz-Casado et al. (2005) Web-based 82.55% 72.38–90.09%
PPMIC Bullinaria and Levy (2006) Bullinaria and Levy (2006) Corpus-based 85.00% 75.26-92.00%
GLSA Matveeva et al. (2005) Matveeva et al. (2005) Corpus-based 86.25% 76.73-92.93%
LSA Rapp (2003) Rapp (2003) Corpus-based 92.50% 84.39-97.20%
PR Turney et al. (2003) Turney et al. (2003) Hybrid 97.50% 91.26–99.70%


Explanation of table

  • Algorithm = name of algorithm
  • Reference for algorithm = where to find out more about given algorithm
  • Reference for experiment = where to find out more about evaluation of given algorithm with TOEFL questions
  • Type = general type of algorithm: corpus-based, lexicon-based, hybrid
  • Correct = percent of 80 questions that given algorithm answered correctly
  • 95% confidence = confidence interval calculated using Binomial Exact Test
  • table rows sorted in order of increasing percent correct
  • several WordNet-based similarity measures are implemented in Ted Pedersen's WordNet::Similarity package
  • LSA = Latent Semantic Analysis
  • PMI-IR = Pointwise Mutual Information - Information Retrieval
  • PR = Product Rule
  • PPMIC = Positive Pointwise Mutual Information with Cosine
  • GLSA = Generalized Latent Semantic Analysis
  • CWO = Context Window Overlapping
  • DS = Dependency Space

Caveats

  • the performance of a corpus-based algorithm depends on the corpus, so the difference in performance between two corpus-based systems may be due to the different corpora, rather than the different algorithms
  • the TOEFL questions include nouns, verbs, and adjectives, but some of the WordNet-based algorithms were only designed to work with nouns
  • some of the algorithms may have been tuned on the TOEFL questions


References

Bullinaria, J.A., and Levy, J.P. (2006). Extracting semantic representations from word co-occurrence statistics: A computational study. To appear in Behavior Research Methods, 38.

Hirst, G., and St-Onge, D. (1998). Lexical chains as representation of context for the detection and correction of malapropisms. In C. Fellbaum (ed.), WordNet: An Electronic Lexical Database. Cambridge: MIT Press, 305-332.

Jarmasz, M., and Szpakowicz, S. (2003). Roget’s thesaurus and semantic similarity, Proceedings of the International Conference on Recent Advances in Natural Language Processing (RANLP-03), Borovets, Bulgaria, September, pp. 212-219.

Jiang, J.J., and Conrath, D.W. (1997). Semantic similarity based on corpus statistics and lexical taxonomy. Proceedings of the International Conference on Research in Computational Linguistics, Taiwan.

Landauer, T.K., and Dumais, S.T. (1997). A solution to Plato's problem: The latent semantic analysis theory of the acquisition, induction, and representation of knowledge. Psychological Review, 104(2):211–240.

Leacock, C., and Chodorow, M. (1998). Combining local context and WordNet similarity for word sense identification. In C. Fellbaum (ed.), WordNet: An Electronic Lexical Database. Cambridge: MIT Press, pp. 265-283.

Lin, D. (1998). An information-theoretic definition of similarity. Proceedings of the 15th International Conference on Machine Learning (ICML-98), Madison, WI, pp. 296-304.

Matveeva, I., Levow, G., Farahat, A., and Royer, C. (2005). Generalized latent semantic analysis for term representation. Proceedings of the International Conference on Recent Advances in Natural Language Processing (RANLP-05), Borovets, Bulgaria.

Pado, S., and Lapata, M. (2007). Dependency-based construction of semantic space models. Computational Linguistics, 33(2), 161-199.

Rapp, R. (2003). Word sense discovery based on sense descriptor dissimilarity. Proceedings of the Ninth Machine Translation Summit, pp. 315-322.

Resnik, P. (1995). Using information content to evaluate semantic similarity. Proceedings of the 14th International Joint Conference on Artificial Intelligence (IJCAI-95), Montreal, pp. 448-453.

Ruiz-Casado, M., Alfonseca, E. and Castells, P. (2005) Using context-window overlapping in Synonym Discovery and Ontology Extension. Proceedings of the International Conference Recent Advances in Natural Language Processing (RANLP-2005), Borovets, Bulgaria.

Terra, E., and Clarke, C.L.A. (2003). Frequency estimates for statistical word similarity measures. Proceedings of the Human Language Technology and North American Chapter of Association of Computational Linguistics Conference 2003 (HLT/NAACL 2003), pp. 244–251.

Turney, P.D. (2001). Mining the Web for synonyms: PMI-IR versus LSA on TOEFL. Proceedings of the Twelfth European Conference on Machine Learning (ECML-2001), Freiburg, Germany, pp. 491-502.

Turney, P.D., Littman, M.L., Bigham, J., and Shnayder, V. (2003). Combining independent modules to solve multiple-choice synonym and analogy problems. Proceedings of the International Conference on Recent Advances in Natural Language Processing (RANLP-03), Borovets, Bulgaria, pp. 482-489.

Turney, P.D. (2008). A uniform approach to analogies, synonyms, antonyms, and associations. Proceedings of the 22nd International Conference on Computational Linguistics (Coling 2008), Manchester, UK, pp. 905-912.

See also