Annotated Bibliography on Statistical Semantics

From ACL Wiki
Jump to: navigation, search

Delavenay, E. (1960). An Introduction to Machine Translation, New York, NY: Thames and Hudson.

  • This book contains one of the earliest definitions of the term statistical semantics, as "statistical study of meanings of words and their frequency and order of recurrence".
  • Firth, J.R. (1957). A synopsis of linguistic theory 1930-1955. In Studies in Linguistic Analysis, pp. 1-32. Oxford: Philological Society. Reprinted in F.R. Palmer (ed.), Selected Papers of J.R. Firth 1952-1959, London: Longman (1968).

  • Firth wrote that "a word is characterized by the company it keeps".
  • Frank, E., Paynter, G.W., Witten, I.H., Gutwin, C., and Nevill-Manning, C.G. (1999). Domain-specific keyphrase extraction. In Proceedings of the Sixteenth International Joint Conference on Artificial Intelligence (IJCAI-99), pp. 668-673. California: Morgan Kaufmann.

  • A Naive Bayes supervised learning algorithm is used to extract important words and phrases from documents. The features that characterize keyphrases include early occurrence and relatively high frequency in the given document.
  • Furnas, G.W., Landauer, T.K., Gomez, L.M., and Dumais, S.T. (1983). Statistical semantics: Analysis of the potential performance of keyword information systems. Bell System Technical Journal, 62(6):1753-1806.

  • A foundational paper on statistical semantics.
  • Hearst, M.A. (1992). Automatic acquisition of hyponyms from large text corpora. In Proceedings of the Fourteenth International Conference on Computational Linguistics, pages 539–545, Nantes, France.

  • A widely cited paper that shows how simple lexical-syntactic patterns can be used to mine text for semantic relations, such as hyponymy.
  • Landauer, T.K., and Dumais, S.T. (1997). A solution to Plato's problem: The latent semantic analysis theory of the acquisition, induction, and representation of knowledge. Psychological Review, 104(2):211–240.

  • Latent Semantic Analysis (LSA) is evaluated as a measure of word similarity, using multiple-choice synonym questions from the Test of English as a Foreign Language. LSA achieves a score of 64.4%, almost the same as the 64.5% of the average non-english US college applicant.
  • Lund, K., Burgess, C., and Atchley, R.A. (1995). Semantic and associative priming in high-dimensional semantic space. In Proceedings of the 17th Annual Conference of the Cognitive Science Society, pages 660-665.

  • A statistical semantics measure of word similarity, from the perspective of cognitive science.
  • Pantel, P., and Lin, D. (2002). Discovering word senses from text. In Proceedings of ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pages 613–619.

  • A corpus-based algorithm that clusters similar words together. A polysemous word may belong to several different clusters, one cluster for each sense of the word.
  • Terra, E., and Clarke, C.L.A. (2003). Frequency estimates for statistical word similarity measures. In Proceedings of the Human Language Technology and North American Chapter of Association of Computational Linguistics Conference 2003 (HLT/NAACL 2003), pages 244–251.

  • An evaluation of many different corpus-based measures of word similarity, using multiple-choice synonym questions.
  • Turney, P.D. (2000). Learning algorithms for keyphrase extraction. Information Retrieval, 2(4), 303-336.

  • A supervised algorithm for extracting keyphrases from documents. The main features that characterize keyphrases are distributional.
  • Turney, P.D. (2001). Answering subcognitive Turing Test questions: A reply to French. Journal of Experimental and Theoretical Artificial Intelligence, 13(4), 409-419.

  • Subcognitive questions are designed to probe the network of cultural and perceptual associations that we naturally develop in the course of our daily lives. The paper presents a corpus-based approach to answering subcognitive questions.
  • Turney, P.D. (2003). Coherent keyphrase extraction via Web mining, In Proceedings of the Eighteenth International Joint Conference on Artificial Intelligence (IJCAI-03), Acapulco, Mexico, 434-439.

  • A corpus-based approach to measuring lexical cohesion.
  • Turney, P.D. (2004). Word sense disambiguation by Web mining for word co-occurrence probabilities. In Proceedings of the Third International Workshop on the Evaluation of Systems for the Semantic Analysis of Text (SENSEVAL-3), Barcelona, Spain, pp. 239-242.

  • Using co-occurrence statistics to disambiguate words, much as envisioned by Weaver (1955).
  • Turney, P.D. (2006), Similarity of semantic relations. Computational Linguistics, 32(3), 379-416.

  • The paper distinguishes relational similarity from attributional similarity and presents a statistical semantics approach to measuring relational similarity.
  • Turney, P.D., and Littman, M.L. (2003). Measuring praise and criticism: Inference of semantic orientation from association, ACM Transactions on Information Systems (TOIS), 21(4), 315-346.

  • Using statistical semantics to distinguish words of praise (e.g., "honest", "intrepid") from words of criticism (e.g., "disturbing", "superfluous").
  • Turney, P.D., and Littman, M.L. (2005). Corpus-based learning of analogies and semantic relations. Machine Learning, 60(1–3):251–278.

  • An algorithm for classifying the semantic relations in noun-modifiers, based on distributional information. For example, in the noun-modifier pair "laser printer", the relation is that the laser is an instrument used by the printer.
  • Turney, P.D., Littman, M.L., Bigham, J., and Shnayder, V. (2003). Combining independent modules to solve multiple-choice synonym and analogy problems. In Proceedings of the International Conference on Recent Advances in Natural Language Processing (RANLP-03), Borovets, Bulgaria, pp. 482-489.

  • A method for combining corpus-based (statistical semantics) approaches to semantics with lexicon-based approaches.
  • Weaver, W. (1955). Translation. In W.N. Locke and D.A. Booth (eds.), Machine Translation of Languages, Cambridge, MA: MIT Press.

  • The first use of the term statistical semantics. The paper argues that word sense disambiguation for machine translation should be based on the co-occurrence frequency of the context words near a given target word.