Difference between revisions of "Knowledge collections and datasets (English)"

Revision as of 15:42, 7 February 2009

Knowledge collections and datasets for Computational Linguistics and Natural Language Processing.

For languages other than English, see List of resources by language.

See also NLG:Data sets for a collection of data sets used for building natural language generation systems.

@@ Line 15: / Line 15: @@
 * [[Spam filtering datasets]]
 * [[TEASE]] - Acquisition of Entailment Relations from the Web
-* [[TOEFL Synonym Questions]] - a way of evaluating algorithms for measuring degree of similarity between two words
+* [[TOEFL Synonym Questions]] - a way of evaluating algorithms for measuring degree of similarity between 2 words
 * [http://w3.usf.edu/FreeAssociation/ University of South Florida Free Association Norms]
 * [[VerbOcean]] - verbs organized by semantic relation, including temporal precedence and strength
 * [[WordNet]]
 * [http://www.cs.technion.ac.il/~gabr/resources/data/wordsim353/wordsim353.html WordSimilarity-353 Test Collection]
+See also [[NLG:Data sets]] for a collection of data sets used for building natural language generation systems.
 == Additional Dataset Collections ==