Difference between revisions of "Knowledge collections and datasets (English)"
Jump to navigation
Jump to search
Line 16: | Line 16: | ||
* [[TEASE]] - Acquisition of Entailment Relations from the Web | * [[TEASE]] - Acquisition of Entailment Relations from the Web | ||
* [[TOEFL Synonym Questions]] - a way of evaluating algorithms for measuring degree of similarity between 2 words | * [[TOEFL Synonym Questions]] - a way of evaluating algorithms for measuring degree of similarity between 2 words | ||
− | * [[RG-65 Test Collection (State of the art)]] - suitable for correlation-based evaluation of algorithms for measuring semantic similarity of word pairs | + | * [[RG-65 Test Collection (State of the art)|RG-65 Test Collection]] - suitable for correlation-based evaluation of algorithms for measuring semantic similarity of word pairs |
* [http://w3.usf.edu/FreeAssociation/ University of South Florida Free Association Norms] | * [http://w3.usf.edu/FreeAssociation/ University of South Florida Free Association Norms] | ||
* [[VerbOcean]] - verbs organized by semantic relation, including temporal precedence and strength | * [[VerbOcean]] - verbs organized by semantic relation, including temporal precedence and strength |
Revision as of 16:05, 16 October 2013
Knowledge collections and datasets for Computational Linguistics and Natural Language Processing.
For languages other than English, see List of resources by language.
- Clustering by Committee - terms clustered and organized using the Distributional Hypothesis
- DIRT Paraphrase Collection - Discovery of Inference Rules from Text
- Edinburgh Associative Thesaurus (EAT)
- FrameNet
- MRC Psycholinguistic Database
- Preposition Project
- Noun Compound Repository
- Reuters-21578 Text Categorization Collection
- SAT Analogy Questions - a way of evaluating algorithms for measuring relational similarity
- Spam filtering datasets
- TEASE - Acquisition of Entailment Relations from the Web
- TOEFL Synonym Questions - a way of evaluating algorithms for measuring degree of similarity between 2 words
- RG-65 Test Collection - suitable for correlation-based evaluation of algorithms for measuring semantic similarity of word pairs
- University of South Florida Free Association Norms
- VerbOcean - verbs organized by semantic relation, including temporal precedence and strength
- WordNet
- WordSimilarity-353 Test Collection
See also NLG:Data sets for a collection of data sets used for building natural language generation systems.