Difference between revisions of "Knowledge collections and datasets (English)"
Jump to navigation
Jump to search
m |
|||
Line 1: | Line 1: | ||
Datasets for Computational Linguistics and Natural Language Processing. | Datasets for Computational Linguistics and Natural Language Processing. | ||
+ | <!-- Please keep this list in alphabetical order --> | ||
* [[Clustering by Committee]] - terms clustered and organized using the [[Distributional Hypothesis]] | * [[Clustering by Committee]] - terms clustered and organized using the [[Distributional Hypothesis]] | ||
Line 8: | Line 9: | ||
* [[Noun compound repository|Noun Compound Repository]] | * [[Noun compound repository|Noun Compound Repository]] | ||
* [http://kdd.ics.uci.edu/databases/reuters21578/reuters21578.html Reuters-21578 Text Categorization Collection] | * [http://kdd.ics.uci.edu/databases/reuters21578/reuters21578.html Reuters-21578 Text Categorization Collection] | ||
+ | * [[SAT Analogy Questions]] - a way of evaluating algorithms for measuring relational similarity | ||
* [[Spam filtering datasets]] | * [[Spam filtering datasets]] | ||
* [[TEASE]] - Acquisition of Entailment Relations from the Web | * [[TEASE]] - Acquisition of Entailment Relations from the Web | ||
+ | * [[TOEFL Synonym Questions]] - a way of evaluating algorithms for measuring degree of similarity between two words | ||
* [http://w3.usf.edu/FreeAssociation/ University of South Florida Free Association Norms] | * [http://w3.usf.edu/FreeAssociation/ University of South Florida Free Association Norms] | ||
* [[VerbOcean]] - verbs organized by semantic relation, including temporal precedence and strength | * [[VerbOcean]] - verbs organized by semantic relation, including temporal precedence and strength |
Revision as of 06:00, 13 May 2007
Datasets for Computational Linguistics and Natural Language Processing.
- Clustering by Committee - terms clustered and organized using the Distributional Hypothesis
- DIRT Paraphrase Collection - Discovery of Inference Rules from Text
- Edinburgh Associative Thesaurus (EAT)
- FrameNet
- MRC Psycholinguistic Database
- Noun Compound Repository
- Reuters-21578 Text Categorization Collection
- SAT Analogy Questions - a way of evaluating algorithms for measuring relational similarity
- Spam filtering datasets
- TEASE - Acquisition of Entailment Relations from the Web
- TOEFL Synonym Questions - a way of evaluating algorithms for measuring degree of similarity between two words
- University of South Florida Free Association Norms
- VerbOcean - verbs organized by semantic relation, including temporal precedence and strength
- WordNet
- WordSimilarity-353 Test Collection