Difference between revisions of "Knowledge collections and datasets (English)"
Jump to navigation
Jump to search
m |
(+main cat) |
||
Line 16: | Line 16: | ||
== Additional Dataset Collections == | == Additional Dataset Collections == | ||
* [http://www.ldc.upenn.edu/ Linguistic Data Consortium (LDC)] | * [http://www.ldc.upenn.edu/ Linguistic Data Consortium (LDC)] | ||
+ | |||
+ | [[Category:Knowledge Collections and Datasets|*]] |
Revision as of 15:03, 21 November 2006
Datasets for Computational Linguistics and Natural Language Processing.
- Clustering by Committee - terms clustered and organized using the Distributional Hypothesis
- DIRT Paraphrase Collection - Discovery of Inference Rules from Text
- Edinburgh Associative Thesaurus (EAT)
- FrameNet
- MRC Psycholinguistic Database
- Noun Compound Repository
- Reuters-21578 Text Categorization Collection
- Spam filtering datasets
- University of South Florida Free Association Norms
- VerbOcean - verbs organized by semantic relation, including temporal precedence and strength
- WordNet
- WordSimilarity-353 Test Collection