Difference between revisions of "Corpora, datasets, lexicons"
Jump to navigation
Jump to search
(added LCS and ThoughtTreasure lexicons) |
|||
Line 27: | Line 27: | ||
== Lexicons == | == Lexicons == | ||
− | * [http://clipdemos.umiacs.umd.edu/catvar/ Catvar 2.0 | + | * [http://clipdemos.umiacs.umd.edu/catvar/ Catvar 2.0: The Categorial Variation Database] |
* [http://xwn.hlt.utdallas.edu/ eXtended WordNet] | * [http://xwn.hlt.utdallas.edu/ eXtended WordNet] | ||
* [http://www.wjh.harvard.edu/%7Einquirer/spreadsheet_guide.htm General Inquirer] | * [http://www.wjh.harvard.edu/%7Einquirer/spreadsheet_guide.htm General Inquirer] | ||
− | * [http://www.umiacs.umd.edu/~bonnie/LCS_Database_Documentation.html LCS] | + | * [http://www.umiacs.umd.edu/~bonnie/LCS_Database_Documentation.html LCS Database: Lexical Conceptual Structures] |
* [http://www.dcs.shef.ac.uk/research/ilash/Moby/ Moby lexicon project] | * [http://www.dcs.shef.ac.uk/research/ilash/Moby/ Moby lexicon project] | ||
* [http://patty.isti.cnr.it/~esuli/software/SentiWordNet/ SentiWordNet] | * [http://patty.isti.cnr.it/~esuli/software/SentiWordNet/ SentiWordNet] |
Revision as of 07:14, 31 October 2006
Miscellaneous
Corpora
- American National Corpus (ANC)
- Biomedical corpora
- British National Corpus (BNC)
- Brown Corpus
- Collins Wordbanks
- David Lee's Bookmarks for Corpus-based Linguists
- Gutenberg
- Oxford English Corpus
- WebCorp
Datasets
- Edinburgh Associative Thesaurus (EAT)
- Linguistic Data Consortium (LDC)
- MRC Psycholinguistic Database
- Noun Compound Repository
- Reuters-21578 Text Categorization Collection
- University of South Florida Free Association Norms
- WordSimilarity-353 Test Collection