Corpora, datasets, lexicons
Revision as of 17:16, 30 October 2006 by Kevin.Cohen (talk | contribs) (→Corpora: Added a link for the biomedical corpora site at UCHSC)
Miscellaneous
Corpora
- American National Corpus (ANC)
- corpora Biomedical corpora
- British National Corpus (BNC)
- Brown Corpus
- Collins Wordbanks
- David Lee's Bookmarks for Corpus-based Linguists
- Gutenberg
- Oxford English Corpus
- WebCorp
Datasets
- Edinburgh Associative Thesaurus (EAT)
- Linguistic Data Consortium (LDC)
- MRC Psycholinguistic Database
- Noun Compound Repository
- Reuters-21578 Text Categorization Collection
- University of South Florida Free Association Norms
- WordSimilarity-353 Test Collection