Difference between revisions of "Corpora for English"

From ACL Wiki
Jump to navigation Jump to search
m (Jonsafari moved page Corpora (English) to Corpora for English: align with other related articles)
(Added GUM corpus)
Line 8: Line 8:
 
*[http://www-2.cs.cmu.edu/afs/cs.cmu.edu/project/theo-11/www/naive-bayes/bow-0.8/stopwords.c English stop words (from SMART)]
 
*[http://www-2.cs.cmu.edu/afs/cs.cmu.edu/project/theo-11/www/naive-bayes/bow-0.8/stopwords.c English stop words (from SMART)]
 
*[http://gmb.let.rug.nl Groningen Meaning Bank] semantically annotated corpus
 
*[http://gmb.let.rug.nl Groningen Meaning Bank] semantically annotated corpus
 +
*[https://corpling.uis.georgetown.edu/gum/ GUM - Georgetown University Multilayer corpus], multiple parses, coreference, entities, sentence types and RST
 
*[https://www.gutenberg.org Project Gutenberg]
 
*[https://www.gutenberg.org Project Gutenberg]
 
*[http://ufal.mff.cuni.cz/hamledt HamleDT], harmonized dependency treebanks of many languages, common annotation style.
 
*[http://ufal.mff.cuni.cz/hamledt HamleDT], harmonized dependency treebanks of many languages, common annotation style.

Revision as of 07:50, 10 June 2016

For languages other than English, see List of resources by language.

Free and Downloadable

Proprietary or Require Prior Permission


Link collections

Corpora tools