Difference between revisions of "Corpora for English"

From ACL Wiki
Jump to navigation Jump to search
(3 intermediate revisions by the same user not shown)
Line 25: Line 25:
 
*[http://usna.edu/LangStudy/BNC/ Exploring Words and Phrases from the British National Corpus]
 
*[http://usna.edu/LangStudy/BNC/ Exploring Words and Phrases from the British National Corpus]
 
*[http://ir.dcs.gla.ac.uk/test_collections/gov2-summary.htm GOV2 Corpus] - 426 gigabytes of text
 
*[http://ir.dcs.gla.ac.uk/test_collections/gov2-summary.htm GOV2 Corpus] - 426 gigabytes of text
 +
*[http://gmb.let.rug.nl Groningen Meaning Bank] semantically annotated corpus
 
*[http://www.gutenberg.org/wiki/Main_Page Gutenberg]
 
*[http://www.gutenberg.org/wiki/Main_Page Gutenberg]
 
*[http://prize.hutter1.net/ Hutter Prize for Lossless Compression of Human Knowledge 100M sample of Wikipedia]
 
*[http://prize.hutter1.net/ Hutter Prize for Lossless Compression of Human Knowledge 100M sample of Wikipedia]
Line 43: Line 44:
 
*[http://www.grsampson.net/LucyDoc.html The LUCY Corpus - Documentation]
 
*[http://www.grsampson.net/LucyDoc.html The LUCY Corpus - Documentation]
 
*[http://www.cs.rochester.edu/research/cisd/resources/trains.html TRAINS Dialogue Corpus]
 
*[http://www.cs.rochester.edu/research/cisd/resources/trains.html TRAINS Dialogue Corpus]
 +
*[http://www.let.rug.nl/~bos/vpe/ VP Ellipsis corpus]
 
*[http://wacky.sslmit.unibo.it/ WaCky]
 
*[http://wacky.sslmit.unibo.it/ WaCky]
 
*[http://www.webcorp.org.uk/guide/ WebCorp]
 
*[http://www.webcorp.org.uk/guide/ WebCorp]
Line 51: Line 53:
 
*[http://www.dcs.gla.ac.uk/idom/ir_resources/ Collections of texts and corpora]
 
*[http://www.dcs.gla.ac.uk/idom/ir_resources/ Collections of texts and corpora]
 
*[http://www.bmanuel.org/clr2_mp.html Manuel Barbera: General Corpora and Corpus Linguistics Resources]
 
*[http://www.bmanuel.org/clr2_mp.html Manuel Barbera: General Corpora and Corpus Linguistics Resources]
*[http://www.alphabit.net Isabella Chiari: Corpora, Software and Linguistic resources]
 
 
*[http://www.sultry.arts.usyd.edu.au/links/statnlp.html Annotated list of resources on statistical NLP and corpus-based CL]
 
*[http://www.sultry.arts.usyd.edu.au/links/statnlp.html Annotated list of resources on statistical NLP and corpus-based CL]
  

Revision as of 01:13, 26 October 2012

For languages other than English, see List of resources by language.

Link collections

Corpora tools