Difference between revisions of "Corpora for English"

From ACL Wiki
Jump to navigation Jump to search
Line 141: Line 141:
  
 
*[http://www.dcs.gla.ac.uk/idom/ir_resources/linguistic_utils/stop_words List of stop words]
 
*[http://www.dcs.gla.ac.uk/idom/ir_resources/linguistic_utils/stop_words List of stop words]
 +
*[http://korpus.pl/index.php?page=poliqarp Poliqarp] - open source XML-aware indexer, search engine and concordancer
 
*[http://www.sketchengine.co.uk/ The Sketch Engine]
 
*[http://www.sketchengine.co.uk/ The Sketch Engine]
 
*[http://www.cis.upenn.edu/~treebank/tokenization.html Treebank tokenization scheme]
 
*[http://www.cis.upenn.edu/~treebank/tokenization.html Treebank tokenization scheme]

Revision as of 02:21, 11 November 2007

English

Galician

German

Multilingual

Russian

Slovak

Italian

Link collections

Corpora tools

Uncategorized

Arabic

Bosnian

Bulgarian

Czech

Danish

English

Finnish

French

German

Haitian Creole

Italian

Japanese

Polish

Romanian

Sanskrit

Slovenian

Spanish

Swahili