Difference between revisions of "Corpora for English"

From ACL Wiki
Jump to navigation Jump to search
Line 68: Line 68:
 
*[http://ece.ut.ac.ir/dbrg/hamshahri/ Hamshahri corpus]
 
*[http://ece.ut.ac.ir/dbrg/hamshahri/ Hamshahri corpus]
 
*[http://www.elda.org/catalogue/en/speech/S0112.html Persian speech database Farsdat]
 
*[http://www.elda.org/catalogue/en/speech/S0112.html Persian speech database Farsdat]
 
==Multilingual==
 
<!-- Please keep this list in alphabetical order -->
 
 
*[http://wt.jrc.it/lt/Acquis/ ACQUIS COMMUNAUTAIRE Multilingual Corpus]
 
*[http://spraakbanken.gu.se/ Bank of Swedish]
 
*[http://sli.uvigo.es/CLUVI/ CLUVI Corpus (Galician-English-Spanish-French parallel corpus)]
 
*[http://hnk.ffzg.hr/ Croatian National Corpus (HNK)]
 
*[http://ucnk.ff.cuni.cz/ Czech National Corpus (CNC)]
 
*[http://www.kun.nl/celex CELEX - The Dutch Center for Lexical Information]
 
*[http://www.cdc.gov/ncidod/sars/languages.htm Centre for Disease Control - Chinese, French, Japanese, Spanish info on SARS]
 
*[http://www.linguateca.pt/COMPARA/ COMPARA corpus]
 
*[http://www.debian.org/international/ Debian free software community]
 
*[http://www.ling.lancs.ac.uk/corplang/emille EMILLE corpus]
 
*[http://www.statmt.org/europarl/ European Parliament Proceedings Parallel Corpus 1996-2003]
 
*[http://www.illc.uva.nl/EuroWordNet EuroWordNet]
 
*[http://www.france.diplomatie.fr/label_france/index.html French Foreign Ministry's magazine]
 
*[http://glossa.fltr.ucl.ac.be/ GlossaNet]
 
*[http://hometown.aol.com/mit2haiti/JA-HC-kr.htm Haitian Creole corpus -Teknoloji pou lang kreyol]
 
*[http://corpus.nytud.hu/mnsz/ Hungarian National Corpus]
 
*[http://www.ldc.upenn.edu/Catalog/CatalogEntry.jsp?catalogId=LDC95T20 Hansard French-English parallel corpus]
 
*[http://www.ucl.ac.uk/english-usage/ice/avail.htm ICE corpora]
 
*[http://korpus.pl/ IPI PAN Corpus of Polish]
 
*[http://www.tu-chemnitz.de/phil/InternetGrammar/ Learner Behaviour on the Internet]
 
*[http://muchmore.dfki.de/resources1.htm MuchMore Springer Bilingual Corpus]
 
*[http://nl.ijs.si/ME/ MULTEXT-East: Multilingual Corpora for Eastern and Central European Languages]
 
*[http://tcc.itc.it/people/forner/multilingualcorpora.html Multilingual Corpora: Available Resources]
 
* [http://www.csse.monash.edu.au/~jwb/tanakacorpus.html Tanaka Corpus: Japanese-English sentence pairs]
 
*[http://multisemcor.itc.it MultiSemCor]
 
*[http://www.ims.uni-stuttgart.de/info/Newspapers.html Newspapers on the Internet]
 
*[http://logos.uio.no/opus/ OPUS - an open source parallel corpus]
 
*[http://www.tekstlab.uio.no/Bosnian/Corpus.html Oslo Corpus of Bosnian]
 
*[http://langbank.engl.polyu.edu.hk/indexl.html PolyU Language Bank]
 
*[http://www.corpusdoportugues.org/ Portuguese Corpus]
 
*[http://register.consilium.eu.int/ Public registry of the Council of the EU]
 
*[http://www.ruscorpora.ru/ Russian National Corpus (RNK)]
 
*[http://www.multilingual.com/allen51.htm The Bible as a Resource for Translation Software]
 
*[http://www.cogsci.ed.ac.uk/elsnet/eci.html The ECI Multilingual corpus]
 
*[http://www.fida.net/ Slovenian Corpus FIDA] and [http://www.fidaplus.net/ FIDA+]
 
*[http://www.corpusdelespanol.org/ Spanish Corpus]
 
*[http://www.unhchr.ch/udhr/index.htm UN declaration of human rights in multiple languages]
 
*[http://www-igm.univ-mlv.fr/~unitex/ UNITEX]
 
*[http://www.u-grenoble3.fr/kraif/liens.htm Useful links about parallel corpora, by Olivier Kraif]
 
*[http://wacky.sslmit.unibo.it/ WaCky Project]
 
*[http://www.wortschatz.uni-leipzig.de/html/wliste.html Wortlisten: spoken German, English, French, and Dutch]
 
  
 
==Russian==
 
==Russian==

Revision as of 13:58, 24 April 2008

For languages other than English, see List of resources by language.

English

Galician

German

Iranian

Russian

Slovak

Italian

Link collections

Corpora tools

Uncategorized

Arabic

Bosnian

Bulgarian

Croatian

Czech

Danish

English

Finnish

French

German

Haitian Creole

Italian

Japanese

Polish

Romanian

Sanskrit

Slovenian

Spanish

Swahili