Resources for Russian
From ACLWiki
(Difference between revisions)
(→Corpora) |
|||
| (One intermediate revision by one user not shown) | |||
| Line 1: | Line 1: | ||
==Corpora== | ==Corpora== | ||
| + | ===Free open source=== | ||
| + | * [http://www.euromatrixplus.net/multi-un/ MultiUN] "A Multilingual corpus from United Nation Documents", the Russian portion is 876 MB, the other languages in the multilingual corpus are: English/French/Spanish/Arabic/Chinese/German | ||
| + | |||
| + | ===Unknown license=== | ||
<!-- Please keep this list in alphabetical order --> | <!-- Please keep this list in alphabetical order --> | ||
| − | * [http://www | + | * [http://www.helsinki.fi/venaja/english/e-material/hanco/index.htm HANCO: The Helsinki annotated corpus of Russian texts] (searchable, no visible download links) |
| − | * [http://www.sfb441.uni-tuebingen.de/b1/korpora.html Russian Corpora (uni-tuebingen.de)] | + | * [http://www.sfb441.uni-tuebingen.de/b1/korpora.html Russian Corpora (uni-tuebingen.de)] (searchable, no visible download links) |
* [http://corpus.leeds.ac.uk/ruscorpora.html Russian Internet Corpus] | * [http://corpus.leeds.ac.uk/ruscorpora.html Russian Internet Corpus] | ||
| − | * [http://www.ruscorpora.ru/ Russian National Corpus] | + | * [http://www.ruscorpora.ru/ Russian National Corpus] |
* [http://www.philol.msu.ru/~lex/corpus/ Russian Newspaper Corpus] | * [http://www.philol.msu.ru/~lex/corpus/ Russian Newspaper Corpus] | ||
* [http://lib.ru/ Various texts in Russian (lib.ru)] | * [http://lib.ru/ Various texts in Russian (lib.ru)] | ||
Latest revision as of 06:38, 18 April 2011
Contents |
Corpora
Free open source
- MultiUN "A Multilingual corpus from United Nation Documents", the Russian portion is 876 MB, the other languages in the multilingual corpus are: English/French/Spanish/Arabic/Chinese/German
Unknown license
- HANCO: The Helsinki annotated corpus of Russian texts (searchable, no visible download links)
- Russian Corpora (uni-tuebingen.de) (searchable, no visible download links)
- Russian Internet Corpus
- Russian National Corpus
- Russian Newspaper Corpus
- Various texts in Russian (lib.ru)
POS taggers
- AOT, morphological analyser
- Mocky, statistical taggers and lemmatiser
- Mystem, morphological analyser