Difference between revisions of "Resources for Russian"
Jump to navigation
Jump to search
(→Grammars: Link Grammar Parser, includes Russian dictionaries.) |
|||
(2 intermediate revisions by one other user not shown) | |||
Line 2: | Line 2: | ||
===Free open source=== | ===Free open source=== | ||
* [http://www.euromatrixplus.net/multi-un/ MultiUN] "A Multilingual corpus from United Nation Documents", the Russian portion is 876 MB, the other languages in the multilingual corpus are: English/French/Spanish/Arabic/Chinese/German | * [http://www.euromatrixplus.net/multi-un/ MultiUN] "A Multilingual corpus from United Nation Documents", the Russian portion is 876 MB, the other languages in the multilingual corpus are: English/French/Spanish/Arabic/Chinese/German | ||
+ | * [http://www.statmt.org/wmt13/translation-task.html#download WMT corpora], including the Yandex 1M corpus, News Commentary, and News Crawl | ||
===Unknown license=== | ===Unknown license=== | ||
Line 21: | Line 22: | ||
== Grammars == | == Grammars == | ||
* [[Generation grammars|KPML generation grammar]] | * [[Generation grammars|KPML generation grammar]] | ||
+ | * [http://abisource.com/projects/link-grammar/ Link Grammar Parser], includes Russian dictionaries. | ||
==Various resources== | ==Various resources== |
Revision as of 20:42, 8 February 2014
Corpora
Free open source
- MultiUN "A Multilingual corpus from United Nation Documents", the Russian portion is 876 MB, the other languages in the multilingual corpus are: English/French/Spanish/Arabic/Chinese/German
- WMT corpora, including the Yandex 1M corpus, News Commentary, and News Crawl
Unknown license
- HANCO: The Helsinki annotated corpus of Russian texts (searchable, no visible download links)
- Russian Corpora (uni-tuebingen.de) (searchable, no visible download links)
- Russian Internet Corpus
- Russian National Corpus
- Russian Newspaper Corpus
- Various texts in Russian (lib.ru)
POS taggers
- AOT, morphological analyser
- Mocky, statistical taggers and lemmatiser
- Mystem, morphological analyser
Grammars
- KPML generation grammar
- Link Grammar Parser, includes Russian dictionaries.