Free open source

  • MultiUN "A Multilingual corpus from United Nation Documents", the Russian portion is 876 MB, the other languages in the multilingual corpus are: English/French/Spanish/Arabic/Chinese/German
  • WMT corpora, including Europarl, News Commentary, and News Crawl

Unknown license

POS taggers


Various resources

