Difference between revisions of "Resources for Slovenian"

Latest revision as of 09:52, 26 May 2014

Europarl corpus, sentence aligned with English
IJS - ELAN Slovene-English Parallel Corpus
JRC Acquis parallel texts. Languages involved: Bulgarian, Czech, Danish, German, Greek, English, Spanish, Estonian, Finnish, French, Hungarian, Italian, Lithuanian, Latvian, Maltese, Dutch, Polish, Portuguese, Romanian, Slovak, Slovene and Swedish.

"Communication in Slovene" corpora, includes written, spoken, web, learner's, and tagged corpora, up to 1.2 billion words
HamleDT, harmonized dependency treebanks of many languages, common annotation style.
Multext EAST lexica, annotated "1984" corpus, parallel and comparable text and speech corpora. Languages involved: Bulgarian, Croatian, Czech, English, Estonian, Hungarian, Lithuanian, Macedonian, Persian, Polish, Resian, Romanian, Russian, Serbian, Slovak, Slovene, and Ukrainian

@@ Line 1: / Line 1: @@
 ==Corpora==
-* [http://nl.ijs.si/elan/ Slovene-English Parallel Corpus]
+===Free license===
+* [http://www.statmt.org/europarl Europarl corpus], sentence aligned with English
+* [http://nl.ijs.si/elan/ IJS - ELAN] Slovene-English Parallel Corpus
+* [http://langtech.jrc.it/JRC-Acquis.html JRC Acquis] parallel texts.  Languages involved: Bulgarian, Czech, Danish, German, Greek, English, Spanish, Estonian, Finnish, French, Hungarian, Italian, Lithuanian, Latvian, Maltese, Dutch, Polish, Portuguese, Romanian, Slovak, Slovene and Swedish.
+===Non-free license===
+* [http://eng.slovenscina.eu/korpusi "Communication in Slovene" corpora], includes written, spoken, web, learner's, and tagged corpora, up to 1.2 billion words
+* [http://ufal.mff.cuni.cz/hamledt HamleDT], harmonized dependency treebanks of many languages, common annotation style.
+* [http://nl.ijs.si/ME/ Multext EAST] lexica, annotated "1984" corpus, parallel and comparable text and speech corpora.  Languages involved: Bulgarian, Croatian, Czech, English, Estonian, Hungarian, Lithuanian, Macedonian, Persian, Polish, Resian, Romanian, Russian, Serbian, Slovak, Slovene, and Ukrainian
 [[Category:Resources by language|Solvenian]]