Difference between revisions of "Resources for Bulgarian"

Latest revision as of 08:36, 26 May 2014

Southeast European Times, sentence aligned corpus, Albanian, Bulgarian, English, Greek, Macedonian, Romanian, Serbo-Croatian, Turkish — approximately 4.5 million words per language
Europarl corpus, sentence aligned with English
HamleDT, harmonized dependency treebanks of many languages, common annotation style.

@@ Line 31: / Line 31: @@
 * [http://www.statmt.org/setimes/ Southeast European Times], sentence aligned corpus, Albanian, Bulgarian, English, Greek, Macedonian, Romanian, Serbo-Croatian, Turkish &mdash; approximately 4.5 million words per language
 * [http://www.statmt.org/europarl Europarl corpus], sentence aligned with English
+* [http://ufal.mff.cuni.cz/hamledt HamleDT], harmonized dependency treebanks of many languages, common annotation style.
 ===Proprietary===