Difference between revisions of "Resources for Bulgarian"

Latest revision as of 08:36, 26 May 2014

Southeast European Times, sentence aligned corpus, Albanian, Bulgarian, English, Greek, Macedonian, Romanian, Serbo-Croatian, Turkish — approximately 4.5 million words per language
Europarl corpus, sentence aligned with English
HamleDT, harmonized dependency treebanks of many languages, common annotation style.

@@ Line 2: / Line 2: @@
 ===Free software===
+* [https://apertium.svn.sourceforge.net/svnroot/apertium/trunk/apertium-mk-bg apertium-mk-bg] RBMT system between Macedonian and Bulgarian
 ===Proprietary===
+* [http://webtrance.skycode.com/?setlang=en WebTrance]
 ==Lexical resources==
+===Morphological analysis===
+====Free software====
+* [https://apertium.svn.sourceforge.net/svnroot/apertium/trunk/apertium-mk-bg/apertium-mk-bg.bg.dix Morphological analyser] 8,581 lemmata, ~88% coverage over SETimes
+====Proprietary====
+== Grammars ==
+===Proprietary===
 * [http://dcl.bas.bg/BulNet/general_en.html BulNet WordNet] (21,444 synonym sets)
+* [[Generation grammars|KPML generation grammar]]
+==Corpora==
+===Free===
+* [http://www.statmt.org/setimes/ Southeast European Times], sentence aligned corpus, Albanian, Bulgarian, English, Greek, Macedonian, Romanian, Serbo-Croatian, Turkish &mdash; approximately 4.5 million words per language
+* [http://www.statmt.org/europarl Europarl corpus], sentence aligned with English
+* [http://ufal.mff.cuni.cz/hamledt HamleDT], harmonized dependency treebanks of many languages, common annotation style.
+===Proprietary===
+* [http://www.hf.uio.no/easteur-orient/bulg/mat/ Corpus of spoken Bulgarian]
 ==Bibliography==
-*
 ==External links==
-*
 [[Category:Resources by language|Bulgarian]]