Difference between revisions of "Resources for Hungarian"
Jump to navigation
Jump to search
(Added: Araneum) |
(+hungarian national corpus) |
||
(One intermediate revision by the same user not shown) | |||
Line 1: | Line 1: | ||
==Corpora== | ==Corpora== | ||
+ | ===Free=== | ||
+ | * [http://www.statmt.org/europarl Europarl corpus], sentence aligned with English | ||
+ | * [http://mokk.bme.hu/resources/webcorpus/ Hungarian Webcorpus] - 590 million tokens | ||
+ | |||
+ | ===Non-Free=== | ||
* [http://ucts.uniba.sk/aranea_about/ Araneum Hungaricum], Gigaword Hungarian web corpus | * [http://ucts.uniba.sk/aranea_about/ Araneum Hungaricum], Gigaword Hungarian web corpus | ||
− | |||
* Hunglish parallel corpus ([http://mokk.bme.hu/resources/hunglishcorpus download], [http://hunglish.hu/search search]) | * Hunglish parallel corpus ([http://mokk.bme.hu/resources/hunglishcorpus download], [http://hunglish.hu/search search]) | ||
− | |||
* [http://ufal.mff.cuni.cz/hamledt HamleDT], harmonized dependency treebanks of many languages, common annotation style. | * [http://ufal.mff.cuni.cz/hamledt HamleDT], harmonized dependency treebanks of many languages, common annotation style. | ||
+ | * [http://corpus.nytud.hu/mnsz/ Hungarian National Corpus] | ||
+ | |||
== Tools == | == Tools == | ||
− | * [http://code.google.com/p/hunpos/ hunpos] | + | * [http://code.google.com/p/hunpos/ hunpos] - open-source POS-tagger |
− | * [http://mokk.bme.hu/resources/hunmorph/ hunmorph] | + | * [http://mokk.bme.hu/resources/hunmorph/ hunmorph] - open-source morphological analyzer |
[[Category:Resources by language|Hungarian]] | [[Category:Resources by language|Hungarian]] |
Latest revision as of 07:44, 26 June 2016
Corpora
Free
- Europarl corpus, sentence aligned with English
- Hungarian Webcorpus - 590 million tokens
Non-Free
- Araneum Hungaricum, Gigaword Hungarian web corpus
- Hunglish parallel corpus (download, search)
- HamleDT, harmonized dependency treebanks of many languages, common annotation style.
- Hungarian National Corpus