Difference between revisions of "Resources for Turkish"

From ACL Wiki

Jump to navigation Jump to search

Revision as of 08:56, 26 May 2014

Morphological analysis

Free software

TRMorph "is a relatively complete morphological analyzer for Turkish. It is implemented using SFST, and uses a lexicon based on (but heavily modified) the wordlist of Zemberek spell checker. The morphological analyzer is distributed under the GPL."

Proprietary

Lexical resources

Turkish Language Association

Corpora

Free

Southeast European Times (sentence aligned corpus, Albanian, Bulgarian, English, Greek, Macedonian, Romanian, Serbo-Croatian, Turkish — approximately 4.5 million words per language)
TS Corpus (PoSTagged Turkish Corpus. The corpus also presents morphological and lemma tags of the data. Consists of 491 Million tokens)

Proprietary

HamleDT, harmonized dependency treebanks of many languages, common annotation style.
METU-Sabanci Turkish treebank
Turkish plain text and Co-occurrences at LCC

Bibliography

K. Oflazer, "Two-level Description of Turkish Morphology," Literary and Linguistic Computing, vol. 9, pp. 137-148, 1995. Backwards PDF

See also

External links

Retrieved from "https://aclweb.org/aclwiki/index.php?title=Resources_for_Turkish&oldid=10720"

Resources by language