Difference between revisions of "Resources for Turkish"

Latest revision as of 07:40, 17 June 2015

Morphological analysis

Free software

TRMorph "is a relatively complete morphological analyzer for Turkish. It is implemented using SFST, and uses a lexicon based on (but heavily modified) the wordlist of Zemberek spell checker. The morphological analyzer is distributed under the GPL."

Proprietary

Lexical resources

Turkish Language Association

Corpora

Free

Southeast European Times (sentence aligned corpus, Albanian, Bulgarian, English, Greek, Macedonian, Romanian, Serbo-Croatian, Turkish — approximately 4.5 million words per language)

Proprietary

HamleDT, harmonized dependency treebanks of many languages, common annotation style.
TS Corpus (PoSTagged Turkish Corpus. The corpus also presents morphological and lemma tags of the data. Consists of 491 Million tokens)
METU-Sabanci Turkish treebank
Turkish plain text and Co-occurrences at LCC

Bibliography

K. Oflazer, "Two-level Description of Turkish Morphology," Literary and Linguistic Computing, vol. 9, pp. 137-148, 1995. Backwards PDF

External links

@@ Line 1: / Line 1: @@
-==Machine translation systems==
+==Morphological analysis==
 ===Free software===
+* [http://www.let.rug.nl/~coltekin/trmorph/ TRMorph] "is a relatively complete morphological analyzer for Turkish. It is implemented using [[SFST]], and uses a lexicon based on (but heavily modified) the wordlist of [[Zemberek]] spell checker. The morphological analyzer is distributed under the [[GPL]]."
 ===Proprietary===
@@ Line 11: / Line 11: @@
 ==Corpora==
+===Free===
+* [http://www.statmt.org/setimes/ Southeast European Times] (sentence aligned corpus, Albanian, Bulgarian, English, Greek, Macedonian, Romanian, Serbo-Croatian, Turkish &mdash; approximately 4.5 million words per language)
+===Proprietary===
+* [http://ufal.mff.cuni.cz/hamledt HamleDT], harmonized dependency treebanks of many languages, common annotation style.
+* [http://tscorpus.com/ TS Corpus] (PoSTagged Turkish Corpus. The corpus also presents morphological and lemma tags of the data. Consists of 491 Million tokens)
+* [http://www.ii.metu.edu.tr/~corpus/treebank.html METU-Sabanci Turkish treebank]
+* [http://corpora.informatik.uni-leipzig.de/ Turkish plain text and Co-occurrences at LCC]
 ==Bibliography==
@@ Line 22: / Line 31: @@
 * [http://www.hlst.sabanciuniv.edu Sabancı University Natural Language Processing Tools (Turkish Morphological Analyzer, BalkaNET)]
 * [http://ddi.ce.itu.edu.tr Istanbul Technical University Natural Language Processing Research Group]
+* [http://nooj4nlp.net/pages/turkish.html NooJ_TR by Mersin University Turkish National Corpus Project Team]
+[[Category:Resources by language|Turkish]]
-[[Category:Resources by language|Tajik]]

Difference between revisions of "Resources for Turkish"

Latest revision as of 07:40, 17 June 2015

Contents

Morphological analysis

Free software

Proprietary

Lexical resources

Corpora

Free

Proprietary

Bibliography

See also

External links

Navigation menu

Search