Resources for Arabic
Revision as of 11:22, 12 November 2009 by Kiwibird (talk | contribs) (→Free/open licence: quranic arabic corpus)
Morphology
Free software
- AraMorph - Perl - An Arabic morphological analyzer and part-of-speech tagger written in Perl (originally by Tim Buckwalter)
- AraMorph - Java - An Arabic morphological analyzer and part-of-speech tagger rewritten in Java for Lucene
Proprietary
Corpora
Proprietary
Free/open licence
- Meedan-Memory, Arabic-English TMX (sentence-aligned), ~467,000 words on the English side, Open Database Licence
- Quranic Arabic Corpus, 77,430 words of Quranic Arabic, with manually verified contextual POS, inflection, derivation; dependency grammar annotation is planned.
Parser
Free software
- Arabic dictionaries, by Jon Dehdari, for the Link-Grammar parser. These require the Aramorph stemming package, above.