Difference between revisions of "Resources for Arabic"
Jump to navigation
Jump to search
m (add section header) |
|||
Line 9: | Line 9: | ||
==Corpora== | ==Corpora== | ||
+ | ===Proprietary=== | ||
*[http://www.ldc.upenn.edu/Catalog/LDC2001T55.html Arabic Newswire Part 1] | *[http://www.ldc.upenn.edu/Catalog/LDC2001T55.html Arabic Newswire Part 1] | ||
+ | |||
+ | ===Free/open licence=== | ||
+ | * [http://github.com/anastaw/Meedan-Memory Meedan-Memory], Arabic-English TMX (sentence-aligned), ~467,000 words on the English side, [http://www.opendatacommons.org/licenses/odbl/ Open Database Licence] | ||
==Parser== | ==Parser== |
Revision as of 13:15, 7 November 2009
Morphology
Free software
- AraMorph - Perl - An Arabic morphological analyzer and part-of-speech tagger written in Perl (originally by Tim Buckwalter)
- AraMorph - Java - An Arabic morphological analyzer and part-of-speech tagger rewritten in Java for Lucene
Proprietary
Corpora
Proprietary
Free/open licence
- Meedan-Memory, Arabic-English TMX (sentence-aligned), ~467,000 words on the English side, Open Database Licence
Parser
Free software
- Arabic dictionaries, by Jon Dehdari, for the Link-Grammar parser. These require the Aramorph stemming package, above.