Resources for Persian
Jump to navigation
Jump to search
The printable version is no longer supported and may have rendering errors. Please update your browser bookmarks and please use the default browser print function instead.
Corpora
Free
- VOA Persian Corpus 2003-2008 (public domain)
Proprietary
- Bijankhan corpus (gratis for research/non-commercial purposes)
- CALLFRIEND Farsi (speech), LDC
- Hamshahri corpus (gratis for research/non-commercial purposes)
- Persian speech database Farsdat, ELRA
Lexical resources
Free
- Persian - English dictionary, derived from Wikipedia article names. Retains Wikipedia's CC-BY-SA 3.0 license.
Proprietary
Machine translation
Free
Proprietary
- The Shiraz project (Persian -> English)
- Tehran English-Persian Parallel Corpus by Mohammad Taher Pilevar, NLP Lab, University of Tehran. For research or non-commercial use.
Morphology tools
Free
- Perstem - Persian stemmer, light morphological analyzer, and character set converter.
- Morphological dictionary — compiled using lttoolbox.
Parsing
Free
- HamleDT, harmonized dependency treebanks of many languages, common annotation style.
- Persian dictionaries for the Link-Grammar parser. By Jon Dehdari. These require the Perstem stemming package, above.
- Uppsala Persian Dependency Treebank, Creative Commons Attribution 3.0 License
Proprietary
- Dadegan Dependency Treebank for research purposes only.
- HPSG Persian Treebank (PerTreeBank) for academic research purposes only.
Bibliography
- Dehdari, Jon, and Deryle Lonsdale. 2008. A link grammar parser for Persian. In Karimi, S., Samiian, V., and Stilo, D., editors, Aspects of Iranian Linguistics, volume 1. Cambridge Scholars Press. ISBN: 978-18-471-8639-3 (BIB)
- Feili, H. and G. Ghassem-Sani (2004) "An Application of Lexicalized Grammars in English-Persian Translation". Proceedings of the 16th European Conference on Artificial Intelligence (ECAI 2004), 24-27 Aug. 2004, Universidad Politecnica de Valencia, Valencia, Spain, pp. 596-600.
- Megerdoomian, K. (2000) "Unification-Based Persian Morphology". Proceedings of CICLing 2000, Alexander Gelbukh, Center of Investigation on Computation-IPN, Mexico, 2000.
- Megerdoomian, K. (2004) "Finite-State Morphological Analysis of Persian". COLING 2004 Computational Approaches to Arabic Script-based Languages. Ali Farghaly and Karine Megerdoomian editors, Geneva, Switzerland, 2004, pgs. 35-41.
- Mohammad Amin Farajian (2011). PEN: Parallel English-Persian News Corpus. Proceedings of 2011 International Conference on Artificial Intelligence (ICAI'11), Nevada, USA.
See also
External links
- Iranian Linguistics: NLP Resources for Persian
- the Jon safari (link parser, small lexicon, stemmer, morphological analysis tools)