Difference between revisions of "Resources for Persian"
Jump to navigation
Jump to search
(taher pilevar's corpus) |
|||
Line 12: | Line 12: | ||
== Corpora == | == Corpora == | ||
+ | ===Proprietary=== | ||
+ | <!-- Please keep this list in alphabetical order --> | ||
+ | *[http://ece.ut.ac.ir/DBRG/Bijankhan/ Bijankhan corpus] (gratis for research/non-commercial purposes) | ||
+ | *[http://www.ldc.upenn.edu/Catalog/CatalogEntry.jsp?catalogId=LDC96S50 CALLFRIEND Farsi (speech)], LDC | ||
+ | *[http://ece.ut.ac.ir/dbrg/hamshahri/ Hamshahri corpus] (gratis for research/non-commercial purposes) | ||
+ | *[http://www.elda.org/catalogue/en/speech/S0112.html Persian speech database Farsdat], ELRA | ||
+ | |||
===Unknown license=== | ===Unknown license=== | ||
− | |||
− | |||
− | |||
*[http://ece.ut.ac.ir/NLP/resources.htm First Free English-Persian Parallel Corpus] Mohammad Taher Pilevar, NLP Lab, University of Tehran, Iran | *[http://ece.ut.ac.ir/NLP/resources.htm First Free English-Persian Parallel Corpus] Mohammad Taher Pilevar, NLP Lab, University of Tehran, Iran | ||
− | |||
− | |||
==Parser== | ==Parser== |
Revision as of 00:36, 14 April 2010
Machine translation systems
Free software
Proprietary
- The Shiraz project (Persian -> English)
Morphology tools
Free software
- Perstem - Persian stemmer, light morphological analyzer, and character set converter.
- Morphological dictionary — compiled using lttoolbox.
Corpora
Proprietary
- Bijankhan corpus (gratis for research/non-commercial purposes)
- CALLFRIEND Farsi (speech), LDC
- Hamshahri corpus (gratis for research/non-commercial purposes)
- Persian speech database Farsdat, ELRA
Unknown license
- First Free English-Persian Parallel Corpus Mohammad Taher Pilevar, NLP Lab, University of Tehran, Iran
Parser
Free software
- Persian dictionaries, by Jon Dehdari, for the Link-Grammar parser. These require the Perstem stemming package, above.
Bibliography
- Dehdari, Jon, and Deryle Lonsdale. 2008. A link grammar parser for Persian. In Karimi, S., Samiian, V., and Stilo, D., editors, Aspects of Iranian Linguistics, volume 1. Cambridge Scholars Press. ISBN: 978-18-471-8639-3 (BIB)
- Feili, H. and G. Ghassem-Sani (2004) "An Application of Lexicalized Grammars in English-Persian Translation". Proceedings of the 16th European Conference on Artificial Intelligence (ECAI 2004), 24-27 Aug. 2004, Universidad Politecnica de Valencia, Valencia, Spain, pp. 596-600.
- Megerdoomian, K. (2000) "Unification-Based Persian Morphology". Proceedings of CICLing 2000, Alexander Gelbukh, Center of Investigation on Computation-IPN, Mexico, 2000.
- Megerdoomian, K. (2004) "Finite-State Morphological Analysis of Persian". COLING 2004 Computational Approaches to Arabic Script-based Languages. Ali Farghaly and Karine Megerdoomian editors, Geneva, Switzerland, 2004, pgs. 35-41.
See also
External links
- the Jon safari (link parser, small lexicon, stemmer, morphological analysis tools)