Difference between revisions of "Resources for Persian"

From ACL Wiki
Jump to: navigation, search
(Morphology tools)
(See also: +link to Resources for Kurdish)
(12 intermediate revisions by 6 users not shown)
Line 1: Line 1:
==Machine translation systems==
+
==Machine translation==
  
===Free software===
+
===Free resources===
  
===Proprietary===
+
===Proprietary resources===
 
*[http://crl.nmsu.edu/Research/Projects/shiraz/index.html The Shiraz project] (Persian -> English)
 
*[http://crl.nmsu.edu/Research/Projects/shiraz/index.html The Shiraz project] (Persian -> English)
 +
*[http://ece.ut.ac.ir/NLP/resources.htm Tehran English-Persian Parallel Corpus] by Mohammad Taher Pilevar, NLP Lab, University of Tehran. For research or non-commercial use.
  
 
==Morphology tools==
 
==Morphology tools==
 
===Free software===
 
===Free software===
 
*[http://sourceforge.net/projects/perstem Perstem] - Persian stemmer, light morphological analyzer, and character set converter.
 
*[http://sourceforge.net/projects/perstem Perstem] - Persian stemmer, light morphological analyzer, and character set converter.
*[http://apertium.svn.sourceforge.net/svnroot/apertium/trunk/apertium-tg-fa/apertium-tg-fa.fa.dix.xml Morphological dictionary] — compiled using [http://xixona.dlsi.ua.es/wiki/index.php/lttoolbox lttoolbox].
+
*[http://apertium.svn.sourceforge.net/svnroot/apertium/incubator/apertium-tg-fa/apertium-tg-fa.fa.dix Morphological dictionary] — compiled using [[lttoolbox]].
  
 
== Corpora ==
 
== Corpora ==
 +
===Free===
 +
*[http://www.ling.ohio-state.edu/~jonsafari/corpora VOA Persian Corpus 2003-2008] (public domain)
 +
 +
===Proprietary===
 
<!-- Please keep this list in alphabetical order -->
 
<!-- Please keep this list in alphabetical order -->
 +
*[http://ece.ut.ac.ir/DBRG/Bijankhan/ Bijankhan corpus] (gratis for research/non-commercial purposes)
 +
*[http://www.ldc.upenn.edu/Catalog/CatalogEntry.jsp?catalogId=LDC96S50 CALLFRIEND Farsi (speech)], LDC
 +
*[http://ece.ut.ac.ir/dbrg/hamshahri/ Hamshahri corpus] (gratis for research/non-commercial purposes)
 +
*[http://www.elda.org/catalogue/en/speech/S0112.html Persian speech database Farsdat], ELRA
 +
 +
 +
==Parsing==
 +
===Free resources===
 +
* [http://www.ling.ohio-state.edu/~jonsafari/persianlg/ Persian dictionaries] for the [http://www.abisource.com/projects/link-grammar/ Link-Grammar parser]. By [http://www.ling.ohio-state.edu/~jonsafari/ Jon Dehdari]. These require the Perstem stemming package, above.
 +
 +
===Proprietary===
 +
*[http://dadegan.ir/en/persiandependencytreebank Dadegan Dependency Treebank] for research purposes only.
 +
*[http://hpsg.fu-berlin.de/~ghayoomi/PTB.html HPSG Persian Treebank (PerTreeBank)] for academic research purposes only.
 +
*[http://stp.lingfil.uu.se/~mojgan/persian_dependency_treebank.pdf A soon-to-be-released Persian Dependency Treebank],  license not specified yet.
  
*[http://ece.ut.ac.ir/DBRG/Bijankhan/ Bijankhan corpus]
 
*[http://www.ldc.upenn.edu/Catalog/CatalogEntry.jsp?catalogId=LDC96S50 CALLFRIEND Farsi (speech)]
 
*[http://ece.ut.ac.ir/dbrg/hamshahri/ Hamshahri corpus]
 
*[http://www.elda.org/catalogue/en/speech/S0112.html Persian speech database Farsdat]
 
  
 
==Bibliography==
 
==Bibliography==
*  Feili, H. and G. Ghassem-Sani (2004) "An Application of Lexicalized Grammars in English-Persian Translation". ''Proceedings of the 16th European Conference on Artificial Intelligence (ECAI 2004)'', 24-27 Aug. 2004, Universidad Politecnica de Valencia, Valencia, Spain, pp. 596-600. [http://sharif.edu/~sani/papers/Feili_SaniE2.pdf PDF]
+
* Dehdari, Jon, and Deryle Lonsdale. 2008. [http://www.ling.ohio-state.edu/~jonsafari/papers/dehdari_lonsdale_2005.pdf A link grammar parser for Persian]. In Karimi, S., Samiian, V., and Stilo, D., editors, ''Aspects of Iranian Linguistics'', volume 1. Cambridge Scholars Press. ISBN: 978-18-471-8639-3 ([http://www.ling.ohio-state.edu/~jonsafari/bib/dehdarilonsdale2005.bib.txt BIB])
* Megerdoomian, K. (2000) "Unification-Based Persian Morphology". ''Proceedings of CICLing 2000'', Alexander Gelbukh, Center of Investigation on Computation-IPN, Mexico, 2000. [http://crl.nmsu.edu/Research/Projects/shiraz/publications/papers/Cicling.pdf PDF]
+
 
* Megerdoomian, K. (2004) "Finite-State Morphological Analysis of Persian". ''COLING 2004 Computational Approaches to Arabic Script-based Languages''. Ali Farghaly and Karine Megerdoomian editors, Geneva, Switzerland, 2004, pgs. 35-41. [http://acl.ldc.upenn.edu/coling2004/W5/pdf/W5-7.pdf PDF]
+
*  Feili, H. and G. Ghassem-Sani (2004) "[http://sharif.edu/~sani/papers/Feili_SaniE2.pdf An Application of Lexicalized Grammars in English-Persian Translation]". ''Proceedings of the 16th European Conference on Artificial Intelligence (ECAI 2004)'', 24-27 Aug. 2004, Universidad Politecnica de Valencia, Valencia, Spain, pp. 596-600.
 +
* Megerdoomian, K. (2000) "[http://crl.nmsu.edu/Research/Projects/shiraz/publications/papers/Cicling.pdf Unification-Based Persian Morphology]". ''Proceedings of CICLing 2000'', Alexander Gelbukh, Center of Investigation on Computation-IPN, Mexico, 2000.
 +
* Megerdoomian, K. (2004) "[http://acl.ldc.upenn.edu/coling2004/W5/pdf/W5-7.pdf Finite-State Morphological Analysis of Persian]". ''COLING 2004 Computational Approaches to Arabic Script-based Languages''. Ali Farghaly and Karine Megerdoomian editors, Geneva, Switzerland, 2004, pgs. 35-41.
 +
* Mohammad Amin Farajian (2011). [http://world-comp.org/p2011/ICA4953.pdf PEN: Parallel English-Persian News Corpus]. Proceedings of 2011 International Conference on Artificial Intelligence (ICAI'11), Nevada, USA.
  
 
==See also==
 
==See also==
 +
*[[Resources for Kurdish]]
 
*[[Resources for Tajik]]
 
*[[Resources for Tajik]]
  
 
==External links==
 
==External links==
*[http://home.byu.net/jmd56/index.html the Jon safari] (link parser, small lexicon, stemmer, morphological analysis tools)
+
*[http://www.iranianlinguistics.org/wiki/index.php?title=Persian Iranian Linguistics: NLP Resources for Persian]
 +
*[http://www.ling.ohio-state.edu/~jonsafari/persian_nlp.html the Jon safari] (link parser, small lexicon, stemmer, morphological analysis tools)
  
  
 
[[Category:Resources by language|Persian]]
 
[[Category:Resources by language|Persian]]

Revision as of 19:09, 8 October 2013

Machine translation

Free resources

Proprietary resources

Morphology tools

Free software

Corpora

Free

Proprietary


Parsing

Free resources

Proprietary


Bibliography

  • Dehdari, Jon, and Deryle Lonsdale. 2008. A link grammar parser for Persian. In Karimi, S., Samiian, V., and Stilo, D., editors, Aspects of Iranian Linguistics, volume 1. Cambridge Scholars Press. ISBN: 978-18-471-8639-3 (BIB)

See also

External links