Difference between revisions of "Resources for Hindi"

From ACL Wiki
Jump to navigation Jump to search
(Adding a new POS tagger, Shallow parser. Future: Article Structure needs to be edited. Confusing)
Line 1: Line 1:
 
Hindi computing is gaining momentum very fast. thousands of Hindi sites, blogs and portals have come as a result of availability of computing tools and ease of use. Following link has a list of important tools and softwares for Hindi and Devanaagarii:
 
Hindi computing is gaining momentum very fast. thousands of Hindi sites, blogs and portals have come as a result of availability of computing tools and ease of use. Following link has a list of important tools and softwares for Hindi and Devanaagarii:
  
*[http://hi.wikipedia.org/wiki/इंटरनेट_पर_हिन्दी_के_साधन Hindi Computing : Tools and Techniques]
+
*[http://bit.ly/ytAT95 Hindi Computing : Tools and Techniques]
 +
 
 +
==POS Tagger, Morphological Analyzer, Lemmatizer, Corpus==
 +
 
 +
* [http://sivareddy.in/downloads Download the tagger]
 +
 
 +
* [http://sivareddy.in/papers/files/hindi.sample.out.txt Sample output of the tagger]
 +
 
 +
The tagger and its related files are distributed under GNU GPL license. Corpus is licensed.
 +
 
 +
<b>Related Publication:</b>
 +
Siva Reddy, Serge Sharoff. [http://www.aclweb.org/anthology-new/W/W11/W11-3603.pdf Cross Language POS Taggers (and other Tools) for Indian Languages: An Experiment with Kannada using Telugu Resources.]  In Proceedings of IJCNLP workshop on Cross Lingual Information Access: Computational Linguistics and the Information Need of Multilingual Societies. (CLIA 2011 at IJNCLP 2011), Chiang Mai, Thailand [http://sivareddy.in/papers/reddy2011crosslang.bib Bibtex]
  
 
==Morphological analysis==
 
==Morphological analysis==
Line 8: Line 19:
  
 
* [http://apertium.svn.sourceforge.net/svnroot/apertium/trunk/incubator/apertium-hi-ur.hi.dix Hindi analyser] for [[lttoolbox]] (~29,385 lemmata) -- GPL (by the University of Hyderabad &mdash; converted from the Anusaaraka analyser)
 
* [http://apertium.svn.sourceforge.net/svnroot/apertium/trunk/incubator/apertium-hi-ur.hi.dix Hindi analyser] for [[lttoolbox]] (~29,385 lemmata) -- GPL (by the University of Hyderabad &mdash; converted from the Anusaaraka analyser)
 
===Proprietary===
 
  
 
==Machine translation==
 
==Machine translation==
Line 17: Line 26:
 
* [http://ltrc.iiit.net/~anusaaraka/ Anusaaraka] Hindi&mdash;English and others.
 
* [http://ltrc.iiit.net/~anusaaraka/ Anusaaraka] Hindi&mdash;English and others.
  
 +
==Shallow Parser==
 +
 +
[http://ltrc.iiit.ac.in/showfile.php?filename=downloads/shallow_parser.php Hindi Shallow parser]
 +
 +
Keywords: Hindi, Part of Speech tagger, Lemmatizer, Morph Analyzer, Corpus
  
 
[[Category:Resources by language|Hindi]]
 
[[Category:Resources by language|Hindi]]
 +
[[Category: Part of Speech tagger]]
 +
[[Category: Lemmatizer]]
 +
[[Category: Morph Analyser]]
 +
[[Category: Corpus]]

Revision as of 11:39, 16 January 2012

Hindi computing is gaining momentum very fast. thousands of Hindi sites, blogs and portals have come as a result of availability of computing tools and ease of use. Following link has a list of important tools and softwares for Hindi and Devanaagarii:

POS Tagger, Morphological Analyzer, Lemmatizer, Corpus

The tagger and its related files are distributed under GNU GPL license. Corpus is licensed.

Related Publication: Siva Reddy, Serge Sharoff. Cross Language POS Taggers (and other Tools) for Indian Languages: An Experiment with Kannada using Telugu Resources. In Proceedings of IJCNLP workshop on Cross Lingual Information Access: Computational Linguistics and the Information Need of Multilingual Societies. (CLIA 2011 at IJNCLP 2011), Chiang Mai, Thailand Bibtex

Morphological analysis

Free software

  • Hindi analyser for lttoolbox (~29,385 lemmata) -- GPL (by the University of Hyderabad — converted from the Anusaaraka analyser)

Machine translation

Free software

Shallow Parser

Hindi Shallow parser

Keywords: Hindi, Part of Speech tagger, Lemmatizer, Morph Analyzer, Corpus