Difference between revisions of "Resources for Hindi"
(Adding a new POS tagger, Shallow parser. Future: Article Structure needs to be edited. Confusing) |
|||
Line 2: | Line 2: | ||
*[http://bit.ly/ytAT95 Hindi Computing : Tools and Techniques] | *[http://bit.ly/ytAT95 Hindi Computing : Tools and Techniques] | ||
+ | |||
+ | ==Dependency Parser== | ||
+ | |||
+ | * [http://sivareddy.in/downloads Download the Parser] | ||
+ | * [http://sivareddy.in/papers/files/hindi.dependency.parser.out.txt Sample output of the parser] | ||
==POS Tagger, Morphological Analyzer, Lemmatizer, Corpus== | ==POS Tagger, Morphological Analyzer, Lemmatizer, Corpus== |
Revision as of 12:11, 20 April 2013
Hindi computing is gaining momentum very fast. thousands of Hindi sites, blogs and portals have come as a result of availability of computing tools and ease of use. Following link has a list of important tools and softwares for Hindi and Devanaagarii:
Dependency Parser
POS Tagger, Morphological Analyzer, Lemmatizer, Corpus
The tagger and its related files are distributed under GNU GPL license. Corpus is licensed.
Related Publication: Siva Reddy, Serge Sharoff. Cross Language POS Taggers (and other Tools) for Indian Languages: An Experiment with Kannada using Telugu Resources. In Proceedings of IJCNLP workshop on Cross Lingual Information Access: Computational Linguistics and the Information Need of Multilingual Societies. (CLIA 2011 at IJNCLP 2011), Chiang Mai, Thailand Bibtex
Morphological analysis
Free software
- Hindi analyser for lttoolbox (~29,385 lemmata) -- GPL (by the University of Hyderabad — converted from the Anusaaraka analyser)
Machine translation
Free software
- Anusaaraka Hindi—English and others.
Shallow Parser
Keywords: Hindi, Part of Speech tagger, Lemmatizer, Morph Analyzer, Corpus