Morphology software for English

From ACL Wiki
Jump to navigation Jump to search
The printable version is no longer supported and may have rendering errors. Please update your browser bookmarks and please use the default browser print function instead.

Tools and Software for English - Morphology and part of speech tagging

For languages other than English, see List of resources by language.

Morphology

Free software

  • Catvar 2.0 - The Categorial Variation Database for English (OSL)
  • HFST - Helsinki Finite-State Transducer Technology - FST library, command line tools, hfst-twolc (a rule compiler for two-level rules), and several spellers and morphological analyzers (GPL)
  • FOMA - finite-state toolkit (similar to Xerox XFST), created and maintained by Måns Huldén (GPL)
  • lttoolbox -- lexical processing tools for building morphological analysers/generators with XML specification files. Includes data for English (both analysis and disambiguation). (GPL)
  • PC-KIMMO - a Two-level Processor for Morphological Analysis, including KGEN, KTEXT, and Englex
  • SFST - Stuttgart Finite State Transducer Tools (GPL)
    • Where is the data for English?
  • MULTEXT mmorph - (unmaintained) two-level morphology, package includes some data for English and German, (GPL2 or later)

Source(s): Software solutions

Unknown license

  • MAP - Cambridge/Edinburgh Morphological Analyzer and Dictionary System (gratis download, no license information)

Proprietary software

Part of speech tagging

Free software

  • ACOPOST - A Collection Of PoS Taggers Maximum Entropy Tagger, Trigram Tagger, Transformation-based Tagger, Example-based tagger
  • Illinois LBJ POS Tagger - Uses averaged Perceptron based sequential model. Java API, Free, open source license.
  • GENiA- part-of-speech tagging, shallow parsing, and named entity recognition for biomedical text. C++, BSD license.
  • NLTK - Natural Language Toolkit Regexp Tagger, N-Gram Tagger, Brill Tagger, HMM Tagger, plus a freely downloadable book with a chapter on tagging
  • RelEx - provides English-language part-of-speech tagging, entity tagging, as well as other types of tags (gender, date, money ...), after performing a deep parse, so that tags agree with parse. Also provides resulting stems. Apache 2.0 License.
  • Spejd - Shallow Parsing and Disambiguation Engine a GPL tool for simultaneous rule-based morphosyntactic disambiguation and partial parsing
  • Tagger training on the Apertium Wiki (HMM + constraint based)
  • VISL Constraint Grammar rule based disambiguation (GPL)
    • Is there a Free set of rules for English?

Proprietary software

Combined morphology and tagging

Free software

Proprietary software

See also