Language Identification Tools

From ACL Wiki
Revision as of 07:41, 19 December 2012 by Kiwibird (talk | contribs) (wops)

(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to: navigation, search

A listing of language identification tools. Language identification can mean both identifiying text type (e.g. news vs literature) and language (e.g. English vs Frisian vs Dutch).

Most of these tools require training on a big corpus (see List of resources by language for corpora per language), but many come with some prebuilt language models.

Free Software

  • Compact Language Detector for Javascript (3-clause license)
    • doesn't seem to include a method to add new languages, the existing ones were presumably generated by Google


See also