Resources for German
The printable version is no longer supported and may have rendering errors. Please update your browser bookmarks and please use the default browser print function instead.
Corpora
Free software
- RIA Open Source Rule Induction Tool includes an LFG-parsed German-English phrase-aligned parallel corpus, a subset of the EuroParl corpus (4000 sentences for each language, the tool at least is LGPL)
Unknown license
- Bavarian Archive for Speech Signals Corpora
- COSMAS II
- Experimental Corpus Query System (University of Stuttgart, Germany)
- German plain text and Co-occurrences at LCC
- NEGRA Corpus
- TIGER treebank
- Tübingen Treebank of Written German (TüBa-D/Z)
- Tübingen Treebank of Spoken German (TüBa-D/S, aka Verbmobil treebank)
- Tübingen Partially Parsed Corpus of Written German (TüPP-D/Z)
- Le Monde Diplomatique-Die Tageszeitung Translation Corpus - French-German, aligned (parallel)
Evaluation datasets
Grammars
Morphological analsis
Free software
- Morphisto, based on SMOR, is an SFST-based analyser and generator for German. (The morphology is GPLv2, but the lexicon is proprietary/non-commercial: CC-BY-SA-NC v3)
Lexicons
Free software
- DING - German-English Dictionary with approximately 253,000 entries (GPL 2 or later).
Proprietary/gratis
- Lexical information for German ("The data is freely available for education, research and other non-commercial purposes.")
- Canoo.net - German Dictionaries and Grammars
Unknown license
- IMSLex German Lexicon (no license information, but only "sample" download)
- mOlif morphological analyzer (broken link)