Resources for Japanese
Jump to navigation
Jump to search
The printable version is no longer supported and may have rendering errors. Please update your browser bookmarks and please use the default browser print function instead.
There is a very good list at Kyoto University: Catalogue of Language Resources and Tools in Japan
Corpora
Proprietary
- Japanese plain text and Co-occurrences at LCC (downloadable and web-searchable, but only for non-commercial use)
- Balanced Corpus of Contemporary Written Japanese (BCCWJ) (subset is web searchable at Kotonoha)
Free/Open Licence
Multilingual
- Tanaka Corpus by Jim Breen, under a CC-BY-SA 3.0 licence
- Tatoeba Updated version of the Tanaka Corpus; ≈150,000 sentence pairs (CC-BY)
- Japanese-English Bilingual Corpus of Wikipedia's Kyoto Articles ≈500,000 pairs of manually-translated sentences (CC-BY 3.0)
- National Diet Library Subject Headers Japanese Subject Headers, with paraphrases including English Translations(non-commercial attribution)
- English-Japanese Translation Alignment Data aligned by Masao Utiyama (GFDL, CC-by-nc 1.0)
- WordNet Definitions and Glosses ≈180,000 sentence/phrase pairs (WordNet license, similar to BSD)
- The Kyoto Free Translation Task (KFTT) by Graham Neubig, 1235 sentences of Japanese-English manually word-aligned
- JEC Basic Sentence Data by Kyoto University: 5,304 basic Japanese sentences based on Kyoto University Case Frame Data, translated in Chinese and English
Monolingual
Grammars
Free/Open Licence
- Jacy HPSG grammar MIT Licence
Unknown licence
- KPML generation grammar (downloadable)
Dictionaries
Free/Open Licence
- EDICT Japanese-English dictionary, by Jim Breen, (CC-BY-SA 3.0 licence)
- ENAMDICT/JMnedict proper name dictionary, by Jim Breen, (CC-BY-SA 3.0 licence)
- Japanese version of WordNet by NICT, (WordNet license, like BSD)
Unknown licence
- List of Japanese transitive/intransitive verb pairs (dead link?)