Difference between revisions of "Resources for Japanese"
Jump to navigation
Jump to search
(→Free/Open Licence: Added wordnet) |
(→Free/Open Licence: added Kyoto University and NTT Blog Corpus) |
||
Line 4: | Line 4: | ||
===Free/Open Licence=== | ===Free/Open Licence=== | ||
+ | ====Multilingual==== | ||
* [http://www.edrdg.org/projects/tanaka/tanakacorpus.html Tanaka Corpus] by Jim Breen, under a CC-BY-SA 3.0 licence | * [http://www.edrdg.org/projects/tanaka/tanakacorpus.html Tanaka Corpus] by Jim Breen, under a CC-BY-SA 3.0 licence | ||
** [http://tatoeba.org/eng/home Tatoeba] Updated version of the Tanaka Corpus; ≈150,000 sentence pairs (CC-BY) | ** [http://tatoeba.org/eng/home Tatoeba] Updated version of the Tanaka Corpus; ≈150,000 sentence pairs (CC-BY) | ||
Line 10: | Line 11: | ||
* [http://mastarpj.nict.go.jp/~mutiyama/align/index.html English-Japanese Translation Alignment Data] aligned by [http://mastarpj.nict.go.jp/~mutiyama/ Masao Utiyama] (GFDL, CC-by-nc 1.0) | * [http://mastarpj.nict.go.jp/~mutiyama/align/index.html English-Japanese Translation Alignment Data] aligned by [http://mastarpj.nict.go.jp/~mutiyama/ Masao Utiyama] (GFDL, CC-by-nc 1.0) | ||
* [http://nlpwww.nict.go.jp/wn-ja/index.en.html WordNet Definitions and Glosses] ≈180,000 sentence/phrase pairs (WordNet license, similar to BSD) | * [http://nlpwww.nict.go.jp/wn-ja/index.en.html WordNet Definitions and Glosses] ≈180,000 sentence/phrase pairs (WordNet license, similar to BSD) | ||
+ | ====Monolingual==== | ||
+ | * [http://www-lab25.kuee.kyoto-u.ac.jp/NLP_Portal/lr-cat-e.html#jp:knb_corpus Kyoto University and NTT Blog Corpus] | ||
== Grammars == | == Grammars == |
Revision as of 17:54, 3 May 2011
Corpora
Proprietary
- Japanese plain text and Co-occurrences at LCC (downloadable and web-searchable, but only for non-commercial use)
Free/Open Licence
Multilingual
- Tanaka Corpus by Jim Breen, under a CC-BY-SA 3.0 licence
- Tatoeba Updated version of the Tanaka Corpus; ≈150,000 sentence pairs (CC-BY)
- Japanese-English Bilingual Corpus of Wikipedia's Kyoto Articles ≈500,000 pairs of manually-translated sentences (CC-BY 3.0)
- National Diet Library Subject Headers Japanese Subject Headers, with paraphrases including English Translations(non-commercial attribution)
- English-Japanese Translation Alignment Data aligned by Masao Utiyama (GFDL, CC-by-nc 1.0)
- WordNet Definitions and Glosses ≈180,000 sentence/phrase pairs (WordNet license, similar to BSD)
Monolingual
Grammars
Free/Open Licence
- Jacy HPSG grammar MIT Licence
Unknown licence
- KPML generation grammar (downloadable)
Dictionaries
Free/Open Licence
- EDICT Japanese-English dictionary, by Jim Breen, (CC-BY-SA 3.0 licence)
- ENAMDICT/JMnedict proper name dictionary, by Jim Breen, (CC-BY-SA 3.0 licence)
- Japanese version of WordNet by NICT, (WordNet license, like BSD)
Unknown licence
- List of Japanese transitive/intransitive verb pairs (dead link?)