Difference between revisions of "Resources for Spanish"
Jump to navigation
Jump to search
Line 6: | Line 6: | ||
* [http://www.statmt.org/wmt13/translation-task.html#download WMT corpora], including [http://en.wikipedia.org/wiki/Europarl_corpus Europarl], News Commentary, and News Crawl | * [http://www.statmt.org/wmt13/translation-task.html#download WMT corpora], including [http://en.wikipedia.org/wiki/Europarl_corpus Europarl], News Commentary, and News Crawl | ||
* [http://www.euromatrixplus.net/multi-un/ UN parallel corpora] | * [http://www.euromatrixplus.net/multi-un/ UN parallel corpora] | ||
+ | * [https://dev.termwatch.es/~fresa/CORPUS/MSF2/ The Portuguese/Spanish corpus of Multi-Sentence Fusion] | ||
== Grammars == | == Grammars == |
Revision as of 04:09, 4 May 2020
Corpora
- Araneum Hispanicum, Gigaword Spanish web corpus
- Corpus del Español (website only)
- Corpus de referencia de la lengua Española contemporanea: corpus oral peninsular
- HamleDT, harmonized dependency treebanks of many languages, common annotation style.
- WMT corpora, including Europarl, News Commentary, and News Crawl
- UN parallel corpora
- The Portuguese/Spanish corpus of Multi-Sentence Fusion
Grammars
Datasets
- Spanish word similarity dataset based on RG-65.