Difference between revisions of "Resources for Portugese"
Jump to navigation
Jump to search
m |
|||
(2 intermediate revisions by 2 users not shown) | |||
Line 4: | Line 4: | ||
* [http://www.statmt.org/europarl Europarl corpus], sentence aligned with English | * [http://www.statmt.org/europarl Europarl corpus], sentence aligned with English | ||
* [http://ufal.mff.cuni.cz/hamledt HamleDT], harmonized dependency treebanks of many languages, common annotation style. | * [http://ufal.mff.cuni.cz/hamledt HamleDT], harmonized dependency treebanks of many languages, common annotation style. | ||
+ | * [http://www.corpusdoportugues.org/ o Corpus do Português] (web-only interface) | ||
+ | * [https://dev.termwatch.es/~fresa/CORPUS/MSF2/ The Portuguese/Spanish corpus of Multi-Sentence Fusion] | ||
==Software== | ==Software== | ||
Line 9: | Line 11: | ||
* [http://www.linguateca.pt/corpografo Corpógrafo] - a Web-based environment for corpora research | * [http://www.linguateca.pt/corpografo Corpógrafo] - a Web-based environment for corpora research | ||
− | == | + | ==Word Lists== |
− | * [http://www.uni-koeln.de/~mzampier/resources/pawl.txt P-AWL] - the Portuguese academic | + | * [http://www.uni-koeln.de/~mzampier/resources/pawl.txt P-AWL] - the Portuguese academic word list compiled as described in [http://link.springer.com/chapter/10.1007/978-3-642-12320-7_15#page-1 Baptista et al. (2010)] |
[[Category:Resources by language|Portugese]] | [[Category:Resources by language|Portugese]] |
Latest revision as of 04:09, 4 May 2020
Corpora
- Colonia, corpus of historical Portuguese.
- Europarl corpus, sentence aligned with English
- HamleDT, harmonized dependency treebanks of many languages, common annotation style.
- o Corpus do Português (web-only interface)
- The Portuguese/Spanish corpus of Multi-Sentence Fusion
Software
- CEPRIL - Portugese Segmenter
- Corpógrafo - a Web-based environment for corpora research
Word Lists
- P-AWL - the Portuguese academic word list compiled as described in Baptista et al. (2010)