Difference between revisions of "Resources for Portugese"

From ACL Wiki
Jump to navigation Jump to search
(added sections; +Europarl corpus)
 
(4 intermediate revisions by 3 users not shown)
Line 1: Line 1:
  
 
==Corpora==
 
==Corpora==
 +
* [http://corporavm.uni-koeln.de/colonia/ Colonia], corpus of historical Portuguese.
 
* [http://www.statmt.org/europarl Europarl corpus], sentence aligned with English
 
* [http://www.statmt.org/europarl Europarl corpus], sentence aligned with English
 
+
* [http://ufal.mff.cuni.cz/hamledt HamleDT], harmonized dependency treebanks of many languages, common annotation style.
 +
* [http://www.corpusdoportugues.org/ o Corpus do Português] (web-only interface)
 +
* [https://dev.termwatch.es/~fresa/CORPUS/MSF2/ The Portuguese/Spanish corpus of Multi-Sentence Fusion]
  
 
==Software==
 
==Software==
Line 8: Line 11:
 
* [http://www.linguateca.pt/corpografo Corpógrafo] - a Web-based environment for corpora research
 
* [http://www.linguateca.pt/corpografo Corpógrafo] - a Web-based environment for corpora research
  
 +
==Word Lists==
 +
* [http://www.uni-koeln.de/~mzampier/resources/pawl.txt P-AWL] - the Portuguese academic word list compiled as described in [http://link.springer.com/chapter/10.1007/978-3-642-12320-7_15#page-1 Baptista et al. (2010)]
  
 
[[Category:Resources by language|Portugese]]
 
[[Category:Resources by language|Portugese]]

Latest revision as of 04:09, 4 May 2020

Corpora

Software

  • CEPRIL - Portugese Segmenter
  • Corpógrafo - a Web-based environment for corpora research

Word Lists