Difference between revisions of "Resources for Italian"
Jump to navigation
Jump to search
(updated and extended references) |
|||
(26 intermediate revisions by 11 users not shown) | |||
Line 1: | Line 1: | ||
− | == Tools == | + | == Tools for Italian == |
+ | |||
+ | === Tokenisers === | ||
+ | * [http://tcc.itc.it/projects/textpro/index.php TextPro] | ||
+ | |||
+ | === POS taggers === | ||
+ | * [http://tcc.itc.it/projects/textpro/index.php TextPro] | ||
+ | * [http://www.ims.uni-stuttgart.de/projekte/corplex/TreeTagger/DecisionTreeTagger.html TreeTagger] | ||
+ | |||
+ | ===Morphology=== | ||
+ | ====Free software==== | ||
+ | * [http://sslmitdev-online.sslmit.unibo.it/linguistics/morph-it.php Morph-It! version 0.47] - a free morphological resource for the Italian language, includes [[SFST]] sources. [[LGPL]] license. | ||
+ | |||
+ | ====Unknown license==== | ||
+ | * [https://archivium.biz/strumenti/Analisi-Verbi.html Archivium AV] - a verb analyzer for Italian verbs | ||
+ | * [https://archivium.biz/strumenti/Coniugazione-Verbi.html Archivium CV] - a verb coniugator for Italian verbs | ||
+ | * [https://archivium.biz/strumenti/Analisi-Grammaticale.html Archivium AG] - a morphological analizer for Italian text | ||
+ | Contact m dot mensa at tecnologie-umanistiche dot it for API | ||
+ | |||
+ | === Named Entity Recognisers === | ||
+ | * [http://tcc.itc.it/projects/ontotext/entitypro.html EntityPro] | ||
+ | |||
+ | === Temporal Expressions === | ||
+ | * [http://tcc.itc.it/projects/ontotext/ita-chronos.html ITA-Chronos] | ||
=== Parsers === | === Parsers === | ||
+ | * [http://ai-nlp.info.uniroma2.it/external/chaosproject/ Chaos] - Robust syntactic parser for Italian and for English | ||
+ | === Generators === | ||
+ | * [http://tcc.itc.it/projects/xig/index.html XIG] - Interchange to Italian Generator | ||
+ | == Resources for Italian == | ||
− | == | + | === Corpora === |
+ | <!-- Please keep this list in alphabetical order --> | ||
+ | |||
+ | * [http://ucts.uniba.sk/aranea_about/ Araneum Italicum], Gigaword Italian web corpus | ||
+ | * [http://www.istc.cnr.it/material/database/colfis/ ColFIS Corpus e Lessico di Frequenza dell'Italiano Scritto] | ||
+ | * [http://corpus.cilta.unibo.it:8080/coris_ita.html Corpus di Italiano Scritto contemporaneo (CORIS/CODIS)] | ||
+ | * [http://www.statmt.org/europarl Europarl corpus], sentence aligned with English | ||
+ | * [http://ufal.mff.cuni.cz/hamledt HamleDT], harmonized dependency treebanks of many languages, common annotation style. | ||
+ | * [http://corpora.informatik.uni-leipzig.de/ Italian plain text and Co-occurrences at LCC] | ||
+ | * [http://languageserver.uni-graz.at/badip/badip/20_corpusLip.php LIP - Lessico di frequenza dell'Italiano Parlato - Access via BADIP] | ||
+ | * [http://multisemcor.itc.it/ MultiSemCor] - English/Italian parallel corpus | ||
+ | * [http://www.uni-duisburg.de/Fak2/FremdPhil/Romanistik/Personal/Burr/humcomp/ Oxford Text Archive Corpus of Italian Newspapers] | ||
+ | * [http://tlio.ovi.cnr.it/TLIO/ Tesoro della lingua italiana delle origini (TLIO)] | ||
+ | |||
+ | === Tagsets === | ||
+ | * [http://tcc.itc.it/projects/textpro/index.php LemmaPro] - Italian POS tagset for LemmaPro | ||
=== Treebanks === | === Treebanks === | ||
Line 13: | Line 55: | ||
=== WordNets === | === WordNets === | ||
+ | * [http://www.elda.fr/ EuroWordNet] | ||
* [http://multiwordnet.itc.it/english/home.php MultiWordNet] - a multilingual lexical database in which the Italian WordNet is strictly aligned with Princeton WordNet 1.6 | * [http://multiwordnet.itc.it/english/home.php MultiWordNet] - a multilingual lexical database in which the Italian WordNet is strictly aligned with Princeton WordNet 1.6 | ||
+ | |||
+ | === Lexicons === | ||
+ | * [http://www.ilc.cnr.it/clips/PSC_decription.htm PAROLE-SIMPLE-CLIPS] - a four-layered, general purpose computational lexicon | ||
+ | |||
+ | == Links == | ||
+ | * [http://evalita.itc.it/ Evalita] - Evaluation of NLP tools for Italian | ||
+ | |||
+ | [[Category:Resources by language|Italian]] |
Latest revision as of 14:24, 15 March 2019
Tools for Italian
Tokenisers
POS taggers
Morphology
Free software
- Morph-It! version 0.47 - a free morphological resource for the Italian language, includes SFST sources. LGPL license.
Unknown license
- Archivium AV - a verb analyzer for Italian verbs
- Archivium CV - a verb coniugator for Italian verbs
- Archivium AG - a morphological analizer for Italian text
Contact m dot mensa at tecnologie-umanistiche dot it for API
Named Entity Recognisers
Temporal Expressions
Parsers
- Chaos - Robust syntactic parser for Italian and for English
Generators
- XIG - Interchange to Italian Generator
Resources for Italian
Corpora
- Araneum Italicum, Gigaword Italian web corpus
- ColFIS Corpus e Lessico di Frequenza dell'Italiano Scritto
- Corpus di Italiano Scritto contemporaneo (CORIS/CODIS)
- Europarl corpus, sentence aligned with English
- HamleDT, harmonized dependency treebanks of many languages, common annotation style.
- Italian plain text and Co-occurrences at LCC
- LIP - Lessico di frequenza dell'Italiano Parlato - Access via BADIP
- MultiSemCor - English/Italian parallel corpus
- Oxford Text Archive Corpus of Italian Newspapers
- Tesoro della lingua italiana delle origini (TLIO)
Tagsets
- LemmaPro - Italian POS tagset for LemmaPro
Treebanks
- ISST - Italian Syntactic-Semantic Treebank
- TUT - Turin University Treebank
- VIT - Venice Italian Treebank
WordNets
- EuroWordNet
- MultiWordNet - a multilingual lexical database in which the Italian WordNet is strictly aligned with Princeton WordNet 1.6
Lexicons
- PAROLE-SIMPLE-CLIPS - a four-layered, general purpose computational lexicon
Links
- Evalita - Evaluation of NLP tools for Italian