Cross-lingual dependency parsing for closely related languages - Helsinki’s submission to VarDial 2017

Jörg Tiedemann


Abstract
This paper describes the submission from the University of Helsinki to the shared task on cross-lingual dependency parsing at VarDial 2017. We present work on annotation projection and treebank translation that gave good results for all three target languages in the test set. In particular, Slovak seems to work well with information coming from the Czech treebank, which is in line with related work. The attachment scores for cross-lingual models even surpass the fully supervised models trained on the target language treebank. Croatian is the most difficult language in the test set and the improvements over the baseline are rather modest. Norwegian works best with information coming from Swedish whereas Danish contributes surprisingly little.
Anthology ID:
W17-1216
Volume:
Proceedings of the Fourth Workshop on NLP for Similar Languages, Varieties and Dialects (VarDial)
Month:
April
Year:
2017
Address:
Valencia, Spain
Editors:
Preslav Nakov, Marcos Zampieri, Nikola Ljubešić, Jörg Tiedemann, Shevin Malmasi, Ahmed Ali
Venue:
VarDial
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
131–136
Language:
URL:
https://aclanthology.org/W17-1216
DOI:
10.18653/v1/W17-1216
Bibkey:
Cite (ACL):
Jörg Tiedemann. 2017. Cross-lingual dependency parsing for closely related languages - Helsinki’s submission to VarDial 2017. In Proceedings of the Fourth Workshop on NLP for Similar Languages, Varieties and Dialects (VarDial), pages 131–136, Valencia, Spain. Association for Computational Linguistics.
Cite (Informal):
Cross-lingual dependency parsing for closely related languages - Helsinki’s submission to VarDial 2017 (Tiedemann, VarDial 2017)
Copy Citation:
PDF:
https://aclanthology.org/W17-1216.pdf
Data
Universal Dependencies