Universal Dependencies for Arabic

Dima Taji, Nizar Habash, Daniel Zeman


Abstract
We describe the process of creating NUDAR, a Universal Dependency treebank for Arabic. We present the conversion from the Penn Arabic Treebank to the Universal Dependency syntactic representation through an intermediate dependency representation. We discuss the challenges faced in the conversion of the trees, the decisions we made to solve them, and the validation of our conversion. We also present initial parsing results on NUDAR.
Anthology ID:
W17-1320
Volume:
Proceedings of the Third Arabic Natural Language Processing Workshop
Month:
April
Year:
2017
Address:
Valencia, Spain
Editors:
Nizar Habash, Mona Diab, Kareem Darwish, Wassim El-Hajj, Hend Al-Khalifa, Houda Bouamor, Nadi Tomeh, Mahmoud El-Haj, Wajdi Zaghouani
Venue:
WANLP
SIG:
SEMITIC
Publisher:
Association for Computational Linguistics
Note:
Pages:
166–176
Language:
URL:
https://aclanthology.org/W17-1320
DOI:
10.18653/v1/W17-1320
Bibkey:
Cite (ACL):
Dima Taji, Nizar Habash, and Daniel Zeman. 2017. Universal Dependencies for Arabic. In Proceedings of the Third Arabic Natural Language Processing Workshop, pages 166–176, Valencia, Spain. Association for Computational Linguistics.
Cite (Informal):
Universal Dependencies for Arabic (Taji et al., WANLP 2017)
Copy Citation:
PDF:
https://aclanthology.org/W17-1320.pdf
Data
Universal Dependencies