Building the Basque PropBank

Izaskun Aldezabal, María Jesús Aranzabe, Arantza Díaz de Ilarraza, Ainara Estarrona


Abstract
This paper presents the work that has been carried out to annotate semantic roles in the Basque Dependency Treebank (BDT). We will describe the resources we have used and the way the annotation of 100 verbs has been done. We decide to follow the model proposed in the PropBank project that has been deployed in other languages, such as Chinese, Spanish, Catalan and Russian. The resources used are: an in-house database with syntactic/semantic subcategorization frames for Basque verbs, an English-Basque verb mapping based on Levin’s classification and the BDT itself. Detailed guidelines for human taggers have been established as a result of this annotation process. In addition, we have characterized the information associated to the semantic tag. Besides, and based on this study, we will define semi-automatic procedures that will facilitate the task of manual annotation for the rest of the verbs of the Treebank. We have also adapted AbarHitz, a tool used in the construction of the BDT, for the task of annotating semantic roles according to the proposed characterization.
Anthology ID:
L10-1149
Volume:
Proceedings of the Seventh International Conference on Language Resources and Evaluation (LREC'10)
Month:
May
Year:
2010
Address:
Valletta, Malta
Editors:
Nicoletta Calzolari, Khalid Choukri, Bente Maegaard, Joseph Mariani, Jan Odijk, Stelios Piperidis, Mike Rosner, Daniel Tapias
Venue:
LREC
SIG:
Publisher:
European Language Resources Association (ELRA)
Note:
Pages:
Language:
URL:
http://www.lrec-conf.org/proceedings/lrec2010/pdf/217_Paper.pdf
DOI:
Bibkey:
Cite (ACL):
Izaskun Aldezabal, María Jesús Aranzabe, Arantza Díaz de Ilarraza, and Ainara Estarrona. 2010. Building the Basque PropBank. In Proceedings of the Seventh International Conference on Language Resources and Evaluation (LREC'10), Valletta, Malta. European Language Resources Association (ELRA).
Cite (Informal):
Building the Basque PropBank (Aldezabal et al., LREC 2010)
Copy Citation:
PDF:
http://www.lrec-conf.org/proceedings/lrec2010/pdf/217_Paper.pdf