Enriching a Treebank to Investigate Relative Clause Extraposition in German

Jan Strunk


Abstract
I describe the construction of a corpus for research on relative clause extraposition in German based on the treebank TüBa-D/Z. I also define an annotation scheme for the relations between relative clauses and their antecedents which is added as a second annotation level to the syntactic trees. This additional annotation level allows for a direct representation of the relevant parts of the relative construction and also serves as a locus for the annotation of additional features which are partly automatically derived from the underlying treebank and partly added manually. Finally, I also report on the results of two pilot studies using this enriched treebank. The first study tests claims made in the theoretical literature on relative clause extraposition with regard to syntactic locality, definiteness, and restrictiveness. It shows that although the theoretical claims often go in the right direction, they go too far by positing categorical constraints that are not supported by the corpus data and thus underestimate the complexity of the data. The second pilot study goes one step in the direction of taking this complexity into account by demonstrating the potential of the enriched treebank for building a multivariate model of relative clause extraposition as a syntactic alternation.
Anthology ID:
L10-1625
Volume:
Proceedings of the Seventh International Conference on Language Resources and Evaluation (LREC'10)
Month:
May
Year:
2010
Address:
Valletta, Malta
Editors:
Nicoletta Calzolari, Khalid Choukri, Bente Maegaard, Joseph Mariani, Jan Odijk, Stelios Piperidis, Mike Rosner, Daniel Tapias
Venue:
LREC
SIG:
Publisher:
European Language Resources Association (ELRA)
Note:
Pages:
Language:
URL:
http://www.lrec-conf.org/proceedings/lrec2010/pdf/917_Paper.pdf
DOI:
Bibkey:
Cite (ACL):
Jan Strunk. 2010. Enriching a Treebank to Investigate Relative Clause Extraposition in German. In Proceedings of the Seventh International Conference on Language Resources and Evaluation (LREC'10), Valletta, Malta. European Language Resources Association (ELRA).
Cite (Informal):
Enriching a Treebank to Investigate Relative Clause Extraposition in German (Strunk, LREC 2010)
Copy Citation:
PDF:
http://www.lrec-conf.org/proceedings/lrec2010/pdf/917_Paper.pdf