German Today: a really extensive Corpus of Spoken Standard German

Caren Brinckmann, Stefan Kleiner, Ralf Knöbl, Nina Berend


Abstract
The research project “German Today” aims to determine the amount of regional variation in (near-)standard German spoken by young and older educated adults and to identify and locate regional features. To this end, we compile an areally extensive corpus of read and spontaneous German speech. Secondary school students and 50-to-60-year-old locals are recorded in 160 cities throughout the German speaking area of Europe. All participants read a number of short texts and a word list, name pictures, translate words and sentences from English, answer questions in a sociobiographic interview, and take part in a map task experiment. The resulting corpus comprises over 1,000 hours of speech, which is transcribed orthographically. Automatically derived broad phonetic transcriptions, selective manual narrow phonetic transcriptions, and variationalist annotations are added. Focussing on phonetic variation we aim to show to what extent national or regional standards exist in spoken German. Furthermore, the linguistic variation due to different contextual styles (read vs. spontaneous speech) shall be analysed. Finally, the corpus enables us to investigate whether linguistic change has occurred in spoken (near-)standard German.
Anthology ID:
L08-1516
Volume:
Proceedings of the Sixth International Conference on Language Resources and Evaluation (LREC'08)
Month:
May
Year:
2008
Address:
Marrakech, Morocco
Editors:
Nicoletta Calzolari, Khalid Choukri, Bente Maegaard, Joseph Mariani, Jan Odijk, Stelios Piperidis, Daniel Tapias
Venue:
LREC
SIG:
Publisher:
European Language Resources Association (ELRA)
Note:
Pages:
Language:
URL:
http://www.lrec-conf.org/proceedings/lrec2008/pdf/806_paper.pdf
DOI:
Bibkey:
Cite (ACL):
Caren Brinckmann, Stefan Kleiner, Ralf Knöbl, and Nina Berend. 2008. German Today: a really extensive Corpus of Spoken Standard German. In Proceedings of the Sixth International Conference on Language Resources and Evaluation (LREC'08), Marrakech, Morocco. European Language Resources Association (ELRA).
Cite (Informal):
German Today: a really extensive Corpus of Spoken Standard German (Brinckmann et al., LREC 2008)
Copy Citation:
PDF:
http://www.lrec-conf.org/proceedings/lrec2008/pdf/806_paper.pdf