The Alborada-I3A Corpus of Disordered Speech

Oscar Saz, Eduardo Lleida, Carlos Vaquero, W.-Ricardo Rodríguez


Abstract
This paper describes the Alborada-I3A corpus of disordered speech, acquired during the recent years for the research in different speech technologies for the handicapped like Automatic Speech Recognition or pronunciation assessment. It contains more than 2 hours of speech from 14 young impaired speakers and nearly 9 hours from 232 unimpaired age-matched peers whose collaboration was possible by the joint work with different educational and assistive institutions. Furthermore, some extra resources are provided with the corpus, including the results of a perceptual human-based labeling of the lexical mispronunciations made by the impaired speakers. The corpus has been used to achieve results in different tasks like analyses on the speech production in impaired children, acoustic and lexical adaptation for ASR and studies on the speech proficiency of the impaired speakers. Finally, the full corpus is freely available for the research community with the only restrictions of maintaining all its data and resources for research purposes only and keeping the privacy of the speakers and their speech data
Anthology ID:
L10-1075
Volume:
Proceedings of the Seventh International Conference on Language Resources and Evaluation (LREC'10)
Month:
May
Year:
2010
Address:
Valletta, Malta
Editors:
Nicoletta Calzolari, Khalid Choukri, Bente Maegaard, Joseph Mariani, Jan Odijk, Stelios Piperidis, Mike Rosner, Daniel Tapias
Venue:
LREC
SIG:
Publisher:
European Language Resources Association (ELRA)
Note:
Pages:
Language:
URL:
http://www.lrec-conf.org/proceedings/lrec2010/pdf/119_Paper.pdf
DOI:
Bibkey:
Cite (ACL):
Oscar Saz, Eduardo Lleida, Carlos Vaquero, and W.-Ricardo Rodríguez. 2010. The Alborada-I3A Corpus of Disordered Speech. In Proceedings of the Seventh International Conference on Language Resources and Evaluation (LREC'10), Valletta, Malta. European Language Resources Association (ELRA).
Cite (Informal):
The Alborada-I3A Corpus of Disordered Speech (Saz et al., LREC 2010)
Copy Citation:
PDF:
http://www.lrec-conf.org/proceedings/lrec2010/pdf/119_Paper.pdf