Massively Multilingual Neural Grapheme-to-Phoneme Conversion

Ben Peters; Jon Dehdari; Josef van Genabith

doi:10.18653/v1/W17-5403

Massively Multilingual Neural Grapheme-to-Phoneme Conversion

Ben Peters, Jon Dehdari, Josef van Genabith

Abstract

Grapheme-to-phoneme conversion (g2p) is necessary for text-to-speech and automatic speech recognition systems. Most g2p systems are monolingual: they require language-specific data or handcrafting of rules. Such systems are difficult to extend to low resource languages, for which data and handcrafted rules are not available. As an alternative, we present a neural sequence-to-sequence approach to g2p which is trained on spelling–pronunciation pairs in hundreds of languages. The system shares a single encoder and decoder across all languages, allowing it to utilize the intrinsic similarities between different writing systems. We show an 11% improvement in phoneme error rate over an approach based on adapting high-resource monolingual g2p models to low-resource languages. Our model is also much more compact relative to previous approaches.

Anthology ID:: W17-5403
Volume:: Proceedings of the First Workshop on Building Linguistically Generalizable NLP Systems
Month:: September
Year:: 2017
Address:: Copenhagen, Denmark
Editors:: Emily Bender, Hal Daumé III, Allyson Ettinger, Sudha Rao
Venue:: WS
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 19–26
Language:
URL:: https://aclanthology.org/W17-5403/
DOI:: 10.18653/v1/W17-5403
Bibkey:
Cite (ACL):: Ben Peters, Jon Dehdari, and Josef van Genabith. 2017. Massively Multilingual Neural Grapheme-to-Phoneme Conversion. In Proceedings of the First Workshop on Building Linguistically Generalizable NLP Systems, pages 19–26, Copenhagen, Denmark. Association for Computational Linguistics.
Cite (Informal):: Massively Multilingual Neural Grapheme-to-Phoneme Conversion (Peters et al., 2017)
Copy Citation:
PDF:: https://aclanthology.org/W17-5403.pdf
Code: bpopeters/mg2p

PDF Cite Search Code Fix data