Generating flexible proper name references in text: Data, models and evaluation

Thiago Castro Ferreira, Emiel Krahmer, Sander Wubben


Abstract
This study introduces a statistical model able to generate variations of a proper name by taking into account the person to be mentioned, the discourse context and variation. The model relies on the REGnames corpus, a dataset with 53,102 proper name references to 1,000 people in different discourse contexts. We evaluate the versions of our model from the perspective of how human writers produce proper names, and also how human readers process them. The corpus and the model are publicly available.
Anthology ID:
E17-1062
Volume:
Proceedings of the 15th Conference of the European Chapter of the Association for Computational Linguistics: Volume 1, Long Papers
Month:
April
Year:
2017
Address:
Valencia, Spain
Editors:
Mirella Lapata, Phil Blunsom, Alexander Koller
Venue:
EACL
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
655–664
Language:
URL:
https://aclanthology.org/E17-1062
DOI:
Bibkey:
Cite (ACL):
Thiago Castro Ferreira, Emiel Krahmer, and Sander Wubben. 2017. Generating flexible proper name references in text: Data, models and evaluation. In Proceedings of the 15th Conference of the European Chapter of the Association for Computational Linguistics: Volume 1, Long Papers, pages 655–664, Valencia, Spain. Association for Computational Linguistics.
Cite (Informal):
Generating flexible proper name references in text: Data, models and evaluation (Castro Ferreira et al., EACL 2017)
Copy Citation:
PDF:
https://aclanthology.org/E17-1062.pdf