Probabilistic Verb Selection for Data-to-Text Generation

Dell Zhang, Jiahao Yuan, Xiaoling Wang, Adam Foster


Abstract
In data-to-text Natural Language Generation (NLG) systems, computers need to find the right words to describe phenomena seen in the data. This paper focuses on the problem of choosing appropriate verbs to express the direction and magnitude of a percentage change (e.g., in stock prices). Rather than simply using the same verbs again and again, we present a principled data-driven approach to this problem based on Shannon’s noisy-channel model so as to bring variation and naturalness into the generated text. Our experiments on three large-scale real-world news corpora demonstrate that the proposed probabilistic model can be learned to accurately imitate human authors’ pattern of usage around verbs, outperforming the state-of-the-art method significantly.
Anthology ID:
Q18-1038
Volume:
Transactions of the Association for Computational Linguistics, Volume 6
Month:
Year:
2018
Address:
Cambridge, MA
Editors:
Lillian Lee, Mark Johnson, Kristina Toutanova, Brian Roark
Venue:
TACL
SIG:
Publisher:
MIT Press
Note:
Pages:
511–527
Language:
URL:
https://aclanthology.org/Q18-1038
DOI:
10.1162/tacl_a_00038
Bibkey:
Cite (ACL):
Dell Zhang, Jiahao Yuan, Xiaoling Wang, and Adam Foster. 2018. Probabilistic Verb Selection for Data-to-Text Generation. Transactions of the Association for Computational Linguistics, 6:511–527.
Cite (Informal):
Probabilistic Verb Selection for Data-to-Text Generation (Zhang et al., TACL 2018)
Copy Citation:
PDF:
https://aclanthology.org/Q18-1038.pdf