Underspecified Universal Dependency Structures as Inputs for Multilingual Surface Realisation

Simon Mille, Anja Belz, Bernd Bohnet, Leo Wanner


Abstract
In this paper, we present the datasets used in the Shallow and Deep Tracks of the First Multilingual Surface Realisation Shared Task (SR’18). For the Shallow Track, data in ten languages has been released: Arabic, Czech, Dutch, English, Finnish, French, Italian, Portuguese, Russian and Spanish. For the Deep Track, data in three languages is made available: English, French and Spanish. We describe in detail how the datasets were derived from the Universal Dependencies V2.0, and report on an evaluation of the Deep Track input quality. In addition, we examine the motivation for, and likely usefulness of, deriving NLG inputs from annotations in resources originally developed for Natural Language Understanding (NLU), and assess whether the resulting inputs supply enough information of the right kind for the final stage in the NLG process.
Anthology ID:
W18-6527
Volume:
Proceedings of the 11th International Conference on Natural Language Generation
Month:
November
Year:
2018
Address:
Tilburg University, The Netherlands
Editors:
Emiel Krahmer, Albert Gatt, Martijn Goudbeek
Venue:
INLG
SIG:
SIGGEN
Publisher:
Association for Computational Linguistics
Note:
Pages:
199–209
Language:
URL:
https://aclanthology.org/W18-6527
DOI:
10.18653/v1/W18-6527
Bibkey:
Cite (ACL):
Simon Mille, Anja Belz, Bernd Bohnet, and Leo Wanner. 2018. Underspecified Universal Dependency Structures as Inputs for Multilingual Surface Realisation. In Proceedings of the 11th International Conference on Natural Language Generation, pages 199–209, Tilburg University, The Netherlands. Association for Computational Linguistics.
Cite (Informal):
Underspecified Universal Dependency Structures as Inputs for Multilingual Surface Realisation (Mille et al., INLG 2018)
Copy Citation:
PDF:
https://aclanthology.org/W18-6527.pdf
Data
NomBank