Does Size Matter – How Much Data is Required to Train a REG Algorithm?

Mariët Theune1,  Ruud Koolen2,  Emiel Krahmer2,  Sander Wubben2
1University of Twente, 2Tilburg University


Abstract

In this paper we investigate how much data is required to train an algorithm for attribute selection, a subtask of Referring Expressions Generation (REG). To enable comparison between different-sized training sets, a systematic training method was developed. The results show that depending on the complexity of the domain, training on 10 to 20 items may already lead to a good performance.




Full paper: http://www.aclweb.org/anthology/P/P11/.pdf