Psycholinguistic Models of Sentence Processing Improve Sentence Readability Ranking

David M. Howcroft, Vera Demberg


Abstract
While previous research on readability has typically focused on document-level measures, recent work in areas such as natural language generation has pointed out the need of sentence-level readability measures. Much of psycholinguistics has focused for many years on processing measures that provide difficulty estimates on a word-by-word basis. However, these psycholinguistic measures have not yet been tested on sentence readability ranking tasks. In this paper, we use four psycholinguistic measures: idea density, surprisal, integration cost, and embedding depth to test whether these features are predictive of readability levels. We find that psycholinguistic features significantly improve performance by up to 3 percentage points over a standard document-level readability metric baseline.
Anthology ID:
E17-1090
Volume:
Proceedings of the 15th Conference of the European Chapter of the Association for Computational Linguistics: Volume 1, Long Papers
Month:
April
Year:
2017
Address:
Valencia, Spain
Editors:
Mirella Lapata, Phil Blunsom, Alexander Koller
Venue:
EACL
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
958–968
Language:
URL:
https://aclanthology.org/E17-1090
DOI:
Bibkey:
Cite (ACL):
David M. Howcroft and Vera Demberg. 2017. Psycholinguistic Models of Sentence Processing Improve Sentence Readability Ranking. In Proceedings of the 15th Conference of the European Chapter of the Association for Computational Linguistics: Volume 1, Long Papers, pages 958–968, Valencia, Spain. Association for Computational Linguistics.
Cite (Informal):
Psycholinguistic Models of Sentence Processing Improve Sentence Readability Ranking (Howcroft & Demberg, EACL 2017)
Copy Citation:
PDF:
https://aclanthology.org/E17-1090.pdf