A Supervised Approach to Extractive Summarisation of Scientific Papers

Ed Collins, Isabelle Augenstein, Sebastian Riedel


Abstract
Automatic summarisation is a popular approach to reduce a document to its main arguments. Recent research in the area has focused on neural approaches to summarisation, which can be very data-hungry. However, few large datasets exist and none for the traditionally popular domain of scientific publications, which opens up challenging research avenues centered on encoding large, complex documents. In this paper, we introduce a new dataset for summarisation of computer science publications by exploiting a large resource of author provided summaries and show straightforward ways of extending it further. We develop models on the dataset making use of both neural sentence encoding and traditionally used summarisation features and show that models which encode sentences as well as their local and global context perform best, significantly outperforming well-established baseline methods.
Anthology ID:
K17-1021
Volume:
Proceedings of the 21st Conference on Computational Natural Language Learning (CoNLL 2017)
Month:
August
Year:
2017
Address:
Vancouver, Canada
Editors:
Roger Levy, Lucia Specia
Venue:
CoNLL
SIG:
SIGNLL
Publisher:
Association for Computational Linguistics
Note:
Pages:
195–205
Language:
URL:
https://aclanthology.org/K17-1021
DOI:
10.18653/v1/K17-1021
Bibkey:
Cite (ACL):
Ed Collins, Isabelle Augenstein, and Sebastian Riedel. 2017. A Supervised Approach to Extractive Summarisation of Scientific Papers. In Proceedings of the 21st Conference on Computational Natural Language Learning (CoNLL 2017), pages 195–205, Vancouver, Canada. Association for Computational Linguistics.
Cite (Informal):
A Supervised Approach to Extractive Summarisation of Scientific Papers (Collins et al., CoNLL 2017)
Copy Citation:
PDF:
https://aclanthology.org/K17-1021.pdf
Code
 EdCo95/scientific-paper-summarisation +  additional community code
Data
CSPubSum