Decoupling Encoder and Decoder Networks for Abstractive Document Summarization

Ying Xu, Jey Han Lau, Timothy Baldwin, Trevor Cohn


Abstract
Abstractive document summarization seeks to automatically generate a summary for a document, based on some abstract “understanding” of the original document. State-of-the-art techniques traditionally use attentive encoder–decoder architectures. However, due to the large number of parameters in these models, they require large training datasets and long training times. In this paper, we propose decoupling the encoder and decoder networks, and training them separately. We encode documents using an unsupervised document encoder, and then feed the document vector to a recurrent neural network decoder. With this decoupled architecture, we decrease the number of parameters in the decoder substantially, and shorten its training time. Experiments show that the decoupled model achieves comparable performance with state-of-the-art models for in-domain documents, but less well for out-of-domain documents.
Anthology ID:
W17-1002
Volume:
Proceedings of the MultiLing 2017 Workshop on Summarization and Summary Evaluation Across Source Types and Genres
Month:
April
Year:
2017
Address:
Valencia, Spain
Editors:
George Giannakopoulos, Elena Lloret, John M. Conroy, Josef Steinberger, Marina Litvak, Peter Rankel, Benoit Favre
Venue:
MultiLing
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
7–11
Language:
URL:
https://aclanthology.org/W17-1002/
DOI:
10.18653/v1/W17-1002
Bibkey:
Cite (ACL):
Ying Xu, Jey Han Lau, Timothy Baldwin, and Trevor Cohn. 2017. Decoupling Encoder and Decoder Networks for Abstractive Document Summarization. In Proceedings of the MultiLing 2017 Workshop on Summarization and Summary Evaluation Across Source Types and Genres, pages 7–11, Valencia, Spain. Association for Computational Linguistics.
Cite (Informal):
Decoupling Encoder and Decoder Networks for Abstractive Document Summarization (Xu et al., MultiLing 2017)
Copy Citation:
PDF:
https://aclanthology.org/W17-1002.pdf