Dating Documents using Graph Convolution Networks

Shikhar Vashishth, Shib Sankar Dasgupta, Swayambhu Nath Ray, Partha Talukdar


Abstract
Document date is essential for many important tasks, such as document retrieval, summarization, event detection, etc. While existing approaches for these tasks assume accurate knowledge of the document date, this is not always available, especially for arbitrary documents from the Web. Document Dating is a challenging problem which requires inference over the temporal structure of the document. Prior document dating systems have largely relied on handcrafted features while ignoring such document-internal structures. In this paper, we propose NeuralDater, a Graph Convolutional Network (GCN) based document dating approach which jointly exploits syntactic and temporal graph structures of document in a principled way. To the best of our knowledge, this is the first application of deep learning for the problem of document dating. Through extensive experiments on real-world datasets, we find that NeuralDater significantly outperforms state-of-the-art baseline by 19% absolute (45% relative) accuracy points.
Anthology ID:
P18-1149
Volume:
Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
Month:
July
Year:
2018
Address:
Melbourne, Australia
Editors:
Iryna Gurevych, Yusuke Miyao
Venue:
ACL
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
1605–1615
Language:
URL:
https://aclanthology.org/P18-1149
DOI:
10.18653/v1/P18-1149
Bibkey:
Cite (ACL):
Shikhar Vashishth, Shib Sankar Dasgupta, Swayambhu Nath Ray, and Partha Talukdar. 2018. Dating Documents using Graph Convolution Networks. In Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 1605–1615, Melbourne, Australia. Association for Computational Linguistics.
Cite (Informal):
Dating Documents using Graph Convolution Networks (Vashishth et al., ACL 2018)
Copy Citation:
PDF:
https://aclanthology.org/P18-1149.pdf
Poster:
 P18-1149.Poster.pdf
Code
 malllabiisc/NeuralDater
Data
New York Times Annotated Corpus