Modeling Online Discourse with Coupled Distributed Topics

Nikita Srivatsan, Zachary Wojtowicz, Taylor Berg-Kirkpatrick


Abstract
In this paper, we propose a deep, globally normalized topic model that incorporates structural relationships connecting documents in socially generated corpora, such as online forums. Our model (1) captures discursive interactions along observed reply links in addition to traditional topic information, and (2) incorporates latent distributed representations arranged in a deep architecture, which enables a GPU-based mean-field inference procedure that scales efficiently to large data. We apply our model to a new social media dataset consisting of 13M comments mined from the popular internet forum Reddit, a domain that poses significant challenges to models that do not account for relationships connecting user comments. We evaluate against existing methods across multiple metrics including perplexity and metadata prediction, and qualitatively analyze the learned interaction patterns.
Anthology ID:
D18-1496
Original:
D18-1496v1
Version 2:
D18-1496v2
Volume:
Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing
Month:
October-November
Year:
2018
Address:
Brussels, Belgium
Editors:
Ellen Riloff, David Chiang, Julia Hockenmaier, Jun’ichi Tsujii
Venue:
EMNLP
SIG:
SIGDAT
Publisher:
Association for Computational Linguistics
Note:
Pages:
4673–4682
Language:
URL:
https://aclanthology.org/D18-1496
DOI:
10.18653/v1/D18-1496
Bibkey:
Cite (ACL):
Nikita Srivatsan, Zachary Wojtowicz, and Taylor Berg-Kirkpatrick. 2018. Modeling Online Discourse with Coupled Distributed Topics. In Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing, pages 4673–4682, Brussels, Belgium. Association for Computational Linguistics.
Cite (Informal):
Modeling Online Discourse with Coupled Distributed Topics (Srivatsan et al., EMNLP 2018)
Copy Citation:
PDF:
https://aclanthology.org/D18-1496.pdf