EigenSent: Spectral sentence embeddings using higher-order Dynamic Mode Decomposition

Subhradeep Kayal, George Tsatsaronis


Abstract
Distributed representation of words, or word embeddings, have motivated methods for calculating semantic representations of word sequences such as phrases, sentences and paragraphs. Most of the existing methods to do so either use algorithms to learn such representations, or improve on calculating weighted averages of the word vectors. In this work, we experiment with spectral methods of signal representation and summarization as mechanisms for constructing such word-sequence embeddings in an unsupervised fashion. In particular, we explore an algorithm rooted in fluid-dynamics, known as higher-order Dynamic Mode Decomposition, which is designed to capture the eigenfrequencies, and hence the fundamental transition dynamics, of periodic and quasi-periodic systems. It is empirically observed that this approach, which we call EigenSent, can summarize transitions in a sequence of words and generate an embedding that can represent well the sequence itself. To the best of the authors’ knowledge, this is the first application of a spectral decomposition and signal summarization technique on text, to create sentence embeddings. We test the efficacy of this algorithm in creating sentence embeddings on three public datasets, where it performs appreciably well. Moreover it is also shown that, due to the positive combination of their complementary properties, concatenating the embeddings generated by EigenSent with simple word vector averaging achieves state-of-the-art results.
Anthology ID:
P19-1445
Volume:
Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics
Month:
July
Year:
2019
Address:
Florence, Italy
Editors:
Anna Korhonen, David Traum, Lluís Màrquez
Venue:
ACL
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
4536–4546
Language:
URL:
https://aclanthology.org/P19-1445
DOI:
10.18653/v1/P19-1445
Bibkey:
Cite (ACL):
Subhradeep Kayal and George Tsatsaronis. 2019. EigenSent: Spectral sentence embeddings using higher-order Dynamic Mode Decomposition. In Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pages 4536–4546, Florence, Italy. Association for Computational Linguistics.
Cite (Informal):
EigenSent: Spectral sentence embeddings using higher-order Dynamic Mode Decomposition (Kayal & Tsatsaronis, ACL 2019)
Copy Citation:
PDF:
https://aclanthology.org/P19-1445.pdf
Code
 DeepK/hoDMD-experiments
Data
SSTSST-5