MORSE: Semantic-ally Drive-n MORpheme SEgment-er

Tarek Sakakini, Suma Bhat, Pramod Viswanath


Abstract
We present in this paper a novel framework for morpheme segmentation which uses the morpho-syntactic regularities preserved by word representations, in addition to orthographic features, to segment words into morphemes. This framework is the first to consider vocabulary-wide syntactico-semantic information for this task. We also analyze the deficiencies of available benchmarking datasets and introduce our own dataset that was created on the basis of compositionality. We validate our algorithm across datasets and present state-of-the-art results.
Anthology ID:
P17-1051
Volume:
Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
Month:
July
Year:
2017
Address:
Vancouver, Canada
Editors:
Regina Barzilay, Min-Yen Kan
Venue:
ACL
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
552–561
Language:
URL:
https://aclanthology.org/P17-1051
DOI:
10.18653/v1/P17-1051
Bibkey:
Cite (ACL):
Tarek Sakakini, Suma Bhat, and Pramod Viswanath. 2017. MORSE: Semantic-ally Drive-n MORpheme SEgment-er. In Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 552–561, Vancouver, Canada. Association for Computational Linguistics.
Cite (Informal):
MORSE: Semantic-ally Drive-n MORpheme SEgment-er (Sakakini et al., ACL 2017)
Copy Citation:
PDF:
https://aclanthology.org/P17-1051.pdf
Presentation:
 P17-1051.Presentation.pdf
Software:
 P17-1051.Software.zip
Dataset:
 P17-1051.Datasets.zip
Video:
 https://aclanthology.org/P17-1051.mp4