Did you offend me? Classification of Offensive Tweets in Hinglish Language

Puneet Mathur, Ramit Sawhney, Meghna Ayyar, Rajiv Shah


Abstract
The use of code-switched languages (e.g., Hinglish, which is derived by the blending of Hindi with the English language) is getting much popular on Twitter due to their ease of communication in native languages. However, spelling variations and absence of grammar rules introduce ambiguity and make it difficult to understand the text automatically. This paper presents the Multi-Input Multi-Channel Transfer Learning based model (MIMCT) to detect offensive (hate speech or abusive) Hinglish tweets from the proposed Hinglish Offensive Tweet (HOT) dataset using transfer learning coupled with multiple feature inputs. Specifically, it takes multiple primary word embedding along with secondary extracted features as inputs to train a multi-channel CNN-LSTM architecture that has been pre-trained on English tweets through transfer learning. The proposed MIMCT model outperforms the baseline supervised classification models, transfer learning based CNN and LSTM models to establish itself as the state of the art in the unexplored domain of Hinglish offensive text classification.
Anthology ID:
W18-5118
Volume:
Proceedings of the 2nd Workshop on Abusive Language Online (ALW2)
Month:
October
Year:
2018
Address:
Brussels, Belgium
Editors:
Darja Fišer, Ruihong Huang, Vinodkumar Prabhakaran, Rob Voigt, Zeerak Waseem, Jacqueline Wernimont
Venue:
ALW
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
138–148
Language:
URL:
https://aclanthology.org/W18-5118
DOI:
10.18653/v1/W18-5118
Bibkey:
Cite (ACL):
Puneet Mathur, Ramit Sawhney, Meghna Ayyar, and Rajiv Shah. 2018. Did you offend me? Classification of Offensive Tweets in Hinglish Language. In Proceedings of the 2nd Workshop on Abusive Language Online (ALW2), pages 138–148, Brussels, Belgium. Association for Computational Linguistics.
Cite (Informal):
Did you offend me? Classification of Offensive Tweets in Hinglish Language (Mathur et al., ALW 2018)
Copy Citation:
PDF:
https://aclanthology.org/W18-5118.pdf
Code
 pmathur5k10/Hinglish-Offensive-Text-Classification