Sanjana Sharma


2018

pdf bib
Degree based Classification of Harmful Speech using Twitter Data
Sanjana Sharma | Saksham Agrawal | Manish Shrivastava
Proceedings of the First Workshop on Trolling, Aggression and Cyberbullying (TRAC-2018)

Harmful speech has various forms and it has been plaguing the social media in different ways. If we need to crackdown different degrees of hate speech and abusive behavior amongst it, the classification needs to be based on complex ramifications which needs to be defined and hold accountable for, other than racist, sexist or against some particular group and community. This paper primarily describes how we created an ontological classification of harmful speech based on degree of hateful intent and used it to annotate twitter data accordingly. The key contribution of this paper is the new dataset of tweets we created based on ontological classes and degrees of harmful speech found in the text. We also propose supervised classification system for recognizing these respective harmful speech classes in the texts hence. This serves as a preliminary work to lay down foundation on defining different classes of harmful speech and subsequent work will be done in making it’s automatic detection more robust and efficient.