AI ML NIT Patna at HASOC 2019: Deep Learning Approach for Identification of Abusive Content ? Kirti Kumari1 and Jyoti Prakash Singh1 National Institute of Technology Patna, Patna, India {kirti.cse15, jps}@nitp.ac.in Abstract. Social media is a globally open place for online users to ex- press their thoughts and opinions. There are numerous advantages of social media but some severe challenges are also associated with it. Anti- social and abusive conduct has become more common due to the emer- gence of social media. Identification of Hate Speech, Cyber-aggression, and Offensive language is a very challenging task. The nature of struc- tures of the natural language makes this task even more tedious. Being a challenging task, we are fascinated to propose a deep learning system based on Convolutional Neural Networks to identify Hate Speech, Offen- sive language, and Profanity. We have done experiments with three differ- ent embeddings. These experiments have been associated with comments of code-mixed Hindi-English and multi-domain social media text. We have found that One-hot embedding performed better than pre-trained fastText embedding for the code-mixed Hindi dataset. Keywords: Hate Speech · Offensive Language · Convolutional Neural Network · GloVe · fastText 1 Introduction In social media, anyone is free to post their ideas and views without declaring his/her identity. Detection of Cyber-aggression [9], Hate Speech [14], Offensive language and Profanity used by social media users have become one of the major challenges of the current scenario. Social media users are being targeted by Hate Speech and Offensive language such as abusive, hurtful, derogatory or unlaw- ful user-generated content by some mischievous users. These online platforms provide an open place to discuss and comment on different matters but abusive comments and online violence on individuals have turned this into a very impor- tant social issue. As a result of the misuse of online interactions, a large number of people have fallen into depression, anxiety, and other mental health problems. A survey undertaken by Feminism in India has noted that online abuse has been ? Copyright c 2019 for this paper by its authors. Use permitted under Creative Com- mons License Attribution 4.0 International (CC BY 4.0). FIRE 2019, 12-15 Decem- ber 2019, Kolkata, India. K. Kumari et al. faced by more than 50% of females in major cities of India1 . During the study2 (July 2017), 66% of online abused people reported that they feel powerlessness in their capacity to react to Internet violence or harassment. These statistics emphasize the necessity of an automated system for the detection of abusive comments as well as the moderation of the system. As a result, several research efforts across the world have emerged over the past few years to identify abusive content [1, 2, 4, 12, 15] using machine learning and natural language processing. Hate Speech detection becomes a challenging task because it can not be ad- dressed simply by filtering words. In addition to the meaning of words, a lot of other factors such as context information, characteristics of the user, the gen- der of individual people have to be considered for the detection of Hate Speech. Abuse is a term that includes many varying forms of fine-grained adverse ex- pressions in the framework of natural language. For example, Nobata et al. [12] concentrated on Hate Speech, Derogatory language, and Profanity while Wassem and Hovy [15] focused on racism and sexism types of abuse. Definitions tend to be overlapping and ambiguous for different types of abuse. The Hate Speech and Offensive Content Identification in Indo-European Languages (HASOC) or- ganizer team defines Hate Speech as describing negative attributes or deficiencies toward groups because of race, political opinion, sexual orientation, gender, so- cial status, health condition or similar [10]. A large number of works [1, 5, 6, 15] are reported by the researchers on Hate Speech detection for English language only and very few works [2, 8] have been reported for mixed languages such as English-Hindi, English-Bengali, and other languages. In this paper, we have used the multi-lingual HASOC Corpus [10] and pro- posed a deep learning model based on Convolutional Neural Networks (CNN) to identify Hate Speech and Offensive content on multi-domain social media platforms collected from Facebook and Twitter. We have used three types of embeddings of text and transliteration tools to normalize Devanagari to Roman script for code-mixed Hindi corpus. The rest of the paper is structured as follows. Section 2 presents associated works for the detection of Hate Speech and Offensive Language while Section 3 presents our suggested framework for identification of Hate Speech, Offensive language, and Profanity. Section 4 presents the finding of the suggested scheme. Finally, in Section 5, we have concluded the paper and have discussed the future directions for these tasks. 2 Related Works As social media and online platforms have grown in terms of impact and accep- tance of users, various problems such as Hate Speech, Profanity and Offensive language on these platforms have increased drastically. Several systems have been proposed by researchers for automatic detection and classification of these problems. 1 https://blog.ipleaders.in/cyber-stalking 2 https://www.statista.com/statistics/784838/online-harassment-impact-on-women/ Deep Learning Approach for Identification of Abusive Content Burnap and Williams [3] identified Hate Speech on the Twitter network fo- cusing mainly on racism. Nobata et al. [12] used character n-gram features and reported that character n-gram features are the most predictive features for the detection of Hate Speech. Wassem and Hovy [15] have focused on Hate tweet detection related to racism and sexism. They have used Logistic Regression as classifier and character n-gram features to classify the tweets. Davidson et al. [5] have found that racist and homophobic tweets are generally Hate Speech and sexist tweets are in general offensive. They have used Logistic Regression, Naive Bayes, Decision Trees, Random Forests and Support Vector Machines (SVM) to classify the tweets. Mehdat and Tetreault [11] also found that character n-gram features are more predictive than token n-gram features for Hate Speech detec- tion. Badjatiya et al. [1] detected Hate Speech using deep learning models such as Convolutional Neural Networks (CNN) and Long Short-Term Memory (LSTM). They have experimented with several embeddings named Random, GloVe and fastText embeddings and have found that combination of LSTM, Random em- bedding and Gradient Boosted Decision Trees (GBDTs) had performed the best for classifying the Hate tweets. Del Vigna et al. [6] have classified the Hate Speech of Facebook comments into fine-grained classes. They have used two different approaches with SVM and LSTM to identify the Hate comments. Bohra et al. [2] have identified code-mixed Hate tweets, especially for Hindi and English lan- guage on Twitter. Kamble and Joshi [8] have also focused on Hindi and English code-mixed tweets and have detected Hate Speech using various deep learning models such as CNN, LSTM, and Bi-directional LSTM. A lot of research works have been done for the English language but very few works have been done for the other languages and code-mixed languages. In this paper, we have focused on multi-lingual text, especially for Hindi and English languages and used a deep neural network model to detect Hate Speech, Offensive language, and Profanity. 3 Methodology This section describes the details of datasets and proposed approaches. The description of the datasets used in the experiments has been given in sub-section 3.1 and the details of the proposed approaches to identify Hate Speech, Offensive language and Profanity are presented in sub-section 3.2. 3.1 Description of Datasets In this paper, the multilingual datasets [10] provided by Hate Speech and Offen- sive Content Identification in Indo-European Languages (HASOC)3 have been used. The shared tasks of HASOC have been provided for three languages (En- glish, code-mixed Hindi, and German) and each language, there are three sub- tasks (Sub-task1, Sub-task2, and Sub-task3). The provided comments have been collected from Twitter and Facebook. The details of the sub-tasks are: Sub-task1 3 https://hasoc2019.github.io K. Kumari et al. is a coarse-grained binary classification that needed respondents to classify tweets into two groups: Hate and Offensive (HOF) and Not Hate-Offensive (NOT). (i) HOF: This post includes hateful, offensive or profane contents and (ii) NOT: This post contains neither Hate Speech nor offensive content. Sub-task2 is a fine-grained classification. Hate Speech and Offensive posts from the Sub-task1 are further classified into three categories: (i) Hate Speech (HATE): Posts un- der this class contain Hate Speech contents. (ii) Offensive (OFFN): Posts under this class contain offensive contents. (iii) Profane (PRFN): These posts contain profane words. In Sub-task3, the category of abuse is checked and includes only the posts marked as HOF in Sub-task1. Sub-task3 is further grouped into two classes: (i) Targeted Insult (TIN): Such posts contain humiliating/insulting or threatening content. (ii) Untargeted Insult (UNT): Posts that contain untar- geted swearing and profanity, those posts of particular profanity which are not targeted at anybody but contain language that is not acceptable. The sizes of the training datasets for English and Hindi corpus are 5852 and 4665 posts, respectively. Test data for English and Hindi corpus are 1153 and 1318 posts, respectively. The detailed description of the datasets used in this work is given in Table 1. Table 1. Description of datasets Corpus Sub-tasks Class #training samples #testing samples HOF 2261 288 Sub-task1 NOT 3591 865 HATE 1143 124 English Sub-task2 OFFN 451 71 PRFN 667 93 TIN 2041 245 Sub-task3 UNT 220 43 HOF 2469 605 Sub-task1 NOT 2196 713 HATE 556 190 Hindi Sub-task2 OFFN 676 197 PRFN 1237 218 TIN 1545 542 Sub-task3 UNT 924 63 3.2 Proposed Approach The proposed methodology is based on the Convolutional Neural Networks (CNN) model, a block diagram of which is shown in Figure 1. At first, we have removed the stopwords from the comments by using Natural Language Toolkit4 . The embedding layer is the representation of inputs in the deep neural 4 www.nltk.org Deep Learning Approach for Identification of Abusive Content network models. The embedding layer encodes the word used in the comments. We have done experiments with three different embeddings including One-hot embedding, GloVe embedding [13] and fastText embedding [7]. In the case of the Hindi dataset, we have used only One-hot and fastText embeddings. For One-hot embedding, we have transliterated Devanagari to Roman script by using translit- eration tools5 . These tools identify the Unicode patterns and transliterate the Devanagari script to Roman script. The dimensions of a word embeddings are kept 300 for pre-trained (GloVe and fastText) and 100 for One-hot embeddings. This embedded comment is fed to the CNN layer. In our case, we have used four layers of convolution and one layer of the max-pooling in between the 3rd and 4th convolution layers. At last, we have used the flatten layer followed by a dense layer. Within the layer, we have used sigmoid and softmax activation function at the dense layer for binary class and multi-class problems, respectively. In every hidden layer, we have used the Rectified Linear Unit (ReLU) activation function. Number of filters used in 1st , 2nd , 3rd and 4th convolutional layers are 8, 16, 64 and 64, respectively. The filter size and max-pooling size in both cases are used as 4. We have used 80% of the samples for training and the remaining are used for validation. The details of the hyper-parameters of our experiments are shown in Table 2. In all the experiments we have used Keras library6 . Embedding Comment Preprocessing CNN Predicted Class Layer Fig. 1. Overview of the proposed model Table 2. Hyper-parameters used in the proposed approach Parameter description Values Maximum length of comment 100 Size of filters 4 Number of filters 8, 16, 64, 64 Pooling size 4 Activation function ReLU, Sigmoid, Softmax Number of Convolutional layers 4 Learning rate 0.001 Batch size 32 Loss function Binary cross-entropy Optimizer Rms-prop Epoch 100 5 https://pandey.github.io/posts/transliterate-devanagari-to-latin.html 6 http://keras.io K. Kumari et al. 4 Results and Discussions This section describes the results obtained on the English and code-mixed Hindi languages for all three Sub-tasks. For both datasets, we have used several em- beddings followed by a Convolutional Neural Networks (CNN) model to classify the comments into their output classes. Using a macro-averaged F1-score and weighted F1-score, classification models have been evaluated for all the tasks. Table 3 shows the results obtained by proposed approaches to test samples with different combinations of embeddings, which have been used for training and testing. Our system achieved an approximate weighted F1-score of 69%, 55% and 60% for English Sub-task1, Sub-task2, and Sub-task3, respectively with fast- Text embedding. For the Hindi Sub-task1, Sub-task2 and Sub-task3, our system achieved an approximate weighted F1-score of 78%, 52%, and 66%, respectively with One-hot embedding. It is clear from the Table 3 that fastText embedding is performing better than GloVe and One-hot embeddings in case of the English dataset and One-hot embedding is performing better than fastText embedding for the Hindi dataset. Our results are ranked 17th , 20th , 12th among participants of shared tasks for Hindi Sub-task1, Sub-task2 and Sub-task3, respectively and the results on English dataset are positioned 56th , 35th , 30th among participants of shared tasks for Sub-task1, Sub-task2 and Sub-task3, respectively. The proposed model has performed better for Sub-task1 and Sub-task3 which can be seen in Table 3. Table 3 also shows that misclassified instances are more for Sub-task2 in both datasets. The main reason for the misclassification of Sub- task2 is that it is a fine-grained classification task. Even for the human being, it is very difficult to differentiate among the Hate Speech, Offensive language, and Profanity; not only due to very fine but also very fade differences among these classes. Just filtering the keywords will generally result in many false- positive cases because context plays a major role in the detection of the Hate Speech, Offensive language, and Profanity. Another important reason for the misclassification of classifiers is that the datasets are very unbalanced which can be seen in Table 1. 5 Conclusion and Future Work Hate Speech and Offensive language identification is a challenging task. Many numbers of research have been carried out in the domain of Hate Speech detec- tion for the English language but very few researches are reported for the other languages and multi-lingual text. This research work has been focused on multi- lingual text classification, especially for Hindi and English code-mixed text. In this paper, a deep learning model for the identification of Hate Speech, Offensive contents, and Profanity on multi-domain platforms have been proposed. Three types of embeddings: One-hot, pre-trained GloVe and fastText embeddings have been used in the experiments. It has been found that fastText embedding has performed better than the other two embeddings for the English dataset and One-hot has performed better for the Hindi dataset. Deep Learning Approach for Identification of Abusive Content Table 3. Results of classification with different Embeddings Test Dataset Sub-task Embedding Macro-F1 Weighted-F1 One-hot 0.4803 0.6477 Sub-task1 GloVe 0.5308 0.6485 fastText 0.5921 0.6854 One-hot 0.2425 0.5475 English Sub-task2 GloVe 0.2049 0.4636 fastText 0.3405 0.5548 One-hot 0.3335 0.5983 Sub-task3 GloVe 0.3585 0.5917 fastText 0.3607 0.5979 One-hot 0.7827 0.7834 Sub-task1 fastText 0.5907 0.5898 One-hot 0.3486 0.5208 Hindi Sub-task2 fastText 0.1103 0.0656 One-hot 0.4878 0.6588 Sub-task3 fastText 0.4464 0.6444 Hate Speech detection is an open challenge for the research community. The social media post contains not only text but also image followed by text and even in the case of text, code-mixed languages are used. Therefore, future works on Hate Speech detection might address multi-lingual cases of several languages and consideration of multi-modal forms of social media posts to make the system more robust. Acknowledgements The first author would want to acknowledge the Ministry of Electronics and Information Technology (MeitY), Government of India for the financial support during the research work through the Visvesvaraya Ph.D Scheme for Electronics and IT. References 1. Badjatiya, P., Gupta, S., Gupta, M., Varma, V.: Deep learning for Hate Speech detection in tweets. In: Proceedings of the 26th International Conference on World Wide Web Companion. pp. 759–760. International World Wide Web Conferences Steering Committee (2017) 2. Bohra, A., Vijay, D., Singh, V., Akhtar, S.S., Shrivastava, M.: A dataset of Hindi- English code-mixed social media text for Hate Speech detection. In: Proceedings of the Second Workshop on Computational Modeling of People’s Opinions, Per- sonality, and Emotions in Social Media. pp. 36–41 (2018) 3. Burnap, P., Williams, M.L.: Cyber Hate Speech on twitter: An application of machine classification and statistical modeling for policy and decision making. Policy & Internet 7(2), 223–242 (2015) K. Kumari et al. 4. Chen, J., Yan, S., Wong, K.C.: Verbal aggression detection on Twitter comments: convolutional neural network for short-text sentiment analysis. Neural Computing and Applications pp. 1–10 5. Davidson, T., Warmsley, D., Macy, M., Weber, I.: Automated Hate Speech de- tection and the problem of Offensive language. arXiv preprint arXiv:1703.04009 (2017) 6. Del Vigna12, F., Cimino23, A., Dell’Orletta, F., Petrocchi, M., Tesconi, M.: Hate me, hate me not: Hate Speech detection on facebook. In: Proceedings of the First Italian Conference on Cybersecurity (ITASEC17). pp. 86–95 (2017) 7. Joulin, A., Grave, E., Bojanowski, P., Douze, M., Jégou, H., Mikolov, T.: Fast- text.zip: Compressing text classification models. arXiv preprint arXiv:1612.03651 (2016) 8. Kamble, S., Joshi, A.: Hate Speech detection from code-mixed Hindi-English tweets using deep learning models. arXiv preprint arXiv:1811.05145 (2018) 9. Kumari, K., Singh, J.P., Dwivedi, Y.K., Rana, N.P.: Aggressive social media post detection system containing symbolic images. In: Conference on e-Business, e- Services and e-Society. pp. 415–424. Springer (2019) 10. Modha, S., Mandl, T., Majumder, P., Patel, D.: Overview of the HASOC track at FIRE 2019: Hate Speech and Offensive Content Identification in Indo-European Languages. In: Proceedings of the 11th annual meeting of the Forum for Informa- tion Retrieval Evaluation (2019) 11. Mehdad, Y., Tetreault, J.: Do characters abuse more than words? In: Proceedings of the 17th Annual Meeting of the Special Interest Group on Discourse and Dialogue. pp. 299–303 (2016) 12. Nobata, C., Tetreault, J., Thomas, A., Mehdad, Y., Chang, Y.: Abusive language detection in online user content. In: Proceedings of the 25th international confer- ence on world wide web. pp. 145–153. International World Wide Web Conferences Steering Committee (2016) 13. Pennington, J., Socher, R., Manning, C.: Glove: Global vectors for word repre- sentation. In: Proceedings of the 2014 conference on empirical methods in natural language processing (EMNLP). pp. 1532–1543 (2014) 14. Schmidt, A., Wiegand, M.: A survey on Hate Speech detection using natural lan- guage processing. In: Proceedings of the Fifth International Workshop on Natural Language Processing for Social Media. pp. 1–10 (2017) 15. Waseem, Z., Hovy, D.: Hateful symbols or hateful people? predictive features for Hate Speech detection on Twitter. In: Proceedings of the NAACL student research workshop. pp. 88–93 (2016)