An Ensemble Approach for Hate and Offensive Language Identification in English and Indo-Aryan Languages Abhinav Kumar1 , Pradeep Kumar Roy2 and Sunil Saumya3 1 Department of Computer Science & Engineering, Siksha ’O’ Anusandhan Deemed to be University, Bhubaneswar, India 2 Department of Computer Science & Engineering, Indian Institute of Information Technology Surat, Gujarat, India 3 Department of Computer Science & Engineering, Indian Institute of Information Technology Dharwad, India Abstract The freedom to upload and the lack of effective social media monitoring have resulted in a slew of societal issues such as cyberbullying, offensive content, and hate speech. Due to this, identifying hate and abusive language on social media is one of the trendiest research topics these days. This work proposes an ensemble-based model for detecting hate and offensive language in English and Hindi social media postings, which combines a support vector machine, logistic regression, random forest, gradient boosting, and Adaboost classifiers. The use of word-level n-gram features performed significantly well in the English dataset, with macro 𝐹1 -scores of 0.79 and 0.59 for two different tasks, while character-level n-gram features performed significantly well in the Hindi dataset, with macro 𝐹1 -scores of 0.75 and 0.47 for two different tasks. Keywords Hate speech, Offensive content, Deep learning, Machine learning, Ensemble learning 1. Introduction The rise of mobility and the accessibility of the Internet has enticed people all over the world to utilize social media platforms for communication [1, 2]. The majority of Internet users used at least one social media network today, such as Facebook, Twitter, Instagram, YouTube, or others. Because communication on these platforms is inexpensive, people are publishing an endless amount of content [3, 4]. In recent years, the freedom to upload and the lack of effective monitoring has led to a slew of societal issues, including cyberbullying, offensive content, and hate speech [5, 6, 7, 8, 9]. Because of anonymity and mobility provided by the social platforms, the cultivation and spread of hate speech eventually leading to hate crime has become easy in a virtual landscape beyond the reach of traditional law enforcement. Hate speech may be defined as “any communication that disparages a person or a group on the basis of their gender, sexual orientation, nationality, religion, or other characteristics” [10, 11, 12]. Forum for Information Retrieval Evaluation, December 13-17, 2021, India Envelope-Open abhinavanand05@gmail.com (A. Kumar); pradeep.roy@iiitsurat.ac.in (P. K. Roy); sunil.saumya@iiitdwd.ac.in (S. Saumya) © 2021 Copyright for this paper by its authors. Use permitted under Creative Commons License Attribution 4.0 International (CC BY 4.0). CEUR Workshop Proceedings http://ceur-ws.org ISSN 1613-0073 CEUR Workshop Proceedings (CEUR-WS.org) Hate speech is considered harmful by several online forums, including Facebook1 , YouTube2 , and Twitter3 , which have policies in place to delete hate speech content. There is significant motivation to explore automatic hate speech detection because of societal concern and how ubiquitous hate speech is becoming on the Internet [5, 13, 14]. The distribution of nasty content can be prevented by automating its identification. Automatic detection of hate speech over the social platform is needed in the current scenario; however, it has several challenges starting from the definition of hate speech itself. Recent social posts containing code-mixed languages, such as English-Hindi, English-Malayalam, or any other code-mixed languages. If a model was developed with a unimodal dataset like English, it might not detect the hate speech post having code-mixed languages effectively. Several works [13, 14, 10, 8, 15, 16, 17] have been proposed by researchers to identify hate speech from social media. Kumari and Singh [15] presented a model based on convolutional neural networks for detecting hate, obscenity, and abusive language in English and Hindi tweets. To recognize hatred, offensive, and profanity in English, Hindi, and German tweets, Mishra and Pal [16] developed an attention-based bidirectional long-short-term memory network. Mujadia et al. [17] developed an ensemble-based model comprised of a support vector machine, random forest, and Adaboost classifiers to identify hate content in tweets written in English, Hindi, and German. Roy et al. [10] proposed a convolutional neural network-based model for the identification of hate content from social media. Kumar et al. [13] proposed a fine-tuned BERT model whereas [14] used conventional machine learning models for the hate speech identification. Saumya et al. [8] experimented with several conventional machine learning and deep learning models for the hate speech identification from Dravidian social media posts. They found character N-gram features with conventional machine learning classifiers performing better than the complex deep learning models. In line with these works, the current paper proposes an ensemble-based machine learning model for the identification of hate and offensive content from English and Hindi social media posts. The dataset published for the FIRE-2021 workshop [18, 19] is used to validate the proposed ensemble-based model. The rest of the sections are organized as follows: Section 2 discusses the proposed methodol- ogy in detail. Section 3 lists the findings and finally the paper is concluded in section 4. 2. Methodology The detailed diagram of the proposed ensemble-based model can be seen in Figure 1. The proposed ensemble-based model consists of five different classifiers: (i) Support Vector Machine (SVM), (ii) Logistic Regression (LR), (iii) Random Forest (RF), (iv) Gradient Boosting (GB), and (v) AdaBoost. The proposed model is validated with the dataset published in FIRE-2021 [18]. Two different sub-tasks were given: (i) a coarse-grained binary classification of tweets in Hate and Offensive (HOF) and Non-Hate and offensive (NOT) classes, (ii) the further classification of Hate and Offensive (HOF) tweets into Hate (HATE), profane (PRFN) and offensive (OFFN) posts. 1 https://www.facebook.com/communitystandards/objectionable𝑐 𝑜𝑛𝑡𝑒𝑛𝑡. 2 https://support.google.com/youtube/answer/2801939. 3 https://help.twitter.com/en/rules-and-policies/hateful-conduct-policy. Support Vector Machine PRFN Inverse Document Frequency) HOF HATE TF-IDF (Term-Frequency Logistic Regression OFFN Features Social Media Random Forest Posts Gradient Boosting NOT AdaBoost Figure 1: Proposed model for the hate and offensive language identification from social media Table 1 Data statistic used to validate the proposed system Class English Hindi Train Test Train Test Task-A HOF 2,501 798 1,433 505 NOT 1,342 483 3,161 1,027 Total 3,843 1,281 4,594 1,532 Task-B NONE 1,342 483 3,161 1,027 PRFN 1,196 379 213 74 HATE 683 224 566 215 OFFN 622 195 654 216 Total 3,843 1,281 4,594 1,532 The overall data statistic for both the task can be seen in Table 1. In the experimentation, the aforementioned classifiers performed well individually in the identification of hate and offensive content, due to this, we utilized them to construct an ensemble-based model that can train efficiently to identify hate and offensive content on social media. To provide input to the proposed model, we experimented with different combinations of word and character n-gram TF-IDF (Term-Frequency Inverse Document Frequency) features for both English and Hindi datasets. • English Task-A and Task-B: TF-IDF is retrieved from the textual contents of the social media post to provide input to the suggested ensemble-based model. In the case of English language posts, we found that the first 50,000 uni-gram, bi-gram, and tri-gram word-level TF-IDF features performed well with the model in classifying posts into the various hate classes, compared to other n-gram combinations of word-level and character-level TF-IDF features. Table 2 Results for hate and offensive language identification from English and Hindi social media posts Task Class Precision Recall 𝐹1 -score Accuracy English Task-A HOF 0.79 0.89 0.84 NOT 0.77 0.60 0.67 78.22 Macro Average 0.78 0.75 0.76 English Task-B HATE 0.56 0.42 0.48 NONE 0.68 0.72 0.70 OFFN 0.59 0.32 0.42 65.96 PRFN 0.69 0.90 0.78 Macro Average 0.63 0.59 0.59 Hindi Task-A HOF 0.79 0.55 0.65 NOT 0.81 0.93 0.86 80.22 Macro Average 0.80 0.74 0.75 Hindi Task-B HATE 0.44 0.08 0.14 NONE 0.77 0.97 0.86 OFFN 0.57 0.44 0.49 73.56 PRFN 0.70 0.28 0.40 Macro Average 0.62 0.44 0.47 • Hindi Task-A and Task-B: In the case of Hindi social media posts, we found that the first 70,000 one-to-six gram character-level TF-IDF features performed the best when compared to other word-level and char-level n-gram features. 3. Results The performance of the proposed model is measured in terms of macro precision, macro recall, macro 𝐹1 -score, and accuracy. The results for both the sub-tasks for the English and Hindi dataset are listed in Table 2. In the case of English Task-A, the proposed ensemble-based model achieved a macro precision of 0.78, macro recall of 0.75, macro 𝐹1 -score of 0.76, and accuracy of 78.22%. The confusion matrix for English Task-A can be seen in Figure 2. In the case of English Task-B, the proposed ensemble-based model achieved a macro precision of 0.63, macro recall of 0.59, macro 𝐹1 -score of 0.59, and accuracy of 65.96%. The confusion matrix for English Task-B can be seen Figure 3. For Hindi Task-A, the proposed model achieved a macro precision of 0.80, a macro recall of 0.74, macro 𝐹1 -score of 0.75, and accuracy of 80.22%. The confusion matrix for the Hindi Task-B can be seen in Figure 4. Similarly, for Hindi Task-B, the proposed model achieved a macro precision of 0.62, macro recall of 0.44, macro 𝐹1 -score of 0.47, and accuracy of 73.56%. The confusion matrix for the Hindi Task-B can be seen in Figure 5. 4. Conclusion The detection of hate speech on social media poses significant problems. This paper investigates the usefulness of TF-IDF features at the word and character levels using an ensemble-based Confusion matrix 700 600 HOF 0.89 0.11 500 True label 400 300 NOT 0.40 0.60 200 100 F T HO NO Predicted label Figure 2: Confusion matrix for English Task-A Confusion matrix HATE 0.42 0.42 0.10 0.06 300 250 NONE 0.08 0.72 0.04 0.16 200 True label 0.14 0.20 0.32 0.34 150 OFFN 100 PRFN 0.02 0.07 0.01 0.90 50 TE NE FN FN HA OF PR NO Predicted label Figure 3: Confusion matrix for English Task-B machine learning approach. The proposed ensemble-based model achieved macro 𝐹 1-scores of 0.79 and 0.59 for English task-A and task-B, respectively, and 0.75 and 0.47 for Hindi task-A and task-B, respectively. In the future, some other deep learning-based ensemble models can be implemented for the identification of hate and offensive content from social media posts. References [1] T. Mandl, S. Modha, P. Majumder, D. Patel, M. Dave, C. Mandlia, A. Patel, Overview of the hasoc track at fire 2019: Hate speech and offensive content identification in indo-european languages, in: Proceedings of the 11th forum for information retrieval evaluation, 2019, pp. 14–17. [2] A. Kumar, J. P. Singh, Disaster severity prediction from twitter images, in: Intelligence Enabled Research, Springer, 2021, pp. 65–73. Confusion matrix 900 800 HOF 0.55 0.45 700 600 True label 500 400 NOT 0.07 0.93 300 200 100 F T HO NO Predicted label Figure 4: Confusion matrix for Hindi Task-A Confusion matrix HATE 0.08 0.85 0.06 0.00 800 NONE 0.01 0.97 0.02 0.00 600 True label OFFN 0.04 0.49 0.44 0.04 400 200 PRFN 0.08 0.19 0.45 0.28 0 TE NE FN FN HA OF PR NO Predicted label Figure 5: Confusion matrix for Hindi Task-B [3] A. Priya, A. Kumar, Deep ensemble approach for COVID-19 fake news detection from social media, in: 2021 8th International Conference on Signal Processing and Integrated Networks (SPIN), IEEE, 2021, pp. 396–401. [4] A. Kumar, J. P. Singh, S. Saumya, A comparative analysis of machine learning techniques for disaster-related tweet classification, in: 2019 IEEE R10 Humanitarian Technology Conference (R10-HTC)(47129), IEEE, 2019, pp. 222–227. [5] M. Mondal, L. A. Silva, F. Benevenuto, A measurement study of hate speech in social media, in: Proceedings of the 28th ACM conference on hypertext and social media, 2017, pp. 85–94. [6] G. Kumar, J. P. Singh, A. Kumar, A deep multi-modal neural network for the identification of hate speech from social media, in: Conference on e-Business, e-Services and e-Society, Springer, 2021, pp. 670–680. [7] A. K. Mishra, S. Saumya, A. Kumar, IIIT_DWD@ HASOC 2020: Identifying offensive content in Indo-European languages, in: FIRE (Working Notes), 2020. [8] S. Saumya, A. Kumar, J. P. Singh, Offensive language identification in Dravidian code mixed social media text, in: Proceedings of the First Workshop on Speech and Language Technologies for Dravidian Languages, 2021, pp. 36–45. [9] P. Badjatiya, S. Gupta, M. Gupta, V. Varma, Deep learning for hate speech detection in tweets, in: Proceedings of the 26th international conference on World Wide Web companion, 2017, pp. 759–760. [10] P. K. Roy, A. K. Tripathy, T. K. Das, X.-Z. Gao, A framework for hate speech detection using deep convolutional neural network, IEEE Access 8 (2020) 204951–204962. [11] P. Burnap, M. L. Williams, Cyber hate speech on Twitter: An application of machine classification and statistical modeling for policy and decision making, Policy & internet 7 (2015) 223–242. [12] R. Jain, D. Goel, P. Sahu, A. Kumar, J. Singh, Profiling Hate Speech Spreaders on Twitter— Notebook for PAN at CLEF 2021, in: G. Faggioli, N. Ferro, A. Joly, M. Maistro, F. Piroi (Eds.), CLEF 2021 Labs and Workshops, Notebook Papers, CEUR-WS.org, 2021. URL: http://ceur-ws.org/Vol-2936/paper-175.pdf. [13] A. Kumar, S. Saumya, J. P. Singh, NITP-AI-NLP@ HASOC-FIRE2020: Fine tuned BERT for the hate speech and offensive content identification from social media., in: FIRE (Working Notes), 2020, pp. 266–273. [14] A. Kumar, S. Saumya, J. P. Singh, NITP-AI-NLP@ HASOC-Dravidian-CodeMix-FIRE2020: A machine learning approach to identify offensive languages from Dravidian code-mixed text., in: FIRE (Working Notes), 2020, pp. 384–390. [15] K. Kumari, J. P. Singh, AI ML NIT Patna at HASOC 2019: Deep learning approach for identification of abusive content., in: FIRE (Working Notes), 2019, pp. 328–335. [16] A. Mishra, S. Pal, IIT Varanasi at HASOC 2019: Hate speech and offensive content identification in Indo-European languages., in: FIRE (Working Notes), 2019, pp. 344–351. [17] V. Mujadia, P. Mishra, D. M. Sharma, IIIT-Hyderabad at HASOC 2019: Hate speech detection., in: FIRE (Working Notes), 2019, pp. 271–278. [18] S. Modha, T. Mandl, G. K. Shahi, H. Madhu, S. Satapara, T. Ranasinghe, M. Zampieri, Overview of the HASOC Subtrack at FIRE 2021: Hate Speech and Offensive Content Identification in English and Indo-Aryan Languages and Conversational Hate Speech, in: FIRE 2021: Forum for Information Retrieval Evaluation, Virtual Event, 13th-17th December 2021, ACM, 2021. [19] T. Mandl, S. Modha, G. K. Shahi, H. Madhu, S. Satapara, P. Majumder, J. Schäfer, T. Ranas- inghe, M. Zampieri, D. Nandini, A. K. Jaiswal, Overview of the HASOC subtrack at FIRE 2021: Hate Speech and Offensive Content Identification in English and Indo-Aryan Lan- guages, in: Working Notes of FIRE 2021 - Forum for Information Retrieval Evaluation, CEUR, 2021. URL: http://ceur-ws.org/.