An Analysis of Transformer-based Models for Code-mixed Conversational Hate-speech Identification Neeraj Kumar Singh1 , Utpal Garain1 1 Indian Statistical Institute, ISI, Kolkata, India Abstract The current surge in social media usage has resulted in the widespread availability of harmful and hateful content. Such inflammatory content identification in social media is a crucial NLP problem. Recent research has repeatedly demonstrated that context-level semantics matter more than word-level semantics for assessing the existence of hate content. This paper investigates many state-of-the-art transformer-based models for hate content detection in code-mixed datasets. We emphasize transformer- based models since they capture context-level semantics. In particular, we concentrate on Google-MuRIL, XLM-Roberta-base, and Indic-BERT. Additionally, we have experimented with an ensemble of the three mentioned models. Based on substantial empirical evidence, we observe that Google-MuRIL emerges as the top model with macro F1-scores of 0.708 and 0.445 for HASOC shared tasks 1 and 2, placing us 1𝑠𝑡 and 6𝑡ℎ on the overall leaderboard standings respectively. Keywords Hinglish, BERT, Codemixed Language, HateSpeech, Offensive Tweets 1. Introduction Due to the accessibility of the internet, many people engage in a social media interaction on sites like Facebook, Instagram, Twitter, Sharechat, etc. These platforms are completely free and very user-friendly. The problem arises when a user or group of users share content to spread some propaganda, like hate speech, fake news, racial biases towards a group of people, etc., by using these platforms. These platforms have developed some rules. If these rules are broken, the post might be deleted or the user’s account might be temporarily suspended. Manual moderation is not a solution given the volume of content being produced on these sites. Hence, these platforms are looking towards automatic moderation systems. Automated hate-offensive speech identification is a vital task as a result of this issue. The majority of research on the identification of hate speech is restricted to English. After Mandarin and English, Hindi is the third most widely spoken language in the world. Despite this, it is consistently regarded as a low-resource language because it is mostly represented typographically. There are rarely any efforts made to identify hateful or offensive content in other Indian languages like Marathi, Gujrati, Dravid, Bangla, etc. Forum for Information Retrieval Evaluation, December 9-13, 2022, India $ neeraj1909@gmail.com (N. K. Singh); utpal@isical.ac.in (U. Garain) © 2022 Copyright for this paper by its authors. Use permitted under Creative Commons License Attribution 4.0 International (CC BY 4.0). CEUR Workshop Proceedings http://ceur-ws.org ISSN 1613-0073 CEUR Workshop Proceedings (CEUR-WS.org) The difficulty of constructing local languages in universal key inputs has led to a recent trend in code-mixed data as well. The several alternative interpretations of the Hinglish words in various contextual settings make automated categorization challenging in the case of code-mix Indic languages like Hinglish, where Hindi is expressed in romanized English. When we deal with conversational speaking in the Hinglish language, it also gets very challenging. When identifying hate speech, the conversation’s context is absolutely essential. Whether one agrees or disagrees with the previous comment or the prevailing philosophy in the discourse, it is possible to develop hatred for a certain target group, as shown in Figure 1. If the context of the parent tweet is taken into consideration, it is possible to determine whether a conversational thread contains hate or inflammatory material, even though it is not always obvious from a single comment or a reply to a comment that it does. Figure 1: Regarding the debate that was occurring in Israel at the time, the parent tweet expresses anger and contempt toward Muslim nations. In the two responses to the tweet, the word "Amine," which in Persian means "honestly," was used. which, in the context of the parent, is encouraging hatred.. Earlier, in HASOC 2020[1], a dataset was launched for the identification of hate speech in Dravidian-CodeMix Tanglish and Manglish (Tamil and Malayalam written using the Roman script), and in HASOC 2021 [2], to detect hate speech in conversational code-mixed Hinglish, a new dataset was launched (Hindi language written using the Roman script instead of the Devanagari script). This time, HASOC [3] has added three projects as a continuation of the prior work. Task 1 is available for Hinglish and German and focuses on identifying hate speech and offensive language. Task 1 involves classifying conversational codemix tweets into two categories: hateful and offensive (HOF) and non-hateful and offensive (NOT). The second aim is to categorise conversational hate speech in the code-mixed language Hinglish into multiple classes. Task 3 focuses on the identification of hate speech and abusive language in Marathi. We participated only in Task 1 and Task 2, Identification of Conversational Hate-Speech in Code- Mixed Languages (ICHCL). In this study, we show our transformer-based BERT model-based system. The code for this shared task is available at https://github.com/neeraj1909/HASOC-2022. 2. Related Work Conversational hate speech detection in the codemix dataset is a challenging task. A majority of work has been done using the various transformer-based pre-trained models for the first time at HASOC 2021 [2, 4, 5, 6]. The transformer-based pre-trained architecture of BERT [7] is used by [8], more specifically mBERT [7] and XLM-Roberta [9]. An ensemble model of the three Transformer based architectures Indic-BERT [10], Multilingual BERT [11], and XLM-Roberta [9] is used by [12]. For feature extraction, [13] used TF-IDF, Word2Vec, Emo2Vec, and Hashtag vector. An ensemble of Random Forest, Multilayer Perceptron, and Gradient Boosting is used for the classification task. Transformer-based XLM-Roberta [9] model for the classification task is used by [14]. Works like[15] used codemix data augmentation using WordNet. They tried several different models, like Logistic Regression(LR) with word-level TF-IDF, Convolutional Neural Network(CNN) with word-level TF-IDF, and finetuned BERT pre-trained model. FastText library is used by [16] for the identification of hate speech. An ensemble of three different Transformer based models Ernie2.0 [17], Twitter Roberta Base Offensive[18], and HateBERT[19] is used by [20]. 3. Dataset Datasets[21] have been sampled from Twitter. To lessen the impact of bias, organisers have selected contentious stories on several subjects that are likely to feature hateful, offensive, and profane posts. The controversial stories are as follows: • Temple-Mosque Controversy • Covid Controversy • Common Civil Code • Hinduphobia • Namaz on public place • Farmer Protest • Historical Hindu Muslim • Islamophobia • Russian-Ukrainian conflict etc. 3.1. Task-1: Finding Hate-Offensive Content in Conversational Hinglish-German Code-Mixed Languages Task 1’s major goal is the coarse-grained binary classification of conversational hate speech and offensive language, primarily for Hinglish and German. HOF denotes a tweet, comment, or reply that promotes or involves hate speech or other rude or obscene languages, as opposed to a tweet, comment, or reply that is free of any harmful, vulgar, or hateful language. The dataset statistics for Hinglish and German for binary classification are shown in Table 1. Category Dataset Size Hate and Offensive (HOF) 2612 Non Hate-Offensive (NOT) 2609 Total 5221 Table 1 Data distribution for Task 1 (ICHCL Binary Classification) Category Dataset Size Contextual Hate (CHOF) 888 Standalone Hate (SHOF) 1636 Non-Hate (NONE) 2390 Total 4914 Table 2 Data distribution for Task 2 (ICHCL Multi-Class Classification) 3.2. Task 2: Classifying Conversational Hate Speech in Hinglish Code-Mixed Languages For task 2 (multi-class classification), the HOF class is classified into three sub-classes: Standalone Hate (SHOF), Contextual Hate (CHOF), and Non-Hate (NONE). It’s not always possible to tell from a single comment or reply to a comment whether or not a conversational thread contains hate-offensive information. It is simple to identify by providing the parent tweet’s context. As seen in Figure 1, the response is positive, but only insofar as it expresses hatred for the person who posted the original tweet in the remark. It validates the venom that was written in the comment. It is therefore also hate speech. Table 2 contains the dataset statistics for the conversational code-mixed data for multi-class classification. 3.3. Data Preprocessing We flatten each interaction into a separate parent-comment-reply chain before feeding the data into our model. We concatenate the tweets and add a "[SEP]" token to each tweet, comment, and reply to help users distinguish one from the next. Every instance has a final label applied to it that corresponds to the last comment in the discussion chain. Except for the comma (,), full-stop (.), bar (|), and question mark (? ), we removed all punctuation from the tweets. By eliminating URLs, mentions, and new-line characters, we clean up the data. We replaced the emojis with their CLDR short names. For multi-class classification, the target class’s frequency is highly imbalanced. For our model to take into account this issue, we have assigned different weights to each class by dividing the frequency of each class by the size of the dataset and subtracting this result from one. 4. Methodology Three different transformer model types were used in our studies; they are explained below: • XLM-Roberta: A masked language model built on transformers and pre-trained on 100 languages. On numerous cross-lingual benchmarks, it produced cutting-edge performance. Facebook AI published this model in 2019. • Indic-BERT: a pre-trained ALBERT model that has been trained on a sizable dataset of 12 Indic languages, including Assamese, Bengali, English, Gujarati, Hindi, Kannada, Malayalam, Marathi, Oriya, Punjabi, Tamil, and Telugu. There are roughly 9 billion tokens in the multilingual corpus. • Google MuRIL[22]: In 2020, Google released it. It is a multi-lingual BERT for Indic lan- guages. MuRIL-BERT was pre-trained in 17 Indic languages, including English, Assamese, Bengali, Gujarati, Hindi, Kannada, Kashmiri, Malayalam, Marathi, Nepali, Oriya, Punjabi, Sanskrit, Sindhi, Tamil, Telugu, and Urdu. 4.1. Binary Classification Using binary cross-entropy loss, we improved BERT transformers and trained our linear classifier layer on top of them. Each BERT model had a dropout layer on top of which added a fully connected linear layer. The representation of the CLS token served as the input to this fully connected layer. 4.2. Multi-class Classification In this task, the target class’s frequency is highly imbalanced. To deal with this class-imbalance issue, we have used the weights for each class. We have assigned different weights to each class by dividing the frequency of each class by the size of the dataset and subtracting it from one. The ensemble of the three models together has also been put to the test. For our purposes, we decided to combine the output of the three models using the majority voting process. There are two ways to perform a majority voting system: • Hard Voting Ensemble: In this instance, the model chooses the prediction class with the most votes out of all of the fine-tuned transformer models. • Soft Voting Ensemble: The model in this case sums the class probabilities from all of the fine-tuned transformer models and selects the class with the highest sum of probabilities. 4.3. Tuning Parameters We utilised a batch size of 64 and a maximum sequence length of 512 for all models. We employed early halting on the validation loss with patience of 10 epochs to obtain the optimised learnable parameters. A 2e-5 initial learning rate Adam optimizer was employed. 5. Results We divided the training data into three groups for evaluation purposes: a train set, a validation set, and a test set, with a ratio of 80:10:10. We track the top model using validation loss at each Figure 2: Ensemble Model for Task 1 Model Accuracy(%) Precision Recall F1 Macro Google-MuRIL 0.78 0.78 0.78 0.78 XLM-Roberta 0.72 0.72 0.72 0.72 Indic-Bert 0.71 0.71 0.70 0.70 Soft Voting Ensemble 0.67 0.67 0.66 0.66 Hard Voting Ensemble 0.67 0.67 0.66 0.66 Table 3 Performance on ICHCL Binary Classification Task (Task 1) epoch. Using the Google-MuRIL BERT, we are achieving the best results in binary code-mixed categorization. In the case of binary classification, we have found that the Google-MuRIL model’s macro F1-score is 5–6% higher than the XLM–Roberta–Base model. We utilised various random seeds to test our models and found that their performance was largely the same. 5.1. Results for Training Data For the binary classification problem, Table 3 compares the results of Google-MuRIL, XLM- Roberta-base, Indic-BERT, and an ensemble of these three models. Table 4 displays the classifi- cation outcomes for multi-class categorization. 5.2. Results for Test Data In accordance with the findings from the validation data (in section 5.1), we adjusted the model’s hyperparameters for the entire training set of data. We turned in the five runs for tasks 1 and 2 Model Accuracy(%) Precision Recall F1 Macro Google-MuRIL 0.60 0.56 0.56 0.56 XLM-Roberta 0.54 0.54 0.53 0.53 Indic-Bert 0.55 0.55 0.54 0.54 Soft Voting Ensemble 0.55 0.55 0.55 0.55 Hard Voting Ensemble 0.55 0.55 0.55 0.55 Table 4 Performance on ICHCL Multi-Class Classification Task (Task 2) Task Name Submission Name F1 Macro Precision Recall Task 1 binary_muril_ichcl 0.7083 0.7121 0.7091 Task 2 multiclass_muril_ichcl 0.4448 0.5248 0.4575 Table 5 Final results that were shared. We may deduce from the leaderboard that Google-MuRIL is the best model for both shared tasks, as indicated in Table 5. 6. Conclusion We have compared the outcomes for various transformer-based BERT architectures in this research. In both tasks, we found that Google-MuRIL outperforms all alternative transformer- based systems. It has also been seen that changing the random seed does not change the model performance. We have also observed that the performance of all three BERT models is better than the ensemble model. So, some of the actions will be to speculate on the reason behind them. 7. Acknowledgement This research is funded by the Science and Engineering Research Board (SERB), Dept. of Science and Technology (DST), Govt. of India through Grant File No. SPR/2020/000495. References [1] T. Mandl, S. Modha, A. Kumar M, B. R. Chakravarthi, Overview of the hasoc track at fire 2020: Hate speech and offensive language identification in tamil, malayalam, hindi, english and german, in: Forum for information retrieval evaluation, 2020, pp. 29–32. [2] S. Modha, T. Mandl, G. K. Shahi, H. Madhu, S. Satapara, T. Ranasinghe, M. Zampieri, Overview of the hasoc subtrack at fire 2021: Hate speech and offensive content iden- tification in english and indo-aryan languages and conversational hate speech (2021) 1–3. [3] Satapara, Shrey and Majumder, Prasenjit and Mandl, Thomas and Modha, Sandip and Madhu, Hiren and Ranasinghe, Tharindu and Zampieri, Marcos and North, Kai and Pre- masiri, Damith, Overview of the HASOC Subtrack at FIRE 2022: Hate Speech and Offensive Content Identification in English and Indo-Aryan Languages, in: FIRE 2022: Forum for Information Retrieval Evaluation, Virtual Event, 9th-13th December 2022, ACM, 2022. [4] F. M. P. del Arco, S. Halat, S. Padó, R. Klinger, Multi-task learning with sentiment, emotion, and target detection to recognize hate speech and offensive language (2021). [5] S. Chanda, S. Ujjwal, S. Das, S. Pal, Fine-tuning pre-trained transformer based model for hate speech and offensive content identification in english, indo-aryan and code-mixed (english-hindi) languages, in: Forum for Information Retrieval Evaluation (Working Notes)(FIRE), CEUR-WS. org, 2021. [6] R. Rajalakshmi, S. Srivarshan, F. Mattins, E. Kaarthik, P. Seshadri, Conversational hate- offensive detection in code-mixed hindi-english tweets (2021). [7] J. D. M.-W. C. Kenton, L. K. Toutanova, Bert: Pre-training of deep bidirectional transformers for language understanding, in: Proceedings of naacL-HLT, 2019, pp. 4171–4186. [8] S. Banerjee, M. Sarkar, N. Agrawal, P. Saha, M. Das, Exploring transformer based models to identify hate speech and offensive content in english and indo-aryan languages, arXiv preprint arXiv:2111.13974 (2021). [9] A. Conneau, K. Khandelwal, N. Goyal, V. Chaudhary, G. Wenzek, F. Guzmán, E. Grave, M. Ott, L. Zettlemoyer, V. Stoyanov, Unsupervised cross-lingual representation learning at scale, arXiv preprint arXiv:1911.02116 (2019). [10] D. Kakwani, A. Kunchukuttan, S. Golla, N. Gokul, A. Bhattacharyya, M. M. Khapra, P. Kumar, Indicnlpsuite: Monolingual corpora, evaluation benchmarks and pre-trained multilingual language models for indian languages, in: Findings of the Association for Computational Linguistics: EMNLP 2020, 2020, pp. 4948–4961. [11] J. Devlin, M.-W. Chang, K. Lee, K. Toutanova, Bert: Pre-training of deep bidirectional transformers for language understanding, arXiv preprint arXiv:1810.04805 (2018). [12] Z. M. Farooqi, S. Ghosh, R. R. Shah, Leveraging transformers for hate speech detection in conversational code-mixed tweets, arXiv preprint arXiv:2112.09986 (2021). [13] A. Hegde, M. D. Anusha, H. L. Shashirekha, An ensemble model for hate speech and offensive content identification in indo-european languages, in: Forum for Information Retrieval Evaluation (Working Notes)(FIRE), CEUR-WS. org, 2021. [14] A. Kadam, A. Goel, J. Jain, J. S. Kalra, M. Subramanian, M. Reddy, P. Kodali, T. Arjun, M. Shrivastava, P. Kumaraguru, Battling hateful content in indic languages hasoc ’21, arXiv preprint arXiv:2110.12780 (2021). [15] M. S. Jahan, M. Oussalah, J. Mim, M. Islam, Offensive language identification using hindi- english code-mixed tweets, and code-mixed data augmentation, in: Forum for Information Retrieval Evaluation (Working Notes)(FIRE), CEUR-WS. org, 2021. [16] N. P. Motlogelwa, E. Thuma, M. Mudongo, T. Leburu-Dingalo, G. Mosweunyane, Leverag- ing text generated from emojis for hate speech and offensive content identification, in: Forum for Information Retrieval Evaluation (Working Notes)(FIRE), CEUR-WS. org, 2021. [17] Y. Sun, S. Wang, Y. Li, S. Feng, H. Tian, H. Wu, H. Wang, Ernie 2.0: A continual pre- training framework for language understanding, in: Proceedings of the AAAI Conference on Artificial Intelligence, volume 34, 2020, pp. 8968–8975. [18] F. Barbieri, J. Camacho-Collados, L. Neves, L. Espinosa-Anke, Tweeteval: Unified bench- mark and comparative evaluation for tweet classification, arXiv preprint arXiv:2010.12421 (2020). [19] T. Caselli, V. Basile, J. Mitrović, M. Granitzer, Hatebert: Retraining bert for abusive language detection in english, arXiv preprint arXiv:2010.12472 (2020). [20] B. Chinagundi, M. Singh, T. Ghosal, P. S. Rana, G. S. Kohli, Classification of hate, offensive and profane content from tweets using an ensemble of deep contextualized and domain specific representations (2021). [21] S. Modha, T. Mandl, P. Majumder, S. Satapara, T. Patel, H. Madhu, Overview of the HASOC Subtrack at FIRE 2022: Identification of Conversational Hate-Speech in Hindi- English Code-Mixed and German Language , in: Working Notes of FIRE 2022 - Forum for Information Retrieval Evaluation, CEUR, 2022. [22] S. Khanuja, D. Bansal, S. Mehtani, S. Khosla, A. Dey, B. Gopalan, D. K. Margam, P. Aggarwal, R. T. Nagipogu, S. Dave, et al., Muril: Multilingual representations for indian languages, arXiv preprint arXiv:2103.10730 (2021).