Baseline BERT models for Conversational Hate Speech Detection in Code-mixed tweets utilizing Data Augmentation and Offensive Language Identification in Marathi⋆ Koyel Ghosh1 , Apurbalal Senapati1 and Utpal Garain2 1 Central Institute of Technology, Kokrajhar, Assam, India 2 Indian Statistical Institute, Kolkata, India Abstract In today’s world, social media plays a vital role in spreading hate towards a person or group based on their color, caste, sex, sexual orientation, political differences, etc. Most of the work is done on a single tweet or comment classification, which lacks the conversation’s context. The tweet, corresponding comments, and reply often helps us understand the context of the entire discussion. This paper discusses the used system and the performance of the team CITK_ISI on the first available code-mixed dataset on Hindi-English and German conversation scrapped from Twitter. Data augmentation is used with a baseline transfer-based BERT model and achieved a macro F1 score of 0.6653 for ICHCL Hinglish and German codemix binary classification. The system also identifies hate speech and offensive language in Marathi, a binary classification that secures a macro F1 score of 0.9019. Keywords Hate Speech, Transformers, Binary classification, Multiclass-classification, Code-Mixed Languages, Hindi-English, German, Marathi 1. Introduction Instead of being friendly or informative, social media platforms like Twitter, Facebook, Youtube, etc. are becoming the platforms for cyberbullying and online harassment, leading people to depression or provoking people to involve in violence [1]. There are numerous instances around the globe in spreading such hate speeches disturbs social and communal integrity. As a result, numerous platforms of social media websites monitor user posts. This directs to an urgent injunction for methods to identify suspicious posts automatically. Most research on hate speech detection is done in English-like languages. Low-resource languages suffer from a lack of annotated datasets. Though few mono-lingual datasets in low-resource languages are available, code-mixed data like Hinglish (assembled of the words spoken in Hindi but written in the Roman script rather than the Devanagari script) are often used on Twitter, Facebook etc. This code-mixed language consists of different grammatical uses, slang and hateful words, Forum for Information Retrieval Evaluation, December 9-13, 2022, India ∗ Corresponding author. Envelope-Open ghosh.koyel8@gmail.com (K. Ghosh); a.senapati@cit.ac.in (A. Senapati); utpal.garain@gmail.com (U. Garain) GLOBE https://github.com/BrainLearns (K. Ghosh) © 2022 Copyright for this paper by its authors. Use permitted under Creative Commons License Attribution 4.0 International (CC BY 4.0). CEUR Workshop Proceedings http://ceur-ws.org ISSN 1613-0073 CEUR Workshop Proceedings (CEUR-WS.org) including phonetic variations, misspelled words, and contextual usage in sentences. As well as the context of conversion plays a vital role in understanding the hate towards someone or something. Sometimes a parent tweet doesn’t spread hate or fake news, but comments or replies associated with it directly attack the person who posts the tweet. Figure 1 shows an example reply supporting a hate comment towards a source tweet. Figure 1: The reply has a positive sentiment “You totally nailed it, can’t stop laughing.”. But it is positive in favour of the hate expressed towards the author of the source tweet in the comment. Hence, it supports the hate expressed in the comment. Hence, it is also hate speech. The source tweet says, “Modi ji (PM of India) was asking for ideas to solve the covid situation of India. My idea to him is to resign.” the comment expresses, “They have asked Doctors and Scientists. Not fuckers. Sit down.” Keeping this scenario in mind Hate Speech and Offensive Content Identification in English and Indo-Aryan Languages (HASOC) 20221 [2] proposes two tasks this year 1) Task 1 ICHCL Binary Classification - Identification of Conversational Hate-Speech in Code-Mixed Languages like Hinglish and German 2) Task 2 ICHCL Multiclass CLassification- Identification of Conversational Hate-Speech in Code-Mixed Languages only in Hinglish. Along with that, they proposed 3) Task 3A Marathi- Offensive Language Identification in Marathi 4) Task 3B Marathi- Categorisation of Offensive Language in Marathi 5) Task 3C Marathi- Offense Target Identification in Marathi [3]. All five tasks are the extension of the previous year’s HASOC 2021 task2 . This paper attempted to identify hate speech content in all five tasks. Pre-trained BERT (Bidirectional Encoder Representations from Transformers) such as mBERT [4], MahaBERT [5], is used for this work. The rest of the paper is structured as follows. Section 2 is the work related to hate speech detection in Hindi and Marathi languages. Section 3 describes the experimental setup, including the dataset, preprocessing steps, and baseline pre-trained BERT models. Section 4 shows the 1 https://hasocfire.github.io/hasoc/2022/index.html 2 https://hasocfire.github.io/hasoc/2021/index.html results and findings from the experiments. Finally, it is concluded in Section 5. 2. Related work The primary challenges of hate speech detection are the absence of related resources like language-specific datasets. Creating labeled datasets of hate speech in the Indian language is tedious and challenging. It needs lots of groundwork and preprocessing, like cleaning, annotators’ agreements, etc., to create valuable data from social media. This section briefly outlines the existing approaches and available datasets of Hindi, Hindi-English and Marathi languages. • Hindi: HASOC (Hate Speech and Offensive Content Identification), a shared task or- ganized by FIRE (Forum for Information Retrieval Evaluation)3 , which published hate datasets in Indian languages such as Hindi, Marathi, etc. HASOC offers four subtracks, one of which is relevant to us: HASOC - English and Indo-Aryan Languages. The distribution of datasets comes in a tab-separated format. Other collections, including HASOC, use techniques to identify hate speech in online posts. In 2019, the HASOC-Hindi dataset offered three tasks [6]. Subtask A, which is the first task, is binary classification. Identifying the profanity or abuse (multiclass) of the hate comment is the second task, or subtask B. Subtask C involves determining if the hate speech is targeted at a specific person or is more general (untargeted). In Hindi, 93 runs were submitted for 3 different mini-tasks. Regarding the Hindi subtask A, the winner team, QutNocturnal [7], used a CNN-based method with a Word2vec embedding, yielding improved Marco F1 (0.8149) and Weighted F1 (0.8202) scores. The second group, LGI2P [8], employed BERT for classification after training a fastText model for the proposed Hindi language. Both the Marco-F1 and Weighted-F1 values for the system were 0.8111. Subtask B of the Hindi dataset receives a score of 0.5812 in Marco-F1 and 0.7147 in Weighted-F1 when BERT is used by 3Idiots [9]. This subtask C Hindi Dataset was completed with a high Marco-F1 score of 0.5754 by team A3-108 [10]. According to them, Adaboost [11] was the best performing classifier among the three classifiers, i.e., Adaboost or Adaptive Boosting (AB), Random Forest (RF), Linear Support Vector Machine (SVM). They merge multiple weak classifiers to construct a robust prediction model, but an ensemble of SVM, Random Forest, and Adaboost with hard voting performed even better. This classifier used TF-IDF features of word unigrams and characters 2, 3, 4, and 5 grams with an additional feature of the length of every tweet. In HASOC 2020, two Hate Speech detection tasks [12], sub-task A (binary class) and sub-task B (multiclass) are proposed with another Hindi dataset in the research area. NSIT_ML_Geeks [13] outperforms other teams in the competition scoring Marco-F1 0.5337 and 0.2667 in sub-task A and sub-task B, respectively, utilizing CNN and BiLSTM. Nohate [14] team achieved Marco-F1 0.3345 in sub-task B, fine-tuning the BERT model for the classification. In 2021, HASOC published a Hindi dataset [15] with sub-task A and B again. Total Sixty-five teams submitted a total of six thousand and fifty-two runs. The best submission 3 http://fire.irsi.res.in/fire/2022/home was achieved Macro F1 0.7825 in sub-task A with a fine-tuned Multilingual-BERT (20 epochs) with a classifier layer added at the final phase. The second team also fine-tuned Multilingual-BERT and scored Macro F1 0.7797. NeuralSpace [16] got Macro F1 0.5603 in sub-task B. They use an XLM-R transformer, vector representations for emojis using the system Emoji2Vec, and sentence embeddings for hashtags. After that, three resulting representations were concatenated before classification. In the paper [17] they used the pre-trained multilingual BERT (m-BERT) model for computing the input embedding on the Hostility Detection Dataset (Hindi) later SVM, Random-Forest, Multilayer perceptron (MLP), Logistic Regression models are used as classifiers. In coarse-grained evaluation, SVM reported the best weighted-F1 score of 84%, whereas they obtained 84%, 83%, and 80% weighted-F1 scores for LR, MLP, and RF. In fine-grained evaluation, SVM has the most excellent F1 score for evaluating three hostile dimensions, namely Hate (47%), Offensive (42%), and Defamation (43%). Logistic Regression beats the others in the Fake dimension with an F1 score of 68%. • Hindi-English: In 2021, HASOC’s main track had another subtrack, i.e., Identification of Conversational Hate-Speech in Code-Mixed Languages (ICHCL) [18], offered as subtask-2 of the HASOC-English and Indo-Aryan Languages subtrack. The ICHCL subtask aims to filter posts that are normal on a standalone basis but might be judged as hate, profane and offensive posts if we consider the context. This subtask focused on the binary classification of such contextual posts. The dataset is sampled from Twitter. Around 7000 code-mixed posts in English and Hindi were downloaded and annotated with an annotation platform developed for this task. Team MIDAS [19] is the top team of the ICHCL task. The authors proposed a transformer-based approach that relied on a concatenation of the contextual representation. They have used hard voting-based ensembles of three transformer models: IndicBERT, Multilingual-BERT, and XML-ROBERta. The team added a dropout followed by a fully connected layer to the end of each transfer-based model. Finally, the model combines the probabilities of three models for the two classes, passed through a Softmax layer. The scores were combined with an ensemble of classifiers using a hard voting scheme to obtain the final classification result. The authors of Super Mario [20] fine-tuned the XLM-Roberta-Large model with a classifier layer added at the end and trained on the ICHCL dataset. A binary cross-entropy scheme was applied to train the system. • Marathi: In HASOC-Marathi [15], the best-performing team, WLV-RIT fine-tuned XLM- R Large model with a simple softmax layer. Later executed transfer learning from English data released for OffensEval 2019 [21] and Hindi data released for HASOC 2019 [6] and show that executing transfer learning from Hindi is better than executing transfer learning from English. They Scored an F1 score of 0.9144 [22]. The second team applied a fine-tuned LaBSE transformer [23] on the Marathi and the Hindi data set and achieved an F1 score of 0.8808. Their experiments show that the LaBSE transformer [24] outperforms XLM-R in the monolingual settings, but XLM-R performs better when Hindi and Marathi data are merged. L3CubeMahaHate [25] presents the first major Marathi hate speech dataset with 25,000 distinct tweets from Twitter, later annotated manually, and labeled them into four major classes, i.e., hate, offensive, profane, and not. Finally, they use CNN, LSTM, and Transformers. Next, they explore monolingual and multilingual variants of BERT like MahaBERT, IndicBERT, mBERT, and xlm-RoBERTa and show that monolingual models perform better than their multilingual counterparts. Their MahaBERT [5] model provides the best results on L3Cube-MahaHate Corpus. In the paper [26], They present results from several machine learning experiments on MOLD4 dataset, including zero-short and other transfer learning experiments on state-of-the-art cross-lingual transformers from Bengali, English, and Hindi data. Authors [27] release a Marathi dataset and experiment with several machine learning models, including state-of-the-art transformer models, to predict the type and target of offensive tweets in Marathi. Later, attempt using cross-lingual embeddings and transfer learning to spot offensive language. Finally, they investigate semi-supervised data augmentation. They built a larger semi-supervised dataset for Marathi called SeMOLD, which has about 8000 examples. 3. Experimental setup 3.1. Task description The brief of the task5 is outlined below. • Task 1 ICHCL Binary Classification: It is ICHCL HINGLISH and GERMAN Codemix Binary Classification. This task aims to identify Hinglish and German hate speech and offensive language. It is a coarse-grained binary classification to classify tweets into two classes: hate and offensive (HOF) and non-hate and offensive (NOT). – (NOT) Non-Hate-Offensive - This post does not contain hate speech or profane, offensive content. – (HOF) Hate and Offensive - This post contains hate, offensive, and profane content. • Task 2 ICHCL Multiclass CLassification: Identification of Conversational Hate-Speech in Code-Mixed Languages (ICHCL) - Multiclass Classification. This year for the Hinglish language, a multiclass task has been introduced that further divides the HOF tweets into 3 subclasses: – (SHOF) Standalone Hate - Offensive, profane content is in tweets, comments, or replies. – (CHOF) Contextual Hate - Comment or reply supporting the hate, offence and profanity expressed in its parent. This includes affirming the hate with positive sentiment and having apparent hate. – (NONE) Non-Hate - This tweet, comment, or reply does not contain Hate, offensive, or profane content. • Task 3A Marathi: Offensive Language Detection – OFF - Posts containing any form of non-acceptable language (profanity) or a targeted offence, which can be veiled or direct. 4 MOLD is available at: https://github.com/ tharindudr/MOLD 5 https://hasocfire.github.io/hasoc/2022/call_for_participation.html – NOT - Posts that do not contain offence or profanity. • Task 3B Marathi: Categorisation of Offensive Language – Targeted Insult (TIN) - Posts containing an insult/threat to an individual, group, or others. – Untargeted (UNT) - Posts containing nontargeted profanity and swearing. • Task 3C Marathi: Offense Target Identification – Individual (IND) - Posts targeting an individual. – Group (GRP) - The target of these offensive posts is a group of people considered unity due to the same ethnicity, gender or sexual orientation, political affiliation, religious belief, or other common characteristics. – Other (OTH) - The target of these offensive posts does not belong to any of the previous two categories. 3.2. Dataset This year, HASOC 2022 provides code-mixed Hinglish-German datasets tagged as “NOT” and “HOF” for binary classification (Task 1) as well as “NONE ”, “SHOF” and “CHOF” for multi- classification (Task 2). Table 1, 2 shows all five task dataset statistics separately. Here, we only include the total count of the test data, not the label count of the test dataset, as it is not provided yet. Class label Training Test Task 1 NOT 2,609 - HOF 2,612 - TOTAL 5,221 1,077 Task 2 NONE 2,390 - SHOF 1,636 - CHOF 888 - TOTAL 4,833 996 Table 1 Class distribution analysis for Task 1 and Task 2 dataset, which includes Hinglish-German data Marathi dataset tagged as “NOT” and “HOF” for binary classification (Task 3A); “NOT”, “TIN” and “UNT” for multi-classification (Task 3B); and “NOT”, “IND”, “GRP” and “OTH” for another multi-classification (Task 3C). 3.3. Preprocessing • Data Augmentation: Here, we utilize the previous year’s HASOC-ICHCL2021 data for the binary classification along with the HASOC-ICHCL2022 dataset. We just merged both of the datasets. Class label Training Test Task 3A NOT 2,034 - HOF 1,069 - TOTAL 3,103 508 Task 3B NOT 2,035 - TIN 741 - UNT 327 - TOTAL 3,103 508 Task 3C NOT 2,363 - IND 503 - GRP 157 - OTH 80 - TOTAL 3,103 508 Table 2 Class distribution analysis for Task 3A, Task 3B and Task 3C dataset which includes only Marathi data • Data concatenation: In preprocessing step, we concatenate tweets, comments, and replies applying the given code6 . This part is applicable for Task 1 and Task 2. • Convert all the words in lowercase: We convert all the words into lowercase. • Converted emojis: Here, we didn’t remove the emoji entirely; rather converted emojis and emoticons to English text 7 as it is a Hinglish code-mix task. • Stopwords removal: We remove English and Hindi stopwords from the dataset. • Stemming: Stemming is used to convert the word to its root word by removing its inflections. • Removing unnecessary symbols and url: Remove @, , *, # , https?:// etc. from the dataset to make the dataset noise free. Applicable for Marathi data also. . • Label encoding: We encode Class into a unique number for each task. – Task 1 (HASOC-ICHCL-Hinglish-German2022 binary classification) - “HOF” to “0”, and “NOT” to “1”, – Task 2 (HASOC-ICHCL-Hinglish2022 multiclass classification) - “NONE” to “0”, “SHOF” to “1”, “CHOF” to “2”. – Subtask-3A (HASOC-Marathi2022 binary classification) “NOT” to “0” and “HOF” to “1”. 6 https://github.com/hasocfire/ICHCLbaseline/tree/master/ICHCL_baseline2k22 7 https://studymachinelearning.com/text-preprocessing-handle-emoji-emoticon/ – Subtask-3B (HASOC-Marathi2022 ternary classification) “NOT” to “0”, “TIN” to “1” and “UNT” to “2”. – Subtask-3C (HASOC-Marathi2022 four classification) “NOT” to “0”, “IND” to “1”, “GRP” to “2” and “OTH” to “3”. Table 3 shows all the preprocessing steps applied to all five tasks. Preprocessing steps Task 1 Task 2 Task 3A Task 3B Task 3C Data Augmentation Yes No No No No Data concatenation Yes Yes No No No Convert all the words in lowercase Yes Yes Yes Yes Yes Converted emojis Yes Yes No No No Stopwords removal Yes Yes No No No Stemming Yes Yes No No No Removing unnecessary symbols and url Yes Yes Yes Yes Yes Label encoding Yes Yes Yes Yes Yes Table 3 Preprocessing steps for all five tasks 3.4. Pre-trained BERT models BERT models are trained on a large raw text (without human labeling) corpus in a self-supervised way. Figure 2 shows the representation of the general proposed approach for all five tasks. Figure 2: (a) General architecture to perform Task 1 and Task 2 (b) architecture to perform Subtask-3A, Subtask-3B and Subtask-3C • mBERT8 : It is pre-trained with the largest Wikipedia over 104 top languages worldwide, including Hindi, Bengali and Marathi, using a masked language modeling (MLM) objective. 8 https://huggingface.co/bert-base-multilingual-uncased For Task-1 and Task-2, we use the same mBERT architecture with a few changes (different preprocessing steps only). • MahaBERT9 : MahaBERT is a multilingual BERT (bert-base-multilingual-cased) model finetuned on L3Cube-MahaCorpus and other publicly available Marathi monolingual datasets. For Subtask-3A, Subtask-3B and Subtask-3C, we use the same MahaBERT architecture. Due to memory and GPU issues, we did several experiments but with the same hyperpa- rameter combination (Table 3), and we noticed that smaller batch sizes help better fine-tuning. Hyperparameter Learning-rate 1e-5 Epochs 5 Max seq length 512 Batch size 5 Table 4 Combination of hyperparameters for fine-tuning pre-trained BERT variants 4. Result Here, table 5 shows the result; Macro F1_Score, precision and recall measures the performance. We put all the tasks’ results as shown on the leaderboard. We train the whole dataset and predict classes for the given test set. We also tested other pre-trained BERT models but submitted only one run, giving the best result (we didn’t submit other runs as they did not perform well). Task f1_score precision recall Task 1 ICHCL Binary Classification 0.6621 0.6732 0.6655 Task 2 ICHCL Multiclass CLassification 0.3952 0.4699 0.4199 Task 3A Marathi 0.9019 0.9021 0.9022 Task 3B Marathi 0.3073 0.3405 0.2868 Task 3C Marathi 0.2063 0.2322 0.1960 Table 5 Performance of all the tasks 5. Conclusion In this paper, five task performances are presented. In Hinglish-German, our task is to classify a tweet, comment, and reply pair is HOF or NOT (Task 1). The same pair from the dataset conveys SHOF or CHOF or NONE (Task 2). In Marathi, texts are HOF or NOT (Subtask-3A). In multiclass 9 https://huggingface.co/l3cube-pune/marathi-bert classification, text is NOT, TIN or UNT (Subtask-3B). The last task in Marathi is to classify the text in NOT or IND or GRP or OTH (Subtask-3C). We utilized several variants of pre-trained BERT models but submitted only one run. We notice a smaller batch size gives a better result than a larger batch size. Converting emojis and emoticons to text help to increase performance. More experiments on preprocessing are needed to increase the models’ performance. Here, data augmentation plays a good role; otherwise, we use a common state-of-the-art baseline transformer-based pre-trained BERT model. We applied the same data augmentation approach for the Marathi dataset, i.e., we merged the previous year’s HASOC-Marathi data but couldn’t submit it on time; otherwise, it also performed well. References [1] M. L. Williams, P. Burnap, A. Javed, H. Liu, S. Ozalp, Hate in the Ma- chine: Anti-Black and Anti-Muslim Social Media Posts as Predictors of Offline Racially and Religiously Aggravated Crime, The British Journal of Criminology 60 (2019) 93–117. URL: https://doi.org/10.1093/bjc/azz049. doi:10.1093/bjc/azz049 . arXiv:https://academic.oup.com/bjc/article- pdf/60/1/93/31634412/azz049.pdf . [2] p. . A. y. M. Satapara, Shrey and Majumder, Prasenjit and Mandl, Thomas and Modha, Sandip and Madhu, Hiren and Ranasinghe, Tharindu and Zampieri, Marcos and North, Kai and Premasiri, Damith, booktitle = FIRE 2022: Forum for Information Retrieval Evaluation, Virtual Event, 9th-13th December 2022, Overview of the HASOC Subtrack at FIRE 2022: Hate Speech and Offensive Content Identification in English and Indo-Aryan Languages, ???? [3] T. Ranasinghe, K. North, D. Premasiri, M. Zampieri, Overview of the HASOC subtrack at FIRE 2022: Offensive Language Identification in Marathi, in: Working Notes of FIRE 2022 - Forum for Information Retrieval Evaluation, CEUR, 2022. [4] J. Devlin, M. Chang, K. Lee, K. Toutanova, BERT: pre-training of deep bidirectional transformers for language understanding, CoRR abs/1810.04805 (2018). URL: http://arxiv. org/abs/1810.04805. arXiv:1810.04805 . [5] R. Joshi, L3cube-mahacorpus and mahabert: Marathi monolingual corpus, marathi BERT language models, and resources, CoRR abs/2202.01159 (2022). URL: https://arxiv.org/abs/ 2202.01159. arXiv:2202.01159 . [6] T. Mandl, S. Modha, P. Majumder, D. Patel, M. Dave, C. Mandlia, A. Patel, Overview of the hasoc track at fire 2019: Hate speech and offensive content identification in indo-european languages, in: Proceedings of the 11th Forum for Information Retrieval Evaluation, FIRE ’19, Association for Computing Machinery, New York, NY, USA, 2019, p. 14–17. URL: https://doi.org/10.1145/3368567.3368584. doi:10.1145/3368567.3368584 . [7] M. A. Bashar, R. Nayak, Qutnocturnal@hasoc’19: CNN for hate speech and offensive content identification in hindi language, CoRR abs/2008.12448 (2020). URL: https://arxiv. org/abs/2008.12448. arXiv:2008.12448 . [8] J.-C. Mensonides, P.-A. Jean, A. Tchechmedjiev, S. Harispe, Imt mines ales at hasoc 2019: automatic hate speech detection, in: FIRE 2019-11th Forum for Information Retrieval Evaluation, volume 2517, 2019, pp. p–279. [9] S. Mishra, S. Mishra, 3idiots at hasoc 2019: Fine-tuning transformer neural networks for hate speech identification in indo-european languages., in: FIRE (Working Notes), 2019, pp. 208–213. [10] V. Mujadia, P. Mishra, D. M. Sharma, Iiit-hyderabad at hasoc 2019: Hate speech detection., in: FIRE (Working Notes), 2019, pp. 271–278. [11] Y. Freund, R. E. Schapire, A decision-theoretic generalization of on-line learning and an application to boosting, Journal of computer and system sciences 55 (1997) 119–139. [12] T. Mandl, S. Modha, A. Kumar M, B. R. Chakravarthi, Overview of the hasoc track at fire 2020: Hate speech and offensive language identification in tamil, malayalam, hindi, english and german, in: Forum for Information Retrieval Evaluation, FIRE 2020, Association for Computing Machinery, New York, NY, USA, 2020, p. 29–32. URL: https://doi.org/10.1145/ 3441501.3441517. doi:10.1145/3441501.3441517 . [13] R. Raj, S. Srivastava, S. Saumya, Nsit & iiitdwd @ hasoc 2020: Deep learning model for hate-speech identification in indo-european languages, in: FIRE, 2020. [14] S. Kumari, Nohate at hasoc2020: Multilingual hate speech detection, in: Forum for Information Retrieval Evaluation, FIRE, 2020. [15] S. Modha, T. Mandl, G. K. Shahi, H. Madhu, S. Satapara, T. Ranasinghe, M. Zampieri, Overview of the hasoc subtrack at fire 2021: Hate speech and offensive content iden- tification in english and indo-aryan languages and conversational hate speech, in: Fo- rum for Information Retrieval Evaluation, FIRE 2021, Association for Computing Ma- chinery, New York, NY, USA, 2021, p. 1–3. URL: https://doi.org/10.1145/3503162.3503176. doi:10.1145/3503162.3503176 . [16] M. Bhatia, T. S. Bhotia, A. Agarwal, P. Ramesh, S. Gupta, K. Shridhar, F. Laumann, A. Dash, One to rule them all: Towards joint indic language hate speech detection, CoRR abs/2109.13711 (2021). URL: https://arxiv.org/abs/2109.13711. arXiv:2109.13711 . [17] M. Bhardwaj, M. S. Akhtar, A. Ekbal, A. Das, T. Chakraborty, Hostility detection dataset in hindi, arXiv preprint arXiv:2011.03588 (2020). [18] S. Satapara, S. Modha, T. Mandl, H. Madhu, P. Majumder, Overview of the hasoc subtrack at fire 2021: Conversational hate speech detection in code-mixed language, Working Notes of FIRE (2021). [19] Z. M. Farooqi, S. Ghosh, R. R. Shah, Leveraging transformers for hate speech detection in conversational code-mixed tweets, arXiv preprint arXiv:2112.09986 (2021). [20] S. Banerjee, M. Sarkar, N. Agrawal, P. Saha, M. Das, Exploring transformer based models to identify hate speech and offensive content in english and indo-aryan languages, arXiv preprint arXiv:2111.13974 (2021). [21] M. Zampieri, S. Malmasi, P. Nakov, S. Rosenthal, N. Farra, R. Kumar, Semeval-2019 task 6: Identifying and categorizing offensive language in social media (offenseval), arXiv preprint arXiv:1903.08983 (2019). [22] M. Nene, K. North, T. Ranasinghe, M. Zampieri, Transformer models for offensive language identification in marathi, in: FIRE, 2021. [23] F. Feng, Y. Yang, D. Cer, N. Arivazhagan, W. Wang, Language-agnostic bert sentence embedding, arXiv preprint arXiv:2007.01852 (2020). [24] A. Glazkova, M. Kadantsev, M. Glazkov, Fine-tuning of pre-trained transformers for hate, offensive, and profane content detection in english and marathi, arXiv preprint arXiv:2110.12687 (2021). [25] A. Velankar, H. Patil, A. Gore, S. Salunke, R. Joshi, L3cube-mahahate: A tweet- based marathi hate speech detection dataset and BERT models, CoRR abs/2203.13778 (2022). URL: https://doi.org/10.48550/arXiv.2203.13778. doi:10.48550/arXiv.2203.13778 . arXiv:2203.13778 . [26] S. S. Gaikwad, T. Ranasinghe, M. Zampieri, C. Homan, Cross-lingual offensive language identification for low resource languages: The case of Marathi, in: Proceedings of the International Conference on Recent Advances in Natural Language Processing (RANLP 2021), INCOMA Ltd., Held Online, 2021, pp. 437–443. URL: https://aclanthology.org/2021. ranlp-1.50. [27] M. Zampieri, T. Ranasinghe, M. Chaudhari, S. Gaikwad, P. Krishna, M. Nene, S. Paygude, Predicting the type and target of offensive social media posts in marathi, Social Network Analysis and Mining 12 (2022) 77. URL: https://doi.org/10.1007/s13278-022-00906-8. doi:10. 1007/s13278- 022- 00906- 8 .