Harnessing Pre-Trained Sentence Transformers for Offensive Language Detection in Indian Languages Ananya Joshi1,3 , Raviraj Joshi2,3 1 MKSSS Cummins College of Engineering for Women, Pune, Maharashtra, India 2 Indian Institute of Technology Madras, Chennai, Tamil Nadu, India 3 L3Cube Pune, India Abstract In our increasingly interconnected digital world, social media platforms have emerged as powerful channels for the dissemination of hate speech and offensive content. This work delves into the domain of hate speech detection, placing specific emphasis on three low-resource Indian languages: Bengali, Assamese, and Gujarati. The challenge is framed as a text classification task, aimed at discerning whether a tweet contains offensive or non-offensive content. Leveraging the HASOC 2023 datasets, we fine-tuned pre-trained BERT and SBERT models to evaluate their effectiveness in identifying hate speech. Our findings underscore the superiority of monolingual sentence-BERT models, particularly in the Bengali language, where we achieved the highest ranking. However, the performance in Assamese and Gujarati languages signifies ongoing opportunities for enhancement. The goal of our team- ’Sanvadita’ is to foster inclusive online spaces by countering hate speech proliferation. Keywords Natural Language Processing, Sentence-BERT, Transformers, Hate-speech detection, Offensive language detection, Indian Regional Languages, Low Resource Languages, Text Classification, IndicNLP, BERT 1. Introduction In today’s interconnected world, social media platforms have gained significant influence and have become powerful means of spreading hate speech, often targeting individuals or groups based on factors like race, caste, gender, sexual orientation, or political beliefs. The negative effects of this trend, including cyberbullying and the presence of offensive content, are well- documented and can harm the mental well-being of users. As the number of people using social media continues to grow, it is crucial to develop effective methods to identify and address offensive language to maintain a positive online environment. Efficient tools for detecting offensive, vulgar, and hateful language on social media platforms are essential because such language can disrupt online discussions and have real-world con- sequences. This highlights the need for robust Natural Language Processing (NLP) systems capable of effectively recognizing and countering offensive language on these platforms [1]. Our research specifically focuses on the challenge of detecting offensive, profane, and hateful language in low-resource Indian languages, namely Assamese, Bengali, and Gujarati. These languages have received relatively less attention in the field of NLP, and they each have unique Forum for Information Retrieval Evaluation, December 15-18, 2023, Goa, India Envelope-Open joshiananya20@gmail.com (A. Joshi); ravirajoshi@gmail.com (R. Joshi) © 2023 Copyright for this paper by its authors. Use permitted under Creative Commons License Attribution 4.0 International (CC BY 4.0). CEUR Workshop Proceedings http://ceur-ws.org ISSN 1613-0073 CEUR Workshop Proceedings (CEUR-WS.org) CEUR ceur-ws.org Workshop ISSN 1613-0073 Proceedings linguistic characteristics that require specialized solutions for addressing offensive content effectively. Bengali, primarily spoken in West Bengal, India, and Bangladesh, is known for its rich literary tradition and cultural significance. With over 230 million speakers, it ranks as the second most spoken language in India and the seventh in the world. Gujarati, predominantly spoken in the Indian state of Gujarat, contributes significantly to India’s linguistic diversity with approximately 55 million speakers. Assamese, spoken primarily in the northeastern Indian state of Assam, is rooted in Sanskrit and includes various dialects, playing a vital role in the linguistic diversity of India’s northeastern region. Hate Speech and Offensive Content Identification in English and Indo-Aryan Languages (HASOC) 20231 initiative includes four distinct tasks. We specifically concentrate on two tasks [2]: • Task 1B: Identifying Hate, Offensive, and Profane Content in Gujarati • Task 4: Annihilate Hates - Detecting Hate Speech in Bengali and Assamese Throughout this paper, we rigorously evaluate the performance of both single-language and multi-language models when applied to the datasets associated with these tasks. We primarily focus on sentence-BERT models for identifying offensive language in social media contexts, showcasing their superior performance. Notably, we present state-of-the-art results on the HASOC 2023 test set using specialized models such as BengaliSBERT, GujaratiSBERT [3], and assamese-bert [4], which have been developed by L3Cube-Pune2 . 2. Related Work The task of hate speech detection in multilingual contexts has garnered significant attention in recent shared tasks and research endeavors. In this section, we provide an overview of related work, with a focus on studies relevant to our investigation of hate speech detection in low-resource Indian languages. Several shared tasks have aimed to address hate speech detection challenges. For instance, the paper [5] analyses the systems submitted for the HASOC shared tasks and DravidianLangTech workshop conducted in 2020, focusing on Malayalam, Tamil, and Kannada offensive posts on social media. [6] describes the Subtrack 3 of HASOC-2022, focusing on Offensive Language Iden- tification in Marathi. [7] describes the HASOC 2021 subtask of identification of conversational hate speech in code-mixed languages. Hindi and Marathi, two prominent Indian languages, have received considerable attention in hate speech detection research. Notable studies include [8, 9, 10, 11, 12], which have contributed to the understanding of hate speech dynamics in these languages. [13] presents a comparative study between monolingual and multilingual BERT models for hate speech detection in Marathi language, while [14] presents a similar comparative analysis with cross-language evaluation for Hindi and Marathi. 1 https://hasocfire.github.io/hasoc/2023/ 2 https://huggingface.co/l3cube-pune While Hindi and Marathi have been extensively studied, research efforts have expanded to include languages such as Bengali and Assamese. [15] offers insights into hate speech detection in Bengali, while [16] presents transformer based hate speech detection in Assamese. Similar challenges have been explored in South Indian languages, adding to the linguistic diversity of hate speech research. [17] suggests a weighted ensemble framework to capture hate speech and offensive languages on social platforms posted in code-mixed languages like Hindi–English, Tamil–English, Malayalam–English, Telugu–English, and others. The paper [18] proposes a novel technique of selective translation and transliteration for code-mixed and romanized offensive speech classification in Dravidian languages. These prior studies provide valuable foundations for our investigation into hate speech detec- tion in low-resource Indian languages, such as Assamese, Bengali, and Gujarati, underscoring the growing recognition of the need to address hate speech in diverse linguistic contexts. 3. Experimental Setup 3.1. Task description Below, we provide an overview of the tasks: • Task 1B: Identifying Hate, offensive and profane content in Gujarati3 : This task focuses on Hate speech and Offensive language identification for Gujarati. This is a coarse-grained binary classification in a few-shot setting, in which participating systems are required to classify tweets into two classes, namely: Hate and Offensive (HOF) and Non-Hate and offensive (NOT). > (NOT) Non Hate-Offensive - This post does not contain any hate speech, profane, offensive content. > (HOF) Hate and Offensive - This post contains hate, offensive, and profane content. • Task 4: Annihilate Hates4 [19]: The objective of the task is to detect hate speech in Bengali, Bodo, and Assamese languages. It is a binary classification task. Each dataset (for the three languages) consists of a list of sentences with their corresponding class (hate or offensive (HOF) or not hate (NOT)). Data is primarily collected from Twitter, Facebook, or youtube comments. Team rank is determined based on the Macro F1 score. 3.2. Datasets HASOC 2023 provides training datasets tagged as ”NOT” and ”HOF” for binary classification for both Task 1 and Task 4. The main source of data collection is Twitter, Facebook, or YouTube comments. Table 1 shows all dataset statistics. The distribution of offensive and non-offensive tweets in the training dataset of each language is depicted in Figure 1 3 https://hasocfire.github.io/hasoc/2023/task1.html 4 https://sites.google.com/view/hasoc-2023-annihilate-hates/home Table 1 HASOC 2023 datasets statistics for Task 1 and Task 4 Training Test Label -> HOF NOT Total Total Task 1 Gujarati 100 100 200 1196 Assamese 2347 1689 4036 1009 Task 4 Bengali 515 766 1281 320 HOF NOT 2400 2200 2000 1800 1600 1400 1200 1000 800 600 400 200 0 Gujarati Assamese Bengali Language Figure 1: Class distribution of tweets in HASOC 2023 training datasets for Task 1 and Task 4 3.3. Preprocessing In order to enhance the accuracy of our classification task, we conducted data preprocessing to improve the data quality. We engaged in cleaning procedures to optimize the data conditions, which included eliminating punctuation marks, URLs, usernames, handles, hashtags, numbers, and Roman characters. Additionally, our preprocessing methods addressed issues such as newline characters, excessive spaces, and empty parentheses. Notably, we made a deliberate decision to retain emojis, as they contribute significantly to conveying the sentiment of the text and were observed to yield superior results. Label encoding: We encode Class label into a unique number for each task: ”HOF” to ”1”, and ”NOT” to ”0” 3.4. Models and Training Setup BERT [20] models are pre-trained on a massive corpus of text data, where they learn to predict masked words within sentences. Then, they are fine-tuned on specific downstream tasks using labeled data. Sentence-BERT (SBERT) [21] models are trained by learning fixed-size embeddings for sentences using siamese or triplet network architectures that aim to optimize similarity scores between related sentences and minimize distances between them in embedding space. While BERT focuses on word-level representations, SBERT models are designed to capture the semantic meaning of sentences, including subtle nuances and context, by producing fixed-size sentence embeddings. Hate speech often relies on the overall context and phrasing of a sentence, making SBERT’s sentence-level understanding more relevant. Hate speech classification often requires an understanding of context and context-dependent variations in meaning. SBERT models, leverage contextual information by considering the surrounding words in a sentence, making them more adept at recognizing the intended sentiment or tone. Traditional BERT models, while powerful, may struggle to understand the nuances of entire sentences and their emotional or hateful intent. The papers [22, 3] show that the Sentence-BERT models outperform the corresponding BERT variants in understanding context-specific information. Hence, we primarily utilize the monolingual and multilingual SBERT models for Gujarati and Bengali languages. The Assamese language, however, lacks quality datasets and powerful models such as Sentence-BERT. Hence, we use the monolingual assamese-bert and the multilingual indic-bert model. • For Task 1, we use the pre-trained monolingual model GujaratiSBERT5 and the multilin- gual IndicSBERT6 model. • For Task 4, we use pre-trained monolingual models of BengaliSBERT7 , bengali-bert8 , assamese-bert9 and the multilingual models IndicSBERT, indic-bert10 . For both tasks, we initialize a classification model using the BERT architecture and freeze the first six layers of the model. Next, we train the model using the provided training data for 4 epochs with the default learning rate. 4. Results We conducted training on a range of models using the complete training dataset and subsequently employed these models to predict classes for the provided test dataset. In all the tasks, the texts are classified into 2 categories- HOF, indicating the presence of hateful content, or NOT- indicating no offensive content. The outcomes are presented in Table 2, and the evaluation metric employed for determining the team’s leaderboard ranking was the Macro F1 Score. We have included all the task results in accordance with the leaderboard presentation. Additionally, we explored the efficacy of multiple pre-trained BERT and SBERT models but submitted only the most successful run for evaluation, omitting the submission of other runs due to their subpar performance. We achieved the top ranking (rank 1) among 21 participating teams for Task 4- Bengali language, because of the highest Macro F1 Score obtained using the BengaliSBERT model. The 5 https://huggingface.co/l3cube-pune/gujarati-sentence-bert-nli 6 https://huggingface.co/l3cube-pune/indic-sentence-bert-nli 7 https://huggingface.co/l3cube-pune/bengali-sentence-bert-nli 8 https://huggingface.co/l3cube-pune/bengali-bert 9 https://huggingface.co/l3cube-pune/assamese-bert 10 https://huggingface.co/ai4bharat/indic-bert BengaliSBERT model outperforms the bengali-bert and other multilingual models like MuRil, Indic-bert and IndicSBERT. For Task 4- Assamese language, we attained rank 6 among 20 teams through the use of assamese-bert model. For Task1- Gujarati, we stand at Rank 10 among 17 participating teams. The best score was given by GujaratiSBERT model, outperforming the gujarati-bert and other multilingual models like MuRil, Indic-bert and IndicSBERT. Table 2 Macro F1 scores obtained from various models, along with the Ranks achieved in Task1 and Task4 of HASOC 2023 Task Language Model MACRO F1 Rank GujaratiSBERT 0.7324 10 Task 1 Gujarati IndicSBERT 0.7291 assamese-bert 0.7065 6 Task 4 Assamese indic-bert 0.6788 BengaliSBERT 0.7703 1 Task 4 Bengali IndicSBERT 0.7409 indic-bert 0.7121 5. Conclusion Through this paper, we describe our approach for hate and offensive speech detection in three Indian languages. We utilize the HASOC 2023 datasets for fine-tuning the pretrained BERT and SBERT models and testing their performance. Our findings reveal that monolingual Sentence-BERT models consistently outperform both monolingual BERT models and multilin- gual counterparts in the realm of hate speech identification. Notably, we secured the highest ranking for Bengali language, while the lower rankings in Assamese and Gujarati languages underscore the ongoing need for enhancements in these domains. Looking ahead, we are committed to exploring various strategies to elevate the performance of Assamese and Gu- jarati models. Our overarching goal is to contribute to the advancement of more inclusive and comprehensive tools for combatting online hate speech, ultimately fostering online spaces characterized by tolerance and respect. Acknowledgments This work was done under the L3Cube Pune mentorship program. We would like to express our gratitude towards our mentors at L3Cube for their continuous support and encouragement. References [1] A. Velankar, H. Patil, R. Joshi, A review of challenges in machine learning based automated hate speech detection, arXiv preprint arXiv:2209.05294 (2022). [2] T. Ranasinghe, K. Ghosh, A. S. Pal, A. Senapati, A. E. Dmonte, M. Zampieri, S. Modha, S. Satapara, Overview of the HASOC subtracks at FIRE 2023: Hate speech and offensive content identification in assamese, bengali, bodo, gujarati and sinhala, in: Proceedings of the 15th Annual Meeting of the Forum for Information Retrieval Evaluation, FIRE 2023, Goa, India. December 15-18, 2023, ACM, 2023. [3] S. Deode, J. Gadre, A. Kajale, A. Joshi, R. Joshi, L3cube-indicsbert: A simple approach for learning cross-lingual sentence representations using multilingual bert, arXiv preprint arXiv:2304.11434 (2023). [4] R. Joshi, L3cube-hindbert and devbert: Pre-trained bert transformer models for devanagari based hindi and marathi languages, arXiv preprint arXiv:2211.11418 (2022). [5] B. R. Chakravarthi, D. Chinnappa, R. Priyadharshini, A. K. Madasamy, S. Sivanesan, S. C. Navaneethakrishnan, S. Thavareesan, D. Vadivel, R. Ponnusamy, P. K. Kumaresan, Developing successful shared tasks on offensive language identification for dravidian languages, 2021. arXiv:2111.03375 . [6] T. Ranasinghe, K. North, D. Premasiri, M. Zampieri, Overview of the hasoc subtrack at fire 2022: Offensive language identification in marathi, 2022. arXiv:2211.10163 . [7] S. Satapara, S. Modha, T. Mandl, H. Madhu, P. Majumder, Overview of the hasoc subtrack at fire 2021: Conversational hate speech detection in code-mixed language, Working Notes of FIRE (2021) 13–17. [8] A. Velankar, H. Patil, A. Gore, S. Salunke, R. Joshi, Hate and offensive speech detection in hindi and marathi, arXiv preprint arXiv:2110.12200 (2021). [9] T. Chavan, S. Patankar, A. Kane, O. Gokhale, R. Joshi, A twitter bert approach for offensive language detection in marathi, arXiv preprint arXiv:2212.10039 (2022). [10] H. Patil, A. Velankar, R. Joshi, L3cube-mahahate: A tweet-based marathi hate speech detection dataset and bert models, in: Proceedings of the Third Workshop on Threat, Aggression and Cyberbullying (TRAC 2022), 2022, pp. 1–9. [11] K. Ghosh, A. Senapati, U. Garain, Baseline bert models for conversational hate speech detec- tion in code-mixed tweets utilizing data augmentation and offensive language identification in marathi, in: Fire, 2022. URL: https://api.semanticscholar.org/CorpusID:259123570. [12] S. Ghosal, A. Jain, Hatecircle and unsupervised hate speech detection incorporating emotion and contextual semantics, ACM Trans. Asian Low-Resour. Lang. Inf. Process. 22 (2023). URL: https://doi.org/10.1145/3576913. doi:10.1145/3576913 . [13] A. Velankar, H. Patil, R. Joshi, Mono vs multilingual bert for hate speech detection and text classification: A case study in marathi, in: IAPR Workshop on Artificial Neural Networks in Pattern Recognition, Springer, 2022, pp. 121–128. [14] K. Ghosh, D. A. Senapati, Hate speech detection: a comparison of mono and multilingual transformer model with cross-language evaluation, in: Proceedings of the 36th Pacific Asia Conference on Language, Information and Computation, De La Salle University, Manila, Philippines, 2022, pp. 853–865. URL: https://aclanthology.org/2022.paclic-1.94. [15] M. Das, S. Banerjee, P. Saha, A. Mukherjee, Hate speech and offensive language detection in bengali, arXiv preprint arXiv:2210.03479 (2022). [16] K. Ghosh, D. Sonowal, A. Basumatary, B. Gogoi, A. Senapati, Transformer-based hate speech detection in assamese, in: 2023 IEEE Guwahati Subsection Conference (GCON), 2023, pp. 1–5. doi:10.1109/GCON58516.2023.10183497 . [17] P. K. Roy, S. Bhawal, C. N. Subalalitha, Hate speech and offensive language detection in dravidian languages using deep ensemble framework, Computer Speech & Language 75 (2022) 101386. URL: https://www.sciencedirect.com/science/article/pii/S0885230822000250. doi:https://doi.org/10.1016/j.csl.2022.101386 . [18] S. Sai, Y. Sharma, Towards offensive language identification for dravidian languages, in: Proceedings of the first workshop on speech and language technologies for Dravidian languages, 2021, pp. 18–27. [19] K. Ghosh, A. Senapati, A. S. Pal, Annihilate Hates (Task 4, HASOC 2023): Hate Speech Detection in Assamese, Bengali, and Bodo languages, in: Working Notes of FIRE 2023 - Forum for Information Retrieval Evaluation, CEUR, 2023. [20] J. Devlin, M.-W. Chang, K. Lee, K. Toutanova, Bert: Pre-training of deep bidirectional transformers for language understanding, 2019. arXiv:1810.04805 . [21] N. Reimers, I. Gurevych, Sentence-bert: Sentence embeddings using siamese bert-networks, arXiv preprint arXiv:1908.10084 (2019). [22] A. Joshi, A. Kajale, J. Gadre, S. Deode, R. Joshi, L3cube-mahasbert and hindsbert: Sentence bert models and benchmarking bert sentence representations for hindi and marathi, in: Science and Information Conference, Springer, 2023, pp. 1184–1199.