-

Forum for Information Retrieval Evaluation, December

1613-0073

Detection in Assamese, Bengali, and Bodo languages

Koyel Ghosh

ghosh.koyel8@gmail.com 0 1

Apurbalal Senapati

a.senapati@cit.ac.in 0 1

Aditya Shankar Pal

adityashankarpal_r@isical.ac.in 1 2

Transformers, BERT

0 Central Institute of Technology , Kokrajhar, Assam , India 1 Hate Speech Detection , Binary Classification, Assamese, Bengali, Bodo, Machine Learning, Deep Learning 2 Indian Statistical Institute , Kolkata , India

2023

1 5 18

In today's world, social media can act as a tool for spreading hate towards a person or group based on their color, caste, sex, sexual orientation, political diferences, etc. As social media continues to expand, the proliferation of hate speech is also surging at an alarming rate. Recently, Research on identifying hate speech in social media has gained significant prominence, with a specific need for investigations focused on languages other than English. The HASOC (Hate Speech and Ofensive Content Identification) track intends to provide a platform for Hate Speech Detection since 2019 at FIRE (Forum for Information Retrieval Evaluation). HASOC 2023 is coordinating four tasks, with AH (Annihilate Hates, Task 4) being one of them. The AH task aims to develop and assess supervised machine learning systems on the three datasets. The three datasets presented for hate speech in three Indian languages (Assamese, Bengali, and Bodo) are collected from ™YouTube and ™Facebook comments. Each dataset is tagged with the binary classification (hate or non-hate) labels. In the Assamese language, 20 teams made 180 submissions, while 21 teams submitted 214 entries in the Bengali language, and for the Bodo language, 19 teams submitted a total of 175 submissions. The performance of the best classifiers for Assamese, Bengali, and Bodo are measured with the Macro F1 score of 0.73, 0.77, and 0.85, respectively. This article briefly summarizes the tasks, data development, and results. The variant of BERT architecture achieved the best performance in the task. However, other systems have also been successfully applied to the task.

CEUR ceur-ws.org

1. Introduction

In addition to fostering friendships and facilitating information sharing, popular social media platforms such as ™Twitter, ™Facebook, and ™YouTube have also become platforms for cyberbullying and online harassment. These negative aspects can have severe consequences, including causing depression and inciting individuals to engage in violent actions, as evidenced in studies like [ 1, 2 ]. Instances of hate speech on these platforms have disrupted social and communal harmony on a global scale. Consequently, many countries have introduced increasingly complex regulations to address ofensive online content, as discussed in [ 3 ] and [ 4 ]. This LGOBE (A. S. Pal) CEUR Workshop Proceedings situation has created a crucial need for automated methods to detect suspicious posts. It’s worth noting that most research in this area has primarily focused on English and similar languages. On the other hand, Low-resource languages need more annotated datasets. Linguists have examined and characterized diferent manifestations of hate speech [ 5 ], while political scholars and legal authorities explore methods to govern online platforms and address problematic content while preserving the principles of free expression [ 6 ]. Algorithms are always getting better, and people are making lots of diferent sets of data for lots of other things, and they’re studying them. Recently, Researchers made sets of data for many diferent languages [ 7 ] like English [ 8, 9, 10, 11 ], Greek [ 12 ], Portuguese [ 13 ], Danish [ 14 ], Mexican Spanish [ 15 ], and Turkish [ 16 ]. In Indian languages, hate speech dataset available is Hindi [ 17, 18, 19 ], Marathi [ 19 ], Bengali [ 20 ], Telegu [ 21 ], Tamil [ 21 ], Malayalam [ 21 ], and Kannada [ 21 ]. Having all these diferent data sets helps us understand how similar or diferent they are and how trustworthy they are.

In the HASOC 2023, four tasks (Task 1 - Task 4) in the research area of Hate Speech detection are proposed. Task 1 [ 22 ] focuses on identifying hate speech, ofensive language, and profanity in diferent languages using natural language processing techniques. Task 2 [ 23], known as the Identification of Conversational Hate-Speech in Code-Mixed Languages (ICHCL), addresses the challenge of identifying hate speech and ofensive content in code-mixed conversations on social media. Code-mixed text includes multiple languages within a single conversation. The task is divided into two subtasks. Task 3 [24] aims to detect the various hateful spans within a sentence already considered hateful. A hate span is a set of continuous tokens that, in tandem, communicate the explicit hatefulness in a sentence.

This paper will provide an overview of Task 4, i.e., Annihilate Hates (AH), which contributes task-specific (Hate speech detection) low-resourced datasets on three languages: Assamese, Bengali, and Bodo. This AH dataset is version 3 and well updated of HS (version 2) [25], and NEIHS (version 1) [26, 27] datasets.

2. Related Forum and Dataset

The main obstacle in hate speech detection is the requirement for language-specific datasets. Constructing labeled datasets for hate speech in Indian languages is a laborious and intricate endeavor. It needs extensive groundwork and preprocessing tasks such as data cleaning and ensuring agreement among annotators. This section provides a concise overview of Indian datasets in languages like Hindi, Marathi, Bengali, Telugu, Tamil, Malayalam, and Kannada.

The HASOC challenge, organized by FIRE (Forum for Information Retrieval Evaluation)1, has played a significant role in providing hate speech datasets in Indian languages like Hindi, Marathi, etc. HASOC comprises four subtracks, and the dataset distribution is in a tab-separated format. In 2019, the HASOC-Hindi dataset introduced three tasks as described in [ 17 ]. Subtask A is the initial task involving binary classification. Subtask B focuses on identifying the profanity or abuse within hate comments, a multiclass classification task. Subtask C is centered on determining whether the hate speech is targeted at a specific individual or if it’s more general and untargeted. In the HASOC 2020 edition, two hate speech detection tasks were presented, as mentioned in [ 18 ]. Subtask A involves binary classification, and Subtask B addresses multiclass classification. These tasks are accompanied by another Hindi dataset, expanding the research scope in this area. In 2021, HASOC published a Hindi dataset [ 19 ] with sub-tasks A and B again. Total Sixty-five teams submitted a total of six thousand and fity-two runs. In HASOCMarathi [ 19 ], the Marathi hate speech dataset with binary classification task. Authors [ 28, 29] experimented on HASOC datasets and analyzed the transformer-based model’s performance in detail. BD-SHS [ 20 ] is a Bengali hate speech dataset with three levels: hate speech identification (binary classification, i.e., hate and not hate), identification of the Target of hate speech (multilabel classification, i.e., individual, male, female, and group), and categorization of hate speech types (multi-label classification, i.e., slander, call to violence, gender, religion). The authors [ 21 ] created several Indian datasets, i.e., Hindi, Telegu, Tamil, Malayalam, and Kannada, later performed monolingual, unbalanced splits, zero-shot cross-lingual, Few-shot, Joint training, pretraining and cross-dataset experiments on those datasets.

3. Task Description

In HASOC 20232, Task 4 is Annihilate Hates (AH) with three languages proposed in the research area of hate speech detection. These tasks ofered all three languages: Assamese, Bengali, and Bodo. Figure 1 shows the Screenshot of Annihilate Hates (AH) Website3.

3.1. Sub-task A: Hate Speech Detection in Assamese, Bengali, and Bodo (Binary)

Task 4 aims to detect hate speech in Assamese, Bengali, and Bodo languages. Each dataset (for the three languages) consists of a list of comments with their corresponding class (hate or ofensive ( HOF ) or not hate (NOT )). Data is primarily collected from ™Facebook and ™YouTube comments. It is a binary classification task in which participating systems are required to classify the comments into two classes: HOF and NOT. Figure 2 shows the sample-tagged datasets of the AH-Assamese, AH-Bengali, and AH-Bodo.

4. Dataset Description 4.1. Dataset Collection

In this section, dataset collection, annotation, and analysis have been discussed for Task 4. Our primary aim in constructing this dataset is to ensure its diversity, so we intentionally selected a few political, entertainment, and more ™Facebook pages and ™YouTube channels. We initiated the process by identifying contentious posts, often related to recent events, prominent ifgures such as politicians and actors, which had a higher likelihood of containing hate speech. Subsequently, we scrutinized the comments on these posts, seeking those primarily written in a

2https://hasocfire.github.io/hasoc/2023/index.html (Access on 30.10.2023) 3https://sites.google.com/view/hasoc-2023-annihilate-hates/home (Access on 30.10.2023)

determine whether these comments contained hate speech and categorized them accordingly. All the comments are collected using open source scrapper tools4. Ultimately, native speakers tagged the sentences as either HOF or NOT. Sentences that fell under the HOF category usually contained hate-related words and were considered hate-ofensive statements. In contrast, sentences conveying formal information, suggestions, or questions were categorized as NOT sentences.

4https://github.com/kevinzg/facebook-scraper (Access on 30.10.2023) 4.2. Dataset Annotation

In dataset annotation, we share three separate CSV files with the three annotators. These files contain three columns: S. No. (serial number), text (comments), and task_1 (binary, i.e., HOF or NOT ). The data for each language was tagged manually by three native speakers, young adults between 19 and 24. These annotators are students at the Central Institute of Technology in Kokrajhar, Assam, India. Their task involved manually classifying comments into two categories: those containing hateful (HOF ) content and those that did not (NOT ), using binary labels. The ifnal decision was taken by consulting with a domain expert. Identifying hate speech is a subjective task, and it requires careful consideration. Consequently, we have established specific and rigorous guidelines to help define what qualifies as hate speech. These regulations are based on the community standards of ™Facebook5 and ™YouTube6. The authors [ 21 ] follow the below-mentioned rules for the comments to be marked as hate, and we follow the scheme with updation. (a) Profanity: Comments that include profane language, curses, or vulgar words are categorized as hate speech. (b) Sexual orientation: Sexual attraction can be directed toward individuals of the opposite gender, the same gender, both genders or multiple genders. (c) Personal: Comments regarding one’s fashion sense, choice of content, language selection, and related aspects. (d) Gender chauvinism: People are targeted in the comment because of their gender. (e) Religious: A person is criticized for their choice of religious beliefs and practices. For example, comments challenging the use of a turban or a burkha (the veil), (f) Political: Harassed a person based on political beliefs. For instance, bullying people for supporting a political party. (g) Violent intention: Containing a threat or call to violence in the comments.

Diferent annotators annotate the AH datasets, and the majority vote was considered; the annotation agreement calculated using (Kappa) coeficient is shown in Table 1. The problems and the level of disagreement need to be explored in the future.

Datasets AH-Assamese AH-Bengali

AH-Bodo statistics 0.67 0.54 0.81

4.3. Dataset Analysis

We summarize the key statistics of the AH dataset in Table 2. For the Assamese dataset, 2,955 comments are HOF out of 5,045. 641 comments are HOF out of 1,601 in the Bengali dataset which leads NOT is high. Out of 2,099, a total of 1,225 are HOF in the Bodo dataset. As a result, our Assamese and Bodo dataset is slightly skewed in favour of containing hate speech. Figure 3 shows the details of class distribution. In the training dataset, 4,036, 1,281, and 1,679 comments are present in the Assamese, Bengali, and Bodo datasets, respectively.

5https://web.facebook.com/communitystandards/ (Access on 30.10.2023) 6https://www.youtube.com/howyoutubeworks/policies/community-guidelines/ (Access on 30.10.2023)

Dataset AH-Assamese AH-Bengali AH-Bodo

Total

5. Result

The macro approach computes the F1 score individually for each class without considering the use of weights for aggregation. As a result, it imposes a more significant penalty when a system’s performance is poor for minority classes. The selection of a specific F1 variant depends on the task’s objectives and the label distribution in the dataset. Hate speech-related classification tasks often face class imbalance, making the macro F1 measure the suitable choice for evaluation.

For the system rum submission and evaluation of participants’ experiments, we depend on the Kaggle platform. Figure 4 shows the Screenshot of the Annihilate Hates (AH) Kaggle Website for run submission. We provide separate Kaggle platforms Like Assamese7, Bengali8, Bodo9 for participants to submit experimental runs.

Overall, 69 participants register for the task 4. In the Assamese task, 20 teams made 180 submissions, while 21 teams submitted 214 runs in the Bengali task, and for the Bodo task, 19 teams submitted a total of 175 runs.

The performance of the best classification algorithms for Assamese, Bengali, and Bodo are Macro F1 measures of 0.73, 0.77, and 0.85, respectively. The results for AH-Assamese, AHBengali, and AH-Bodo datasets are shown in Table 3, Table 4, and Table 5, respectively.

6. Methodology

This section discusses the systems utilized by the participants. 6.1. AH-Assamese • Chetona [30] propose ensembling IndicBERT and Naive Bayes, along with synthetic data upsampling techniques (up-sample the training examples of each language by translating the examples from the other two languages to the given language.). • FiRC-NLP [31] fine-tune the pre-trained XLM-RoBERTa-large model to get second position in the leaderboard. • TeamBD [32] experimented with xlm-roberta-large (multilingual) along with ChatGPT3 augmentation. 7https://www.kaggle.com/competitions/annihilate-hates-assamese (Access on 30.10.2023) 8https://www.kaggle.com/competitions/annihilate-hates-bengali (Access on 30.10.2023) 9https://www.kaggle.com/competitions/annihilate-hates-bodo (Access on 30.10.2023) (a) • SATLab [33] uses the LIBLinear L2-regularized logistic regression model (dual, -s 7) [42]. • AI Alchemists [34] fune-tuned XLM-RoBERTa model. • Sanvadita [35] uses monolingual assamese-bert10 and the multilingual indic-bert model11. • Z-AGI Labs [36] experiments with various multi-lingual transformer-based models for fine-tuning such as Bert-Base-Multilingual (Cased and Uncased), DistilBert-Base10https://huggingface.co/l3cube-pune/assamese-bert (Access on 30.10.2023) 11https://huggingface.co/ai4bharat/indic-bert (Access on 30.10.2023) 6.2. AH-Bengali • Sanvadita [35] uses pre-trained monolingual Bengali Sentence-BERT14, Bengali-BERT models15 and multilingual Indic Sentence-BERT 16. • FiRC-NLP [31] utilizes XLM-RoBERTa-large model. • Z-AGI Labs [36] utilizes pre-trained models for the experiments but achieves the highest score fine-tuning the csebuetnlp/banglabert pre-trained model. • TeamBD [32] experiments with xlm-roberta-large model (multiingual). • AI Alchemists [34] fune-tuned XLM-RoBERTa model. • Code Fellas [38] fine-tuned MuRIL for Bengali to get the best experiment results out of all experiments done by the team. • Chetona [30] applies the same for Bengali language as Assamese languages. • SATLab [33] utilizes the same system as Assamese. • MUCS [39], SVM trained with TF-IDF of syllable n-grams and TF-IDF of char n-grams both in the range (1, 3). • JCT/ Avigail Stekel [37], their best model for Bengali is an MNB model with 6-gram features out of all experiments they performed, mentioned in the Assamese section. 12https://huggingface.co/ibraheemmoosa/xlmindic-base-uniscript (Access on 30.10.2023) 13https://huggingface.co/ibraheemmoosa/xlmindic-base-multiscript (Access on 30.10.2023) 14https://huggingface.co/l3cube-pune/bengali-sentence-bert-nli (Access on 30.10.2023) 15https://huggingface.co/l3cube-pune/bengali-bert (Access on 30.10.2023) 16https://huggingface.co/l3cube-pune/indic-sentence-bert-nli (Access on 30.10.2023) • CNLP-NITS-PP [40], a CNN-based Binary Classification Model with FastText Embeddings outperforms the other systems.

• CHANDAN SENAPATI [41] implements the deep learning model LSTM. 6.3. AH-Bodo • SATLab [33] utilizes the same system applied to the Assamese dataset. • JCT/ Avigail Stekel [37], their best submission for Bodo is a LR with all word unigrams in the training set. • FiRC-NLP [31] utilizes the XLM-RoBERTa-large model, the best submission among all their experiments. • Chetona [30] applies the same system for the Bodo language as mentioned in the Assamese section. • AI Alchemists [34] fune-tuned XLM-RoBERTa model. • MUCS [39] trains SVM with TF-IDF of syllable n-grams and TF-IDF of char n-grams both in the range (1, 3) obtained the best macro F1 scores. • Code Fellas [38] uses a BiLSTM model enhanced with an additional Dense Layer attaining an impressive F1 score for Bodo. • Z-AGI Labs/ Nikhil Narayan [36] fine-tuned a pre-trained Bert Base Multilingual Cased model for Bodo. • TeamBD [32] applies xlm-roberta-large model (multiingual).

• CNLP-NITS-PP [40] gets best result for Bodo dataset with Logistic Regression.

7. Conclusion

The submissions in the AH task (Task 4, HASOC 2023) have shown transformer-based pretrained models to be the state-of-the-art approach for Hate Speech detection in the Assamese and Bengali datasets. However, the L2-regularized logistic regression model gives the best result for the Bodo dataset. Other deep learning models, like LSTM, CNNs, etc., also perform well on the given datasets. Upon reviewing the outcomes, the most suitable approach for hate speech detection depends on factors such as the language of the dataset, the level of classification detail, and the distribution of class labels. Balancing an imbalanced training dataset could impact the classification system’s efectiveness. In the long run, the AH task aims to provide more low-resourced data with binary and multi-label classification tasks.

8. Acknowledgement

We thank Mr. Debarshi Sonowal, Mr. Abhilash Basumatary, and Ms. Bidisha Gogoi for their help in collecting and tagging the Assamese hate dataset. Additionally, we extend our thanks to Mr. Maharaj Brahma and Mr. Mwnthai Narzary for their valuable contributions in collecting and labelling the Bodo hate dataset. We also thank the FIRE and HASOC organizers for their support in organizing the track. We thank all participants for their submissions and their valuable work. India. December 15-18, 2023, CEUR Workshop Proceedings, CEUR-WS.org, 2023. [23] S. Madhu, Hiren Satapara, P. Pandya, N. Shah, T. Mandl, S. Modha, Overview of the hasoc subtrack at fire 2023: Identification of conversational hate-speech, in: K. Ghosh, T. Mandl, P. Majumder, M. Mitra (Eds.), Working Notes of FIRE 2023 - Forum for Information Retrieval Evaluation, Goa, India. December 15-18, 2023, CEUR Workshop Proceedings, CEUR-WS.org, 2023. [24] S. Masud, M. A. Khan, M. S. Akhtar, T. Chakraborty, Overview of the HASOC Subtrack at FIRE 2023: Identification of Tokens Contributing to Explicit Hate in English by Span Detection, in: Working Notes of FIRE 2023 - Forum for Information Retrieval Evaluation, Goa, India. December 15-18, 2023, CEUR Workshop Proceedings, CEUR-WS.org, 2023. [25] K. Ghosh, A. Senapati, Hate speech detection: an analysis of mono and multilingual transformer models with cross-language evaluation on hindi, marathi, bangla, and bodo language, Natural Language Engineering Accepted on 26.10.2023 (2023). [26] K. Ghosh, A. Senapati, M. Narzary, M. Brahma, Hate speech detection in low-resource bodo and assamese texts with ml-dl and bert models, Scalable Computing: Practice and Experience 24 (2023) 941–955. [27] K. Ghosh, D. Sonowal, A. Basumatary, B. Gogoi, A. Senapati, Transformer-based hate speech detection in assamese, in: 2023 IEEE Guwahati Subsection Conference (GCON), 2023, pp. 1–5. doi:10.1109/GCON58516.2023.10183497. [28] K. Ghosh, D. A. Senapati, Hate speech detection: a comparison of mono and multilingual transformer model with cross-language evaluation, in: Proceedings of the 36th Pacific Asia Conference on Language, Information and Computation, Association for Computational Linguistics, Manila, Philippines, 2022, pp. 853–865. URL: https://aclanthology.org/2022. paclic-1.94. [29] K. Ghosh, A. Senapati, U. Garain, Baseline bert models for conversational hate speech detection in code-mixed tweets utilizing data augmentation and ofensive language identiifcation in marathi, in: Forum for Information Retrieval Evaluation (Working Notes)(FIRE).

CEUR-WS. org, 2022. [30] S. Saha, M. Sullivan, R. Srihari, Hate Speech Detection in Low Resource Indo-Aryan Languages, in: Working Notes of FIRE 2023 - Forum for Information Retrieval Evaluation, CEUR, 2023. [31] M. S. Jahan, F. Hassan, W. Aransa, A. Bouchekif, Multilingual Hate Speech Detection Using Ensemble of Transformer Models, in: Working Notes of FIRE 2023 - Forum for Information Retrieval Evaluation, CEUR, 2023. [32] K. M. Jhuma, M. Oussalah, A. Singhal, Cross-Linguistic Ofensive Language Detection: BERT-Based Analysis of Bengali, Assamese, & Bodo Conversational Hateful Content from Social Media, in: Working Notes of FIRE 2023 - Forum for Information Retrieval Evaluation, CEUR, 2023. [33] Y. Bestgen, Using Only Character Ngrams for Hate Speech and Ofensive Content Identiifcation in Five Low-Ressource Languages, in: Working Notes of FIRE 2023 - Forum for Information Retrieval Evaluation, CEUR, 2023. [34] C. Muhammad Awais, J. Raj, Breaking Barriers: Multilingual Toxicity Analysis for Hate Speech and Ofensive Language in Low-Resource Indo-Aryan Languages, in: Working Notes of FIRE 2023 - Forum for Information Retrieval Evaluation, CEUR, 2023. [35] A. Joshi, R. Joshi, Harnessing Pre-Trained Sentence Transformers for Ofensive Language Detection in Indian Languages, in: Working Notes of FIRE 2023 - Forum for Information Retrieval Evaluation, CEUR, 2023. [36] N. Narayan, M. Biswal, P. Goyal, A. Panigrahi, Hate Speech and Ofensive Content Detection in Indo-Aryan Languages: A Battle of LSTM and Transformers, in: Working Notes of FIRE 2023 - Forum for Information Retrieval Evaluation, CEUR, 2023. [37] A. Stekel, A. Prives, Y. HaCohen-Kerner, Detecting Ofensive Language in Bengali, Bodo, and Assamese using Word Unigrams, Char N-grams, Classical Machine Learning, and Deep Learning Methods, in: Working Notes of FIRE 2023 - Forum for Information Retrieval Evaluation, CEUR, 2023. [38] A. Reddy Gutha, N. Sai Adarsh, A. Alekar, D. Reddy, Multilingual Hate Speech and Ofensive Language Detection of Low Resource Languages, in: Working Notes of FIRE 2023 - Forum for Information Retrieval Evaluation, CEUR, 2023. [39] P. M, R. K, A. Hegde, K. G, S. Coelho, H. L. Shashirekha, Taming Toxicity: Learning Models for Hate Speech and Ofensive Language Detection in Social Media Text, in: Working Notes of FIRE 2023 - Forum for Information Retrieval Evaluation, CEUR, 2023. [40] G. Kalita, E. Halder, C. Taparia, A. Vetagiri, D. P. Pakray, Examining Hate Speech Detection Across Multiple Indo-Aryan Languages in Tasks 1 & 4, in: Working Notes of FIRE 2023 Forum for Information Retrieval Evaluation, CEUR, 2023. [41] C. Senapati, U. Roy, Bengali Hate Speech Detection Using Deep Learning Technique, in:

Working Notes of FIRE 2023 - Forum for Information Retrieval Evaluation, CEUR, 2023. [42] R.-E. Fan, K.-W. Chang, C.-J. Hsieh, X.-R. Wang, C.-J. Lin, Liblinear: A library for large linear classification, J. Mach. Learn. Res. 9 (2008) 1871–1874.

[1]

M. L.

Williams ,

Burnap ,

Javed , H. Liu,

Ozalp , Hate in the Machine: Anti-Black and Anti-Muslim Social Media Posts as Predictors of Ofline Racially and Religiously Aggravated Crime , The British Journal of Criminology 60 ( 2019 ) 93 - 117 . URL: https://doi.org/10.1093/bjc/azz049. doi: 10 .1093/bjc/azz049. arXiv:https://academic.oup.com/bjc/articlepdf/60/1/93/31634412/azz049.pdf.

[2]

Laub , Hate speech on social media: Global comparisons , Council on foreign relations 7 ( 2019 ).

[3]

Nicholas , C. Ezeibe, The state, hate speech regulation and sustainable democracy in africa: a study of nigeria and kenya , African Identities ( 2020 ). doi: 10 .1080/14725843. 2020 . 1813548 .

[4]

Quintel ,

Ullrich , Self-Regulation of Fundamental Rights? The EU Code of Conduct on Hate Speech, Related Initiatives and Beyond, Edward Elgar Publishing, 2019 . Available at SSRN: https://ssrn.com/abstract=3298719.

[5]

Jaki , T. De Smedt,

Gwóźdź ,

Panchal ,

Rossa , G. De Pauw, Online hatred of women in the incels.me forum: Linguistic analysis and automatic detection , Journal of Language Aggression and Conflict 7 ( 2019 ) 240 - 268 . URL: https://www.jbe-platform.com/content/ journals/10.1075/jlac.00026.jak. doi:https://doi.org/10.1075/jlac.00026.jak.

[6]

G. L.

Casey , Ending the incel rebellion: The tragic impacts of an online hate group , Loyola Journal of Public Interest Law 21 ( 2019 ) 71 . URL: https://heinonline.org/HOL/P?h=hein. journals/loyjpubil21&i=79.

[7]

Poletto ,

Basile ,

Sanguinetti ,

Bosco ,

Patti , Resources and benchmark corpora for hate speech detection: a systematic review , Language Resources and Evaluation 55 ( 2021 ) 1 - 47 . doi: 10 .1007/s10579-020-09502-8.

[8]

Zampieri ,

Malmasi ,

Nakov ,

Rosenthal ,

Farra ,

Kumar , Predicting the type and target of ofensive posts in social media, in: Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers), Association for Computational Linguistics , Minneapolis, Minnesota, 2019 , pp. 1415 - 1420 . URL: https://aclanthology.org/ N19-1144. doi: 10 .18653/v1/ N19 -1144.

[9]

Davidson ,

Warmsley ,

Macy , I. Weber , Automated hate speech detection and the problem of ofensive language , Proceedings of the International AAAI Conference on Web and Social Media 11 ( 2017 ) 512 - 515 . URL: https://ojs.aaai.org/index.php/ICWSM/article/ view/14955. doi: 10 .1609/icwsm.v11i1. 14955 .

[10]

Kwok ,

Wang , Locate the hate: Detecting tweets against blacks , Proceedings of the AAAI Conference on Artificial Intelligence 27 ( 2013 ) 1621 - 1622 . URL: https://ojs.aaai.org/ index.php/AAAI/article/view/8539. doi: 10 .1609/aaai.v27i1. 8539 .

[11] Kaggle , Toxic comment classification challenge: Identify and classify toxic online comments ( 2017 ). URL: https://www.kaggle.com/c/ jigsaw-toxic -comment-classification-challenge.

[12]

Pitenis ,

Zampieri , T. Ranasinghe, Ofensive language identification in greek , CoRR abs/ 2003 .07459 ( 2020 ). URL: https://arxiv.org/abs/ 2003 .07459. arXiv: 2003 .07459.

[13]

Fortuna , J. Rocha da Silva, J. Soler-Company,

Wanner ,

Nunes , A hierarchicallylabeled Portuguese hate speech dataset , in: Proceedings of the Third Workshop on Abusive Language Online , Association for Computational Linguistics, Florence, Italy, 2019 , pp. 94 - 104 . URL: https://aclanthology.org/W19-3510. doi: 10 .18653/v1/ W19 - 3510.

[14]

G. I.

Sigurbergsson , L. Derczynski, Ofensive language and hate speech detection for danish , CoRR abs/ 1908 .04531 ( 2019 ). URL: http://arxiv.org/abs/ 1908 .04531. arXiv: 1908 .04531.

[15]

Aragon ,

M. A.

Carmona ,

Montes ,

H. J.

Escalante ,

Villaseñor-Pineda ,

Moctezuma , Overview of mex-a3t at iberlef 2019: Authorship and aggressiveness analysis in mexican spanish tweets , 2019 .

[16] Ç. Çöltekin , A corpus of Turkish ofensive language on social media , in: Proceedings of the Twelfth Language Resources and Evaluation Conference , European Language Resources Association, Marseille, France, 2020 , pp. 6174 - 6184 . URL: https://aclanthology.org/ 2020 . lrec- 1 . 758 .

[17]

Mandl ,

Modha ,

Majumder ,

Patel ,

Dave ,

Mandlia ,

Patel , Overview of the hasoc track at fire 2019: Hate speech and ofensive content identification in indo-european languages , in: Proceedings of the 11th Forum for Information Retrieval Evaluation , FIRE '19, Association for Computing Machinery, New York, NY, USA, 2019 , p. 14 - 17 . URL: https://doi.org/10.1145/3368567.3368584. doi: 10 .1145/3368567.3368584.

[18]

Mandl ,

Modha , A. Kumar

B. R.

Chakravarthi , Overview of the hasoc track at fire 2020: Hate speech and ofensive language identification in tamil, malayalam, hindi, english and german, in: Forum for Information Retrieval Evaluation , FIRE 2020 , Association for Computing Machinery , New York, NY, USA, 2020 , p. 29 - 32 . URL: https://doi.org/10.1145/ 3441501.3441517. doi: 10 .1145/3441501.3441517.

[19]

Modha ,

Mandl ,

G. K.

Shahi ,

Madhu ,

Satapara ,

Ranasinghe , M. Zampieri, Overview of the hasoc subtrack at fire 2021: Hate speech and ofensive content identification in english and indo-aryan languages and conversational hate speech, in: Forum for Information Retrieval Evaluation , FIRE 2021 , Association for Computing Machinery , New York, NY, USA, 2021 , p. 1 - 3 . URL: https://doi.org/10.1145/3503162.3503176. doi: 10 .1145/3503162.3503176.

[20]

Romim ,

Ahmed ,

M. S.

Islam ,

A. Sen

Sharma ,

Talukder ,

M. R.

Amin , BD-SHS: A benchmark dataset for learning to detect online bangla hate speech in diferent social contexts , in: Proceedings of the Thirteenth Language Resources and Evaluation Conference , European Language Resources Association, Marseille, France, 2022 , pp. 5153 - 5162 . URL: https://aclanthology.org/ 2022 .lrec- 1 . 552 .

[21]

Gupta ,

Roychowdhury , M. Das , S.

Banerjee , P.

Saha , B.

Mathew , h. p. vanchinathan, A. Mukherjee, Multilingual abusive comment detection at scale for indic languages , in: S. Koyejo,

Mohamed ,

Agarwal ,

Belgrave ,

Cho , A . Oh (Eds.), Advances in Neural Information Processing Systems , volume 35 , Curran

Associates

, Inc., 2022 , pp. 26176 - 26191 . URL: https://proceedings.neurips.cc/paper_files/paper/2022/file/ a7c4163b33286261b24c72fd3d1707c9-Paper-Datasets_and_Benchmarks.pdf.

[22]

Satapara ,

Madhu ,

Ranasinghe ,

A. E.

Dmonte ,

Zampieri ,

Pandya ,

Shah ,

Sandip ,

Majumder , T. Mandl, Overview of the hasoc subtrack at fire 2023: Hatespeech identification in sinhala and gujarati , in: K. Ghosh,

Mandl ,

Majumder , M. Mitra (Eds.), Working Notes of FIRE 2023 - Forum for Information Retrieval Evaluation, Goa,