<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta>
      <journal-title-group>
        <journal-title>Forum for Information Retrieval Evaluation, December</journal-title>
      </journal-title-group>
      <issn pub-type="ppub">1613-0073</issn>
    </journal-meta>
    <article-meta>
      <title-group>
        <article-title>Detection in Assamese, Bengali, and Bodo languages</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Koyel Ghosh</string-name>
          <email>ghosh.koyel8@gmail.com</email>
          <xref ref-type="aff" rid="aff0">0</xref>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Apurbalal Senapati</string-name>
          <email>a.senapati@cit.ac.in</email>
          <xref ref-type="aff" rid="aff0">0</xref>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Aditya Shankar Pal</string-name>
          <email>adityashankarpal_r@isical.ac.in</email>
          <xref ref-type="aff" rid="aff1">1</xref>
          <xref ref-type="aff" rid="aff2">2</xref>
        </contrib>
        <contrib contrib-type="editor">
          <string-name>Transformers, BERT</string-name>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Central Institute of Technology</institution>
          ,
          <addr-line>Kokrajhar, Assam</addr-line>
          ,
          <country country="IN">India</country>
        </aff>
        <aff id="aff1">
          <label>1</label>
          <institution>Hate Speech Detection</institution>
          ,
          <addr-line>Binary Classification, Assamese, Bengali, Bodo, Machine Learning, Deep Learning</addr-line>
        </aff>
        <aff id="aff2">
          <label>2</label>
          <institution>Indian Statistical Institute</institution>
          ,
          <addr-line>Kolkata</addr-line>
          ,
          <country country="IN">India</country>
        </aff>
      </contrib-group>
      <pub-date>
        <year>2023</year>
      </pub-date>
      <volume>1</volume>
      <fpage>5</fpage>
      <lpage>18</lpage>
      <abstract>
        <p>In today's world, social media can act as a tool for spreading hate towards a person or group based on their color, caste, sex, sexual orientation, political diferences, etc. As social media continues to expand, the proliferation of hate speech is also surging at an alarming rate. Recently, Research on identifying hate speech in social media has gained significant prominence, with a specific need for investigations focused on languages other than English. The HASOC (Hate Speech and Ofensive Content Identification) track intends to provide a platform for Hate Speech Detection since 2019 at FIRE (Forum for Information Retrieval Evaluation). HASOC 2023 is coordinating four tasks, with AH (Annihilate Hates, Task 4) being one of them. The AH task aims to develop and assess supervised machine learning systems on the three datasets. The three datasets presented for hate speech in three Indian languages (Assamese, Bengali, and Bodo) are collected from ™YouTube and ™Facebook comments. Each dataset is tagged with the binary classification (hate or non-hate) labels. In the Assamese language, 20 teams made 180 submissions, while 21 teams submitted 214 entries in the Bengali language, and for the Bodo language, 19 teams submitted a total of 175 submissions. The performance of the best classifiers for Assamese, Bengali, and Bodo are measured with the Macro F1 score of 0.73, 0.77, and 0.85, respectively. This article briefly summarizes the tasks, data development, and results. The variant of BERT architecture achieved the best performance in the task. However, other systems have also been successfully applied to the task.</p>
      </abstract>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>-</title>
      <p>CEUR
ceur-ws.org</p>
    </sec>
    <sec id="sec-2">
      <title>1. Introduction</title>
      <p>
        In addition to fostering friendships and facilitating information sharing, popular social
media platforms such as ™Twitter, ™Facebook, and ™YouTube have also become platforms for
cyberbullying and online harassment. These negative aspects can have severe consequences,
including causing depression and inciting individuals to engage in violent actions, as evidenced
in studies like [
        <xref ref-type="bibr" rid="ref1 ref2">1, 2</xref>
        ]. Instances of hate speech on these platforms have disrupted social and
communal harmony on a global scale. Consequently, many countries have introduced
increasingly complex regulations to address ofensive online content, as discussed in [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ] and [
        <xref ref-type="bibr" rid="ref4">4</xref>
        ]. This
LGOBE
(A. S. Pal)
CEUR
Workshop
Proceedings
situation has created a crucial need for automated methods to detect suspicious posts. It’s worth
noting that most research in this area has primarily focused on English and similar languages.
On the other hand, Low-resource languages need more annotated datasets. Linguists have
examined and characterized diferent manifestations of hate speech [
        <xref ref-type="bibr" rid="ref5">5</xref>
        ], while political scholars
and legal authorities explore methods to govern online platforms and address problematic
content while preserving the principles of free expression [
        <xref ref-type="bibr" rid="ref6">6</xref>
        ]. Algorithms are always getting
better, and people are making lots of diferent sets of data for lots of other things, and they’re
studying them. Recently, Researchers made sets of data for many diferent languages [
        <xref ref-type="bibr" rid="ref7">7</xref>
        ] like
English [
        <xref ref-type="bibr" rid="ref10 ref11 ref8 ref9">8, 9, 10, 11</xref>
        ], Greek [
        <xref ref-type="bibr" rid="ref12">12</xref>
        ], Portuguese [
        <xref ref-type="bibr" rid="ref13">13</xref>
        ], Danish [
        <xref ref-type="bibr" rid="ref14">14</xref>
        ], Mexican Spanish [
        <xref ref-type="bibr" rid="ref15">15</xref>
        ], and
Turkish [
        <xref ref-type="bibr" rid="ref16">16</xref>
        ]. In Indian languages, hate speech dataset available is Hindi [
        <xref ref-type="bibr" rid="ref17 ref18 ref19">17, 18, 19</xref>
        ], Marathi [
        <xref ref-type="bibr" rid="ref19">19</xref>
        ],
Bengali [
        <xref ref-type="bibr" rid="ref20">20</xref>
        ], Telegu [
        <xref ref-type="bibr" rid="ref21">21</xref>
        ], Tamil [
        <xref ref-type="bibr" rid="ref21">21</xref>
        ], Malayalam [
        <xref ref-type="bibr" rid="ref21">21</xref>
        ], and Kannada [
        <xref ref-type="bibr" rid="ref21">21</xref>
        ]. Having all these
diferent data sets helps us understand how similar or diferent they are and how trustworthy
they are.
      </p>
      <p>
        In the HASOC 2023, four tasks (Task 1 - Task 4) in the research area of Hate Speech detection
are proposed. Task 1 [
        <xref ref-type="bibr" rid="ref22">22</xref>
        ] focuses on identifying hate speech, ofensive language, and profanity
in diferent languages using natural language processing techniques. Task 2 [ 23], known as the
Identification of Conversational Hate-Speech in Code-Mixed Languages (ICHCL), addresses
the challenge of identifying hate speech and ofensive content in code-mixed conversations on
social media. Code-mixed text includes multiple languages within a single conversation. The
task is divided into two subtasks. Task 3 [24] aims to detect the various hateful spans within a
sentence already considered hateful. A hate span is a set of continuous tokens that, in tandem,
communicate the explicit hatefulness in a sentence.
      </p>
      <p>This paper will provide an overview of Task 4, i.e., Annihilate Hates (AH), which contributes
task-specific (Hate speech detection) low-resourced datasets on three languages: Assamese,
Bengali, and Bodo. This AH dataset is version 3 and well updated of HS (version 2) [25], and
NEIHS (version 1) [26, 27] datasets.</p>
    </sec>
    <sec id="sec-3">
      <title>2. Related Forum and Dataset</title>
      <p>The main obstacle in hate speech detection is the requirement for language-specific datasets.
Constructing labeled datasets for hate speech in Indian languages is a laborious and intricate
endeavor. It needs extensive groundwork and preprocessing tasks such as data cleaning and
ensuring agreement among annotators. This section provides a concise overview of Indian
datasets in languages like Hindi, Marathi, Bengali, Telugu, Tamil, Malayalam, and Kannada.</p>
      <p>
        The HASOC challenge, organized by FIRE (Forum for Information Retrieval Evaluation)1,
has played a significant role in providing hate speech datasets in Indian languages like Hindi,
Marathi, etc. HASOC comprises four subtracks, and the dataset distribution is in a tab-separated
format. In 2019, the HASOC-Hindi dataset introduced three tasks as described in [
        <xref ref-type="bibr" rid="ref17">17</xref>
        ]. Subtask A
is the initial task involving binary classification. Subtask B focuses on identifying the profanity
or abuse within hate comments, a multiclass classification task. Subtask C is centered on
determining whether the hate speech is targeted at a specific individual or if it’s more general
and untargeted. In the HASOC 2020 edition, two hate speech detection tasks were presented, as
mentioned in [
        <xref ref-type="bibr" rid="ref18">18</xref>
        ]. Subtask A involves binary classification, and Subtask B addresses multiclass
classification. These tasks are accompanied by another Hindi dataset, expanding the research
scope in this area. In 2021, HASOC published a Hindi dataset [
        <xref ref-type="bibr" rid="ref19">19</xref>
        ] with sub-tasks A and B
again. Total Sixty-five teams submitted a total of six thousand and fity-two runs. In
HASOCMarathi [
        <xref ref-type="bibr" rid="ref19">19</xref>
        ], the Marathi hate speech dataset with binary classification task. Authors [ 28, 29]
experimented on HASOC datasets and analyzed the transformer-based model’s performance in
detail. BD-SHS [
        <xref ref-type="bibr" rid="ref20">20</xref>
        ] is a Bengali hate speech dataset with three levels: hate speech identification
(binary classification, i.e., hate and not hate), identification of the Target of hate speech
(multilabel classification, i.e., individual, male, female, and group), and categorization of hate speech
types (multi-label classification, i.e., slander, call to violence, gender, religion). The authors [
        <xref ref-type="bibr" rid="ref21">21</xref>
        ]
created several Indian datasets, i.e., Hindi, Telegu, Tamil, Malayalam, and Kannada, later
performed monolingual, unbalanced splits, zero-shot cross-lingual, Few-shot, Joint training,
pretraining and cross-dataset experiments on those datasets.
      </p>
    </sec>
    <sec id="sec-4">
      <title>3. Task Description</title>
      <p>In HASOC 20232, Task 4 is Annihilate Hates (AH) with three languages proposed in the research
area of hate speech detection. These tasks ofered all three languages: Assamese, Bengali, and
Bodo. Figure 1 shows the Screenshot of Annihilate Hates (AH) Website3.</p>
      <sec id="sec-4-1">
        <title>3.1. Sub-task A: Hate Speech Detection in Assamese, Bengali, and Bodo (Binary)</title>
        <p>Task 4 aims to detect hate speech in Assamese, Bengali, and Bodo languages. Each dataset
(for the three languages) consists of a list of comments with their corresponding class (hate or
ofensive ( HOF ) or not hate (NOT )). Data is primarily collected from ™Facebook and ™YouTube
comments. It is a binary classification task in which participating systems are required to
classify the comments into two classes: HOF and NOT. Figure 2 shows the sample-tagged
datasets of the AH-Assamese, AH-Bengali, and AH-Bodo.</p>
      </sec>
    </sec>
    <sec id="sec-5">
      <title>4. Dataset Description</title>
      <sec id="sec-5-1">
        <title>4.1. Dataset Collection</title>
        <p>In this section, dataset collection, annotation, and analysis have been discussed for Task 4.
Our primary aim in constructing this dataset is to ensure its diversity, so we intentionally
selected a few political, entertainment, and more ™Facebook pages and ™YouTube channels. We
initiated the process by identifying contentious posts, often related to recent events, prominent
ifgures such as politicians and actors, which had a higher likelihood of containing hate speech.
Subsequently, we scrutinized the comments on these posts, seeking those primarily written in a</p>
        <sec id="sec-5-1-1">
          <title>2https://hasocfire.github.io/hasoc/2023/index.html (Access on 30.10.2023) 3https://sites.google.com/view/hasoc-2023-annihilate-hates/home (Access on 30.10.2023)</title>
          <p>determine whether these comments contained hate speech and categorized them accordingly.
All the comments are collected using open source scrapper tools4. Ultimately, native speakers
tagged the sentences as either HOF or NOT. Sentences that fell under the HOF category usually
contained hate-related words and were considered hate-ofensive statements. In contrast,
sentences conveying formal information, suggestions, or questions were categorized as NOT
sentences.</p>
        </sec>
        <sec id="sec-5-1-2">
          <title>4https://github.com/kevinzg/facebook-scraper (Access on 30.10.2023)</title>
        </sec>
      </sec>
      <sec id="sec-5-2">
        <title>4.2. Dataset Annotation</title>
        <p>
          In dataset annotation, we share three separate CSV files with the three annotators. These files
contain three columns: S. No. (serial number), text (comments), and task_1 (binary, i.e., HOF or
NOT ). The data for each language was tagged manually by three native speakers, young adults
between 19 and 24. These annotators are students at the Central Institute of Technology in
Kokrajhar, Assam, India. Their task involved manually classifying comments into two categories:
those containing hateful (HOF ) content and those that did not (NOT ), using binary labels. The
ifnal decision was taken by consulting with a domain expert. Identifying hate speech is a
subjective task, and it requires careful consideration. Consequently, we have established specific
and rigorous guidelines to help define what qualifies as hate speech. These regulations are
based on the community standards of ™Facebook5 and ™YouTube6. The authors [
          <xref ref-type="bibr" rid="ref21">21</xref>
          ] follow
the below-mentioned rules for the comments to be marked as hate, and we follow the scheme
with updation. (a) Profanity: Comments that include profane language, curses, or vulgar words
are categorized as hate speech. (b) Sexual orientation: Sexual attraction can be directed toward
individuals of the opposite gender, the same gender, both genders or multiple genders. (c)
Personal: Comments regarding one’s fashion sense, choice of content, language selection, and
related aspects. (d) Gender chauvinism: People are targeted in the comment because of their
gender. (e) Religious: A person is criticized for their choice of religious beliefs and practices. For
example, comments challenging the use of a turban or a burkha (the veil), (f) Political: Harassed
a person based on political beliefs. For instance, bullying people for supporting a political party.
(g) Violent intention: Containing a threat or call to violence in the comments.
        </p>
        <p>Diferent annotators annotate the AH datasets, and the majority vote was considered; the
annotation agreement calculated using  (Kappa) coeficient is shown in Table 1. The problems
and the level of disagreement need to be explored in the future.</p>
        <p>Datasets
AH-Assamese
AH-Bengali</p>
        <p>AH-Bodo
 statistics
0.67
0.54
0.81</p>
      </sec>
      <sec id="sec-5-3">
        <title>4.3. Dataset Analysis</title>
        <p>We summarize the key statistics of the AH dataset in Table 2. For the Assamese dataset, 2,955
comments are HOF out of 5,045. 641 comments are HOF out of 1,601 in the Bengali dataset
which leads NOT is high. Out of 2,099, a total of 1,225 are HOF in the Bodo dataset. As a result,
our Assamese and Bodo dataset is slightly skewed in favour of containing hate speech. Figure 3
shows the details of class distribution. In the training dataset, 4,036, 1,281, and 1,679 comments
are present in the Assamese, Bengali, and Bodo datasets, respectively.</p>
        <sec id="sec-5-3-1">
          <title>5https://web.facebook.com/communitystandards/ (Access on 30.10.2023) 6https://www.youtube.com/howyoutubeworks/policies/community-guidelines/ (Access on 30.10.2023)</title>
          <p>Dataset
AH-Assamese
AH-Bengali
AH-Bodo</p>
          <p>Total</p>
        </sec>
      </sec>
    </sec>
    <sec id="sec-6">
      <title>5. Result</title>
      <p>The macro approach computes the F1 score individually for each class without considering
the use of weights for aggregation. As a result, it imposes a more significant penalty when
a system’s performance is poor for minority classes. The selection of a specific F1 variant
depends on the task’s objectives and the label distribution in the dataset. Hate speech-related
classification tasks often face class imbalance, making the macro F1 measure the suitable choice
for evaluation.</p>
      <p>For the system rum submission and evaluation of participants’ experiments, we depend on the
Kaggle platform. Figure 4 shows the Screenshot of the Annihilate Hates (AH) Kaggle Website
for run submission. We provide separate Kaggle platforms Like Assamese7, Bengali8, Bodo9 for
participants to submit experimental runs.</p>
      <p>Overall, 69 participants register for the task 4. In the Assamese task, 20 teams made 180
submissions, while 21 teams submitted 214 runs in the Bengali task, and for the Bodo task, 19
teams submitted a total of 175 runs.</p>
      <p>The performance of the best classification algorithms for Assamese, Bengali, and Bodo are
Macro F1 measures of 0.73, 0.77, and 0.85, respectively. The results for AH-Assamese,
AHBengali, and AH-Bodo datasets are shown in Table 3, Table 4, and Table 5, respectively.</p>
    </sec>
    <sec id="sec-7">
      <title>6. Methodology</title>
      <p>This section discusses the systems utilized by the participants.
6.1. AH-Assamese
• Chetona [30] propose ensembling IndicBERT and Naive Bayes, along with synthetic data
upsampling techniques (up-sample the training examples of each language by translating
the examples from the other two languages to the given language.).
• FiRC-NLP [31] fine-tune the pre-trained XLM-RoBERTa-large model to get second position
in the leaderboard.
• TeamBD [32] experimented with xlm-roberta-large (multilingual) along with ChatGPT3
augmentation.
7https://www.kaggle.com/competitions/annihilate-hates-assamese (Access on 30.10.2023)
8https://www.kaggle.com/competitions/annihilate-hates-bengali (Access on 30.10.2023)
9https://www.kaggle.com/competitions/annihilate-hates-bodo (Access on 30.10.2023)
(a)
• SATLab [33] uses the LIBLinear L2-regularized logistic regression model (dual, -s 7) [42].
• AI Alchemists [34] fune-tuned XLM-RoBERTa model.
• Sanvadita [35] uses monolingual assamese-bert10 and the multilingual indic-bert model11.
• Z-AGI Labs [36] experiments with various multi-lingual transformer-based models
for fine-tuning such as Bert-Base-Multilingual (Cased and Uncased),
DistilBert-Base10https://huggingface.co/l3cube-pune/assamese-bert (Access on 30.10.2023)
11https://huggingface.co/ai4bharat/indic-bert (Access on 30.10.2023)
6.2. AH-Bengali
• Sanvadita [35] uses pre-trained monolingual Bengali Sentence-BERT14, Bengali-BERT
models15 and multilingual Indic Sentence-BERT 16.
• FiRC-NLP [31] utilizes XLM-RoBERTa-large model.
• Z-AGI Labs [36] utilizes pre-trained models for the experiments but achieves the highest
score fine-tuning the csebuetnlp/banglabert pre-trained model.
• TeamBD [32] experiments with xlm-roberta-large model (multiingual).
• AI Alchemists [34] fune-tuned XLM-RoBERTa model.
• Code Fellas [38] fine-tuned MuRIL for Bengali to get the best experiment results out of all
experiments done by the team.
• Chetona [30] applies the same for Bengali language as Assamese languages.
• SATLab [33] utilizes the same system as Assamese.
• MUCS [39], SVM trained with TF-IDF of syllable n-grams and TF-IDF of char n-grams
both in the range (1, 3).
• JCT/ Avigail Stekel [37], their best model for Bengali is an MNB model with 6-gram
features out of all experiments they performed, mentioned in the Assamese section.
12https://huggingface.co/ibraheemmoosa/xlmindic-base-uniscript (Access on 30.10.2023)
13https://huggingface.co/ibraheemmoosa/xlmindic-base-multiscript (Access on 30.10.2023)
14https://huggingface.co/l3cube-pune/bengali-sentence-bert-nli (Access on 30.10.2023)
15https://huggingface.co/l3cube-pune/bengali-bert (Access on 30.10.2023)
16https://huggingface.co/l3cube-pune/indic-sentence-bert-nli (Access on 30.10.2023)
• CNLP-NITS-PP [40], a CNN-based Binary Classification Model with FastText Embeddings
outperforms the other systems.</p>
      <p>• CHANDAN SENAPATI [41] implements the deep learning model LSTM.
6.3. AH-Bodo
• SATLab [33] utilizes the same system applied to the Assamese dataset.
• JCT/ Avigail Stekel [37], their best submission for Bodo is a LR with all word unigrams in
the training set.
• FiRC-NLP [31] utilizes the XLM-RoBERTa-large model, the best submission among all
their experiments.
• Chetona [30] applies the same system for the Bodo language as mentioned in the Assamese
section.
• AI Alchemists [34] fune-tuned XLM-RoBERTa model.
• MUCS [39] trains SVM with TF-IDF of syllable n-grams and TF-IDF of char n-grams both
in the range (1, 3) obtained the best macro F1 scores.
• Code Fellas [38] uses a BiLSTM model enhanced with an additional Dense Layer attaining
an impressive F1 score for Bodo.
• Z-AGI Labs/ Nikhil Narayan [36] fine-tuned a pre-trained Bert Base Multilingual Cased
model for Bodo.
• TeamBD [32] applies xlm-roberta-large model (multiingual).</p>
      <p>• CNLP-NITS-PP [40] gets best result for Bodo dataset with Logistic Regression.</p>
    </sec>
    <sec id="sec-8">
      <title>7. Conclusion</title>
      <p>The submissions in the AH task (Task 4, HASOC 2023) have shown transformer-based
pretrained models to be the state-of-the-art approach for Hate Speech detection in the Assamese
and Bengali datasets. However, the L2-regularized logistic regression model gives the best result
for the Bodo dataset. Other deep learning models, like LSTM, CNNs, etc., also perform well on
the given datasets. Upon reviewing the outcomes, the most suitable approach for hate speech
detection depends on factors such as the language of the dataset, the level of classification detail,
and the distribution of class labels. Balancing an imbalanced training dataset could impact
the classification system’s efectiveness. In the long run, the AH task aims to provide more
low-resourced data with binary and multi-label classification tasks.</p>
    </sec>
    <sec id="sec-9">
      <title>8. Acknowledgement</title>
      <p>We thank Mr. Debarshi Sonowal, Mr. Abhilash Basumatary, and Ms. Bidisha Gogoi for their
help in collecting and tagging the Assamese hate dataset. Additionally, we extend our thanks to
Mr. Maharaj Brahma and Mr. Mwnthai Narzary for their valuable contributions in collecting
and labelling the Bodo hate dataset. We also thank the FIRE and HASOC organizers for their
support in organizing the track. We thank all participants for their submissions and their
valuable work.
India. December 15-18, 2023, CEUR Workshop Proceedings, CEUR-WS.org, 2023.
[23] S. Madhu, Hiren Satapara, P. Pandya, N. Shah, T. Mandl, S. Modha, Overview of the
hasoc subtrack at fire 2023: Identification of conversational hate-speech, in: K. Ghosh,
T. Mandl, P. Majumder, M. Mitra (Eds.), Working Notes of FIRE 2023 - Forum for Information
Retrieval Evaluation, Goa, India. December 15-18, 2023, CEUR Workshop Proceedings,
CEUR-WS.org, 2023.
[24] S. Masud, M. A. Khan, M. S. Akhtar, T. Chakraborty, Overview of the HASOC Subtrack
at FIRE 2023: Identification of Tokens Contributing to Explicit Hate in English by Span
Detection, in: Working Notes of FIRE 2023 - Forum for Information Retrieval Evaluation,
Goa, India. December 15-18, 2023, CEUR Workshop Proceedings, CEUR-WS.org, 2023.
[25] K. Ghosh, A. Senapati, Hate speech detection: an analysis of mono and multilingual
transformer models with cross-language evaluation on hindi, marathi, bangla, and bodo
language, Natural Language Engineering Accepted on 26.10.2023 (2023).
[26] K. Ghosh, A. Senapati, M. Narzary, M. Brahma, Hate speech detection in low-resource
bodo and assamese texts with ml-dl and bert models, Scalable Computing: Practice and
Experience 24 (2023) 941–955.
[27] K. Ghosh, D. Sonowal, A. Basumatary, B. Gogoi, A. Senapati, Transformer-based hate
speech detection in assamese, in: 2023 IEEE Guwahati Subsection Conference (GCON),
2023, pp. 1–5. doi:10.1109/GCON58516.2023.10183497.
[28] K. Ghosh, D. A. Senapati, Hate speech detection: a comparison of mono and multilingual
transformer model with cross-language evaluation, in: Proceedings of the 36th Pacific Asia
Conference on Language, Information and Computation, Association for Computational
Linguistics, Manila, Philippines, 2022, pp. 853–865. URL: https://aclanthology.org/2022.
paclic-1.94.
[29] K. Ghosh, A. Senapati, U. Garain, Baseline bert models for conversational hate speech
detection in code-mixed tweets utilizing data augmentation and ofensive language
identiifcation in marathi, in: Forum for Information Retrieval Evaluation (Working Notes)(FIRE).</p>
      <p>CEUR-WS. org, 2022.
[30] S. Saha, M. Sullivan, R. Srihari, Hate Speech Detection in Low Resource Indo-Aryan
Languages, in: Working Notes of FIRE 2023 - Forum for Information Retrieval Evaluation,
CEUR, 2023.
[31] M. S. Jahan, F. Hassan, W. Aransa, A. Bouchekif, Multilingual Hate Speech Detection
Using Ensemble of Transformer Models, in: Working Notes of FIRE 2023 - Forum for
Information Retrieval Evaluation, CEUR, 2023.
[32] K. M. Jhuma, M. Oussalah, A. Singhal, Cross-Linguistic Ofensive Language Detection:
BERT-Based Analysis of Bengali, Assamese, &amp; Bodo Conversational Hateful Content from
Social Media, in: Working Notes of FIRE 2023 - Forum for Information Retrieval Evaluation,
CEUR, 2023.
[33] Y. Bestgen, Using Only Character Ngrams for Hate Speech and Ofensive Content
Identiifcation in Five Low-Ressource Languages, in: Working Notes of FIRE 2023 - Forum for
Information Retrieval Evaluation, CEUR, 2023.
[34] C. Muhammad Awais, J. Raj, Breaking Barriers: Multilingual Toxicity Analysis for Hate
Speech and Ofensive Language in Low-Resource Indo-Aryan Languages, in: Working
Notes of FIRE 2023 - Forum for Information Retrieval Evaluation, CEUR, 2023.
[35] A. Joshi, R. Joshi, Harnessing Pre-Trained Sentence Transformers for Ofensive Language
Detection in Indian Languages, in: Working Notes of FIRE 2023 - Forum for Information
Retrieval Evaluation, CEUR, 2023.
[36] N. Narayan, M. Biswal, P. Goyal, A. Panigrahi, Hate Speech and Ofensive Content
Detection in Indo-Aryan Languages: A Battle of LSTM and Transformers, in: Working
Notes of FIRE 2023 - Forum for Information Retrieval Evaluation, CEUR, 2023.
[37] A. Stekel, A. Prives, Y. HaCohen-Kerner, Detecting Ofensive Language in Bengali, Bodo,
and Assamese using Word Unigrams, Char N-grams, Classical Machine Learning, and
Deep Learning Methods, in: Working Notes of FIRE 2023 - Forum for Information Retrieval
Evaluation, CEUR, 2023.
[38] A. Reddy Gutha, N. Sai Adarsh, A. Alekar, D. Reddy, Multilingual Hate Speech and
Ofensive Language Detection of Low Resource Languages, in: Working Notes of FIRE
2023 - Forum for Information Retrieval Evaluation, CEUR, 2023.
[39] P. M, R. K, A. Hegde, K. G, S. Coelho, H. L. Shashirekha, Taming Toxicity: Learning Models
for Hate Speech and Ofensive Language Detection in Social Media Text, in: Working
Notes of FIRE 2023 - Forum for Information Retrieval Evaluation, CEUR, 2023.
[40] G. Kalita, E. Halder, C. Taparia, A. Vetagiri, D. P. Pakray, Examining Hate Speech Detection
Across Multiple Indo-Aryan Languages in Tasks 1 &amp; 4, in: Working Notes of FIRE 2023
Forum for Information Retrieval Evaluation, CEUR, 2023.
[41] C. Senapati, U. Roy, Bengali Hate Speech Detection Using Deep Learning Technique, in:</p>
      <p>Working Notes of FIRE 2023 - Forum for Information Retrieval Evaluation, CEUR, 2023.
[42] R.-E. Fan, K.-W. Chang, C.-J. Hsieh, X.-R. Wang, C.-J. Lin, Liblinear: A library for large
linear classification, J. Mach. Learn. Res. 9 (2008) 1871–1874.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <given-names>M. L.</given-names>
            <surname>Williams</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Burnap</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Javed</surname>
          </string-name>
          , H. Liu,
          <string-name>
            <given-names>S.</given-names>
            <surname>Ozalp</surname>
          </string-name>
          ,
          <article-title>Hate in the Machine: Anti-Black and Anti-Muslim Social Media Posts as Predictors of Ofline Racially and Religiously Aggravated Crime</article-title>
          ,
          <source>The British Journal of Criminology</source>
          <volume>60</volume>
          (
          <year>2019</year>
          )
          <fpage>93</fpage>
          -
          <lpage>117</lpage>
          . URL: https://doi.org/10.1093/bjc/azz049. doi:
          <volume>10</volume>
          .1093/bjc/azz049. arXiv:https://academic.oup.com/bjc/articlepdf/60/1/93/31634412/azz049.pdf.
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [2]
          <string-name>
            <given-names>Z.</given-names>
            <surname>Laub</surname>
          </string-name>
          ,
          <article-title>Hate speech on social media: Global comparisons</article-title>
          ,
          <source>Council on foreign relations 7</source>
          (
          <year>2019</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [3]
          <string-name>
            <given-names>A.</given-names>
            <surname>Nicholas</surname>
          </string-name>
          ,
          <string-name>
            <surname>C. Ezeibe,</surname>
          </string-name>
          <article-title>The state, hate speech regulation and sustainable democracy in africa: a study of nigeria and kenya</article-title>
          , African
          <string-name>
            <surname>Identities</surname>
          </string-name>
          (
          <year>2020</year>
          ). doi:
          <volume>10</volume>
          .1080/14725843.
          <year>2020</year>
          .
          <volume>1813548</volume>
          .
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [4]
          <string-name>
            <given-names>T.</given-names>
            <surname>Quintel</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Ullrich</surname>
          </string-name>
          ,
          <article-title>Self-Regulation of Fundamental Rights? The EU Code of Conduct on Hate Speech, Related Initiatives</article-title>
          and Beyond, Edward Elgar Publishing,
          <year>2019</year>
          . Available at SSRN: https://ssrn.com/abstract=3298719.
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          [5]
          <string-name>
            <given-names>S.</given-names>
            <surname>Jaki</surname>
          </string-name>
          , T. De Smedt,
          <string-name>
            <given-names>M.</given-names>
            <surname>Gwóźdź</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Panchal</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Rossa</surname>
          </string-name>
          , G. De Pauw,
          <article-title>Online hatred of women in the incels.me forum: Linguistic analysis and automatic detection</article-title>
          ,
          <source>Journal of Language Aggression and Conflict</source>
          <volume>7</volume>
          (
          <year>2019</year>
          )
          <fpage>240</fpage>
          -
          <lpage>268</lpage>
          . URL: https://www.jbe-platform.com/content/ journals/10.1075/jlac.00026.jak. doi:https://doi.org/10.1075/jlac.00026.jak.
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          [6]
          <string-name>
            <given-names>G. L.</given-names>
            <surname>Casey</surname>
          </string-name>
          ,
          <article-title>Ending the incel rebellion: The tragic impacts of an online hate group</article-title>
          ,
          <source>Loyola Journal of Public Interest Law</source>
          <volume>21</volume>
          (
          <year>2019</year>
          )
          <article-title>71</article-title>
          . URL: https://heinonline.org/HOL/P?h=hein.
          <source>journals/loyjpubil21&amp;i=79.</source>
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          [7]
          <string-name>
            <given-names>F.</given-names>
            <surname>Poletto</surname>
          </string-name>
          ,
          <string-name>
            <given-names>V.</given-names>
            <surname>Basile</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Sanguinetti</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Bosco</surname>
          </string-name>
          ,
          <string-name>
            <given-names>V.</given-names>
            <surname>Patti</surname>
          </string-name>
          ,
          <article-title>Resources and benchmark corpora for hate speech detection: a systematic review</article-title>
          ,
          <source>Language Resources and Evaluation</source>
          <volume>55</volume>
          (
          <year>2021</year>
          )
          <fpage>1</fpage>
          -
          <lpage>47</lpage>
          . doi:
          <volume>10</volume>
          .1007/s10579-020-09502-8.
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          [8]
          <string-name>
            <given-names>M.</given-names>
            <surname>Zampieri</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Malmasi</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Nakov</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Rosenthal</surname>
          </string-name>
          ,
          <string-name>
            <given-names>N.</given-names>
            <surname>Farra</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Kumar</surname>
          </string-name>
          ,
          <article-title>Predicting the type and target of ofensive posts in social media, in: Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers), Association for Computational Linguistics</article-title>
          , Minneapolis, Minnesota,
          <year>2019</year>
          , pp.
          <fpage>1415</fpage>
          -
          <lpage>1420</lpage>
          . URL: https://aclanthology.org/ N19-1144. doi:
          <volume>10</volume>
          .18653/v1/
          <fpage>N19</fpage>
          -1144.
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          [9]
          <string-name>
            <given-names>T.</given-names>
            <surname>Davidson</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Warmsley</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Macy</surname>
          </string-name>
          ,
          <string-name>
            <surname>I. Weber</surname>
          </string-name>
          ,
          <article-title>Automated hate speech detection and the problem of ofensive language</article-title>
          ,
          <source>Proceedings of the International AAAI Conference on Web and Social Media</source>
          <volume>11</volume>
          (
          <year>2017</year>
          )
          <fpage>512</fpage>
          -
          <lpage>515</lpage>
          . URL: https://ojs.aaai.org/index.php/ICWSM/article/ view/14955. doi:
          <volume>10</volume>
          .1609/icwsm.v11i1.
          <fpage>14955</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          [10]
          <string-name>
            <given-names>I.</given-names>
            <surname>Kwok</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Y.</given-names>
            <surname>Wang</surname>
          </string-name>
          ,
          <article-title>Locate the hate: Detecting tweets against blacks</article-title>
          ,
          <source>Proceedings of the AAAI Conference on Artificial Intelligence</source>
          <volume>27</volume>
          (
          <year>2013</year>
          )
          <fpage>1621</fpage>
          -
          <lpage>1622</lpage>
          . URL: https://ojs.aaai.org/ index.php/AAAI/article/view/8539. doi:
          <volume>10</volume>
          .1609/aaai.v27i1.
          <fpage>8539</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          [11]
          <string-name>
            <surname>Kaggle</surname>
          </string-name>
          ,
          <article-title>Toxic comment classification challenge: Identify and classify toxic online comments (</article-title>
          <year>2017</year>
          ). URL: https://www.kaggle.com/c/ jigsaw-toxic
          <article-title>-comment-classification-challenge.</article-title>
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          [12]
          <string-name>
            <given-names>Z.</given-names>
            <surname>Pitenis</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Zampieri</surname>
          </string-name>
          , T. Ranasinghe,
          <article-title>Ofensive language identification in greek</article-title>
          , CoRR abs/
          <year>2003</year>
          .07459 (
          <year>2020</year>
          ). URL: https://arxiv.org/abs/
          <year>2003</year>
          .07459. arXiv:
          <year>2003</year>
          .07459.
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          [13]
          <string-name>
            <given-names>P.</given-names>
            <surname>Fortuna</surname>
          </string-name>
          , J. Rocha da Silva, J. Soler-Company,
          <string-name>
            <given-names>L.</given-names>
            <surname>Wanner</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Nunes</surname>
          </string-name>
          ,
          <article-title>A hierarchicallylabeled Portuguese hate speech dataset</article-title>
          ,
          <source>in: Proceedings of the Third Workshop on Abusive Language Online</source>
          , Association for Computational Linguistics, Florence, Italy,
          <year>2019</year>
          , pp.
          <fpage>94</fpage>
          -
          <lpage>104</lpage>
          . URL: https://aclanthology.org/W19-3510. doi:
          <volume>10</volume>
          .18653/v1/
          <fpage>W19</fpage>
          - 3510.
        </mixed-citation>
      </ref>
      <ref id="ref14">
        <mixed-citation>
          [14]
          <string-name>
            <given-names>G. I.</given-names>
            <surname>Sigurbergsson</surname>
          </string-name>
          , L. Derczynski,
          <article-title>Ofensive language and hate speech detection for danish</article-title>
          , CoRR abs/
          <year>1908</year>
          .04531 (
          <year>2019</year>
          ). URL: http://arxiv.org/abs/
          <year>1908</year>
          .04531. arXiv:
          <year>1908</year>
          .04531.
        </mixed-citation>
      </ref>
      <ref id="ref15">
        <mixed-citation>
          [15]
          <string-name>
            <given-names>M.</given-names>
            <surname>Aragon</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M. A.</given-names>
            <surname>Carmona</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Montes</surname>
          </string-name>
          ,
          <string-name>
            <given-names>H. J.</given-names>
            <surname>Escalante</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L.</given-names>
            <surname>Villaseñor-Pineda</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Moctezuma</surname>
          </string-name>
          ,
          <article-title>Overview of mex-a3t at iberlef 2019: Authorship and aggressiveness analysis in mexican spanish tweets</article-title>
          ,
          <year>2019</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref16">
        <mixed-citation>
          [16]
          <string-name>
            <surname>Ç. Çöltekin</surname>
          </string-name>
          ,
          <article-title>A corpus of Turkish ofensive language on social media</article-title>
          ,
          <source>in: Proceedings of the Twelfth Language Resources and Evaluation Conference</source>
          , European Language Resources Association, Marseille, France,
          <year>2020</year>
          , pp.
          <fpage>6174</fpage>
          -
          <lpage>6184</lpage>
          . URL: https://aclanthology.org/
          <year>2020</year>
          . lrec-
          <volume>1</volume>
          .
          <fpage>758</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref17">
        <mixed-citation>
          [17]
          <string-name>
            <given-names>T.</given-names>
            <surname>Mandl</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Modha</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Majumder</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Patel</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Dave</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Mandlia</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Patel</surname>
          </string-name>
          ,
          <article-title>Overview of the hasoc track at fire 2019: Hate speech and ofensive content identification in indo-european languages</article-title>
          ,
          <source>in: Proceedings of the 11th Forum for Information Retrieval Evaluation</source>
          , FIRE '19,
          <string-name>
            <surname>Association</surname>
          </string-name>
          for Computing Machinery, New York, NY, USA,
          <year>2019</year>
          , p.
          <fpage>14</fpage>
          -
          <lpage>17</lpage>
          . URL: https://doi.org/10.1145/3368567.3368584. doi:
          <volume>10</volume>
          .1145/3368567.3368584.
        </mixed-citation>
      </ref>
      <ref id="ref18">
        <mixed-citation>
          [18]
          <string-name>
            <given-names>T.</given-names>
            <surname>Mandl</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Modha</surname>
          </string-name>
          ,
          <string-name>
            <surname>A. Kumar</surname>
            <given-names>M</given-names>
          </string-name>
          ,
          <string-name>
            <given-names>B. R.</given-names>
            <surname>Chakravarthi</surname>
          </string-name>
          ,
          <article-title>Overview of the hasoc track at fire 2020: Hate speech and ofensive language identification in tamil, malayalam, hindi, english and german, in: Forum for Information Retrieval Evaluation</article-title>
          ,
          <string-name>
            <surname>FIRE</surname>
          </string-name>
          <year>2020</year>
          ,
          <article-title>Association for Computing Machinery</article-title>
          , New York, NY, USA,
          <year>2020</year>
          , p.
          <fpage>29</fpage>
          -
          <lpage>32</lpage>
          . URL: https://doi.org/10.1145/ 3441501.3441517. doi:
          <volume>10</volume>
          .1145/3441501.3441517.
        </mixed-citation>
      </ref>
      <ref id="ref19">
        <mixed-citation>
          [19]
          <string-name>
            <given-names>S.</given-names>
            <surname>Modha</surname>
          </string-name>
          ,
          <string-name>
            <given-names>T.</given-names>
            <surname>Mandl</surname>
          </string-name>
          ,
          <string-name>
            <given-names>G. K.</given-names>
            <surname>Shahi</surname>
          </string-name>
          ,
          <string-name>
            <given-names>H.</given-names>
            <surname>Madhu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Satapara</surname>
          </string-name>
          ,
          <string-name>
            <given-names>T.</given-names>
            <surname>Ranasinghe</surname>
          </string-name>
          ,
          <string-name>
            <surname>M.</surname>
          </string-name>
          <article-title>Zampieri, Overview of the hasoc subtrack at fire 2021: Hate speech and ofensive content identification in english and indo-aryan languages and conversational hate speech, in: Forum for Information Retrieval Evaluation</article-title>
          ,
          <string-name>
            <surname>FIRE</surname>
          </string-name>
          <year>2021</year>
          ,
          <article-title>Association for Computing Machinery</article-title>
          , New York, NY, USA,
          <year>2021</year>
          , p.
          <fpage>1</fpage>
          -
          <lpage>3</lpage>
          . URL: https://doi.org/10.1145/3503162.3503176. doi:
          <volume>10</volume>
          .1145/3503162.3503176.
        </mixed-citation>
      </ref>
      <ref id="ref20">
        <mixed-citation>
          [20]
          <string-name>
            <given-names>N.</given-names>
            <surname>Romim</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Ahmed</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M. S.</given-names>
            <surname>Islam</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A. Sen</given-names>
            <surname>Sharma</surname>
          </string-name>
          ,
          <string-name>
            <given-names>H.</given-names>
            <surname>Talukder</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M. R.</given-names>
            <surname>Amin</surname>
          </string-name>
          , BD-SHS:
          <article-title>A benchmark dataset for learning to detect online bangla hate speech in diferent social contexts</article-title>
          ,
          <source>in: Proceedings of the Thirteenth Language Resources and Evaluation Conference</source>
          , European Language Resources Association, Marseille, France,
          <year>2022</year>
          , pp.
          <fpage>5153</fpage>
          -
          <lpage>5162</lpage>
          . URL: https://aclanthology.org/
          <year>2022</year>
          .lrec-
          <volume>1</volume>
          .
          <fpage>552</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref21">
        <mixed-citation>
          [21]
          <string-name>
            <given-names>V.</given-names>
            <surname>Gupta</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Roychowdhury</surname>
          </string-name>
          ,
          <string-name>
            <surname>M. Das</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          <string-name>
            <surname>Banerjee</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          <string-name>
            <surname>Saha</surname>
            ,
            <given-names>B.</given-names>
          </string-name>
          <string-name>
            <surname>Mathew</surname>
          </string-name>
          , h. p.
          <article-title>vanchinathan, A. Mukherjee, Multilingual abusive comment detection at scale for indic languages</article-title>
          , in: S. Koyejo,
          <string-name>
            <given-names>S.</given-names>
            <surname>Mohamed</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Agarwal</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Belgrave</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K.</given-names>
            <surname>Cho</surname>
          </string-name>
          ,
          <string-name>
            <surname>A</surname>
          </string-name>
          . Oh (Eds.),
          <source>Advances in Neural Information Processing Systems</source>
          , volume
          <volume>35</volume>
          ,
          <string-name>
            <surname>Curran</surname>
            <given-names>Associates</given-names>
          </string-name>
          , Inc.,
          <year>2022</year>
          , pp.
          <fpage>26176</fpage>
          -
          <lpage>26191</lpage>
          . URL: https://proceedings.neurips.cc/paper_files/paper/2022/file/ a7c4163b33286261b24c72fd3d1707c9-Paper-Datasets_and_Benchmarks.pdf.
        </mixed-citation>
      </ref>
      <ref id="ref22">
        <mixed-citation>
          [22]
          <string-name>
            <given-names>S.</given-names>
            <surname>Satapara</surname>
          </string-name>
          ,
          <string-name>
            <given-names>H.</given-names>
            <surname>Madhu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>T.</given-names>
            <surname>Ranasinghe</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A. E.</given-names>
            <surname>Dmonte</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Zampieri</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Pandya</surname>
          </string-name>
          ,
          <string-name>
            <given-names>N.</given-names>
            <surname>Shah</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Sandip</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Majumder</surname>
          </string-name>
          , T. Mandl,
          <article-title>Overview of the hasoc subtrack at fire 2023: Hatespeech identification in sinhala and gujarati</article-title>
          , in: K. Ghosh,
          <string-name>
            <given-names>T.</given-names>
            <surname>Mandl</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Majumder</surname>
          </string-name>
          , M. Mitra (Eds.), Working Notes of FIRE 2023 -
          <article-title>Forum for Information Retrieval Evaluation, Goa,</article-title>
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>