<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta>
      <journal-title-group>
        <journal-title>Forum for Information Retrieval Evaluation, December</journal-title>
      </journal-title-group>
      <issn pub-type="ppub">1613-0073</issn>
    </journal-meta>
    <article-meta>
      <title-group>
        <article-title>Ofensive Content Detection in Indo-Aryan Languages: A Battle of LSTM and Transformers</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Z-AGI Labs</string-name>
        </contrib>
        <contrib contrib-type="author">
          <string-name>India</string-name>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Nikhil Narayan</string-name>
          <email>nikhilnarayan73@gmail.com</email>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Mrutyunjay Biswal</string-name>
          <email>mrutyunjay.biswal.hmu@gmail.com</email>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Pramod Goyal</string-name>
          <email>goyalpramod1729@gmail.com</email>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Abhranta Panigrahi</string-name>
          <email>abhranta.panigrahi@gmail.com</email>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Languages.</string-name>
        </contrib>
        <contrib contrib-type="editor">
          <string-name>Multilingual Models, Low-Resource Languages, Hate Speech, Indic Languages, HASOC-FIRE, CEUR-WS</string-name>
        </contrib>
      </contrib-group>
      <pub-date>
        <year>2023</year>
      </pub-date>
      <volume>1</volume>
      <fpage>5</fpage>
      <lpage>18</lpage>
      <abstract>
        <p>Social media platforms serve as accessible outlets for individuals to express their thoughts and experiences, resulting in an influx of user-generated data spanning all age groups. While these platforms enable free expression, they also present significant challenges, including the proliferation of hate speech and ofensive content. Such objectionable language disrupts objective discourse and can lead to radicalization of debates, ultimately threatening democratic values. Consequently, social media platforms have taken steps to monitor and curb abusive behavior, necessitating automated methods for identifying suspicious posts. This paper contributes to Hate Speech and Ofensive Content Identification in English and Indo-Aryan Languages (HASOC) 2023 shared tasks track for Hate Speech Detection in Low-Resource We, team Z-AGI Labs, conduct a comprehensive comparative analysis of hate speech classification across five distinct languages-Bengali, Assamese, Bodo, Sinhala, and Gujarati-within the context of the HASOC competition. Our study encompasses a wide range of pre-trained models, including Bert variants, XLM-R, and LSTM models, to assess their performance in identifying hate speech across these languages. Results reveal intriguing variations in model performance. Notably, Bert Base Multilingual Cased emerges as a strong performer across languages, achieving an F1 score of 0.67027 for Bengali and 0.70525 for Assamese. At the same time, it significantly outperforms other models with an impressive F1 score of 0.83009 for Bodo. In Sinhala, XLM-R stands out with an F1 score of 0.83493, whereas for Gujarati, a custom LSTM-based model outshined with an F1 score of 0.76601. This study ofers valuable insights into the suitability of various pre-trained models for hate speech detection in multilingual settings. By considering the nuances of each, our research contributes to an informed model selection for building robust hate speech detection systems.</p>
      </abstract>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>-</title>
      <p>CEUR
ceur-ws.org</p>
    </sec>
    <sec id="sec-2">
      <title>1. Introduction</title>
      <p>
        In the era of expanding global connectivity through social media, platforms such as Facebook,
X (Formerly Twitter), YouTube, and Instagram have grappled with a disturbing surge in hate
speech perpetuated by individuals and organized groups. With the surge of Influencers and
Content Creators, also commonly known as the Creator Economy[
        <xref ref-type="bibr" rid="ref1">1</xref>
        ], there has been an alarming
rise in incidents concerning targeted attacks on individuals based on their opinions, appearance,
and ethnicity[
        <xref ref-type="bibr" rid="ref2">2</xref>
        ]. The consequences of such pervasive abusive language are far-reaching, often
CEUR
Workshop
Proceedings
resulting in public humiliation and significant personal and professional consequences[
        <xref ref-type="bibr" rid="ref3 ref4 ref5">3, 4, 5</xref>
        ]
for victims. The escalating prevalence of online hate speech, often characterized by anonymity,
scale, and overwhelming volumes that challenge human moderators, underscores the pressing
need for social media platforms to strike a balance between safeguarding freedom of expression
and fostering an environment of inclusiveness and respect.
      </p>
      <p>
        The need for content moderation[
        <xref ref-type="bibr" rid="ref6">6</xref>
        ] by detecting Hate and Ofensive engagement has pushed
organizations and research groups to develop systems and solutions at scale. Significant work
has been done to identify toxic, profane, and ofensive comments. However, a majority of
contributions focus predominantly on Resource-heavy languages such as English[
        <xref ref-type="bibr" rid="ref10 ref11 ref12 ref13 ref7 ref8 ref9">7, 8, 9, 10, 11,
12, 13</xref>
        ]. This brings constraints to hate and ofensive speech content detection and moderation
in low-resource languages. The lack of large-scale corpus and pre-trained models makes it
extremely dificult to tackle Natural Language Understanding (NLU) downstream tasks.
      </p>
      <p>
        In response to these challenges, Hate Speech and Ofensive Content Identification in English
and Indo-Aryan Languages (HASOC) presented four shared tasks as a part of its 5th edition,
2023. Out of these, Task 1 and Task 4[
        <xref ref-type="bibr" rid="ref14">14</xref>
        ] focus on detecting hate and ofensive content in 5
low-resource languages as follows:- Bodo[
        <xref ref-type="bibr" rid="ref15">15</xref>
        ], Bengali[
        <xref ref-type="bibr" rid="ref16">16</xref>
        ], and Assamese[
        <xref ref-type="bibr" rid="ref17">17</xref>
        ] in Task 4[
        <xref ref-type="bibr" rid="ref18">18</xref>
        ],
Gujarati and Sinhala in Task 1[
        <xref ref-type="bibr" rid="ref19">19</xref>
        ]. Each language has its corresponding dataset, evaluation
metric, competition page, and leaderboard. The challenges present one problem statement, that
is to classify a given content into one of the following classes:
• HOF (Hate and/or Ofensive): This content is hate speech, ofensive, and/or profane.
• NOT (Non-Hate and/or Ofensive): This content is not hate speech, ofensive, and/or
profane.
      </p>
      <p>In this paper, we describe our approach to tackle the challenges. Here’s the summary of our
contribution:
• Adequate preprocessing techniques for datasets.
• Provide a strong LSTM with an attention head baseline model.
• Comparative analysis of pre-trained models in zero-shot and few-shot settings.
• Fine-tuning large multi-lingual models on the given datasets.</p>
      <p>From here, the report continues in the following manner: In section 2, we highlight previous
approaches as related work. In section 3, we give an overview of the dataset for each language
and describe the challenge at hand. In section 4, we present our approach in detail, covering
the nitty gritty of our experimental set-up, cross-validation strategy, models used, and intuition
behind them. In section 5, we brief the results from the experiments section. Then, we conclude
in section 6 with the final takeaways, our standings, and the scope of future work. The
implementation details can be found in the following GitHub repository1.</p>
    </sec>
    <sec id="sec-3">
      <title>2. Related Work</title>
      <p>
        Detecting hate speech poses a formidable challenge in the realm of research, with existing
literature encompassing diverse methodologies, such as dictionary-based approaches[
        <xref ref-type="bibr" rid="ref20">20</xref>
        ], the
1https://github.com/The-Originalz/fire-hasoc-2023
utilization of distributional semantics[
        <xref ref-type="bibr" rid="ref21">21</xref>
        ], and the recent exploration of the eficacy of neural
network architectures[
        <xref ref-type="bibr" rid="ref22">22</xref>
        ]. However, it is notable that a substantial portion of this research
predominantly focuses on hate speech detection within the English language. Conversely,
there is limited scholarly attention directed towards other foreign languages[
        <xref ref-type="bibr" rid="ref23 ref24 ref25 ref26">23, 24, 25, 26</xref>
        ]
and the intricacies of code-switched text[
        <xref ref-type="bibr" rid="ref27 ref28">27, 28</xref>
        ]. Despite the significant impact of regional
low-resource languages on online hate speech, this domain remains relatively uncharted, with
recent investigations exploring the utility of transformers[
        <xref ref-type="bibr" rid="ref29">29</xref>
        ] and author profiling through the
application of graph neural networks[
        <xref ref-type="bibr" rid="ref30">30</xref>
        ].
      </p>
      <p>
        Historically, numerous strategies have been employed to address the challenge of identifying
hate speech. Kwok and Wang[
        <xref ref-type="bibr" rid="ref31">31</xref>
        ] experimented with a straightforward bag of words (BOW)
methodology to detect hate speech, but these lightweight models yielded subpar results
characterized by elevated false positive rates. Enhancing these models with various fundamental
natural language processing (NLP) components, such as part-of-speech tags[
        <xref ref-type="bibr" rid="ref32">32</xref>
        ] and N-gram
graphs, contributed to improved performance. Lexical techniques employing TF-IDF in
conjunction with Support Vector Machines (SVM) as a classification model surprisingly achieved
commendable results[
        <xref ref-type="bibr" rid="ref33">33</xref>
        ].
      </p>
      <p>
        The advent of embedding words into distributed representations marked a pivotal shift, as
researchers harnessed word embeddings like Glove[
        <xref ref-type="bibr" rid="ref34">34</xref>
        ] and FastText[
        <xref ref-type="bibr" rid="ref35">35</xref>
        ] to project discrete text
into a latent space, surpassing the performance of conventional BOW and lexical approaches.
      </p>
      <p>
        Recurrent Neural Networks (RNNs) remained the go-to method for tackling various natural
language challenges over an extended period. For instance, the winning approach in the 2020
HASOC competition for Hindi[
        <xref ref-type="bibr" rid="ref36">36</xref>
        ] employed a one-layer Bidirectional Long Short-Term Memory
(BiLSTM) model with FastText embeddings to discern hate speech. Likewise, the most accurate
model for English[
        <xref ref-type="bibr" rid="ref37">37</xref>
        ] adopted an LSTM architecture with Glove embeddings to represent
textual inputs. Mohtaj et al.[
        <xref ref-type="bibr" rid="ref38">38</xref>
        ] also embraced a character-based LSTM, aligning with this
prevailing trend.
      </p>
      <p>
        In recent times, self-attention-based transformer models[
        <xref ref-type="bibr" rid="ref29">29</xref>
        ], and their derivatives such as
BERT[
        <xref ref-type="bibr" rid="ref39">39</xref>
        ], derived from extensive corpus-trained encoders, have exhibited greater potential than
traditional RNNs across a multitude of NLP tasks. BERT-like models have garnered substantial
attention due to their remarkable transfer learning capabilities, outperforming alternative
approaches consistently[
        <xref ref-type="bibr" rid="ref40">40</xref>
        ].
      </p>
      <p>
        Despite the substantial body of research on hate speech detection, experiments dedicated
to low-resource languages remain relatively scarce. Notably, simple logistic regression using
LASER embeddings demonstrated superior performance to BERT-based models[
        <xref ref-type="bibr" rid="ref41">41</xref>
        ],
underscoring the necessity for more precise multilingual base language models. Consequently, we have
witnessed the ascendancy of multilingual language models like XLM-Roberta[
        <xref ref-type="bibr" rid="ref42">42</xref>
        ]. Following
the trend, region-specific low-resource language models are developed. Some of the notable
contributions are MuRIL[
        <xref ref-type="bibr" rid="ref43">43</xref>
        ], SinBERT[
        <xref ref-type="bibr" rid="ref44">44</xref>
        ] for Sinhala, BanglaBERT[
        <xref ref-type="bibr" rid="ref45">45</xref>
        ], Indic-BERT[
        <xref ref-type="bibr" rid="ref46">46</xref>
        ], and
XLM-Indic[
        <xref ref-type="bibr" rid="ref47">47</xref>
        ] variants. Authors in [
        <xref ref-type="bibr" rid="ref48">48</xref>
        ] provide a detailed study on performance of
monolingual and multi-lingual in the context of cross-lingual evaluation for hate speech identification.
Previous editions of HASOC[
        <xref ref-type="bibr" rid="ref49 ref50">49, 50</xref>
        ] have witnessed a significant efort towards improving
performance in low-resource languages such as Hindi[
        <xref ref-type="bibr" rid="ref51">51</xref>
        ], Marathi[
        <xref ref-type="bibr" rid="ref52">52</xref>
        ], etc. In the ensuing
sections, we will elucidate our approach, which leverages several multilingual models for hate
speech identification, accompanied by an exhaustive comparative analysis against alternative
      </p>
    </sec>
    <sec id="sec-4">
      <title>3. Dataset Description</title>
      <sec id="sec-4-1">
        <title>3.1. Sinhala</title>
        <p>
          Task 1 consists of 2 sub-tasks, one for Sinhala, and the other for Gujarati. Sinhala, one of the
two oficial languages in Sri Lanka, is spoken by over 17 million people. This edition of HASOC
brings the first-ever shared task for the aforementioned Indo-Aryan low-resource language.
The train and test sets for Sinhala are based on the SOLD: Sinhala Ofensive Language Detection
dataset[
          <xref ref-type="bibr" rid="ref53">53</xref>
          ]. The SOLD consists of 10,000 manually annotated tweets divided into two classes:
Ofensive and Not ofensive, both at the token level and the sentence level. However, the dataset
provided for the task contains 7500 samples in the train set, each labeled as HOF or NOT. The
test set contains 2500 samples.
        </p>
      </sec>
      <sec id="sec-4-2">
        <title>3.2. Gujarati</title>
        <p>Gujarati is one of the 22 oficial languages of India with over 50M native speakers. The train
set for this task contains 200 tweets whereas the test set contains 1196 tweets. This is also a
coarse-grained binary classification problem but in a few-shot setting. The train data frame
is made up of 5 columns, named as follows: tweet_id, created_at, text, user_screen_time, and
label. The test set contains only tweet_id and the text column.</p>
      </sec>
      <sec id="sec-4-3">
        <title>3.3. Assamese, Bodo, and Bengali</title>
        <p>Task 4 consists of 3 Kaggle competitions, each corresponding to one of the following languages:
Assamese, Bodo, and Bengali. The primary sources for data collection are X (formerly Twitter),
Facebook, and YouTube Comments. Each train set contains tweet-label pairs, with only tweets
in the test set to predict the targets.</p>
      </sec>
    </sec>
    <sec id="sec-5">
      <title>4. Experimental Set-up</title>
      <p>In this section, we discuss our approach and explain the experimental set-up details. We start
with creating a validation strategy for each dataset. As all the datasets are fairly balanced (refer
to figure 3.3), we opt for K-Fold cross-validation with 5 folds. And while creating the splits, we
set the random seed to 2023.</p>
      <sec id="sec-5-1">
        <title>4.1. Preprocessing</title>
        <p>
          The preprocessing step involves cleaning the contents for feature extraction. We notice that
the Bodo tweets do not contain any emoji or repetitive punctuation marks that need attention.
However, there are instances where some of the content contains English words in a
codemixed manner. Note that, such instances are seen across all the datasets. For the other two
datasets in Task 4, though the majority of the content is cleaned, some of them contain emojis
and a few repetitive punctuation characters. The repetition is removed, and the emojis are
converted into their respective textual description using the emot2 library. The Task 1 datasets
for Sinhala and Gujarati contain usernames as @USER and the usernames are made available
in a separate column. Along with that, the datasets also contain code-mixed English words,
repetitive punctuation characters, emojis, and hashtags. Note that, there are no URLs or
hyperlinks associated with any of the content. The usernames are removed along with repetitive
punctuation characters. The hashtags are further processed with Ekphrasis3 tokenizer to
segment them into meaningful tokens. The emoji2vec[
          <xref ref-type="bibr" rid="ref54">54</xref>
          ] embeddings are used on top of
emojis, as it has shown competitive results in previous works.
        </p>
      </sec>
      <sec id="sec-5-2">
        <title>4.2. Modeling</title>
        <p>To start with, we create an LSTM-attention-based baseline with 2 bi-direction LSTM layers
followed by an attention block. The attention head is further connected through 2 dense layers
with sigmoid activation in the last layer. The model is trained with Adam optimizer and Binary
Cross Entropy as the loss function. The hyperparameters involved such as the batch size,
number of epochs, learning rate, vocab size, embedding dimension, and maximum length of the
input sequence are varied and tuned on a case-to-case basis.</p>
        <p>As the available datasets have less number of samples per language (refer to figure 3.1), we also
leverage Transformer-based language models for the downstream task at hand. The available
training data are used to fine-tune the encoder layers of the transformer-based models, leaving
the embedding layers frozen. Note that to incorporate emojis semantics, we concatenate the
embedding layers with the emoji2vec-generated embeddings. We experiment with various
multi-lingual transformer-based models for fine-tuning such as Bert-Base-Multilingual (Cased
and Uncased), DistilBert-Base-Multilingual-Cased, XLM-Roberta-Base, Muril-Base, and
XLMIndic-Base (UniScript4 and Multi-Script5). Other than that, we also present the results from a
couple of language-specific models such as Bangla-BERT and SinhalaBERTo 6.</p>
        <p>Note that, the Huggingface implementation for the models7 is used via
TFAutoModelForSequenceClassification with corresponding hyperparameters for each. All the training and
inference are done using Kaggle runtime and MacBook Air M2 with 24GB unified memory.</p>
        <p>For inference, we ensemble the models from each fold with equal weightage on the logits,
then take a threshold of 0.5 to classify into HOF or NOT. The labels are mapped to numbers as
follows: HOF is mapped to 0, and NOT is mapped to 1.</p>
        <p>2https://github.com/NeelShah18/emot
3https://github.com/cbaziotis/ekphrasis
4https://huggingface.co/ibraheemmoosa/xlmindic-base-uniscript
5https://huggingface.co/ibraheemmoosa/xlmindic-base-multiscript
6https://huggingface.co/keshan/SinhalaBERTo
7https://huggingface.co/models</p>
      </sec>
    </sec>
    <sec id="sec-6">
      <title>5. Results</title>
      <p>The competitions for Task 4 are evaluated on macro-f1 metrics, whereas the Task 1 challenges
are evaluated on macro-f1, Precision, and Recall.</p>
      <p>It is evident from the results matrix (refer to table 3) that, the LSTM baseline poses a strong
competition in performance for all the languages. Even going a step further for Gujarati,
the LSTM-based Model scores highest amongst XLM-Indic-Base-MultiScript and
Bert-BaseMultilingual. This results in the highest F1-Score of 0.76601 and a Recall of 0.79704, with a
marginal gain of 0.001 for F1 and 0.03 for Recall against the second-best performing model for
the task. For Sinhala, XLM-Roberta seems to be the winner beating our LSTM Baseline and
Sinhala Bert with a considerable margin. Due to time constraints and run submission limits, we
experimented with a handful of BERT-based and Roberta-based models for fine-tuning along
with the LSTM-with-Attention baseline Model.</p>
      <p>For Task 4, we have varied candidate models for experimentation as mentioned in section
4 (refer to table 1). Our LSTM baseline poses as one of the top performers for Bodo and
Assamese by yielding the third-highest F1-Score, beating XLM-Roberta-based models. For
Bengali, however, BanglaBert beat the rest of the models with a staggering 0.75625 f1-score.
The best-performing model for Bodo and Assamese is Bert-Base-Multilingual-Cased with an
f1-score of 0.83009, and 0.70525 respectively. Note that, all the scores mentioned above (refer
to table2) are the performance on the hidden test set, and directly taken from the system-run
report provided by the Organizers after a finalized leaderboard. A visual representation of
comparative performance for Task 4 is shown in figure 5.</p>
      <p>The obtained results help us climb to 3rd/20 for Bengali, 5th/16 for Sinhala, 7th/17 for Gujarati,
8th/20 for Assamese, and 12th/19 for Bodo.</p>
    </sec>
    <sec id="sec-7">
      <title>6. Conclusion</title>
      <p>
        This work has been submitted to the CEUR-2023 Workshop Proceedings for Hate Speech and
Ofensive Content Identification in English and Indo-Aryan Languages (HASOC) track. In this
report, we entail our approach to solving two tasks for the Track on classifying a given content
into Hate and Ofensive (HOF) or NOT. We experiment with various models starting from simple
LSTM-based architecture to pretrained transformer-based multilingual models. Though the
transformer shows sheer dominance in the majority of the tasks, our LSTM baseline emerges
as a strong competitor with promising results, even resulting in the highest f1-score amongst
our candidate models for Gujarati. We plan to further extend our work to other low-resource
indic languages, and explore the possibilities of zero-shot, few-shot, and cross-lingual transfer
learning scenarios. We also aim to develop a unified language model to incorporate all the
languages such as NLLB[
        <xref ref-type="bibr" rid="ref55">55</xref>
        ] for similar downstream tasks to strengthen content moderation.
J. Wang, No language left behind: Scaling human-centered machine translation, 2022.
arXiv:2207.04672.
      </p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <surname>Wikipedia</surname>
            <given-names>contributors</given-names>
          </string-name>
          ,
          <source>Creator economy - Wikipedia</source>
          , the free encyclopedia,
          <year>2023</year>
          . URL: https://en.wikipedia.org/w/index.php?title=Creator_economy&amp;oldid=
          <fpage>1159137601</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [2]
          <string-name>
            <given-names>Taniya</given-names>
            <surname>Roy</surname>
          </string-name>
          ,
          <article-title>Hindutva's circulation of anti-muslim hate aided by digital platforms</article-title>
          ,
          <source>ifnds report</source>
          ,
          <year>2022</year>
          . URL: https://thewire.in/communalism/ india
          <article-title>-anti-muslim-hate-twitter-facebook-whatsapp-hindutva-modi-bjp.</article-title>
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [3]
          <string-name>
            <given-names>K. E.</given-names>
            <surname>Riehm</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K. A.</given-names>
            <surname>Feder</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K. N.</given-names>
            <surname>Tormohlen</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R. M.</given-names>
            <surname>Crum</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A. S.</given-names>
            <surname>Young</surname>
          </string-name>
          ,
          <string-name>
            <surname>K. M. Green</surname>
            ,
            <given-names>L. R.</given-names>
          </string-name>
          <string-name>
            <surname>Pacek</surname>
            ,
            <given-names>L. N.</given-names>
          </string-name>
          <string-name>
            <surname>La</surname>
            <given-names>Flair</given-names>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Mojtabai</surname>
          </string-name>
          ,
          <article-title>Associations Between Time Spent Using Social Media and Internalizing and Externalizing Problems Among US Youth, JAMA Psychiatry 76 (</article-title>
          <year>2019</year>
          )
          <fpage>1266</fpage>
          -
          <lpage>1273</lpage>
          . URL: https://doi.org/10.1001/jamapsychiatry.
          <year>2019</year>
          .
          <volume>2325</volume>
          . doi:
          <volume>10</volume>
          .1001/ jamapsychiatry.
          <year>2019</year>
          .
          <volume>2325</volume>
          .
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [4]
          <string-name>
            <given-names>J.</given-names>
            <surname>Naslund</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Bondre</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Torous</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K.</given-names>
            <surname>Aschbrenner</surname>
          </string-name>
          ,
          <article-title>Social media and mental health: Benefits, risks, and opportunities for research and practice</article-title>
          ,
          <source>Journal of Technology in Behavioral Science</source>
          <volume>5</volume>
          (
          <year>2020</year>
          ). doi:
          <volume>10</volume>
          .1007/s41347- 020- 00134- x.
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          [5]
          <string-name>
            <given-names>R.</given-names>
            <surname>Bannink</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Broeren</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Looij-Jansen</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F.</given-names>
            <surname>Waart</surname>
          </string-name>
          ,
          <string-name>
            <given-names>H.</given-names>
            <surname>Raat</surname>
          </string-name>
          ,
          <article-title>Cyber and traditional bullying victimization as a risk factor for mental health problems and suicidal ideation in adolescents</article-title>
          ,
          <source>PloS one 9</source>
          (
          <year>2014</year>
          )
          <article-title>e94026</article-title>
          . doi:
          <volume>10</volume>
          .1371/journal.pone.
          <volume>0094026</volume>
          .
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          [6]
          <string-name>
            <given-names>Dylan</given-names>
            <surname>Walsh</surname>
          </string-name>
          ,
          <article-title>As content booms, how can platforms protect kids from hateful speech</article-title>
          ?,
          <year>2022</year>
          . URL: https://mitsloan.mit.edu/ideas-made
          <article-title>-to-matter/ content-booms-how-can-platforms-protect-kids-hate-speech.</article-title>
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          [7]
          <string-name>
            <given-names>H. H.</given-names>
            <surname>Saeed</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K.</given-names>
            <surname>Shahzad</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F.</given-names>
            <surname>Kamiran</surname>
          </string-name>
          ,
          <article-title>Overlapping toxic sentiment classification using deep neural architectures</article-title>
          ,
          <source>in: 2018 IEEE International Conference on Data Mining Workshops (ICDMW)</source>
          ,
          <year>2018</year>
          , pp.
          <fpage>1361</fpage>
          -
          <lpage>1366</lpage>
          . doi:
          <volume>10</volume>
          .1109/ICDMW.
          <year>2018</year>
          .
          <volume>00193</volume>
          .
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          [8]
          <string-name>
            <given-names>A.</given-names>
            <surname>Vaidya</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F.</given-names>
            <surname>Mai</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Y.</given-names>
            <surname>Ning</surname>
          </string-name>
          ,
          <article-title>Empirical analysis of multi-task learning for reducing identity bias in toxic comment detection</article-title>
          ,
          <source>Proceedings of the International AAAI Conference on Web and Social Media</source>
          <volume>14</volume>
          (
          <year>2020</year>
          )
          <fpage>683</fpage>
          -
          <lpage>693</lpage>
          . URL: https://ojs.aaai.org/index.php/ICWSM/ article/view/7334. doi:
          <volume>10</volume>
          .1609/icwsm.v14i1.
          <fpage>7334</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          [9]
          <string-name>
            <given-names>S. M.</given-names>
            <surname>Carta</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Corriga</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Mulas</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D. R.</given-names>
            <surname>Recupero</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Saia</surname>
          </string-name>
          ,
          <article-title>A supervised multi-class multi-label word embeddings approach for toxic comment classification</article-title>
          ,
          <source>in: International Conference on Knowledge Discovery and Information Retrieval</source>
          ,
          <year>2019</year>
          . URL: https://api. semanticscholar.org/CorpusID:204754719.
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          [10]
          <string-name>
            <given-names>T.</given-names>
            <surname>Tran</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Y.</given-names>
            <surname>Hu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Hu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K.</given-names>
            <surname>Yen</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F.</given-names>
            <surname>Tan</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K.</given-names>
            <surname>Lee</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Park</surname>
          </string-name>
          ,
          <string-name>
            <surname>Habertor:</surname>
          </string-name>
          <article-title>An eficient and efective deep hatespeech detector</article-title>
          ,
          <year>2020</year>
          . arXiv:
          <year>2010</year>
          .08865.
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          [11]
          <string-name>
            <given-names>O.</given-names>
            <surname>Kamal</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Kumar</surname>
          </string-name>
          , T. Vaidhya,
          <article-title>Hostility detection in hindi leveraging pre-trained language models</article-title>
          ,
          <year>2021</year>
          . arXiv:
          <volume>2101</volume>
          .
          <fpage>05494</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          [12]
          <string-name>
            <surname>Hitkul</surname>
            ,
            <given-names>K.</given-names>
          </string-name>
          <string-name>
            <surname>Aggarwal</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          <string-name>
            <surname>Bamdev</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          <string-name>
            <surname>Mahata</surname>
            ,
            <given-names>R. R.</given-names>
          </string-name>
          <string-name>
            <surname>Shah</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          <string-name>
            <surname>Kumaraguru</surname>
          </string-name>
          ,
          <article-title>Trawling for trolling: A dataset</article-title>
          ,
          <year>2020</year>
          . arXiv:
          <year>2008</year>
          .00525.
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          [13]
          <string-name>
            <surname>A.-M. Founta</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          <string-name>
            <surname>Djouvas</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          <string-name>
            <surname>Chatzakou</surname>
            ,
            <given-names>I. Leontiadis</given-names>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Blackburn</surname>
          </string-name>
          ,
          <string-name>
            <given-names>G.</given-names>
            <surname>Stringhini</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Vakali</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Sirivianos</surname>
          </string-name>
          ,
          <string-name>
            <given-names>N.</given-names>
            <surname>Kourtellis</surname>
          </string-name>
          ,
          <article-title>Large scale crowdsourcing and characterization of twitter abusive behavior</article-title>
          ,
          <year>2018</year>
          . arXiv:
          <year>1802</year>
          .00393.
        </mixed-citation>
      </ref>
      <ref id="ref14">
        <mixed-citation>
          [14]
          <string-name>
            <given-names>T.</given-names>
            <surname>Ranasinghe</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K.</given-names>
            <surname>Ghosh</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A. S.</given-names>
            <surname>Pal</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Senapati</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A. E.</given-names>
            <surname>Dmonte</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Zampieri</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Modha</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Satapara</surname>
          </string-name>
          ,
          <source>Overview of the HASOC subtracks at FIRE</source>
          <year>2023</year>
          :
          <article-title>Hate speech and ofensive content identification in assamese, bengali, bodo, gujarati and sinhala</article-title>
          ,
          <source>in: Proceedings of the 15th Annual Meeting of the Forum for Information Retrieval Evaluation</source>
          ,
          <string-name>
            <surname>FIRE</surname>
          </string-name>
          <year>2023</year>
          , Goa,
          <source>India. December 15-18</source>
          ,
          <year>2023</year>
          , ACM,
          <year>2023</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref15">
        <mixed-citation>
          [15]
          <string-name>
            <surname>K. G. Aditya Shankar</surname>
            <given-names>Pal</given-names>
          </string-name>
          ,
          <source>Annihilate hates (bodo)</source>
          ,
          <year>2023</year>
          . URL: https://kaggle.com/ competitions/annihilate-hates-bodo.
        </mixed-citation>
      </ref>
      <ref id="ref16">
        <mixed-citation>
          [16]
          <string-name>
            <given-names>K.</given-names>
            <surname>Ghosh</surname>
          </string-name>
          ,
          <source>Annihilate hates (bengali)</source>
          ,
          <year>2023</year>
          . URL: https://kaggle.com/competitions/ annihilate-hates-bengali.
        </mixed-citation>
      </ref>
      <ref id="ref17">
        <mixed-citation>
          [17]
          <string-name>
            <surname>K. G. Aditya Shankar</surname>
            <given-names>Pal</given-names>
          </string-name>
          ,
          <source>Annihilate hates (assamese)</source>
          ,
          <year>2023</year>
          . URL: https://kaggle.com/ competitions/annihilate-hates-assamese.
        </mixed-citation>
      </ref>
      <ref id="ref18">
        <mixed-citation>
          [18]
          <string-name>
            <given-names>K.</given-names>
            <surname>Ghosh</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Senapati</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A. S.</given-names>
            <surname>Pal</surname>
          </string-name>
          ,
          <article-title>Annihilate Hates (Task 4</article-title>
          ,
          <string-name>
            <surname>HASOC</surname>
          </string-name>
          <year>2023</year>
          )
          <article-title>: Hate Speech Detection in Assamese, Bengali, and Bodo languages</article-title>
          , in: Working Notes of FIRE 2023 -
          <article-title>Forum for Information Retrieval Evaluation</article-title>
          ,
          <string-name>
            <surname>CEUR</surname>
          </string-name>
          ,
          <year>2023</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref19">
        <mixed-citation>
          [19]
          <string-name>
            <given-names>S.</given-names>
            <surname>Satapara</surname>
          </string-name>
          ,
          <string-name>
            <given-names>H.</given-names>
            <surname>Madhu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>T.</given-names>
            <surname>Ranasinghe</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A. E.</given-names>
            <surname>Dmonte</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Zampieri</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Pandya</surname>
          </string-name>
          ,
          <string-name>
            <given-names>N.</given-names>
            <surname>Shah</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Sandip</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Majumder</surname>
          </string-name>
          , T. Mandl,
          <article-title>Overview of the hasoc subtrack at fire 2023: Hatespeech identification in sinhala and gujarati</article-title>
          , in: K. Ghosh,
          <string-name>
            <given-names>T.</given-names>
            <surname>Mandl</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Majumder</surname>
          </string-name>
          , M. Mitra (Eds.), Working Notes of FIRE 2023 -
          <article-title>Forum for Information Retrieval Evaluation, Goa, India</article-title>
          .
          <source>December 15-18</source>
          ,
          <year>2023</year>
          , CEUR Workshop Proceedings, CEUR-WS.org,
          <year>2023</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref20">
        <mixed-citation>
          [20]
          <string-name>
            <given-names>R.</given-names>
            <surname>Guermazi</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Hammami</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A. B.</given-names>
            <surname>Hamadou</surname>
          </string-name>
          ,
          <article-title>Using a semi-automatic keyword dictionary for improving violent web site filtering</article-title>
          ,
          <source>in: 2007 Third International IEEE Conference on Signal-Image Technologies and Internet-Based System</source>
          ,
          <year>2007</year>
          , pp.
          <fpage>337</fpage>
          -
          <lpage>344</lpage>
          . doi:
          <volume>10</volume>
          .1109/ SITIS.
          <year>2007</year>
          .
          <volume>137</volume>
          .
        </mixed-citation>
      </ref>
      <ref id="ref21">
        <mixed-citation>
          [21]
          <string-name>
            <given-names>N.</given-names>
            <surname>Djuric</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Zhou</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Morris</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Grbovic</surname>
          </string-name>
          ,
          <string-name>
            <given-names>V.</given-names>
            <surname>Radosavljevic</surname>
          </string-name>
          ,
          <string-name>
            <given-names>N.</given-names>
            <surname>Bhamidipati</surname>
          </string-name>
          ,
          <article-title>Hate speech detection with comment embeddings</article-title>
          ,
          <source>in: Proceedings of the 24th International Conference on World Wide Web, WWW '15 Companion</source>
          , Association for Computing Machinery, New York, NY, USA,
          <year>2015</year>
          , p.
          <fpage>29</fpage>
          -
          <lpage>30</lpage>
          . URL: https://doi.org/10.1145/2740908.2742760. doi:
          <volume>10</volume>
          .1145/ 2740908.2742760.
        </mixed-citation>
      </ref>
      <ref id="ref22">
        <mixed-citation>
          [22]
          <string-name>
            <given-names>P.</given-names>
            <surname>Badjatiya</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Gupta</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Gupta</surname>
          </string-name>
          ,
          <string-name>
            <given-names>V.</given-names>
            <surname>Varma</surname>
          </string-name>
          ,
          <article-title>Deep learning for hate speech detection in tweets</article-title>
          ,
          <source>in: Proceedings of the 26th International Conference on World Wide Web Companion, WWW '17 Companion, International World Wide Web Conferences Steering Committee, Republic and Canton of Geneva, CHE</source>
          ,
          <year>2017</year>
          , p.
          <fpage>759</fpage>
          -
          <lpage>760</lpage>
          . URL: https://doi.org/ 10.1145/3041021.3054223. doi:
          <volume>10</volume>
          .1145/3041021.3054223.
        </mixed-citation>
      </ref>
      <ref id="ref23">
        <mixed-citation>
          [23]
          <string-name>
            <given-names>J. A.</given-names>
            <surname>Leite</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Silva</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K.</given-names>
            <surname>Bontcheva</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Scarton</surname>
          </string-name>
          ,
          <article-title>Toxic language detection in social media for Brazilian Portuguese: New dataset and multilingual analysis</article-title>
          ,
          <source>in: Proceedings of the 1st Conference of the Asia-Pacific Chapter of the Association for Computational Linguistics and the 10th International Joint Conference on Natural Language Processing</source>
          , Association for Computational Linguistics, Suzhou, China,
          <year>2020</year>
          , pp.
          <fpage>914</fpage>
          -
          <lpage>924</lpage>
          . URL: https: //aclanthology.org/
          <year>2020</year>
          .aacl-main.
          <volume>91</volume>
          .
        </mixed-citation>
      </ref>
      <ref id="ref24">
        <mixed-citation>
          [24]
          <string-name>
            <given-names>A.</given-names>
            <surname>Saroj</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Pal</surname>
          </string-name>
          ,
          <article-title>An Indian language social media collection for hate and ofensive speech</article-title>
          ,
          <source>in: Proceedings of the Workshop on Resources</source>
          and
          <article-title>Techniques for User and Author Profiling in Abusive Language, European Language Resources Association (ELRA), Marseille</article-title>
          , France,
          <year>2020</year>
          , pp.
          <fpage>2</fpage>
          -
          <lpage>8</lpage>
          . URL: https://aclanthology.org/
          <year>2020</year>
          .restup-
          <volume>1</volume>
          .2.
        </mixed-citation>
      </ref>
      <ref id="ref25">
        <mixed-citation>
          [25]
          <string-name>
            <given-names>V.</given-names>
            <surname>Basile</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Bosco</surname>
          </string-name>
          ,
          <string-name>
            <given-names>E.</given-names>
            <surname>Fersini</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Nozza</surname>
          </string-name>
          ,
          <string-name>
            <given-names>V.</given-names>
            <surname>Patti</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F. M.</given-names>
            <surname>Rangel Pardo</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Rosso</surname>
          </string-name>
          , M. Sanguinetti, SemEval
          <article-title>-2019 task 5: Multilingual detection of hate speech against immigrants and women in Twitter</article-title>
          ,
          <source>in: Proceedings of the 13th International Workshop on Semantic Evaluation</source>
          ,
          <article-title>Association for Computational Linguistics</article-title>
          , Minneapolis, Minnesota, USA,
          <year>2019</year>
          , pp.
          <fpage>54</fpage>
          -
          <lpage>63</lpage>
          . URL: https://aclanthology.org/S19-2007. doi:
          <volume>10</volume>
          .18653/v1/
          <fpage>S19</fpage>
          - 2007.
        </mixed-citation>
      </ref>
      <ref id="ref26">
        <mixed-citation>
          [26]
          <string-name>
            <given-names>A.</given-names>
            <surname>Ghosh Chowdhury</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Didolkar</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Sawhney</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R. R.</given-names>
            <surname>Shah</surname>
          </string-name>
          , ARHNet
          <article-title>- leveraging community interaction for detection of religious hate speech in Arabic, in: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics</article-title>
          : Student Research Workshop, Association for Computational Linguistics, Florence, Italy,
          <year>2019</year>
          , pp.
          <fpage>273</fpage>
          -
          <lpage>280</lpage>
          . URL: https://aclanthology.org/P19-2038. doi:
          <volume>10</volume>
          .18653/v1/
          <fpage>P19</fpage>
          - 2038.
        </mixed-citation>
      </ref>
      <ref id="ref27">
        <mixed-citation>
          [27]
          <string-name>
            <given-names>S.</given-names>
            <surname>Chopra</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Sawhney</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Mathur</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R. Ratn</given-names>
            <surname>Shah</surname>
          </string-name>
          ,
          <article-title>Hindi-english hate speech detection: Author profiling, debiasing, and practical perspectives</article-title>
          ,
          <source>Proceedings of the AAAI Conference on Artificial Intelligence</source>
          <volume>34</volume>
          (
          <year>2020</year>
          )
          <fpage>386</fpage>
          -
          <lpage>393</lpage>
          . URL: https://ojs.aaai.org/index.php/AAAI/ article/view/5374. doi:
          <volume>10</volume>
          .1609/aaai.v34i01.
          <fpage>5374</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref28">
        <mixed-citation>
          [28]
          <string-name>
            <given-names>R.</given-names>
            <surname>Kapoor</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Y.</given-names>
            <surname>Kumar</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K.</given-names>
            <surname>Rajput</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R. R.</given-names>
            <surname>Shah</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Kumaraguru</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Zimmermann</surname>
          </string-name>
          ,
          <article-title>Mind your language: Abuse and ofense detection for code-switched languages</article-title>
          ,
          <source>Proceedings of the AAAI Conference on Artificial Intelligence</source>
          <volume>33</volume>
          (
          <year>2019</year>
          )
          <fpage>9951</fpage>
          -
          <lpage>9952</lpage>
          . URL: https://ojs.aaai.org/ index.php/AAAI/article/view/5112. doi:
          <volume>10</volume>
          .1609/aaai.v33i01.
          <fpage>33019951</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref29">
        <mixed-citation>
          [29]
          <string-name>
            <given-names>A.</given-names>
            <surname>Vaswani</surname>
          </string-name>
          ,
          <string-name>
            <given-names>N.</given-names>
            <surname>Shazeer</surname>
          </string-name>
          ,
          <string-name>
            <given-names>N.</given-names>
            <surname>Parmar</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Uszkoreit</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L.</given-names>
            <surname>Jones</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A. N.</given-names>
            <surname>Gomez</surname>
          </string-name>
          , L. u. Kaiser,
          <string-name>
            <surname>I. Polosukhin</surname>
          </string-name>
          ,
          <article-title>Attention is all you need</article-title>
          , in: I. Guyon,
          <string-name>
            <given-names>U. V.</given-names>
            <surname>Luxburg</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Bengio</surname>
          </string-name>
          ,
          <string-name>
            <given-names>H.</given-names>
            <surname>Wallach</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Fergus</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Vishwanathan</surname>
          </string-name>
          , R. Garnett (Eds.),
          <source>Advances in Neural Information Processing Systems</source>
          , volume
          <volume>30</volume>
          ,
          <string-name>
            <surname>Curran</surname>
            <given-names>Associates</given-names>
          </string-name>
          , Inc.,
          <year>2017</year>
          . URL: https://proceedings.neurips.cc/ paper_files/paper/2017/file/3f5ee243547dee91fbd053c1c4a845aa-Paper.pdf.
        </mixed-citation>
      </ref>
      <ref id="ref30">
        <mixed-citation>
          [30]
          <string-name>
            <given-names>T.</given-names>
            <surname>Ranasinghe</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Gupte</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Zampieri</surname>
          </string-name>
          ,
          <string-name>
            <surname>I. Nwogu</surname>
          </string-name>
          ,
          <article-title>Wlv-rit at hasoc-dravidian-codemixifre2020: Ofensive language identification in code-switched youtube comments</article-title>
          ,
          <year>2020</year>
          . arXiv:
          <year>2011</year>
          .00559.
        </mixed-citation>
      </ref>
      <ref id="ref31">
        <mixed-citation>
          [31]
          <string-name>
            <given-names>I.</given-names>
            <surname>Kwok</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Y.</given-names>
            <surname>Wang</surname>
          </string-name>
          ,
          <article-title>Locate the hate: Detecting tweets against blacks</article-title>
          ,
          <source>in: Proceedings of the Twenty-Seventh AAAI Conference on Artificial Intelligence</source>
          , AAAI'
          <fpage>13</fpage>
          , AAAI Press,
          <year>2013</year>
          , p.
          <fpage>1621</fpage>
          -
          <lpage>1622</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref32">
        <mixed-citation>
          [32]
          <string-name>
            <given-names>Y.</given-names>
            <surname>Chen</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Y.</given-names>
            <surname>Zhou</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Zhu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>H.</given-names>
            <surname>Xu</surname>
          </string-name>
          ,
          <article-title>Detecting ofensive language in social media to protect adolescent online safety</article-title>
          ,
          <source>in: 2012 International Conference on Privacy, Security, Risk and Trust and 2012 International Confernece on Social Computing</source>
          ,
          <year>2012</year>
          , pp.
          <fpage>71</fpage>
          -
          <lpage>80</lpage>
          . doi:
          <volume>10</volume>
          . 1109/SocialCom- PASSAT.
          <year>2012</year>
          .
          <volume>55</volume>
          .
        </mixed-citation>
      </ref>
      <ref id="ref33">
        <mixed-citation>
          [33]
          <string-name>
            <given-names>R.</given-names>
            <surname>Rajalakshmi</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Y.</given-names>
            <surname>Reddy</surname>
          </string-name>
          , Dlrg@hasoc
          <year>2020</year>
          :
          <article-title>A hybrid approach for hate and ofensive content identification in multilingual tweets</article-title>
          ,
          <source>in: Fire</source>
          ,
          <year>2020</year>
          . URL: https://api.semanticscholar. org/CorpusID:232314467.
        </mixed-citation>
      </ref>
      <ref id="ref34">
        <mixed-citation>
          [34]
          <string-name>
            <given-names>J.</given-names>
            <surname>Pennington</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Socher</surname>
          </string-name>
          , C. Manning, GloVe:
          <article-title>Global vectors for word representation</article-title>
          ,
          <source>in: Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP)</source>
          ,
          <article-title>Association for Computational Linguistics</article-title>
          , Doha, Qatar,
          <year>2014</year>
          , pp.
          <fpage>1532</fpage>
          -
          <lpage>1543</lpage>
          . URL: https://aclanthology.org/D14-1162. doi:
          <volume>10</volume>
          .3115/v1/
          <fpage>D14</fpage>
          - 1162.
        </mixed-citation>
      </ref>
      <ref id="ref35">
        <mixed-citation>
          [35]
          <string-name>
            <given-names>P.</given-names>
            <surname>Bojanowski</surname>
          </string-name>
          ,
          <string-name>
            <given-names>E.</given-names>
            <surname>Grave</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Joulin</surname>
          </string-name>
          , T. Mikolov,
          <article-title>Enriching word vectors with subword information, Transactions of the Association for Computational Linguistics 5 (</article-title>
          <year>2017</year>
          )
          <fpage>135</fpage>
          -
          <lpage>146</lpage>
          . URL: https://aclanthology.org/Q17-1010. doi:
          <volume>10</volume>
          .1162/tacl_a_
          <fpage>00051</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref36">
        <mixed-citation>
          [36]
          <string-name>
            <given-names>R.</given-names>
            <surname>Raj</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Srivastava</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Saumya</surname>
          </string-name>
          , Nsit &amp; iiitdwd @ hasoc
          <year>2020</year>
          :
          <article-title>Deep learning model for hate-speech identification in indo-european languages</article-title>
          ,
          <source>in: Fire</source>
          ,
          <year>2020</year>
          . URL: https: //api.semanticscholar.org/CorpusID:232313876.
        </mixed-citation>
      </ref>
      <ref id="ref37">
        <mixed-citation>
          [37]
          <string-name>
            <surname>A. K. Mishra</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          <string-name>
            <surname>Saumya</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          <string-name>
            <surname>Kumar</surname>
          </string-name>
          , Iiit_dwd@hasoc
          <year>2020</year>
          :
          <article-title>Identifying ofensive content in indo-european languages</article-title>
          , in: P. Mehta,
          <string-name>
            <given-names>T.</given-names>
            <surname>Mandl</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Majumder</surname>
          </string-name>
          , M. Mitra (Eds.), Working Notes of FIRE 2020 -
          <article-title>Forum for Information Retrieval Evaluation, Hyderabad</article-title>
          , India,
          <source>December 16-20</source>
          ,
          <year>2020</year>
          , volume
          <volume>2826</volume>
          <source>of CEUR Workshop Proceedings, CEUR-WS.org</source>
          ,
          <year>2020</year>
          , pp.
          <fpage>139</fpage>
          -
          <lpage>144</lpage>
          . URL: https://ceur-ws.
          <source>org/</source>
          Vol-
          <volume>2826</volume>
          /
          <fpage>T2</fpage>
          -5.pdf.
        </mixed-citation>
      </ref>
      <ref id="ref38">
        <mixed-citation>
          [38]
          <string-name>
            <given-names>S.</given-names>
            <surname>Mohtaj</surname>
          </string-name>
          ,
          <string-name>
            <given-names>V.</given-names>
            <surname>Woloszyn</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Möller</surname>
          </string-name>
          , Tub at hasoc 2020:
          <article-title>Character based lstm for hate speech detection in indo-european languages</article-title>
          ,
          <source>in: Fire</source>
          ,
          <year>2020</year>
          . URL: https://api.semanticscholar.org/ CorpusID:232314731.
        </mixed-citation>
      </ref>
      <ref id="ref39">
        <mixed-citation>
          [39]
          <string-name>
            <given-names>J.</given-names>
            <surname>Devlin</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Chang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K.</given-names>
            <surname>Lee</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K.</given-names>
            <surname>Toutanova</surname>
          </string-name>
          ,
          <article-title>BERT: pre-training of deep bidirectional transformers for language understanding</article-title>
          , in: J.
          <string-name>
            <surname>Burstein</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          <string-name>
            <surname>Doran</surname>
          </string-name>
          , T. Solorio (Eds.),
          <source>Proceedings of the</source>
          <year>2019</year>
          <article-title>Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, NAACL-HLT 2019, Minneapolis</article-title>
          , MN, USA, June 2-7,
          <year>2019</year>
          , Volume
          <volume>1</volume>
          (Long and Short Papers),
          <source>Association for Computational Linguistics</source>
          ,
          <year>2019</year>
          , pp.
          <fpage>4171</fpage>
          -
          <lpage>4186</lpage>
          . URL: https://doi.org/10.18653/v1/n19-
          <fpage>1423</fpage>
          . doi:
          <volume>10</volume>
          .18653/v1/n19-
          <fpage>1423</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref40">
        <mixed-citation>
          [40]
          <string-name>
            <given-names>M.</given-names>
            <surname>Mozafari</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Farahbakhsh</surname>
          </string-name>
          ,
          <string-name>
            <given-names>N.</given-names>
            <surname>Crespi</surname>
          </string-name>
          ,
          <article-title>A BERT-Based Transfer Learning Approach for Hate Speech Detection in Online Social Media</article-title>
          ,
          <year>2019</year>
          , pp.
          <fpage>928</fpage>
          -
          <lpage>940</lpage>
          . doi:
          <volume>10</volume>
          .1007/ 978- 3-
          <fpage>030</fpage>
          - 36687- 2_
          <fpage>77</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref41">
        <mixed-citation>
          [41]
          <string-name>
            <given-names>S. S.</given-names>
            <surname>Aluru</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B.</given-names>
            <surname>Mathew</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Saha</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Mukherjee</surname>
          </string-name>
          ,
          <article-title>Deep learning models for multilingual hate speech detection</article-title>
          ,
          <year>2020</year>
          . arXiv:
          <year>2004</year>
          .06465.
        </mixed-citation>
      </ref>
      <ref id="ref42">
        <mixed-citation>
          [42]
          <string-name>
            <given-names>A.</given-names>
            <surname>Conneau</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K.</given-names>
            <surname>Khandelwal</surname>
          </string-name>
          ,
          <string-name>
            <given-names>N.</given-names>
            <surname>Goyal</surname>
          </string-name>
          ,
          <string-name>
            <given-names>V.</given-names>
            <surname>Chaudhary</surname>
          </string-name>
          ,
          <string-name>
            <given-names>G.</given-names>
            <surname>Wenzek</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F.</given-names>
            <surname>Guzmán</surname>
          </string-name>
          , E. Grave,
          <string-name>
            <given-names>M.</given-names>
            <surname>Ott</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L.</given-names>
            <surname>Zettlemoyer</surname>
          </string-name>
          ,
          <string-name>
            <given-names>V.</given-names>
            <surname>Stoyanov</surname>
          </string-name>
          ,
          <article-title>Unsupervised cross-lingual representation learning at scale, in: Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, Association for Computational Linguistics</article-title>
          , Online,
          <year>2020</year>
          , pp.
          <fpage>8440</fpage>
          -
          <lpage>8451</lpage>
          . URL: https://aclanthology.org/
          <year>2020</year>
          .acl-main.
          <volume>747</volume>
          . doi:
          <volume>10</volume>
          .18653/v1/
          <year>2020</year>
          .acl- main.747.
        </mixed-citation>
      </ref>
      <ref id="ref43">
        <mixed-citation>
          [43]
          <string-name>
            <given-names>S.</given-names>
            <surname>Khanuja</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Bansal</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Mehtani</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Khosla</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Dey</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B.</given-names>
            <surname>Gopalan</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D. K.</given-names>
            <surname>Margam</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Aggarwal</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R. T.</given-names>
            <surname>Nagipogu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Dave</surname>
          </string-name>
          , et al.,
          <article-title>Muril: Multilingual representations for indian languages</article-title>
          ,
          <source>arXiv preprint arXiv:2103.10730</source>
          (
          <year>2021</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref44">
        <mixed-citation>
          [44]
          <string-name>
            <given-names>V.</given-names>
            <surname>Dhananjaya</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Demotte</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Ranathunga</surname>
          </string-name>
          , S. Jayasena,
          <article-title>BERTifying Sinhala - a comprehensive analysis of pre-trained language models for Sinhala text classification</article-title>
          ,
          <source>in: Proceedings of the Thirteenth Language Resources and Evaluation Conference</source>
          , European Language Resources Association, Marseille, France,
          <year>2022</year>
          , pp.
          <fpage>7377</fpage>
          -
          <lpage>7385</lpage>
          . URL: https://aclanthology.org/
          <year>2022</year>
          .lrec-
          <volume>1</volume>
          .
          <fpage>803</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref45">
        <mixed-citation>
          [45]
          <string-name>
            <given-names>T.</given-names>
            <surname>Hasan</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Bhattacharjee</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K.</given-names>
            <surname>Samin</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Hasan</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Basak</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M. S.</given-names>
            <surname>Rahman</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Shahriyar</surname>
          </string-name>
          ,
          <article-title>Not low-resource anymore: Aligner ensembling, batch filtering, and new datasets for BengaliEnglish machine translation</article-title>
          ,
          <source>in: Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP)</source>
          ,
          <article-title>Association for Computational Linguistics</article-title>
          , Online,
          <year>2020</year>
          , pp.
          <fpage>2612</fpage>
          -
          <lpage>2623</lpage>
          . URL: https://aclanthology.org/
          <year>2020</year>
          .emnlp-main.
          <volume>207</volume>
          . doi:
          <volume>10</volume>
          . 18653/v1/
          <year>2020</year>
          .emnlp-main.
          <volume>207</volume>
          .
        </mixed-citation>
      </ref>
      <ref id="ref46">
        <mixed-citation>
          [46]
          <string-name>
            <given-names>S.</given-names>
            <surname>Doddapaneni</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Aralikatte</surname>
          </string-name>
          , G. Ramesh,
          <string-name>
            <given-names>S.</given-names>
            <surname>Goyal</surname>
          </string-name>
          ,
          <string-name>
            <surname>M. M. Khapra</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          <string-name>
            <surname>Kunchukuttan</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          <string-name>
            <surname>Kumar</surname>
          </string-name>
          ,
          <article-title>Towards leaving no indic language behind: Building monolingual corpora, benchmark and models for indic languages</article-title>
          ,
          <year>2023</year>
          . arXiv:
          <volume>2212</volume>
          .
          <fpage>05409</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref47">
        <mixed-citation>
          [47]
          <string-name>
            <given-names>I. M.</given-names>
            <surname>Moosa</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M. E.</given-names>
            <surname>Akhter</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A. B.</given-names>
            <surname>Habib</surname>
          </string-name>
          ,
          <article-title>Does transliteration help multilingual language modeling?, in: Findings of the Association for Computational Linguistics: EACL 2023, Association for Computational Linguistics</article-title>
          , Dubrovnik, Croatia,
          <year>2023</year>
          , pp.
          <fpage>670</fpage>
          -
          <lpage>685</lpage>
          . URL: https://aclanthology.org/
          <year>2023</year>
          .findings-eacl.
          <volume>50</volume>
          . doi:
          <volume>10</volume>
          .18653/v1/
          <year>2023</year>
          .findings-eacl.
          <volume>50</volume>
          .
        </mixed-citation>
      </ref>
      <ref id="ref48">
        <mixed-citation>
          [48]
          <string-name>
            <given-names>K.</given-names>
            <surname>Ghosh</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D. A.</given-names>
            <surname>Senapati</surname>
          </string-name>
          ,
          <article-title>Hate speech detection: a comparison of mono and multilingual transformer model with cross-language evaluation</article-title>
          ,
          <source>in: Proceedings of the 36th Pacific Asia Conference on Language, Information and Computation</source>
          , De La Salle University, Manila, Philippines,
          <year>2022</year>
          , pp.
          <fpage>853</fpage>
          -
          <lpage>865</lpage>
          . URL: https://aclanthology.org/
          <year>2022</year>
          .paclic-
          <volume>1</volume>
          .
          <fpage>94</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref49">
        <mixed-citation>
          [49]
          <string-name>
            <given-names>S.</given-names>
            <surname>Satapara</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Majumder</surname>
          </string-name>
          ,
          <string-name>
            <given-names>T.</given-names>
            <surname>Mandl</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Modha</surname>
          </string-name>
          ,
          <string-name>
            <given-names>H.</given-names>
            <surname>Madhu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>T.</given-names>
            <surname>Ranasinghe</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Zampieri</surname>
          </string-name>
          , K. North,
          <string-name>
            <given-names>D.</given-names>
            <surname>Premasiri</surname>
          </string-name>
          ,
          <article-title>Overview of the hasoc subtrack at fire 2022: Hate speech and ofensive content identification in english and indo-aryan languages, in: Proceedings of the 14th Annual Meeting of the Forum for Information Retrieval Evaluation</article-title>
          , FIRE '22,
          <string-name>
            <surname>Association</surname>
          </string-name>
          for Computing Machinery, New York, NY, USA,
          <year>2023</year>
          , p.
          <fpage>4</fpage>
          -
          <lpage>7</lpage>
          . URL: https: //doi.org/10.1145/3574318.3574326. doi:
          <volume>10</volume>
          .1145/3574318.3574326.
        </mixed-citation>
      </ref>
      <ref id="ref50">
        <mixed-citation>
          [50]
          <string-name>
            <given-names>S.</given-names>
            <surname>Modha</surname>
          </string-name>
          ,
          <string-name>
            <given-names>T.</given-names>
            <surname>Mandl</surname>
          </string-name>
          ,
          <string-name>
            <given-names>G. K.</given-names>
            <surname>Shahi</surname>
          </string-name>
          ,
          <string-name>
            <given-names>H.</given-names>
            <surname>Madhu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Satapara</surname>
          </string-name>
          ,
          <string-name>
            <given-names>T.</given-names>
            <surname>Ranasinghe</surname>
          </string-name>
          ,
          <string-name>
            <surname>M.</surname>
          </string-name>
          <article-title>Zampieri, Overview of the hasoc subtrack at fire 2021: Hate speech and ofensive content identiifcation in english and indo-aryan languages and conversational hate speech, in: Proceedings of the 13th Annual Meeting of the Forum for Information Retrieval Evaluation</article-title>
          , FIRE '21,
          <string-name>
            <surname>Association</surname>
          </string-name>
          for Computing Machinery, New York, NY, USA,
          <year>2022</year>
          , p.
          <fpage>1</fpage>
          -
          <lpage>3</lpage>
          . URL: https://doi.org/10.1145/3503162.3503176. doi:
          <volume>10</volume>
          .1145/3503162.3503176.
        </mixed-citation>
      </ref>
      <ref id="ref51">
        <mixed-citation>
          [51]
          <string-name>
            <given-names>M.</given-names>
            <surname>Bhatia</surname>
          </string-name>
          ,
          <string-name>
            <given-names>T. S.</given-names>
            <surname>Bhotia</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Agarwal</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Ramesh</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Gupta</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K.</given-names>
            <surname>Shridhar</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F.</given-names>
            <surname>Laumann</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Dash</surname>
          </string-name>
          ,
          <article-title>One to rule them all: Towards joint indic language hate speech detection</article-title>
          ,
          <year>2021</year>
          . arXiv:
          <volume>2109</volume>
          .
          <fpage>13711</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref52">
        <mixed-citation>
          [52]
          <string-name>
            <given-names>T.</given-names>
            <surname>Ranasinghe</surname>
          </string-name>
          , K. North,
          <string-name>
            <given-names>D.</given-names>
            <surname>Premasiri</surname>
          </string-name>
          ,
          <string-name>
            <surname>M.</surname>
          </string-name>
          <article-title>Zampieri, Overview of the hasoc subtrack at fire 2022: Ofensive language identification in marathi</article-title>
          ,
          <year>2022</year>
          . arXiv:
          <volume>2211</volume>
          .
          <fpage>10163</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref53">
        <mixed-citation>
          [53]
          <string-name>
            <given-names>T.</given-names>
            <surname>Ranasinghe</surname>
          </string-name>
          ,
          <string-name>
            <given-names>I.</given-names>
            <surname>Anuradha</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Premasiri</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K.</given-names>
            <surname>Silva</surname>
          </string-name>
          ,
          <string-name>
            <given-names>H.</given-names>
            <surname>Hettiarachchi</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L.</given-names>
            <surname>Uyangodage</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Zampieri</surname>
          </string-name>
          ,
          <source>Sold: Sinhala ofensive language dataset</source>
          ,
          <year>2022</year>
          . arXiv:
          <volume>2212</volume>
          .
          <fpage>00851</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref54">
        <mixed-citation>
          [54]
          <string-name>
            <given-names>B.</given-names>
            <surname>Eisner</surname>
          </string-name>
          ,
          <string-name>
            <given-names>T.</given-names>
            <surname>Rocktäschel</surname>
          </string-name>
          , I. Augenstein,
          <string-name>
            <given-names>M.</given-names>
            <surname>Bošnjak</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Riedel</surname>
          </string-name>
          , emoji2vec:
          <article-title>Learning emoji representations from their description</article-title>
          ,
          <source>in: Proceedings of the Fourth International Workshop on Natural Language Processing for Social Media</source>
          , Association for Computational Linguistics, Austin, TX, USA,
          <year>2016</year>
          , pp.
          <fpage>48</fpage>
          -
          <lpage>54</lpage>
          . URL: https://aclanthology.org/W16-6208. doi:
          <volume>10</volume>
          .18653/v1/
          <fpage>W16</fpage>
          -6208.
        </mixed-citation>
      </ref>
      <ref id="ref55">
        <mixed-citation>
          [55]
          <string-name>
            <given-names>N.</given-names>
            <surname>Team</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M. R.</given-names>
            <surname>Costa-jussà</surname>
          </string-name>
          , J.
          <string-name>
            <surname>Cross</surname>
            ,
            <given-names>O.</given-names>
          </string-name>
          <string-name>
            <surname>Çelebi</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          <string-name>
            <surname>Elbayad</surname>
            ,
            <given-names>K.</given-names>
          </string-name>
          <string-name>
            <surname>Heafield</surname>
            ,
            <given-names>K.</given-names>
          </string-name>
          <string-name>
            <surname>Hefernan</surname>
            ,
            <given-names>E.</given-names>
          </string-name>
          <string-name>
            <surname>Kalbassi</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          <string-name>
            <surname>Lam</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          <string-name>
            <surname>Licht</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          <string-name>
            <surname>Maillard</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          <string-name>
            <surname>Sun</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          <string-name>
            <surname>Wang</surname>
            ,
            <given-names>G.</given-names>
          </string-name>
          <string-name>
            <surname>Wenzek</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          <string-name>
            <surname>Youngblood</surname>
            ,
            <given-names>B.</given-names>
          </string-name>
          <string-name>
            <surname>Akula</surname>
            ,
            <given-names>L.</given-names>
          </string-name>
          <string-name>
            <surname>Barrault</surname>
            ,
            <given-names>G. M.</given-names>
          </string-name>
          <string-name>
            <surname>Gonzalez</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          <string-name>
            <surname>Hansanti</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          <string-name>
            <surname>Hofman</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          <string-name>
            <surname>Jarrett</surname>
            ,
            <given-names>K. R.</given-names>
          </string-name>
          <string-name>
            <surname>Sadagopan</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          <string-name>
            <surname>Rowe</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          <string-name>
            <surname>Spruit</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          <string-name>
            <surname>Tran</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          <string-name>
            <surname>Andrews</surname>
            ,
            <given-names>N. F.</given-names>
          </string-name>
          <string-name>
            <surname>Ayan</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          <string-name>
            <surname>Bhosale</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          <string-name>
            <surname>Edunov</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          <string-name>
            <surname>Fan</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          <string-name>
            <surname>Gao</surname>
            ,
            <given-names>V.</given-names>
          </string-name>
          <string-name>
            <surname>Goswami</surname>
            ,
            <given-names>F.</given-names>
          </string-name>
          <string-name>
            <surname>Guzmán</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          <string-name>
            <surname>Koehn</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          <string-name>
            <surname>Mourachko</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          <string-name>
            <surname>Ropers</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          <string-name>
            <surname>Saleem</surname>
          </string-name>
          , H. Schwenk,
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>