<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Detection in Marathi and Code-Mixed Languages using TF-IDF and Transformers-Based BERT-Variants</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Sakshi Kalra</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Kushank Maheshwari</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Saransh Goel</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Yashvardhan Sharma</string-name>
          <email>yash@pilani.bits-pilani.ac.in</email>
          <xref ref-type="aff" rid="aff0">0</xref>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Classification</institution>
          ,
          <addr-line>Tokenizer, TF-IDF, Multilingual BERT, Machine Learning</addr-line>
        </aff>
        <aff id="aff1">
          <label>1</label>
          <institution>Department of CSIS, BITS Pilani</institution>
          ,
          <addr-line>333031, Rajasthan</addr-line>
          ,
          <country country="IN">INDIA</country>
        </aff>
      </contrib-group>
      <abstract>
        <p>People now express their ideas on social media on a global scale. Online attacks against others can be made without fear of repercussions due to the increased sense of freedom provided by the anonymity feature, which eventually leads to the spread of hate speech. The current attempts to filter online information and stop the propagation of hatred are insuficient. Regional languages' popularity on social media and the lack of hate speech detectors that can be used in multiple languages are two aspects that contribute to this. This paper discusses two aspects of fake news detection namely: Identification of Conversational Hate-Speech in Code-Mixed Languages like Hindi, English and German, while second part discusses about Ofensive Language Identification in Marathi. Our approach uses TF-IDF word embedding combined with Machine Learning models and transformer based BERT models for the classification of hate speech in each of the two sub tasks. The MuRIL-BERT model produces the best results, with an accuracy of 73.1% and a Macro-F1 score of 0.727 for the code-mixed language and a macro F1-score of 0.8306 on Marathi data, which is 6% more from previous year.</p>
      </abstract>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>1. Introduction</title>
      <p>In the past few years, academics have become more interested in the topic of hate speech. This
is shown by the fact that the number of Web of Science (WOS)-indexed publications went from
speech, which is impossible to do manually and must instead be done computationally.</p>
      <p>
        By setting up assignments and seminars, online communities, social media businesses, and
technology firms are making significant investments and promoting research in this field of
Hate Speech Detection. FIRE is one such group, and it has been actively managing the HASOC
responsibilities since 2019 [
        <xref ref-type="bibr" rid="ref4">4</xref>
        ]. HASOC 2022 is looking for technology that can detect
inflammatory language and hate speech without human intervention. The competition is broken up into
two subtracks.
      </p>
      <p>
        For the first task, the dataset contains code-mix tweets in more than one language (Hinglish
and German), along with comments and replies to those comments. When the language is coded,
it is dificult to tell what is hate speech. Code mixed text uses the vocabulary and grammar
of more than one language [
        <xref ref-type="bibr" rid="ref5">5</xref>
        ]. For example, in the dataset used, the Hinglish data has Hindi
written in both roman and devanagari script, which makes it harder to find hate speech in this
data. The proposed model uses two methods for text classification one is machine learning
approach using TF-IDF feature extraction and other is deep learning approach using diferent
BERT variants, which are based on the transformers model; BERT has been shown to be the
best in understanding the right and left context in a text up to this point.
      </p>
      <p>
        For the second task of Hate Speech and Ofensive Content Identification in Marathi Language
aims at Binary classification to classify a tweet by a user as either ofensive and hate or not
ofensive. The overview of FIRE 2022 subtasks is presented in [
        <xref ref-type="bibr" rid="ref6">6</xref>
        ] and [
        <xref ref-type="bibr" rid="ref7">7</xref>
        ]. We approached the
task using the Transformers-based models namely MuRIL, Distil-BERT and Multilingual-BERT
which have displayed impressive outcomes in NLP tasks like text classification. The provided
Marathi dataset is fine-tuned using a pre-trained transformer model from the HuggingFace
library1 . We demonstrate that using transfer learning on pre-trained BERT models is preferable
to using conventional machine learning algorithms.The code is available from the github
repository2.
      </p>
    </sec>
    <sec id="sec-2">
      <title>2. Related Work</title>
      <p>
        For the code-mixed languages, various approaches in the past have been used. The authors
of [
        <xref ref-type="bibr" rid="ref8">8</xref>
        ] explain how we can extract features from text data using TF-IDF. They examined the
performance of the TF-IDF implementation using 1400 papers from the United Nations Parallel
Text Corpus for LDCs and only returned the top 100 relevant texts. In a further study [
        <xref ref-type="bibr" rid="ref9">9</xref>
        ],
researchers went into greater detail about TF-IDF feature extraction and compared character
n-grams to word n-grams, concluding that character n-grams were more useful for detecting
hate speech. Another paper by [
        <xref ref-type="bibr" rid="ref10">10</xref>
        ] describes how the BERT model can be used for text
classification; this paper covers the architecture of the BERT model, which is trained on a large corpus
of data and input tokenized text, as well as an attention mask. They achieved GLUE scores of
80.5%, 86.7% MultiNLI accuracy, 93.2% on the SQuAD v1.1 question-answering test, and 83.1%
1https://huggingface.co/
2https://github.com/Kushank24/Marathi 
on the SQuAD v2.0 test. The approach of utilising BERT for classification in [
        <xref ref-type="bibr" rid="ref10">10</xref>
        ] is further
explained by using the output corresponding to the [CLS] token and adding an Feed Forward
Network above it. Another study [
        <xref ref-type="bibr" rid="ref11">11</xref>
        ] used soft voting technique on three transformer-based
architectures (urduhack, BERT, and XLM-RoBERTa) to achieve an accuracy of 93.6The authors
in [
        <xref ref-type="bibr" rid="ref12">12</xref>
        ] make an attempt to identify threatening posts using deep learning based models on
transformers, they essentially employed the pretrained BERT model (RoBERTa) for classifying
text as threatening and non-threatening and obtained an F1 score of 53.46% and ROC AUC of
81.99%.
  Another paper in [
        <xref ref-type="bibr" rid="ref13">13</xref>
        ] fine-tuned monolingual and multilingual transformers over Urdu text,
and used ensembling techniques to combine the results of RoBERTa-urdu-small, XLM-RoBERTa,
bert-based-multilingual-case, and Alberta-urdu-large, yielding an accuracy of 0.596 and an
F1 score of 0.449. In an another attempt by [
        <xref ref-type="bibr" rid="ref14">14</xref>
        ] got the highest F1 score of 0.7993 by using
pre-trained BERT models with a fine-tuning classification layer over them. They also used data
augmentation to make the models generalise better and used both machine learning and deep
learning techniques for the task of recognising hate and ofensive speech. The efectiveness of
several pre-trained multilingual BERT models in the detection of threats and hate speech, which
are also types of emotions, was discussed in [
        <xref ref-type="bibr" rid="ref13">13</xref>
        ] and [
        <xref ref-type="bibr" rid="ref14">14</xref>
        ]. [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ] used a variety of datasets, the
majority of which were based on data from Twitter, including TRAC, hatebase Twitter, Kaggle,
etc., and suggested an SVM-based model called mSVM, which on the TRAC dataset produced
state-of-the-art results with 80% accuracy and a 53.68% macro F1 score. They also employed
the BERT model, which produced results that were 2 percent better but could not explain the
interpretability of the choice.
      </p>
      <p>
        For the Marathi Language, Automated ofensive and hate speech detection has been tested
using a variety of machine learning and deep learning techniques [
        <xref ref-type="bibr" rid="ref15">15</xref>
        ]. The bulk of conventional
machine learning techniques extract features from voice text, such as lexical and linguistic
features, n-grams, and bags of words [
        <xref ref-type="bibr" rid="ref16">16</xref>
        ]. Word embedding techniques have also recently been
presented for these tasks [
        <xref ref-type="bibr" rid="ref17">17</xref>
        ]. However, these methods fall short of capturing the speech’s
whole context. Deep learning methods [
        <xref ref-type="bibr" rid="ref18">18</xref>
        ] are currently becoming more and more popular in
a variety of fields, including machine translation, sentiment analysis, text classification, and
language modelling. Recurrent Neural Networks (RNNs) [
        <xref ref-type="bibr" rid="ref19">19</xref>
        ], Convolutional neural networks
(CNNs) [
        <xref ref-type="bibr" rid="ref20">20</xref>
        ], long short-term memories (LSTMs) [
        <xref ref-type="bibr" rid="ref21">21</xref>
        ], and the newest approach, bidirectional
encoder representations (BERT) [
        <xref ref-type="bibr" rid="ref22">22</xref>
        ], are a few of these methods. A combination of Machine
Learning models and transformers based models is presented in [
        <xref ref-type="bibr" rid="ref23">23</xref>
        ].
      </p>
      <p>
        In [
        <xref ref-type="bibr" rid="ref11">11</xref>
        ] both ML models as well as Transformer based models have been applied for Urdu
Language. Additionally, BERT models for Hate Speech detection for Urdu Language has also
been applied in FIRE 2021 [
        <xref ref-type="bibr" rid="ref12">12</xref>
        ]. Another study [
        <xref ref-type="bibr" rid="ref24">24</xref>
        ] for identifying hate speech phrases on
Twitter was done. In order to comprehend semantics, the deep convolutional neural network
model and GloVe embedding vectors have been combined. With an F1-score of 0.92, the findings
explain that their model performed better than the other models. In [
        <xref ref-type="bibr" rid="ref14">14</xref>
        ] techniques like
TFIDF weightings as well as word embeddings ae used, which is then fed into machine learning
algortihms namely random forest, logistic regression and support vector classifier.
      </p>
    </sec>
    <sec id="sec-3">
      <title>3. Dataset</title>
      <p>Data Type
Training Data
HOF
2612
NOT
2609
Total Entries
5221</p>
      <sec id="sec-3-1">
        <title>Task A (Code-Mixed Language):</title>
        <p>
          The dataset used in this task is collected from HASOC (2022)3 which is one of the subtracks
of the Forum for Information Retrieval Evaluation (FIRE)4 2022. It is a collection of tweets;
each instance of the dataset [
          <xref ref-type="bibr" rid="ref25">25</xref>
          ], [
          <xref ref-type="bibr" rid="ref26">26</xref>
          ] includes a main tweet that is labelled as HOF or NOT.
Additionally, each tweet may obtain multiple comments, each of which is also labelled ”HOF”
or ”NOT.” Finally, each comment may receive multiple replies, each of which is also labelled
”HOF” or ”NOT.”
• Non Hate Ofensive(NOT) - Tweets, Comments or Replies with this label does not
include Hate Speech.
• Hate and Ofensive(HOF) - Tweets, Comments or Replies with this label include Hate
or ofencive speech
        </p>
        <p>The dataset difers in that the determination of whether a comment or a reply falls under
the category of hate speech depends on both the main tweet and the comment in the case of a
reply. For instance, a comment of ”yes” is meaningless by itself, but if it is made in response to
a main tweet that is hate speech, then it is considered hate speech, while a comment of ”no”
for the same tweet is not. Therefore, the modification we made to get it ready for the model
(to capture the context of the tweet, comment, and reply) is that the text for the main tweet
remains the same, the main tweet is appended to the comment, and the main tweet as well as
the comment are appended to the reply text (separated by blank space). This way, it will be
able to capture the context of the comment and reply.</p>
        <p>• Main tweet: &lt;main tweet&gt;
• Comment: &lt;main tweet&gt; &lt;comment&gt;
• Reply: &lt;main tweet&gt; &lt;comment&gt; &lt;reply&gt;</p>
      </sec>
      <sec id="sec-3-2">
        <title>Task B (Marathi Language):</title>
        <p>The datasets for the tasks are provided by the organizers of HASOC’225. The subtask A in the
HASOC challenge for Marathi Language is a binary classification task. We need to categorize
the sentences in the Marathi Language dataset into the following classes:
3https://hasocfire.github.io/hasoc/2022/index.html
4http://fire.irsi.res.in/fire/2022/home
5https://hasocfire.github.io/hasoc/2022/index.html
• Non-Ofensive(NOT) - Tweets containing this label do not contain hate speech, foul
language, or other ofensive material.
• Ofensive(OFF) - Hateful, ofensive, and profane content can all be found in tweets with
this label.</p>
        <sec id="sec-3-2-1">
          <title>The data statistics are as follows:</title>
          <p>The graphical representation of statistics for the dataset are listed in Figure 2. Twitter’s
definition of the term ”Ofensive” refers to abusive remarks made to people or groups with the
intention of intimidating them or silencing their voice.</p>
        </sec>
      </sec>
    </sec>
    <sec id="sec-4">
      <title>4. Handling the Class Imbalanced Issue</title>
      <p>
        For the Task A, the dataset was balanced while for Task B the dataset was imbalanced. To
solve this issue, Resampling the training dataset randomly is one way to deal with the issue of
data imbalance. The dataset can be resampled using two diferent techniques: undersampling,
which involves removing examples from the majority class, and oversampling, which involves
repeating samples from the minority class [
        <xref ref-type="bibr" rid="ref14">14</xref>
        ]. We oversampled the dataset using the imblearn
[
        <xref ref-type="bibr" rid="ref27">27</xref>
        ] library because the training instances are already rather few and removing examples from
the majority class will further reduce them. Making the ratio of the minority to the majority
class 0.5 by using RandomOverSampler with a sampling method of 0.5.
      </p>
    </sec>
    <sec id="sec-5">
      <title>5. TF-IDF for Text Classification</title>
      <p>
        TF(term frequency) explains the importance of a word for a particular document [
        <xref ref-type="bibr" rid="ref8">8</xref>
        ].
(      )   () =
        
    
IDF (inverse document frequency) describes the relevance of a word for a corpus. For instance,
stopwords are included in every document, making them the least relevant for classifying the
whole corpus. As a result, their IDF value will be lower. On the other hand, a word’s IDF value
will be high if it appears in a small number of documents.
      </p>
      <p>(        )  () =
log(</p>
      <p>ℎ   
)</p>
      <sec id="sec-5-1">
        <title>Then finally we combine both TF and IDF to form TF-IDF:</title>
        <p>
          −   () =   () ∗   ()
For the classification of an input tweet, the voting method is used. For each word in the input
text, we calculated the &lt;HOF score&gt; and &lt;NOT score&gt;. The code iterates over the entire training
set tweet by tweet. For each word in the input text; if the word is present in the tweet, then
check for the tweet’s label. If label is 1, then the tf-idf value of the word for that tweet is added
to its &lt;HOF score&gt; otherwise to its &lt;NOT score&gt;. The &lt;HOF score&gt; and &lt;NOT score&gt; values
of all words thus calculated are added to the full input text, and the label with the higher score
is the predicted label.
6. BERT Model and its Variants for Text Classification
[
          <xref ref-type="bibr" rid="ref10">10</xref>
          ]There are two main steps related to the BERT architecture for classification: pre-training
and fine-tuning. Pre-training involves training the model on unlabeled data using several
pretrained tasks. An English teacher teaches a language to a child by using ”fill in the blanks” ,
”question and answer” types of exercises. The BERT model is pre-trained in a similar way by
giving it tokenized text and masking part of the text’s tokens; the model’s job is to discover the
missing word. Another method used for pre-training BERT is next sentence prediction. It starts
with choosing two sentences A and B, 50% of the time B is the actual sentence following A and
50% of the time it is a random sentence from the corpus. This teaches the model to identify
the relationship between two sentences, which will help in ”question answering” tasks. The
next step is the fine-tuning of various tasks, such as classification and question answering, for
which two sentences are appended with a [SEP] token between them and only one sentence is
passed as input. The fine-tuning task will require some additional layers over the output from
the BERT model for training for a particular task, for example, for classification, the output
corresponding to the [CLS] token is taken as input for the Feed Forward Network (FFN). This
Feed Forward Network is called the fine tuning layer, and during fine tuning, the weights of
this classification layer are trained without changing the weights inside the BERT model. So,
we can say that the fine tuning layer is using the knowledge of the BERT model to train for
classification; in our case, there are two nodes in the output layer for binary classification. BERT
architecture is shown in Figure 3. The following BERT variants are used in the proposed task:
• MuRIL6 - MuRIL [
          <xref ref-type="bibr" rid="ref28">28</xref>
          ] is a BERT based model trained over 17 Indian languages using
        </p>
        <p>
          Wikipedia data.
• Multilingual-BERT7 - M-BERT [
          <xref ref-type="bibr" rid="ref29">29</xref>
          ] has 104 languages pre-trained from large wikipedia
data. WordPiece is used to tokenize and lowercase the texts, and a common vocabulary
with a size of 110,000 is used. This model is case sensitive.
• Distil-BERT8 - DistilBERT [30] model is based on small, cheap and fast transformers
used knowledge distilling during pre-training and reduced the size of BERT by 40%
        </p>
      </sec>
    </sec>
    <sec id="sec-6">
      <title>7. Proposed Techniques and Algorithms</title>
      <sec id="sec-6-1">
        <title>Task A (Code-Mixed Language):</title>
        <p>The suggested model, as shown in Figure 4, first takes the multilingual and code mixed text as
input and preprocesses it by deleting stopwords( sklearn library provides the list of stopwords
for English and German language and Kaggle provided for Hindi language, a custom function
is used to remove the stopwords from dataset one by one using the lists of stopwords) for the
dataset presented in Figure 1. The hyperlinks, emojis and hashtags are also removed. The text
is made lowercase to handle names in the text. Following that, the preprocessed data is used to
train two diferent models, the TF-IDF feature extraction model and the BERT model (all models
are trained independently). The HOF and NOT scores for test data are determined using the
TF-IDF feature extraction approach, which is described in the next section. The following are
the phases related to the text classification using TF-IDF:
• Data Pre-Processing
• Extracting TF-IDF features
• Calculating TF-IDF score for classification
6https://huggingface.co/google/ MuRIL-base-cased
7https://huggingface.co/bert-base-multilingual-cased
8https://huggingface.co/distilroberta-base
To determine whether text input is HOF or NOT, a tokenizer is applied first, followed by a
ifne-tuning layer over the four pre-trained BERT models (Distill BERT, Multilingual BERT,
RoBERTa, and Muril BERT). Figure 4 shows the proposed architecture for the hate speech
classification. The four key phases of the process are: 
• Data Pre-Processing
• Tokenization
• Using Pre-Trained BERT Model
• Fine-Tuning Classifier for the Pre-Trained Model
Table 3 lists the various hyperparameters used while training of the proposed models.</p>
      </sec>
      <sec id="sec-6-2">
        <title>Task B (Marathi Language):</title>
        <p>Transformers-based models ofer state-of-the-art implementation for several NLP related tasks
such as fake news detection, question answering systems, machine translation, rumour detection
etc. As a result of their bidirectional training and greater language comprehension, they
outperform other ML approaches. First-step in the Transformer-based model creation is
pretraining, which is then followed by fine-tuning. Large language datasets (monolingual) or
datasets in several languages (multilingual) are used to train the model in the initial stages.
To obtain the word embeddings, just the encoder component of the transformer design is
used. To calculate the probability for binary classes, an additional output layer is implemented.
The diferent word embedding models that have been used are mentioned above in the BERT
explanation part.</p>
        <p>The Flowchart in Figure 5 shows the complete approach. In Brief, the main 4 steps of the
process are:
• Data Pre-Processing
• Tokenization
• Using Pre-Trained BERT Model
• Fine-Tuning Classifier for the Pre-Trained Model</p>
        <sec id="sec-6-2-1">
          <title>The hyperparameters for training the model are mentioned in Table 4.</title>
        </sec>
      </sec>
    </sec>
    <sec id="sec-7">
      <title>8. Results and Evaluations</title>
      <sec id="sec-7-1">
        <title>Task A (Code-Mixed Language):</title>
        <p>The performance of each model is evaluated using various evaluation metrics. Table 5 lists
the accuracy, precision, recall, and F1-measure using the TF-IDF model. Table 6 lists the
accuracy for Micro-F1 and Macro-F1 using BERT and its variants, Of the three BERT versions,
MuRIL produced the best outcomes. Distil-BERT and Multilingual-BERT produced nearly
identical results, but Multilingual-BERT performed better. The code is available from the github
repository9</p>
      </sec>
      <sec id="sec-7-2">
        <title>Task B (Marathi Language):</title>
        <p>Accuracy and Macro F1 are used to evaluate each model’s performance. MuRIL gave the best
results among all 3 BERT models. While MuRIL and Multilingual-BERT almost gave similar
results, but MuRIL performed better than Multilingual-BERT. While Distil-BERT performed the
worst. The test data provided by HASOC is only for the following Hyperparamters: Number
9https://github.com/saransh-goel/HASOC.git
of Epochs = 4, Batch size = 2 and Learning Rate = 1.00742e-05. The results are shown in the
following below tables namely Table 7, Table 8, Table 9 and Table 10:</p>
        <p>The results show that MuRIL gives the best results in all the scenarios. When the Learning
Rate is decreased the accuracy of all three models increases, while when the Learning Rate is
increased accuracy of MuRIL and mBERT decreases while accuracy for Distil-BERT increases.
At the same time the changes seen when changing Batch size is similar to Learning Rate.</p>
      </sec>
    </sec>
    <sec id="sec-8">
      <title>9. Conclusion and Future Work</title>
      <sec id="sec-8-1">
        <title>Task A (Code-Mixed Language):</title>
        <p>The proposed results demonstrate that the BERT model performs better than the TF-IDF feature
extraction model. This is because the BERT model takes into account the right and left context
in the text, allowing it to detect hate speech more accurately by taking into account the context
of each sentence; additionally, BERT takes subwords as tokens as well; for example, ”playing”
is broken into ”play” and ”ing,” and then separate embeddings are calculated for each token;
this extra quality also helps the BERT model perform better. In this scenario, Muril-BERT
outperforms multilingual-BERT and Distil-BERT. The next stage for detecting hate speech
would be viewed as a multimodal technique. Some social context-based features can also be
investigated in future research. One could even go much farther in the TF-IDF feature extraction
process to employ character and word n-grams for hate speech detection. There must be a BERT
model trained over a large dataset that performs better for code mixed languages, particularly
Hindi written in roman script. </p>
      </sec>
      <sec id="sec-8-2">
        <title>Task B (Marathi Language):</title>
        <p>The results presented above show that pre-trained BERT models perform better and are better
able to grasp the meaning of a given sentence, serving as better learning representations.
Therefore, compared to conventional feature extraction approaches, the transfer learning
strategy using pre-trained BERT models is better suitable for identifying ofensive and hate
speech. The MuRIL performed the best among the three models. On the public leaderboard
rankings, we came in fourth place. Additionally, By focusing on both images and text and
obtaining the visual components for better feature extraction, we may approach this hate speech
detection issue from a multimodal perspective. With better word tokenization and specific
tokens for Marathi language, the performance could be enhanced. In the future, models can be
trained on a larger corpus to improve accuracy even further. Further, future experiments with
deeper transformer architectures may be conducted.
[30] V. Sanh, L. Debut, J. Chaumond, T. Wolf, Distilbert, a distilled version of bert: smaller,
faster, cheaper and lighter, arXiv preprint arXiv:1910.01108 (2019).</p>
      </sec>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <given-names>M. A.</given-names>
            <surname>Paz</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Montero-Díaz</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Moreno-Delgado</surname>
          </string-name>
          ,
          <article-title>Hate speech: A systematized review</article-title>
          ,
          <source>Sage Open</source>
          <volume>10</volume>
          (
          <year>2020</year>
          )
          <fpage>2158244020973022</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [2]
          <string-name>
            <given-names>J. T.</given-names>
            <surname>Nockleby</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L. W.</given-names>
            <surname>Levy</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K. L.</given-names>
            <surname>Karst</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D. J.</given-names>
            <surname>Mahoney</surname>
          </string-name>
          ,
          <article-title>Encyclopedia of the american constitution</article-title>
          , Detroit, MI:
          <source>Macmillan Reference</source>
          <volume>3</volume>
          (
          <year>2000</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [3]
          <string-name>
            <given-names>S.</given-names>
            <surname>MacAvaney</surname>
          </string-name>
          , H.
          <string-name>
            <surname>-R. Yao</surname>
            ,
            <given-names>E.</given-names>
          </string-name>
          <string-name>
            <surname>Yang</surname>
            ,
            <given-names>K.</given-names>
          </string-name>
          <string-name>
            <surname>Russell</surname>
            ,
            <given-names>N.</given-names>
          </string-name>
          <string-name>
            <surname>Goharian</surname>
            ,
            <given-names>O.</given-names>
          </string-name>
          <string-name>
            <surname>Frieder</surname>
          </string-name>
          ,
          <article-title>Hate speech detection: Challenges and solutions</article-title>
          ,
          <source>PloS one 14</source>
          (
          <year>2019</year>
          )
          <article-title>e0221152</article-title>
          .
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [4]
          <string-name>
            <given-names>P.</given-names>
            <surname>Mehta</surname>
          </string-name>
          ,
          <string-name>
            <given-names>T.</given-names>
            <surname>Mandl</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Majumder</surname>
          </string-name>
          ,
          <string-name>
            <surname>S. Gangopadhyay,</surname>
          </string-name>
          <article-title>Report on the fire 2020 evaluation initiative</article-title>
          ,
          <source>in: ACM SIGIR Forum</source>
          , volume
          <volume>55</volume>
          , ACM New York, NY, USA,
          <year>2021</year>
          , pp.
          <fpage>1</fpage>
          -
          <lpage>11</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          [5]
          <string-name>
            <given-names>N.</given-names>
            <surname>Choudhary</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Singh</surname>
          </string-name>
          ,
          <string-name>
            <surname>I. Bindlish</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Shrivastava</surname>
          </string-name>
          ,
          <article-title>Sentiment analysis of code-mixed languages leveraging resource rich languages</article-title>
          , arXiv preprint arXiv:
          <year>1804</year>
          .
          <volume>00806</volume>
          (
          <year>2018</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          [6]
          <string-name>
            <given-names>S.</given-names>
            <surname>Modha</surname>
          </string-name>
          ,
          <string-name>
            <given-names>T.</given-names>
            <surname>Mandl</surname>
          </string-name>
          ,
          <string-name>
            <given-names>G. K.</given-names>
            <surname>Shahi</surname>
          </string-name>
          ,
          <string-name>
            <given-names>H.</given-names>
            <surname>Madhu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Satapara</surname>
          </string-name>
          ,
          <string-name>
            <given-names>T.</given-names>
            <surname>Ranasinghe</surname>
          </string-name>
          ,
          <string-name>
            <surname>M.</surname>
          </string-name>
          <article-title>Zampieri, Overview of the hasoc subtrack at fire 2021: Hate speech and ofensive content identification in english and indo-aryan languages and conversational hate speech</article-title>
          ,
          <source>in: Forum for Information Retrieval Evaluation</source>
          ,
          <year>2021</year>
          , pp.
          <fpage>1</fpage>
          -
          <lpage>3</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          [7]
          <string-name>
            <given-names>T.</given-names>
            <surname>Mandl</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Modha</surname>
          </string-name>
          ,
          <string-name>
            <given-names>G. K.</given-names>
            <surname>Shahi</surname>
          </string-name>
          ,
          <string-name>
            <given-names>H.</given-names>
            <surname>Madhu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Satapara</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Majumder</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Schäfer</surname>
          </string-name>
          ,
          <string-name>
            <given-names>T.</given-names>
            <surname>Ranasinghe</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Zampieri</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Nandini</surname>
          </string-name>
          , et al.,
          <source>Overview of the hasoc subtrack at fire</source>
          <year>2021</year>
          :
          <article-title>Hate speech and ofensive content identification in english and indo-aryan languages</article-title>
          ,
          <source>arXiv preprint arXiv:2112.09301</source>
          (
          <year>2021</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          [8]
          <string-name>
            <given-names>J.</given-names>
            <surname>Ramos</surname>
          </string-name>
          , et al.,
          <article-title>Using tf-idf to determine word relevance in document queries</article-title>
          ,
          <source>in: Proceedings of the first instructional conference on machine learning</source>
          , volume
          <volume>242</volume>
          ,
          <string-name>
            <surname>Citeseer</surname>
          </string-name>
          ,
          <year>2003</year>
          , pp.
          <fpage>29</fpage>
          -
          <lpage>48</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          [9]
          <string-name>
            <given-names>A.</given-names>
            <surname>Schmidt</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Wiegand</surname>
          </string-name>
          ,
          <article-title>A survey on hate speech detection using natural language processing</article-title>
          ,
          <source>in: Proceedings of the fith international workshop on natural language processing for social media</source>
          ,
          <year>2017</year>
          , pp.
          <fpage>1</fpage>
          -
          <lpage>10</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          [10]
          <string-name>
            <given-names>J.</given-names>
            <surname>Devlin</surname>
          </string-name>
          , M.-
          <string-name>
            <given-names>W.</given-names>
            <surname>Chang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K.</given-names>
            <surname>Lee</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K.</given-names>
            <surname>Toutanova</surname>
          </string-name>
          , Bert:
          <article-title>Pre-training of deep bidirectional transformers for language understanding</article-title>
          , arXiv preprint arXiv:
          <year>1810</year>
          .
          <volume>04805</volume>
          (
          <year>2018</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          [11]
          <string-name>
            <given-names>S.</given-names>
            <surname>Kalraa</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Y.</given-names>
            <surname>Bansala</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Y.</given-names>
            <surname>Sharmaa</surname>
          </string-name>
          ,
          <article-title>Detection of abusive records by analyzing the tweets in urdu language exploring transformer based models (</article-title>
          <year>2021</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          [12]
          <string-name>
            <given-names>S.</given-names>
            <surname>Kalraa</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Agrawala</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Y.</given-names>
            <surname>Sharmaa</surname>
          </string-name>
          ,
          <article-title>Detection of threat records by analyzing the tweets in urdu language exploring deep learning transformer-based models (</article-title>
          <year>2021</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          [13]
          <string-name>
            <given-names>S.</given-names>
            <surname>Kalraa</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Vermaa</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Y.</given-names>
            <surname>Sharmaa</surname>
          </string-name>
          ,
          <string-name>
            <given-names>G. S.</given-names>
            <surname>Chauhanb</surname>
          </string-name>
          ,
          <article-title>Ensembling of various transformer based models for the fake news detection task in the urdu language (</article-title>
          <year>2021</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref14">
        <mixed-citation>
          [14]
          <string-name>
            <given-names>S.</given-names>
            <surname>Kalraa</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K. N.</given-names>
            <surname>Inania</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Y.</given-names>
            <surname>Sharmaa</surname>
          </string-name>
          ,
          <string-name>
            <given-names>G. S.</given-names>
            <surname>Chauhanb</surname>
          </string-name>
          ,
          <article-title>Applying transfer learning using bert-based models for hate speech detection (</article-title>
          <year>2020</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref15">
        <mixed-citation>
          [15]
          <string-name>
            <given-names>T.</given-names>
            <surname>Davidson</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Warmsley</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Macy</surname>
          </string-name>
          ,
          <string-name>
            <surname>I. Weber</surname>
          </string-name>
          ,
          <article-title>Automated hate speech detection and the problem of ofensive language</article-title>
          ,
          <source>in: Proceedings of the international AAAI conference on web and social media</source>
          , volume
          <volume>11</volume>
          ,
          <year>2017</year>
          , pp.
          <fpage>512</fpage>
          -
          <lpage>515</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref16">
        <mixed-citation>
          [16]
          <string-name>
            <given-names>A.</given-names>
            <surname>Gaydhani</surname>
          </string-name>
          ,
          <string-name>
            <given-names>V.</given-names>
            <surname>Doma</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Kendre</surname>
          </string-name>
          , L. Bhagwat,
          <article-title>Detecting hate speech and ofensive language on twitter using machine learning: An n-gram and tfidf based approach</article-title>
          , arXiv preprint arXiv:
          <year>1809</year>
          .
          <volume>08651</volume>
          (
          <year>2018</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref17">
        <mixed-citation>
          [17]
          <string-name>
            <given-names>R.</given-names>
            <surname>Kshirsagar</surname>
          </string-name>
          ,
          <string-name>
            <given-names>T.</given-names>
            <surname>Cukuvac</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K.</given-names>
            <surname>McKeown</surname>
          </string-name>
          ,
          <string-name>
            <surname>S. McGregor</surname>
          </string-name>
          ,
          <article-title>Predictive embeddings for hate speech detection on twitter</article-title>
          , arXiv preprint arXiv:
          <year>1809</year>
          .
          <volume>10644</volume>
          (
          <year>2018</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref18">
        <mixed-citation>
          [18]
          <string-name>
            <given-names>P.</given-names>
            <surname>Badjatiya</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Gupta</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Gupta</surname>
          </string-name>
          ,
          <string-name>
            <given-names>V.</given-names>
            <surname>Varma</surname>
          </string-name>
          ,
          <article-title>Deep learning for hate speech detection in tweets</article-title>
          ,
          <source>in: Proceedings of the 26th international conference on World Wide Web companion</source>
          ,
          <year>2017</year>
          , pp.
          <fpage>759</fpage>
          -
          <lpage>760</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref19">
        <mixed-citation>
          [19]
          <string-name>
            <given-names>G. K.</given-names>
            <surname>Pitsilis</surname>
          </string-name>
          ,
          <string-name>
            <given-names>H.</given-names>
            <surname>Ramampiaro</surname>
          </string-name>
          ,
          <string-name>
            <given-names>H.</given-names>
            <surname>Langseth</surname>
          </string-name>
          ,
          <article-title>Efective hate-speech detection in twitter data using recurrent neural networks</article-title>
          ,
          <source>Applied Intelligence</source>
          <volume>48</volume>
          (
          <year>2018</year>
          )
          <fpage>4730</fpage>
          -
          <lpage>4742</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref20">
        <mixed-citation>
          [20]
          <string-name>
            <given-names>Z.</given-names>
            <surname>Zhang</surname>
          </string-name>
          , L. Luo,
          <article-title>Hate speech detection: A solved problem? the challenging case of long tail on twitter</article-title>
          ,
          <source>Semantic Web</source>
          <volume>10</volume>
          (
          <year>2019</year>
          )
          <fpage>925</fpage>
          -
          <lpage>945</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref21">
        <mixed-citation>
          [21]
          <string-name>
            <given-names>A.</given-names>
            <surname>Bisht</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Singh</surname>
          </string-name>
          ,
          <string-name>
            <given-names>H.</given-names>
            <surname>Bhadauria</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Virmani</surname>
          </string-name>
          , et al.,
          <article-title>Detection of hate speech and ofensive language in twitter data using lstm model, in: Recent trends in image and signal</article-title>
          processing in
          <source>computer vision</source>
          , Springer,
          <year>2020</year>
          , pp.
          <fpage>243</fpage>
          -
          <lpage>264</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref22">
        <mixed-citation>
          [22]
          <string-name>
            <given-names>J.</given-names>
            <surname>Devlin</surname>
          </string-name>
          , M.-
          <string-name>
            <given-names>W.</given-names>
            <surname>Chang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K.</given-names>
            <surname>Lee</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K.</given-names>
            <surname>Toutanova</surname>
          </string-name>
          , Bert:
          <article-title>Pre-training of deep bidirectional transformers for language understanding</article-title>
          , arXiv preprint arXiv:
          <year>1810</year>
          .
          <volume>04805</volume>
          (
          <year>2018</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref23">
        <mixed-citation>
          [23]
          <string-name>
            <given-names>M.</given-names>
            <surname>Zampieri</surname>
          </string-name>
          ,
          <string-name>
            <given-names>T.</given-names>
            <surname>Ranasinghe</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Chaudhari</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Gaikwad</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Krishna</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Nene</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Paygude</surname>
          </string-name>
          ,
          <article-title>Predicting the type and target of ofensive social media posts in marathi</article-title>
          ,
          <source>Social Network Analysis and Mining</source>
          <volume>12</volume>
          (
          <year>2022</year>
          )
          <article-title>77</article-title>
          . URL: https://doi.org/10.1007/s13278-022-00906-8. doi:
          <volume>10</volume>
          . 1007/s13278- 022- 00906- 8.
        </mixed-citation>
      </ref>
      <ref id="ref24">
        <mixed-citation>
          [24]
          <string-name>
            <given-names>P. K.</given-names>
            <surname>Roy</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A. K.</given-names>
            <surname>Tripathy</surname>
          </string-name>
          ,
          <string-name>
            <surname>T. K. Das</surname>
            ,
            <given-names>X.-Z.</given-names>
          </string-name>
          <string-name>
            <surname>Gao</surname>
          </string-name>
          ,
          <article-title>A framework for hate speech detection using deep convolutional neural network</article-title>
          ,
          <source>IEEE Access 8</source>
          (
          <year>2020</year>
          )
          <fpage>204951</fpage>
          -
          <lpage>204962</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref25">
        <mixed-citation>
          [25]
          <string-name>
            <surname>Satapara</surname>
          </string-name>
          , Shrey and Majumder, Prasenjit and Mandl, Thomas and Modha, Sandip and Madhu, Hiren and Ranasinghe, Tharindu and Zampieri, Marcos and North, Kai and Premasiri, Damith,
          <source>Overview of the HASOC Subtrack at FIRE</source>
          <year>2022</year>
          :
          <article-title>Hate Speech and Ofensive Content Identification in English and Indo-Aryan Languages</article-title>
          , in: FIRE 2022:
          <article-title>Forum for Information Retrieval Evaluation, Virtual Event</article-title>
          ,
          <fpage>9th</fpage>
          -13th
          <source>December</source>
          <year>2022</year>
          , ACM,
          <year>2022</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref26">
        <mixed-citation>
          [26]
          <string-name>
            <given-names>S.</given-names>
            <surname>Modha</surname>
          </string-name>
          ,
          <string-name>
            <given-names>T.</given-names>
            <surname>Mandl</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Majumder</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Satapara</surname>
          </string-name>
          ,
          <string-name>
            <given-names>T.</given-names>
            <surname>Patel</surname>
          </string-name>
          ,
          <string-name>
            <given-names>H.</given-names>
            <surname>Madhu</surname>
          </string-name>
          ,
          <article-title>Overview of the HASOC Subtrack at FIRE 2022: Identification of Conversational Hate-Speech in HindiEnglish Code-Mixed and German Language</article-title>
          , in: Working Notes of FIRE 2022 -
          <article-title>Forum for Information Retrieval Evaluation</article-title>
          ,
          <string-name>
            <surname>CEUR</surname>
          </string-name>
          ,
          <year>2022</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref27">
        <mixed-citation>
          [27]
          <string-name>
            <given-names>S.</given-names>
            <surname>Prabhu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Mohamed</surname>
          </string-name>
          ,
          <string-name>
            <surname>H.</surname>
          </string-name>
          <article-title>Misra, Multi-class text classification using bert-based active learning</article-title>
          ,
          <source>arXiv preprint arXiv:2104.14289</source>
          (
          <year>2021</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref28">
        <mixed-citation>
          [28]
          <string-name>
            <given-names>S.</given-names>
            <surname>Khanuja</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Bansal</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Mehtani</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Khosla</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Dey</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B.</given-names>
            <surname>Gopalan</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D. K.</given-names>
            <surname>Margam</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Aggarwal</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R. T.</given-names>
            <surname>Nagipogu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Dave</surname>
          </string-name>
          , et al.,
          <article-title>Muril: Multilingual representations for indian languages</article-title>
          ,
          <source>arXiv preprint arXiv:2103.10730</source>
          (
          <year>2021</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref29">
        <mixed-citation>
          [29]
          <string-name>
            <given-names>T.</given-names>
            <surname>Pires</surname>
          </string-name>
          ,
          <string-name>
            <given-names>E.</given-names>
            <surname>Schlinger</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Garrette</surname>
          </string-name>
          ,
          <article-title>How multilingual is multilingual bert?</article-title>
          , arXiv preprint arXiv:
          <year>1906</year>
          .
          <volume>01502</volume>
          (
          <year>2019</year>
          ).
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>