<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Probabilistic Impact Score Generation using Ktrain-BERT to Identify Hate Words from Twitter Discussions</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Sourav Das</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Prasanta Mandal</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Sanjay Chatterji</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Department of Computer Science and Engineering, Indian Institute of Information Technology</institution>
          ,
          <addr-line>Kalyani West Bengal</addr-line>
          ,
          <country country="IN">India</country>
        </aff>
      </contrib-group>
      <abstract>
        <p>Social media has seen a worrying rise in hate speech in recent times. Branching to several distinct categories of cyberbullying, gender discrimination, or racism, the combined label for such derogatory content can be classified as toxic content in general. This paper presents experimentation with a Keras wrapped lightweight BERT model to successfully identify hate speech and predict probabilistic impact score for the same to extract the hateful words within sentences. The dataset used for this task is the Hate Speech and Offensive Content Detection (HASOC 2021) data from FIRE 2021 in English. Our system obtained a validation accuracy of 82.60%, with a maximum F1-Score of 82.68%. Subsequently, our predictive cases performed significantly well in generating impact scores for successful identification of the hate tweets as well as the hateful words from tweet pools.</p>
      </abstract>
      <kwd-group>
        <kwd>eol&gt;Hate Speech</kwd>
        <kwd>Hate Tweets</kwd>
        <kwd>Label Classification</kwd>
        <kwd>BERT</kwd>
        <kwd>Probabilistic Impact Score</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>1. Introduction</title>
      <p>With the ever-evolving paradigms of global socio-economic issues, there has also been an
increment in vast amounts of expressions and opinions on social networks in recent years.
Depending on the context or topic, often the participating agents (humans) in the conversation
can mutually agree or be polarized. Such contrast between the opinions and perceptions causes
the tendency in people to get under the nerves of the other people involved in the discussion,
even if it is by the help of hateful comments, or toxic content in general. The Cambridge Dictionary
defines hate speech as “public speech that expresses hate or encourages violence towards a
person based on something such as race, religion or sexual orientation”. However, it could be
unanimously agreed that political and financial viewpoints also play a significant role in modern
hate content on social media.</p>
      <p>
        In the need to counter the online hate content, the NLP research community is continuing
prominent contributions and findings for the same [
        <xref ref-type="bibr" rid="ref1 ref2 ref3">1, 2, 3</xref>
        ]. However, hate speech is not always
directly trackable, as it is dependent on topic and context [
        <xref ref-type="bibr" rid="ref4">4</xref>
        ]. Also, in a longer sequence of
discussions, hate comments can occur as well we dissolve.
      </p>
      <p>In this work, we focus on detecting hate speech and offensive content in English by exploratory
data analysis with an ensemble transformer model.
____________________________
Forum for Information Retrieval Evaluation, December 13-17, 2021, India</p>
      <p>souravphd21@iiitkalyani.ac.in (S. Das); prasantaphd2019@iiitkalyani.ac.in (P. Mandal);
sanjayc@iiitkalyani.ac.in (S. Chatterji)
0000-0003-4864-1635 (S. Das); 0000-0003-1367-957X (P. Mandal); 0000-0002-1086-9987 (S. Chatterji)
© 2021 Copyright for this paper by its authors. Use permitted under Creative Commons License Attribution 4.0 International (CC BY 4.0).</p>
      <p>
        CPWrEooUrckResehdoinpgs hIStpN:/ce1ur6-1w3-s0.or7g3 CEUR Workshop Proceedings (CEUR-WS.org)
The hate data HASOC produced by the FIRE 20211 [
        <xref ref-type="bibr" rid="ref5">5</xref>
        ], which are multilingual datasets for the
Hate Speech and Offensive Content Identification shared task, but the datasets are available in
separate fractions as per different languages (English, Hindi, and Marathi).
      </p>
      <p>For the proposed research, we choose only the English dataset from HASOC Subtask 1. In order
to shape the data for our proposed work, we introduce a set of extensive data analysis steps and
reduce the data attributes only to the atomic columns. We split the train and test sets using linear
Logistic Regression, and save the model. Next, we incorporate the ktrain BERT model [6], which
is a lightweight BERT wrapped by the Tensorflow-Keras2 library for low resource systems and a
faster training phase. We extend the model further for several hyperparameters tuning, train and
test visualization, and most importantly for probabilistic impact score generation from the
trained model for any random sentences, to point out the hate speech factor (word) in those
sentences. Finally, we produce the qualitative analysis by classification report from the saved
model. We intend to make the code publicly available at Github3.</p>
      <p>
        The rest of the paper is organized as follows: Section 2 provides a detailed insight on the data
and processing of the same. Section 3 describes the structural details of the selected model, with
experimental and hyperparameter settings. Section 4 demonstrates the results with metrics and
visualizations, probabilistic impact score generation for hate or positive tweets, and error
analysis. Finally, Section 5 concludes the work with future directions.
2. Data Description
As mentioned earlier, our data for experimentation is the Hate Speech and Offensive Content
Identification in English and Indo-Aryan Languages or HASOC 20214, provided by the Forum for
Information Retrieval Evaluation (FIRE 2021) [
        <xref ref-type="bibr" rid="ref6">7</xref>
        ]. There are three separate sets of train and test
corpora, respectively in English, Hindi, and Marathi. The datasets are primarily available for
particular shared Subtask 1, which is further divided as subtask 1A for identifying hate and
offensive contents from the tweets, and subtask 1B for discriminating between hate, profane, and
offensive posts. We select the English corpus of Subtask 1 for our approach. The train set is a
binary classification dataset, where the tweets contain the user id for each tweet, the tweets
themselves in text format without the eradication of emojis, special characters, and external web
links, and the classification category for each tweet. The columns are named “id”, “text”, and
“task_1”. The tweets are already annotated, and either labeled as HOF, i.e., Hate and Offensive
Content, or NOT, i.e., Non-Hate-Offensive Content under the task_1 column. On the other hand,
the test set contains tweets with respective user ids only. These tweets are not labeled and could
either be used for hate content identification (in the form of HOF and NOT), or for system
validation and hate tweet prediction. We showcase the detailed dataset information in Table 1.
      </p>
      <p>Data
HASOC 2021 Train</p>
      <p>(English)
HASOC 2021 Test
(English)</p>
      <p>Total Tweets
3844
1282
1 http://fire.irsi.res.in/fire/2021/home
2 https://www.tensorflow.org/api_docs/python/tf/keras
3 https://github.com/SouravD-Me
4 https://hasocfire.github.io/hasoc/2021/index.html
HOF
2508</p>
      <p>Labels</p>
      <p>NA</p>
      <p>NOT
1976
2.1</p>
      <sec id="sec-1-1">
        <title>Preprocessing</title>
        <p>To normalize the data in plain English texts, at first, we aim to eliminate the web links,
abbreviations. We drop the additional columns not contributing to the determinant factor of the
dataset. Next, we process the emojis and translate the labels (HOF or NOT) of the tweets into
binary categories for better efficiency of the subsequent training phases. Here, hate speech can
obviously be interpreted as negative content, while a generic tweet is a positive content. For that,
we introduce two columns, neg and pos. the fundamental behind this is that we want to maintain
an exact bipolarity of each tweet from the corpus. We maintain two labels parallelly, while joining
the tweets with their respective HOF and NOT categories, and providing a score of 1 in a pos
column if the tweet is non-hateful, or 1 in the neg column if the tweet contains hate speech. We
show the head end of the preprocessed data shape in Figure 1.</p>
      </sec>
    </sec>
    <sec id="sec-2">
      <title>3. Model</title>
      <p>At first, we implement the already popularized Logistic Regression algorithm from scikit learn5
to make a manual imbalanced partition of the train set itself. Since this is a classification task, LR
has proved to be efficient in this genre of tasks [8]. LR maintains a uniform numerical
regularization to normalize the stability of the training data, otherwise is also termed as the C
value, i.e., the hyperparameter set. The downright advantage of deploying the LR model from
scikit learn library is that it is fairly tuned, so the split (and training, if needed) needs little to no
further optimization.</p>
      <p>In order, we use the pre-trained BERT model wrapped in a Keras-Tensorflow library for
efficiency in low code. This custom wrapper library is known as ktrain. BERT is already a widely
renowned application. It is a bi-directional transformer that learns a language representation for
pre-training. It uses the attention mechanism Transformer to learn the relationships between
words in a text. Instead of having the model sequentially read the text input, the transformer
encoder reads the entire text sequence at once. This characteristic makes it non-directional. But
one major downside of such an acclaimed model is that classic BERT performs better than
expected on many challenging tasks mainly due to its ability to handle large datasets, with
Google’s state-of-the-art computing resources, and the complexity of its pre-training tasks. This
is not always beneficial for researchers with low resources or access. Hence, diluted and more
5 https://scikit-learn.org/stable/modules/generated/sklearn.linear_model.LogisticRegression.html#
comprehensive versions of BERT have emerged as the more popular choices for
experimentations. A few of such models are DistilBERT [9], BERT-Base-Uncased [9], DeBERTa
[10], etc.</p>
      <p>A similar, but underutilized extension is the ktrain wrapper for BERT. It is not exclusively built
for BERT, but the original repo is influenced by fastai6 library, and also offers a wide range of
other implementations for NBSVM [11], BiLSTM [12], and LDA [13]. It is also a potential library
for image classification and graph neural networks. For dedicated NLP tasks, ktrain helps build
and train neural networks for text classification, question answering, document summarization,
named entity tagging, etc. It can also be used to estimate an optimal learning rate and schedule
learning rates.</p>
      <p>The first step involves loading and preprocessing data from different sources. Since ktrain is
developed to work seamlessly with Keras, this step is quite similar to data loading in Keras
models. It can be done in various forms, such as text, images, and graph data. At this stage, the
model expects a preprocessor instance, where the preprocessor is loaded with the encapsulated
feature set, pretraining length, and model range. In the second step, the model can be further
customized by hyperparameter tuning, learning schedules interchanging, or by creating a custom
one using tf.keras. Here we can employ the learning rate schedules such as one-cycle policy and
SDGR. For our approach, we use the turnwise one-cycle policy, i.e., repeating a cycle each with 25
epochs for three consecutive times. After that, the model is automatically configured after
inspecting the data. Next, we call the predictor method from the already learned model, and we
provide instances (tweets) from the test data, or even random sentences to predict the category
(HATE or NOT). Now, for the predicted set of sentences, the explain method can be called to
generate the probabilistic scores to showcase the scores for each word in that sentence. A word
with a higher score represents the higher negative impact and essentially turns out to be hate,
offensive, or discriminatory content. Here the predictor log along with the model implementation
could also be saved for later deployment. Our model with the proposed methodology for the
experiment is represented in Figure 2.
3.1</p>
      <sec id="sec-2-1">
        <title>Setup</title>
        <p>We use Google Colab as our platform for experimentation. It introduces a native feature named a
hardware accelerator for faster execution time, but with a limited resource threshold. Colab
provides the options to choose from the dedicated graphical processor environment execution
based on Tesla K80 GPUs or Google’s Tensor Processing Units, developed for parallel neural
computations simultaneously. Users can select any of the previously mentioned options, and for
the record, any BERT-based model requires demanding resources. Since we worked with the
ktrain wrapper, we select the notebook settings as GPU on a regular non-pro Colab account. We
intend to utilize the available CUDA cores in a GPU environment for enhanced training and
prediction time. We also connect to a hosted runtime with sufficient but abstract RAM and disk
availability, as Google does not share the exact figurative information with the users.
3.2</p>
      </sec>
      <sec id="sec-2-2">
        <title>Hyperparameter Settings</title>
        <p>For the hate words classification from ktrain-BERT, at first, the pretrained BERT loaded with
relatively fewer parameters to run on top of Keras. The ktrain-BERT has a pretraining length of
75, with a maximum feature set consideration, and set of words per batch for embedding. For
word token sequencing, it has a ngram range of 3. The model loading also includes determining
the column labels, so the model itself can pre-judge if the task is binary classification or
multilabel classification. Then we call the get_learner function to place the already split data in the
ktrain learner object. The batch size is kept as 16 for providing small training sets at a time. The
6 https://docs.fast.ai/
learner object can also be invoked by creating a custom fit function. We initiate the fit function
while keeping the learning rate 1e – 5 for 25 epochs. For the training process, we further tune the
learning rate by implementing the ktrain find function, which provides an interactive
customization scope into the learning schedule. We employ a Stochastic Gradient Restart
schedule (SGDR) to ensure to maintain checkpoints in training and resume training if the
notebook is paused, or the GPU access gets terminated. Now, the training metrics are fit to 8
epochs for 5 cycles, where only the best-performed batches are kept for entering into the next
cycle, and the rest are discarded. Here we keep the learning rate as 1e – 6. Our total trainable
parameters are 109,148,162.</p>
      </sec>
    </sec>
    <sec id="sec-3">
      <title>4. Results</title>
      <p>The results of ktrain-BERT for dedicated hate words identification for English data are shown in
Table 2. At next, we show the model shape for the first 5 layers out of a total of 12 layers with
layer-wise structure in Table 3 to provide a better investigation of the model itself. We run the
ktrain-BERT for 40 epochs in 5 consecutive cycles. The training vs. validation accuracy is plotted
in Figure 3. It shows the incremental growth of the model performance w.r.t time during this
phase. For English HASOC data, the Precision, F1-Score, and the highest training and validation
accuracy obtained subsequently are 87.16%, 83.63%, 97.28%, and 82.60%. The classification
report can be depicted in Figure 4.</p>
      <p>Language</p>
      <p>English</p>
      <p>Model
KtrainBERT</p>
      <p>Identification Labels</p>
      <p>HOF
(Hate Content)</p>
      <p>NOT
(Non-Hate Content)</p>
      <sec id="sec-3-1">
        <title>Probabilistic Impact Score</title>
        <p>After the validation phase, we choose random tweets both from the training and test data to
explain the model on a predictive probabilistic scale. It is obtained by extending the trained set
and validating the test data further by fusing the explain and predict function on the randomly
fed tweets. We choose three tweets with hate content, and the other three with non-hateful
content. The scores are normalized in a comparable scale, i.e., all the scores are in a non-negative
scale, while quite naturally the higher scores represent more positive or negative impacts, reliant
on the context of the concerning tweet itself. The results are shown here in Figures 5 to 10, with
the impact score for each word within the sentences.</p>
        <p>Tweet impact score breakdown: the - 0.767, razer - 0.073, blade - 1.143, 15 - 2.052, is - 2.336,
really - 3.190, good - 1.707, this - 0.517, year - 1.590.</p>
        <p>Tweet impact score breakdown: any - 0.539, daniel - 0.622, d - 0.922, lewis - 0.801, movie - 0.714,
is - 0.223, almost - 1.595, flawless - 2.604.</p>
        <p>Tweet impact score breakdown: real - 0.259, madid - 1.032, has - 0.881, an - 0.002, excellent
1.583, record - 0.383, in - 0.171, the - 0.444, champions - 0.411, league - 0.652.</p>
        <p>Tweet impact score breakdown: ryzen - 0.642, is - 0.977, a - 0.296, winner - 0.132, and - 0.036,
you’re - 0.808, a - 0.091, freak - 1.920, to - 0.353, think - 0.194, otherwise - 0.506.</p>
        <p>Tweet impact score breakdown: stop - 1.191, being - 1.630, a - 0.238, twat - 2.383, then - 0.818.</p>
        <p>Tweet impact score breakdown: you - 2.082, are - 1.220, hopeless - 0.026, retire - 1.816, wanker
- 3.773.</p>
        <p>It is evident from the above demonstrations that the system has identified the impactful hate
or non-hate words within the tweets in every instance. While it might be argued that the
probabilistic impact score for the words “hopeless” or “retire” is relatively low to identify them
as potential hate words, but the counter logic could be these words do not necessarily mean to
use for derogatory comments always. These are heavily used for the generic purpose (e.g., “It is
sad that Jim is finally going to retire”, or “I’m feeling hopeless since the beginning of the pandemic”).
Hence the impact on these words could not entirely determine the toxicity of a tweet or reply.
4.2</p>
      </sec>
      <sec id="sec-3-2">
        <title>Error Analysis</title>
        <p>It can be depicted from Figure 4 that even if the training classification accuracy constantly
maintains a stable increment range over 90%, in fare to that the validation rather performed less
significantly, with a bottleneck around 82%. The primary reason for this is the misclassification
of several tweets during the validation phase. We showcase such one instance, where a tweet is
like:
“@ na ##ren ##dra ##mo ##di @ ami ##ts ##ha ##h @ ra ##hul ##gan ##dhi @ r ##g ##way
##ana ##do ##ffi ##ce its been long time holding my words…”
Even after preprocessing and filtration, such tweets do not provide any particular insight for
classification, hence these tweets fail to add any value to the overall validation and thereafter
impact score generation process. The epoch performances are also relatively poor with higher
loss where these tweets are retrieved and fed by the BERT for validating the classification. For the
above-mentioned tweet, its ID is 139, and the epoch also suffers from a high loss value (loss: 7.77), in
which it is originally entered within the batch. It has affected the overall validation accuracy also.</p>
      </sec>
    </sec>
    <sec id="sec-4">
      <title>5. Conclusion and Future Work</title>
      <p>In this work, we used a finetuned BERT wrapped in a custom Keras module for hate speech
detection in English tweets. We obtained the highest training accuracy of 97.28% and a validation
accuracy of 82.60%. Extending that work, we also identified the existence of hate or non-hate
contents within concerned tweets using a probabilistic impact score generation for every word
in a tweet, as well as with the whole tweet itself. For the further scope of this work, we want to
develop a system capable of delivering the same for code-mixed tweets. Not only that, but we
would also like to relate the hate speech detection to the tweet topics so that a separate scaling
can be done for the topics; to determine which of the topics are more controversial than the
others, and likely to stir up the hate speech, cyberbullying, and other forms of demeaning contents
on the social networks.</p>
      <sec id="sec-4-1">
        <title>Declaration</title>
        <p>The hate speeches shown for the probabilistic impact score demonstrations are for experimental
purposes only. These are real tweets and are collected from the training and/or test data. The
authors DO NOT promote any form of hate content on social networks, and strongly condemn it.
[8] Shah, K., Patel, H., Sanghvi, D. and Shah, M. 2020. A Comparative Analysis of Logistic
Regression, Random Forest and KNN Models for the Text Classification. Augmented Human
Research. 5, 1 (Mar. 2020), 12. https://doi.org/10.1007/s41133-020-00032-0.
[9] Nayak, A., Timmapathini, H., Ponnalagu, K. and Gopalan Venkoparao, V. 2020. Domain
adaptation challenges of BERT in tokenization and sub-word representations of
Out-ofVocabulary words. Proceedings of the First Workshop on Insights from Negative Results in NLP
(Online, Nov. 2020), 1–5.
[10]DeBERTa: Decoding-enhanced BERT with Disentangled Attention:
https://arxiv.org/abs/2006.03654.
[11] Wang, S. and Manning, C.D. 2012. Baselines and bigrams: simple, good sentiment and topic
classification. Proceedings of the 50th Annual Meeting of the Association for Computational
Linguistics: Short Papers - Volume 2 (USA, Jul. 2012), 90–94.
[12] Das, S., Das, D. and Kolya, A.K. 2020. Sentiment classification with GST tweet data on LSTM
based on polarity-popularity model. Sādhanā. 45, 1 (May 2020), 140.
https://doi.org/10.1007/s12046-020-01372-8.
[13] Blei, D.M., Ng, A.Y. and Jordan, M.I. 2003. Latent dirichlet allocation. the Journal of machine
Learning research. 3, (2003), 993–1022.</p>
      </sec>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <surname>Djuric</surname>
            ,
            <given-names>N.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Zhou</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Morris</surname>
            ,
            <given-names>R.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Grbovic</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Radosavljevic</surname>
            ,
            <given-names>V.</given-names>
          </string-name>
          and
          <string-name>
            <surname>Bhamidipati</surname>
            ,
            <given-names>N.</given-names>
          </string-name>
          <year>2015</year>
          .
          <article-title>Hate Speech Detection with Comment Embeddings</article-title>
          .
          <source>Proceedings of the 24th International Conference on World Wide Web</source>
          (New York, NY, USA, May
          <year>2015</year>
          ),
          <fpage>29</fpage>
          -
          <lpage>30</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [2]
          <string-name>
            <surname>Fortuna</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          and
          <string-name>
            <surname>Nunes</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          <year>2018</year>
          .
          <article-title>A Survey on Automatic Detection of Hate Speech in Text</article-title>
          .
          <source>ACM Computing Surveys</source>
          .
          <volume>51</volume>
          ,
          <issue>4</issue>
          (Jul.
          <year>2018</year>
          ),
          <volume>85</volume>
          :
          <fpage>1</fpage>
          -
          <lpage>85</lpage>
          :
          <fpage>30</fpage>
          . https://doi.org/10.1145/3232676.
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [3]
          <string-name>
            <surname>Ayo</surname>
            ,
            <given-names>F.E.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Folorunso</surname>
            ,
            <given-names>O.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Ibharalu</surname>
            ,
            <given-names>F.T.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Osinuga</surname>
            ,
            <given-names>I.A.</given-names>
          </string-name>
          and
          <string-name>
            <surname>Abayomi-Alli</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          <year>2021</year>
          .
          <article-title>A probabilistic clustering model for hate speech classification in twitter</article-title>
          .
          <source>Expert Systems with Applications</source>
          .
          <volume>173</volume>
          ,
          <string-name>
            <surname>(Jul</surname>
          </string-name>
          .
          <year>2021</year>
          ),
          <volume>114762</volume>
          . https://doi.org/10.1016/j.eswa.
          <year>2021</year>
          .
          <volume>114762</volume>
          .
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [4]
          <string-name>
            <surname>Gao</surname>
            ,
            <given-names>L.</given-names>
          </string-name>
          and
          <string-name>
            <surname>Huang</surname>
            ,
            <given-names>R.</given-names>
          </string-name>
          <year>2017</year>
          .
          <article-title>Detecting Online Hate Speech Using Context Aware Models</article-title>
          .
          <source>Proceedings of the International Conference Recent Advances in Natural Language Processing</source>
          ,
          <article-title>RANLP 2017 (Varna</article-title>
          , Bulgaria, Sep.
          <year>2017</year>
          ),
          <fpage>260</fpage>
          -
          <lpage>266</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          [5]
          <string-name>
            <surname>Modha</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Mandl</surname>
            ,
            <given-names>T.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Shahi</surname>
            ,
            <given-names>G.K.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Madhu</surname>
            ,
            <given-names>H.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Satapara</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Ranasinghe</surname>
            ,
            <given-names>T.</given-names>
          </string-name>
          and
          <string-name>
            <surname>Zampieri</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          <year>2021</year>
          .
          <article-title>Overview of the hasoc subtrack at fire 2021: Hate speech and offensive content identification in english and indo-aryan languages and conversational hate speech</article-title>
          .
          <source>FIRE</source>
          <year>2021</year>
          :
          <article-title>Forum for Information Retrieval Evaluation, Virtual Event</article-title>
          ,
          <fpage>13th</fpage>
          -17th
          <source>December</source>
          <year>2021</year>
          (
          <year>2021</year>
          ).
          <article-title>[6]ktrain: A Low-Code Library for Augmented Machine Learning</article-title>
          : https://arxiv.org/abs/
          <year>2004</year>
          .10703.
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          [7]
          <string-name>
            <surname>Mandl</surname>
            ,
            <given-names>T.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Modha</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Shahi</surname>
            ,
            <given-names>G.K.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Madhu</surname>
            ,
            <given-names>H.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Satapara</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Majumder</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Schäfer</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Ranasinghe</surname>
            ,
            <given-names>T.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Zampieri</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          and
          <string-name>
            <surname>Nandini</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          <year>2021</year>
          .
          <article-title>Overview of the HASOC subtrack at FIRE 2021: Hate speech and offensive content identification in English and Indo-Aryan languages</article-title>
          .
          <source>Working Notes of FIRE</source>
          . (
          <year>2021</year>
          ).
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>