<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta>
      <journal-title-group>
        <journal-title>Forum for Information Retrieval Evaluation, December</journal-title>
      </journal-title-group>
    </journal-meta>
    <article-meta>
      <title-group>
        <article-title>Hatespeech and Ofensive Content Detection in Hindi Language using C-BiGRU</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Sudharsana Kannan</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Jelena Mitrović</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Faculty of Computer Science and Mathematics, CAROLL Research Group, University of Passau</institution>
          ,
          <country country="DE">Germany</country>
        </aff>
      </contrib-group>
      <pub-date>
        <year>2021</year>
      </pub-date>
      <volume>1</volume>
      <fpage>3</fpage>
      <lpage>17</lpage>
      <abstract>
        <p>In this paper, we present our submission from the team CAROLL_Passau for subtask 1A of the HASOC 2021 workshop. Our presented model, C-BiGRU, is composed of a Convolutional Neural Network (CNN) together with a bidirectional Recurrent Neural Network (RNN). We utilized word embeddings to allow our model to apprehend the correlation between words in the text. The structure of our model enables it to capture the contextual information along with the long-term dependencies in the text in order to perform binary classification on ofensive text. We evaluated our model on the test data provided by the HASOC organizers. Our model achieved a macro F1 score of 75.04%, accuracy of 77.48%, precision and recall with the scores of 74.63% and 75.60% respectively.</p>
      </abstract>
      <kwd-group>
        <kwd>eol&gt;Hate speech</kwd>
        <kwd>Ofensive language</kwd>
        <kwd>Hindi</kwd>
        <kwd>C-BiGRU</kwd>
        <kwd>Embeddings</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>1. Introduction</title>
      <p>
        Social media is growing continuously to be one of the powerful means of communication
around the world, spreading various forms of user-generated content containing various sorts
of information[
        <xref ref-type="bibr" rid="ref1 ref2">1, 2</xref>
        ]. Although most of the time users of social media can post without any
restrictions, the posts often require content moderation [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ] since they can contain ofensive
language, abusive messages, or hate speech. The challenges present in automatically identifying
and detecting ofensiveness contained in these posts increased the attention of the scientific
community [
        <xref ref-type="bibr" rid="ref4">4</xref>
        ]. Various studies are performed in the Natural Language Processing (NLP)
community to ease the process of analyzing and identifying hate speech successfully in text
and language used online [
        <xref ref-type="bibr" rid="ref5 ref6">5, 6</xref>
        ]. In an attempt to improve and develop new methodologies
and approaches concerning hate speech and ofensive language detection, various workshops,
conferences, and competitions in the form of shared tasks are conducted in recent years [
        <xref ref-type="bibr" rid="ref7 ref8 ref9">7, 8, 9</xref>
        ].
In this paper, we propose a system to participate in a subtask to perform binary classification
on a dataset provided by the organizers of the challenge containing hateful and non-hateful
tweets in the Hindi language. We make the following contributions in our paper:
• Pre-processing strategy for extracting beneficial tweet content for detecting hate speech
• A model that is capable of detecting hate speech in multiple languages
• Implementation details of the approach utilized to secure a position in the HASOC 2021
challenge
      </p>
      <p>The remainder of the paper is organized as follows. In the second section, we present previous
work related to hate speech detection. We then provide an overview of the baseline model. In the
third section, the system description including the experimental setup of our model C-BiGRU is
presented. In the fourth section, we report our results and discussion for the challenge. Lastly,
we conclude the paper and provide changes and enhancements that can be applied to our
approach in the future.</p>
    </sec>
    <sec id="sec-2">
      <title>2. Background</title>
      <p>
        A variety of ideas and algorithms have been proposed over the years to identify and categorize
ofensive language, aggression, hatespeech, and various abusive language phenomena on social
media [
        <xref ref-type="bibr" rid="ref10 ref11 ref12">10, 11, 12</xref>
        ]. Addressing these topics in diferent languages using diferent features mainly,
the text is predominant [
        <xref ref-type="bibr" rid="ref6">6</xref>
        ]. In this section, we look into diferent approaches and contributions
related to our work in the field of hate speech detection.
      </p>
      <p>
        A classification methodology through transfer learning using CNN is proposed by [
        <xref ref-type="bibr" rid="ref13">13</xref>
        ] utilizing
the dataset of Hindi-English code switched language. The authors employed the method of
transfer learning to train the tweets in English and then reused the system by retraining the
model on the code-mixed dataset to detect hate speech successfully. [
        <xref ref-type="bibr" rid="ref14">14</xref>
        ] presented two diferent
classifiers: SVM (Support Vector Machine) and Random forest classifier for hate speech detection
using the code-mixed text of Hindi and English. The system used various textual features such as
character, word, and lexicon-based features. The authors revealed that out of the two classifiers
used, the SVM performed better in classifying the tweets. The approach used by [
        <xref ref-type="bibr" rid="ref15">15</xref>
        ] involved
data augmentation by utilizing a translation strategy to increase the size of the dataset. They
presented an ensemble of bidirectional GRU (BiGRU) and TF-IDF approaches using fastText
embeddings as their best-performing model to detect aggressive tweets. [
        <xref ref-type="bibr" rid="ref16">16</xref>
        ] investigated two
models namely the sub-word level LSTM model and hierarchical LSTM model with attention to
detect hate speech from Hindi-English code-mixed social media text. A comparative study
between aggressive and ofensive language detection on three diferent languages: Hindi, Bangla,
and English is presented in [
        <xref ref-type="bibr" rid="ref17">17</xref>
        ]. The authors used the SVM and BERT models to perform the
study and revealed that the SVM classifier outperformed BERT in non-English datasets due to
the lack of adequate pre-trained models for such languages. Other recent approaches [
        <xref ref-type="bibr" rid="ref18 ref19 ref20">18, 19, 20</xref>
        ]
to detect hate speech utilize various machine learning and deep learning techniques.
      </p>
    </sec>
    <sec id="sec-3">
      <title>3. System Overview</title>
      <p>
        In this section, we describe our model C-BiGRU along with the pre-processing steps and
embeddings utilized in the system. The architecture of the model is shown in Figure 1
3.1. Data
We utilized the data provided by the organizers of the HASOC 2021 workshop [
        <xref ref-type="bibr" rid="ref21">21</xref>
        ]. The dataset
for subtask 1A for the Hindi language contains tweets annotated with the labels ’NOT’ (Non
Hate-Ofensive) and ’HOF’ (Hate and Ofensive). The train data consists of 4,594 tweets and is
split into 3,161, not ofensive tweets and 1,433 ofensive tweets. The test data consists of 1,532
tweets in total with a mix of 505 ofensive tweets and 1027, not ofensive tweets labeled in a
similar fashion as that of the train data.
      </p>
      <sec id="sec-3-1">
        <title>3.2. Pre-processing</title>
        <p>Pre-processing of tweets includes removal of emojis, URLs, and any additional spaces. In
addition, the tweets are converted to lowercase, HTML character encodings are replaced with
their respective token representation or literal. Apart from this, TweetTokenizer from NLTK is
used to split the tokens containing special characters (e.g. ’/’, ’-’).</p>
      </sec>
      <sec id="sec-3-2">
        <title>3.3. C-BiGRU Model</title>
        <p>We begin our model construction with the embedding layer. The pre-processed tweets are fixed
to a sequence length of 150 tokens, longer sequences are clipped at the end and shorter sequences
are padded with a masking token. Following this, we create a dictionary containing all unique
tokens that appear more than once and map them to their number of occurrences in the
respective corpus. As a next step, we construct the weighting matrix  *  to form the embedding
layer, where dim is the dimension of the fastText [22] embedding model used in our setup and m
the number of unique tokens ,  ∈ {1, .., }. The word vector of  is stored in W if the token
is present in the embedding model. If  has no pre-trained word vector, we generate a
random vector drawn from the uniform distribution within −
︂[
√︁ 6 ,
 as suggested by [23].</p>
        <p>Next in line is the convolutional layer where n-gram features are extracted from the sequence
of tokens. The (kx128) 1-dimensional filters present in this layer assist in the accomplishment
of feature extraction. The value of k varies from 2 to 5 and represents diferent window sizes.
The outputs produced by this layer are all of the same sequence lengths which are attained
through padding. Furthermore, we also make use of the ReLu activation function. The resulting
feature maps are concatenated and then forwarded to the recurrent layer.</p>
        <p>GRUs which are the improvised version of standard recurrent neural networks aim to solve
the vanishing gradient problem using a gating mechanism. Originally proposed by [24] these
are used in capturing long-term dependencies of input-sequence. GRUs have appeared to
accomplish practically identical outcomes to LSTM in sequence modeling tasks. They have
been reported to outperform LSTM while training on smaller datasets [25]. In our model, we
made use of bidirectional GRU as the recurrent layer. As input to one of the GRU layers, the
output from the previous layer is received meanwhile, the reversed form of the same output is
used for the other GRU layer. The GRU layers return hidden states for each processed feature
map. The hidden states from both layers are concatenated. The resultant output of size 150*128
is obtained by setting the hidden layers of both the layers to 64.</p>
        <p>The output from the previous layer is then passed through a global max-pooling layer which
reduces the output space to (1x128) nodes. Finally, a fully connected layer with 32 neurons that
connect to a single output neuron that utilizes the sigmoid activation function is used. In order
to prevent overfitting, a dropout layer is introduced with a rate of 0.2 before the single output
neuron. Besides that, another dropout layer with a rate of 0.2 is included after the embedding
layer.</p>
        <p>Moreover, we used cross-entropy as an error function for our model and the Adam optimizer to
update our network weights [26]. During the training phase, early stopping is implemented and
we perform a split on the data resulting in 10% of the data as validation dataset and 90% of the
data for training. The batch size of the gradient update is set to 32 with a maximum of 5 epochs.</p>
      </sec>
      <sec id="sec-3-3">
        <title>3.4. Baseline</title>
        <p>To compare the performance of our model with a baseline setup, we used a classification model,
Logistic Regression. The model was evaluated using the validation dataset after carrying out
the pre-processing steps as described previously in section 3.2. The baseline model after
evaluating resulted in a macro F1 score of 70.03%, accuracy of 75.87%, precision and recall with
the scores 64.04% and 51.05% respectively. Afterward, the system was tested using the test
data and achieved a macro F1 score of 73.46%, accuracy of 78.13%, precision and recall with the
scores 76.14% and 72.22% respectively. The scores of the baseline model are compared with our
C-BiGRU system in the following section.</p>
      </sec>
    </sec>
    <sec id="sec-4">
      <title>4. Results and Discussion</title>
      <p>To study the performance of our C-BiGRU model, we conducted experiments in the Hindi
language to detect hate speech utilizing the train and test data as described in section 3.1. The
model was trained and thereafter, evaluated using the validation data which is 10% of the train
data. During the validation phase, the system achieved an accuracy of 76.30%, recall, precision,
and macro F1 score of 72.61%, 61.49%, and 63.64% respectively. Following this, the model was
tested using the test data and the system achieved a macro F1 score of 75.04%, accuracy of 77.48%,
precision and recall with the scores 74.63% and 75.60% respectively. The results are displayed
in the table 1 Our presented C-BiGRU model successfully performed the classification task
with noticeable results and surpassed the baseline model. The model also showed consistent F1
scores during the validation phase and test phase.</p>
      <p>The overview of the results and findings of HASOC 2021 is presented in [27]
This experiment helped us evaluate the performance of the C-BiGRU model in another language
as a follow-up to the previously impressive performances of the model in other languages
English, German, Danish, and Turkish [28, 29, 30].</p>
    </sec>
    <sec id="sec-5">
      <title>5. Conclusion and Future work</title>
      <p>In this submission, we described our approach to detecting hate speech present in social
media posts and we presented the architecture of our model. We further provided the results
of our model evaluated on the Hindi language with significant performance. Although our
model performed well, there are a few limitations such as handling of unknown tokens, and
distinguishing between explicit and implicit hate speech, which is an important task that is not
easy to overcome, as explained and investigated by [31, 32].</p>
      <p>In the future, we plan to extend our approach to work on identifying rhetorical figures and
multi-word expressions containing abusive language. We will also work on improving the
model to identify the fine-grained diference between implicit and explicit ofensive posts. Apart
from this, domain-specific word embeddings can help in handling the unknown tokens. Other
potential features such as POS (Parts of Speech) tagging will also be explored.</p>
    </sec>
    <sec id="sec-6">
      <title>Acknowledgments</title>
      <p>The project on which this report is based was funded by the German Federal Ministry of
Education and Research (BMBF) under the funding code 01|S20049. The author is responsible
for the content of this publication.
inghe, M. Zampieri, D. Nandini, A. K. Jaiswal, Overview of the HASOC subtrack at FIRE
2021: Hate Speech and Ofensive Content Identification in English and Indo-Aryan
Languages, in: Working Notes of FIRE 2021 - Forum for Information Retrieval Evaluation,
CEUR, 2021. URL: http://ceur-ws.org/.
[22] A. Joulin, E. Grave, P. Bojanowski, T. Mikolov, Bag of tricks for eficient text classification,
in: Proceedings of the 15th Conference of the European Chapter of the Association
for Computational Linguistics: Volume 2, Short Papers, Association for Computational
Linguistics, Valencia, Spain, 2017, pp. 427–431. URL: https://aclanthology.org/E17-2068.
[23] K. He, X. Zhang, S. Ren, J. Sun, Delving deep into rectifiers: Surpassing human-level
performance on imagenet classification., in: In Proceedings of the IEEE international
conference on computer vision, pages 1026–1034., 2015.
[24] K. Cho, B. van Merriënboer, C. Gulcehre, D. Bahdanau, F. Bougares, H. Schwenk, Y. Bengio,
Learning phrase representations using RNN encoder–decoder for statistical machine
translation, in: Proceedings of the 2014 Conference on Empirical Methods in Natural Language
Processing (EMNLP), Association for Computational Linguistics, Doha, Qatar, 2014, pp.
1724–1734. URL: https://aclanthology.org/D14-1179. doi:10.3115/v1/D14-1179.
[25] J. Chung, Çaglar Gülçehre, K. Cho, Y. Bengio, Empirical evaluation of gated recurrent
neural networks on sequence modeling, ArXiv abs/1412.3555 (2014).
[26] D. P. Kingma, J. Ba, Adam: A method for stochastic optimization, arXiv preprint
arXiv:1412.6980 (2014).
[27] S. Modha, T. Mandl, G. K. Shahi, H. Madhu, S. Satapara, T. Ranasinghe, M. Zampieri,
Overview of the HASOC Subtrack at FIRE 2021: Hate Speech and Ofensive Content
Identification in English and Indo-Aryan Languages and Conversational Hate Speech, in:
FIRE 2021: Forum for Information Retrieval Evaluation, Virtual Event, 13th-17th December
2021, ACM, 2021.
[28] J. Mitrović, B. Birkeneder, M. Granitzer, nlpUP at SemEval-2019 task 6: A deep neural
language model for ofensive language detection, in: Proceedings of the 13th
International Workshop on Semantic Evaluation, Association for Computational Linguistics,
Minneapolis, Minnesota, USA, 2019, pp. 722–726. URL: https://aclanthology.org/S19-2127.
doi:10.18653/v1/S19-2127.
[29] B. Birkeneder, J. Mitrovic, J. Niemeier, L. Teubert, S. Handschuh, upinf-ofensive
language detection in german tweets, in: 14th Conference on Natural Language Processing
KONVENS 2018, 2018.
[30] O. Hussein, H. Sfar, J. Mitrović, M. Granitzer, NLP_Passau at SemEval-2020 task 12:
Multilingual neural network for ofensive language detection in English, Danish and
Turkish, in: Proceedings of the Fourteenth Workshop on Semantic Evaluation, International
Committee for Computational Linguistics, Barcelona (online), 2020, pp. 2090–2097. URL:
https://aclanthology.org/2020.semeval-1.277.
[31] T. Caselli, V. Basile, J. Mitrović, I. Kartoziya, M. Granitzer, I feel ofended, don’t be abusive!
implicit/explicit messages in ofensive and abusive language, in: Proceedings of the 12th
language resources and evaluation conference, 2020, pp. 6193–6202.
[32] T. Caselli, V. Basile, J. Mitrović, M. Granitzer, Hatebert: Retraining bert for abusive language
detection in english, arXiv preprint arXiv:2010.12472 (2020).</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <given-names>C.</given-names>
            <surname>Nobata</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Tetreault</surname>
          </string-name>
          , A. Thomas,
          <string-name>
            <given-names>Y.</given-names>
            <surname>Mehdad</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Y.</given-names>
            <surname>Chang</surname>
          </string-name>
          ,
          <article-title>Abusive language detection in online user content</article-title>
          ,
          <source>in: Proceedings of the 25th International Conference on World Wide Web, WWW '16, International World Wide Web Conferences Steering Committee, Republic and Canton of Geneva, CHE</source>
          ,
          <year>2016</year>
          , p.
          <fpage>145</fpage>
          -
          <lpage>153</lpage>
          . URL: https://doi.org/10.1145/ 2872427.2883062. doi:
          <volume>10</volume>
          .1145/2872427.2883062.
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [2]
          <string-name>
            <given-names>P.</given-names>
            <surname>Badjatiya</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Gupta</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Gupta</surname>
          </string-name>
          ,
          <string-name>
            <given-names>V.</given-names>
            <surname>Varma</surname>
          </string-name>
          ,
          <article-title>Deep learning for hate speech detection in tweets</article-title>
          ,
          <source>Proceedings of the 26th International Conference on World Wide Web Companion - WWW '17 Companion</source>
          (
          <year>2017</year>
          ). URL: http://dx.doi.org/10.1145/3041021.3054223. doi:
          <volume>10</volume>
          . 1145/3041021.3054223.
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [3]
          <string-name>
            <given-names>T.</given-names>
            <surname>Gillespie</surname>
          </string-name>
          ,
          <article-title>Content moderation, ai, and the question of scale</article-title>
          ,
          <source>Big Data &amp; Society</source>
          <volume>7</volume>
          (
          <year>2020</year>
          )
          <article-title>2053951720943234</article-title>
          . URL: https://doi.org/10.1177/2053951720943234. doi:
          <volume>10</volume>
          .1177/ 2053951720943234. arXiv:https://doi.org/10.1177/2053951720943234.
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [4]
          <string-name>
            <given-names>M.</given-names>
            <surname>Mozafari</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Farahbakhsh</surname>
          </string-name>
          ,
          <string-name>
            <given-names>N.</given-names>
            <surname>Crespi</surname>
          </string-name>
          ,
          <article-title>Hate speech detection and racial bias mitigation in social media based on bert model</article-title>
          ,
          <source>PLOS ONE 15</source>
          (
          <year>2020</year>
          )
          <article-title>e0237861</article-title>
          . URL: http://dx.doi. org/10.1371/journal.pone.0237861. doi:
          <volume>10</volume>
          .1371/journal.pone.
          <volume>0237861</volume>
          .
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          [5]
          <string-name>
            <given-names>M.</given-names>
            <surname>Corazza</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Menini</surname>
          </string-name>
          ,
          <string-name>
            <given-names>E.</given-names>
            <surname>Cabrio</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Tonelli</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Villata</surname>
          </string-name>
          ,
          <article-title>A multilingual evaluation for online hate speech detection</article-title>
          ,
          <source>ACM Trans. Internet Technol</source>
          .
          <volume>20</volume>
          (
          <year>2020</year>
          ). URL: https://doi.org/10. 1145/3377323. doi:
          <volume>10</volume>
          .1145/3377323.
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          [6]
          <string-name>
            <given-names>A.</given-names>
            <surname>Schmidt</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Wiegand</surname>
          </string-name>
          ,
          <article-title>A survey on hate speech detection using natural language processing</article-title>
          ,
          <source>in: Proceedings of the Fifth International Workshop on Natural Language Processing for Social Media</source>
          , Association for Computational Linguistics, Valencia, Spain,
          <year>2017</year>
          , pp.
          <fpage>1</fpage>
          -
          <lpage>10</lpage>
          . URL: https://aclanthology.org/W17-1101. doi:
          <volume>10</volume>
          .18653/v1/
          <fpage>W17</fpage>
          -1101.
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          [7]
          <string-name>
            <given-names>M.</given-names>
            <surname>Zampieri</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Nakov</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Rosenthal</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Atanasova</surname>
          </string-name>
          , G. Karadzhov,
          <string-name>
            <given-names>H.</given-names>
            <surname>Mubarak</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L.</given-names>
            <surname>Derczynski</surname>
          </string-name>
          ,
          <string-name>
            <surname>Z</surname>
          </string-name>
          . Pitenis, c. Çöltekin, SemEval-2020
          <source>Task</source>
          <volume>12</volume>
          :
          <article-title>Multilingual Ofensive Language Identification in Social Media (OfensEval 2020)</article-title>
          , in: Proceedings of SemEval,
          <year>2020</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          [8]
          <string-name>
            <given-names>T.</given-names>
            <surname>Mandl</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Modha</surname>
          </string-name>
          ,
          <string-name>
            <surname>A. Kumar</surname>
            <given-names>M</given-names>
          </string-name>
          ,
          <string-name>
            <given-names>B. R.</given-names>
            <surname>Chakravarthi</surname>
          </string-name>
          ,
          <article-title>Overview of the hasoc track at fire 2020: Hate speech and ofensive language identification in tamil, malayalam, hindi, english and german, in: Forum for Information Retrieval Evaluation</article-title>
          ,
          <string-name>
            <surname>FIRE</surname>
          </string-name>
          <year>2020</year>
          ,
          <article-title>Association for Computing Machinery</article-title>
          , New York, NY, USA,
          <year>2020</year>
          , p.
          <fpage>29</fpage>
          -
          <lpage>32</lpage>
          . URL: https://doi.org/10.1145/ 3441501.3441517. doi:
          <volume>10</volume>
          .1145/3441501.3441517.
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          [9]
          <string-name>
            <given-names>B.</given-names>
            <surname>Cristina</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Felice</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F.</given-names>
            <surname>Poletto</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Manuela</surname>
          </string-name>
          , T. Maurizio,
          <article-title>Overview of the EVALITA Hate Speech Detection (HaSpeeDe) Task</article-title>
          , in: T. Caselli,
          <string-name>
            <given-names>N.</given-names>
            <surname>Novielli</surname>
          </string-name>
          ,
          <string-name>
            <given-names>V.</given-names>
            <surname>Patti</surname>
          </string-name>
          , P. Rosso (Eds.),
          <article-title>Proceedings of the 6th evaluation campaign of Natural Language Processing and Speech tools for Italian (EVALITA'18), CEUR</article-title>
          .org, Turin, Italy,
          <year>2018</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          [10]
          <string-name>
            <given-names>T.</given-names>
            <surname>Davidson</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Warmsley</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Macy</surname>
          </string-name>
          ,
          <string-name>
            <surname>I. Weber</surname>
          </string-name>
          ,
          <article-title>Automated hate speech detection and the problem of ofensive language</article-title>
          ,
          <source>in: ICWSM</source>
          ,
          <year>2017</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          [11]
          <string-name>
            <given-names>M.</given-names>
            <surname>Mozafari</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Farahbakhsh</surname>
          </string-name>
          ,
          <string-name>
            <given-names>N.</given-names>
            <surname>Crespi</surname>
          </string-name>
          ,
          <article-title>A bert-based transfer learning approach for hate speech detection in online social media</article-title>
          , in: H.
          <string-name>
            <surname>Cherifi</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          <string-name>
            <surname>Gaito</surname>
            ,
            <given-names>J. F.</given-names>
          </string-name>
          <string-name>
            <surname>Mendes</surname>
            ,
            <given-names>E.</given-names>
          </string-name>
          <string-name>
            <surname>Moro</surname>
            ,
            <given-names>L. M.</given-names>
          </string-name>
          <string-name>
            <surname>Rocha</surname>
          </string-name>
          (Eds.),
          <source>Complex Networks and Their Applications VIII</source>
          , Springer International Publishing, Cham,
          <year>2020</year>
          , pp.
          <fpage>928</fpage>
          -
          <lpage>940</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          [12]
          <string-name>
            <given-names>J. H.</given-names>
            <surname>Park</surname>
          </string-name>
          , P. Fung,
          <article-title>One-step and two-step classification for abusive language detection on Twitter</article-title>
          ,
          <source>in: Proceedings of the First Workshop on Abusive Language Online</source>
          , Association for Computational Linguistics, Vancouver, BC, Canada,
          <year>2017</year>
          , pp.
          <fpage>41</fpage>
          -
          <lpage>45</lpage>
          . URL: https:// aclanthology.org/W17-3006. doi:
          <volume>10</volume>
          .18653/v1/
          <fpage>W17</fpage>
          -3006.
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          [13]
          <string-name>
            <given-names>P.</given-names>
            <surname>Mathur</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Shah</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Sawhney</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Mahata</surname>
          </string-name>
          ,
          <article-title>Detecting ofensive tweets in Hindi-English code-switched language</article-title>
          ,
          <source>in: Proceedings of the Sixth International Workshop on Natural Language Processing for Social Media</source>
          , Association for Computational Linguistics, Melbourne, Australia,
          <year>2018</year>
          , pp.
          <fpage>18</fpage>
          -
          <lpage>26</lpage>
          . URL: https://aclanthology.org/W18-3504. doi:
          <volume>10</volume>
          .18653/v1/
          <fpage>W18</fpage>
          -3504.
        </mixed-citation>
      </ref>
      <ref id="ref14">
        <mixed-citation>
          [14]
          <string-name>
            <given-names>A.</given-names>
            <surname>Bohra</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Vijay</surname>
          </string-name>
          ,
          <string-name>
            <given-names>V.</given-names>
            <surname>Singh</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S. S.</given-names>
            <surname>Akhtar</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Shrivastava</surname>
          </string-name>
          ,
          <article-title>A dataset of Hindi-English code-mixed social media text for hate speech detection</article-title>
          ,
          <source>in: Proceedings of the Second Workshop on Computational Modeling of People's Opinions</source>
          , Personality, and Emotions in Social Media, Association for Computational Linguistics, New Orleans, Louisiana, USA,
          <year>2018</year>
          , pp.
          <fpage>36</fpage>
          -
          <lpage>41</lpage>
          . URL: https://aclanthology.org/W18-1105. doi:
          <volume>10</volume>
          .18653/v1/
          <fpage>W18</fpage>
          -1105.
        </mixed-citation>
      </ref>
      <ref id="ref15">
        <mixed-citation>
          [15]
          <string-name>
            <given-names>J.</given-names>
            <surname>Risch</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Krestel</surname>
          </string-name>
          ,
          <article-title>Aggression identification using deep learning and data augmentation</article-title>
          ,
          <source>in: Proceedings of the First Workshop on Trolling, Aggression and Cyberbullying (TRAC2018)</source>
          ,
          <year>2018</year>
          , pp.
          <fpage>150</fpage>
          -
          <lpage>158</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref16">
        <mixed-citation>
          [16]
          <string-name>
            <given-names>T. Y.</given-names>
            <surname>Santosh</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K. V.</given-names>
            <surname>Aravind</surname>
          </string-name>
          ,
          <article-title>Hate speech detection in hindi-english code-mixed social media text</article-title>
          ,
          <source>in: Proceedings of the ACM India Joint International Conference on Data Science and Management of Data</source>
          , CoDS-COMAD '
          <fpage>19</fpage>
          ,
          <string-name>
            <surname>Association</surname>
          </string-name>
          for Computing Machinery, New York, NY, USA,
          <year>2019</year>
          , p.
          <fpage>310</fpage>
          -
          <lpage>313</lpage>
          . URL: https://doi.org/10.1145/3297001.3297048. doi:
          <volume>10</volume>
          . 1145/3297001.3297048.
        </mixed-citation>
      </ref>
      <ref id="ref17">
        <mixed-citation>
          [17]
          <string-name>
            <given-names>R.</given-names>
            <surname>Kumar</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B.</given-names>
            <surname>Lahiri</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A. K.</given-names>
            <surname>Ojha</surname>
          </string-name>
          ,
          <article-title>Aggressive and ofensive language identification in hindi, bangla, and english: A comparative study</article-title>
          ,
          <source>SN Computer Science</source>
          <volume>2</volume>
          (
          <year>2021</year>
          )
          <article-title>26</article-title>
          . URL: https://doi.org/10.1007/s42979-020-00414-6. doi:
          <volume>10</volume>
          .1007/s42979-020-00414-6.
        </mixed-citation>
      </ref>
      <ref id="ref18">
        <mixed-citation>
          [18]
          <string-name>
            <given-names>A.</given-names>
            <surname>Baruah</surname>
          </string-name>
          ,
          <string-name>
            <surname>K. A. Das</surname>
            ,
            <given-names>F. A.</given-names>
          </string-name>
          <string-name>
            <surname>Barbhuiya</surname>
            ,
            <given-names>K.</given-names>
          </string-name>
          <string-name>
            <surname>Dey</surname>
          </string-name>
          ,
          <article-title>Iiitg-adbu@hasoc-dravidian-codemixifre2020: Ofensive content detection in code-mixed dravidian text</article-title>
          ,
          <source>CoRR abs/2107</source>
          .14336 (
          <year>2021</year>
          ). URL: https://arxiv.org/abs/2107.14336. arXiv:
          <volume>2107</volume>
          .
          <fpage>14336</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref19">
        <mixed-citation>
          [19]
          <string-name>
            <given-names>P. K.</given-names>
            <surname>Roy</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A. K.</given-names>
            <surname>Tripathy</surname>
          </string-name>
          ,
          <string-name>
            <surname>T. K. Das</surname>
            ,
            <given-names>X.-Z.</given-names>
          </string-name>
          <string-name>
            <surname>Gao</surname>
          </string-name>
          ,
          <article-title>A framework for hate speech detection using deep convolutional neural network</article-title>
          ,
          <source>IEEE Access 8</source>
          (
          <year>2020</year>
          )
          <fpage>204951</fpage>
          -
          <lpage>204962</lpage>
          . doi:
          <volume>10</volume>
          . 1109/ACCESS.
          <year>2020</year>
          .
          <volume>3037073</volume>
          .
        </mixed-citation>
      </ref>
      <ref id="ref20">
        <mixed-citation>
          [20]
          <string-name>
            <given-names>R.</given-names>
            <surname>Raja</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Srivastavab</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Saumyac</surname>
          </string-name>
          , Nsit &amp; iiitdwd@ hasoc
          <year>2020</year>
          :
          <article-title>Deep learning model for hate-speech identification in indo-european languages (</article-title>
          <year>2021</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref21">
        <mixed-citation>
          [21]
          <string-name>
            <given-names>T.</given-names>
            <surname>Mandl</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Modha</surname>
          </string-name>
          ,
          <string-name>
            <given-names>G. K.</given-names>
            <surname>Shahi</surname>
          </string-name>
          ,
          <string-name>
            <given-names>H.</given-names>
            <surname>Madhu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Satapara</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Majumder</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Schäfer</surname>
          </string-name>
          , T. Ranas-
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>