<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>HASOCOne@FIRE-HASOC2020: Using BERT and Multilingual BERT models for Hate Speech Detection</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Suman Dowlagar</string-name>
          <email>suman.dowlagar@research.iiit.ac.in</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Radhika Mamidi</string-name>
          <email>radhika.mamidi@iiit.ac.in</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>International Institute of Information Technology - Hyderabad (IIIT-Hyderabad)</institution>
          ,
          <addr-line>Gachibowli, Hyderabad, Telangana</addr-line>
          ,
          <country country="IN">India</country>
          ,
          <addr-line>500032</addr-line>
        </aff>
      </contrib-group>
      <abstract>
        <p>Hateful and Toxic content has become a significant concern in today's world due to an exponential rise in social media. The increase in hate speech and harmful content motivated researchers to dedicate substantial eforts to the challenging direction of hateful content identification. In this task, we propose an approach to automatically classify hate speech and ofensive content. We have used the datasets obtained from FIRE 2019 and 2020 shared tasks. We perform experiments by taking advantage of transfer learning models. We observed that the pre-trained BERT model and the multilingual-BERT model gave the best results. The code is made publically available at https://github.com/suman101112/hasoc-fire-2020 Nowadays, people are frequently using social media platforms to communicate their opinions and share information. Although the communication among users can lead to constructive conversations, the people have been increasingly hit by hateful and ofensive content due to these platforms' anonymity features. It has become a significant issue. The threat of abuse and harassment made many people stop expressing themselves. According to the Cambridge dictionary, Hate speech and ofensive content is defined as,</p>
      </abstract>
      <kwd-group>
        <kwd>eol&gt;Hate speech</kwd>
        <kwd>ofensive content</kwd>
        <kwd>label classification</kwd>
        <kwd>transfer learning</kwd>
        <kwd>BERT</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>1. Introduction</title>
      <p>• To harass and cause lasting pain by attacking something uniquely dear to the target.
• To use words that are considered insulting by most people.</p>
      <p>The main obstacle with hate speech is, it is dificult to classify based on a single sentence because
most of the hate speech has context attached to it, and it can morph into many diferent shapes
depending on the context. Another obstacle is that humans cannot always agree on what can be
classified as hate speech. Hence it is not very easy to create a universal machine learning algorithm
that would detect it. Also, the datasets used to train models tend to "reflect the majority view of the
people who collected or labeled the data".</p>
      <p>To deal with the above scenarios and to encourage research on hate speech and ofensive content,
the NLP community organized several tasks and workshops such as Task 12: OfensEval 2:
Multilingual Ofensive content identification in Social Media text 1, OSATC4 shared task on ofensive content
detection 2. Similarly, the FIRE 2020’s HASOC shared task was devoted to the Hate Speech and
Offensive Content Identification in Indo-European Languages. This task aims to classify the given
annotated tweets. This paper presents the state-of-the-art BERT transfer learning models for automated
detection of hate speech and ofensive content.</p>
      <p>The paper is organized as follows. Section 2 provides related work on hate speech and ofensive
content detection. Section 3 describes the methodology used for this task. Section 4 presents the
experimental setup and the performance of the model. Section 5 concludes our work.</p>
    </sec>
    <sec id="sec-2">
      <title>2. Related Work</title>
      <p>
        Machine learning and natural language processing approaches have made a breakthrough in
detecting hate speech on web platforms. Many scientific studies have been dedicated to using Machine
Learning (ML) [
        <xref ref-type="bibr" rid="ref1 ref2">1, 2</xref>
        ] and Deep Learning (DL) [
        <xref ref-type="bibr" rid="ref3 ref4">3, 4</xref>
        ] methods for automated hate speech and ofensive
content detection. The features used in traditional machine learning approaches are word-level and
character-level n-grams, etc. Although supervised machine learning-based approaches have used
diferent text mining-based features such as surface features, sentiment analysis, lexical resources,
linguistic features, knowledge-based features, or user-based and platform-based metadata, they
necessitate a well-defined feature extraction approach. Nowadays, the neural network models apply
text representation and deep learning approaches such as Convolutional Neural Networks (CNNs)
[
        <xref ref-type="bibr" rid="ref5">5</xref>
        ], Bi-directional Long Short-Term Memory Networks (LSTMs) [
        <xref ref-type="bibr" rid="ref6">6</xref>
        ], and BERT [
        <xref ref-type="bibr" rid="ref7">7</xref>
        ] to improve the
performance of hate speech and ofensive content detection models.
      </p>
    </sec>
    <sec id="sec-3">
      <title>3. Methodology</title>
      <p>
        Here, we use the pre-trained BERT transformer model for hate speech and ofensive content
detection. Figure 1 depicts the abstract view of BERT model that is used for hate speech detection and
ofensive language identification. Bidirectional Encoder Representations from Transformers (BERT)
is a transformer Encoder stack trained on the large English corpus. It has 2 models,   and
  . These model sizes have a large number of transformer layers. The   version has
12 transformer layers and the   has 24. These also have larger feed-forward networks with
768 and 1024 hidden representations, and attention heads are 12 and 16 for the respective models.
Like the vanilla transformer model [
        <xref ref-type="bibr" rid="ref8">8</xref>
        ], BERT takes a sequence of words as input. Each layer applies
self-attention, passes its results through a feed-forward network, and then hands it of to the next
encoder. Embeddings from   have 768 hidden units. The BERT configuration model takes a
sequence of words/tokens at a maximum length of 512 and produces an encoded representation of
dimensionality 768.
      </p>
      <p>The pre-trained BERT models have a better word representation as they are trained on a large
Wikipedia and book corpus. As the pre-trained BERT model is trained on generic corpora, we need
to fine-tune the model for the downstream tasks. During fine-tuning, the pre-trained BERT model
parameters are updated when trained on the labeled hate speech and ofensive content dataset. When
ifne-tuned on the downstream sentence classification task, a very few changes are applied to the
  configuration. In this architecture, only the [CLS] (classification) token output provided by
BERT is used. The [CLS] output is the output of the 12th transformer encoder with a dimensionality
of 768. It is given as input to a fully connected neural network, and the softmax activation function
2http://edinburghnlp.inf.ed.ac.uk/workshops/OSACT4/
is applied to the neural network to classify the given sentence. Thus, BERT learns to predict whether
a tweet can be classified as a hate speech or ofensive content. Apart from   model, we used
the pre-trained multilingual   model, as our data consisted of German and Hindi multilingual
languages. The multilingual BERT and vanilla BERT models’ architecture is the same, but the
pretrained multilingual BERT model is trained on multilingual Wikipedia language sources.</p>
    </sec>
    <sec id="sec-4">
      <title>4. Experiment</title>
      <p>Initially, we introduce datasets used, the task description, and then review the BERT model’s
performance on hate speech and ofensive content detection. We also include our implementation details
and error analysis in the subsequent sections.</p>
      <sec id="sec-4-1">
        <title>4.1. Dataset</title>
        <p>We used the dataset provided by the organizers of HASOC FIRE-2020 [9] and FIRE-2019 [10]. The
HASOC dataset was subsequently sampled from Twitter and partially from Facebook for English,
German, and Hindi languages. The tweets were acquired using hashtags and keywords that contained
ofensive content. The statistics of FIRE 2020 and 2019 datasets are given in the Table 1.</p>
      </sec>
      <sec id="sec-4-2">
        <title>4.2. Task description</title>
        <sec id="sec-4-2-1">
          <title>The following tasks are in HASOC 2020.</title>
          <p>Sub-task A focuses on coarse-grained Hate speech detection in all three languages. The task is to
classify tweets into two classes:
• (NOT) Non Hate-Ofensive - Post does not contain any Hate speech, profane, ofensive content.
• (HOF) Hate and Ofensive - Post contains Hate, ofensive, and profane content.</p>
          <p>Sub-task B represents a fine-grained classification. Hate-speech and ofensive posts from the
subtask A are further classified into three categories. The task is to classify the tweets into three classes:
• (HATE) Hate speech - Post contains Hate speech content.
• (OFFN) Ofenive - Post contains ofensive content such as insulting, degrading, dehumanizing
and threatening.
• (PRFN) Profane - Post contains profane words. This typically concerns the usage of swearwords
and cursing.</p>
        </sec>
      </sec>
      <sec id="sec-4-3">
        <title>4.3. Implementation</title>
        <p>For the implementation, we used the transformers library provided by HuggingFace [11]. The
HuggingFace transformers package is a python library providing pre-trained and configurable transformer
models useful for a variety of NLP tasks. It contains the pre-trained BERT and multilingual BERT, and
other models suitable for downstream tasks. As the implementation environment, we use the PyTorch
library that supports GPU processing. The BERT models were run on NVIDIA RTX 2070 graphics card
with an 8 GB graphics card. We trained our classifier with a batch size of 64 for 5 to 10 epochs based
on our experiments. The dropout is set to 0.1, and the Adam optimizer is used with a learning rate of
2e-5. We used the hugging face transformers pre-trained BERT tokenizer for tokenization. We used
the BertForSequenceClassification module provided by the HuggingFace library during finetuning
and sequence classification.</p>
      </sec>
      <sec id="sec-4-4">
        <title>4.4. Baseline models</title>
        <sec id="sec-4-4-1">
          <title>Here, we compared the BERT model with other machine learning algorithms.</title>
          <p>4.4.1. SVM with TF_IDF text representation
We chose Support Vector Machines (SVM) for hate speech and ofensive content detection. The
tokenizer used is SentencePiece [12]. SentencePiece is a commonly used technique to segment words
into a subword-level. In both cases, the vocabulary is initialized with all the individual characters in
the language, and then the most frequent or likely combinations of the symbols are iteratively added
to the vocabulary.
4.4.2. ELMO embeddings with SVM model
ELMO(Embeddings from Language Models) [13] deals with contextual embeddings. Contextual
wordembeddings are born to capture the word meaning in its context. Instead of using a fixed embedding
for each word, ELMO looks at the word’s context, i.e., the word’s entire sentence, before assigning
embedding to the word. It uses a bi-LSTM trained on a specific task to be able to create those
embeddings. We used the ELMO model present on tensorflow hub (https://tfhub.dev/google/elmo/2) to
obtain the ELMO embeddings on the hate speech data for all the languages. After obtaining the
embeddings, we take the mean of embeddings and apply an SVM classifier to classify the given sentence
into hate speech or ofensive content. We used the SentencePiece tokenizer.</p>
        </sec>
      </sec>
    </sec>
    <sec id="sec-5">
      <title>5. Results</title>
      <p>The results are tabulated in Tables 2, 3 and 4. We evaluated the performance of the method using
macro F1 and accuracy. The BERT model performed well when compared to the other SVM with
TF-IDF and ELMO text representations. Given all the languages and both the subtasks A and B, we
have observed an increase of 1-2% in classification metrics for ELMO embeddings + SVM classifier
compared to the baseline SVM classifier. However, BERT showed an increase of 5-7% in classification
metrics compared to ELMO and SVM models. It shows the pre-trained BERT model’s capability, which
learnt better text representations from the generic data. The state of the art transformer architecture
(a)
used in the BERT model helped the model learn better parameter weights in hate speech and ofensive
content detection.</p>
    </sec>
    <sec id="sec-6">
      <title>6. Error Analysis</title>
      <p>The confusion matrix of BERT model for subtasks A and B for the english, german and hindi datasets
is given in the Figure 2. For the binary classification, the best-performed model was for English
subtask A. The binary classification for the Hindi model is not helpful. The model misclassified most of
the hate-speech labels. It can be seen in subfigure 2(e). For ofensive content evaluation, the model
performed better on English subtask B. It correctly classified "NONE (not ofensive)" and "PROF
(profane)" but was unable to classify "HATE (hate speech)" and "OFFN (ofensive)" and misunderstood
most of them as "PROF". The multilingual-BERT model misclassified most of the hate speech and
ofensive content labels for the German and Hindi languages as "NONE" and didn’t perform well on
those datasets.</p>
    </sec>
    <sec id="sec-7">
      <title>7. Conclusion and Future work</title>
      <p>We used pre-trained bi-directional encoder representations using transformers (BERT) and
multilingualBERT for hate speech and ofensive content detection for English, German, and Hindi languages. We
compared the BERT with other machine learning and neural network classification methods. Our
analysis showed that using the pre-trained BERT and multilingual BERT models and finetuning it for
downstream hate-speech text classification tasks showed an increase in macro F1 score and accuracy
metrics compared to traditional word-based machine learning approaches.</p>
      <p>The given data has both hate speech and ofensive content labeled for a given same sentence. It
implies that both tasks are related. In such a scenario, we can use joint learning models to help obtain
a strong relationship between the two tasks. Which, in turn, helps a deep joint classification model
to understand the given datasets better.
[9] T. Mandl, S. Modha, G. K. Shahi, A. K. Jaiswal, D. Nandini, D. Patel, P. Majumder, J. Schäfer,
Overview of the HASOC track at FIRE 2020: Hate Speech and Ofensive Content Identification
in Indo-European Languages), in: Working Notes of FIRE 2020 - Forum for Information Retrieval
Evaluation, CEUR, 2020.
[10] T. Mandl, S. Modha, P. Majumder, D. Patel, M. Dave, C. Mandlia, A. Patel, Overview of the hasoc
track at fire 2019: Hate speech and ofensive content identification in indo-european languages,
in: Proceedings of the 11th Forum for Information Retrieval Evaluation, 2019, pp. 14–17.
[11] T. Wolf, L. Debut, V. Sanh, J. Chaumond, C. Delangue, A. Moi, P. Cistac, T. Rault, R. Louf, M.
Funtowicz, et al., Huggingface’s transformers: State-of-the-art natural language processing, ArXiv
(2019) arXiv–1910.
[12] T. Kudo, J. Richardson, Sentencepiece: A simple and language independent subword tokenizer
and detokenizer for neural text processing, arXiv preprint arXiv:1808.06226 (2018).
[13] M. E. Peters, M. Neumann, M. Iyyer, M. Gardner, C. Clark, K. Lee, L. Zettlemoyer, Deep
contextualized word representations, arXiv preprint arXiv:1802.05365 (2018).</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <given-names>T.</given-names>
            <surname>Davidson</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Warmsley</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Macy</surname>
          </string-name>
          ,
          <string-name>
            <surname>I. Weber</surname>
          </string-name>
          ,
          <article-title>Automated hate speech detection and the problem of ofensive language</article-title>
          ,
          <source>arXiv preprint arXiv:1703.04009</source>
          (
          <year>2017</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [2]
          <string-name>
            <given-names>A.</given-names>
            <surname>Gaydhani</surname>
          </string-name>
          ,
          <string-name>
            <given-names>V.</given-names>
            <surname>Doma</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Kendre</surname>
          </string-name>
          , L. Bhagwat,
          <article-title>Detecting hate speech and ofensive language on twitter using machine learning: An n-gram and tfidf based approach</article-title>
          , arXiv preprint arXiv:
          <year>1809</year>
          .
          <volume>08651</volume>
          (
          <year>2018</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [3]
          <string-name>
            <given-names>B.</given-names>
            <surname>Gambäck</surname>
          </string-name>
          ,
          <string-name>
            <given-names>U. K.</given-names>
            <surname>Sikdar</surname>
          </string-name>
          ,
          <article-title>Using convolutional neural networks to classify hate-speech</article-title>
          ,
          <source>in: Proceedings of the first workshop on abusive language online</source>
          ,
          <year>2017</year>
          , pp.
          <fpage>85</fpage>
          -
          <lpage>90</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [4]
          <string-name>
            <given-names>P.</given-names>
            <surname>Badjatiya</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Gupta</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Gupta</surname>
          </string-name>
          ,
          <string-name>
            <given-names>V.</given-names>
            <surname>Varma</surname>
          </string-name>
          ,
          <article-title>Deep learning for hate speech detection in tweets</article-title>
          ,
          <source>in: Proceedings of the 26th International Conference on World Wide Web Companion</source>
          ,
          <year>2017</year>
          , pp.
          <fpage>759</fpage>
          -
          <lpage>760</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          [5]
          <string-name>
            <given-names>Y.</given-names>
            <surname>Kim</surname>
          </string-name>
          ,
          <article-title>Convolutional neural networks for sentence classification</article-title>
          ,
          <source>arXiv preprint arXiv:1408.5882</source>
          (
          <year>2014</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          [6]
          <string-name>
            <given-names>S.</given-names>
            <surname>Hochreiter</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Schmidhuber</surname>
          </string-name>
          ,
          <article-title>Long short-term memory</article-title>
          ,
          <source>Neural computation 9</source>
          (
          <year>1997</year>
          )
          <fpage>1735</fpage>
          -
          <lpage>1780</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          [7]
          <string-name>
            <given-names>J.</given-names>
            <surname>Devlin</surname>
          </string-name>
          , M.-
          <string-name>
            <given-names>W.</given-names>
            <surname>Chang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K.</given-names>
            <surname>Lee</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K.</given-names>
            <surname>Toutanova</surname>
          </string-name>
          , Bert:
          <article-title>Pre-training of deep bidirectional transformers for language understanding</article-title>
          , arXiv preprint arXiv:
          <year>1810</year>
          .
          <volume>04805</volume>
          (
          <year>2018</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          [8]
          <string-name>
            <given-names>A.</given-names>
            <surname>Vaswani</surname>
          </string-name>
          ,
          <string-name>
            <given-names>N.</given-names>
            <surname>Shazeer</surname>
          </string-name>
          ,
          <string-name>
            <given-names>N.</given-names>
            <surname>Parmar</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Uszkoreit</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L.</given-names>
            <surname>Jones</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A. N.</given-names>
            <surname>Gomez</surname>
          </string-name>
          , Ł. Kaiser,
          <string-name>
            <surname>I. Polosukhin</surname>
          </string-name>
          ,
          <article-title>Attention is all you need</article-title>
          ,
          <source>in: Advances in neural information processing systems</source>
          ,
          <year>2017</year>
          , pp.
          <fpage>5998</fpage>
          -
          <lpage>6008</lpage>
          .
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>