<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Contextual Hate Speech Detection in Code Mixed Text using Transformer Based Approaches</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Ravindra Nayak</string-name>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Raviraj Joshi</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Indian Institute of Technology Madras</institution>
          ,
          <addr-line>Chennai</addr-line>
        </aff>
        <aff id="aff1">
          <label>1</label>
          <institution>Sri Jayachamarajendra College of Engineering</institution>
          ,
          <addr-line>Mysore</addr-line>
        </aff>
      </contrib-group>
      <abstract>
        <p>In the recent past, social media platforms have helped people in connecting and communicating to a wider audience. But this has also led to a drastic increase in cyberbullying. It is essential to detect and curb hate speech to keep the sanity of social media platforms. Also, code mixed text containing more than one language is frequently used on these platforms. We, therefore, propose automated techniques for hate speech detection in code mixed text from scraped Twitter. We specifically focus on code mixed English-Hindi text and transformer-based approaches. While regular approaches analyze the text independently, we also make use of content text in the form of parent tweets. We try to evaluate the performances of multilingual BERT and Indic-BERT in single-encoder and dual-encoder settings. The ifrst approach is to concatenate the target text and context text using a separator token and get a single representation from the BERT model. The second approach encodes the two texts independently using a dual BERT encoder and the corresponding representations are averaged. We show that the dual-encoder approach using independent representations yields better performance. We also employ simple ensemble methods to further improve the performance. We describe the systems built by our team r1_2021 for HASOC 2021 Subtask 2 and the subsequent set of experiments.</p>
      </abstract>
      <kwd-group>
        <kwd>eol&gt;Hate Speech Detection</kwd>
        <kwd>Social Media</kwd>
        <kwd>Code Mixed</kwd>
        <kwd>Hinglish</kwd>
        <kwd>Multilingual</kwd>
        <kwd>Indic</kwd>
        <kwd>BERT</kwd>
        <kwd>Context-aware</kwd>
        <kwd>Deep Learning</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>1. Introduction</title>
      <p>
        Social media is a boon to many, as they have helped in creating and promoting budding
businesses on such platforms. Although it has vast use cases, it comes with a caveat too. People
with malicious intent have considered it as an opportunity to promote hate speech among a
wider audience [
        <xref ref-type="bibr" rid="ref1 ref2">1, 2</xref>
        ]. There has been multiple research that directly links social media to poor
mental health. Because such platforms are outnumbered by youngsters, their mental stability is
trivial in shaping their future careers. So it is necessary to take actions against such malevolent
content on a large scale.
      </p>
      <p>Ofensive language such as insulting, hurtful, derogatory or obscene content directed towards
people might suppress meaningful discussions. As there are no restrictions on expressing
peoples opinions on such platforms, it might lead to the defaming of personalities. So it is the
platform’s responsibility to restrain such content. Hate speech mainly involves discriminating
against people based on religion, community, race, nationality, gender or any other identity
factors [3, 4].</p>
      <p>Even though manual moderation of hate speech is always precise, it isn’t recommended
considering the huge volumes of data that is being pumped into social media. So there is a
constant need for automated techniques to suppress such hateful content where the ages of
all the groups are exposed to [5, 6, 7, 8]. As we have seen advances in computing capabilities,
machine learning algorithms have gained their importance in tasks that involve understanding
natural language.</p>
      <p>In this work, we are interested in the hate speech detection of tweets. This paper mainly
focuses on evaluating the HASOC 2021, Identification of Conversational Hate-Speech in
CodeMixed Languages (ICHCL) subtask [9]. This task aims to detect hate speech in individual tweets
and their respective comments and replies which support hate speech directly or indirectly.
The dataset contains scraped text from Twitter with binary labels. A conversational thread can
contain abusive or ofensive content, which is not apparent just from a single comment or the
reply to a comment but can be identified if given the context of the parent content as shown
in Table 1. Furthermore, the contents on such social media are spread in so many diferent
languages, including code-mixed languages such as Hinglish (mix of Hindi and English in roman
text) [10].</p>
      <p>The hate speech detection task can be considered as a binary text classification problem [ 11].
We solely depend upon the tweets and the context of the tweet to determine hateful content.
Even though some of the tweets can be rejected considering the behaviour of the content creator,
we cannot always guarantee that this information is available every time. We evaluate various
deep learning techniques, specifically the multilingual BERT based models. We have tried to
experiment on various fine-tuning methods, and how they are helpful for the model to detect
malicious tweets.</p>
    </sec>
    <sec id="sec-2">
      <title>2. Related Work</title>
      <p>Hate speech detection is precise when manually moderated. The context of such tweets is also
important to identify hatefulness. For the code mixed data, in particular, the moderator must
have a vast knowledge of vocabulary across languages to curb malicious content. If we have
enough data on the user’s behaviour and tweet content, then it could help us in mitigating such
content by blacklisting such users. Many approaches like graph convolution networks are being
used that capture not only the structure of online communities but also the linguistic behaviour
of the users within them [12].</p>
      <p>Dictionary-based approaches are popular for text data where we try to maintain a list of
words or phrases that might be profane or any kind of racial slurs. Various machine learning
approaches involve the usage of extra-linguistic features in conjunction with character
ngrams to build binary logistic regression classifiers [ 13]. There have been studies showing that
including knowledge graph features have helped in building better models [14].</p>
      <p>Word level embeddings like Glove have helped in better capture of the semantics of words in
comparison to one-hot encoding [15]. Another similar approach is to make use of sentence-level
embeddings like ELMo which help in extracting rich features from the text. These embeddings
are then fed to bi-directional LSTMs or CNNs for classification [ 16]. As these embeddings are
trained on huge corpora of data, they are often called transfer learning as they help in reusing
feature-rich vectors for similar classification tasks. Various other features like LIWC features,
SentiWordnet and Profanity vectors also aid the model [17].</p>
      <p>For the code mixed Hinglish data sets, there have been studies on ensembling BERT based
embeddings along with Bi-LSTM to improve the model [3]. As context plays an important role
in the detection of hate speech, context-aware models are built which take previous tweet’s
features as an input along with the current tweet. Various ensembles of traditional machine
learning algorithms with deep learning techniques have also been explored [18].</p>
    </sec>
    <sec id="sec-3">
      <title>3. Architecture details</title>
      <p>In this section, we describe the details of diferent techniques along with their hyperparameters.
Figure 1 gives a summary of the model details along with 2 architectures that were explored in
this work.</p>
      <p>We use transformer-based neural networks as they have shown great progress in NLP tasks
[19]. As these networks help in the parallelisation of computations, they have an immense
advantage over their predecessor networks like RNN and LSTM. Transformers reduce the
latency of model inference time as they are capable of making use of the contemporary hardware
available. We explore two multilingual variations of BERT-based models viz. m-BERT and
Indic-BERT. Both the BERT variations include Hindi as one of the pre-training languages.</p>
      <sec id="sec-3-1">
        <title>3.1. Multilingual-BERT (m-BERT)</title>
        <p>This model’s architecture is based on BERT-base [20]. It is a model that contains 12 transformer
blocks, 12 self-attention heads, hidden size of 768. The input for BERT contains a maximum
embedding of 512 words and it outputs a sequential representation. Special tokens like [CLS]
and [SEP] are used to specify the start of a sentence and separation of sentences respectively.
For a classification task, final encoder representations are considered and a softmax is applied
to classify the representation.</p>
        <p>As the BERT-base consists of only English text, we use a Multilingual BERT-base model
that has been trained on 102 languages using a shared word-piece vocabulary of size 110k.
Oversampling of low resource languages is done to overcome data imbalance. It has shown
great results on zero-shot transfer learning for various downstream tasks and also helped in
code-switched data tasks [21].</p>
      </sec>
      <sec id="sec-3-2">
        <title>3.2. Indic-BERT</title>
        <p>This model is based on ALBERT [22], which is a lighter version of BERT as they incorporate
parameter sharing across layers which in turn leads to lesser parameters. They have also made
modifications in pre-training mechanisms by introducing new pre-training tasks that have led
to better sentence embeddings. ALBERT contains 12 transformer blocks, 12 self-attention heads,
a hidden size of 768 and an input embedding size of 128.</p>
        <p>Indic-BERT is a multilingual ALBERT based model that has been trained on 12 major Indian
languages with a shared vocabulary size of 200k. It has outperformed multilingual BERT in
some of the Indic tasks [23].</p>
      </sec>
    </sec>
    <sec id="sec-4">
      <title>4. Experimental Setup</title>
      <sec id="sec-4-1">
        <title>4.1. Dataset details</title>
        <p>The HASOC 2021 ICHCL dataset [9] consists of tweets and their context if any, along with the
labels. The binary labels consist of (NOT) Non-Hate Ofensive and (HOF) Hate and Ofensive.
This dataset comprises 2 level hierarchy where an individual tweet can be followed by a comment,
and that comment can have a reply. In the case of comments, we consider individual tweets
as the context, and for replies, we consider the concatenated context of the first tweet and the
comment associated with it.</p>
        <p>The dataset mainly consists of a train and test set. There are a total of 7088 tweets provided
as a dataset, out of which there are 5740 training samples and 1348 test samples. We have
considered a random 10 per cent of the data for validation purposes. Training data contains
2841 hate speech samples and 2899 non-hate speech samples, whereas test data contained 695
hate speech samples along with 653 non-hate speech samples. As the task mainly focuses
on context-based hate speech detection, there were 82 individual tweets in training and 16
individual tweets in testing. The remaining 5658 data points in training and 1332 data points
in the test used the individual tweets as the context. More statistics on the data is provided in
Table 2.</p>
      </sec>
      <sec id="sec-4-2">
        <title>4.2. Data preprocessing</title>
        <p>Various data preprocessing techniques are used to clean and normalize the tweets.
• Removal of URLs: Often people use hyperlinks to diferent websites. As this might not
help us, we are removing it.
• Removal of User Mentions: User mentions are commonly used in tweets. Their removal
is necessary as it is not helpful to the model.
• Removal of Non-Hindi and Non-English characters: As we are sure about the
dataset containing only Roman and Devanagari text, we remove characters outside
the Unicode block.
• Retain Emojis and Hashtags: We retain emojis and hashtags, as this will help in
determining whether a tweet is supporting a hateful tweet, in the absence of text.</p>
      </sec>
      <sec id="sec-4-3">
        <title>4.3. Training details</title>
        <p>All the models were trained using the PyTorch framework and Hugging Face library [24]. The
models have been finetuned up to a maximum of 5 epochs and the minimum validation loss
is the criteria used for picking the best epoch. As discussed in Figure 1, we mainly work on 2
approaches for m-BERT and Indic-BERT.</p>
        <p>• Single Encoder Approach (single sentence representation): This is a basic approach
of fine-tuning BERT based models, where we add a dense layer after the BERT [CLS]
token embedding followed by the softmax classifier. The context text and the target text
are concatenated using a separator token to get a single [CLS] representation from the
BERT model.
• Dual Encoder Approach (averaging the context and target representations): As
context plays a vital role in our dataset, we passed the context and the tweet separately
to the BERT to get their [CLS] token embeddings. This embedding acts as a sentence
representation for individual context and tweets. These embeddings are averaged and
further passed to the dense layer for classification. If the context is absent, then we
consider only the tweet representations.</p>
      </sec>
    </sec>
    <sec id="sec-5">
      <title>5. Results and Discussions</title>
      <p>We evaluate diferent BERT-based approaches for the task of Hate speech detection. The results
of the experiments are outlined in Table 3. The macro precision, recall, and F1-scores are metrics
used to compare the models. As the target text uses code-mixed Hindi and English language,
we use m-BERT and Indic-BERT as our baselines. In the baseline approach, we concatenate the
target and the context text using a separator token. We perform a series of experiments on top
of the baseline model by freezing the embedding layer and incorporating a static dictionary of
ofensive words. The frozen embeddings showed promising results as the token embeddings
were not overfitted to the training data. The static dictionary is used as a deterministic classifier
by directly tagging a text as hateful if any ofensive word is present in the text. The dictionary
was created using various web sources and neither train data nor test data were referenced
during the process. In the dual encoder approach, we average out the [CLS] token embeddings
for the context and the tweet, further showing improvement in F1 scores. Integration of the
static dictionary with this method further improves the F1 numbers. The Indic-BERT model with
frozen embeddings, static dictionary, and dual representations approach outperformed all the
other techniques. We combine the best-performing models using simple ensemble techniques
to get the best results. The scores of the individual models are fused using averaging. The
confusion matrices for best models are shown in Figure 2. Note that the best-run r1_2021_v5
submitted to the shared task was based on m-BERT + FE + C-Avg and resulted in an F1-score of
67.42%. The other experiments were conducted post shared task and results are reported on the
same test set.</p>
    </sec>
    <sec id="sec-6">
      <title>6. Conclusion</title>
      <p>Under the HASOC 2021 ICHCL task, we try to evaluate various finetuning techniques for code
mixed data considering the context of the tweets. We have mainly focused on multilingual
BERT based architectures. We observe that frozen embeddings give better results by retaining
rich token representations from the pre-trained model. Moreover, averaging over sentence
representations has helped the model in understanding the context better while trying to classify
the current tweet. Using the ensemble of models, we report the best F1 score of 73.07%, over
the m-BERT baseline F1 score of 65.53%. Note that our leader board F1-score is 67.42% and
the other experiments were performed post final submission. Primarily we emphasize the
importance of averaging representations using the dual BERT encoder setting in context-based
text classification problems.</p>
    </sec>
    <sec id="sec-7">
      <title>Acknowledgments</title>
      <p>This research was conducted under the guidance of L3Cube, Pune. We would like to express our
gratitude towards our mentors at L3Cube for their continuous support and encouragement.
[3] N. Vashistha, A. Zubiaga, Online multilingual hate speech detection: Experimenting
with hindi and english social media, Information 12 (2021). URL: https://www.mdpi.com/
2078-2489/12/1/5. doi:10.3390/info12010005.
[4] S. MacAvaney, H.-R. Yao, E. Yang, K. Russell, N. Goharian, O. Frieder, Hate speech detection:</p>
      <p>Challenges and solutions, PloS one 14 (2019) e0221152.
[5] A. Schmidt, M. Wiegand, A survey on hate speech detection using natural language
processing, in: Proceedings of the fifth international workshop on natural language
processing for social media, 2017, pp. 1–10.
[6] S. Modha, T. Mandl, G. K. Shahi, H. Madhu, S. Satapara, T. Ranasinghe, M. Zampieri,
Overview of the HASOC Subtrack at FIRE 2021: Hate Speech and Ofensive Content
Identification in English and Indo-Aryan Languages and Conversational Hate Speech, in:
FIRE 2021: Forum for Information Retrieval Evaluation, Virtual Event, 13th-17th December
2021, ACM, 2021.
[7] R. Joshi, R. Karnavat, K. Jirapure, R. Joshi, Evaluation of deep learning models for
hostility detection in hindi text, in: 2021 6th International Conference for Convergence in
Technology (I2CT), IEEE, 2021, pp. 1–5.
[8] A. Wani, I. Joshi, S. Khandve, V. Wagh, R. Joshi, Evaluating deep learning approaches for
covid19 fake news detection, in: International Workshop on Combating Online Hostile
Posts in Regional Languages during Emergency Situation, Springer, 2021, pp. 153–163.
[9] S. Satapara, S. Modha, T. Mandl, H. Madhu, P. Majumder, Overview of the HASOC
Subtrack at FIRE 2021: Conversational Hate Speech Detection in Code-mixed language ,
in: Working Notes of FIRE 2021 - Forum for Information Retrieval Evaluation, CEUR, 2021.
[10] R. Joshi, R. Joshi, Evaluating input representation for language identification in
hindienglish code mixed text, arXiv preprint arXiv:2011.11263 (2020).
[11] R. Joshi, P. Goel, R. Joshi, Deep learning for hindi text classification: A comparison, in:
International Conference on Intelligent Human Computer Interaction, Springer, 2019, pp.
94–101.
[12] P. Mishra, M. D. Tredici, H. Yannakoudakis, E. Shutova, Abusive language detection with
graph convolutional networks, 2019. arXiv:1904.04073.
[13] Z. Waseem, D. Hovy, Hateful symbols or hateful people? predictive features for hate
speech detection on Twitter, in: Proceedings of the NAACL Student Research Workshop,
Association for Computational Linguistics, San Diego, California, 2016, pp. 88–93. URL:
https://aclanthology.org/N16-2013. doi:10.18653/v1/N16-2013.
[14] P. Maheshappa, B. Mathew, P. Saha, Using knowledge graphs to improve hate speech
detection, in: 8th ACM IKDD CODS and 26th COMAD, CODS COMAD 2021, Association
for Computing Machinery, New York, NY, USA, 2021, p. 430. URL: https://doi.org/10.1145/
3430984.3431072. doi:10.1145/3430984.3431072.
[15] Z. Zhang, L. Luo, Hate speech detection: A solved problem? the challenging case of long
tail on twitter, 2018. arXiv:1803.03662.
[16] M.-A. Rizoiu, T. Wang, G. Ferraro, H. Suominen, Transfer learning for hate speech detection
in social media, 2019. arXiv:1906.03829.
[17] P. Mathur, R. Sawhney, M. Ayyar, R. Shah, Did you ofend me? classification of ofensive
tweets in Hinglish language, in: Proceedings of the 2nd Workshop on Abusive Language
Online (ALW2), Association for Computational Linguistics, Brussels, Belgium, 2018, pp.
138–148. URL: https://aclanthology.org/W18-5118. doi:10.18653/v1/W18-5118.
[18] L. Gao, R. Huang, Detecting online hate speech using context aware models, in:
Proceedings of the International Conference Recent Advances in Natural Language
Processing, RANLP 2017, INCOMA Ltd., Varna, Bulgaria, 2017, pp. 260–266. URL: https:
//doi.org/10.26615/978-954-452-049-6_036. doi:10.26615/978-954-452-049-6_036.
[19] A. Vaswani, N. Shazeer, N. Parmar, J. Uszkoreit, L. Jones, A. N. Gomez, L. Kaiser, I.
Polosukhin, Attention is all you need, 2017. arXiv:1706.03762.
[20] J. Devlin, M.-W. Chang, K. Lee, K. Toutanova, Bert: Pre-training of deep bidirectional
transformers for language understanding, 2019. arXiv:1810.04805.
[21] T. Pires, E. Schlinger, D. Garrette, How multilingual is multilingual bert?, 2019.</p>
      <p>arXiv:1906.01502.
[22] Z. Lan, M. Chen, S. Goodman, K. Gimpel, P. Sharma, R. Soricut, Albert: A lite bert for
self-supervised learning of language representations, 2020. arXiv:1909.11942.
[23] D. Kakwani, A. Kunchukuttan, S. Golla, G. N.C., A. Bhattacharyya, M. M. Khapra, P. Kumar,
IndicNLPSuite: Monolingual Corpora, Evaluation Benchmarks and Pre-trained
Multilingual Language Models for Indian Languages, in: Findings of EMNLP, 2020.
[24] T. Wolf, J. Chaumond, L. Debut, V. Sanh, C. Delangue, A. Moi, P. Cistac, M. Funtowicz,
J. Davison, S. Shleifer, et al., Transformers: State-of-the-art natural language processing, in:
Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing:
System Demonstrations, 2020, pp. 38–45.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <given-names>C.</given-names>
            <surname>Ezeibe</surname>
          </string-name>
          ,
          <article-title>Hate speech and election violence in nigeria</article-title>
          ,
          <source>Journal of Asian and African Studies</source>
          <volume>56</volume>
          (
          <year>2021</year>
          )
          <fpage>919</fpage>
          -
          <lpage>935</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [2]
          <string-name>
            <given-names>A.</given-names>
            <surname>Matamoros-Fernández</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Farkas</surname>
          </string-name>
          ,
          <article-title>Racism, hate speech, and social media: A systematic review and critique</article-title>
          ,
          <source>Television &amp; New Media</source>
          <volume>22</volume>
          (
          <year>2021</year>
          )
          <fpage>205</fpage>
          -
          <lpage>224</lpage>
          .
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>