Contextual Hate Speech Detection in Code Mixed
Text using Transformer Based Approaches
Ravindra Nayak1 , Raviraj Joshi2
1
    Sri Jayachamarajendra College of Engineering, Mysore
2
    Indian Institute of Technology Madras, Chennai


                                         Abstract
                                         In the recent past, social media platforms have helped people in connecting and communicating to a
                                         wider audience. But this has also led to a drastic increase in cyberbullying. It is essential to detect and
                                         curb hate speech to keep the sanity of social media platforms. Also, code mixed text containing more
                                         than one language is frequently used on these platforms. We, therefore, propose automated techniques
                                         for hate speech detection in code mixed text from scraped Twitter. We specifically focus on code mixed
                                         English-Hindi text and transformer-based approaches. While regular approaches analyze the text in-
                                         dependently, we also make use of content text in the form of parent tweets. We try to evaluate the
                                         performances of multilingual BERT and Indic-BERT in single-encoder and dual-encoder settings. The
                                         first approach is to concatenate the target text and context text using a separator token and get a single
                                         representation from the BERT model. The second approach encodes the two texts independently using a
                                         dual BERT encoder and the corresponding representations are averaged. We show that the dual-encoder
                                         approach using independent representations yields better performance. We also employ simple ensem-
                                         ble methods to further improve the performance. We describe the systems built by our team r1_2021
                                         for HASOC 2021 Subtask 2 and the subsequent set of experiments.

                                         Keywords
                                         Hate Speech Detection, Social Media, Code Mixed, Hinglish, Multilingual, Indic, BERT, Context-aware,
                                         Deep Learning


1. Introduction
Social media is a boon to many, as they have helped in creating and promoting budding
businesses on such platforms. Although it has vast use cases, it comes with a caveat too. People
with malicious intent have considered it as an opportunity to promote hate speech among a
wider audience [1, 2]. There has been multiple research that directly links social media to poor
mental health. Because such platforms are outnumbered by youngsters, their mental stability is
trivial in shaping their future careers. So it is necessary to take actions against such malevolent
content on a large scale.
   Offensive language such as insulting, hurtful, derogatory or obscene content directed towards
people might suppress meaningful discussions. As there are no restrictions on expressing
peoples opinions on such platforms, it might lead to the defaming of personalities. So it is the
platform’s responsibility to restrain such content. Hate speech mainly involves discriminating

Forum for Information Retrieval Evaluation, December 13-17, 2021, India
" ravindranyk707@gmail.com (R. Nayak); ravirajoshi@gmail.com (R. Joshi)
                                       © 2021 Copyright for this paper by its authors. Use permitted under Creative Commons License Attribution 4.0 International (CC BY 4.0).
    CEUR
    Workshop
    Proceedings
                  http://ceur-ws.org
                  ISSN 1613-0073
                                       CEUR Workshop Proceedings (CEUR-WS.org)
Table 1
Few training samples of context and tweets. HOF indicates Hate & Offensive speech, whereas NOT
indicates Non-Hate Offensive speech content.
 Context                                  Tweet                                                    Label
 -                                        INDIA NEEDS VACCINES                                     NOT
 INDIA NEEDS VACCINES                     Yes ma’am India need vaccine Doctors and facilities      NOT
 INDIA NEEDS VACCINES                     Is there any Vaccine which can prevent India from        HOF
                                          you
 INDIA NEEDS VACCINES                     vaccine insano k liye hain reptiles k liye nhi           HOF
 -                                        Look at this insensitive piece of shit                   HOF
 Look at this insensitive piece of shit   After all that is happening since last month but still   NOT
 Look at this insensitive piece of shit   Yo wtf                                                   HOF
 Look at this insensitive piece of shit   Haha                                                     HOF


against people based on religion, community, race, nationality, gender or any other identity
factors [3, 4].
   Even though manual moderation of hate speech is always precise, it isn’t recommended
considering the huge volumes of data that is being pumped into social media. So there is a
constant need for automated techniques to suppress such hateful content where the ages of
all the groups are exposed to [5, 6, 7, 8]. As we have seen advances in computing capabilities,
machine learning algorithms have gained their importance in tasks that involve understanding
natural language.
   In this work, we are interested in the hate speech detection of tweets. This paper mainly
focuses on evaluating the HASOC 2021, Identification of Conversational Hate-Speech in Code-
Mixed Languages (ICHCL) subtask [9]. This task aims to detect hate speech in individual tweets
and their respective comments and replies which support hate speech directly or indirectly.
The dataset contains scraped text from Twitter with binary labels. A conversational thread can
contain abusive or offensive content, which is not apparent just from a single comment or the
reply to a comment but can be identified if given the context of the parent content as shown
in Table 1. Furthermore, the contents on such social media are spread in so many different
languages, including code-mixed languages such as Hinglish (mix of Hindi and English in roman
text) [10].
   The hate speech detection task can be considered as a binary text classification problem [11].
We solely depend upon the tweets and the context of the tweet to determine hateful content.
Even though some of the tweets can be rejected considering the behaviour of the content creator,
we cannot always guarantee that this information is available every time. We evaluate various
deep learning techniques, specifically the multilingual BERT based models. We have tried to
experiment on various fine-tuning methods, and how they are helpful for the model to detect
malicious tweets.


2. Related Work
Hate speech detection is precise when manually moderated. The context of such tweets is also
important to identify hatefulness. For the code mixed data, in particular, the moderator must
have a vast knowledge of vocabulary across languages to curb malicious content. If we have
enough data on the user’s behaviour and tweet content, then it could help us in mitigating such
content by blacklisting such users. Many approaches like graph convolution networks are being
used that capture not only the structure of online communities but also the linguistic behaviour
of the users within them [12].
   Dictionary-based approaches are popular for text data where we try to maintain a list of
words or phrases that might be profane or any kind of racial slurs. Various machine learning
approaches involve the usage of extra-linguistic features in conjunction with character n-
grams to build binary logistic regression classifiers [13]. There have been studies showing that
including knowledge graph features have helped in building better models [14].
   Word level embeddings like Glove have helped in better capture of the semantics of words in
comparison to one-hot encoding [15]. Another similar approach is to make use of sentence-level
embeddings like ELMo which help in extracting rich features from the text. These embeddings
are then fed to bi-directional LSTMs or CNNs for classification [16]. As these embeddings are
trained on huge corpora of data, they are often called transfer learning as they help in reusing
feature-rich vectors for similar classification tasks. Various other features like LIWC features,
SentiWordnet and Profanity vectors also aid the model [17].
   For the code mixed Hinglish data sets, there have been studies on ensembling BERT based
embeddings along with Bi-LSTM to improve the model [3]. As context plays an important role
in the detection of hate speech, context-aware models are built which take previous tweet’s
features as an input along with the current tweet. Various ensembles of traditional machine
learning algorithms with deep learning techniques have also been explored [18].


3. Architecture details
In this section, we describe the details of different techniques along with their hyperparameters.
Figure 1 gives a summary of the model details along with 2 architectures that were explored in
this work.
   We use transformer-based neural networks as they have shown great progress in NLP tasks
[19]. As these networks help in the parallelisation of computations, they have an immense
advantage over their predecessor networks like RNN and LSTM. Transformers reduce the
latency of model inference time as they are capable of making use of the contemporary hardware
available. We explore two multilingual variations of BERT-based models viz. m-BERT and
Indic-BERT. Both the BERT variations include Hindi as one of the pre-training languages.

3.1. Multilingual-BERT (m-BERT)
This model’s architecture is based on BERT-base [20]. It is a model that contains 12 transformer
blocks, 12 self-attention heads, hidden size of 768. The input for BERT contains a maximum
embedding of 512 words and it outputs a sequential representation. Special tokens like [CLS]
and [SEP] are used to specify the start of a sentence and separation of sentences respectively.
For a classification task, final encoder representations are considered and a softmax is applied
to classify the representation.
Figure 1: Representation of single encoder approach (left) and dual encoder approach (right).


  As the BERT-base consists of only English text, we use a Multilingual BERT-base model
that has been trained on 102 languages using a shared word-piece vocabulary of size 110k.
Oversampling of low resource languages is done to overcome data imbalance. It has shown
great results on zero-shot transfer learning for various downstream tasks and also helped in
code-switched data tasks [21].

3.2. Indic-BERT
This model is based on ALBERT [22], which is a lighter version of BERT as they incorporate
parameter sharing across layers which in turn leads to lesser parameters. They have also made
modifications in pre-training mechanisms by introducing new pre-training tasks that have led
to better sentence embeddings. ALBERT contains 12 transformer blocks, 12 self-attention heads,
a hidden size of 768 and an input embedding size of 128.
   Indic-BERT is a multilingual ALBERT based model that has been trained on 12 major Indian
languages with a shared vocabulary size of 200k. It has outperformed multilingual BERT in
some of the Indic tasks [23].


4. Experimental Setup
4.1. Dataset details
The HASOC 2021 ICHCL dataset [9] consists of tweets and their context if any, along with the
labels. The binary labels consist of (NOT) Non-Hate Offensive and (HOF) Hate and Offensive.
This dataset comprises 2 level hierarchy where an individual tweet can be followed by a comment,
Table 2
Statistics of the dataset
                                   Feature         Train     Test
                                 Total words      273627    68019
                               Max word length      166      131
                               Avg word count      47.67    50.45
                                Unique tokens      5717     1343


and that comment can have a reply. In the case of comments, we consider individual tweets
as the context, and for replies, we consider the concatenated context of the first tweet and the
comment associated with it.
   The dataset mainly consists of a train and test set. There are a total of 7088 tweets provided
as a dataset, out of which there are 5740 training samples and 1348 test samples. We have
considered a random 10 per cent of the data for validation purposes. Training data contains
2841 hate speech samples and 2899 non-hate speech samples, whereas test data contained 695
hate speech samples along with 653 non-hate speech samples. As the task mainly focuses
on context-based hate speech detection, there were 82 individual tweets in training and 16
individual tweets in testing. The remaining 5658 data points in training and 1332 data points
in the test used the individual tweets as the context. More statistics on the data is provided in
Table 2.

4.2. Data preprocessing
Various data preprocessing techniques are used to clean and normalize the tweets.
    • Removal of URLs: Often people use hyperlinks to different websites. As this might not
      help us, we are removing it.
    • Removal of User Mentions: User mentions are commonly used in tweets. Their removal
      is necessary as it is not helpful to the model.
    • Removal of Non-Hindi and Non-English characters: As we are sure about the
      dataset containing only Roman and Devanagari text, we remove characters outside
      the Unicode block.
    • Retain Emojis and Hashtags: We retain emojis and hashtags, as this will help in
      determining whether a tweet is supporting a hateful tweet, in the absence of text.

4.3. Training details
All the models were trained using the PyTorch framework and Hugging Face library [24]. The
models have been finetuned up to a maximum of 5 epochs and the minimum validation loss
is the criteria used for picking the best epoch. As discussed in Figure 1, we mainly work on 2
approaches for m-BERT and Indic-BERT.

    • Single Encoder Approach (single sentence representation): This is a basic approach
      of fine-tuning BERT based models, where we add a dense layer after the BERT [CLS]
      token embedding followed by the softmax classifier. The context text and the target text
Table 3
Result metrics for different BERT configurations. FE indicates BERT model with frozen embedding
layer. C-Avg indicates averaging over the [CLS] token embeddings of the context and tweet. Dictionary
indicates a static list of profane words.
                               Model                                   Precision   Recall   F1 score
                          m-BERT baseline                                66.07     65.63     65.53
                        Indic-BERT baseline                              67.18     67.17     67.17
                 m-BERT + frozen embeddings (FE)                         70.03     67.40     66.65
                       m-BERT + FE + C-Avg                               67.70     67.65     67.65
                 m-BERT + FE + C-Avg + Dictionary                        68.82     68.62     68.61
                   Indic-BERT + FE + Dictionary                          70.71     70.07     69.99
               Indic-BERT + FE + C-Avg + Dictionary                      71.09     70.44     70.37
         Ensemble 2 (Indic-BERT C-Avg+ m-BERT C-Avg)                     71.65     71.59     71.60
 Ensemble 4 (Indic-BERT + Indic C-Avg + m-BERT + m-BERT C-Avg)           73.21     73.17     73.07


      are concatenated using a separator token to get a single [CLS] representation from the
      BERT model.
    • Dual Encoder Approach (averaging the context and target representations): As
      context plays a vital role in our dataset, we passed the context and the tweet separately
      to the BERT to get their [CLS] token embeddings. This embedding acts as a sentence
      representation for individual context and tweets. These embeddings are averaged and
      further passed to the dense layer for classification. If the context is absent, then we
      consider only the tweet representations.


5. Results and Discussions
We evaluate different BERT-based approaches for the task of Hate speech detection. The results
of the experiments are outlined in Table 3. The macro precision, recall, and F1-scores are metrics
used to compare the models. As the target text uses code-mixed Hindi and English language,
we use m-BERT and Indic-BERT as our baselines. In the baseline approach, we concatenate the
target and the context text using a separator token. We perform a series of experiments on top
of the baseline model by freezing the embedding layer and incorporating a static dictionary of
offensive words. The frozen embeddings showed promising results as the token embeddings
were not overfitted to the training data. The static dictionary is used as a deterministic classifier
by directly tagging a text as hateful if any offensive word is present in the text. The dictionary
was created using various web sources and neither train data nor test data were referenced
during the process. In the dual encoder approach, we average out the [CLS] token embeddings
for the context and the tweet, further showing improvement in F1 scores. Integration of the
static dictionary with this method further improves the F1 numbers. The Indic-BERT model with
frozen embeddings, static dictionary, and dual representations approach outperformed all the
other techniques. We combine the best-performing models using simple ensemble techniques
to get the best results. The scores of the individual models are fused using averaging. The
confusion matrices for best models are shown in Figure 2. Note that the best-run r1_2021_v5
Figure 2: Confusion matrices for the Ensenble 4 approach (left) and IndicBERT Dual Encoder approach
(right). The rows correspond to true class and columns correspond to predicted class. Each cell value
further is segregated as the number of contextual examples + the number of non-contextual examples.


submitted to the shared task was based on m-BERT + FE + C-Avg and resulted in an F1-score of
67.42%. The other experiments were conducted post shared task and results are reported on the
same test set.


6. Conclusion
Under the HASOC 2021 ICHCL task, we try to evaluate various finetuning techniques for code
mixed data considering the context of the tweets. We have mainly focused on multilingual
BERT based architectures. We observe that frozen embeddings give better results by retaining
rich token representations from the pre-trained model. Moreover, averaging over sentence
representations has helped the model in understanding the context better while trying to classify
the current tweet. Using the ensemble of models, we report the best F1 score of 73.07%, over
the m-BERT baseline F1 score of 65.53%. Note that our leader board F1-score is 67.42% and
the other experiments were performed post final submission. Primarily we emphasize the
importance of averaging representations using the dual BERT encoder setting in context-based
text classification problems.


Acknowledgments
This research was conducted under the guidance of L3Cube, Pune. We would like to express our
gratitude towards our mentors at L3Cube for their continuous support and encouragement.


References
 [1] C. Ezeibe, Hate speech and election violence in nigeria, Journal of Asian and African
     Studies 56 (2021) 919–935.
 [2] A. Matamoros-Fernández, J. Farkas, Racism, hate speech, and social media: A systematic
     review and critique, Television & New Media 22 (2021) 205–224.
 [3] N. Vashistha, A. Zubiaga, Online multilingual hate speech detection: Experimenting
     with hindi and english social media, Information 12 (2021). URL: https://www.mdpi.com/
     2078-2489/12/1/5. doi:10.3390/info12010005.
 [4] S. MacAvaney, H.-R. Yao, E. Yang, K. Russell, N. Goharian, O. Frieder, Hate speech detection:
     Challenges and solutions, PloS one 14 (2019) e0221152.
 [5] A. Schmidt, M. Wiegand, A survey on hate speech detection using natural language
     processing, in: Proceedings of the fifth international workshop on natural language
     processing for social media, 2017, pp. 1–10.
 [6] S. Modha, T. Mandl, G. K. Shahi, H. Madhu, S. Satapara, T. Ranasinghe, M. Zampieri,
     Overview of the HASOC Subtrack at FIRE 2021: Hate Speech and Offensive Content
     Identification in English and Indo-Aryan Languages and Conversational Hate Speech, in:
     FIRE 2021: Forum for Information Retrieval Evaluation, Virtual Event, 13th-17th December
     2021, ACM, 2021.
 [7] R. Joshi, R. Karnavat, K. Jirapure, R. Joshi, Evaluation of deep learning models for hos-
     tility detection in hindi text, in: 2021 6th International Conference for Convergence in
     Technology (I2CT), IEEE, 2021, pp. 1–5.
 [8] A. Wani, I. Joshi, S. Khandve, V. Wagh, R. Joshi, Evaluating deep learning approaches for
     covid19 fake news detection, in: International Workshop on Combating Online Hostile
     Posts in Regional Languages during Emergency Situation, Springer, 2021, pp. 153–163.
 [9] S. Satapara, S. Modha, T. Mandl, H. Madhu, P. Majumder, Overview of the HASOC
     Subtrack at FIRE 2021: Conversational Hate Speech Detection in Code-mixed language ,
     in: Working Notes of FIRE 2021 - Forum for Information Retrieval Evaluation, CEUR, 2021.
[10] R. Joshi, R. Joshi, Evaluating input representation for language identification in hindi-
     english code mixed text, arXiv preprint arXiv:2011.11263 (2020).
[11] R. Joshi, P. Goel, R. Joshi, Deep learning for hindi text classification: A comparison, in:
     International Conference on Intelligent Human Computer Interaction, Springer, 2019, pp.
     94–101.
[12] P. Mishra, M. D. Tredici, H. Yannakoudakis, E. Shutova, Abusive language detection with
     graph convolutional networks, 2019. arXiv:1904.04073.
[13] Z. Waseem, D. Hovy, Hateful symbols or hateful people? predictive features for hate
     speech detection on Twitter, in: Proceedings of the NAACL Student Research Workshop,
     Association for Computational Linguistics, San Diego, California, 2016, pp. 88–93. URL:
     https://aclanthology.org/N16-2013. doi:10.18653/v1/N16-2013.
[14] P. Maheshappa, B. Mathew, P. Saha, Using knowledge graphs to improve hate speech
     detection, in: 8th ACM IKDD CODS and 26th COMAD, CODS COMAD 2021, Association
     for Computing Machinery, New York, NY, USA, 2021, p. 430. URL: https://doi.org/10.1145/
     3430984.3431072. doi:10.1145/3430984.3431072.
[15] Z. Zhang, L. Luo, Hate speech detection: A solved problem? the challenging case of long
     tail on twitter, 2018. arXiv:1803.03662.
[16] M.-A. Rizoiu, T. Wang, G. Ferraro, H. Suominen, Transfer learning for hate speech detection
     in social media, 2019. arXiv:1906.03829.
[17] P. Mathur, R. Sawhney, M. Ayyar, R. Shah, Did you offend me? classification of offensive
     tweets in Hinglish language, in: Proceedings of the 2nd Workshop on Abusive Language
     Online (ALW2), Association for Computational Linguistics, Brussels, Belgium, 2018, pp.
     138–148. URL: https://aclanthology.org/W18-5118. doi:10.18653/v1/W18-5118.
[18] L. Gao, R. Huang, Detecting online hate speech using context aware models, in: Pro-
     ceedings of the International Conference Recent Advances in Natural Language Pro-
     cessing, RANLP 2017, INCOMA Ltd., Varna, Bulgaria, 2017, pp. 260–266. URL: https:
     //doi.org/10.26615/978-954-452-049-6_036. doi:10.26615/978-954-452-049-6_036.
[19] A. Vaswani, N. Shazeer, N. Parmar, J. Uszkoreit, L. Jones, A. N. Gomez, L. Kaiser, I. Polo-
     sukhin, Attention is all you need, 2017. arXiv:1706.03762.
[20] J. Devlin, M.-W. Chang, K. Lee, K. Toutanova, Bert: Pre-training of deep bidirectional
     transformers for language understanding, 2019. arXiv:1810.04805.
[21] T. Pires, E. Schlinger, D. Garrette, How multilingual is multilingual bert?, 2019.
     arXiv:1906.01502.
[22] Z. Lan, M. Chen, S. Goodman, K. Gimpel, P. Sharma, R. Soricut, Albert: A lite bert for
     self-supervised learning of language representations, 2020. arXiv:1909.11942.
[23] D. Kakwani, A. Kunchukuttan, S. Golla, G. N.C., A. Bhattacharyya, M. M. Khapra, P. Kumar,
     IndicNLPSuite: Monolingual Corpora, Evaluation Benchmarks and Pre-trained Multilin-
     gual Language Models for Indian Languages, in: Findings of EMNLP, 2020.
[24] T. Wolf, J. Chaumond, L. Debut, V. Sanh, C. Delangue, A. Moi, P. Cistac, M. Funtowicz,
     J. Davison, S. Shleifer, et al., Transformers: State-of-the-art natural language processing, in:
     Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing:
     System Demonstrations, 2020, pp. 38–45.