<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta>
      <journal-title-group>
        <journal-title>Forum for Information Retrieval Evaluation, December</journal-title>
      </journal-title-group>
    </journal-meta>
    <article-meta>
      <title-group>
        <article-title>Battling Hateful Content in Indic Languages HASOC '21</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Aditya Kadam</string-name>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Anmol Goel</string-name>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Jivitesh Jain</string-name>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Jushaan Singh Kalra</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Mallika Subramanian</string-name>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Manvith Reddy</string-name>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Prashant Kodali</string-name>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>T.H. Arjun</string-name>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Manish Shrivastava</string-name>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Ponnurangam Kumaraguru</string-name>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Delhi Technological University</institution>
          ,
          <addr-line>Delhi</addr-line>
          ,
          <country country="IN">India</country>
        </aff>
        <aff id="aff1">
          <label>1</label>
          <institution>International Institute of Information Technology</institution>
          ,
          <addr-line>Hyderabad</addr-line>
          ,
          <country country="IN">India</country>
        </aff>
      </contrib-group>
      <pub-date>
        <year>2021</year>
      </pub-date>
      <volume>1</volume>
      <fpage>3</fpage>
      <lpage>17</lpage>
      <abstract>
        <p>The extensive rise in consumption of online social media (OSMs) by a large number of people poses a critical problem of curbing the spread of hateful content on these platforms. With the growing usage of OSMs in multiple languages, the task of detecting and characterizing hate becomes more complex. The subtle variations of code-mixed texts along with switching scripts only add to the complexity. This paper presents a solution for the HASOC 2021 Multilingual Twitter Hate-Speech Detection challenge by team PreCog IIIT Hyderabad. We adopt a multilingual transformer based approach and describe our architecture for all 6 subtasks as part of the challenge. Out of the 6 teams that participated in all the subtasks, our submissions rank 3rd overall.</p>
      </abstract>
      <kwd-group>
        <kwd>eol&gt;Hate Speech</kwd>
        <kwd>Social Media</kwd>
        <kwd>Code Mixed</kwd>
        <kwd>Indic Languages</kwd>
        <kwd>Transformer Architecture</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>1. Introduction</title>
      <p>
        Dissemination of hateful content on nearly all social media is increasingly becoming an alarming
concern. In the research community as well, this is a heavily studied research problem [
        <xref ref-type="bibr" rid="ref1 ref2 ref3 ref4 ref5 ref6">1, 2,
3, 4, 5</xref>
        ]. Misconduct such as bullying, derogatory comments based on gender, race, religion,
threatening remarks etc. are more prevalent today than ever before. The repercussions that
such content can have is profound and can result in increased mental stress, emotional outburst
and negative psychological impacts [
        <xref ref-type="bibr" rid="ref7">6</xref>
        ]. Hence, curbing the proliferation of this hate speech is
imperative. Furthermore, the massive scale at which online social media platforms function
makes it an even more pressing issue, which needs to be addressed in a robust manner. Most
online social media platforms have imposed strict guidelines 1 2 3 to help prevent the spread
of hate. In spite of these platform regulations, the dynamics of user-interaction influence the
difusion of (and hence increase in) hate to a large extent [
        <xref ref-type="bibr" rid="ref1 ref5">1</xref>
        ].
      </p>
      <p>
        The problem of hate speech has been addressed by several researchers, but the rise in
multilingual content has added to the complexity of identification of hateful content. Majority of these
studies deal with high-resource languages such as English, and only recently have low-resource
languages – such as several Indic Languages – been more deeply explored [
        <xref ref-type="bibr" rid="ref8">7</xref>
        ]. In a country
like India, with multitude of regional languages, the phenomenon of Code Mixing/Switching
(wherein linguistic units such as phrases/words of two languages occur in a single utterance) is
also pervasive.
      </p>
      <p>In this paper we elucidate our approach in solving the six downstream tasks of hate speech
identification and characterization in Indian languages as a part of the ‘HASOC ’21 Hate Speech
and Ofensive Content Identification in English and Indo-Aryan Languages’ challenge [8].
Motivated by existing architectures, we curate our own pipeline by fusing fine-tuned transformer
based models with additional features to solve this challenge and highlight the diferent
methodologies that were adopted for the three languages – English, Hindi, Marathi, and Code Mixed
Hindi - English. We also make our code, methodology and approach public to the research
community. 4</p>
    </sec>
    <sec id="sec-2">
      <title>2. Literature Review</title>
      <p>Discerning hateful content on social media is an already tricky problem given the challenges
associated with it, for instance disrespectful/abusive words could be censored in text, some
expressions may not be inherently ofensive, however they can be so in the right context [ 9].
Owing to the conversational design of social media wherein users can reply to a given comment
(either support, refute or irrelevant to the original message), the build-up of threads in response
to a hateful message can also intensify hate even if the reply is not hateful on its own. The
evolution of such hate intensity has shown diverse patterns and no direct correlation to the
parent tweet which makes the task of hate speech detection more dificult [ 10].</p>
      <p>
        Significant amount of research has been conducted to evaluate traditional NLP approaches
such as character level CNNs, word embedding based approaches and the myriad of
variations with LSTMs (sub-word level, hierarchical, BiLSTMs) [11]. Likewise, Machine Learning
algorithms including SVMs, K-Nearest Neighbours, Multinomial Naive Bayes (MNB) and their
respective performances in multilingual text settings have also been explored [
        <xref ref-type="bibr" rid="ref8">12, 7, 13</xref>
        ].
Investigating categories of profane words that are commonly used in hate speech is another non-trivial
subtask under the hate detection umbrella, primarily because of the diferent interpretations of
words in diferent cultures/demographics, adaptation of slangs in newer generations etc [ 14].
      </p>
      <p>
        In recent times however, with the introduction of Transformer based models and their
performance in Natural Language Understanding (NLU) tasks, significant work has been done
in order to adapt these for multilingual texts as well to leverage transfer between languages.
Models such as XLMR, mBERT, MuRIL, RemBERT have gained much popularity and have shown
promising results [
        <xref ref-type="bibr" rid="ref9">15, 16, 17</xref>
        ]. Transfer learning based approaches that leverage performance of
high resource languages accompanied with CNN classification heads have also shown significant
improvements in capturing hateful content on social media platforms [
        <xref ref-type="bibr" rid="ref10 ref11">18, 19</xref>
        ]. Sharing and
re-utilizing the model weights learnt whilst training on a corpus for a high resource language
can aid the process of training for languages that are still under explored [
        <xref ref-type="bibr" rid="ref12">20</xref>
        ].
      </p>
      <p>4https://github.com/Adi2K/Precog-HASOC-2021</p>
    </sec>
    <sec id="sec-3">
      <title>3. Dataset</title>
      <sec id="sec-3-1">
        <title>3.1. Dataset &amp; Task Description</title>
        <p>
          Subtask 1 consisted of data for 3 languages, namely – English, Hindi and Marathi [
          <xref ref-type="bibr" rid="ref13 ref14">21, 22</xref>
          ].
For English and Hindi, the task was further subdivided into 2 sub-parts: a) Identification of
hateful v/s non-hateful content and b) Characterizing the kind of hate present in a tweet –
either Profane, Hateful, Ofensive or None. The distribution of the diferent data classes for
each of the three languages is shown in Table 1.
        </p>
      </sec>
      <sec id="sec-3-2">
        <title>3.2. Preprocessing Data</title>
        <p>As a precursor to applying any NLP models on text data, we pre-processed the dataset with
standard techniques. Given that the data from Twitter is bound to have certain amount of noise
and unwanted elements such as – URLs, mentions etc, these were removed from the tweet texts.
Hashtags have a slightly diferent contribution to analysis of the tweet since they may or may
not contribute positively in the classification task. Through the results from our experiments,
we observed that omitting the hashtags proved to work better, and hence they were cleaned
from the tweet as well.</p>
        <p>Since the data is code mixed, not only in terms of the combination of languages but also
with respect to scripts (some English text is written in Roman script, whereas some Hindi text
is written in Devanagari apart from Roman), we also normalize the Indic language scripts for
(a) Using BerTweet model with a MLP classifier
head.</p>
        <p>(b) Combining CNN features over XLM-R output
and manually generated feature vectors.</p>
        <p>Marathi and Hindi. In addition to that, we removed stop words for the Marathi dataset using
this list. 5 Finally, punctuations were also removed from the dataset texts.</p>
        <p>An interesting observation was that for the task of hate detection, the presence of emojis
converted to text in the tweets did not improve the performance of our models significantly
(rather it reduced the scores by some margin). However, including emojis along with text while
classifying hate did have a positive impact since the emoji-text conversion was able to capture
hints of sentiment and indirect ofensive/profane content.</p>
      </sec>
    </sec>
    <sec id="sec-4">
      <title>4. Methodology</title>
      <sec id="sec-4-1">
        <title>4.1. Subtask 1: Identifying Hate, ofensive and profane content from the post</title>
        <sec id="sec-4-1-1">
          <title>4.1.1. English Classifiers</title>
          <p>
            For the English Subtask 1, the architecture that resulted in the best performance is an ensemble
of the following models:
• Fine-tuned BERTweet model [
            <xref ref-type="bibr" rid="ref16">24</xref>
            ]
• Fine-tuned XLM-Roberta [16] with CNN Head
          </p>
          <p>We use XLM-R, a multilingual model, along with the monolingual model in the ensemble
as we found that some of the text in the training set has transliterated Hindi along with some
Devanagari text. We extracted textual features such as distribution of ‘?’, ‘!’, capital letters etc.</p>
          <p>
            5https://github.com/stopwords-iso/stopwords-mr
We also use the percentage of profane words and sentiment of the text as a feature. We use
profane words list curated from various sources such as words/cuss 6, zacanger/profane-words 7,
t-davison/lexicons. 8 For sentiment analysis we use the TweetEval [
            <xref ref-type="bibr" rid="ref17">25</xref>
            ] model and use its
softmax output as a feature to our models.
          </p>
          <p>
            Inspired by Kim [
            <xref ref-type="bibr" rid="ref18">26</xref>
            ] we pass the embedding (concatenated last 4 hidden layers) to a CNN
and max-pool convolution layers of various widths to a fully connected layer of size 128 with
dropout. We concatenate this 128 dimensional vector with our feature vector. We pass this
output onto a dense output layer with softmax activation and cross entropy loss as shown in
Figure 1.
          </p>
          <p>
            Along with the previous models, we fine-tune BERTweet, a pre-trained language model for
English tweets. BERTweet has the same architecture as BERT and is trained on the pre-training
procedure of RoBERTa, but it is trained solely on tweets, thus, making it a viable alternative
and suitable for our task. This model has shown state-of-the-art results on tasks based on
tweets [
            <xref ref-type="bibr" rid="ref16">24</xref>
            ]. We use the encoder architecture and pass the pooled output through a linear layer
for the classification which uses softmax activation and cross-entropy loss as shown in the
Figure 1.
          </p>
          <p>We also train the models on the previous years datasets but notice that this does not increase
the performance of the models but actually degrades the performance in Task 1B due to skewed
distribution of classes. Transliteration of emojis didn’t improve the performance. The class
imbalance in Subtask 1-B degraded the performance of our models hence we tried to improve
upon it by using a weighted loss function but we notice that this decreases the performance and
that the domain specific distribution is actually helping the models. We also perform K-Fold
Validation and use early stopping to avoid over-fitting. We average the probabilities of each
class across folds and the two models in our ensemble.</p>
        </sec>
        <sec id="sec-4-1-2">
          <title>4.1.2. Hindi &amp; Marathi Classifier</title>
          <p>For both the Hindi and Marathi language, the architecture that performed the best utilized the
XLM-R transformer model. This model was able to capture the code-mixed and multilingual
nature of the tweets dataset. To amplify the results, we leveraged intermediary representations of
the language model as well as textual features that were extracted from the tweets. In particular,
we utilized the Multilingual MiniLM language model for fine-tuning on Hindi Subtask 1-B.
We observed that MiniLM with Focal Loss instead of Cross Entropy Loss performed better
than other baselines in the imbalanced multi-class setting of Hindi Subtask 1-B. Focal Loss
compensates for class imbalance with a factor that increases the network’s sensitivity towards
mis-classified samples.</p>
          <p>
            Inspired by Mozafari et al. [
            <xref ref-type="bibr" rid="ref10">18</xref>
            ] we use the pre-trained representations of the text from 12
hidden layer of XLM-R model (each of 768 dimensions) and then apply a CNN layer with a
kernel size of 3. The output is then passed through a soft-max following which the cross-entropy
loss is computed whilst training. This model architecture is represented in Figure 2. Tuning
hyperparameters such as optimizers, loss functions and dropout layers, we experiment with
6https://github.com/words/cuss
7https://github.com/zacanger/profane-words
8https://github.com/t-davidson/hate-speech-and-offensive-language/tree/master/lexicons
(a) The base architecture for Hindi &amp; Marathi lan- (b) Multilingual MiniLM architecture adopted to
guages for Subtask 1 using XLM-R with CNN overcome class imbalance while characterizing
augmented with textual features vector fol- hate for the Hindi Subtask 1-B.
lowed by a softmax layer.
diferent options. For the optimizers we try Adadelta and Adam optimizers with Adam working
out better. Amongst all loss functions, the Cross Entropy Loss performed the best. As for the
dropout layers we explore dropouts in the range 0.1-0.5 and use 0.5 as the final dropout for the
model architecture.
          </p>
          <p>We further augment the model features, with two kinds of textual features – fraction of
profane words and sentiment of the tweet. Due to lack of resources for Marathi we catalogue 9
a list of profane words in Marathi and use this to find the fraction of profane words in a tweet.
For Hindi, we curate a list of profane words by collating and appending to existing lists 10,
and use this to score each tweet. As for the sentiment of the tweet, we incorporated
of-theshelf HuggingFace models to obtain the positive, negative and neutral scores for a tweet 11 12.
Although the textual features improved the performance for Hindi only by a small margin, for
Marathi, manually extracted textual features helped in achieving a significant boost.</p>
          <p>For the Marathi Subtask 1, we experimented with a voting ensemble of the XLM-Roberta
with CNN Head using the following features:
• Word Embedding + Fraction of Profane Words + Sentiment Polarity
• Word Embedding + Sentiment Polarity
9https://github.com/Adi2K/MarathiSwear
10https://github.com/neerajvashistha/online-hate-speech-recog/blob/master/data/hi/
Hinglish-Offensive-Text-Classification/Hinglish_Profanity_List.csv
11https://huggingface.co/l3cube-pune/MarathiSentiment
12https://huggingface.co/cardiffnlp/twitter-xlm-roberta-base-sentiment
(a) Model pipeline for hate detection in conversa- (b) Hierarchy of a conversation thread and its
astional threads for Subtask 2. sociated comments.
However we noticed that the base model with the embedding and the textual features performed
better on the leaderboard.</p>
        </sec>
      </sec>
      <sec id="sec-4-2">
        <title>4.2. Subtask 2 : Identification of Conversational Hate-Speech in Code-Mixed</title>
      </sec>
      <sec id="sec-4-3">
        <title>Languages (ICHCL)</title>
        <p>
          The tweets for Subtask 2 are code mixed. While the Transformer based encoder models have
performed well on various monolingual NLU tasks, their performance does not reach the same
level on code mixed sentences. Multilingual transformer based models, have been applied
for various code mixed NLU tasks, and have performed better than monolingual transformer
based models [
          <xref ref-type="bibr" rid="ref19">27</xref>
          ]. For this task, we use XLM-RoBERTa [16]. To capture the context and the
tweet itself, we modify the input in the following manner, where [CLS] , [SEP] are part of the
vocabulary of model, and are used to classify and take multiple sentences as input, respectively.
[CLS] &lt;Tweet text to be classified&gt; [SEP] &lt;context of parent tweet&gt; [SEP]
Here, &lt;Tweet text to be classified&gt; is the text of the tweet/comment/reply that is being classified,
while &lt;Context of parent tweet&gt; is either just the parent tweet or concatenation of parent tweet
and comment, depending on weather the text to be classified is a tweet or a comment or a reply.
While classifying a standalone tweet, the context is left empty. The Hindi corpus used to train
XLM-Roberta is in Devanagari script, while there is only a small portion of the corpus which
is in Romanised form. With the hypothesis that the performance of model will improve if the
Hindi tokens are in Devanagari script, we used CSNLI tool 13 to convert the Romanised tokens
to Devanagari script. However, this normalisation only had a marginal impact on the final
performance of the model. We used Huggingface’s Trainer API to train the XLM-R model, and
the hyperparameters were chosen using the hyperparameter search functionality ofered by
Trainer API.
        </p>
      </sec>
      <sec id="sec-4-4">
        <title>4.3. Experiments</title>
        <p>A
B
A
B
A
2</p>
        <p>XLM-R + CNN</p>
        <p>Ensemble
XLM-R + CNN + Sentiment Scores</p>
        <p>XLM-R + CNN + Weighted loss</p>
        <p>Ensemble</p>
        <p>MuRIL</p>
        <p>XLM-R Base</p>
        <p>XLM-R + CNN
MiniLM with Focal Loss</p>
        <p>XLM-R Base</p>
        <p>Ensemble</p>
        <p>
          XLM-R + CNN
XLM-R without norm
XLM-R with norm
We used Huggingface Transformers [
          <xref ref-type="bibr" rid="ref20">28</xref>
          ] library for implementing the classifiers. For hyper
parameter tuning we use Optuna Framework 14 library. Exploring multiple architectures
simultaneously, we also tried ensembling an odd number of models following a majority rule based
selection. For the English Subtask 1 we also did ensembling with averaged softmax probabilities.
However, the increase in complexity of the classification pipeline did not necessarily improve
performance scores, considering the size and the distribution of the dataset for Hindi and
Marathi but helped in English. Table 2 captures the Accuracies and F1 scores (corresponding to
submissions made on the leaderboard) of all our models for each of the subtasks.
13https://github.com/irshadbhat/csnli
14https://optuna.org/
        </p>
      </sec>
    </sec>
    <sec id="sec-5">
      <title>5. Conclusion</title>
      <p>In this paper, we presented our approaches for Hate Speech detection on Indian Languages
and code mix between Hindi-English using multilingual transformer based encoder models.
Although, in this work we have employed diferent models to address individual language
specific subtasks, a multi-task single model based approach, which performs well across all
the language pairs, would be an interesting challenge, which we wish to explore as a future
work. In addition to this, as part of future work, we would like to improve the performance by
carrying out an additional step of domain adaptive pre-training of the encoder models, and an
eficient ensemble of multilingual encoder models.</p>
    </sec>
    <sec id="sec-6">
      <title>Acknowledgments</title>
      <p>We would like to thank the organisers of HASOC’21 Shared task for addressing a crucial problem
of hate speech in Indian languages by releasing data resources, and for the smooth conduct of
the competition. We would also like to specially thank all members of our research lab, PreCog,
for the constructive suggestions during the whole process.
methods for the languages of india, Information 12 (2021). URL: https://www.mdpi.com/
2078-2489/12/8/306. doi:1 0 . 3 3 9 0 / i n f o 1 2 0 8 0 3 0 6 .
[8] S. Modha, T. Mandl, G. K. Shahi, H. Madhu, S. Satapara, T. Ranasinghe, M. Zampieri,
Overview of the HASOC Subtrack at FIRE 2021: Hate Speech and Ofensive Content
Identification in English and Indo-Aryan Languages and Conversational Hate Speech, in:
FIRE 2021: Forum for Information Retrieval Evaluation, Virtual Event, 13th-17th December
2021, ACM, 2021.
[9] G. Kovács, P. Alonso, R. Saini, Challenges of hate speech detection in social media, SN
Computer Science 2 (2021) 95. URL: https://doi.org/10.1007/s42979-021-00457-3. doi:1 0 .
1 0 0 7 / s 4 2 9 7 9 - 0 2 1 - 0 0 4 5 7 - 3 .
[10] S. Dahiya, S. Sharma, D. Sahnan, V. Goel, E. Chouzenoux, V. Elvira, A. Majumdar, A.
Bandhakavi, T. Chakraborty, Would your tweet invoke hate on the fly? forecasting hate
intensity of reply threads on twitter, in: Proceedings of the 27th ACM SIGKDD Conference
on Knowledge Discovery &amp; Data Mining, KDD ’21, Association for Computing Machinery,
New York, NY, USA, 2021, p. 2732–2742. URL: https://doi.org/10.1145/3447548.3467150.
doi:1 0 . 1 1 4 5 / 3 4 4 7 5 4 8 . 3 4 6 7 1 5 0 .
[11] T. Y. Santosh, K. V. Aravind, Hate speech detection in hindi-english code-mixed social media
text, in: Proceedings of the ACM India Joint International Conference on Data Science and
Management of Data, CoDS-COMAD ’19, Association for Computing Machinery, New
York, NY, USA, 2019, p. 310–313. URL: https://doi.org/10.1145/3297001.3297048. doi:1 0 .
1 1 4 5 / 3 2 9 7 0 0 1 . 3 2 9 7 0 4 8 .
[12] P. Rani, S. Suryawanshi, K. Goswami, B. R. Chakravarthi, T. Fransen, J. P. McCrae, A
comparative study of diferent state-of-the-art hate speech detection methods in
HindiEnglish code-mixed data, in: Proceedings of the Second Workshop on Trolling, Aggression
and Cyberbullying, European Language Resources Association (ELRA), Marseille, France,
2020, pp. 42–48. URL: https://aclanthology.org/2020.trac-1.7.
[13] F. E. Ayo, O. Folorunso, F. T. Ibharalu, I. A. Osinuga, Machine learning techniques for
hate speech classification of twitter data: State-of-the-art, future challenges and research
directions, Computer Science Review 38 (2020) 100311. URL: https://www.sciencedirect.
com/science/article/pii/S1574013720304111. doi:h t t p s : / / d o i . o r g / 1 0 . 1 0 1 6 / j . c o s r e v . 2 0 2 0 .
1 0 0 3 1 1 .
[14] P. L. Teh, C.-B. Cheng, W. M. Chee, Identifying and categorising profane words in hate
speech, in: Proceedings of the 2nd International Conference on Compute and Data
Analysis, ICCDA 2018, Association for Computing Machinery, New York, NY, USA, 2018,
p. 65–69. URL: https://doi.org/10.1145/3193077.3193078. doi:1 0 . 1 1 4 5 / 3 1 9 3 0 7 7 . 3 1 9 3 0 7 8 .
[15] J. Devlin, M.-W. Chang, K. Lee, K. Toutanova, BERT: Pre-training of deep bidirectional
transformers for language understanding, in: Proceedings of the 2019 Conference of
the North American Chapter of the Association for Computational Linguistics: Human
Language Technologies, Volume 1 (Long and Short Papers), Association for Computational
Linguistics, Minneapolis, Minnesota, 2019, pp. 4171–4186. URL: https://aclanthology.org/
N19-1423. doi:1 0 . 1 8 6 5 3 / v 1 / N 1 9 - 1 4 2 3 .
[16] A. Conneau, K. Khandelwal, N. Goyal, V. Chaudhary, G. Wenzek, F. Guzmán, E. Grave,
M. Ott, L. Zettlemoyer, V. Stoyanov, Unsupervised cross-lingual representation learning at
scale, in: Proceedings of the 58th Annual Meeting of the Association for Computational
S. Gugger, M. Drame, Q. Lhoest, A. Rush, Transformers: State-of-the-art natural language
processing, in: Proceedings of the 2020 Conference on Empirical Methods in Natural
Language Processing: System Demonstrations, Association for Computational Linguistics,
Online, 2020, pp. 38–45. URL: https://aclanthology.org/2020.emnlp-demos.6. doi:1 0 . 1 8 6 5 3 /
v 1 / 2 0 2 0 . e m n l p - d e m o s . 6 .</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <given-names>B.</given-names>
            <surname>Mathew</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Dutt</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Goyal</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Mukherjee</surname>
          </string-name>
          ,
          <article-title>Spread of hate speech in online social media</article-title>
          ,
          <source>in: Proceedings of the 10th ACM Conference on Web Science</source>
          , WebSci '19,
          <string-name>
            <surname>Association</surname>
          </string-name>
          for Computing Machinery, New York, NY, USA,
          <year>2019</year>
          , p.
          <fpage>173</fpage>
          -
          <lpage>182</lpage>
          . URL: https://doi.org/10. 1145/3292522.3326034.
          <source>doi:1 0 . 1 1</source>
          <volume>4 5 / 3 2 9 2 5 2 2 . 3 3 2 6 0 3 4 .</volume>
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [2]
          <string-name>
            <given-names>L.</given-names>
            <surname>Silva</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Mondal</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Correa</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F.</given-names>
            <surname>Benevenuto</surname>
          </string-name>
          ,
          <string-name>
            <surname>I. Weber</surname>
          </string-name>
          ,
          <article-title>Analyzing the targets of hate in online social media</article-title>
          ,
          <source>in: Tenth international AAAI conference on web and social media</source>
          ,
          <year>2016</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [3]
          <string-name>
            <given-names>M.</given-names>
            <surname>Mozafari</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Farahbakhsh</surname>
          </string-name>
          ,
          <string-name>
            <given-names>N.</given-names>
            <surname>Crespi</surname>
          </string-name>
          ,
          <article-title>Hate speech detection and racial bias mitigation in social media based on bert model</article-title>
          ,
          <source>PLOS ONE 15</source>
          (
          <year>2020</year>
          )
          <fpage>1</fpage>
          -
          <lpage>26</lpage>
          . URL: https://doi.org/10. 1371
          <source>/journal.pone.0237861. doi:1 0 . 1 3</source>
          <volume>7 1</volume>
          / j o u r n a l .
          <source>p o n e . 0 2</source>
          <volume>3 7 8 6 1 .</volume>
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [4]
          <string-name>
            <given-names>Z.</given-names>
            <surname>Mossie</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.-H.</given-names>
            <surname>Wang</surname>
          </string-name>
          ,
          <article-title>Vulnerable community identification using hate speech detection on social media</article-title>
          ,
          <source>Information Processing &amp; Management</source>
          <volume>57</volume>
          (
          <year>2020</year>
          )
          <article-title>102087</article-title>
          . URL: https: //www.sciencedirect.com/science/article/pii/S0306457318310902. doi:h t t p s : / / d o i .
          <source>o r g / 1 0 .</source>
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          <source>1 0 1 6 / j . i p m . 2 0</source>
          <volume>1 9 . 1 0 2 0 8 7 .</volume>
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          [5]
          <string-name>
            <given-names>K. A.</given-names>
            <surname>Qureshi</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Sabih</surname>
          </string-name>
          ,
          <article-title>Un-compromised credibility: Social media based multi-class hate speech classification for text</article-title>
          ,
          <source>IEEE Access 9</source>
          (
          <year>2021</year>
          )
          <fpage>109465</fpage>
          -
          <lpage>109477</lpage>
          .
          <source>doi: 1 0 . 1 1 0</source>
          <string-name>
            <given-names>9</given-names>
            <surname>/ A C C E S S .</surname>
          </string-name>
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          [6]
          <string-name>
            <given-names>K.</given-names>
            <surname>Saha</surname>
          </string-name>
          , E. Chandrasekharan,
          <string-name>
            <surname>M. De Choudhury</surname>
          </string-name>
          ,
          <article-title>Prevalence and psychological effects of hateful speech in online college communities</article-title>
          ,
          <source>in: Proceedings of the 10th ACM Conference on Web Science</source>
          , WebSci '19,
          <string-name>
            <surname>Association</surname>
          </string-name>
          for Computing Machinery, New York, NY, USA,
          <year>2019</year>
          , p.
          <fpage>255</fpage>
          -
          <lpage>264</lpage>
          . URL: https://doi.org/10.1145/3292522.3326032.
          <source>doi:1 0 . 1 1</source>
          <volume>4 5 / 3 2 9 2 5 2 2 . 3 3 2 6 0 3 2 .</volume>
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          [7]
          <string-name>
            <given-names>T.</given-names>
            <surname>Ranasinghe</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Zampieri</surname>
          </string-name>
          ,
          <article-title>An evaluation of multilingual ofensive language identification Linguistics, Association for Computational Linguistics</article-title>
          , Online,
          <year>2020</year>
          , pp.
          <fpage>8440</fpage>
          -
          <lpage>8451</lpage>
          . URL: https://aclanthology.org/
          <year>2020</year>
          .acl-main.
          <source>747. doi:1 0 . 1 8</source>
          <volume>6 5 3</volume>
          / v 1 /
          <article-title>2 0 2 0 . a c l - m a i n . 7 4 7</article-title>
          .
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          [17]
          <string-name>
            <given-names>S.</given-names>
            <surname>Khanuja</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Bansal</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Mehtani</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Khosla</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Dey</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B.</given-names>
            <surname>Gopalan</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D. K.</given-names>
            <surname>Margam</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Aggarwal</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R. T.</given-names>
            <surname>Nagipogu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Dave</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Gupta</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S. C. B.</given-names>
            <surname>Gali</surname>
          </string-name>
          ,
          <string-name>
            <given-names>V.</given-names>
            <surname>Subramanian</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Talukdar</surname>
          </string-name>
          , Muril:
          <article-title>Multilingual representations for indian languages</article-title>
          ,
          <year>2021</year>
          .
          <article-title>a r X i v : 2 1 0 3 . 1 0 7 3 0</article-title>
          .
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          [18]
          <string-name>
            <given-names>M.</given-names>
            <surname>Mozafari</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Farahbakhsh</surname>
          </string-name>
          ,
          <string-name>
            <given-names>N.</given-names>
            <surname>Crespi</surname>
          </string-name>
          ,
          <article-title>A bert-based transfer learning approach for hate speech detection in online social media</article-title>
          , in: H.
          <string-name>
            <surname>Cherifi</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          <string-name>
            <surname>Gaito</surname>
            ,
            <given-names>J. F.</given-names>
          </string-name>
          <string-name>
            <surname>Mendes</surname>
            ,
            <given-names>E.</given-names>
          </string-name>
          <string-name>
            <surname>Moro</surname>
            ,
            <given-names>L. M.</given-names>
          </string-name>
          <string-name>
            <surname>Rocha</surname>
          </string-name>
          (Eds.),
          <source>Complex Networks and Their Applications VIII</source>
          , Springer International Publishing, Cham,
          <year>2020</year>
          , pp.
          <fpage>928</fpage>
          -
          <lpage>940</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          [19]
          <string-name>
            <given-names>I.</given-names>
            <surname>Bigoulaeva</surname>
          </string-name>
          ,
          <string-name>
            <given-names>V.</given-names>
            <surname>Hangya</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Fraser</surname>
          </string-name>
          ,
          <article-title>Cross-lingual transfer learning for hate speech detection</article-title>
          ,
          <source>in: Proceedings of the First Workshop on Language Technology for Equality, Diversity and Inclusion</source>
          , Association for Computational Linguistics, Kyiv,
          <year>2021</year>
          , pp.
          <fpage>15</fpage>
          -
          <lpage>25</lpage>
          . URL: https://aclanthology.org/
          <year>2021</year>
          .ltedi-
          <volume>1</volume>
          .3.
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          [20]
          <string-name>
            <given-names>T.</given-names>
            <surname>Ranasinghe</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Zampieri</surname>
          </string-name>
          ,
          <article-title>Multilingual ofensive language identification for lowresource languages</article-title>
          ,
          <source>CoRR abs/2105</source>
          .05996 (
          <year>2021</year>
          ). URL: https://arxiv.org/abs/2105.05996.
          <article-title>a r X i v : 2 1 0 5 . 0 5 9 9 6</article-title>
          .
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          [21]
          <string-name>
            <given-names>T.</given-names>
            <surname>Mandl</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Modha</surname>
          </string-name>
          ,
          <string-name>
            <given-names>G. K.</given-names>
            <surname>Shahi</surname>
          </string-name>
          ,
          <string-name>
            <given-names>H.</given-names>
            <surname>Madhu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Satapara</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Majumder</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Schäfer</surname>
          </string-name>
          ,
          <string-name>
            <given-names>T.</given-names>
            <surname>Ranasinghe</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Zampieri</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Nandini</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A. K.</given-names>
            <surname>Jaiswal</surname>
          </string-name>
          ,
          <article-title>Overview of the HASOC subtrack at FIRE 2021: Hate Speech and Ofensive Content Identification in English and Indo-Aryan Languages</article-title>
          , in: Working Notes of FIRE 2021 -
          <article-title>Forum for Information Retrieval Evaluation</article-title>
          ,
          <string-name>
            <surname>CEUR</surname>
          </string-name>
          ,
          <year>2021</year>
          . URL: http://ceur-ws.org/.
        </mixed-citation>
      </ref>
      <ref id="ref14">
        <mixed-citation>
          [22]
          <string-name>
            <given-names>S.</given-names>
            <surname>Gaikwad</surname>
          </string-name>
          ,
          <string-name>
            <given-names>T.</given-names>
            <surname>Ranasinghe</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Zampieri</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C. M.</given-names>
            <surname>Homan</surname>
          </string-name>
          ,
          <article-title>Cross-lingual ofensive language identification for low resource languages: The case of marathi</article-title>
          ,
          <source>in: Proceedings of RANLP</source>
          ,
          <year>2021</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref15">
        <mixed-citation>
          [23]
          <string-name>
            <given-names>S.</given-names>
            <surname>Satapara</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Modha</surname>
          </string-name>
          ,
          <string-name>
            <given-names>T.</given-names>
            <surname>Mandl</surname>
          </string-name>
          ,
          <string-name>
            <given-names>H.</given-names>
            <surname>Madhu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Majumder</surname>
          </string-name>
          ,
          <article-title>Overview of the HASOC Subtrack at FIRE 2021: Conversational Hate Speech Detection in Code-mixed language</article-title>
          , in: Working Notes of FIRE 2021 -
          <article-title>Forum for Information Retrieval Evaluation</article-title>
          ,
          <string-name>
            <surname>CEUR</surname>
          </string-name>
          ,
          <year>2021</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref16">
        <mixed-citation>
          [24]
          <string-name>
            <given-names>D. Q.</given-names>
            <surname>Nguyen</surname>
          </string-name>
          ,
          <string-name>
            <given-names>T.</given-names>
            <surname>Vu</surname>
          </string-name>
          , A. T. Nguyen,
          <article-title>BERTweet: A pre-trained language model for English Tweets</article-title>
          ,
          <source>in: Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing: System Demonstrations</source>
          ,
          <year>2020</year>
          , pp.
          <fpage>9</fpage>
          -
          <lpage>14</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref17">
        <mixed-citation>
          [25]
          <string-name>
            <given-names>F.</given-names>
            <surname>Barbieri</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Camacho-Collados</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L.</given-names>
            <surname>Espinosa-Anke</surname>
          </string-name>
          , L. Neves, TweetEval:Unified Benchmark and
          <article-title>Comparative Evaluation for Tweet Classification</article-title>
          ,
          <source>in: Proceedings of Findings of EMNLP</source>
          ,
          <year>2020</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref18">
        <mixed-citation>
          [26]
          <string-name>
            <given-names>Y.</given-names>
            <surname>Kim</surname>
          </string-name>
          ,
          <article-title>Convolutional neural networks for sentence classification</article-title>
          ,
          <source>in: Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP)</source>
          ,
          <source>Association for Computational Linguistics</source>
          ,
          <year>2014</year>
          , pp.
          <fpage>1746</fpage>
          -
          <lpage>1751</lpage>
          . URL: https://aclanthology. org/D14-1181.
        </mixed-citation>
      </ref>
      <ref id="ref19">
        <mixed-citation>
          [27]
          <string-name>
            <given-names>S.</given-names>
            <surname>Khanuja</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Dandapat</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Srinivasan</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Sitaram</surname>
          </string-name>
          , M. Choudhury,
          <string-name>
            <surname>GLUECoS:</surname>
          </string-name>
          <article-title>An evaluation benchmark for code-switched NLP, in: Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, Association for Computational Linguistics</article-title>
          , Online,
          <year>2020</year>
          , pp.
          <fpage>3575</fpage>
          -
          <lpage>3585</lpage>
          . URL: https://aclanthology.org/
          <year>2020</year>
          .acl-main.
          <source>329. doi:1 0 . 1 8</source>
          <volume>6 5 3</volume>
          / v 1 /
          <article-title>2 0 2 0 . a c l - m a i n . 3 2 9</article-title>
          .
        </mixed-citation>
      </ref>
      <ref id="ref20">
        <mixed-citation>
          [28]
          <string-name>
            <given-names>T.</given-names>
            <surname>Wolf</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L.</given-names>
            <surname>Debut</surname>
          </string-name>
          ,
          <string-name>
            <given-names>V.</given-names>
            <surname>Sanh</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Chaumond</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Delangue</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Moi</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Cistac</surname>
          </string-name>
          ,
          <string-name>
            <given-names>T.</given-names>
            <surname>Rault</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Louf</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Funtowicz</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Davison</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Shleifer</surname>
          </string-name>
          , P. von Platen, C. Ma,
          <string-name>
            <given-names>Y.</given-names>
            <surname>Jernite</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Plu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Xu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>T.</given-names>
            <surname>Le</surname>
          </string-name>
          <string-name>
            <surname>Scao</surname>
          </string-name>
          ,
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>