<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>A Feature Extraction based Model for Hate Speech Identification</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Salar Mohtaj</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Vera Schmitt</string-name>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Sebastian Möller</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>German Research Centre for Artificial Intelligence (DFKI)</institution>
          ,
          <addr-line>Projektbüro Berlin, Berlin</addr-line>
          ,
          <country country="DE">Germany</country>
        </aff>
        <aff id="aff1">
          <label>1</label>
          <institution>Quality and Usability Lab, Technische Universität Berlin</institution>
          ,
          <addr-line>Berlin</addr-line>
          ,
          <country country="DE">Germany</country>
        </aff>
      </contrib-group>
      <abstract>
        <p>The detection of hate speech online has become an important task, as ofensive language such as hurtful, obscene and insulting content can harm marginalized people or groups. This paper presents TU Berlin team experiments and results on the task 1A and 1B of the shared task on hate speech and ofensive content identification in Indo-European languages 2021. The success of diferent Natural Language Processing models is evaluated for the respective subtasks throughout the competition. We tested different models based on recurrent neural networks in word and character levels and transfer learning approaches based on Bert on the provided dataset by the competition. Among the tested models that have been used for the experiments, the transfer learning-based models achieved the best results in both subtasks.</p>
      </abstract>
      <kwd-group>
        <kwd>eol&gt;Hate speech detection</kwd>
        <kwd>Ofensive Content Identification</kwd>
        <kwd>Bert</kwd>
        <kwd>LSTM</kwd>
        <kwd>English</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>1. Introduction</title>
      <p>Hindi and Marathi. The second subtask focuses on the identification of conversational
hatespeech in Code-Mixed Languages. The TU Berlin team focuses on the subtasks 1A and 1B.
Subtask 1A is a coarse-grained binary classification task where tweets should be classified into
two classes:
• (NOT) Non Hate-Ofensive: These posts do not contain any hate speech, profane or ofensive
content
• (HOF) Hate and Ofensive: These posts contain hate, ofensive and profane content
Subtask 1B is is a three-class classification task ofered for English and Hindi, where
hatespeech, profane and ofensive posts from subtask 1A are further classified into the following
categories:
• (HATE Hate speech: this class contains posts which hate-speech content
• (OFFN Ofensive: posts in this class contain ofensive content
• (PRFN Profane: posts in this class contain profane content</p>
      <p>
        In this paper the proposed models for classifying tweets into one of the classes for the
respective subtask are presented. For this purpose, the state-of-the-art NLP methods are applied
to classify the posts and categorize them into the classes. Hereby the team of TU Berlin focuses
on the English dataset. We used transfer learning models based on the BERT language model [
        <xref ref-type="bibr" rid="ref6">6</xref>
        ],
and also Recurrent Neural Networks (RNNs), either in word and character levels to categorize
tweets into the relevant classes.
      </p>
      <p>The following section 2 describes some of the state-of-the-art models for the task of hate
speech detection in English. Section 3 describes the provided train and test dataset, whereas
section 4 contains details about data processing and the experiments and models applied.
Furthermore, in section 5 the achieved results are analyzed, and section 6 summarizes and
concludes the approaches and results.</p>
    </sec>
    <sec id="sec-2">
      <title>2. Related Work</title>
      <p>
        In this section, we overview some of the recent approaches for automatic hate speech detection
from English text. Although the automated approaches for hate speech detection could be
categorized into keyword-based, source metadata and machine learning based approaches [
        <xref ref-type="bibr" rid="ref7">7</xref>
        ],
in this section we focus on some of the state-of-the-art machine learning based models.
      </p>
      <p>
        Among the proposed models for the HASOC shared task on 2020, Mishra et al. has been used
a Long Short-Term Memory (LSTM) based model using Glove vectors [
        <xref ref-type="bibr" rid="ref9">9</xref>
        ] for the embedding
[
        <xref ref-type="bibr" rid="ref8">8</xref>
        ]. They fed the outputs of the embedding layer to a single layer LSTM network and put a
fully connected layer on top. In this year’s competition we tried to use a similar architecture
as one of our experiments. On the other side, the YNU_OXZ team [
        <xref ref-type="bibr" rid="ref10">10</xref>
        ] in the HASOC 2020
competition proposed a model based on XLM-RoBERTa [
        <xref ref-type="bibr" rid="ref11">11</xref>
        ] and LSTMs. In their model, they
concatenated the output of the last layer hidden state of XLM-RoBERTa and the hidden state
of the last four layers of XLM-RoBERTa that is fed into an Ordered Neurons LSTM (ON-LSTM)
[
        <xref ref-type="bibr" rid="ref12">12</xref>
        ]. Finally, they input these vectors into a fully connected network for the final classification.
      </p>
      <p>
        Badjatiya et al. did diferent experiments based on three diferent neural network architecture
to detect hate speech tweets in Twitter [
        <xref ref-type="bibr" rid="ref13">13</xref>
        ]. They used convolutional Neural Networks (CNNs),
LSTM, and FastText [
        <xref ref-type="bibr" rid="ref14">14</xref>
        ], with either random embeddings or GloVe embeddings.The proposed
models categorize tweets as racist, sexist or neither. Their experiments show that the model
based on LSTM, random embedding and Gradient Boosted Decision Trees outperforms the
other models in terms of precision, recall, and F1 score.
      </p>
    </sec>
    <sec id="sec-3">
      <title>3. Data</title>
      <p>The English dataset of HASOC 2021 for the subtasks 1A and 1B, contains the text content of
tweets in English, IDs, and the labels for subtask 1A and 1B, respectively. the statistics of the
training dataset is presented in Table 1. Moreover, the test dataset contains 1281 tweets which
should be categorized into one of the classes based on the subtask.</p>
      <p>
        The content would contains hashtags, emojis, links and usernames that refer to a user on
Twitter. A sample of the dataset in diferent categories is presented in Table 2. More details
about the datasets are provided in [
        <xref ref-type="bibr" rid="ref15 ref16">15, 16</xref>
        ].
      </p>
    </sec>
    <sec id="sec-4">
      <title>4. Experiments</title>
      <p>This section contains a short description on the used pre-processing steps and also the developed
models and experiments for the task of hate speech detection.</p>
      <sec id="sec-4-1">
        <title>4.1. Data Processing</title>
        <p>
          For pre-processing of the raw data, we followed the same procedure as the experiments on
the last year’s competition [
          <xref ref-type="bibr" rid="ref17">17</xref>
          ]. The data pre-processing mainly includes the replacement of
mentions with the phrase ’username’, replacement of emojis with short textual descriptions,
links are also replaced with the phrase ’link’, and the replacement of multiple white spaces with
a single white space. These steps are applied to both, the train and test datasets in order to
facilitate the training process.
        </p>
      </sec>
      <sec id="sec-4-2">
        <title>4.2. Models</title>
        <p>
          The best performance of the last year HASOC competition for the English dataset have been
achieved by [
          <xref ref-type="bibr" rid="ref18">18</xref>
          ] with a LSTM using GloVe embeddings [
          <xref ref-type="bibr" rid="ref9">9</xref>
          ] as input. Furthermore, transformer
based language models such as BERT [
          <xref ref-type="bibr" rid="ref6">6</xref>
          ], DistilBERT and RoBERTa [
          <xref ref-type="bibr" rid="ref19">19</xref>
          ], and also ELMO [20]
Language
        </p>
        <p>Total # of Instances
English
3843</p>
        <p>Subtask 1A
HOF NOT
2501
1342</p>
        <p>HATE
683</p>
        <p>Subtask 1B
OFFN PRFN
622
1196</p>
        <p>NONE
1342
This is enough of yours Modi This is not skill India it
is kill India @narendramodi #ExitModi #Resign_PM_Modi
https://t.co/m9FZyU4Lfg
Please, abdicate! You failed us. You failed everyone. Everyone
is sufering. EVERYONE! #ModiKaVaccineJumla
@Feisty_Waters Ok. What did you do to piss of the universe?
@ndtv Nothing gonna help you please #Resign_PM_Modi
HOF</p>
        <p>HOF
HOF</p>
        <p>
          NOT
showed also promising results for similar task. Therefore, the TU Berlin team focuses on BERT
based transfer learning approaches for the proposed subtasks. We also did some experiments
on character level LSTM models which achieved our best results on the last year’s competition
[
          <xref ref-type="bibr" rid="ref17">17</xref>
          ].
        </p>
      </sec>
      <sec id="sec-4-3">
        <title>4.3. LSTM based models</title>
        <p>We developed two diferent models based on LSTM networks. We developed a smaller, character
based architecture, Char_LSTM hereinafter, and a deeper and more complex network based on
words, Word_LSTM hereinafter. Since people sometimes do minor changes on the words (e.g.,
by repeating some characters) when they express hate speech, a word based model may not
signal those terms properly. As a result, we also developed a character based model to compare
the outcomes of the models.</p>
        <p>For the Char_LSTM, we tried out diferent hyper-parameters that includes:
In our experiments, the batch size of 32 and, the Adam optimizer [21] and the Binary Cross
Entropy (BCE) loss function have been used in both models. In the Word_LSTM model, we
tested either using Glove pre-trained vectors and training the embedding layer from scratch.
The detailed results of the proposed models are presented in the section 5.</p>
        <p>
          • Embedding dimension [50, 100, 200]
• Hidden dimension [
          <xref ref-type="bibr" rid="ref16">16, 32, 64, 128</xref>
          ]
• Dropout [0.25, 0.5, 0.75]
• Embedding dimension [100, 300]
• Hidden dimension [32, 64, 128, 256, 512]
• Dropout [0.25, 0.5, 0.75]
The range of the above mentioned hyper-parameters for the Word_LSTM model are as follow:
        </p>
      </sec>
      <sec id="sec-4-4">
        <title>4.4. Bert based models</title>
        <p>In addition to the models based on Recurrent Neural Networks (RNNs), we tested two transfer
learning based models using BERT language model [22]. In one of the experiments, we
finetuned English Bert for the task of hate speech identification. For this purpose, we followed the
recommended hyper-parameters by the authors [22].</p>
        <p>As the other transfer learning based model, we used Bert for extracting features from textual
data. In other words, in this approach, the Bert language model was used to convert text data into
vectors. The resulting vectors inputted into a Gated Recurrent Units (GRU) network. Diferent
hyper-parameters tested on the data to choose the best parameters. The range of diferent
hyper-parameters which had been used in the feature extraction approach are as follow:
• Hidden dimension [32, 64, 128, 256, 512]
• Dropout [0.25, 0.5, 0.75]
Like the LSTM based models, the batch size of 32 and, the Adam optimizer [21] and the Binary
Cross Entropy (BCE) loss function have been used in this experiment. We present the detailed
results by the diferent architectures in section 5.</p>
      </sec>
    </sec>
    <sec id="sec-5">
      <title>5. Results</title>
      <p>In this section the achieved results on the training data are presented. For doing the experiments,
the training dataset has been divided into train, validation and test datasets. The train part
contains 70% of the whole data, the validation part consist of 10% of the data, and the test part
contains the remaining 20% of the provided dataset.</p>
      <p>
        We tested all of the mentioned models with diferent hyper-parameters. The best achieved
results are shown in tables 3 - 5. In order to determine the impact of the pre-processing steps
on the final results, we’ve repeated the experiments with the same hyper-parameters without
applying the pre-processing steps. Although the runs without applying pre-processing could
achieve competitive results in some cases, the experiments based on the pre-processed data
outperforms the other ones in most of the cases. The performance of the submitted models for
both sub-tasks are reported in details in [
        <xref ref-type="bibr" rid="ref15">15</xref>
        ].
      </p>
      <p>The same architectures have been trained on the data for the sub-task 1B. The best achieved
results on the second task were applied on the sub-task 1B test dataset and submitted to the
shared task.</p>
    </sec>
    <sec id="sec-6">
      <title>6. Conclusion and Future Work</title>
      <p>In this paper, we presented the proposed models on the task 1A and 1B of the shared task on
hate speech and ofensive content identification in English. We used a BERT based architecture
and word and character based LSTM models for training a model to classify tweets into ofensive
and not ofensive categories. Our experiments show that Bert based model outperform the other
approaches.</p>
      <p>Model name</p>
      <p>Pre-processed</p>
      <p>Hyper-parameters
Embedding dimension Hidden dimension
dropout
Char_LSTM
Word_LSTM
yes
yes
yes
yes
no
no
yes
yes
yes
no
0.75
0.78
0.76
0.79
0.75
0.77
0.81
0.83
0.80
0.79</p>
      <p>F1</p>
      <p>Since over-fitting was one of the main issues for training diferent models during the
competition, enriching the training data by adding data samples from diferent resources could be
a possible solution for improving the results. Moreover, the proposed transfer learning based
results could be compared with the results from the the other state-of-the-art language models
like GPT-3 to check if there is a significant diference in the performances.</p>
    </sec>
    <sec id="sec-7">
      <title>Acknowledgments</title>
      <p>We would like to thank the organizers of HASOC2021 shared task for organizing the competition
and taking time on the inquiries.
M. Mitra (Eds.), Working Notes of FIRE 2020 - Forum for Information Retrieval Evaluation,
Hyderabad, India, December 16-20, 2020, volume 2826 of CEUR Workshop Proceedings,
CEUR-WS.org, 2020, pp. 823–828. URL: http://ceur-ws.org/Vol-2826/T10-3.pdf.
[20] M. E. Peters, M. Neumann, M. Iyyer, M. Gardner, C. Clark, K. Lee, L. Zettlemoyer, Deep
contextualized word representations, in: M. A. Walker, H. Ji, A. Stent (Eds.), Proceedings of the
2018 Conference of the North American Chapter of the Association for Computational
Linguistics: Human Language Technologies, NAACL-HLT 2018, New Orleans, Louisiana, USA,
June 1-6, 2018, Volume 1 (Long Papers), Association for Computational Linguistics, 2018,
pp. 2227–2237. URL: https://doi.org/10.18653/v1/n18-1202. doi:10.18653/v1/n18-1202.
[21] D. P. Kingma, J. Ba, Adam: A method for stochastic optimization, in: Y. Bengio, Y. LeCun
(Eds.), 3rd International Conference on Learning Representations, ICLR 2015, San Diego,
CA, USA, May 7-9, 2015, Conference Track Proceedings, 2015. URL: http://arxiv.org/abs/
1412.6980.
[22] J. Devlin, M. Chang, K. Lee, K. Toutanova, BERT: pre-training of deep bidirectional
transformers for language understanding, CoRR abs/1810.04805 (2018). URL: http://arxiv.
org/abs/1810.04805. arXiv:1810.04805.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <given-names>J.</given-names>
            <surname>Kim</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Ortiz</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Nam</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Santiago</surname>
          </string-name>
          ,
          <string-name>
            <given-names>V.</given-names>
            <surname>Datta</surname>
          </string-name>
          ,
          <article-title>Intersectional bias in hate speech and abusive language datasets</article-title>
          , CoRR abs/
          <year>2005</year>
          .05921 (
          <year>2020</year>
          ). URL: https://arxiv.org/abs/
          <year>2005</year>
          .05921. arXiv:
          <year>2005</year>
          .05921.
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [2]
          <string-name>
            <given-names>I.</given-names>
            <surname>Gagliardone</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Pohjonen</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Z.</given-names>
            <surname>Beyene</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Zerai</surname>
          </string-name>
          , G. Aynekulu,
          <string-name>
            <given-names>M.</given-names>
            <surname>Bekalu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Bright</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Moges</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Seifu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>N.</given-names>
            <surname>Stremlau</surname>
          </string-name>
          , et al.,
          <article-title>Mechachal: Online debates and elections in ethiopia-from hate speech to engagement in social media</article-title>
          ,
          <source>Available at SSRN</source>
          <volume>2831369</volume>
          (
          <year>2016</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [3]
          <string-name>
            <given-names>H. S.</given-names>
            <surname>Alatawi</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A. M.</given-names>
            <surname>Alhothali</surname>
          </string-name>
          ,
          <string-name>
            <surname>K. M. Moria</surname>
          </string-name>
          ,
          <article-title>Detecting white supremacist hate speech using domain specific word embedding with deep learning and bert</article-title>
          ,
          <source>IEEE Access 9</source>
          (
          <year>2021</year>
          )
          <fpage>106363</fpage>
          -
          <lpage>106374</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [4]
          <string-name>
            <given-names>K.</given-names>
            <surname>Florio</surname>
          </string-name>
          ,
          <string-name>
            <given-names>V.</given-names>
            <surname>Basile</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Polignano</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Basile</surname>
          </string-name>
          ,
          <string-name>
            <given-names>V.</given-names>
            <surname>Patti</surname>
          </string-name>
          ,
          <article-title>Time of your hate: The challenge of time in hate speech detection on social media</article-title>
          ,
          <source>Applied Sciences</source>
          <volume>10</volume>
          (
          <year>2020</year>
          ). URL: https: //www.mdpi.com/2076-3417/10/12/4180. doi:
          <volume>10</volume>
          .3390/app10124180.
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          [5]
          <string-name>
            <given-names>T.</given-names>
            <surname>Mandl</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Modha</surname>
          </string-name>
          ,
          <string-name>
            <surname>A. K. M</surname>
            ,
            <given-names>B. R. Chakravarthi</given-names>
          </string-name>
          ,
          <source>Overview of the HASOC track at FIRE</source>
          <year>2020</year>
          :
          <article-title>Hate speech and ofensive language identification in tamil, malayalam, hindi, english and german</article-title>
          , in: P. Majumder,
          <string-name>
            <given-names>M.</given-names>
            <surname>Mitra</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Gangopadhyay</surname>
          </string-name>
          , P. Mehta (Eds.), FIRE 2020:
          <article-title>Forum for Information Retrieval Evaluation, Hyderabad</article-title>
          , India,
          <source>December 16-20</source>
          ,
          <year>2020</year>
          , ACM,
          <year>2020</year>
          , pp.
          <fpage>29</fpage>
          -
          <lpage>32</lpage>
          . URL: https://doi.org/10.1145/3441501.3441517. doi:
          <volume>10</volume>
          .1145/3441501.3441517.
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          [6]
          <string-name>
            <given-names>J.</given-names>
            <surname>Devlin</surname>
          </string-name>
          , M.-
          <string-name>
            <given-names>W.</given-names>
            <surname>Chang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K.</given-names>
            <surname>Lee</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K.</given-names>
            <surname>Toutanova</surname>
          </string-name>
          , Bert:
          <article-title>Pre-training of deep bidirectional transformers for language understanding, in: Proceeding sof the 2019 Conference of the North American Chapter of the Association for ComputationalLinguistics: Human Language Technologies, NAACL-HLT 2019, volume 1 (Long</article-title>
          and Short Papers),
          <year>2019</year>
          , pp.
          <fpage>4171</fpage>
          -
          <lpage>4186</lpage>
          . URL: https://doi.org/10.18653/v1/n19-
          <fpage>1423</fpage>
          . doi:doi:10.18653/v1/n19-
          <fpage>1423</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          [7]
          <string-name>
            <given-names>S.</given-names>
            <surname>MacAvaney</surname>
          </string-name>
          , H.
          <string-name>
            <surname>-R. Yao</surname>
            ,
            <given-names>E.</given-names>
          </string-name>
          <string-name>
            <surname>Yang</surname>
            ,
            <given-names>K.</given-names>
          </string-name>
          <string-name>
            <surname>Russell</surname>
            ,
            <given-names>N.</given-names>
          </string-name>
          <string-name>
            <surname>Goharian</surname>
            ,
            <given-names>O.</given-names>
          </string-name>
          <string-name>
            <surname>Frieder</surname>
          </string-name>
          ,
          <article-title>Hate speech detection: Challenges and solutions</article-title>
          ,
          <source>PloS one 14</source>
          (
          <year>2019</year>
          )
          <article-title>e0221152</article-title>
          .
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          [8]
          <string-name>
            <given-names>A. K.</given-names>
            <surname>Mishra</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Saumya</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Kumar</surname>
          </string-name>
          , Iiit_dwd@hasoc
          <year>2020</year>
          :
          <article-title>Identifying ofensive content in indo-european languages</article-title>
          , in: P. Mehta,
          <string-name>
            <given-names>T.</given-names>
            <surname>Mandl</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Majumder</surname>
          </string-name>
          , M. Mitra (Eds.), Working Notes of FIRE 2020 -
          <article-title>Forum for Information Retrieval Evaluation, Hyderabad</article-title>
          , India,
          <source>December 16-20</source>
          ,
          <year>2020</year>
          , volume
          <volume>2826</volume>
          <source>of CEUR Workshop Proceedings, CEUR-WS.org</source>
          ,
          <year>2020</year>
          , pp.
          <fpage>139</fpage>
          -
          <lpage>144</lpage>
          . URL: http://ceur-ws.
          <source>org/</source>
          Vol-
          <volume>2826</volume>
          /
          <fpage>T2</fpage>
          -5.pdf.
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          [9]
          <string-name>
            <given-names>J.</given-names>
            <surname>Pennington</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Socher</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C. D.</given-names>
            <surname>Manning</surname>
          </string-name>
          , Glove:
          <article-title>Global vectors for word representation</article-title>
          , in: A.
          <string-name>
            <surname>Moschitti</surname>
            ,
            <given-names>B.</given-names>
          </string-name>
          <string-name>
            <surname>Pang</surname>
          </string-name>
          , W. Daelemans (Eds.),
          <source>Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing, EMNLP 2014, October 25-29</source>
          ,
          <year>2014</year>
          , Doha,
          <string-name>
            <surname>Qatar,</surname>
          </string-name>
          <article-title>A meeting of SIGDAT, a Special Interest Group of the ACL</article-title>
          , ACL,
          <year>2014</year>
          , pp.
          <fpage>1532</fpage>
          -
          <lpage>1543</lpage>
          . URL: https://doi.org/10.3115/v1/d14-
          <fpage>1162</fpage>
          . doi:
          <volume>10</volume>
          .3115/v1/d14-
          <fpage>1162</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          [10]
          <string-name>
            <given-names>X.</given-names>
            <surname>Ou</surname>
          </string-name>
          ,
          <string-name>
            <given-names>H.</given-names>
            <surname>Li</surname>
          </string-name>
          , Ynu_oxz at HASOC 2020:
          <article-title>Multilingual hate speech and ofensive content identification based on xlm-roberta</article-title>
          , in: P. Mehta,
          <string-name>
            <given-names>T.</given-names>
            <surname>Mandl</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Majumder</surname>
          </string-name>
          , M. Mitra (Eds.), Working Notes of FIRE 2020 -
          <article-title>Forum for Information Retrieval Evaluation, Hyderabad</article-title>
          , India,
          <source>December 16-20</source>
          ,
          <year>2020</year>
          , volume
          <volume>2826</volume>
          <source>of CEUR Workshop Proceedings, CEUR-WS.org</source>
          ,
          <year>2020</year>
          , pp.
          <fpage>121</fpage>
          -
          <lpage>127</lpage>
          . URL: http://ceur-ws.
          <source>org/</source>
          Vol-
          <volume>2826</volume>
          /
          <fpage>T2</fpage>
          -3.pdf.
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          [11]
          <string-name>
            <given-names>A.</given-names>
            <surname>Conneau</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K.</given-names>
            <surname>Khandelwal</surname>
          </string-name>
          ,
          <string-name>
            <given-names>N.</given-names>
            <surname>Goyal</surname>
          </string-name>
          ,
          <string-name>
            <given-names>V.</given-names>
            <surname>Chaudhary</surname>
          </string-name>
          ,
          <string-name>
            <given-names>G.</given-names>
            <surname>Wenzek</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F.</given-names>
            <surname>Guzmán</surname>
          </string-name>
          , E. Grave,
          <string-name>
            <given-names>M.</given-names>
            <surname>Ott</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L.</given-names>
            <surname>Zettlemoyer</surname>
          </string-name>
          ,
          <string-name>
            <given-names>V.</given-names>
            <surname>Stoyanov</surname>
          </string-name>
          ,
          <article-title>Unsupervised cross-lingual representation learning at scale</article-title>
          , in: D.
          <string-name>
            <surname>Jurafsky</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          <string-name>
            <surname>Chai</surname>
            ,
            <given-names>N.</given-names>
          </string-name>
          <string-name>
            <surname>Schluter</surname>
            ,
            <given-names>J. R.</given-names>
          </string-name>
          <string-name>
            <surname>Tetreault</surname>
          </string-name>
          (Eds.),
          <source>Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, ACL 2020, Online, July</source>
          <volume>5</volume>
          -
          <issue>10</issue>
          ,
          <year>2020</year>
          , Association for Computational Linguistics,
          <year>2020</year>
          , pp.
          <fpage>8440</fpage>
          -
          <lpage>8451</lpage>
          . URL: https: //doi.org/10.18653/v1/
          <year>2020</year>
          .acl-main.
          <volume>747</volume>
          . doi:
          <volume>10</volume>
          .18653/v1/
          <year>2020</year>
          .acl-main.
          <volume>747</volume>
          .
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          [12]
          <string-name>
            <given-names>Y.</given-names>
            <surname>Shen</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Tan</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Sordoni</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A. C.</given-names>
            <surname>Courville</surname>
          </string-name>
          ,
          <article-title>Ordered neurons: Integrating tree structures into recurrent neural networks</article-title>
          ,
          <source>in: 7th International Conference on Learning Representations, ICLR</source>
          <year>2019</year>
          ,
          <article-title>New Orleans</article-title>
          , LA, USA, May 6-
          <issue>9</issue>
          ,
          <year>2019</year>
          , OpenReview.net,
          <year>2019</year>
          . URL: https://openreview.net/forum?id=
          <fpage>B1l6qiR5F7</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          [13]
          <string-name>
            <given-names>P.</given-names>
            <surname>Badjatiya</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Gupta</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Gupta</surname>
          </string-name>
          ,
          <string-name>
            <given-names>V.</given-names>
            <surname>Varma</surname>
          </string-name>
          ,
          <article-title>Deep learning for hate speech detection in tweets</article-title>
          , in: R. Barrett,
          <string-name>
            <given-names>R.</given-names>
            <surname>Cummings</surname>
          </string-name>
          ,
          <string-name>
            <given-names>E.</given-names>
            <surname>Agichtein</surname>
          </string-name>
          , E. Gabrilovich (Eds.),
          <source>Proceedings of the 26th International Conference on World Wide Web Companion, Perth, Australia, April 3-7</source>
          ,
          <year>2017</year>
          , ACM,
          <year>2017</year>
          , pp.
          <fpage>759</fpage>
          -
          <lpage>760</lpage>
          . URL: https://doi.org/10.1145/3041021.3054223. doi:
          <volume>10</volume>
          .1145/3041021.3054223.
        </mixed-citation>
      </ref>
      <ref id="ref14">
        <mixed-citation>
          [14]
          <string-name>
            <given-names>A.</given-names>
            <surname>Joulin</surname>
          </string-name>
          , E. Grave,
          <string-name>
            <given-names>P.</given-names>
            <surname>Bojanowski</surname>
          </string-name>
          , T. Mikolov,
          <article-title>Bag of tricks for eficient text classification</article-title>
          , in: M.
          <string-name>
            <surname>Lapata</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          <string-name>
            <surname>Blunsom</surname>
            ,
            <given-names>A</given-names>
          </string-name>
          . Koller (Eds.),
          <source>Proceedings of the 15th Conference of the European Chapter of the Association for Computational Linguistics</source>
          ,
          <string-name>
            <surname>EACL</surname>
          </string-name>
          <year>2017</year>
          , Valencia, Spain, April 3-
          <issue>7</issue>
          ,
          <year>2017</year>
          , Volume
          <volume>2</volume>
          :
          <string-name>
            <given-names>Short</given-names>
            <surname>Papers</surname>
          </string-name>
          ,
          <source>Association for Computational Linguistics</source>
          ,
          <year>2017</year>
          , pp.
          <fpage>427</fpage>
          -
          <lpage>431</lpage>
          . URL: https://doi.org/10.18653/v1/e17-
          <fpage>2068</fpage>
          . doi:
          <volume>10</volume>
          .18653/v1/e17-
          <fpage>2068</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref15">
        <mixed-citation>
          [15]
          <string-name>
            <given-names>S.</given-names>
            <surname>Modha</surname>
          </string-name>
          ,
          <string-name>
            <given-names>T.</given-names>
            <surname>Mandl</surname>
          </string-name>
          ,
          <string-name>
            <given-names>G. K.</given-names>
            <surname>Shahi</surname>
          </string-name>
          ,
          <string-name>
            <given-names>H.</given-names>
            <surname>Madhu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Satapara</surname>
          </string-name>
          ,
          <string-name>
            <given-names>T.</given-names>
            <surname>Ranasinghe</surname>
          </string-name>
          ,
          <string-name>
            <surname>M.</surname>
          </string-name>
          <article-title>Zampieri, Overview of the HASOC Subtrack at FIRE 2021: Hate Speech and Ofensive Content Identification in English and Indo-Aryan Languages and Conversational Hate Speech</article-title>
          , in: FIRE 2021:
          <article-title>Forum for Information Retrieval Evaluation, Virtual Event</article-title>
          ,
          <fpage>13th</fpage>
          -17th
          <source>December</source>
          <year>2021</year>
          , ACM,
          <year>2021</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref16">
        <mixed-citation>
          [16]
          <string-name>
            <given-names>T.</given-names>
            <surname>Mandl</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Modha</surname>
          </string-name>
          ,
          <string-name>
            <given-names>G. K.</given-names>
            <surname>Shahi</surname>
          </string-name>
          ,
          <string-name>
            <given-names>H.</given-names>
            <surname>Madhu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Satapara</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Majumder</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Schäfer</surname>
          </string-name>
          ,
          <string-name>
            <given-names>T.</given-names>
            <surname>Ranasinghe</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Zampieri</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Nandini</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A. K.</given-names>
            <surname>Jaiswal</surname>
          </string-name>
          ,
          <article-title>Overview of the HASOC subtrack at FIRE 2021: Hate Speech and Ofensive Content Identification in English and Indo-Aryan Languages</article-title>
          , in: Working Notes of FIRE 2021 -
          <article-title>Forum for Information Retrieval Evaluation</article-title>
          ,
          <string-name>
            <surname>CEUR</surname>
          </string-name>
          ,
          <year>2021</year>
          . URL: http://ceur-ws.org/.
        </mixed-citation>
      </ref>
      <ref id="ref17">
        <mixed-citation>
          [17]
          <string-name>
            <given-names>S.</given-names>
            <surname>Mohtaj</surname>
          </string-name>
          ,
          <string-name>
            <given-names>V.</given-names>
            <surname>Woloszyn</surname>
          </string-name>
          , S. Möller, TUB at HASOC 2020:
          <article-title>Character based LSTM for hate speech detection in indo-european languages</article-title>
          , in: P. Mehta,
          <string-name>
            <given-names>T.</given-names>
            <surname>Mandl</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Majumder</surname>
          </string-name>
          , M. Mitra (Eds.), Working Notes of FIRE 2020 -
          <article-title>Forum for Information Retrieval Evaluation, Hyderabad</article-title>
          , India,
          <source>December 16-20</source>
          ,
          <year>2020</year>
          , volume
          <volume>2826</volume>
          <source>of CEUR Workshop Proceedings, CEUR-WS.org</source>
          ,
          <year>2020</year>
          , pp.
          <fpage>298</fpage>
          -
          <lpage>303</lpage>
          . URL: http://ceur-ws.
          <source>org/</source>
          Vol-
          <volume>2826</volume>
          /
          <fpage>T2</fpage>
          -26.pdf.
        </mixed-citation>
      </ref>
      <ref id="ref18">
        <mixed-citation>
          [18]
          <string-name>
            <surname>A. K. Mishra</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          <string-name>
            <surname>Saumya</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          <string-name>
            <surname>Kumar</surname>
          </string-name>
          , Iiit_dwd@hasoc
          <year>2020</year>
          :
          <article-title>Identifying ofensive content in indo-european languages</article-title>
          , in: P. Mehta,
          <string-name>
            <given-names>T.</given-names>
            <surname>Mandl</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Majumder</surname>
          </string-name>
          , M. Mitra (Eds.), Working Notes of FIRE 2020 -
          <article-title>Forum for Information Retrieval Evaluation, Hyderabad</article-title>
          , India,
          <source>December 16-20</source>
          ,
          <year>2020</year>
          , volume
          <volume>2826</volume>
          <source>of CEUR Workshop Proceedings, CEUR-WS.org</source>
          ,
          <year>2020</year>
          , pp.
          <fpage>139</fpage>
          -
          <lpage>144</lpage>
          . URL: http://ceur-ws.
          <source>org/</source>
          Vol-
          <volume>2826</volume>
          /
          <fpage>T2</fpage>
          -5.pdf.
        </mixed-citation>
      </ref>
      <ref id="ref19">
        <mixed-citation>
          [19]
          <string-name>
            <given-names>R.</given-names>
            <surname>Kumar</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B.</given-names>
            <surname>Lahiri</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A. K.</given-names>
            <surname>Ojha</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Bansal</surname>
          </string-name>
          , Comma@fire
          <year>2020</year>
          :
          <article-title>Exploring multilingual joint training across diferent classification tasks</article-title>
          , in: P. Mehta,
          <string-name>
            <given-names>T.</given-names>
            <surname>Mandl</surname>
          </string-name>
          , P. Majumder,
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>