<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>HateGPT: Unleashing GPT-3.5 Turbo to Combat Hate Speech on X</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Aniket Deroy</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Subhankar Maity</string-name>
          <email>subhankar.ai@kgpian.iitkgp.ac.in</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>IIT Kharagpur</institution>
          ,
          <addr-line>Kharagpur</addr-line>
          ,
          <country country="IN">India</country>
        </aff>
      </contrib-group>
      <abstract>
        <p>The widespread use of social media platforms like Twitter and Facebook has enabled people of all ages to share their thoughts and experiences, leading to an immense accumulation of user-generated content. However, alongside the benefits, these platforms also face the challenge of managing hate speech and ofensive content, which can undermine rational discourse and threaten democratic values. As a result, there is a growing need for automated methods to detect and mitigate such content, especially given the complexity of conversations that may require contextual analysis across multiple languages, including code-mixed languages. We participated in the English task where we have to classify English tweets into two categories namely Hate and Ofensive and Non Hate-Ofensive. In this work, we experiment with state-of-the-art large language models like GPT-3.5 Turbo via prompting to classify tweets into Hate and Ofensive or Non Hate-Ofensive. We modified the temperature as an experimental parameter. In this study, we evaluate the performance of a classification model using Macro-F1 scores across three distinct runs. The Macro-F1 score, which balances precision and recall across all classes, is used as the primary metric for model evaluation. The scores obtained are 0.756 for run 1, 0.751 for run 2, and 0.754 for run 3, indicating a high level of performance with minimal variance among the runs. The results suggest that the model consistently performs well in terms of precision and recall, with run 1 showing the highest performance. These findings highlight the robustness and reliability of the model across diferent runs.</p>
      </abstract>
      <kwd-group>
        <kwd>eol&gt;GPT</kwd>
        <kwd>Hate Speech</kwd>
        <kwd>Classification</kwd>
        <kwd>English</kwd>
        <kwd>Prompt Engineering</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>1. Introduction</title>
      <p>
        The advent of social media platforms such as Twitter (currently known as X) and Facebook has
revolutionized the way individuals communicate, enabling people from diverse backgrounds to share
their thoughts, experiences, and opinions freely [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ]. This democratization of content creation has
led to an exponential increase in user-generated data. While these platforms have facilitated global
connectivity and discourse, they have also become hotbeds for hate speech [
        <xref ref-type="bibr" rid="ref2 ref3 ref4 ref5">2, 3, 4, 5</xref>
        ] and ofensive
content. Such content not only disrupts meaningful communication but also poses significant threats
to social cohesion and democratic values.
      </p>
      <p>
        Addressing the proliferation of hate speech [
        <xref ref-type="bibr" rid="ref6 ref7 ref8">6, 7, 8</xref>
        ] on social media is a complex challenge. The
nature of online communication, where context and nuance often play a crucial role, makes it dificult
to detect ofensive language accurately [
        <xref ref-type="bibr" rid="ref9">9</xref>
        ]. This challenge is further compounded by the multilingual
nature of online communities, where users frequently employ code-mixed languages such as Hinglish
(a mix of Hindi and English), German-English, and Bangla, among others [
        <xref ref-type="bibr" rid="ref10">10</xref>
        ]. As these languages blend
cultural and linguistic elements, the task of identifying hate speech becomes even more intricate.
      </p>
      <p>
        In response to this growing concern, technology companies and social media platforms have begun
to invest in automated methods to detect and manage ofensive content [
        <xref ref-type="bibr" rid="ref11 ref12 ref2">2, 11, 12</xref>
        ]. The goal is to strike
a balance between preserving open and free dialogue while preventing the spread of harmful speech. In
this work, we focus on the classification of English tweets into two categories: Hate and Ofensive and
Non Hate-Ofensive. By leveraging state-of-the-art large language models such as GPT-3.5 Turbo, we
experiment with prompting techniques to classify tweets accurately.
      </p>
      <p>Run 1 achieved the highest Macro-F1 score at 0.756, indicating it balanced precision and recall across
diferent classes better than the other runs. Run 2 had a slightly lower score of 0.751, suggesting a
small decline in performance, either in precision, recall, or both. Run 3 scored 0.754, which was slightly
lower than Run 1 but higher than Run 2, indicating its performance was similar to Run 2’s with a minor
improvement.</p>
    </sec>
    <sec id="sec-2">
      <title>2. Related Work</title>
      <p>
        The detection of hate speech and ofensive content on social media has garnered significant attention
in recent years, driven by the growing need to maintain safe and constructive online environments
[
        <xref ref-type="bibr" rid="ref13">13</xref>
        ]. Researchers have explored a variety of approaches to address this issue, ranging from traditional
machine learning techniques [
        <xref ref-type="bibr" rid="ref14 ref15">14, 15</xref>
        ] to the application of advanced deep learning models [
        <xref ref-type="bibr" rid="ref16 ref2">16, 2</xref>
        ].
      </p>
      <p>
        Early approaches to hate speech detection [
        <xref ref-type="bibr" rid="ref14">14</xref>
        ] used simple machine learning algorithms. These
models [
        <xref ref-type="bibr" rid="ref15">15</xref>
        ] used manually crafted features, including word n-grams, part-of-speech tags, and sentiment
scores, to classify text. For instance, Badjatiya et al. [17], Chiu et al. [18] employed a logistic regression
model with n-grams and part-of-speech features to classify tweets into hate speech, ofensive language,
and neither. However, the performance of these models was often limited by their reliance on
surfacelevel features, which could not fully capture the complexities of language and context.
      </p>
      <p>
        Zhang and Luo [
        <xref ref-type="bibr" rid="ref6">6</xref>
        ] experimented with LSTM networks and Gradient Boosted Decision Trees (GBDT)
to classify hate speech on Twitter, demonstrating improvements over traditional machine learning
methods. Similarly, Liu and Avci [
        <xref ref-type="bibr" rid="ref7">7</xref>
        ] utilized a CNN-LSTM architecture to detect ofensive language,
showing that deep learning models could capture both local and sequential patterns in text.
      </p>
      <p>
        Mozafari et al. [
        <xref ref-type="bibr" rid="ref16">16</xref>
        ], Saleem et al. [
        <xref ref-type="bibr" rid="ref8">8</xref>
        ] marked the advancement in the field of BERT and transformer.
These models, pre-trained on large corpora, allowed for contextual understanding of text, leading to
more accurate classification. Mozafari et al. [19], Zhu et al. [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ] leveraged BERT for hate speech detection,
achieving state-of-the-art performance by fine-tuning the model on labeled datasets. The success of
transformer models paved the way for further research into leveraging large language models for
ofensive language detection.
      </p>
      <p>More recently, the focus has shifted toward leveraging even more sophisticated large language models
(LLMs), such as GPT-3 and its successors. These LLMs, with their ability to generate and understand
text in a nuanced manner, have shown promise in detecting ofensive content. For instance, Chiu et al.
[18], Mozafari et al. [20], Thapliyal [21] explored the use of GPT-3 for hate speech detection through
few-shot learning, highlighting the model’s ability to generalize across diferent datasets with minimal
task-specific training. However, challenges remain in applying these models to code-mixed languages
and in ensuring that they can handle the subtleties of context-dependent hate speech.</p>
      <p>
        Yadav et al. [22], Thapliyal [21] investigated the detection of hate speech in Hinglish using deep
learning models, while organized a shared task on multilingual hate speech detection [
        <xref ref-type="bibr" rid="ref4">4</xref>
        ], focusing on
languages such as English, German, and Hindi. These studies underscore the importance of developing
language-agnostic models or approaches that can efectively deal with code-mixing and multilingual
content. However, no work has explored the capabilities of GPT-3.5 Turbo for hate speech detection. In
this work, we explored the capabilities of GPT-3.5 Turbo to detect hate speech in English social media
posts on X.
      </p>
    </sec>
    <sec id="sec-3">
      <title>3. Dataset</title>
      <p>The English testing dataset consists of 888 tweets collected from a popular social media platform, X.
Since we used efectively only the test dataset for our predictions, we only mentioned the statistics
corresponding to the test dataset.</p>
    </sec>
    <sec id="sec-4">
      <title>4. Task Definition</title>
      <p>The task [23, 24] in this study involves the automated classification of social media content, specifically
tweets, into two distinct categories: Hate and Ofensive (i.e., HOF) and Non Hate-Ofensive (i.e., NOT).
The objective is to develop a model that can accurately identify whether a given tweet contains hate
speech or ofensive language (i.e., HOF), or if it does not (i.e., NOT).</p>
    </sec>
    <sec id="sec-5">
      <title>5. Methodology</title>
      <p>Prompting, especially with large language models such as GPT-3.5 Turbo, ofers a powerful approach to
solving the problem of hate speech and ofensive content detection for several reasons:
- Contextual Understanding: Large language models are pretrained [25] on vast amounts of
text data, enabling them to understand language nuances, context, and semantic relationships
between words and phrases. This deep understanding allows them to discern whether a piece of
content is ofensive or hateful, even when the language is subtle or context-dependent.
- Flexibility and Adaptability: Prompting allows for flexibility [ 26] in how the task is framed and
tackled. By carefully designing prompts, the model can be directed to focus on specific aspects of
the content, such as detecting harmful language or distinguishing between diferent forms of
ofense. This adaptability is crucial in handling the diverse and evolving nature of hate speech on
social media.
- Multilingual and Code-Mixed Language Handling: Prompting large language models is
beneficial for dealing with multilingual content [ 27], including code-mixed languages, which
are common on social media. The model’s extensive training on diverse text sources helps it
understand and classify content that blends languages or uses non-standard linguistic forms.
- Eficiency in Deployment: Prompting does not require the traditional pipeline of data
preprocessing [28], feature extraction, and model training. Instead, the model can be used directly to
classify content by providing it with well-crafted prompts. This reduces the time and resources
needed to deploy hate speech detection systems.
- Scalability: With prompting, the same model can be applied to a wide range of tasks without
significant modifications. This scalability [ 29] is important for social media platforms that need
to monitor vast amounts of content in real time and across diferent languages.
- Handling Ambiguity and Subjectivity: Hate speech and ofensive content often involve
subjective judgments [30]. Prompting a large language model allows for more nuanced
decisionmaking, as the model can consider context, intent, and the subtleties of language that might be
missed by simpler models.
- Rapid Iteration and Improvement: Prompting [31] enables quick adjustments and refinements
based on feedback, making it easier to improve the model’s performance over time. As new forms
of ofensive language emerge, the prompts can be updated or refined to ensure the model remains
efective.
5.1. Prompt Engineering-Based Approach
We used the GPT-3.5 Turbo1 model [32] via prompting to solve the classification task. We used GPT-3.5
Turbo in zero-Shot mode via prompting. After the prompt is provided to the LLM, the following steps
take place inside the LLM while generating the output. The following outlines the steps that occur
internally within the LLM, summarizing the prompting approach using GPT-3.5 Turbo:</p>
      <sec id="sec-5-1">
        <title>Step 1: Tokenization</title>
        <p>• Prompt:  = [1, 2, . . . , ]
1https://platform.openai.com/docs/models/gpt-3-5-turbo
• The input text (prompt) is first tokenized into smaller units called tokens. These tokens are often
subwords or characters, depending on the model’s design.</p>
        <p>• Tokenized Input:  = [1, 2, . . . , ]</p>
      </sec>
      <sec id="sec-5-2">
        <title>Step 2: Embedding</title>
        <p>• Each token is converted into a high-dimensional vector (embedding) using an embedding matrix
.
• Embedding Matrix:  ∈ R| |× , where | | is the size of the vocabulary and  is the embedding
dimension.</p>
        <p>• Embedded Tokens: emb = [(1), (2), . . . , ()]</p>
      </sec>
      <sec id="sec-5-3">
        <title>Step 3: Positional Encoding</title>
        <p>• Since the model processes sequences, it adds positional information to the embeddings to capture
the order of tokens.
• Positional Encoding:  ()
• Input to the Model:  = emb +</p>
      </sec>
      <sec id="sec-5-4">
        <title>Step 4: Attention Mechanism (Transformer Architecture)</title>
        <p>• Attention Score Calculation: The model computes attention scores to determine the importance
of each token relative to others in the sequence.
• Attention Formula:</p>
        <p>Attention(, ,  ) = softmax
︂(  )︂
√

• where  (query),  (key), and  (value) are linear transformations of the input .
• This attention mechanism is applied multiple times through multi-head attention, allowing the
model to focus on diferent parts of the sequence simultaneously.
(1)
(2)
(3)
(4)</p>
      </sec>
      <sec id="sec-5-5">
        <title>Step 5: Feedforward Neural Networks</title>
        <p>• The output of the attention mechanism is passed through feedforward neural networks, which
apply non-linear transformations.
• Feedforward Layer:</p>
        <p>FFN() = max(0, 1 + 1)2 + 2
• where 1, 2 are weight matrices and 1, 2 are biases.</p>
      </sec>
      <sec id="sec-5-6">
        <title>Step 6: Stacking Layers</title>
        <p>• Multiple layers of attention and feedforward networks are stacked, each with its own set of
parameters. This forms the "deep" in deep learning.
• Layer Output:
() = LayerNorm(() + Attention((), (),  ()))</p>
        <p>(+1) = LayerNorm(() + FFN(()))
Step 7: Output Generation
• The final output of the stacked layers is a sequence of vectors.
• These vectors are projected back into the token space using a softmax layer to predict the next
token or word in the sequence.
• Softmax Function:
 (|) =</p>
        <p>exp()
∑︀|=|1 exp( )
(5)
• where  is the logit corresponding to token  in the vocabulary.
• The model generates the next token in the sequence based on the probability distribution, and
the process repeats until the end of the output sequence is reached.</p>
      </sec>
      <sec id="sec-5-7">
        <title>Step 8: Decoding</title>
        <p>• The predicted tokens are then decoded back into text, forming the final output.</p>
        <p>• Output Text:  = [1, 2, . . . , ]</p>
        <p>The process begins with tokenization, where the input text is broken down into smaller units called
tokens, which could be subwords or characters depending on the model. Next, in the embedding
step, each token is converted into a high-dimensional vector using an embedding matrix ,resulting in
embedded tokens. To capture the order of tokens in the sequence, positional encoding is added to the
embedded tokens, producing the input for the model. The attention mechanism in the transformer
architecture then computes attention scores to determine the importance of each token relative to
others. Following attention, the output is passed through feedforward neural networks that apply
non-linear transformations to enhance the model’s learning capacity. The feedforward process involves
weight matrices and biases, introducing non-linearity. These attention and feedforward layers are then
stacked to form the deep layers of the model. Each layer processes the input and adds its contribution to
the overall understanding of the sequence. The output from the stacked layers is a sequence of vectors.
In output generation, these vectors are projected back into the token space using a softmax layer to
predict the next token in the sequence. The softmax function produces a probability distribution over
the vocabulary, and the model selects the most likely token. Finally, in decoding, the predicted tokens
are converted back into text, forming the final output sequence. This process repeats until the entire
output sequence is generated, resulting in the final text produced by the model.</p>
        <p>We used the following prompt for English language for the purpose of classification: " Please Check
whether the Tweet-&lt;Tweet&gt; is Hate and Ofensive or Non Hate-Ofensive. Only state Hate and Ofensive or
Non Hate-Ofensive ". The figure representing the methodology is shown in Figure 1.</p>
        <p>We run the GPT model at 3 diferent temperature values- 0.7, 0.8, and 0.9.</p>
        <p>All the labels are converted from Hate and Ofensive to HOF and from Non Hate-Ofensive to NOT
and then submitted.</p>
      </sec>
    </sec>
    <sec id="sec-6">
      <title>6. Results</title>
      <p>Run Number
Macro-F1 Score
Run 1
Run 2
Run 3</p>
    </sec>
    <sec id="sec-7">
      <title>7. Conclusion</title>
      <p>In this study, we explored the application of large language models, specifically GPT-3.5 Turbo, for
the task of detecting hate speech and ofensive content on social media. The increasing volume and
complexity of online communication, especially in multilingual and code-mixed languages, present
significant challenges for maintaining a safe and constructive digital environment. Our work focused
on classifying English tweets into Hate and Ofensive and Non Hate-Ofensive categories, while also
extending our analysis to other languages.</p>
      <p>The Macro-F1 scores across the three runs of the classification model demonstrate strong and
consistent performance. With scores of 0.756, 0.751, and 0.754, respectively, the results indicate that the
model efectively balances precision and recall across diferent classes. The slight variations observed
among the runs are minimal, reflecting the model’s stability and reliability in various testing scenarios.
These findings afirm the model’s capability to perform well in a balanced manner across all classes,
reinforcing its utility in practical applications where class performance consistency is critical. Future
work may explore further refinements to enhance performance or investigate additional metrics for a
more comprehensive evaluation.</p>
      <p>While the results are promising, they also highlight areas for further improvement. The complexity
of language, the nuances of context, and the evolving nature of online discourse require continuous
refinement of models and approaches. Future research should focus on enhancing the model’s ability to
handle multilingual and code-mixed content more efectively, as well as on developing strategies to
address the subjectivity inherent in detecting ofensive language.</p>
    </sec>
    <sec id="sec-8">
      <title>Declaration on Generative AI</title>
      <p>During the preparation of this work, the author(s) used ChatGPT in order to: Drafting content, Grammar
and spelling check, etc. After using this tool/service, the author(s) reviewed and edited the content as
needed and take(s) full responsibility for the publication’s content.
Proceedings of the Eighth International Conference on Complex Networks and Their Applications
COMPLEX NETWORKS 2019 8, Springer, 2020, pp. 928–940.
[17] P. Badjatiya, S. Gupta, M. Gupta, V. Varma, Deep learning for hate speech detection in tweets, in:
Proceedings of the 26th international conference on World Wide Web companion, International
World Wide Web Conferences Steering Committee, 2017, pp. 759–760.
[18] K.-L. Chiu, A. Collins, R. Alexander, Detecting hate speech with gpt-3, arXiv preprint
arXiv:2103.12407 (2021).
[19] M. Mozafari, R. Farahbakhsh, N. Crespi, Hate speech detection and racial bias mitigation in social
media based on bert model, PloS one 15 (2020) e0237861.
[20] M. Mozafari, R. Farahbakhsh, N. Crespi, Cross-lingual few-shot hate speech and ofensive language
detection using meta learning, IEEE Access 10 (2022) 14880–14896.
[21] H. Thapliyal, Sarcasm Detection System for Hinglish Language (SDSHL), Ph.D. thesis, IIIT
Hyderabad, 2020.
[22] S. Yadav, A. Kaushik, K. McDaid, Leveraging weakly annotated data for hate speech detection in
code-mixed hinglish: A feasibility-driven transfer learning approach with large language models,
arXiv preprint arXiv:2403.02121 (2024).
[23] K. Ghosh, N. Raihan, S. Modha, S. Satapara, T. Gaur, Y. Dave, M. Zampieri, S. Jaki, T. Mandl,
Overview of the HASOC Track at FIRE 2024: Hate-Speech Identification in English and Bengali,
in: FIRE ’24: Proceedings of the 16th Annual Meeting of the Forum for Information Retrieval
Evaluation. December 12-15, Gandhinagar, India, Association for Computing Machinery (ACM),
New York, NY, USA, 2024.
[24] N. Raihan, K. Ghosh, S. Modha, S. Satapara, T. Gaur, Y. Dave, M. Zampieri, S. Jaki, T. Mandl,
Overview of the HASOC Track at FIRE 2024: Hate-Speech Identification in English and Bengali, in:
K. Ghosh, T. Mandl, P. Majumder, D. Ganguly (Eds.), Forum for Information Retrieval Evaluation
(Working Notes) (FIRE 2024) December 12-15, Gandhinagar, India, CEUR-WS.org, 2024.
[25] J. Chen, Z. Liu, X. Huang, C. Wu, Q. Liu, G. Jiang, Y. Pu, Y. Lei, X. Chen, X. Wang, et al., When
large language models meet personalization: Perspectives of challenges and opportunities, World
Wide Web 27 (2024) 42.
[26] P. Dillenbourg, P. Tchounikine, Flexibility in macro-scripts for computer-supported collaborative
learning, Journal of computer assisted learning 23 (2007) 1–13.
[27] K. Shanmugavadivel, V. Sathishkumar, S. Raja, T. B. Lingaiah, S. Neelakandan, M. Subramanian,
Deep learning based sentiment analysis and ofensive language identification on multilingual
code-mixed data, Scientific Reports 12 (2022) 21557.
[28] J. Heit, J. Liu, M. Shah, An architecture for the deployment of statistical models for the big data
era, in: 2016 IEEE International Conference on Big Data (Big Data), IEEE, 2016, pp. 1377–1384.
[29] B. Lester, R. Al-Rfou, N. Constant, The power of scale for parameter-eficient prompt tuning, arXiv
preprint arXiv:2104.08691 (2021).
[30] A. C. Curry, G. Abercrombie, Z. Talat, Subjective isms? on the danger of conflating hate and
ofence in abusive language detection, in: Proceedings of the 8th Workshop on Online Abuse and
Harms (WOAH 2024), 2024, pp. 275–282.
[31] A. Madaan, N. Tandon, P. Gupta, S. Hallinan, L. Gao, S. Wiegrefe, U. Alon, N. Dziri, S. Prabhumoye,
Y. Yang, et al., Self-refine: Iterative refinement with self-feedback, Advances in Neural Information
Processing Systems 36 (2024).
[32] T. B. Brown, B. Mann, N. Ryder, M. Subbiah, J. D. Kaplan, P. Dhariwal, A. Neelakantan, P. Shyam,
G. Sastry, A. Askell, et al., Language models are few-shot learners, Advances in neural information
processing systems 33 (2020) 1877–1901.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <given-names>V.</given-names>
            <surname>Taprial</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Kanwar</surname>
          </string-name>
          , Understanding social media,
          <source>Bookboon</source>
          ,
          <year>2012</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [2]
          <string-name>
            <given-names>Y.</given-names>
            <surname>Zhu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>H.</given-names>
            <surname>Xu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Wang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>X.</given-names>
            <surname>Zhu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Zeng</surname>
          </string-name>
          ,
          <article-title>Hate speech detection based on sentiment knowledge sharing in multi-task learning</article-title>
          ,
          <source>in: Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics</source>
          ,
          <year>2020</year>
          , pp.
          <fpage>3456</fpage>
          -
          <lpage>3467</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [3]
          <string-name>
            <given-names>Y.</given-names>
            <surname>Jin</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L.</given-names>
            <surname>Wanner</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Shvets</surname>
          </string-name>
          ,
          <article-title>Gpt-hatecheck: Can llms write better functional tests for hate speech detection?</article-title>
          ,
          <source>arXiv preprint arXiv:2402.15238</source>
          (
          <year>2024</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [4]
          <string-name>
            <given-names>S. S.</given-names>
            <surname>Aluru</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B.</given-names>
            <surname>Mathew</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Saha</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Mukherjee</surname>
          </string-name>
          ,
          <article-title>Deep learning models for multilingual hate speech detection</article-title>
          , arXiv preprint arXiv:
          <year>2004</year>
          .
          <volume>06465</volume>
          (
          <year>2020</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          [5]
          <string-name>
            <given-names>B. R.</given-names>
            <surname>Chakravarthi</surname>
          </string-name>
          ,
          <string-name>
            <given-names>N.</given-names>
            <surname>Sripriya</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B.</given-names>
            <surname>Bharathi</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K.</given-names>
            <surname>Nandhini</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S. C.</given-names>
            <surname>Navaneethakrishnan</surname>
          </string-name>
          ,
          <string-name>
            <given-names>T.</given-names>
            <surname>Durairaj</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Ponnusamy</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P. K.</given-names>
            <surname>Kumaresan</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K. K.</given-names>
            <surname>Ponnusamy</surname>
          </string-name>
          ,
          <string-name>
            <surname>C.</surname>
          </string-name>
          <article-title>Rajkumar, Overview of the shared task on sarcasm identification of dravidian languages (malayalam and tamil) in dravidiancodemix, in: Forum of Information Retrieval and Evaluation FIRE-</article-title>
          <year>2023</year>
          ,
          <year>2023</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          [6]
          <string-name>
            <given-names>Z.</given-names>
            <surname>Zhang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F.</given-names>
            <surname>Luo</surname>
          </string-name>
          ,
          <article-title>Detecting hate speech on twitter using a convolution-gru based deep neural network</article-title>
          ,
          <source>in: Proceedings of the 5th International Workshop on Natural Language Processing for Social Media</source>
          ,
          <year>2018</year>
          , pp.
          <fpage>17</fpage>
          -
          <lpage>18</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          [7]
          <string-name>
            <given-names>F.</given-names>
            <surname>Liu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Avci</surname>
          </string-name>
          , Nuli at semeval
          <article-title>-2019 task 6: Transfer learning for ofensive language detection using bidirectional transformers</article-title>
          ,
          <source>in: Proceedings of the 13th International Workshop on Semantic Evaluation</source>
          ,
          <year>2019</year>
          , pp.
          <fpage>87</fpage>
          -
          <lpage>91</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          [8]
          <string-name>
            <given-names>H. M.</given-names>
            <surname>Saleem</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K. P.</given-names>
            <surname>Dillon</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Benesch</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Ruths</surname>
          </string-name>
          ,
          <article-title>A web of hate: Tackling hateful speech in online social spaces</article-title>
          ,
          <source>in: Proceedings of the 1st Workshop on Abusive Language Online</source>
          ,
          <year>2017</year>
          , pp.
          <fpage>1</fpage>
          -
          <lpage>10</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          [9]
          <string-name>
            <given-names>S.</given-names>
            <surname>MacAvaney</surname>
          </string-name>
          , H.
          <string-name>
            <surname>-R. Yao</surname>
            ,
            <given-names>E.</given-names>
          </string-name>
          <string-name>
            <surname>Yang</surname>
            ,
            <given-names>K.</given-names>
          </string-name>
          <string-name>
            <surname>Russell</surname>
            ,
            <given-names>N.</given-names>
          </string-name>
          <string-name>
            <surname>Goharian</surname>
            ,
            <given-names>O.</given-names>
          </string-name>
          <string-name>
            <surname>Frieder</surname>
          </string-name>
          ,
          <article-title>Hate speech detection: Challenges and solutions</article-title>
          ,
          <source>PloS one 14</source>
          (
          <year>2019</year>
          )
          <article-title>e0221152</article-title>
          .
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          [10]
          <string-name>
            <surname>T. K. Bhatia</surname>
          </string-name>
          , W. C.
          <article-title>Ritchie, Multilingualism and forensic linguistics, The Handbook of bilingualism and multilingualism (</article-title>
          <year>2012</year>
          )
          <fpage>671</fpage>
          -
          <lpage>699</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          [11]
          <string-name>
            <given-names>M.</given-names>
            <surname>Anzovino</surname>
          </string-name>
          , E. Fersini,
          <string-name>
            <given-names>P.</given-names>
            <surname>Rosso</surname>
          </string-name>
          ,
          <article-title>Automatic identification and classification of misogynistic language on twitter</article-title>
          ,
          <source>in: Proceedings of the 23rd International Conference on Applications of Natural Language to Information Systems</source>
          ,
          <year>2018</year>
          , pp.
          <fpage>57</fpage>
          -
          <lpage>64</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          [12]
          <string-name>
            <given-names>S. N</given-names>
            , T. Durairaj, N. K, B. B,
            <surname>K. K. Ponnusamy</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Rajkumar</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P. K.</given-names>
            <surname>Kumaresan</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Ponnusamy</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S. C.</given-names>
            <surname>Navaneethakrishnan</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B. R.</given-names>
            <surname>Chakravarthi</surname>
          </string-name>
          ,
          <article-title>Findings of shared task on sarcasm identification in code-mixed dravidian languages</article-title>
          , in: D.
          <string-name>
            <surname>Ganguly</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          <string-name>
            <surname>Majumdar</surname>
            ,
            <given-names>B.</given-names>
          </string-name>
          <string-name>
            <surname>Mitra</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          <string-name>
            <surname>Gupta</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          <string-name>
            <surname>Gangopadhyay</surname>
          </string-name>
          , P. Majumder (Eds.),
          <source>Proceedings of the 15th Annual Meeting of the Forum for Information Retrieval Evaluation</source>
          ,
          <string-name>
            <surname>FIRE</surname>
          </string-name>
          <year>2023</year>
          , Panjim, India,
          <source>December 15-18</source>
          ,
          <year>2023</year>
          , ACM,
          <year>2023</year>
          , pp.
          <fpage>22</fpage>
          -
          <lpage>24</lpage>
          . URL: https: //doi.org/10.1145/3632754.3633077. doi:
          <volume>10</volume>
          .1145/3632754.3633077.
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          [13]
          <string-name>
            <given-names>F.</given-names>
            <surname>Poletto</surname>
          </string-name>
          ,
          <string-name>
            <given-names>V.</given-names>
            <surname>Basile</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Sanguinetti</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Bosco</surname>
          </string-name>
          ,
          <string-name>
            <given-names>V.</given-names>
            <surname>Patti</surname>
          </string-name>
          ,
          <article-title>Resources and benchmark corpora for hate speech detection: a systematic review</article-title>
          ,
          <source>Language Resources and Evaluation</source>
          <volume>55</volume>
          (
          <year>2021</year>
          )
          <fpage>477</fpage>
          -
          <lpage>523</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref14">
        <mixed-citation>
          [14]
          <string-name>
            <given-names>J.</given-names>
            <surname>Smith</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Doe</surname>
          </string-name>
          ,
          <article-title>A study on hate speech detection using machine learning</article-title>
          ,
          <source>Journal of Computational Social Science</source>
          <volume>12</volume>
          (
          <year>2018</year>
          )
          <fpage>123</fpage>
          -
          <lpage>145</lpage>
          . doi:
          <volume>10</volume>
          .1007/s12345-018-1234-5.
        </mixed-citation>
      </ref>
      <ref id="ref15">
        <mixed-citation>
          [15]
          <string-name>
            <given-names>C.</given-names>
            <surname>Nobata</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Tetreault</surname>
          </string-name>
          , A. Thomas,
          <string-name>
            <given-names>Y.</given-names>
            <surname>Mehdad</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Y.</given-names>
            <surname>Chang</surname>
          </string-name>
          ,
          <article-title>Abusive language detection in online user content</article-title>
          ,
          <source>in: Proceedings of the 25th international conference on world wide web, International World Wide Web Conferences Steering Committee</source>
          ,
          <year>2016</year>
          , pp.
          <fpage>145</fpage>
          -
          <lpage>153</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref16">
        <mixed-citation>
          [16]
          <string-name>
            <given-names>M.</given-names>
            <surname>Mozafari</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Farahbakhsh</surname>
          </string-name>
          ,
          <string-name>
            <given-names>N.</given-names>
            <surname>Crespi</surname>
          </string-name>
          ,
          <article-title>A bert-based transfer learning approach for hate speech detection in online social media</article-title>
          ,
          <source>in: Complex Networks and Their Applications VIII:</source>
          Volume 1
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>