<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta>
      <journal-title-group>
        <journal-title>Forum for Information Retrieval Evaluation, December</journal-title>
      </journal-title-group>
      <issn pub-type="ppub">1613-0073</issn>
    </journal-meta>
    <article-meta>
      <title-group>
        <article-title>Subtrack at FIRE 2023: Identification of Conversational Hate-Speech</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Hiren Madhu</string-name>
          <email>hirenmadhu16@gmail.com</email>
          <xref ref-type="aff" rid="aff0">0</xref>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Shrey Satapara</string-name>
          <email>shreysatapara@gmail.com</email>
          <xref ref-type="aff" rid="aff0">0</xref>
          <xref ref-type="aff" rid="aff2">2</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Pavan Pandya</string-name>
          <email>pavanpandya1311@gmail.com</email>
          <xref ref-type="aff" rid="aff0">0</xref>
          <xref ref-type="aff" rid="aff3">3</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Nisarg Shah</string-name>
          <email>nisarg0606@gmail.com</email>
          <xref ref-type="aff" rid="aff0">0</xref>
          <xref ref-type="aff" rid="aff3">3</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Thomas Mandl</string-name>
          <email>mandl@uni-hildesheim.de</email>
          <xref ref-type="aff" rid="aff0">0</xref>
          <xref ref-type="aff" rid="aff5">5</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Sandip Modha</string-name>
          <email>sjmodha@gmail.com</email>
          <xref ref-type="aff" rid="aff0">0</xref>
          <xref ref-type="aff" rid="aff4">4</xref>
        </contrib>
        <contrib contrib-type="editor">
          <string-name>Benchmark, Context</string-name>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Hate Speech, NLP, Social Media</institution>
          ,
          <addr-line>Language Resource, Deep Learning, Text Classification, Evaluation</addr-line>
        </aff>
        <aff id="aff1">
          <label>1</label>
          <institution>Indian Institute of Science</institution>
          ,
          <addr-line>Bangalore</addr-line>
          ,
          <country country="IN">India</country>
        </aff>
        <aff id="aff2">
          <label>2</label>
          <institution>Indian Institute of Technology</institution>
          ,
          <addr-line>Hyderabad</addr-line>
          ,
          <country country="IN">India</country>
        </aff>
        <aff id="aff3">
          <label>3</label>
          <institution>Indiana Bloomington University</institution>
          ,
          <country country="US">USA</country>
        </aff>
        <aff id="aff4">
          <label>4</label>
          <institution>LDRP-ITR</institution>
          ,
          <addr-line>Gandhinagar</addr-line>
          ,
          <country country="IN">India</country>
        </aff>
        <aff id="aff5">
          <label>5</label>
          <institution>University of Hildesheim</institution>
          ,
          <country country="DE">Germany</country>
        </aff>
      </contrib-group>
      <pub-date>
        <year>2023</year>
      </pub-date>
      <volume>1</volume>
      <fpage>5</fpage>
      <lpage>18</lpage>
      <abstract>
        <p>Identifying hate speech based on context is a requirement for real-world content moderation systems. However, in research, the definition and use of context for hate speech recognition has seen a variety of approaches. The task ”Identification of Conversational Hate Speech” 2023 has provided a further dataset for hate speech detection, including context. The data was collected from Twitter (called X since 2023) and includes tweets, comments, or responses to such tweets or comments. This paper reports on the dataset, experiments, and results. Six teams submitted results for the binary classification task, and the best submission reached an F1 measure of 0.8. For the second task, five submissions were submitted. We also present a baseline that uses unlabelled data to obtain its predictions.</p>
      </abstract>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>-</title>
      <p>CEUR
ceur-ws.org
of the</p>
    </sec>
    <sec id="sec-2">
      <title>1. Introduction</title>
      <p>
        Hate speech and ofensive language, which include hurtful, insulting, or derogatory remarks
exchanged between individuals, are commonly observed on social media platforms like Facebook,
Twitter, and Reddit. The abundant presence of such content on these platforms fosters ofline
hate crimes and fuels disorderly actions against various communities or political groups driven
by agendas such as racism, misogyny, anti-LGBTQI+, anti-Muslim, anti-government, and
other extremist ideologies [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ]. To combat these hate crimes, the European Union (EU) and
other European nations have implemented laws that classify online hate speech as a criminal
ofense, leading to the conviction of many individuals involved in such online activities. In
contrast, the United States (US) primarily focuses on addressing hate speech through non-legal
nEvelop-O
(S. Modha)
CEUR
Workshop
Proceedings
means to safeguard the principles of free speech. While freedom of speech is crucial, a recent
study [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ] reveals that Elon Musk’s influence on Twitter and his alterations to content moderation
policies have increased hate speech on the platform. Consequently, this has caused numerous
environmentally-conscious users to become inactive on the platform, resulting in a decline in
the quality of discourse. In scenarios like this, freedom of speech acts as a double-edged sword.
      </p>
      <p>
        Due to this, open societies need to figure out how to maintain civil discourse without
resorting to totalitarian control, unlike the new Digital Service Act (DSA). DSA operates based
on a ”delete first, think later” approach, which removes user-generated content excessively and
undermines freedom of expression [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ]. Content moderation aims to strike a balance between
these objectives. Nonetheless, moderating content necessitates numerous human annotators’
involvement, making scalability impractical. This situation has driven research eforts toward
the advancement of automatic systems for identifying harmful online content. Text classification
represents just one component essential for meeting legal and practical demands [
        <xref ref-type="bibr" rid="ref4">4</xref>
        ], although
it is crucial.
      </p>
      <p>Most existing research in the field of identifying hate speech or ofensive content primarily
concentrates on analyzing the text of individual posts. Frequently, ofensive or hateful content
can be concealed within a conversation thread and may not be immediately evident in isolated
comments or responses. However, it is feasible to uncover such hate speech by examining the
original content and the context in which it was posted. Additionally, social media content often
spans multiple languages, including code-mixed languages like Hinglish1. This makes it crucial
for social media platforms to detect and remove such content before it reaches a wider audience.
In the two editions of identification of conversational hate speech in code-mixed languages
(ICHCL) [5, 6], datasets that handle such conversational hate speech have been released. The
ifrst edition featured binary labels to distinguish between hateful and regular tweets, while the
second edition introduced a multiclass task, further categorizing hateful content tweets into
two subtypes: standalone and contextual hate speech.</p>
      <p>In this paper, we provide an overview of the third edition of ICHCL, which is centered on
promoting the development of semi-supervised algorithms for classifying hateful text. Detailed
information regarding the task and dataset is elaborated upon in Section 3.</p>
    </sec>
    <sec id="sec-3">
      <title>2. Related Work</title>
      <p>Because a tweet is typically a part of a larger discourse and a conversation among certain
people, it is frequently dificult to understand it on its own. So far, only few text classification
experiments and datasets took context into account. Context has been modeled in diferent
ways. An early approach used LDA and RNNs.</p>
      <p>Recursive neural networks were used to capture context within sentences [7] but less for
capturing relations between subsequent messages in social media. LSTMs were used in an
approach by Gao and Huang[8]. The dataset is based on comments in discussion threads on
news articles and it contains 1500 comments. The context is given by the content of the news
articles [8].</p>
      <p>1Hindi written in Latin script instead of Devanagari script</p>
      <p>The shared task RumourEval reacts to the need to consider evolving conversations and
news updates for rumors and check their veracity [9]. The organizers provided a dataset of
misinformation posts and conversations about those posts. The best performing system [10]
used word2vec combined with several other dimensions such as source content analysis, source
account credibility, reply account credibility and stance of the source message among others.</p>
      <p>One dataset was labeled twice by crowd workers. One group was provided context and the
other one not [11]. The data is extracted from Wikipedia talk pages. Context was given by
the parent message and the title of the discussion thread [11]. It needs to pointed out that the
parent message might not be the last message preceding the message to the annotated.</p>
      <p>A further dataset that was extended with context information for the concept of abusiveness
[12]. This data was collected based on an existing dataset without contextual information. For
all tweets, the text was used to search them and if they were found, the authors tried to extract
the previous messages. For all tweets, for which this was successful, the preceding messages
were downloaded as context. Such a process leads to a greatly varying context size is very
between items. Around 45% of the hateful tweets had one preceding tweet as context and
another 45% had between 2 and 5 preceding tweets. Applying this methodology, almost half of
the tweets which were annotated as abusive were labelled as non-abusive once context was
available [12].</p>
      <p>In a study with 10.000 Youtube comments, the quality of annotations in regard to interrater
agreement was measured. Context improved the metrics by less than 5% in absolute terms [13].</p>
      <p>In a study with Reddit posts, 27000 posts were annotated [14]. Context was given by providing
the entire thread to the annotaters. However, diverse uses of context for annotation were
reported. Quite low interrater agreement was reported, however, experiments showed an
overall trend for improvement for context modeling [14].</p>
      <p>Another dataset based of 6800 Reddit posts incuding the context of one preceeding comment
was created [15]. A crowd worker annotation process showed low agreement and low quality
annotations were disregarded. Showing the previous post changes the judgment in over 30% of
the items of the Hate class. The best classification results reach F1 scores of 0.7 [ 15].</p>
      <p>Within HASOC, datasets were collected in two previous editions of ICHCL and experiments
were carried out [16, 6]. The diversity of the approaches, data sources and contexts definitions
shows that further experiments are required.</p>
    </sec>
    <sec id="sec-4">
      <title>3. ICHCL Task Overview and Dataset</title>
      <p>A conversational thread might contain hateful, ofensive, or profane language. This kind of
content may not be immediately noticeable within individual tweets, comments, or responses to
such tweets or comments. Nevertheless, it is possible to detect such hate speech by examining
the original content and the context in which it was posted. For two editions of ICHCL, we
have been focusing on detecting such hateful content in conversations. This year, we introduce
a variation of the last two editions. We provide details about this in the following section. In
the subsequent section, we discuss the details of the ICHCL 2023 dataset.</p>
      <sec id="sec-4-1">
        <title>3.1. Task Overview</title>
        <p>Training supervised models for classifying code-mixed text presents substantial challenges
due to the limited availability of labeled data and the associated high cost of annotating large
datasets. However, employing semi-supervised learning methods can alleviate these challenges
by leveraging unlabeled data to improve model accuracy and reduce the need for extensive
labeled data.</p>
        <p>As a result the ICHCL task was developed further. Participants received an unlabeled training
dataset and a labeled test dataset containing around 1,000 code-mixed Hindi samples. A crucial
requirement was that participants must utilize the new unlabeled data to make predictions on
the test dataset.</p>
        <p>The classification task was divided into two subtasks:
• Task 2a: This subtask focuses on the binary classification of conversational tweets with
tree-structured data into:
– (NOT) Non-Hate-Ofensive - This tweet, comment, or reply does not contain any
hate speech or ofensive content.
– (HOF) Hate and Ofensive - This tweet, comment, or reply contains hate speech,
ofensive, or profane content either on its own or in support of hate expressed in
the parent tweet.
• Task 2b: This subtask is centered on classifying conversational tweets with tree-structured
data into specific forms of hate, as follows:
– (SHOF) Standalone Hate - This tweet, comment, or reply contains hate speech,
ofensive, or profane content on its own.
– (CHOF) Contextual Hate - Comment or reply supporting the hate, ofense, and
profanity expressed in its parent. This includes afirming the hate with positive
sentiment and having apparent hate.
– (NONE) Non-Hate - This tweet, comment, or reply does not contain any hate speech
or ofensive or profane content.</p>
        <p>This edition addresses the scarcity of labeled data and reduces annotation costs by providing
only unlabeled data to participants. Participants received an unlabeled training dataset and a
labeled test dataset comprising approximately 1,000 code-mixed Hindi samples. Additionally,
a crucial requirement in this edition of ICHCL was that participants needed to leverage the
unlabeled data which is provided to make predictions on the test dataset. Furthermore, a link
to their GitHub repository to demonstrate compliance with the requirement was mandatory.
To ensure fairness and equal opportunities for all participants, we imposed a condition that
restricts the use to transformers with fewer than 200M parameters, preventing groups with
extensive computational resources from gaining an unfair advantage.</p>
      </sec>
      <sec id="sec-4-2">
        <title>3.2. Dataset</title>
        <p>This section will provide an overview of how we gathered the dataset and present its statistics.
To ensure a fair and unbiased sample of tweets, we selected controversial news stories covering
a wide range of topics. We specifically handpicked contentious stories that were highly likely to
contain hateful, ofensive, or profane comments. These stories were drawn from the following
categories:
• Brahmin Controversy in JNU
• Corruption
• Hinduphobia
• Kali smoking controversy
• Karnataka Election
• Kerala stories
• Modi clean chit
• Nupur Sharma
• Pakistan World Cup loss
• Udaipur murder
• Udhav Thakre government
• Zubair arrest</p>
        <p>The participants were encouraged to use ICHCL 2021 and 2022 datasets as labeled data. In
Table 1, we present the dataset statistics of the 2021 and 2022 datasets. We also present the
statistics for the 2023 test data and the unlabeled training data. Table 2 presents the
interannotator agreement for each level.</p>
        <p>Train (2021) [5]
Train (2022) [6]</p>
        <sec id="sec-4-2-1">
          <title>Test</title>
        </sec>
        <sec id="sec-4-2-2">
          <title>Total</title>
        </sec>
        <sec id="sec-4-2-3">
          <title>Unlabeled</title>
          <p>#Twitter Posts</p>
        </sec>
        <sec id="sec-4-2-4">
          <title>HOF NONE</title>
          <p>49 33</p>
        </sec>
        <sec id="sec-4-2-5">
          <title>SHOF NONE</title>
          <p>75 97
1 5
125 135
26
#Comments on Posts</p>
        </sec>
        <sec id="sec-4-2-6">
          <title>HOF CHOF NONE</title>
          <p>1820 - 1958</p>
        </sec>
        <sec id="sec-4-2-7">
          <title>SHOF CHOF NONE</title>
          <p>588 171 1166
141 68 523
2549 239 3647
3928
#Replies on comments</p>
        </sec>
        <sec id="sec-4-2-8">
          <title>HOF CHOF NONE</title>
          <p>972 - 908</p>
        </sec>
        <sec id="sec-4-2-9">
          <title>SHOF CHOF NONE</title>
          <p>973 717 1127
112 79 69
2057 736 2104
4571</p>
        </sec>
      </sec>
    </sec>
    <sec id="sec-5">
      <title>4. Results</title>
      <p>In this edition of ICHCL, an initiative was put in place to encourage young researchers to
develop innovative solutions. We introduced a semi-supervised version of the task to broaden
the approaches used. Unfortunately, no submissions utilized the semi-supervised methods,
highlighting a lack of interest in this part of the task. However, we still present the results and
approaches by the participants.</p>
      <p>As we can see in Table 3, for Task 2A, which focuses on binary classification, FiRC-NLP
secured the top position with their submission ”parfirst2_all_folds,” achieving an impressive F1
score of 0.8079. They were closely followed by IRLab@IITBHU, Chetona, and AiAlchemists,</p>
      <sec id="sec-5-1">
        <title>Type</title>
      </sec>
      <sec id="sec-5-2">
        <title>Main</title>
      </sec>
      <sec id="sec-5-3">
        <title>Comment</title>
      </sec>
      <sec id="sec-5-4">
        <title>Replies</title>
      </sec>
      <sec id="sec-5-5">
        <title>IAA after two annotation rounds 0.800 0.85381 0.93961</title>
      </sec>
      <sec id="sec-5-6">
        <title>IAA after three annotation rounds 1.000 0.80276 0.90243</title>
        <p>demonstrating competitive results in precision and recall. In Task 2B, a multi-class classification
task, FiRC-NLP continued to lead with their submission ”parfirst_top3_top7_task2b,” achieving a
significant F1 macro score of 0.6541 presented in Table 4. IRLab@IITBHU and AiAlchemists also
demonstrated notable performance. However, it is worth noting that the baseline submissions
by HASOC in both tasks ranked lower, underlining the competitiveness of the shared task.</p>
      </sec>
    </sec>
    <sec id="sec-6">
      <title>5. Methodology</title>
      <p>In this section, we first explain the baseline we provided to the participants, and then we discuss
the methodology of the top two teams.</p>
      <sec id="sec-6-1">
        <title>5.1. Baseline Model</title>
        <p>In order to support a low threshold for the entry to the shared task, a baseline model was provided
for participants. It included with a template for steps like importing data, preprocessing,
featuring, and classification. The participating teams could make changes in the code and
experiment with various settings.</p>
        <p>This year, we use a semi-supervised baseline, specifically, we use pseudo-labeling. First, we
ifne-tune a bert-base-multilingual on the labeled part of the dataset (2021, 2022 datasets).
We then predict labels for the unlabeled training set (2023 training data) and then again fine-tune
the model with the entire dataset (2021, 2022 datasets, and 2023 dataset with predictions).</p>
      </sec>
      <sec id="sec-6-2">
        <title>5.2. Participant approaches</title>
        <p>In this section, we explain and summarise the most successful participant approaches:
• FiRC-NLP: The sytsem uses concatenation to incorporate context and fine-tune
XLMRoBERTa-large for binary classification. For the multiclass task, the team fist applies the
same binary classifier to classify hate and non-hate, and then fine-tunes another LLM to
classify hate into standalone or contextual hate [17].
• IRLab@IITBHU: The submission implements a contrastive loss function to fine-tune
the vanilla mBERT model, which is then used to obtain features for each individual level.
After this step, they pass the features through a two-layer LSTM model to incorporate
the context together with features from sentence BERT.
• Chetona: The submission concatenates the diferent levels of the conversational thread
given. In then applies IndicBERT to encode the text and classifies based on the training
data [19].</p>
      </sec>
    </sec>
    <sec id="sec-7">
      <title>6. Conclusion</title>
      <p>We reported on experiments with conversational and contextual hate speech detection. The
new ICHCL dataset was created with a higher interrater agreement. The use of unlabelled data
was set a the challenge for the task 2023. However, participants did not use the in that way.
Overall, the submissions reached a good level of performance with up to a 0.8 F1 score applying
deep learning models.</p>
      <p>In future evaluations, data augmentation by large language models might be valuable
directions. First experiments report positive outcomes [22].
[5] S. Satapara, S. Modha, T. Mandl, H. Madhu, P. Majumder, Overview of the HASOC
subtrack at FIRE 2021: Conversational hate speech detection in code-mixed language,
in: P. Mehta, T. Mandl, P. Majumder, M. Mitra (Eds.), Working Notes of FIRE 2021
Forum for Information Retrieval Evaluation, Gandhinagar, India, December 13-17, 2021,
volume 3159 of CEUR Workshop Proceedings, CEUR-WS.org, 2021, pp. 20–31. URL: https:
//ceur-ws.org/Vol-3159/T1-2.pdf.
[6] S. Modha, T. Mandl, P. Majumder, S. Satapara, T. Patel, H. Madhu, Overview of the
HASOC subtrack at FIRE 2022: Identification of conversational hate-speech in
hindienglish code-mixed and german language, in: Working Notes of FIRE 2022 - Forum for
Information Retrieval Evaluation, Kolkata, India, December 9-13, 2022, 2022, pp. 475–488.</p>
      <p>URL: https://ceur-ws.org/Vol-3395/T7-1.pdf.
[7] H. Park, S. Cho, J. Park, Word RNN as a baseline for sentence completion, in: 5th IEEE
International Congress on Information Science and Technology, CiSt 2018, Marrakech,
Morocco, October 21-27, 2018, IEEE, 2018, pp. 183–187. doi:10.1109/CIST.2018.8596572.
[8] L. Gao, R. Huang, Detecting online hate speech using context aware models, in:
Proceedings of the International Conference Recent Advances in Natural Language Processing,
RANLP 2017, 2017, pp. 260–266. doi:10.26615/978- 954- 452- 049- 6_036.
[9] G. Gorrell, E. Kochkina, M. Liakata, A. Aker, A. Zubiaga, K. Bontcheva, L. Derczynski,
SemEval-2019 task 7: RumourEval, determining rumour veracity and support for rumours,
in: Proceedings of the 13th International Workshop on Semantic Evaluation, Association
for Computational Linguistics, Minneapolis, Minnesota, USA, 2019, pp. 845–854. doi:10.
18653/v1/S19- 2147.
[10] Q. Li, Q. Zhang, L. Si, eventAI at SemEval-2019 task 7: Rumor detection on social media
by exploiting content, user credibility and propagation information, in: Proceedings of
the 13th International Workshop on Semantic Evaluation, Association for Computational
Linguistics, Minneapolis, Minnesota, USA, 2019, pp. 855–859. doi:10.18653/v1/S19- 2148.
[11] J. Pavlopoulos, J. Sorensen, L. Dixon, N. Thain, I. Androutsopoulos, Toxicity detection:
Does context really matter?, in: Proceedings of the 58th Annual Meeting of the
Association for Computational Linguistics, ACL 2020, Online, July 5-10, 2020, Association for
Computational Linguistics, 2020, pp. 4296–4305. doi:10.18653/v1/2020.acl- main.396.
[12] S. Menini, A. P. Aprosio, S. Tonelli, Abuse is contextual, what about NLP? the role of
context in abusive language annotation and detection, CoRR abs/2103.14916 (2021). URL:
https://arxiv.org/abs/2103.14916. arXiv:2103.14916.
[13] N. Ljubešic, I. Mozetic, P. K. Novak, Quantifying the impact of context on the quality of
manual hate speech annotation, Natural Language Engineering 1 (2022) 14. doi:10.1017/
S1351324922000353.
[14] B. Vidgen, D. Nguyen, H. Margetts, P. Rossini, R. Tromble, Introducing cad: the contextual
abuse dataset, in: Proceedings of the 2021 Conference of the North American Chapter of
the Association for Computational Linguistics: Human Language Technologies, 2021, pp.
2289–2303. doi:10.18653/v1/2021.naacl- main.182.
[15] X. Yu, E. Blanco, L. Hong, Hate speech and counter speech detection: Conversational
context does matter, arXiv Preprint (2022). URL: https://arxiv.org/abs/2206.06423.
[16] S. Satapara, S. Modha, T. Mandl, H. Madhu, P. Majumder, Overview of the HASOC
subtrack at FIRE 2021: Conversational hate speech detection in code-mixed language,
in: P. Mehta, T. Mandl, P. Majumder, M. Mitra (Eds.), Working Notes of FIRE 2021
Forum for Information Retrieval Evaluation, Gandhinagar, India, December 13-17, 2021,
volume 3159 of CEUR Workshop Proceedings, CEUR-WS.org, 2021, pp. 20–31. URL: http:
//ceur-ws.org/Vol-3159/T1-2.pdf.
[17] M. S. Jahan, F. Hassan, W. Mohamed, A. M. Bouchekif, Multilingual hate speech detection
using ensemble of transformer models, in: Working Notes of FIRE 2023 - Forum for
Information Retrieval Evaluation, Goa, India, December 15-18, 2023, CEUR-WS.org, 2023.
[18] S. Chandal, A. Dhaka, S. Pal, Crossing borders: Multilingual hate speech detection, in:
K. Ghosh, T. Mandl, P. Majumder, M. Mitra (Eds.), Working Notes of FIRE 2023 - Forum for
Information Retrieval Evaluation, Goa, India, December 15-18, 2023, CEUR-WS.org, 2023.
[19] N. Madani, S. Saha, M. Sullivan, R. Srihari, Hate Speech Detection in Low Resource
Indo-Aryan Languages, in: K. Ghosh, T. Mandl, P. Majumder, M. Mitra (Eds.), Working
Notes of FIRE 2023 - Forum for Information Retrieval Evaluation, Goa, India, December
15-18, 2023, CEUR-WS.org, 2023.
[20] C. Muhammad Awais, J. Raj, Breaking Barriers: Multilingual Toxicity Analysis for Hate
Speech and Ofensive Language in Low-Resource Indo-Aryan Languages, in: Working
Notes of FIRE 2023 - Forum for Information Retrieval Evaluation, CEUR, 2023.
[21] P. M, R. K, A. Hegde, K. G, S. Coelho, H. L. Shashirekha, Taming toxicity: Learning
models for hate speech and ofensive language detection in social media text, in: K. Ghosh,
T. Mandl, P. Majumder, M. Mitra (Eds.), Working Notes of FIRE 2023 - Forum for Information
Retrieval Evaluation, Goa, India, December 15-18, 2023, CEUR-WS.org, 2023.
[22] A. Anuchitanukul, J. Ive, L. Specia, Revisiting contextual toxicity detection in conversations,
ACM J. Data Inf. Qual. 15 (2023) 6:1–6:22. URL: https://doi.org/10.1145/3561390. doi:10.
1145/3561390.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <given-names>I.</given-names>
            <surname>Kamenova</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Perliger</surname>
          </string-name>
          ,
          <volume>16</volume>
          . Online Hate Crimes,
          <source>Handbook on Crime and Technology</source>
          (
          <year>2023</year>
          )
          <article-title>278</article-title>
          . doi:
          <volume>10</volume>
          .4337/9781800886643.00026.
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [2]
          <string-name>
            <given-names>C. H.</given-names>
            <surname>Chang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>N. R.</given-names>
            <surname>Deshmukh</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P. R.</given-names>
            <surname>Armsworth</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Y. J.</given-names>
            <surname>Masuda</surname>
          </string-name>
          ,
          <article-title>Environmental users abandoned Twitter after Musk takeover</article-title>
          ,
          <source>Trends in Ecology &amp; Evolution</source>
          (
          <year>2023</year>
          ). doi:
          <volume>10</volume>
          . 1016/j.tree.
          <year>2023</year>
          .
          <volume>07</volume>
          .002.
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [3]
          <string-name>
            <given-names>A.</given-names>
            <surname>Turillazzi</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Taddeo</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L.</given-names>
            <surname>Floridi</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F.</given-names>
            <surname>Casolari</surname>
          </string-name>
          ,
          <article-title>The digital services act: an analysis of its ethical, legal, and social implications</article-title>
          ,
          <source>Law, Innovation and Technology</source>
          <volume>15</volume>
          (
          <year>2023</year>
          )
          <fpage>83</fpage>
          -
          <lpage>106</lpage>
          . doi:
          <volume>10</volume>
          .1080/17579961.202.
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [4]
          <string-name>
            <given-names>A.</given-names>
            <surname>Arora</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Nakov</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Hardalov</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S. M.</given-names>
            <surname>Sarwar</surname>
          </string-name>
          ,
          <string-name>
            <given-names>V.</given-names>
            <surname>Nayak</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Y.</given-names>
            <surname>Dinkov</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Zlatkova</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K.</given-names>
            <surname>Dent</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Bhatawdekar</surname>
          </string-name>
          , G. Bouchard,
          <string-name>
            <surname>I. Augenstein</surname>
          </string-name>
          ,
          <article-title>Detecting harmful content on online platforms: What platforms need vs. where research eforts go</article-title>
          ,
          <source>ACM Computing Surveys</source>
          (
          <year>2023</year>
          ). doi:
          <volume>10</volume>
          .1145/3603399, just Accepted.
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>