<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta>
      <journal-title-group>
        <journal-title>Forum for Information Retrieval Evaluation, December</journal-title>
      </journal-title-group>
      <issn pub-type="ppub">1613-0073</issn>
    </journal-meta>
    <article-meta>
      <title-group>
        <article-title>Subtrack at FIRE 2023: Identification of Tokens Contributing to Explicit Hate in English by Span Detection</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Sarah Masud</string-name>
          <email>sarahm@iiitd.ac.in</email>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Mohammad Aflah</string-name>
          <email>aflah20082@iiitd.ac.in</email>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Md. Shad Akhtar</string-name>
          <email>shad.akhtar@iiitd.ac.in</email>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Tanmoy Chakraborty</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Hate Span</string-name>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Explicit Hate</string-name>
        </contrib>
        <contrib contrib-type="author">
          <string-name>English Tweet</string-name>
        </contrib>
        <contrib contrib-type="author">
          <string-name>HASOC'</string-name>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Indian Institute of Technology</institution>
          ,
          <addr-line>Delhi</addr-line>
        </aff>
        <aff id="aff1">
          <label>1</label>
          <institution>Indraprastha Institute of Information Technology</institution>
          ,
          <addr-line>Delhi</addr-line>
        </aff>
      </contrib-group>
      <pub-date>
        <year>2023</year>
      </pub-date>
      <volume>1</volume>
      <fpage>5</fpage>
      <lpage>18</lpage>
      <abstract>
        <p>As hate speech continues to proliferate on the web, it is becoming increasingly important to develop computational methods to mitigate it. Reactively, using black-box models to identify hateful content can perplex users as to why their posts were automatically flagged as hateful. On the other hand, proactive mitigation can be achieved by suggesting rephrasing before a post is made public. However, both mitigation techniques require information about which part of a post contains the hateful aspect, i.e., what spans within a text are responsible for conveying hate. Better detection of such spans can significantly reduce explicitly hateful content on the web. To further contribute to this research area, we organized HateNorm at HASOC-FIRE 2023, focusing on explicit span detection in English Tweets1. A total of 12 teams participated in the competition, with the highest macro-F1 observed at 0.58.</p>
      </abstract>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>-</title>
      <p>CEUR
ceur-ws.org
of the</p>
    </sec>
    <sec id="sec-2">
      <title>1. Introduction</title>
      <p>
        CEUR
Workshop
Proceedings
content, one can look into developing systems that can capture and attend to the hateful spans
[
        <xref ref-type="bibr" rid="ref1">1</xref>
        ] within a sentence. Span detection can help develop a sense of rationale, act as a tool for
post hoc analysis, and improve the retrival of critical facts in claim verification [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ].
      </p>
      <p>
        Shared Task Objective. A hate span is a set of continuous tokens that, in tandem,
communicate the explicit hatefulness in a sentence. Table 1 provides some examples of harmful
social media posts marked for hateful fragments. For instance, in the first sentence of Table 1,
“Women ... Can’t live with them ... {Can’t shoot them}”, the portion highlighted in red will be
considered as a hateful span. Formally, given a hate sample, tokenized as  = ⟨ 1,  2, … ,   ⟩, the
hate span identification task looks for a sequence of hateful tokens, ⟨  , ..,  + ⟩ [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ].
      </p>
      <p>Problem Definition: Given a hateful text, identify those specific fragments within the
sentence that are hateful. This is a sequence tagging task where the aim is to label each
word as either belonging to the hate span or not.</p>
      <p>
        Share Task Details. Through the HateNorm shared task part of HASOC-FIRE 20232, we
aimed at engaging the broader research community in understanding span detection techniques
and contributing towards the extraction of spans inside a hateful text. In this task, we repurpose
a part of the publicly available Hate Normalization dataset [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ], with each data point containing
at least one hate span. The competition ran for a month, from July 13, 2023, to August 16, 2023,
PST. Hosted on Kaggle, the task received 72 submissions from 12 teams.
      </p>
      <p>
        Observations As opposed to a single label per input sentence in a general NLP classification
setup, for HateNorm, we had a label per token of the sentence [
        <xref ref-type="bibr" rid="ref4">4</xref>
        ]. Given the sequential nature
of the output, we observed initial hesitation among participants in working with the dataset.
However, their engagement improved once a starter kit/codebook was shared. We also observed
that among the submissions that submitted a demo paper, the base architecture was more than
just a large language model (LLM)-based classifier. There was a mixed usage of both LLM and
Bi-directional LSTMs. Further, we noticed that half of the teams did not apply a CRF layer to
capture the sequential encoding of the target label but instead relied on LLM’s ability to capture
context while making predictions for individual tags. The winning team ‘FiRC-NLP‘ with 8
submissions, obtained macro-F1 scores of 0.53 and 0.58 on the public and private leaderboards,
respectively. While this beats the start-kit scores of 0.34, it is comparable to the
SpanBERTBiLSTM-CRF model from Masud et al. [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ], which also reported a macro-F1 of 0.58. More work
is needed to bring mainstream attention beyond a text classification of hatefulness to detecting
spans. Shared task venues like HASOC and SemEval are the steps in the right direction.
      </p>
      <sec id="sec-2-1">
        <title>2https://hasocfire.github.io/hasoc/2023/</title>
      </sec>
    </sec>
    <sec id="sec-3">
      <title>2. Related Work</title>
      <p>
        Owing to the relevance and need for computational methods to tackle hate speech, we now have
a plethora of datasets [
        <xref ref-type="bibr" rid="ref5 ref6 ref7">5, 6, 7</xref>
        ] and techniques [
        <xref ref-type="bibr" rid="ref8 ref9">8, 9</xref>
        ] exploring the same. Regarding explicit hate
speech, hate lexicons [10, 11] have been explored. Auxiliary tasks such as hate normalization
[
        <xref ref-type="bibr" rid="ref3">3, 12, 13</xref>
        ] and rationale prediction [14] underpinned by the presence or absence of hateful
phrases in a sentence led to the foray of hate span detection. In English, the task has been
explored from the point of view of detecting toxic and ofensive spans [
        <xref ref-type="bibr" rid="ref1 ref3">1, 14, 3, 12</xref>
        ]. In
lowresource settings, the span detection has been explored for Vietnamese [15]. Via the MUDES
model, Ranasinghe and Zampieri [16] explored the cross-lingual applicability of hateful span
detection when trained on English span datasets. In the multimodal aspect, video frames that
conveyed hatefulness were employed as hate spans [17]. Another work detected phrases and
sentences in long articles that contribute to hate [18]. In other areas of social computing, span
detection has been explored under the English [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ] and multilingual [19] factual claim detection
settings. Meanwhile, detecting tokens a model pays attention to while labeling a sample as
hateful has been employed in posthoc explanations [20].
lol what a stupid k*k*
@user text me fa**ot.
sad to say but I do not trust shit
I know how bi****s operate
      </p>
      <p>Span
{O, O, O, B, I}
{O, O, O, B}
{O, O, O, O, O, B, I,
I, I, O, O, O, B, O}</p>
    </sec>
    <sec id="sec-4">
      <title>3. Dataset</title>
      <p>
        This task employed the existing dataset from Masud et al. [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ] curated initially for 3 processes –
hate intensity prediction, hate span prediction, and hate normalization generation. We employ
only the subset of samples labeled for hate span prediction for hosting HateNorm. This led to a
dataset with 3027 explicitly hateful sentences marked with hate spans. As outlined in Table 2,
the spans are tagged via the BIO notation, marking the beginning and inclusion of span tokens
as othering, marking the exclusion. Note that a single token can be a span with a corresponding
‘B’ tag. Meanwhile, an ‘I‘ tag is always preceded by a ‘B‘ tag. The 2421 train samples contained
4695 unique spans with an average of 1.939 spans per training instance. Figure 1 outlines the
distribution of the number of spans of a given length, and the majority of spans are ≤ 5 in
length. In the train set, each row contained an ‘id | space-separated token | list of span indices |
space-separated gold span label.’ Meanwhile, the 606 test instances were divided into 182 public
leader board and 424 privately held instances.
Id
100
200
606
10 15
Length of Span
20
25
      </p>
    </sec>
    <sec id="sec-5">
      <title>4. Task Details</title>
      <p>Hosting. HateNorm was hosted as a Kaggle3 Competition from 13th July 2023 to 16th August
2023 PST. It received participation from 12 teams, leading to 72 submissions (an average of
6 submissions per team) throughout the competition. As a part of the Kaggle competition,
participants were given a sample codebook and a sample ‘submission.csv,‘ as outlined in Table
3. We required the ‘id | space-separated predicted label’ for the submission file.</p>
      <p>Evaluation Metric. Unlike the classification of a single instance that can be adjudged via
accuracy or macro-F1, span detection requires evaluating the correct ordering of spans, ‘B’
following a ‘I’ and ‘O’ being the default. To capture this sequential nature of label prediction for
individual tokens, we employ the seqeval macro-F1 metric [21]. We hosted the custom metric of
seqeval as a script and loaded that to set up the competition so that each incoming submission,
by default, gets evaluated via seqeval macro-F1. Further, held 70% of the test samples were
private, based on which the final rankings were revealed after the contest. During the contest,
the participants saw their rank compared against the public leaderboard 30%. Note that a public
leader board does not mean test cases are public.</p>
      <p>Baselines. The codebook provided to the participant’s finetuned a DistillBERT+FNN setup
which reported a extremely low macro-F1 of 0.36. Meanwhile, the baselines provided by Masud</p>
      <sec id="sec-5-1">
        <title>3https://www.kaggle.com/competitions/hatenorm23</title>
        <p>Rank (Change in Rank)
1 (-)
2 (↑3)
3 (↓1)
4 (↓1)
5 (↓1)
6 (↓1)</p>
        <p>Team Name</p>
        <p>FiRC-NLP
Mohammadmostafa78</p>
        <p>CNLP-NITS-PP
IRLab@IITBHU</p>
        <p>Niranjan Rao</p>
        <p>
          TextShield
et al. [
          <xref ref-type="bibr" rid="ref3">3</xref>
          ] consisted of BiLSTM+CRF with a macro-F1 of 0.44, and the best method4 being a
SpanBERT [22]+BiLSTM+CRF system that reported a macro-F1 of 0.58.
        </p>
      </sec>
    </sec>
    <sec id="sec-6">
      <title>5. Submitted System</title>
      <p>
        Table 4 enlists the top 6 submissions. Among the participating teams that shared the overview
notes, we observed that ‘FiRC-NLP’ employed an ensemble of SpanBERT + CRF with teacher
enforcing. When run under lowercase preprocessing, the setup led to the highest macro-F1 of
0.58. The SpanBERT-based method, ‘FiRC-NLP,’ is also at par with the SpanBERT system of
the baseline solution [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ]. Note that owing to a one-to-one mapping of tokens to span tags, we
discouraged the users from performing additional preprocessing. Meanwhile, the second-best
team ‘Mohammadmostafa78’ with a macro-F1 of 0.52, overcame the skewness in BIO notations
by converting the label space to only BO and employing an XLM-RoBERTa [23]+FNN setup.
The third highest scoring teams, ‘CNLP-NITS-PP’ and ‘IRLab@IITBHU,’ have a macro-F1 of
51, difering only fourth decimal place. However, both employ distinct methods. While the
former employs a BERT+BiLSTM+FNN setup, the latter employs contextual embedding (Glove)
based BiLSTM+CRF setup akin to the existing baseline. Similar to the observations in our
baseline solutions, we observe that BiLSTM and contextual embedding-based solutions perform
considerably well. Overall, while Transformer systems either in the form of BERT or SpanBERT
help improve the performance, a BiLSTM system trained via CRF is equally viable. We also
observe that the proposed systems submissions more or less follow the performance trends of
the existing baseline solutions, further corroborating that combining transformer-based systems
with CRF and BiLSTM attention mechanisms is the optimal way to detect hateful spans.
      </p>
    </sec>
    <sec id="sec-7">
      <title>6. Conclusion</title>
      <p>Despite engaging with malicious content, some online users are adaptable and can be persuaded
to change their beliefs through empathy and corrective conduct. Through this task, we aimed
4Note: We excluded the Elmo based system from baseline due to reproducibility issues with Elmo on both Tensorflow
and Pytorch.
to help these users whose social interactions can eventually be nudged to becoming non-hateful.
We believe that the proposed systems can be efectively utilized to assist the moderators.
R. Reichart, M. Roberts, U. Shalit, B. Stewart, V. Veitch, D. Yang (Eds.), Proceedings of the
First Workshop on Causal Inference and NLP, Association for Computational Linguistics,
Punta Cana, Dominican Republic, 2021, pp. 74–82. URL: https://aclanthology.org/2021.
cinlp-1.6. doi:10.18653/v1/2021.cinlp- 1.6.
[10] M. Polignano, G. Colavito, C. Musto, M. de Gemmis, G. Semeraro, Lexicon enriched hybrid
hate speech detection with human-centered explanations, in: Adjunct Proceedings of
the 30th ACM Conference on User Modeling, Adaptation and Personalization, UMAP ’22
Adjunct, Association for Computing Machinery, New York, NY, USA, 2022, p. 184–191.</p>
      <p>URL: https://doi.org/10.1145/3511047.3537688. doi:10.1145/3511047.3537688.
[11] V. Stamou, I. Alexiou, A. Klimi, E. Molou, A. Saivanidou, S. Markantonatou, Cleansing &amp;
expanding the HURTLEX(el) with a multidimensional categorization of ofensive words, in:
K. Narang, A. Mostafazadeh Davani, L. Mathias, B. Vidgen, Z. Talat (Eds.), Proceedings of
the Sixth Workshop on Online Abuse and Harms (WOAH), Association for Computational
Linguistics, Seattle, Washington (Hybrid), 2022, pp. 102–108. URL: https://aclanthology.
org/2022.woah-1.10. doi:10.18653/v1/2022.woah- 1.10.
[12] J. Pavlopoulos, L. Laugier, A. Xenos, J. Sorensen, I. Androutsopoulos, From the detection of
toxic spans in online discussions to the analysis of toxic-to-civil transfer, in: Proceedings of
the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long
Papers), Association for Computational Linguistics, Dublin, Ireland, 2022, pp. 3721–3734.</p>
      <p>URL: https://aclanthology.org/2022.acl-long.259. doi:10.18653/v1/2022.acl- long.259.
[13] V. Agarwal, Y. Chen, N. Sastry, Haterephrase: Zero- and few-shot reduction of hate
intensity in online posts using large language models, 2023. arXiv:2310.13985.
[14] B. Mathew, P. Saha, S. M. Yimam, C. Biemann, P. Goyal, A. Mukherjee, Hatexplain: A
benchmark dataset for explainable hate speech detection, in: Proceedings of the AAAI
Conference on Artificial Intelligence, volume 35, 2021, pp. 14867–14875.
[15] P. G. Hoang, C. D. Luu, K. Q. Tran, K. V. Nguyen, N. L.-T. Nguyen, ViHOS: Hate speech
spans detection for Vietnamese, in: Proceedings of the 17th Conference of the European
Chapter of the Association for Computational Linguistics, Association for Computational
Linguistics, Dubrovnik, Croatia, 2023, pp. 652–669. URL: https://aclanthology.org/2023.
eacl-main.47. doi:10.18653/v1/2023.eacl- main.47.
[16] T. Ranasinghe, M. Zampieri, MUDES: Multilingual detection of ofensive spans, in:
Proceedings of the 2021 Conference of the North American Chapter of the Association for
Computational Linguistics: Human Language Technologies: Demonstrations, Association
for Computational Linguistics, Online, 2021, pp. 144–152. URL: https://aclanthology.org/
2021.naacl-demos.17. doi:10.18653/v1/2021.naacl- demos.17.
[17] M. Das, R. Raj, P. Saha, B. Mathew, M. Gupta, A. Mukherjee, Hatemm: A multi-modal
dataset for hate video classification, Proceedings of the International AAAI Conference on
Web and Social Media 17 (2023) 1014–1023. URL: https://ojs.aaai.org/index.php/ICWSM/
article/view/22209. doi:10.1609/icwsm.v17i1.22209.
[18] L. Zhou, A. Caines, I. Pete, A. Hutchings, Automated hate speech detection and span
extraction in underground hacking and extremist forums, Natural Language Engineering
29 (2023) 1247–1274. doi:10.1017/S1351324922000262.
[19] S. Mittal, M. Sundriyal, P. Nakov, Lost in translation, found in spans: Identifying claims in
multilingual social media, arXiv:2310.18205 (2023).
[20] B. Kennedy, X. Jin, A. Mostafazadeh Davani, M. Dehghani, X. Ren, Contextualizing
hate speech classifiers with post-hoc explanation, in: Proceedings of the 58th Annual
Meeting of the Association for Computational Linguistics, Association for Computational
Linguistics, Online, 2020, pp. 5435–5442. URL: https://aclanthology.org/2020.acl-main.483.
doi:10.18653/v1/2020.acl- main.483.
[21] H. Nakayama, seqeval: A python framework for sequence labeling
evaluation, 2018. URL: https://github.com/chakki-works/seqeval, software available from
https://github.com/chakki-works/seqeval.
[22] M. Joshi, D. Chen, Y. Liu, D. S. Weld, L. Zettlemoyer, O. Levy, SpanBERT: Improving
pre-training by representing and predicting spans, Transactions of the Association for
Computational Linguistics 8 (2020) 64–77. URL: https://aclanthology.org/2020.tacl-1.5.
doi:10.1162/tacl_a_00300.
[23] A. Conneau, K. Khandelwal, N. Goyal, V. Chaudhary, G. Wenzek, F. Guzmán, E. Grave,
M. Ott, L. Zettlemoyer, V. Stoyanov, Unsupervised cross-lingual representation learning at
scale, in: D. Jurafsky, J. Chai, N. Schluter, J. Tetreault (Eds.), Proceedings of the 58th Annual
Meeting of the Association for Computational Linguistics, Association for Computational
Linguistics, Online, 2020, pp. 8440–8451. URL: https://aclanthology.org/2020.acl-main.747.
doi:10.18653/v1/2020.acl- main.747.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <given-names>J.</given-names>
            <surname>Pavlopoulos</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Sorensen</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L.</given-names>
            <surname>Laugier</surname>
          </string-name>
          ,
          <string-name>
            <surname>I. Androutsopoulos</surname>
          </string-name>
          , SemEval
          <article-title>-2021 task 5: Toxic spans detection</article-title>
          ,
          <source>in: Proceedings of the 15th International Workshop on Semantic Evaluation (SemEval-2021)</source>
          ,
          <article-title>Association for Computational Linguistics</article-title>
          , Online,
          <year>2021</year>
          , pp.
          <fpage>59</fpage>
          -
          <lpage>69</lpage>
          . URL: https://aclanthology.org/
          <year>2021</year>
          .semeval-
          <volume>1</volume>
          .6. doi:
          <volume>10</volume>
          .18653/v1/
          <year>2021</year>
          .semeval-
          <volume>1</volume>
          .6.
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [2]
          <string-name>
            <given-names>M.</given-names>
            <surname>Sundriyal</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Kulkarni</surname>
          </string-name>
          ,
          <string-name>
            <given-names>V.</given-names>
            <surname>Pulastya</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M. S.</given-names>
            <surname>Akhtar</surname>
          </string-name>
          , T. Chakraborty,
          <article-title>Empowering the factcheckers! automatic identification of claim spans on Twitter</article-title>
          ,
          <source>in: Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing</source>
          , Association for Computational Linguistics, Abu Dhabi, United Arab Emirates,
          <year>2022</year>
          , pp.
          <fpage>7701</fpage>
          -
          <lpage>7715</lpage>
          . URL: https: //aclanthology.org/
          <year>2022</year>
          .emnlp-main.
          <volume>525</volume>
          . doi:
          <volume>10</volume>
          .18653/v1/
          <year>2022</year>
          .emnlp- main.525.
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [3]
          <string-name>
            <given-names>S.</given-names>
            <surname>Masud</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Bedi</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M. A.</given-names>
            <surname>Khan</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M. S.</given-names>
            <surname>Akhtar</surname>
          </string-name>
          , T. Chakraborty,
          <article-title>Proactively reducing the hate intensity of online posts via hate speech normalization</article-title>
          ,
          <source>in: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining</source>
          , KDD '22,
          <string-name>
            <surname>Association</surname>
          </string-name>
          for Computing Machinery, New York, NY, USA,
          <year>2022</year>
          , p.
          <fpage>3524</fpage>
          -
          <lpage>3534</lpage>
          . URL: https://doi.org/10.1145/3534678.3539161. doi:
          <volume>10</volume>
          .1145/3534678.3539161.
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [4]
          <string-name>
            <given-names>A.</given-names>
            <surname>Vaswani</surname>
          </string-name>
          ,
          <string-name>
            <given-names>N.</given-names>
            <surname>Shazeer</surname>
          </string-name>
          ,
          <string-name>
            <given-names>N.</given-names>
            <surname>Parmar</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Uszkoreit</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L.</given-names>
            <surname>Jones</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A. N.</given-names>
            <surname>Gomez</surname>
          </string-name>
          , L. u. Kaiser,
          <string-name>
            <surname>I. Polosukhin</surname>
          </string-name>
          ,
          <article-title>Attention is all you need</article-title>
          , in: I. Guyon,
          <string-name>
            <given-names>U. V.</given-names>
            <surname>Luxburg</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Bengio</surname>
          </string-name>
          ,
          <string-name>
            <given-names>H.</given-names>
            <surname>Wallach</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Fergus</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Vishwanathan</surname>
          </string-name>
          , R. Garnett (Eds.),
          <source>Advances in Neural Information Processing Systems</source>
          , volume
          <volume>30</volume>
          ,
          <string-name>
            <surname>Curran</surname>
            <given-names>Associates</given-names>
          </string-name>
          , Inc.,
          <year>2017</year>
          . URL: https://proceedings.neurips.cc/ paper_files/paper/2017/file/3f5ee243547dee91fbd053c1c4a845aa-Paper.pdf.
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          [5]
          <string-name>
            <given-names>Z.</given-names>
            <surname>Waseem</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Hovy</surname>
          </string-name>
          ,
          <article-title>Hateful symbols or hateful people? predictive features for hate speech detection on Twitter</article-title>
          ,
          <source>in: Proceedings of the NAACL Student Research Workshop</source>
          , Association for Computational Linguistics, San Diego, California,
          <year>2016</year>
          , pp.
          <fpage>88</fpage>
          -
          <lpage>93</lpage>
          . URL: https://aclanthology.org/N16-2013. doi:
          <volume>10</volume>
          .18653/v1/
          <fpage>N16</fpage>
          - 2013.
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          [6]
          <string-name>
            <given-names>T.</given-names>
            <surname>Davidson</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Warmsley</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Macy</surname>
          </string-name>
          ,
          <string-name>
            <surname>I. Weber</surname>
          </string-name>
          ,
          <article-title>Automated hate speech detection and the problem of ofensive language</article-title>
          ,
          <source>Proceedings of the International AAAI Conference on Web and Social Media</source>
          <volume>11</volume>
          (
          <year>2017</year>
          )
          <fpage>512</fpage>
          -
          <lpage>515</lpage>
          . URL: https://ojs.aaai.org/index.php/ICWSM/article/ view/14955.
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          [7]
          <string-name>
            <given-names>A.</given-names>
            <surname>Kulkarni</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Masud</surname>
          </string-name>
          ,
          <string-name>
            <given-names>V.</given-names>
            <surname>Goyal</surname>
          </string-name>
          , T. Chakraborty,
          <article-title>Revisiting hate speech benchmarks: From data curation to system deployment</article-title>
          ,
          <source>in: Proceedings of the 29th ACM SIGKDD Conference on Knowledge Discovery and Data Mining</source>
          , KDD '23,
          <string-name>
            <surname>Association</surname>
          </string-name>
          for Computing Machinery, New York, NY, USA,
          <year>2023</year>
          , p.
          <fpage>4333</fpage>
          -
          <lpage>4345</lpage>
          . URL: https://doi.org/10.1145/3580305.3599896. doi:
          <volume>10</volume>
          .1145/3580305.3599896.
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          [8]
          <string-name>
            <given-names>A.</given-names>
            <surname>Schmidt</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Wiegand</surname>
          </string-name>
          ,
          <article-title>A survey on hate speech detection using natural language processing</article-title>
          , in: L.
          <string-name>
            <surname>-W. Ku</surname>
          </string-name>
          , C.-T. Li (Eds.),
          <source>Proc. of the Fifth International Workshop on Natural Language Processing for Social Media</source>
          ,
          <string-name>
            <surname>ACL</surname>
          </string-name>
          , Valencia, Spain,
          <year>2017</year>
          , pp.
          <fpage>1</fpage>
          -
          <lpage>10</lpage>
          . URL: https://aclanthology.org/W17-1101. doi:
          <volume>10</volume>
          .18653/v1/
          <fpage>W17</fpage>
          - 1101.
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          [9]
          <string-name>
            <given-names>A.</given-names>
            <surname>Founta</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L.</given-names>
            <surname>Specia</surname>
          </string-name>
          ,
          <article-title>A survey of online hate speech through the causal lens</article-title>
          , in: A.
          <string-name>
            <surname>Feder</surname>
            ,
            <given-names>K.</given-names>
          </string-name>
          <string-name>
            <surname>Keith</surname>
            , E. Manzoor,
            <given-names>R.</given-names>
          </string-name>
          <string-name>
            <surname>Pryzant</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          <string-name>
            <surname>Sridhar</surname>
            ,
            <given-names>Z.</given-names>
          </string-name>
          <string-name>
            <surname>Wood-Doughty</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          <string-name>
            <surname>Eisenstein</surname>
          </string-name>
          , J. Grimmer,
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>