<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta>
      <issn pub-type="ppub">1613-0073</issn>
    </journal-meta>
    <article-meta>
      <title-group>
        <article-title>Project: Exploring Italian Slurs Reappropriation with Large Language Models</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Marco Cuccarini</string-name>
          <email>marco.cuccarini@unina.it</email>
          <xref ref-type="aff" rid="aff0">0</xref>
          <xref ref-type="aff" rid="aff2">2</xref>
          <xref ref-type="aff" rid="aff4">4</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Lia Draetta</string-name>
          <email>lia.draetta@unito.it</email>
          <xref ref-type="aff" rid="aff0">0</xref>
          <xref ref-type="aff" rid="aff3">3</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Chiara Ferrando</string-name>
          <email>chiara.ferrando@unito.it</email>
          <xref ref-type="aff" rid="aff0">0</xref>
          <xref ref-type="aff" rid="aff3">3</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Liam James</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Viviana Patti</string-name>
          <email>viviana.patti@unito.it</email>
          <xref ref-type="aff" rid="aff0">0</xref>
          <xref ref-type="aff" rid="aff3">3</xref>
        </contrib>
        <contrib contrib-type="editor">
          <string-name>Semantic requalification process, Homostransphobia detection, Slurs, Natural Language Processing, Large Language Models</string-name>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>CLiC-it 2024: Tenth Italian Conference on Computational Linguistics</institution>
        </aff>
        <aff id="aff1">
          <label>1</label>
          <institution>DISI, University of Bologna</institution>
        </aff>
        <aff id="aff2">
          <label>2</label>
          <institution>Department of Biology, University of Naples Federico II</institution>
        </aff>
        <aff id="aff3">
          <label>3</label>
          <institution>Department of Computer Science, University of Turin</institution>
        </aff>
        <aff id="aff4">
          <label>4</label>
          <institution>Department of Mathematics and Computer Science, University of Perugia</institution>
        </aff>
      </contrib-group>
      <abstract>
        <p>Recently, social networks have become the primary means of communication for many people, leading computational linguistics researchers to focus on the language used on these platforms. As online interactions grow, recognizing and preventing ofensive messages targeting various groups has become urgent. However, finding a balance between detecting hate speech and preserving free expression while promoting inclusive language is challenging. Previous studies have highlighted the risks of automated analysis misinterpreting context, which can lead to the censorship of marginalized groups. Our study is the first to explore the reappropriative use of slurs in Italian by leveraging Large Language Models (LLMs) with a zero-shot approach. We revised annotations of an existing Italian homotransphobic dataset, developed new guidelines, and designed various prompts to address the LLMs task. Our findings illustrate the dificulty of this challenge and provide preliminary results on using LLMs for such a language specific task. Warning: This paper contains examples of explicitly ofensive content. Our positionality: This paper is situated in Italy in 2024 and is authored by researchers specializing in Natural Language Processing (NLP). Beyond our academic work, we are sensitive to anti-hate speech issues. Our backgrounds fields are theoretical linguistics, computer science and NLP.</p>
      </abstract>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>-</title>
      <p>CEUR
ceur-ws.org</p>
    </sec>
    <sec id="sec-2">
      <title>1. Introduction</title>
      <p>In recent years, social networks have become the
primary means of communication for most people. With
the daily growth of online interactions, it has become
urgent to recognize and prevent the spread of ofensive
messages against diferent target groups based on gender,
sex, sexual orientation, race, religion, language, or
political orientation. Moreover, categorizing hate speech with
clear-cut boundaries is overly simplistic, as it includes
various forms of abusive language that imply disrespect
and hostility. A recent challenge is finding a balance
between detecting hate speech and preserving the free
spread of ideas and opinions on the web, while
promoting inclusive and fair language. Thiago et al. (2021) [1]
highlighted how automated analysis can misinterpret
context, risking the censorship of marginalized groups
other study by Pamungkas and colleagues (2020) [2, 3]
emphasized the importance of considering context in
Natand solidarity within the group members [4]. Although
community visibility and the use of specific slang have
been approached for years, to our knowledge only some
hate speech studies specifically addressed slurs, and few
focused on slurs semantic reappropriation [5].
Nowadays, recognizing this kind of semantic shift through</p>
      <sec id="sec-2-1">
        <title>NLP tools is crucial to avoid the risk of removing not abusive speech in online contents, which could paradoxically harm marginalized users [6, 7].</title>
      </sec>
      <sec id="sec-2-2">
        <title>Our study is the first with the aim of investigating</title>
        <p>need to take a step ahead from the existing abusive
language detection models. Having in mind the capability
of LLMs in classification task, we leveraged a LLM with
a zero-shot approach in order to recognize the presence
of reappropriative uses in our dataset.</p>
        <p>This study makes the following contributions:
• We partially revised the original annotation
previously conducted on the HODI dataset
(Homolanguages, such as those of the LGBT+ community. An- reappropriative use of slurs in Italian, highlighting the
transphobic Dataset in Italian)1[8], by developing
new annotation guidelines.
• We used a LLM specifically fine-tuned on Italian</p>
        <p>language by leveraging prompt engineering.
• From a linguistic point of view, we showed why
certain features of the Italian language make this
task particularly challenging.</p>
      </sec>
      <sec id="sec-2-3">
        <title>This paper is structured as follows: in the Section 2 we</title>
        <p>review the most significant related work on hate speech
detection and zero-shot approaches leveraging LLMs. In
the Section 3 we describe our methodology for the dataset
creation and the implementation of zero-shot tasks. In
Sections 3 and 5 we respectively report results, analysis
and main limitations of this work. Finally, in the last
Section 6 we draw conclusions of the current research.</p>
      </sec>
    </sec>
    <sec id="sec-3">
      <title>2. Related work</title>
      <p>(RHSD), the first hate speech dataset dedicated at
investigating the use of reclaimed slur terms, and by fine-tuning
a baseline model which resulted in the Reclaimed Hate
Speech (RHS) model.</p>
      <p>As far as the Italian language is concerned, slurs
recently became a significant topic from a linguistic and
philosophical point of view, but there are not studies
focusing on slurs reappropriation detection task.
Philosophy of language studies highlighted that a key area
of interest is slurs echoic uses, where target
communities reappropriated derogatory terms to express pride,
solidarity, or use them as tools for political and social
activism [13, 4]. Nossem (2019) [14] observed a productive
role in creating localized versions of queer by
reappropriating and redefining existing local alternative terms,
specifically frocio and frocia, femminiell@, and ricchione.
At this point, it should be noted that Italian, diferently
from English language, lacks terms like queer, which
bring with them such a long socio-cultural and
historical background. The semantic requalification process of
homotransphobic slurs is at its first steps and consists of
a challenging task that has not yet been investigated in
computational domains with LLMs.</p>
      <sec id="sec-3-1">
        <title>As presented above, hate speech is a challenging task, due</title>
        <p>to magnitude of the phenomenon and the dificulties of
defining clear boundaries. Some recent developments in
AI underlined the challenge of building corpora and
models to automatically detect the abusive (or not abusive)
nature of slurs in social media texts. Pamungkas’ et al. 3. Methodology
(2020) [2] research focused on the use of swear words in
English and aimed at diferentiate between ofensive and 3.1. Dataset creation
non-ofensive occurrences of slurs. A Twitter English
corpus, SWAD (Swear Words Abusiveness Dataset), was To our knowledge, there are no available annotated
developed by manually annotating the abusive charge at datasets in the Italian language focusing on the
phethe word level and models were trained to automatically nomenon of slurs semantic reappropriation. To address
predict abusiveness. the issue of limited data, in this preliminary research we</p>
        <p>Over the last decade, most studies approached the hate utilized the HODI dataset [8], which contains 6000 Italian
speech detection in terms of binary classification [ 9]. Twitter messages collected by using a set of 21 keywords
For instance, Plaza et al. (2023) [10] examines this task (i.e., gay, pride, lesbica, frocio). The dataset is a collection
by comparing the performances of an encoder-decoder of sentences directed against LGBT+ community who
model with several BERT-based models in both zero-shot are target of homotransphobia. Our argument is that
learning and fine-tuning scenarios. The findings show in such a corpus it is possible to find slurs used in both
that BERT-based models perform poorly in zero-shot abusive and reappropriative contexts. With the aim of
learning, while the others, even without additional train- collecting messages suitable for our study, we filtered the
ing, achieves results comparable to fine-tuned models. HODI dataset by selecting tweets that contain at least</p>
        <p>Nowadays, research indicates that hate speech changes one denigratory term, by adopting a two-fold strategy.
depending on the target groups [9]. Detecting homo- To select the homotransphobic swear words, we used the
transphobic hate speech (i.e. a specific abusive language HurtLex lexicon2 [15], a multilingual lexicon containing
addressed to LGBT+ community) has emerged as a critical an organized list of denigratory terms divided into 17
research area, with various scholars proposing solutions categories (i.e. negative stereotypes, ethnic slurs, moral
in diferent languages such as English [ 11] and Italian and behavioral defects, words related to homosexuality).
[8, 12]. From HurtLex, we selected only the words categorized as</p>
        <p>However, only few studies focused on the detection homotransphobic, then we further narrowed the list to
of slurs that have undergone a semantic reappropriation those that satisfy the slur definition 3 provided by Bianchi
process. Zsisku and colleagues (2024) [5] approached the
task by collecting the Reclaimed Hate Speech Dataset
2https://github.com/valeriobasile/hurtlex
3Bianchi (2014) [4] defines slurs ”derogatory terms -such as ‘nigger’
and ‘faggot’-targeting individuals and groups of individuals on the
basis of race, nationality, religion, gender or sexual orientation.
According to most scholars, slurs generally have a neutral counterpart,</p>
      </sec>
      <sec id="sec-3-2">
        <title>1The HODI dataset was created for a shared task focused on identi</title>
        <p>fying homotransphobia in Italian tweets.</p>
        <sec id="sec-3-2-1">
          <title>Io ero 6/7enne ed ero il ricchione alle elementari,</title>
          <p>all’oratorio, alle medie, al liceo e tutta la vita. E mi . seclehmooeln,tinarhyigshchsochool,oal,tatnhde aylolumtyh lcifeen. tAern,dinI’mmiodkdaley
va bene così, c’è più colore in questo mondo</p>
        </sec>
        <sec id="sec-3-2-2">
          <title>When I was 6/7 years old, I was the gay one in</title>
          <p>with that, it adds more color to this world.
i.e. a non-derogatory correlate: ‘Boche’ and ‘German’, ‘nigger’ and
‘African-American’ or ‘black’, ‘faggot’ and ‘homosexual’”.</p>
          <p>After obtaining the described subset, we utilized
zeroshot Learning (ZSL) with prompting to assess the model’s
• Mamma mia raga come mi ha messa di buon umore ability to determine whether the target words are used
il #LiguriaPride non mi sentivo così da un sacco in abusive or non-abusive context. Specifically, we
emgrazie energia frocia ployed the Qwen model [17], a multilingual decoded-only
[English translation: Mamma mia guys how LLM pre-trained on Italian.
the #LiguaPride has put me in such a good mood We define the temperature of the model to be 1, a fair
I haven’t felt this way in a long time thanks FRO- trade-of between randomness and determinism in the
CIA energy] results, and a maximum sequence length of 2024. For
inference, an A100 GPU provided by Google Colab was
(2014,2015) [4, 16]. We chose to exclude words such as
gay, omosessuale, omofilo, pederasta , and diverso because
they are not strictly derogatory terms, hypothesizing
that if words are not perceived as abusive, they cannot
undergo a process of semantic reappropriation. After
obtaining a list of 17 words, we filtered the HODI dataset
by selecting only the tweets that contained at least one
of the following target words: anomalo, chiappa, frocio,
invertito, travestiti, checca, deviato, culattone, finocchio,
ifnocchi, finocchietto, sesso anale, frocia, ricchione, trans,
troia, stesso sesso. The resulting subset is a collection of
1742 tweets (see two examples in table 1).
3.2. Annotation guidelines
Establishing guidelines for such a subjective and
previously unexplored topic has been challenging. Since
the phenomenon lacks clear boundaries, we aimed to
describe the task as clearly as possible. With this in
mind, we based our guidelines on previous works in
the field of the philosophy of language [ 4, 16, 13]. We
asked three expert annotators to decide whether the
target words in each tweet are used in a reappropriative
context or not. Building on previously cited works, we
defined reappropriation as the use of derogatory epithets
by members of the target groups in a manner that is
generally considered non-ofensive. To better define the
phenomenon we highlighted diferent contexts in which
this linguistic behaviour could occur:</p>
        </sec>
      </sec>
      <sec id="sec-3-3">
        <title>Friendly contexts – members of the target group use</title>
        <p>the derogatory terms in a non-ofensive way in informal
contexts.</p>
        <p>Political reappropriation contexts – target groups
reclaim the use of derogatory epithets as a tool to
emphasize a conscious and common political struggle.</p>
        <p>• Happy #PrideMonth e ricordatevi che l’orgoglio si
celebra non solo quando andate a ballare nelle
discoteche gay, ma anche quando si tratta di metterci
la faccia e combattere per la causa perché altrimenti
il ricchione lo state facendo solo col culo degli altri
e non è carino
[English translation: Happy #PrideMonth and
remember that pride is celebrated non only when
you go dancing in gay discos, but also when it
comes to put your face out there and to fight for
the cause because otherwise you are just being
RICCHIONE on other people’s ass and it is not
nice]</p>
        <p>Artistic contexts – artists reclaim derogatory epithets
to subvert the dominant socio-cultural norms.</p>
        <p>• Poca gente che li guarda, c’è una checca che fa il
tifo Se #LucioDalla avesse scritto #AnnaEMarco nel
2022 sarebbe stato accusato di omofobia, lui. Invece
ha scritto una canzone immensa
[English translation: Few people look at them,
there is a CHECCA cheering if #LucioDalla had
written #AnnaEMarco in 2022 he would have
been accused of homophobia. Instead he wrote a
great song]
3.3. Zero-shot learning approach
Table 2 on tweets labeled diferently (some examples in Appendix
Inter-annotator agreement metrics B). We observed that out of a total of 217 tweets with
annotation disagreement, 67 (30.88%) contained the word
Fleiss’ Kappa 0.57 ”frocia”. This word likely caused confusion due to its
Annotators Cohen’s kappa unique history: unlike the other target words ”frocia”,
feminine form of ”frocio”, originated in an already
reapAnnotator 1 vs Annotator 2 0.559 propriative context 5 [14]. In some cases, due to a lack of
AAnnnnoottaattoorr 12 vvss AAnnnnoottaattoorr 33 00..562187 context, it was very dificult to understand the real
communicative intent of tweets (i.e., Sono ricchione. (senso
andiamo) - ”I’m gay. (like, let’s go)”). In other instances,
used. The code is available on the following GitHub it was challenging to determine whether the person who
page4. wrote the message is part of the LGBT+ community or</p>
        <p>As previously discussed, collecting a large-scale corpus not (Oggi il mondo mi sta urlando contro che sono un
ricfor reappropriated language detection is challenging. To chione colossale senza speranza ed io gli sto dando ragione
address the lack of data, we used a ZSL approach, prompt- - ”Today the world is shouting at me that I’m a colossal
ing the model to recognize the presence of semantic re- hopeless queer, and I’m agreeing with it”), assuming that
qualification without providing additional information. only members of target community can use slurs in
reapThis method evaluates the model’s ability to generalize propriative sense. Finally, we also identified some noisy
efectively with no training data, taking into account only data in which target words have diferent meanings. For
information acquired during the LLM training phase. example, in the sentence Il 4 è l’onomastico di checca
fren</p>
        <p>Diferent studies [ 18, 19] showed that ZSL results are zis ci ubriachiamo (”On the 4th, it’s Checca Frenzis’ name
significantly influenced by the appropriateness and pre- day, so we’re getting drunk”) the term ”checca”6 is likely
cision of the prompts used. Additionally, multiple re- used as a diminutive of the Italian name ”Francesca”.
searchers [19] proposed diferent methods to improve We also noticed that in some cases tweets labelled as
performances. Plaza-del-Arco et al. (2022) [18] demon- reappropriative were also labelled as homotransphobic
strated that one of the most critical factors is ensuring in the original annotation of HODI dataset. Due to this
that the prompt fits well with the utilized corpus. Taking apparent contradiction, we conducted a qualitative
linthis into account, we designed four diferent prompts guistic analysis on this data. We realized that in four
using the HODI sub-corpus with the reappropriation an- examples (Oggi avrò di che parlare coi colleghi..un etero
notation as the gold standard, each including specific analfabeta che conquista l’attenzione di una checca
alfadetails about the task and the corpora. The first one is betizzata , mi raccomando vai a fare la quarta dose
the most general - explaining only the task in few words - che forse ti aiuta a dimenticarmi. Ciao - ”Today I’ll have
while the fourth is as precise as possible providing full list something to talk about with my colleagues... an
illiterof target words (full prompts are provided in Appendix ate straight guy who captures the attention of a literate
A). queer. Make sure to get your fourth dose, maybe it’ll
help you forget about me. Bye”), it is unclear whether
the writer is part of the LGBT+ community or not. In
4. Results other words, it is uncertain if the users were using slurs
to refer to themselves with reappropriative intent or to
4.1. Annotation statistics other persons in abusive term. In addition, in some of
these examples, target words were used as part of figures
We calculated the annotator agreement firstly by using of speech, mostly similes (Fare come una checca - ”Behave
Fleiss’ Kappa, obtaining 0.57, secondly through Cohen’s as a faggot”). These expressions, highly lexicalized in
Kappa between pairs of annotators (all metrics are dis- Italian and often used as abusive idiomatic phrases, likely
played in table 2). The moderate agreement and metrics increased the dificulty in recognizing the correct usages.
variability highlighted the task’s dificulty and
subjectivity. Despite the three annotators being experts on the
topic, they encountered challenges in distinguishing the
use of slurs.</p>
        <p>The majority annotation indicates that out of a total
of 1742 examples, only 168 were annotated as
reappropriated.</p>
        <p>To better understand annotators disagreements and
collect challenging examples, we conducted an analysis
5Nossem (2019) considers ”frocia” as a calque of the English “queer”
or ”Alternatively, we could see it as a new concept which is
specific to the Italian linguistic and cultural context, rather than an
adaption or appropriation of the English “queer”, i.e. some sort of a
territorialised post-queer” [14].
6”Checca” as well as being a diminutive form of the Italian name
”Francesca” is a colloquial and somewhat derogatory term in Italian
used to refer to a gay man</p>
      </sec>
      <sec id="sec-3-4">
        <title>4https://github.com/marcocuccarini/ReCLAIMProject</title>
        <p>process; therefore, a new scalar annotation scheme is
probably required. Furthermore, the fact that only
experienced young researches sensitive to LGBT+ issues were
involved in the annotation task may have led to bias in
the results.</p>
        <p>As future work we plan to:
• create a new dataset and annotating it by
following a perspectivist approach 7[20], i.e. by
collecting diferent points of view from various social
media, involving annotators with diferent
backgrounds, in terms of age, origin, education, in/out
target groups, and providing more context
information during the annotation phase in order to
better understand slurs’ meanings and intents.
• through diferent LLMs, investigate which
approach has better performances in recognising
diferent uses of slurs, for instance by using ZSL
approach between pairs of examples or defining
few-shot with new suitable data.
• regarding ethical considerations, it is crucial to
directly and actively involve the LGBT+ community.
Gathering viewpoints and suggestions from those
who experience daily oppression and denigration
is essential not only to strengthen the research
methodology but also to ensure its relevance and
sensitivity to their lived experiences.</p>
      </sec>
    </sec>
    <sec id="sec-4">
      <title>6. Conclusion</title>
      <p>4.2. LLM classification results</p>
      <sec id="sec-4-1">
        <title>The results of the ZSL approach are detailed in Table 3.</title>
        <p>Notably, performances change among the prompts. The
fourth prompt, which is the most specific, achieves the
highest performance as it specifies all the target words
considered during dataset construction. In contrast, the
third one, focusing specifically on detecting
homotransphobia by asking if the text intends to ofend on the
basis of sexual orientation or gender identity, has low
performances. Among the four prompts, the first one
(”Determine if the sentence contains semantic
reappropriation; respond ’True’ if it does and ’False’ otherwise.”)
has the worst performances, likely due to the
ambiguity of the expression ”semantic reappropriation” for the
model. Additionally, the model struggled to recognize
the minority class (semantic requalification) because it
is very complex for the model to recognize the context
of the use of a slur, whether it is used to ofend or not.</p>
        <p>This requires a deep understanding of the context and
social dynamics, and it can also be a challenging task for
humans.</p>
        <p>To address this issue, balancing the information in the
prompt by providing more details about semantic
requalification could improve the model’s overall performances.</p>
        <p>Therefore, we did not achieve very good performances,
highlighting the importance of collecting new data and
reviewing the computational approach.</p>
        <p>This paper presents the first attempt to specifically
address the detection of slur reappropriation in the Italian
language. One of the reasons that motivated us to
undertake this task is the need to ensure a safe linguistic
environment on social networks without risking the
censorship of individual freedom of expression. Since there
was no existing dataset to explore homophobic slurs in
the Italian language, we filtered a pre-existing
homotransphobic dataset to build a subset containing only tweets
with slurs occurrences, used both abusively and
non5. Limitations and future works abusively. We then designed precise new guidelines and
annotated the filtered subset, focusing on the presence of
The semantic requalification of slurs turned out to be a slur semantic reappropriation. With the newly annotated
complex and time-consuming process in several aspects. dataset, we approached a classification task using LLMs
Although the study has taken its first steps, some limi- with zero-shot techniques. Leveraging the Qwen model
tations must be acknowledged. Firstly, we realized that [17], we proposed four diferent prompts. As suggested
the HODI dataset [8] was not completely suitable for our by previous literature, more specific prompts and those
purposes. Tweets had been collected for the homotrans- better suited to the dataset yielded better performance. In
phobia detection aim and the diference of research goals this work, we proposed an important and under-explored
did not provide us the right data to investigate the seman- task through a two-fold contribution. On one hand, we
tic requalification process of slurs. Secondly, a binary highlighted the lack of data in the Italian language
dealannotation proved to be limiting due to the dificulty of ing with this phenomenon and the necessity of building
the task. The subjective evaluation of the annotators
does not allow the problem to be simplified in terms
of the presence or absence of semantic requalification 7https://pdai.info/
an up-to-date corpus that comprehensively includes mul- ings of the Eighth Evaluation Campaign of
Natutiple sources and semantic contexts. On the other hand, ral Language Processing and Speech Tools for
Italwe demonstrated a possible approach by leveraging new ian. Final Workshop (EVALITA 2023), Parma, Italy,
state-of-the-art LLMs. Finally, it is important to have in September 7th-8th, 2023, volume 3473 of CEUR
mind that compared to English, Italian has a diferent Workshop Proceedings, CEUR-WS.org, 2023. URL:
history and cultural background, resulting in a much https://ceur-ws.org/Vol-3473/paper26.pdf.
slower linguistic evolution. This makes establishing pre- [9] A. Ollagnier, E. Cabrio, S. Villata, Unsupervised
cise characteristics of this topic a challenging task due ifne-grained hate speech target community
detecto the lack of solid foundational knowledge. In conclu- tion and characterisation on social media, Social
sion, we believe that bringing attention to the issue will Network Analysis and Mining 13 (2023) 58.
lead to anti-discrimination activities, the creation of safer [10] F. M. Plaza-del arco, D. Nozza, D. Hovy,
Respectspaces in online communication, and the inclusion and ful or toxic? using zero-shot learning with
lanacceptance of LGBT+ communities. guage models to detect hate speech, in: Y.-l. Chung,
P. Röttger, D. Nozza, Z. Talat, A. Mostafazadeh
Davani (Eds.), The 7th Workshop on Online Abuse
References and Harms (WOAH), Association for
Computational Linguistics, Toronto, Canada, 2023, pp.
[1] D. O. Thiago, A. D. Marcelo, A. Gomes, Fighting 60–68. URL: https://aclanthology.org/2023.woah-1.
hate speech, silencing drag queens? artificial
intelligence in content moderation and risks to lgbtq [11] 6S.. dKoui:m10a.r,18A6.5N3/avg1a/r,2A02.3K.uwomaahr-, 1A..6.Singh, Hate
voices online, Sexuality &amp; culture 25 (2021) 700–732. speech detection: A survey, in: 2022 4th
Interna[2] E. W. Pamungkas, V. Basile, V. Patti, Do you really tional Conference on Advances in Computing,
Comwant to hurt me? predicting abusive swearing in munication Control and Networking (ICAC3N),
social media, in: N. Calzolari, F. Béchet, P. Blache, IEEE, 2022, pp. 171–176.</p>
        <p>K. Choukri, C. Cieri, T. Declerck, S. Goggi, H. Isa- [12] D. Locatelli, G. Damo, D. Nozza, A cross-lingual
hara, B. Maegaard, J. Mariani, H. Mazo, A. Moreno, study of homotransphobia on twitter, in:
ProceedJ. Odijk, S. Piperidis (Eds.), Proceedings of the ings of the First Workshop on Cross-Cultural
ConTwelfth Language Resources and Evaluation Con- siderations in NLP (C3NLP), 2023, pp. 16–24.
ference, European Language Resources Associa- [13] B. Cepollaro, et al., Linguaggio d’odio, in:
Pragtion, Marseille, France, 2020, pp. 6237–6246. URL: matica Sperimentale, Società Editrice il Mulino spa,
https://aclanthology.org/2020.lrec-1.765. 2022, pp. 145–156.
[3] E. W. Pamungkas, V. Basile, V. Patti, In- [14] E. Nossem, Queer, frocia, femminielle, ricchione et
vestigating the role of swear words in abu- al. - localizing ’queer’ in the italian context, GSI:
sive language detection tasks, Lang. Re- Gender, Sexuality, Italy 6 (2019) 1–27.
sour. Evaluation 57 (2023) 155–188. URL: https:// [15] E. Bassignana, V. Basile, V. Patti, Hurtlex: A
muldoi.org/10.1007/s10579-022-09582-8. doi:10.1007/ tilingual lexicon of words to hurt, in: E. Cabrio,
S10579- 022- 09582- 8. A. Mazzei, F. Tamburini (Eds.), Proceedings of the
[4] C. Bianchi, Slurs and appropriation: An echoic Fifth Italian Conference on Computational
Linguisaccount, Journal of Pragmatics 66 (2014) 35–44. tics (CLiC-it 2018), Torino, Italy, December
10[5] E. Zsisku, A. Zubiaga, H. Dubossarsky, Hate speech
12, 2018, volume 2253 of CEUR Workshop
Proceeddetection and reclaimed language: Mitigating false ings, CEUR-WS.org, 2018. URL: https://ceur-ws.org/
positives and compounded discrimination, in: Pro- Vol-2253/paper49.pdf.
ceedings of the 16th ACM Web Science Conference, [16] C. Bianchi, Il lato oscuro delle parole: epiteti
deni2024, pp. 241–249. gratori e riappropriazione, Sistemi intelligenti 27
[6] T. Gillespie, Custodians of the Internet: Platforms, (2015) 285–302.</p>
        <p>content moderation, and the hidden decisions that [17] J. Bai, S. Bai, Y. Chu, Z. Cui, K. Dang, X. Deng,
shape social media, Yale University Press, 2018. Y. Fan, W. Ge, Y. Han, F. Huang, B. Hui, L. Ji, M. Li,
[7] N. Strossen, Hate: Why we should resist it with free J. Lin, R. Lin, D. Liu, G. Liu, C. Lu, K. Lu, J. Ma,
speech, not censorship, Oxford University Press, R. Men, X. Ren, X. Ren, C. Tan, S. Tan, J. Tu, P. Wang,
2018. S. Wang, W. Wang, S. Wu, B. Xu, J. Xu, A. Yang,
[8] D. Nozza, A. T. Cignarella, G. Damo, T. Caselli, H. Yang, J. Yang, S. Yang, Y. Yao, B. Yu, H. Yuan,
V. Patti, HODI at EVALITA 2023: Overview of Z. Yuan, J. Zhang, X. Zhang, Y. Zhang, Z. Zhang,
the first shared task on homotransphobia detec- C. Zhou, J. Zhou, X. Zhou, T. Zhu, Qwen technical
tion in italian, in: M. Lai, S. Menini, M. Polignano, report, CoRR abs/2309.16609 (2023). URL: https://
V. Russo, R. Sprugnoli, G. Venturi (Eds.), Proceed- doi.org/10.48550/arXiv.2309.16609. doi:10.48550/
ARXIV.2309.16609. arXiv:2309.16609.
[18] F. M. Plaza-del Arco, M.-T. Martín-Valdivia,</p>
        <p>R. Klinger, Natural language inference prompts
for zero-shot emotion classification in text across
corpora, in: N. Calzolari, C.-R. Huang, H. Kim,
J. Pustejovsky, L. Wanner, K.-S. Choi, P.-M. Ryu,
H.H. Chen, L. Donatelli, H. Ji, S. Kurohashi, P. Paggio,
N. Xue, S. Kim, Y. Hahm, Z. He, T. K. Lee, E.
Santus, F. Bond, S.-H. Na (Eds.), Proceedings of the
29th International Conference on Computational
Linguistics, International Committee on
Computational Linguistics, Gyeongju, Republic of Korea,
2022, pp. 6805–6817. URL: https://aclanthology.org/
2022.coling-1.592.
[19] P. Liu, W. Yuan, J. Fu, Z. Jiang, H. Hayashi, G.
Neubig, Pre-train, prompt, and predict: A systematic
survey of prompting methods in natural language
processing, ACM Computing Surveys 55 (2023)
1–35.
[20] S. Frenda, G. Abercrombie, V. Basile, A.
Pedrani, R. Panizzon, A. T. Cignarella, C. Marco,
D. Bernardi, Perspectivist approaches to
natural language processing: A survey, Language
Resources and Evaluation Journal (2024). doi:10.</p>
        <p>1007/s10579-024-09766-4.</p>
      </sec>
    </sec>
    <sec id="sec-5">
      <title>A. Zero-shot learning prompts</title>
      <p>4</p>
      <p>Stabilisci se la frase contiene riappropriazione
semantica, rispondi ’Vero’ se lo contiene e ’Falso’
altrimenti.</p>
      <sec id="sec-5-1">
        <title>Determine if the sentence contains semantic reap</title>
        <p>propriation; respond ’True’ if it contains it and ’False’
otherwise.</p>
        <p>Stabilisci se la frase contiene un linguaggio che non
ha intenzione di ofendere, Rispondi ’Vero’ se lo
contiene e ’Falso’ altrimenti.</p>
      </sec>
      <sec id="sec-5-2">
        <title>Determine if the sentence contains language that</title>
        <p>has not abusive intent. Respond ’True’ if it does and
’False’ otherwise
Stabilisci se la frase contiene un linguaggio che
intende ofendere delle persone per il loro
orientamento sessuale e le loro identità di genere, rispondi
’Vero’ se lo contiene e ’Falso’ altrimenti.</p>
        <p>Stabilisci se nelle frasi proposte le seguenti parole
“frocio, invertito, travestit*, checca, deviato,
culattone, finocchio, finocchi, omosex, finocchietto,
omosessuali, frocia, ricchione, trans, troia” sono
utilizzate per ofendere le persone per il loro
orientamento sessuale e/o identità di genere. Rispondi
“Vero” se c’è un intento ofensivo, altrimenti “Falso”.</p>
      </sec>
      <sec id="sec-5-3">
        <title>Determine if the sentence contains language intended to ofend people based on their sexual orientation or gender identity. Respond ’True’ if it does and ’False’ otherwise.</title>
        <p>Determine if the following words in the proposed
sentences—’frocio, invertito, travestit*, checca,
deviato, culattone, finocchio, finocchi, omosex,
finocchietto, omosessuali, frocia, ricchione, trans, troia’—are
used to ofend people based on their sexual
orientation and/or gender identity. Respond ’True’ if there
is an ofensive intent, otherwise respond ’False’.</p>
      </sec>
    </sec>
    <sec id="sec-6">
      <title>B. Annotation disagreement examples</title>
      <p>Translation</p>
      <sec id="sec-6-1">
        <title>My mouth is burning hot...I want a fag for</title>
        <p>myself.</p>
        <p>I’m at university and I just can’t stop being
so gay today, help!
How gay is she, I love her, she wants a paper
map to explore the gardens.</p>
      </sec>
      <sec id="sec-6-2">
        <title>User_*I’m gay. (like, let’s go). Man, husband, father, and faggot.</title>
      </sec>
      <sec id="sec-6-3">
        <title>Physics is a straight thing, and in fact, I’m</title>
        <p>half gay..</p>
        <p>Today the world is screaming at me that I
am a colossal hopeless fag, and I’m agreeing
with it.</p>
        <p>I’m about to tweet something very gay.</p>
      </sec>
    </sec>
  </body>
  <back>
    <ref-list />
  </back>
</article>