=Paper=
{{Paper
|id=Vol-3878/108_main_long
|storemode=property
|title=Implicit Stereotypes: A Corpus-Based Study for Italian
|pdfUrl=https://ceur-ws.org/Vol-3878/108_main_long.pdf
|volume=Vol-3878
|authors=Wolfgang Schmeisser-Nieto,Giacomo Ricci,Simona Frenda,Mariona Taule,Cristina Bosco
|dblpUrl=https://dblp.org/rec/conf/clic-it/Schmeisser-Nieto24
}}
==Implicit Stereotypes: A Corpus-Based Study for Italian==
Implicit Stereotypes: A Corpus-Based Study for Italian
Wolfgang S. Schmeisser-Nieto1,2,* , Giacomo Ricci2 , Simona Frenda3,4 , Mariona Taulé1 and
Cristina Bosco2
1
Universitat de Barcelona, Gran Via de les Corts Catalanes, 585, Barcelona, Spain
2
University of Turin, Dipartimento di Informatica, Corso Svizzera 185, 10149 Torino, Italy
3
Interaction Lab, Heriot-Watt University, The Avenue, Edinburgh, EH14 4AS, Scotland
4
aequa-tech, Torino, Italy
Abstract
Detecting stereotypes is a challenging task, particularly when they are not expressed explicitly. In this study, we applied
an annotation schema from the literature designed to formalize implicit stereotypes. We analyzed implicit stereotypes
about immigrants in two datasets: StereoHoax-IT and SterheoSchool, which are created from different sources. StereoHoax-
IT consists of reactions on Twitter to specific hoaxes aimed at discriminating against immigrants, while SterheoSchool
includes comments from teenagers on fake news generated in psychological experiments. We describe the annotation
process, annotator disagreements, and provide both quantitative and qualitative analyses to shed light on how implicitness
characterizes stereotypes in different texts. Our findings suggest that implicit stereotypes are often conveyed through logical
linguistic relations, such as entailment and behavioral evaluations of immigrants.
Keywords
Implicit stereotype, Corpora annotation, Corpora analysis, Italian language
1. Introduction and Background municated through linguistic devices such as metaphor
and irony [9], negation [12], or entailments [13]. Re-
Various recent NLP studies have focused on detecting cently, efforts have been made to formalize the strategies
stereotypes online, often in conjunction with forms of for expressing implicit stereotypes, with the goal of es-
abusive language [1, 2, 3, 4, 5]. The importance of tack- tablishing standardized criteria for annotators [14]. An
ling this phenomenon is due to its impact on social struc- example of explicit stereotype is "[Gli immigrati] buttano
tures and the power of individuals. Therefore, detecting via il cibo che gli danno per poi andare a mangiare i poveri cani,
stereotypes can prevent their emergence and spread, and dove finiremo!" 1 (extracted from StereoHoax-IT corpus),
thereby have a positive impact on our society. in which the generalization of the target group and the
In social psychology, a stereotype has been defined as association with an action is expressed in a present tense
a set of beliefs about others perceived as belonging to a with a habitual aspect. On the other hand, in the example
different social group [6]. It oversimplifies the features "Come noi rispettiamo loro e il colore della loro pelle, così loro
of the group and generalizes a particular feature, apply- che abitano nei nostri paesi dovrebbero portare rispetto nei nostri
ing it to all its members [6]. In contrast to the emotional confronti." 2 (SterheoSchool corpus), the stereotype is not
component of prejudice and the behavioral component of overtly manifested, but it must be inferred through the
discrimination, a stereotype is associated with the cogni- evaluation of the in-group and an exhortative sentence.
tive component of the triad [7]. In language, stereotypes From a computational linguistics perspective, concerns
can be expressed explicitly or implicitly [8]. Explicit have been raised about how to detect and process stereo-
stereotypes deliver a straightforward message, clearly types, a task often considered closely related to the de-
revealing the associated traits, often using derogatory ad- tection of abusive language or hate speech [15].
jectives [9, 10]. In contrast, implicit stereotypes are more Alongside research on hate speech, the study of stereo-
nuanced and indirect, requiring the reader to infer their type detection has increased, particularly within eval-
meaning [11]. These implicit stereotypes can be com- uation tasks [16, 4, 17, 18, 19]. However, the detection
CLiC-it 2024 - Tenth Italian Conference on Computational Linguistics, of implicit stereotypes remains a significant challenge
Dec 04 — 06, 2024, Pisa, Italy [20]. There are several works that deal with stereotypes
*
Corresponding author. in more complex narratives, such as microportraits [21]
$ wolfgang.schmeisser@ub.edu (W. S. Schmeisser-Nieto); and political debates [22]. The detection of implicitness
giacomo.ricci@edu.unito.it (G. Ricci); s.frenda@hw.ac.uk
(S. Frenda); mtaule@ub.edu (M. Taulé); cristina.bosco@unito.it
has also been studied with reference to several other
(C. Bosco)
0000-0001-5663-6276 (W. S. Schmeisser-Nieto); 1
Transl. "They throw away the food they are given only to go eat the
0000-0002-6215-3374 (S. Frenda); 0000-0003-0089-940X (M. Taulé); poor dogs. Where will we end up!"
0000-0002-8857-4484 (C. Bosco) 2
Transl. "Just as we respect them and the color of their skin, they, who
© 2024 Copyright for this paper by its authors. Use permitted under Creative Commons License
Attribution 4.0 International (CC BY 4.0). live in our countries, should show respect toward us."
CEUR
ceur-ws.org
Workshop ISSN 1613-0073
Proceedings
phenomena, in particular those characterized by sub- the presence or absence of anti-migrant stereotypes, and,
jectivity, such as irony [23]. In this paper, we analyze if present, for other related categories such as whether
the implicit manifestation of stereotypes targeting immi- the stereotype was expressed implicitly or explicitly and
grants, using a well-defined annotation schema proposed which forms of discredit the stereotype could be clas-
by Schmeisser-Nieto et al. [14] and tested on a subset sified at. This category is inspired by the Stereotype
of comments from Spanish newspapers (DETESTS [5]). Content Model (SCM) [7] and allowed us to observe the
This schema represents different criteria for determining stereotype from a perspective that encompasses psychol-
the implicitness of stereotypes in an attempt to formal- ogy and computational linguistics [26]. In section 3, we
ize the concept. Disentangling strategies of implicitness show how we extended this annotation to describe the
presents a significant challenge, often resulting in the dimension of implicitness6 . StereoHoax-IT [27] is a
identification of multiple categories within the same text. contextualized multilingual dataset of tweets annotated
Our main contributions consist of expanding the an- primarily for the presence of anti-migrant stereotypes.
notation with topics of stereotypes about immigrants [5] The dataset consists of replies to tweets identified as con-
and the strategies to implicitness [14], as well as test- taining racial hoaxes specifically targeting migrants and
ing this schema on two existing Italian datasets. These collected from debunking websites from French, Italian
datasets share the same domain as those used for Spanish, and Spanish Twitter, collected from 2019 to 2021. Each
stereotypes about immigrants, and include data extracted message is provided with its “conversation head” (the
from Twitter (now X) as reactions to specific hoaxes message containing the source racial hoax), and its direct
(StereoHoax-IT) and comments written by high school parent message (if applicable). In this paper, we only use
students to two examples of fake news artificially cre- the Italian subset, which includes 3,123 instances. Due to
ated within psychological experiments (SterheoSchool) the rarity of the phenomenon, there is a significant class
as described in [24, 25]. Analyzing the annotated texts, imbalance: 472 instances (15%) contain a stereotype, 332
we noted that implicit stereotypes appear to be conveyed of which (70%) are implicit and 140 (30%) are explicit.
especially through logical linguistic relations like entail- SterheoSchool [28] consists of a selection of data col-
ment and the behavioral evaluation of immigrants in both lected in Italian schools during experiments conducted by
datasets. Moreover, in most cases, the annotators needed social psychologists [24, 25]. More precisely, it includes
to use contextual information to determine the presence the reactions of teenagers, who read two hoaxes artifi-
of stereotypes. For example, in this case "Che centra lui e cially created and presented as news articles, recorded
Italiano!, può essere massacrato!" 3 (StereoHoax-IT) the au- via a cell phone interface. The hoaxes were designed to
thor of the message expresses a stereotype complaining elicit reactions to stereotypes in readers. For each news
that foreigners enjoy better treatment than Italians, who item, readers were asked to comment on the news and
can indeed be "macellati" (slaughtered). on the main character of the articles. These comments
The rest of the paper is organized as follows: Sections 2 are also associated with metadata, such as the age and
and 3 describe the datasets and the annotation applied; declared gender of the author. By collecting data gener-
Sections 4 and 5 present quantitative and qualitative anal- ated by teenagers, this corpus aims to fill a gap in the
yses of the annotated data; and Section 6 summarizes the literature in which teenagers are an underrepresented
results and provides guidance regarding future work. category in data annotated for text classification tasks.
We applied the annotation scheme mentioned above to
the news and comments. This corpus consists of 1,147
2. Datasets comments, of which 337 (33.8%) are annotated as con-
taining stereotypes, of which 152 (45%) are expressed in
In this work, we focus on two annotated corpora con-
an implicit form.
taining implicit stereotypes developed within the STER-
4 5
HEOTYPES project and the SterotypHate project . Their
content is related to attitudes regarding immigrants and 3. Annotation
they share similar conversational structures and the same
annotation scheme. Each message in these datasets is The annotation scheme we applied on the two corpora
contextualized, i.e. collocated within a discourse thread is based on two different layers, topics of stereotypes and
or presented as a comment on a given news item. For implicitness strategies, as well as the need for context.
the annotation scheme, each message is annotated for The topics of stereotypes were firstly introduced
within an evaluation task, DETESTS [5], in which the
3
Transl. "That’s not the point, he is Italian! He can be slaughtered!" participants had to train models to decide whether a text
4
STERHEOTYPES (Studying European Racial Hoaxes and sterEO-
TYPES) is an international project funded by Compagnia di San
6
Paolo and VolksWagen Stiftung. The datasets will be made available for research purposes after the
5
StereotypHate is a project funded by Compagnia di San Paolo. acceptance of the paper in anonymized form.
contained stereotypes, and when they did, classify the linguistic devices used to convey implicit stereotypes, we
stereotype into ten different categories: have revised the criteria proposed in [14] as follows:
• Xenophobia victims Immigrants are perceived • World knowledge World knowledge refers to
as victims of xenophobia and discrimination. the shared cultural, social and historical knowl-
They enrich culture and diversity and should have edge needed to interpret messages, e.g., "La scuola
the same rights as citizens. si inchina all’islam: l’aceto è bandito dalle mense." 8
• Suffering victims Immigrants are portrayed as (StereoHoax-IT)
victims of poverty and violence in their places of • Figures of speech Every figure of speech ex-
origin and as having to face difficult situations in cept for irony and sarcasm, and humor and jokes.
their host countries. For instance, metaphor, rhetorical questions, eu-
• Economic resources Immigrants are seen as an phemisms or reported speech, e.g., "Chi è quel
pazzo che si mette in casa uno di questi? Un suicidio" 9
economic resource. They do the jobs that locals
do not want to do, pay taxes and solve the prob- (StereoHoax-IT)
lems arising from low population growth. • Irony/Sarcasm The message expresses a mean-
ing that is the opposite of what is said, e.g. in "Che
• Migration control Immigrants present a threat
bella gente fanno arrivare.....che bello avere un paese
due to massive influxes and a lack of control at
pieno di risorse pronte a tutto.....ma proprio a tutto." 10
the borders. Immigrants are illegal and should be
(StereoHoax-IT)
expelled. It is seen as an invasion.
• Humor/Jokes Jokes about a target group of-
• Culture and religion differences Immigrants
ten use stereotypes and may or may not include
suppose a loss of the in-group’s values and tradi-
irony, e.g. in "Chissà se ha detto:"Cibo no buono"." 11
tions and the replacement of the target group’s
(StereoHoax-IT)
customs and religions. They are also seen as une-
• Extrapolation The target refers to an individual
ducated and should adapt to their host country.
or specific members of a social group, not the
• Benefits Immigrants compete with the in-group group as a whole, e.g. in "Classico del sud-italia
for resources such as public subsidies, school Maleducata" 12 (SterheoSchool)
places, jobs, health care and pensions. They are • Imperative/Exhortative Calls to take certain
privileged over the in-group. actions related to the target group, e.g. "Come in
• Public health Immigrants are thought to be car- Cina FUCILATELO" 13 (StereoHoax-IT)
riers of infections and diseases such as COVID-19, • Entailment/Evaluation Logical relation be-
Ebola and HIV. tween two sentences in which the condition of
• Security Immigration brings security issues. Due truth of sentence A implies the truth of sentence
to immigration, there is an increase in crime, do- B. The implicit stereotype is implied in sentence
mestic violence, robbery, drug use, sexual assault, A. An evaluation of the author’s or in-group’s
murder, terrorist attacks and public disorders. thoughts, emotions and behaviors, rather than
• Dehumanization Immigrants are seen as infe- content about the out-group or target group,
rior beings and are compared with animals, par- can be considered as a type of entailment, e.g.
asites or scum. Their lives have less value than "Saranno fuori o liberi presto" 14 (StereoHoax-IT) is
those of the in-group. the answer to a racial hoax in which a group of im-
• Other topics Any other immigration stereotypes migrants rape and murder a teenage girl. With the
not covered in the previous categories. author’s evaluation of the situation, it is entailed
that immigrants are immune from punishment.
Context and implicitness strategies were initially pro- • Other implicitness Other types of implicitness
posed as criteria that could help annotators to annotate not considered in the previous categories.
implicitness, since their vagueness may decrease Inter- e.g. "al giorno d’oggi non ci si può fidare di nessuno
Annotator Agreement (IAA) [14]. By context, we refer una persona ripugnante" 15 (SterheoSchool)
to information contained in previous messages, which
8
is considered necessary to understand the meaning of Transl. "The school bows to Islam: vinegar is banned from canteens."
9
Transl. "Who’s that fool who takes one of these into his house? a
the message to be annotated, as in the following exam-
suicide"
ple: "Sempre assolti...sempre misure e pesi differenti". Context: 10
Transl. "Such nice people they bring in... how nice it is to have a
"Uccide anziana ebrea al grido di Allah Akbar. Assolto perché country full of resources ready for anything... anything at all"
drogato." 7 (StereoHoax-IT). Regarding the strategies and 11
Transl. "I wonder if he said: «Food no good»"
12
Transl. "Typical of Southern Italy"
7 13
Transl. "Always acquitted...always different measures and weights." Transl. "SHOOT HIM like in China"
14
Context: "Kills elderly Jewish woman while shouting ‘Allah Akbar.’ Transl. "They will be out or free soon"
15
Acquitted because he was on drugs." Transl. "nowadays you can’t trust anyone a repulsive person"
Table 1 stereotypical topics that portray immigrants as threats,
Inter-annotator agreement test using Fleiss’ kappa (𝜅) coeffi- the security issue is highly prevalent in both datasets.
cient on the categories of implicitness and stereotype topics A common trend shows that the most frequent implic-
of the StereoHoax-IT and the SterheoSchool corpora. itness strategy in both datasets is ‘entailment/evaluation’,
Label StereoHoax-IT SterheoSchool accounting for 64% in StereoHoax-IT and 80% in Ster-
heoSchool. To a lesser degree, ‘extrapolation’ appears in
Xenophobia victims 0.57 0.50
Suffering victims 0.49 0.50
both datasets, with 13% in the former and 19% in the lat-
Economic resource 0.48 0.50 ter, respectively. Other represented strategies that exceed
Migration control 0.77 0.55 10% of instances are only found in StereoHoax-IT.
Culture & religion 0.75 0.71 The label ‘context’ has a high prevalence in both
Benefits 0.75 0.62 datasets, accounting for 38% in StereoHoax-IT and 80%
Public health 0.86 0.50
Security 0.81 0.64
in SterheoSchool. This is expected, as it depends on the
Dehumanization 0.71 0.71 methodology to produce the comments—spontaneous
Other topics 0.52 0.43 versus controlled—and the variety of contexts: two
fake news for StereoSchool and 50 racial hoaxes for
Context 0.72 0.50 StereoHoax-IT. The limited amount of data unfortunately
World knowledge 0.52 0.51
does not allow us to reliably evaluate a correlation be-
Figures of speech 0.68 0.70 tween ‘context’ and certain implicitness strategies, as
Irony/Sarcasm 0.70 0.50 shown in Table 3, except for the association between ‘en-
Humor/Jokes 0.52 No cases tailment/evaluation’ and ‘context’ across both datasets.
Extrapolation 0.51 0.53 The correlation between ‘implicitness’ and ‘context’ is
Imperative/Exhortative 0.73 0.53
Entailment/Evaluation 0.45 0.49
also shown in Bourgeade et al. [27], with significant asso-
Other implicitness 0.51 0.52 ciations of the aforementioned labels in three languages:
French, Italian and Spanish. In StereoHoax-IT, the corre-
lations between the ‘context’ and ‘irony/sarcasm’, ‘extrap-
The annotation was carried out on the Label Studio
olation’ and ‘imperative/exhortative’ are also significant,
platform by three native Italian speakers with a back-
whereas the category of other implicitness strategies is
ground in linguistics, some of whom specialized in NLP.
also significantly correlated in SterheoSchool, which can
They achieved an acceptable to good IAA in the majority
be analyzed qualitatively to determine if there is a pattern
of cases, as reported in Table 1, which varies across cate-
among them. The other strategies do not have represen-
gories and corpora. By observing Table 2, we can see that
tative instances that allow for analyzing them compara-
only a few topics have been marked by the majority of
tively, except for ‘extrapolation’, which is significantly
annotators , while not all the implicit criteria have been
correlated in StereoHoax-IT but not in SterheoSchool.
identified in the texts (i.e., ‘humor/jokes’).
In terms of co-occurrences between topics and implicit
4. Quantitative Analysis strategies, we can observe from Table 4 that there is also
a great disparity in both datasets. Focusing on the two
Table 2 shows the distribution of the disaggregated anno- topics with the highest representation in SterheoSchool
tations across both datasets. Columns 0%, 33%, 67% and (Culture & religion, 51%, and security, 35%), which ac-
100%, respectively, indicate the number of instances per count for the majority of the corpus, we can analyze
label that were annotated by no annotator (0%), by one some differences with StereoHoax-IT. Firstly, ‘culture &
annotator (33%), by two annotators (67%) and by all three religion’ is expressed primarily through entailments or
annotators (100%). Column % positive class shows the per- evaluations (65 co-occurrences) and secondarily through
centage of the label voted by the majority of annotators, extrapolations in SterheoSchool. In contrast, the distri-
and its total number of cases in parentheses. bution of strategies used to represent ‘culture & religion’
Firstly, an inconsistency in the distribution of labels stereotypes is more evenly spread in StereoHoax-IT. A
can be observed since SterheoSchool has a representation similar pattern is observed with the topic of ’security’,
of labels of more than 10% on only four labels. This dispar- which, while concentrating strategies in ’entailment/e-
ity is due to the extraction methods of each dataset: the valuation,’ also utilizes a range of other strategies, partic-
topics of the racial hoaxes used to extract the dataset were ularly ‘extrapolation’ and ‘imperative/exhortative’. With
more balanced in StereoHoax-IT than in SterheoSchool, these co-occurrences, we can reaffirm that the different
with the latter focusing generally on security and cultural methods to extract the data have an impact on the charac-
differences that are discussed in the two only contexts teristics of it, and therefore, its distribution of labels. For
provided to the students for their comments. However, instance, the messages were written in a non-controlled
while in the former there is a representation of all the environment, which gives the authors the freedom to
express themselves without constrains. Moreover, the
Table 2
Distribution of labels and percentages of positive class.
StereoHoax-IT SterheoSchool
Labels 0% 33% 67% 100% % positive class 0% 33% 67% 100% % positive class
Xenophobia victims 265 54 12 1 4% (13) 149 3 0 0 %0 (0)
Suffering victims 313 19 0 0 0% (0) 148 4 0 0 0% (0)
Economic resource 299 33 0 0 0% (0) 151 1 0 0 0% (0)
Migration control 203 48 45 36 24% (81) 140 8 2 2 3% (4)
Culture & religion 254 43 15 20 11% (35) 37 38 49 28 51% (77)
Benefits 235 30 41 26 20% (67) 139 11 2 0 1% (2)
Public health 257 16 23 36 18% (59) 151 1 0 0 0% (0)
Security 128 42 48 114 49% (162) 48 50 29 25 36% (54)
Dehumanization 258 40 21 13 10% (34) 126 17 4 5 6% (9)
Other topics 316 15 1 0 0% (1) 66 76 10 0 7% (10)
Context 116 90 45 81 38% (126) 1 28 61 62 81% (123)
World knowledge 187 111 31 3 10% (34) 136 15 1 0 1% (1)
Figures of speech 257 40 27 8 11% (35) 142 8 0 2 1% (2)
Irony/Sarcasm 247 42 30 13 13% (43) 151 1 0 0 0% (0)
Humor/Jokes 300 29 3 0 1% (3) 152 0 0 0 0% (0)
Extrapolation 157 133 36 6 13% (42) 69 54 26 3 19% (29)
Entailment/Evaluation 20 100 167 46 64% (212) 1 30 63 58 80% (121)
Imperative/Exhortative 238 49 24 21 14% (45) 106 38 7 1 5% (8)
Other implicitness 301 29 2 0 1% (2) 100 41 11 0 7% (11)
Table 3
Association between contextuality and implicitness. The values where p is significant are shown in bold.
StereoHoax-IT SterheoSchool
Cramer’s V X² / p-value Cramer’s V X² / p-value
World knowledge 0.074 1.8 / 0.18 0.064 0.623 / 0.43
Figures of speech 0.105 3.691 / 0.055 0.0 0.0 / 1.0
Irony/Sarcasm 0.188 11.759 / 0.001 – 0.0 / 1.0
Humor/Jokes 0.089 2.648 / 0.104 – 0.0 / 1.0
Extrapolation 0.176 10.315 /0.001 0.041 0.258 / 0.611
Entailment/Evaluation 0.232 17.872 / 0.0 0.232 8.189 / 0.004
Imperative/Exhortative 0.116 4.502 / 0.034 0.077 0.9 / 0.343
Other implicitness 0.059 1.173 / 0.279 0.22 7.344 / 0.007
topics in StereoHoax-IT are more balanced, as seen in 5. Qualitative analysis
the distribution of ‘entailment/evaluation’, which is also
used in ‘migration control’, ‘benefits’, ‘public health’ and To deepen the analysis of implicitness strategies and their
‘dehumanization’. On the other hand, in SterheoSchool, interaction with different topics, we explore some mes-
both initial fake news have the same narrative features, sages to uncover the linguistic structures that are char-
such as describing an aggression and highlighting the acteristic of implicit communication.
origin of the aggressor, thus eliciting a reaction in the Example 1 has been annotated with the topic ‘public
readers related to these topics. The example "Siamo alla health’ and ‘figures of speech’ and ‘Irony/Sarcasm’ for
follia: ad Agrigento autobus gratis agli immigrati per evitare vio- the strategy of implicitness; all labels achieved a 67% IAA.
lenze e aggressioni." 16 (StereoHoax-IT) is related to security 1) Governo di involtini primavera!!! 18 (StereoHoax-IT)
expressed through extrapolation. The example "Un cris- In the context given for this message, the author com-
tiano che entrasse in una moschea in un paese arabo e sputasse plains that the government did not use more restric-
per terra sopravviverebbe pochi secondi." 17 (StereoHoax-IT) tive measures against Chinese children during the early
highlights cultural and religious differences by the evalu- stages of COVID-19. First, an ironic reading, i.e., as
ation of a hypothetical situation. stating A to mean not-A, is triggered by the metonymy
“spring rolls” [29], identifying Chinese citizens through
16
Transl. "It’s crazy: in Agrigento, free buses for immigrants to prevent a traditional Chinese dish. Second, disapproval is con-
violence and aggressions." veyed showing a kind of favorable attitude of the Italian
17
Transl. "A Christian entering a Mosque in an Arab country and
18
spitting on the ground would survive a few seconds." Trasl."Spring rolls government."
Table 4
Co-occurrence of implicitness strategies and topics of stereotypes. The numbers on the left correspond to StereoHoax-IT,
whereas the numbers on the right correspond to SterheoSchool.
StereoHoax-IT / SterheoSchool
World Figures Irony/ Humor/ Extrapolation Imperative/ Entailment/ Other
knowledge of speech Sarcasm Jokes Exhortative Evaluation implicitness
Xenophobia victims 4/0 3/0 2/0 1/0 0/0 2/0 5/0 0/0
Suffering victims 0/0 0/0 0/0 0/0 0/0 0/0 0/0 0/0
Economic resource 0/0 0/0 0/0 0/0 0/0 0/0 0/0 0/0
Migration control 7/0 13 / 0 10 / 0 0/0 4/1 13 / 0 55 / 4 1/0
Culture & religion 11 / 0 0/1 6/0 2/0 5 / 17 3/7 22 / 65 0/1
Benefits 12 / 0 8/0 11 / 0 0/0 1/1 7/0 51 / 2 0/0
Public health 2/0 17 / 0 8/0 1/0 3/0 4/0 43 / 0 0/0
Security 7/0 12 / 1 17 / 0 0/0 35 / 6 29 / 2 103 / 45 0/4
Dehumanization 3/0 5/0 3/0 2/0 7/1 13 / 1 14 / 8 1/0
Other topics 0/0 0/0 0/0 0/0 0/4 0/0 0/5 1/4
government toward Chinese children. also interesting, and has been studied especially in social
Example 2 was annotated as ‘culture & religion’ by all media [32, 33], as a means to lower the negative social
three annotators. In terms of the implicitness strategies, cost of what has been said. The two categories that most
it was labeled as both ‘extrapolation’ and ‘entailment/e- frequently co-occur with ‘irony/sarcasm’ in StereoHoax-
valuation’ by two out of the three annotators. IT are ‘figures of speech’ (out of 35 instances, six are also
2) Venezia, donne velate sputano al crocifisso. 19 ironic) and ‘humor/jokes’ (out of three cases, two are
(StereoHoax-IT) ironic), as in the next example:
In this case, the noun phrase “veiled women” is a case of 5) @Belle facce intelligenti! Viva Lombroso! 22 (67% Hu-
lexical narrowing, i.e., a lexical item conveys a meaning mor/Jokes, 67% Irony/Sarcasm, StereoHoax-IT)
that is more specific than the item’s encoded meaning. We found messages in which ‘entailment/evaluation’ co-
The reader selects a more specific meaning on the basis occurs with ‘irony/sarcasm’, but this correlation should
of stereotypes and world knowledge [30] of the mean- be analyzed in depth to be considered relevant, as 64% of
ing of “veiled women”, which denotes a set of women instances were annotated as ‘entailment/evaluation.’
who wear a veil, narrowed to mean Muslim women. This
equalization arises from the stereotype that posits that 6. Conclusions
if a woman wears a veil, she is a Muslim. Furthermore,
the absence of the determiner in the noun phrase, that In this paper, we applied an annotation scheme for analyz-
usually indicates a generic reference, combined with the ing the implicitness of stereotypes against immigrants ac-
imperfective aspect and present tense of the verb, may cording to two main dimensions (i.e., topics and strategies
suggest a habitual interpretation of the predicate "spit on for making the content implicit) to the Italian StereoHoax-
the crucifix" [31]. ‘Extrapolation’ strategy here refers to IT and SterheoSchool corpora. Adding these two layers
the attribution of this action to the entire category. of annotation allowed us to observe that annotators need
Among the more frequently agreed implicitness strate- to use contextual information to determine the presence
gies, there are ‘imperative/exhortative’ and ‘figures of of stereotypes especially, when specific strategies have
speech’, which have linguistic and punctuation features been used by the author of the message (irony/sarcasm,
closer to explicitness: the former is associated with a spe- extrapolation, entailment/evaluation, and imperative/ex-
cific grammatical mood and the exclamation mark, while hortative). Moreover, implicit stereotypes appear to be
the latter is associated with a question mark (considering conveyed mainly through logical linguistic relations such
that rhetorical questions are frequently annotated as a as the entailment and behavioral evaluation of immi-
figure of speech), see e.g.: grants and, in fewer cases, via ‘imperative/exhortative’,
3) Se non fate niente Fra 10 anni l’italia sarà tutta musul- ‘irony/sarcasm’ and ‘extrapolation.’
mana! 20 (StereoHoax-IT) As future work, we plan to perform a comparative
4) Come ci si può sentir sicuri in una società che permette analysis with the datasets in Spanish, which have already
questo? meschina21 (SterheoSchool) been annotated with this schema, in order to understand
The high IAA for the category of ‘irony/sarcasm’ is cultural analogies and differences in portraying immi-
19
grants as threats, enemies or victims.
Trasl."Venice, veiled women spit on the crucifix."
20
Trasl."If you do nothing In 10 years Italy will be completely Muslim"
21 22
Trasl."How can one feel secure in a society that allows this? mean" Trasl."Nice smart faces! Long life Lombroso!"
Acknowledgments [6] G. W. Allport, K. Clark, T. Pettigrew, The nature of
prejudice, Addison-wesley Reading, MA, 1954.
The work of Wolfgang Schmeisser-Nieto is funded by [7] S. T. Fiske, Stereotyping, prejudice, and discrimina-
the project StereotypHate (Compagnia di San Paolo for tion, in: The Handbook of Social Psychology, Vols.
the call ‘Progetti di Ateneo - Compagnia di San Paolo 1-2, 4th Ed, McGraw-Hill, New York, NY, US, 1998,
2019/2021 - Mission 1.1 - Finanziamento ex-post’). pp. 357–411.
The work of Cristina Bosco is partially funded by the [8] A. G. Greenwald, M. R. Banaji, Implicit social
same project. cognition: Attitudes, self-esteem, and stereo-
types, Psychological review 102 (1995) 4—27.
URL: http://faculty.washington.edu/agg/pdf/
References Greenwald_Banaji_PsychRev_1995.OCR.pdf.
[1] M. Anzovino, E. Fersini, P. Rosso, Automatic identi- doi:10.1037/0033-295x.102.1.4.
fication and classification of misogynistic language [9] K. A. Collins, R. Clément, Language and
on Twitter, in: M. Silberztein, F. Atigui, E. Ko- prejudice: direct and moderated effects,
rnyshova, E. Métais, F. Meziane (Eds.), Natural Lan- Journal of Language and Social Psychol-
guage Processing and Information Systems - 23rd ogy 31 (2012) 376–396. URL: http://journals.
International Conference on Applications of Natu- sagepub.com/doi/10.1177/0261927X12446611.
ral Language to Information Systems, NLDB 2018, doi:10.1177/0261927X12446611.
Paris, France, June 13-15, 2018, Proceedings, vol- [10] F. D’Errico, M. Paciello, Online moral disengage-
ume 10859 of Lecture Notes in Computer Science, ment and hostile emotions in discussions on hosting
Springer, 2018, pp. 57–64. URL: https://doi.org/10. immigrants, Internet Research 28 (2018) 1313–1335.
1007/978-3-319-91947-8_6. URL: https://www.emerald.com/insight/content/
[2] E. Lavergne, R. Saini, G. Kovács, K. Murphy, doi/10.1108/IntR-03-2017-0119/full/html. doi:10.
TheNorth @ HaSpeeDe 2: BERT-based language 1108/IntR-03-2017-0119.
model fine-tuning for Italian hate speech detection, [11] U. Quasthoff, The uses of stereotype in everyday
in: Proceedings of the Seventh Evaluation Cam- argument, Journal of pragmatics 2 (1978) 1–48.
paign of Natural Language Processing and Speech [12] C. J. Beukeboom, C. Finkenauer, D. H. J. Wigboldus,
Tools for Italian Final Workshop, EVALITA - De- The negation bias: When negations signal stereo-
cember 17th, 2020, volume 2765, CEUR-WS, 2020, typic expectancies., Journal of Personality and So-
pp. 142–147. URL: http://ceur-ws.org/Vol-2765/ cial Psychology 99 (2010) 978–992. URL: http://doi.
paper135.pdf. apa.org/getdoi.cfm?doi=10.1037/a0020861. doi:10.
[3] M. Sanguinetti, G. Comandini, E. D. Nuovo, 1037/a0020861.
S. Frenda, M. Stranisci, C. Bosco, T. Caselli, V. Patti, [13] T. F. Pettigrew, R. W. Meertens, Subtle and
I. Russo, Haspeede 2 @ EVALITA2020: Overview blatant prejudice in western Europe, European
of the EVALITA 2020 hate speech detection task, Journal of Social Psychology 25 (1995) 57–75. URL:
in: V. Basile, D. Croce, M. D. Maro, L. C. Passaro https://onlinelibrary.wiley.com/doi/10.1002/ejsp.
(Eds.), Proceedings of the Seventh Evaluation Cam- 2420250106. doi:10.1002/ejsp.2420250106.
paign of Natural Language Processing and Speech [14] W. S. Schmeisser-Nieto, M. Nofre, M. Taulé, Crite-
Tools for Italian. Final Workshop (EVALITA 2020), ria for the annotation of implicit stereotypes, in:
Online event, December 17th, 2020, volume 2765 of Proceedings of the Thirteenth Language Resources
CEUR Workshop Proceedings, CEUR-WS, 2020. URL: and Evaluation Conference (LREC 2022), 2022, pp.
http://ceur-ws.org/Vol-2765/paper162.pdf. 753–762.
[4] M. Taulé, A. Ariza, M. Nofre, E. Amigó, P. Rosso, [15] C. Bosco, F. Dell’Orletta, F. Poletto, M. Sanguinetti,
Overview of DETOXIS at IberLEF 2021: DEtec- M. Tesconi, Overview of the evalita 2018 hate
tion of TOXicity in comments In Spanish, Proce- speech detection task, in: EVALITA 2018-Sixth Eval-
samiento del Lenguaje Natural 67 (2021) 209–221. uation Campaign of Natural Language Processing
URL: http://journal.sepln.org/sepln/ojs/ojs/index. and Speech Tools for Italian, volume 2263, CEUR,
php/pln/article/view/6390. 2018, pp. 1–9.
[5] A. Ariza-Casabona, W. S. Schmeisser-Nieto, [16] M. Sanguinetti, G. Comandini, E. di Nuovo,
M. Nofre, M. Taulé, E. Amigó, B. Chulvi, P. Rosso, S. Frenda, M. Stranisci, C. Bosco, T. Caselli, V. Patti,
Overview of DETESTS at IberLEF 2022: DETEction I. Russo, Haspeede 2 @ EVALITA2020: Overview
and classification of racial STereotypes in Spanish, of the EVALITA 2020 hate speech detection task, in:
Procesamiento del Lenguaje Natural 69 (2022) V. Basile, D. Croce, M. Di Maro, L. Passaro (Eds.),
217–228. Proceedings of the Seventh Evaluation Campaign
of Natural Language Processing and Speech Tools
for Italian. Final Workshop (EVALITA 2020), vol- https://aclanthology.org/E17-1025.
ume 2765, CEUR Workshop Proceedings (CEUR- [24] G. Corbelli, P. G. Cicirelli, F. D’Errico, M. Paciello,
WS.org), 2020. Conference date: 17-12-2020. Preventing prejudice emerging from misleading
[17] F. Rodríguez-Sánchez, J. C. de Albornoz, L. Plaza, news among adolescents: The role of implicit acti-
J. Gonzalo, P. Rosso, M. Comet, T. Donoso, vation and regulatory self-efficacy in dealing with
Overview of EXIST 2021: sexism identification in online misinformation, Social Sciences 12 (2023).
social networks, Procesamiento del Lenguaje Nat- [25] F. D’Errico, P. G. Cicirelli, G. Corbelli, M. Paciello,
ural 67 (2021) 195–207. URL: http://journal.sepln. Addressing racial misinformation at school: A
org/sepln/ojs/ojs/index.php/pln/article/view/6389. psycho-social intervention aimed at reducing eth-
[18] F. Rodríguez-Sánchez, J. C. de Albornoz, nic moral disengagement in adolescents, Social
L. Plaza, A. Mendieta-Aragón, G. Marco-Remón, Psychology of Education (2023).
M. Makeienko, M. Plaza, J. Gonzalo, D. Spina, [26] C. Bosco, V. Patti, S. Frenda, A. T. Cignarella,
P. Rosso, Overview of EXIST 2022: sexism M. Paciello, F. D’Errico, Detecting racial
identification in social networks, Procesamiento stereotypes: An italian social media corpus
del Lenguaje Natural 69 (2022) 229–240. URL: where psychology meets nlp, Information
http://journal.sepln.org/sepln/ojs/ojs/index.php/ Processing & Management 60 (2023) 103118.
pln/article/view/6443. URL: https://linkinghub.elsevier.com/retrieve/pii/
[19] L. Plaza, J. Carrillo-de Albornoz, R. Morante, S0306457322002199. doi:10.1016/j.ipm.2022.
E. Amigó, J. Gonzalo, D. Spina, P. Rosso, Overview 103118.
of exist 2023–learning with disagreement for sex- [27] T. Bourgeade, A. T. Cignarella, S. Frenda, M. Lau-
ism identification and characterization, in: Inter- rent, W. Schmeisser-Nieto, F. Benamara, C. Bosco,
national Conference of the Cross-Language Eval- V. Moriceau, V. Patti, M. Taulé, A Multilingual
uation Forum for European Languages, Springer, Dataset of Racial Stereotypes in Social Media Con-
2023, pp. 316–342. versational Threads, in: Findings of the Association
[20] W. S. Schmeisser-Nieto, P. Pastells, S. Frenda, for Computational Linguistics: EACL 2023, Asso-
M. Taule, Human vs. machine perceptions on ciation for Computational Linguistics, Dubrovnik,
immigration stereotypes, in: N. Calzolari, M.-Y. Croatia, 2023, pp. 686–696.
Kan, V. Hoste, A. Lenci, S. Sakti, N. Xue (Eds.), [28] E. Chierchiello, T. Bourgeade, G. Ricci, C. Bosco,
Proceedings of the 2024 Joint International Con- F. D’Errico, Studying reactions to stereotypes in
ference on Computational Linguistics, Language teenagers: an annotated italian dataset, in: Proceed-
Resources and Evaluation (LREC-COLING 2024), ings of the Fourth Workshop on Threat, Aggression
ELRA and ICCL, Torino, Italia, 2024, pp. 8453–8463. and Cyberbullying (TRAC-2024), 2014.
URL: https://aclanthology.org/2024.lrec-main.741. [29] G. Lakoff, Women, fire, and dangerous things: What
[21] A. Fokkens, N. Ruigrok, C. Beukeboom, G. Sarah, categories reveal about the mind, University of
W. Van Atteveldt, Studying muslim stereotyping Chicago Press, Chicago, 1987.
through microportrait extraction, in: Proceedings [30] Y. Huang, Implicitness in the lexis, in: P. Cap,
of the Eleventh International Conference on Lan- M. Dynel (Eds.), Implicitness: From lexis to dis-
guage Resources and Evaluation (LREC 2018), 2018, course, John Benjamins, Amsterdam/ Philadelphia,
pp. 3734–3741. 2017, pp. 67–94.
[22] J. J. Sánchez-Junquera, B. Chulvi, P. Rosso, [31] C. Lyons, Definiteness, Cambridge University Press,
S. P. Ponzetto, How do you speak about im- Cambridge, 1999.
migrants? taxonomy and stereoimmigrants [32] S. Frenda, V. Patti, P. Rosso, Killing me softly:
dataset for identifying stereotypes about im- Creative and cognitive aspects of implicitness
migrants, Applied Sciences 11 (2021). URL: in abusive language online, Natural Language
https://www.mdpi.com/2076-3417/11/8/3610. Engineering 29 (2023) 1516–1537. doi:10.1017/
doi:10.3390/app11083610. S1351324922000316.
[23] J. Karoui, F. Benamara, V. Moriceau, V. Patti, [33] S. Frenda, V. Patti, P. Rosso, When sarcasm hurts:
C. Bosco, N. Aussenac-Gilles, Exploring the im- Irony-aware models for abusive language detec-
pact of pragmatic phenomena on irony detection tion, in: A. Arampatzis, E. Kanoulas, T. Tsikrika,
in tweets: A multilingual corpus study, in: M. Lap- S. Vrochidis, A. Giachanou, D. Li, M. Aliannejadi,
ata, P. Blunsom, A. Koller (Eds.), Proceedings of the M. Vlachos, G. Faggioli, N. Ferro (Eds.), Experimen-
15th Conference of the European Chapter of the tal IR Meets Multilinguality, Multimodality, and
Association for Computational Linguistics: Volume Interaction, Springer Nature Switzerland, Cham,
1, Long Papers, Association for Computational Lin- 2023, pp. 34–47.
guistics, Valencia, Spain, 2017, pp. 262–272. URL: