Implicit Stereotypes: A Corpus-Based Study for Italian Wolfgang S. Schmeisser-Nieto1,2,* , Giacomo Ricci2 , Simona Frenda3,4 , Mariona Taulé1 and Cristina Bosco2 1 Universitat de Barcelona, Gran Via de les Corts Catalanes, 585, Barcelona, Spain 2 University of Turin, Dipartimento di Informatica, Corso Svizzera 185, 10149 Torino, Italy 3 Interaction Lab, Heriot-Watt University, The Avenue, Edinburgh, EH14 4AS, Scotland 4 aequa-tech, Torino, Italy Abstract Detecting stereotypes is a challenging task, particularly when they are not expressed explicitly. In this study, we applied an annotation schema from the literature designed to formalize implicit stereotypes. We analyzed implicit stereotypes about immigrants in two datasets: StereoHoax-IT and SterheoSchool, which are created from different sources. StereoHoax- IT consists of reactions on Twitter to specific hoaxes aimed at discriminating against immigrants, while SterheoSchool includes comments from teenagers on fake news generated in psychological experiments. We describe the annotation process, annotator disagreements, and provide both quantitative and qualitative analyses to shed light on how implicitness characterizes stereotypes in different texts. Our findings suggest that implicit stereotypes are often conveyed through logical linguistic relations, such as entailment and behavioral evaluations of immigrants. Keywords Implicit stereotype, Corpora annotation, Corpora analysis, Italian language 1. Introduction and Background municated through linguistic devices such as metaphor and irony [9], negation [12], or entailments [13]. Re- Various recent NLP studies have focused on detecting cently, efforts have been made to formalize the strategies stereotypes online, often in conjunction with forms of for expressing implicit stereotypes, with the goal of es- abusive language [1, 2, 3, 4, 5]. The importance of tack- tablishing standardized criteria for annotators [14]. An ling this phenomenon is due to its impact on social struc- example of explicit stereotype is "[Gli immigrati] buttano tures and the power of individuals. Therefore, detecting via il cibo che gli danno per poi andare a mangiare i poveri cani, stereotypes can prevent their emergence and spread, and dove finiremo!" 1 (extracted from StereoHoax-IT corpus), thereby have a positive impact on our society. in which the generalization of the target group and the In social psychology, a stereotype has been defined as association with an action is expressed in a present tense a set of beliefs about others perceived as belonging to a with a habitual aspect. On the other hand, in the example different social group [6]. It oversimplifies the features "Come noi rispettiamo loro e il colore della loro pelle, così loro of the group and generalizes a particular feature, apply- che abitano nei nostri paesi dovrebbero portare rispetto nei nostri ing it to all its members [6]. In contrast to the emotional confronti." 2 (SterheoSchool corpus), the stereotype is not component of prejudice and the behavioral component of overtly manifested, but it must be inferred through the discrimination, a stereotype is associated with the cogni- evaluation of the in-group and an exhortative sentence. tive component of the triad [7]. In language, stereotypes From a computational linguistics perspective, concerns can be expressed explicitly or implicitly [8]. Explicit have been raised about how to detect and process stereo- stereotypes deliver a straightforward message, clearly types, a task often considered closely related to the de- revealing the associated traits, often using derogatory ad- tection of abusive language or hate speech [15]. jectives [9, 10]. In contrast, implicit stereotypes are more Alongside research on hate speech, the study of stereo- nuanced and indirect, requiring the reader to infer their type detection has increased, particularly within eval- meaning [11]. These implicit stereotypes can be com- uation tasks [16, 4, 17, 18, 19]. However, the detection CLiC-it 2024 - Tenth Italian Conference on Computational Linguistics, of implicit stereotypes remains a significant challenge Dec 04 — 06, 2024, Pisa, Italy [20]. There are several works that deal with stereotypes * Corresponding author. in more complex narratives, such as microportraits [21] $ wolfgang.schmeisser@ub.edu (W. S. Schmeisser-Nieto); and political debates [22]. The detection of implicitness giacomo.ricci@edu.unito.it (G. Ricci); s.frenda@hw.ac.uk (S. Frenda); mtaule@ub.edu (M. Taulé); cristina.bosco@unito.it has also been studied with reference to several other (C. Bosco)  0000-0001-5663-6276 (W. S. Schmeisser-Nieto); 1 Transl. "They throw away the food they are given only to go eat the 0000-0002-6215-3374 (S. Frenda); 0000-0003-0089-940X (M. Taulé); poor dogs. Where will we end up!" 0000-0002-8857-4484 (C. Bosco) 2 Transl. "Just as we respect them and the color of their skin, they, who © 2024 Copyright for this paper by its authors. Use permitted under Creative Commons License Attribution 4.0 International (CC BY 4.0). live in our countries, should show respect toward us." CEUR ceur-ws.org Workshop ISSN 1613-0073 Proceedings phenomena, in particular those characterized by sub- the presence or absence of anti-migrant stereotypes, and, jectivity, such as irony [23]. In this paper, we analyze if present, for other related categories such as whether the implicit manifestation of stereotypes targeting immi- the stereotype was expressed implicitly or explicitly and grants, using a well-defined annotation schema proposed which forms of discredit the stereotype could be clas- by Schmeisser-Nieto et al. [14] and tested on a subset sified at. This category is inspired by the Stereotype of comments from Spanish newspapers (DETESTS [5]). Content Model (SCM) [7] and allowed us to observe the This schema represents different criteria for determining stereotype from a perspective that encompasses psychol- the implicitness of stereotypes in an attempt to formal- ogy and computational linguistics [26]. In section 3, we ize the concept. Disentangling strategies of implicitness show how we extended this annotation to describe the presents a significant challenge, often resulting in the dimension of implicitness6 . StereoHoax-IT [27] is a identification of multiple categories within the same text. contextualized multilingual dataset of tweets annotated Our main contributions consist of expanding the an- primarily for the presence of anti-migrant stereotypes. notation with topics of stereotypes about immigrants [5] The dataset consists of replies to tweets identified as con- and the strategies to implicitness [14], as well as test- taining racial hoaxes specifically targeting migrants and ing this schema on two existing Italian datasets. These collected from debunking websites from French, Italian datasets share the same domain as those used for Spanish, and Spanish Twitter, collected from 2019 to 2021. Each stereotypes about immigrants, and include data extracted message is provided with its “conversation head” (the from Twitter (now X) as reactions to specific hoaxes message containing the source racial hoax), and its direct (StereoHoax-IT) and comments written by high school parent message (if applicable). In this paper, we only use students to two examples of fake news artificially cre- the Italian subset, which includes 3,123 instances. Due to ated within psychological experiments (SterheoSchool) the rarity of the phenomenon, there is a significant class as described in [24, 25]. Analyzing the annotated texts, imbalance: 472 instances (15%) contain a stereotype, 332 we noted that implicit stereotypes appear to be conveyed of which (70%) are implicit and 140 (30%) are explicit. especially through logical linguistic relations like entail- SterheoSchool [28] consists of a selection of data col- ment and the behavioral evaluation of immigrants in both lected in Italian schools during experiments conducted by datasets. Moreover, in most cases, the annotators needed social psychologists [24, 25]. More precisely, it includes to use contextual information to determine the presence the reactions of teenagers, who read two hoaxes artifi- of stereotypes. For example, in this case "Che centra lui e cially created and presented as news articles, recorded Italiano!, può essere massacrato!" 3 (StereoHoax-IT) the au- via a cell phone interface. The hoaxes were designed to thor of the message expresses a stereotype complaining elicit reactions to stereotypes in readers. For each news that foreigners enjoy better treatment than Italians, who item, readers were asked to comment on the news and can indeed be "macellati" (slaughtered). on the main character of the articles. These comments The rest of the paper is organized as follows: Sections 2 are also associated with metadata, such as the age and and 3 describe the datasets and the annotation applied; declared gender of the author. By collecting data gener- Sections 4 and 5 present quantitative and qualitative anal- ated by teenagers, this corpus aims to fill a gap in the yses of the annotated data; and Section 6 summarizes the literature in which teenagers are an underrepresented results and provides guidance regarding future work. category in data annotated for text classification tasks. We applied the annotation scheme mentioned above to the news and comments. This corpus consists of 1,147 2. Datasets comments, of which 337 (33.8%) are annotated as con- taining stereotypes, of which 152 (45%) are expressed in In this work, we focus on two annotated corpora con- an implicit form. taining implicit stereotypes developed within the STER- 4 5 HEOTYPES project and the SterotypHate project . Their content is related to attitudes regarding immigrants and 3. Annotation they share similar conversational structures and the same annotation scheme. Each message in these datasets is The annotation scheme we applied on the two corpora contextualized, i.e. collocated within a discourse thread is based on two different layers, topics of stereotypes and or presented as a comment on a given news item. For implicitness strategies, as well as the need for context. the annotation scheme, each message is annotated for The topics of stereotypes were firstly introduced within an evaluation task, DETESTS [5], in which the 3 Transl. "That’s not the point, he is Italian! He can be slaughtered!" participants had to train models to decide whether a text 4 STERHEOTYPES (Studying European Racial Hoaxes and sterEO- TYPES) is an international project funded by Compagnia di San 6 Paolo and VolksWagen Stiftung. The datasets will be made available for research purposes after the 5 StereotypHate is a project funded by Compagnia di San Paolo. acceptance of the paper in anonymized form. contained stereotypes, and when they did, classify the linguistic devices used to convey implicit stereotypes, we stereotype into ten different categories: have revised the criteria proposed in [14] as follows: • Xenophobia victims Immigrants are perceived • World knowledge World knowledge refers to as victims of xenophobia and discrimination. the shared cultural, social and historical knowl- They enrich culture and diversity and should have edge needed to interpret messages, e.g., "La scuola the same rights as citizens. si inchina all’islam: l’aceto è bandito dalle mense." 8 • Suffering victims Immigrants are portrayed as (StereoHoax-IT) victims of poverty and violence in their places of • Figures of speech Every figure of speech ex- origin and as having to face difficult situations in cept for irony and sarcasm, and humor and jokes. their host countries. For instance, metaphor, rhetorical questions, eu- • Economic resources Immigrants are seen as an phemisms or reported speech, e.g., "Chi è quel pazzo che si mette in casa uno di questi? Un suicidio" 9 economic resource. They do the jobs that locals do not want to do, pay taxes and solve the prob- (StereoHoax-IT) lems arising from low population growth. • Irony/Sarcasm The message expresses a mean- ing that is the opposite of what is said, e.g. in "Che • Migration control Immigrants present a threat bella gente fanno arrivare.....che bello avere un paese due to massive influxes and a lack of control at pieno di risorse pronte a tutto.....ma proprio a tutto." 10 the borders. Immigrants are illegal and should be (StereoHoax-IT) expelled. It is seen as an invasion. • Humor/Jokes Jokes about a target group of- • Culture and religion differences Immigrants ten use stereotypes and may or may not include suppose a loss of the in-group’s values and tradi- irony, e.g. in "Chissà se ha detto:"Cibo no buono"." 11 tions and the replacement of the target group’s (StereoHoax-IT) customs and religions. They are also seen as une- • Extrapolation The target refers to an individual ducated and should adapt to their host country. or specific members of a social group, not the • Benefits Immigrants compete with the in-group group as a whole, e.g. in "Classico del sud-italia for resources such as public subsidies, school Maleducata" 12 (SterheoSchool) places, jobs, health care and pensions. They are • Imperative/Exhortative Calls to take certain privileged over the in-group. actions related to the target group, e.g. "Come in • Public health Immigrants are thought to be car- Cina FUCILATELO" 13 (StereoHoax-IT) riers of infections and diseases such as COVID-19, • Entailment/Evaluation Logical relation be- Ebola and HIV. tween two sentences in which the condition of • Security Immigration brings security issues. Due truth of sentence A implies the truth of sentence to immigration, there is an increase in crime, do- B. The implicit stereotype is implied in sentence mestic violence, robbery, drug use, sexual assault, A. An evaluation of the author’s or in-group’s murder, terrorist attacks and public disorders. thoughts, emotions and behaviors, rather than • Dehumanization Immigrants are seen as infe- content about the out-group or target group, rior beings and are compared with animals, par- can be considered as a type of entailment, e.g. asites or scum. Their lives have less value than "Saranno fuori o liberi presto" 14 (StereoHoax-IT) is those of the in-group. the answer to a racial hoax in which a group of im- • Other topics Any other immigration stereotypes migrants rape and murder a teenage girl. With the not covered in the previous categories. author’s evaluation of the situation, it is entailed that immigrants are immune from punishment. Context and implicitness strategies were initially pro- • Other implicitness Other types of implicitness posed as criteria that could help annotators to annotate not considered in the previous categories. implicitness, since their vagueness may decrease Inter- e.g. "al giorno d’oggi non ci si può fidare di nessuno Annotator Agreement (IAA) [14]. By context, we refer una persona ripugnante" 15 (SterheoSchool) to information contained in previous messages, which 8 is considered necessary to understand the meaning of Transl. "The school bows to Islam: vinegar is banned from canteens." 9 Transl. "Who’s that fool who takes one of these into his house? a the message to be annotated, as in the following exam- suicide" ple: "Sempre assolti...sempre misure e pesi differenti". Context: 10 Transl. "Such nice people they bring in... how nice it is to have a "Uccide anziana ebrea al grido di Allah Akbar. Assolto perché country full of resources ready for anything... anything at all" drogato." 7 (StereoHoax-IT). Regarding the strategies and 11 Transl. "I wonder if he said: «Food no good»" 12 Transl. "Typical of Southern Italy" 7 13 Transl. "Always acquitted...always different measures and weights." Transl. "SHOOT HIM like in China" 14 Context: "Kills elderly Jewish woman while shouting ‘Allah Akbar.’ Transl. "They will be out or free soon" 15 Acquitted because he was on drugs." Transl. "nowadays you can’t trust anyone a repulsive person" Table 1 stereotypical topics that portray immigrants as threats, Inter-annotator agreement test using Fleiss’ kappa (𝜅) coeffi- the security issue is highly prevalent in both datasets. cient on the categories of implicitness and stereotype topics A common trend shows that the most frequent implic- of the StereoHoax-IT and the SterheoSchool corpora. itness strategy in both datasets is ‘entailment/evaluation’, Label StereoHoax-IT SterheoSchool accounting for 64% in StereoHoax-IT and 80% in Ster- heoSchool. To a lesser degree, ‘extrapolation’ appears in Xenophobia victims 0.57 0.50 Suffering victims 0.49 0.50 both datasets, with 13% in the former and 19% in the lat- Economic resource 0.48 0.50 ter, respectively. Other represented strategies that exceed Migration control 0.77 0.55 10% of instances are only found in StereoHoax-IT. Culture & religion 0.75 0.71 The label ‘context’ has a high prevalence in both Benefits 0.75 0.62 datasets, accounting for 38% in StereoHoax-IT and 80% Public health 0.86 0.50 Security 0.81 0.64 in SterheoSchool. This is expected, as it depends on the Dehumanization 0.71 0.71 methodology to produce the comments—spontaneous Other topics 0.52 0.43 versus controlled—and the variety of contexts: two fake news for StereoSchool and 50 racial hoaxes for Context 0.72 0.50 StereoHoax-IT. The limited amount of data unfortunately World knowledge 0.52 0.51 does not allow us to reliably evaluate a correlation be- Figures of speech 0.68 0.70 tween ‘context’ and certain implicitness strategies, as Irony/Sarcasm 0.70 0.50 shown in Table 3, except for the association between ‘en- Humor/Jokes 0.52 No cases tailment/evaluation’ and ‘context’ across both datasets. Extrapolation 0.51 0.53 The correlation between ‘implicitness’ and ‘context’ is Imperative/Exhortative 0.73 0.53 Entailment/Evaluation 0.45 0.49 also shown in Bourgeade et al. [27], with significant asso- Other implicitness 0.51 0.52 ciations of the aforementioned labels in three languages: French, Italian and Spanish. In StereoHoax-IT, the corre- lations between the ‘context’ and ‘irony/sarcasm’, ‘extrap- The annotation was carried out on the Label Studio olation’ and ‘imperative/exhortative’ are also significant, platform by three native Italian speakers with a back- whereas the category of other implicitness strategies is ground in linguistics, some of whom specialized in NLP. also significantly correlated in SterheoSchool, which can They achieved an acceptable to good IAA in the majority be analyzed qualitatively to determine if there is a pattern of cases, as reported in Table 1, which varies across cate- among them. The other strategies do not have represen- gories and corpora. By observing Table 2, we can see that tative instances that allow for analyzing them compara- only a few topics have been marked by the majority of tively, except for ‘extrapolation’, which is significantly annotators , while not all the implicit criteria have been correlated in StereoHoax-IT but not in SterheoSchool. identified in the texts (i.e., ‘humor/jokes’). In terms of co-occurrences between topics and implicit 4. Quantitative Analysis strategies, we can observe from Table 4 that there is also a great disparity in both datasets. Focusing on the two Table 2 shows the distribution of the disaggregated anno- topics with the highest representation in SterheoSchool tations across both datasets. Columns 0%, 33%, 67% and (Culture & religion, 51%, and security, 35%), which ac- 100%, respectively, indicate the number of instances per count for the majority of the corpus, we can analyze label that were annotated by no annotator (0%), by one some differences with StereoHoax-IT. Firstly, ‘culture & annotator (33%), by two annotators (67%) and by all three religion’ is expressed primarily through entailments or annotators (100%). Column % positive class shows the per- evaluations (65 co-occurrences) and secondarily through centage of the label voted by the majority of annotators, extrapolations in SterheoSchool. In contrast, the distri- and its total number of cases in parentheses. bution of strategies used to represent ‘culture & religion’ Firstly, an inconsistency in the distribution of labels stereotypes is more evenly spread in StereoHoax-IT. A can be observed since SterheoSchool has a representation similar pattern is observed with the topic of ’security’, of labels of more than 10% on only four labels. This dispar- which, while concentrating strategies in ’entailment/e- ity is due to the extraction methods of each dataset: the valuation,’ also utilizes a range of other strategies, partic- topics of the racial hoaxes used to extract the dataset were ularly ‘extrapolation’ and ‘imperative/exhortative’. With more balanced in StereoHoax-IT than in SterheoSchool, these co-occurrences, we can reaffirm that the different with the latter focusing generally on security and cultural methods to extract the data have an impact on the charac- differences that are discussed in the two only contexts teristics of it, and therefore, its distribution of labels. For provided to the students for their comments. However, instance, the messages were written in a non-controlled while in the former there is a representation of all the environment, which gives the authors the freedom to express themselves without constrains. Moreover, the Table 2 Distribution of labels and percentages of positive class. StereoHoax-IT SterheoSchool Labels 0% 33% 67% 100% % positive class 0% 33% 67% 100% % positive class Xenophobia victims 265 54 12 1 4% (13) 149 3 0 0 %0 (0) Suffering victims 313 19 0 0 0% (0) 148 4 0 0 0% (0) Economic resource 299 33 0 0 0% (0) 151 1 0 0 0% (0) Migration control 203 48 45 36 24% (81) 140 8 2 2 3% (4) Culture & religion 254 43 15 20 11% (35) 37 38 49 28 51% (77) Benefits 235 30 41 26 20% (67) 139 11 2 0 1% (2) Public health 257 16 23 36 18% (59) 151 1 0 0 0% (0) Security 128 42 48 114 49% (162) 48 50 29 25 36% (54) Dehumanization 258 40 21 13 10% (34) 126 17 4 5 6% (9) Other topics 316 15 1 0 0% (1) 66 76 10 0 7% (10) Context 116 90 45 81 38% (126) 1 28 61 62 81% (123) World knowledge 187 111 31 3 10% (34) 136 15 1 0 1% (1) Figures of speech 257 40 27 8 11% (35) 142 8 0 2 1% (2) Irony/Sarcasm 247 42 30 13 13% (43) 151 1 0 0 0% (0) Humor/Jokes 300 29 3 0 1% (3) 152 0 0 0 0% (0) Extrapolation 157 133 36 6 13% (42) 69 54 26 3 19% (29) Entailment/Evaluation 20 100 167 46 64% (212) 1 30 63 58 80% (121) Imperative/Exhortative 238 49 24 21 14% (45) 106 38 7 1 5% (8) Other implicitness 301 29 2 0 1% (2) 100 41 11 0 7% (11) Table 3 Association between contextuality and implicitness. The values where p is significant are shown in bold. StereoHoax-IT SterheoSchool Cramer’s V X² / p-value Cramer’s V X² / p-value World knowledge 0.074 1.8 / 0.18 0.064 0.623 / 0.43 Figures of speech 0.105 3.691 / 0.055 0.0 0.0 / 1.0 Irony/Sarcasm 0.188 11.759 / 0.001 – 0.0 / 1.0 Humor/Jokes 0.089 2.648 / 0.104 – 0.0 / 1.0 Extrapolation 0.176 10.315 /0.001 0.041 0.258 / 0.611 Entailment/Evaluation 0.232 17.872 / 0.0 0.232 8.189 / 0.004 Imperative/Exhortative 0.116 4.502 / 0.034 0.077 0.9 / 0.343 Other implicitness 0.059 1.173 / 0.279 0.22 7.344 / 0.007 topics in StereoHoax-IT are more balanced, as seen in 5. Qualitative analysis the distribution of ‘entailment/evaluation’, which is also used in ‘migration control’, ‘benefits’, ‘public health’ and To deepen the analysis of implicitness strategies and their ‘dehumanization’. On the other hand, in SterheoSchool, interaction with different topics, we explore some mes- both initial fake news have the same narrative features, sages to uncover the linguistic structures that are char- such as describing an aggression and highlighting the acteristic of implicit communication. origin of the aggressor, thus eliciting a reaction in the Example 1 has been annotated with the topic ‘public readers related to these topics. The example "Siamo alla health’ and ‘figures of speech’ and ‘Irony/Sarcasm’ for follia: ad Agrigento autobus gratis agli immigrati per evitare vio- the strategy of implicitness; all labels achieved a 67% IAA. lenze e aggressioni." 16 (StereoHoax-IT) is related to security 1) Governo di involtini primavera!!! 18 (StereoHoax-IT) expressed through extrapolation. The example "Un cris- In the context given for this message, the author com- tiano che entrasse in una moschea in un paese arabo e sputasse plains that the government did not use more restric- per terra sopravviverebbe pochi secondi." 17 (StereoHoax-IT) tive measures against Chinese children during the early highlights cultural and religious differences by the evalu- stages of COVID-19. First, an ironic reading, i.e., as ation of a hypothetical situation. stating A to mean not-A, is triggered by the metonymy “spring rolls” [29], identifying Chinese citizens through 16 Transl. "It’s crazy: in Agrigento, free buses for immigrants to prevent a traditional Chinese dish. Second, disapproval is con- violence and aggressions." veyed showing a kind of favorable attitude of the Italian 17 Transl. "A Christian entering a Mosque in an Arab country and 18 spitting on the ground would survive a few seconds." Trasl."Spring rolls government." Table 4 Co-occurrence of implicitness strategies and topics of stereotypes. The numbers on the left correspond to StereoHoax-IT, whereas the numbers on the right correspond to SterheoSchool. StereoHoax-IT / SterheoSchool World Figures Irony/ Humor/ Extrapolation Imperative/ Entailment/ Other knowledge of speech Sarcasm Jokes Exhortative Evaluation implicitness Xenophobia victims 4/0 3/0 2/0 1/0 0/0 2/0 5/0 0/0 Suffering victims 0/0 0/0 0/0 0/0 0/0 0/0 0/0 0/0 Economic resource 0/0 0/0 0/0 0/0 0/0 0/0 0/0 0/0 Migration control 7/0 13 / 0 10 / 0 0/0 4/1 13 / 0 55 / 4 1/0 Culture & religion 11 / 0 0/1 6/0 2/0 5 / 17 3/7 22 / 65 0/1 Benefits 12 / 0 8/0 11 / 0 0/0 1/1 7/0 51 / 2 0/0 Public health 2/0 17 / 0 8/0 1/0 3/0 4/0 43 / 0 0/0 Security 7/0 12 / 1 17 / 0 0/0 35 / 6 29 / 2 103 / 45 0/4 Dehumanization 3/0 5/0 3/0 2/0 7/1 13 / 1 14 / 8 1/0 Other topics 0/0 0/0 0/0 0/0 0/4 0/0 0/5 1/4 government toward Chinese children. also interesting, and has been studied especially in social Example 2 was annotated as ‘culture & religion’ by all media [32, 33], as a means to lower the negative social three annotators. In terms of the implicitness strategies, cost of what has been said. The two categories that most it was labeled as both ‘extrapolation’ and ‘entailment/e- frequently co-occur with ‘irony/sarcasm’ in StereoHoax- valuation’ by two out of the three annotators. IT are ‘figures of speech’ (out of 35 instances, six are also 2) Venezia, donne velate sputano al crocifisso. 19 ironic) and ‘humor/jokes’ (out of three cases, two are (StereoHoax-IT) ironic), as in the next example: In this case, the noun phrase “veiled women” is a case of 5) @Belle facce intelligenti! Viva Lombroso! 22 (67% Hu- lexical narrowing, i.e., a lexical item conveys a meaning mor/Jokes, 67% Irony/Sarcasm, StereoHoax-IT) that is more specific than the item’s encoded meaning. We found messages in which ‘entailment/evaluation’ co- The reader selects a more specific meaning on the basis occurs with ‘irony/sarcasm’, but this correlation should of stereotypes and world knowledge [30] of the mean- be analyzed in depth to be considered relevant, as 64% of ing of “veiled women”, which denotes a set of women instances were annotated as ‘entailment/evaluation.’ who wear a veil, narrowed to mean Muslim women. This equalization arises from the stereotype that posits that 6. Conclusions if a woman wears a veil, she is a Muslim. Furthermore, the absence of the determiner in the noun phrase, that In this paper, we applied an annotation scheme for analyz- usually indicates a generic reference, combined with the ing the implicitness of stereotypes against immigrants ac- imperfective aspect and present tense of the verb, may cording to two main dimensions (i.e., topics and strategies suggest a habitual interpretation of the predicate "spit on for making the content implicit) to the Italian StereoHoax- the crucifix" [31]. ‘Extrapolation’ strategy here refers to IT and SterheoSchool corpora. Adding these two layers the attribution of this action to the entire category. of annotation allowed us to observe that annotators need Among the more frequently agreed implicitness strate- to use contextual information to determine the presence gies, there are ‘imperative/exhortative’ and ‘figures of of stereotypes especially, when specific strategies have speech’, which have linguistic and punctuation features been used by the author of the message (irony/sarcasm, closer to explicitness: the former is associated with a spe- extrapolation, entailment/evaluation, and imperative/ex- cific grammatical mood and the exclamation mark, while hortative). Moreover, implicit stereotypes appear to be the latter is associated with a question mark (considering conveyed mainly through logical linguistic relations such that rhetorical questions are frequently annotated as a as the entailment and behavioral evaluation of immi- figure of speech), see e.g.: grants and, in fewer cases, via ‘imperative/exhortative’, 3) Se non fate niente Fra 10 anni l’italia sarà tutta musul- ‘irony/sarcasm’ and ‘extrapolation.’ mana! 20 (StereoHoax-IT) As future work, we plan to perform a comparative 4) Come ci si può sentir sicuri in una società che permette analysis with the datasets in Spanish, which have already questo? meschina21 (SterheoSchool) been annotated with this schema, in order to understand The high IAA for the category of ‘irony/sarcasm’ is cultural analogies and differences in portraying immi- 19 grants as threats, enemies or victims. Trasl."Venice, veiled women spit on the crucifix." 20 Trasl."If you do nothing In 10 years Italy will be completely Muslim" 21 22 Trasl."How can one feel secure in a society that allows this? mean" Trasl."Nice smart faces! Long life Lombroso!" Acknowledgments [6] G. W. Allport, K. Clark, T. Pettigrew, The nature of prejudice, Addison-wesley Reading, MA, 1954. The work of Wolfgang Schmeisser-Nieto is funded by [7] S. T. Fiske, Stereotyping, prejudice, and discrimina- the project StereotypHate (Compagnia di San Paolo for tion, in: The Handbook of Social Psychology, Vols. the call ‘Progetti di Ateneo - Compagnia di San Paolo 1-2, 4th Ed, McGraw-Hill, New York, NY, US, 1998, 2019/2021 - Mission 1.1 - Finanziamento ex-post’). pp. 357–411. The work of Cristina Bosco is partially funded by the [8] A. G. Greenwald, M. R. Banaji, Implicit social same project. cognition: Attitudes, self-esteem, and stereo- types, Psychological review 102 (1995) 4—27. URL: http://faculty.washington.edu/agg/pdf/ References Greenwald_Banaji_PsychRev_1995.OCR.pdf. [1] M. Anzovino, E. Fersini, P. Rosso, Automatic identi- doi:10.1037/0033-295x.102.1.4. fication and classification of misogynistic language [9] K. A. Collins, R. Clément, Language and on Twitter, in: M. Silberztein, F. Atigui, E. Ko- prejudice: direct and moderated effects, rnyshova, E. Métais, F. Meziane (Eds.), Natural Lan- Journal of Language and Social Psychol- guage Processing and Information Systems - 23rd ogy 31 (2012) 376–396. URL: http://journals. International Conference on Applications of Natu- sagepub.com/doi/10.1177/0261927X12446611. ral Language to Information Systems, NLDB 2018, doi:10.1177/0261927X12446611. Paris, France, June 13-15, 2018, Proceedings, vol- [10] F. D’Errico, M. Paciello, Online moral disengage- ume 10859 of Lecture Notes in Computer Science, ment and hostile emotions in discussions on hosting Springer, 2018, pp. 57–64. URL: https://doi.org/10. immigrants, Internet Research 28 (2018) 1313–1335. 1007/978-3-319-91947-8_6. URL: https://www.emerald.com/insight/content/ [2] E. Lavergne, R. Saini, G. Kovács, K. Murphy, doi/10.1108/IntR-03-2017-0119/full/html. doi:10. TheNorth @ HaSpeeDe 2: BERT-based language 1108/IntR-03-2017-0119. model fine-tuning for Italian hate speech detection, [11] U. Quasthoff, The uses of stereotype in everyday in: Proceedings of the Seventh Evaluation Cam- argument, Journal of pragmatics 2 (1978) 1–48. paign of Natural Language Processing and Speech [12] C. J. Beukeboom, C. Finkenauer, D. H. J. Wigboldus, Tools for Italian Final Workshop, EVALITA - De- The negation bias: When negations signal stereo- cember 17th, 2020, volume 2765, CEUR-WS, 2020, typic expectancies., Journal of Personality and So- pp. 142–147. URL: http://ceur-ws.org/Vol-2765/ cial Psychology 99 (2010) 978–992. URL: http://doi. paper135.pdf. apa.org/getdoi.cfm?doi=10.1037/a0020861. doi:10. [3] M. Sanguinetti, G. Comandini, E. D. Nuovo, 1037/a0020861. S. Frenda, M. Stranisci, C. Bosco, T. Caselli, V. Patti, [13] T. F. Pettigrew, R. W. Meertens, Subtle and I. Russo, Haspeede 2 @ EVALITA2020: Overview blatant prejudice in western Europe, European of the EVALITA 2020 hate speech detection task, Journal of Social Psychology 25 (1995) 57–75. URL: in: V. Basile, D. Croce, M. D. Maro, L. C. Passaro https://onlinelibrary.wiley.com/doi/10.1002/ejsp. (Eds.), Proceedings of the Seventh Evaluation Cam- 2420250106. doi:10.1002/ejsp.2420250106. paign of Natural Language Processing and Speech [14] W. S. Schmeisser-Nieto, M. Nofre, M. Taulé, Crite- Tools for Italian. Final Workshop (EVALITA 2020), ria for the annotation of implicit stereotypes, in: Online event, December 17th, 2020, volume 2765 of Proceedings of the Thirteenth Language Resources CEUR Workshop Proceedings, CEUR-WS, 2020. URL: and Evaluation Conference (LREC 2022), 2022, pp. http://ceur-ws.org/Vol-2765/paper162.pdf. 753–762. [4] M. Taulé, A. Ariza, M. Nofre, E. Amigó, P. Rosso, [15] C. Bosco, F. Dell’Orletta, F. Poletto, M. Sanguinetti, Overview of DETOXIS at IberLEF 2021: DEtec- M. Tesconi, Overview of the evalita 2018 hate tion of TOXicity in comments In Spanish, Proce- speech detection task, in: EVALITA 2018-Sixth Eval- samiento del Lenguaje Natural 67 (2021) 209–221. uation Campaign of Natural Language Processing URL: http://journal.sepln.org/sepln/ojs/ojs/index. and Speech Tools for Italian, volume 2263, CEUR, php/pln/article/view/6390. 2018, pp. 1–9. [5] A. Ariza-Casabona, W. S. Schmeisser-Nieto, [16] M. Sanguinetti, G. Comandini, E. di Nuovo, M. Nofre, M. Taulé, E. Amigó, B. Chulvi, P. Rosso, S. Frenda, M. Stranisci, C. Bosco, T. Caselli, V. Patti, Overview of DETESTS at IberLEF 2022: DETEction I. Russo, Haspeede 2 @ EVALITA2020: Overview and classification of racial STereotypes in Spanish, of the EVALITA 2020 hate speech detection task, in: Procesamiento del Lenguaje Natural 69 (2022) V. Basile, D. Croce, M. Di Maro, L. Passaro (Eds.), 217–228. Proceedings of the Seventh Evaluation Campaign of Natural Language Processing and Speech Tools for Italian. Final Workshop (EVALITA 2020), vol- https://aclanthology.org/E17-1025. ume 2765, CEUR Workshop Proceedings (CEUR- [24] G. Corbelli, P. G. Cicirelli, F. D’Errico, M. Paciello, WS.org), 2020. Conference date: 17-12-2020. Preventing prejudice emerging from misleading [17] F. Rodríguez-Sánchez, J. C. de Albornoz, L. Plaza, news among adolescents: The role of implicit acti- J. Gonzalo, P. Rosso, M. Comet, T. Donoso, vation and regulatory self-efficacy in dealing with Overview of EXIST 2021: sexism identification in online misinformation, Social Sciences 12 (2023). social networks, Procesamiento del Lenguaje Nat- [25] F. D’Errico, P. G. Cicirelli, G. Corbelli, M. Paciello, ural 67 (2021) 195–207. URL: http://journal.sepln. Addressing racial misinformation at school: A org/sepln/ojs/ojs/index.php/pln/article/view/6389. psycho-social intervention aimed at reducing eth- [18] F. Rodríguez-Sánchez, J. C. de Albornoz, nic moral disengagement in adolescents, Social L. Plaza, A. Mendieta-Aragón, G. Marco-Remón, Psychology of Education (2023). M. Makeienko, M. Plaza, J. Gonzalo, D. Spina, [26] C. Bosco, V. Patti, S. Frenda, A. T. Cignarella, P. Rosso, Overview of EXIST 2022: sexism M. Paciello, F. D’Errico, Detecting racial identification in social networks, Procesamiento stereotypes: An italian social media corpus del Lenguaje Natural 69 (2022) 229–240. URL: where psychology meets nlp, Information http://journal.sepln.org/sepln/ojs/ojs/index.php/ Processing & Management 60 (2023) 103118. pln/article/view/6443. URL: https://linkinghub.elsevier.com/retrieve/pii/ [19] L. Plaza, J. Carrillo-de Albornoz, R. Morante, S0306457322002199. doi:10.1016/j.ipm.2022. E. Amigó, J. Gonzalo, D. Spina, P. Rosso, Overview 103118. of exist 2023–learning with disagreement for sex- [27] T. Bourgeade, A. T. Cignarella, S. Frenda, M. Lau- ism identification and characterization, in: Inter- rent, W. Schmeisser-Nieto, F. Benamara, C. Bosco, national Conference of the Cross-Language Eval- V. Moriceau, V. Patti, M. Taulé, A Multilingual uation Forum for European Languages, Springer, Dataset of Racial Stereotypes in Social Media Con- 2023, pp. 316–342. versational Threads, in: Findings of the Association [20] W. S. Schmeisser-Nieto, P. Pastells, S. Frenda, for Computational Linguistics: EACL 2023, Asso- M. Taule, Human vs. machine perceptions on ciation for Computational Linguistics, Dubrovnik, immigration stereotypes, in: N. Calzolari, M.-Y. Croatia, 2023, pp. 686–696. Kan, V. Hoste, A. Lenci, S. Sakti, N. Xue (Eds.), [28] E. Chierchiello, T. Bourgeade, G. Ricci, C. Bosco, Proceedings of the 2024 Joint International Con- F. D’Errico, Studying reactions to stereotypes in ference on Computational Linguistics, Language teenagers: an annotated italian dataset, in: Proceed- Resources and Evaluation (LREC-COLING 2024), ings of the Fourth Workshop on Threat, Aggression ELRA and ICCL, Torino, Italia, 2024, pp. 8453–8463. and Cyberbullying (TRAC-2024), 2014. URL: https://aclanthology.org/2024.lrec-main.741. [29] G. Lakoff, Women, fire, and dangerous things: What [21] A. Fokkens, N. Ruigrok, C. Beukeboom, G. Sarah, categories reveal about the mind, University of W. Van Atteveldt, Studying muslim stereotyping Chicago Press, Chicago, 1987. through microportrait extraction, in: Proceedings [30] Y. Huang, Implicitness in the lexis, in: P. Cap, of the Eleventh International Conference on Lan- M. Dynel (Eds.), Implicitness: From lexis to dis- guage Resources and Evaluation (LREC 2018), 2018, course, John Benjamins, Amsterdam/ Philadelphia, pp. 3734–3741. 2017, pp. 67–94. [22] J. J. Sánchez-Junquera, B. Chulvi, P. Rosso, [31] C. Lyons, Definiteness, Cambridge University Press, S. P. Ponzetto, How do you speak about im- Cambridge, 1999. migrants? taxonomy and stereoimmigrants [32] S. Frenda, V. Patti, P. Rosso, Killing me softly: dataset for identifying stereotypes about im- Creative and cognitive aspects of implicitness migrants, Applied Sciences 11 (2021). URL: in abusive language online, Natural Language https://www.mdpi.com/2076-3417/11/8/3610. Engineering 29 (2023) 1516–1537. doi:10.1017/ doi:10.3390/app11083610. S1351324922000316. [23] J. Karoui, F. Benamara, V. Moriceau, V. Patti, [33] S. Frenda, V. Patti, P. Rosso, When sarcasm hurts: C. Bosco, N. Aussenac-Gilles, Exploring the im- Irony-aware models for abusive language detec- pact of pragmatic phenomena on irony detection tion, in: A. Arampatzis, E. Kanoulas, T. Tsikrika, in tweets: A multilingual corpus study, in: M. Lap- S. Vrochidis, A. Giachanou, D. Li, M. Aliannejadi, ata, P. Blunsom, A. Koller (Eds.), Proceedings of the M. Vlachos, G. Faggioli, N. Ferro (Eds.), Experimen- 15th Conference of the European Chapter of the tal IR Meets Multilinguality, Multimodality, and Association for Computational Linguistics: Volume Interaction, Springer Nature Switzerland, Cham, 1, Long Papers, Association for Computational Lin- 2023, pp. 34–47. guistics, Valencia, Spain, 2017, pp. 262–272. URL: