=Paper=
{{Paper
|id=Vol-3878/49_main_long
|storemode=property
|title=The Vulnerable Identities Recognition Corpus (VIRC) for Hate Speech Analysis
|pdfUrl=https://ceur-ws.org/Vol-3878/49_main_long.pdf
|volume=Vol-3878
|authors=Ibai Guillén-Pacho,Arianna Longo,Marco Antonio Stranisci,Viviana Patti,Carlos Badenes-Olmedo
|dblpUrl=https://dblp.org/rec/conf/clic-it/Guillen-PachoLS24
}}
==The Vulnerable Identities Recognition Corpus (VIRC) for Hate Speech Analysis==
<pdf width="1500px">https://ceur-ws.org/Vol-3878/49_main_long.pdf</pdf>
<pre>
                         The Vulnerable Identities Recognition Corpus (VIRC) for Hate
                         Speech Analysis
                         Ibai Guillén-Pacho1,*,† , Arianna Longo2,3,† , Marco Antonio Stranisci2,3 , Viviana Patti2 and
                         Carlos Badenes-Olmedo1,4
                         1
                           Ontology Engineering Group, Universidad Politécnica de Madrid, Spain
                         2
                           University of Turin, Italy
                         3
                           Aequa-tech, Torino, Italy (aequa-tech.com)
                         4
                           Computer Science Department, Universidad Poltécnica de Madrid, Spain


                                          Abstract
                                          This paper presents the Vulnerable Identities Recognition Corpus (VIRC), a novel resource designed to enhance hate speech analysis in
                                          Italian and Spanish news headlines. VIRC comprises 880 headlines, manually annotated for vulnerable identities, dangerous discourse,
                                          derogatory expressions, and entities. Our experiments reveal that recent large language models (LLMs) struggle with the fine-grained
                                          identification of these elements, underscoring the complexity of detecting hate speech. VIRC stands out as the first resource of its kind
                                          in these languages, offering a richer annotation scheme compared to existing corpora. The insights derived from VIRC can inform
                                          the development of sophisticated detection tools and the creation of policies and regulations to combat hate speech on social media,
                                          promoting a safer online environment. Future work will focus on expanding the corpus and refining annotation guidelines to further
                                          enhance its comprehensiveness and reliability.

                                          Keywords
                                          hate speech, vulnerable identities, annotated corpora


                         1. Introduction                                                      et al. [10], which treat the identification of HS targets as a
                                                                                              span-based task.
                         Hate Speech (HS) detection is a task with a high social impact.         In order to fill this gap, we present the Vulnerable Identi-
                         Developing technologies that are able to recognize these forms ties Recognition Corpus (VIRC): a dataset of 880 Italian and
                         of discrimination is not only crucial to enforce existing laws Spanish headlines against migrants aimed at providing an
                         but it also supports important tasks like the moderation of event-centric representation of HS against vulnerable groups.
                         social media contents. However, recognizing HS is challeng- The annotation scheme is built on four elements:
                         ing. Verbal discrimination takes different forms and involves a
                         number of correlated phenomena that make difficult to reduce               • Named Entity Recognition (NER). All the named
                         HS as a binary classification.                                                entities that are involved in a HS expression: ‘location’,
                            Analyzing the recent history of corpora annotated for HS it                ‘organization’, and ‘person’.
                         is possible to observe the shift from very broad categorizations           • Vulnerable Identity mentions. Generic mentions
                         of hatred contents to increasingly detailed annotation schemes                related to identities target of HS as they are defined by
                         aimed at understanding the complexity of this phenomenon.                     the international regulatory frameworks 1 : ‘women’,
                         High-level schemes including dimensions like “hateful/offen-                  ‘LGBTQI’, ‘ethnic minority’, and ‘migrant’.
                         siveness” [1] or “sexism/racism” [2] paved the way for more                • Derogatory mentions. All mentions that negatively
                         sophisticated attempts to formalize such concepts in different                portray people belonging to vulnerable groups.
                         directions: exploring the interaction between HS and vulnera-              • Dangerous speech. The part of the message that is
                         ble targets [3, 4, 5]; studying the impact of subjectivity [6, 7];            perceived as hateful against named entities or vulner-
                         identifying the triggers of HS in texts [8, 9].                               able identities.
                            Despite this trend, the complex semantics of HS in texts
                         is far from being fully explored. Information Extraction (IE)           In this paper we present a preliminary annotation experi-
                         approaches to HS annotation have been rarely implemented, ment intended to validate the scheme and to assess the impact
                         yet. Therefore, corpora that includes fine-grained structured on disagreement in such a fine-grained task. The paper is
                         semantic representation of HS incidents are not available. The structured as follows. In Section 2, we discuss related work,
                         only notable exception is the recent work of Büyükdemirci in Section 3, we describe the methodology used, in Section 4,
                                                                                              we introduce the VIRC corpus, and in Section 5, we present
                         CLiC-it 2024: Tenth Italian Conference on Computational Linguistics, the conclusions and discuss possible future work.
                         Dec 04 — 06, 2024, Pisa, Italy
                         *
                           Corresponding author.
                         †
                           These authors contributed equally.                                                                                2. Related Work
                         $ ibai.guillen@upm.es (I. Guillén-Pacho);
                         arianna.longo401@edu.unito.it (A. Longo);                                                                           Literature on automatic HS detection is vast and follows differ-
                         marcoantonio.stranisci@unito.it (M. A. Stranisci); viviana.patti@unito.it                                           ent research directions [11]: from the analysis of subjectivity
                         (V. Patti); carlos.badenes@upm.es (C. Badenes-Olmedo)
                          https://iguillenp.github.io/ (I. Guillén-Pacho);
                                                                                                                                             in the perception of this phenomenon [12] to the definition of
                         https://marcostranisci.github.io/ (M. A. Stranisci);                                                                ever more refined categorizations of hateful contents [13]. In
                         https://www.unito.it/persone/vpatti (V. Patti); https://about.me/cbadenes                                           this section we focus on the approaches to HS detection that
                         (C. Badenes-Olmedo)                                                                                                 are aimed at studying the target of HS inspired by Informa-
                          0000-0001-7801-8815 (I. Guillén-Pacho); 0009-0005-8500-1946                                                       tion Extraction (IE) approaches. In Section 2.1 we review HS
                         (A. Longo); 0000-0001-9337-7250 (M. A. Stranisci); 0000-0001-5991-370X
                         (V. Patti); 0000-0002-2753-9917 (C. Badenes-Olmedo)                                                                 1
                                                                                                                                                 https://www.coe.int/en/web/combating-hate-speech/
                                  © 2024 Copyright for this paper by its authors. Use permitted under Creative Commons License Attribution
                                  4.0 International (CC BY 4.0).                                                                                 recommendation-on-combating-hate-speech

CEUR
                  ceur-ws.org
Workshop      ISSN 1613-0073
Proceedings
resources inspired by this approach with a specific focus on           offensive expressions, totaling 765 tweets in English and 765
span-based annotated corpora. In Section 2.2 we discuss the            tweets in Turkish.
implementation of NER-based techniquest in the creation of
HS corpora.                                                            2.2. Named Entity Recognition
                                                                       Developed as a branch of Information Extraction (IE), Named
2.1. Hate Speech Detection
                                                                       Entity Recognition (NER) is a field of research aimed at de-
A large amount of work on HS detection focuses on clas-                tecting named entities in documents according to different
sification, both binary (existence or not) and multi-labeled           schemes. Following the review of Jehangir et al. [21], it is
(misogyny, racism, xenophobia, etc.). This has led to the exis-        possible to observe general-purpose schemes, which usually
tence of large collections of datasets such as those grouped by        includes entities of the type ‘person’, ‘location’, ‘organiza-
[14]. One of the main problems is that most resources are in           tion’ and ‘time’, and schemes defined for specific applications.
English, and for mid-to-low resource languages (e.g., Italian),        OntoNotes [22] is an example of the first type of approach: a
some HS categories are not covered. This constraint is miti-           broad collection of documents gathered from different sources
gated by cross-lingual transfer learning to exploit resources in       (e.g., newspaper, television news) annotated with a tagset
other languages [15] and, although good results are achieved,          that includes general categories of named entities. On the
the creation of resources for these languages is still necessary.      other hand, more specific applications include biomedical
   The main resources for the identification of HS are par-            NER, which focuses on identifying entities relevant to the
ticularly focused on a target by identifying the presence or           biomedical field, such as diseases, genes and chemicals. An
absence of HS in them. As in the work of [16], where in 1,100          example in this field is the JNLPBA dataset[23], which is de-
tweets in Italian with special target on immigrants were an-           rived from the GENIA corpus. This dataset consists of 2,000
notated according to the presence of HS, irony, and the stance         biomedical abstracts from the MEDLINE database, annotated
of the message’s author on immigration matters. However,               with detailed entity types such as proteins, DNA, RNA, cell
recently, there has been an increasing focus on identifying            lines and cell types.
hateful expressions and their intended targets. The change in             NER-based approaches for HS detection and analysis are
paradigm suggests that resources should be wider in scope and          still few. ElSherief et al. [24] exploited Twitter users’ mentions
not focus on a particular discourse target. The main resources         to distinguish between directed and generalized forms of HS.
in this field have high linguistic diversity, although they do         Rodríguez-Sánchez et al. [25] used derogatory expressions of
not all follow the same annotation scheme, with English being          women as seeds to collect misogynist messages according to
the most common language. We have found works in English               a fine grained classification of this phenomenon. [26] adopted
[17]; Vietnamese[18]; Korean [19]; English and Turkish [10];           a similar methodology to collect tweets about 3 vulnerable
and English, French, and Arabic [20]. However, we have not             groups to discrimination: ethnic minorities, religious minori-
found any in Italian or Spanish, which we believe makes this           ties, and Roma communities. Piot et al. [14] analyzed the
work the first to cover these languages for this task.                 correlation between the presence of HS and named entities
   Two main annotation approaches can be drawn from these              in 60 existing datasets. Despite these previous works, there
studies, those that annotate at the span level [17, 18, 19, 10] and    are no attempts to define a NER-based scheme specifically
those that annotate over the full text [20]. On the one hand,          intended for HS detection. Our work represents an attempt
the work that follows the latter approach presents a corpus of         to fill this gap by combining categories from general-purpose
13.000 tweets (5.647 English, 4.014 French, and 3.353 Arabic)          NER and a taxonomy of vulnerable groups to discrimination
and notes the sentiment of the annotator (shock, sadness,              in a common annotation scheme aimed at providing deeper
disgust, etc.), hostility type (abusive, hateful, offensive, etc.),    insights about the targets of HS.
directness (direct or indirect), target attribute (gender, religion,
disabled, etc.) and target group (individual, women, African,
etc.).                                                                 3. Methodology
   On the other hand, works that follow the approach of span
annotation design different annotation criteria. The simplest,         3.1. Data Collection
[17, 18], only annotates one dimension. The first, [17], anno-         We collect news from public Telegram channels with the
tates the parts that make a comment toxic on a 30.000 English          telegram-dataset-builder [27]. The selected channels are
comments of the Civil Comments platform. The second, [18],             shown in Table 1, they are in Spanish and Italian and aligned
annotates only the parts that make a comment offensive or              with the left and right wings of the political spectrum. The
hateful in 11.000 Vietnamese comments on Facebook and                  subset of Italian headlines was integrated with titles published
Youtube. The other papers, [19, 10], extend this approach              on newspapers Facebook pages that have been collected in
and also label the span in which the target of the attack is           collaboration with the Italian Amnesty Task Force on HS, a
mentioned. Moreover, [19] is not limited to that; they also            group of activists that produce counter narratives against dis-
annotate the target type (individual, group, other), the tar-          criminatory contents spread by online newspapers and users
get attribute (gender, race, ethnic, etc.) and the target group        comments2 . We collected all the news headlines detected by
(LGBTQ +, Muslims, feminists, etc.). Their final corpus has            activists in March 2020, 2021, 2022, and 2023, and added them
20.130 annotated offensive Korean-language news and video              to our corpus.
comments.                                                                 Given the large amount of news collected, we applied filters
   However, the guidelines used by the different works some-           to the dataset to reduce it to its final size. We focus on news
times present incompatibilities. Although some works use               about racism; for this purpose, we applied the classifier piuba-
offensive and hateful labels in the same way [19, 18], others          bigdata/beto-contextualized-hate-speech to stick to news items
distinguish between these two types of expression [10]. This           labeled as racism. Since this classifier is trained on Spanish
resource, the last one, has separately annotated hateful and
                                                                       2
                                                                           https://www.amnesty.it/entra-in-azione/task-force-attivismo/
 Migranti, un esercito di scrocconi: 120mila mantenuti con l’8 per mille degli italiani.3
 Hordas de gitanos arrasan Mercadona después de que les ingresen 3000 euros en sus ‘tarjetas solidarias’.4
 Questa è Villa Aldini, la residenza di lusso che ospita i migranti stupratori a Bologna.5


                             Vulnerable identity - Migrants           Derogatory             Entity - Location
                          Vulnerable identity - Ethnic minority     Dangerous speech       Entity - Organization

          Figure 1: Examples of annotated headlines


                      Left-wing              Right-wing                     provide details about the entity in a free-text field.
     Spanish     elpais_esp, smolny7      MediterraneoDGT,
                                                                               The final layers of the annotation scheme address the con-
                                          elmundoes
                                                                            text in which these entities are mentioned, specifically fo-
         Italian ByobluOfficial,               terzaroma,       mar-        cusing on identifying derogatory mentions and dangerous
                      sadefenza                cellopamio, ilpri-           speech.
                                               matonazionaleIPN,               A derogatory mention is characterized by negative or dis-
                                               VoxNewsInfo                  paraging remarks about the target. In these instances, explicit
                                                                            hate speech is absent, but the mention itself is discriminatory
Table 1
Telegram channels from which the news have been extracted.                  or offensive, often employing a tone intended to belittle or
                                                                            discredit the target. The label derogatory is used to mark
                                                                            these mentions.
                                                                               Moreover, the annotation includes identifying dangerous el-
texts, prior to this step we automatically translated Italian ements: portions of text that, intentionally or unintentionally,
news with the model facebook/nllb-200-distilled-600M. This could incite hate speech or increase the vulnerability of the
translation step is used only for the filtering process; once the target identity. Dangerous speech, which can be either explicit
news is selected, the translated text is no longer used. In the or implicit, promotes or perpetuates negative prejudices and
end, this process generates 532 news headlines classified as stereotypes, potentially triggering harmful responses against
racist for Italian and 348 for Spanish, that have been selected the group. The label dangerous [28] is used to tag these
for the annotation task.                                                    segments. Annotators were encouraged to use free-text fields
                                                                            to provide details on implicit dangerous speech or recurring
3.2. Data Annotation                                                        dangerous concepts.
                                                                               The annotation guidelines provided annotators with spe-
A comprehensive, span-based annotation scheme was devel-
                                                                            cific criteria and with the following list of potential markers
oped to label vulnerable identities and entities present in the
                                                                            of dangerous speech to help their identification:
dataset. Annotators were provided with instructions and had
to choose a label and highlight the word, phrase, or portion                      • Incitement to violence: the text explicitly encour-
of text that best embodied the qualities of the chosen label                         ages violence against the target group;
in the text. It was possible to choose more than one label                        • Open discrimination: the text openly states or sup-
for the same portion of text. The instructions also provided                         ports discrimination against the target group;
annotators with some examples of annotated headlines.                             • Ridicule: the text ridicules the target in the eyes of
    The initial layer of annotation focuses on identifying vul-                      the readers by belittling it or mocking it;
nerable targets within the text and categorizing them into
                                                                                  • Stereotyping: the text perpetuates negative stereo-
one of six predefined labels: ethnic minority, migrant, reli-
                                                                                     types about the target group, contributing to a dis-
gious minority, women, LGBTQ+ community, and other.
                                                                                     torted view of it;
These labels represent vulnerable groups, as the vulnerability
                                                                                  • Disinformation: the text spreads false or misleading
of the targets can often be traced back to their belonging to
                                                                                     information that can harm the target group;
certain categories of people which are particularly exposed
to discrimination, marginalisation, or prejudice in society. In                   •  Dehumanization: the text dehumanizes the target
cases where the targeted group didn’t fit into one of the pre-                       group, using language that equates it with objects or
defined labels, annotators were required to use the ‘other’                          animals;
category. Then, for instances labeled as ‘other’, annotators                      •  Criminalization: the text portrays the target group
were instructed to provide specific details regarding the group                      as inherently criminal or associates it with illegal ac-
in a free-text field.                                                                tivities, contributing to the perception that the group
    After categorizing vulnerable targets, the second layer in-                      as a whole is dangerous.
volves annotating named entities. Annotators identify entities
                                                                            However, a text may still be considered dangerous even if it
within the text and label them with one of five possible types:
                                                                            does not explicitly include these markers, as they are intended
person, group, organization, location, and other. As in
                                                                            as examples rather than strict requirements.
the first layer, instances labelled ‘other’ require annotators to
                                                                               Figure 1 provides three examples of annotated headlines,
                                                                            two in Italian and one in Spanish, showing the application
2
  “Migrants, an army of scroungers: 120,000 supported by the Italians’ of the annotation scheme as described. In the figure, dif-
  8x1000 tax allocation”.                                                   ferent colours highlight the various types of labels used. A
3
  “Hordes of gypsies devastate Mercadona after 3000 euros were deposited
  in their solidarity cards”.
                                                                            vulnerable   identity was detected in each headline: ‘Migranti’
4
  “This is Villa Aldini, the luxury residence that hosts rapist migrants in in the first and in the third one and ‘gitanos’ in the second
  Bologna”.                                                                 one, respectively labelled as ‘vulnerable group - migrant’ and
‘vulnerable group - ethnic minority’. The three examples all       this would be a full agreement (1 true positive). However,
contain multiple elements of dangerous speech, highlighted in      if the latter selected “women of Italy”, it would be a partial
red, and the second text also contains an element which was        agreement (0.5 true positive).
marked with the derogatory label. Additionally, the second
and the third headlines include examples of annotation for         Quantitative Analysis. The agreement on the annotation
named entities, with ‘Mercadona’ labelled as ‘entity - organi-     of entities is always moderate but differs between the Span-
zation’, and ‘Villa Aldini’ and ‘Bologna’ labelled as ‘entity -    ish and the Italian subsets. Annotators of Spanish headlines
location’.                                                         scored a higher agreement on ‘location’ (0.66 vs 0.60), ‘vul-
                                                                   nerable’ (0.15 vs 0) and ‘organization’ (0.41 vs 0.12) while
                                                                   entities of the type ‘person’ (0.63 vs 0.47) and ‘other’ (0.1 vs
4. The VIRC Corpus                                                 0) are better recognized in Italian headlines.
The VIRC corpus is a collection of 532 Italian and 348 Spanish        On average, the annotation of vulnerable identities resulted
news headlines annotated by 2 independent annotators for           in a higher agreement between annotators in both subsets
each language. Following the perspectivist paradigm [29],          and at the same time confirmed an higher agreement of Span-
we both released the disaggregated annotations and the gold-       ish annotations that always outperforms Italian ones. The
standard corpus. The code used to generate the gold standard       highest agreement emerges for the label ‘migrant’ on which
corpus, carry out experiments, and compile statistics can be       annotators obtained an F-score of 0.86 for Italian and 0.96 for
accessed through the following GitHub repository6 . In this        Spanish. The agreement on ‘ethnic minority’ is a bit lower but
Section we present an analysis of disagreement (Section 4.1)       still significant, while Spanish headlines reached an F-score of
and relevant statistics about the corpus (Section 4.2).            0.83 Italian ones only 0.63. An equally high agreement is on
                                                                   the ‘lgbtq+’ label, which is only present in Italian headlines
                                                                   with an F-score of 0.8. Among vulnerable groups, women
4.1. Inter-Annotator Agreement                                     scored the lowest F-score: 0.6 for Spanish, 0.22 for Italian.
Since the span-based annotation task does not provide a fixed      The largest observed discrepancy is with religious minorities,
number of annotated items, we adopted the F-score metric to        in Spanish an F-score of 1 is achieved while in Italian 0.
evaluate the agreement between annotators [30]. For each sub-         While the annotation of ‘dangerous’ spans achieves an ac-
set of the corpus we randomly chose one annotator as the gold      ceptable agreement, the ‘derogatory’ annotation is character-
standard set of labels and the other as the set of predictions.    ized as the one that achieves the lowest agreement between
We then computed the F-score between the two distributions         annotators. Additionally, annotations of Italian headlines re-
of labels in order to measure the agreement between the an-        sulted in higher disagreement than Spanish ones, contrary
notators. Table 2 shows the results of our analysis. In general,   to what we observed about ‘entities’ and ‘vulnerable identi-
annotations always showed a fair or higher agreement, ex-          ties’. Text spans expressing dangerous speech are recognized
cept for some entity-related labels and the “derogatory” one.      with an agreement of 0.57 for Italian and 0.49 for Spanish
There is also a low agreement in the Italian set on the labels     headlines. Agreement about ‘derogatory’ is low for Italian
“religious minority” and “women”.                                  headlines (0.28) while Spanish ones show almost no agree-
                                                                   ment (0.08)
                                                 IAA (F-score)
                                               Spanish Italian     Qualitative Analysis. In summary, while the overall re-
                                 dangerous       0.49      0.57    sults of the annotation are positive, some categories show
                                 derogatory      0.08      0.28
                                                                   significant disagreement between annotators. These disagree-
                              entity - group      0.0      0.00
                           entity - location     0.66      0.60    ments highlight the need to review and refine the annotation
                      entity - organization      0.41      0.12    guidelines for problematic categories, and to provide more
                              entity - other      0.0      0.10    detailed instructions. The importance of reassessing the guide-
                             entity - person     0.47      0.63    lines in order to make them clearer and more consistent is
                          vulnerable entity      0.15      0.00    further underscored by the fact that, for Spanish headlines,
        vulnerable group - ethnic minority       0.83      0.63    the annotators agreed on both labels and intervals in only 67
     vulnerable group - lgbtq+ community           -       0.80    cases, and for Italian headlines, agreement was reached in just
               vulnerable group - migrant        0.96      0.86    88 cases.
                  vulnerable group - other       0.46      0.41       Since the annotation task was span-based, we opted not
      vulnerable group - religious minority       1.0      0.00
                                                                   to use a confusion matrix to analyze the disagreement. A
                vulnerable group - women          0.6      0.22
                                                                   confusion matrix is not appropriate for span detection, as it
Table 2                                                            assumes discrete labels applied to predefined items, whereas
The annotators agreement measured through the F-score and bro-     our task involved labeling spans of text that varied in length
ken down by label.                                                 and context. Instead, we performed a qualitative analysis,
                                                                   examining specific cases of disagreement to understand their
   Although the overall results are positive, they show signif-    nature. This approach allowed us to explore not only how
icant variations that can be quantitatively and qualitatively.     annotators differed in labeling spans but also why these differ-
Inclusion of overlapping spans was handled as follows: if          ences emerged, providing a deeper insight into the underlying
one span fully included another, this was considered to be an      issues of interpretation and guidelines.
agreement. In cases where the spans only partially overlapped,        Looking more closely at the headlines where the annota-
meaning there was some shared text but not full inclusion, this    tions present inconsistencies, a variety of motivations behind
was treated as a partial agreement. For example, if one anno-      discrepancies can be identified.
tator labeled “All women” and another selected only “women”,          For instance, in the Italian title “Orrore nella casa occu-
6
    https://github.com/oeg-upm/virc
pata dagli immigrati: donna lanciata giù dal secondo piano”7 ,                                               Spanish      Italian
                                                                                              dangerous        136          166
‘donna’ was marked as a vulnerable identity by only one of
                                                                                             derogatory         3            16
the annotators, suggesting maybe an erroneous focus on an                                        entities      140          146
individual target at a time (‘immigrati’) by the other annotator.                      vulnerable groups       270          253
    Another type of disagreement relates to the interpretation
of derogatory mentions. An example can be found in “Un                     Table 3
terzo dei reati sono commessi da stranieri (e gli africani hanno           The distribution of labels in the gold standard corpus.
il record). Tutti i numeri”8 , where one annotator identified the
term ‘stranieri’ as a derogatory mention, as well as represen-
tative of a vulnerable identity, while another annotator simply            4.2. Dataset Analysis
stuck to the second label, perhaps highlighting a divergence               In this section we provide an analysis of the four label types
in the interpretation of the guidelines. Furthermore, it is inter-         that occur in the gold standard version of the VIRC corpus:
esting to observe the disagreement created by the headlines                ‘derogatory’, ‘dangerous’, ‘named entities’, ‘vulnerable groups’.
that use generic term ‘stranieri’ (‘foreigners’), which was of-            The analysis is twofold: first, we describe the distribution of
ten labelled as ‘vulnerable identity - ethnic minority’ by one             these label types, then we present a zero-shot and a few-
annotator and as ‘vulnerable identity - migrant’ by the other.             shot experiment aimed at understanding if existing LLMs
This inconsistence between annotators can be identified in                 (T5[31] and BART[32]) are able to recognize these labeled
two headlines: “Ius soli e cittadinanza facile agli stranieri? Il          spans in news headlines by comparing their outputs to the
sangue non è acqua”9 and “Un terzo dei reati sono commessi                 gold standard annotations.
da stranieri (e gli africani hanno il record). Tutti i numeri”2 . In
the first case, we can solve the disagreement by looking at the
                                                                           Corpus statistics. Table 3 shows the distribution of label
context: the explicit reference to the issue of granting citizen-
                                                                           types in the corpus. As it can be observed, mentions of vulner-
ship suggests that the term ‘foreigners’ is more appropriately
                                                                           able groups are the most present, with 270 occurrences in the
referred to the specific category of migrants. On the other
                                                                           Spanish subset and 253 in the Italian subset. This confirms
hand, in the second headline, there is no direct reference to
                                                                           the relevance of annotating vulnerable in the identification
specifically migration-related issues and thus both interpre-
                                                                           of discriminatory contents, which is tied to their high rec-
tations in terms of the vulnerable category of belonging are
                                                                           ognizability by annotators (Section 4.1). The role on named
acceptable.
                                                                           entities differs in the two subsets. Annotators labeled them
    Finally, some texts present a slight difference in the anno-
                                                                           with agreement 130 times in Spanish headlines and 67 times
tation spans of choice, as observed in “Più di 200mila case
                                                                           in Italian ones. This might be caused by their compositions.
popolari agli immigrati”10 , where the annotators identified
                                                                           Since Italian headlines were partly collected from Facebook
dangerous speech in the same section of text, but with dif-
                                                                           pages of mainstream newspapers, there was a higher num-
ferences in the number of highlighted words (first annotator
                                                                           ber of named entities that were not relevant for the analysis
labelled ‘Più di 200mila’; second annotator labelled ‘200mila
                                                                           of headlines’ danger. The number of text spans labeled as
case popolari’), reflecting variations in the identification of
                                                                           dangerous is almost equivalent in the two subsets (136 for
relevant content for the analysis of dangerous speech.
                                                                           Spanish, 166 for Italian), showing a good presence of this
    In addition to the predefined labels, we also collected free-
                                                                           label type despite the high disagreement between annotators.
text fields as part of the annotation process. These comments
                                                                           Finally, it is worth mentioning the almost total absence of text
offered an additional layer of granularity, allowing annota-
                                                                           spans labeled as ‘derogatory’ with agreement (3 for Spanish,
tors to describe nuances not covered by the fixed categories.
                                                                           16 for Italian) that suggests the high subjectivity of this phe-
For example, in the Spanish headline “Dos menas marroquíes
                                                                           nomenon and also the need of better define its characteristics
apuñalan a dos turistas para robarles en Salou”11 , both an-
                                                                           in annotation guidelines.
notators used the two labels ‘vulnerable identity - ethnic mi-
nority’ and ‘vulnerable identity - other’ to annotate the span
‘menas marroquíes’. Alongside the ‘other’ label, one annotator             Corpus analysis with LLMs. We completed our analysis
provided the comment ‘Under 18’, while the other one used                  of the VIRC corpus through zero-shot experiments aimed at
‘young people’ to describe the vulnerable group. Although                  exploring the ability of existing LLMs to identify the four
stated differently, both comments highlight the specific vul-              types of labelled spans in messages. We considered the de-
nerability related to the age of the group, complementing the              tection of spans as an extractive Question Answering (QA)
existing labels. As this example shows, the flexibility in the             problem. For the task we adopted the T5[31] and BART[32]
annotation process provided by free-text fields is useful to               LLMs architectures for both languages. For Italian we employ
capture multi-categorical terms and to identify potential new              [33] and [34] and for Spanish [35] and [36] models, respec-
categories that may not have been initially considered in the              tively. The translations of the prompts used are the following
predefined labels.                                                         (see Appendix A for the original ones):

                                                                                • What part of the text is dangerous (criminalizes,
                                                                                  ridicules, incites violence, ...) against vulnerable iden-
                                                                                  tities (women, migrants, ethnic minorities, ...)?
7
                                                                                • What part of the text is derogatory (negative or pe-
 “Atrocity in a house occupied by migrants: woman thrown from second
                                                                                  jorative comments about the victim without explicit
  floor”.
8
 “One third of all crimes are committed by foreigners (and Africans hold          hate speech, but the mention itself is discriminatory or
  the record). All the numbers”.                                                  offensive, and often uses a tone intended to denigrate
9
 “Ius soli and easy citizenships for foreigners? Blood is not water”.             or discredit the victim)?
10
   “More than 200,000 public housing units for immigrants”.
11
   “Two Moroccan unaccompanied migrant minors stab two tourists to rob
                                                                                • What named entity is mentioned in the sentence?
   them in Salou”.
                                         Non-Restictive Zero-Shot                       Restictive Zero-Shot
                                          T5                  BART                     T5                  BART
                                  Spanish Italian      Spanish Italian          Spanish Italian     Spanish Italian
                   dangerous        0.39      0.28       0.43      0.39           0.49     0.47       0.51      0.43
                  derogatory        0.02      0.05       0.03      0.04           0.67     0.43       0.50      0.33
                        entity      0.28      0.11       0.23      0.23           0.40     0.30       0.30      0.27
           vulnerable identity      0.63      0.19       0.41      0.48           0.56     0.18       0.35      0.37
     Table 4
     F-score results of zero-shot experiments on the VIRC corpus with T5 and BART models for each label.


     • Which hate speech vulnerable identity is mentioned               VIRC provides a detailed and structured resource that en-
       in the sentence?                                              hances understanding of the extensive use of hate speech in
                                                                     Italian and Spanish news headlines. The corpus is particularly
   We designed two approaches for zero-shot experiments,             valuable as it includes more annotation dimensions compared
restictive and non-restrictive. On the one hand, for the non-        to related studies in other languages, such as vulnerable identi-
restictive zero-shot experiments, for each sentence in the           ties, dangerous discourse, derogatory expressions, and entities.
dataset, we queried the model with the prompt of each label          This differentiation between vulnerable identities and enti-
and extracted the three most confident results. Then, we             ties, as well as between dangerous and derogatory elements,
filtered out those responses below the %0.02 confidence of           enables the development of sophisticated detection tools that
the model to limit the noise. Finally, all these annotations go      can facilitate large-scale actions to mitigate the impact of
through a majority vote (identical to the one used to build the      hate speech (e.g., moderation of messages and generation
aggregate dataset) to normalize the model response.                  of counter-narratives that reduce the damage to the mental
   On the other hand, for the restictive zero-shot experi-           health of victims).
ments, we queried the model with the prompts for each an-               Future work will focus on expanding this resource by dou-
notation present in the aggregated dataset. And, as there are        bling the size of annotations for both languages and including
sentences that have two equal labels in different spans, we          non-racism-related phrases to ensure the resource is com-
request five different annotations from the model, ordered           prehensive. Additionally, we plan to refine the annotation
from most confident to least confident. If an annotation was         guidelines to avoid low agreement on the derogatory label, en-
already included, the next annotation is taken in order to           hancing the overall reliability and utility of the corpus. These
avoid duplicating annotations in the model.                          efforts will further improve the effectiveness of hate speech
   Table 4 presents the F-scores for each label type, experi-        detection and contribute to the development of policies and
ment, and model. In general, T5 and BART tend to perform             tools for a safer online environment.
more effectively in Spanish compared to Italian. The models
face noticeable challenges in identifying the labels ‘danger-
ous’, ‘derogatory’, and ‘entity’. Nevertheless, when they are        Acknowledgments
aware that the label exists within the sentence (restictive),
they manage to recognize it with fairly good agreement. Dur-         This work is supported by the Predoctoral Grant (PIPF-
ing annotation, the label ‘derogatory’ proves most challenging       2022/COM-25947) of the Consejería de Educación, Ciencia
to identify. In the non-restrictive scenario, it scarcely receives   y Universidades de la Comunidad de Madrid, Spain. Arianna
any agreement, yet in the restictive scenario, it achieves a         Longo’s work has been supported by aequa-tech. The au-
reasonable level, particularly in Spanish. This indicates that       thors gratefully acknowledge the Universidad Politécnica de
the model struggles to discern its presence initially but, once      Madrid (www.upm.es) for providing computing resources on
acknowledged, can recognise the expression.                          the IPTC-AI innovation Space AI Supercomputing Cluster.
   The restictive method enhances performance over the non-
restictive method for all labels except ‘vulnerable identity.’
This shows that models generally have a better comprehension
                                                                     References
and identification of vulnerable identities in sentences without      [1] T. Davidson, D. Warmsley, M. Macy, I. Weber, Auto-
restrictions compared to when they are restricted to specific             mated hate speech detection and the problem of offen-
mentions. It should also be noted that, in the Spanish context,           sive language, in: Proceedings of the international AAAI
T5 is more effective than BART in identifying ‘vulnerable                 conference on web and social media, volume 11, 2017,
identity’ labels for both approaches, while BART performs                 pp. 512–515.
better in Italian.                                                    [2] Z. Waseem, Are you a racist or am i seeing things?
   These results show that a NER-based annotation scheme                  annotator influence on hate speech detection on twit-
for HS detection is difficult to annotated but also to be auto-           ter, in: Proceedings of the first workshop on NLP and
matically detected. Larger resources are necessary to develop             computational social science, 2016, pp. 138–142.
models that are able to detect the complex semantics of HS.           [3] M. ElSherief, C. Ziems, D. Muchlinski, V. Anupindi,
                                                                          J. Seybolt, M. De Choudhury, D. Yang, Latent hatred: A
5. Conclusions and Future Work                                            benchmark for understanding implicit hate speech, in:
                                                                          Proceedings of the 2021 Conference on Empirical Meth-
The Vulnerable Identities Recognition Corpus (VIRC), created              ods in Natural Language Processing, 2021, pp. 345–363.
in this work, reveals the challenge of identifying vulnerable         [4] B. Vidgen, D. Nguyen, H. Margetts, P. Rossini,
identities due to the rapid evolution of language on social               R. Tromble, Introducing cad: the contextual abuse
media. Our experiments indicate that large language models                dataset, in: Proceedings of the 2021 Conference of the
(LLMs) struggle significantly with this task.
     North American Chapter of the Association for Com-            [16] M. Madeddu, S. Frenda, M. Lai, V. Patti, V. Basile, Dis-
     putational Linguistics: Human Language Technologies,               aggreghate it corpus: A disaggregated italian dataset of
     2021, pp. 2289–2303.                                               hate speech, in: F. Boschetti, G. E. Lebani, B. Magnini,
 [5] P. Chiril, E. W. Pamungkas, F. Benamara, V. Moriceau,              N. Novielli (Eds.), Proceedings of the Ninth Italian Con-
     V. Patti, Emotionally informed hate speech detection: A            ference on Computational Linguistics (CLiC-it 2023).,
     multi-target perspective, Cogn. Comput. 14 (2022) 322–             volume 3596, 2023.
     352. URL: https://doi.org/10.1007/s12559-021-09862-5.         [17] J. Pavlopoulos, J. Sorensen, L. Laugier, I. Androutsopou-
     doi:10.1007/S12559-021-09862-5.                                    los, SemEval-2021 task 5: Toxic spans detection, in:
 [6] M. Sap, D. Card, S. Gabriel, Y. Choi, N. A. Smith, The             A. Palmer, N. Schneider, N. Schluter, G. Emerson, A. Her-
     risk of racial bias in hate speech detection, in: Proceed-         belot, X. Zhu (Eds.), Proceedings of the 15th Inter-
     ings of the 57th annual meeting of the association for             national Workshop on Semantic Evaluation (SemEval-
     computational linguistics, 2019, pp. 1668–1678.                    2021), Association for Computational Linguistics, On-
 [7] P. Sachdeva, R. Barreto, G. Bacon, A. Sahn, C. Von Va-             line, 2021, pp. 59–69. URL: https://aclanthology.org/2021.
     cano, C. Kennedy, The measuring hate speech corpus:                semeval-1.6. doi:10.18653/v1/2021.semeval-1.6.
     Leveraging rasch measurement theory for data perspec-         [18] P. G. Hoang, C. D. Luu, K. Q. Tran, K. V. Nguyen,
     tivism, in: Proceedings of the 1st Workshop on Perspec-            N. L.-T. Nguyen, ViHOS: Hate speech spans detec-
     tivist Approaches to NLP@ LREC2022, 2022, pp. 83–94.               tion for Vietnamese, in: A. Vlachos, I. Augenstein
 [8] B. Mathew, P. Saha, S. M. Yimam, C. Biemann, P. Goyal,             (Eds.), Proceedings of the 17th Conference of the Eu-
     A. Mukherjee, Hatexplain: A benchmark dataset for                  ropean Chapter of the Association for Computational
     explainable hate speech detection, in: Proceedings of              Linguistics, Association for Computational Linguistics,
     the AAAI conference on artificial intelligence, volume 35,         Dubrovnik, Croatia, 2023, pp. 652–669. URL: https:
     2021, pp. 14867–14875.                                             //aclanthology.org/2023.eacl-main.47. doi:10.18653/
 [9] J. Pavlopoulos, J. Sorensen, L. Laugier, I. Androutsopou-          v1/2023.eacl-main.47.
     los, Semeval-2021 task 5: Toxic spans detection, in:          [19] Y. Jeong, J. Oh, J. Lee, J. Ahn, J. Moon, S. Park,
     Proceedings of the 15th international workshop on se-              A. Oh, KOLD: Korean offensive language dataset,
     mantic evaluation (SemEval-2021), 2021, pp. 59–69.                 in: Y. Goldberg, Z. Kozareva, Y. Zhang (Eds.), Pro-
[10] K. Büyükdemirci, I. E. Kucukkaya, E. Ölmez, C. Toraman,            ceedings of the 2022 Conference on Empirical Meth-
     JL-Hate: An Annotated Dataset for Joint Learning of                ods in Natural Language Processing, Association
     Hate Speech and Target Detection, in: N. Calzolari, M.-Y.          for Computational Linguistics, Abu Dhabi, United
     Kan, V. Hoste, A. Lenci, S. Sakti, N. Xue (Eds.), Proceed-         Arab Emirates, 2022, pp. 10818–10833. URL: https://
     ings of the 2024 Joint International Conference on Com-            aclanthology.org/2022.emnlp-main.744. doi:10.18653/
     putational Linguistics, Language Resources and Eval-               v1/2022.emnlp-main.744.
     uation (LREC-COLING 2024), ELRA and ICCL, Torino,             [20] N. Ousidhoum, Z. Lin, H. Zhang, Y. Song, D.-Y. Yeung,
     Italia, 2024, pp. 9543–9553.                                       Multilingual and multi-aspect hate speech analysis, in:
[11] F. Poletto, V. Basile, M. Sanguinetti, C. Bosco,                   K. Inui, J. Jiang, V. Ng, X. Wan (Eds.), Proceedings of the
     V. Patti,         Resources and benchmark corpora                  2019 Conference on Empirical Methods in Natural Lan-
     for hate speech detection: a systematic review,                    guage Processing and the 9th International Joint Confer-
     Lang. Resour. Evaluation 55 (2021) 477–523.                        ence on Natural Language Processing (EMNLP-IJCNLP),
     URL:         https://doi.org/10.1007/s10579-020-09502-8.           Association for Computational Linguistics, Hong Kong,
     doi:10.1007/S10579-020-09502-8.                                    China, 2019, pp. 4675–4684. URL: https://aclanthology.
[12] E. Leonardelli, S. Menini, A. P. Aprosio, M. Guerini,              org/D19-1474. doi:10.18653/v1/D19-1474.
     S. Tonelli, Agreeing to disagree: Annotating offensive        [21] B. Jehangir, S. Radhakrishnan, R. Agarwal, A survey on
     language datasets with annotators’ disagreement, in:               named entity recognition - datasets, tools, and method-
     Proceedings of the 2021 Conference on Empirical Meth-              ologies, Natural Language Processing Journal 3 (2023).
     ods in Natural Language Processing, 2021, pp. 10528–          [22] E. Hovy, M. Marcus, M. Palmer, L. Ramshaw,
     10539.                                                             R. Weischedel, Ontonotes: the 90% solution, in: Pro-
[13] H. Kirk, W. Yin, B. Vidgen, P. Röttger, Semeval-2023 task          ceedings of the human language technology conference
     10: Explainable detection of online sexism, in: Proceed-           of the NAACL, Companion Volume: Short Papers, 2006,
     ings of the 17th International Workshop on Semantic                pp. 57–60.
     Evaluation (SemEval-2023), 2023, pp. 2193–2210.               [23] N. Collier, T. Ohta, Y. Tsuruoka, Y. Tateisi, J.-D. Kim,
[14] P. Piot, P. Martín-Rodilla, J. Parapar, Metahate: A dataset        Introduction to the bio-entity recognition task at jnlpba,
     for unifying efforts on hate speech detection, Proceed-            in: N. Collier, P. Ruch, A. Nazarenko (Eds.), Proceedings
     ings of the International AAAI Conference on Web                   of the International Joint Workshop on Natural Lan-
     and Social Media 18 (2024) 2025–2039. URL: https://ojs.            guage Processing in Biomedicine and its Applications
     aaai.org/index.php/ICWSM/article/view/31445. doi:10.               (NLPBA/BioNLP), COLING, 2004, pp. 73–78.
     1609/icwsm.v18i1.31445.                                       [24] M. ElSherief, V. Kulkarni, D. Nguyen, W. Y. Wang,
[15] D. Nozza, F. Bianchi, G. Attanasio, HATE-ITA: Hate                 E. Belding, Hate lingo: A target-based linguistic analysis
     speech detection in Italian social media text, in:                 of hate speech in social media, in: Proceedings of the
     K. Narang, A. Mostafazadeh Davani, L. Mathias, B. Vid-             international AAAI conference on web and social media,
     gen, Z. Talat (Eds.), Proceedings of the Sixth Work-               volume 12, 2018.
     shop on Online Abuse and Harms (WOAH), Associa-               [25] F. Rodríguez-Sánchez, J. Carrillo-de Albornoz, L. Plaza,
     tion for Computational Linguistics, Seattle, Washington            Automatic classification of sexism in social networks:
     (Hybrid), 2022, pp. 252–260. doi:10.18653/v1/2022.                 An empirical study on twitter data, IEEE Access 8 (2020)
     woah-1.24.                                                         219563–219576.
[26] M. Sanguinetti, F. Poletto, C. Bosco, V. Patti, M. Stranisci,      • Vulnerable Identity: “¿Qué identidad vulnerable al
     An italian twitter corpus of hate speech against immi-               discurso de odio se menciona en la frase?”
     grants, in: Proceedings of the eleventh international
     conference on language resources and evaluation (LREC            For Italian:
     2018), 2018.                                                       • Dangerous: “Quale parte del testo è pericolosa (crim-
[27] I. Guillén-Pacho, oeg-upm/telegram-dataset-builder:                  inalizza, ridicolizza, incita alla violenza, ...) nei con-
     version 1.0.0, 2024. URL: https://doi.org/10.5281/zenodo.            fronti di identità vulnerabili (donne, migranti, mino-
     12773159. doi:10.5281/zenodo.12773159.                               ranze etniche, ...)?”
[28] S. Benesch, Dangerous speech, 86272 12 (2023) 185–197.
                                                                        • Derogatory: “Quale parte del testo è dispregiativa
[29] F. Cabitza, A. Campagner, V. Basile, Toward a perspec-
                                                                          (commenti negativi o denigratori sulla vittima senza
     tivist turn in ground truthing for predictive computing,
                                                                          un esplicito discorso d’odio, ma in cui la menzione
     in: Proceedings of the AAAI Conference on Artificial
                                                                          stessa è discriminatoria o offensiva e spesso usa un
     Intelligence, volume 37, 2023, pp. 6860–6868.
                                                                          tono volto a sminuire o screditare la vittima)?”
[30] T. Brants, Inter-annotator agreement for a german news-
                                                                        • Entity: “Quale entità nominata è menzionata nella
     paper corpus., in: LREC, Citeseer, 2000.
                                                                          frase?”
[31] C. Raffel, N. Shazeer, A. Roberts, K. Lee, S. Narang,
     M. Matena, Y. Zhou, W. Li, P. J. Liu, Exploring the                • Vulnerable Identity: “Quale identità vulnerabile ai
     limits of transfer learning with a unified text-to-text              discorsi d’odio è menzionata nella frase?”
     transformer, Journal of machine learning research 21
     (2020) 1–67.
[32] M. Lewis, Y. Liu, N. Goyal, M. Ghazvininejad, A. Mo-
     hamed, O. Levy, V. Stoyanov, L. Zettlemoyer, Bart:
     Denoising sequence-to-sequence pre-training for nat-
     ural language generation, translation, and compre-
     hension, 2019. URL: https://arxiv.org/abs/1910.13461.
     arXiv:1910.13461.
[33] G. Sarti, M. Nissim, IT5: Text-to-text pretraining for
     Italian language understanding and generation, in:
     N. Calzolari, M.-Y. Kan, V. Hoste, A. Lenci, S. Sakti,
     N. Xue (Eds.), Proceedings of the 2024 Joint Interna-
     tional Conference on Computational Linguistics, Lan-
     guage Resources and Evaluation (LREC-COLING 2024),
     ELRA and ICCL, Torino, Italia, 2024, pp. 9422–9433. URL:
     https://aclanthology.org/2024.lrec-main.823.
[34] M. La Quatra, L. Cagliero, Bart-it: An efficient sequence-
     to-sequence model for italian text summarization, Fu-
     ture Internet 15 (2023). URL: https://www.mdpi.com/
     1999-5903/15/1/15. doi:10.3390/fi15010015.
[35] V. Araujo, M. M. Trusca, R. Tufiño, M.-F. Moens,
     Sequence-to-sequence spanish pre-trained language
     models, 2023. arXiv:2309.11259.
[36] V. Araujo, M. M. Trusca, R. Tufiño, M.-F. Moens,
     Sequence-to-sequence spanish pre-trained language
     models, 2023. arXiv:2309.11259.


A. LLMs Prompts
The prompts used are the same for each model but different
for each language. For Spanish, the prompts used for each
label are:

     • Dangerous: “¿Qué parte del texto es peligroso (crimi-
       naliza, ridiculiza, incita a la violencia, ...) contra iden-
       tidades vulnerables (mujeres, migrantes, minorías ét-
       nicas, ...)?”
     • Derogatory: “¿Qué parte del texto es derogativo (co-
       mentarios negativos o despectivos sobre la víctima
       sin incitación explícita al odio, pero la mención en
       sí es discriminatoria u ofensiva, y a menudo emplea
       un tono destinado a menospreciar o desacreditar a la
       víctima)?”
     • Entity: “¿Qué entidad nombrada se menciona en la
       frase?”

</pre>