<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Beyond Headlines: A Corpus of Femicides News Coverage in Italian Newspapers</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Eleonora Cappuccio</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
          <xref ref-type="aff" rid="aff2">2</xref>
          <xref ref-type="aff" rid="aff3">3</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Benedetta Muscato</string-name>
          <xref ref-type="aff" rid="aff1">1</xref>
          <xref ref-type="aff" rid="aff3">3</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Laura Pollacci</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
          <xref ref-type="aff" rid="aff3">3</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Marta Marchiori Manerba</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
          <xref ref-type="aff" rid="aff3">3</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Clara Punzi</string-name>
          <xref ref-type="aff" rid="aff1">1</xref>
          <xref ref-type="aff" rid="aff3">3</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Chandana Sree Mala</string-name>
          <xref ref-type="aff" rid="aff1">1</xref>
          <xref ref-type="aff" rid="aff3">3</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Margherita Lalli</string-name>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Gizem Gezici</string-name>
          <xref ref-type="aff" rid="aff2">2</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Michela Natilli</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Fosca Giannotti</string-name>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>ISTI-CNR</institution>
          ,
          <addr-line>Pisa</addr-line>
          ,
          <country country="IT">Italy</country>
        </aff>
        <aff id="aff1">
          <label>1</label>
          <institution>Scuola Normale Superiore</institution>
          ,
          <addr-line>Pisa</addr-line>
          ,
          <country country="IT">Italy</country>
        </aff>
        <aff id="aff2">
          <label>2</label>
          <institution>Università degli Studi di Bari Aldo Moro</institution>
          ,
          <addr-line>Bari</addr-line>
        </aff>
        <aff id="aff3">
          <label>3</label>
          <institution>Università di Pisa</institution>
          ,
          <addr-line>Pisa</addr-line>
          ,
          <country country="IT">Italy</country>
        </aff>
      </contrib-group>
      <abstract>
        <p>How newspapers cover news significantly impacts how facts are understood, perceived, and processed by the public. This is especially crucial when serious crimes are reported, e.g., in the case of femicides, where the description of the perpetrator and the victim builds a strong, often polarized opinion of this severe societal issue. This paper presents FMNews, a new dataset of articles reporting femicides extracted from Italian newspapers. Our core contribution aims to promote the development of a deeper framing and awareness of the phenomenon through an original resource available and accessible to the research community, facilitating further analyses on the topic. The paper also provides a preliminary study of the resulting collection through several example use cases and scenarios.</p>
      </abstract>
      <kwd-group>
        <kwd>eol&gt;Italian Dataset</kwd>
        <kwd>Newspapers</kwd>
        <kwd>Information Extraction</kwd>
        <kwd>Information Retrieval</kwd>
        <kwd>AI for Social Good</kwd>
        <kwd>Femicides</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>1. Introduction</title>
      <p>
        of women by males due to their gender. Successively, the
term femicide, translated in Castillian as femicidio or
femHow newspapers and journalists present news plays a inicide by the anthropologist Marcela Lagarde to attract
crucial role in shaping public understanding and percep- political attention on the dire situation faced by women
tion of information. This is especially important when in Mexico [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ], has gained global traction with varying
reporting serious crimes, such as femicides, where de- interpretations, yet consistently denotes a patriarchal
imscriptions of the perpetrator and victim can create po- petus behind homicides and other forms of male violence
larized opinions influencing readers’ perceptions and against women, primarily emphasising the sociological
interpretations of the event. According to Bouzerdan dimensions of abuse and the socio-political ramifications
and Whitten-Woodring [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ], news media often report inci- of the phenomenon. In the Italian language, the term
dents of women’s homicides in a sensationalised manner, femminicidio has been almost exclusively adopted, as
treating these crimes as isolated events rather than situat- evidenced by a Google Trends analysis comparing the
ing them within the bigger framework of violence against search terms "femicidio" and "femminicidio" to queries
women. This narrative defies the global demands of hu- regarding "femicide"1.
man rights organisations to acknowledge and address this An analysis of the phenomenon of femicide in the
Italphenomenon as demanded by its intricate dynamics. Nu- ian context and, in particular, a linguistic investigation
merous countries have followed such recommendations of it, are particularly relevant. Feminicide, a term used
only partially through the formal adoption of specific ter- by the feminist movement in Italy since 2005, gained
minology such as femicide and feminicide in legal frame- prominence in the media in 2011, especially thanks to
works and public discourse. The two terms have related the works of Barbara Spinelli [
        <xref ref-type="bibr" rid="ref4">4</xref>
        ]. The CEDAW
Combut distinct nuances of meaning. Femicide, a criminolog- mittee2, based on data from the Shadow Report on the
ical concept initially coined in English by the feminist Implementation of CEDAW in Italy, addressed
recomcriminologist Diana H. Russell [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ], denotes the murder mendations to the Italian government on feminicide in
its Concluding Observations. This was the first time the
committee addressed a European state on feminicide, a
category previously reserved for warnings to Central
CLiC-it 2024: Tenth Italian Conference on Computational Linguistics,
Dec 04 — 06, 2024, Pisa, Italy
* Corresponding author.
† These authors contributed equally. 1The conducted analysis included news web searches in Italy since
$ eleonora.cappuccio@phd.unipi.it (E. Cappuccio); 2022, i.e., since when the service implemented an enhanced data
benedetta.muscato@sns.it (B. Muscato) collection methodology.
      </p>
      <p>© 2024 Copyright for this paper by its authors. Use permitted under Creative Commons License 2Committee on the Elimination of Discrimination Against Women.</p>
      <p>Attribution 4.0 International (CC BY 4.0).</p>
      <p>
        American countries. The challenges in accurately contex- national4 and local5 level, with local editions
spantualising feminicide in Italy also stem from a prolonged ning across the whole Italian territory.
absence of oficial data, resulting in sensationalism and • Political, which was granted by choosing
nathe perception of a dramatic rise in the crime. This may tional newspaper with varying political leanings.
induce an emergency narrative that obscures the inher- • Temporal, where the time frame of national
ent structural dimensions of the phenomenon, thereby newspapers extends from November 2009 to
undermining the very essence of the term [
        <xref ref-type="bibr" rid="ref5">5</xref>
        ]. Media February 2024, whilst that of the local ones ranges
interpretations are essential for shaping a shared under- from November 2010 to February 20246.
standing across a vast audience, such as a whole country;
hence, the examination of media discourse emerges as
a significant analytical instrument on top of statistical 2. Related Work
evaluation of femicide data to understand the
achievements and directions of state intervention towards the According to frame analysis, the ways in which
newspasubstantial granting of women’s right to life [
        <xref ref-type="bibr" rid="ref6">6</xref>
        ]. pers cover news significantly impact how facts are
un
      </p>
      <p>
        In this regard, Aldrete and Fernández-Ardèvol [
        <xref ref-type="bibr" rid="ref7">7</xref>
        ] derstood, perceived, and processed by the public [
        <xref ref-type="bibr" rid="ref10 ref11">10, 11</xref>
        ].
showed that there is a large body of empirical studies Framing narratives means strategically including or
omiton femicide discourse across diferent socio-cultural con- ting elements (such as problem definitions, explanations
texts, which often justify the perpetrator’s actions. Given and evaluations) of a given situation in a
communicathe complexity of the phenomenon, a comprehensive tive text [
        <xref ref-type="bibr" rid="ref12 ref13 ref14">12, 13, 14</xref>
        ]. This process aims to advocate for
investigation could be achieved by integrating media specific interpretations, assess moral responsibilities of
analysis with external data, such as demographics and individuals involved and propose solutions while also
current events, blending together researchers from dif- eliciting nuanced emotional responses from the
audiferent fields like computer science, social sciences, and ence, thereby afecting their perceptions and attitudes. It
complex systems science. The lack of accessible and is worth noting that in the case of news articles, media
relevant data specific to socio-culturally context where framing can be seen as a demonstration of political power
femicide is notably prevalent, such as in Italy, makes the [
        <xref ref-type="bibr" rid="ref10">10</xref>
        ], influencing which actors or interests are involved
task particularly challenging [
        <xref ref-type="bibr" rid="ref8">8</xref>
        ]. shape narratives, often unnoticed by the audience [
        <xref ref-type="bibr" rid="ref11">11</xref>
        ].
      </p>
      <p>
        This paper presents FMNews, a new dataset of articles The process of news framing becomes especially
crureporting femicides extracted from Italian newspapers3. cial when reporting serious crimes, such as femicides, as
We conduct a preliminary analysis of the resulting col- understanding femicide requires analyzing its evolution
lection through several example use cases and scenarios. from both statistical and social perspectives, as discussed
The primary contribution is to deepen understanding and in the Manifesto delle Giornaliste e dei Giornalisti per il
awareness of femicide from a socio-technical perspective. Rispetto e la Parita’ di Genere nell’Informazione7
(ManWe seek to examine how prominent Italian news sources ifesto of Journalists for Respect and Gender Equality in
report on the issue in connection to the shaping of public News Reporting, our translation).
perception, while also ofering an innovative and acces- The acknowledged impact of language on how
readsible resource to facilitate future investigation within ers perceive information has prompted researchers to
the research community. Furthermore, this study was explore how the language surrounding femicide has
designed to enable a multifaceted investigation covering changed and how this influences individuals’
responthe following three dimensions: sibility perception [
        <xref ref-type="bibr" rid="ref15">15</xref>
        ], which can vary based on the way
femicides are reported [
        <xref ref-type="bibr" rid="ref1 ref16 ref17 ref9">1, 16, 9, 17</xref>
        ]. Moreover, an
initia3The choice of newspapers was dictated by the circulation volume
released by Audipress, a company that collects data on the reading
habits of daily and periodical press in Italy: https://audipress.it/
quotidiani/.
      </p>
      <p>
        • Geographical, with the aim to explore
potential variations in framing between local and na- 4The selected national newspapers are the following: Corriere della
tional media outlets. Indeed, previous research Sera, La Repubblica, La Stampa, Il Fatto Quotidiano, Il Giornale and
has shown that Italian local daily newspaper of- Il Post.
ten suppress the agency of the perpetrator, por- 5Tgrhoeuspe,lwechteicdhlococavlenrethwespfoalploewrsinargectihtieeslo:cAaglreidgietniotno,sAonfcthoenaC,iAtyrNezewzos,
traying the events as mere occurrences [
        <xref ref-type="bibr" rid="ref9">9</xref>
        ]. We Avellino, Bari, Bologna, Brescia, Brindisi, Caserta, Catania, Cesena,
selecting newspapers reporting news at both the Chieti, Como, Ferrara, Firenze, Foggia, Forlì, Frosinone, Genova,
Pescara, Piacenza, Latina, Lecce, Lecco, Livorno, Messina, Milano,
Modena, Monza, Napoli, Novara, Padova, Palermo, Parma, Perugia,
Pisa, Pordenone, Ravenna, Reggio, Rimini, Roma, Salerno, Sondrio,
Terni, Torino, Trento, Treviso, Trieste, Udine, Venezia, Verona,
Vicenza, Viterbo.
6In Fig. 3 in the Appendix, we report the distribution of articles
across time.
7https://www.sindacatogiornalistiveneto.it/wp-content/uploads/
2020/12/MANIFESTO-DI-VENEZIA.pdf.
tive by University of Bologna seeks to identify the main Selenium 10 and Beautiful Soup11. Data scraping
discursive features employed in discussions about femi- was performed in two subsequent phases. Firstly, a
comcide in public spaces, including media and legal speech8. prehensive list of article links was extracted by querying
      </p>
      <p>
        Recognizing the significant role of linguistic expres- the internal search engine of the newspaper websites
sion in depicting incidents of gender-based violence, with the keywords femminicidio, femminicidi,
previous research has explored various NLP techniques. femminicida: the first word stands for the Italian term
These studies aim to discern how NLP models can efec- "femicide", the second is its plural form, and the third
tively predict and analyze human perception judgments indicates the "person who commits a femicide". The
keyconcerning the sensitive issue of gender-based violence words were selected to concentrate our analysis on the
events. Following previous works on the impact of spe- media’s representation and discourse surrounding this
cific grammatical constructions and semantic frames [
        <xref ref-type="bibr" rid="ref18">18</xref>
        ] phenomenon. This choice intentionally excludes articles
in describing the same event but with various nuances, that discuss such crimes in general terms, allowing for a
Minnema et al. [
        <xref ref-type="bibr" rid="ref19">19</xref>
        ] introduced the first multilingual tool, more focused examination of the femicide narratives. In
based on Frame Semantics and Cognitive Linguistics, for the second phase, the web pages corresponding to such
detecting the focus or perspective depicted in an event, links were scraped to extract the text of the articles and
called Socio Fillmore. Furthermore, building on the lin- other metadata to build the raw version of the dataset.
guistic analysis provided by Socio Fillmore, Minnema
et al. [
        <xref ref-type="bibr" rid="ref20">20</xref>
        ] demonstrated that various linguistic choices 3.2. Data Cleaning
trigger diferent perceptions of responsibility, which can
be modeled automatically. As a result, their series of We implemented a supervised and semi-supervised data
regression models revealed that these distinct linguis- cleaning process, consisting of two phases, to prepare
tic choices significantly influence human perceptions of the data. In the first step, the same pipeline was applied
responsibility. Additionally, to promote awareness of to both FMNews-Nat and FMNews-Loc. We initially
reperspective-based writing, Minnema et al. [
        <xref ref-type="bibr" rid="ref21">21</xref>
        ] intro- moved all duplicate articles from the collected data, i.e.,
duced the novel task of responsibility perspective transfer. those with identical texts (title and body), metadata (e.g.,
The task involves the automatic rewriting of descriptions date), and source publication. Additionally, we converted
of gender-based violence to alter the perceived level of the dates into the format of yyyy-mm-dd and removed
blame attributed to the perpetrator. Both works lever- articles where at least one of the following elements was
aged one of the limited resources available for the Italian missing: publication date, title, or body. Despite the
recommunity, the RAI Femicide Corpus, a collection moval of duplicates, certain articles had identical text
of 2.734 news articles covering 937 confirmed femicide bodies, albeit with minor variations primarily due to
specases in Italy happened between 2015 and 2017 [22]. Ad- cial character encoding (e.g., accents and apostrophes)
ditional online resources, both oficial and unoficial, con- or diferences in web crawling (e.g., one article included
taining further statistics on the phenomenon of femicide the website menu or footer while the other did not). To
in Italy are listed in the Appendix A. address this issue, we implemented a method to
identify and handle articles with identical or highly similar
text bodies sharing the same title. In details, we first
3. FMNews Corpus employed a TF-IDF12 vectorizer to convert the raw text
data into numerical vectors and then use them to
comThe main contribution brought by this paper is the pro- pute the cosine similarities between all pairs of texts
duction of two datasets derived from Italian newspapers: in the dataset. For more details on the parameters and
the FMNews9 corpus. The corpus consists of the following thresholds employed, we refer to Appendix B. Finally, we
components: FMNews-Nat, reporting data from national utilized Beautiful Soup to remove any HTML tags
newspapers, and FMNews-Loc, which gathers articles that could have been mistakenly included in the article
from local newspapers in 53 Italian cities. body during the collection phase.
      </p>
      <p>The second step of the data cleaning process entailed
3.1. Data Extraction supervised cleaning of the article texts and headlines. The
article texts from national newspapers in FMNews-Nat
displayed various noise patterns specific to each news
media outlet. To address this issue, we manually created
Despite the heterogeneous HTML structures of the
newspapers involved, it was feasible to generalise the data
extraction process via the open source Python libraries
8https://site.unibo.it/osservatorio-femminicidio/it.
9The collection can be accessed for research purposes by requesting
it by email from the authors.
10https://selenium-python.readthedocs.io/.
11https://www.crummy.com/software/BeautifulSoup/bs4/doc/.
12Term Frequency-Inverse Document Frequency, in short TF-IDF, is
a measure of the importance of a word to a document in a collection
or corpus [23].</p>
      <sec id="sec-1-1">
        <title>Column</title>
        <p>Url</p>
        <sec id="sec-1-1-1">
          <title>Title</title>
        </sec>
        <sec id="sec-1-1-2">
          <title>Text</title>
        </sec>
        <sec id="sec-1-1-3">
          <title>Newspaper</title>
        </sec>
        <sec id="sec-1-1-4">
          <title>Keyword</title>
        </sec>
        <sec id="sec-1-1-5">
          <title>Date</title>
        </sec>
      </sec>
      <sec id="sec-1-2">
        <title>Description</title>
        <sec id="sec-1-2-1">
          <title>URL of the original newspaper article</title>
        </sec>
        <sec id="sec-1-2-2">
          <title>Title of the article</title>
        </sec>
        <sec id="sec-1-2-3">
          <title>Main section of the newspaper article</title>
        </sec>
        <sec id="sec-1-2-4">
          <title>Name of the media outlet where</title>
          <p>the article was published. In</p>
        </sec>
        <sec id="sec-1-2-5">
          <title>FMNews-Loc, it reports the name of the city to which the local edition refers to.</title>
        </sec>
        <sec id="sec-1-2-6">
          <title>Keyword used to collect the article</title>
        </sec>
        <sec id="sec-1-2-7">
          <title>Publication date of the article in</title>
          <p>the format yyyy-mm-dd</p>
          <p>Quotidiano has the largest number of articles, with a total
of 2,861, followed by La Repubblica with 2,837 articles.
Corriere is next, with a total of 968 articles. La Stampa
has a more limited presence, with 292 articles. Il Post
contributes 244 articles, and Il Giornale has the fewest
entries in this set, with 241 articles. For FMNews-Loc,
the time span after data cleaning ranges from November
2010 to February 2024.</p>
        </sec>
      </sec>
    </sec>
    <sec id="sec-2">
      <title>4. Use Cases and Scenarios</title>
      <p>Since the two datasets share the same structure and we
are interested in studying the phenomenon of femicide
from both a national and local perspective, the analyses
exemplified in the following were conducted on both
datasets without distinction. After a textual analysis
based on the tokenization, removal of stopwords,
extraction of lemmas and a straightforward assessment
of the lexical diversity (as detailed in the Appendix C),
we approached a viable keyword extraction method to
uncover relevant patterns in the documents.</p>
      <p>Keyword Extraction According to Firoozeh et al. [24],
specific criteria must be met for keywords to meet
eligibility standards. In our case study, we emphasize the
a list of replacements for each outlet, employing regular importance of keywords that show representativity and
expressions for targeted removal of articles or specific exhaustivity, aiming for terms that capture significant
sub-strings from article titles or bodies (we refer to Ap- rather than marginal aspects of the subject matter. To
pendix B for additional details). In this stage, we also assess the significance of words within our collection
excluded articles whose text bodies did not contain infor- of documents, a standard approach involves the Term
mation directly related to femicides, such as television Frequency - Inverse Document Frequency (TF-IDF).
programme listings or podcast episode agendas. For a deeper analysis, we calculate TF-IDF for each</p>
      <p>On the other hand, the articles from local newspapers news outlet. We utilize Spacy’s Italian pipeline to
prein FMNews-Loc exhibited minimal noise within their text. process texts by tokenizing, lemmatizing, and selecting
Therefore, the data preparation phase focused on poorly only lemmas that are full words from specific
part-ofencoded symbols and domain-specific substrings such speech classes (nouns, adjectives, verbs). By focusing
as copyright indications and external contributions, e.g., only on content lemmas and excluding function words
government press releases. Unlike national newspapers, (like articles and prepositions), we eliminate noise and
for journalistic publications, this ad-hoc cleaning did not improve accuracy in analyzing relationships between
result in data loss. documents and word relevance. The lists of lemmas do
not include words containing numbers or Italian
stop3.3. Final Dataset words obtained from Nltk and Spacy, with additional
crawling-dependent stopwords such as "it," "https," "min,"
and the names of months. Also, we preserve multi-word
expressions identified by the lemmatizer by
concatenating them to treat them as unique words during TF-IDF
calculation. Articles are then grouped by news outlet, each
acting as a single document for the TF-IDF computation.</p>
      <p>We use the TF-IDF Vectorizer from the scikit-learn13
library to transform the lemmatized tokens into
numerical features that reflect their importance within the text.</p>
      <p>(a) Il Post
(b) Corriere della Sera
(c) Il Giornale
(d) Fatto Quotidiano
(e) La Repubblica
(f) La Stampa</p>
      <p>Thus, TF-IDF measures the significance of terms
concerning the news outlets. Fig. 1 illustrates the most relevant
keywords extracted from FMNews-Nat by news outlet.</p>
      <p>As expected, terms like "woman," "violence," and "kill"
(along with "femicide") are central to the narrative of
femicide and are common across all outlets. Other keywords
vary in relevance among multiple outlets; for example,
"son" appears in all outlets except Il Post. Specific
keywords are unique to one or two outlets: "gender," "right,"
and "sexual" appear only in Il Post; "family" is relevant
in Corriere della Sera and La Stampa; and "man" is found
in Il Post and Il Giornale. Due to the number of local
news outlets in FMNews-Loc (50), Fig. 2 shows the top
20 keywords with the highest average TF-IDF, calculated
as the mean of the TF-IDF values of the terms with
respect to the news outlets. As expected, the highest ranks
are occupied by the same relevant keywords found in
national news outlets, such as "woman," "violence," "victim,"
and "femicide". Additionally, some keywords relevant
to specific national news outlets show high relevance
for local media, although with lower average TF-IDFs,
such as "gender". Conversely, the distribution reveals
previously unseen keywords, such as "young," "school,"
and "association".</p>
      <sec id="sec-2-1">
        <title>Semantic Vector Extraction For an additional layer</title>
        <p>of analysis, we chose to train a word embedding model to
explore semantic relationships among words. This model
represents words as continuous space vectors, where
the proximity of vectors indicates the semantic
similarity between the words they represent: closer vectors
correspond to words with more similar meanings. We would expect, nearly all terms are associated and
highemployed Word2Vec (W2V) [25], which operates by light that the victim is a woman. In this regard, a
drawmapping words to high-dimensional vectors within a back to consider is that the specific selection of the terms
given vocabulary. This mapping is designed to represent used for the data collection query may have hindered
semantic relationships between words in the vectorial our analysis from uncovering insights about homicides
space. W2V has been implemented through Gensim14, committed against individuals who do not identify as
a powerful tool set for NLP tasks. A key parameter in woman or fit into the traditional gender binary. Indeed,
W2V is the "window", i.e., the number of context words the discussion around gender-based violence in Italy is
to be considered, which we defined as 10 to consider still predominantly centred on women, while other
gena contextual window that extends neither too far nor ders remain significantly neglected 15.
too close to the current word, thereby striking a balance
between contextual relevance and computational
eficiency. To discover the semantic associations within our 5. Conclusion
dataset, we leveraged the "most similar" method from In this contribution, we provided a novel dataset
concernGensim, which computes the cosine similarity between ing the critical issue of femicide in Italy. Considering the
word vectors to identify words with the closest seman- absence of resources for conducting in-depth analyses on
tic proximity. For both datasets the size of the training the subject, our intent was to bridge this gap and provide
embeddings for the W2D model is fixed to 100 while an original perspective for understanding and raising
the vocabulary size change accordingly to the dataset, in awareness about this severe phenomenon.
FMNews-Nat is 6809, in FMNews-Loc is 6064. As suggested by Dobbe et al. [26], proposing a
con</p>
        <p>In FMNews-Nat, the word "donna" (woman) yielded tribution within the Machine Learning domain
responsemantically related terms such as "vittima" (victim) and sibly and consciously means foremost acknowledging
"prostituta" (whore). The term "femminicidio" (femicide) our own biases. In particular, we are referring to both
elicited associations like "violenza" (violence), "impres- the newspaper selection and choice of the terms used to
sionante" (impressive), and "dramma" (drama). In Table extract the data, that certainly shaped the results (all
de3a, the analysis of "uccidere" (to kill) encompasses related sign choices are justified in detail in Section 3). A future
terms such as "ammazzare" (to murder), "ucciderla" (to kill outlook concerns the investigation of how both victims
her), "ammazzato" (murdered, masculine form), "ucciso" and perpetrators are framed from a linguistic perspective.
(killed, masculine form), "suicidarsi" (to commit suicide), Further analyses could regard identifying temporal and
and "strangolato" (strangled, masculine form). These geographical patterns arising from media attention
manterms may collectively pertain to the perpetrator’s ac- ifested through the coverage of femicides and comparing
tions against the victim. Fig. 5 in the Appendix provides the framing of these events with the political leaning of
a comprehensive overview of word vectors closely asso- the respective newspapers.
ciated with the previously extracted keywords, which
were identified as the most significant in FMNews-Nat.</p>
        <p>In Table 3b, the words correlated in meaning to
"vittima" (victim) in FMNews-Loc are presented. As we
15As a matter of fact, there is no oficial collection of statistics
regarding this specific kind of event. The only organisation
that records the gender of the victims in its database is the
Observatory Femicides Lesbicides Transcides managed by Non una
di meno, the Italian section of movement Ni una menos (https:
//osservatorionazionale.nonunadimeno.net/).
14https://pypi.org/project/gensim/.</p>
      </sec>
    </sec>
    <sec id="sec-3">
      <title>Acknowledgments</title>
      <p>This work has been supported by the European Union
under ERC-2018-ADG GA 834756 (XAI), by
HumanE-AINet GA 952026, by the Partnership Extended PE00000013
- “FAIR - Future Artificial Intelligence Research” - Spoke 1
“Human-centered AI”, and by SoBigData.it that receives
funding from European Union – NextGenerationEU –
National Recovery and Resilience Plan (Piano Nazionale
di Ripresa e Resilienza, PNRR) – Project: “SoBigData.it
– Strengthening the Italian RI for Social Mining and Big
Data Analytics” – Prot. IR0000013 – Avviso n. 3264 del
28/12/2021.</p>
    </sec>
    <sec id="sec-4">
      <title>A. Additional Resources</title>
      <p>Computational Linguistics, Toronto, Canada, 2023,
pp. 7907–7918. URL: https://aclanthology.org/2023.</p>
      <p>ifndings-acl.501. Oficial Resources
[22] M. Belluati, Femminicidio, Una lettura tra realtà e
interpretazione. Biblioteca di testi e studi. Carocci Oficial statistics on femicide cases in Italy can be
ac(2021). cessed through ISTAT16 and the Ministry of the Interior
[23] A. Rajaraman, J. D. Ullman, Data mining, in: through the Department of Public Security website17. In
Mining of Massive Datasets, Cambridge University particular, ISTAT provides data on victims of voluntary
Press, Cambridge, 2011, pp. 1–17. doi:10.1017/ homicide, divided by gender, from 1992 to 2020,
withCBO9781139058452.002. out additional information. In contrast, the Department
[24] N. Firoozeh, A. Nazarenko, F. Alizon, B. Daille, Key- of Public Security ofers more detailed data covering a
word extraction: Issues and methods, Natural Lan- limited time range, i.e., from 2002 to 2022: victims are
guage Engineering 26 (2020) 259–291. categorized by their relationship to the murderer. These
[25] T. Mikolov, K. Chen, G. Corrado, J. Dean, Eficient categories include: Partner (husband/wife, domestic
partestimation of word representations in vector space, ner, boyfriend/girlfriend), Former partner (former
husarXiv preprint arXiv:1301.3781 (2013). band/wife, former domestic partner, former
boyfriend/[26] R. Dobbe, S. Dean, T. K. Gilbert, N. Kohli, A broader girlfriend), Other relative, Other acquaintance, Perpetrator
view on bias in automated decision-making: Re- unknown to the victim, and Perpetrator unidentified .
lfecting on epistemology and dynamics, CoRR
abs/1807.00553 (2018). URL: http://arxiv.org/abs/ Unoficial Resources
1807.00553. arXiv:1807.00553.</p>
      <p>Unoficial data and statistics regarding femicides in
Italy are also available, typically compiled by
nongovernmental or grassroots organisations. One notable
example is the open database18 managed by the Italian
activists of Ni una menos19, an international feminist
movement that campaigns against gender-based violence.</p>
      <p>Although it covers a shorter time frame, this database
ofers disaggregated and more detailed information than
the oficial statistics. For example, in addition to the
names of the victims, the collection also includes
important characteristics such as the age and nationality of the
individuals involved, the geographical dimension, and
the gender of the victim, including non-binary framings.</p>
      <p>While not readily accessible, a combined examination of
both oficial and non-oficial data is essential for a more
thorough and comprehensive analysis of the issues of
femicide in Italy.</p>
    </sec>
    <sec id="sec-5">
      <title>B. Data Preparation</title>
      <p>We applied a supervised and semi-supervised cleaning
phase divided into two steps to prepare the data. In the
ifrst step, the same pipeline was applied to both datasets,
primarily aimed at removing duplicate articles,
formatting metadata, and reducing data and metadata sparsity.</p>
      <p>The second step entailed supervised cleaning of the
article texts and headlines. We observed diferent types of
noise in the texts of the national newspapers compared
16https://www.istat.it/it/violenza-sulle-donne/il-fenomeno/</p>
      <p>omicidi-di-donne.
17https://www.interno.gov.it/it/stampa-e-comunicazione/</p>
      <p>dati-e-statistiche/omicidi-volontari-e-violenza-genere.
18https://osservatorionazionale.nonunadimeno.net/anno/.
19https://nonunadimeno.wordpress.com/.
to the local ones. Hence, given that the two datasets are solely arise from symbols, we set a tolerance threshold
released and usable separately, we implemented a similar of 0.89 to determine text equality. If two text bodies had
pipeline for both datasets, albeit customized for each. a cosine similarity greater than 0.89, we considered them
duplicates and retained only the first occurrence,
removData Preparation - Step I: Cleaning ing the second found in the dataset. Finally, we utilized
Beautiful Soup to remove any HTML tags that could
We first removed all duplicate articles from the collected have been mistakenly included in the article body during
data (just under 12,800 articles from national newspapers the collection phase. This step ensured that our text data
and approximately 8,400 articles from local ones), i.e., was free from any undesired HTML tags before further
those with identical texts (title and body), metadata (e.g., processing or analysis.
date), and source publication. Additionally, we converted
the dates into the format of yyyy-mm-dd and removed Data Preparation - Step II: FMNews-Nat
articles where at least one of the following elements was
missing: publication date, title, or body. Despite the The article texts from national newspapers displayed
varremoval of duplicates, some articles had identical text ious noise patterns specific to each news media outlet. To
bodies, albeit with minor variations primarily due to spe- address this issue, we manually created a list of
replacecial character encoding (e.g., accents and apostrophes) ments for each outlet, employing regular expressions
or diferences in web crawling (e.g., one article included for targeted removal of articles or specific sub-strings
the website menu or footer while the other did not). To from article titles or bodies. In particular, the body of
address this issue, we implemented a method to identify articles from Il Post, La Repubblica and Il Fatto Quotidiano
and handle articles with identical or highly similar text included parts of webpage menus and footers, as well as
bodies, but only if they share the same title. The method various types of news media outlet sponsorship, such as
relies on cosine similarity to determine whether two texts subscriptions, newsletter sign-ups, and agendas/lists of
are the same. In particular, we first employed a TF-IDF podcast episodes. On the other hand, articles from
Corvectorizer to convert the raw text data into numerical vec- riere della sera included text substrings associated with
tors. These vectors were then used to compute the cosine the journalistic domain, such as headings containing the
similarities between all pairs of texts in the dataset. Co- name of the correspondent, reporter, or photographer.
sine similarity produces a value between 0 and 1, where We observed that the texts of the articles published by
1 indicates identical texts and values closer to 0 indicate Corriere della sera often, but not always, follow a
parless similar texts. Since text preprocessing had not been ticular structure: "by Author_name Author_surname"
performed yet and diferences between text bodies could (where &lt;Author_name Author_surname&gt; can be a
natural person or abbreviations with one dot) or "Editorial
team", followed by a city or "online", in either uppercase
or lowercase. Occasionally, this structure is followed
by another city, for instance, "Bologna Online Editorial
Staf". Additionally, this "basic" structure may or may not
be followed by "inviato a &lt;City&gt; &lt;(Province)&gt;", or
"inviata", "foto di &lt;Author_name Author_surname&gt;". We
generally excluded articles whose text bodies did not
contain information directly related to femicides, such as
television programme listings or podcast episode
agendas. We retained the article whenever feasible, removing
irrelevant substrings from the text bodies, such as menus
and footers. The resulting FMNews-Nat dataset includes
7, 443 articles: in Fig. 4 we report the distribution of
articles by media outlet.</p>
      <p>Data Preparation - Step II: FMNews-Loc
The articles from local newspapers exhibited minimal
noise within their text. Therefore, the data preparation
phase focused on poorly encoded symbols and
domainspecific substrings such as copyright indications and
external contributions, e.g., government press releases.
Unlike national newspapers, for journalistic publications,
this ad-hoc cleaning did not result in data loss .
Therefore, the resulting FMNews-Loc dataset includes 7, 728
articles.</p>
    </sec>
    <sec id="sec-6">
      <title>C. Textual Analysis</title>
      <p>Although applying NLP models typically requires
standardized and structured text, it is important to
acknowledge that such preprocessing may result in the loss of
some information. We believe it is important to keep
track into texts of the elements we manipulate.
• Emails and URLS. Emails and URLs found
within the body of the articles are replaced with
a placeholder tag, such as "[[URL]]".
• Uppercase words. Words entirely in uppercase
are not replaced or modified, as the text will be
normalized in subsequent stages of the work, i.e.,
converted to lowercase. Uppercase words are
extracted and saved for further analysis.
• Punctuation, symbols, numbers.
Punctuation, symbols, and numbers are removed from
the texts.
• Stopwords. We remove the stopwords included
in the list provided by NLTK 20 and Spacy21
libraries, along with a brief, manually compiled
list of stopwords. This latter list includes
domainspecific and context-related keywords, such as
"Link Embed", "FOTO", "FOTOGRAMMA". It is
important to note that the "ad hoc" stopwords
were removed from the non-normalized text to
mitigate the impact of stopwords removal. Indeed,
during the analysis, we observed that some
articles from national newspapers contained certain
keywords entirely in uppercase to indicate
elements attached to the article. Thus, we chose to
compile the list of stopwords to be case-sensitive,
aiming to avoid removing words within the body
of the article.</p>
      <p>After extracting the features from the raw texts, we
proceeded with the following steps. First, we tokenized
the body of articles using the Spacy library with the
Italian module, selecting only words. Next, we extracted
tokens that are not included in the stopwords. Then, we
extracted the lemmas, again excluding stopwords. Finally,
we further refined our selection by retaining from the
tokens only words belonging to what is commonly referred
to as "full" classes of speech, such as nouns, verbs,
adjectives, and adverbs. This process of extracting "full" words
aimed to focus our analysis on linguistically significant
elements of the text. This approach allows us to study
meaningful linguistic units, facilitating a more accurate
understanding of the semantic content and structure of
the text.</p>
      <p>After tokenization, removal of stopwords, and
extraction of lemmas, we computed the Type-Token Ratio (TTR)
for the articles, a measure of the lexical diversity in a text.
This is given by the proportion of unique words in a text,
or "types", to the total number of words, or "tokens" and
reads:
   =
types
tokens
(1)
20https://www.nltk.org/.
21https://spacy.io/.</p>
      <p>Where types is the number of unique types and tokens
is the number of tokens in the text. TTR values range
from 0 to 1, where a higher value indicates greater lexical
variety, whereas a lower value implies more repetition
of words in the text. This is a straightforward measure
which nevertheless allows us to form an initial
assessment of the lexical richness in the narrative surrounding
femicides. The newspaper Il Post, along with Il Fatto
Quotidiano and La Repubblica, exhibited a notable variation
in terms of TTR. While FMNews-Nat shows variation
in lexicon usage, FMNews-Loc exhibits a uniformity in
language .</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <given-names>C.</given-names>
            <surname>Bouzerdan</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Whitten-Woodring</surname>
          </string-name>
          ,
          <article-title>Killings in context: An analysis of the news framing of femicide</article-title>
          ,
          <source>Human Rights Review</source>
          <volume>19</volume>
          (
          <year>2018</year>
          )
          <fpage>211</fpage>
          -
          <lpage>228</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [2]
          <string-name>
            <given-names>J.</given-names>
            <surname>Radford</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Russell</surname>
          </string-name>
          , Femicide: The Politics of Woman Killing,
          <string-name>
            <surname>Post-Contemporary</surname>
            <given-names>Interventions</given-names>
          </string-name>
          , Twayne,
          <year>1992</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [3]
          <string-name>
            <surname>M. M. L</surname>
          </string-name>
          . y de los Ríos,
          <article-title>Por la vida y la libertad de las mujeres: fin al feminicidio</article-title>
          ,
          <source>Cámara de Diputados del Congreso de la Unión</source>
          , LIX Legislatura,
          <article-title>Comisión Especial para Conocer y Dar Seguimiento a las Investigaciones Relacionadas con los Feminicidios en la República Mexicana y a la</article-title>
          <string-name>
            <surname>Procuración de Justicia Vinculada</surname>
          </string-name>
          ,
          <year>2006</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [4]
          <string-name>
            <given-names>B.</given-names>
            <surname>Spinelli</surname>
          </string-name>
          ,
          <article-title>Femminicidio: dalla denuncia sociale al riconoscimento giuridico internazionale</article-title>
          ,
          <source>Franco Angeli</source>
          ,
          <year>2008</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          [5]
          <string-name>
            <given-names>B.</given-names>
            <surname>Spinelli</surname>
          </string-name>
          ,
          <string-name>
            <surname>L'</surname>
          </string-name>
          <article-title>italia rispetta la CEDAW? il femminicidio in italia alla luce delle raccomandazioni delle nazioni unite</article-title>
          , in: I. Corti (Ed.),
          <article-title>Universo femminile</article-title>
          .
          <source>La CEDAW tra diritto e politiche, eum edizioni università di Macerata</source>
          ,
          <year>2012</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          [6]
          <string-name>
            <given-names>S.</given-names>
            <surname>Abis</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Orrù</surname>
          </string-name>
          , et al.,
          <article-title>Il femminicidio nella stampa italiana: un'indagine linguistica</article-title>
          , gender/sexuality/italy 3 (
          <year>2016</year>
          )
          <fpage>18</fpage>
          -
          <lpage>33</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          [7]
          <string-name>
            <given-names>M.</given-names>
            <surname>Aldrete</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Fernández-Ardèvol</surname>
          </string-name>
          ,
          <article-title>Framing femicide in the news, a paradoxical story: A comprehensive analysis of thematic and episodic frames</article-title>
          , Crime, Media, Culture (
          <year>2023</year>
          )
          <fpage>17416590231199771</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          [8]
          <string-name>
            <given-names>A.</given-names>
            <surname>Forciniti</surname>
          </string-name>
          , E. Zavarrone,
          <article-title>Data quality and violence against women: The causes and actors of femicide</article-title>
          ,
          <source>Social Indicators Research</source>
          (
          <year>2023</year>
          )
          <fpage>1</fpage>
          -
          <lpage>25</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          [9]
          <string-name>
            <given-names>C.</given-names>
            <surname>Meluzzi</surname>
          </string-name>
          ,
          <string-name>
            <given-names>E.</given-names>
            <surname>Pinelli</surname>
          </string-name>
          ,
          <string-name>
            <given-names>E.</given-names>
            <surname>Valvason</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Zanchi</surname>
          </string-name>
          ,
          <article-title>Responsibility attribution in gender-based domestic violence: A study bridging corpus-assisted discourse analysis and readers' perception</article-title>
          ,
          <source>Journal of pragmatics 185</source>
          (
          <year>2021</year>
          )
          <fpage>73</fpage>
          -
          <lpage>92</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          [10]
          <string-name>
            <surname>R. M. Entman</surname>
          </string-name>
          , Framing:
          <article-title>Toward clarification of a fractured paradigm</article-title>
          ,
          <source>Journal of Communication</source>
          <volume>43</volume>
          (
          <year>1993</year>
          )
          <fpage>51</fpage>
          -
          <lpage>58</lpage>
          . doi:
          <volume>10</volume>
          .1111/j.1460-
          <fpage>2466</fpage>
          .
          <year>1993</year>
          . tb01304.x.
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          [11]
          <string-name>
            <given-names>J. James W.</given-names>
            <surname>Tankard</surname>
          </string-name>
          ,
          <article-title>The empirical approach to the study of media framing</article-title>
          , in: S. D.
          <string-name>
            <surname>Reese</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          <string-name>
            <surname>Gandy</surname>
            ,
            <given-names>A. E.</given-names>
          </string-name>
          <string-name>
            <surname>Grant</surname>
          </string-name>
          (Eds.),
          <article-title>Framing public life</article-title>
          , Taylor &amp; Francis, Philadelphia, PA,
          <year>2001</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          [12]
          <string-name>
            <given-names>M.</given-names>
            <surname>Edelman</surname>
          </string-name>
          ,
          <article-title>Contestable categories and public opinion</article-title>
          ,
          <source>Political Communication</source>
          <volume>10</volume>
          (
          <year>1993</year>
          )
          <fpage>231</fpage>
          -
          <lpage>242</lpage>
          . doi:
          <volume>10</volume>
          .1080/10584609.
          <year>1993</year>
          .
          <volume>9962981</volume>
          .
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          [13]
          <string-name>
            <given-names>D.</given-names>
            <surname>Kahneman</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Tversky</surname>
          </string-name>
          , Choices, values, and frames.,
          <source>American Psychologist</source>
          <volume>39</volume>
          (
          <year>1984</year>
          )
          <fpage>341</fpage>
          -
          <lpage>350</lpage>
          . doi:
          <volume>10</volume>
          .1037/
          <fpage>0003</fpage>
          -
          <lpage>066x</lpage>
          .
          <year>39</year>
          .4.341.
        </mixed-citation>
      </ref>
      <ref id="ref14">
        <mixed-citation>
          [14]
          <string-name>
            <surname>P. M. Sniderman</surname>
            ,
            <given-names>R. A.</given-names>
          </string-name>
          <string-name>
            <surname>Brody</surname>
            ,
            <given-names>P. E.</given-names>
          </string-name>
          <string-name>
            <surname>Tetlock</surname>
          </string-name>
          ,
          <article-title>Cambridge studies in public opinion and political psychology: Reasoning and choice: Explorations in political psychology</article-title>
          , Cambridge University Press, Cambridge, England,
          <year>1993</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref15">
        <mixed-citation>
          [15]
          <string-name>
            <given-names>C.</given-names>
            <surname>Corradi</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Marcuello-Servós</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Boira</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Weil</surname>
          </string-name>
          ,
          <article-title>Theories of femicide and their significance for social research</article-title>
          ,
          <source>Current sociology 64</source>
          (
          <year>2016</year>
          )
          <fpage>975</fpage>
          -
          <lpage>995</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref16">
        <mixed-citation>
          [16]
          <string-name>
            <given-names>J.</given-names>
            <surname>Fairbairn</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Boyd</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Y.</given-names>
            <surname>Jiwani</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Dawson</surname>
          </string-name>
          ,
          <article-title>Changing media representations of femicide as primary prevention</article-title>
          ,
          <source>in: The Routledge International Handbook on Femicide and Feminicide</source>
          , Routledge,
          <year>2023</year>
          , pp.
          <fpage>554</fpage>
          -
          <lpage>564</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref17">
        <mixed-citation>
          [17]
          <string-name>
            <given-names>E.</given-names>
            <surname>Pinelli</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Zanchi</surname>
          </string-name>
          ,
          <article-title>Gender-based violence in italian local newspapers: How argument structure constructions can diminish a perpetrator's responsibility, in: Discourse Processes between Reason and Emotion: A Post-</article-title>
          disciplinary Perspective, Springer,
          <year>2021</year>
          , pp.
          <fpage>117</fpage>
          -
          <lpage>143</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref18">
        <mixed-citation>
          [18]
          <string-name>
            <given-names>G.</given-names>
            <surname>Minnema</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Gemelli</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Zanchi</surname>
          </string-name>
          ,
          <string-name>
            <given-names>V.</given-names>
            <surname>Patti</surname>
          </string-name>
          ,
          <string-name>
            <given-names>T.</given-names>
            <surname>Caselli</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Nissim</surname>
          </string-name>
          , et al.,
          <article-title>Frame semantics for social nlp in italian: Analyzing responsibility framing in femicide news reports</article-title>
          ,
          <source>in: CEUR WORKSHOP PROCEEDINGS</source>
          , volume
          <volume>3033</volume>
          ,
          <string-name>
            <surname>CEUR-WS</surname>
          </string-name>
          ,
          <year>2021</year>
          , pp.
          <fpage>1</fpage>
          -
          <lpage>8</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref19">
        <mixed-citation>
          [19]
          <string-name>
            <given-names>G.</given-names>
            <surname>Minnema</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Gemelli</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Zanchi</surname>
          </string-name>
          ,
          <string-name>
            <given-names>T.</given-names>
            <surname>Caselli</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Nissim</surname>
          </string-name>
          ,
          <article-title>Sociofillmore: a tool for discovering perspectives</article-title>
          ,
          <source>arXiv preprint arXiv:2203.03438</source>
          (
          <year>2022</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref20">
        <mixed-citation>
          [20]
          <string-name>
            <given-names>G.</given-names>
            <surname>Minnema</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Gemelli</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Zanchi</surname>
          </string-name>
          ,
          <string-name>
            <given-names>T.</given-names>
            <surname>Caselli</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Nissim</surname>
          </string-name>
          ,
          <article-title>Dead or murdered? predicting responsibility perception in femicide news reports, in: Proceedings of the 2nd Conference of the Asia-Pacific Chapter of the Association for Computational Linguistics and the 12th</article-title>
          <source>International Joint Conference on Natural Language Processing</source>
          (Volume
          <volume>1</volume>
          :
          <string-name>
            <surname>Long</surname>
            <given-names>Papers)</given-names>
          </string-name>
          ,
          <source>Association for Computational Linguistics</source>
          , Online only,
          <year>2022</year>
          , pp.
          <fpage>1078</fpage>
          -
          <lpage>1090</lpage>
          . URL: https://aclanthology.org/
          <year>2022</year>
          .aacl-main.
          <volume>79</volume>
          .
        </mixed-citation>
      </ref>
      <ref id="ref21">
        <mixed-citation>
          [21]
          <string-name>
            <given-names>G.</given-names>
            <surname>Minnema</surname>
          </string-name>
          ,
          <string-name>
            <given-names>H.</given-names>
            <surname>Lai</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B.</given-names>
            <surname>Muscato</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Nissim</surname>
          </string-name>
          ,
          <article-title>Responsibility perspective transfer for Italian femicide news</article-title>
          ,
          <source>in: Findings of the Association for Computational Linguistics: ACL</source>
          <year>2023</year>
          ,
          <article-title>Association for</article-title>
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>