=Paper= {{Paper |id=Vol-3878/21_main_long |storemode=property |title=Women's Professions and Targeted Misogyny Online |pdfUrl=https://ceur-ws.org/Vol-3878/21_main_long.pdf |volume=Vol-3878 |authors=Alessio Cascione,Aldo Cerulli,Marta Marchiori Manerba,Lucia Passaro |dblpUrl=https://dblp.org/rec/conf/clic-it/CascioneCMP24 }} ==Women's Professions and Targeted Misogyny Online== https://ceur-ws.org/Vol-3878/21_main_long.pdf
                                Women’s Professions and Targeted Misogyny Online
                                Alessio Cascione1,* , Aldo Cerulli2,* , Marta Marchiori Manerba1 and Lucia C. Passaro1
                                1
                                    Dipartimento di Informatica, Università di Pisa, Largo B. Pontecorvo 3, Pisa, 56127, Italy
                                2
                                    Dipartimento di Filologia, Letteratura e Linguistica, Università di Pisa, Via Santa Maria 36, Pisa, 56126, Italy


                                                   Abstract
                                                   With the increasing popularity of social media platforms, the dissemination of misogynistic content has become more prevalent
                                                   and challenging to address. In this paper, we investigate the phenomenon of online misogyny on Twitter through the lens of
                                                   hurtfulness, qualifying its different manifestation in English tweets considering the profession of the targets of misogynistic
                                                   attacks. By leveraging manual annotation and a BERTweet model trained for fine-grained misogyny identification, we find
                                                   that specific types of misogynistic speech are more intensely directed towards particular professions. For example, derailing
                                                   discourse predominantly targets authors and cultural figures, while dominance-oriented speech and sexual harassment are
                                                   mainly directed at politicians and athletes. Additionally, we use the HurtLex lexicon and ItEM to assign hurtfulness scores
                                                   to tweets based on different hate speech categories. Our analysis reveals that these scores align with the profession-based
                                                   distribution of misogynistic speech, highlighting the targeted nature of such attacks.

                                                   Keywords
                                                   Abusive Language, Online Misogyny, Hurtfulness



                                1. Introduction                                                                                             social media posts. By examining the correlation between
                                                                                                                                            the profession of offended women and the prevalence
                                Misogyny is a radical manifestation of sexism directed to-                                                  of misogynistic attitudes, we aim to shed light on the
                                ward the female gender, which becomes subject of hatred.                                                    extent to which misogyny is perpetuated within specific
                                Its effects are widespread and systematic, bearing severe                                                   professional domains.
                                both social and individual consequences, such verbal and                                                        Fontanella et al. [6] highlight how research focusing
                                physical violence, rape and femicide. Indeed, misogyny,                                                     on automatic detection of misogyny tends to show weak
                                prejudice, and contempt towards women continue to per-                                                      connections with other conceptual areas addressing dif-
                                sist in various forms in our society. While overt acts of                                                   ferent aspects of the phenomenon. The finding suggests
                                discrimination and sexism have received attention, it is                                                    that current research has not yet adequately addressed
                                crucial to acknowledge that misogyny often manifests                                                        the fine-grained manifestations of online misogynistic
                                in subtle and nuanced ways [1, 2]. Moreover, with the                                                       attacks. Our contribution conducts novel analyses to
                                increasing popularity of social media platforms, the dis-                                                   uncover and measure misogynistic attitudes within dif-
                                semination of misogynistic content has become more                                                          ferent professional fields. Specifically, we examine how
                                prevalent and challenging to address [3, 4].                                                                different types of misogyny are distributed across vari-
                                   From a socio-historical perspective, women have faced                                                    ous women’s professions and how the language used in
                                numerous barriers that limited their access to certain pro-                                                 misogynistic posts varies across them. To explore this
                                fessions, hindered their career progression, and subjected                                                  relationship, we expand the English misogyny identifi-
                                them to belittlement and offense related to their work [5].                                                 cation dataset introduced by Fersini et al. [7], known as
                                These gendered biases not only perpetuate inequality but                                                    AMI, by incorporating the professions of the women tar-
                                also serve as breeding grounds for misogyny.                                                                geted. By adding professional categories to AMI, we en-
                                   In this paper, we focus on automated misogyny detec-                                                     able novel analyses on how misogynistic attacks against
                                tion, specifically investigating whether different profes-                                                  women differ based on their profession. Our research is
                                sional roles trigger varying degrees of hurtfulness across                                                  driven by the following research questions:

                                CLiC-it 2024: Tenth Italian Conference on Computational Linguistics,                                         RQ1 How does misogyny distribute across pro-
                                Dec 04 — 06, 2024, Pisa, Italy                                                                                   fessions? We analyze women’s profession ac-
                                *
                                  Corresponding authors. These authors contributed equally.                                                      cording to the type of misogyny directed towards
                                $ a.cascione@studenti.unipi.it (A. Cascione);                                                                    them.
                                a.cerulli1@studenti.unipi.it (A. Cerulli);
                                                                                                                                             RQ2 How does the language used in misogynistic
                                marta.marchiori@phd.unipi.it (M. Marchiori Manerba);
                                lucia.passaro@unipi.it (L. C. Passaro)                                                                           tweets vary across different professions? We
                                € https://martamarchiori.github.io/ (M. Marchiori Manerba);                                                      investigate how specific hurtful expressions are
                                https://luciacpassaro.github.io/ (L. C. Passaro)                                                                 directed at specific professions more frequently
                                 0009-0003-5043-5942 (A. Cascione); 0000-0002-0877-7063                                                         than others.
                                (M. Marchiori Manerba); 0000-0003-4934-534 (L. C. Passaro)
                                             © 2024 Copyright for this paper by its authors. Use permitted under Creative Commons License
                                             Attribution 4.0 International (CC BY 4.0).                                                       To address our RQs, we proceed following the work-




CEUR
                  ceur-ws.org
Workshop      ISSN 1613-0073
Proceedings
                AMI Dataset

                 mis    43.90%


                       der   4.61%
                                                                                                  AMI - PRF Dataset
                       dis   51.35%                           Tweets filtering
                       dom   12.30%                          Manual annotation
                                                                                            der   4.91%          30.10% politician
                       sex   17.28%                            of profession
                                                                                            dis   54.30%         29.38%   artist
                       ste   14.44%
                                                                                            dom   9.74%          21.49%   athlete
                 not
                        56.10%                                                                                   19.03%   author
                 mis                                                                        sex   11.84%

                                                                                            ste   19.21%


                PRF Dataset


                   politician        21.84%

                       artist        28.69%
                                                           Automatic annotation
                                                              of misogyny type
                       athlete       31.05%

                       author        18.42%




Figure 1: A subset of the AMI dataset, containing ground-truth misogyny annotations, is manually labeled with the professions
of victims of misogynistic attacks, as detailed in Section 3. The PRF dataset, featuring professions by-design, is extracted and
automatically annotated with misogyny types using a BERTweet model trained on the AMI dataset. The manually annotated
AMI subset and the automatically annotated PRF dataset are then combined to form the AMI-PRF dataset. Labels distributions
of each dataset are displayed in the workflow.



flow depicted in Figure 1. We begin by utilizing a subset              yny detection [7, 8, 9]. Indeed, it is a pressing need to
of the AMI dataset, which contains ground-truth annota-                develop systems for detecting emotive [10, 11] and of-
tions for misogyny. This subset is manually labeled with               fensive word lexicons for harassment research [12], as
the professions of the victims of misogynistic attacks,                highlighted by Rezvan et al. [13]. Contributing to the field
as detailed in Section 3.2. We then employ a misogyny                  of sexism categorization, Parikh et al. [14] provide a large
classifier to automatically annotate with various types of             dataset for multi-label classification of sexism. Chiril et al.
misogyny a novel collection, the Profession (PRF) dataset,             [15] explore the detection of sexist hate speech, examin-
which comprises 760 tweets labeled with professions. The               ing the relationship between gender stereotype detection
final step involves combining the manually annotated                   and sexism classification. Similarly, Felmlee et al. [16]
AMI subset with the automatically annotated PRF dataset,               investigate online aggression towards women on social
resulting in the AMI-PRF dataset1 . This enriched dataset              media platforms, focusing on the strategic nature of sex-
provides a resource that enables a thorough investigation              ist tweets and the reinforcement of stereotypes.
of the phenomenon.                                                        Emphasizing the interaction and co-influence of so-
   The remainder of this paper is organized as follows.                cial dimensions, like gender and profession, can assist
Section 2 discusses previous works that closely related to             in capturing complex social dynamics and informing the
ours, while Section 3 details the enrichment of the AMI                development of norms that promote equity and justice,
dataset with professional categories. Section 4 reports                as outlined by Hancock [17] and Dhamoon [18]. Specifi-
the experiments conducted to answer our RQs, whereas                   cally, previous social science research has examined hate
Section 5 outlines conclusions, limitations, and future                discourse directed at specific groups of women, such as
directions of the work.                                                politicians and celebrities. For example, Silva-Paredes
                                                                       and Ibarra Herrera [19] offer a corpus-based analysis of
                                                                       gender-based aggression towards a Chilean right-wing
2. Related Work                                                        female politician, while Phipps and Montgomery [20]
                                                                       and Ritchie [21] focus on forms of hate speech in me-
In recent years, the field of NLP has witnessed a grow-
                                                                       dia campaigns against Nancy Pelosi and Hillary Clin-
ing interest in detecting misogyny and sexist content
                                                                       ton, respectively. Specifically for tweets, Saluja and Thi-
on social media platforms. Various works have signifi-
                                                                       laka [22] employ the Feminist Critical Discourse Theory
cantly contributed to this area by publicly introducing
                                                                       to perform gender-specific inferences w.r.t. Twitter dis-
diverse datasets and evaluation tasks tailored for misog-
                                                                       course concerning Indian political leaders. On the other
1
                                                                       hand, Ghaffari [23] analyzes 2000 user-generated posts
  The dataset is accessible for research purposes by requesting it by
  email from the authors. To protect the identities of the affected
                                                                       focusing on American celebrity Lena Dunham, examin-
  women, we chose to omit explicit references to profiles and original ing manifestations of hate and stereotypes. To the best
  tweet IDs from the dataset.                                          of our knowledge, this is the first data-driven work that
examines the relationship between women professional               Table 1
categories and types of misogynistic attacks on online             BERTweet multi-classification results on AMI test set.
platforms.                                                                            support%     Precision       Recall   F1-score
                                                                       der            2.391%       0.250           0.273    0.261
3. Data Exploration and                                                dis            30.65%       0.626           0.794    0.700
                                                                       dom            26.95%       0.811           0.484    0.606
   Enrichment                                                          sex            9.565%       0.500           0.773    0.607
                                                                       ste            30.43%       0.906           0.821    0.861
In this section, we detail the construction of our novel
AMI-PRF dataset.                                                       Macro Avg,     -            0.618           0.629    0.607
                                                                       Wtd. Avg.      -            0.740           0.704    0.704
                                                                       Accuracy       -            -               -        0.704
3.1. AMI Dataset
We address the lack of misogynous data annotated w.r.t.
victims’ professions by enriching the AMI dataset2 [7].            by examining relevant job details in the tweet content
The dataset includes a coarse-grained distinction between          or on the profile page of the victim, if mentioned. For
misogynistic and not-misogynistic tweets, as well as a             such cases, a collaborative approach was taken during
fine-grained labeling for misogynistic tweets, categoriz-          group meetings to share general insights, ensuring that
ing them into five different types of misogynistic hate            any disagreements were addressed through discussions
speech: derailing (to justify women abuse), discredit              and ultimately resolved through consensus. In absence
(general slurring), dominance (to assert men superior-             of clues regarding the profession, the tweet is simply
ity), sexual harassment (sexual advances and violence)             labeled as ‘generic’.
and stereotype (oversimplification and objectification).              Finally, we point out that not all tweets in the AMI
   We enrich AMI by adding information about the pro-              dataset have women as victims. In several cases, misogy-
fessions of the victims. This enrichment is performed              nist language is used to insult men, companies or politi-
through retrieving from Wikidata3 professional figures             cal parties. Out of 5000 AMI tweets, we initially filtered
that are subclasses of the person class.                           out those that were not directed at women. Among the
   Our annotation of professions include four categories,          remaining tweets, 2187 were labelled as misogynistic.
namely ‘artist’, ‘author’, ‘athlete’, ‘politician (and ac-         However, we were able to obtain professional categories
tivist)’. We focus on these professions as they are repre-         for only a subset of 380 of these tweets, highlighting the
sented in the AMI dataset, based on the popular women              need for additional data collection.
referenced. Although the first two are both subclasses
of creator, which is an immediate subclass of person, we           3.2. PRF Dataset
keep them separate due to their different natures: the
former encompasses visual and performing arts, the lat-            To address the issue of having only a small number of
ter intellectual activities. On the other hand, we choose          tweets annotated for both misogyny and profession, we
to group politicians and activists together to highlight           crawl additional tweets. From the most common expres-
their shared involvement in public social activities, even         sions in the misogynistic tweets of AMI, we derive a list
though they are not directly related according to Wiki-            of misogynistic keywords. For each of our target profes-
data taxonomy.                                                     sions, we choose five representative popular women, col-
   As shown by Fig. 4 (Appendix A), each macro-                    lecting tweets containing a reference to them in the form
profession initiates a potentially large set of nested sub-        of a hashtag, mention and/or explicit name and surname.
professions based on Wikidata subclass of relationship.            As a result, we extract 760 tweets labeled with profes-
   We leverage these professions to manually label AMI             sions, which have been posted before the beginning of
misogynistic tweets that actually refer to women. In               February 2023: we refer to this collection as the Profes-
order to produce a consistent labeling, we establish the           sion (PRF) dataset. Since these tweets are filtered using
following conventions: if the tweet refers to a famous             specific keywords and are directed at popular women,
woman, we choose the first (or unique) occupation among            we consider them inherently misogynistic, as a woman
those appearing on her Wikidata page, tracing it back to           is the primary target of hate speech.
the appropriate macro-category. This approach mitigates               To identify the type of misogyny in PRF, we lever-
annotation inconsistencies by leveraging an established            age BERTweet4 , a transformer-based [24] model trained
external resource for labeling. When such information              on the AMI multi-classification dataset. We opt for this
is unavailable, we determine the professional category             model since it is pre-trained on Twitter, and it achieves
2
    https://live.european-language-grid.eu/catalogue/corpus/7272
3                                                                  4
    https://www.wikidata.org/wiki/Wikidata:Main_Page                   https://github.com/VinAIResearch/BERTweet
state-of-the-art performance in Twitter sentiment analy-
sis tasks [25]. Before training, the AMI tweets are prepro-
cessed with a TweetNormalizer function5 which maps
emojis into text strings and substitutes user mentions and
web/url links with @USER and HTTPURL placeholders. For
model selection, we perform a stratified cross-validation
with k = 5. We search for the best weight decay and
learning rate in [1e-2,1e-5] and [1e-5,3e-5], respectively.
For each configuration, we set 10 epochs, 500 warm up
steps and a train/validation batch of 16/8. The optimal
performance is achieved with a learning rate of 3e-5 and
a weight decay of 1e-2. Tab. 1 shows BERTweet perfor-          Figure 2: Alluvial plot depicting the relationship between
mances for the multi-class misogyny detection task on          misogyny types and professions. Thicker streams indicate a
AMI test set, comprising 1000 tweets (460 misogynistic).       higher number of tweets corresponding to the misogyny type
For the multi-classification task, we focus only on misog-     originating from the respective block.
ynistic tweets. The evaluation metrics include Accuracy,
as well as weighted and unweighted average Precision,
Recall, and F1-score. We adopt this model to label our         could be explained as an attempt to undermine the le-
PRF dataset with types of misogyny.                            gitimacy and value of women holding relevant public
                                                               roles. Sexual harassment is notably prevalent towards
AMI-PRF Dataset By combining the 380 tweets from               politicians and athletes, as expressions of intent to assert
AMI, having ground-truth information regarding the             power over women through threats of violence.
type of misogyny, and the PRF dataset, labeled with
our trained model, we obtain 1140 tweets featuring both        4.2. Hurtfulness by Profession (RQ2)
misogyny type and professions. Such dataset, named
AMI-PRF, is leveraged to investigate the relation between      To address RQ2 – whether specific hurtful expressions
misogyny and professions.                                      target women in certain professions – we define a quan-
                                                               titative lexicon-based measure for assessing the hurtful-
                                                               ness of tweets.
4. Experiments and Data Analyses
                                                               Hurtfulness Evaluation To define a hurtfulness mea-
4.1. Misogyny Type by Profession (RQ1)                         sure for tweets, we leverage the HurtLex lexicon, which
To address RQ1, we examine how different types of misog-       compiles offensive words and stereotyped expressions
ynistic speech are distributed across various professions      aimed at insulting and degrading marginalized individ-
in AMI-PRF. For each type of misogyny, we find how             uals and groups [26]. HurtLex organizes words into 17
many tweets belonging to such class are directed towards       fine-grained categories, each identifying a specific target
a specific profession and qualitatively compare the results    or form of offense.
in Fig. 2.                                                        Inspired by the work of Nozza et al. [12], where a
                                                               harmful sentence completions indicator is defined for
                                                               generative language models, we employ a subset of 9
Discussion We observe distinct patterns in the usage
                                                               HurtLex categories for our purposes: animals, prostitu-
of misogynistic speech across professions: derailing dis-
                                                               tion, professions, negative connotations, homosexual-
course, which focuses on justifying women abuse and
                                                               ity, male genitalia, female genitalia, derogatory terms,
rejecting male responsibility, tends to primarily target au-
                                                               and crime6 . The hurtfulness score for a tweet w.r.t. one
thors compared to the other professions. This aligns with
                                                               of the 9 categories could be computed as the ratio of
the nature of derailing speech, which seeks to rationalize
                                                               HurtLex lemmas7 from that category to the total HurtLex
mistreatment of women and deflect male accountabil-
                                                               lemmas from any category present in the tweet. How-
ity. Therefore, this kind of discourse can be expected to
                                                               ever, an approach relying solely on the HurtLex lexicon
be commonly directed at public intellectuals or cultural
                                                               would not provide a sufficiently comprehensive analysis,
figures. In contrast, dominance-oriented misogynistic
                                                               as HurtLex has low coverage of the vocabulary in the
discourse, aimed at asserting male superiority along with
                                                               AMI-PRF dataset, with only 15.42% of the lemmas in a
stereotypical negative speech, is predominantly directed
                                                               tweet occurring in HurtLex on average.
at powerful figures such as politicians. This prevalence
                                                               6
                                                                   For detailed descriptions of each category, we refer to Bassignana
5
    https://github.com/VinAIResearch/BERTweet/blob/master/         et al. [26].
                                                               7
    TweetNormalizer.py                                             We retain only conservative-level lemmas.
Table 2
Average cosine similarity between HurtLex lemmas and ItEM
centroids using Word2vec Twitter embeddings.

       HurtLex Category          Centroid similarity
       animals                                  0.57
       prostitution                             0.60
       professions                              0.60
       negative connotations                    0.55
       homosexuality                            0.59
       male genitalia                           0.52
       female genitalia                         0.56
       derogatory                               0.56
       crime                                    0.57

                                                              Figure 3: Emotive z-scores for HurtLex categories with respect
                                                              to professions.
   To enhance our reference vocabulary, we leverage
ItEM8 , a methodology proposed by Passaro and Lenci
[10]. For each lemma in the HurtLex subset, we obtain
its vectorial representation using ItEM and the Word2vec         where 𝑞 is the number of lemmas in t which occur in
Twitter embeddings9 , following Godin [27]. For each          𝑉 . This allows us to obtain, for each tweet-category pair,
category, we compute a centroid embedding by averag-          a score between [0, 1], indicating the tweet hurtfulness
ing the vectors associated with each lemma in that cate-      tendency.
gory. This allows us to represent each category through
a unique embedding. Tab. 2 reports the average cosine        Discussion Fig. 3 provides a visual analysis of the re-
similarity between lemmas of a specific category and the     sults. The Emotive score is computed category-wise as
respective centroid. Finally, we compute the cosine sim-     the average of the scores for each tweet, after having
ilarity between each word embedding in the Word2vec          standardized the values with a z-score approach. We
Twitter vocabulary and each centroid, thus creating a        keep a 𝑡ℎ𝑟 of 0.2 in terms of cosine similarity to filter
new lexicon featuring a coverage of 76.51% w.r.t. the        out excessively noisy category associations, while still
AMI-PRF dataset.                                             allowing low values to contribute to the average score.
   We leverage the similarity scores to define a hurtful     This provides a general overview on the hurtful language
emotive score for each tweet as follows: let t be a lem-     across different professions. According to the Emotive
matized tweet, 𝑤 a lemma in t, 𝑘 one of the 9 HurtLex        analysis, politicians are mainly targeted with insults re-
            ˜ the centroid of category 𝑘, 𝑠 the cosine sim-
categories, 𝑘                                                lated to crime, homosexuality and male genitalia. This is
ilarity function and 𝑉 the set of vocabulary items, i.e.     consistent with what has been observed in Fig. 2, where
the words for which we have a Twitter emmbedding. For        forms of sexual harassment discourse were mainly di-
each 𝑤 ∈ 𝑉 , we define the 𝐼𝑡𝐸𝑀 function as:                 rected toward political figures. For artists, we notice a
                                                             peak w.r.t. female genitalia, while for athletes we register
                            {︃
                                    ˜           ˜            a more balanced trend, except for a peak in negative con-
   𝐼𝑡𝐸𝑀 (𝑤, 𝑘    ˜, 𝑡ℎ𝑟) = 𝑠(𝑤, 𝑘) if 𝑠(𝑤, 𝑘) ≥ 𝑡ℎ𝑟 (1) notation. On the other hand, authors seem to be mainly
                               0                ˜) < 𝑡ℎ𝑟
                                        if 𝑠(𝑤, 𝑘            targeted with crime and profession-related topics, con-
    where 𝑡ℎ𝑟 designates a threshold in [0, 1] range. In sistent with the fact that the type of misogyny mostly
other words, the 𝐼𝑡𝐸𝑀 function outputs the cosine sim- inflicted towards this profession consists of derailing and
ilarity value between 𝑤 and 𝑘’s centroid if such value stereotypes.
is greater or equal then 𝑡ℎ𝑟, while it outputs 0 if it is
lower than 𝑡ℎ𝑟. Additionally, if 𝑤 is not found in the 5. Conclusion
vocabulary, its 𝐼𝑡𝐸𝑀 value is also considered 0.
    The Emotive score for a tweet t w.r.t. a category 𝑘 and In this paper, we investigated the phenomenon of misog-
a threshold 𝑡ℎ𝑟 is then computed as:                         yny on Twitter through the lens of hurtfulness, qualifying
                                                             its different manifestation considering the profession of
                                                             the targets of the misogynistic attacks.
                             ∑︀
                                     𝐼𝑡𝐸𝑀   (𝑤, 𝑘, 𝑡ℎ𝑟)
          Emotive(t, 𝑘) =        𝑤∈t
                                                         (2)    Specifically, we examined how different types of misog-
                                         𝑞
8
                                                             yny  are distributed across various professions, unveiling
  https://github.com/Unipisa/ItEM/                           how derailing discourse is mostly used to attack authors,
9
https://github.com/FredericGodin/TwitterEmbeddings
while dominance and sexual harassment speech targets                computational linguistic approach, Humanities and
especially politicians.                                             Social Sciences Communications 11 (2024) 1–15.
   Additionally, we studied through a hurtfulness score         [7] E. Fersini, D. Nozza, P. Rosso, Overview of the
measure how the language used in misogynistic tweets                evalita 2018 task on automatic misogyny identi-
varies across different professions: politicians tend to            fication (AMI), in: Tommaso Caselli and Nicole
be targeted with hate speech revolving around sexuality             Novielli and Viviana Patti and Paolo Rosso (Ed.),
(female/male genitalia, homosexuality) and crime, while             Proceedings of the Sixth Evaluation Campaign of
artists seem to be insulted mainly through general deroga-          Natural Language Processing and Speech Tools
tory terms. On the other hand, less heterogeneous results           for Italian. Final Workshop (EVALITA 2018) co-
were obtained for athletes and authors, except for peaks            located with the Fifth Italian Conference on Com-
in hurtful topics regarding crimes and professions.                 putational Linguistics (CLiC-it 2018), Turin, Italy,
   We acknowledge two potential limitations of our con-             December 12-13, 2018, volume 2263 of CEUR Work-
tribution: the incomplete coverage of our dataset’s vocab-          shop Proceedings, CEUR-WS.org, 2018. URL: http:
ulary by the Hurtlex-based ItEM lexicon, and our decision           //ceur-ws.org/Vol-2263/paper009.pdf.
to focus on just four professions, which, as motivated,         [8] V. Basile, C. Bosco, E. Fersini, D. Nozza, V. Patti, F. M.
was guided by the representation of those professions               Rangel Pardo, P. Rosso, M. Sanguinetti, SemEval-
in the AMI dataset. We therefore plan to extend the                 2019 task 5: Multilingual detection of hate speech
approach adopting a richer vocabulary w.r.t. datasets               against immigrants and women in Twitter, in: Pro-
as well as expanding the set of professions. Indeed, as             ceedings of the 13th International Workshop on Se-
further future investigations, it could be assessed how             mantic Evaluation, Association for Computational
hurtfulness dimensions change using different lexicons              Linguistics, Minneapolis, Minnesota, USA, 2019,
or automatic approaches. We also intend to investigate              pp. 54–63. URL: https://aclanthology.org/S19-2007.
the distribution of misogynistic language both textual              doi:10.18653/v1/S19-2007.
and multi-modal, as well as the broader expression of           [9] M. Zampieri, P. Nakov, S. Rosenthal, P. Atanasova,
emotions in posts associated with different professions.            G. Karadzhov, H. Mubarak, L. Derczynski, Z. Pite-
                                                                    nis, Ç. Çöltekin, SemEval-2020 task 12: Mul-
                                                                    tilingual offensive language identification in so-
Acknowledgments                                                     cial media (OffensEval 2020), in: Proceedings of
                                                                    the Fourteenth Workshop on Semantic Evaluation,
Research partially funded by PNRR-PE00000013 “FAIR
                                                                    International Committee for Computational Lin-
- Future Artificial Intelligence Research” - Spoke 1
                                                                    guistics, Barcelona (online), 2020, pp. 1425–1447.
“Human-centered AI” under NextGeneration EU, ERC-
                                                                    URL: https://aclanthology.org/2020.semeval-1.188.
2018-ADG G.A. 834756 XAI: Science and technology for
                                                                    doi:10.18653/v1/2020.semeval-1.188.
the eXplanation of AI decision making under Horizon
                                                               [10] L. C. Passaro, A. Lenci, Evaluating context se-
2020, and PRIN 2022 PIANO (Personalized Interventions
                                                                    lection strategies to build emotive vector space
Against Online Toxicity) project, CUP B53D23013290006.
                                                                    models, in: N. Calzolari, K. Choukri, T. Declerck,
                                                                    S. Goggi, M. Grobelnik, B. Maegaard, J. Mariani,
References                                                          H. Mazo, A. Moreno, J. Odijk, S. Piperidis (Eds.),
                                                                    Proceedings of the Tenth International Confer-
 [1] M. E. David, Reclaiming feminism: Challenging                  ence on Language Resources and Evaluation LREC
     everyday misogyny, Policy Press, 2016.                         2016, Portorož, Slovenia, May 23-28, 2016, Eu-
 [2] C. Tileagă, Communicating misogyny: An interdis-               ropean Language Resources Association (ELRA),
     ciplinary research agenda for social psychology, So-           2016. URL: http://www.lrec-conf.org/proceedings/
     cial and Personality Psychology Compass 13 (2019)              lrec2016/summaries/637.html.
     e12491.                                                   [11] A. Bondielli, L. C. Passaro, Leveraging CLIP for
 [3] E. A. Jane, ‘Back to the kitchen, cunt’: Speaking the          image emotion recognition, in: E. Cabrio, D. Croce,
     unspeakable about online misogyny, Continuum                   L. C. Passaro, R. Sprugnoli (Eds.), Proceedings of
     28 (2014) 558–570.                                             the Fifth Workshop on Natural Language for Ar-
 [4] D. Ging, E. Siapera, Special issue on online misog-            tificial Intelligence (NL4AI 2021) co-located with
     yny, Feminist media studies 18 (2018) 515–524.                 20th International Conference of the Italian Associ-
 [5] J. Marques, Exploring gender at work, Springer,                ation for Artificial Intelligence (AI*IA 2021), Online
     2021.                                                          event, November 29, 2021, volume 3015 of CEUR
 [6] L. Fontanella, B. Chulvi, E. Ignazzi, A. Sarra, A. Ton-        Workshop Proceedings, CEUR-WS.org, 2021. URL:
     todimamma, How do we study misogyny in the                     https://ceur-ws.org/Vol-3015/paper172.pdf.
     digital age? A systematic literature review using a       [12] D. Nozza, F. Bianchi, D. Hovy, HONEST: measuring
     hurtful sentence completion in language models,              Prevent This Nightmare, America”: Nancy Pelosi
     in: K. Toutanova, A. Rumshisky, L. Zettlemoyer,              As the Monstrous-Feminine in Donald Trump’s
     D. Hakkani-Tür, I. Beltagy, S. Bethard, R. Cotterell,        YouTube Attacks, Women’s Studies in Commu-
     T. Chakraborty, Y. Zhou (Eds.), Proceedings of the           nication 45 (2022) 316–337.
     2021 Conference of the North American Chapter           [21] J. Ritchie, Creating a monster: Online media con-
     of the Association for Computational Linguistics:            structions of Hillary Clinton during the democratic
     Human Language Technologies, NAACL-HLT 2021,                 primary campaign, 2007–8, Feminist Media Studies
     Online, June 6-11, 2021, Association for Computa-            13 (2013) 102–119.
     tional Linguistics, 2021, pp. 2398–2406.                [22] N. Saluja, N. Thilaka, Women leaders and digi-
[13] M. Rezvan, S. Shekarpour, L. Balasuriya,                     tal communication: Gender stereotyping of female
     K. Thirunarayan, V. L. Shalin, A. P. Sheth,                  politicians on twitter, Journal of Content, Commu-
     A quality type-aware annotated corpus and lexicon            nity & Communication 7 (2021) 227–241.
     for harassment research, in: H. Akkermans,              [23] S. Ghaffari, Discourses of celebrities on insta-
     K. Fontaine, I. E. Vermeulen, G. Houben, M. S. We-           gram: digital femininity, self-representation and
     ber (Eds.), Proceedings of the 10th ACM Conference           hate speech, in: Social Media Critical Discourse
     on Web Science, WebSci 2018, Amsterdam, The                  Studies, Routledge, 2023, pp. 43–60.
     Netherlands, May 27-30, 2018, ACM, 2018, pp. 33–        [24] A. Vaswani, N. Shazeer, N. Parmar, J. Uszkoreit,
     36. URL: https://doi.org/10.1145/3201064.3201103.            L. Jones, A. N. Gomez, L. Kaiser, I. Polosukhin,
     doi:10.1145/3201064.3201103.                                 Attention is all you need, in: I. Guyon, U. von
[14] P. Parikh, H. Abburi, P. Badjatiya, R. Krishnan,             Luxburg, S. Bengio, H. M. Wallach, R. Fergus,
     N. Chhaya, M. Gupta, V. Varma, Multi-label cat-              S. V. N. Vishwanathan, R. Garnett (Eds.), Ad-
     egorization of accounts of sexism using a neural             vances in Neural Information Processing Systems
     framework, in: Proceedings of the 2019 Confer-               30: Annual Conference on Neural Information
     ence on Empirical Methods in Natural Language                Processing Systems 2017, December 4-9, 2017,
     Processing and the 9th International Joint Con-              Long Beach, CA, USA, 2017, pp. 5998–6008. URL:
     ference on Natural Language Processing (EMNLP-               https://proceedings.neurips.cc/paper/2017/hash/
     IJCNLP), Association for Computational Linguis-              3f5ee243547dee91fbd053c1c4a845aa-Abstract.
     tics, Hong Kong, China, 2019, pp. 1642–1652. URL:            html.
     https://aclanthology.org/D19-1174. doi:10.18653/        [25] S. Barreto, R. Moura, J. Carvalho, A. Paes, A. Plas-
     v1/D19-1174.                                                 tino, Sentiment analysis in tweets: an assessment
[15] P. Chiril, F. Benamara, V. Moriceau, “be nice to             study from classical to modern word representation
     your wife! the restaurants are closed”: Can gender           models, Data Min. Knowl. Discov. 37 (2023) 318–380.
     stereotype detection improve sexism classification?,         URL: https://doi.org/10.1007/s10618-022-00853-0.
     in: Findings of the Association for Computational            doi:10.1007/S10618-022-00853-0.
     Linguistics: EMNLP 2021, Association for Compu-         [26] E. Bassignana, V. Basile, V. Patti, Hurtlex: A mul-
     tational Linguistics, Punta Cana, Dominican Repub-           tilingual lexicon of words to hurt, in: E. Cabrio,
     lic, 2021, pp. 2833–2844. URL: https://aclanthology.         A. Mazzei, F. Tamburini (Eds.), Proceedings of the
     org/2021.findings-emnlp.242. doi:10.18653/v1/                Fifth Italian Conference on Computational Lin-
     2021.findings-emnlp.242.                                     guistics (CLiC-it 2018), Torino, Italy, December 10-
[16] D. Felmlee, P. Inara Rodis, A. Zhang, Sexist slurs:          12, 2018, volume 2253 of CEUR Workshop Proceed-
     Reinforcing feminine stereotypes online, Sex Roles           ings, CEUR-WS.org, 2018. URL: https://ceur-ws.org/
     83 (2020) 16–28.                                             Vol-2253/paper49.pdf.
[17] A.-M. Hancock, When multiplication doesn’t equal        [27] F. Godin, Improving and interpreting neural net-
     quick addition: Examining intersectionality as a             works for word-level prediction tasks in natural
     research paradigm, Perspectives on politics 5 (2007)         language processing, Ghent University, Belgium
     63–79.                                                       (2019).
[18] R. K. Dhamoon, Considerations on mainstreaming
     intersectionality, Political research quarterly 64
     (2011) 230–243.
[19] D. Silva-Paredes, D. Ibarra Herrera, Resisting anti-
     democratic values with misogynistic abuse against
     a chilean right-wing politician on twitter: The#
     camilapeluche incident, Discourse & Communica-
     tion 16 (2022) 426–444.
[20] E. B. Phipps, F. Montgomery, “Only YOU Can
A. Supplementary Material
In Figure 4, we display the tree of nested professions based on the Wikidata taxonomy concerning the popular
women selected to collect the PRF dataset (§3.2). Branches identify Wikidata subclass of relationships, while dashed
marks the connections between women and the first (or unique) occupation appearing on their Wikidata pages.We
avoid reporting women’s names to maintain anonymity.

                                                                Person




                      Sportsperson                                                                               Worker
                                                                                         ACTIVIST


                                                                                                               Professional
                        ATHLETE

                                                                            Environmentalist    Political
                                                                                                activist
                                                                                                               POLITICIAN

 Tennis                 Football                  Volleyball
            Runner                      Swimmer                                                  Human
 player                  player                     player
                                                                                                rights
                                                                                               activist


           Sprinter    Association
                        football
                          player

                                                                Creator




                              AUTHOR                                                                 ARTIST



                               Writer

                                                                                                    Performing                                 Visual
                                                           Director            Musician
                                                                                                      artist                                   artist



                      Non-fiction                                              Vocalist                Actor
                         writer
                                                       Film         Art                                                Designer                                      Performance
                                                                                                                                   Painter   Photographer Sculptor
                                                     director    director                                                                                               artist
                      Researcher                                                Singer

                                                                                               Film actor

                       Scientist                                                                                               Fashion
                                                                                                                 Architect
                                                                                                                              designer
                                                                       Singer -
                                                                      songwriter



    Astronaut   Biologist    Astronomer




          Microbiologist Astrophysicist




                Virologist




Figure 4: Tree of professions held by the group of popular women selected to collect the PRF dataset.