<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Beyond Obscuration and Visibility: Thoughts on the Diferent Strategies of Gender-Fair Language in Italian</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Martina Rosola</string-name>
          <xref ref-type="aff" rid="aff3">3</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Simona Frenda</string-name>
          <xref ref-type="aff" rid="aff1">1</xref>
          <xref ref-type="aff" rid="aff4">4</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Alessandra Teresa Cignarella</string-name>
          <xref ref-type="aff" rid="aff1">1</xref>
          <xref ref-type="aff" rid="aff4">4</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Matteo Pellegrini</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Andrea Marra</string-name>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Mara Floris</string-name>
          <xref ref-type="aff" rid="aff2">2</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>CIRCSE Research Centre, Università Cattolica del Sacro Cuore</institution>
          ,
          <addr-line>Milan</addr-line>
          ,
          <country country="IT">Italy</country>
        </aff>
        <aff id="aff1">
          <label>1</label>
          <institution>Department of Computer Science, University of Turin</institution>
          ,
          <country country="IT">Italy</country>
        </aff>
        <aff id="aff2">
          <label>2</label>
          <institution>Università Vita-Salute San Rafaele</institution>
          ,
          <addr-line>Milan</addr-line>
          ,
          <country country="IT">Italy</country>
        </aff>
        <aff id="aff3">
          <label>3</label>
          <institution>Università degli Studi di Brescia</institution>
          ,
          <country country="IT">Italy</country>
        </aff>
        <aff id="aff4">
          <label>4</label>
          <institution>aequa-tech</institution>
          ,
          <addr-line>Turin</addr-line>
          ,
          <country country="IT">Italy</country>
        </aff>
      </contrib-group>
      <abstract>
        <p>This study focuses on the growing importance of gender-fair language and explores innovative strategies proposed also in other languages to avoid gender-specific endings. We present a set of guidelines for the annotation and reformulation of gender-(un)fair texts and their application to a corpus of 1,024 portions of university administrative documents in Italian. Overall, the guidelines presented in this study prove to be valuable both practically and theoretically. They help identify and address non-inclusive expressions while highlighting the complexities of obscuration and visibility in gender-fair language reformulation. In addition, the statistical analysis of the created corpus shows how administrative texts tend to contain gender-unfair language, especially the masculine overextended expressions, showing the need to adopt specific and complete guidelines that lead (and support the staf training) to the use of a more gender-fair language.</p>
      </abstract>
      <kwd-group>
        <kwd>eol&gt;Annotation Schema</kwd>
        <kwd>Italian</kwd>
        <kwd>Gender-fair language</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>1. Introduction</title>
      <p>
        Recognizing the importance of rectifying such lan- (in)visibilization of women in language, more recent
scholguage discrepancies, various guidelines have been pub- arly and activist debates also concern how to address and
lished over the years [e.g., 5]. The annotation scheme talk about non-binary people, i.e., those that do not
excluwe propose draws upon the recommendations presented sively identify as men or women, aiming at making them
in the available guidelines, to develop a comprehensive visible too.
framework for addressing gender-unfair expressions in The Italian grammatical gender system, indeed, is
biItalian language usage. To the best of our knowledge, our nary and does not provide a straightforward way to refer
annotation scheme represents a novel approach. While to non-binary people. Various gender-neutral sufixes are
another project (i.e., E-MIMIC) focuses on inclusive lan- in use, such as ‘-@’ or ‘-u’ (see [11] for a comprehensive
guage, it simply distinguishes between inclusive and non- list). As González Vázquez et al. [12] observe, such
innoinclusive texts [
        <xref ref-type="bibr" rid="ref6">6</xref>
        ]. Our annotation scheme appears to be vative proposals can be employed to make gender visible,
the first one distinguishing between diferent types of as in “tutti, tutte e tuttu” (everyone:m.pl, everyone:f.pl,
gender-unfair language, and it comprehensively consid- and everyone:inn.pl)2, or to neutralize it, as in “tuttu”
ers all the gender-fair options when it comes to ofering (everyone.inn.pl) used for a mixed-gender group.
alternative wordings. Moreover, applying this scheme of The implementation of innovative strategies also
deannotation to various administrative texts, we showed pends on the features of the language. Marcato and
how, despite the existence of various guidelines, they Thüne [13] provide an analysis of the Italian grammatical
remain impregnated with gender-unfair expressions. gender system, distinguishing between nouns whose
ref
      </p>
      <p>In this work, firstly we review previous studies on erential gender is expressed by diferent lexical roots (e.g.,
this topic, both in theoretical linguistics (subsection 2.1) “madre”, mother:f.sg, and “padre”, father:m.sg); nouns
and in NLP (subsection 2.2). We then describe in detail with mobile gender, whose referential gender is
specthe annotation scheme (section 3) and the creation of the ified through the addition of diferent sufixes to the
annotated corpus (section 4), also providing a preliminary same lexical root (e.g., “figlia”, daughter: f.sg, and “figlio”,
analysis of the data gathered so far. son:m.sg); and the so-called epicene nouns, whose
gender is not overtly marked, but only revealed by satellite
elements – i.e., the noun’s determiners and modifiers
2. Related Work (e.g., “la nipote”, the.f.sg niece:f.sg; “il nipote”, the.m.sg
nephew:m.sg). As Formato [14] observes, some nouns
2.1. Linguistics (i.e., ‘semi-epicene’) work in the latter way only in the
singular and have diferent gendered sufixes in the plural
(e.g., “giornalista”, journalist:f/m.sg; “giornaliste”,
journalist:f.pl; “giornalisti”, journalist:m.pl). Finally, a few
nouns refer to individuals of any gender irrespective of
their grammatical gender (e.g., “persona”, person:f.sg).</p>
      <p>Due to this peculiar characteristic, these nouns can
be straightforwardly used to refer to non-binary people
as well. Moreover, gender-neutral sufixes are not
required for epicene (and, in the singular, semi-epicene)
nouns, as they are not overtly marked for gender. In this
case, the only needed precaution to get a gender-neutral
form concerns the choice of gender-neutral satellite
elements or their gender-neutralization. Gender-neutral
sufixes are further inefective for nouns like madre and
padre, where it is the root to be overtly marked for
gender. These word endings, thus, should only be used with
nouns with mobile gender, which, however, constitute
the vast majority of Italian animate nouns (see [15], p.
106). Innovative strategies should also be used for the
many gendered pronouns, determiners, past participles,
and adjectives in order to make them gender-neutral
and suitable to refer to non-binary people and to
mixedgender groups.</p>
      <p>Formato [14] also provides a taxonomy of linguistic
us</p>
      <sec id="sec-1-1">
        <title>Sexism in language and how to make Italian gender-fair</title>
        <p>
          are increasingly studied and debated topics (see [
          <xref ref-type="bibr" rid="ref7">7</xref>
          ] for
an overview). The classic reference point in the literature
is Sabatini [
          <xref ref-type="bibr" rid="ref8">8</xref>
          ], which comprises an analysis of sexism
in the Italian language and recommendations on how to
overcome it.
        </p>
        <p>
          Sabatini [
          <xref ref-type="bibr" rid="ref8">8</xref>
          ] identifies grammatical and semantic
asymmetries, namely gender-unfair grammatical and
discursive or lexical linguistic conventions. The use of
masculine terms for mixed-gender groups belongs to the former,
while the exclusive use of adjectives for one gender (e.g.,
grazioso, ‘pretty’, is hardly used for men) instantiates
the latter. On top of avoiding semantically sexist
expressions, Sabatini [
          <xref ref-type="bibr" rid="ref8">8</xref>
          ] advises using feminine job titles for
women and conjoining masculine and feminine forms
for mixed-gender groups.
        </p>
        <p>Her recommendations have been expanded and
adapted by several private and public bodies, that issued
gender-fair language guidelines [e.g., 9, 5, 10]. These
works, among others, distinguish between strategies
aimed at symmetrizing language by giving women the
same visibility that men have, and strategies aimed at
getting rid of sexism through the avoidance of
gendered forms altogether. While Sabatini and the
subsequent guidelines focus on the discrimination and</p>
      </sec>
      <sec id="sec-1-2">
        <title>2We label such forms as inn ‘innovative’.</title>
        <p>ages influenced by gendered assumptions. Like Sabatini,
Formato focuses on both sexist expressions and
linguistic conventions. Among the latter, Formato originally
distinguishes the case in which masculine terms are used
for mixed-gender groups from those in which they are
used for unknown or generic individuals.</p>
        <p>In our framework, we elaborate on the categories
identified in these works, to develop our own taxonomy, that
we implement in the annotation scheme described in
section 3.
2.2. Natural Language Processing
the abundance of papers on gender bias, many newly
developed algorithms lack suficient bias testing and fail to
address ethical considerations. Lastly, the methodologies
employed in this area often lack comprehensive
definitions of gender bias and robust evaluation baselines and
pipelines.</p>
        <p>The present work contributes to address many of these
issues: we explicitly take into account the linguistic
representation of non-binary individuals, we create an
annotation scheme for Italian – for which much fewer
NLP studies are available compared to English – and
ifnally, we present an annotated corpus that could be
exploited for the training of automatic NLP tools.</p>
        <p>In recent years, sexism and gender-(un)fair practices have
been addressed in Computational Linguistics, mostly
focusing on the presence of gender bias in automatic sys- 3. Annotation Scheme
tems. As highlighted in Costa-jussà [16], studies on
gender bias in NLP serve a dual role. On the one hand, NLP The annotation task is divided into two parts. A first
can function as a tool to identify gender bias in various so- annotation layer concerns the identification of portions
cial domains such as online news or advertisements. On of text(s) where gender-unfair language is used, and the
the other hand, it frequently generates gender-biased assignment of each of them to a specific type among the
systems, thus contributing to the perpetuation and rein- following ones:
forcement of gender bias within society. This bias in NLP • ‘incongruous’ (It. “incongruo”), when the
gramis predominantly attributed to the training of models on matical gender of the noun (and, possibly, of its
datasets that exhibit inherent biases. Consequently, the modifiers), does not match the gender of the
refamplification of bias occurs through the learning algo- erent identified in discourse (e.g., “il ministro
rithms employed in NLP systems. del turismo, Daniela Santanché”, the.m.sg
min</p>
        <p>Some specific studies have been conducted in the field ister:m.sg of.the tourism, Daniela Santanché);
of Machine Translation. One of the most recent was car- • ‘overextended’ (It. “sovraesteso”), when the
masried out by Rescigno et al. [17] who explored how three of culine (or, in rare cases, feminine) grammatical
the most popular translation systems (Google Translate, gender is used to refer to a group composed of
Bing Microsoft Translator, and DeepL) handle gender people with diferent genders (e.g., “il rapporto
phenomena in natural languages, such as pronouns, job con i professori è buono”, the relationship with
titles, and occupation names. The authors compared the the.m.pl professor:m.pl is good, with reference
translations generated from English respectively to Ital- to a group of teachers possibly comprising men,
ian, French, and Spanish, revealing that all three systems women, and non-binary individuals);
exhibit some level of gender bias, with Google Trans- • ‘generic’ (It. “generico”) when the masculine (or,
late producing more biased translations, Bing Microsoft in rare cases, feminine) grammatical gender is
Translator displaying a lesser degree of bias, and DeepL used to refer to a generic or specific, but unknown,
generally being more gender-neutral. person, whose actual gender cannot be guessed</p>
        <p>Similarly to Costa-jussà [16], Sun et al. [18] conducted (e.g. “il vincitore riceverà un premio”, the.m.sg
a comprehensive literature review, exploring various winner:m.sg will.receive a bonus, where the
idenstrategies proposed in existing research to address gen- tity of the winner is unknown, and so is their
der bias, including dataset preprocessing, algorithmic gender).
modifications, and post-processing techniques. The
paper emphasizes the significance of mitigating gender bias A second annotation layer concerns the proposal of
in NLP systems and highlights the challenges associated gender-fair reformulations of the portions of texts
identiwith bias detection and mitigation. ifed as unfair, 3 and the assignment of each reformulation</p>
        <p>More recently, Stanczak and Augenstein [19] identify to a specific type.
four key limitations in current research on gender bias in As for cases of incongruous gender, the only possible
NLP. Firstly, social gender is often treated as a binary vari- type of gender-fair solution is providing a ‘congruous’
able, not paying attention to its fluidity and continuity. (It. “congrua”) alternative option, where the
grammatiSecondly, studies usually give more importance to high- cal gender matches the gender of the referent (e.g., “la
resource languages - primarily English - neglecting the
diversity of languages spoken globally. Thirdly, despite
3At least one reformulation for each gender-unfair portion is
required, but more than one reformulation is often provided.</p>
        <p>visibility
conservative
innovative
il vincitore o la vincitrice
il vincitore o la vincitrice o l* vincitor*
obscuration
chi vincerà
l* vincitor*
Data Collection The data made available by the
University of Brescia include a range of administrative
materials such as the department’s strategic plan, reports
from the departmental council and parity commission, as
well as various forms. Most of them are already public on
the website of the University, others, like the forms, have
been asked to administrative organs. For this pioneering
study, we collected specifically 13 documents.</p>
      </sec>
      <sec id="sec-1-3">
        <title>These distinctions generate a four-way contrast, that is</title>
        <p>illustrated in Table 1, where one example reformulation
per type is provided for the phrase il vincitore (see the
Appendix for other examples).</p>
        <p>Lastly, ‘mixed’ (It. “ibride”) reformulations use
diferent strategies for diferent elements in the gender-unfair
portion of text, e.g. l* vincitore o vincitrice (the.inn.sg
winner:m.sg or winner:f.sg), where an innovative
obscuration strategy is used for the article and a conservative
visibility strategy is used for the noun.</p>
        <p>
          • ‘visibility’ (It. “visibilità”) strategies, that make
the possible reference to persons with diferent
genders explicit by means of the use of diferent
grammatical genders; vs. ‘obscuration’
(“oscuramento”) strategies, that try to avoid the use of
expressions that reveal the (assumed) gender of
referents; Data Preprocessing All the documents have been
• ‘conservative’ (It. “conservative”) strategies, that transformed into plain text to be processed
automationly use expressions that are part of the gram- cally. To deal with the special format of forms or the
matical system of the standard variety of Italian; layout in tables of special reports, we designed various
vs. ‘innovative’ (It. “innovative”) strategies, that regular expressions to clean and prepare the texts for the
introduce new means of expression into the sys- segmentation in sentences.
tem. To support the annotation phase, we signaled for each
sentence the possible presence of discrepancies:
displayed below each task were any words from the
enriched lexicon of professions’ names based on Sabatini
[
          <xref ref-type="bibr" rid="ref8">8</xref>
          ] present in the sentence. From a total of 1,024
sentences, 409 contained such words. However, the
sentences in which the annotators have detected at least one
unfair expression are 422. The lexicon has been updated
to include all the words pointed by the annotators.
        </p>
      </sec>
      <sec id="sec-1-4">
        <title>Annotation Process All annotators were trained on</title>
        <p>the annotation scheme, which was analysed during an
initial meeting. Doubtful cases were discussed in regular
4. Corpus bi-weekly meetings. In addition, we kept a file of notes
in which we reviewed and discussed uncertain cases as
The scheme of annotation was applied by 5 expert an- they arose. The annotation scheme was subsequently
notators of gender-fair language to a small corpus of ad- updated in response to the insights gathered both in the
ministrative texts coming from the University of Brescia. file and the meetings. The annotation process has been
Diferently from other textual genres, the administra- carried out on LabelStudio platform4, creating a specific
tive texts, for their format and technical language, have interface that facilitates the two layers of annotation: the
required a specific preprocessing process to let the an- identification of the gender-unfair expression, and the
renotators focus especially on the spans of text that could formulation with one or more alternatives. The interface
contain discrepancies. provided a section for comments in order to encourage</p>
        <p>
          To this purpose, we employed the original lexicon of reflection on the annotation scheme and collect insights
professional names taken into account in Sabatini [
          <xref ref-type="bibr" rid="ref8">8</xref>
          ], en- from the annotators.
riching it with terms especially pertaining to Academia Even if the amount of analyzed data seems small, the
or terms that could be used in an overextended way (i.e., annotation task has been conducted from October 2022
“essi”, they:m.pl). Below, we describe the steps of
documents collection, preprocessing, and annotation. Finally, 4https://labelstud.io/
to January 2023 by 5 experts of gender-(un)fair language the main purpose of this paper is to present the process
(philosophers of language, linguists, and computational of our work. As we mentioned before, we created a novel
linguists), that provided alternatives for each textual span annotation scheme for the Italian language, which
alidentified as unfair in the sentence. lows a fine-grained distinction between diferent cases of
gender discrepancies. Moreover, the scheme has
continPreliminary Analysis Thanks to this process of anno- uously been discussed between authors and annotators,
tation in two layers, we created a corpus of 422 sentences mostly concerning the interpretation of labels, such as
where at least one gender-unfair expression has been e.g. the sensible distinction between “overextended” and
identified, and 602 sentences where no gender-unfair “generic” gender-unfair expressions. Last but not least,
expression has been identified. the identified span did not just contain gender-unfair
        </p>
        <p>In the 422 sentences containing gender-unfair expres- expressions, but any element that needs to be changed in
sions, the annotators detected on average 3 textual spans order to get a gender-fair text. For example, if an
annoper sentence containing gender-unfair language (for a tator decided to propose "il corpo docenti" (the teaching
total of 3,195 portions) and proposed from 1 to about 11 staf) instead of "i docenti" (the. m.pl teachers) in "i docenti
alternatives. devono partecipare" (the teachers have to participate),</p>
        <p>Moreover, looking at the frequencies of the types of they also have to select the verb "devono" for it to agree
unfair expressions identified in the corpus, we can see in number with "il corpo docenti". Crucially, the verb
from Table 2 that the most common case of gender-unfair only has to be selected if the proposed alternative to "i
language in administrative documents is represented docenti" is singular and, thus, the verb needs to be
sinby the use of overextended forms, and in particular of gular too. Hence, if another annotator doesn’t propose
overextended masculine (e.g., “i ricercatori”, the.m.pl re- a singular alternative to "i docenti", they won’t need to
searchers:m.pl; and “i docenti”, the.m.pl teachers, for select the verb. As a result, the two annotators would
mixed-gender groups of, respectively, researchers and select diferent spans even when agreeing on what are
teachers). the gender-unfair expressions within the text. For this
reason, even comparing just the textual spans between
#_portions type annotators would not be a good indicator of the
annota2,709 overextended tors’ agreement.
452 generic However, in Table 3 below we provide an example of
34 incongruous a sentence with the reformulations proposed by the five
3,195 total annotators for each gender-unfair span of text, to give
an idea of the kind of variation that can be found.</p>
      </sec>
    </sec>
    <sec id="sec-2">
      <title>5. Conclusions and Future Work</title>
      <p>In recent years, gender-fair language has gained
significant attention, leading to the proposal of new strategies
in various languages to avoid using masculine or
feminine endings. Motivated by these theories, we conducted
a study to examine the usage of diferent solutions in
practical situations. We developed guidelines for gender-fair
annotation and reformulation of texts, which we applied
to a corpus of 1,024 portions of university administrative
documents in Italian.</p>
      <p>The corpus was annotated by 5 experts, and in 422
cases the annotators identified at least one gender-unfair
expression. The preliminary analysis of this corpus
highlighted the need to adopt specific guidelines (as well as
a list of words to pay particular attention to) to support
administrative staf in writing gender-fair texts.</p>
      <p>Applying our annotation and reformulation guidelines
to real data has led to theoretical advancements: we
discovered that ‘obscuration’ and ‘visibility’ strategies can
coexist within the same reformulation, and we
consequently updated the annotation scheme to include ‘mixed’</p>
      <sec id="sec-2-1">
        <title>Agreement A quantitative measure of inter-annotator</title>
        <p>reliability has not been calculated for diferent reasons.</p>
        <p>First, the scheme provides a variety of gender-fair options
and the choice of a specific alternative depends on several
factors, including individual preference: one annotator
might agree on "l@ Professor@" (the Professor:INN.SG)
being gender-fair, but choose a diferent innovative
option, like "lu Professoru" (the Professor:INN.SG), instead.</p>
        <p>Therefore, comparing the alternatives provided by each
annotator is not a good measure of whether the
annotators consider a specific option an appropriate gender-fair
alternative to a certain gender-unfair expression.
Relatedly, we do not plan to release an aggregated dataset,
inclusive of a set of “gold-standard” preferred labels.
Indeed, the aim of the present work is not to consolidate a
‘ground truth’ among rephrasing strategies but rather to
explore as many solutions as possible while using
genderfair language. This is tied to our focus on methodology:</p>
        <p>Text: A partire dal 2013 il DiGi ha organizzato ogni anno International Summer schools, allo scopo di attrarre studenti stranieri e di
ofrire agli studenti bresciani l’opportunità di entrare in contatto con studenti e docenti di altri Paesi.
ann.</p>
        <p>span</p>
        <sec id="sec-2-1-1">
          <title>Innovative visibility</title>
        </sec>
        <sec id="sec-2-1-2">
          <title>Conservative visibility</title>
        </sec>
        <sec id="sec-2-1-3">
          <title>Conservative obscuration</title>
        </sec>
        <sec id="sec-2-1-4">
          <title>Innovative obscuration</title>
          <p>A
B
C
D
E
bres–
‘di origine straniera’ or ‘di
nazionalità estera’
a ‘studenti provenienti
dalla provincia di Brescia’
‘studenti di università
estere’
‘a coloro che studiano
all’Università di Brescia’
‘dall’estero’
–
‘dalla provincia di Brescia’
–
–
–
–
–
–
–
‘stranieri’
‘agli studenti
bresciani’
‘studenti
stranieri’
‘agli studenti
bresciani’
–
–
–
–
‘studenti
stranieri’
‘studenti
ciani’
‘studenti’
‘studenti/esse
stranieri/e’
bres- ‘studenti/esse
bresciani/e’
‘studenti/esse’
‘stranieri’
‘agli studenti’
‘bresciani’
‘stranieri’
‘agli studenti’
‘bresciani’
‘straniere/i’
‘alle/agli
denti’
‘bresciane/i’
‘straniere/i’
‘alle/agli
denti’
‘bresciane/i’
stu–
–
–
–
–
–
–
‘straniere/i/3’ or ‘di nazionalità straniera’ –
‘straniere/i/u’
stu- ‘alle/agli/all3 studenti’ or ‘alle persone che studi- ‘all3 studenti’ or
‘alle/agli/allu studenti’ ano’ ‘all* studenti’
‘bresciane/i/3’ or ‘bres- ‘del bresciano’, ‘di area –
ciane/i/u’ bresciana’, or ‘di Brescia’</p>
        </sec>
      </sec>
    </sec>
    <sec id="sec-3">
      <title>Ethics Statement</title>
      <p>alternatives.</p>
      <p>To summarize, the annotation scheme has proven
valuable both practically and theoretically. It facilitated the The annotators have been paid in the context of the
acidentification of gender-unfair expressions and the for- tions provided by the Gender Equality Plan of the
Unimulation of alternatives. Moreover, it revealed the inad- versity of Brescia. The time of annotation has been
monequacy of an exclusive distinction between obscuration itored to ensure that the actual time spent annotating
and visibility, emphasizing the need to incorporate a new matched the agreed-upon paid hours.
type of strategy (i.e., ‘mixed’ alternatives) into the
classiifcation. Limitations</p>
      <p>Although the created annotation scheme has been
applied only to administrative texts so far, the guidelines are Our work presents some limitations. Firstly, the sample
formulated in such a way that they can be easily applied of analyzed texts is small and related to a specific
doto data pertaining to diferent domains. Indeed, we plan main. To test the robustness of the proposed guidelines,
to extend the annotation to other data, like web pages of we planned to expand this corpus and its analysis.
Seca University that describes its organization and its events. ondly, in this work we presented an annotation schema
Finally, the resulting corpus, composed of 3,195 portions to recognize gender-unfair language and to reformulate
of texts identified as gender-unfair and reformulated with it, specifically for Italian, limiting its adaptation to other
at least one alternative, could be used in the context of languages.
training models to recognize gender-unfair expressions
and suggest their alternatives.
Figure 1: Label Studio set up with a text containing an overextended and a generic gender-unfair text span.
Figure 2: Label Studio set up with a text containing an incongruous gender-unfair text span.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <given-names>S.</given-names>
            <surname>Sczesny</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Formanowicz</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F.</given-names>
            <surname>Moser</surname>
          </string-name>
          ,
          <article-title>Can genderfair language reduce gender stereotyping and discrimination?</article-title>
          ,
          <source>Frontiers in psychology 7</source>
          (
          <year>2016</year>
          )
          <fpage>25</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [2]
          <string-name>
            <given-names>J.</given-names>
            <surname>Saul</surname>
          </string-name>
          ,
          <string-name>
            <given-names>E.</given-names>
            <surname>Diaz-Leon</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Hesni</surname>
          </string-name>
          , Feminist Philosophy of Language, in: E. N.
          <string-name>
            <surname>Zalta</surname>
          </string-name>
          , U. Nodelman (Eds.),
          <source>The Stanford Encyclopedia of Philosophy</source>
          , Fall 2022 ed., Metaphysics Research Lab, Stanford University,
          <year>2022</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [3]
          <string-name>
            <given-names>P.</given-names>
            <surname>Gygax</surname>
          </string-name>
          , U. Gabriel,
          <string-name>
            <given-names>O.</given-names>
            <surname>Sarrasin</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Oakhill</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Garnham</surname>
          </string-name>
          ,
          <article-title>Generically intended, but specifically interpreted: When beauticians, musicians, and mechanics are all men</article-title>
          ,
          <source>Language and cognitive processes 23</source>
          (
          <year>2008</year>
          )
          <fpage>464</fpage>
          -
          <lpage>485</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [4]
          <string-name>
            <given-names>P.</given-names>
            <surname>Gygax</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Sato</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Öttl</surname>
          </string-name>
          , U. Gabriel,
          <article-title>The masculine form in grammatically gendered languages and its multiple interpretations: a challenge for our cognitive system</article-title>
          ,
          <source>Language Sciences</source>
          <volume>83</volume>
          (
          <year>2021</year>
          )
          <fpage>101328</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          [5]
          <string-name>
            <surname>MIUR</surname>
          </string-name>
          ,
          <article-title>Linee guida per l'uso del genere nel linguaggio amministrativo del MIUR</article-title>
          , MIUR,
          <year>2018</year>
          . URL: https://www.miur.gov.it/documents/20182/0/Line e_Guida_per_l_uso
          <article-title>_del_genere_nel_linguaggio_ amministrativo_del_</article-title>
          <source>MIUR_2018.pdf /3c8df bef-4df d-475a-8a29-5adc0d7376d8?version=1</source>
          .0&amp;t=
          <volume>1520</volume>
          <fpage>428640228</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          [6]
          <string-name>
            <given-names>G.</given-names>
            <surname>Attanasio</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Greco</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M. L.</given-names>
            <surname>Quatra</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L.</given-names>
            <surname>Cagliero</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Raus</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Tonti</surname>
          </string-name>
          ,
          <string-name>
            <given-names>T.</given-names>
            <surname>Cerquitelli</surname>
          </string-name>
          , E-mimic:
          <article-title>Empowering multilingual inclusive communication</article-title>
          ,
          <source>IEEE Big Data Workshops</source>
          (
          <year>2021</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          [7]
          <string-name>
            <given-names>G.</given-names>
            <surname>Sulis</surname>
          </string-name>
          ,
          <string-name>
            <given-names>V.</given-names>
            <surname>Gheno</surname>
          </string-name>
          ,
          <article-title>The debate on language and gender in italy, from the visibility of women to inclusive language (1980s-2020s)</article-title>
          ,
          <source>The Italianist</source>
          <volume>42</volume>
          (
          <year>2022</year>
          )
          <fpage>153</fpage>
          -
          <lpage>183</lpage>
          . doi:
          <volume>10</volume>
          .1080/02614340.
          <year>2022</year>
          .
          <volume>2125707</volume>
          .
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          [8]
          <string-name>
            <given-names>A.</given-names>
            <surname>Sabatini</surname>
          </string-name>
          ,
          <article-title>Il sessismo nella lingua italiana. raccomandazioni per un uso non sessista della lingua italiana</article-title>
          , Rome: Istituto Poligrafico e Zecca dello Stato (
          <year>1987</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          [9]
          <string-name>
            <given-names>C.</given-names>
            <surname>Robustelli</surname>
          </string-name>
          ,
          <article-title>Donne grammatica e media. con prefazione di n. maraschio, presidente accademia della crusca</article-title>
          , in: Donne Grammatica e Media, ITA,
          <year>2014</year>
          , pp.
          <fpage>1</fpage>
          -
          <lpage>79</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          [10]
          <string-name>
            <surname>A. M. Thornton</surname>
          </string-name>
          ,
          <article-title>Per un uso della lingua italiana rispettoso dei generi</article-title>
          ,
          <source>Università degli Studi</source>
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>