<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Building Foundations for Inclusiveness through Expert-Annotated Data</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Moreno La Quatra</string-name>
          <email>moreno.laquatra@unikore.it</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Salvatore Greco</string-name>
          <email>salvatore_greco@polito.it</email>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Luca Cagliero</string-name>
          <email>luca.cagliero@polito.it</email>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Michela Tonti</string-name>
          <email>michela.tonti@unibg.it</email>
          <xref ref-type="aff" rid="aff3">3</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Francesca Dragotto</string-name>
          <email>francesca.dragotto@gmail.com</email>
          <xref ref-type="aff" rid="aff2">2</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Rachele Raus</string-name>
          <email>rachele.raus@unibo.it</email>
          <xref ref-type="aff" rid="aff4">4</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Stefania Cavagnoli</string-name>
          <email>stefania.cavagnoli@uniroma2.it</email>
          <xref ref-type="aff" rid="aff2">2</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Tania Cerquitelli</string-name>
          <email>tania.cerquitelli@polito.it</email>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Kore University of Enna</institution>
          ,
          <addr-line>Enna</addr-line>
          ,
          <country country="IT">Italy</country>
        </aff>
        <aff id="aff1">
          <label>1</label>
          <institution>Politecnico di Torino</institution>
          ,
          <addr-line>Turin</addr-line>
          ,
          <country country="IT">Italy</country>
        </aff>
        <aff id="aff2">
          <label>2</label>
          <institution>Università degli Studi di Roma Tor Vergata</institution>
          ,
          <addr-line>Rome</addr-line>
          ,
          <country country="IT">Italy</country>
        </aff>
        <aff id="aff3">
          <label>3</label>
          <institution>Università degli studi di Bergamo</institution>
          ,
          <addr-line>Bergamo</addr-line>
          ,
          <country country="IT">Italy</country>
        </aff>
        <aff id="aff4">
          <label>4</label>
          <institution>Università di Bologna</institution>
          ,
          <addr-line>Bologna</addr-line>
          ,
          <country country="IT">Italy</country>
        </aff>
      </contrib-group>
      <abstract>
        <p>Natural Language Understanding and Generation models sufer from a limited capability of understanding the nuances of inclusive communication as they are trained on massive data, often including significant portions of non-inclusive content. Even when the models are specifically designed to address non-inclusive language detection or reformulation, they disregard, to a large extent, inclusivenessrelated features that are likely correlated with the inclusive language nuances, such as the discourse type, level of inclusiveness, and intended context of use. To assess the importance of additional inclusiveness-related features, we collect a new corpus of Italian administrative documents humanly annotated by linguistic experts. Linguistic experts not only highlight non-inclusive text snippets and propose possible reformulations, but also annotate multi-aspect labels related to diferent inclusive language nuances. We empirically show that a multi-task learning approach that leverages the multi-aspect annotations can improve the non-inclusive text reformulation performance, thereby confirming the potential of expert-annotated data in inclusive language processing.</p>
      </abstract>
      <kwd-group>
        <kwd>eol&gt;inclusive language</kwd>
        <kwd>natural language processing</kwd>
        <kwd>text generation</kwd>
        <kwd>deep learning</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>1. Introduction</title>
      <p>
        Non-inclusive expressions are widespread in humanly
written documents [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ]. Training Natural Language
Understanding and Generation models on massive data exposes them
to bias issues related to language inclusiveness. Addressing
this issue is particularly relevant because Artificial
Intelligence (AI)-based solutions must be used responsibly to
correctly model inclusive language practices and not
unintentionally marginalize or disadvantage certain groups.
      </p>
      <p>
        To mitigate the presence of bias in data, applications based
on AI rely on human supervision for model training and
post-processing evaluation. This is quite common in the
areas of Natural Language Understanding and Generative AI,
in which applications like Large Language Models (LLMs)
provide end-users with conversational and language editing
services [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ].
      </p>
      <p>
        The computational linguistic community has agreed
on the need to leverage human expert annotations in
experience-based learning for bias detection and
mitigation [
        <xref ref-type="bibr" rid="ref3 ref4 ref5 ref6">3, 4, 5, 6</xref>
        ]. However, the linguistics literature often
underestimates the importance of linguistic annotators
because of the widespread tendency to value the figures of
preand post-editors [
        <xref ref-type="bibr" rid="ref7 ref8">7, 8</xref>
        ]. Editing and annotation are
substantially diferent: while language editing tools rewrite parts of
the source text based on predefined expert-provided rules,
Natural Language Understanding and Generation models
can leverage annotations to capture the nuances of
annotated text in a self-supervised manner. The use of textual
annotations also relieves annotators of the task of explicitly
formulating or adhering to ad hoc linguistic rules.
      </p>
      <p>
        In the context of inclusive language understanding and
generation, most of the previous work exploits rule-based
or round-trip translations to annotate texts for inclusivity
issues [
        <xref ref-type="bibr" rid="ref10 ref11 ref12 ref9">9, 10, 11, 12</xref>
        ]. However, these works often overlook the
significance of human expert annotations, opting instead
for rule-based approaches or artificially created datasets
generated through round-trip translations. The role of
linguistic annotators in providing specific understanding and
annotations of language data is crucial for developing more
inclusive AI models [
        <xref ref-type="bibr" rid="ref13 ref14">13, 14</xref>
        ].
      </p>
      <p>
        A limited body of work has been devoted to generating
and exploiting multi-faceted expert human annotations to
drive AI models for inclusive language, e.g., [
        <xref ref-type="bibr" rid="ref15 ref16 ref17">15, 16, 17</xref>
        ].
However, existing benchmarks of annotated text for
inclusive language processing neglect potentially relevant
aspects such as the level of inclusiveness, the intended context
of use, and the text genre. These aspects have the
potential to improve the inclusive language understanding and
generation capabilities of AI models.
      </p>
      <p>This paper proposes an expert-annotated dataset
covering these new aspects and investigates their usefulness in
enhancing the performance of the task of non-inclusive text
reformulation in the absence of rule-based editing models.</p>
      <p>To this end, we enrich a corpus of Italian administrative
documents with multi-aspect annotations, providing more
insights into the inclusive language nuances. The purpose is
to enable the study of new features describing inclusiveness
aspects neglected by existing approaches, such as the level of
inclusiveness, register, and genre. By enriching the language
descriptions with new inclusiveness-related features, we
provide the research community with new resources to
enhance the understanding and writing capabilities of
AIbased solutions.</p>
      <p>We also collect preliminary results on the use of
multiaspect annotations in a multi-task learning approach to
enhance non-inclusive language reformulations. The results
confirm the potential of the inclusiveness-related expert
annotations.</p>
    </sec>
    <sec id="sec-2">
      <title>2. The annotation process</title>
      <p>
        The term annotation is often used to indicate the process
by which textual data are subjected to a tightly interrelated
two-phase activity [
        <xref ref-type="bibr" rid="ref6">6</xref>
        ]: a) Identification, selection, and
localisation of specific documents, and b) Interpretation and
labeling of those documents. The first phase entails
identifying and detailing the text segments that exhibit the linguistic
phenomenon under investigation. Subsequently, in the
interpretation phase, the selected occurrences are humanly
labeled. These annotations may encompass various forms
ranging from a selection of pre-established alternatives to
free-text comments or possible reformulations.
      </p>
      <p>Unlike human annotators, AI models often lack cognitive
abilities such as common sense reasoning and
generalization capabilities due to the relatively limited numbers of
linguistic examples used for model training compared to
the impressive variety of natural language forms.</p>
      <p>Human annotators need suficient expertise to interpret
nuanced linguistic phenomena and assign appropriate labels
adequately. Their annotations are at the base of a supervised
learning process. The trained models can progressively
learn from annotated data as automatized humans do, but
at a scale not possible through manual work alone.
Annotation of Italian administrative documents. We
have designed and utilized a novel benchmark dataset for
inclusive language writing in Italian. This dataset comprises
administrative communications sourced from the Italian
public administration, spanning across both national and
regional levels. We annotate the corpus at the sentence
level. To this end, we set up a heterogeneous team of 13
linguistic experts with diverse experiences and expertise
in inclusive language. The team consists of predominantly
female individuals, all native Italian speakers. All the
annotators are educated: 57% have at least 10 years of experience
in linguistics, and 50% have at least 3 years of experience
in inclusive language. In addition, the annotators received,
on average, about 30 hours of training specific to inclusive
language annotations.</p>
      <p>Each human annotator independently assigns
inclusiveness-related metadata to the document
sentences. Each sentence can be enriched with multiple
annotations. The annotations consist of (a) The
reformulation of any non-inclusive piece of text, i.e., an
alternative inclusive form; (b) The level of inclusiveness
of the input sentence indicating whether a sentence is
non-inclusive, inclusive, or not pertinent; (c) The register
or intended context of use, i.e., Standard, Specialized, or
Informative/Educational; (d) the discourse type or genre, i.e.,
Legal, Administrative, Technical, or Informative/Educational.</p>
      <p>Additional contextual aspects could be included in future
annotations to enhance models’ understanding of inclusive
language usage further. By jointly providing those
annotations, the experts aimed to capture inclusive language’s
nuanced, multi-faceted nature.</p>
      <p>By learning language inclusiveness patterns from a
diversified, context-dependent set of expert annotations, AI
models gain exposure to subtle interpretive diferences. The
consistency across annotations is ensured through detailed
guidelines and instructions provided to experts. Before full
annotation, a collaborative analysis of a sample set identifies
any divergent interpretations to refine guidance.</p>
      <sec id="sec-2-1">
        <title>Statistics on annotated data. Table 1 reports the num</title>
        <p>ber of annotated sentences for each aspect, separately for
the training, validation, and test sets.</p>
        <p>Task ID
NILR
ILC
RC
GC</p>
        <p>Train</p>
        <p>Example of annotations. Table 2 shows an example of
an Italian annotated sentence (as well as the
corresponding English translation for non-Italian readers). Linguistics
experts assign diferent annotations to each sentence. In
this example, they have assigned three labels to the
sentence. Regarding inclusiveness, the sentence has been
categorized as non-inclusive because it contains “Il Presidente”
(i.e., Chair/President) and “Rettore” (i.e., Rector), which are
masculine declensions of professional roles. In addition, the
sentence also contains “suo decreto”, which refers to a
decree that comes from a male person, so the sentence is not
inclusive. The discourse sequence is of the administrative
type, as the content refers to an administrative topic, and
the used language is specialized, as the content describes
specific and technical aspects.</p>
      </sec>
    </sec>
    <sec id="sec-3">
      <title>3. Case study: Leveraging Aspects</title>
      <p>for Italian Inclusive Language</p>
    </sec>
    <sec id="sec-4">
      <title>Reformulation</title>
      <p>We conduct an empirical analysis to examine the impact
of utilizing expert annotations in inclusive language
generation. Specifically, we investigate the advantages of
simultaneously addressing two key objectives: reformulating
non-inclusive language and predicting various aspects of
inclusiveness.</p>
      <p>Tasks. Given a non-inclusive piece of text  , the
NonInclusive Language Reformulation (NILR) task aims at
generating an equivalent inclusive natural language form. The
NILR task is a sequence-to-sequence problem, where the
input is a non-inclusive sentence and the output is the
corresponding inclusive sentence.</p>
      <p>Given  and an aspect , the goal is to predict the ’s
value for  .  can be the level of inclusiveness, register
or intended context of use, and discourse type or genre.
According to the aspect under analysis, the corresponding
sub-tasks are denoted by Inclusiveness Level Classification
(ILC), Register Classification (RC), and Genre Classification
Non-inclusivo</p>
      <p>Amministrativo</p>
      <p>Specialistico
Non-inclusive</p>
      <p>Administrative</p>
      <p>Specialized
(GC). The ILC, RC, and GC tasks are treated as separate
classification problems, where the input is a sentence and
the output is the corresponding aspect value.</p>
      <sec id="sec-4-1">
        <title>Single- vs. Multi-Task Learning To compare the per</title>
        <p>formance of models trained using diferent learning
approaches, we conducted experiments in both single-task
and multi-task learning settings.</p>
        <p>
          In Single-Task Learning, we exclusively focus on the task
of Non-Inclusive Language Reformulation (NILR),
disregarding all aspect-related annotations. We leverage an
encoderdecoder architecture, specifically BART-IT [
          <xref ref-type="bibr" rid="ref18">18</xref>
          ], which is a
BART architecture [
          <xref ref-type="bibr" rid="ref19">19</xref>
          ] pre-trained on a clean Italian corpus
[
          <xref ref-type="bibr" rid="ref20">20</xref>
          ]. The model is fine-tuned on the NILR task with the
twofold objective of modifying the input sentence to make
it inclusive while maintaining the original meaning.
        </p>
        <p>Conversely, in Multi-Task Learning, we integrate the NILR
task with Aspect Classification tasks during training (i.e.,
ILC, RC, and GC). For the additional tasks, we specifically
leverage the encoder component of the model, which
extracts representations of the input text. The encoder
component is additionally trained with a classification objective.
Each task is associated with a separate classification head,
trained to predict the corresponding aspect value for the
input sentence. By interleaving these tasks during training,
the model learns to simultaneously address NILR and create
encoder representations that capture various aspects related
to inclusiveness.</p>
        <p>
          Evaluation Metrics. We evaluate the quality of the text
reformulation using a standard train-validation-test split
on our expert-annotated data. To compare the
automatically generated and expected reformulations, we use the
established ROUGE F1-scores [
          <xref ref-type="bibr" rid="ref21">21</xref>
          ]. They measure the unit
overlap, in terms of the number of n-grams in common,
between the two pieces of text. The larger the score, the
higher the syntactic similarity. R-1, R-2, and R-L count the
unit overlap in terms of unigrams, bigrams, and longest
common subsequences, respectively.
        </p>
        <p>To complement the quantitative evaluation, we also
perform a qualitative evaluation of the achieved results. We
involved six human evaluators who were asked to label
each model-generated sentence as: correct if it accurately
maintained the original meaning while using inclusive
language appropriately for the context; partially correct if some
aspects were reformed correctly, but others were missed
or inaccurate; or not correct if the rewriting fundamentally
failed to capture the original meaning or usage intention.
This multi-level feedback aims at capturing the models’
ability to perform the rewriting task sensitively across diferent
scenarios beyond just string-matching metrics.</p>
        <p>To each reformulation, we assign a score to each
annotation as follows: 1 for correct, 0.5 for partially correct, and
0 for incorrect. The final score for each reformulation is
computed as the average over all the expert annotations
( = 6). Finally, we average the scores for all the
reformulations ( = 30) to obtain a single score for each model.
Results’ overview. Columns 2, 3, and 4 in Table 3 show
the ROUGE scores for both models. The multi-task
learning achieves the best performance on all the quantitative
metrics. Regarding the human evaluation, we obtained 6
annotations for 30 reformulations for each model. For the
model trained with the single task configuration, 93 were
correct, 55 were partially correct, and 32 were incorrect.
Instead, for the multi-task model, 101 were correct, 49 were
partially correct, and 30 were incorrect. Column 5 reports
the average human evaluation scores for both models. The
human scores are coherent with the quantitative ones,
showing that the model trained under multi-task settings
beneifts from the additional labels. Based on these preliminary
results, we can conclude that the nuanced and
multidimensional annotations of inclusive language have the potential
to develop a more comprehensive approach to modeling
inclusive language.</p>
      </sec>
    </sec>
    <sec id="sec-5">
      <title>4. Conclusions</title>
      <p>This paper discussed and experimentally demonstrated
that the role and contribution of human annotators are
of paramount importance in improving the quality of NLP
results and the writing capability of generative approaches
in inclusive communication. Starting from a new Italian
administrative corpus, we enriched it with a variety of
annotations with the help of a team of language experts. This
included (i) reformulating gendered language and acronyms,
(ii) rewriting to enhance readability for the visually impaired,
and (iii) defining the intended context of use (register) and
text genre. The preliminary experimental results on the
annotated corpus are promising and highlight the potential
of the newly proposed annotations to develop a more
comprehensive and richer approach that improves the ability
of the generative algorithm to propose comprehensive and
integrative reformulations.</p>
      <p>Limitations. i) The annotation is language-specific ,
limited to the Italian language, thereby constraining its utility
in multilingual scenarios; and ii) It is formal
communicationspecific . Tailored to tackle the challenge of inclusive
language in administrative and academic settings, the natural
language tasks are exclusively trained on administrative
documents, potentially lacking suitability for diverse contexts
like legal and web communications.</p>
      <p>Future work. As part of the E-MIMIC1 (Empowering
Multilingual Inclusive Communication) project, we are currently
working on a multilingual annotation process to overcome
these issues and foster inclusive communication across
different domains and languages. A team of experts is
annotating a large corpus of documents according to linguistic
criteria to label linguistic resources in a multilingual setting.</p>
      <p>
        Finally, we want to exploit text-based explainability
techniques [
        <xref ref-type="bibr" rid="ref22 ref23">22, 23</xref>
        ] to perform further human validation of the
models produced.
      </p>
      <p>Ethical Considerations. All the gathered documents are
public and therefore freely accessible on the internet. All
references to proper names of people and institutions have
been anonymized and replaced with random names for
privacy reasons.</p>
    </sec>
    <sec id="sec-6">
      <title>Acknowledgments</title>
      <p>This study was carried out within the project ”E-MIMIC:
Empowering Multilingual Inclusive Communication”, funded
by the Ministero dell’Universitá e della Ricerca - with the
PRIN 2022 (D.D. 104 - 02/02/2022) program.
1https://dbdmg.polito.it/e-mimic/</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <given-names>S. J.</given-names>
            <surname>Ashwell</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P. K.</given-names>
            <surname>Baskin</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S. L.</given-names>
            <surname>Christiansen</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S. A.</given-names>
            <surname>DiBari</surname>
          </string-name>
          , A. Flanagin,
          <string-name>
            <given-names>T.</given-names>
            <surname>Frey</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Jemison</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Ricci</surname>
          </string-name>
          ,
          <article-title>Three recommended inclusive language guidelines for scholarly publishing: Words matter, Learn</article-title>
          . Publ.
          <volume>36</volume>
          (
          <year>2023</year>
          )
          <fpage>94</fpage>
          -
          <lpage>99</lpage>
          . URL: https://doi.org/10.1002/leap.1527. doi:
          <volume>10</volume>
          .1002/LEAP.1527.
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [2]
          <string-name>
            <given-names>A.</given-names>
            <surname>Balayn</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Yang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Z.</given-names>
            <surname>Szlávik</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Bozzon</surname>
          </string-name>
          ,
          <article-title>Automatic identification of harmful, aggressive, abusive, and offensive language on the web: A survey of technical biases informed by psychology literature</article-title>
          ,
          <source>ACM Trans. Soc. Comput</source>
          .
          <volume>4</volume>
          (
          <year>2021</year>
          )
          <volume>11</volume>
          :
          <fpage>1</fpage>
          -
          <lpage>11</lpage>
          :
          <fpage>56</fpage>
          . URL: https: //doi.org/10.1145/3479158. doi:
          <volume>10</volume>
          .1145/3479158.
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [3]
          <string-name>
            <given-names>R.</given-names>
            <surname>Artstein</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Poesio</surname>
          </string-name>
          ,
          <article-title>Bias decreases in proportion to the number of annotators</article-title>
          , in: Proceedings of FG-MoL
          <year>2005</year>
          :
          <article-title>The 10th conference on Formal Grammar and The 9th Meeting on</article-title>
          , volume
          <volume>139</volume>
          ,
          <year>2009</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [4]
          <string-name>
            <given-names>R.</given-names>
            <surname>Artstein</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Poesio</surname>
          </string-name>
          ,
          <article-title>Inter-coder agreement for computational linguistics</article-title>
          ,
          <source>Computational linguistics 34</source>
          (
          <year>2008</year>
          )
          <fpage>555</fpage>
          -
          <lpage>596</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          [5]
          <string-name>
            <given-names>J.</given-names>
            <surname>Carletta</surname>
          </string-name>
          ,
          <article-title>Assessing agreement on classification tasks: The kappa statistic</article-title>
          ,
          <source>Computational Linguistics</source>
          <volume>22</volume>
          (
          <year>1996</year>
          )
          <fpage>249</fpage>
          -
          <lpage>254</lpage>
          . URL: https://aclanthology.org/ J96-2004.
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          [6]
          <string-name>
            <given-names>P. S.</given-names>
            <surname>Bayerl</surname>
          </string-name>
          ,
          <string-name>
            <surname>K. I. Paul</surname>
          </string-name>
          ,
          <article-title>What determines inter-coder agreement in manual annotations? a meta-analytic investigation</article-title>
          ,
          <source>Computational Linguistics</source>
          <volume>37</volume>
          (
          <year>2011</year>
          )
          <fpage>699</fpage>
          -
          <lpage>725</lpage>
          . URL: https://aclanthology.org/J11-4004. doi:
          <volume>10</volume>
          . 1162/COLI_a_
          <fpage>00074</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          [7]
          <string-name>
            <given-names>J.</given-names>
            <surname>Monti</surname>
          </string-name>
          ,
          <article-title>Dalla zairja alla traduzione automatica: rilfessioni sulla traduzione nell'era digitale</article-title>
          ,
          <source>Lofredo</source>
          ,
          <year>2019</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          [8]
          <string-name>
            <given-names>P.</given-names>
            <surname>Sánchez-Gijón</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Kenny</surname>
          </string-name>
          ,
          <article-title>Selecting and preparing texts for machine translation: Pre-editing and writing for a global audience</article-title>
          ,
          <source>Machine translation for everyone: Empowering users in the age of artificial intelligence 18</source>
          (
          <year>2022</year>
          )
          <fpage>81</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          [9]
          <string-name>
            <given-names>B.</given-names>
            <surname>Alhafni</surname>
          </string-name>
          ,
          <string-name>
            <given-names>N.</given-names>
            <surname>Habash</surname>
          </string-name>
          ,
          <string-name>
            <given-names>H.</given-names>
            <surname>Bouamor</surname>
          </string-name>
          ,
          <article-title>User-centric gender rewriting, in: Proceedings of the 2022 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Association for Computational Linguistics</article-title>
          , Seattle, United States,
          <year>2022</year>
          , pp.
          <fpage>618</fpage>
          -
          <lpage>631</lpage>
          . URL: https://aclanthology.org/
          <year>2022</year>
          .naacl-main.
          <volume>46</volume>
          . doi:
          <volume>10</volume>
          . 18653/v1/
          <year>2022</year>
          .naacl-main.
          <volume>46</volume>
          .
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          [10]
          <string-name>
            <given-names>C.</given-names>
            <surname>Amrhein</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F.</given-names>
            <surname>Schottmann</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Sennrich</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Läubli</surname>
          </string-name>
          ,
          <article-title>Exploiting biased models to de-bias text: A gender-fair rewriting model</article-title>
          ,
          <source>in: Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume</source>
          <volume>1</volume>
          :
          <string-name>
            <surname>Long</surname>
            <given-names>Papers)</given-names>
          </string-name>
          ,
          <source>Association for Computational Linguistics</source>
          , Toronto, Canada,
          <year>2023</year>
          , pp.
          <fpage>4486</fpage>
          -
          <lpage>4506</lpage>
          . URL: https://aclanthology.org/
          <year>2023</year>
          .
          <article-title>acl-long</article-title>
          .
          <volume>246</volume>
          . doi:
          <volume>10</volume>
          .18653/v1/
          <year>2023</year>
          .
          <article-title>acl-long</article-title>
          .
          <volume>246</volume>
          .
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          [11]
          <string-name>
            <given-names>T.</given-names>
            <surname>Sun</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K.</given-names>
            <surname>Webster</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Shah</surname>
          </string-name>
          ,
          <string-name>
            <given-names>W. Y.</given-names>
            <surname>Wang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Johnson</surname>
          </string-name>
          , They, them, theirs
          <article-title>: Rewriting with gender-neutral english</article-title>
          ,
          <source>CoRR abs/2102</source>
          .06788 (
          <year>2021</year>
          ). URL: https:// arxiv.org/abs/2102.06788. arXiv:
          <volume>2102</volume>
          .
          <fpage>06788</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          [12]
          <string-name>
            <given-names>E.</given-names>
            <surname>Vanmassenhove</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Emmery</surname>
          </string-name>
          , D. Shterionov, NeuTral Rewriter:
          <article-title>A rule-based and neural approach to automatic rewriting into gender neutral alternatives</article-title>
          ,
          <source>in: Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing</source>
          , Association for Computational Linguistics, Online and
          <string-name>
            <given-names>Punta</given-names>
            <surname>Cana</surname>
          </string-name>
          , Dominican Republic,
          <year>2021</year>
          , pp.
          <fpage>8940</fpage>
          -
          <lpage>8948</lpage>
          . URL: https://aclanthology.org/
          <year>2021</year>
          .emnlp-main.
          <volume>704</volume>
          . doi:
          <volume>10</volume>
          .18653/v1/
          <year>2021</year>
          .emnlp-main.
          <volume>704</volume>
          .
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          [13]
          <string-name>
            <given-names>A.</given-names>
            <surname>Piergentili</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Fucci</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B.</given-names>
            <surname>Savoldi</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L.</given-names>
            <surname>Bentivogli</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Negri</surname>
          </string-name>
          ,
          <article-title>Gender neutralization for an inclusive machine translation: from theoretical foundations to open challenges</article-title>
          ,
          <source>in: Proceedings of the First Workshop on Gender-Inclusive Translation Technologies, European Association for Machine Translation</source>
          , Tampere, Finland,
          <year>2023</year>
          , pp.
          <fpage>71</fpage>
          -
          <lpage>83</lpage>
          . URL: https://aclanthology.org/
          <year>2023</year>
          .gitt-
          <volume>1</volume>
          .7.
        </mixed-citation>
      </ref>
      <ref id="ref14">
        <mixed-citation>
          [14]
          <string-name>
            <given-names>M.</given-names>
            <surname>Rosola</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Frenda</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A. T.</given-names>
            <surname>Cignarella</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Pellegrini</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Marra</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Floris</surname>
          </string-name>
          , et al.,
          <article-title>Beyond obscuration and visibility: Thoughts on the diferent strategies of genderfair language in italian</article-title>
          , in: CLiC-it
          <year>2023</year>
          .
          <source>Proceedings of the 9th Italian Conference on Computational Linguistics. Venice, Italy, November 30-December 2</source>
          ,
          <year>2023</year>
          ., volume
          <volume>3596</volume>
          ,
          <string-name>
            <surname>CEUR-WS</surname>
          </string-name>
          ,
          <year>2023</year>
          , pp.
          <fpage>1</fpage>
          -
          <lpage>10</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref15">
        <mixed-citation>
          [15]
          <string-name>
            <given-names>G.</given-names>
            <surname>Attanasio</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Greco</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M. La</given-names>
            <surname>Quatra</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L.</given-names>
            <surname>Cagliero</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Tonti</surname>
          </string-name>
          ,
          <string-name>
            <given-names>T.</given-names>
            <surname>Cerquitelli</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Raus</surname>
          </string-name>
          , E-mimic:
          <article-title>Empowering multilingual inclusive communication</article-title>
          ,
          <source>in: 2021 IEEE International Conference on Big Data (Big Data)</source>
          , IEEE,
          <year>2021</year>
          , pp.
          <fpage>4227</fpage>
          -
          <lpage>4234</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref16">
        <mixed-citation>
          [16]
          <string-name>
            <surname>M. La Quatra</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          <string-name>
            <surname>Greco</surname>
            ,
            <given-names>L.</given-names>
          </string-name>
          <string-name>
            <surname>Cagliero</surname>
            , T. Cerquitelli,
            <given-names>Inclusively:</given-names>
          </string-name>
          <article-title>An ai-based assistant for inclusive writing</article-title>
          ,
          <source>in: Joint European Conference on Machine Learning and Knowledge Discovery in Databases</source>
          , Springer,
          <year>2023</year>
          , pp.
          <fpage>361</fpage>
          -
          <lpage>365</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref17">
        <mixed-citation>
          [17]
          <string-name>
            <surname>Raus</surname>
            , Rachele, Tonti, Michela, Cerquitelli, Tania, Cagliero, Luca, Attanasio, Giuseppe, La Quatra, Moreno, Greco, Salvatore,
            <given-names>L´</given-names>
          </string-name>
          <article-title>analyse du discours et l´intelligence artificielle pour réaliser une écriture inclusive : le projet emimic</article-title>
          ,
          <source>SHS Web Conf</source>
          .
          <volume>138</volume>
          (
          <year>2022</year>
          )
          <article-title>01007</article-title>
          . URL: https://doi.org/10. 1051/shsconf/202213801007. doi:
          <volume>10</volume>
          .1051/shsconf/ 202213801007.
        </mixed-citation>
      </ref>
      <ref id="ref18">
        <mixed-citation>
          [18]
          <string-name>
            <surname>La</surname>
            <given-names>Quatra</given-names>
          </string-name>
          , Cagliero, Bart-it:
          <article-title>An eficient sequence-tosequence model for italian text summarization</article-title>
          ,
          <source>Future Internet</source>
          <volume>15</volume>
          (
          <year>2022</year>
          )
          <article-title>15</article-title>
          . URL: http://dx.doi.org/10.3390/ if15010015. doi:
          <volume>10</volume>
          .3390/fi15010015.
        </mixed-citation>
      </ref>
      <ref id="ref19">
        <mixed-citation>
          [19]
          <string-name>
            <given-names>M.</given-names>
            <surname>Lewis</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Y.</given-names>
            <surname>Liu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>N.</given-names>
            <surname>Goyal</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Ghazvininejad</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Mohamed</surname>
          </string-name>
          ,
          <string-name>
            <given-names>O.</given-names>
            <surname>Levy</surname>
          </string-name>
          ,
          <string-name>
            <given-names>V.</given-names>
            <surname>Stoyanov</surname>
          </string-name>
          , L. Zettlemoyer, BART:
          <article-title>Denoising sequence-to-sequence pre-training for natural language generation, translation, and comprehension, in: Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, Association for Computational Linguistics</article-title>
          , Online,
          <year>2020</year>
          , pp.
          <fpage>7871</fpage>
          -
          <lpage>7880</lpage>
          . URL: https:// aclanthology.org/
          <year>2020</year>
          .acl-main.
          <volume>703</volume>
          . doi:
          <volume>10</volume>
          .18653/ v1/
          <year>2020</year>
          .acl-main.
          <volume>703</volume>
          .
        </mixed-citation>
      </ref>
      <ref id="ref20">
        <mixed-citation>
          [20]
          <string-name>
            <given-names>G.</given-names>
            <surname>Sarti</surname>
          </string-name>
          ,
          <string-name>
            <surname>M.</surname>
          </string-name>
          <article-title>Nissim, It5: Large-scale text-to-text pretraining for italian language understanding and generation</article-title>
          ,
          <source>arXiv preprint arXiv:2203.03759</source>
          (
          <year>2022</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref21">
        <mixed-citation>
          [21]
          <string-name>
            <surname>C.-Y. Lin</surname>
            ,
            <given-names>ROUGE:</given-names>
          </string-name>
          <article-title>A package for automatic evaluation of summaries, in: Text Summarization Branches Out, Association for Computational Linguistics</article-title>
          , Barcelona, Spain,
          <year>2004</year>
          , pp.
          <fpage>74</fpage>
          -
          <lpage>81</lpage>
          . URL: https://aclanthology.org/ W04-1013.
        </mixed-citation>
      </ref>
      <ref id="ref22">
        <mixed-citation>
          [22]
          <string-name>
            <given-names>G.</given-names>
            <surname>Sarti</surname>
          </string-name>
          ,
          <string-name>
            <given-names>N.</given-names>
            <surname>Feldhus</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L.</given-names>
            <surname>Sickert</surname>
          </string-name>
          ,
          <string-name>
            <surname>O. van der Wal</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Nissim</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Bisazza</surname>
          </string-name>
          ,
          <string-name>
            <surname>Inseq:</surname>
          </string-name>
          <article-title>An interpretability toolkit for sequence generation models</article-title>
          ,
          <source>in: Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume</source>
          <volume>3</volume>
          :
          <string-name>
            <surname>System</surname>
            <given-names>Demonstrations)</given-names>
          </string-name>
          ,
          <source>Association for Computational Linguistics</source>
          , Toronto, Canada,
          <year>2023</year>
          , pp.
          <fpage>421</fpage>
          -
          <lpage>435</lpage>
          . URL: https: //aclanthology.org/
          <year>2023</year>
          .acl-demo.
          <volume>40</volume>
          . doi:
          <volume>10</volume>
          .18653/ v1/
          <year>2023</year>
          .acl-demo.
          <volume>40</volume>
          .
        </mixed-citation>
      </ref>
      <ref id="ref23">
        <mixed-citation>
          [23]
          <string-name>
            <given-names>F.</given-names>
            <surname>Ventura</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Greco</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Apiletti</surname>
          </string-name>
          , T. Cerquitelli,
          <article-title>Trusting deep learning natural-language models via local and global explanations</article-title>
          ,
          <source>Knowl. Inf. Syst</source>
          .
          <volume>64</volume>
          (
          <year>2022</year>
          )
          <fpage>1863</fpage>
          -
          <lpage>1907</lpage>
          . URL: https://doi.org/10.1007/s10115-022-01690-9. doi:
          <volume>10</volume>
          .1007/s10115-022-01690-9.
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>