<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Biographies through Text Rewriting using GenWriter</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Shweta Soundararajan</string-name>
          <email>shweta.x.soundararajan@mytudublin.ie</email>
          <xref ref-type="aff" rid="aff0">0</xref>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Case Based reasoning, Large Language Models</institution>
          ,
          <addr-line>Gender Bias, Gender Stereotypes, Gendered Language</addr-line>
        </aff>
        <aff id="aff1">
          <label>1</label>
          <institution>Technological University Dublin</institution>
          ,
          <addr-line>Dublin</addr-line>
          ,
          <country country="IE">Ireland</country>
        </aff>
      </contrib-group>
      <pub-date>
        <year>2025</year>
      </pub-date>
      <abstract>
        <p>Gendered language is defined as the words/phrases that signals a particular gender. This can be explicit (e.g., “mother,” “she”) or implicit, where roles or traits (e.g., “gentle,” “ambitious”) suggest an individual's gender. Although, useful in certain contexts, it can reinforce harmful stereotypes and contribute to societal bias. While significant research has explored mitigating gender bias in natural language processing (NLP) models by scrubbing or swapping gendered terms in text, these eforts have primarily focused on addressing explicit gendered language. However, they often overlook implicit gender bias embedded in language use. Therefore, I aim to focus on mitigating implicit gendered language in texts and propose a novel approach - GenWriter, a Case-Based Reasoning and Large Language Model (CBR-LLM) Fusion Approach, to generate revised content that obscures gender while preserving semantic content. The method involves constructing a case base of generalized sentence representations categorized by gender and content type, retrieving semantically similar cases, and adapting the solution through LLM to produce revised versions where the gender is not so evident. I evaluate the performance of my method by measuring gender bias in an occupation classification task. Results show that GenWriter efectively reduces gender bias, outperforming both the original and LLM-only baselines, while maintaining classification accuracy.</p>
      </abstract>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>1. Introduction</title>
      <p>
        Gendered language refers to the use of language that indicates the gender of a person, animal, or object,
either explicitly (e.g., mother, she) or implicitly (e.g., societal expectations of gendered traits) [
        <xref ref-type="bibr" rid="ref1 ref2">1, 2</xref>
        ].
While it can be useful, it can also perpetuate harmful gender stereotypes [
        <xref ref-type="bibr" rid="ref3 ref4">3, 4</xref>
        ]. Gender stereotypes are
generalized views about the roles or traits men and women should have, leading to gender bias [
        <xref ref-type="bibr" rid="ref5">5</xref>
        ].
      </p>
      <p>
        Gender bias in text can result in unfair treatment based on the perceived gender of the author, as seen
in Amazon’s abandoned AI recruitment model, which favored male applicants [
        <xref ref-type="bibr" rid="ref6">6</xref>
        ]. Similarly, gendered
language in biographies led to women being misclassified in job applications [
        <xref ref-type="bibr" rid="ref7">7</xref>
        ]. The language used to
describe individuals can reinforce gender stereotypes, can lead to unconscious bias, discrimination, and
psychological harm [
        <xref ref-type="bibr" rid="ref8">8</xref>
        ]. Gendered language in job ads, for example, can deter women from applying
for positions [
        <xref ref-type="bibr" rid="ref9">9</xref>
        ], while gender stereotypes in children’s stories can influence young minds [
        <xref ref-type="bibr" rid="ref10 ref11">10, 11</xref>
        ].
Therefore, it is important to help or facilitate people to use content where the gender of the person is
not evident, as this can reduce the harm caused to individuals.
      </p>
      <p>
        Prior work [
        <xref ref-type="bibr" rid="ref12 ref13 ref7">7, 12, 13</xref>
        ] has focused on mitigating gender bias in text. These approaches typically
remove, replace, or swap gendered terms—particularly explicit gendered language. While the resulting
debiased texts are useful for training NLP models to promote gender fairness in downstream tasks, they
are not suitable for use as human-facing suggestions. This is because explicit gendered terms such
as pronouns and gendered references are often essential in real-world contexts such as biographies,
articles, and resumes.
      </p>
      <p>To this end, my aim is to rewrite textual content that describes people in such a way the gender of
the person described in the text may not be so evident in the revised version, as an alternative to text</p>
      <p>CEUR
Workshop
Proceedings</p>
      <p>ceur-ws.org
ISSN1613-0073
including content that implies gender identity. The approach involves rewriting text about a person
as if it were written by someone of a diferent gender. Unlike prior work, my approach focuses on
mitigating implicit gendered language rather than only removing or replacing explicit gender terms.</p>
      <sec id="sec-1-1">
        <title>1.1. Background</title>
        <p>This section reviews gendered language, prior research on detecting and mitigating gender bias in text
and models, respectively, various text generation approaches and studies on bias in state-of-the-art text
generation.</p>
        <sec id="sec-1-1-1">
          <title>1.1.1. Gendered Language</title>
          <p>
            Gendered language can be categorized into linguistic and social aspects [
            <xref ref-type="bibr" rid="ref14 ref15 ref16">14, 15, 16</xref>
            ]. Linguistic gender
includes grammatical, referential, and lexical gender. Grammatical gender involves noun classification
based on sentence agreement, common in many languages but not English. Referential gender relates
to the gender of individuals or objects (e.g., pronouns and titles), while lexical gender assigns gender
based on meaning (e.g., ”man” vs. ”woman”). Social gender covers cultural aspects like gender identity,
expression, and roles [
            <xref ref-type="bibr" rid="ref14">14</xref>
            ], where gender identity refers to one’s internal sense of gender [
            <xref ref-type="bibr" rid="ref17">17</xref>
            ], gender
expression is how it is presented [
            <xref ref-type="bibr" rid="ref18">18</xref>
            ], and gender roles are societal expectations [
            <xref ref-type="bibr" rid="ref19">19</xref>
            ]. Gendered language
can reinforce or challenge social norms and stereotypes [
            <xref ref-type="bibr" rid="ref20">20</xref>
            ], with lexical and referential gender being
explicit, while social gender is more implicit.
          </p>
        </sec>
        <sec id="sec-1-1-2">
          <title>1.1.2. Detecting and Mitigating Gender Bias</title>
          <p>
            Existing literature on identifying and mitigating gender bias [
            <xref ref-type="bibr" rid="ref21 ref22 ref23">21, 22, 23</xref>
            ] mainly focuses on linguistic
gender, such as pronouns and gendered terms. Some studies [
            <xref ref-type="bibr" rid="ref24 ref25 ref9">24, 25, 9</xref>
            ] have created gender lexicons
listing words that distinguish men and women. Research [
            <xref ref-type="bibr" rid="ref26 ref27">26, 27</xref>
            ] indicates that women are increasingly
identifying less with traditionally feminine traits, highlighting the need to update societal gender norms
related to stereotypical characteristics of masculinity and femininity. Responding to this, Cryan et al.
proposed a new gender lexicon created by scraping Wikipedia for lists of candidate words and using
crowdsourcing for annotation. Datasets for stereotype detection [
            <xref ref-type="bibr" rid="ref28 ref29 ref30">29, 30, 28</xref>
            ] have been created using
various sources or crowdsourcing. Eforts to mitigate gender bias in NLP models include techniques
like data scrubbing [
            <xref ref-type="bibr" rid="ref7">7</xref>
            ], gender-swapping [
            <xref ref-type="bibr" rid="ref12">12</xref>
            ], and counterfactual data augmentation (CDA) [
            <xref ref-type="bibr" rid="ref13">13</xref>
            ],
Counterfactual data substitution (CDS) [
            <xref ref-type="bibr" rid="ref31">31</xref>
            ] that remove, replace, or swap explicit gendered terms.
Other approaches [
            <xref ref-type="bibr" rid="ref21 ref32 ref33">21, 32, 33</xref>
            ] focus on debiasing word embeddings rather than the text itself.
          </p>
        </sec>
        <sec id="sec-1-1-3">
          <title>1.1.3. Text Generation Approaches</title>
          <p>
            Text generation involves automatically creating coherent and meaningful text, ranging from sentences
to full documents. Approaches include traditional rule-based or data-driven methods, statistical
techniques, and modern neural-based approaches [
            <xref ref-type="bibr" rid="ref34 ref35 ref36">34, 35, 36</xref>
            ]. Traditional systems used predefined rules or
language patterns from datasets, while statistical models like N-grams [
            <xref ref-type="bibr" rid="ref37">37</xref>
            ] and CRFs [
            <xref ref-type="bibr" rid="ref38">38</xref>
            ] modeled word
relationships. Neural networks specifically, transformer-based models like GPT [
            <xref ref-type="bibr" rid="ref39">39</xref>
            ] and BERT [
            <xref ref-type="bibr" rid="ref40">40</xref>
            ]
excel at generating diverse, coherent text. Recently, transformer-based Large Language Models (LLMs)
have revolutionized text generation across various fields, including medical report generation [
            <xref ref-type="bibr" rid="ref41">41</xref>
            ],
academic writing [
            <xref ref-type="bibr" rid="ref42">42</xref>
            ], and children’s education [
            <xref ref-type="bibr" rid="ref43">43</xref>
            ]. However, despite their capabilities, LLMs raise ethical
concerns, particularly around gender bias in generated text, which recent studies [
            <xref ref-type="bibr" rid="ref44 ref45 ref46 ref47 ref48">44, 45, 46, 47, 48</xref>
            ]
show can perpetuate societal harm.
          </p>
          <p>
            Case-based reasoning (CBR) is a problem-solving method that supports text generation by reusing
solutions from similar past cases [
            <xref ref-type="bibr" rid="ref49">49</xref>
            ]. It involves four steps: (1) retrieving relevant cases, (2) reusing and
adapting solutions, (3) revising as needed, and (4) retaining useful outcomes. CBR has been applied to
tasks such as anomaly reporting [
            <xref ref-type="bibr" rid="ref50">50</xref>
            ], obituary writing [
            <xref ref-type="bibr" rid="ref51">51</xref>
            ], sports summaries [
            <xref ref-type="bibr" rid="ref52">52</xref>
            ], product reviews [
            <xref ref-type="bibr" rid="ref53">53</xref>
            ],
and product descriptions [
            <xref ref-type="bibr" rid="ref54">54</xref>
            ]. However, adapting prior solutions in natural language is challenging.
Integrating CBR with LLMs helps address this by (1) reducing hallucinations, bias, and stereotypes, and
(2) enabling application in knowledge-rich domains without formal encodings [
            <xref ref-type="bibr" rid="ref55">55</xref>
            ].
          </p>
          <p>While prior work has focused on detecting and mitigating gender bias in texts and NLP models,
respectively, no studies have addressed rewriting text to make the person’s gender less evident. This
presents a challenge in the research, as there is a scarcity of instances free from content with gender
cues that could be used to ofer alternatives or suggestions for rewriting content implying gender
identity. My proposed research question, discussed in Section 2.1, aims to address this issue.</p>
        </sec>
      </sec>
    </sec>
    <sec id="sec-2">
      <title>2. Research Plan</title>
      <sec id="sec-2-1">
        <title>2.1. Research Objectives</title>
        <p>This section outlines the research objectives and describes the approach taken to achieve them.
Our primary goal is to generate textual content where an individual’s gender is not so evident, which
can be used as an alternative to content signaling a particular gender. While searching for datasets with
no gender cues, I found them to be scarce. To address this challenge, I formulated the following research
question, focusing on exploring a viable approach for generating text that obscures gender identity of
an individual described in the text. Such an approach could be used to revise content that implies a
specific gender across various applications, including biographies, job advertisements, resumes, and
more.</p>
        <p>• How can I efectively transform texts that signal a specific gender into versions where the
individual’s gender is less apparent?</p>
      </sec>
      <sec id="sec-2-2">
        <title>2.2. Approach</title>
        <p>The goal is to rewrite text that implies the gender identity of the person, ensuring the gender of the
person is not so evident in the text. The approach used is to rewrite text content about a person as if it
was written by a person of a diferent gender. To achieve this, I use GenWriter, a CBR-LLM Fusion
Approach, which combines Case-Based Reasoning (CBR) and Large Language Models (LLMs). This
approach creates a case base that serves as a repository of past experiences. When a new problem
arises, such as transforming a text to one where the gender of the person is not so evident, the solutions
from similar cases are used. The LLM plays a key role in both constructing cases and adapting existing
solutions, efectively integrating CBR with LLM capabilities. Section 2.2.1, Section 2.2.2 and Section 2.2.3
will describe how cases in the case base are represented, how solutions to new problems from the
case base are retrieved, and how the retrieved solutions are adapted to ensure their suitability for new
problems.</p>
        <sec id="sec-2-2-1">
          <title>2.2.1. Case Representation</title>
          <p>A case base typically covers a specific application area, such as biographies. Each case in the base
represents a sentence describing an aspect of a person. For example, biographies usually begin with
basic details like name, birthplace, age, and occupation, followed by education and work experience,
and concluding with personal aspects such as family, hobbies, and interests. In total, a biography covers
four main components: Demographics, Education, Work details, and Non-Professional details. A case
base for biographies contains cases, each representing a sentence from a biography related to one of
these components. The case representation will include the following:
• Gender, indicating the gender of the person being discussed in the biography.
• Category, specifying which aspect of the person is being discussed in the biography. The four
components of the biography - Demographics, Education, Work details, and Non-Professional
details are the Category.
• Generalized Sentence, a sentence from the biography related to the Category, with pronouns and
entities used in a biography, such as the name of an individual, location, organization, educational
institution, dates &amp; time, numbers, award, field of study, occupation, specialization/area of
expertise, replaced with context-based placeholders, to ensure entity generalization. This is used
both in the retrieval phase of CBR to find the most similar sentence for a sentence that has to be
rewritten, and in the reuse and adaption phase of CBR as the rewritten sentence.</p>
          <p>
            Generalized sentences for both the query case and cases in the case base are generated using few-shot
prompting [
            <xref ref-type="bibr" rid="ref56">56</xref>
            ] with OpenAI’s GPT-4o (temperature set to 0.7; other hyperparameters at default). The
LLM receives a few-shot prompt (see Table 2 in Appendix A) along with the target sentence to produce
its generalized form. Examples of generated cases and their representations are shown in Table 3 in
Appendix B.
          </p>
        </sec>
        <sec id="sec-2-2-2">
          <title>2.2.2. Case Retrieval</title>
          <p>
            CBR is based on the principle that similar problems have similar solutions. To solve a new problem,
the most similar case in the case base is retrieved. This is done by measuring the semantic similarity
between the sentence embeddings of the new problem and those of cases with the opposite gender
attribute but the same category. For example, if a Demographics sentence with a female gender attribute
needs revision, the most similar male case in the same category is retrieved. Semantic similarity is
measured using cosine similarity between embeddings generated by the Sentence-BERT [
            <xref ref-type="bibr" rid="ref57">57</xref>
            ].
          </p>
        </sec>
        <sec id="sec-2-2-3">
          <title>2.2.3. Case Reuse and Solution Adaptation</title>
          <p>Once the most similar case is retrieved for each sentence in the new problem, its generalized sentence
with context-based placeholders, is reused. These are then concatenated. To adapt these generalized
sentences to the new problem, OpenAI’s GPT-4 (with a temperature of 0.7; other hyperparameters at
default) fills in the placeholders with information like entities and pronouns from the new problem.
The LLM is prompted with the instruction shown in Table 4 in Appendix C and the set of generalized
sentences. Example sentences transformed using GenWriter are shown in Table 5 in Appendix D.</p>
        </sec>
      </sec>
    </sec>
    <sec id="sec-3">
      <title>3. Progress Summary</title>
      <p>Most progress to date has focused on implementing and evaluating GenWriter. The goal of evaluating
GenWriter is to assess its efectiveness in transforming texts that imply gender identity into those
that makes the described person’s gender less evident. I focus on biographies for this application
and evaluate the transformations by measuring gender bias in the occupation classification task. A
reduction in gender bias indicates successful transformation, suggesting the revised biographies are
less influenced by gender-specific cues. I compare the gender bias in the occupation classifier trained
on biographies transformed using GenWriter, a CBR-LLM Fusion approach, with two baselines: the
original BiasBios biographies (without transformation) and the LLM-only approach, where only LLM is
used for revision.</p>
      <sec id="sec-3-1">
        <title>3.1. Building a Casebase</title>
        <p>In my approach, I build a case base by extracting 500 biographies from the training set of BiasBios
dataset (more details in Appendix E), with an equal number of male and female surgeons and nurses.
Each biography, containing Demographics, Education, Work details, and Professional details, is split
into sentences, which form the category label of a case. To gather the necessary attributes, I assign
gender labels to each sentence based on the BiasBios dataset. For category labels, I manually annotate
the first 200 biographies and train a BERT classifier on these labeled sentences to predict category labels
for the remaining 300. The trained BERT classifier, achieving 94% average class accuracy on the test set,
is then used to predict category labels for the remaining biographies. The generalized sentence for each
sentence in the biography is also generated.</p>
      </sec>
      <sec id="sec-3-2">
        <title>3.2. Rewriting Biographies and Measuring Gender Bias in Occupation Classification</title>
        <p>I used 300 biographies from the BiasBios training set (independent of my case base) and the full test set
(9,764 biographies). All 300 training biographies were rewritten using my approach. Each was split into
sentences, labeled for gender (from BiasBios) and category (via a pretrained BERT classifier). I retrieved
the most semantically similar case for each sentence using cosine similarity between generalized
sentences, with a threshold of 0.68; sentences below this threshold were left unchanged. Matched cases
and the original biography were then combined with a prompt and passed to GPT-4o to generate the
revised version. For comparison, I also applied an LLM-only approach, prompting GPT-4o to revise the
original biography (refer Table 7 in Appendix G for the prompt). Example sentences transformed using
LLM-only approach are shown in Table 5 in Appendix D.</p>
        <p>
          I evaluate biography transformation performance by measuring gender bias in occupation
classification. A reduction in gender bias indicates successful transformation. I train a BERT classifier on three
datasets: original biographies, biographies transformed via GenWriter, and those using the LLM-only
approach. To avoid direct occupation clues, occupation names, professional titles, and academic
qualifications are removed from the first sentence of each biography. Gender bias is quantified using the True
Positive Rate Gap (    ) [
          <xref ref-type="bibr" rid="ref58">58</xref>
          ] (see eqn. 1), which compares the gender-specific true positive rates
(TPR) for each occupation. A positive     indicates bias toward males, while a negative    
suggests bias toward females. A     of zero indicates no bias. I also compute average class accuracy,
accounting for the imbalanced distribution of occupations in the test set.
        </p>
        <p>TPRgap(occupation) = TPRoccupation, male − TPRoccupation, female
(1)</p>
      </sec>
    </sec>
    <sec id="sec-4">
      <title>4. Conclusion and Future Work</title>
      <p>Table 1 shows the average class accuracy and the     in the occupation classification. From
the results, we can observe that the classification system, tends to associate nurse with females and
surgeon with males. This is reflected in the     values: negative for nurse and positive for surgeon,
suggesting a bias towards females in nurse biographies and towards males in surgeon biographies,
respectively. The results also reveal notable gender bias in the original biographies for both nurse (0.09)
and surgeon (0.08). GenWriter, a CBR-LLM Fusion approach, significantly reduces this bias by 88.9%
in nurse biographies (from 0.09 to 0.01) and 62.5% in surgeon biographies (from 0.08 to 0.03), while
preserving classification accuracy. In contrast, the LLM-only method achieves smaller reductions (by
44.4% and 12.5%, respectively) but compromises accuracy. These results demonstrate the strength of
GenWriter, not only does it efectively mitigate gender bias in biographies, but it also does so without
sacrificing classification performance, outperforming baseline methods on both fairness and accuracy.</p>
      <p>Training data
BiographyOriginal
BiographyLLM-only
BiographyGenWriter</p>
      <p>Average Class Accuracy (in %)
89.55
85.11
89.15</p>
      <p>TPRgap(N)
-0.09
-0.05
-0.01</p>
      <p>TPRgap(S)
0.08
0.07
0.03</p>
      <p>I intend to extend my approach to include additional occupations beyond nurse and surgeon and
other areas beyond the current scope, such as job advertisements. The job ads will be extracted from
job postings on various online platforms using web scrappers.</p>
    </sec>
    <sec id="sec-5">
      <title>Acknowledgments</title>
      <p>This work was funded by Technological University Dublin through the TU Dublin Scholarship –
Presidents Award.</p>
    </sec>
    <sec id="sec-6">
      <title>Declaration on Generative AI</title>
      <p>The authors have not employed any Generative AI tools for tasks listed in the GenAI Usage Taxonomy.
However, GPT-4o was used as part of the experiments presented in this work.</p>
    </sec>
    <sec id="sec-7">
      <title>A. Instruction prompt and the few-shot examples provided to GPT-4o to generate generalized sentence</title>
      <p>Instruction prompt and the few-shot examples provided to GPT-4o to generate generalized sentence is
described in Table 2.</p>
      <p>Transform a given sentence into a general template by identifying and replacing all entities and pronouns with
placeholders that describe the type of entity, as demonstrated in the examples below. Use consistent placeholders
throughout, while maintaining the grammatical structure of the sentence.</p>
      <p>Examples:
Input Sentence:
Dr. Dilip Nadkarni is an Orthopedic surgeon specialized in Arthroscopic or Key-hole surgery for the Knee Joint.
Output:
Dr. [Name of the Person] is an [Occupation] specialized in [Specialisation].</p>
      <p>Input Sentence:
Dr. Crow graduated from University of Arkansas for Medical Sciences College of Medicine in 1966 and has been in
practice for 51 years.</p>
      <p>Output:
Dr. [Name of the Person] graduated from [University] in [Year] and has been in practice for [Duration].
Input Sentence:
He practices at Apollo Medical Centre with his assistants in Kotturpuram, Chennai, Chennai Speciality Clinic in
Besant Nagar, Chennai and Apollo Spectra Hospitals in MRC Nagar, Chennai.</p>
      <p>Output:
[He/She] practices at [Hospital] with [his/her] assistants in [Location], [Hospital] in [Location], [Hospital] in
[Location].</p>
      <p>Your Turn:</p>
      <p>Input Sentence: &lt;input_sentence&gt;</p>
    </sec>
    <sec id="sec-8">
      <title>B. Examples of cases</title>
      <p>Examples of cases of my case base is shown in Table 3.</p>
      <sec id="sec-8-1">
        <title>Gender</title>
        <p>Female
Female
Female</p>
      </sec>
      <sec id="sec-8-2">
        <title>Category</title>
        <p>Demographics
Education
Work Details</p>
      </sec>
      <sec id="sec-8-3">
        <title>Generalized Sentence</title>
        <p>[Name of the Person] is a [Occupation] in [Location]. .
[He/She] graduated with honours in [Year].</p>
        <p>Having more than [Duration] of diverse experiences, especially in
[Occupation], [Name of the Person] afiliates with [Hospital].</p>
      </sec>
    </sec>
    <sec id="sec-9">
      <title>C. Instruction prompt provided to GPT-4o to fill in context-based placeholders in a generalized sentence</title>
      <p>Instruction prompt provided to GPT-4o to fill in context-based placeholders in a generalized sentence is
described in Table 4.
Given the following biography and template, perform the following steps:
1. Understand the Biography and Template:
Read and analyze the biography and the template carefully to understand the context, placeholders, and the
information available.
2. Replace Placeholders:
Replace each placeholder in the template with suitable values derived from the biography. Use the following rules
while replacing placeholders:
- Keep the format and structure of the template unchanged.
- If a placeholder cannot be replaced due to insuficient information in the biography, retain the placeholder as is.
3. Output:
Provide only the final filled-in template with placeholders replaced wherever possible.</p>
      <p>Input:
Biography: &lt;biography&gt;</p>
      <p>Template: &lt;template&gt;</p>
    </sec>
    <sec id="sec-10">
      <title>D. Example Sentences transformed using GenWriter and LLM-only approach</title>
      <p>Example sentences transformed using GenWriter are shown in Table 5.
1
2
3
4
5</p>
      <sec id="sec-10-1">
        <title>No. Label - Original Sentence</title>
      </sec>
      <sec id="sec-10-2">
        <title>Sentence transformed by GenWriter</title>
        <p>FN - Rayelle acquired her Mas- After completing her
underter of Science in Nursing from graduate studies at
[Univerthe University of South Al- sity], Rayelle Jiles earned her
abama. Masters of Science in Nursing
Specializing in [Specialisation]
at the University of South
Alabama.</p>
        <p>FS - She is rated highly by her Patients rated her highly,
givpatients. ing her an average of [Rating]
stars out of [Total].</p>
        <p>FS - Dr. Justine Lee is a
pediatric plastic surgeon in Los
Angeles, CA. These areas are
among her clinical interests:
cleft lip and palate, facelift,
and blepharoplasty.</p>
        <p>MN - Brian holds a B.S. in
nursing and is completing a
master’s degree in health policy
and law.</p>
        <p>MS - Dr. Brian Gengler is
an orthopedic surgeon with
advanced training in spinal
surgery.</p>
        <p>Dr. Justine Lee is a pediatric
plastic surgeon in Los
Angeles, CA. Her clinical interests
include cleft lip and palate,
facelift, and blepharoplasty.</p>
        <p>Brian R. Jones received a B.S.
in nursing from [University]
and is completing a master’s
degree in health policy and law
from [University].</p>
        <p>Dr. Brian Gengler is an
orthopedic surgeon with expertise
in spinal surgery.</p>
      </sec>
      <sec id="sec-10-3">
        <title>Sentence transformed by LLM</title>
        <p>Her advanced expertise is
backed by a Master of Science
in Nursing from the University
of South Alabama.</p>
        <p>Her patients consistently rate
her services highly, a
testament to her proficiency and
dedication.</p>
        <p>Dr. Justine Lee, a
distinguished pediatric plastic
surgeon based in Los Angeles,
CA, specializes in cleft lip and
palate, facelift, and
blepharoplasty.</p>
        <p>With a B.S. in nursing, he is
furthering his education by
completing a master’s degree
in health policy and law.
Dr. Brian Gengler is a highly
skilled orthopedic surgeon
specializing in spinal surgery.</p>
      </sec>
    </sec>
    <sec id="sec-11">
      <title>E. Data used for Evaluation</title>
      <p>
        I use the BiasBios dataset [
        <xref ref-type="bibr" rid="ref7">7</xref>
        ], which contains 397,340 biographies across 28 occupations, each annotated
with a binary gender label (male or female). For evaluation, I focus on the biographies of surgeons and
nurses, with 22,784 surgeon and nurse biographies in the train set and 9,764 in the test set. For my
train set, I select a subset of 300 biographies from the BiasBios train set, with an equal number of male
and female surgeons and nurses. For my test set, I use the entire BiasBios test set of 9,764 biographies,
which is imbalanced across occupations and gender. The data distribution for my train and test sets is
shown in Table 6 in Appendix F.
      </p>
    </sec>
    <sec id="sec-12">
      <title>F. Data distribution of my train and test set</title>
      <p>Data distribution of my train and test set is shown in Table 6</p>
    </sec>
    <sec id="sec-13">
      <title>G. Instruction prompt provided to GPT-4o to generate a revised version of the original biography</title>
      <p>Instruction prompt provided to GPT-4o to generate a revised version of the original biography is
described in Table 7.</p>
      <p>Given an original biography that describes a &lt;GENDER_1&gt;, produce a revised version of the original biography in a
way that a &lt;GENDER_2&gt; would write it, without changing the person’s name and gendered pronouns. After revising
the biography, provide a brief two-line explanation specifying what was modified in the revised version and why.
Original biography: &lt;original_biography&gt;
Provide the output in the following JSON format:
{
“revised_version”: “&lt;your_revised_version_of_the_provided_biography&gt;”,
“explanation”:
“&lt;your_explanation_for_the_changes_made_in_the _revised_biography&gt;”
}</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <given-names>F.</given-names>
            <surname>Hamidi</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M. K.</given-names>
            <surname>Scheuerman</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S. M.</given-names>
            <surname>Branham</surname>
          </string-name>
          ,
          <article-title>Gender recognition or gender reductionism? the social implications of embedded gender recognition systems</article-title>
          ,
          <source>in: Proceedings of the 2018 chi conference on human factors in computing systems</source>
          ,
          <year>2018</year>
          , pp.
          <fpage>1</fpage>
          -
          <lpage>13</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [2]
          <string-name>
            <given-names>R. S.</given-names>
            <surname>Bigler</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Leaper</surname>
          </string-name>
          ,
          <article-title>Gendered language: Psychological principles, evolving practices, and inclusive policies, Policy Insights from the Behavioral</article-title>
          and
          <source>Brain Sciences</source>
          <volume>2</volume>
          (
          <year>2015</year>
          )
          <fpage>187</fpage>
          -
          <lpage>194</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [3]
          <string-name>
            <given-names>M.</given-names>
            <surname>Bucholtz</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K.</given-names>
            <surname>Hall</surname>
          </string-name>
          ,
          <article-title>Language and identity, A companion to linguistic anthropology 1 (</article-title>
          <year>2004</year>
          )
          <fpage>369</fpage>
          -
          <lpage>394</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [4]
          <string-name>
            <given-names>C.</given-names>
            <surname>Leaper</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R. S.</given-names>
            <surname>Bigler</surname>
          </string-name>
          ,
          <article-title>Gendered language and sexist thought</article-title>
          ,
          <source>Monographs of the Society for Research in Child Development</source>
          <volume>69</volume>
          (
          <year>2004</year>
          )
          <fpage>128</fpage>
          -
          <lpage>142</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          [5]
          <string-name>
            <surname>U. OHCHR</surname>
          </string-name>
          ,
          <article-title>Gender stereotypes and stereotyping and women's rights</article-title>
          ,
          <year>2014</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          [6]
          <string-name>
            <given-names>V.</given-names>
            <surname>Simaki</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Aravantinou</surname>
          </string-name>
          , I. Mporas,
          <string-name>
            <given-names>M.</given-names>
            <surname>Kondyli</surname>
          </string-name>
          ,
          <string-name>
            <given-names>V.</given-names>
            <surname>Megalooikonomou</surname>
          </string-name>
          ,
          <article-title>Sociolinguistic features for author gender identification: From qualitative evidence to quantitative analysis</article-title>
          ,
          <source>Journal of Quantitative Linguistics</source>
          <volume>24</volume>
          (
          <year>2017</year>
          )
          <fpage>65</fpage>
          -
          <lpage>84</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          [7]
          <string-name>
            <given-names>M.</given-names>
            <surname>De-Arteaga</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Romanov</surname>
          </string-name>
          ,
          <string-name>
            <given-names>H.</given-names>
            <surname>Wallach</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Chayes</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Borgs</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Chouldechova</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Geyik</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K.</given-names>
            <surname>Kenthapadi</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A. T.</given-names>
            <surname>Kalai</surname>
          </string-name>
          ,
          <article-title>Bias in bios: A case study of semantic representation bias in a high-stakes setting</article-title>
          ,
          <source>in: proceedings of the Conference on Fairness, Accountability, and Transparency</source>
          ,
          <year>2019</year>
          , pp.
          <fpage>120</fpage>
          -
          <lpage>128</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          [8]
          <string-name>
            <given-names>S.</given-names>
            <surname>Barocas</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A. D.</given-names>
            <surname>Selbst</surname>
          </string-name>
          ,
          <article-title>Big data's disparate impact</article-title>
          , Calif. L. Rev.
          <volume>104</volume>
          (
          <year>2016</year>
          )
          <fpage>671</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          [9]
          <string-name>
            <given-names>D.</given-names>
            <surname>Gaucher</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Friesen</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A. C.</given-names>
            <surname>Kay</surname>
          </string-name>
          ,
          <article-title>Evidence that gendered wording in job advertisements exists and sustains gender inequality</article-title>
          .,
          <source>Journal of personality and social psychology 101</source>
          (
          <year>2011</year>
          )
          <fpage>109</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          [10]
          <string-name>
            <given-names>A. E.</given-names>
            <surname>Arthur</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R. S.</given-names>
            <surname>Bigler</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L. S.</given-names>
            <surname>Liben</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S. A.</given-names>
            <surname>Gelman</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D. N.</given-names>
            <surname>Ruble</surname>
          </string-name>
          ,
          <article-title>Gender stereotyping and prejudice in young children, Intergroup attitudes and relations in childhood through adulthood (</article-title>
          <year>2008</year>
          )
          <fpage>66</fpage>
          -
          <lpage>86</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          [11]
          <string-name>
            <given-names>E. M.</given-names>
            <surname>Bender</surname>
          </string-name>
          ,
          <string-name>
            <given-names>T.</given-names>
            <surname>Gebru</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>McMillan-Major</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Shmitchell</surname>
          </string-name>
          ,
          <article-title>On the dangers of stochastic parrots: Can language models be too big?</article-title>
          ,
          <source>in: Proceedings of the 2021 ACM conference on fairness, accountability, and transparency</source>
          ,
          <year>2021</year>
          , pp.
          <fpage>610</fpage>
          -
          <lpage>623</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          [12]
          <string-name>
            <given-names>J.</given-names>
            <surname>Zhao</surname>
          </string-name>
          ,
          <string-name>
            <given-names>T.</given-names>
            <surname>Wang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Yatskar</surname>
          </string-name>
          ,
          <string-name>
            <given-names>V.</given-names>
            <surname>Ordonez</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K.-W.</given-names>
            <surname>Chang</surname>
          </string-name>
          ,
          <article-title>Gender bias in coreference resolution: Evaluation and debiasing methods</article-title>
          , arXiv preprint arXiv:
          <year>1804</year>
          .
          <volume>06876</volume>
          (
          <year>2018</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          [13]
          <string-name>
            <given-names>K.</given-names>
            <surname>Lu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Mardziel</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F.</given-names>
            <surname>Wu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Amancharla</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Datta</surname>
          </string-name>
          ,
          <article-title>Gender bias in neural natural language processing, Logic, language, and security: essays dedicated to Andre Scedrov on the occasion of his 65th birthday (</article-title>
          <year>2020</year>
          )
          <fpage>189</fpage>
          -
          <lpage>202</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref14">
        <mixed-citation>
          [14]
          <string-name>
            <given-names>L.</given-names>
            <surname>Ackerman</surname>
          </string-name>
          ,
          <article-title>Syntactic and cognitive issues in investigating gendered coreference</article-title>
          ,
          <source>Glossa: a journal of general linguistics 4</source>
          (
          <year>2019</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref15">
        <mixed-citation>
          [15]
          <string-name>
            <given-names>Y. T.</given-names>
            <surname>Cao</surname>
          </string-name>
          , H.
          <string-name>
            <surname>Daumé</surname>
            <given-names>III</given-names>
          </string-name>
          ,
          <article-title>Toward gender-inclusive coreference resolution</article-title>
          , arXiv preprint arXiv:
          <year>1910</year>
          .
          <volume>13913</volume>
          (
          <year>2019</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref16">
        <mixed-citation>
          [16]
          <string-name>
            <given-names>M.</given-names>
            <surname>Bartl</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Leavy</surname>
          </string-name>
          ,
          <article-title>Inferring gender: A scalable methodology for gender detection with online lexical databases</article-title>
          ,
          <source>in: Proceedings of the Second Workshop on Language Technology for Equality, Diversity and Inclusion</source>
          ,
          <year>2022</year>
          , pp.
          <fpage>47</fpage>
          -
          <lpage>58</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref17">
        <mixed-citation>
          [17]
          <string-name>
            <given-names>L.</given-names>
            <surname>Litosseliti</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Sunderland</surname>
          </string-name>
          ,
          <article-title>Gender identity and discourse analysis</article-title>
          , volume
          <volume>2</volume>
          , John Benjamins Publishing,
          <year>2002</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref18">
        <mixed-citation>
          [18]
          <string-name>
            <given-names>D. L.</given-names>
            <surname>Rubin</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K. L.</given-names>
            <surname>Greene</surname>
          </string-name>
          ,
          <article-title>Efects of biological and psychological gender, age cohort, and interviewer gender on attitudes toward gender-inclusive/exclusive language</article-title>
          ,
          <source>Sex Roles</source>
          <volume>24</volume>
          (
          <year>1991</year>
          )
          <fpage>391</fpage>
          -
          <lpage>412</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref19">
        <mixed-citation>
          [19]
          <string-name>
            <given-names>U.</given-names>
            <surname>Gabriel</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Gygax</surname>
          </string-name>
          ,
          <string-name>
            <given-names>O.</given-names>
            <surname>Sarrasin</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Garnham</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Oakhill</surname>
          </string-name>
          ,
          <article-title>Au pairs are rarely male: Norms on the gender perception of role names across english, french, and german</article-title>
          ,
          <source>Behavior research methods 40</source>
          (
          <year>2008</year>
          )
          <fpage>206</fpage>
          -
          <lpage>212</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref20">
        <mixed-citation>
          [20]
          <string-name>
            <given-names>M.</given-names>
            <surname>Hellinger</surname>
          </string-name>
          ,
          <string-name>
            <given-names>H.</given-names>
            <surname>Bußmann</surname>
          </string-name>
          ,
          <article-title>Gender across languages: The linguistic representation of women and men, in: Gender across languages</article-title>
          ,
          <source>John Benjamins</source>
          ,
          <year>2015</year>
          , pp.
          <fpage>1</fpage>
          -
          <lpage>25</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref21">
        <mixed-citation>
          [21]
          <string-name>
            <given-names>T.</given-names>
            <surname>Bolukbasi</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K.-W.</given-names>
            <surname>Chang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J. Y.</given-names>
            <surname>Zou</surname>
          </string-name>
          ,
          <string-name>
            <given-names>V.</given-names>
            <surname>Saligrama</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A. T.</given-names>
            <surname>Kalai</surname>
          </string-name>
          ,
          <article-title>Man is to computer programmer as woman is to homemaker? debiasing word embeddings</article-title>
          ,
          <source>Advances in neural information processing systems</source>
          <volume>29</volume>
          (
          <year>2016</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref22">
        <mixed-citation>
          [22]
          <string-name>
            <given-names>J.</given-names>
            <surname>Zhao</surname>
          </string-name>
          ,
          <string-name>
            <given-names>T.</given-names>
            <surname>Wang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Yatskar</surname>
          </string-name>
          ,
          <string-name>
            <given-names>V.</given-names>
            <surname>Ordonez</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K.-W.</given-names>
            <surname>Chang</surname>
          </string-name>
          ,
          <article-title>Men also like shopping: Reducing gender bias amplification using corpus-level constraints</article-title>
          ,
          <source>arXiv preprint arXiv:1707.09457</source>
          (
          <year>2017</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref23">
        <mixed-citation>
          [23]
          <string-name>
            <given-names>J.</given-names>
            <surname>Zhao</surname>
          </string-name>
          ,
          <string-name>
            <given-names>T.</given-names>
            <surname>Wang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Yatskar</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Cotterell</surname>
          </string-name>
          ,
          <string-name>
            <given-names>V.</given-names>
            <surname>Ordonez</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K.-W.</given-names>
            <surname>Chang</surname>
          </string-name>
          ,
          <article-title>Gender bias in contextualized word embeddings</article-title>
          , arXiv preprint arXiv:
          <year>1904</year>
          .
          <volume>03310</volume>
          (
          <year>2019</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref24">
        <mixed-citation>
          [24]
          <string-name>
            <given-names>S. L.</given-names>
            <surname>Bem</surname>
          </string-name>
          ,
          <article-title>The measurement of psychological androgyny</article-title>
          .,
          <source>Journal of consulting and clinical psychology 42</source>
          (
          <year>1974</year>
          )
          <fpage>155</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref25">
        <mixed-citation>
          [25]
          <string-name>
            <given-names>J. T.</given-names>
            <surname>Spence</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Helmreich</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Stapp</surname>
          </string-name>
          , Personal attributes questionnaire,
          <source>Developmental Psychology</source>
          (
          <year>1974</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref26">
        <mixed-citation>
          [26]
          <string-name>
            <surname>J. M. Twenge</surname>
          </string-name>
          ,
          <article-title>Changes in masculine and feminine traits over time: A meta-analysis</article-title>
          ,
          <source>Sex roles 36</source>
          (
          <year>1997</year>
          )
          <fpage>305</fpage>
          -
          <lpage>325</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref27">
        <mixed-citation>
          [27]
          <string-name>
            <given-names>K.</given-names>
            <surname>Donnelly</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J. M.</given-names>
            <surname>Twenge</surname>
          </string-name>
          ,
          <article-title>Masculine and feminine traits on the bem sex-role inventory,</article-title>
          <year>1993</year>
          -
          <fpage>2012</fpage>
          :
          <article-title>A cross-temporal meta-analysis</article-title>
          ,
          <source>Sex roles 76</source>
          (
          <year>2017</year>
          )
          <fpage>556</fpage>
          -
          <lpage>565</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref28">
        <mixed-citation>
          [28]
          <string-name>
            <given-names>J.</given-names>
            <surname>Cryan</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Tang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>X.</given-names>
            <surname>Zhang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Metzger</surname>
          </string-name>
          ,
          <string-name>
            <given-names>H.</given-names>
            <surname>Zheng</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B. Y.</given-names>
            <surname>Zhao</surname>
          </string-name>
          ,
          <article-title>Detecting gender stereotypes: Lexicon vs. supervised learning methods</article-title>
          ,
          <source>in: Proceedings of the 2020 CHI conference on human factors in computing systems</source>
          ,
          <year>2020</year>
          , pp.
          <fpage>1</fpage>
          -
          <lpage>11</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref29">
        <mixed-citation>
          [29]
          <string-name>
            <given-names>M.</given-names>
            <surname>Nadeem</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Bethke</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Reddy</surname>
          </string-name>
          , Stereoset:
          <article-title>Measuring stereotypical bias in pretrained language models</article-title>
          , arXiv preprint arXiv:
          <year>2004</year>
          .
          <volume>09456</volume>
          (
          <year>2020</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref30">
        <mixed-citation>
          [30]
          <string-name>
            <given-names>N.</given-names>
            <surname>Nangia</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Vania</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Bhalerao</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S. R.</given-names>
            <surname>Bowman</surname>
          </string-name>
          ,
          <article-title>Crows-pairs: A challenge dataset for measuring social biases in masked language models</article-title>
          , arXiv preprint arXiv:
          <year>2010</year>
          .
          <volume>00133</volume>
          (
          <year>2020</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref31">
        <mixed-citation>
          [31]
          <string-name>
            <given-names>R. H.</given-names>
            <surname>Maudslay</surname>
          </string-name>
          ,
          <string-name>
            <given-names>H.</given-names>
            <surname>Gonen</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Cotterell</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Teufel</surname>
          </string-name>
          ,
          <article-title>It's all in the name: Mitigating gender bias with name-based counterfactual data substitution</article-title>
          , arXiv preprint arXiv:
          <year>1909</year>
          .
          <volume>00871</volume>
          (
          <year>2019</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref32">
        <mixed-citation>
          [32]
          <string-name>
            <given-names>H.</given-names>
            <surname>Gonen</surname>
          </string-name>
          ,
          <string-name>
            <surname>Y. Goldberg,</surname>
          </string-name>
          <article-title>Lipstick on a pig: Debiasing methods cover up systematic gender biases in word embeddings but do not remove them</article-title>
          , arXiv preprint arXiv:
          <year>1903</year>
          .
          <volume>03862</volume>
          (
          <year>2019</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref33">
        <mixed-citation>
          [33]
          <string-name>
            <given-names>A.</given-names>
            <surname>Romanov</surname>
          </string-name>
          ,
          <string-name>
            <surname>M. De-Arteaga</surname>
            ,
            <given-names>H.</given-names>
          </string-name>
          <string-name>
            <surname>Wallach</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          <string-name>
            <surname>Chayes</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          <string-name>
            <surname>Borgs</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          <string-name>
            <surname>Chouldechova</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          <string-name>
            <surname>Geyik</surname>
            ,
            <given-names>K.</given-names>
          </string-name>
          <string-name>
            <surname>Kenthapadi</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          <string-name>
            <surname>Rumshisky</surname>
            ,
            <given-names>A. T.</given-names>
          </string-name>
          <string-name>
            <surname>Kalai</surname>
          </string-name>
          ,
          <article-title>What's in a name? reducing bias in bios without access to protected attributes</article-title>
          , arXiv preprint arXiv:
          <year>1904</year>
          .
          <volume>05233</volume>
          (
          <year>2019</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref34">
        <mixed-citation>
          [34]
          <string-name>
            <given-names>J.</given-names>
            <surname>Becker</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J. P.</given-names>
            <surname>Wahle</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B.</given-names>
            <surname>Gipp</surname>
          </string-name>
          , T. Ruas,
          <article-title>Text generation: A systematic literature review of tasks, evaluation, and challenges</article-title>
          ,
          <source>arXiv preprint arXiv:2405.15604</source>
          (
          <year>2024</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref35">
        <mixed-citation>
          [35]
          <string-name>
            <given-names>A.</given-names>
            <surname>Celikyilmaz</surname>
          </string-name>
          ,
          <string-name>
            <given-names>E.</given-names>
            <surname>Clark</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Gao</surname>
          </string-name>
          ,
          <article-title>Evaluation of text generation: A survey</article-title>
          , arXiv preprint arXiv:
          <year>2006</year>
          .
          <volume>14799</volume>
          (
          <year>2020</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref36">
        <mixed-citation>
          [36]
          <string-name>
            <given-names>C. C.</given-names>
            <surname>Osuji</surname>
          </string-name>
          ,
          <string-name>
            <given-names>T. C.</given-names>
            <surname>Ferreira</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B.</given-names>
            <surname>Davis</surname>
          </string-name>
          ,
          <article-title>A systematic review of data-to-text nlg</article-title>
          ,
          <source>arXiv preprint arXiv:2402.08496</source>
          (
          <year>2024</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref37">
        <mixed-citation>
          [37]
          <string-name>
            <given-names>M.</given-names>
            <surname>Suzuki</surname>
          </string-name>
          ,
          <string-name>
            <given-names>N.</given-names>
            <surname>Itoh</surname>
          </string-name>
          ,
          <string-name>
            <given-names>T.</given-names>
            <surname>Nagano</surname>
          </string-name>
          , G. Kurata, S. Thomas,
          <article-title>Improvements to n-gram language model using text generated from neural language model</article-title>
          ,
          <source>in: ICASSP 2019-2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)</source>
          , IEEE,
          <year>2019</year>
          , pp.
          <fpage>7245</fpage>
          -
          <lpage>7249</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref38">
        <mixed-citation>
          [38]
          <string-name>
            <given-names>D.</given-names>
            <surname>Song</surname>
          </string-name>
          , W. Liu,
          <string-name>
            <given-names>T.</given-names>
            <surname>Zhou</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Tao</surname>
          </string-name>
          , D. A. Meyer,
          <article-title>Eficient robust conditional random fields</article-title>
          ,
          <source>IEEE Transactions on Image Processing</source>
          <volume>24</volume>
          (
          <year>2015</year>
          )
          <fpage>3124</fpage>
          -
          <lpage>3136</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref39">
        <mixed-citation>
          [39]
          <string-name>
            <given-names>F.</given-names>
            <surname>Khalil</surname>
          </string-name>
          ,
          <string-name>
            <surname>G.</surname>
          </string-name>
          <article-title>Pipa, Transforming the generative pretrained transformer into augmented business text writer</article-title>
          ,
          <source>Journal of Big Data</source>
          <volume>9</volume>
          (
          <year>2022</year>
          )
          <fpage>112</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref40">
        <mixed-citation>
          [40]
          <string-name>
            <given-names>J.</given-names>
            <surname>Devlin</surname>
          </string-name>
          , Bert:
          <article-title>Pre-training of deep bidirectional transformers for language understanding</article-title>
          , arXiv preprint arXiv:
          <year>1810</year>
          .
          <volume>04805</volume>
          (
          <year>2018</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref41">
        <mixed-citation>
          [41]
          <string-name>
            <given-names>S.</given-names>
            <surname>Wang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Z.</given-names>
            <surname>Zhao</surname>
          </string-name>
          ,
          <string-name>
            <given-names>X.</given-names>
            <surname>Ouyang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Q.</given-names>
            <surname>Wang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Shen</surname>
          </string-name>
          , Chatcad:
          <article-title>Interactive computer-aided diagnosis on medical image using large language models</article-title>
          ,
          <source>arXiv preprint arXiv:2302.07257</source>
          (
          <year>2023</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref42">
        <mixed-citation>
          [42]
          <string-name>
            <given-names>M.</given-names>
            <surname>Sallam</surname>
          </string-name>
          ,
          <article-title>Chatgpt utility in healthcare education, research, and practice: systematic review on the promising perspectives and valid concerns</article-title>
          ,
          <source>in: Healthcare</source>
          , volume
          <volume>11</volume>
          ,
          <string-name>
            <surname>MDPI</surname>
          </string-name>
          ,
          <year>2023</year>
          , p.
          <fpage>887</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref43">
        <mixed-citation>
          [43]
          <string-name>
            <given-names>M.</given-names>
            <surname>Valentini</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Weber</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Salcido</surname>
          </string-name>
          ,
          <string-name>
            <given-names>T.</given-names>
            <surname>Wright</surname>
          </string-name>
          ,
          <string-name>
            <given-names>E.</given-names>
            <surname>Colunga</surname>
          </string-name>
          ,
          <string-name>
            <surname>K.</surname>
          </string-name>
          <article-title>von der Wense, On the automatic generation and simplification of children's stories</article-title>
          , in: H.
          <string-name>
            <surname>Bouamor</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          <string-name>
            <surname>Pino</surname>
            ,
            <given-names>K.</given-names>
          </string-name>
          Bali (Eds.),
          <source>Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing</source>
          , Association for Computational Linguistics, Singapore,
          <year>2023</year>
          , pp.
          <fpage>3588</fpage>
          -
          <lpage>3598</lpage>
          . URL: https: //aclanthology.org/
          <year>2023</year>
          .emnlp-main.
          <volume>218</volume>
          . doi:
          <volume>10</volume>
          .18653/v1/
          <year>2023</year>
          .emnlp- main.218.
        </mixed-citation>
      </ref>
      <ref id="ref44">
        <mixed-citation>
          [44]
          <string-name>
            <given-names>Y.</given-names>
            <surname>Wan</surname>
          </string-name>
          , G. Pu,
          <string-name>
            <given-names>J.</given-names>
            <surname>Sun</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Garimella</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K.-W.</given-names>
            <surname>Chang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>N.</given-names>
            <surname>Peng</surname>
          </string-name>
          , “
          <article-title>kelly is a warm person, joseph is a role model”: Gender biases in LLM-generated reference letters</article-title>
          , in: H.
          <string-name>
            <surname>Bouamor</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          <string-name>
            <surname>Pino</surname>
            ,
            <given-names>K.</given-names>
          </string-name>
          Bali (Eds.),
          <source>Findings of the Association for Computational Linguistics: EMNLP</source>
          <year>2023</year>
          ,
          <article-title>Association for Computational Linguistics</article-title>
          , Singapore,
          <year>2023</year>
          , pp.
          <fpage>3730</fpage>
          -
          <lpage>3748</lpage>
          . URL: https://aclanthology.org/
          <year>2023</year>
          . findings-emnlp.
          <volume>243</volume>
          . doi:
          <volume>10</volume>
          .18653/v1/
          <year>2023</year>
          .findings- emnlp.243.
        </mixed-citation>
      </ref>
      <ref id="ref45">
        <mixed-citation>
          [45]
          <string-name>
            <given-names>H.</given-names>
            <surname>Kotek</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Dockum</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Sun</surname>
          </string-name>
          ,
          <article-title>Gender bias and stereotypes in large language models</article-title>
          ,
          <source>in: Proceedings of the ACM collective intelligence conference</source>
          ,
          <year>2023</year>
          , pp.
          <fpage>12</fpage>
          -
          <lpage>24</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref46">
        <mixed-citation>
          [46]
          <string-name>
            <given-names>X.</given-names>
            <surname>Dong</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Y.</given-names>
            <surname>Wang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P. S.</given-names>
            <surname>Yu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Caverlee</surname>
          </string-name>
          ,
          <article-title>Disclosure and mitigation of gender bias in llms</article-title>
          ,
          <source>arXiv preprint arXiv:2402.11190</source>
          (
          <year>2024</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref47">
        <mixed-citation>
          [47]
          <string-name>
            <given-names>X.</given-names>
            <surname>Fang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Che</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Mao</surname>
          </string-name>
          ,
          <string-name>
            <given-names>H.</given-names>
            <surname>Zhang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Zhao</surname>
          </string-name>
          ,
          <string-name>
            <given-names>X.</given-names>
            <surname>Zhao</surname>
          </string-name>
          ,
          <article-title>Bias of ai-generated content: an examination of news produced by large language models</article-title>
          ,
          <source>Scientific Reports</source>
          <volume>14</volume>
          (
          <year>2024</year>
          )
          <fpage>1</fpage>
          -
          <lpage>20</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref48">
        <mixed-citation>
          [48]
          <string-name>
            <given-names>A.</given-names>
            <surname>Ovalle</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Goyal</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Dhamala</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Z.</given-names>
            <surname>Jaggers</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K.-W.</given-names>
            <surname>Chang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Galstyan</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Zemel</surname>
          </string-name>
          , R. Gupta, “
          <article-title>i'm fully who i am”: Towards centering transgender and non-binary voices to measure biases in open language generation</article-title>
          ,
          <source>in: Proceedings of the 2023 ACM Conference on Fairness, Accountability, and Transparency</source>
          ,
          <year>2023</year>
          , pp.
          <fpage>1246</fpage>
          -
          <lpage>1266</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref49">
        <mixed-citation>
          [49]
          <string-name>
            <given-names>A.</given-names>
            <surname>Yan</surname>
          </string-name>
          , Z. Cheng,
          <article-title>A review of the development and future challenges of case-based reasoning</article-title>
          ,
          <source>Applied Sciences</source>
          <volume>14</volume>
          (
          <year>2024</year>
          )
          <fpage>7130</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref50">
        <mixed-citation>
          [50]
          <string-name>
            <given-names>S.</given-names>
            <surname>Massie</surname>
          </string-name>
          ,
          <string-name>
            <given-names>N.</given-names>
            <surname>Wiratunga</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Craw</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Donati</surname>
          </string-name>
          , E. Vicari,
          <article-title>From anomaly reports to cases</article-title>
          ,
          <source>in: Case-Based Reasoning Research and Development: 7th International Conference on Case-Based Reasoning</source>
          , ICCBR 2007 Belfast, Northern Ireland,
          <string-name>
            <surname>UK</surname>
          </string-name>
          ,
          <year>August</year>
          13-
          <issue>16</issue>
          ,
          <year>2007</year>
          Proceedings 7, Springer,
          <year>2007</year>
          , pp.
          <fpage>359</fpage>
          -
          <lpage>373</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref51">
        <mixed-citation>
          [51]
          <string-name>
            <given-names>A.</given-names>
            <surname>Upadhyay</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Massie</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Clogher</surname>
          </string-name>
          ,
          <article-title>Case-based approach to automated natural language generation for obituaries</article-title>
          ,
          <source>in: Case-Based Reasoning Research and Development: 28th International Conference, ICCBR</source>
          <year>2020</year>
          , Salamanca, Spain, June 8-12,
          <year>2020</year>
          , Proceedings 28, Springer,
          <year>2020</year>
          , pp.
          <fpage>279</fpage>
          -
          <lpage>294</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref52">
        <mixed-citation>
          [52]
          <string-name>
            <given-names>A.</given-names>
            <surname>Upadhyay</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Massie</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R. K.</given-names>
            <surname>Singh</surname>
          </string-name>
          ,
          <string-name>
            <given-names>G.</given-names>
            <surname>Gupta</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Ojha</surname>
          </string-name>
          ,
          <article-title>A case-based approach to data-to-text generation</article-title>
          ,
          <source>in: Case-Based Reasoning Research and Development: 29th International Conference, ICCBR</source>
          <year>2021</year>
          , Salamanca, Spain,
          <source>September 13-16</source>
          ,
          <year>2021</year>
          , Proceedings 29, Springer,
          <year>2021</year>
          , pp.
          <fpage>232</fpage>
          -
          <lpage>247</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref53">
        <mixed-citation>
          [53]
          <string-name>
            <given-names>D.</given-names>
            <surname>Bridge</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Healy</surname>
          </string-name>
          , Ghostwriter-
          <volume>2</volume>
          .0:
          <article-title>Product reviews with case-based support</article-title>
          ,
          <source>in: International Conference on Innovative Techniques and Applications of Artificial Intelligence</source>
          , Springer,
          <year>2010</year>
          , pp.
          <fpage>467</fpage>
          -
          <lpage>480</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref54">
        <mixed-citation>
          [54]
          <string-name>
            <given-names>A.</given-names>
            <surname>Waugh</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Bridge</surname>
          </string-name>
          ,
          <article-title>An evaluation of the ghostwriter system for case-based content suggestions</article-title>
          ,
          <source>in: Artificial Intelligence and Cognitive Science: 20th Irish Conference, AICS 2009</source>
          , Dublin, Ireland,
          <source>August 19-21</source>
          ,
          <year>2009</year>
          ,
          <source>Revised Selected Papers 20</source>
          , Springer,
          <year>2010</year>
          , pp.
          <fpage>262</fpage>
          -
          <lpage>272</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref55">
        <mixed-citation>
          [55]
          <string-name>
            <given-names>K.</given-names>
            <surname>Wilkerson</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Leake</surname>
          </string-name>
          ,
          <article-title>On implementing case-based reasoning with large language models</article-title>
          ,
          <source>in: International Conference on Case-Based Reasoning</source>
          , Springer,
          <year>2024</year>
          , pp.
          <fpage>404</fpage>
          -
          <lpage>417</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref56">
        <mixed-citation>
          [56]
          <string-name>
            <given-names>T.</given-names>
            <surname>Brown</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B.</given-names>
            <surname>Mann</surname>
          </string-name>
          ,
          <string-name>
            <given-names>N.</given-names>
            <surname>Ryder</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Subbiah</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J. D.</given-names>
            <surname>Kaplan</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Dhariwal</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Neelakantan</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Shyam</surname>
          </string-name>
          ,
          <string-name>
            <given-names>G.</given-names>
            <surname>Sastry</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Askell</surname>
          </string-name>
          , et al.,
          <article-title>Language models are few-shot learners</article-title>
          ,
          <source>Advances in neural information processing systems</source>
          <volume>33</volume>
          (
          <year>2020</year>
          )
          <fpage>1877</fpage>
          -
          <lpage>1901</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref57">
        <mixed-citation>
          [57]
          <string-name>
            <given-names>N.</given-names>
            <surname>Reimers</surname>
          </string-name>
          ,
          <string-name>
            <surname>I. Gurevych</surname>
          </string-name>
          ,
          <article-title>Sentence-bert: Sentence embeddings using siamese bert-networks</article-title>
          , arXiv preprint arXiv:
          <year>1908</year>
          .
          <volume>10084</volume>
          (
          <year>2019</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref58">
        <mixed-citation>
          [58]
          <string-name>
            <given-names>F.</given-names>
            <surname>Prost</surname>
          </string-name>
          ,
          <string-name>
            <given-names>N.</given-names>
            <surname>Thain</surname>
          </string-name>
          , T. Bolukbasi,
          <article-title>Debiasing embeddings for reduced gender bias in text classification</article-title>
          ,
          <source>GeBNLP</source>
          <year>2019</year>
          9573 (
          <year>2019</year>
          )
          <fpage>69</fpage>
          .
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>