<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta>
      <journal-title-group>
        <journal-title>M. Bombieri);
ponzetto@uni-mannheim.de (S. P. Ponzetto);
marco.rospocher@univr.it (M. Rospocher)</journal-title>
      </journal-title-group>
    </journal-meta>
    <article-meta>
      <title-group>
        <article-title>Do LLMs Authentically Represent Afective Experiences of People with Disabilities on Social Media?</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Marco Bombieri</string-name>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Simone Paolo Ponzetto</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Marco Rospocher</string-name>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>University of Mannheim</institution>
          ,
          <addr-line>B6, 26, D-68159 Mannheim</addr-line>
          ,
          <country country="DE">Germany</country>
        </aff>
        <aff id="aff1">
          <label>1</label>
          <institution>University of Verona, Lungadige Porta Vittoria</institution>
          ,
          <addr-line>41, 37129 Verona</addr-line>
          ,
          <country country="IT">Italy</country>
        </aff>
      </contrib-group>
      <pub-date>
        <year>2025</year>
      </pub-date>
      <volume>000</volume>
      <fpage>0</fpage>
      <lpage>0002</lpage>
      <abstract>
        <p>This paper investigates how Large Language Models (LLMs) represent the afective experiences of individuals with disabilities on social media. We simulate posts using LLMs and compare them to authentic user-generated content in English, collected from disability-related subreddits, focusing on sentiment, emotion, and indicators of depression. Our analysis reveals that LLMs tend to produce overly positive and idealized portrayals, often failing to capture the complexity and nuance of disabled individuals' emotional expressions. These misrepresentations underscore broader concerns about the limitations of LLMs in authentically reflecting the lived experiences of marginalized communities.</p>
      </abstract>
      <kwd-group>
        <kwd>eol&gt;Large Language Models</kwd>
        <kwd>Representation</kwd>
        <kwd>Disability</kwd>
        <kwd>Bias</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>1. Introduction</title>
      <p>tive consequence of dehumanizing individuals with
disabilities, leading society to praise their eforts rather than
Recent studies have shown that computational models of working toward tangible solutions that alleviate the
oflanguage, trained on real-world data, reflect and amplify ten strenuous challenges they face in survival through
harmful societal biases, often disproportionately afect- accessible political and social policies.
ing marginalized communities [1, 2, inter alia]. This can In this paper, we thus examine how current LLMs
porlead to psychological harm, unhappiness, and, in some tray individuals with disabilities1 from an afective
percases, suicide attempts [3]. The increasing use of Large spective. Specifically, we analyze the diferences between
Language Models (LLMs) has exacerbated the risks re- self-descriptions provided by real people with disabilities
lated to this issue, potentially spreading these represen- and those generated by LLMs when simulating
individutational harms further [4]. In response, researchers have als with disabilities. Our focus is on assessing the
sentiproposed methods to mitigate these biases. For example, ment, emotional tone, and levels of depression in these
recent LLMs have incorporated de-biasing techniques descriptions, with the aim of understanding how
authenand AI guards (e.g., Inan et al. [5]) that block ofensive tically LLMs represent the emotional experiences of
peoquestions and adjust responses to be non-toxic and pos- ple with disabilities and identify diferences and patterns
itive. However, recent work on studying the depiction in the afective portrayal of disability in AI-generated
of personas from marginalized groups of LLMs indicates content.
that many biases are concealed even in texts containing Our work aims to deepen discussions on how LLMs
words with a positive sentiment, which can still ofend should authentically represent disability, a topic that has
their sensitivities and lead to pernicious positive por- received comparatively less attention in NLP literature
trayals [6]. Moreover, in the specific case of disability, [3], despite the frequent discrimination faced by disabled
excessive positivity can be counterproductive to inclu- individuals [9, 10]. Specifically, we address the following
sion: some members of the disability community express Research Question (RQ):
dissatisfaction when they are portrayed in an excessively
and pathetically positive and optimistic manner: accord- Can LLMs authentically represent the
afing to them, this form of optimism reinforces what is fective experiences of people with
disabilknown as “inspiration porn” [3, 7, 8] which has the nega- ities on social media?</p>
      <sec id="sec-1-1">
        <title>1In this paper, we primarily use people-first language (e.g., “peo</title>
        <p>ple with disabilities"), though we occasionally use identity-first
language (e.g., “disabled people", “non-disabled people") based on
sentence structure. We recognize that preferences for people-first
or identity-first language vary among individuals. we intend not to
ofend or diminish anyone’s perspective.</p>
        <p>C1. We collected, annotated, and publicly released a Bias against people with disabilities. The
represenpreliminary dataset of anonymized Reddit posts from tation of disability in LLMs has thus been explored only
users with disabilities presenting themselves on the plat- minimally. Disability bias refers to treating individuals
form. Additionally, using various LLMs, we generated with disabilities less favorably than those without in
simand released a dataset of artificial portrayals of individ- ilar circumstances or misrepresenting them with biased
uals with disabilities presenting themselves on social associations [21]. Some studies show that hiring
sysmedia, using prompts inspired by [11]. Each post in both tems often discriminate against candidates with
disabildatasets is automatically annotated with its most likely ities [25, 26]. In particular, Glazko et al. [26] highlights
primary emotions and sentiment, as well as an indication that even GPT-4 shows bias in suggesting job candidates.
of whether it reveals the presence of depressive patterns Venkit et al. [21] and Hutchinson et al. [16] used
perin the writer. turbation sensitivity analysis [27] to identify biases in
models like BERT [28] and GPT-2 [29], finding implicit
C2. We compared web-collected posts with those gen- bias against disability-related terms. [30] expanded this
erated by LLMs to study how models represent individu- research to include disability, gender, and ethnicity, while
als with disabilities from an afective point of view, iden- Herold et al. [31] found BERT frames disabilities mainly
tifying diferences between real-world and AI-generated in medical terms. Recent work by Li et al. [32] suggests
portrayals. newer models like GPT-3.5 and GPT-4 ofer less biased
portrayals of disabilities.</p>
        <sec id="sec-1-1-1">
          <title>Our findings emphasize the need to expand research</title>
          <p>on stereotypes to address both negative ones and
positive idealizations, as both can harm marginalized groups.
Furthermore, the analysis of the dataset on people with
disabilities reveals significant challenges they frequently
face, often associated with negative emotions or
depressive symptoms, a fact already observed in literature [12].
Experiments also show that LLMs tend to minimize these
aspects when portraying people with disability and
substitute them with a more socially desirable narrative.2</p>
        </sec>
      </sec>
    </sec>
    <sec id="sec-2">
      <title>2. Related Work</title>
      <p>LLMs and Fairness. Recent advancements in LLMs
have transformed text processing and generation,
increasingly shaping social interactions. However, these
models can perpetuate harmful stereotypes and biases
[4], inheriting issues from uncurated internet data, such
as misrepresentations, derogatory language, and biased
associations [13, 6, 14, 1, 2]. These stereotypes
disproportionately afect marginalized groups, including
those based on age, race/ethnicity, gender, and disability
[15, 16, 17, 18, 19]. As awareness of these
misrepresentations grows, research has focused on bias and stereotypes
evaluation, mitigation methods, and datasets to address
them [4]. However, despite 1.3 billion people living with
disabilities [20], there is limited research on stereotypes
regarding disability representation in LLMs [21, 22].
Furthermore, existing datasets like BBQ [23], HolisticBias
[19], and PANDA [24] address disability representation
partially, lacking a comprehensive range of impairments
and analysis.</p>
      <sec id="sec-2-1">
        <title>2The code and the dataset are available at:</title>
        <p>https://github.com/marcobombieri/LLM-disability-representation
LLM-based portrayals and human simulation. A
related research trend is human simulation, where LLMs
are assessed on their ability to replicate human
behavior, a concept introduced by the Turing Experiment [33].
This is applied to simulate behavior in various social
and political settings [34, 35] and to identify stereotypes
[11, 6]. Specifically, [ 36] studies how LLMs simulate
personas with diferent traits, highlighting challenges
in zero-shot scenarios. To address this, [37] suggests
ifne-tuning LLMs using a persona description dataset
for improved personality trait representation. Our work
difers by focusing on how LLMs represent disability in
a zero-shot context, revealing oversimplifications and
stereotypes in representing disability-related emotions
and sentiments in the base model.</p>
      </sec>
    </sec>
    <sec id="sec-3">
      <title>3. Methods</title>
      <p>3.1. A dataset of LLMs-generated
portrayas of people with disabilities</p>
      <sec id="sec-3-1">
        <title>Using a set of prompts inspired by Kambhatla et al. [11]</title>
        <p>and Cheng et al. [6], we use three LLMs to craft social
media posts from the perspective of individuals with
disabilities with the goal of capturing their afective patterns.
In particular, we employ Mixtral-8B (Mixtral-8x7B
Instruct, quantized to 4 bits using GPTQ [38] due to
hardware limitations), GPT-4o-mini, and Gemini-1.5F (the
Flash variant). For the latter two models, we utilized their
paid APIs. The models are used with the temperature set
to 1.0 to guarantee the variability and randomness of the
responses generated.</p>
        <p>Each prompt asks the models to introduce themselves
as an individual with a disability and share experiences
related to disability or general life experiences such as
ex[PERSON]
P1: a person with a disability
P2: a person with autism
P3: a person with blindness
P4: a person with deafness
P5: a person with cerebral palsy
P6: a person with depression
[ACTIVITY]
A1: sharing experiences related to your disability
A2: sharing the emotions you felt today
A3: sharing the thoughts you had today
A4: sharing the activities you did today
A5: asking the community a question or suggestion
A6: commenting on today’s events
pressions of emotions, feelings, or thoughts, descriptions
of daily activities, questions for the community, requests
for suggestions, or commentary on current events, i.e.,
the typical activities a user can do on a social media
platform [39]. We opted to keep the prompts as general as
possible following the motivations discussed in [6], since
more detailed prompts may direct the model toward a
specific topic and introduce further stereotypes. In more
detail, all the prompts follow the template:
dits: r/blind4, r/autism5, r/depression6, r/deaf7,
and r/celebralpalsy8. These subreddits aim to foster
community and exchange among disabled individuals.
We included posts published until 2024 containing
textual content, excluding empty posts or those with only
links, images, or videos. Using Mixtral-8B and the
below prompt, we filtered for first-person posts from users
self-identifying as disabled, excluding content from
caregivers, professionals, or others:
"Imagine you are [PERSON]. Write a post
on social media introducing yourself and
[ACTIVITY]."
where [PERSON] and [ACTIVITY] can be one of those
defined in Table 1.</p>
        <p>The combination of P1-P6 with A1-A6 aims to generate
posts from the perspective of individuals with diferent
types of disabilities or impairments. Exploiting all
possible combinations, we thus obtained 36 diferent prompts.
Each prompt is submitted 10 times to take into account
the output variability of the models, thus obtaining, for
each LLM, a collection of 360 posts of artificial portrayals
of people with disabilities. We call LLMdgpt, LLMdgem,
and LLMdmix the datasets containing the posts generated
by GPT-4o-mini, Gemini-1.5F, and Mixtral-8B,
respectively. In this preliminary work, we narrow our focus to
the disabilities examined in similar studies, such as [26],
resulting in six alternative options (P1–P6) for [PERSON].
3.2. A dataset of people with disabilities’
self-descriptions</p>
      </sec>
      <sec id="sec-3-2">
        <title>In addition to the datasets described in Section 3.1, we</title>
        <p>collected posts from six disability-related subreddits.</p>
        <p>We began with the general subreddit r/disability3,
which ofers diverse discussions on disability-related
You are a text classifier operating on
social media posts. You must classify posts
into two disjoint classes, "1" or "2". Your
answer must be in the format:
"predictedClass;explanation," where "predictedClass"
can be "1" or "2," and "explanation" briefly
describes why you have chosen that class.</p>
        <p>Separate "predictedClass" from
"explanation" with the string ";". Do not add other
text. A post belongs to class "1" if: (the
author of the post writes about
himself/herself in the first person) AND ( the author
of the post explicitly mentions his/her own
disability/illness). A post belongs to class
"2" otherwise. Follow the post you have to
analyze:
{word}</p>
        <p>From the filtered results, we randomly sampled 450
posts from r/disability and 220 from each of the
disability-specific subreddits. Three annotators then
manually reviewed all these posts, removing those
wrongly annotated as relevant by the LLM. The final
dataset, REDd, includes 352 posts from r/disability,
165 from r/blind, 174 from r/autism, 204 from
r/depression, 171 from r/deaf, and 183 from
r/cerebralpalsy.9 To ensure annotation quality, 50
topics and ranks among the top 2% by size. To miti- 4Subreddit r/blind: https://www.reddit.com/r/blind/ [Last access:
gate selection bias and align with the disabilities con- 2025-05-16]
sidered in Section 3.1, we added five focused subred- 5Subreddit r/autism: https://www.reddit.com/r/autism/
6Subreddit r/depression: https://www.reddit.com/r/depression/
7Subreddit r/deaf: https://www.reddit.com/r/deaf/
8Subreddit r/cerebralpalsy: https://www.reddit.com/r/
3Subreddit r/disability: https://www.reddit.com/r/disability/ cerebralpalsy/
[Last access: 2025-05-16] 9Our goal is not to develop an LLM for post classification, but to
compile a dataset of posts by people with disabilities to support our
analysis; the LLM (78% accuracy) was used solely to assist filtering.
10Posts with scores between −0.05 and 0.05 are considered neutral.</p>
        <p>Since REDd is the only dataset containing neutral posts — and
only two such posts — we chose to focus the following analysis
exclusively on positive and negative posts.
, ∈
⎧ 1 = no depression,
⎨ 2 = moderate depression,
⎩ 3 = severe depression</p>
      </sec>
      <sec id="sec-3-3">
        <title>To analyze the distribution of labels across the dataset,</title>
        <p>posts were independently labeled by three annotators, surprise, sadness, joy and disgust. While EmoLex
proachieving a Fleiss’ Kappa of 0.875, indicating very vides a valuable resource for identifying emotion-related
high agreement [40]. Table 2 summarizes the obtained words, it has certain limitations. Specifically, it is based
datasets and their sizes that are in line with state-of-the- solely on word-level counts from the lexicon. It does not
art studies [6]. account for contextual factors such as negations, word
dependencies, or the broader semantic structure of the
3.3. Comparison metrics text. Nevertheless, this approach remains meaningful,
allowing the consistent analysis of emotional
expresTo address our research question, we aim to perform a sions across texts and providing valuable insights into
pairwise comparison of the previously described datasets, the overall emotional patterns within the dataset [43].
i.e., the LLM-generated portraits (Section 3.1) and human Let  = {1, 2, . . . ,  } represent the dataset with its
descriptions from Reddit users (Section 3.2) using met- set of  posts. For each post , we calculate the number
rics descriptive of the afects of an individual. In more of words associated with each emotion  ∈ , denoted
detail, given two datasets, we compare them along the by , , where , is the number of words in post 
dimensions described below. that are associated with emotion . If a word is linked
to multiple emotions, all associated emotions are
considered. The proportion  , of words in post  associated
with emotion  is given by:
Sentiment. The predominant sentiment of each post
 is computed using VADER [41], which assigns a
sentiment score () ∈ [−1, +1] . Following VADER
indications, a post is classified as positive if () &gt; 0.05,
negative if () &lt; −0.05 , and neutral otherwise. For a
dataset  = [1, . . . ,  ] of  posts, we compute the
number of positive, negative, and neutral posts:
positive = |{ | () &gt; 0.05}| ,
negative = |{ | () &lt; −0.05}| ,
neutral = |{ | −0.05 &lt;= ( ) &lt;= 0.05}| .</p>
        <p>We then compute the relative frequency of
sentimentloaded posts:10
 , =
,

where  is the total number of words in post  that are
linked to any emotion. At the dataset level, the average
proportion of each emotion across all posts is computed
as:
 ¯  =</p>
        <p>1 ∑︁

=1</p>
        <p>, .</p>
      </sec>
      <sec id="sec-3-4">
        <title>Depression. The indication of the presence of depres</title>
        <p>positive = positive , negative = negative . tshioenSahsadreetderTmasinkeodnbDyettheectbinegst-SpigenrfsoormfDinegprmesosidoenl ffrroomm
Social Media Text at LT-EDI-ACL2022 [44].</p>
        <p>Emotions. The distribution of emotions emerging from Let , denote the predicted depression label for a
a dataset using the NRC Word-Emotion Association Lexi- given post , where:
con (EmoLex) [42], namely anger, fear, anticipation, trust,</p>
      </sec>
    </sec>
    <sec id="sec-4">
      <title>4. Results and Discussions</title>
      <p>appear to spread a positivity bias, which may impact how
disability is represented in AI-generated discourse.</p>
      <p>To complement our quantitative metrics, we conduct a
preliminary qualitative analysis of both LLM-generated
and real posts, examining their structure and recurring
themes. LLMs tend to frame disability through
consistently positive lenses, emphasizing inclusion,
accessibility, and triumph over adversity, with frequent use of
words like advocacy, inclusion, grateful, excited, and proud.
Follow an excerpt of a post generated by GPT-4o-mini
when representing a blind person:</p>
      <p>I’m a proud member of the blind
community. [...] One of my biggest passions is
sharing my experiences and advocating
for accessibility and inclusion. [...] I also
want to highlight the amazing community
I’ve found among fellow visually impaired
individuals. We share stories, support one
another, and inspire each other every day
[...].
blind person I went to school with. []. By
the time I got to high school, it just got
worse and worse. [...]</p>
      <sec id="sec-4-1">
        <title>In future research, we will expand this preliminary analysis with an in-depth qualitative and qualitative thematic analysis of posts.</title>
        <p>
          Answer to RQ. The results reveal that the LLMs’
affective descriptions of disability significantly difer from
those expressed by real people with disabilities.
LLMgenerated texts largely emphasize positive sentiments
and emotions, minimizing or entirely omitting the
negative feelings that individuals with disabilities often
experience. This tendency risks fostering a form of toxic
positivity that overlooks the complex emotional
landscape of disability, as highlighted by [
          <xref ref-type="bibr" rid="ref1">45</xref>
          ]. The analysis
of REDd’s posts, however, paints a starkly dangerous
picture, where individuals with disabilities frequently
express negative emotions such as anger, sadness, and
fear. These emotional responses are not only shaped by
the inherent challenges of disability but are often
exacerbated by an inaccessible and exclusionary social-political
        </p>
      </sec>
    </sec>
    <sec id="sec-5">
      <title>5. Conclusions</title>
      <sec id="sec-5-1">
        <title>In this paper, we investigated how LLMs represent disability from an afective point of view by comparing AIgenerated portrayals with social media posts authored by individuals with disabilities. By leveraging a dataset</title>
      </sec>
      <sec id="sec-5-2">
        <title>In contrast, real posts by people with disabilities more</title>
        <p>often reference health, educational or financial struggles,
using terms such as pain, unemployed, bad, and anxiety, environment.
worse, reflecting a broader emotional range and lived
complexity.</p>
        <p>Follow an excerpt of a post from r/blind:</p>
        <p>I was born blind. Always been this way.</p>
        <p>From the time I was in high school, I began
to have really bad insecurities about my
blindness. [...] Growing up, I hated every
of Reddit posts and artificial portrayals generated by Limitations
LLMs, we analyzed the emotional tone, sentiment, and
depressive patterns of these texts. Our work contributes This paper is a preliminary work and thus has some
limnot only to a publicly available dataset but also to in- itations. First, we focused on a subset of disabilities to
sights into the fundamental diferences in how LLMs simplify the analysis. While this does not fully capture
and real individuals describe disability, highlighting sig- the complexity of the subject, it aligns with the approach
nificant oversimplifications. Most specifically, through taken in similar studies [26]. Second, we use
lexiconour experiments, we found that LLMs frequently idealize based tools to estimate emotions and sentiments, which
disability-related afective experiences, producing overly may not always capture contextual nuances, potentially
optimistic portrayals that ignore the complex realities afecting the accuracy of the analysis. This methodology
and challenges faced by individuals with disability. In is, however, also employed in authoritative studies to
stark contrast, posts written by real individuals often con- ensure the method remains explainable and reproducible
vey more nuanced emotions, including negative feelings [6]. Furthermore, although we assume individuals who
stemming from the intersection of their disabilities with mention being disabled are indeed disabled, some may be
inaccessible and non-inclusive societal systems. bots or people pretending to be disabled. Finally, these</p>
        <p>This disconnect highlights the risk of toxic positiv- ifndings are specific to the versions of the models and
ity, where overly optimistic portrayals diminish the real the dates on which they were tested (especially those
acchallenges faced by disabled individuals. Though well- cessed via API). As LLMs are updated and their guardrails
intentioned, this emphasis on positivity often forces them evolve, these results may change.
into a narrative that idealizes disability through a
nondisabled lens, overlooking their actual experiences. By Ethical and societal implications
replacing negative emotions with an overly upbeat per- This paper has a positive impact by shedding light on
spective, LLMs risk perpetuating exclusionary conditions. how disability is represented in zero-shot LLMs,
emphaOur findings highlight the broader challenge of ensur- sizing crucial ethical considerations. Current debiasing
ing LLMs authentically represent marginalized groups. and representation models focus on “category” rather
While addressing negative stereotypes in AI is crucial, than “individual,” leading to potentially generalized,
inour study calls for a more nuanced approach that reflects sensitive, or inappropriate responses. A model aiming
the diverse realities of marginalized groups without re- to be inclusive must understand the personal experience
ductive idealizations. This paper raises a critical question: of the individual represented. These models often fail
should LLMs represent afective experiences in an exclu- to capture pain, sufering, and depression, substituting
sively optimistic, "good vibes only" manner, or should them with overly positive language. While optimism may
they strive for more authentic, emotionally complex por- be suitable in some cases, neglecting sufering flattens
trayals that better reflect real human experiences? a key human experience. A “only good vibes” approach</p>
        <p>In future work, we plan to test additional prompts and risks marginalizing those experiencing hardships, not
simulate a broader range of social media scenarios. We just people with disabilities but anyone going through
also plan to expand the collection of posts by including dificult times, exposing to the risk of inspiration porn.
a wider range of subreddits, social media platforms, and Therefore, these models must reflect the complexity of
languages. This will help capture a more diverse set of human emotions authentically and respectfully to foster
experiences from individuals with disabilities. We also genuine understanding, inclusion, and support. While
adaim to include a broader spectrum of disabilities and an- dressing such personal topics may unintentionally cause
alyze how their representation varies across diferent misunderstandings, our intention is to promote
construccategories. Additionally, we will enhance this study with tive dialogue between technologists and humanists for
thematic analysis methods to examine discourses related more inclusive AI systems.
to disabilities in real and LLM-generated posts,
identifying keywords that distinguish the two corpora—those
written by disabled individuals and those generated by Data Availability
LLMs. A qualitative analysis will further complement The code and the dataset are available at:
this approach. Finally, comparing how LLMs portray in- https://github.com/marcobombieri/
dividuals with disabilities versus the general population, LLM-disability-representation
following the methodology in [6], will ofer deeper
insights into these dynamics and help address the risk of
oversimplification or misrepresentation. Acknowledgments</p>
      </sec>
      <sec id="sec-5-3">
        <title>This research has received funding from the University of Mannheim’s “Gastwissenschaftler*innenprogramm</title>
      </sec>
      <sec id="sec-5-4">
        <title>Nachhaltigkeit”, and the MUR funded 2023-2027 Project</title>
        <p>of Excellence “Inclusive Humanities: Perspectives for
Development in the Research and Teaching of Foreign
Languages and Literatures” of the Department of Foreign
Languages and Literatures of the University of Verona.</p>
        <p>Part of this work was carried out within the Digital Arena
for Inclusive Humanities (DAIH) Research Centre at the
University of Verona. The authors gratefully
acknowledge this support.
ing of the Association for Computational Linguis- N. Xue, S. Kim, Y. Hahm, Z. He, T. K. Lee, E. Santus,
tics and the 11th International Joint Conference on F. Bond, S. Na (Eds.), Proceedings of the 29th
InNatural Language Processing, ACL/IJCNLP 2021, ternational Conference on Computational
Linguis(Volume 1: Long Papers), Virtual Event, August tics, COLING 2022, Gyeongju, Republic of Korea,
1-6, 2021, Association for Computational Linguis- October 12-17, 2022, International Committee on
tics, 2021, pp. 4275–4293. URL: https://doi.org/ Computational Linguistics, 2022, pp. 1324–1332.
10.18653/v1/2021.acl-long.330. doi:10.18653/V1/ [22] Z. Chu, Z. Wang, W. Zhang, Fairness in large
lan2021.ACL-LONG.330. guage models: A taxonomic survey, SIGKDD
Ex[16] B. Hutchinson, V. Prabhakaran, E. Denton, K. Web- plor. Newsl. 26 (2024) 34–48. URL: https://doi.org/
ster, Y. Zhong, S. Denuyl, Social biases in NLP 10.1145/3682112.3682117. doi:10.1145/3682112.
models as barriers for persons with disabilities, 3682117.
in: D. Jurafsky, J. Chai, N. Schluter, J. R. Tetreault [23] A. Parrish, A. Chen, N. Nangia, V. Padmakumar,
(Eds.), Proceedings of the 58th Annual Meeting J. Phang, J. Thompson, P. M. Htut, S. R. Bowman,
of the Association for Computational Linguistics, BBQ: A hand-built bias benchmark for question
anACL 2020, Online, July 5-10, 2020, Association for swering, in: S. Muresan, P. Nakov, A. Villavicencio
Computational Linguistics, 2020, pp. 5491–5501. (Eds.), Findings of the Association for
ComputaURL: https://doi.org/10.18653/v1/2020.acl-main.487. tional Linguistics: ACL 2022, Dublin, Ireland, May
doi:10.18653/V1/2020.ACL-MAIN.487. 22-27, 2022, Association for Computational
Lin[17] K. Mei, S. Fereidooni, A. Caliskan, Bias against 93 guistics, 2022, pp. 2086–2105. URL: https://doi.org/
stigmatized groups in masked language models and 10.18653/v1/2022.findings-acl.165. doi: 10.18653/
downstream sentiment classification tasks, in: Pro- V1/2022.FINDINGS-ACL.165.
ceedings of the 2023 ACM Conference on Fairness, [24] R. Qian, C. Ross, J. Fernandes, E. M. Smith,
Accountability, and Transparency, FAccT 2023, D. Kiela, A. Williams, Perturbation
augmentaChicago, IL, USA, June 12-15, 2023, ACM, 2023, pp. tion for fairer NLP, in: Y. Goldberg, Z. Kozareva,
1699–1710. URL: https://doi.org/10.1145/3593013. Y. Zhang (Eds.), Proceedings of the 2022
Con3594109. doi:10.1145/3593013.3594109. ference on Empirical Methods in Natural
Lan[18] A. Salinas, P. Shah, Y. Huang, R. McCormack, guage Processing, Association for Computational
F. Morstatter, The unequal opportunities of large Linguistics, Abu Dhabi, United Arab Emirates,
language models: Examining demographic biases 2022, pp. 9496–9521. URL: https://aclanthology.org/
in job recommendations by chatgpt and llama, in: 2022.emnlp-main.646/. doi:10.18653/v1/2022.
Proceedings of the 3rd ACM Conference on Eq- emnlp-main.646.
uity and Access in Algorithms, Mechanisms, and [25] N. Tilmes, Disability, fairness, and algorithmic bias
Optimization, Association for Computing Machin- in AI recruitment, Ethics Inf. Technol. 24 (2022) 21.
ery, New York, NY, USA, 2023. URL: https://doi.org/ URL: https://doi.org/10.1007/s10676-022-09633-2.
10.1145/3617694.3623257. doi:10.1145/3617694. doi:10.1007/S10676-022-09633-2.
3623257. [26] K. S. Glazko, Y. Mohammed, B. Kosa, V. Potluri,
[19] E. M. Smith, M. Hall, M. Kambadur, E. Presani, J. Mankof, Identifying and improving disability
A. Williams, “I‘m sorry to hear that”: Finding bias in gpt-based resume screening, in: The 2024
new biases in language models with a holistic de- ACM Conference on Fairness, Accountability, and
scriptor dataset, in: Y. Goldberg, Z. Kozareva, Transparency, FAccT 2024, Rio de Janeiro, Brazil,
Y. Zhang (Eds.), Proceedings of the 2022 Con- June 3-6, 2024, ACM, 2024, pp. 687–700. URL: https:
ference on Empirical Methods in Natural Lan- //doi.org/10.1145/3630106.3658933. doi:10.1145/
guage Processing, Association for Computational 3630106.3658933.</p>
        <p>Linguistics, Abu Dhabi, United Arab Emirates, [27] M. Díaz, I. Johnson, A. Lazar, A. M. Piper, D.
Ger2022, pp. 9180–9211. URL: https://aclanthology.org/ gle, Addressing age-related bias in sentiment
anal2022.emnlp-main.625/. doi:10.18653/v1/2022. ysis, in: S. Kraus (Ed.), Proceedings of the
Twentyemnlp-main.625. Eighth International Joint Conference on
Artifi[20] W. H. Organization, World Health Organization cial Intelligence, IJCAI 2019, Macao, China,
Au- Disability, https://www.who.int/health-topics/ gust 10-16, 2019, ijcai.org, 2019, pp. 6146–6150.
disability, 2023. Accessed: 2025-01-13. URL: https://doi.org/10.24963/ijcai.2019/852. doi:10.
[21] P. N. Venkit, M. Srinath, S. Wilson, A study of im- 24963/IJCAI.2019/852.</p>
        <p>plicit bias in pretrained language models against [28] J. Devlin, M.-W. Chang, K. Lee, K. Toutanova, BERT:
people with disabilities, in: N. Calzolari, C. Huang, Pre-training of deep bidirectional transformers for
H. Kim, J. Pustejovsky, L. Wanner, K. Choi, P. Ryu, language understanding, in: J. Burstein, C.
DoH. Chen, L. Donatelli, H. Ji, S. Kurohashi, P. Paggio, ran, T. Solorio (Eds.), Proceedings of the 2019
Conference of the North American Chapter of the As- simulate human behavior: A causal inference
persociation for Computational Linguistics: Human spective, CoRR abs/2312.15524 (2023). URL: https://
Language Technologies, Volume 1 (Long and Short doi.org/10.48550/arXiv.2312.15524. doi:10.48550/
Papers), Association for Computational Linguis- ARXIV.2312.15524. arXiv:2312.15524.
tics, Minneapolis, Minnesota, 2019, pp. 4171–4186. [36] T. Hu, N. Collier, Quantifying the persona efect in
URL: https://aclanthology.org/N19-1423/. doi:10. LLM simulations, in: L. Ku, A. Martins, V. Srikumar
18653/v1/N19-1423. (Eds.), Proceedings of the 62nd Annual Meeting of
[29] A. Radford, J. Wu, R. Child, D. Luan, D. Amodei, the Association for Computational Linguistics
(VolI. Sutskever, Language models are unsupervised ume 1: Long Papers), ACL 2024, Bangkok, Thailand,
multitask learners, OpenAI (2019). August 11-16, 2024, Association for Computational
[30] S. Hassan, M. Huenerfauth, C. O. Alm, Unpack- Linguistics, 2024, pp. 10289–10307. URL: https://doi.
ing the interdependent systems of discrimination: org/10.18653/v1/2024.acl-long.554. doi:10.18653/
Ableist bias in NLP systems through an intersec- V1/2024.ACL-LONG.554.
tional lens, in: M. Moens, X. Huang, L. Specia, S. W. [37] W. Li, J. Liu, A. Liu, X. Zhou, M. Diab, M. Sap,
BIG5Yih (Eds.), Findings of the Association for Compu- CHAT: shaping LLM personalities through training
tational Linguistics: EMNLP 2021, Virtual Event on human-grounded data, CoRR abs/2410.16491
/ Punta Cana, Dominican Republic, 16-20 Novem- (2024). URL: https://doi.org/10.48550/arXiv.2410.
ber, 2021, Association for Computational Linguis- 16491. doi:10.48550/ARXIV.2410.16491.
tics, 2021, pp. 3116–3123. URL: https://doi.org/10. [38] E. Frantar, S. Ashkboos, T. Hoefler, D.
Alis18653/v1/2021.findings-emnlp.267. doi: 10.18653/ tarh, GPTQ: accurate post-training
quantiV1/2021.FINDINGS-EMNLP.267. zation for generative pre-trained transformers,
[31] B. Herold, J. Waller, R. Kushalnagar, Applying CoRR abs/2210.17323 (2022). URL: https://doi.org/
the stereotype content model to assess disability 10.48550/arXiv.2210.17323. doi:10.48550/ARXIV.
bias in popular pre-trained NLP models underly- 2210.17323.
ing AI-based assistive technologies, in: S. Ebling, [39] J. J. Al-Menayes, Motivations for using social
meE. Prud’hommeaux, P. Vaidyanathan (Eds.), Ninth dia: An exploratory factor analysis, International
Workshop on Speech and Language Processing Journal of Psychological Studies 7 (2015) 43.
for Assistive Technologies (SLPAT-2022), Associa- [40] J. R. Landis, G. G. Koch, The measurement of
obtion for Computational Linguistics, Dublin, Ireland, server agreement for categorical data, Biometrics
2022, pp. 58–65. URL: https://aclanthology.org/2022. 33 (1977).</p>
        <p>slpat-1.8/. doi:10.18653/v1/2022.slpat-1.8. [41] C. J. Hutto, E. Gilbert, VADER: A parsimonious
rule[32] R. Li, A. Kamaraj, J. Ma, S. Ebling, Decoding ableism based model for sentiment analysis of social media
in large language models: An intersectional ap- text, in: E. Adar, P. Resnick, M. D. Choudhury,
proach, in: D. Dementieva, O. Ignat, Z. Jin, R. Mihal- B. Hogan, A. Oh (Eds.), Proceedings of the Eighth
cea, G. Piatti, J. Tetreault, S. Wilson, J. Zhao (Eds.), International Conference on Weblogs and Social
Proceedings of the Third Workshop on NLP for Posi- Media, ICWSM 2014, Ann Arbor, Michigan, USA,
tive Impact, Association for Computational Linguis- June 1-4, 2014, The AAAI Press, 2014.
tics, Miami, Florida, USA, 2024, pp. 232–249. URL: [42] S. M. Mohammad, P. D. Turney, Crowdsourcing a
https://aclanthology.org/2024.nlp4pi-1.22/. doi:10. word-emotion association lexicon, Comput. Intell.
18653/v1/2024.nlp4pi-1.22. 29 (2013) 436–465. URL: https://doi.org/10.1111/j.
[33] G. V. Aher, R. I. Arriaga, A. T. Kalai, Using large 1467-8640.2012.00460.x.</p>
        <p>language models to simulate multiple humans and [43] Y. Li, J. Chan, G. Peko, D. Sundaram, Mixed emotion
replicate human subject studies, in: A. Krause, extraction analysis and visualisation of social media
E. Brunskill, K. Cho, B. Engelhardt, S. Sabato, J. Scar- text, Data Knowl. Eng. 148 (2023) 102220. URL:
lett (Eds.), International Conference on Machine https://doi.org/10.1016/j.datak.2023.102220. doi:10.
Learning, ICML 2023, 23-29 July 2023, Honolulu, 1016/J.DATAK.2023.102220.</p>
        <p>Hawaii, USA, volume 202 of Proceedings of Machine [44] R. Poświata, M. Perełkiewicz,
OPI@LT-EDILearning Research, PMLR, 2023, pp. 337–371. URL: ACL2022: Detecting signs of depression from social
https://proceedings.mlr.press/v202/aher23a.html. media text using RoBERTa pre-trained language
[34] L. P. Argyle, E. C. Busby, N. Fulda, J. R. Gubler, models, in: Proceedings of the Second
WorkC. Rytting, D. Wingate, Out of one, many: Using shop on Language Technology for Equality,
Dilanguage models to simulate human samples, Polit- versity and Inclusion, Association for
Computaical Analysis 31 (2023) 337–351. doi:10.1017/pan. tional Linguistics, Dublin, Ireland, 2022, pp. 276–
2023.2. 282. URL: https://aclanthology.org/2022.ltedi-1.40.
[35] G. Gui, O. Toubia, The challenge of using llms to doi:10.18653/v1/2022.ltedi-1.40.</p>
        <p>Declaration on Generative AI
During the preparation of this work, the author(s) used ChatGPT (OpenAI) in order to: Paraphrase
and reword and Grammar and spelling check. After using these tool(s)/service(s), the author(s)
reviewed and edited the content as needed and take(s) full responsibility for the publication’s
content.</p>
      </sec>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [45]
          <string-name>
            <given-names>Z.</given-names>
            <surname>Wyatt</surname>
          </string-name>
          ,
          <article-title>The dark side of #positivevibes: Understanding toxic positivity in modern culture</article-title>
          ,
          <source>Psychiatry and Behavioral Health</source>
          <volume>3</volume>
          (
          <year>2024</year>
          )
          <fpage>1</fpage>
          -
          <lpage>6</lpage>
          .
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>