<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta>
      <journal-title-group>
        <journal-title>M. Miceli, M. Schuessler, T. Yang, Between subjectivity and imposition: Power dynamics in
data annotation for computer vision, Proc. ACM Hum.-Comput. Interact.</journal-title>
      </journal-title-group>
    </journal-meta>
    <article-meta>
      <article-id pub-id-type="doi">10.1609/aies.v7i1.31617</article-id>
      <title-group>
        <article-title>Decoding Bias in Generative AI. Framing Socio-Technical Data Literacy as a Collective Critical Practice</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Antonella Autuori</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Institute of Design, SUPSI University of Applied Sciences and Arts of Southern Switzerland</institution>
          ,
          <addr-line>Mendrisio</addr-line>
          ,
          <country country="CH">Switzerland</country>
        </aff>
        <aff id="aff1">
          <label>1</label>
          <institution>RMIT University, School of Design</institution>
          ,
          <addr-line>Melbourne</addr-line>
          ,
          <country country="AU">Australia</country>
        </aff>
      </contrib-group>
      <pub-date>
        <year>2024</year>
      </pub-date>
      <volume>4</volume>
      <issue>2020</issue>
      <abstract>
        <p>This practice-based PhD develops a socio-technical data literacy framework for generative AI, emphasizing participatory engagement with the socio-cultural dimensions of data. Through methods such as participatory workshops and critical making, the research demonstrates how non-technical stakeholders can decode and intervene in algorithmic bias through engagement with classification data practices. The resulting toolkit and evaluative framework ofer practical strategies for inclusive, culturally-aware participation in educational and civic contexts. By conceptualizing data literacy as a critical, situated and collective critical practice, the research contributes to HCI by advancing more equitable and relational human-AI interaction.</p>
      </abstract>
      <kwd-group>
        <kwd>eol&gt;generative artificial intelligence</kwd>
        <kwd>bias</kwd>
        <kwd>human-AI interaction</kwd>
        <kwd>data literacy</kwd>
        <kwd>feminist epistemologies</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>1. Introduction</title>
      <p>
        Generative AI technologies are increasingly central to the infrastructures that mediate perception,
representation, and decision-making; however, these systems also replicate and intensify existing social
hierarchies through historically embedded biases in training data and classification architectures [
        <xref ref-type="bibr" rid="ref1 ref2 ref3">1, 2, 3</xref>
        ].
This practice-based doctoral research introduces socio-technical data literacy as a critical and reflexive
framework for engaging with these systems, one that equips non-technical users to decode, interrogate,
and intervene in the classificatory logics that shape generative visual outputs. Socio-technical data
literacy is defined here as a situated capacity to critically engage with the interconnected technical and
socio-cultural dimensions of generative AI.
      </p>
      <p>The term socio-technical underscores that these systems are not purely computational but are shaped
by cultural assumptions, institutional structures, and power relations. Data refers not only to training
corpora but also to the classification schemas that organize and give meaning to information. These
classifications—often hidden behind polished outputs—play a crucial role in determining what is made
visible, normative, or excluded. Literacy, in this context, is not merely a technical skill but a relational,
critical ability to interpret, question, and reshape algorithmic representations in a situated context. It
enables users to surface bias, negotiate meaning, and collectively reimagine the epistemic structures
embedded in generative systems.</p>
      <p>Rather than treating users as passive recipients of generative technologies, this work emphasizes
their role as epistemic agents with ethical responsibility in shaping model behavior through interactions
such as prompt design, content selection, and sense-making. Within this perspective, the research
investigates how non-technical stakeholders can actively challenge dominant representations and
co-construct alternative classificatory logics within generative AI systems.</p>
      <p>The work is guided by two central questions: How can socio-technical data literacy function as
a creative-critical method for decoding and intervening in bias with generative AI systems, while
supporting user agency in the interpretation and manipulation of classification processes? How can
participatory approaches to data and AI move beyond technical performativity to foster more relational,
care-oriented engagements with generative technologies and data?</p>
      <p>
        Informed by feminist epistemologies [
        <xref ref-type="bibr" rid="ref4 ref5">4, 5</xref>
        ], critical pedagogy [
        <xref ref-type="bibr" rid="ref6 ref7">6, 7</xref>
        ], and critical data studies [
        <xref ref-type="bibr" rid="ref8 ref9">8, 9</xref>
        ],
this research approaches knowledge as situated, relational, and materially embedded. Feminist theory
rejects the notion of disembodied objectivity, emphasizing instead partial perspectives grounded in lived
experience and conditioned by specific socio-cultural contexts. This lens supports an understanding
of human–AI interaction as mediated by both personal and structural factors, such as identity, afect,
memory, language, and access. Critical pedagogy reinforces this perspective by framing learning as
a dialogic and collective process, oriented toward reflection, agency, and the disruption of dominant
epistemic hierarchies. Critical data studies further extend this approach by interrogating how data
infrastructures embed social assumptions and reproduce asymmetries of representation. In addition
to this theoretical foundation, speculative design [
        <xref ref-type="bibr" rid="ref10">10</xref>
        ] can ofer methodological strategies that enable
participants to challenge normative data logics and imagine alternative engagements with generative
technologies.
      </p>
      <p>Situated within HCI, this research advances a set of tools designed to evaluate and inform
nontechnical users’ engagement with generative AI systems, with a focus on fostering agency and critical
awareness. As these systems become increasingly embedded in everyday contexts, understanding how
users interpret, challenge, and influence generative outputs becomes essential.</p>
      <p>Designed for educational and civic settings, these tools support reflective engagement and promote
individual and collective responsibility. By foregrounding user agency in shaping model behavior, the
framework invites critical attention to accountability, transparency, and the classificatory systems that
underpin algorithmic outputs. It thus expands the scope of data literacy toward more equitable, situated,
and ethically responsive human–AI relations.</p>
      <p>This research is currently at the beginning of its second year, with the methodology defined and
under implementation. Ethics approval has been obtained from the RMIT University Human Research
Ethics Committee in Melbourne, and expert interviews and participatory workshops are currently
underway. The following sections outline the theoretical foundations, participatory methodology, and
practical contributions of this research.</p>
    </sec>
    <sec id="sec-2">
      <title>2. Research Background</title>
      <p>
        The ontological foundation of this practice-based PhD engages with interdisciplinary discourses from
philosophy, history, and science and technology studies to examine how technological systems, bias,
and discrimination are co-produced. Understanding AI as a material and discursive infrastructure—
shaped by practices of classification, representation, and decision-making—requires a historically and
conceptually grounded perspective [
        <xref ref-type="bibr" rid="ref2">2, 11</xref>
        ].
      </p>
      <p>Recent researches demonstrate that both large language models (LLMs) and text-to-image (TTI)
systems systematically reproduce and amplify stereotypes related to gender, sexuality, and ethnicity
[12, 13, 14], marginalize non-Western epistemologies [15], and constrain identity representations [16],
while ultimately excluding disability and neurodivergent identities [17]. These patterns of exclusion are
deeply embedded in the construction of training datasets, where selection, annotation, and classification
practices are shaped by assumptions about what and who should be made visible. Such processes confer
epistemic legitimacy on particular worldviews, embedding them within algorithmic systems under the
guise of technical objectivity.</p>
      <p>
        Data annotation has been shown to be a situated and power-laden process that mediates subjectivity
and institutional authority [18, 19]. Visual taxonomies, such as those found in ImageNet, have been
shown to rely on normative assumptions about what identities, roles, and expressions should look like,
reducing individuals to predefined labels that claim universality but reflect narrow, culturally specific
worldviews [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ]. This reductive logic presumes a direct, stable correspondence between appearance and
meaning, reinforcing cultural stereotypes under the guise of computational legibility. ImageNet, in
particular, exemplifies the risks of large-scale annotation when applied to human subjects, where labels
such as “loser,” “kleptomaniac,” or “slattern” were attached to images of real people without consent or
contextual nuance [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ].
      </p>
      <p>As argued by Bowker and Star, classification systems embody institutional and cultural logics that,
although often obscured, carry significant epistemic and political consequences [ 20]. In this sense,
datasets do not merely reflect reality, but actively construct it through the worldviews embedded in
their structures and labeling practices [21]. The act of selecting, labeling, and categorizing images is not
neutral or technical—it is a political intervention with lasting impact on how people are seen, sorted,
and acted upon by AI systems. These classificatory regimes not only reproduce harm but have become
increasingly opaque as commercial AI systems scale, limiting public scrutiny of how representations
are produced and deployed.</p>
      <p>This research contributes to ongoing debates by framing data literacy as a critical, collective practice
specific to the context of generative AI. It addresses the epistemic and political dimensions of
classification systems, proposing tools and methods that make these structures accessible and open to
contestation by non-technical audiences.</p>
    </sec>
    <sec id="sec-3">
      <title>3. Towards a Bottom-Up Participatory Imperative</title>
      <p>A central ambition of this research is to involve non-technical experts, community members, educators,
activists, and other stakeholders who have been historically marginalized or excluded from the design
and classification processes underpinning AI systems.</p>
      <p>These individuals are not merely end-users; they are co-constructors of knowledge [22, 23] whose
lived experiences, values, and perspectives are essential for surfacing biases, contesting dominant
narratives, and envisioning alternative futures for AI [24].</p>
      <p>Conventional participatory practices in AI often confine stakeholder involvement to consultative or
tokenistic roles, where input is solicited only at discrete stages—typically after key design decisions have
already been made—or restricted to superficial aspects such as user interface feedback [ 25]. Moreover,
current practice frequently relies on proxies—such as UX professionals or algorithmic models—to
represent stakeholder voices, rather than enabling direct and sustained engagement [26].</p>
      <p>This results in constrained agency and limited influence over foundational classificatory structures.
Such forms of engagement are not merely desirable but constitute necessary conditions for ensuring
accountability, transparency, and contextual relevance in AI systems. Within this framework, the
redistribution of epistemic authority and the collective shaping of classification processes are understood
as central to advancing more just, reflexive, and socially responsive technological futures [ 27, 28, 26, 29].</p>
      <p>This requires politically informed understandings of how technology and citizenship are entangled,
making visible the power relations embedded in digital systems and supporting emancipatory practices
aimed at social justice[30]. As D’Ignazio and Klein emphasize in their framework of data feminism,
Expanding who participates in the design and interpretation of data is not simply a matter of broadening
access, but a deliberate epistemic choice—one that challenges dominant knowledge systems and afirms
the value of situated, plural forms of understanding that are often excluded from mainstream data
practices [31].</p>
    </sec>
    <sec id="sec-4">
      <title>4. Methodology</title>
      <p>This PhD research adopts a practice-based methodology that weaves together critical theoretical
inquiry and participatory experimentation. The approach is structured around four interconnected
methodological pillars, each designed to foreground the epistemic and ethical complexities of generative
AI while centering the agency of non-technical stakeholders.</p>
      <sec id="sec-4-1">
        <title>4.1. Literature Review as Critical Infrastructure Mapping</title>
        <p>The first pillar consists of a literature review conceived as a form of critical infrastructure mapping.
Rather than summarizing existing work, it delineates the conceptual landscape of data classification,
representational bias, and user agency within generative AI. These systems are approached as
sociotechnical infrastructures shaped by historical, cultural, and political conditions.</p>
        <p>As part of this phase, a cross-disciplinary bias cartography is assembled—drawing from media
studies, data science, informatics, and psychology—not to isolate technological failures, but to examine
how algorithmic and human biases intersect. This mapping supports a relational understanding of
classification processes and provides a shared foundation for reflection and discussion with experts and
participants in following methods.</p>
      </sec>
      <sec id="sec-4-2">
        <title>4.2. Expert Interviews</title>
        <p>The second pillar involves semi-structured interviews with designers, educators, AI practitioners, and
activists engaged in critical work on algorithmic systems. Conceived as dialogical rather than extractive,
these interviews support knowledge co-production through reflective prompts and in-depth discussion.</p>
        <p>Participants are invited to critically engage with and contribute to the cross-disciplinary bias
cartography developed during the literature review, bringing insights from their respective domains of
practice. This process surfaces tensions between ethical commitments and technical constraints, and
informs the iterative co-design of workshop formats and the development of evaluation criteria.</p>
      </sec>
      <sec id="sec-4-3">
        <title>4.3. Participatory Workshops</title>
        <p>The third methodological pillar centers on participatory workshops, which are conceived as epistemic
interventions and critical making spaces. These workshops invite participants to engage with generative
AI through a combination of reflective inquiry and speculative experimentation.</p>
        <p>Activities include prompt hacking—where participants iteratively test and annotate generative models
to reveal hidden biases and zine-making, which draws on feminist traditions of storytelling to document
personal encounters with algorithmic classification. Further, dataset remixing enables the co-creation of
alternative taxonomies through collaborative data curation, while image annotation and reclassification
exercises encourage participants to challenge normative visual grammars and disrupt established
hierarchies.</p>
        <p>
          These workshops are guided by principles of constructionism [32], care ethics [33], and pedagogical
co-production [
          <xref ref-type="bibr" rid="ref6">6</xref>
          ], emphasizing hands-on, embodied critique over abstract deliberation. Data generated
from observations, participant-created artifacts, and reflective discussions are analyzed using reflexive
thematic analysis.
        </p>
      </sec>
      <sec id="sec-4-4">
        <title>4.4. Toolkit and Evaluative Framework for Socio-Technical Data Literacy in</title>
      </sec>
      <sec id="sec-4-5">
        <title>Participatory and Educational Environment</title>
        <p>The final phase of the research involves the development and assessment of a modular toolkit designed
to support the practice of socio-technical data literacy in educational and civic contexts. The toolkit
includes adaptable facilitation formats, design probes, and guiding materials, and is examined through
an evaluative lens that attends to multiple dimensions of participant engagement.</p>
        <p>These include epistemic understanding—the ability to articulate and identify bias in generative AI
systems; social engagement—reflected in the willingness to discuss and intervene in classification
processes; and critical awareness—the recognition of AI systems as situated, value-laden infrastructures.</p>
        <p>Evaluation is not treated as a conclusive step, but rather as an iterative and generative moment within
the research, one that reflects on how theoretical commitments are translated into situated practice, and
how collective inquiry can inform more inclusive and contextually grounded ways of engaging with AI.</p>
      </sec>
    </sec>
    <sec id="sec-5">
      <title>5. Conclusion and Research Contribution</title>
      <p>This practice-based doctoral research contributes to HCI by reframing AI literacy through a
sociotechnical lens that centers interpretive agency, ethical responsibility, and participatory engagement.
Generative AI is approached not as a neutral infrastructure but as a site where bias is embedded, enacted,
and contested through interaction.</p>
      <p>The development of a socio-technical data literacy framework enables non-technical stakeholders—
often excluded from processes of design, governance, and interpretation—to critically engage with the
classificatory logics shaping generative outputs. The project investigates how engagement with
generative AI can support a shift from bias awareness to the cultivation of ethical agency. It focuses on how
individuals recognize and respond to algorithmic representations, and how moments of interpretation
can become sites of negotiation, reflexivity, and shared accountability. Bias is understood not only as
a property of data but as something reproduced through interaction, sense-making, and institutional
context.</p>
      <p>Through participatory workshops, dialogical interviews, and critical making practices, the research
explores how collective, situated interventions can foster more inclusive and pluralistic approaches
to knowledge production. The resulting toolkit and evaluative framework ofer practical resources
for educational and civic contexts, while proposing new ways of assessing agency, awareness, and
engagement in AI-mediated environments. Ultimately, this work expands the scope of data literacy and
participatory design toward more equitable, reflexive, and socially responsive human–AI interactions.</p>
    </sec>
    <sec id="sec-6">
      <title>Acknowledgments</title>
      <p>The author would like to thank their supervisors, Brad Haylock and Laurene Vaughan, from the RMIT
Design School for their unwavering support and for guiding the development of this research.</p>
    </sec>
    <sec id="sec-7">
      <title>Declaration on Generative AI</title>
      <p>During the preparation of this work, the author used Grammarly in order to: Grammar and spelling
check. Further, the author used Deeply in order to: Syntax review.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <given-names>M.</given-names>
            <surname>Broussard</surname>
          </string-name>
          ,
          <article-title>More than a Glitch: Confronting Race, Gender, and Ability Bias in Tech</article-title>
          , MIT Press,
          <year>2023</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [2]
          <string-name>
            <given-names>K.</given-names>
            <surname>Crawford</surname>
          </string-name>
          ,
          <article-title>Atlas of AI: Power, politics, and the planetary costs of artificial intelligence</article-title>
          , Yale University Press,
          <year>2021</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [3]
          <string-name>
            <given-names>K.</given-names>
            <surname>Crawford</surname>
          </string-name>
          , T. Paglen,
          <article-title>Excavating ai: The politics of images in machine learning training sets</article-title>
          , https://excavating.ai/,
          <year>2020</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [4]
          <string-name>
            <given-names>D.</given-names>
            <surname>Haraway</surname>
          </string-name>
          ,
          <article-title>Situated knowledges: The science question in feminism and the privilege of partial perspective</article-title>
          ,
          <source>Feminist Studies</source>
          <volume>14</volume>
          (
          <year>1988</year>
          ). doi:
          <volume>10</volume>
          .2307/3178066.
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          [5]
          <string-name>
            <given-names>S.</given-names>
            <surname>Harding</surname>
          </string-name>
          ,
          <article-title>Subjectivity, experience and knowledge: An epistemology from/for rainbow coalition politics</article-title>
          ,
          <source>Development and Change</source>
          <volume>23</volume>
          (
          <year>1992</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          [6]
          <string-name>
            <given-names>P.</given-names>
            <surname>Freire</surname>
          </string-name>
          ,
          <article-title>Pedagogy of the oppressed</article-title>
          , Seabury Press,
          <year>1970</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          [7]
          <string-name>
            <given-names>B.</given-names>
            <surname>Hooks</surname>
          </string-name>
          ,
          <article-title>Teaching to transgress: Education as the practice of freedom</article-title>
          , Routledge,
          <year>1994</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          [8]
          <string-name>
            <given-names>J.</given-names>
            <surname>Gray</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Gerlitz</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L.</given-names>
            <surname>Bounegru</surname>
          </string-name>
          , Data infrastructure literacy,
          <source>Big Data &amp; Society</source>
          <volume>5</volume>
          (
          <year>2018</year>
          ). doi:
          <volume>10</volume>
          .1177/2053951718786316.
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          [9]
          <string-name>
            <given-names>L.</given-names>
            <surname>Pangrazio</surname>
          </string-name>
          ,
          <string-name>
            <given-names>N.</given-names>
            <surname>Selwyn</surname>
          </string-name>
          ,
          <article-title>Critical Data Literacies: Rethinking Data and Everyday Life</article-title>
          , MIT Press,
          <year>2023</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          [10]
          <string-name>
            <given-names>N. Sánchez</given-names>
            <surname>Querubín</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Niederer</surname>
          </string-name>
          ,
          <article-title>Climate futures: machine learning from cli-fi</article-title>
          ,
          <source>Convergence</source>
          <volume>30</volume>
          (
          <year>2024</year>
          ). doi:
          <volume>10</volume>
          .1177/13548565221135715.
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>