<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>A Frictional Design Approach: Towards Judicial AI and its Possible Applications</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Caterina Fregosi</string-name>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Federico Cabitza</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>IRCCS Ospedale Galeazzi-Sant'Ambrogio</institution>
          ,
          <addr-line>Milan</addr-line>
          ,
          <country country="IT">Italy</country>
        </aff>
        <aff id="aff1">
          <label>1</label>
          <institution>Università degli Studi di Milano-Bicocca</institution>
          ,
          <addr-line>Milan</addr-line>
          ,
          <country country="IT">Italy</country>
        </aff>
      </contrib-group>
      <abstract>
        <p>Decision support systems (DSS) are increasingly being integrated into high-stakes domains like healthcare, law, and finance, where critical decisions have significant consequences. Traditional DSS often provide a single, clear-cut recommendation, which can lead to automation bias and diminish the user's sense of agency. However, there is a growing concern about the over-reliance on these systems and the potential for deskilling among users. The knowledge gap we aim to address is the development of decision support systems that efectively encourage critical reflection and maintain user engagement and responsibility in decision-making processes. In this workshop contribution, we report on the development of Judicial AI, a novel approach inspired by Frictional AI. Judicial AI diverges from traditional DSS by ofering multiple, contrasting explanations to support diferent potential outcomes. This design encourages users to engage in deeper cognitive processing, thereby promoting critical reflection, reducing automation bias, and preserving the user's sense of agency. This ongoing study employs a two-arm experiment to investigate the efects of this approach in the context of content classification tasks, comparing it with the traditional protocol. The expected outcomes of this ongoing study suggest that the Judicial protocol could not only mitigate automation bias but also safeguard users' sense of agency and promote long-term skill retention.</p>
      </abstract>
      <kwd-group>
        <kwd>eol&gt;Frictional AI</kwd>
        <kwd>Judicial AI</kwd>
        <kwd>Human-AI Decision making process</kwd>
        <kwd>eXplainable AI (XAI)</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>1. Introduction</title>
      <p>
        In domains where decision-makers face high-stakes scenarios with significant consequences,
Decision Support Systems (DSS) are increasingly implemented. It is essential to support users
not only in identifying the optimal decision but also in efectively managing the decision-making
process. This approach aims to mitigate the detrimental efects of interaction, such as
overconifdence or underconfidence [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ], while fostering appropriate reliance on the decision support
system [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ]. This involves providing users with support to critically assess both their own
reasoning processes and the AI system’s recommendations, a feature often absent in oracular
decision support systems [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ]. Such systems tend to ofer clear-cut answers, thereby fostering
an uncritical reliance on the system. Cooper (1999) introduced the concept of cognitive friction
defined as “the resistance encountered by a human intellect when it engages with a complex
system of rules” [
        <xref ref-type="bibr" rid="ref4">4</xref>
        ]. In technology domain, a design inspired by friction concept intentionally
incorporates what Cox et al (2016) describe as design frictions “points of dificulty encountered
during interaction with technology” [
        <xref ref-type="bibr" rid="ref5">5</xref>
        ] or what Cabitza et al (2019) term programmed
ineficiencies [6]. Contrary to trends aiming to create seamless interactions that promote speed and
eficiency, a “positive friction ”strategy deliberately integrates these elements to improve user
engagement and reflection [ 7]. The term Frictional AI was introduced by Cabitza et al (2024)
as an umbrella term for a variety of methods aimed at encouraging reflection in human-AI
decision making processes by introducing cognitive friction [8].
      </p>
      <p>In the domain of Human-Computer Interaction (HCI), the design of decision support systems
(DSS) that ofer multiple, well-argued explanations for diferent hypotheses—rather than
simply presenting the (allegedely) correct answer—presents two significant advantages that
address cognitive and ethical concerns. Firstly, such a system is designed to mitigate the risk
of automation bias, a well-documented phenomenon where users over-rely on automated
systems [9], often accepting their outputs uncritically even when they are wrong [10]. By
presenting multiple plausible explanations, backing up each option, the DSS compels the user
to engage in deeper cognitive processing, comparing and contrasting the arguments put forth
for each hypothesis. This engagement naturally limits automation bias, as users are less likely
to defer uncritically to a single system-provided solution. Even when one explanation appears
more convincing, the presence of alternative perspectives serves as a safeguard, ensuring that
the user remains critical, reducing (but not eradicating) the chances of endorsing a false or
irrelevant conclusion.</p>
      <p>Secondly, ofering multiple explanations helps address a less explored but equally important
issue in HCI: the potential loss of agency in human-AI interaction [11, 12], especially when
the system is renown for its accuracy and reliability. When users are presented with only
one “right” answer, they may gradually lose their sense of control and responsibility over
decision-making processes [13]. This phenomenon, which can be assimilated to the concept
of deresponsibilization [14], reflects the risk that users may start to perceive themselves as
mere executors of the system’s decisions rather than active, responsible agents, which are
accountable for the final decision (as they still are). Over time, this can lead to long-term
consequences, including loss of motivation, loss of skill and hampered learning [8]. By fostering
an environment where the user must evaluate and decide between multiple, well-supported
hypotheses, the DSS preserves and even enhances the user’s sense of agency. The user remains
an active participant in the decision-making process, fully responsible for the final choice,
which in turn helps maintain and develop their cognitive skills.</p>
      <p>To this end, we have designed an experiment introducing a Judicial system, one of the protocol
associated with Frictional AI, which involves an AI system providing contrasting plausible
explanations that each support a diferent decision outcome [15, 8].</p>
      <p>
        This resonates with the “agonistic machine learning” models [16] and with the Evaluative
AI paradigm introduced by Miller (2023) [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ] for explainable decision support. The novelty
introduced by the Judicial AI system is that, inspired by the judicial domain, it proposes distinct
explanations to support each of the two possible outcomes.
      </p>
      <p>In this project we investigate the textual generative setting in Judicial protocol for sentence
classification and its efects in terms of accuracy, confidence, reliance, perceived responsibility
and sense of agency for the decisions made.</p>
    </sec>
    <sec id="sec-2">
      <title>2. Methods</title>
      <p>As illustrated in 1, we will conduct a two-arm experiment to examine how diferent interaction
designs influence decision-making in content moderation tasks. Participants in each arm will
be presented with the same set of 30 sentences, previously identified as complex by
state-ofthe-art hate speech detection systems and sourced from a social media platform. The two arms
of the study will employ distinct interaction protocols, which will be randomly assigned to
participants.</p>
      <p>• Judicial: Participants will be presented with a sentence and asked to classify it as either
hate speech or not hate speech. Additionally, they will be required to rate their
confidence in their decision using a four-level ordinal scale, which ranges from “not at all
confident” to “completely confident”. Following this initial judgment, participants will
be shown the sentence again, accompanied by two arguments, generated by the Judicial
AI system, presented in colored boxes: one in a pastel red box on the left, advocating
for the classification of the sentence as hate speech, and another in a pastel blue box on
the right, presenting an opposite argument. Participants will then be asked to provide
their final decision and confidence rating using the same scale as before. This process
will be systematically repeated for all 30 cases. To minimize potential order bias in the
decision-making process, half of the participants assigned this protocol will encounter
ifrst the opposing viewpoint. Specifically, the pastel blue boxes, representing arguments
for content classification as not hate speech, will be positioned on the left, while the pastel
red boxes, supporting the classification of content as hate speech, will be on the right.
• Traditional: The initial screen for each case will be consistent with the Judicial arm.</p>
      <p>Participants will first be asked to make an initial decision on whether the sentence
constitutes hate speech or not, and to rate their confidence in this decision. After this, the
system will present its classification of the sentence (hate speech or not) and participants
will be asked to either confirm or reject this classification, providing their confidence
level in this final decision.</p>
      <p>Pre-test and post-test questionnaires will be administered to assess participants’ trust in the
AI system. Additionally, the post-test questionnaire will evaluate the sense of agency and
responsibility participants perceive regarding the decisions they made during the study.</p>
      <sec id="sec-2-1">
        <title>We expect to address the following research questions:</title>
        <p>R1: Is the Traditional protocol associated with higher accuracy compared to the Judicial
protocol?
R2: Are respondents of the Judicial protocol more confident than Traditional ones in their own
ifnal decision?
R3: Is there a significant diference in reliance behavior between the Judicial and Traditional
protocols?
R4: Do Judicial respondents feel a higher sense of agency and responsibility regarding their
decisions compared to Traditional respondents?</p>
        <p>To address the proposed research questions, a series of analyses will be conducted 1 on the
groups of users subjected to the Traditional and Judicial interaction protocols, as outlined in
Table 1.</p>
      </sec>
    </sec>
    <sec id="sec-3">
      <title>3. Expected results</title>
      <p>A DSS that ofers multiple plausible explanations not only aligns with the principles of
usercentered design but also plays a crucial role in maintaining critical engagement, preserving
user agency, and ensuring the retention of decision-making skills, thereby addressing both
automation bias and the risk of deskilling in human-AI interactions. By encouraging deeper and
more critical reflection, this design reduces the risk of fostering undue user trust, which can
contribute to the White Box Paradox [17]. However, it is important to note that the protocol
could still inadvertently introduce bias if one explanation seems more convincing, even if it is
incorrect. The adoption of the Judicial protocol in human-AI interaction is expected to have a
significant impact on the quality of decisions made by users, in particular on perceived agency
and control over their choices. Therefore, we believe Judicial AI could represent a promising
direction in the study of improved decision support system processes, potentially increasing both
the efectiveness of these systems and user satisfaction. Further research focused on refining
Judicial protocols and examining their long-term efects could have significant implications for
the design and implementation of future decision support systems.</p>
      <sec id="sec-3-1">
        <title>1with the tool available at https://mudilab.github.io/dss-quality-assessment/</title>
        <p>Research Question Planned analysis
Is the Traditional protocol associated with Compare initial and final accuracy levels
higher accuracy compared to the Judicial of users in both the Traditional and
Juprotocol? dicial protocol groups to assess if the
absence of direct advice in the Judicial
protocol afects final accuracy.</p>
        <p>Are respondents of the Judicial protocol Analyze and compare the final confidence
more confident than Traditional ones in levels and the diferences between initial
their own final decision? and final confidence for both groups to
determine if the Judicial protocol increases
confidence in the final decision.</p>
        <p>Is there a significant diference in reliance Compare reliance by analyzing how users
between the Judicial and Traditional pro- in both groups rely on correct or
incortocols? rect advice/explanations. Examine cases
where the initial decision difers from the
final decision to identify reliance patterns.</p>
        <p>Do Judicial respondents feel a higher Analyze the final questionnaire responses
sense of agency and responsibility regard- to compare the perceived levels of AI
influing their decisions compared to Tradi- ence, responsibility, and sense of agency
tional respondents? between the two groups.</p>
      </sec>
    </sec>
    <sec id="sec-4">
      <title>Acknowledgments</title>
      <p>C. Fregosi and F. Cabitza acknowledge funding support provided by the Italian project PRIN
PNRR 2022 InXAID - Interaction with eXplainable Artificial Intelligence in (medical) Decision
making. CUP: H53D23008090001 funded by the European Union - Next Generation EU.
interactions: The case for microboundaries, in: Proceedings of the 2016 CHI conference
extended abstracts on human factors in computing systems, 2016, pp. 1389–1397.
[6] F. Cabitza, A. Campagner, D. Ciucci, A. Seveso, Programmed ineficiencies in dss-supported
human decision making, in: Modeling Decisions for Artificial Intelligence: 16th
International Conference, MDAI 2019, Milan, Italy, September 4–6, 2019, Proceedings 16, Springer,
2019, pp. 201–212.
[7] Z. Chen, R. Schmidt, Exploring a behavioral model of" positive friction" in human-ai
interaction, arXiv preprint arXiv:2402.09683 (2024).
[8] F. Cabitza, C. Natali, L. Famiglini, A. Campagner, V. Caccavella, E. Gallazzi, Never tell
me the odds: Investigating pro-hoc explanations in medical decision making, Artificial
Intelligence in Medicine (2024) 102819.
[9] Z. Buçinca, M. B. Malaya, K. Z. Gajos, To trust or to think: cognitive forcing functions
can reduce overreliance on ai in ai-assisted decision-making, Proceedings of the ACM on
Human-Computer Interaction 5 (2021) 1–21.
[10] M. Vered, T. Livni, P. D. L. Howe, T. Miller, L. Sonenberg, The efects of explanations on
automation bias, Artificial Intelligence 322 (2023) 103952.
[11] H. Limerick, J. W. Moore, D. Coyle, Empirical evidence for a diminished sense of agency in
speech interfaces, in: Proceedings of the 33rd Annual ACM Conference on Human Factors
in Computing Systems, 2015, pp. 3967–3970.
[12] A. Galsgaard, T. Doorschodt, A.-L. Holten, F. C. Müller, M. P. Boesen, M. Maas, Artificial
intelligence and multidisciplinary team meetings; a communication challenge for
radiologists’ sense of agency and position as spider in a web?, European Journal of Radiology 155
(2022) 110231.
[13] R. Legaspi, W. Xu, T. Konishi, S. Wada, N. Kobayashi, Y. Naruse, Y. Ishikawa, The sense of
agency in human–ai interactions, Knowledge-Based Systems 286 (2024) 111298.
[14] C. Sureau, Medical deresponsibilization, Journal of assisted reproduction and genetics 12
(1995) 552–558.
[15] C. Natali, et al., Per aspera ad astra, or flourishing via friction: Stimulating cognitive
activation by design through frictional decision support systems, in: CEUR workshop
proceedings, volume 3481, 2023, pp. 15–19.
[16] M. Hildebrandt, Privacy as protection of the incomputable self: From agnostic to agonistic
machine learning, Theoretical Inquiries in Law 20 (2019) 83–121.
[17] F. Cabitza, A. Campagner, L. Ronzio, M. Cameli, G. E. Mandoli, M. C. Pastore, L. M.
Sconifenza, D. Folgado, M. Barandas, H. Gamboa, Rams, hounds and white boxes: Investigating
human–ai collaboration protocols in medical diagnosis, Artificial Intelligence in Medicine
138 (2023) 102506.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <given-names>T.</given-names>
            <surname>Kliegr</surname>
          </string-name>
          , Š. Bahník,
          <string-name>
            <given-names>J.</given-names>
            <surname>Fürnkranz</surname>
          </string-name>
          ,
          <article-title>A review of possible efects of cognitive biases on interpretation of rule-based machine learning models</article-title>
          ,
          <source>Artificial Intelligence</source>
          <volume>295</volume>
          (
          <year>2021</year>
          )
          <fpage>103458</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [2]
          <string-name>
            <given-names>F.</given-names>
            <surname>Cabitza</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Campagner</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Angius</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Natali</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Reverberi</surname>
          </string-name>
          ,
          <article-title>Ai shall have no dominion: on how to measure technology dominance in ai-supported human decision-making</article-title>
          ,
          <source>in: Proceedings of the 2023 CHI conference on human factors in computing systems</source>
          ,
          <year>2023</year>
          , pp.
          <fpage>1</fpage>
          -
          <lpage>20</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [3]
          <string-name>
            <given-names>T.</given-names>
            <surname>Miller</surname>
          </string-name>
          ,
          <article-title>Explainable ai is dead, long live explainable ai! hypothesis-driven decision support using evaluative ai</article-title>
          ,
          <source>in: Proceedings of the 2023 ACM Conference on Fairness, Accountability, and Transparency</source>
          ,
          <year>2023</year>
          , pp.
          <fpage>333</fpage>
          -
          <lpage>342</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [4]
          <string-name>
            <given-names>A.</given-names>
            <surname>Cooper</surname>
          </string-name>
          ,
          <article-title>The inmates are running the asylum</article-title>
          , Springer,
          <year>1999</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          [5]
          <string-name>
            <given-names>A. L.</given-names>
            <surname>Cox</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S. J.</given-names>
            <surname>Gould</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M. E.</given-names>
            <surname>Cecchinato</surname>
          </string-name>
          ,
          <string-name>
            <surname>I. Iacovides</surname>
          </string-name>
          ,
          <string-name>
            <surname>I. Renfree</surname>
          </string-name>
          ,
          <article-title>Design frictions for mindful</article-title>
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>