<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta>
      <journal-title-group>
        <journal-title>CA: a cancer journal for clinicians 35
(1985) 130-151.
[22] B. J. Dietvorst</journal-title>
      </journal-title-group>
      <issn pub-type="ppub">1613-0073</issn>
    </journal-meta>
    <article-meta>
      <article-id pub-id-type="doi">10.1038/s41746-024-01161-1</article-id>
      <title-group>
        <article-title>XAI Framework for Trust Calibration in Skin Lesion Analysis</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Tim Katzke</string-name>
          <email>tim.katzke@tu-dortmund.de</email>
          <xref ref-type="aff" rid="aff0">0</xref>
          <xref ref-type="aff" rid="aff1">1</xref>
          <xref ref-type="aff" rid="aff3">3</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Mustafa Yalçıner</string-name>
          <email>mustafa.yalciner@tu-dortmund.de</email>
          <xref ref-type="aff" rid="aff0">0</xref>
          <xref ref-type="aff" rid="aff1">1</xref>
          <xref ref-type="aff" rid="aff3">3</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Jan Corazza</string-name>
          <email>jan.corazza@tu-dortmund.de</email>
          <xref ref-type="aff" rid="aff0">0</xref>
          <xref ref-type="aff" rid="aff1">1</xref>
          <xref ref-type="aff" rid="aff3">3</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Alfio</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Ventura</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
          <xref ref-type="aff" rid="aff1">1</xref>
          <xref ref-type="aff" rid="aff4">4</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Tim-Moritz Bündert</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
          <xref ref-type="aff" rid="aff2">2</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Emmanuel Müller</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
          <xref ref-type="aff" rid="aff1">1</xref>
          <xref ref-type="aff" rid="aff3">3</xref>
        </contrib>
        <contrib contrib-type="editor">
          <string-name>XAI for Medical Diagnosis, Skin Lesion Analysis, Trust Calibration, Human-AI Interaction, Computer Vision</string-name>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>44227</institution>
          ,
          <country country="DE">Germany</country>
        </aff>
        <aff id="aff1">
          <label>1</label>
          <institution>Research Center Trustworthy Data Science and Security, University Alliance Ruhr</institution>
          ,
          <addr-line>Joseph-von-Fraunhofer-Str. 25, Dortmund</addr-line>
        </aff>
        <aff id="aff2">
          <label>2</label>
          <institution>Scientific Computing Center, Karlsruhe Institute of Technology</institution>
          ,
          <addr-line>Zirkel 2, Karlsruhe, 76131</addr-line>
          ,
          <country country="DE">Germany</country>
        </aff>
        <aff id="aff3">
          <label>3</label>
          <institution>TU Dortmund University</institution>
          ,
          <addr-line>August-Schmidt-Straße 1, Dortmund, 44227</addr-line>
          ,
          <country country="DE">Germany</country>
        </aff>
        <aff id="aff4">
          <label>4</label>
          <institution>University of Duisburg-Essen</institution>
          ,
          <addr-line>Bismarckstraße 120, Duisburg, 47057</addr-line>
          ,
          <country country="DE">Germany</country>
        </aff>
      </contrib-group>
      <pub-date>
        <year>2025</year>
      </pub-date>
      <volume>70</volume>
      <fpage>09</fpage>
      <lpage>11</lpage>
      <abstract>
        <p>Explainable artificial intelligence (XAI) methods provide insights into machine learning models by making their decision processes more transparent for humans. Ideally, such transparency enables users to trust the AI to an appropriate extent, with understanding both its capabilities and limitations for the given task. However, evaluations of XAI methods rarely assess their impact on users' perceived trust alignment with actual model capabilities. In fact, a recent survey reveals that 80% of published work introducing an XAI method does not include user studies. To bridge this gap, we introduce SkinSplain, a web-based framework designed for measuring users' perceived trust in AI systems when interacting with both numerical and visual interpretability cues. SkinSplain allows users to provide inputs to a machine learning model and observe its explanations for predictions. Crucially, users then self-report their level of trust in the model's predictions. These trust scores facilitate further analysis in user studies. Given the increased popularity of AI-based skin lesion analyzers, we employ SkinSplain in a user study to examine how explanation methods influence trust in AI-driven medical diagnostics. The source code is available at https://github.com/Ti-Kat/SkinSplain.</p>
      </abstract>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>1. Introduction</title>
      <p>data.</p>
      <p>LGOBE
CEUR</p>
      <p>
        ceur-ws.org
reliability [
        <xref ref-type="bibr" rid="ref6">6</xref>
        ]. Proper trust calibration ensures users’ confidence matches the AI’s performance, leading
to more efective decision-making and collaboration [
        <xref ref-type="bibr" rid="ref6">6</xref>
        ]. Accordingly, we understand trust-calibration as
an alignment problem between subjective trust perception of the human and objective trustworthiness
of the technical system [
        <xref ref-type="bibr" rid="ref2 ref6 ref7">6, 2, 7</xref>
        ].
      </p>
      <p>
        Demonstrating to AI system users that they can control and evaluate the quality of input data on
which AI predictions are performed can significantly enhance their ability to calibrate trust towards
such systems [
        <xref ref-type="bibr" rid="ref6 ref8">6, 8</xref>
        ]. For example, allowing users to interact with the AI system and explore its behaviour
on various inputs can help users build a nuanced understanding of when AI predictions are trustworthy
and when human oversight is required. This, in turn, allows them to discern how manipulations to
input data influence and may enhance the performance and trustworthiness of AI predictions.
      </p>
      <p>
        However, this type of controllability is underexplored in the literature focusing on Explainable
AI and trust in AI. In fact, a recent survey highlights that only one in five papers proposing a new
XAI method conducts any form of user survey [
        <xref ref-type="bibr" rid="ref9">9</xref>
        ]. This highlights two problems in the current XAI
research. First, new explainability methods are usually not evaluated on real user studies. Therefore,
it remains unclear, which XAI method actually help calibrate users’s trust in the AI model. Secondly,
it remains underexplored how a user’s ability to interact with the AI and select inputs for which the
system performs well or poorly impacts the trust calibration. More specifically, recent reviews on trust
calibration [
        <xref ref-type="bibr" rid="ref6">6</xref>
        ], trust in AI [
        <xref ref-type="bibr" rid="ref10 ref11 ref12">10, 11, 12</xref>
        ] as well as the “unified and practical user-centric framework for
explainable artificial intelligence” [
        <xref ref-type="bibr" rid="ref13">13</xref>
        ], and experimental studies [
        <xref ref-type="bibr" rid="ref14">14</xref>
        ] do not address how explaining the
role of human-controlled inputs in enhancing the performance of technical systems may help develop
calibrated trust. Therefore, current literature falls short in guiding users of technical systems on how to
achieve a collaborative performance that exceeds the capabilities of either humans or technical systems
operating independently.
      </p>
      <p>To close this gap, we design the web application SkinSplain. SkinSplain is a framework designed for
aiding the explainability of an AI system by allowing the user to explore the system’s behaviour on
various inputs and understand model behaviour with the visual explainers. This interactive approach
enables the user to investigate the nuances in the model’s predictive performance, while being supported
with explainers that facilitate model understanding. Crucially, the user can then report a trust score for
each of the inputs, allowing for a subsequent analysis in a broader user study.</p>
      <p>We demonstrate the use of SkinSplain practically and employ our framework in a preregistered
study on skin cancer detection 1. More specifically, we investigate whether (1) explaining how the AI
model generates predictions and (2) showing inputs for which the AI systems’ predictive performance
deteriorates, leads to more calibrated trust in the AI system among laypeople.</p>
    </sec>
    <sec id="sec-2">
      <title>2. Related Work</title>
      <p>Whether layperson or expert, anyone who wants to evaluate new information with an established AI
system must provide input data. Such highly interactive AI systems, which depend on high-quality
usercontrolled input, are relatively new. For example, large language models like ChatGPT and image-based
applications like Foodvisor 2 are used in everyday life, and generate outputs based on whatever input is
given. We were specifically inspired by skin cancer detection and prevention due to its high practical
relevance [15, 16], current developments [17, 18, 19] and strong data availability [20]. SkinVision 3 and
FotoFinder 4 are commercially available, clinically validated, regulated and certified applications that
provide an AI prediction based on a photo of a skin lesion. This is done, for example, by automating the
quantification of the ABCD rule [ 21] (Asymmetry, Border, Color, Diameter), by displaying the image
areas that are particularly important for the prediction, or by assigning a score that indicates the overall
risk.
1https://aspredicted.org/2ryf-7y88.pdf
2https://www.foodvisor.io/en/
3https://www.skinvision.com/
4https://www.fotofinder.de/en/</p>
      <p>
        Currently, research on human controllability in such systems remains limited compared to more rigid,
less interactive AI systems with low human-controllable aspects —as seen, for example, in AI-assisted
decision-making [22, 23, 24]. A user’s perceived control is essential in determining the user’s intention
to use a technical system [
        <xref ref-type="bibr" rid="ref8">8</xref>
        ], a stepping stone to actual usage experience [25, 26]. Thus, perceived
control is also essential for optimal trust calibration that results from extensive usage experience.
Theoretical considerations on trust in automation [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ] emphasize that understanding the functionalities,
strengths, weaknesses, and limitations of technical systems is crucial for trust calibration. This leads
to an understanding when the technical system should and should not be used [
        <xref ref-type="bibr" rid="ref7">7</xref>
        ]. However, an
understanding of the limitations —an understanding of the human-controllable elements of prediction
quality— could lead to behavior fostering improved performance and trustworthiness in all situations
and go beyond the general decision of usage. In fact, a series of studies [
        <xref ref-type="bibr" rid="ref8">8</xref>
        ] indicate that participants
were more willing to use an imperfect algorithm if they could control it slightly. To summarize,
emphasizing user controllability ensures that users obtain necessary learning experiences for long-term
trust calibration [
        <xref ref-type="bibr" rid="ref8">8, 25, 26</xref>
        ], ultimately fostering an appropriate level of trust in technical systems over
time [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ].
      </p>
    </sec>
    <sec id="sec-3">
      <title>3. The SkinSplain Framework</title>
      <p>SkinSplain is a web-based framework that delivers real-time, interactive explanations of a classifier’s
decisions, enabling users to select and manipulate inputs while observing the immediate impact on both
model output and explanation quality. SkinSplain integrates mechanisms for participants to provide
self-reported trust measures directly within the interface. These perceived trust values may serve
as valuable ground truth for assessing user confidence, allowing researchers to compare subjective
evaluations with objective trust metrics, such as model predictive performance. As an application
domain, SkinSplain focuses on skin lesion classification, where images are categorized as benign or
malignant. This application not only underscores the practical relevance of the framework but also
highlights the importance of aligning explainability with both user trust and empirical performance
measures to ensure reliable and interpretable AI systems.</p>
      <sec id="sec-3-1">
        <title>3.1. User Interface</title>
        <p>The SkinSplain user interface, as shown in Figure 1, is divided into three sections.
Left Side (User-Controlled Input Selection) The left side of the interface provides users with direct
control over the input selection, while also displaying the current input image along with additional
metadata below. Users may load a new image from the ISIC skin lesion dataset [20] by clicking the
“New Image” button. Prior to selection, they can apply filters based on demographic attributes. Multiple
available drop-down menus correspond to categorical filters, such as age, diagnosis, sex, or lesion
location. For instance, selecting the “Body location” filter reveals options like “torso” or “hand”. To
emulate realistic variations in image quality, users can also adjust brightness, blur, and rotation via
interactive sliders, with changes immediately reflected on the screen. This functionality not only
ensures that the input data covers varying real-world conditions, but also allows the users to have more
influence over the input characteristics — a crucial factor in calibrating trust.</p>
        <p>Still, to ensure ethical and responsible use, SkinsPlain is limited to publicly available ISIC data
and does not support user-uploaded images. This restriction helps mitigate privacy risks and ethical
concerns associated with applying an uncertified AI system to real user-provided medical images [ 27].
Right Side (XAI Analysis) Once the user clicks “Analyse Image”, the results of one more more
XAI methods are displayed on the right side of the interface within seconds, providing transparent
communication of the model’s internal decision-making processes. This transparency is essential for
users to assess how self-controlled input adjustments afect model predictions. We briefly outline the
currently integrated XAI methods below, and give a motivation and detailed explanations in Section 3.2.
• Melanoma Score – Prediction for the current input as benign or malignant on a scale of 1 to 10.
• Reliability Score – Assesses the reliability of the model’s prediction for the given input.
• Visual Explanation – Highlights image regions most significant on the prediction.
• Similar Images – Displays the most similar represented images from both classes.
Bottom (Measuring Users Perceived Trust) After reviewing the XAI analysis results, users can
submit their perceived trust in the model’s prediction via a slider at the bottom. The compilation and
evaluation of these perceived trust scores provide the necessary user trust self-report for systematic
user trust calibration analyses. For the demo, we opted for a simple, one-dimensional measurement of
trust perception. The basic SkinsPlain framework is designed for repeated presentation of inputs. We
recommended limiting self-report measures within the framework to avoid overburdening participants
and keep them motivated. These measurements become valid because of the repeated measures design.
We recommend incorporating the SkinsPlain framework into a larger survey akin to our preregistered
study if complex state and trait self-report measures are pursued.</p>
      </sec>
      <sec id="sec-3-2">
        <title>3.2. AI Model and XAI Technologies</title>
        <p>(a) Original image
(b) Baseline
(c) Blur + Brightened
(d) Blur + Darkened</p>
      </sec>
      <sec id="sec-3-3">
        <title>AI Model Foundation: Skin Lesion Classification Given that our image data is based on the</title>
        <p>ISIC Challenge datasets, we referenced the winning solution from the 2020 ISIC Challenge [28] in
developing our skin lesion classifier AI model. The solution employed an ensemble of convolutional
neural networks (CNNs), based predominantly on the EficientNet architecture. Since individual models
in the ensemble performed nearly as well as the full ensemble, we opted for a simpler, single-model
approach, based one a more recent variant of that architecture, to facilitate the application of standard
XAI methods. Specifically, we fine-tuned an EficientNetV2-S model [ 29] , pre-trained on ImageNet,
for binary classification using images from the “Nevus” and “Melanoma” classes from a subset of the
ISIC datasets, while deliberately excluding metadata such as demographic attributes. Our skin lesion
classifier achieves an AUC-ROC of 0.9548 on an unseen test set drawn from the 2018 ISIC data.</p>
      </sec>
      <sec id="sec-3-4">
        <title>Numerical AI-Trustworthiness Metrics: Melanoma and Reliability Scores The Melanoma Score</title>
        <p>ofers an intuitive measure of the classifier’s confidence, where a value of 0 indicates high confidence in
benign classification and 10 indicates high confidence in malignancy. This is based on the classifier’s
single-neuron output, passed through a sigmoid activation function. The displayed melanoma score is
obtained by linearly mapping the logit to a scale ranging from 0 to 10. Quantified on the same scale,
the Reliability Score indicates how closely an input image aligns with the training data distribution.
In essence, the further an input deviates from what the model is accustomed to, the less reliable its
predictions become. Providing users with a quantitative measure of this reliability is crucial for informed
decision-making. To that end, we calculate these scores for each input using a layer-wise variant of the
Deep k-Nearest Neighbors [30] algorithm. This identifies the  -nearest neighbors from the training set
within each layer of the skin lesion classifier, and computes a score that quantifies the consistency of
the latent behavior of the input as it is processed by the model. By reducing complex model outputs to
simple numerical indicators indicating objective trustworthiness— consistent in scale with the selectable
levels of subjective perceived trust — these scores facilitate trust calibration, and enable users to quickly
gauge the certainty and reliability of the current prediction. Long term, trust calibration for the given AI
model and domain (here skin cancer detection) emerges, which may be transferred to similar technology.</p>
      </sec>
      <sec id="sec-3-5">
        <title>Visual Interpretability: Saliency Maps and Similar Images Saliency Maps serve as a visual tool to</title>
        <p>highlight the most influential features that afect the classifier’s decision. These influential features are
often determined by analyzing how changes in the input impact the model’s output [ 31]. We employ the
Integrated Gradients method [32] on a transparent version of the base image for its robust estimation
of feature importance, while also applying Gaussian noise to the output image to further smooth the
results. This method is included to enhance interpretability by transparently communicating which
input regions drive the classifier’s output, and how this may change under diverse image manipulations.
An example of this is visualized in Figure 2. Here it can be observed, that by increasing or decreasing
the brightness, the model focuses either more or less on irrelevant image artifacts outside the actual
skin lesion area. To communicate the behavior of the skin lesion classifier based on another visual
cue, we also optionally display Similar Images from the training dataset with respect to the internally
learned representation of the currently analysed image. This is performed for both a true melanoma
and a true benign image. More precisely, these most similar images are determined by identifying the
single nearest neighbors of either class in the representation space of the classifiers penultimate layer
based on eudclidean distance. This motivated by the goal of highlighting similarities and distinctions
between supposedly similar images with drastically diferent implications in a high-stakes environment.</p>
      </sec>
    </sec>
    <sec id="sec-4">
      <title>4. Trust Calibration with SkinSplain</title>
      <p>
        We now briefly outline potential research applications of SkinSplain for user trust calibration.
Evaluating XAI through User Interaction SkinSplain enables real-time, interactive exploration of
model predictions alongside their corresponding explainability outputs, allowing study participants
to investigate how input quality influences outcomes and to understand their role in a collaborative
prediction process. Adding survey questions to the XAI-setup allows for repeated assessment of
participant perceptions, facilitating interdisciplinary research, especially on trust calibration with its
deep perceptional-technical nature [
        <xref ref-type="bibr" rid="ref2 ref6 ref7">6, 2, 7</xref>
        ]. Moreover, SkinSplain allows for the evaluation of various
XAI methods, whether using a subset of the provided methods, or substituting alternative approaches,
to systematically assess their impact on perceived user trust. When interactivity is not essential, the
interface can be configured for fixed, survey-based studies (as shown in Figure 3 and advertised in our
preregistration), ofering a controlled environment for online experiments with XAI.
      </p>
      <sec id="sec-4-1">
        <title>Balancing Perceived Trust and Objective Trustworthiness In our framework, user trust is</title>
        <p>
          shaped both by interactive, controllable inputs and by the interpretability cues presented. Trust
calibration involves balancing this trust—that is, how much perceived user trust is attributed to the
AI—with objective measures of the system’s performance (e.g., accuracy or reliability) that indicate the
trustworthiness of the system and how much trust should be placed into it. SkinSplain supports assessing
both measures; self-reported trust measures can be obtained in flexible user studies that investigate the
role of human controllable inputs and the influence of XAI methods on trust in conjunction with objective
model performance. This provides the necessary data to systematically analyse trust calibration [
          <xref ref-type="bibr" rid="ref2 ref6 ref7">6, 2, 7</xref>
          ].
        </p>
      </sec>
    </sec>
    <sec id="sec-5">
      <title>5. Discussion and Outlook</title>
      <p>We introduced SkinSplain, a web application framework designed to investigate how XAI methods
afect user behavior and trust perceptions. Our work demonstrates its use in user studies examining
trust in a skin cancer classifier, where participants evaluated visual explainers and reliability measures.
Although our current implementation targets the skin cancer domain, our framework is inherently
domain-agnostic. Moreover, its components can be easily replaced to support user studies across diverse
subsets of XAI methods. Looking ahead, we plan to extend our research with non-static user studies
that exploit SkinSplain’s interactive capabilities to further explore the influence of XAI on user behavior,
ultimately contributing to the development of more trustworthy AI systems.</p>
    </sec>
    <sec id="sec-6">
      <title>Acknowledgments</title>
      <p>This research is funded by the Research Center Trustworthy Data Science and Security
(https://rctrust.ai), one of the Research Alliance centres within the University Alliance Ruhr (https://uaruhr.de).</p>
    </sec>
    <sec id="sec-7">
      <title>Declaration on Generative AI</title>
      <p>During the preparation of this proposal, we used the ChatGPT 4o model from OpenAI for minor
language edits, aiming to enhance readability. After using this tool/service, the authors reviewed and
edited the content as needed and take full responsibility for the manuscript’s content.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <given-names>J. K.</given-names>
            <surname>Winkler</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K.</given-names>
            <surname>Sies</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Fink</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F.</given-names>
            <surname>Toberer</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Enk</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M. S.</given-names>
            <surname>Abassi</surname>
          </string-name>
          ,
          <string-name>
            <given-names>T.</given-names>
            <surname>Fuchs</surname>
          </string-name>
          ,
          <string-name>
            <given-names>H. A.</given-names>
            <surname>Haenssle</surname>
          </string-name>
          ,
          <article-title>Association between diferent scale bars in dermoscopic images and diagnostic performance of a marketapproved deep learning convolutional neural network for melanoma recognition</article-title>
          ,
          <source>European Journal of Cancer</source>
          <volume>145</volume>
          (
          <year>2021</year>
          )
          <fpage>146</fpage>
          -
          <lpage>154</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [2]
          <string-name>
            <given-names>J. D.</given-names>
            <surname>Lee</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K. A.</given-names>
            <surname>See</surname>
          </string-name>
          ,
          <article-title>Trust in automation: Designing for appropriate reliance</article-title>
          ,
          <source>Human Factors: The Journal of the Human Factors and Ergonomics Society</source>
          <volume>46</volume>
          (
          <year>2004</year>
          )
          <fpage>50</fpage>
          -
          <lpage>80</lpage>
          . doi:
          <volume>10</volume>
          .1518/ hfes.46.1.50_
          <fpage>30392</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [3]
          <string-name>
            <given-names>A. P.</given-names>
            <surname>Association</surname>
          </string-name>
          , trust,
          <year>2018</year>
          . URL: https://dictionary.apa.org/trust.
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [4]
          <string-name>
            <surname>D. J. McAllister</surname>
          </string-name>
          ,
          <article-title>Afect- and cognition-based trust as foundations for interpersonal cooperation in organizations</article-title>
          ,
          <source>Academy of Management Journal</source>
          <volume>38</volume>
          (
          <year>1995</year>
          )
          <fpage>24</fpage>
          -
          <lpage>59</lpage>
          . doi:
          <volume>10</volume>
          .5465/256727.
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          [5]
          <string-name>
            <given-names>C.</given-names>
            <surname>Starke</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Ventura</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Bersch</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Cha</surname>
          </string-name>
          , C. de Vreese, P. Doebler,
          <string-name>
            <given-names>M.</given-names>
            <surname>Dong</surname>
          </string-name>
          ,
          <string-name>
            <given-names>N.</given-names>
            <surname>Krämer</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Leib</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Peter</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L.</given-names>
            <surname>Schäfer</surname>
          </string-name>
          ,
          <string-name>
            <surname>I. Soraperra</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Szczuka</surname>
          </string-name>
          , E. Tuchtfeld,
          <string-name>
            <given-names>R.</given-names>
            <surname>Wald</surname>
          </string-name>
          ,
          <string-name>
            <given-names>N.</given-names>
            <surname>Köbis</surname>
          </string-name>
          ,
          <article-title>Risks and protective measures for synthetic relationships</article-title>
          ,
          <source>Nature Human Behaviour</source>
          <volume>8</volume>
          (
          <year>2024</year>
          )
          <fpage>1834</fpage>
          -
          <lpage>1836</lpage>
          . URL: https: //www.nature.com/articles/s41562-024
          <article-title>-02005-4</article-title>
          . doi:
          <volume>10</volume>
          .1038/s41562-024-02005-4.
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          [6]
          <string-name>
            <given-names>M.</given-names>
            <surname>Wischnewski</surname>
          </string-name>
          ,
          <string-name>
            <given-names>N.</given-names>
            <surname>Krämer</surname>
          </string-name>
          , E. Müller,
          <article-title>Measuring and understanding trust calibrations for automated systems: A survey of the state-of-the-art and future directions</article-title>
          ,
          <source>in: Proceedings of the 2023 CHI Conference on Human Factors in Computing Systems, ACM</source>
          ,
          <year>2023</year>
          , pp.
          <fpage>1</fpage>
          -
          <lpage>16</lpage>
          . doi:
          <volume>10</volume>
          .1145/3544548.3581197.
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          [7]
          <string-name>
            <given-names>R.</given-names>
            <surname>Parasuraman</surname>
          </string-name>
          ,
          <string-name>
            <given-names>V.</given-names>
            <surname>Riley</surname>
          </string-name>
          , Humans and automation: Use, misuse, disuse, abuse,
          <source>Human Factors: The Journal of the Human Factors and Ergonomics Society</source>
          <volume>39</volume>
          (
          <year>1997</year>
          )
          <fpage>230</fpage>
          -
          <lpage>253</lpage>
          . doi:
          <volume>10</volume>
          .1518/ 001872097778543886.
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          [8]
          <string-name>
            <given-names>B. J.</given-names>
            <surname>Dietvorst</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J. P.</given-names>
            <surname>Simmons</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Massey</surname>
          </string-name>
          ,
          <article-title>Overcoming algorithm aversion: People will use imperfect algorithms if they can (even slightly) modify them</article-title>
          ,
          <source>Management Science</source>
          <volume>64</volume>
          (
          <year>2018</year>
          )
          <fpage>1155</fpage>
          -
          <lpage>1170</lpage>
          . doi:
          <volume>10</volume>
          .1287/mnsc.
          <year>2016</year>
          .
          <volume>2643</volume>
          .
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          [9]
          <string-name>
            <given-names>M.</given-names>
            <surname>Nauta</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Trienes</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Pathak</surname>
          </string-name>
          , E. Nguyen,
          <string-name>
            <given-names>M.</given-names>
            <surname>Peters</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Y.</given-names>
            <surname>Schmitt</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Schlötterer</surname>
          </string-name>
          , M. van
          <string-name>
            <surname>Keulen</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          <string-name>
            <surname>Seifert</surname>
          </string-name>
          ,
          <article-title>From anecdotal evidence to quantitative evaluation methods: A systematic review on evaluating explainable ai</article-title>
          ,
          <source>ACM Comput. Surv</source>
          .
          <volume>55</volume>
          (
          <year>2023</year>
          ). URL: https://doi.org/10.1145/3583558. doi:
          <volume>10</volume>
          .1145/3583558.
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          [10]
          <string-name>
            <given-names>A.</given-names>
            <surname>Jacovi</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Marasović</surname>
          </string-name>
          ,
          <string-name>
            <given-names>T.</given-names>
            <surname>Miller</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Y.</given-names>
            <surname>Goldberg</surname>
          </string-name>
          ,
          <article-title>Formalizing trust in artificial intelligence</article-title>
          ,
          <source>in: Proceedings of the 2021 ACM Conference on Fairness, Accountability, and Transparency</source>
          , ACM,
          <year>2021</year>
          , pp.
          <fpage>624</fpage>
          -
          <lpage>635</lpage>
          . doi:
          <volume>10</volume>
          .1145/3442188.3445923.
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          [11]
          <string-name>
            <surname>A. D. Kaplan</surname>
            , T. T. Kessler,
            <given-names>J. C.</given-names>
          </string-name>
          <string-name>
            <surname>Brill</surname>
            ,
            <given-names>P. A.</given-names>
          </string-name>
          <string-name>
            <surname>Hancock</surname>
          </string-name>
          , Trust in artificial intelligence:
          <article-title>Meta-analytic ifndings</article-title>
          ,
          <source>Human Factors: The Journal of the Human Factors and Ergonomics Society</source>
          <volume>65</volume>
          (
          <year>2023</year>
          )
          <fpage>337</fpage>
          -
          <lpage>359</lpage>
          . doi:
          <volume>10</volume>
          .1177/00187208211013988.
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          [12]
          <string-name>
            <given-names>E.</given-names>
            <surname>Glikson</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A. W.</given-names>
            <surname>Woolley</surname>
          </string-name>
          ,
          <article-title>Human trust in artificial intelligence: Review of empirical research</article-title>
          ,
          <source>Academy of Management Annals</source>
          <volume>14</volume>
          (
          <year>2020</year>
          )
          <fpage>627</fpage>
          -
          <lpage>660</lpage>
          . doi:
          <volume>10</volume>
          .5465/annals.
          <year>2018</year>
          .
          <volume>0057</volume>
          .
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          [13]
          <string-name>
            <given-names>S.</given-names>
            <surname>Kaplan</surname>
          </string-name>
          ,
          <string-name>
            <given-names>H.</given-names>
            <surname>Uusitalo</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L.</given-names>
            <surname>Lensu</surname>
          </string-name>
          ,
          <article-title>A unified and practical user-centric framework for explainable artificial intelligence</article-title>
          ,
          <source>Knowledge-Based Systems</source>
          <volume>283</volume>
          (
          <year>2024</year>
          )
          <article-title>111107</article-title>
          . URL: https: //linkinghub.elsevier.com/retrieve/pii/S0950705123008572. doi:
          <volume>10</volume>
          .1016/j.knosys.
          <year>2023</year>
          .
          <volume>111107</volume>
          .
        </mixed-citation>
      </ref>
      <ref id="ref14">
        <mixed-citation>
          [14]
          <string-name>
            <given-names>B.</given-names>
            <surname>Leichtmann</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Humer</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Hinterreiter</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Streit</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Mara</surname>
          </string-name>
          ,
          <article-title>Efects of explainable artificial intelligence on trust and human behavior in a high-risk decision task, Computers in Human Behavior 139 (</article-title>
          <year>2023</year>
          ). doi:
          <volume>10</volume>
          .1016/j.chb.
          <year>2022</year>
          .
          <volume>107539</volume>
          .
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>