<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Beyond Model Trust: Dual XAI for Adaptive and User-Centric Explainability</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Maria Luigia Natalia De Bonis</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Electrical and Information Engineering Department, Polytechnic University of Bari</institution>
        </aff>
      </contrib-group>
      <abstract>
        <p>Current approaches to Explainable AI (XAI) often rely on static, one-size-fits-all methods that fail to meet the diverse needs of users. While AI experts require detailed insights to debug, refine, and optimize models, AI non-experts need intuitive, domain-relevant explanations to support decision-making. Consequently, a single explanation format is frequently either too simplistic for expert analysis or, in most cases, too technical to be meaningful for non-experts. This research introduces Dual XAI, a multimodal, human-centered framework designed to address these limitations. By integrating complementary explanation techniques with user modeling strategies, Dual XAI aims at providing explanations whose content, format, and level of detail are adapted to the expertise and specific needs of diferent user types. Grounded in human-centered design principles, Dual XAI enhances both interpretability and usability by providing personalized, context-aware insights. This enables AI experts to perform in-depth analysis of model behavior, while allowing AI non-experts to access accessible and actionable explanations.</p>
      </abstract>
      <kwd-group>
        <kwd>eol&gt;Explainable Artificial Intelligence (XAI)</kwd>
        <kwd>Human-Centered AI</kwd>
        <kwd>Quantitative Evaluation of XAI</kwd>
        <kwd>AI Debugging and Optimization</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>1. Context and Motivation</title>
      <p>Artificial Intelligence (AI) has become a central technology in various domains, including medicine,
ifnance, and law, due to its ability to learn, reason, and adapt [ 1]. In recent years, advancements in
AI models have led to increasingly autonomous and sophisticated systems, enabling them to tackle
complex problems with unprecedented performance levels.</p>
      <p>However, as AI models grow in complexity, particularly with the rise of Deep Learning (DL),
understanding their decision-making processes has become significantly more challenging. Although early
AI systems were relatively interpretable, modern architectures, composed of millions of parameters, are
often regarded as black-boxes, making their internal reasoning opaque and limiting their widespread
adoption [2].</p>
      <p>This lack of transparency has fueled a growing demand for explainability and interpretability, with
increasing pressure from stakeholders, regulators, and end users who require better insights into
the reliability of AI model outputs. The absence of detailed explanations for a model’s behavior can
discourage its use, undermining trust and impeding AI deployment in real-world applications.</p>
      <p>A notable example arises in the healthcare sector, where AI is used in high-stakes diagnostic
procedures, such as radiological image classification. In these cases, it is a priority for clinicians to understand
the reasoning behind a model recommendations before considering its outputs in clinical practice.</p>
      <p>Trust in AI depends not only on explainability but also on the overall quality of the model, including
its accuracy, robustness, and fairness. A clear explanation alone is insuficient if the system sufers
from systematic bias or produces inconsistent outputs. Encouraging responsible adoption, therefore,
requires an approach that couples explainability with rigorous assessments of performance, stability of
predictions, and absence of systematic errors.</p>
      <p>The growing need for transparency has led to the emergence of Explainable Artificial Intelligence
(XAI) [3], a collection of methodologies and techniques designed to enhance the interpretability of AI
models without compromising their predictive performance.</p>
      <p>Challenges in current XAI approaches. Despite its advancements, current XAI methods face
substantial limitations. Many rely on a static one-size-fits-all approach, assuming that a single explanation
can suit all users, regardless of expertise or context. However, explainability is not a monolithic concept:
diferent users have distinct goals and cognitive needs. A common misconception is that applying an
XAI method like SHAP [4] automatically makes a model interpretable for everyone. In reality, uniform
explanations often prove too simplistic for experts or too complex for AI non-experts, limiting their
practical value.</p>
      <p>Overcoming these issues requires a paradigm shift toward flexible, human-centered approaches [5]
that adapt explanations based on expertise and interaction context. It is essential to distinguish between
two main categories of users: AI non-experts and AI experts, each with specific needs.</p>
      <p>AI non-experts (e.g., clinicians, business decision-makers) need intuitive, interactive, and
domainrelevant explanations to extract useful insights. AI experts (e.g., researchers, developers, data scientists),
on the other hand, require detailed, technical explanations for debugging, bias detection, and model
optimization - needs often overlooked in the literature.</p>
      <p>Finally, explainability should go beyond trust-building and evolve into a tool for discovery and
iterative improvement. To support this shift, it is crucial to develop quantitative, reproducible metrics
that assess the stability, fidelity, and usefulness of explanations. These metrics can help AI experts
select and optimize XAI techniques while enhancing knowledge extraction for AI non-experts.
Towards a human-centered and multimodal XAI framework. In light of these considerations,
this research introduces Dual XAI, an innovative framework that embraces a human-centered and
multimodal approach to overcome the existing limitations of explainability techniques. Dual XAI aims to
provide flexible and adaptive solutions that cater to the specific needs of both AI experts and non-experts,
ensuring that explainability becomes a practical and efective tool for real-world AI applications. Given
the increasing role of AI in healthcare, this research will focus on improving the comprehensibility and
clinical relevance of explanations for medical professionals, particularly neurologists, who are a group
of users with whom we are collaborating in our research.</p>
    </sec>
    <sec id="sec-2">
      <title>2. Related Work</title>
      <p>Definitions and conceptual foundations of XAI. The definitions of interpretability and
explainability remain widely debated in the literature [6]. To date, there is no unanimous consensus on what makes
a model “comprehensible” in cognitive terms: while some authors view explainability as the ability
of a system to provide high-level reasons supporting its decisions [7], others favor approaches more
closely tied to the concept of fidelity, meaning the extent to which explanations accurately reflect the
model true behavior [8]. This lack of agreement has led to fragmented methodologies and inconsistent
evaluation criteria, making it dificult to determine whether an explainability technique genuinely
improves interpretability or merely provides a superficial justification for the model outputs. Without a
common conceptual foundation, XAI risks evolving in directions that fail to address the practical needs
of users. Further complicating the landscape are ethical and legal considerations. In many domains,
explainability is not just a matter of clarity; it is also an essential requirement for ensuring responsibility
and verifiability of algorithmic decisions.</p>
      <p>Regulatory aspects of explainability. The regulatory debate has highlighted the fundamental
role of explainability in facilitating responsible AI adoption. The AI Act has placed legal constraints
on the use of AI in high-risk applications, mandating transparency, fairness, and safety in
decisionmaking processes [9]. Similarly, regulations such as the GDPR [10] have reinforced the “right to
explanation”, underscoring the obligation of organizations to ensure transparent AI-driven decisions.
These considerations demonstrate that explainability is not merely a technical issue and an essential
factor for regulatory compliance and societal acceptance of AI.</p>
      <p>XAI Methodologies. In recent years, a variety of XAI methodologies have been proposed, broadly
classified into two categories: ante-hoc methods, where the model itself is designed to be interpretable
from the development stage, and post-hoc methods, aimed at generating explanations for pre-trained
models [3]. Within the latter category, there are local approaches such as LIME [11] and SHAP, which
approximate a model behavior around individual input points without relying on its internal structure,
making them model-agnostic. In contrast, gradient-based methods like Grad-CAM [12] directly depend
on the model architecture, leveraging its gradients and activation maps to identify prominent regions
in neural networks. Other techniques include global explanations based on feature importance and
counterfactual strategies.</p>
      <sec id="sec-2-1">
        <title>Need for hybrid and multimodal XAI approaches. Despite this range of approaches, XAI methods</title>
        <p>are often applied in isolation, without any real integration among them. Each technique highlights
specific aspects of the model, so the absence of a multimodal strategy restricts the ability to gain a
comprehensive view of the model behavior [13]. Since every technique emphasizes diferent characteristics
of the model, a multimodal or hybrid strategy would be more efective in meeting the varied needs of
users [13].</p>
        <p>A further distinction in the literature diferentiates explanations aimed at building trust for end users,
often called “BLUE XAI ”, from those intended for debugging and model optimization by developers,
referred to as “RED XAI ” [13]. While the former approach prioritizes usability and transparency, the
latter provides more technical, detailed analyses valuable for researchers and data scientists. However,
most XAI methods fall at one extreme or the other, being either overly simplified and "user-friendly", e.g.
yet unsuitable for advanced analysis, or too technical and thus dificult to interpret for non specialists.
Moreover, using standalone XAI techniques does not guarantee that AI-expert users will achieve a deep
understanding of model behavior; rather, this level of insight calls for the complementary, multimodal
integration of diferent methods that ofer multiple perspectives on the model.</p>
        <p>Challenges in evaluating explainability. Another unresolved issue is how to evaluate XAI
explanations. Currently, such assessments predominantly rely on user studies and subjective questionnaires,
which, while useful for gauging human perception, lack standardization. Eforts to introduce
quantitative metrics, such as stability and robustness, have not yet led to a broad consensus. Longo et al. propose
the "XAI 2.0" [14], which integrates metrics addressing precision, consistency, and utility, diferentiated
by user profile. Meanwhile, Biecek et al. [ 13] underscore the need for shared benchmarks to compare
XAI techniques. The lack of widely accepted quantitative tools hinders standardization and practical
adoption of these methodologies.</p>
        <p>Concrete examples of the need for more comprehensive explanations are provided by Anders
et al. [15], who demonstrate how interpretive methods can identify and mitigate spurious
correlations (often referred to as the “Clever Hans efect”), thereby improving models reliability and
robustness. Arya et al. [16] introduce AI Explainability 360, an open-source toolkit that brings
together a variety of interpretive approaches (feature-based, instance-based, and global). Similar
toolkits, such as Alibi [17] and Captum [18], already combine local and global explanations.
While these solutions represent an important step toward multimodal strategies, they still lack
adaptive and dynamic management of explanations, which would automatically tailor both content
and communication style to diferent user roles and levels of expertise, i.e., AI experts vs. AI non-experts.</p>
        <p>Taken together, these considerations highlight the need for a more flexible approach that can provide
integrated, customizable explanations. As will be detailed in the subsequent sections, the Dual XAI
framework proposed in this study aims to move beyond the traditional one-size-fits-all model, ofering
dynamic, multimodal solutions that simultaneously address requirements of transparency, trust, detailed
analysis, and model optimization.</p>
      </sec>
    </sec>
    <sec id="sec-3">
      <title>3. Research Questions and Methodology</title>
      <p>In light of the gaps outlined in the previous sections, this research addresses three main questions,
designed to fill some of the current voids in the XAI literature:
• RQ1: How can AI model outputs be made genuinely usable for AI non-expert users, moving
beyond mere trust-building to provide domain-relevant, practical tools?
• RQ2: How can diferent Explainable AI techniques be combined to create multimodal and
complementary explanations that meet the analytical and optimization needs of AI expert users?
• RQ3: Which quantitative metrics can be defined to objectively, comparatively, and
comprehensively evaluate the efectiveness, robustness, and reliability of XAI techniques, taking into account
the diverse requirements of diferent user groups?</p>
      <p>To address these three areas of investigation, this work proposes the development of a framework
called Dual XAI (Figure 1), structured in three phases corresponding to each of the research questions.
Phase 1: requirements analysis and Human-Centered Design . With the aim of addressing RQ1
this phase adopts a Human-Centered Design (HCD) approach [19], integrating participatory design
principles through co-design sessions where clinicians actively contribute to defining the most relevant
aspects of explainability. These sessions aim to inform the development of interactive prototypes,
ranging from conceptual sketches to functional versions of XAI interfaces. By varying key parameters
such as the type of visualization (e.g., bar charts, heatmaps), the level of textual detail, and the possibility
of executing what-if simulations, the study explores how diferent configurations impact usability and
support clinicians not only in assessing the reliability of model outputs but also in actively engaging with
the explanations. Through interactivity and navigable insights, the system would facilitate knowledge
extraction, enabling neurologists to explore patterns, investigate alternative scenarios, and refine their
understanding of the behavior of the underlying model, as well as their clinical questions.</p>
      <p>Interactive Visualization Techniques and Adaptive Learning mechanisms are used to ensure
adaptability to diferent levels of expertise. The system dynamically adjusts the level of explanation
detail based on user expertise, ofering more intuitive, high-level representations for residents while
allowing specialists to access in-depth analyses when needed. Visual interfaces are designed to enhance
interpretability, providing clinicians with the flexibility to navigate between summary-level insights
and fine-grained feature contributions, ensuring that the explanations remain contextually relevant and
aligned with their diagnostic reasoning processes.</p>
      <p>The usability evaluation uses think-aloud protocols and task analysis, measuring both objective
performance indicators (such as error rates, help requests, and time to completion) and subjective
assessments via standardized questionnaires, including the User Engagement Scale [20], NASA-TLX
[21], and AttrakDif [ 22]. Beyond usability, the study would assess the impact of explanations on
clinical decision-making by examining whether they improve diagnostic accuracy, reduce errors, and
enhance the clinician’s ability to interact with and interpret the provided insights. Qualitative insights
from post-task interviews and focus groups further elucidate how the provided explanations influence
reasoning, with a particular focus on how interactivity contributes to deeper exploration and hypothesis
generation.</p>
      <p>Findings from these evaluations inform an iterative refinement process, updating interface designs
based on real-world user interactions and testing subsequent iterations to ensure optimal adaptation to
clinical workflows and cognitive demands.</p>
      <p>Phase 2: integration and development of a multimodal framework RQ2 is addressed through
the design and implementation of the Dual XAI framework, which encompasses a platform capable of
integrating multiple Explainable AI techniques (feature-based, instance-based, counterfactual,
gradientbased) in a complementary and dynamic fashion. In this phase:
• An initial selection of XAI techniques (e.g., SHAP, LIME, Grad-CAM, counterfactual approaches)
is carried out by means of a thorough literature review and a preliminary comparative analysis,
evaluating their explanatory capacity for diferent model types (e.g., neural networks for image
analysis or tabular models based on clinical features). Once identified, these methods are combined
according to how they are most efectively presented, whether through a unified interface with
multiple interactive panels, or a mechanism that allows switching between explanatory modes.
• Adaptive algorithms enable a robust Integration of Multiple Explanation Techniques,
centered on the needs of AI expert users and supporting Debugging and Optimization activities
(e.g., highlighting potential biases in MRI classification or in demographic clinical variables). The
resulting integrated framework is tested with expert users in a controlled experiment, comparing
(a) the multimodal approach, (b) a single-method XAI baseline, and (c) no explanation at all.
This study assesses the rapidity of AI expert users in debugging, detecting model overfitting, and
correcting hyperparameters, complemented by a qualitative investigation of whether multimodal
visualization genuinely adds clarity or instead imposes excessive cognitive load.</p>
      <sec id="sec-3-1">
        <title>Phase 3: development and validation of quantitative metrics In response to RQ3, this phase</title>
        <p>focuses on defining quantitative metrics to objectively assess the efectiveness, robustness, and reliability
of XAI methods. The evaluation framework incorporates both existing techniques and, where necessary,
new metrics tailored to specific explainability challenges.</p>
        <p>Among the established approaches, the Model Parameter Randomization Check tests whether
explanations remain stable under controlled perturbations of model parameters, while Target Sensitivity
assesses whether explanations for diferent outcomes are truly contrastive. Additionally, Stability for
Slight Variations measures robustness against minor input modifications. This study also explores novel
metrics to address aspects of explainability that remain insuficiently captured.</p>
        <p>The development of these evaluation metrics is accompanied by the creation of Benchmarks, Tools,
and Standards to ensure a structured and reproducible assessment of XAI techniques. Empirical
validation involves AI non-expert users (i.e., the neurologists) to align these metrics with real-world
analytical needs, ensuring their practical relevance. By integrating well-established evaluation criteria
with new tailored metrics, this phase aims to contribute to a more comprehensive and standardized
framework for explainability assessment, supporting both model optimization and the development
of more reliable interpretability methods. The Dual XAI framework operates through a continuous
feedback loop between the two user categories:
• Needs and interactions of AI non-experts provide valuable insights to refine XAI techniques used
by AI experts.
• AI experts, in turn, enhance the quality and usability of explanations, ensuring that AI non-expert
users receive explanations that are relevant, accurate, and tailored to their needs.</p>
        <p>This iterative cycle fosters continuous improvement, refining explanations dynamically based on
user-specific interactions and feedback.</p>
      </sec>
    </sec>
    <sec id="sec-4">
      <title>4. Preliminary Results and Contributions</title>
      <p>The research carried out in the early period of this doctoral program lays the foundation for the proposed
Dual XAI framework by targeting two key challenges in XAI: (1) assessing the stability and reliability
of model explanations (related to RQ3), and (2) integrating Human-Centered design principles to make
AI tools truly usable for AI non-expert end users (supporting RQ1). In parallel, these eforts contribute
to the formulation of a multimodal strategy, central to RQ2.</p>
      <p>The progress thus far can be grouped into two interconnected lines of inquiry. First, a comparative
analysis of multiple XAI methods for brain age prediction pipelines highlighted the variability in
explanations, both within individual methods and between diferent approaches. Second, the development
and pilot testing of an interactive tool for neurologists underscored the importance of a human-centered
approach to ensure that AI-based systems are interpretable and usable, as well as tailored to the specific
user, so that they can be practically relevant in clinical settings.</p>
      <p>In the first study, titled “Explainable brain age prediction: a comparative evaluation of morphometric
and deep learning pipelines” [23], we focused on assessing the stability and coherence of diferent
posthoc interpretability methods for brain age prediction. Specifically, we examined how changes in the
reference background afected SHAP-based explanations and contrasted DeepSHAP with Grad-CAM for
CNN-based models. The results showed that varying the background in SHAP led to significant shifts
in feature importance, reducing the overall consistency of the explanations. Likewise, comparing
GradCAM activation maps and feature-attribution methods (e.g., DeepSHAP) revealed key discrepancies,
indicating that each approach emphasizes diferent aspects of the prediction pipeline. We also found
that models based on morphometric features (e.g., cortical thickness, gray matter volume) often produce
explanations more aligned with known neuro-anatomical markers than purely image-based CNNs.
These findings collectively validated the need for a multimodal perspective on XAI, since relying on
any single method risks overlooking critical aspects of model behavior.</p>
      <p>In parallel, we sought to improve the usability and interpretability of AI models for clinical settings by
developing and evaluating a web-based application called Brain Age Predictor. This tool integrates the
deep learning model that, in our first study, demonstrated the highest stability and consistency of
SHAPbased explanations under varying background configurations. Designed primarily for neurologists, the
system provides a SHAP-based explanation module along with an interactive interface that enables
users to visualize, edit, and simulate the efect of various morphometric features on predicted brain age.</p>
      <p>Explanations are presented through two complementary visualizations, both focused on the individual
patient level: a Tornado Plot, which displays the most influential features contributing to the predicted
brain age for a specific subject, and a custom-designed Glass Brain, an interactive 3D visualization
tailored for clinical users. The Glass Brain allows users to navigate and decompose SHAP values across
anatomical regions, enabling spatial reasoning and in-depth neuroanatomical analysis. The design,
implementation, and formative evaluation of this tool are presented in the paper “Explainable AI for
Brain Age Prediction: Design, Implementation, and Formative Evaluation of an Interactive Tool”, which
was recently accepted at the Hybrid Human Artificial Intelligence (HHAI) 2025 conference.</p>
      <p>A formative study conducted with neurology residents reported the interface as generally intuitive,
with participants particularly appreciating the “what-if” scenario feature (e.g., adjusting cortical
thickness values to see how estimated brain age changes). Nonetheless, the evaluation also revealed the need
for clearer SHAP representations and enhanced support for longitudinal patient monitoring. These
ifndings underscore the importance of adopting interactive dashboards and adaptive explanation
strategies, to ensure that even AI non-expert users can derive meaningful insights from complex ML models.
Together, these preliminary results provide critical evidence that model explanations must be both
methodologically robust (e.g., stable across varying reference backgrounds or architectural changes)
and contextually tailored to the domain expertise of end users. From a methodological standpoint, the
comparative evaluation emphasizes the value of complementary XAI strategies to capture diferent
facets of a model’s decision process. Meanwhile, the usability study demonstrates that delivering
intuitive, interactive interfaces is indispensable for facilitating trust and adoption among AI non-expert
users, such as medical trainees or clinicians who require clinically actionable insights rather than purely
algorithmic details.</p>
    </sec>
    <sec id="sec-5">
      <title>5. Future Work and Expected Contributions</title>
      <p>This research will continue to refine and validate the proposed Dual XAI framework, establishing
it as a robust, adaptive, and user-centered approach to explainability. A primary focus will be the
development of quantitative evaluation metrics that go beyond subjective assessments, providing
reliable and reproducible measures of stability, fidelity, and contrastiveness to support AI experts in
model inspection and optimization. In parallel, user profiling strategies will be further developed to
ensure that explanations dynamically adapt to varying levels of expertise, task requirements, and usage
contexts. Rather than relying on a binary distinction between AI experts and non-experts, Dual XAI will
incorporate more nuanced user models to personalize content, format, and interactivity. The framework
will also integrate complementary and multimodal explanation techniques, combining
featurebased, instance-based, and counterfactual methods into cohesive, interactive interfaces. These will
be tailored to clinical reasoning processes, with neurology serving as the initial application domain.
By grounding explainability in measurable quality, contextual relevance, and personalized delivery,
Dual XAI aims to transform explanations from abstract model outputs into actionable tools for decision
support, making them efective and meaningful in high-stakes, real-world environments.</p>
    </sec>
    <sec id="sec-6">
      <title>Acknowledgments</title>
      <p>I extend my gratitude to Prof. Tommaso Di Noia, my doctoral advisor, and Prof. Angela Lombardi,
my co-supervisor, for their guidance and support. I am also thankful to Prof. Carmelo Ardito for his
valuable insights. This work was partially supported by the project DEMETRA (CUP D99J22001970006)
Missione 6/componente 2/Investimento: 2.1 "Raforzamento e potenziamento della ricerca biomedica
del SSN", funded by European Commission – NextGenerationEU.</p>
    </sec>
    <sec id="sec-7">
      <title>Declaration on Generative AI</title>
      <p>The author has not employed any Generative AI tools.
[1] S. Makridakis, The forthcoming artificial intelligence (ai) revolution: Its impact on society and
ifrms, Futures 90 (2017) 46–60.
[2] M. Christoph, Interpretable machine learning: A guide for making black box models explainable
(2020).
[3] A. B. Arrieta, N. Díaz-Rodríguez, J. Del Ser, A. Bennetot, S. Tabik, A. Barbado, S. García, S. Gil-López,
D. Molina, R. Benjamins, et al., Explainable artificial intelligence (xai): Concepts, taxonomies,
opportunities and challenges toward responsible ai, Information fusion 58 (2020) 82–115.
[4] S. M. Lundberg, G. Erion, H. Chen, A. DeGrave, J. M. Prutkin, B. Nair, R. Katz, J. Himmelfarb,
N. Bansal, S.-I. Lee, From local explanations to global understanding with explainable ai for trees,
Nature machine intelligence 2 (2020) 56–67.
[5] B. Shneiderman, Human-centered AI, Oxford University Press, 2022.
[6] Z. C. Lipton, The mythos of model interpretability: In machine learning, the concept of
interpretability is both important and slippery., Queue 16 (2018) 31–57.
[7] T. Miller, Explanation in artificial intelligence: Insights from the social sciences, Artificial</p>
      <p>Intelligence 267 (2019) 1–38.
[8] R. Guidotti, A. Monreale, S. Ruggieri, F. Turini, F. Giannotti, D. Pedreschi, A survey of methods for
explaining black box models, ACM computing surveys (CSUR) 51 (2018) 1–42.
[9] European Parliament, Council of the European Union, Regulation (eu) 2024/xxxx of the european
parliament and of the council laying down harmonised rules on artificial intelligence (artificial
intelligence act), https://eur-lex.europa.eu/, 2024. Accessed: March 27, 2025.
[10] E. GDPR, General data protection regulation (gdpr), 2018.
[11] M. T. Ribeiro, S. Singh, C. Guestrin, " why should i trust you?" explaining the predictions of any
classifier, in: Proceedings of the 22nd ACM SIGKDD international conference on knowledge
discovery and data mining, 2016, pp. 1135–1144.
[12] R. R. Selvaraju, M. Cogswell, A. Das, R. Vedantam, D. Parikh, D. Batra, Grad-cam: Visual
explanations from deep networks via gradient-based localization, in: Proceedings of the IEEE international
conference on computer vision, 2017, pp. 618–626.
[13] P. Biecek, W. Samek, Position: explain to question not to justify, in: Proceedings of the International</p>
      <p>Conference on Machine Learning, ICML’24, JMLR.org, 2024.
[14] L. Longo, M. Brcic, F. Cabitza, J. Choi, R. Confalonieri, J. Del Ser, R. Guidotti, Y. Hayashi, F. Herrera,
A. Holzinger, et al., Explainable artificial intelligence (xai) 2.0: A manifesto of open challenges and
interdisciplinary research directions, Information Fusion 106 (2024) 102301.
[15] C. J. Anders, L. Weber, D. Neumann, W. Samek, K.-R. Müller, S. Lapuschkin, Finding and removing
clever hans: Using explanation methods to debug and improve deep models, Information Fusion
77 (2022) 261–295.
[16] V. Arya, R. K. Bellamy, P.-Y. Chen, A. Dhurandhar, M. Hind, S. C. Hofman, S. Houde, Q. V. Liao,
R. Luss, A. Mojsilović, et al., One explanation does not fit all: A toolkit and taxonomy of ai
explainability techniques, arXiv preprint arXiv:1909.03012 (2019).
[17] J. Klaise, A. Van Looveren, G. Vacanti, A. Coca, Alibi explain: Algorithms for explaining machine
learning models, Journal of Machine Learning Research 22 (2021) 1–7.
[18] N. Kokhlikyan, V. Miglani, M. Martin, E. Wang, B. Alsallakh, J. Reynolds, A. Melnikov, N. Kliushkina,
C. Araya, S. Yan, et al., Captum: A unified and generic model interpretability library for pytorch,
arXiv preprint arXiv:2009.07896 (2020).
[19] International Organization for Standardization, ISO 9241-210:2019 - ergonomics of human-system
interaction — part 210: Human-centred design for interactive systems, https://www.iso.org/
standard/77520.html, 2019.
[20] H. O’Brien, Theoretical perspectives on user engagement, Why engagement matters:
Crossdisciplinary perspectives of user engagement in digital media (2016) 1–26.
[21] S. Hart, Development of nasa-tlx (task load index): Results of empirical and theoretical research,</p>
      <p>Human mental workload/Elsevier (1988).
[22] M. Hassenzahl, M. Burmester, F. Koller, Attrakdif: Ein fragebogen zur messung wahrgenommener
hedonischer und pragmatischer qualität, Mensch &amp; Computer 2003: Interaktion in Bewegung
(2003) 187–196.
[23] M. L. N. De Bonis, G. Fasano, A. Lombardi, C. Ardito, A. Ferrara, E. Di Sciascio, T. Di Noia,
Explainable brain age prediction: a comparative evaluation of morphometric and deep learning
pipelines, Brain Informatics 11 (2024) 33.</p>
    </sec>
  </body>
  <back>
    <ref-list />
  </back>
</article>