<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta>
      <journal-title-group>
        <journal-title>Joint Workshop on Interfaces and Human Decision Making for Recommender Systems, September</journal-title>
      </journal-title-group>
    </journal-meta>
    <article-meta>
      <title-group>
        <article-title>Designing and Personalising Hybrid Multi-Modal Health Explanations for Lay Users</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Maxwell Szymanski</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Cristina Conati</string-name>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Vero Vanden Abeele</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Katrien Verbert</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Department of Computer Science, KU Leuven</institution>
          ,
          <addr-line>Leuven</addr-line>
          ,
          <country country="BE">Belgium</country>
        </aff>
        <aff id="aff1">
          <label>1</label>
          <institution>Department of Computer Science, University of British Columbia</institution>
          ,
          <addr-line>Vancouver</addr-line>
          ,
          <country country="CA">Canada</country>
        </aff>
      </contrib-group>
      <pub-date>
        <year>2023</year>
      </pub-date>
      <volume>18</volume>
      <issue>2023</issue>
      <fpage>0000</fpage>
      <lpage>0002</lpage>
      <abstract>
        <p>Recommender systems are increasingly used in mobile health applications. While researchers have highlighted the importance of explaining these recommendations to lay users, with benefits such as increased trust and a higher tendency to follow up on these recommendations, how to design explanations for lay users in critical contexts such as health remains largely unexplored. This paper explores and evaluates a unimodal visual and textual explanation, as well as a combined hybrid explanation modality for chronic pain related recommendations through a qualitative and quantitative lens via a repeated measures study ( = 262), and links user's preference towards these modalities to diferences in user perception such as trust and acceptance. Additionally, we explore the efects of how personal characteristics such as need for cognition and ease-of-satisfaction afect the user's preference and reception of the diferent explanations. Results indicate a strong preference towards the combined hybrid explanations. We also found interaction efects of ease-of-satisfaction and need for cognition on the perception of diferent explanation designs, indicating that users with a higher need for cognition tend to trust unimodal explanations more compared to hybrid explanations.</p>
      </abstract>
      <kwd-group>
        <kwd>eol&gt;health recommender systems</kwd>
        <kwd>health RS</kwd>
        <kwd>explainable AI</kwd>
        <kwd>lay users</kwd>
        <kwd>XAI</kwd>
        <kwd>hybrid explanations</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>1. Introduction</title>
      <p>
        In recent years, recommender systems (RS) have gained popularity across a multitude of domains,
including domains with high stakes such as health. In such domains, explanations to make
systems more scrutable have been identified as a core requirement [
        <xref ref-type="bibr" rid="ref1 ref2">1, 2</xref>
        ]. Various factors need to
be taken into account when designing such explanations, such as the goals of the explanations
[
        <xref ref-type="bibr" rid="ref2">2</xref>
        ] and the type of end user [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ]. Ribera et al. indicate that users can be diferentiated with respect
to their expertise - AI-experts, domain-experts and lay users - and that goals of explanations
difer across these groups [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ]. Personality traits are another characteristic amongst which users
can be distinguished. One such trait that has often been studied in the context of explanations
is need for cognition (NFC - the tendency to engage in and enjoy activities that require thinking).
It has been previously linked to the efectiveness of explanations, with studies indicating that
users with a low NFC tend to benefit more from explanations and hints as they help them make
more confident decisions [
        <xref ref-type="bibr" rid="ref4 ref5">4, 5</xref>
        ]. Additionally, Kouki et al. investigated the efect of a user’s
ease-of-satisfaction (EOS - the user‘s natural propensity to be satisfied) on their reception of
explanations [
        <xref ref-type="bibr" rid="ref6 ref7">6, 7</xref>
        ], and found that it has an influence on the subjective persuasiveness of certain
explanation types.
      </p>
      <p>
        Despite the many interesting studies on the efectiveness of explanations in relation to user
characteristics, very little is known on how explanations need to be designed for lay users
in high stakes domains such as health: only 10% of explanations in the health domain are
designed with lay users in mind [
        <xref ref-type="bibr" rid="ref8">8</xref>
        ]. There is a strong need to design and evaluate efective
explanation designs specifically for these lay users, as it has been shown that lay users are more
susceptible to a plethora of biases when interacting with explanations [
        <xref ref-type="bibr" rid="ref10 ref9">9, 10</xref>
        ]. Recent work
focusing on lay users suggests that hybrid explanations, that combine two or more modalities,
can be beneficial for lay users in terms of understanding and bias mitigation [
        <xref ref-type="bibr" rid="ref6 ref9">6, 9</xref>
        ]. However,
these studies have only contextualised their findings and benefits in low-stakes contexts, such
as music and news recommender systems. To the best of our knowledge, lay user evaluations
of hybrid explanations in high-stakes domains such as health remain largely unexplored. It is
therefore important to evaluate whether findings from studies in other domains generalise to
domains with higher stakes.
      </p>
      <p>
        In this paper, we explore whether combining a unimodal textual and visual explanation into
a hybrid explanation produces benefits for lay users in the higher-stakes health domain, or
introduce any potential information overload or negative perception. As previous findings also
highlight the role of NFC and EOS in diferences of perception, we also investigate any potential
interaction efects of these traits on the user‘s perception of explanations [
        <xref ref-type="bibr" rid="ref4 ref6">4, 6</xref>
        ]. This leads us to
answer the following research questions:
RQ1 Which health explanation modality (a unimodal textual or visual, or a hybrid modality)
do lay users prefer and why?
RQ2 In which way do personal characteristics (need for cognition, ease-of-satisfaction)
influence lay user perception of unimodal and hybrid explanations?
      </p>
      <p>To address these research questions, we developed a mobile health application that is able
to coach and inform users experiencing chronic musculoskeletal pain. We developed this
application in collaboration with IDEWE, the largest occupational health service provider in
Belgium. This app contains a conversational RS that provides knowledge-based
recommendations regarding how to efectively manage pain flare-ups. These recommendations were
developed based on input of six ergonomists and prevention advisors and refined based on
data collected through an initial longitudinal study ( = 249). In a follow-up mixed-methods
between-subject study ( = 262), we compared unimodal textual and visual explanations
to a hybridised combination of the two. Results indicate that although subjective perception
(apart from usefulness) remains the same across the unimodal and hybrid explanations, lay
users strongly tend to prefer hybrid health explanations. Through a thematic analysis, we distil
themes as to why hybrid explanations are preferred, and additionally find that extending visual
explanations with a textual modality is more beneficial for users than extending existing
textual explanations with visuals. Secondly, we see an interaction efect of NFC on user
perception of diferent explanations modalities: users with a higher NFC have a more positive
reception of unimodal explanations, compared to hybrid explanations, nuancing previous
ifndings in favour of hybrid explanations. Ease-of-satisfaction seems to be a strong predictor of
attitudes towards explanation modalities in general, with high EOS users having a more positive
perception compared to users with low EOS indicating a need for further research to explore
efective explanations for users with low ease-of-satisfaction.</p>
    </sec>
    <sec id="sec-2">
      <title>2. Related work</title>
      <sec id="sec-2-1">
        <title>2.1. RecSys in Health</title>
        <p>In this section, we will discuss the use of recommender systems in the health domain, as well as
recent advances of incorporating explanations into recommender systems.</p>
        <p>
          Recommender systems (RS) have become prominent in health applications, where they help
retrieve relevant information or recommend possible next actions tailored to the needs of the
end user. These health recommender systems (HRS) are used both in clinical settings as well as
in personal contexts where health applications aid users in their daily lives. A recent systematic
review of HRS for lay users shows that the majority of HRS that used a graphical user interface
focus on mobile applications [
          <xref ref-type="bibr" rid="ref11">11</xref>
          ].
        </p>
        <p>
          However, the increased use of HRS is also paralleled with certain barriers. One such issue is a
mismatch in recommendations to the user’s expectations. Such mismatch can lead to a decrease
in system efectiveness [
          <xref ref-type="bibr" rid="ref12">12</xref>
          ] and a decrease in trust towards the system [
          <xref ref-type="bibr" rid="ref11">11</xref>
          ], potentially steering
the user away from future use. Early research mainly focused on increasing the accuracy of RS
in order to mitigate this issue. However, recent research increasingly explores the efects of
human factors, including research on explanations to increase transparency, human-in-the-loop
feedback to correct misunderstandings, and using conversational RS to increase familiarity
towards the system’s interface [
          <xref ref-type="bibr" rid="ref13">13</xref>
          ]. This broader approach in reasoning about RS should allow
researchers to improve RS efectiveness beyond quantitative algorithmic capability.
        </p>
      </sec>
      <sec id="sec-2-2">
        <title>2.2. Explaining health recommendations</title>
        <p>
          As highlighted earlier, adding explanations to recommendations can improve their overall
efectiveness. These explanations make the system interpretable and transparent, which in turn
can improve trust towards the system [
          <xref ref-type="bibr" rid="ref14">14, 15</xref>
          ]. There exist HRS that explain their rationale to
the end user, such as the food recommender system of Wayman et al. that explains why certain
recipes are recommended based on the user’s nutritional intake [16], or a visualisation for
medical experts that is able to explain breast cancer similarities [17]. However, the systematic
review of De Croon et al. states that only 10% of HRS that focus on lay users provide explanations
[
          <xref ref-type="bibr" rid="ref11">11</xref>
          ]. Additionally, a study of Bussone et al. points out that providing overly detailed explanations
for health recommenders can create unforeseen efects, such as creating over-reliance on
explanations [18], indicating that health recommendation explanations should be designed with
suficient care. As such, designing explanations with lay users in mind, and evaluating them
with these users, is paramount.
        </p>
      </sec>
      <sec id="sec-2-3">
        <title>2.3. Personalising explanations for end users</title>
        <p>
          Research has explored the influence of diferent characteristics on the efect of the type, amount
and content of explanations to show. Naveed et al. explored the efects of a person’s thinking
style and found that participants with a fast and intuitive thinking style depended more on
explanations, whereas participants with logical and rational thinking acted more independently
from the given explanations [19]. Millecamp et al. repeatedly studied the efects of said thinking
style (by measuring need for cognition) in music RS explanations, but additionally investigated
the efects of a person’s musical sophistication and openness on the attitude towards explanations
[
          <xref ref-type="bibr" rid="ref4">20, 4</xref>
          ].
        </p>
        <p>
          Additionally, an increasing amount of research has indicated that the expertise of end users
should be taken into account when designing explanations. Ribera et al. highlight diferences
in the needs, goals and limitations of diferent user groups, including AI-experts, domain
experts and lay users. AI expert users, for example, use explanations to verify or improve the
underlying AI system, whereas domain experts can leverage explanations to gain additional
insights and learn from the system. Lay users have their own set of goals, but also their own
array of limitations. Wang et al. have highlighted several shortcomings of lay users that relate
to cognitive biases, such as confirmation and anchoring biases, due to a backward-oriented,
hypothesis-driven reasoning process [
          <xref ref-type="bibr" rid="ref10">10</xref>
          ]. Tsai et al. also noticed a reinforcing efect , where users
avoid interacting with content they are not familiar with [21]. Szymanski et al. additionally
pointed out that lay users, despite having these biases and incorrectly interpreting certain
complex explanations, can still have a preference for said complex explanations over other,
simpler explanation modalities due to these cognitive biases [
          <xref ref-type="bibr" rid="ref9">9</xref>
          ].
        </p>
        <p>
          Thus we see that interpretability through explanations has multiple benefits and can result
in an increased trust towards the system. However, as previously mentioned, the adoption of
explanations in HRS is still low. Furthermore, most health-related AI explanations are being
researched with AI and domain expert users in mind [
          <xref ref-type="bibr" rid="ref8">8</xref>
          ], which leaves a big gap for explanations
w.r.t. lay users. Keeping the aforementioned biases in mind that lay users are prone to, it is
therefore important to assess whether explanations are indeed interpretable to make sure no
misalignment in trust is created.
        </p>
      </sec>
      <sec id="sec-2-4">
        <title>2.4. Hybrid explanations</title>
        <p>
          There has been some previous work looking into hybrid explanations, which essentially combine
multiple explanations to enhance understanding and transparency [
          <xref ref-type="bibr" rid="ref9">22, 23, 9, 24</xref>
          ]. Hybrid
explanations can serve a twofold purpose: combining two or more explanation styles in the
same representation, to provide more information and context to end users, or combining two
or more explanation representations that ofer the same information in diferent representations,
to help alleviate shortcomings of using only one representation.
        </p>
        <p>
          However, when designing hybrid explanations, the trade-of between information overload
and completeness needs to be considered. Kouki et al. investigated the optimal number of
explanation styles that users want to see and found that this number in part depends on ones
personal characteristics [
          <xref ref-type="bibr" rid="ref6">6</xref>
          ]. The authors of TeleGam additionally focused their research on
the concept of user-specified resolutions of textual explanations, where users could choose the
level of detail of verbalisations that accompanied the visual explanations [23]. Considering
the benefits of hybrid explanations, as well as the pitfalls due to limited understanding and
potential biases related to lay users, more research needs to be done with regards to hybrid
explanations [25]. Given that previous work only situated their benefits in low-stakes domains,
future research, including this work, should explore if any benefits generalise to higher-stakes
domains.
        </p>
      </sec>
    </sec>
    <sec id="sec-3">
      <title>3. Explanation design</title>
      <p>In this section, we present the health recommender system that was developed, as well as the
diferent explanation modalities to explain said recommendations.</p>
      <sec id="sec-3-1">
        <title>3.1. Designing a health recommender system</title>
        <p>For the purpose of this research, we have developed a mobile health application that is able to
coach and inform users experiencing chronic musculoskeletal pain. The main function of the
application is to monitor and educate users through a variety of information, questionnaires and
interactive exercises, by guiding them through content modules regarding various topics, such
as activity management, pain education, mindfulness, etc. The content of the app, including
the modules, were developed in collaboration with researchers from IDEWE, and a team of 6
ergonomists and prevention advisors [26].</p>
        <p>To help users with specific episodes of a pain flare up, we have developed a so called pain
logbook that users can fill in in order to receive specific and tailored recommendations. The
logbook includes a conversational RS to increase engagement and ease-of-use [27]. It asks the
user about situational data (how intense the pain is and in which situation it occurred), as
well as the user’s reaction and thoughts they had at the time (e.g. stopping all activities, being
frustrated or scared, feeling helpless). Based on these inputs, recommendations are generated
using a knowledge-based recommender system on how to better cope with the pain flareup
next time or how to better handle the given reactions and emotions. Domain knowledge of a
team of six ergonomists and prevention advisors, together with a physiotherapy researcher, was
translated into a set of rules that guide users towards one of the 39 in-app submodules related
to mindfulness, resilience, activity, etc. Note that more than one relevant recommendation
can be given at once if relevant. In that case, recommendations are sorted based on their
feature importances (which themes were most prevalent in their input) as well as past pain
logbook entries. For example, if a user indicates being really frustrated when the flare up
occurs, and ceases all activities, two recommendations will be given. A first one will be related
to the frustrated emotions, as negative emotions sensitise our neurons causing a higher pain
perception, and will recommend a specific mindfulness exercise within the thoughts and emotions
module. A second recommendation will be related to the user stopping all physical activities,
as frequently abstaining from activities will cause the muscles to become even weaker over
time, which in turn will increase the amount of pain episodes. Therefore, a recommendation
regarding activity management will also be shown, which will tell the users to take a short
break, and adapt or lighten their current activities so they can continue. A high-level overview
of how the pain logbook, as well as the knowledge-based RS work, is given in Figure 1.</p>
      </sec>
      <sec id="sec-3-2">
        <title>3.2. Designing the explanations</title>
        <p>To explain the pain logbook recommendations, we designed diferent explanation modalities,
including textual and visual explanations presented in Figure 2. Based on data from a think
aloud study with these diferent explanation modalities, we found that feature importance (FI)
explanations were favoured by most users for their ability to give an overview as a
comprehensive and complete explanation, despite a possible information overload. Additionally, textual
explanations were favoured by users for whom the recommendation strongly aligned with their
expectations, as the explanation was able to give more in-depth information into the topic that
the users already agree with. Users that preferred less (overwhelming) amounts of information
were also more in favour of textual explanations, and less in favour of FI and other visually
more elaborate designs. To illustrate how these explanations will be presented, we give the
following example. When a user has indicated being scared during a pain flare up, and having
thoughts such as “Why is this happening to me?”, a recommendation towards the mindfulness
submodule is given. The visual explanation gives an overview of all themes that where captured
from the user‘s input, and indicates negative emotions (e.g. frustrated, scared) with high feature
attributions as they contribute most towards said recommendation. The textual explanation
on the other hand highlights that negative emotions were detected in the input, and explains
that negative emotions increases pain-perception and can be minimised through mindfulness.
When the user clicks through to the next recommendation, diferent topics that are relevant to
the new recommendation will be highlighted in the visual explanation, and a diferent text
will be displayed in the textual explanation.</p>
        <p>We evaluated these unimodal textual and visual explanations, as well as our
knowledgebased recommendations, through a longitudinal study with 249 participants who used the
application for 4 months. Participants could interact with the coaching application and pain
logbook that either had the visual or textual explanation (randomly assigned), and were able
to give suggestions through an in-app feedback module. The data of this iteration was then
used to refine the knowledge-based recommender system, as well as the explanations, for the
following iteration. Participants mentioned that some recommendations and their respective
textual explanations were too general in some cases. For instance, some participants found that
explanations related to activity management did not take their personal context into account,
such as which activity they were doing. To address this limitations, our collaborative team of 6
ergonomists further diversified the textual explanations to take specific nuances in inputs into
account.</p>
        <p>
          In addition to the two unimodal explanations, we also add a new hybrid explanation
design (Figure 2, right) that essentially combines both the textual and the visual explanation.
As discussed in Section 2.4, some previous work has shown positive efects of using hybrid
explanations in terms of increased understanding and preference [
          <xref ref-type="bibr" rid="ref6 ref9">9, 23, 6</xref>
          ]. However, previous
work regarding hybrid health explanations focuses more on multiple complex visuals that
suit domain experts, or states that a combination of complex feature attributions and class
attributions should be incorporated, which in practice might be overwhelming for lay users
[
          <xref ref-type="bibr" rid="ref10">10</xref>
          ]. Our hybrid design therefore draws inspiration from simpler combinations of textual
and visual explanations that have been proven to be beneficial for lay users, albeit not yet
tested in high-stakes domains such as health [
          <xref ref-type="bibr" rid="ref9">9, 23</xref>
          ].
        </p>
      </sec>
    </sec>
    <sec id="sec-4">
      <title>4. Methodology</title>
      <sec id="sec-4-1">
        <title>4.1. Study design</title>
        <p>
          Through a within-subject study with  = 262 participants (ethically approved by the Ethics
Committee Research of the UZ Leuven universitary hospital (EC Research) with application
number S-65610), we explore if, and how extending a unimodal textual explanation with
a visual explanation might benefit the user, and vice-versa. We also investigate whether
extending textual explanations with visual explanations has diferent benefits for lay users
than doing so the other way around. During the within-subject study, participants are first
presented with contextual information regarding the user study, and are required to consent to
their (questionnaire) data being anonymously recorded in order to continue. After consenting,
the participants receive a questionnaire regarding their personal characteristics (PC). This
questionnaire relates to their ease-of-satisfaction (EOS, from Kouki et al., slightly adapted to
ift the health recommendation setting [
          <xref ref-type="bibr" rid="ref6">6</xref>
          ]), and the abbreviated 6-item version of the need
for cognition scale (NFC) [28]). Next, participants interacted with the pain logbook with the
unimodal explanation (being either textual or visual depending on the randomly assigned
group). The study asks them to fill in the logbook according to a (chronic) pain scenario they
recent experienced, after which they receive several relevant recommendations, accompanied by
either a textual or visual explanation. Participants are free to go through the recommendations
and their respective explanation, and choose one that suits them, or select the ‘none’ option if no
recommendation is relevant. This step is followed by a questionnaire regarding the perception
of explanations, inquiring the participant about the perceived trust, transparency, persuasiveness,
usefulness and satisfaction, taken from [29, 30]. After filling in the questionnaire, participants are
presented with the same pain logbook, but with an explanation that extends the first unimodal
explanation they were presented with (by a visual explanation if textual was shown first, or
vice versa). Participants are asked to fill in the logbook according to a diferent pain episode they
recently experienced. This step is followed by the same questionnaire relating to the perceived
trust, transparency, persuasiveness, usefulness and satisfaction of the hybrid explanations. In
the final step, participants are asked to give their preference towards either of the explanation
designs on a 5-point Likert scale (1 - strong preference for unimodal 5 - strong preference for
hybrid 3 - both), and give a reason as to why they like their preferred explanation design, and
dislike the other.
        </p>
      </sec>
      <sec id="sec-4-2">
        <title>4.2. Recruitment</title>
        <p>
          Figure 3 shows an overview of the within-subject study design, in which users first interacted
with either the textual or visual explanations as the unimodal design, followed by interacting
with the hybrid explanations. We recruited a total of 291 English-speaking participants through
the online Prolific platform, where users receive $7.29 for participating in the study (approx.
25 minutes). The inclusion criterion was to have experienced chronic pain for a period of
at least three months in the past three years. After filtering out participants that did not
complete all questionnaires or incorrectly answered the ‘alertness’ question, the resulting
number of participants equates to 262 (143 participants in the textual↔hybrid setting, 119 in
the visual↔hybrid setting). The demographics (age, gender and scores on NFC and EOS) of the
participants can be seen in Figure 4. We also find that the medians of the personal characteristics
are in line with other studies [
          <xref ref-type="bibr" rid="ref4">4</xref>
          ].
        </p>
      </sec>
    </sec>
    <sec id="sec-5">
      <title>5. Results RQ1 - Which explanation do users prefer and why?</title>
      <p>To answer RQ1 regarding user preferences, we need to analyse the preference data. As it is
non-normal (Shapiro-Wilk test, textual↔hybrid:  = 0.65,  &lt; 0.001, visual↔hybrid:
 = 0.62,  &lt; 0.001), we cannot perform a (M)ANOVA analysis and need to opt for a
nonparametric univariate analysis using a one-sided Wilcoxon signed rank test. Additionally, we
do a two-coder thematic analysis to gain insights into the reasoning behind their preference.</p>
      <sec id="sec-5-1">
        <title>5.1. Main efect for preference</title>
        <p>We asked users to give a preference towards either the unimodal explanation (either visual
or textual), and the hybrid explanation on a 5-point Likert scale, with 1 being a strong
preference towards the unimodal explanation, 5 a strong preference towards hybrid, and 3
being a preference for both. To test whether there is a significant preference for the novel hybrid
explanations or not, we test the sample median against a median of   = 3. Since preference
is non-normally distributed (Shapiro-Wilk results:  = 0.668,  &lt; .001), we perform a
onesided one-sample Wilcoxon rank test. Results indicate strong evidence to assume the alternative
hypothesis (i.e. the preference median being greater than 3) ( = 420,  &lt; .001,  = .543
(large effect)), indicating that hybrid explanations were largely preferred compared to
unimodal explanation (Figure 5).</p>
      </sec>
      <sec id="sec-5-2">
        <title>5.2. How does extending a unimodal explanation afect user preference?</title>
        <p>To delve deeper into the reasons behind user preferences, we extend the previous results with
results from the thematic analysis to gain more insight in to why users prefer certain modalities
over others. We do this through a two-coder iterative thematic analysis, with an agreement
percentage of 89.2% and Cohen’s kappa  = 0.68, resulting in a substantial inter-coder agreement
[31]. For brevity, we only report the overarching themes rather than individual codes.</p>
        <sec id="sec-5-2-1">
          <title>5.2.1. Discussion</title>
          <p>The thematic analysis points to an interesting synergy between the textual and visual
explanations. When used in a standalone setting, textual explanations were found to be
good at describing specific factors for recommendation in a detailed way, helping end users
to understand both the explanation, as well as the recommendation. However, most users
also found them to be less engaging and harder to scan through. On the other hand, visual
explanations gave most users a concise and global overview of all their inputs, making it easy to
scan for information at first glance. Yet we also notice a dislike for the lack of depth and detail,
with some users not even perceiving the visuals and highlights as an explanation, but instead
as a mere summary of their inputs. Combining both modalities has proven to integrate
the strengths of both modalities, whilst at the same time alleviating the shortcomings
of using them separately (i.e. textual being less engaging and hard to scan, and the visual
being too general or not ofering a ‘real’ explanation).</p>
          <p>Whilst most users did express a positive sentiment towards the hybrid explanations, we
do have to keep in mind that the preferred explanation modality also has its shortcomings.
Some users found the combination of both explanations to be either too overwhelming at
ifrst glance due to the addition of text, or had no interest in seeing visuals due to a general
dislike or lack of perceived added value. This relates closely to the well-known dilemma known
as the completeness-conciseness (or simplicity-power) trade-of, for which several solutions
have been proposed. One of the solutions is progressive disclosure, where we seek to ofer
information on demand by providing users with initial essential information, and deferring
detailed or more advanced information to a pop-up or secondary screen [32]. A second method
consists of making textual explanations more readable by applying methods such as chunking,
highlighting or applying brevity, [33]. These methods can be combined with progressive
disclosure to gradually expose users to information in the hybrid explanation when needed, i.e.
when the recommendation doesn’t align with their expectations or when the user shows high
interest in the topic and wants more information. Further research could investigate if these
proposed solutions have a potential to further decrease the limited shortcomings of the hybrid
explanations.</p>
          <p>
            We also investigate whether extending textual explanations with visuals has diferent benefits
compared to adding visual explanations with a textual modality. We see a stronger preference for
textual modalities (26,3% standalone textual compared to 16,3% standalone visual, or 83,7% hybrid
preference when textuals are added compared to 73,7% when visuals are added). When looking
at overarching themes when unimodal explanations are extended, we see insightfulness as a big
theme when textual explanations are added ( = 82 unique users), with reasons being a more
detailed, better to understand and more informative design. When visual explanations are added,
we still see insightfulness emerge, albeit to a lesser extent ( = 50 unique users). In addition
to insightfulness, users mention visual engagement as a factor to prefer the addition of visual
explanation ( = 35 unique users), with users finding the visuals to make the explanation more
engaging, appealing and less boring. The stronger preference for textual modalities contradicts
previous findings that indicate that lay users tend to have a strong preference towards visual
explanations over textual ones due to their visual engagement [
            <xref ref-type="bibr" rid="ref9">9</xref>
            ]. However, previous findings
were contextualised in non-critical settings. In high-stakes contexts such as health, users saw
a higher need for more more informative explanations (a need-to-have), rather than “visual
engagement” (which is a nice-to-have).
          </p>
        </sec>
      </sec>
    </sec>
    <sec id="sec-6">
      <title>6. Results RQ2 - How does the user’s perception of explanations difer?</title>
      <p>For RQ2 regarding perception, we find our independent variables to be normally distributed
and thus are able to do a (M)ANOVA analysis.</p>
      <sec id="sec-6-1">
        <title>6.1. Main efects of perception</title>
        <p>We can compare the hybrid explanation with both the visual and textual explanation to see
whether they perform better or worse in any of these categories. We perform a non-parametric
Wilcox signed rank test in each setting to see if the results of the hybrid explanations difer from
the other explanation. Using a Bonferroni correction (adjusting for  = 5 tests in each setting,
accounting for the 5 dimensions we test on), we find that only usefulness difers with the hybrid
explanation, compared to both the textual explanation, as well as the visual explanation.
Using a (Bonferroni corrected) one-way Wilcox signed rank test, there is a strong statistical
significance (  =&lt; 0.001***) that indicates participants found the hybrid explanations to
be more useful compared to both the textual and visual explanations (Table 1). Research
by Tsai et al. [30] mentions that when explanations that are being combined are complementary,
they are perceived to be more useful for users. This leads us to conclude that our design of the
textual and simplified feature importances are complementary to each other, and in turn are
perceived to be more useful.</p>
      </sec>
      <sec id="sec-6-2">
        <title>6.2. Interaction efects of personal characteristics on perception</title>
        <p>We now explore the efects of the user’s personal characteristics ( PC), consisting of NFC and
EOS, on the perception of explanations. We divide this research question into two parts: how
do PC afect a user’s perception of explanations in general, and how do PC afect the perception
of diferent unimodal and hybrid explanations . For each PC (NFC and EOS), we do a median
split (leave-median-out) to create a group that has a low score (below median) and a high
score (above median). This allows us to perform a MANOVA with either NFC or EOS as the
predictor, with trust, transparency, persuasion, acceptance, usefulness and preference being the
outcome, to try to answer the aforementioned research questions. The scores regarding trust,
transparency, persuasion, satisfaction and usefulness have been rescaled between 0 and 100 for
easier interpretation and comparison.</p>
        <sec id="sec-6-2-1">
          <title>6.2.1. Efects of NCS and EOS on general explanation perception</title>
          <p>Table 2 shows the means of the scores of each subgroup of participants (low and high NFC and
EOS), as well as the MANOVA  that indicates whether the diference between the low and the
high subgroup is significant. We find that users with a higher ease-of-satisfaction have a
more positive perception of explanations in general, with their average scores for not
only the perceived satisfaction, but also trust, transparency and persuasiveness being higher by
10 to 15% compared to users with a lower ease-of-satisfaction. For NFC, we do not find any
significance w.r.t. explanation perceptions in general.</p>
        </sec>
        <sec id="sec-6-2-2">
          <title>6.2.2. Efects of NCS and EOS on diferences in perception between unimodal and hybrid explanations</title>
          <p>
            Now we use NFC and EOS as the predictor on the diferences in perception between the hybrid
explanation and the unimodal explanation (e.g. ΔTrust represents how much more/less
participants trust the hybrid explanations compared to the unimodal explanation, with a
positive score of  meaning that they trust hybrid explanations more by % compared to the
unimodal design). The results are shown in Table 3. We find that users with a higher
NFC tend to score the hybrid explanations lower in terms of trust, transparency and
usefulness compared to the unimodal explanation. To contextualise this finding, we look
at related work regarding NFC and music recommendation explanations by Millecamp et al.
[
            <xref ref-type="bibr" rid="ref4">4</xref>
            ]. They state that users with a lower NFC tend to benefit more from additional explanations
in general, whereas users with a high NFC only do so when the recommendations are not in
line with their expectation and have an explicit need for explanations. When looking at the
metadata in our study of whether or not users agreed with the recommendations, we see that
the majority of users did agree with the recommendations, which explains why the need for
more information is lower, and consequently a slightly lower need for and perceived usefulness
of the hybrid, more informative explanations. However, the slightly lower trust, transparency,
persuasion and usefulness does not translate to their preference, as most users (low and high
NFC alike) still opt to choose for the hybrid explanations. Thus we can conclude that it is still
feasible to present hybrid explanations by default, albeit with the modification of showing either
the textual or visual part on-demand when users request to have more information.
          </p>
          <p>For EOS, we see a similar trend where participants with a high EOS score the hybrid
explanations slightly lower compared to participants with a low EOS, but only in
terms of transparency. Similar results have been reported by Kouki et al, who found that
users with a high conscientiousness score also high on EOS, and prefer to see a lower amount
of explanations [34]. While we didn’t measure conscientiousness directly, we can speculate
that the high conscientiousness - high EOS finding is generalisable (as it relates to personal
characteristics, not the study setup), which might explain why our participants with a higher
EOS scored the hybrid modality with more explanations slightly lower in terms of perception.</p>
        </sec>
      </sec>
    </sec>
    <sec id="sec-7">
      <title>7. Conclusion and future work</title>
      <p>
        In this work, we have designed and compared a textual, visual and hybrid explanation
modality that shows non-expert users why they receive certain recommendations regarding
their chronic pain. Through an initial longitudinal study with 249 participants, we found certain
shortcomings when using previously proposed feature importances as visual explanations
for lay users. Using their feedback, we adapted the feature importance designs to fit their
mental model, yet still convey largely the same information. Afterwards, we performed an
online within-subject study with 262 participants to compare the unimodal explanation (either
textual or newly adapted visual) to a hybrid explanation in terms of preference and user
reception. We find that most users prefer the hybrid explanation, i.e. a combination of diferent
explanation modalities that is able to give them more insight into why the recommendation is
given. However, we found this preference to be higher with users that were first presented with
visual explanations, indicating a higher need for accompanying textual explanations to existing
visual explanations. Through a thematic analysis, we explored themes to gain insight into why
users liked said explanation modalities, and extracted general guidelines for designing health
recommendations for non-expert users. We found that using the adapted visual explanations
proved to be a good fit for conveying a summary regarding their input, as it allowed users to
see the general picture at a glance. However, visual explanations alone often lacked depth and
specific information, and could be confusing for non-expert users to interpret. Adding textual
explanations allowed users to gain more insight into both the explanation and recommendation,
and is often regarded as easy to understand by most non-expert users. However, using text
alone also had it’s shortcomings, such as being perceived as less engaging. Interestingly, both
the thematic analysis and quantitative data have shown that both modalities in tandem as a
hybrid explanation alleviated most of the downsides of using visual and textual explanations
separately, and proved to be the most opted for by the majority of participants. Additionally,
we found that the combination of modalities didn’t introduce any information overload efect,
since the negative themes regarding hybrid explanations mainly related to a lack of need for
more information, and not due to an overwhelming amount of information. These results add to
the limited corpus of previous findings within other domains, which state that complementarily
designed hybrid explanations aid end users rather than overwhelm them [
        <xref ref-type="bibr" rid="ref9">9, 30</xref>
        ], especially in
the underexplored area of designing hybrid explanations for lay users in high-stakes domain
such as health [
        <xref ref-type="bibr" rid="ref11 ref8">8, 11</xref>
        ].
      </p>
      <p>
        We also explored the efects of a user’s personal characteristics to see whether their
ease-ofsatisfaction and need for cognition have an influence on the reception of the explanations. We
ifnd that ease-of-satisfaction seems to be a good predictor on the user’s general perception of
explanations, as users with a high EOS tend to find explanations to be not only more satisfactory,
but also more trustworthy, transparent, persuasive, and useful compared to users with low EOS.
Need for cognition did not seem to afect a user’s general attitude towards explanations, but it
did however highlight diferences between explanation modalities (hybrid vs. unimodal). We
found that users with a low NFC tend to find the hybrid explanations more useful, transparent
and trustworthy compared to the unimodal explanations, and find the opposite efect in users
with a high NFC. This nuances the overall benefits of hybrid explanations by stating that,
although users with a high NFC still tend to prefer hybrid explanations, they find them to be
slightly less useful, transparent and trustworthy compared to unimodal designs. This partly
relates to previous work by Szymanski et al. [
        <xref ref-type="bibr" rid="ref9">9</xref>
        ] where lay users preferred explanation modalities
that they unknowingly performed worse with, and calls for more in-depth research as to where
these discrepancies come from and how they might potentially afect lay users.
      </p>
      <sec id="sec-7-1">
        <title>Limitations and future work</title>
        <p>
          While careful consideration has been put into the study design process, certain decisions
inherently lead to excluding other trajectories to explore. While our within-study design allowed
us to gain insightful qualitative data comparing standalone explanations to underexplored hybrid
ones (by seeing how extending either a unimodal textual or visual explanation with the other
modality afects user perception and preference and if extending textual first difers form visual
ifrst), it limits us from directly comparing all three explanation modalities with each other. A
follow-up between-subject study could prove to be useful for evaluating and comparing the
visual, textual and hybrid explanation modalities through a fully quantitative lens. Additionally,
the choice of representation of the visual and textual explanations naturally have their impact
on how users perceive those explanations. We based our explanations on the previous guidelines
for designing health explanations for lay users, but exploring the efects of diferent visual
representations on how they influence user perception, and potentially exploring multiple types
of visual explanations side by side, could also prove be interesting [35]. Lastly, many surveys
and papers note that besides measuring aspects such as trust, transparency, usefulness and so
on, user understanding should also be measured [
          <xref ref-type="bibr" rid="ref1">36, 1</xref>
          ]. While there are tools and measures
to inquire about AI and explanation understanding in an ofline way, previous research has
shown that due to biases present with non-expert users, there is a possibility that these users
unknowingly have an incorrect understanding of the system [
          <xref ref-type="bibr" rid="ref10 ref9">10, 9</xref>
          ].
        </p>
        <p>
          Future work could also focus on exploring the efects of other personal characteristics on
how non-expert users perceive health explanations. Similar research by Kouki et al. has shown
that other characteristics, such as a user’s dependability, neuroticism and agreeableness can
also impact the way users perceive explanations [
          <xref ref-type="bibr" rid="ref6">6</xref>
          ]. Capturing the user’s previous knowledge
or experience with health RS could also prove to be useful. Other aspects of user perception,
such as the quality of the recommendation (e.g. perceived accuracy and novelty), also seem
to be influenced by both the explanation modality, as well as user characteristics, and could
prove to be a useful path to explore in the context of health recommendations [34]. And lastly,
as mentioned in Section 7, performing an in-person study to assess user understanding of
explanations has a lot of potential, as non-expert users often sufer from cognitive biases that
can lead to incorrect understanding unbeknownst to them.
        </p>
      </sec>
    </sec>
    <sec id="sec-8">
      <title>Acknowledgments</title>
      <p>We would like to thank all participants for their time and valuable insights. This research
is part of the research projects Personal Health Empowerment (PHE) with project
number HBC.2018.2012, financed by Research Foundation Flanders (FWO) with project numbers
G0A4923N and G0A3319N as well as the C1 project with project number C14/21/072, and has
been ethically approved by the Ethics Committee Research UZ / KU Leuven (EC Research) with
application number S-65610.
doi:10.3390/electronics8080832.
[15] S. Stumpf, V. Rajaram, L. Li, W.-K. Wong, M. Burnett, T. G. Dietterich, E. Sullivan,
J. Herlocker, Interacting meaningfully with machine learning systems: Three
experiments, International Journal of Human Computer Studies 67 (2009) 639–662. URL:
https://openaccess.city.ac.uk/id/eprint/12417/. doi:10.1016/j.ijhcs.2009.03.004, ©
2009, Elsevier. Licensed under the Creative Commons
Attribution-NonCommercialNoDerivatives 4.0 International http://creativecommons.org/licenses/by-nc-nd/4.0/.
[16] E. Wayman, S. Madhvanath, Nudging Grocery Shoppers to Make Healthier Choices, in:
Proceedings of the Ninth Conference on Recommender Systems, ACM, 2015, pp. 289–292.
doi:10.1145/2792838.2799669.
[17] J.-B. Lamy, B. Sekar, G. Guezennec, J. Bouaud, B. Séroussi, Explainable artificial
intelligence for breast cancer: A visual case-based reasoning approach, Artificial
Intelligence in Medicine 94 (2019) 42–53. URL: https://www.sciencedirect.com/science/article/
pii/S0933365718304846. doi:https://doi.org/10.1016/j.artmed.2019.01.001.
[18] A. Bussone, S. Stumpf, D. M. O’Sullivan, The role of explanations on trust and reliance in
clinical decision support systems, 2015 International Conference on Healthcare Informatics
(2015) 160–169.
[19] S. Naveed, T. Donkers, J. Ziegler, Argumentation-based explanations in recommender
systems: Conceptual framework and empirical results, in: Adjunct Publication of the 26th
Conference on User Modeling, Adaptation and Personalization, UMAP ’18, Association for
Computing Machinery, New York, NY, USA, 2018, pp. 293–298. URL: https://doi.org/10.
1145/3213586.3225240. doi:10.1145/3213586.3225240.
[20] M. Millecamp, R. Haveneers, K. Verbert, Cogito Ergo Quid? The Efect of Cognitive Style
in a Transparent Mobile Music Recommender System, UMAP ’20 (2020) 323–327. URL:
https://doi.org/10.1145/3340631.3394871.
[21] C.-H. Tsai, P. Brusilovsky, Beyond the ranked list: User-driven exploration and
diversification of social recommendation, in: 23rd International Conference on Intelligent User
Interfaces, IUI ’18, Association for Computing Machinery, New York, NY, USA, 2018, pp. 239–250.</p>
      <p>URL: https://doi.org/10.1145/3172944.3172959. doi:10.1145/3172944.3172959.
[22] D. Parra, P. Brusilovsky, User-controllable personalization: A case study with
setfusion, International Journal of Human-Computer Studies 78 (2015) 43–67. URL: https:
//www.sciencedirect.com/science/article/pii/S1071581915000208. doi:https://doi.org/
10.1016/j.ijhcs.2015.01.007.
[23] F. Hohman, A. Srinivasan, S. M. Drucker, Telegam: Combining visualization and
verbalization for interpretable machine learning, in: 2019 IEEE Visualization Conference (VIS),
2019, pp. 151–155. doi:10.1109/VISUAL.2019.8933695.
[24] K. Verbert, D. Parra, P. Brusilovsky, E. Duval, Visualizing recommendations to support
exploration, transparency and controllability, in: Proceedings of the 2013 International
Conference on Intelligent User Interfaces, IUI ’13, Association for Computing Machinery,
New York, NY, USA, 2013, pp. 351–362. URL: https://doi.org/10.1145/2449396.2449442.
doi:10.1145/2449396.2449442.
[25] M. Szymanski, K. Verbert, V. Vanden Abeele, Designing and evaluating explainable ai
for non-ai experts: Challenges and opportunities, in: Proceedings of the 16th ACM
Conference on Recommender Systems, RecSys ’22, Association for Computing Machinery,
New York, NY, USA, 2022, pp. 735–736. URL: https://doi.org/10.1145/3523227.3547427.
doi:10.1145/3523227.3547427.
[26] C. Puri, S. Keyaerts, M. Szymanski, L. Godderis, K. Verbert, S. Luca, B. Vanrumste, Daily pain
prediction in workplace using gaussian processes, Proceedings of the 16th International
Joint Conference on Biomedical Engineering Systems and Technologies (2023). doi:10.
5220/0011611200003414.
[27] D. Jannach, A. Manzoor, W. Cai, L. Chen, A survey on conversational recommender
systems, 2020. arXiv:2004.00646.
[28] G. L. de Holanda Coelho, P. H. P. Hanel, L. J. Wolf, The very eficient assessment
of need for cognition: Developing a six-item version, Assessment 27 (2020) 1870–
1885. URL: https://doi.org/10.1177/1073191118793208. doi:10.1177/1073191118793208.
arXiv:https://doi.org/10.1177/1073191118793208, pMID: 30095000.
[29] J. D. Van Der Laan, A. Heino, D. De Waard, A simple procedure for the assessment of
acceptance of advanced transport telematics, Transportation Research Part C:
Emerging Technologies 5 (1997) 1–10. URL: https://www.sciencedirect.com/science/article/pii/
S0968090X96000253. doi:https://doi.org/10.1016/S0968-090X(96)00025-3.
[30] C.-H. Tsai, P. Brusilovsky, Evaluating Visual Explanations for Similarity-Based
Recommendations: User Perception and Performance, in: Proceedings of the 27th ACM Conference on
User Modeling, Adaptation and Personalization, UMAP ’19, Association for Computing
Machinery, New York, NY, USA, 2019, pp. 22–30. URL: https://doi.org/10.1145/3320435.3320465.
doi:10.1145/3320435.3320465.
[31] N. J.-M. Blackman, J. J. Koval, Interval estimation for cohen’s kappa as a measure of
agreement, Statistics in Medicine 19 (2000) 723–741. doi:https://doi.org/10.1002/
(SICI)1097-0258(20000315)19:5&lt;723::AID-SIM379&gt;3.0.CO;2-A.
[32] A. Springer, S. Whittaker, Progressive disclosure: When, why, and how do users want
algorithmic transparency information?, ACM Trans. Interact. Intell. Syst. 10 (2020). URL:
https://doi.org/10.1145/3374218. doi:10.1145/3374218.
[33] N. Wichman, Speaking of sentences: Chunking, Teaching English in the Two
Year College 36 (2009) 281–290. URL: https://www.proquest.com/scholarly-journals/
speaking-sentences-chunking/docview/220969438/se-2, copyright - Copyright National
Council of Teachers of English Mar 2009; Document feature - Tables; ; Last updated
2019-11-22.
[34] P. Kouki, J. Schafer, J. Pujara, J. O’Donovan, L. Getoor, Personalized Explanations for
Hybrid Recommender Systems, in: Proceedings of the 24th International Conference
on Intelligent User Interfaces, IUI ’19, Association for Computing Machinery, New York,
NY, USA, 2019, pp. 379–390. URL: https://doi.org/10.1145/3301275.3302306. doi:10.1145/
3301275.3302306.
[35] M. Szymanski, V. V. Abeele, K. Verbert, Explaining health recommendations to lay users:</p>
      <p>The dos and don’ts, Technical Report, 2022. URL: http://ceur-ws.org.
[36] V. Lai, C. Chen, Q. V. Liao, A. Smith-Renner, C. Tan, Towards a science of
humanai decision making: A survey of empirical studies, CoRR abs/2112.11471 (2021). URL:
https://arxiv.org/abs/2112.11471. arXiv:2112.11471.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <given-names>S.</given-names>
            <surname>Mohseni</surname>
          </string-name>
          ,
          <string-name>
            <given-names>N.</given-names>
            <surname>Zarei</surname>
          </string-name>
          ,
          <string-name>
            <given-names>E. D.</given-names>
            <surname>Ragan</surname>
          </string-name>
          ,
          <article-title>A multidisciplinary survey and framework for design and evaluation of explainable ai systems</article-title>
          ,
          <source>ACM Trans. Interact. Intell. Syst</source>
          .
          <volume>11</volume>
          (
          <year>2021</year>
          ). URL: https://doi.org/10.1145/3387166. doi:
          <volume>10</volume>
          .1145/3387166.
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [2]
          <string-name>
            <given-names>N.</given-names>
            <surname>Tintarev</surname>
          </string-name>
          , Explanations of recommendations,
          <source>in: Proceedings of the 2007 ACM Conference on Recommender Systems</source>
          , RecSys '07,
          <string-name>
            <surname>Association</surname>
          </string-name>
          for Computing Machinery, New York, NY, USA,
          <year>2007</year>
          , pp.
          <fpage>203</fpage>
          -
          <lpage>206</lpage>
          . URL: https://doi.org/10.1145/1297231.1297275. doi:
          <volume>10</volume>
          .1145/1297231.1297275.
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [3]
          <string-name>
            <given-names>M.</given-names>
            <surname>Ribera</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Lapedriza</surname>
          </string-name>
          ,
          <article-title>Can we do better explanations? a proposal of user-centered explainable ai</article-title>
          ,
          <source>CEUR Workshop Proceedings</source>
          <volume>2327</volume>
          (
          <year>2019</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [4]
          <string-name>
            <given-names>M.</given-names>
            <surname>Millecamp</surname>
          </string-name>
          ,
          <string-name>
            <given-names>N. N.</given-names>
            <surname>Htun</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Conati</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K.</given-names>
            <surname>Verbert</surname>
          </string-name>
          ,
          <article-title>To explain or not to explain: The efects of personal characteristics when explaining music recommendations</article-title>
          ,
          <source>in: Proceedings of the 24th International Conference on Intelligent User Interfaces</source>
          ,
          <source>IUI '19</source>
          ,
          <string-name>
            <surname>Association</surname>
          </string-name>
          for Computing Machinery, New York, NY, USA,
          <year>2019</year>
          , pp.
          <fpage>397</fpage>
          -
          <lpage>407</lpage>
          . URL: https://doi.org/10. 1145/3301275.3302313. doi:
          <volume>10</volume>
          .1145/3301275.3302313.
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          [5]
          <string-name>
            <given-names>C.</given-names>
            <surname>Conati</surname>
          </string-name>
          ,
          <string-name>
            <given-names>O.</given-names>
            <surname>Barral</surname>
          </string-name>
          ,
          <string-name>
            <given-names>V.</given-names>
            <surname>Putnam</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L.</given-names>
            <surname>Rieger</surname>
          </string-name>
          ,
          <article-title>Toward personalized xai: A case study in intelligent tutoring systems</article-title>
          ,
          <source>Artif. Intell</source>
          .
          <volume>298</volume>
          (
          <year>2021</year>
          ). URL: https://doi.org/10.1016/j.artint.
          <year>2021</year>
          .
          <volume>103503</volume>
          . doi:
          <volume>10</volume>
          .1016/j.artint.
          <year>2021</year>
          .
          <volume>103503</volume>
          .
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          [6]
          <string-name>
            <given-names>P.</given-names>
            <surname>Kouki</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Schafer</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Pujara</surname>
          </string-name>
          ,
          <string-name>
            <surname>J. O'Donovan</surname>
            ,
            <given-names>L.</given-names>
          </string-name>
          <string-name>
            <surname>Getoor</surname>
          </string-name>
          ,
          <article-title>Generating and Understanding Personalized Explanations in Hybrid Recommender Systems</article-title>
          ,
          <source>ACM Trans. Interact. Intell. Syst</source>
          .
          <volume>10</volume>
          (
          <year>2020</year>
          )
          <fpage>1</fpage>
          -
          <lpage>40</lpage>
          . URL: https://doi.org/10.1145/3365843. doi:
          <volume>10</volume>
          .1145/3365843.
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          [7]
          <string-name>
            <given-names>J.</given-names>
            <surname>Schafer</surname>
          </string-name>
          ,
          <string-name>
            <surname>J. O'Donovan</surname>
            ,
            <given-names>T.</given-names>
          </string-name>
          <string-name>
            <surname>Höllerer</surname>
          </string-name>
          ,
          <article-title>Easy to please: Separating user experience from choice satisfaction</article-title>
          ,
          <source>in: Proceedings of the 26th Conference on User Modeling, Adaptation and Personalization</source>
          , UMAP '18,
          <string-name>
            <surname>Association</surname>
          </string-name>
          for Computing Machinery, New York, NY, USA,
          <year>2018</year>
          , pp.
          <fpage>177</fpage>
          -
          <lpage>185</lpage>
          . URL: https://doi.org/10.1145/3209219.3209222. doi:
          <volume>10</volume>
          .1145/3209219. 3209222.
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          [8]
          <string-name>
            <given-names>J.</given-names>
            <surname>Ooge</surname>
          </string-name>
          , G. Stiglic,
          <string-name>
            <given-names>K.</given-names>
            <surname>Verbert</surname>
          </string-name>
          ,
          <article-title>Explaining artificial intelligence with visual analytics in healthcare</article-title>
          ,
          <source>WIREs Data Mining and Knowledge Discovery</source>
          <volume>12</volume>
          (
          <year>2021</year>
          ). URL: https: //wires.onlinelibrary.wiley.com/doi/abs/10.1002/widm.1427. doi:https://doi.org/10. 1002/widm.1427.
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          [9]
          <string-name>
            <given-names>M.</given-names>
            <surname>Szymanski</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Millecamp</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K.</given-names>
            <surname>Verbert</surname>
          </string-name>
          ,
          <article-title>Visual, textual or hybrid: The efect of user expertise on diferent explanations</article-title>
          ,
          <source>in: 26th International Conference on Intelligent User Interfaces</source>
          ,
          <source>IUI '21</source>
          ,
          <string-name>
            <surname>Association</surname>
          </string-name>
          for Computing Machinery, New York, NY, USA,
          <year>2021</year>
          , pp.
          <fpage>109</fpage>
          -
          <lpage>119</lpage>
          . URL: https://doi.org/10.1145/3397481.3450662. doi:
          <volume>10</volume>
          .1145/3397481. 3450662.
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          [10]
          <string-name>
            <given-names>D.</given-names>
            <surname>Wang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Q.</given-names>
            <surname>Yang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Abdul</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B. Y.</given-names>
            <surname>Lim</surname>
          </string-name>
          , Designing
          <string-name>
            <surname>Theory-Driven User-Centric Explainable</surname>
            <given-names>AI</given-names>
          </string-name>
          , Association for Computing Machinery, New York, NY, USA,
          <year>2019</year>
          , pp.
          <fpage>1</fpage>
          -
          <lpage>15</lpage>
          . URL: https://doi.org/10.1145/3290605.3300831.
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          [11]
          <string-name>
            <given-names>R. D.</given-names>
            <surname>Croon</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L. V.</given-names>
            <surname>Houdt</surname>
          </string-name>
          ,
          <string-name>
            <given-names>N. N.</given-names>
            <surname>Htun</surname>
          </string-name>
          ,
          <string-name>
            <given-names>G.</given-names>
            <surname>Štiglic</surname>
          </string-name>
          ,
          <string-name>
            <given-names>V. V.</given-names>
            <surname>Abeele</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K.</given-names>
            <surname>Verbert</surname>
          </string-name>
          ,
          <article-title>Health recommender systems: Systematic review</article-title>
          ,
          <source>Journal of medical Internet research 23</source>
          (
          <year>2021</year>
          ). URL: https: //pubmed.ncbi.nlm.nih.gov/34185014/. doi:
          <volume>10</volume>
          .2196/18035.
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          [12]
          <string-name>
            <given-names>K.</given-names>
            <surname>Balog</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F.</given-names>
            <surname>Radlinski</surname>
          </string-name>
          ,
          <article-title>Measuring Recommendation Explanation Quality: The Conflicting Goals of Explanations</article-title>
          ,
          <source>in: Proceedings of the 43rd International ACM SIGIR Conference on Research and Development in Information Retrieval</source>
          , SIGIR '20,
          <string-name>
            <surname>Association</surname>
          </string-name>
          for Computing Machinery, New York, NY, USA,
          <year>2020</year>
          , pp.
          <fpage>329</fpage>
          -
          <lpage>338</lpage>
          . URL: https://doi.org/10.1145/3397271. 3401032. doi:
          <volume>10</volume>
          .1145/3397271.3401032.
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          [13]
          <string-name>
            <given-names>A.</given-names>
            <surname>Calero Valdez</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Ziefle</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K.</given-names>
            <surname>Verbert</surname>
          </string-name>
          ,
          <article-title>Hci for recommender systems: The past, the present and the future</article-title>
          ,
          <source>in: Proceedings of the 10th ACM Conference on Recommender Systems</source>
          , RecSys '16,
          <string-name>
            <surname>Association</surname>
          </string-name>
          for Computing Machinery, New York, NY, USA,
          <year>2016</year>
          , pp.
          <fpage>123</fpage>
          -
          <lpage>126</lpage>
          . URL: https://doi.org/10.1145/2959100.2959158. doi:
          <volume>10</volume>
          .1145/2959100.2959158.
        </mixed-citation>
      </ref>
      <ref id="ref14">
        <mixed-citation>
          [14]
          <string-name>
            <given-names>D. V.</given-names>
            <surname>Carvalho</surname>
          </string-name>
          ,
          <string-name>
            <given-names>E. M.</given-names>
            <surname>Pereira</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J. S.</given-names>
            <surname>Cardoso</surname>
          </string-name>
          ,
          <article-title>Machine learning interpretability: A survey on methods and metrics</article-title>
          ,
          <source>Electronics</source>
          <volume>8</volume>
          (
          <year>2019</year>
          ). URL: https://www.mdpi.com/2079-9292/8/8/832.
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>