<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Explaining health recommendations to lay users: The dos and don'ts</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Maxwell Szymanski</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Vero Vanden Abeele</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Katrien Verbert</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>. Introduction</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Related Work</string-name>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Department of Computer Science, KU Leuven</institution>
          ,
          <addr-line>Leuven</addr-line>
          ,
          <country country="BE">Belgium</country>
        </aff>
      </contrib-group>
      <abstract>
        <p>In recent years, mobile health recommendations are used in an increasing number of applications. Researchers have highlighted the importance of explaining these recommendations to lay users, with benefits such as increased trust and a higher tendency to follow up on these recommendations. However, a diferent explanation modality can impact the way users perceive the recommendation, either in a positive or negative way. This paper will explore and evaluate six diferent explanation designs through a qualitative user study, and give general design guidelines and considerations regarding explaining pain-related health recommendations to lay users.</p>
      </abstract>
      <kwd-group>
        <kwd>eol&gt;explainable AI</kwd>
        <kwd>explainable recommender systems</kwd>
        <kwd>explanation interpretation</kwd>
        <kwd>lay users</kwd>
        <kwd>health recommendations</kwd>
        <kwd>HRS</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>1.2. Explaining health recommendations</title>
      <p>
        As highlighted earlier, adding explanations to
recommendations can improve the overall efectiveness. These
make the system interpretable, which in turn can
improve trust towards the system [
        <xref ref-type="bibr" rid="ref7">7</xref>
        ]. There exist HRS that
explain their rationale to the end user, such as the food
recommender system of Wayman et al. that explains why
certain recipes are recommended based on the user’s
nutritional intake [
        <xref ref-type="bibr" rid="ref8">8</xref>
        ], or a visualisation for medical experts
that is able to explain breast cancer similarities [
        <xref ref-type="bibr" rid="ref9">9</xref>
        ].
However, the systematic review of De Croon et al. states that
only 10% of HRS that focus on lay users make use of
explanations. This makes HRS explanations for lay users
a novel, but under-explored topic. Additionally, a study
of Bussone et al. points out that providing overly detailed
explanations for health recommenders can create
unforeseen efects, such as creating over-reliance on
explanations [
        <xref ref-type="bibr" rid="ref10">10</xref>
        ], which points out that health recommender
explanations should be designed with suficient care. This
makes designing explanations with non-expert users in
mind, and evaluating them with end users, paramount.
      </p>
    </sec>
    <sec id="sec-2">
      <title>1.3. End user expertise</title>
      <sec id="sec-2-1">
        <title>An increasing amount of research has pointed out that</title>
        <p>
          the expertise of end users should be taken into account
when designing explanations. Ribera et al. [
          <xref ref-type="bibr" rid="ref11">11</xref>
          ] have
proposed three main categories of end users: non-experts
(lay users), domain experts (in our context medical
professionals or health coaches) and software- and AI-experts.
Each category of users comes with its own needs, goals
and limitations. AI expert users, for example, use XAI
to verify or improve the underlying AI system, whereas
domain experts can leverage explanations to gain
additional insights and learn from the system. Lay users have
their own set of goals, but more interestingly their own
array of limitations as well. Wang et al. have pointed out
several shortcomings in non-expert users related to
cognitive biases, such as confirmation and anchoring bias,
due to a backward-oriented, hypothesis-driven
reasoning process [
          <xref ref-type="bibr" rid="ref12">12</xref>
          ]. Tsai et al. also noticed a reinforcing
efect , where users avoid interacting with content they
are not familiar with [
          <xref ref-type="bibr" rid="ref13">13</xref>
          ]. Szymanski et al. additionally
pointed out that non-expert users, despite having these
biases and incorrectly interpreting certain complex
explanations, can still have a preference for them over other,
simpler explanation modalities [
          <xref ref-type="bibr" rid="ref14">14</xref>
          ].
        </p>
        <p>
          Thus we see that interpretability through explanations
has multiple benefits and can result in an increased trust
towards the system. However, as previously mentioned,
the adoption of explanations in HRS is still low.
Furthermore, most health-related AI explanations are being
researched with AI and domain expert users in mind [
          <xref ref-type="bibr" rid="ref15">15</xref>
          ],
which leaves a big gap for explanations w.r.t. lay users.
1–10
        </p>
      </sec>
      <sec id="sec-2-2">
        <title>Keeping the aforementioned biases in mind that lay users are prone to, it is therefore tantamount to assess whether explanations are indeed interpretable to make sure no misalignment in trust is created.</title>
        <p>With these considerations in mind, we investigate the
following research questions:</p>
      </sec>
      <sec id="sec-2-3">
        <title>RQ1 What explanation design do lay users prefer when</title>
        <p>explaining health recommendations and why?
RQ2 What design considerations are substantial when
explaining health recommendations to lay users?</p>
        <sec id="sec-2-3-1">
          <title>2. Explanation designs</title>
          <p>As mentioned in section 1.1, we will focus on designing
diferent explanations that will explain why users are
receiving specific recommendations for their pain
flareups. Keeping the context and type of end users in mind,
the following design guidelines have to be kept in mind
for all variants of explanations:
• Mobile-friendly: as the explanations will be
ofered within the context of the mobile health
app, the explanations have to be well-suited for
display on a small mobile screen.
• Summative: the explanations should possess the
ability to summarise categorical data, as input
consists of (semi-)unstructured user input.
• Suited for non-experts: as the end users are
non-experts, the explanations should not use any
advanced and statistical concepts to explain why
the recommendation is suggested.</p>
          <p>
            Keeping these criteria in mind, we came up with the
following designs in Figure 1 based on well-known and
widely used explanation types:
• Text-based: briefly explain why the
recommendation is related to the most prevalent input. The
wording is based on the "communicating
healthrelated news to patients" guidelines described by
[
            <xref ref-type="bibr" rid="ref16">16</xref>
            ] and these explanations were collaboratively
designed for the purpose of this study by six
ergoand physiotherapists.
• Text-based + inline reply: an addition to the
textual explanation, where the inline-reply shows
which specific user message most contributed to
the recommendation.
• Tags: tags are a common method of
communicating all topics that are relevant to a
recommendation (e.g. Bidargaddi et al. [
            <xref ref-type="bibr" rid="ref17">17</xref>
            ]).
• Word clouds: in addition to showing all relevant
topics, word clouds are able to additionally
communicate relative importance/relevance of these
topics (e.g. [18, 19]).
(a) Purely textual
(b) Inline reply
(c) Tags
(d) Word cloud
(e) Feature importance
(f) Feature importance + %
• Feature-importances (FI): feature importance
bars communicate contributing themes of the
user input, as well as their input relevance, albeit
in a more specific way compared word clouds.
• Feature-importances (FI) + percentages: adds
percentages to the FI bars to communicate exact
topic importances.
          </p>
        </sec>
      </sec>
      <sec id="sec-2-4">
        <title>For the user study, we recruited 11 participants out of a</title>
        <p>pool of 286 people who were already using the mobile
health coaching application without the pain logbook</p>
        <p>These explanation designs are sorted from least to most and its recommender system, as mentioned in section 1.1,
by the amount of information they convey regarding the and thus knew and have interacted with the content and
inputs relevant to the recommendation. The textual ex- diferent modules. The group consisted of nine women
planation only focuses on one input, with the inline reply and two men, of which four finished graduate school, six
being able to also show which specific input triggered college, and one high school. Age-wise, 2 participants
the recommendation, whereas the tags are able to dis- were between 21-30, 5 between 31-40, 3 between 41-50
play all relevant input categories that are related to the and 1 between 51-60. All 11 users noted to use the
inrecommendation. The word-cloud further builds on this ternet on the regular basis, with 6 participants stating
by also displaying the relative importance of each input to be average computer and IT users, and 5 participants
related to the recommendation, and the FI shows the ex- stating to be advanced computer and IT users.
act sorting of input according to importance. The added
percentages give the most transparency regarding the
inputs, by also displaying the exact values used by the
underlying RS.</p>
      </sec>
    </sec>
    <sec id="sec-3">
      <title>2.1. Participants</title>
    </sec>
    <sec id="sec-4">
      <title>2.2. Protocol of the evaluation study</title>
      <p>Insights through XAI (+)
At the start of the study, users were briefed on the pur- Six users liked the fact that they were able to gain more
pose and context of the think-aloud study, and gave their insight through this explanation modality. Four users
consent to having the audio recorded, after which they also stated that the percentages were a “nice-to-know”,
iflled in the ResQue demographics questionnaire [ 20]. making the explanation more useful and informative.
Afterwards, they were guided through the pain logbook,
which they had to fill in with recent pain-episode they Negative sentiment towards XAI (-)
experienced in mind. Having done so, they received
some information regarding the recommendations that On the flip-side, two users disliked the addition of
displayare going to be given, along with the explanations. We ing percentages, stating that when it comes to emotions
briefly went over the six explanation designs in a fixed and feelings, certain aspects are not quantifiable . U4
order, after which we asked the participant to “explain stated: “Personally I think feelings are not quantifiable.
what they like or dislike about the explanation” sepa- The bars are good, but don’t put an exact number on it. It’s
rately for each design once they have seen them all. To okay if you’re communicating frequencies, like how often
conclude this preference elicitation, the users had to sort an emotion occurred for example.”.
the explanations by preference, with 1 being their most
preferred one, and 6 their least preferred. They also had Visual/information overload (-)
to give (or repeat) a key reason as to why they are giving Two users also stated that the addition of percentages is
each explanation a certain ranking. The audio recordings unnecessary, mentioning that only using bars to
comof both the preference elicitation and ranking are used municate importances is suficient.
afterwards for a thematic analysis.</p>
    </sec>
    <sec id="sec-5">
      <title>3.2. Feature importance</title>
    </sec>
    <sec id="sec-6">
      <title>2.3. Data analysis</title>
      <p>Rank: 2 · The feature importance explanation was
The thematic analysis was done in two phases, with the among the most preferred explanations, liked for the
ifrst phase consisting of deriving granular themes from fact that is was able to give a summary of the user input
the thematic analysis with two researchers, and the sec- ( = 11), as well as being able to give additional insights
ond phase focusing on merging them to higher level ( = 2).
themes with a third researcher. The resulting higher
level themes are displayed in Figure 3, along with the
frequencies in which they occur per explanation design. Provides summary (+)
The agreement percentage of the first phase two-coder Six users found the feature importance bars to be a clear
thematic analysis is 88.1%, with Cohen’s kappa being way of communicating input topics and their importance.
 = 0.66, resulting in a substantial inter-coder agree- Four users stated that it gives them a nice overview of
ment [21]. their input.</p>
      <sec id="sec-6-1">
        <title>3. Results</title>
        <p>Insights through XAI (+)</p>
        <sec id="sec-6-1-1">
          <title>Two users specifically liked the additional insights that</title>
          <p>Taking the average raking scores of all explanation de- they were able to get from the feature importances. U4
signs, we are now able to rank the 6 explanation modal- mentioned: “There are of course no numbers given, but
ities from best to worst ranked, along with the results I can assume that I am really frustrated, and a bit less
from the thematic analysis to explain why each explana- angry. I find it interesting to reflect on results that come
tion type scored poorly or adequately. Figure 2 shows out of a questionnaire.”
the frequencies of the rankings given to each explanation
design.</p>
          <p>Negative sentiment towards XAI (-)</p>
        </sec>
      </sec>
    </sec>
    <sec id="sec-7">
      <title>3.1. Feature importance + percentage</title>
      <p>Rank: 1 (best) · This explanation type was favored by
most users, mainly due to the fact that it provided the
most insight and transparency ( = 10). Only three out
of 11 people found the addition of the percentages to
feature importance bars to be ineficacious.</p>
      <sec id="sec-7-1">
        <title>Three users were unsure of the ranking of some topics,</title>
        <p>stating that they agreed with the general content, but not
as to why one topic was deemed more important over
others. This caused these users to slightly dislike and
distrust the system, and give it a lower ranking.</p>
      </sec>
      <sec id="sec-7-2">
        <title>Two users found the bars to be unnecessary, giving</title>
        <p>them information as to what contributed towards the
recommendation, but not why, like the textual explanation
did. U6 stated: “There is not a lot of background given. It
shows that these inputs contributed to my recommendation,
but not why.”
Three users were fond of the additional insights they
got from the tags and the general themes that were
present in their input. U3 stated: “When inputting my
feelings I did not necessarily perceive them as negative or
angry. But based on these tags, I’m able to see: okay, this
is how the app interprets my feelings.”
3.3. Tags
Rank: 3 · Tags scored relatively better than the previous
three explanations in terms of average ranking, and were
liked for their summative ability ( = 8). Only people
who disliked having a lot of information, were less in
favor of the tag explanation ( = 2).</p>
        <p>Visual/information overload (-)</p>
      </sec>
      <sec id="sec-7-3">
        <title>Only two users stated that tags were unnecessary or</title>
        <p>provided too much information. U6 stated: “Yes it’s clear,
but less practical. I tend to focus on one thing at a time.”</p>
      </sec>
    </sec>
    <sec id="sec-8">
      <title>3.4. Purely textual</title>
      <p>Provides summary (+) Rank: 4 · Purely textual explanations received mixed
reactions during the think-aloud study. When users liked
Four users found using tags to be a nice way of providing or agreed with the recommendation, the textual
explanaa summary of their input. Four users also stated that tion was a welcome addition helping them understand the
doing in such a way is a clear and concise method of recommendation process and the recommendation itself,
explaining why the recommendation is given. and gave users a nice summary of why the
recommendation matched their inputs ( = 8). However, when the
recommendation wasn’t in line with the user’s
expectations, the textual explanation highlighted the mismatch
even more and caused a poor reception of the recom- Problem with representation (-)
mender system in general ( = 5). Here is an overview
of these topics:
Only some minor and infrequent negative remarks were
given surrounding inline replies. Three users disliked
the fact that by highlighting or repeating their negative
input, they are more confronted with it. One user
additionally mentioned that this explanation feels like the
recommendation is only tuned to one input instead of
multiple user inputs, making it feel too specific .</p>
    </sec>
    <sec id="sec-9">
      <title>3.6. Word cloud</title>
      <p>Provides summary (+)</p>
      <sec id="sec-9-1">
        <title>Six users found that the textual explanation was able to</title>
        <p>summarize their input quite well, albeit only focusing
on one topic (the most relevant one) surrounding the
recommendation.</p>
        <p>Positive sentiment towards explanation (+)
Rank: 6 (last) · The word cloud received the lowest
avTwo users stated that the written explanation was con- erage score. In general, users like the addition of
displayifrming and comforting . One user also stated that the ing keyword or topic importance, however using a word
wording of the textual explanation felt less confronting cloud to do so proves to be an inferior solution. The
theregarding their negative input. matic analysis points out two main negative themes as to
why this explanation is disliked: problems with
represenNegative sentiment towards explanation (-) tation and content ( = 9) and visual/information
overload ( = 4) and one positive theme, insights through
explanation ( = 4).</p>
        <p>On the other hand, three users mentioned that they
cannot relate to the recommendation, and that the textual
explanation highlighted this fact. U4 also found the ex- Problems with representation (-)
planation to also be provoking, stating the following: “I
know that I’m frustrated and that it does not help. However, Three users pointed out having keyword size
commuexplaining that acts like waving a red flag in front of a nicate importance was unclear, and would rather have
bull.” something concrete like bars indicating exact relevance.
Three users also pointed out that the inconsistent sizes
3.5. Inline-reply inherent to the design of word clouds were visually
displeasing. Two users additionally stated highlighting
Rank: 5 · During the think-aloud study, the inline reply important keywords might be too confronting with
rereceived relatively positive feedback and comments re- spect to their own input, e.g. if a user inputs that they
garding the succinct summary it gave of the users input are feeling sad, having it displayed as a large word might
( = 7), with only some minor remarks regarding the confront the user too much with their state of mind.
presentation of the explanation ( = 3). However, it
scored quite low during the preference ranking itself due Visual/information overload (-)
to other explanation modalities simply being preferred
over the inline-reply.</p>
        <p>Provides summary (+)</p>
      </sec>
      <sec id="sec-9-2">
        <title>Six users found the explanation modality to be clear and</title>
        <p>more concrete, and one user additionally stated that
showing which message triggered the recommendation
requires less analysis from the user.</p>
        <p>Insights through explanation (+)
Three users liked the fact that the inline-reply raises
awareness of the fact that the recommendation is related
to one of their own inputs. U3 stated: “I find it better than
the textual explanation. There, they state ’You seem to be
frustrated’, and here you really are made aware of the fact
that it’s your own input.“</p>
      </sec>
      <sec id="sec-9-3">
        <title>Three users found the addition of displaying relevance</title>
        <p>in such a way unnecessary, one of which additionally
stated that adding the information in such way is too
distracting.</p>
        <p>Insights through explanation (+)</p>
      </sec>
      <sec id="sec-9-4">
        <title>Four users stated however that adding this information</title>
        <p>of keyword relevance gives more insight due to not
only showing the relevant topics, but their importance
as well.</p>
        <sec id="sec-9-4-1">
          <title>4. Discussion</title>
        </sec>
      </sec>
      <sec id="sec-9-5">
        <title>We will now discuss some of the most prevalent observations that were present in several explanation designs, as well as suggest guidelines on how to design health explanations for lay users experiencing (chronic) pain.</title>
      </sec>
    </sec>
    <sec id="sec-10">
      <title>4.1. Beware of confronting people with negative sentiments</title>
      <p>People experiencing (chronic) pain or illness can feel
distress when receiving negative information surrounding
their state. In our study, we noticed that highlighting
keywords that are potentially negative (e.g. negative
emotions, reactions, etc.), can cause distress with users
and therefore make them dislike the explanation. This
was apparent with the inline reply and word cloud
explanations, where visually highlighting negative sentiments
that relate to the recommendation caused users to dislike
the explanation.</p>
    </sec>
    <sec id="sec-11">
      <title>4.2. Use tags or feature importance when control is needed</title>
      <p>Due to the fact that tags and FI/FI+% are able to
display multiple input categories, users positively expressed
that this would provide them more control over the
recommendation process, if the design or implementation
allows for it. One user suggested that tapping certain
topics could be useful to request recommendations in a
more user-controlled way. Other users additionally
suggested U9:“It’s nice if you can individually remove certain
topics”, and U7: “... especially of you notice something that
wasn’t interpreted the way you intended it”.</p>
    </sec>
    <sec id="sec-12">
      <title>4.4. Insight vs. information overload</title>
      <p>
        4.3. Design FI through a lay user’s Users generally liked the holistic approach of the feature
importances, and were more inclined to look into the
perspective recommendation itself. When asked why they liked the
The FI and FI+% designs were favored by most users, recommendations more when explained using FI
comgiving most users the insight and summary they needed. pared to the purely textual explanation, they stated that
However, as mentioned in section 3.2, U4 interpreted the the FI were able to show them a general overview of them
FI bars as “... I can assume that I am really frustrated, as a person.
and a bit less angry”, indicating that they saw it as an On the other hand, there were also some users who
disoverview of their input, and not how strongly their input agreed with the ordering of keyword importances that
relates to the recommendation. In total, 10 out of 11 lay the feature importance bars were displaying, causing
users interpreted FI diferently than intended. Only U4 a slight increase in distrust towards the recommender
was able to correctly interpret the bars (after reading the system, ranking the explanation lower. This is to be
extext above the FI bars - “This is how your inputs relate pected, as increasing transparency of explanations can
to the recommendation”), saying “The frustrated bar is cause a higher drop in trust towards the system if the
the biggest, okay, so that contributes most to my recom- content of the explanation or recommendation does not
mendation”. Having a wrong interpretation could lead to align with the user’s expectation. However, the efect
confusion towards the system when, for example, a next of a misaligned textual explanation is still stronger, as
recommendation is shown, and the input keywords and users who did not agree with either the recommendation
their relevance change with respect to this new recom- or the explanation expressed a more negative sentiment
mendation. However, overcoming biases and changing towards the recommendation, and gave the textual
recmental models of lay users often proves to be dificult. ommendation a lower ranking. This is in line with similar
A possible design adaptations to the FI and FI+% design, research by Balog et al. [
        <xref ref-type="bibr" rid="ref5">5</xref>
        ], in which they state that
mismay show a general overview/summary of the user in- aligned recommendations that focus on a single topic or
put to be in line with what users were interpreting, and item are more susceptible to a lower perceived quality of
then highlight the keywords that are relevant to the rec- explanation compared to multi-item recommendations.
ommendation that is being shown. This can be seen in
      </p>
      <sec id="sec-12-1">
        <title>5. Conclusion</title>
        <p>This paper introduced several explanation designs for
mobile pain related health recommendations, and
compared them among lay users. Most users preferred the
added transparency that was provided by the tags and FI
/ FI+% designs, stating that it gave them a brief and clear
overview of their input which helped them understand
why they received certain recommendations. Another
interesting aspect is the fact that designs should be careful
with visually highlighting negative sentiments of users.
Designs that did so, i.e. the inline-reply and word cloud,
were received poorly by users. Lastly, we confirmed that
lay users might interpret certain visual explanations
differently than intended, yet still prefer them over others.
Given their feedback, we presented an adapted design
of the favoured FI / FI+% explanation to be in line with
what lay users expect.</p>
      </sec>
      <sec id="sec-12-2">
        <title>6. Limitations &amp; Future work</title>
        <p>The qualitative aspect of this study was already able
to point out several key aspects related to designing
health explanations for patients experiencing chronic
pain. However, a larger scale quantitative user study is
needed to further investigate these results. One such
aspect is the fact that some users preferred textual
explanations over explanations that ofered more information.
Investigating whether this correlates to the user’s need
for cognition (NFC), and what its implications are, can
prove to be an interesting research direction similar to the
research of Millecamp et al. [22]. Another aspect is the
fact that while most users disliked being confronted with
their negative input, some did not mind. This could be
related to the "warriors vs. worriers" research, in which
some users experiencing chronic pain actually prefer
being exposed to negative feedback so they could address it,
and could prove useful for further research [23]. Future
research should also consider other designs to explain
health recommendations and elaborate design guidelines
that can be used by researchers and practitioners in this
exciting domain. In addition, an interesting further line
of research is to personalise these explanations
on-thelfy, based on interaction data of end-users. As in work
of [24], clicks and hover interactions as well as eye gaze
data can be considered for such personalisation.</p>
      </sec>
      <sec id="sec-12-3">
        <title>Acknowledgments</title>
        <sec id="sec-12-3-1">
          <title>This work is part of the research projects Personal Health Empowerment (PHE) with project number HBC.2018.2012, financed by Flanders Innovation &amp; Entrepreneurship, and IMPERIUM with project number</title>
          <p>1–10</p>
        </sec>
        <sec id="sec-12-3-2">
          <title>G0A3319N, financed by Research Foundation Flanders (FWO).</title>
        </sec>
      </sec>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <surname>R. De Croon</surname>
            ,
            <given-names>L. Van</given-names>
          </string-name>
          <string-name>
            <surname>Houdt</surname>
            ,
            <given-names>N. N.</given-names>
          </string-name>
          <string-name>
            <surname>Htun</surname>
            ,
            <given-names>G.</given-names>
          </string-name>
          <string-name>
            <surname>Štiglic</surname>
            ,
            <given-names>V.</given-names>
          </string-name>
          <string-name>
            <surname>Vanden Abeele</surname>
            ,
            <given-names>K.</given-names>
          </string-name>
          <string-name>
            <surname>Verbert</surname>
          </string-name>
          ,
          <article-title>Health recommender systems: Systematic review</article-title>
          ,
          <source>J Med Internet Res</source>
          <volume>23</volume>
          (
          <year>2021</year>
          )
          <article-title>e18035</article-title>
          . URL: https://www.jmir.org/
          <year>2021</year>
          /6/ e18035. doi:
          <volume>10</volume>
          .2196/18035.
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [2]
          <string-name>
            <given-names>F.</given-names>
            <surname>Torrent-Fontbona</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B.</given-names>
            <surname>Lopez</surname>
          </string-name>
          ,
          <article-title>Personalized adaptive cbr bolus recommender system for type 1 diabetes</article-title>
          ,
          <source>IEEE Journal of Biomedical and Health Informatics</source>
          <volume>23</volume>
          (
          <year>2019</year>
          )
          <fpage>387</fpage>
          -
          <lpage>394</lpage>
          . doi:
          <volume>10</volume>
          .1109/JBHI.
          <year>2018</year>
          .
          <volume>2813424</volume>
          , robin's Paper: [
          <volume>93</volume>
          ].
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [3]
          <string-name>
            <given-names>R.</given-names>
            <surname>Gouveia</surname>
          </string-name>
          , E. Karapanos,
          <string-name>
            <given-names>M.</given-names>
            <surname>Hassenzahl</surname>
          </string-name>
          ,
          <article-title>How do we engage with activity trackers? a longitudinal study of habito</article-title>
          ,
          <source>UbiComp 2015 - Proceedings of the 2015 ACM International Joint Conference on Pervasive and Ubiquitous Computing</source>
          (
          <year>2015</year>
          )
          <fpage>1305</fpage>
          -
          <lpage>1316</lpage>
          . doi:
          <volume>10</volume>
          .1145/2750858.2804290.
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [4]
          <string-name>
            <given-names>K.</given-names>
            <surname>Cheung</surname>
          </string-name>
          ,
          <string-name>
            <given-names>W.</given-names>
            <surname>Ling</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C. J.</given-names>
            <surname>Karr</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K.</given-names>
            <surname>Weingardt</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S. M.</given-names>
            <surname>Schueller</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D. C.</given-names>
            <surname>Mohr</surname>
          </string-name>
          ,
          <article-title>Evaluation of a recommender app for apps for the treatment of depression and anxiety: An analysis of longitudinal user engagement</article-title>
          ,
          <source>Journal of the American Medical Informatics Association</source>
          <volume>25</volume>
          (
          <year>2018</year>
          )
          <fpage>955</fpage>
          -
          <lpage>962</lpage>
          . doi:
          <volume>10</volume>
          .1093/ jamia/ocy023.
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          [5]
          <string-name>
            <given-names>K.</given-names>
            <surname>Balog</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F.</given-names>
            <surname>Radlinski</surname>
          </string-name>
          ,
          <article-title>Measuring Recommendation Explanation Quality: The Conflicting Goals of Explanations</article-title>
          ,
          <source>in: Proceedings of the 43rd International ACM SIGIR Conference on Research and Development in Information Retrieval</source>
          , SIGIR '20,
          <string-name>
            <surname>Association</surname>
          </string-name>
          for Computing Machinery, New York, NY, USA,
          <year>2020</year>
          , p.
          <fpage>329</fpage>
          -
          <lpage>338</lpage>
          . URL: https://doi.org/ 10.1145/3397271.3401032. doi:
          <volume>10</volume>
          .1145/3397271. 3401032.
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          [6]
          <string-name>
            <given-names>A.</given-names>
            <surname>Calero Valdez</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Ziefle</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K.</given-names>
            <surname>Verbert</surname>
          </string-name>
          ,
          <article-title>Hci for recommender systems: The past, the present and the future</article-title>
          ,
          <source>in: Proceedings of the 10th ACM Conference on Recommender Systems</source>
          , RecSys '16,
          <string-name>
            <surname>Association</surname>
          </string-name>
          for Computing Machinery, New York, NY, USA,
          <year>2016</year>
          , p.
          <fpage>123</fpage>
          -
          <lpage>126</lpage>
          . URL: https://doi.org/ 10.1145/2959100.2959158. doi:
          <volume>10</volume>
          .1145/2959100. 2959158.
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          [7]
          <string-name>
            <given-names>D. V.</given-names>
            <surname>Carvalho</surname>
          </string-name>
          ,
          <string-name>
            <given-names>E. M.</given-names>
            <surname>Pereira</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J. S.</given-names>
            <surname>Cardoso</surname>
          </string-name>
          ,
          <article-title>Machine learning interpretability: A survey on methods and metrics</article-title>
          ,
          <source>Electronics</source>
          <volume>8</volume>
          (
          <year>2019</year>
          ). URL: https:// www.mdpi.com/2079-9292/8/8/832. doi:
          <volume>10</volume>
          .3390/ electronics8080832.
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          [8]
          <string-name>
            <given-names>E.</given-names>
            <surname>Wayman</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Madhvanath</surname>
          </string-name>
          , Nudging Grocery Shoppers to Make Healthier Choices,
          <source>in: Proceedings of the Ninth Conference on Recommender Systems</source>
          , ACM,
          <year>2015</year>
          , pp.
          <fpage>289</fpage>
          -
          <lpage>292</lpage>
          . doi:
          <volume>10</volume>
          .1145/
          <article-title>ommendation service for a curated list of read2792838.2799669. ily available mental health and well-being mobile</article-title>
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          [9]
          <string-name>
            <surname>J.-B. Lamy</surname>
            ,
            <given-names>B.</given-names>
          </string-name>
          <string-name>
            <surname>Sekar</surname>
          </string-name>
          , G. Guezennec, J. Bouaud,
          <article-title>apps for young people: Randomized controlled B</article-title>
          .
          <string-name>
            <surname>Séroussi</surname>
          </string-name>
          ,
          <article-title>Explainable artificial intelligence trial</article-title>
          ,
          <source>Journal of Medical Internet Research</source>
          <volume>19</volume>
          (
          <year>2017</year>
          ).
          <article-title>for breast cancer: A visual case-based reasoning</article-title>
          doi:10.2196/jmir.6775, robin's
          <source>Paper: [55]. approach, Artificial Intelligence in Medicine</source>
          <volume>94</volume>
          [18]
          <string-name>
            <given-names>Y.</given-names>
            <surname>Wu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Ester</surname>
          </string-name>
          ,
          <article-title>Flame: A probabilistic model (</article-title>
          <year>2019</year>
          )
          <fpage>42</fpage>
          -
          <lpage>53</lpage>
          . URL: https://www.sciencedirect.
          <article-title>combining aspect based opinion mining</article-title>
          and colcom/science/article/pii/S0933365718304846. laborative filtering, in: Proceedings of the Eighth doi:https://doi.org/10.1016/j.artmed.
          <source>ACM International Conference on Web Search</source>
          <year>2019</year>
          .
          <volume>01</volume>
          .001. and
          <string-name>
            <given-names>Data</given-names>
            <surname>Mining</surname>
          </string-name>
          , WSDM '15, Association for
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          [10]
          <string-name>
            <given-names>A.</given-names>
            <surname>Bussone</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Stumpf</surname>
          </string-name>
          ,
          <string-name>
            <surname>D. M. O'Sullivan</surname>
          </string-name>
          ,
          <article-title>The role Computing Machinery</article-title>
          , New York, NY, USA,
          <year>2015</year>
          ,
          <article-title>of explanations on trust and</article-title>
          reliance in clinical de- p.
          <fpage>199</fpage>
          -
          <lpage>208</lpage>
          . URL: https://doi.org/10.1145/2684822. cision support systems, 2015 International Confer-
          <volume>2685291</volume>
          . doi:
          <volume>10</volume>
          .1145/2684822.2685291. ence on Healthcare Informatics (
          <year>2015</year>
          )
          <fpage>160</fpage>
          -
          <lpage>169</lpage>
          . [19]
          <string-name>
            <surname>C.-H. Tsai</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          <string-name>
            <surname>Brusilovsky</surname>
          </string-name>
          ,
          <string-name>
            <surname>Evaluating Visual</surname>
          </string-name>
          Ex-
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          [11]
          <string-name>
            <given-names>M.</given-names>
            <surname>Ribera</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Lapedriza</surname>
          </string-name>
          ,
          <article-title>Can we do better explana- planations for Similarity-Based Recommendations: tions? a proposal of user-centered explainable ai, User Perception and Performance</article-title>
          , in: ProceedCEUR
          <source>Workshop Proceedings</source>
          <volume>2327</volume>
          (
          <year>2019</year>
          ).
          <source>ings of the 27th ACM Conference on User Mod-</source>
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          [12]
          <string-name>
            <given-names>D.</given-names>
            <surname>Wang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Q.</given-names>
            <surname>Yang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Abdul</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B. Y.</given-names>
            <surname>Lim</surname>
          </string-name>
          , Designing eling,
          <source>Adaptation and Personalization</source>
          , UMAP '19,
          <string-name>
            <surname>Theory-Driven User-Centric Explainable</surname>
            <given-names>AI</given-names>
          </string-name>
          , Asso- Association for Computing Machinery, New York, ciation for Computing Machinery, New York, NY, NY, USA,
          <year>2019</year>
          , p.
          <fpage>22</fpage>
          -
          <lpage>30</lpage>
          . URL: https://doi.org/ USA,
          <year>2019</year>
          , p.
          <fpage>1</fpage>
          -
          <lpage>15</lpage>
          . URL: https://doi.org/10.1145/ 10.1145/3320435.3320465. doi:
          <volume>10</volume>
          .1145/3320435. 3290605.3300831. 3320465.
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          [13]
          <string-name>
            <surname>C.-H. Tsai</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          <string-name>
            <surname>Brusilovsky</surname>
            , Beyond the ranked list: [20]
            <given-names>P.</given-names>
          </string-name>
          <string-name>
            <surname>Pu</surname>
            ,
            <given-names>L.</given-names>
          </string-name>
          <string-name>
            <surname>Chen</surname>
            ,
            <given-names>R.</given-names>
          </string-name>
          <string-name>
            <surname>Hu</surname>
          </string-name>
          ,
          <article-title>A user-centric evaluation User-driven exploration and diversification of so- framework for recommender systems</article-title>
          , in: Procial recommendation,
          <source>in: 23rd International Con- ceedings of the Fifth ACM Conference on Recference on Intelligent User Interfaces</source>
          ,
          <source>IUI '18</source>
          , As- ommender
          <string-name>
            <surname>Systems</surname>
          </string-name>
          , RecSys '11,
          <article-title>Association for sociation for Computing Machinery</article-title>
          , New York, Computing Machinery, New York, NY, USA,
          <year>2011</year>
          , NY, USA,
          <year>2018</year>
          , p.
          <fpage>239</fpage>
          -
          <lpage>250</lpage>
          . URL: https://doi.org/ p.
          <fpage>157</fpage>
          -
          <lpage>164</lpage>
          . URL: https://doi.org/10.1145/2043932. 10.1145/3172944.3172959. doi:
          <volume>10</volume>
          .1145/3172944. 2043962. doi:
          <volume>10</volume>
          .1145/2043932.2043962. 3172959. [21]
          <string-name>
            <surname>N. J.-M. Blackman</surname>
            ,
            <given-names>J. J.</given-names>
          </string-name>
          <string-name>
            <surname>Koval</surname>
          </string-name>
          , Interval es-
        </mixed-citation>
      </ref>
      <ref id="ref14">
        <mixed-citation>
          [14]
          <string-name>
            <given-names>M.</given-names>
            <surname>Szymanski</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Millecamp</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K.</given-names>
            <surname>Verbert</surname>
          </string-name>
          ,
          <article-title>Visual, timation for cohen's kappa as a measure of textual or hybrid: The efect of user expertise on agreement</article-title>
          ,
          <source>Statistics in Medicine 19</source>
          (
          <year>2000</year>
          )
          <article-title>diferent explanations</article-title>
          , in: 26th International Con-
          <volume>723</volume>
          -
          <fpage>741</fpage>
          . doi:https://doi.org/10.1002/
          <article-title>ference on Intelligent User Interfaces</article-title>
          ,
          <source>IUI '21</source>
          ,
          <string-name>
            <surname>As-</surname>
          </string-name>
          (SICI)
          <fpage>1097</fpage>
          -
          <lpage>0258</lpage>
          (
          <issue>20000315</issue>
          )19:
          <fpage>5</fpage>
          &lt;
          <fpage>723</fpage>
          :
          <article-title>: sociation for Computing Machinery</article-title>
          , New York, AID-SIM379&gt;
          <article-title>3.0</article-title>
          .CO;
          <fpage>2</fpage>
          -
          <lpage>A</lpage>
          . NY, USA,
          <year>2021</year>
          , p.
          <fpage>109</fpage>
          -
          <lpage>119</lpage>
          . URL: https://doi.org/ [22]
          <string-name>
            <given-names>M.</given-names>
            <surname>Millecamp</surname>
          </string-name>
          ,
          <string-name>
            <given-names>N. N.</given-names>
            <surname>Htun</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Conati</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K.</given-names>
            <surname>Verbert</surname>
          </string-name>
          , To
          <volume>10</volume>
          .1145/3397481.3450662. doi:
          <volume>10</volume>
          .1145/3397481.
          <article-title>explain or not to explain: The efects of personal 3450662. characteristics when explaining music recommen-</article-title>
        </mixed-citation>
      </ref>
      <ref id="ref15">
        <mixed-citation>
          [15]
          <string-name>
            <given-names>J.</given-names>
            <surname>Ooge</surname>
          </string-name>
          , G. Stiglic,
          <string-name>
            <given-names>K.</given-names>
            <surname>Verbert</surname>
          </string-name>
          ,
          <article-title>Explaining arti- dations, in: Proceedings of the 24th International ifcial intelligence with visual analytics in health- Conference on Intelligent User Interfaces</article-title>
          ,
          <source>IUI '19</source>
          , care,
          <source>WIREs Data Mining and Knowledge Dis- Association for Computing Machinery</source>
          , New York, covery
          <volume>12</volume>
          (
          <year>2021</year>
          ). URL: https://wires.onlinelibrary. NY, USA,
          <year>2019</year>
          , p.
          <fpage>397</fpage>
          -
          <lpage>407</lpage>
          . URL: https://doi.org/ wiley.com/doi/abs/10.1002/widm.1427. doi:https:
          <volume>10</volume>
          .1145/3301275.3302313. doi:
          <volume>10</volume>
          .1145/3301275. //doi.org/10.1002/widm.1427. 3302313.
        </mixed-citation>
      </ref>
      <ref id="ref16">
        <mixed-citation>
          [16]
          <string-name>
            <given-names>M.</given-names>
            <surname>Schmid Mast</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Kindlimann</surname>
          </string-name>
          , W. Lange- [23]
          <string-name>
            <given-names>J.</given-names>
            <surname>Geuens</surname>
          </string-name>
          ,
          <string-name>
            <given-names>T.</given-names>
            <surname>Swinnen</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L.</given-names>
            <surname>Geurts</surname>
          </string-name>
          , R. Westhovens, witz, Recipients' perspective on breaking bad R. De Croon,
          <string-name>
            <given-names>V.</given-names>
            <surname>Vanden</surname>
          </string-name>
          <string-name>
            <surname>Abeele</surname>
          </string-name>
          ,
          <article-title>Worriers versus news: How you put it really makes a diference, warriors: Tailoring mhealth to address diferences Patient Education and Counseling 58 (2005) in patients with chronic arthritis</article-title>
          ,
          <source>in: 2020 IEEE In244-251</source>
          . URL: https://www.sciencedirect.com/ ternational Conference on Healthcare Informatics science/article/pii/S0738399105001473. doi:https:
          <source>(ICHI)</source>
          ,
          <year>2020</year>
          , pp.
          <fpage>1</fpage>
          -
          <lpage>12</lpage>
          . doi:
          <volume>10</volume>
          .1109/ICHI48887. //doi.org/10.1016/j.pec.
          <year>2005</year>
          .
          <volume>05</volume>
          .005,
          <year>2020</year>
          .9374322.
          <article-title>medical Education and Training in Communication</article-title>
          . [24]
          <string-name>
            <given-names>M.</given-names>
            <surname>Millecamp</surname>
          </string-name>
          ,
          <string-name>
            <given-names>T.</given-names>
            <surname>Willemot</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K.</given-names>
            <surname>Verbert</surname>
          </string-name>
          , Your eyes ex-
        </mixed-citation>
      </ref>
      <ref id="ref17">
        <mixed-citation>
          [17]
          <string-name>
            <given-names>N.</given-names>
            <surname>Bidargaddi</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Musiat</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Winsall</surname>
          </string-name>
          ,
          <string-name>
            <surname>G.</surname>
          </string-name>
          <article-title>Vogl, plain everything: exploring the use of eye tracking V. Blake</article-title>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Quinn</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Orlowski</surname>
          </string-name>
          ,
          <string-name>
            <surname>G.</surname>
          </string-name>
          <article-title>Antezana, to provide explanations on-the-fly</article-title>
          , in: Proceedings G. Schrader,
          <article-title>Eficacy of a web-based guided rec- of the 8th Joint Workshop on Interfaces and Human Decision Making for Recommender Systems co-located with 15th ACM Conference on Recommender Systems (RecSys</article-title>
          <year>2021</year>
          ), volume
          <volume>2948</volume>
          , CEUR Workshop Proceedings,
          <year>2021</year>
          , pp.
          <fpage>89</fpage>
          -
          <lpage>100</lpage>
          .
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>