1.2. Explaining health recommendations

Explaining health recommendations to lay users: The dos and don'ts

Maxwell Szymanski

Vero Vanden Abeele

Katrien Verbert

. Introduction

Related Work

0 Department of Computer Science, KU Leuven , Leuven , Belgium

In recent years, mobile health recommendations are used in an increasing number of applications. Researchers have highlighted the importance of explaining these recommendations to lay users, with benefits such as increased trust and a higher tendency to follow up on these recommendations. However, a diferent explanation modality can impact the way users perceive the recommendation, either in a positive or negative way. This paper will explore and evaluate six diferent explanation designs through a qualitative user study, and give general design guidelines and considerations regarding explaining pain-related health recommendations to lay users.

eol>explainable AI explainable recommender systems explanation interpretation lay users health recommendations HRS

1.2. Explaining health recommendations

As highlighted earlier, adding explanations to recommendations can improve the overall efectiveness. These make the system interpretable, which in turn can improve trust towards the system [ 7 ]. There exist HRS that explain their rationale to the end user, such as the food recommender system of Wayman et al. that explains why certain recipes are recommended based on the user’s nutritional intake [ 8 ], or a visualisation for medical experts that is able to explain breast cancer similarities [ 9 ]. However, the systematic review of De Croon et al. states that only 10% of HRS that focus on lay users make use of explanations. This makes HRS explanations for lay users a novel, but under-explored topic. Additionally, a study of Bussone et al. points out that providing overly detailed explanations for health recommenders can create unforeseen efects, such as creating over-reliance on explanations [ 10 ], which points out that health recommender explanations should be designed with suficient care. This makes designing explanations with non-expert users in mind, and evaluating them with end users, paramount.

1.3. End user expertise An increasing amount of research has pointed out that

the expertise of end users should be taken into account when designing explanations. Ribera et al. [ 11 ] have proposed three main categories of end users: non-experts (lay users), domain experts (in our context medical professionals or health coaches) and software- and AI-experts. Each category of users comes with its own needs, goals and limitations. AI expert users, for example, use XAI to verify or improve the underlying AI system, whereas domain experts can leverage explanations to gain additional insights and learn from the system. Lay users have their own set of goals, but more interestingly their own array of limitations as well. Wang et al. have pointed out several shortcomings in non-expert users related to cognitive biases, such as confirmation and anchoring bias, due to a backward-oriented, hypothesis-driven reasoning process [ 12 ]. Tsai et al. also noticed a reinforcing efect , where users avoid interacting with content they are not familiar with [ 13 ]. Szymanski et al. additionally pointed out that non-expert users, despite having these biases and incorrectly interpreting certain complex explanations, can still have a preference for them over other, simpler explanation modalities [ 14 ].

Thus we see that interpretability through explanations has multiple benefits and can result in an increased trust towards the system. However, as previously mentioned, the adoption of explanations in HRS is still low. Furthermore, most health-related AI explanations are being researched with AI and domain expert users in mind [ 15 ], which leaves a big gap for explanations w.r.t. lay users. 1–10

Keeping the aforementioned biases in mind that lay users are prone to, it is therefore tantamount to assess whether explanations are indeed interpretable to make sure no misalignment in trust is created.

With these considerations in mind, we investigate the following research questions:

RQ1 What explanation design do lay users prefer when

explaining health recommendations and why? RQ2 What design considerations are substantial when explaining health recommendations to lay users?

2. Explanation designs

As mentioned in section 1.1, we will focus on designing diferent explanations that will explain why users are receiving specific recommendations for their pain flareups. Keeping the context and type of end users in mind, the following design guidelines have to be kept in mind for all variants of explanations: • Mobile-friendly: as the explanations will be ofered within the context of the mobile health app, the explanations have to be well-suited for display on a small mobile screen. • Summative: the explanations should possess the ability to summarise categorical data, as input consists of (semi-)unstructured user input. • Suited for non-experts: as the end users are non-experts, the explanations should not use any advanced and statistical concepts to explain why the recommendation is suggested.

Keeping these criteria in mind, we came up with the following designs in Figure 1 based on well-known and widely used explanation types: • Text-based: briefly explain why the recommendation is related to the most prevalent input. The wording is based on the "communicating healthrelated news to patients" guidelines described by [ 16 ] and these explanations were collaboratively designed for the purpose of this study by six ergoand physiotherapists. • Text-based + inline reply: an addition to the textual explanation, where the inline-reply shows which specific user message most contributed to the recommendation. • Tags: tags are a common method of communicating all topics that are relevant to a recommendation (e.g. Bidargaddi et al. [ 17 ]). • Word clouds: in addition to showing all relevant topics, word clouds are able to additionally communicate relative importance/relevance of these topics (e.g. [18, 19]). (a) Purely textual (b) Inline reply (c) Tags (d) Word cloud (e) Feature importance (f) Feature importance + % • Feature-importances (FI): feature importance bars communicate contributing themes of the user input, as well as their input relevance, albeit in a more specific way compared word clouds. • Feature-importances (FI) + percentages: adds percentages to the FI bars to communicate exact topic importances.

For the user study, we recruited 11 participants out of a

pool of 286 people who were already using the mobile health coaching application without the pain logbook

These explanation designs are sorted from least to most and its recommender system, as mentioned in section 1.1, by the amount of information they convey regarding the and thus knew and have interacted with the content and inputs relevant to the recommendation. The textual ex- diferent modules. The group consisted of nine women planation only focuses on one input, with the inline reply and two men, of which four finished graduate school, six being able to also show which specific input triggered college, and one high school. Age-wise, 2 participants the recommendation, whereas the tags are able to dis- were between 21-30, 5 between 31-40, 3 between 41-50 play all relevant input categories that are related to the and 1 between 51-60. All 11 users noted to use the inrecommendation. The word-cloud further builds on this ternet on the regular basis, with 6 participants stating by also displaying the relative importance of each input to be average computer and IT users, and 5 participants related to the recommendation, and the FI shows the ex- stating to be advanced computer and IT users. act sorting of input according to importance. The added percentages give the most transparency regarding the inputs, by also displaying the exact values used by the underlying RS.

2.1. Participants 2.2. Protocol of the evaluation study

Insights through XAI (+) At the start of the study, users were briefed on the pur- Six users liked the fact that they were able to gain more pose and context of the think-aloud study, and gave their insight through this explanation modality. Four users consent to having the audio recorded, after which they also stated that the percentages were a “nice-to-know”, iflled in the ResQue demographics questionnaire [ 20]. making the explanation more useful and informative. Afterwards, they were guided through the pain logbook, which they had to fill in with recent pain-episode they Negative sentiment towards XAI (-) experienced in mind. Having done so, they received some information regarding the recommendations that On the flip-side, two users disliked the addition of displayare going to be given, along with the explanations. We ing percentages, stating that when it comes to emotions briefly went over the six explanation designs in a fixed and feelings, certain aspects are not quantifiable . U4 order, after which we asked the participant to “explain stated: “Personally I think feelings are not quantifiable. what they like or dislike about the explanation” sepa- The bars are good, but don’t put an exact number on it. It’s rately for each design once they have seen them all. To okay if you’re communicating frequencies, like how often conclude this preference elicitation, the users had to sort an emotion occurred for example.”. the explanations by preference, with 1 being their most preferred one, and 6 their least preferred. They also had Visual/information overload (-) to give (or repeat) a key reason as to why they are giving Two users also stated that the addition of percentages is each explanation a certain ranking. The audio recordings unnecessary, mentioning that only using bars to comof both the preference elicitation and ranking are used municate importances is suficient. afterwards for a thematic analysis.

3.2. Feature importance 2.3. Data analysis

Rank: 2 · The feature importance explanation was The thematic analysis was done in two phases, with the among the most preferred explanations, liked for the ifrst phase consisting of deriving granular themes from fact that is was able to give a summary of the user input the thematic analysis with two researchers, and the sec- ( = 11), as well as being able to give additional insights ond phase focusing on merging them to higher level ( = 2). themes with a third researcher. The resulting higher level themes are displayed in Figure 3, along with the frequencies in which they occur per explanation design. Provides summary (+) The agreement percentage of the first phase two-coder Six users found the feature importance bars to be a clear thematic analysis is 88.1%, with Cohen’s kappa being way of communicating input topics and their importance. = 0.66, resulting in a substantial inter-coder agree- Four users stated that it gives them a nice overview of ment [21]. their input.

3. Results

Insights through XAI (+)

Two users specifically liked the additional insights that

Taking the average raking scores of all explanation de- they were able to get from the feature importances. U4 signs, we are now able to rank the 6 explanation modal- mentioned: “There are of course no numbers given, but ities from best to worst ranked, along with the results I can assume that I am really frustrated, and a bit less from the thematic analysis to explain why each explana- angry. I find it interesting to reflect on results that come tion type scored poorly or adequately. Figure 2 shows out of a questionnaire.” the frequencies of the rankings given to each explanation design.

Negative sentiment towards XAI (-)

3.1. Feature importance + percentage

Rank: 1 (best) · This explanation type was favored by most users, mainly due to the fact that it provided the most insight and transparency ( = 10). Only three out of 11 people found the addition of the percentages to feature importance bars to be ineficacious.

Three users were unsure of the ranking of some topics,

stating that they agreed with the general content, but not as to why one topic was deemed more important over others. This caused these users to slightly dislike and distrust the system, and give it a lower ranking.

Two users found the bars to be unnecessary, giving

them information as to what contributed towards the recommendation, but not why, like the textual explanation did. U6 stated: “There is not a lot of background given. It shows that these inputs contributed to my recommendation, but not why.” Three users were fond of the additional insights they got from the tags and the general themes that were present in their input. U3 stated: “When inputting my feelings I did not necessarily perceive them as negative or angry. But based on these tags, I’m able to see: okay, this is how the app interprets my feelings.” 3.3. Tags Rank: 3 · Tags scored relatively better than the previous three explanations in terms of average ranking, and were liked for their summative ability ( = 8). Only people who disliked having a lot of information, were less in favor of the tag explanation ( = 2).

Visual/information overload (-)

Only two users stated that tags were unnecessary or

provided too much information. U6 stated: “Yes it’s clear, but less practical. I tend to focus on one thing at a time.”

3.4. Purely textual

Provides summary (+) Rank: 4 · Purely textual explanations received mixed reactions during the think-aloud study. When users liked Four users found using tags to be a nice way of providing or agreed with the recommendation, the textual explanaa summary of their input. Four users also stated that tion was a welcome addition helping them understand the doing in such a way is a clear and concise method of recommendation process and the recommendation itself, explaining why the recommendation is given. and gave users a nice summary of why the recommendation matched their inputs ( = 8). However, when the recommendation wasn’t in line with the user’s expectations, the textual explanation highlighted the mismatch even more and caused a poor reception of the recom- Problem with representation (-) mender system in general ( = 5). Here is an overview of these topics: Only some minor and infrequent negative remarks were given surrounding inline replies. Three users disliked the fact that by highlighting or repeating their negative input, they are more confronted with it. One user additionally mentioned that this explanation feels like the recommendation is only tuned to one input instead of multiple user inputs, making it feel too specific .

3.6. Word cloud

Provides summary (+)

Six users found that the textual explanation was able to

summarize their input quite well, albeit only focusing on one topic (the most relevant one) surrounding the recommendation.

Positive sentiment towards explanation (+) Rank: 6 (last) · The word cloud received the lowest avTwo users stated that the written explanation was con- erage score. In general, users like the addition of displayifrming and comforting . One user also stated that the ing keyword or topic importance, however using a word wording of the textual explanation felt less confronting cloud to do so proves to be an inferior solution. The theregarding their negative input. matic analysis points out two main negative themes as to why this explanation is disliked: problems with represenNegative sentiment towards explanation (-) tation and content ( = 9) and visual/information overload ( = 4) and one positive theme, insights through explanation ( = 4).

On the other hand, three users mentioned that they cannot relate to the recommendation, and that the textual explanation highlighted this fact. U4 also found the ex- Problems with representation (-) planation to also be provoking, stating the following: “I know that I’m frustrated and that it does not help. However, Three users pointed out having keyword size commuexplaining that acts like waving a red flag in front of a nicate importance was unclear, and would rather have bull.” something concrete like bars indicating exact relevance. Three users also pointed out that the inconsistent sizes 3.5. Inline-reply inherent to the design of word clouds were visually displeasing. Two users additionally stated highlighting Rank: 5 · During the think-aloud study, the inline reply important keywords might be too confronting with rereceived relatively positive feedback and comments re- spect to their own input, e.g. if a user inputs that they garding the succinct summary it gave of the users input are feeling sad, having it displayed as a large word might ( = 7), with only some minor remarks regarding the confront the user too much with their state of mind. presentation of the explanation ( = 3). However, it scored quite low during the preference ranking itself due Visual/information overload (-) to other explanation modalities simply being preferred over the inline-reply.

Provides summary (+)

Six users found the explanation modality to be clear and

more concrete, and one user additionally stated that showing which message triggered the recommendation requires less analysis from the user.

Insights through explanation (+) Three users liked the fact that the inline-reply raises awareness of the fact that the recommendation is related to one of their own inputs. U3 stated: “I find it better than the textual explanation. There, they state ’You seem to be frustrated’, and here you really are made aware of the fact that it’s your own input.“

Three users found the addition of displaying relevance

in such a way unnecessary, one of which additionally stated that adding the information in such way is too distracting.

Insights through explanation (+)

Four users stated however that adding this information

of keyword relevance gives more insight due to not only showing the relevant topics, but their importance as well.

4. Discussion We will now discuss some of the most prevalent observations that were present in several explanation designs, as well as suggest guidelines on how to design health explanations for lay users experiencing (chronic) pain. 4.1. Beware of confronting people with negative sentiments

People experiencing (chronic) pain or illness can feel distress when receiving negative information surrounding their state. In our study, we noticed that highlighting keywords that are potentially negative (e.g. negative emotions, reactions, etc.), can cause distress with users and therefore make them dislike the explanation. This was apparent with the inline reply and word cloud explanations, where visually highlighting negative sentiments that relate to the recommendation caused users to dislike the explanation.

4.2. Use tags or feature importance when control is needed

Due to the fact that tags and FI/FI+% are able to display multiple input categories, users positively expressed that this would provide them more control over the recommendation process, if the design or implementation allows for it. One user suggested that tapping certain topics could be useful to request recommendations in a more user-controlled way. Other users additionally suggested U9:“It’s nice if you can individually remove certain topics”, and U7: “... especially of you notice something that wasn’t interpreted the way you intended it”.

4.4. Insight vs. information overload

4.3. Design FI through a lay user’s Users generally liked the holistic approach of the feature importances, and were more inclined to look into the perspective recommendation itself. When asked why they liked the The FI and FI+% designs were favored by most users, recommendations more when explained using FI comgiving most users the insight and summary they needed. pared to the purely textual explanation, they stated that However, as mentioned in section 3.2, U4 interpreted the the FI were able to show them a general overview of them FI bars as “... I can assume that I am really frustrated, as a person. and a bit less angry”, indicating that they saw it as an On the other hand, there were also some users who disoverview of their input, and not how strongly their input agreed with the ordering of keyword importances that relates to the recommendation. In total, 10 out of 11 lay the feature importance bars were displaying, causing users interpreted FI diferently than intended. Only U4 a slight increase in distrust towards the recommender was able to correctly interpret the bars (after reading the system, ranking the explanation lower. This is to be extext above the FI bars - “This is how your inputs relate pected, as increasing transparency of explanations can to the recommendation”), saying “The frustrated bar is cause a higher drop in trust towards the system if the the biggest, okay, so that contributes most to my recom- content of the explanation or recommendation does not mendation”. Having a wrong interpretation could lead to align with the user’s expectation. However, the efect confusion towards the system when, for example, a next of a misaligned textual explanation is still stronger, as recommendation is shown, and the input keywords and users who did not agree with either the recommendation their relevance change with respect to this new recom- or the explanation expressed a more negative sentiment mendation. However, overcoming biases and changing towards the recommendation, and gave the textual recmental models of lay users often proves to be dificult. ommendation a lower ranking. This is in line with similar A possible design adaptations to the FI and FI+% design, research by Balog et al. [ 5 ], in which they state that mismay show a general overview/summary of the user in- aligned recommendations that focus on a single topic or put to be in line with what users were interpreting, and item are more susceptible to a lower perceived quality of then highlight the keywords that are relevant to the rec- explanation compared to multi-item recommendations. ommendation that is being shown. This can be seen in

5. Conclusion

This paper introduced several explanation designs for mobile pain related health recommendations, and compared them among lay users. Most users preferred the added transparency that was provided by the tags and FI / FI+% designs, stating that it gave them a brief and clear overview of their input which helped them understand why they received certain recommendations. Another interesting aspect is the fact that designs should be careful with visually highlighting negative sentiments of users. Designs that did so, i.e. the inline-reply and word cloud, were received poorly by users. Lastly, we confirmed that lay users might interpret certain visual explanations differently than intended, yet still prefer them over others. Given their feedback, we presented an adapted design of the favoured FI / FI+% explanation to be in line with what lay users expect.

6. Limitations & Future work

The qualitative aspect of this study was already able to point out several key aspects related to designing health explanations for patients experiencing chronic pain. However, a larger scale quantitative user study is needed to further investigate these results. One such aspect is the fact that some users preferred textual explanations over explanations that ofered more information. Investigating whether this correlates to the user’s need for cognition (NFC), and what its implications are, can prove to be an interesting research direction similar to the research of Millecamp et al. [22]. Another aspect is the fact that while most users disliked being confronted with their negative input, some did not mind. This could be related to the "warriors vs. worriers" research, in which some users experiencing chronic pain actually prefer being exposed to negative feedback so they could address it, and could prove useful for further research [23]. Future research should also consider other designs to explain health recommendations and elaborate design guidelines that can be used by researchers and practitioners in this exciting domain. In addition, an interesting further line of research is to personalise these explanations on-thelfy, based on interaction data of end-users. As in work of [24], clicks and hover interactions as well as eye gaze data can be considered for such personalisation.

Acknowledgments This work is part of the research projects Personal Health Empowerment (PHE) with project number HBC.2018.2012, financed by Flanders Innovation & Entrepreneurship, and IMPERIUM with project number

1–10

G0A3319N, financed by Research Foundation Flanders (FWO).

[1] R. De Croon , L. Van

Houdt , N. N.

Htun , G.

Štiglic , V.

Vanden Abeele , K.

Verbert , Health recommender systems: Systematic review , J Med Internet Res 23 ( 2021 ) e18035 . URL: https://www.jmir.org/ 2021 /6/ e18035. doi: 10 .2196/18035.

[2]

Torrent-Fontbona ,

Lopez , Personalized adaptive cbr bolus recommender system for type 1 diabetes , IEEE Journal of Biomedical and Health Informatics 23 ( 2019 ) 387 - 394 . doi: 10 .1109/JBHI. 2018 . 2813424 , robin's Paper: [ 93 ].

[3]

Gouveia , E. Karapanos,

Hassenzahl , How do we engage with activity trackers? a longitudinal study of habito , UbiComp 2015 - Proceedings of the 2015 ACM International Joint Conference on Pervasive and Ubiquitous Computing ( 2015 ) 1305 - 1316 . doi: 10 .1145/2750858.2804290.

[4]

Cheung ,

Ling ,

C. J.

Karr ,

Weingardt ,

S. M.

Schueller ,

D. C.

Mohr , Evaluation of a recommender app for apps for the treatment of depression and anxiety: An analysis of longitudinal user engagement , Journal of the American Medical Informatics Association 25 ( 2018 ) 955 - 962 . doi: 10 .1093/ jamia/ocy023.

[5]

Balog ,

Radlinski , Measuring Recommendation Explanation Quality: The Conflicting Goals of Explanations , in: Proceedings of the 43rd International ACM SIGIR Conference on Research and Development in Information Retrieval , SIGIR '20, Association for Computing Machinery, New York, NY, USA, 2020 , p. 329 - 338 . URL: https://doi.org/ 10.1145/3397271.3401032. doi: 10 .1145/3397271. 3401032.

[6]

Calero Valdez ,

Ziefle ,

Verbert , Hci for recommender systems: The past, the present and the future , in: Proceedings of the 10th ACM Conference on Recommender Systems , RecSys '16, Association for Computing Machinery, New York, NY, USA, 2016 , p. 123 - 126 . URL: https://doi.org/ 10.1145/2959100.2959158. doi: 10 .1145/2959100. 2959158.

[7]

D. V.

Carvalho ,

E. M.

Pereira ,

J. S.

Cardoso , Machine learning interpretability: A survey on methods and metrics , Electronics 8 ( 2019 ). URL: https:// www.mdpi.com/2079-9292/8/8/832. doi: 10 .3390/ electronics8080832.

[8]

Wayman ,

Madhvanath , Nudging Grocery Shoppers to Make Healthier Choices, in: Proceedings of the Ninth Conference on Recommender Systems , ACM, 2015 , pp. 289 - 292 . doi: 10 .1145/ ommendation service for a curated list of read2792838.2799669. ily available mental health and well-being mobile

[9] J.-B. Lamy , B. Sekar , G. Guezennec, J. Bouaud, apps for young people: Randomized controlled B . Séroussi , Explainable artificial intelligence trial , Journal of Medical Internet Research 19 ( 2017 ). for breast cancer: A visual case-based reasoning doi:10.2196/jmir.6775, robin's Paper: [55]. approach, Artificial Intelligence in Medicine 94 [18]

Wu ,

Ester , Flame: A probabilistic model ( 2019 ) 42 - 53 . URL: https://www.sciencedirect. combining aspect based opinion mining and colcom/science/article/pii/S0933365718304846. laborative filtering, in: Proceedings of the Eighth doi:https://doi.org/10.1016/j.artmed. ACM International Conference on Web Search 2019 . 01 .001. and

Data

Mining , WSDM '15, Association for

[10]

Bussone ,

Stumpf , D. M. O'Sullivan , The role Computing Machinery , New York, NY, USA, 2015 , of explanations on trust and reliance in clinical de- p. 199 - 208 . URL: https://doi.org/10.1145/2684822. cision support systems, 2015 International Confer- 2685291 . doi: 10 .1145/2684822.2685291. ence on Healthcare Informatics ( 2015 ) 160 - 169 . [19] C.-H. Tsai , P. Brusilovsky , Evaluating Visual Ex-

[11]

Ribera ,

Lapedriza , Can we do better explana- planations for Similarity-Based Recommendations: tions? a proposal of user-centered explainable ai, User Perception and Performance , in: ProceedCEUR Workshop Proceedings 2327 ( 2019 ). ings of the 27th ACM Conference on User Mod-

[12]

Wang ,

Yang ,

Abdul ,

B. Y.

Lim , Designing eling, Adaptation and Personalization , UMAP '19, Theory-Driven User-Centric Explainable

, Asso- Association for Computing Machinery, New York, ciation for Computing Machinery, New York, NY, NY, USA, 2019 , p. 22 - 30 . URL: https://doi.org/ USA, 2019 , p. 1 - 15 . URL: https://doi.org/10.1145/ 10.1145/3320435.3320465. doi: 10 .1145/3320435. 3290605.3300831. 3320465.

[13] C.-H. Tsai , P.

Brusilovsky , Beyond the ranked list: [20] P.

Pu , L.

Chen , R.

Hu , A user-centric evaluation User-driven exploration and diversification of so- framework for recommender systems , in: Procial recommendation, in: 23rd International Con- ceedings of the Fifth ACM Conference on Recference on Intelligent User Interfaces , IUI '18 , As- ommender Systems , RecSys '11, Association for sociation for Computing Machinery , New York, Computing Machinery, New York, NY, USA, 2011 , NY, USA, 2018 , p. 239 - 250 . URL: https://doi.org/ p. 157 - 164 . URL: https://doi.org/10.1145/2043932. 10.1145/3172944.3172959. doi: 10 .1145/3172944. 2043962. doi: 10 .1145/2043932.2043962. 3172959. [21] N. J.-M. Blackman , J. J. Koval , Interval es-

[14]

Szymanski ,

Millecamp ,

Verbert , Visual, timation for cohen's kappa as a measure of textual or hybrid: The efect of user expertise on agreement , Statistics in Medicine 19 ( 2000 ) diferent explanations , in: 26th International Con- 723 - 741 . doi:https://doi.org/10.1002/ ference on Intelligent User Interfaces , IUI '21 , As- (SICI) 1097 - 0258 ( 20000315 )19: 5 < 723 : : sociation for Computing Machinery , New York, AID-SIM379> 3.0 .CO; 2 - A . NY, USA, 2021 , p. 109 - 119 . URL: https://doi.org/ [22]

Millecamp ,

N. N.

Htun ,

Conati ,

Verbert , To 10 .1145/3397481.3450662. doi: 10 .1145/3397481. explain or not to explain: The efects of personal 3450662. characteristics when explaining music recommen-

[15]

Ooge , G. Stiglic,

Verbert , Explaining arti- dations, in: Proceedings of the 24th International ifcial intelligence with visual analytics in health- Conference on Intelligent User Interfaces , IUI '19 , care, WIREs Data Mining and Knowledge Dis- Association for Computing Machinery , New York, covery 12 ( 2021 ). URL: https://wires.onlinelibrary. NY, USA, 2019 , p. 397 - 407 . URL: https://doi.org/ wiley.com/doi/abs/10.1002/widm.1427. doi:https: 10 .1145/3301275.3302313. doi: 10 .1145/3301275. //doi.org/10.1002/widm.1427. 3302313.

[16]

Schmid Mast ,

Kindlimann , W. Lange- [23]

Geuens ,

Swinnen ,

Geurts , R. Westhovens, witz, Recipients' perspective on breaking bad R. De Croon,

Vanden Abeele , Worriers versus news: How you put it really makes a diference, warriors: Tailoring mhealth to address diferences Patient Education and Counseling 58 (2005) in patients with chronic arthritis , in: 2020 IEEE In244-251 . URL: https://www.sciencedirect.com/ ternational Conference on Healthcare Informatics science/article/pii/S0738399105001473. doi:https: (ICHI) , 2020 , pp. 1 - 12 . doi: 10 .1109/ICHI48887. //doi.org/10.1016/j.pec. 2005 . 05 .005, 2020 .9374322. medical Education and Training in Communication . [24]

Millecamp ,

Willemot ,

Verbert , Your eyes ex-

[17]

Bidargaddi ,

Musiat ,

Winsall , G. Vogl, plain everything: exploring the use of eye tracking V. Blake ,

Quinn ,

Orlowski , G. Antezana, to provide explanations on-the-fly , in: Proceedings G. Schrader, Eficacy of a web-based guided rec- of the 8th Joint Workshop on Interfaces and Human Decision Making for Recommender Systems co-located with 15th ACM Conference on Recommender Systems (RecSys 2021 ), volume 2948 , CEUR Workshop Proceedings, 2021 , pp. 89 - 100 .