Investigating Mechanisms for User Integration in the Activity Goal Recommendation Process by Interface Design Katja Herrmanny Simone Löppenberg Michael Schwarz katja.herrmanny@uni-due.de michael.schwarz@uni-due.de University of Duisburg-Essen University of Duisburg-Essen University of Duisburg-Essen Duisburg, Germany Duisburg, Germany Duisburg, Germany ABSTRACT interface alternatives and an evaluation regarding their potential In the field of physical activity recommendation, we have to deal to integrate the user in the process in the way described above. We with many confounding variables that lead to high result uncer- further investigated understanding, usability, and aesthetics which tainty. Assuming that users’ competence is an essential factor for re- are relevant factors for user engagement. duction of the problem of inaccurate recommendations, we present and evaluate an approach on how to integrate users in the recom- 2 RELATED WORK mendation process. We investigate if and how interface element Recommender systems intend to support users in the decision mak- design can contribute to understanding, reflection and modification ing process based on their preferences and needs [5]. There has of the recommendation result. In the work described here, we use been a lot of research focusing on the prediction accuracy [5]. How- interface elements that allow for planning of physical activity goal ever, recently it has been shown that accuracy of the algorithm striving. Results show that such interface elements can principally influences the user experience only partially [13] and that the key empower users, support recommendation reflection and stimulate to success are the functions provided by the user interface of rec- user interaction with the recommendation. ommender systems [10]. User interface design and dialogs affect usability, acceptance, item rating behaviour, selection behaviour, KEYWORDS trust, and willingness to buy and reuse the system [5]. In order to user integration, recommendation, goal setting, user empowerment, improve recommender systems and user satisfaction, it is beneficial user interface, activity tracking to provide users with the opportunity to interact with recommenda- tions and to make adjustments if needed [12]. However, it is often 1 INTRODUCTION AND SCOPE OF THE not possible to provide feedback to the system, which is impor- PAPER tant to adapt its assumptions about the user [12]. In the context of rating based recommender systems (e.g. movies, music), it has Besides algorithm accuracy, design of user interfaces is an impor- also been shown that interactive recommender systems are advan- tant component of recommender systems, that has gained more tageous since they can factor in changed user interests over time and more interest in the last years [10, 13, 21]. Especially health- or corrections to previously made mistaken ratings [11]. Yet most related personalised recommendations have to deal with many recommender systems consider user ratings as always correct [11]. confounding variables, that are unknown to the algorithm or not The authors [11] therefore suggest to support user interaction with quantifiable and thus difficult or even impossible to be considered in the recommendation by allowing an adjustment of previous rat- the recommender’s reasoning [8]. Due to this and other aspects like ings. This would provide explicit feedback to the system, instead autonomy issues, integrating the user in the process is an essential of implicit feedback, which is typically done by monitoring users’ part of health-related recommender systems [8]. However, in this behavior [3]. field little research has been done to investigate how the user could In some contexts, giving explicit user feedback requires an un- be integrated. In this paper, we investigate how to design a user derstanding of the recommendation. This can be supported by interface to integrate the user in the recommendation process of explanation. Explanation interfaces are used in different fields - physical activity goals by: such as expert systems, medical decision support systems, intelli- • empowering users to understand the recommendation and gent tutoring systems, data exploration systems, and recommender its implications, systems [18]. By explaining the recommendation result, they aim • reflecting and evaluating the recommendation, and at providing transparency and, in consequence, trust and user ac- • providing the opportunity to actively manipulate it. ceptance [18]. Such explanation is also termed user empowerment. In our approach, empowerment and reflection are mainly achieved Empowerment can also be found in the health sector, especially by the planning of recommendation realisation, which helps the under the concept of patient empowerment [1, 7, 14]. In this do- user to assess whether the recommended goal is realistic or not. It main, empowerment includes knowledge transfer and persuasion. also supports the user in appropriate modifications. We present two Following Kondylakis et al. [14], patient empowerment is achieved through the accessibility of information (e. g. the opportunity to get Copyright ©2019 for this paper by its authors. Use permitted under Creative Commons information on the internet). This is in line with Alpay et al. [1] who License Attribution 4.0 International (CC BY 4.0). state that the term empowerment “is frequently used to describe a IntRS ’19: Joint Workshop on Interfaces and Human Decision Making for Recommender Systems, 19 Sept 2019, Copenhagen, DK situation where patients are encouraged to be active in their own health management”. Regarding the effect of empowerment, it has IntRS Workshop, September 2019, Copenhagen, DK Herrmanny, Löppenberg, and Schwarz been shown that empowerment is provided by e-health applica- context data of the specific user. As described above, recommenda- tions and can positively influence patients’ long-term health status tions in this field are very error-prone because of large variations [7]. Specifically for goal recommendations in activity tracking ap- over time and a large number of confounding variables indeter- plications, it has been shown that users wish to get explaining or minable for the algorithm. To meet this challenge we propose to illustrating information [9]. One approach to providing such infor- integrate the user in the recommendation process. Therefore, it is mation was to illustrate the impact of the activity goal by reference necessary (a) to empower users to understand the recommenda- routes [9]. Doing so, users should get a better sense of the amount tion and its implications, (b) to reflect and evaluate it and (c) to of activity of the selected goal. In summary, in the health sector, provide adequate opportunities to actively manipulate it. To reach empowerment includes knowledge transfer and persuasion. [19] dis- this aim, we used two main strategies. The first one is transparency. tinguish between empowerment and persuasion. They use the term By showing the uncertainty of the algorithm, we want to make empowerment to describe explanation interfaces in health recom- the user aware of the necessity of his/her influence. The second mender systems. In a literature review they found that "empowering strategy is implementation planning to illustrate the impact of the the user by interactively guiding his decision, and creating trust, recommendation. The specific (exemplary) user interface elements using [...] explanatory interfaces" are relevant concepts in health we designed, are described in the following. recommender research. Following Schäfer et al. [19], in health rec- Firstly, we presented the recommendation - which is a numerical ommender systems, empowerment is achieved through explanation value on a continuous scale, in contrast to discrete items like in of the internal logic of the recommender system, which they claim conventional recommender systems - as a range on a modifiable to be one of the key challenges for health-related recommender slider. The recommendation with the highest probability to be the systems. Our understanding of empowerment goes even beyond. best fitting one, is used as default value. We further added a colour Beside explanation of the recommender’s reasoning, empowerment gradient to the slider indicating uncertainty regarding the system’s as we define it also includes transferring domain knowledge and recommendation (for different values of the range). Presenting a explaining of the recommendation’s implications. range instead of a single value and indicating uncertainty of the algorithm should support users in recommendation interpretation 3 CONTRIBUTION by understanding that the recommendation is a non-exact one that needs to be reflected and probably to be adjusted. The slider is To overcome the shortcomings of algorithmic recommendations, intended to encourage and enable users to adjust it. we (as well as other researchers mentioned in the previous section) Kilocalories (kcal) were used to present the recommendation, propose to integrate the user in the recommendation process to which is, besides the number of steps, one of the most common units achieve better results. for measuring physical activity. The advantage of that approach In contrast to previous approaches, in this work we don’t focus is that all kind of physical activity can be subsumed in this unit. on textual explaining elements. Instead, we investigate, if and how However, it is problematic that the recommendation may be very interface design can be used to integrate users in the recommenda- abstract for the user, especially as kcal are more common in the tion process. field of nutrition. So secondly, we converted the recommendation Our idea is to to a unit that is more intuitive and better to interpret by the user. • support user empowerment by interface elements that ex- Therefore, it was converted to the time needed to be active in three plain the impact of the recommendation, different intensity levels (low, moderate, high) in order to achieve • support recommendation reflection by interface elements the weekly goal. The reasons for dividing the goal into different that indicate inaccuracy as well es elements that explain the activity levels are explained in section 5. The relation between the impact of the recommendation and , three activity levels could be modified by the user (which leads to • support user engagement by interface elements that allow for more time needed to achieve the goal if the user reduces the more manipulation of the recommendation and its implementation intensive activity and increases the low-intensive one and vice planning. versa). By giving the user a sense of the amount of time needed for goal striving and thus for the difficulty of the goal, the user should be In order to investigate the general potential of our idea to foster empowered to understand the implications of the recommendation user integration through interface design, we designed exemplary and be encouraged to reflect it. interface elements, which are described in detail in the following In addition to this, we thirdly allowed for a more detailed plan- section. This work does not focus on the evaluation of the interface ning to give an even better sense of how difficult it is to achieve the elements itself or the app they are framed in. They are just used as chosen goal in daily life. The interface therefore provides the user tools to investigate our research questions. with the opportunity to plan, how to distribute the required time In summary, this paper contributes by investigating the potential to different activity units. This also aims at fostering reflection of of interface design to support user integration in the recommenda- the goal and, as a consequence, encouraging users to modify it. tion process for the purpose of improving the recommendation. 4 APPROACH 5 MOCK-UPS The scenario of our work is a physical activity tracking app, provid- We designed mock-ups with three main interface elements which ing recommendations for an appropriate (i.e. challenging, but not we implemented for an android application. The targeted appli- overburdening), numerical weekly activity goal based on user and cation should be an activity tracking app, which recommends a IntRS Workshop, September 2019, Copenhagen, DK Herrmanny, Löppenberg, and Schwarz physical activity goal considering user and context parameters. The regarding intensity level. Below there is a table in which goal unit should be consumption of kilocalories per week. The three each row represents an activity unit (for each intensity level) interface elements are: and each column represents 10 minutes. By tapping on the buttons (+ -) on the left and on the right, the duration of the (1) Goal selection element (see figure 1 (a)): The first element activity unit can be modified, whereby the respective cell is is a slider. The slider range represents the recommendation coloured. By using the buttons (+ -) below the table, a new range and the default slider position represents the most rec- activity unit (row) can be added. ommended value, i.e. the value with the highest probability to be the most suitable one. For the study, the goal is initially equipped with a default value of 1400 kcal. As this study focuses on the interface and not the algorithm, this recom- Table 1: Intensity categories with corresponding MET values mendation is the same for all participants to exclude this as a confounding variable. (The recommendation algorithm Intensity MET Example Activity is not part of this paper and thus described elsewhere.) A colour gradient at the slider range indicates the probabilities low 3 slow walking, walking downstairs, golf resulting from the recommendation algorithm. The user has moderate 5 walking, weight lifting, dancing the opportunity to modify the goal within the given range. high 7 running, playing soccer (2) Conversion into time element (see figure 1 (b)): The sec- ond element converts the kcal goal into amount of time, which is less abstract. As time needed to consume a certain The separation of the activity into low, moderate and high inten- amount of kilocalories strongly depends on the intensity of sity is done in order to convert the goal into time and thus better the performed activity, it is necessary to select which ratio estimate the calorie consumption and check if the selected goal is of different activity levels will be performed. Otherwise the realistic. conversion into time would be far too inexact and thus not The distribution into activity units should help the users to get a helpful. To get an approximation without cognitively over- feeling for their own activity behaviour and to integrate activities burdening the user, we provided three intensity categories into their everyday life through precise planning. This also serves as used in the health sector, which can be seen in table 1. For a supporting element to check whether the selected goal is realistic calculation we used an average MET value of each category, or not. which is also presented in table 1. MET (= metabolic equiva- The three elements of the page (weekly goal, conversion into lent of task) is a unit to indicate the intensity of an activity time, activity units) are interdependent. When the weekly goal and can be converted to calories and vice versa. Depending is increased or decreased, the amount of time to be distributed on the activity category, more or less active time is needed among the intensity levels also increased or decreased. This has to achieve the goal. Different exhausting activities or sports an immediate effect on the minutes of activity units to be planned, fall into different categories (see table 1). We enable users to which are adapted to the intensity distribution. The application can specify, which amount of each intensity level they plan to provide helping information for each area if required. do to achieve the goal. This is done by an adjustable circular Especially for reasons of reflection, we assumed that it might be seekbar (circle), which allows for modification of the inten- helpful to show all interface elements on one single page. When sity types portions within the circle. The default state in our users modify their goal, they immediately see the corresponding im- study is 1/3 for each part of the circle, i.e. for each intensity plications on the other planning steps, and thus can better evaluate level. Within the circle, the selected goal is shown and how which amount of adjustment is adequate. On the other hand, space much of the goal has already been distributed among the is limited on smart-phone displays. Layouts are quickly overloaded intensities. If the circle is filled, all kilocalories of the goal are which could result in information overload for the users. This is planned. Below the circle, the derived duration in minutes presumably not helpful in the reflection process. Thus, we decided of each intensity level is shown. Three different icons repre- to design two interfaces with mainly the same elements. The first sent the different intensity levels (walking for low intensity; one is scrollable and presents all described interface elements on cycling for moderate intensity; running for high intensity). a single page. The second one presents them on successive pages (3) Activity unit planning element (see figure 1 (c)): The third with the opportunity to navigate between the pages. interface element consists of input elements for planning units of activity for each intensity level. For example, if the user has chosen to do 60 minutes of high-intense activity 5.1 Interface 1: Single Page (in interface element 2), those could be distributed to one Figure 1 shows the single page interface. At the top of the page, exercise or activity unit of 40 minutes and one of 20 minutes. the goal selection slider is presented. The conversion into time 10 minutes is the minimum duration. By interacting with is presented below the goal selection element with an adjustable the circular seekbar or by tapping on the icons below, it is circle as described above. Below the circle, the resulting duration in possible to switch between the activity units of low, moderate minutes of each intensity unit is shown. The third interface element, and high intensity. Below the heading it is presented how which is the planning of specific activity units, is presented on the much time remains to be distributed to activity units for the bottom of the page. IntRS Workshop, September 2019, Copenhagen, DK Herrmanny, Löppenberg, and Schwarz Figure 2: Multiple Pages Interface (translated from German) We evaluated both interfaces with respect to our research ques- Figure 1: Single Page Interface (translated from German) tions in a laboratory setting with a mixed-method design. A between- with (a) goal selection, (b) conversion into time, (c) activity subject design was chosen, in which the participants were randomly unit planning assigned to one of two groups. Depending on the group, one of the two interfaces was presented to the participants and they were given the following tasks: 5.2 Interface 2: Multiple Pages First, we asked the participants to imagine to use an activity In addition to the single page interface a second one was designed tracking app and presented them the interface in order to explore and implemented. This interface presents the three areas on succes- it. They were asked to select an activity goal for the next week sive pages (Figure 2). The first page contains the weekly goal with based on the (simulated) system recommendation. Meanwhile, they the slider for goal adjustment. The heading of this section has been should describe their thoughts and impressions with the think changed and formulated as a request (“Choose your goal for the aloud method [6], which was recorded. Interaction behaviour was next week”). In addition, a visualisation below the slider converts observed and documented. Afterwards, an online questionnaire the calorie goal to minutes of activity for each intensity level. Above was presented to the participants to collect quantitative data re- the button “Intensity Distribution”, which announces the next page, garding the perception of the interface. In the online questionnaire there is also a short explanation for the next page’s content and why we assessed usability, user experience and aesthetics. Therefore, it is needed. This page’s content is identical to the content of the the User-Experience questionnaire (UEQ) [15], the System Usabil- intensity distribution area in the single page interface. The circle ity Scale (SUS) [4] , the Visual Aesthetics of Websites Inventory can be used to increase and decrease the intensities. The kcal unit (VisAWI) [20] and the After-Scenario Questionnaire (ASQ) [16] is converted into a temporal unit accordingly. Here, too, the next were used. Finally, a semi-structured interview was conducted to step of the display is briefly explained before the button “Activity get the participants’ opinion about the interface and its supportive Units”, which leads to the following page. The last page contains the potential. The study took approximately 30 - 45 minutes. input elements for the planning of specific activity units. Due to the increased amount of space compared to the single page interface, 7 RESULTS AND IMPLICATIONS all intensity types can be listed at the same time, and don’t have to In total, 27 persons (group 1: n=13; group 2: n=14) participated in be switched. Besides that, the interaction opportunities with the the study. They were aged 19 to 67 years (M=30.32, SD=17.19). 23 of elements remain the same. them stated that they were interested in apps for physical activity support and 12 already had experience in using them. 6 EVALUATION The main aim of the evaluation was to investigate if the above 7.1 Behaviour Observation and Think Aloud described mechanisms and interface elements help to: We used behaviour observation and the think aloud method to • empower users to understand the recommendation and its objectively assess empowerment, reflection and appropriateness implications, of manipulation elements. Therefore, interaction behaviour and • reflect and evaluate the recommended goal, and the comments made during the interaction were documented, tran- • provide the opportunity to actively manipulate it. scribed and analysed. Moreover, we investigated if these supportive functions depend The think aloud results indicate empowerment and reflection on the apps presentation mode (single or multiple pages). Further processes. When initially interacting with the goal selection ele- aspects of interest concerned user experience, usability, and aesthet- ment, as expected, some participants (n = 9) mentioned problems ics. All of these aspects can influence whether the user is willing to in estimating, which number of kcal would be an adequate goal. interact with the interface or not. For example, one participant said: IntRS Workshop, September 2019, Copenhagen, DK Herrmanny, Löppenberg, and Schwarz "I have no idea. 1000 kcal sounds good." (participant participants (n = 19) seemed to understand the intensity distribu- 22; multiple pages interface) tion without help of the supervisor, eight asked for help. Regarding Later, when seeing the conversion into a temporal unit, she the activity units (interface element 3), 15 participants asked for commented: help. After hinting at the help texts integrated in the interface, help "It’s quite convenient, that it is given how much [time] was provided, if still necessary. The goal selection element was that would be." (participant 22; multiple pages inter- explained to two participants, the conversion into time with the face) intensity distribution to seven participants and the activity units to 14 participants. Despite problems with usability and understanding, Another participant first commented: the elements were commended in their function and meaning. "No idea. I don’t know what’s average." (participant 24; multiple pages interface) 7.2 Interview Later on when interacting with the activity units, she reflected: For the interviews, we used the qualitative content analysis [17]. "I guess I could have chosen more in total. As it is, I The interviews were transcribed, afterwards the statements were have to do very little per day." (participant 24; multiple clustered and analysed. Due to recording problems, one data set pages interface) is missing, so that the interview section is based on 26 data sets. Transferring kcal into a temporal unit (intensity distribution In the following, the results are presented and enriched by a few element), helped to understand the recommended goal and reflect exemplary translated statements. Since the two interface versions whether it is realistic or not. The aspect of interpretation of the do primarily differ in the distribution of the elements on different kcal unit and the supportive potential of the conversion into time pages, the results are mainly presented for both interfaces simulta- is also explicitly addressed in the interview (see below). neously. Also the order of interaction with the interface elements was analysed to objectively assess reflection. A non-linear order of in- 7.2.1 Overall Satisfaction. The overall evaluation of the partici- teraction and revision of the choices previously made indicates that pants was positive for both interfaces. In general, the interfaces following elements fostered reflection of the selected goal planning. were rated positively. A linear order does not provide information about whether the "It’s quiet good. I usually use a pedometer. There it recommendation was reflected (but intendedly not modified) or [goal planning] is just implicit. Here I can exactly plan not. The observed order of interaction with the interface elements how much I can make a day, [...] do I just want low was the same with both interfaces. The majority of the participants intensity or modify and increase it." (participant 7; completed the task in linear order (n=21). Seven persons (single single page interface) page: n = 4; multiple pages: n = 3) operated with the elements in non-linear order and made adjustments to elements already used 7.2.2 Understanding and Supportive Potential. As an important fac- during processing. The order of interaction with the elements 1 tor for the supportive potential of the interface elements, we asked (goal selection element), 2 (conversion into time element) and 3 participants if they understood the dependencies of the three inter- (activity unit planning element) can be seen in table 2. face elements. In order to objectively evaluate the understanding, we further asked them to explain these dependencies. In most cases Table 2: Order of Interaction with Interface Elements (n = 25), the dependencies between weekly goal, conversion into time and activity units were explained correctly. However, some Participant Interface Interaction Order participants reported initial comprehension problems. 04 multiple 1→2→3→1→2→3→2→3 "At the beginning you are thrown in a bit, but that 07 single 1→2→3→2→3→2→3 is actually okay, because it is so clearly structured." 08 multiple 1→2→3→2→1→2→3 (participant 13; single page interface) 11 single 1→2→3→2→3 Although we did not explicitly ask for it, the main idea that the 12 multiple 1→2→3→2→3 goal could be adjusted was mentioned positively. 24 multiple 1→2→3→2→3 Regarding the first element, we asked the participants if they 25 single 1→2→1→2→3 found the unit kilocalories meaningful. As stated above, we ex- pected that kilocalories might be too abstract for the users. But, To evaluate if the provided opportunities to manipulate the rec- since it is the common unit for activity tracking, it was known by ommendation are appropriate, we analysed correctness of handling all participants. However, eight of them (single page interface: n = as well as reported and observed difficulties: During the interac- 5; multiple pages interface: n = 3) reported problems in interpreting tion with both interfaces, the first interface element (goal selection it as they were not used to the unit. For instance, they asked for through slider) was clear for most participants. Two participants reference values that would have been helpful. Although there were had general questions of understanding and stated that it could not also many participants who did not report problems with the kilo- be set precisely. The time unit was clear, also. The only interpre- calories unit, all except one participant agreed that the conversion tation problem that occurred, was that one participant asked for a into time (interface element 2) helped to interpret and estimate the conversion from minutes into hours. Some participants were con- goal. The separation of activities into different intensities was not fused by the different activity levels. However, the majority of the easy to understand for 6 persons. Others liked this aspect: IntRS Workshop, September 2019, Copenhagen, DK Herrmanny, Löppenberg, and Schwarz "I liked that you could divide it into intensive, moder- for physical activity support and 50% also had experience in using ate and low activity, because you could split it up even them. if you were not an athletic kind of person. You can see that you can achieve a lot even by just climbing Table 3: Descriptive Results of the Single Page Interface stairs or going for a walk." (participant 20; multiple pages interface) Single Page Multi Page M SD M SD With regard to the third interface element - the planning of UEQ attractiveness 1.58 .66 1.80 .60 activity units - most participants (n = 20) found it helpful. For UEQ perspicuity 1.91 .94 2.08 .49 example one participant said: UEQ efficiency 1.92 .86 1.36 .56 "[...] It allows a bit of back and forth planning." (par- UEQ dependability 1.34 .71 1.52 .75 ticipant 11; single page interface) UEQ stimulation 1.11 .62 1.30 .44 Three of them even wished to have further functionalities: One UEQ novelty 1.11 .73 1.17 .59 participant wished to assign the selected units to concrete activ- VisAWI 5.85 .38 5.78 .63 ities, two would like to assign them to specific days. A calender VisAWI simplicity 6.00 .51 5.69 .63 function was also desired by further participants when directly VisAWI diversity 5.07 .71 5.45 .77 asked for additional functionality in another interview question. VisAWI colourfulness 6.41 .45 6.05 .99 Three of the participants who found the planning of activity units VisAWI craftsmanship 6.09 .38 6.05 .66 helpful, reported (initial) problems in understanding. Four partic- ipants (tendentially) did not find the activity units helpful. Three ASQ 3.00 1.38 2.45 1.66 of them reported problems in understanding. Two participants did SUS 77.27 13.80 80.69 10.43 not make a precise statement whether the unit planning element was supportive for them or not. 7.3.1 UEQ. Both interfaces have a good UEQ score (see table 3). A 7.2.3 Usability. Despite positive usability ratings (see section 7.3) value higher than .80 is an indicator for a positive rating, while a some usability problems that occurred when operating with the value lower than -.80 is an indicator for a negative rating. Results interface elements, could be identified in the interview. In general, of the multiple pages interface are descriptively higher in all sub difficulties in operation were often (n = 13) mentioned. scales, except for the result of efficiency. We tested for significance "What bothered me was that sometimes it was hard with a t-test for the UEQ perspicuity scale and - due to missing to select the things". (participant 11; single page inter- prerequisites (normal distribution or variance homogeneity) - a face) Mann-Whitney-U-test for the remaining subscales. There were no Although there were many positive responses to the circular seek- significant results. bar that was used for intensity distribution, four participants had 7.3.2 VisAWI. For the VisAWI there are no reference values for operating problems when moving the pointers. Regarding the plan- interpretation given by the authors. They are stating, that lower ning of activity units (third interface element), it was still desired values imply a negative rating and higher values a positive one. In that the addition and removal of activity units should not only be this case, ratings from 1 to 3.5 are interpreted negative and ratings possible by using the buttons, but also by touching the respective from 3.5 to 7 positive, since the scale is from 1 to 7. In general, the units. single page interface (M=5.85, SD=.38) as well as the multiple page 7.2.4 Aesthetics. Also the aesthetics were rated predominantly interface (M=5.78, SD=.63) were rated positive (table 3) and do not positive for both interfaces. The colour and graphical presentation differ significantly from each other (Mann-Whitney-U-Test: U = of the page was particularly rated as outstanding. 57.700, p = .847). The same applies for the subscales. "The colours fit well together. It is not obtrusive or 7.3.3 ASQ. For the ASQ questionnaire, there were also no refer- boring". (participant 19; single page interface) ence values given for interpretation. The scale ranges from 1 (very However, other participants found the colour design to be too positive) to 7 (very negative). As before, for interpretation we split uniform (n = 3). Especially the circular seekbar was criticised for the scale. Values from 1 to 3.5 are interpreted positively and values being too similar or boring. from 3.5 to 7 negatively. As presented in table 3, the single page "The colours were too similar that you had to look interface has an average of 3.0 (SD=1.38) and the multiple pages exactly what you had just selected." (participant 12; interface of 2.45 (SD=1.66). Both values are therefore evaluated multiple pages interface) positive, whereby the multiple pages interface performed better. However, the differences are not significant (Mann-Whitney-U-Test: 7.3 Online Questionnaire U = 41.000, p = .217). With an online questionnaire we assessed user experience, aes- 7.3.4 SUS. The SUS score can have a range from 0 to 100. Accord- thetics and usability. Due to extreme values, identified by box plot ing to Bangor et al. [2] the single page interface was rated “good” diagrams, some participants were excluded from the analysis of and the multiple pages interface “excellent” (see table 3). Therefore, the questionnaires. 22 participants (group 1: n=11; group 2: n=11) the multiple pages interface had a better result, what however did remained. 91% of them stated that they were interested in apps not appear significant (Mann-Whitney-U-Test: U = 54.500, p=.699). IntRS Workshop, September 2019, Copenhagen, DK Herrmanny, Löppenberg, and Schwarz 8 DISCUSSION Some understanding problems occurred. As explanation was Three main interface elements were designed and evaluated in two given during the study, this should not confound the results. Prob- slightly different interfaces. The three elements are 1) an adjustable ably, these problems would partially be solved through extended, slider showing a recommendation range together with an indication unobserved exploration in real-world settings. As already suggested of uncertainty, 2) conversion into time depending on activity level, above, functionality and exploration of the interface could be en- and 3) activity unit planning. They were designed to empower hanced week by week. Additionally, we addressed understanding users to understand the recommendation and its implications, to problems in the re-design (see section 9) as far as possible. reflect and evaluate it, and to provide the opportunity to actively The questionnaires UEQ and VisAWI addressed the aesthetics manipulate it. In the following, we discuss if and how the interface of the interfaces and showed positive ratings. Regarding usability, elements reached those aims. ASQ ratings were positive in tendency and very positive values were achieved in the SUS. In the interview, the colours were also pre- 8.1 Comparison between the Interfaces dominantly commended, whereas a few people felt that the colour design was too uniform. However, the interview and interaction Results do not show any significant differences between the two showed that some participants had problems with the operation, e.g. interfaces in interaction behaviour, feedback regarding the per- with the exact setting of the goal selection element or with the selec- ceived support of the interface elements, usability, user experience tion of elements, which can be corrected by technical adjustments. or aesthetics. Returning to previous interface elements to make In total, overall ratings for usability, user experience and aesthetics adjustments, was assumed to be more difficult for navigation on are good. This is an important precondition for whether the user is different pages in the multiple pages interface. However, since the willing to interact with the interface or not. This precondition can order of interaction does not vary between the interfaces, both be seen as fulfilled for the study and the interfaces. arrangements of elements (single page interface: on one page, mul- tiple pages interface: on more consecutive pages) seem possible. Consequently, whether those interrelated elements are arranged on 8.3 Interface Element 1: Goal Selection single or on multiple pages doesn’t seem to be an influencing factor All participants made use of the slider, which indicates that a modi- for recommendation reflection and modification. As all results are fiable value within a range seems more appropriate than one single independent from presentation mode, we will subsequently discuss recommended value. Think aloud comments show, that they re- them without separation between the two interface variants. flected, what would be an appropriate goal. However, as expected and in line with the interview results, the limited range and the 8.2 Overall Interface default value alone do not seem to sufficiently support users in evaluating, what an appropriate value would be. Additional em- All participants interacted with all interface elements. This indi- powerment is needed here (see 8.4). One participant suggested a cates that they were perceived as helpful and have the potential to reference value to better interpret the recommended goal. This is integrate users in the process. As a limitation, it can not surely be surprising as we intended the goal range and the default value (and said, which amount of interaction was fostered by the study situa- the colour gradient) to serve as such a reference. At least this one tion and which by the interface design itself. However, participants participant does not seem to interpret it in the intended way. As were not explicitly told to interact with each interface element. On participants did not seem to pay attention to the colour gradient the one hand, the study situation could have fostered interaction indicating the certainty of the result, this could not help in terms with the elements. On the other hand, it is likely, that users of an of empowerment and reflection, which contradicts former research activity tracking app in a real-world setting (compared to a study [8]. One reason might be that the interface contained more ele- setting) are more intrinsically motivated to interact with the system ments than the interfaces used in the cited literature and therefore and choose the most adequate goal. Moreover, in real-world settings the focus of participants was different. Another reason might be, there is much more time to get used to the system and its interface that in the presented work the colour gradient was from light blue elements. Exploration of and interaction with the interface could to dark blue. In the cited study, the gradient had a colour coding be enhanced week by week. with the colours red, yellow and green. It has been shown that most people interacted with the elements in a sequential order. For the single page interface this means that the elements were operated from top to bottom and for the multiple 8.4 Interface Element 2: Conversion into Time pages interface that the elements were operated page by page. Pos- Think aloud, behaviour observation and interview results all in- sible reason are (1) that they were satisfied with the initially chosen dicate, that conversion into a temporal unit strongly empowered value, or (2) that they considered their study task to be completed users to better estimate and reflect what is an appropriate goal when testing and understanding all elements and did not see the value on the recommended range. For some participants this re- need to actually find an appropriate goal in the study setting, or (3) flection lead to revision of the initially chosen goal. Conversion that for those people reflection was not stimulated sufficiently to into time goes along with separation into different activity levels, lead to interaction. However, more than 20 percent of the partici- which was difficult to understand for some participants and thus pants returned to previous elements during the interaction process lowered the intended empowerment. Unfortunately, this separation to make adjustments. Stepping back from one interface element is unavoidable, as the conversion otherwise would have been far too to another, indicates that exploration and interaction has not only inexact and not meaningful anymore. However, results show that been done for the study, but actually stimulated reflection. nevertheless the interface element can have the intended supportive IntRS Workshop, September 2019, Copenhagen, DK Herrmanny, Löppenberg, and Schwarz potential. Although, there were some problems in handling, which There are some limitations and potential for improvement. Al- need to be resolved, all participants interacted with the interface though the work described focuses on investigating the research element. questions, it is part of the user centered development of a broader behavior change application. Therefore, we re-designed the inter- face (Figure 3). We changed the slider’s colour gradient and added 8.5 Interface Element 3: Activity Unit Planning colour coding from green to orange. As some few participants Also the activity unit was used by all participants and the majority encountered problems in distinguishing the intensities in the cir- found it helpful. Some wished for modified elements or opportuni- cular display, we increased the contrast of the colours so that they ties that allow for an even more detailed planning. Reconsidering are easier to distinguish. Addressing the understanding problems, the initially selected goal and stepping back to the first interface we revised the help texts for all elements. Regarding the activity element demonstrates that such a planning element can in princi- unit planning element, we redesigned the element to make it more pal empower users to reflect the recommendation. The usability understandable and meet the user demand for a more detailed plan- and understanding problems reported by some participants are ning, such as a calendar function. Instead of the fields of 10 minute addressed in the re-design of the interface presented below. intervals per activity unit, we now provide one field per day of the week. Duration of the daily activity time can be modified via plus or minus buttons. We further modified the label. 9 SUMMARY, RE-DESIGN AND CONCLUSION We investigated if user interface design can in principal support users of an activity tracking system in understanding and reflect- ing the system’s goal recommendation as a basis for appropriately exerting influence on the recommendation result. This kind of user integration is important as recommendations in this field are very error-prone because of large variations over time and a large num- ber of confounding variables that are indeterminable for the system. In our presented approach, we pursue three main aims, which are (a) to empower users to understand the recommendation and its implications, (b) to reflect and evaluate it and (c) to provide the opportunity to actively manipulate it. In our approach, these aims are pursued through transparency of the algorithm uncertainty and by providing activity planning elements, which are intended to have explanatory function regarding the impact of the recom- mendation and to stimulate reflection of the recommended goal. Further, these elements should enable and support users in ap- propriate modifications of the recommended goal. We designed two different interfaces with three elements: A modifiable slider for goal selection showing a recommendation range, default value and colour indicator for probability of suitability of the recommen- dation; a conversion of the goal unit (kcal) into a temporal unit (minutes) in conjunction with different activity levels; an element Figure 3: Interface Re-Design (translated from German) to plan concrete activity units, i.e. how often and how long users plan to be active to achieve their goal. The two interface variants differed in the arrangement of the elements (single page or multiple pages). We evaluated the interface with regard to the three main Those improvements refer to the specific design of the specific aims presented above. Results were the same for both interface vari- interface elements used in this study. As they are exemplary imple- ants. They show that there is a need of user empowerment and that mentations for interface elements, the revealed limitations do not empowerment can be reached by interface elements that explain affect the gain of knowledge regarding the research question. It can the impact of the recommendation. In this case, the second interface be concluded that implementation planning elements in particular element achieved this by converting the goal from an abstract unit and interface elements in general have the potential to empower to a unit, participants are more used to and which is more conceiv- users, support recommendation reflection and foster user interac- able. The third interface element works by further illustrating what tion with the recommendation. is necessary in the daily life to achieve the goal. The study shows, that both of these interface elements can support reflection of the recommendation. Exerting active influence on the recommendation ACKNOWLEDGMENTS was initially stimulated by just providing the opportunity to do so, This work is part of the research project Personal Analytics, funded with the first interface element. Additionally, results show that as a by the Federal Ministry of Education and Research (Bundesmin- consequence of reflection, stimulated by interface elements 2 and isterium für Bildung und Forschung, BMBF), reference number: 3, further manipulation of the recommended goal was fostered. 16SV7110, aquired and headed by Aysegül Dogangün. IntRS Workshop, September 2019, Copenhagen, DK Herrmanny, Löppenberg, and Schwarz REFERENCES Carlos Castillo, Yelena Mejova, and Arnold Bosman (Eds.). ACM Press, New York, [1] Laurence L. Alpay, Olivier Blanson Henkemans, Wilma Otten, Ton A. J. M. New York, USA, 157–161. https://doi.org/10.1145/3079452.3079499 Rövekamp, and Adrie C. M. Dumay. 2010. E-health applications and services [20] Meinald T Thielsch and Morten Moshagen. 2011. Erfassung visueller Ästhetik for patient empowerment: directions for best practices in The Netherlands. mit dem VisAWI. Tagungsband UP11 (2011). Telemedicine journal and e-health : the official journal of the American Telemedicine [21] Bo Xiao and Izak Benbasat. 2007. E-commerce product recommendation agents: Association 16, 7 (2010), 787–791. https://doi.org/10.1089/tmj.2009.0156 use, characteristics, and impact. MIS quarterly 31, 1 (2007), 137–209. [2] Aaron Bangor, Philip Kortum, and James Miller. 2009. Determining what indi- vidual SUS scores mean: Adding an adjective rating scale. Journal of usability studies 4, 3 (2009), 114–123. [3] Jesús Bobadilla, Fernando Ortega, Antonio Hernando, and A. Gutiérrez. 2013. Recommender systems survey. Knowledge-Based Systems 46 (2013), 109–132. https://doi.org/10.1016/j.knosys.2013.03.012 [4] John Brooke. 1996. SUS-A quick and dirty usability scale. Usability evaluation in industry 189, 194 (1996), 4–7. [5] Alexander Felfernig, Robin Burke, and Pearl Pu. 2012. Preface to the special issue on user interfaces for recommender systems. User Modeling and User-Adapted Interaction 22, 4-5 (2012), 313–316. https://doi.org/10.1007/s11257-012-9120-5 [6] Marsha E Fonteyn, Benjamin Kuipers, and Susan J Grobe. 1993. A description of think aloud method and protocol analysis. Qualitative health research 3, 4 (1993), 430–441. [7] Juan M. García-Gómez, Isabel de La Torre-Díez, Javier Vicente, Montserrat Robles, Miguel López-Coronado, and Joel J. Rodrigues. 2014. Analysis of mobile health ap- plications for a broad spectrum of consumers: a user experience approach. Health informatics journal 20, 1 (2014), 74–84. https://doi.org/10.1177/1460458213479598 [8] Katja Herrmanny and Aysegül Dogangün. 2017. The Impact of Prediction Uncer- tainty in Recommendations for Health-Related Behavior. In Proceedings of the 2nd International Workshop on Health Recommender Systems co-located with the 11th International Conference on Recommender Systems (RecSys 2017), David El- sweiler, Santiago Hors-Fraile, Bernd Ludwig, Alan Said, Hanna Schäfer, Christoph Trattner, Helma Torkamaan, and André Calero Valdez (Eds.). RWTH, 14–17. http://ceur-ws.org/Vol-1953/healthRecSys17_paper_8.pdf [9] Katja Herrmanny, Jürgen Ziegler, and Aysegül Dogangün. 2016. Supporting Users in Setting Effective Goals in Activity Tracking. In Persuasive Technology, Alexander Meschtscherjakov, Boris de Ruyter, Verena Fuchsberger, Martin Murer, and Manfred Tscheligi (Eds.). Springer International Publishing, Cham, 15–26. [10] Jamil Hussain, Wajahat Ali Khan, Muhammad Afzal, Maqbool Hussain, Byeong Ho Kang, and Sungyoung Lee. 2014. Adaptive User Interface and User Experience Based Authoring Tool for Recommendation Systems. In Ubiquitous Computing and Ambient Intelligence. Personalisation and User Adapted Services, Ramón Hervás, Sungyoung Lee, Chris Nugent, and José Bravo (Eds.). Lecture Notes in Computer Science, Vol. 8867. Springer International Publishing, Cham, 136–142. https://doi.org/10.1007/978-3-319-13102-3{_}24 [11] Dosam Hwang, Xuan Hau Pham, and Jason J. Jung. 2011. Preference-based user rate correction process for interactive recommendation systems. In iiWAS 2011, David Taniar, Eric Pardede, Hong-Quang Nguyen, Wenny Rhayu, and Ismail Khall (Eds.). ACM Press, New York, New York, USA, 412. https://doi.org/10.1145/ 2095536.2095619 [12] Dietmar Jannach, Ingrid Nunes, and Michael Jugovac. 2017. Interacting with Recommender Systems. In Proceedings of the 22nd International Conference on Intelligent User Interfaces Companion - IUI ’17 Companion, George A. Papadopou- los, Tsvi Kuflik, Fang Chen, Carlos Duarte, and Wai-Tat Fu (Eds.). ACM Press, New York, New York, USA, 25–27. https://doi.org/10.1145/3030024.3030027 [13] Bart P. Knijnenburg, Martijn C. Willemsen, Zeno Gantner, Hakan Soncu, and Chris Newell. 2012. Explaining the user experience of recommender systems. User Modeling and User-Adapted Interaction 22, 4-5 (2012), 441–504. https://doi. org/10.1007/s11257-011-9118-4 [14] Haridimos Kondylakis, Lefteris Koumakis, Eleni Kazantzaki, Maria Chatzimina, Maria Psaraki, Kostas Marias, and Manolis Tsiknakis. 2015. Patient Empowerment through Personal Medical Recommendations. In MEDINFO 2015: eHealth-enabled health, Indra Neil Sarkar, Andrew Georgiou, and Paulo Mazzoncini de Azevedo Marques (Eds.). IOS Press, Amsterdam and Berlin and Tokyo and Waschington, DC, 1117. https://doi.org/10.3233/978-1-61499-564-7-1117 [15] Bettina Laugwitz, Theo Held, and Martin Schrepp. 2008. Construction and Evaluation of a User Experience Questionnaire. In HCI and Usability for Education and Work, Andreas Holzinger (Ed.). Lecture Notes in Computer Science, Vol. 5298. Springer Berlin Heidelberg, Berlin, Heidelberg, 63–76. https://doi.org/10.1007/ 978-3-540-89350-9{_}6 [16] James R. Lewis. 1991. Psychometric evaluation of an after-scenario questionnaire for computer usability studies: the ASQ. ACM Sigchi Bulletin 23, 1 (1991), 78–81. [17] Philipp Mayring. 2010. Qualitative Inhaltsanalyse: Grundlagen und Techniken. Beltz. https://books.google.de/books?id=BJlxSQAACAAJ [18] Pearl Pu and Li Chen. 2007. Trust-inspiring explanation interfaces for recom- mender systems. Knowledge-Based Systems 20, 6 (2007), 542–556. [19] Hanna Schäfer, Santiago Hors-Fraile, Raghav Pavan Karumur, André Calero Valdez, Alan Said, Helma Torkamaan, Tom Ulmer, and Christoph Trattner. 2017. Towards Health (Aware) Recommender Systems. In Proceedings of the 2017 Inter- national Conference on Digital Health - DH ’17, Patty Kostkova, Floriana Grasso,