Context-Aware Recommendations for Mobile Shopping Béatrice Lamche Yannick Rödl Claudius Hauptmann TU München TU München TU München Boltzmannstr. 3 Boltzmannstr. 3 Boltzmannstr. 3 85748 Garching, Germany 85748 Garching, Germany 85748 Garching, Germany lamche@in.tum.de yannick.roedl@tum.de hauptmac@in.tum.de Wolfgang Wörndl TU München Boltzmannstr. 3 85748 Garching, Germany woerndl@in.tum.de ABSTRACT or social environment to recommend items. A context-aware This paper presents a context-aware mobile shopping recom- recommender system could for example recommend the “Al- mender system. A critique-based baseline recommender sys- bertina” museum rather than visiting the “Prater” amuse- tem is enhanced by the integration of context conditions like ment park if the user spends a rainy day in Vienna. This weather, time, temperature and the user’s company. These paper evaluates which kind of context information is rele- context conditions are embedded into the recommendation vant in a mobile shopping recommender system and how this algorithm via pre- and post-filtering. A nearest neighbor information could be utilized to improve recommendations algorithm, using the concept of an average selection con- of clothing items in a context-aware recommender system. text, calculates how contextually relevant a recommendation By integrating contextual mobile information into the rec- is. Out of 20 clothing items from the hybrid recommenda- ommendations it is expected, that the recommended items tion algorithm, context-aware post-filtering searches for the better fit the customer’s needs and therefore customers are nine best-fitting items. The resulting context-aware recom- more satisfied with the recommender system. The paper is mender system is evaluated in a user study with 100 test organized as follows. We first start off with some definitions participants. The answers of the user study show, that the relevant for context-aware recommender systems and sum- recommendations were perceived as being better than the marize related work. The next section defines the context recommendations of a non-context aware recommender sys- factors and describes the system’s overall design. The user tem. study evaluating the developed system is discussed in sec- tion 4. The paper concludes by summarizing its results and giving an outlook on future research topics. Categories and Subject Descriptors H.4.2 [Information Systems Applications]: Types of Systems—Decision support 2. BACKGROUND AND RELATED WORK A widely used definition in the area of context-aware ap- plications is the definition by Dey: General Terms Design, Experimentation, Human Factors. “Context is any information that can be used to characterize the situation of an entity. An entity is a person, place, or object that is considered Keywords relevant to the interaction between a user and an context-awareness, mobile recommender systems, location- application, including the user and applications based services, user interaction, critiquing, mobile shopping themselves” [6, p. 5]. They define context as relevant information for an interac- 1. INTRODUCTION tion between a user and an application. Therefore, if the Context-aware recommender systems (CARS) are systems context of an entity shall be defined, it is necessary to ask utilizing the user’s context such as the user’s position, weather which information is relevant to the situation. Context-aware recommender systems (CARS) integrate context into the recommendation process. This process can be described by this three dimensional recommendation func- tion [2]: R : U ser × Item × Context → Rating (1) The rating function (R) considers the Context (which is de- Copyright held by the author(s). fined by all the different Context Factors) and recommends LocalRec’15, September 19, 2015, Vienna, Austria. items of the item set (Item) to a user by predicting the rating that this user would give to an item. Context com- as well as personal data, such as calendar appointments, plicates the recommendation process as items can be rated viewed documents and messages, to infer the user’s current in different contexts. An umbrella for example can be rated activity so that the user is not required to explicitly define at good weather conditions very highly, due to the fact that her profile or preferences. The recommendations include it looks nice or is small. However, if it was raining the same stores, restaurants, parks and movies. However, up till now, umbrella could get a bad rating, due to the fact that it breaks the techniques for automatic context detection are often un- at the slightest wind. So the context in the rating function reliable and immature and require further research [9]. We brings additional complexity as the recommendation algo- therefore decided to come up with a solution that takes the rithm does not only have to match users with items, but users’ explicit stated preferences into account. also with the context. I’m feeling LoCo [10] is an ubiquitous mobile recommender Adomavicius and Tuzhilin [2] identified three different points system that recommends places nearby the user’s current lo- in the recommendation process where context might be in- cation, e.g. restaurants and museums. Physical context such corporated into the process: as the user’s current transportation mode and location are automatically detected. This physical information is used 1. Contextual Modeling - the recommendation algorithm for a first filtering step: The user’s mode of transporta- is altered such that it includes the context and already tion and location influences the radius within which places considers it when calculating recommendations for recommendations are considered. Moreover, the user’s 2. Contextual Pre-Filtering - the current context is used mood influences the recommendations: foursquare (a social to select only the most relevant data from the dataset network app to save and share visited places with friends1 ) assigns each place to a category, which is mapped by the 3. Contextual Post-Filtering - the context information is authors to a particular feeling (e.g. the system recommends ignored during the recommendation process, only the events related to Arts & Entertainment when the user feels resulting set is contextualized “artsy”). As soon as the user states a mood, places assigned with the category to which the feeling is mapped to are rec- All of these approaches have their specific strengths and ommended. The system is based on text classification. It weaknesses. However, it is also possible to combine mul- considers the tags and categories associated with a place the tiple context-based algorithms. user has visited. The user model is therefore a document, Since the consideration of context can enhance the use- which holds all the names, categories and tags associated fulness of a recommendation for a user, CARS are recently with a visited place. A conducted user study shows that receiving a lot of attention [1]. For instance Anand and I’m feeling LoCo enhances the user experience and that the Mobasher [3] define a recommendation process that inte- recommended places were overall satisfying [10]. This mood- grates context. They distinguish between a user’s short term based approach is in particular reasonable if a recommender (STM) and long term memories (LTM). Contextual cues are system is aimed to suggest different types of leisure activi- used to retrieve relevant preference models from LTM that ties since the user’s mood might highly influence the current belong to the same context as the current interaction. This preferences. However, we consider the relevance of the user’s information is merged with the current preference model mood as low in our mobile shopping scenario. stored in STM for generating context-aware recommenda- So far, no research exists that analyzed all the contextual tions [3]. However, the proposed framework is very general factors that might be useful when recommending clothing and does not emphasize how it can be applied in a mobile items from different stores for mobile shoppers and investi- scenario, where the context is different. gated how such a recommender system can be constructed Baltrunas et al. [4] investigated the relationship between and is perceived. Such an application could help the user contextual factors and item ratings in a tourist scenario. The detecting new (formerly unknown) brands or stores and find authors developed a web tool for acquiring subjective rat- clothes matching the user’s fashion style. Compared to ex- ings regarding points of interest in a mobile scenario within isting mobile recommender systems, clothing items are dif- a specific context. Users were asked if a specific context ferent in the way, that they frequently change. Such a rec- factor (e.g. winter season) has a positive or negative influ- ommender system has to be frequently trained or being able ence on the rating of a particular item. Second, users were to provide good recommendations on a sparse dataset. We asked to rate example contexts and recommendations. The therefore first acquire the relevant context factors in a mo- more influential a context factor seemed to be (according to bile shopping scenario and then come up with a promising the results of the first step), the more contextual conditions approach how to integrate this context into the recommen- specifying this factor were generated. These imagined rat- dation process. ings could be used as initial ratings in the database, such that the cold start problem is minimized. Based on these results, a predictive model that can be trained offline, was 3. DESIGNING THE PROTOTYPE developed. Results show that influencing context factors for We imagine a system that uses the user’s mobile con- points of interests are inter alia distance, season, weather, text to recommend clothing items available in shops close time, mood and companion [4]. This methodology seems to to the user’s position. However, as in our previously devel- be a very promising approach to acquire contextual ratings, oped baseline system (see Section 3.1), the new approach however ratings were only acquired for a travel planning should still allow critiquing of items. As described in Sec- recommender system and the generated ratings of this work tion 2, context can be integrated into the recommender sys- can’t be directly applied to a mobile shopping scenario. tem in three different ways: contextual pre-filtering, contex- Researches have also been done on automatically predict- tual post-filtering and contextual modeling. In this work, ing the user’s context. For example in [5], a mobile leisure 1 recommender system was developed. It uses time, location, https://foursquare.com Figure 1: The context-aware shopping recommender process (a) Item view in CARS (b) Map view in both systems Figure 3: Detailed Information View mendation algorithm. The algorithm uses a two-fold strat- egy: On a positive critique of an item (touching the thumbs up symbol) it shows items that more closely match the cri- tiqued item. On a negative critique (thumbs down symbol) more diverse items are shown. In both cases, the recommen- dation algorithm uses a k-nearest neighbor algorithm to find the k items that best fit the current requirements. In this case k is set to nine, meaning that in each cycle, the user is shown nine different recommendations. In the following screen, the user selects which of the properties (color, brand, price, type) of the item shall be critiqued. The recommender system then shows more or less items (depending on the (a) Recommendations in CARS (b) Critiquing View in CARS critique) of the selected feature(s). By touching an item’s picture in the recommendations view, the system displays a Figure 2: User Interaction Design result screen, where the user can select the item. The appli- cation also shows the immediate surroundings of the user in a map (see Figure 3). The system described in this section two approaches (contextual pre-filtering and contextual post- without context-awareness is used as a baseline for testing filtering) are combined to improve the recommendations (see the context-aware recommender system. Furthermore we Figure 1). Pre-filtering (Section 3.2) is used to determine have made some adjustments to this content-based recom- which items of the case base are relevant to the user. Rel- mender system due to the changed dataset and performance evance for example depends on the distance the user ac- problems. cepts to travel, or the opening hours of a shop. Post-filtering (Section 3.4) is used to filter the items that shall be recom- 3.2 Contextual Pre-Filtering mended according to their adequacy to the current context In the contextual pre-filtering step, we make sure that by using a nearest neighbor algorithm. In order to build a only relevant data is loaded into the recommender system. database of contextually tagged items, a pre-study was ex- Therefore, the context factors distance to shop, shop crowd- ecuted asking users to classify items according to contexts edness, shop opening hours and item in stock are used to (Section 3.3). This data ensures, that some items already are restrict the case base and avoid unnecessary search in items contextually tagged, which is needed for the post-filtering of the user does not want to see. The user may state prefer- the recommendations. The user interface and interaction ences for each of these context factors. The user might state design of our CARS is described in section 3.5. a different distance to the shops or that she wants to see crowded places as well. 3.1 The Baseline The case base is filtered in four steps. First, all shops The system presented in [7] forms the baseline for our that are not within the specified distance, then shops that CARS. It was developed for the Android platform and incor- are not open at the specified time and shops that do not porates an active learning algorithm. The user interfaces of match the crowdedness criterion are excluded. Finally, it the baseline system are very similar to our developed CARS is verified that the item is in stock. After pre-filtering the and can be seen in Figures 2 and 3. The active learning al- items based on these conditions, it is verified that at least gorithm, called adaptive selection, is a critique-based recom- 300 items are available in the case base, as our tests showed The average context specifies in which context an item is selected. If an item was not selected in any context, it can be assumed, that this item neither is liked by a lot of users nor in a specific context and can therefore receive a higher distance to the current context. Popular items, which are selected in many different contexts will receive a distance which is close to 0.5. However, as they are very popular, they should not receive a high distance and therefore their distance is reduced by a defined percentage of their distance. P wi,b · dist(cf , b) b∈ia N (f ) avgContextDist(c, i) = · P (2) N (ia ) wj j∈ij Equation 2 defines the distance metric. It calculates the dis- Figure 4: Tool for elicitation of item preferences in contexts tance between an item’s (i) average context (in which the item is selected) and the current context (c). The first quo- tient calculates the average distance to the current context that this is the minimum amount of data to adequately react whereas the second quotient normalizes the distances. on the user’s preferences. However, if there were not enough The set of all context conditions in which an item has items available in the case base, these conditions are relaxed been chosen is defined by ia . An individual context condi- and the user is notified about this step. tion in which an item has been chosen is defined by b. For each clothing type, the context factors are of different im- 3.3 Acquisition of Context Relevance portance. Hence, different weights (wi,b ) can be assigned to Before being able to recommend items based on context, context conditions. We assigned the weights for each cloth- the relevant context has to be defined. A promising ap- ing type based on a previously conducted experiment [11]. proach to assess the context relevance for a tourism sce- The distance function dist(cf , b) (Equation 3) calculates the nario is presented by Baltrunas et al. [4] and is therefore distance between the current context condition cf (f stands adapted for our shopping scenario (see also subsection 2). for the context factor) and a context condition b in which Using this methodology we assess the following context fac- the item was chosen. For an improved readability the vari- tors as relevant for our context-aware mobile shopping rec- ables were renamed to x and y in Equation 3. The number ommender system: time of the day, day of the week, tem- of context conditions in which an item has been chosen is perature, weather, company, distance to shop, crowdedness, defined by N (ia ). In this work N (ia ) always is a multiple shop opening hours and item is in stock. In order to ac- of five - the number of context factors - as we assume all quire contextual ratings, a convenience sample of the target context conditions to be set in our artificial environment. population was asked to specify which items they are likely In order to make different items (with different overall to buy in a specified context. We developed a simple Java weights) comparable, we normalize the distance between tool (Figure 4) which shows nine pictures and descriptions zero and one by multiplying with the second quotient of of clothing items. The testers could specify if they would the function. Here N (f ) defines the number of context fac- consider buying the product depending on a randomly se- tors (five) we use for post-filtering. The number of context lected company, temperature or weather, which is specified factors is divided by the sum of weights of all context factors on the right side of the tool. Overall 747 contextual ratings (wj ) for the specific item (j ∈ ij ). for 674 different items were created by six users. This data ( forms the basis for the decision generation in the contextual graphDistance(x, y) if y is nominal post-filtering algorithm. dist(x, y) = |x−y| (3) rangey otherwise If the context factor is ordinal, interval or ratio-scaled, the 3.4 Contextual Post-Filtering distances are calculated based on the euclidean distance. Based on the pre-filtered item set the critique-based rec- Otherwise the graphDistance, a pre-defined distance for ommender selects 20 items. Out of these 20 items only nominal attributes, is used. This graphDistance is similar nine are actually displayed. Therefore, the contextual post- to the distance used by Lee and Lee [8]. The context factors filtering algorithm (illustrated in algorithm 1) has to elimi- weather and company use this graphDistance and define an nate eleven items in each cycle. The context factors time of undirected graph with distances between all context condi- the day, day of the week, company, temperature and weather tions (e.g. the weather conditions Sunny and Rainy have are used to post-filter the recommendations. For this pur- a higher distance than Sunny and Cloudy). The assigned pose, we use a k-nearest neighbor method because this tech- distances are used as an input for the distance method. For nique has proven to be adequate in different CARS. The the context factor time of the day we use a cycle, as the most important component in nearest neighbor algorithms afternoon ends with the night, whereas the night is the first is the used distance metric. In our approach, the user is part of the day. For all other conditions it is expected that not able to rate an item within a given context, but only to the euclidean distance provides good results. Although we select it (and therefore implicitly rating it as good). Based want to achieve a high item frequency, we consider very pop- on this consideration, we came up with a distance metric ular items as being interesting for the user, especially in a that defines an average context in which an item is selected. shopping scenario. Therefore, we alter the resulting distance (avgContextDist(c, i)) if the item was selected in more than 30% of all contexts: The item’s distance is reduced by 20 % so that it is more likely to be displayed to the user. Every item that is not selected in any context receives a distance of 0.51. We came up with this value because it is the average distance at the second tertile when considering all distances of items rated in a specific context to a randomly selected context. This ensures that items which have not been rated within a specific context in our pre-study (see Section 3.3) are more likely to be presented to the user than items that were considered as being uninteresting in that specific con- text. The whole algorithm for contextual post-filtering is presented as algorithm 1. Algorithm 1 Post-filtering by current and item context 1: procedure ContextPostFilter(items, context, k) 2: for all item in items do 3: avgContextDistance(context, item) 4: if itemDistance == null then 5: setDef aultDistance(item) Figure 5: Explicit context determination via questionnaire 6: end if 7: end for By clicking on an item’s picture, the user gets to another 8: decreaseDistanceF orP opularItems(items) screen with more detailed information about the item and 9: return kN earestN eighbors(items, k) the store. Here, the user can finally select the item (see 10: end procedure Figure 3a). This information should enhance the trust the user has in the recommendations as she can check whether The algorithm’s disadvantage is that it weights each fac- the initial preferences (about distances to shop, crowded- tor independently without taking into consideration possible ness, etc.) were incorporated. Moreover, we implemented a connections between the individual context factors. For ex- map showing all available shops. On click of a shop we show ample the connection of rain and being with a friend might the shop’s opening hours, the crowdedness, the name, the be more different from rain and being with the family, than distance to the current position and how many items (out the individual distances between being with the family and of the current recommendations) are available at this shop being with a friend. This detection of dependencies could (see Figure 3b). be done by decision trees or other machine learning tech- niques. Nevertheless, we expect that the algorithm provides 4. USER STUDY reasonable recommendations for the user’s current context The user study was designed in order to test the differ- without these dependencies. The algorithm calculates the ences in user perceptions between the baseline application context distances in less than 100 ms on a Samsung Galaxy and the context-aware recommender system. We want to S3 mini for 20 items with the items being set in (overall) 200 find out whether the users perceive a difference in the ac- different contexts. It allows weighting of context factors for curacy of recommendations. A second goal of the study is each clothing type separately and distances for nominal at- to find out whether users are more satisfied with a recom- tributes. The method kN earestN eighbors(items, k) sorts mender system that takes the mobile context into account. the items by their distance to the current context. In case Therefore, the goal of the user study is to evaluate if the of any ties it uses the similarity measure that has already following hypotheses are true: been applied in our baseline system (Section 3.1). Hypothesis 1: The integration of context-awareness leads 3.5 Navigation and Interface Design to better perceived accuracy compared to non-context- When starting the application, the user is asked to set the aware recommendations. following context conditions manually: preferred distance to Hypothesis 2: The integration of context-awareness im- the shop, opening hours, temperature, weather and com- proves the overall user satisfaction. pany. Moreover the user can specify if she wants to exclude items that are not in stock and shops that are too crowded. Hypothesis 1 is tested by comparing the ratings of recom- The conditions for time of the day and day of the week, are mendations in a context-aware system and a baseline sys- not captured, as it is expected that the users are aware of tem. The users should rate how they perceived the recom- these conditions subconsciously. The context determination mendations on a seven-point Likert-scale. Hypothesis 2 shall interfaces can be seen in Figure 5. test whether the users are more likely to use, reuse or rec- Figure 2a shows an example of a calculated set of recom- ommend the application. This is an indication on how well mendations. With the thumbs up or thumbs down button the the system adapts to the users and how satisfied they are. user is able to critique the item’s attributes such as price, brand, clothing type and color (see Figure 2b). Besides this 4.1 Setup critiquing possibility, the user is able to see some expla- The user study is designed as a supervised within-subjects nations such as why the particular item is recommended. user survey to minimize the number of survey participants in the dataset. The crowdedness was set randomly with probability of 20 % and an item is in stock with a probability of 90 %. 4.2 Results All in all 100 participants (48 females, 52 males), between the ages of 17 and 30, took part in the user study. The an- swers to the Likert-statements (from 1 - strongly agree, to 7 - strongly disagree) in this work either followed a positively or negatively skewed distribution and are ordinal scaled instead of interval scaled. Therefore, a two-tailed paired Wilcoxon signed rank test is executed, rather than a paired t-test, to detect whether there are any significant differences between the distributions. The results of the two-sided tests are re- ported by stating a V and a p value. The V is the sum of ranks assigned to differences with a positive sign. Therefore, a higher V stands for higher differences in the user’s deci- sions. The p value defines how significant the results are. In general we evaluate whether the null hypothesis is likely to be true. The means, as well as the V and p values of Figure 6: Tool to generate a user’s scenario the most important metrics of the two systems are shown in Table 1. In order to test the user’s perceived prediction accuracy, and improve the comparability between the applications. we asked if the recommended products fitted the individual Each user tests both applications (the baseline system and preferences. The baseline application’s mean is 2.71 whereas the CARS) and answers a questionnaire afterwards. Which the CARS mean is 2.34 (M edian = 2 for both systems). system is tested first is flipped in between subjects so that a The Wilcoxon signed rank test reveals, that the recommen- bias because of learning effects could be reduced. The par- dations of the CARS fitted significantly better to the user’s ticipants are asked to imagine being in the scenario, the tool preferences than the baseline’s recommendations (V = 1807, generated for them, whereby the location is always Munich. p < .01). The participant’s task is to find one item only, which they The context-awareness of the applications is evaluated by would like to try on. As soon as the users have found a asking whether the products were in line with the provided suitable item, they are asked to select it, such that they can scenario. The baseline application’s mean is 2.82 whereas finish the test and answer the corresponding questionnaire. the CARS mean is 2.66 (M edian = 2 for both applications). The target population of this application are young smart- The Wilcoxon signed rank test shows V = 1346, p = .54. phone users that like to go shopping. In the user survey qual- This means that the users did not perceive any of the sys- itative and quantitative data are collected. Qualitative data tems as being more context-aware than the other. is measured via a questionnaire. It mainly consists of state- When asking the users whether they are likely to use the ments, the user should assess on a seven-point Likert-scale application again, the users stated that they are significantly (from 1 - strongly agree, to 7 - strongly disagree), e.g. how more likely to use the CARS (M edian = 2, M ean = 2.64) satisfied the user is with the recommendations and the appli- again, than the baseline (M edian = 3, M ean = 3.06) appli- cation in general. The quantitative data is directly measured cation (V = 1563, p < .01). within the application and includes the number of critiquing The maximum time needed to find an item in the base- cycles, the time between viewing the first recommendations line application was 867 seconds (M edian = 142s, M ean = and selecting an item, and the item diversity. Before the 179s) and in the CARS application 697 seconds (M edian = user starts using the application, a scenario describing the 149s, M ean = 182s). The time needed to select an item user’s location, weather and company is generated for her is not significantly different between the applications (V = (see Figure 6). 2302.5, p = .45). The participants are asked to actively select their context Another measure for the effectiveness of the recommen- in the application and imagine it. This scenario is visually dation algorithm is the number of critiquing cycles until an displayed to the users throughout the whole survey on a item was selected. Participants completed their task in aver- computer screen directly in front of them. The context con- age 1.24 cycles less using CARS (M edian = 5, M ean = 6.1 ditions not mentioned in the scenario description, such as with CARS, M edian = 5, M ean = 7.34 with the baseline the crowdedness, can be selected by the user based on her system). Again a Wilcoxon signed rank test was executed own preferences. (V = 2393.5, p = .11). However, the result is not significant, The dataset used to test the application includes 5157 meaning that the null hypothesis cannot be rejected. randomly selected fashion items, that were extracted from One of the goals of the CARS was to reduce the number the Zalando API2 of their UK-store in February and March of times an individual item is shown (item frequency) and 2015. Since our dataset is artificial, we distributed the items thus increase the number of different items (item coverage). equally across all 129 shops and made realistic assumptions All in all the baseline application showed 7506 (1690 dif- for our shops. The shop’s opening hours were set to realistic ferent; 22.5 % unique) and the CARS 6390 (1754 different; values with moderate modifications to have some differences 27.4 % unique) items. We measured every time that an item 2 https://www.zalando.co.uk was displayed to any user. The maximum number of times Table 1: The means of some important measured values vey also imagined these contexts, we expect no significant comparing both variations of the system. differences between the classification of the items and the imagined scenario in the user study. This approach might BASE CARS p value V value help in narrowing down the problem of acquiring relevant mean mean context data as a quick start for a context-aware applica- Perceived accuracy 2.71 2.34 <.01 1807 tion. However, it has to be evaluated how close real contex- Perceived context- 2.82 2.66 .54 1346 tual ratings can be estimated with this method. In order awareness to adapt the existing approaches of estimating a rating to a Intention to return 3.06 2.64 <.01 1563 yes or no decision we had to develop the concept of an av- Time 179 s 182 s .45 2302.5 erage context, in which an item is selected. We believe that Cycles 7.34 6.1 .11 2393.5 every context-aware recommender system relying on yes or Item frequency 4.39 3.62 <.01 285253.5 no decisions might have benefits from adapting its context incorporation by using our approach. We also aim to find out whether the results of this work can be transferred to other application scenarios, such as for grocery shopping or an item was shown was 115 (M edian = 3, M ean = 4.392) leisure activity recommendation systems. for the baseline application and 53 (M edian = 2, M ean = 3.622) for the CARS. A Wilcoxon signed rank test reveals that there is a significant difference between the samples 6. REFERENCES (V = 285253.5, p < .01), meaning that the CARS showed [1] G. Adomavicius, L. Baltrunas, E. W. De Luca, items significantly less frequent than the baseline. Although T. Hussein, and A. Tuzhilin. 4th workshop on the CARS showed less items overall, more different items context-aware recommender systems (cars 2012). In have been shown. This indicates that the recommended RecSys, pages 349–350, 2012. items have been more diverse. [2] G. Adomavicius and A. Tuzhilin. Context-aware Overall, 59 participants reported that they prefer the con- recommender systems. In F. Ricci, L. Rokach, text-aware application (CARS). This are significantly more B. Shapira, and P. B. Kantor, editors, Recommender compared to a random distribution of answers as a chi- Systems Handbook, pages 217–253. Springer, 2011. squared test reveals (X 2 = 30.38, with 2 df [degrees of [3] S. S. Anand and B. Mobasher. Contextual freedom], p < .001). recommendation. Lecture Notes in Artificial The test participants found that the CARS recommenda- Intelligence, 4737:142–160, 2007. tions fitted significantly better to their preferences. There- [4] L. Baltrunas, B. Ludwig, S. Peer, and F. Ricci. fore, hypothesis 1 that the recommendations by a context- Context Relevance Assessment and Exploitation in aware system are perceived as better is retained. Hypothe- Mobile Recommender Systems. Personal Ubiquitous sis 2 that the overall user satisfaction is improved can also Comput., 16(5):507–526, June 2012. be retained to a certain degree as users were more satisfied [5] V. Bellotti and et al. Activity-Based Serendipitous with the CARS. The results might be less significant than Recommendations with the Magitti Mobile Leisure expected as only six users rated items in context as an ini- Guide. Proceeding of the twenty-sixth annual CHI tial dataset. However, we wanted the dataset to be sparse conference on Human factors in computing systems - as there are frequent changes to fashion collections. CHI ’08, pages 1157–1166, 2008. [6] A. K. Dey. Understanding and using context. Personal 5. CONCLUSION AND FUTURE WORK and Ubiquitous Computing, 5:4–7, 2001. In this work, a context-aware recommender system was [7] B. Lamche, U. Trottmann, and W. Wörndl. Active developed and evaluated in a mobile shopping scenario. Our learning strategies for exploratory mobile CARS is based on an active learning algorithm and uses a recommender systems. ACM International Conference nearest neighbor algorithm. Compared to a system with- Proceeding Series, pages 10–17, 2014. out context-awareness, the recommendations were perceived [8] J. S. Lee and J. C. Lee. Context awareness by as significantly better in the CARS. Interestingly, the users case-based reasoning in a music recommendation did not attribute the better recommendation quality to the system. Lecture Notes in Computer Science, more context-aware recommendations but to better adapt- 4836:45–58, 2007. ability to their preferences and their clothing style, although [9] F. Ricci. Mobile recommender systems. Information the only difference from an algorithmic perspective is the Technology & Tourism, 12(3):205–231, 2010. context-awareness. It should be investigated in more detail, [10] S. Savage, M. Baranski, N. E. Chavez, and whether context-awareness is only perceived subconsciously. T. Hollerer. I’m feeling loco: A location based context The next step for this application would be to test it in aware recommendation system. In Advances in an online-experiment where real context-aware information Location-Based Services: 8th International Symposium is elicited. In a first approach the clothing data of some on Location-Based Services, Lecture Notes in selected retailers would be enough to test this application Geoinformation and Cartography. Springer, Vienna, online. In the future, we plan to conduct a user study where Austria, 2011. real context-aware information is elicited. Still a major chal- [11] W. Wörndl and B. Lamche. User interaction with lenge for context-aware applications is to acquire context- context-aware recommender systems on smartphones. aware data to train or tweak a context-aware algorithm. For In icom, volume 14, pages 19–28, 2015. this user study, selected users classified the contexts in which they would try the clothes on. As the users in the user sur-