A Hybrid Strategy for Privacy-Preserving Recommendations for Mobile Shopping Toon De Pessemier Kris Vanhecke Luc Martens iMinds-WiCa-Ghent University iMinds-WiCa-Ghent University iMinds-WiCa-Ghent University G. Crommenlaan 8 box 201 G. Crommenlaan 8 box 201 G. Crommenlaan 8 box 201 B-9050 Ghent, Belgium B-9050 Ghent, Belgium B-9050 Ghent, Belgium toon.depessemier@ugent.be kris.vanhecke@ugent.be luc1.martens@ugent.be ABSTRACT ties. This could occur either on purpose, i.e., by selling To calculate recommendations, recommender systems col- the personal information to a third party, or involuntarily lect and store huge amounts of users’ personal data such through a security breach. As a result, users are becoming as preferences, interaction behavior, or demographic infor- apprehensive about using applications or services that col- mation. If these data are used for other purposes or get lect personal data. For shopping applications for example, into the wrong hands, the privacy of the users can be com- many customers are already concerned with the data col- promised. Thus, service providers are confronted with the lection practices related to loyalty programs [8, 11]. This challenge of o↵ering accurate recommendations without the can be exacerbated when the loyalty program is an appli- risk of dissemination of sensitive information. This paper cation running on the customer’s own smartphone. These presents a hybrid strategy combining collaborative filtering devices contain a large amount of personal data such as the and content-based techniques for mobile shopping with the customer’s phone number, e-mail address, and social net- primary aim of preserving the customer’s privacy. Detailed working account details. information about the customer, such as the shopping his- Despite these privacy concerns, Mobile Shopping Assis- tory, is securely stored on the customer’s smartphone and tants (MSAs) are becoming increasingly popular due to the locally processed by a content-based recommender. Data benefits they o↵er to both customers and retailers. An of individual shopping sessions, which are sent to the store MSA can enhance the shopping experience by incorporating backend for product association and comparison with simi- features such as loyalty programs, discount vouchers, easy lar customers, are unlinkable and anonymous. No uniquely checkout, and various personalized services. The applica- identifying information of the customer is revealed, making tions are easy and inexpensive to roll out because they can it impossible to associate successive shopping sessions at the run on the customer’s own smartphone. The retailer does store backend. Optionally, the customer can disclose demo- not have to invest in specialized hardware and many cus- graphic data and a rudimentary explicit profile for further tomers are already familiar with smartphones and the con- personalization. cept of mobile apps. To address these privacy concerns, Put et al. [9] have created inShopnito, a transparent, privacy- preserving MSA that still o↵ers all the features that cus- Categories and Subject Descriptors tomers and retailers have come to expect, including a rudi- H.3.3 [Information Search and Retrieval]: Information mentary recommender system. In this paper, we have ex- Filtering; K.4.1 [Computers and Society]: Public Policy tended the MSA with an advanced, hybrid recommenda- Issues—Privacy tion strategy. Section 3 describes this contribution in detail. As the security and privacy-enhancing technologies used for Keywords (anonymous) authentication and transactions have already been described [9], Section 2 of this paper provides only a Recommender System, Shopping Assistant, Privacy, Mobile brief overview of the functionality and the implications of privacy-preserving measures on recommendations. 1. INTRODUCTION Data gathering and analysis, i.e. one of the fundamen- 2. PRIVACY-PRESERVING MOBILE SHOP- tals of traditional recommender systems, is a serious con- PING cern for many, increasingly privacy-aware users. A data col- lector may disclose personal information to untrusted par- Preserving the customer’s privacy during the usage of in- Shopnito is of primary importance, which has significant im- plications for the recommender. At registration time, the customer is issued an Idemix [5] anonymous credential con- Permission to make digital or hard copies of all or part of this work for taining attributes with personal information such as name, personal or classroom use is granted without fee provided that copies are zip code, or gender. When the customer enters the store, the not made or distributed for profit or commercial advantage and that copies inShopnito MSA uses the credential to initiate a new shop- bear this notice and the full citation on the first page. To copy otherwise, to ping session on the store backend system. The customer Copyrightto2014 republish, forservers post on the individual papers or to redistribute by the to lists, paper’s requires prior authors. specific Copying permitted for private and academic purposes. This volume is chooses which attributes (name, zip code, gender, explicit permission and/or a fee. published and CBRecSys 2014,copyrighted October 6, by its editors. 2014, Silicon Valley, CA, USA. profile) to disclose during this authentication phase. These Copyright CBRecSys 2014 2014,byOctober the author(s). 6, 2014, Silicon Valley, CA, USA. di↵erent levels of privacy provide the customer the necessary 22 flexibility in the trade-o↵ between privacy and personaliza- Customer tion. The backend system knows that the customer has a valid credential, and it knows the content of the attributes that the customer opted to disclose. However, it does not know which particular customer it is dealing with, because no uniquely identifying information is disclosed during au- thentication. This also means that a customer’s successive Personalized Vouchers Explicit Profile shopping sessions can not be tied together. Explicit Profile Recommendation The customer can now proceed to scan products using the camera of her smartphone and add them to the inShopnito shopping cart. During checkout, inShopnito can be used to Associated Products redeem loyalty points and vouchers to get a discount. To Shopping Cart Shopping Cart Recommendation provide the customer with a complete overview of her shop- ping history, the MSA stores this information securely on the smartphone where it can be used for recommendation purposes. In Section 3, we expand on the recommender Serendipitous Similar Customer components of the inShopnito MSA and backend, and pro- Explicit Profile Shopping Cart Products Recommendation pose a practical solution that preserves the privacy of the customer while still o↵ering advanced personalized services. Recommender systems initially face the cold start prob- Subset of Product Data lem, because nothing is known about the user [7]. Usually, a user’s actions can be tracked over time. As more infor- Content-Based Detailed Profile Recommendation Product Catalog mation about the user becomes available, the quality of the recommendations increases. With inShopnito, each shop- ping session is associated with a di↵erent, anonymous user identifier. Thus, a server-side recommender system will al- Purchase Pattern Shopping History ways have to address the cold start problem, whether it is Recommendation the customer’s first store visit, or her hundredth. Client-side recommenders pose their own set of challenges [3]. Related research [4] into privacy-preserving, personalized Figure 1: Schematic overview of the di↵erent rec- ad delivery has proposed a coarse-grained filtering of ads ommendation techniques. based on the personal information that customers choose to disclose. Subsequently, a further filtering can be per- formed at client side based on the purchase details stored products, make-up, personal care products, etc. on the customer’s smartphone. Compared to existing solu- If the customer opts to disclose (parts of) her explicit tions, the hybrid recommender strategy of inShopnito goes profile and demographic data, this information is sent to further than a filtering of information, by analyzing individ- the store backend and used to personalize the coupons and ual shopping carts and comparing them with the purchases vouchers she receives. These targeted vouchers have benefits of similar customers at the backend. for the retailers (the vouchers are more e↵ective) as well as for the customers (more relevant vouchers are o↵ered). For 3. HYBRID RECOMMENDATIONS privacy reasons, these explicit profiles are only used during The hybrid strategy combines five recommendation ap- the current sessions and removed from the store backend proaches. For each approach, preserving the customer’s pri- after issuing the vouchers. vacy is of crucial importance. Figure 1 provides a schematic overview of these approaches and the data they use. 3.2 Shopping Cart Recommendations During every shopping session, the content of the shopping 3.1 Explicit Profile Recommendations cart is sent to the store backend for analysis. For privacy Through an explicit profile on her smartphone, the cus- reasons, no uniquely identifying reference to the customer tomer can specify her preferred product categories, as shown is stored at server side. Only the content of the individ- in Figure 2(a). These categories, grouping individual prod- ual shopping carts, together with the date of the purchase, ucts that are typically located in the same section of the are stored. The date provides useful information regarding store, allow customers to quickly express their interests and trends in purchasing behavior, or seasonal products. A de- filter out irrelevant product groups. Product categories such tailed timestamp with the exact moment of the purchase as pet supplies, garden tools, car/motorcycle supplies, toys, (hours/minutes) would have no extra value for the recom- or baby products, are not relevant for every customer. This mender and is omitted because this might induce a privacy explicit profile is created automatically based on the cus- risk. If a customer’s time of purchase is known (e.g., by ob- tomer’s purchases, but can be altered at her own discretion. servation) and the exact timestamp would be stored, linking Although the explicit profile contains only category pref- the content of the shopping cart to the customer’s identity erences and no details regarding individual products, dis- would be possible. closing the explicit profile is an optional feature for privacy Analysis of the content of the shopping carts of di↵er- reasons. In addition, customers can opt to disclose some ent customers provides insight into the shopping habits of demographic data such as age, municipality, and gender, in the customers and reveals which products are often bought order to further filter the product categories such as shaving together. For instance, customers will buy both pasta and 22 bolognese sauce if they intend to prepare spaghetti or lasagna. ping carts will end up in the same bucket. Since the bucket Product association rules are used to discover which prod- contains only product information and no link to the iden- ucts belong together. Interesting recommendations are prod- tity of the customer, the purchasing history of an individual ucts that are not yet added to the shopping cart, but are customer cannot be deduced if the bucket groups purchases often bought in combination with the products that are in of many customers. Analysis of the products in the bucket the shopping cart. So, if bolognese sauce is in the shopping of the customer allows to generate recommendations based cart, pasta is a good recommendation. More generally, the on what people who like similar products have bought in best recommendation is the product, Y, with the highest the past. The products that are most popular with other probability to be bought, given the current content of the customers of the group are recommended, with the excep- shopping cart, X. Here, X can be a single product or a set tion of products that are already in the shopping cart of of products that the customer wants to buy. the customer. The popularity of products within a group However, highly popular products will always be bought is normalized with respect to the general popularity of a in combination with a large variety of other products, even product. These recommendations, based on the purchases though no direct link (e.g., a recipe) exists between them. of similar customers, aim to o↵er more serendipitous recom- The probability that the customer will buy these popular mendations to the customers, just as collaborative filtering products is always high, regardless of the content of the algorithms do. shopping cart. In order to take into account the general This recommendation technique can also be combined with popularity of products, this probability, P (X, Y |X), is nor- the approach that compares shopping carts (Section 3.2). malized by dividing it by the probability of buying Y, if Individual shopping carts (without a reference to the cus- the content of the shopping cart is di↵erent from X. The tomer’s identity) can be stored per group of similar cus- products with the highest normalized probability are rec- tomers, as defined by the explicit profile. Subsequently, ommended to the customer, as illustrated in Figure 2(b). product association rules can be applied on the groups of similar customers, instead of on the complete population of P (X,Y ) customers. This partitioning of customers according to their P (X, Y |X) P (X) preferences can help to refine the product association rules. Max = Max P (!X,Y ) (1) X⇢Cart P (!X, Y |!X) X⇢Cart P (!X) In addition to these automatically derived combinations 3.4 Content-based Recommendations of products using product association rules, domain knowl- For privacy reasons, detailed historical information about edge helps to recommend the best matching products. For purchases cannot leave the secured environment of the cus- the shopping cart recommendations, the domain knowledge tomer’s smartphone. As a result, these detailed purchase consists of a set of recipes. If the customer’s shopping cart data can only be exploited if the recommendation algorithm already contains several products that match the ingredients runs on the customer’s smartphone. In this customer-centric of a certain recipe, the missing ingredients are recommended personalization approach [1], each user has its own mobile and the recipe is suggested to try out. Since these recom- recommendation engine. Storing this detailed user profile se- mendations do not require a user profile with an extensive curely on the smartphone also has advantages. For instance, purchase history, they can help to overcome the cold start the user profile can be shared amongst di↵erent shops with- problem. out the privacy risk that one retailer abuses these purchase data for commercial profits. 3.3 Similar Customer Recommendations Since only purchase data of the target user (i.e. the user Storing the customers’ individual consumption behavior for who recommendations are calculated) are available on on a central server induces a privacy risk and is therefore the smartphone, a content-based recommendation algorithm undesirable. With inShopnito, each shopping session has a is the most worthwhile solution to process this detailed pro- di↵erent, anonymous user identifier, and successive shopping file. sessions cannot be linked (Section 2). Because collabora- Content-based recommendation algorithms determine the tive filtering is based on calculating the similarity between products that best match the user’s profile, based on a de- the historical consumption behavior of individual users (or scription of the product characteristics [6]. Since the de- products), a traditional collaborative filtering approach is tailed profile cannot leave the customer’s smartphone, the not possible in this situation. product descriptions have to be transferred to the smart- As an alternative, customers are compared based on their phone for comparison with the detailed profile. However, explicit profile, which is voluntarily disclosed and contains the complete product catalog of the store and the corre- only data about product categories but not of individual sponding descriptions can be quite extensive for processing product purchases. Calculating the similarity between cus- on a smartphone. Therefore, only products and descriptions tomers based on their explicit profile might be less accurate of categories that are relevant for the customer are sent to than based on their complete consumption behavior; but this the smartphone to reduce the data traffic. Recommenda- approach induces no privacy risk. Based on this explicit tions for pet supplies may be irrelevant for customers who shopping profile, customers are partitioned into groups of have never bought any pet supplies in the past. They may similar customers, just as the neighborhoods of similar users not have pets, or buy their supplies through other channels. in the traditional collaborative filtering approach. Each The explicit profile is used to determine which categories are group is represented by a bucket that contains all products relevant and have to be considered. The resulting subset of that have been bought by the customers of that group. the product catalog has to be downloaded only once, the After every visit to the store, the content of the customer’s first time that the customer visits the store. From then on, shopping cart is added to the bucket of the customer’s group. updates of the catalog are sufficient to keep track of new If two customers have a similar explicit profile, their shop- products, changed descriptions, and products that are not 23 a privacy risk. But limiting the disclosed customer data introduces a trade-o↵ between the accuracy of the recom- mendations and the privacy of the customer. Therefore, we present a privacy-preserving, hybrid strategy that combines client-side and server-side recommendation techniques. At server-side, the recommender is based on information that customers opt to disclose, and performs an analysis of the shopping cart using product association rules, and a compar- ison with the shopping carts of similar customers. At client- side, detailed customer information is used for content-based recommendations and suggestions based on purchase pat- terns. (a) (b) 5. ACKNOWLEDGMENTS This research was funded by the IWT-SBO Project Mob- Com: A Mobile Companion (https://www.mobcom.org) Figure 2: Screenshots of the mobile application: (a) the explicit user profile (b) the personalized sugges- 6. REFERENCES tions. [1] G. Adomavicius, Z. Huang, and A. Tuzhilin. Personalization and recommender systems. Tutorials available anymore. in Operations Research, Informs, pages 55–107, 2008. Di↵erent types of content-based recommendation algorithms [2] H. Baumgartner. Repetitive purchase behavior. In can be used, but with the limitation that the computa- A. Diamantopoulos, W. Fritz, and L. Hildebrandt, tional requirements must fit within the available resources of editors, Quantitative Marketing and Marketing the smartphone. Our approach uses the InterestLMS algo- Management, pages 269–286. Gabler Verlag, 2012. rithm of the Duine recommender framework [10]. Content- [3] L. N. Cassel and U. Wolz. Client side personalization. based algorithms often su↵er from over-specialization [7], In DELOS Workshop: Personalisation and since they recommend only products similar to those already Recommender Systems in Digital Libraries, pages bought by the customers. In certain application domains, 8–12, 2001. items should not be recommended if they are too similar [4] M. Hardt and S. Nath. Privacy-aware personalization to something the user has already seen, such as a di↵erent for mobile advertising. In Proceedings of the 2012 news article describing the same event. For shops however, ACM Conference on Computer and Communications various situations exist in which customers are interested Security, CCS ’12, pages 662–673, New York, NY, in similar products: cheaper or discounted products of a USA, 2012. ACM. di↵erent brand, new or similar food products to replenish [5] IBM Research Security Team. Specification of the their house stock, or alternatives for products that are out Identity Mixer Cryptographic Library v. 2.3.4. of stock. Technical report, 2012. 3.5 Purchase Pattern Recommendations [6] D. Jannach, M. Zanker, A. Felfernig, and G. Friedrich. Recommender Systems: An Introduction. Cambridge The last type of recommendations focuses on the repeti- University Press, New York, NY, USA, 1st edition, tive purchase behavior of customers [2]. Specific products, 2010. such as toothpaste or co↵ee, are used on a regular basis, and as a result, need to be replenished regularly. Patterns [7] P. Lops, M. Gemmis, and G. Semeraro. Content-based in the purchase behavior can be detected, and used to pre- recommender systems: State of the art and trends. In dict the next purchase of a certain product. E.g., if one F. Ricci, L. Rokach, B. Shapira, and P. B. Kantor, tube of toothpaste is bought every month, predicting the editors, Recommender Systems Handbook, pages next purchase of toothpaste is obvious. 73–105. Springer US, 2011. Based on the shopping history (i.e. the time and amount [8] V. Pez. Negative e↵ects of loyalty programs: An of the last purchase), the recommender estimates if the cus- empirical investigation on the french mobile phone tomer needs to buy a certain product. If this is the case, sector. Technical report, Université Paris-Dauphine, and the customer has not yet added the product to the 2007. shopping cart, it will be recommended. So, the aim of the [9] A. Put, I. Dacosta, M. Milutinovic, B. De Decker, purchase pattern recommendations is to remind customers S. Seys, F. Boukayoua, V. Naessens, K. Vanhecke, to buy products that they might forget but probably need T. De Pessemier, and L. Martens. inshopnito: An because of their repetitive consumption behavior. advanced yet privacy-friendly mobile shopping application. In Proceedings of the IEEE 10th World 4. CONCLUSIONS Congress on Services (SERVICES 2014). IEEE, 2014. [10] Telematica Instituut / Novay. Duine Framework, 2009. The growing importance of privacy in online services em- Online available at http://duineframework.org/. phasizes the need for privacy-preserving recommender sys- [11] S. Worthington and J. Fear. The hidden side of loyalty tems, not the least in the domain of shopping. Traditional card programs. The Austalian centre for retail studies, collaborative filtering algorithms, which rely on a central 2009. storage and comparison of detailed user profiles, may induce 24