The Continuous Cold Start Problem in e-Commerce Recommender Systems Lucas Bernardi1 , Jaap Kamps2 , Julia Kiseleva3 , Melanie J.I. Mueller1 1 Booking.com, Amsterdam, Netherlands. Email: {lucas.bernardi, melanie.mueller}@booking.com 2 University of Amsterdam, Amsterdam, Netherlands. Email: kamps@uva.nl 3 Eindhoven University of Technology, Eindhoven, Netherlands. Email: j.kiseleva@tue.nl ABSTRACT approaches have supplemented the original collaborative fil- Many e-commerce websites use recommender systems to rec- tering techniques [1]. ommend items to users. When a user or item is new, the In the most basic formulation, the task of a recommender system may fail because not enough information is available system is to predict ratings for items that have not been on this user or item. Various solutions to this ‘cold-start seen by the user. Using these predicted ratings, the system problem’ have been proposed in the literature. However, decides which new items to recommend to the user. Recom- many real-life e-commerce applications suffer from an aggra- mender systems base the prediction of unknown ratings on vated, recurring version of cold-start even for known users or past or current information about the users and items, such items, since many users visit the website rarely, change their as past user ratings, user profiles, item descriptions etc. If interests over time, or exhibit different personas. This paper this information is not available for new users or items, the exposes the Continuous Cold Start (CoCoS) problem and its recommender system runs into the so-called cold-start prob- consequences for content- and context-based recommenda- lem: It does not know what to recommend until the new, tion from the viewpoint of typical e-commerce applications, ‘cold’, user or item has ‘warmed-up’, i.e. until enough in- illustrated with examples from a major travel recommenda- formation has been generated to produce recommendations. tion website, Booking.com. For example, which accomodations should be recommended to someone who visits Booking.com for the first time? If the recommender system is based on which accomodations General Terms users have clicked on in the past, the first recommendations CoCoS: continuous cold start can only be made after the user has clicked on a couple of accomodations on the website. Keywords Several approaches have been proposed and successfully applied to deal with the cold-start problem, such as utiliz- Recommender systems, continous cold-start problem, indus- ing baselines for cold users [8], combining collaborative fil- trial applications tering with content-based recommenders in hybrid systems [14], eliciting ratings from new users [11], or, more recently, 1. INTRODUCTION exploiting the social network of users [6, 15]. In particu- Many e-commerce websites are built around serving per- lar, content-based approaches have been very successful in sonalized recommendations to users. Amazon.com recom- dealing with cold-start problems in collaborative filtering mends books, Booking.com recommends accommodations, [3, 4, 13, 14]. Netflix recommends movies, Reddit recommends news sto- These approaches deal explicitly with cold users or items, ries, etc. Two examples of recommendations of accomoda- and provide a ‘fix’ until enough information has been gath- tions and destinations at Booking.com are shown in Fig- ered to apply the core recommender system. Thus, rather ure 1. This widescale adoption of recommender systems on- than providing unified recommendations for cold and warm line, and the challenges faced by industrial applications, have users, they temporarily bridge the period during which the been a driving force in the development of recommender user or item is ‘cold’ until it is ‘warm’. This can be very systems. The research area has been expanding since the successful in situations in which there are no warm users first papers on collaborative filtering in the 1990s [12, 16]. [3], or in situations when the warm-up period is short and Many different recommendation approaches have been de- warmed-up users or items stay warm. veloped since then, in particular content-based and hybrid However, in many practical e-commerce applications, users or items remain cold for a long time, and can even ‘cool down’ again, leading to a continuous cold-start (CoCoS). In the example of Booking.com, many users visit and book in- frequently since they go on holiday only once or twice a year, leading to a prolonged cold-start and extreme sparsity of col- laborative filtering matrices, see Fig. 2 (top). In addition, even warm long-term users can cool down as they change their needs over time, e.g. going from booking youth hos- CBRecSys 2015, September 20, 2015, Vienna, Austria. tels for road trips to resorts for family vacations. Such cool- Copyright remains with the authors and/or original copyright holders. Customers who viewed Hotel Sacher Wien also viewed: Destinations related to Vienna: Figure 1: Examples of recommender systems on Booking.com. User-to-user collaborative filtering (left): recommend accomodations viewed by similar users to a user who just looked at ‘Hotel Sacher Wien’. Item- to-item content-based recommendations (right): recommend destinations similar to a particular destination, Vienna. downs can happen more frequently and rapidly for users who 2.2 Item Continuous Cold-Start book accommodations for different travel purposes, e.g. for In a symmetric way, these CoCoS problems also arise for leisure holidays and business trips as shown in Fig. 2 (bot- items: tom). These continuous cold-start problems are rarely ad- dressed in the literature despite their relevance in industrial Classical cold-start / sparsity: new or rare items applications. Classical approaches to the cold-start problem Volatility: item properties or value changes over time fail in the case of CoCoS, since they assume that users warm up in a reasonable time and stay warm after that. Personas: item appeals to different types of users In the remainder of the paper, we will elaborate on how CoCoS appears in e-commerce websites (Sec. 2), outline Identity: failure to match data from the same item some approaches to the CoCoS problem (Sec. 3), and end with a discussion about possible future directions (Sec. 4). New items appear frequently in e-commerce catalogues, as shown in Figure 3 for accommodations at Booking.com. Some items are interesting only to niche audiences, or sold only 2. CONTINUOUS COLD-START rarely, for example books or movies on specialized topics. Cold-start problems can in principle arise on both the user Items can be volatile if their properties change over time, side and the items side. such as s phone that becomes outdated once a newer model is released, or a hotel that undergoes a renovation. In the 2.1 User Continuous Cold-Start context of news or conversions, item volatility is also known We first focus on the user side of CoCoS, which can arise as topic drift [9]. Figure 3 on the right shows fluctuations of in the following cases: the review score of a hotel at Booking.com. Some items have different ‘personas’ in that they target several user groups, Classical cold-start / sparsity: new or rare users such as a hotel that caters to business as well as leisure Volatility: user interest changes over time travellers. When several sellers can add items to an e-com- merce catalogue, or when several catalogues are combined, Personas: user has different interests at different, possibly correctly matching items can be problematic (identity prob- close-by points in time lem). Identity: failure to match data from the same user 3. ADDRESSING COLD-START All cases arise commonly in e-commerce websites. New users Many approaches have been proposed to deal with the arrive frequently (classical cold-start), or may appear new classical cold-start problem of new or rare users or items when they don’t log in or use a different device (failed iden- [11]. However, they mostly fail to address the more difficult tity match). Some websites are prone to very low levels CoCoS. of user activity when items are purchased only rarely, such The most popular strategy to address the classic cold-start as travel, cars etc., leading to sparsity problems for recom- problem is the hybrid approach where collaborative filtering mender systems. Most users change their interests over time and content-based models are combined, see [14] as an exam- (volatility), e.g. movie preferences evolve, or travel needs ple. If one of the two method fails due to a new user or item, change. On even shorter timescales, users have different the other method is used to ‘fill-in’. The most basic assump- personas. Depending on their mood or their social context, tion is that similar users will like similar items. Similarity they might be interested in watching different movies. De- of users is measured by their purchase history when warm, pending on the weather or their travel purpose, they may and by their user profile when cold. Conversely, similarities want to book different types of trips, see Figure 2 for exam- between items is computed by the set of users that pur- ples from Booking.com. chased them when warm, and by their content when cold. These issues arise for collaborative filtering as well as In CoCoS, users change their interests, so both collabora- content-based or hybrid approaches, since both user ratings tive filtering and user-profile-based approaches can fail, since or activities as well user profiles might be missing, become looking at the past and similarities can be misleading. Items outdated over time, or not be relevant to the current user also suffer from volatility, although to a lesser degree, which persona. makes the standard hybrid approach also problematic for Activity Level 1 51 101 151 201 251 301 351 Activty Level Day Leisure Business booking booking 1 11 21 31 41 51 61 71 81 91 Day Figure 2: Continuously cold users at Booking.com. Activity levels of two randomly chosen users of Book- ing.com over time. The top user exhibits only rare activity throughout a year, and the bottom user has two different personas, making a leisure and a business booking, without much activity inbetween. 410000 9.1 390000 Available Properties 9 370000 at Booking.com 8.9 Avg. User Rating (2013) 8.8 350000 8.7 330000 8.6 310000 8.5 290000 8.4 270000 8.3 250000 8.2 Jun Jul Ago Sep Oct Nov Dec 1 51 101 151 201 251 301 351 Month Day Figure 3: Continuously cold items at Booking.com. Thousands of new accommodations are added to Book- ing.com every month (left). The user ratings of a hotel can change continuously (right). items. Hybrid approaches also ignore the issue of multiple evaluation using real data. personas. Tavakol and Brefeld [20] propose a topic driven recom- Although, to our knowledge, the continuous cold-start mender system. At the user session level, the user intent problem as defined in this work has not been directly ad- is modeled as a topic distribution over all the possible item dressed in the literature, several approaches are promising. attributes. As the user interacts with the system, the user Tang et al. [19] propose a context-aware recommender intent is predicted and recommendations are computed using system, implemented as a contextual multi-armed bandits the corresponding topic distribution. The topic prediction problem. Although the authors report extensive offline eval- is solved by factored Markov decision processes. Evaluation uation (log based and simulation based) with acceptable on an e-commerce data set shows improvements when com- CTR, no comparison is made from a cold-start problem pared to collaborative filtering methods in terms of average standpoint. rank. Sun et al. [18] explicitly attack the user volatility prob- lem. They propose a dynamic extension of matrix factoriza- 4. DISCUSSION tion where the user latent space is modeled by a state space model fitted by a Kalman filter. Generative data present- In this manuscript, we have described how CoCoS, the ing user preference transitions is used for evaluation. Im- continuous cold-start problem, is a common issue for e-com- provements of RMSE when compared to timeSVD [10] are merce applications. Industrial recommender systems do not reported. Consistent results are reported in [5], after offline only have to deal with ‘cold’ (new or rare) users and items, but also with known users or items that repeatedly ‘cool down’. Reasons for the recurring cool-downs include the cess, Profile Management, and Context Awareness in volatility in user interests or item values, different personas Databases, 2013. depending on user context or item target audience, or iden- [5] F. C. T. Chua, R. J. Oentaryo, and E.-P. Lim. Modeling tification problems due to logged-out users or items from temporal adoptions using dynamic matrix factorization. different catalogues. Despite the practical relevance of Co- In IEEE 13th International Conference on Data Mining (ICDM), pages 91–100, 2013. CoS, common literature approaches do not deal well with [6] I. Guy, N. Zwerdling, D. Carmel, I. Ronen, E. Uziel, this issue. S. Yogev, and S. Ofek-Koifman. Personalized recom- We consider several directions as particularly promising mendation of social software items based on social rela- to deal with CoCoS. Traditional approaches to solve cold- tions. In Proceedings of the Third ACM Conference on start problems try to employ collaborative filtering based on Recommender Systems, pages 53–60, 2009. pseudo or inferred clicks. Recommendations based on so- [7] A. Hawalah and M. Fasli. Utilizing contextual onto- cial networks are an interesting new development that can logical user profiles for personalized recommendations. supplement missing information based on the social graph. Expert Syst. Appl. (ESWA), 41:4777–4797, 2014. For example, recommendations based on Facebook likes are [8] D. Kluver and J. A. Konstan. Evaluating recommender behavior for new users. In Proceedings of the 8th ACM proposed in [15]. Beyond the difficulty to get access to social Conference on Recommender Systems, pages 121–128, data, the application to user volatility or multiple personas 2014. remains challenging. Online user intent prediction can be [9] D. Knights, M. Mozer, and N. Nicolov. Detecting topic used to estimate a user’s current profile on the fly. When drift with compound topic models. In Proceedings of the a user visits the website, his browsing behavior is used to Third International ICWSM Conference, pages 242– estimate his intent after a few clicks, which are then used 245, 2009. to compute recommendations accordingly. However, this [10] Y. Koren. Collaborative filtering with temporal dynam- still delays recommendations until enough clicks have oc- ics. Communications of the ACM, 53:89–97, 2010. curred, which can be problematic if quick recommendations [11] A. M. Rashid, I. Albert, D. Cosley, S. K. Lam, S. M. are needed. For example, in last-minute bookings, users McNee, J. A. Konstan, and J. Riedl. Getting to know you: Learning new user preferences in recommender may be pressed to book an accommodation quickly, leading systems. In Proceedings of the 7th International Con- to very short sessions. ference on Intelligent User Interfaces, pages 127–134, More promising approaches employ content based or con- 2002. textual recommendation. Content based recommendations [12] P. Resnick, N. Iacovou, M. Suchak, P. Bergstrom, and can be very effective based on very little signal: just an J. Riedl. Grouplens: An open architecture for collabo- initial query or single interaction can be exploited to find rative filtering of netnews. In Proceedings of the 1994 an initial item or set of items and exploit relations between ACM Conference on Computer Supported Cooperative Work, pages 175–186, 1994. items to make effective recommendations. In particular [13] M. Saveski and A. Mantrach. Item cold-start rec- context aware recommendations are one of the most promis- ommendations: Learning local collective embeddings. ing strategies when it comes to solving CoCoS. In this setup, In Proceedings of the 8th ACM Conference on Recom- recommendations are computed based on the current con- mender Systems, pages 89–96, 2014. text of the current visitor and the behaviour of other users [14] A. I. Schein, A. Popescul, L. H. Ungar, and D. M. Pen- in similar contexts [see 2, 7, 17] for examples. Context is nock. Methods and metrics for cold-start recommenda- defined as a set of features such as location, time, weather, tions. In Proceedings of the 25th annual international device, etc. Often this data is readily available in most com- ACM SIGIR conference on Research and development mercial implementations of recommender systems. This ap- in information retrieval, pages 253–260, 2002. proach naturally addresses sparsity by clustering users into [15] S. Sedhain, S. Sanner, D. Braziunas, L. Xie, and J. Christensen. Social collaborative filtering for cold- contexts. Since context is determined in a per-action ba- start recommendations. In Proceedings of the 8th ACM sis, user volatility and multiple personas can be addressed Conference on Recommender systems, pages 345–348, robustly. On the other hand, context aware recommenders 2014. cannot address the item side of the problem and they might [16] U. Shardanand and P. Maes. Social information filter- also suffer from cold-start problems in the case of a cold ing: Algorithms for automating ’word of mouth’. In context that has never seen before by the system. Proceedings of the SIGCHI Conference on Human Fac- tors in Computing Systems, pages 210–217, 1995. [17] Y. Shi, A. Karatzoglou, L. Baltrunas, M. Larson, and References A. Hanjalic. Cars2: Learning context-aware representa- tions for context-aware recommendations. In Proceeding [1] G. Adomavicius and A. Tuzhilin. Toward the next gen- of CIKM, pages 291–300, 2014. eration of recommender systems: a survey of the state- of-the-art and possible extensions. IEEE Transactions [18] J. Z. Sun, K. R. Varshney, and K. Subbian. Dynamic on Knowledge and Data Engineering, 17:734–749, 2005. matrix factorization: A state space approach. In IEEE International Conference on Acoustics, Speech and Sig- [2] G. Adomavicius and A. Tuzhilin. Context-aware recom- nal Processing (ICASSP), pages 1897–1900, 2012. mender systems. In Recommender Systems Handbook, pages 217–253, 2011. [19] L. Tang, Y. Jiang, L. Li, and T. Li. Ensemble contex- tual bandits for personalized recommendation. In Pro- [3] M. Aharon, N. Aizenberg, E. Bortnikov, R. Lempel, ceedings of the 8th ACM Conference on Recommender R. Adadi, T. Benyamini, L. Levin, R. Roth, and O. Ser- Systems, pages 73–80, 2014. faty. OFF-set: One-pass factorization of feature sets for online recommendation in persistent cold start settings. [20] M. Tavakol and U. Brefeld. Factored mdps for detect- In Proceedings of the 7th ACM Conference on Recom- ing topics of user sessions. In Proceedings of the 8th mender Systems, pages 375–378, 2013. ACM Conference on Recommender Systems, pages 33– 40, 2014. [4] S. Bykau, F. Koutrika, and Y. Velegrakis. Coping with the persistent cold-start problem. In Personalized Ac-