Tackling Cold-Start Users in Recommender Systems with Indoor Positioning Systems Emanuel Lacic Dominik Kowald Matthias Traub Graz University of Technology Know-Center Know-Center Graz, Austria Graz, Austria Graz, Austria elacic@know-center.at dkowald@know-center.at mtraub@know-center.at Granit Luzhnica Joerg Simon Elisabeth Lex Know-Center Know-Center Graz University of Technology Graz, Austria Graz, Austria Graz, Austria gluzhnica@know- jsimon@know-center.at elisabeth.lex@tugraz.at center.at ABSTRACT In this paper, we present work-in-progress on a recommender sys- tem based on Collaborative Filtering that exploits location infor- mation gathered by indoor positioning systems. This approach al- lows us to provide recommendations for “extreme” cold-start users with absolutely no item interaction data available, where methods based on Matrix Factorization would not work. We simulate and evaluate our proposed system using data from the location-based FourSquare system and show that we can provide substantially bet- ter recommender accuracy results than a simple MostPopular base- line that is typically used when no interaction data is available. Categories and Subject Descriptors H.2.8 [Database Management]: Database Applications—Data min- ing; H.3.3 [Information Storage and Retrieval]: Information Search and Retrieval—Information filtering Keywords cold-start; IPS; beacon; collaborative filtering; FourSquare Figure 1: Example of a public area (e.g., shopping center or academic conference) with five zones where to each zone a bea- 1. INTRODUCTION con is attached. When a device (e.g., a smartphone) enters a One of the main challenges in recommender systems is the cold- zone, the data is stored in our recommender system and can start problem which is defined by so-called cold-start users who then be utilized for location-based recommendations. have not a single or only very few item interaction data available cold-start users with no item interactions at all. One opportunity in (e.g., ratings). In order to tackle this problem, systems like Movie- this respect would be to make use of the ever increasing trend of Lens typically provide interaction surveys where a new user has to providing mobile applications to help users navigate through dif- fulfill a predefined number of interactions before recommendations ferent kinds of public areas, such as shopping centers or scientific can be calculated. However, users are often annoyed by such sur- conferences. These applications can easily acquire a user’s location veys or find it hard to immediately come up with a representative information using indoor positioning systems (IPS) [1] to automat- list of item ratings to fill them out. ically collect location-based item interaction data with no need for Another way to address cold-start users is to utilize algorithms any explicit user action (e.g., a click). based on Matrix Factorization. Although these methods are able to We make use of a user’s location data gathered via IPS technol- provide reasonable results when a minimum number of user-item ogy by proposing a novel recommender system, which utilizes the interactions is available (e.g., three ratings, see [2]), they fail in “ex- user-based Collaborative Filtering approach. Thus, we compute the treme” cold-start settings where there are no item interactions. In similarity between two users based on (i) raw location data and (ii) such cases, recommender systems typically make use of unperson- by creating a user-location network that connects users who visited alized methods such as providing the overall most popular items in the same location during the same day and hour. The preliminary a system. Since recommendations should be personalized in order results of our evaluation based on FourSquare data show that our to support users in the most efficient way, we investigate the useful- proposed approach provides substantially better recommender ac- ness of an additional data source in order to tackle such “extreme” curacy results than a simple Most Popular baseline that is typically Copyright is held by the author/owner(s). used when no user-item interaction data is available. RecSys 2015 Poster Proceedings, September 16–20, 2015, Austria, Vienna. 0.16 2. PROPOSED APPROACH Tracking User Locations. There exists a number of easily at- 0.14 tainable technologies, or indoor positioning systems (IPS), to track 0.12 indoor locations. Among them, BLE (Bluetooth Low Energy) bea- cons have gained importance and popularity, especially after Apple 0.10 introduced the iBeacon protocol1 . Beacons are basically a small nDCG piece of hardware that can be easily attached to e.g., a wall and 0.08 transmit a broadcast to every smartphone or a tablet within its reach. 0.06 Beacons are especially applicable for recommendation tasks since they provide both indoor localization and proximity sensing at low cost and low energy. In our case, we have a public area such as a 0.04 MP Loc.NetworkN.O. Loc.DataJaccard Loc.NetworkA.A. shopping center or an academic conference which is divided into 0.021 2 3 4 5 6 7 8 9 10 several zones. A zone is an abstract location represented by a bea- Number of recommended items con with a given radius (see Figure 1), containing a certain set of Figure 2: nDCG plot for “extreme” cold-start users in the co-located items (e.g., products or venues), preferably related to FourSquare dataset showing that all three location-based CF each other. The transmission power of the broadcast signal should algorithms outperform the MostPopular baseline. be tuned to match the respective physical area of the corresponding zone. However, it should be considered that errors in approximat- method solely based on the raw location data. The overall best re- ing the distance increase with the size of the signal distance [5]. sults are reached by the location network-based approach using the Recommender System. Our IPS-based recommender system Adamic Adar metric with a nDCG@10 value of nearly 15%. relies on user-based Collaborative Filtering. We calculate the sim- ilarity between users u and v either by using the Jaccard’s Coef- 4. CONCLUSION AND FUTURE WORK |∆(u)∩∆(v)| ficient: sim(u, v) = |∆(u)∪∆(v)| on their raw location data (de- In this paper, we have presented work-in-progress on a novel rec- noted by ∆(u) and ∆(v), respectively), or by constructing a lo- ommender system that tackles “extreme” cold-start users with in- cation network where ties between two users are existent if they door positioning systems (i.e., beacon technology). Furthermore, visited the same location within the same day and hour. On the we have shown that our approach outperforms the MostPopular constructed location network in which Γ(u) denotes the location- baseline in an experiment on FourSquare data. One limitation of based neighbourhood of user u, we apply related similarity metrics: our experiment is that it only simulates our approach but it clearly |Γ(u)∩Γ(v)| Neighbourhood Overlap: sim(u, v) = |Γ(u)|+|Γ(v)| , and a refine- shows the potential of it. Thus, as a next step, we will conduct a large-scale user study to evaluate our approach in a real setting ment proposed as Adamic Adar, which adds weights to the links by including it into the i-KNOW Conference Assistant2 during the (since not all neighbours in a network have the same tie strength): P 1 next i-KNOW conference in October 2015. This system will not sim(u, v) = log(|Γ(z)|) (see [3] for the complete for- z∈Γ(u)∩Γ(v) only recommend talks and events but also papers and people ac- malism). cording to a user’s interests and visited indoor locations. From a technical perspective, we utilized the recommender frame- Additionally, we plan to use the accelerometer and gyroscope work presented in [4] to implement and evaluate our approach. sensor to detect the direction of a user in relation to the location of items and try to exploit this for recommendations. We aim to differentiate between cases where a user randomly (i.e., without 3. EVALUATION a specific intention) passes through a zone versus cases where a Experimental Setup. We evaluated our IPS-based recommender user visits a zone and is looking at an item for a longer time or at approach with respect to nDCG (see e.g., [2]) using the FourSquare closer distance. Hence, we can prevent spamming the user with dataset provided by [6]. We chose this dataset since FourSquare recommendations while hassling through a public area. best simulates our setting of a public area (e.g., shopping center Acknowledgments: The authors would like to thank Matthias or academic conference) that can be tracked with IPS technology. Heise for helpful comments on this work. This work is supported Our primary focus lies on users with no item interaction data in by the Know-Center and the EU-IP Learning Layers (Grant Agree- the training set, and our approach recommends up to 10 items (i.e., ment: 318209). venues in the FourSquare setting). Thus, we extracted all users that interacted with 10 items (= 2,783 out of 2,153,471 users) and put 5. REFERENCES [1] J. D. Cai. Business intelligence by connecting real-time indoor these interactions into the test set to be predicted. This ensures location to sales records. In WAIM ’14. Springer. that each of these users is an “extreme” cold-start user. In order [2] D. Kluver and J. a. Konstan. Evaluating recommender behavior for to finally evaluate the effectiveness of our approach, we compared new users. Proc. of RecSys ’14. it to a standard MostPopular baseline, which is the most intuitive [3] E. Lacic, D. Kowald, L. Eberhard, C. Trattner, D. Parra, and L. B. way to provide recommendations when no item interaction data is Marinho. Utilizing online social network and location-based data to available. recommend products and categories in online marketplaces. In Mining, Preliminary Results. The preliminary results of our evaluation Modeling, and Recommending ’Things’ in Social Media. 2015. are shown in Figure 2 in form of a nDCG plot. The results indicate [4] E. Lacic, D. Kowald, and C. Trattner. Socrecm: A scalable social recommender engine for online marketplaces. In Proc. of HT ’14. that all three location-based CF approaches outperform the Most- [5] P. Martin, B.-J. Ho, N. Grupen, S. Muñoz, and M. Srivastava. An Popular baseline which is the standard method for handling users ibeacon primer for indoor localization: Demo abstract. In Proc. of with no item interaction data available. Regarding the location- BuildSys ’14. based algorithms, the two methods based on a user-location net- [6] M. Sarwat, J. Levandoski, A. Eldawy, and M. Mokbel. Lars*: An work, which connects users who visited the same location during efficient and scalable location-aware recommender system. a defined period of time, provide higher nDCG estimates than the Knowledge and Data Engineering, IEEE Transactions on. 1 2 https://developer.apple.com/ibeacon/ http://is.gd/EdMYCN