Timely Tip Selection for Foursquare Recommendations Max Sklar Kristian J. Concepcion Foursquare Labs Foursquare Labs 568 Broadway, 10th Floor 568 Broadway, 10th Floor New York, NY New York, NY max@foursquare.com kjc@foursquare.com ABSTRACT Table 1: Comparing Food Items with the Highest This poster summarizes the techniques we use to serve Bhattacharyya Similarity to Lunch Foursquare tips for a given venue and more specifically the English Phrase Score Thai Translated Score strategies employed for choosing timely and seasonal tips. salad sandwich 0.943 ก๋วยเตี๋ยว noodles 0.884 turkey sandwich 0.929 ชา เขียว green tea 0.845 Categories and Subject Descriptors cuban sandwich 0.918 กาแฟ coffee 0.832 H.3.3 [Information Search and Retrieval]: Information panini 0.913 ขา หมู pig’s feet 0.827 filtering; G.3 [Probability and Statistics]: Time series analysis; I.2.7 [Natural Language Processing]: Text anal- ysis; I.5.1 [Pattern Recognition]: Models–Statistical 3. TIMELINESS OF PHRASES AND TIPS Keywords Through our Swarm app, users check in to share their location and leave a short update for their friends called a bhattacharyya coefficient, context-aware recommenders, shout. In order to find phrases which are time-sensitive, foursquare, machine learning, natural language processing, we looked at shouts instead of tips because they were more text classification specific to what users were doing at any particular time. Our model for phrase popularity over the course of the 1. INTRODUCTION week mirrors our model for venue popularity[4]. For each Foursquare is a location-based recommendation engine. A supported language, we divided the week into 168 hour buck- primary action for users is to write a tip, which is a short ets. We then counted the number of times each phrase was public note attached to a venue, often a review or suggestion. used in a given bucket. We also counted the total number Any given venue is likely to have many tips attached to it, of shouts in each bucket to produce a baseline distribution. which vary in quality and relevance. With the recent focus The Bhattacharyya coefficient[1] is a metric for comparing on search and discovery as well as passive location awareness, the similarity between two probability distributions. Given we have developed a number of heuristics in order to serve two phrase distributions P and Q, we define the similarity the right tips to the right people at the right time. to be X p S(P, Q) = P (w)Q(w) 2. TIP SELECTION COMPONENTS w∈W Language Identification: In order to avoid serving lan- guages that a user does not understand, a language classi- where W is the set of all 168 weekhour buckets. fier on Foursquare tips was built using an ensemble of open For example, the Bhattacharyya coefficient between any source and home-grown solutions. phrase and the word “lunch” provides a measure of how ap- Global quality: We created a hand-labelled training set propriate that phrase is for lunch time. The food items of high and low quality tips based off of a strict set of qual- which rank most highly in this metric for English and Thai ity guidelines. Raw scores from various statistical classifiers give interesting insights into the lunch habits of different that were trained to identify specific traits such as sentiment language groups (Table 1). or spam were used as features to train a quality model. Furthermore, the Bhattacharyya coefficient between any Personalization: We developed a number of signals which phrase and the baseline distribution measures the time sensi- take into account the user’s tastes and social connections. tivity of that phrase. We extracted all the phrases that meet Timeliness and Seasonality: For any given date and a certain threshold for time sensitivity. Then, each phrase- time, a tip is analyzed in order to determine whether it is bucket was assigned a timeliness score which is the log-ratio appropriate for a particular time of week or time of year. of the phrase probability and the baseline probability. In this poster, we go into more detail on the system for We defined C(p) to be the total number of times phrase analyzing this component. p appears in the corpus, and C(pw ) to be the total num- ber of times p appears in weekhour w. Finally, α is a 168- dimentional Dirichlet smoothing constant on phrase count Copyright is held by the author/owner(s). RecSys 2014 Poster Proceedings October 6-10, 2014, Foster City, Silicon data[5] and b is defined as a phrase to correspond with Valley, USA the baseline counts. The timeliness score for a phrase at weekhour w is computed as follows. chosen threshold, we achieved 71.3% precision and 74.5% re-   call for timely tips against our hand labelled set. Untimely C(pw ) + αw C(bw ) + αw scoring used a different threshold and achieved 74.7% preci- T (p, w) = ln P ÷ P C(p) + i αi C(b) + i αi sion and 67.0% recall. The timeliness score for a tip is the sum of the scores of its phrases. For example, at Veselka (a popular Ukrainian 4. EXTENSION TO SEASONALITY restaurant in New York’s East Village), a user wrote “They’re The ability to detect and exploit seasonality is an im- open 24/7 - turn up after your night out and partake of portant feature for search and recommendation systems[3]. the pierogis with applesauce.” The terms “24/7”, “turn up”, There was not enough data to create 365 day-buckets so in- “night out”, and “pierogi” all meet the Bhattacharyya thresh- stead we chose to create buckets based off of weeks. Unfor- old. Their respective scores for Sunday night at midnight are tunately, in the unix calendar utility, many popular holidays 1.2, 0.6, 0.3, and -0.6. These sum to 1.5 which is positive crucial to seasonality fall in different buckets each year. To and indicates that this tip is timely on Sunday night. ameliorate this, we forced every month into a 4 week model, We supplemented our shout counts with the English Word- with the last week of the month subsuming all extra days net[2] food corpus and our English menu database. This beyond the 28th. The last week of each month was then allowed us to associate entries in the Wordnet corpus with normalized to account for the extra days before the Bhat- specific meals (breakfast, lunch, dinner, dessert, and late tacharyya coefficients were calculated. night). For phrases in the Wordnet food corpus with insuf- Another issue was caused by phrases that were seasonal ficient shout data, we replaced the distribution with that of in only one year. Very popular movies caused us to as- the matching abstract mealtimes. sociate “James Bond” with mid-November and “Star Trek” One problem we encountered was with non-compositional with June. We solved this problem by looking at data for compound phrases. The timeliness of “burrito” is very differ- each year individually and flagged outliers. Once flagged, ent from that of “breakfast burrito”, but because the burrito we smoothed the counts to bring the offending year more in data included all mentions of breakfast burrito as well, its line with the rest of the data. timeliness score was dampened. To counteract this prob- lem, we merged phrases in our training data that appeared 4.1 Future Work more frequently together so that they would be considered Some terms follow a different seasonal pattern depending as completely separate entities from their constituent to- on geographic region and performance would be improved kens. In terms of burritos, this meant that all mentions of by geo-fencing phrase distributions by region. For example, breakfast burrito were counted as one term, and all men- the term “fireworks” was found to be incredibly timely during tions of burrito not following breakfast were considered as the first week of July for American Independance Day, but an entirely separate term. there is also a smaller spike in the first week of November for Guy Fawkes Day in Great Britain. Another example was the term “Rangers” being timely in the summer and the winter. 2 The Texas Rangers (a baseball team that plays during the summer) was being conflated with the New York Rangers (a hockey team that plays during the winter). 1 Phrase Timeliness Score Geo-fencing by climate zone as opposed to national bor- ders or metropolitan areas would improve results for weather- 0 related phrases such as “outdoor seating”“hot soup”, and “air conditioning”. −1 5. REFERENCES [1] Bhattacharyya, A. (1946). On a measure of divergence −2 between two multinomial populations. Sankhyā: The Burrito Indian Journal of Statistics, 401-406. Breakfast Burrito [2] Miller, George A. (1995). WordNet: A Lexical −3 Non-Breakfast Burrito Database for English. Communications of the ACM Vol. 38, No. 11: 39-41. 95 100 105 110 115 120 [3] Shokouhi, M. (2011, July). Detecting seasonal queries Hour (96 to 120 is midnight to midnight on Friday) by time-series analysis. In Proceedings of the 34th international ACM SIGIR conference on Research and development in Information Retrieval (pp. 1171-1172). 3.1 Evaluation ACM. We evaluated the timeliness score on a hand-labelled set [4] Sklar, M., Shaw, B., & Hogue, A. (2012, September). of 825 tips, each with four abstract meal times: breakfast, Recommending interesting events in real-time with lunch, dinner, and late night. For each tip and time period, foursquare check-ins. In Proceedings of the sixth ACM we applied the label of timely, neutral, or untimely. We then conference on Recommender systems (pp. 311-312). compared those labels to our timeliness scores, the result of ACM. which satisfied us for using the feature in the product. [5] Sklar, M. (2014). Fast MLE Computation for the The timeliness score serves two purposes: detecting specif- Dirichlet Multinomial. arXiv preprint arXiv:1405.0099. ically timely tips, and disqualifying untimely tips. With our