Discovering Contextual Information from User Reviews for Recommendation Purposes Konstantin Bauman Alexander Tuzhilin Stern School of Business Stern School of Business New York University New York University kbauman@stern.nyu.edu atuzhili@stern.nyu.edu ABSTRACT approaches, such as the ones described in [11]. Although The paper presents a new method of discovering relevant most of the CARS literature has focused on the representa- contextual information from the user-generated reviews in tional approach, an argument has been made that the con- order to provide better recommendations to the users when text is not known in advance in many CARS applications such reviews complement traditional ratings used in rec- and, therefore, needs to be discovered [3]. ommender systems. In particular, we classify all the user In this paper, we focus on the interactional approach to reviews into the “context rich” specific and “context poor” CARS and assume that the contextual information is not generic reviews and present a word-based and an LDA-based known in advance and is latent. Furthermore, we focus on methods of extracting contextual information from the spe- those applications where rating of items provided by the cific reviews. We also show empirically on the Yelp data users are supplemented with user-generated reviews con- that, collectively, these two methods extract almost all the taining, the contextual information, among other things. relevant contextual information across three di↵erent ap- For example, in case of Yelp, user reviews contain valuable plications and that they are complementary to each other: contextual information about user experiences of interacting when one method misses certain contextual information, the with Yelp businesses, such as restaurants, bars, hotels, and other one extracts it from the reviews. beauty & spas. By analyzing these reviews, we can discover various types of rich and important contextual information that can subsequently be used for providing better recom- Keywords mendations. Recommender systems; Contextual information; One way to discover this latent contextual information Online reviews; User-generated content would be to provide a rigorous formal definition of context and discover it in the texts of the user-generated reviews 1. INTRODUCTION using some formal text mining-based context identification methods. This direct approach is difficult, however, because The field of Context-Aware Recommender Systems (CARS) of the complex multidimensional task of defining the un- has experienced extensive growth since the first papers on known contextual information in a rigorous way, identifying this topic appeared in the mid-2000’s [3] when it was shown what constitutes context and what does not in the user- that the knowledge of contextual information helps to pro- generated reviews, and dealing with complexities of extract- vide better recommendations in various settings and applica- ing it from the reviews using text mining methods. tions, including Music [8, 9, 12, 13], Movies [5], E-commerce Therefore, in this paper we propose the following indi- [17], Hotels [10], Restaurants [14]. rect method for discovering relevant contextual information One of the fundamental issues in the CARS field is the from the user-generated reviews. First, we observe that the question of what context is and how it should be specified. contextual information is contained mainly in the specific According to [2, 7], context-aware approaches are divided reviews (those that describe specific visit of a user to an es- into representational and interactional. In the represen- tablishment, such as a restaurant) and hardly appears in the tational approach, adopted in most of the CARS papers, generic reviews (the reviews describing overall impressions context can be described using a set of observable contex- about a particular establishment). Second, words or topic tual variables that are known a priori and the structure of describing the contextual information should appear much which does not change over time. In the interactional ap- more frequently in the specific than in the generic reviews proach [4, 11], the contextual information is not known a pri- because the latter should mostly miss such words or topics. ori and either needs to be learned or modeled using latent Therefore, if we can separate the specific from the generic re- views, compare the frequencies of words or topics appearing in the specific vs. the generic reviews and select these words Permission to make digital or hard copies of all or part of this work for or topic having high frequency ratios, then they should con- personal or classroom use is granted without fee provided that copies are tain most of the contextual information among them. This not made or distributed for profit or commercial advantage and that copies background work of applying the frequency-based method bear this notice and the full citation on the first page. To copy otherwise, to to identifying the important context-related words and top- Copyright 2014 for the individual papers by the paper’s authors. republish, to post on servers or to redistribute to lists, requires prior specific Copying permitted permission for private and academic purposes. This volume is and/or a fee. ics paves the way to the final stage of inspecting these lists published and CBRecSys copyrighted 2014, October 6, by its editors. 2014, Silicon Valley, CA, USA. of words and topics. CBRecSys 2014 Copyright 2014,byOctober 6, 2014, Silicon Valley, CA, USA. the author(s). 2 In this paper, we followed this indirect approach and de- veloped an algorithm for classifying the reviews into the “context rich” specific and “context poor” generic reviews. In additional, we present a word-based and an LDA-based methods of extracting contextual information from the spe- cific reviews. We also show that, together, these two meth- ods extract almost all the relevant contextual information across three di↵erent applications (restaurants, hotels, and Figure 1: An example of a generic review beauty & spas) and that they are complementary to each other: when one method misses certain contextual infor- mation, the other one extracts it from the reviews and vice versa. Furthermore, in those few cases when these two meth- ods fail to extract the relevant contextual information, these types of contexts turned out to be rare (appear infrequently in the reviews) and are more subtle (i.e., it is hard to de- scribe such contexts in crisp linguistic terms). Figure 2: An example of a specific review [1, 10, 14] present some prior work on extracting contex- tual information from the user-generated reviews. Although presenting di↵erent approaches, these three references have one on the popular LDA method [6]. These two approaches one point in common: in all the three papers the types of are described in Section 2.2 and 2.3 respectively. contextual information are a priori known. Therefore, the key issue in these papers is determination of the specific 2.1 Separating Reviews into values of the known contextual types based on the reviews. Specific and Generic Although significant progress has been made on learning The main idea in separating specific from generic reviews context from user-generated reviews, nobody proposed any lies in identification of certain characteristics that are preva- method of separating the reviews into specific and generic lent in one type but not in the other type of review. For ex- and presented the particular methods of extracting the con- ample, users who describe particular restaurant experiences textual information from the reviews that are described in tend to write long reviews and extensively use past tenses this paper. (e.g., “I came with some friends for lunch today”), while This paper makes the following contributions. First, we generic reviews tend to use present tense more frequently proposed two novel methods, a word-based and an LDA- (e.g., “they make wonderful pastas”). based, of extracting the contextual information from the In this work, we identified several such features for sep- user-generated reviews in those CARS applications where arating the generic from the specific reviews, including (a) contexts are not known in advance. Second, we validated the length of the review, (b) the total number of verbs used them on three real-life applications (Restaurants, Hotels, in the review and (c) the number of verbs used in past and Beauty & Spas) and experimentally showed that these tenses. More specifically, we used the following measures two methods are (a) complementary to each other (when- in our study: ever one misses certain contexts, the other one identifies them and vice versa) and (b) collectively, they discover al- most all the contexts across the three di↵erent applications. • LogSentences: logarithm of the number of sentences in Third, we show that most of this contextual information can the review plus one1 . be discovered quickly and e↵ectively. • LogWords: logarithm of the number of words used in 2. METHOD OF CONTEXT DISCOVERY the review plus one. The key idea of the proposed method is to extract the con- • VBDsum: logarithm of the number of verbs in the past textual information from the user-generated reviews. How- tenses in the review plus one. ever, not all the reviews contain rich contextual information. For example, generic reviews, describing overall impressions about a particular restaurant or a hotel, such as the one pre- • Vsum: logarithm of the number of verbs in the review sented in Figure 1, contain only limited contextual informa- plus one. tion, if any. In contrast, the specific visits to a restaurant or staying in a hotel may contain rich contextual information. • VRatio - the ratio of VBDsum and Vsum ( V BDsum V sum ). For example, the review presented in Figure 2 and describ- ing the specific dining experience in a restaurant contains Given these characteristics, we used the classical K-means such contextual information as “lunch time,” with whom the clustering method to separate all the reviews into the “spe- person went to the restaurant, and the date of the visit. cific” vs. “generic” clusters. We describe the specifics of this Therefore, the first step in the proposed approach is to sep- separation method, as applied to our data, in Section 3.2. arate such generic from the specific reviews, and we present Once the two types of reviews are separated into two dif- a particular separation method in Section 2.1. ferent classes, we next apply the word-based and LDA-based After that, we use the specific/generic dichotomy to ex- methods described in the next two sections. tract the contextual information using the two methods pro- posed in this paper, the first one based on the identification 1 We added one avoid the problem of having empty reviews of the most important context-related words and the second when logarithm becomes 1. 2 2.2 Discovering Context Using 2.3 Discovering Context Using Word-based Method LDA-based Method The key idea of this method is to identify those words The key idea of this method is to generate a list of topics (more specifically, nouns) that occur with a significantly about an application using the well-known LDA approach [6] higher frequency in the specific than in the generic reviews. and identify among them those topics corresponding to the As explained earlier, many contextual words describing the contextual information for that application. In particular, contextual information fit into this category. We can cap- we proceed as follows: ture them by analyzing the dichotomy between the patterns of words in the two categories of reviews, as explained below, 1. Build an LDA model on the set of the specific reviews. and identify them as follows: 2. Apply this LDA model to all the user-generated re- 1. For each review Ri , identify the set of nouns Ni ap- views in order to obtain the set of topics Ti for each pearing in it. review Ri with probability higher than certain thresh- old level. 2. For each noun nk , determine its weighted frequencies ws (nk ) and wg (nk ) corresponding to the specific (s) 3. For each topic tk from the generated LDA model, de- and generic (g) reviews, as follows termine its weighted frequencies ws (tk ) and wg (tk ) cor- responding to the specific (s) and generic (g) reviews, |Ri : Ri 2 specif ic and nk 2 Ni | as follows ws (nk ) = |Ri : Ri 2 specif ic| |Ri : Ri 2 specif ic and tk 2 Ti | ws (tk ) = and |Ri : Ri 2 specif ic| |Ri : Ri 2 generic and nk 2 Ni | and wg (nj ) = . |Ri : Ri 2 generic| |Ri : Ri 2 generic and tk 2 Ti | wg (tk ) = . 3. Filter out the words nk that have low overall frequency, |Ri : Ri 2 generic| i.e., 4. Filter out the topics tk that have low overall frequency, |Ri : nk 2 Ni | i.e., w(nk ) = < ↵, |Ri : Ri 2 generic or Ri 2 specif ic| |Ri : tk 2 Ti | w(tk ) = < ↵, where ↵ is a threshold value for the application (e.g., |Ri : Ri 2 generic or Ri 2 specif ic| ↵ = 0.005). where ↵ is a threshold value for the application (e.g., 4. For each noun nk determine ratio of its specific and ↵ = 0.005). ws (nk ) generic weighted frequencies: ratio(nk ) = w g (n ) . k 5. For each topic tk determine the ratio of its specific and ws (tk ) 5. Filter out nouns with ratio(nk ) < (e.g = 1.0). generic weighted frequencies: ratio(tk ) = w g (t ) . k 6. For each remaining noun nk left after filtering step 5, 6. Filter out topics with ratio(tk ) < (e.g. = 1.0). find the set of senses synset(nk ) using WordNet2 [16]. 7. Sort the topics by ratio(tk ) in the descending order. 7. Combine senses into groups gt having close meanings As a result of running this procedure, we obtain a list using WordNet taxonomy distance. Words with sev- of LDA topics that is sorted using the ratio metric defined eral distinct meanings can be represented in several in Step 5 above. Since the contextual information is usually distinct groups. related to the specific user experiences, we expect that these 8. For each group gt determine its weighted frequencies contextual LDA topics will appear high in the generated list, ws (gt ) and wg (gt ) through frequencies of its members as in the case of the word-based method described in Section as: 2.2. |Ri : Ri 2 specif ic and gk \ Ni 6= ;| We next go through the lists of words and topics generated ws (gt ) = . in Sections 2.2 and 2.3 and select the contextual information |Ri : Ri 2 specif ic| out of them. As is shown in Section 4, this contextual infor- 9. For each group gt determine ratio of its specific and mation is usually located high on these two lists and there- generic weighted frequencies as ratio(gt ) = w s (gt ) . fore can be easily identified and extracted from them. The wg (tt ) specifics are further presented in Section 4. As we can see, 10. Sort groups by ratio(gt ) in its descending order. the list generation methods described in Sections 2.2 and 2.3 lie at the core of our context extraction methodology As a result of running this procedure, we obtain a list of and make the final context selection process easy. groups of words that are sorted based on the metric ratio In summary, we proposed a method of separating the re- defined in Step 9 above. Furthermore, the contextual words views pertaining to the specific user experiences from the are expected to be located high in the list (and we empiri- generic reviews. We also proposed two methods of generat- cally show it in Section 4). ing contextual information, one is based on the LDA topics 2 WordNet is a large lexical database of English. Words are and another on generating list of words relevant to the con- grouped into sets of cognitive synonyms, each expressing a textual information. distinct concept. Function synset(word) returns a list of In Section 3, we empirically validate our methods and will lemmas of this word that represent distinct concepts. show their usefulness and complementarity in Section 4. 3 Category Restaurants Hotels Beauty & Spas Cluster specific generic specific generic specific generic Number of reviews 168 132 195 105 173 127 Number of reviews 146 25 127 13 103 9 with context % of reviews with 87% 19% 65% 12% 59% 7% context Table 1: Specific vs. Generic Statistics 3. EXPERIMENTAL SETTINGS We have also counted the number of occurrences of contex- To demonstrate how well our methods work in practice, tual information in generic and specific reviews. The results we tested them on the Yelp data (www.yelp.com) that was presented in Table 1 support our claim that specific reviews provided for the RecSys 2013 competition. In particular, contain richer contextual information than generic reviews we extracted the contextual information from the reviews across all the three applications. pertaining to restaurants, hotels and beauty & spas applica- Second, we have applied the word-based method described tions using the word-based and the LDA-based approaches. in Section 2.2 to the Yelp data. Initially, we generated sets of We describe the Yelp data in Section 3.1 and the specifics nouns for restaurants, hotels and beauty & spas applications of our experiments in Section 3.2. respectively. After we computed the weighted frequencies of nouns and filtered out infrequent and low-ratio words (hav- 3.1 Dataset Descriptions ing the thresholds values of ↵ = 0.005, = 1.0), only 1495, 1292 and 1150 nouns were left in the word lists for restau- The Yelp dataset contains reviews of various businesses, rants, hotels and beauty & spas cases respectively. Finally, such as restaurants, bars, hotels, shopping, real estate, beauty we combined the remaining words into groups, as described & spas, etc., provided by various users of Yelp describing in Step 7, using the Wu&Palmer Similarity measure [19] with their experiences visiting these businesses, in addition to the threshold level of 0.9. As a result, we obtained 835, 755, the user-specified ratings of these businesses. These reviews 512 groups of similar nouns for the restaurants, hotels and were collected in the Phoenix metropolitan area (including beauty & spas categories. towns of Scottsdale, Tempe and Chandler) in Arizona over Third, we have applied the LDA-based method described the period of 6 years. For the purposes of this study, we used in Section 2.3 to the Yelp data. Initially, we pre-processed all the reviews in the dataset for all the 4503 restaurants the reviews using the standard text analysis techniques by (158430 reviews by 36473 users), 284 hotels (5034 reviews removing punctuation marks, stop words, high-frequency by 4148 users) and 764 beauty & spas (5579 reviews by 4272 words, etc. [15]. Then we ran LDA on the three prepro- users). We selected these three categories of businesses (out cessed sets of reviews with m = 150 topics for each of the of 22 in total) because they contained some of the largest three applications using the standard Python module gen- numbers of reviews and also di↵ered significantly from each sim[18]. After generating these topics, we removed the most other. infrequent ones, as described in Step 4 of the LDA-based The data about these businesses is specified with the fol- approach (setting the parameter ↵ = 0.005) and low-ratio lowing attributes: business ID, name, address, category of topics (Step 6) having the parameter = 1.0. As a result, business, geolocation (longitude/latitude), number of reviews, we were left with 135, 121 and 110 topics for each of the the average rating of the reviews, and whether the business three applications. is open or not. The data about the users is specified with We describe the obtained results in the next section. the following attributes: user ID, first name, number of re- views, and the average rating given by the user. Finally, the reviews are specified with the following attributes: review 4. RESULTS ID, business ID, user ID, the rating of the review, the re- First, the results of separation of the user-generated re- view (textual description), and the date of the review. For views into the specific and generic classes are presented in instance, Figures 1 and 2 provide examples of restaurant Table 2 that has the following entries: reviews. • AvgSentences: the average number of sentences in re- 3.2 Applying the proposed methods views from the generic or specific cluster. We applied our context discovery method to the three Yelp applications from Section 3.1 (Restaurants, Hotels and • AvgWords: the average number of words in reviews Beauty & Spas). As a first step, we have separated all the from the cluster. user-generated reviews into the specific and generic classes, as explained in Section 2.1. In order to determine how well • AvgVBDsum: the average number of verbs in past this method works on the Yelp data, we manually labeled tense in reviews from the claster. 300 reviews into specific vs. generic for each of the three applications used in this study (i.e., restaurants, hotels and • AvgVsum: the average number of verbs in reviews from beauty & spas - 900 reviews in total). This labeled data was the cluster. used for computing performance metrics of our separation algorithm. The results of this performance evaluation are • AvgVRatio: the average ratio of VBDsum and Vsum reported in Section 4. for reviews from the cluster. 4 Category Restaurants Hotels Beauty & Spas Cluster specific generic specific generic specific generic AvgSentences 9.59 5.04 10.38 5.58 9.36 4.54 AvgWords 129.42 55.97 147.81 65.48 134.5 50.88 AvgVBDsum 27.07 1.09 28.87 1.58 25.8 1.03 AvgVsum 91.54 23.93 107.43 28.88 107.22 25.65 AvgVRatio 0.43 0.02 0.40 0.06 0.38 0.03 Size 59.3% 40.7% 67.8% 32.2% 59.2% 40.8% AvgRating 3.53 4.03 3.57 3.81 3.76 4.35 Silhouette 0.446 0.424 0.461 Precision 0.87 0.89 0.83 0.92 0.83 0.94 Recall 0.83 0.91 0.83 0.92 0.88 0.90 Accuracy 0.89 0.88 0.90 Table 2: Clusterization quality • Size: size of the cluster in percents from the number Context variable Frequency Word LDA of all reviews in the category (restaurants, hotels and 1 Company 56.3% X(1) X(6) beauty & spas). 2 Time of the day 34.8% X(77) X(21) 3 Day of the week 22.5% X(2) X(15) • AvgRating: the average rating for reviews from the 4 Advice 10.7% X(13) X(16) cluster. 5 Prior Visits 10.2% X X(26) 6 Came by car 7.8% X(267) X(78) • Silhouette: the silhouette measure of the clusterization 7 Compliments 4.9% X(500) X(74) quality (showing how separable the clusters are). 8 Occasion 3.9% X(39) X(19) • Precision: the precision measure for the cluster. 9 Reservation 3.0% X(29) X 10 Discount 2.9% X(4) X • Recall: the recall measure for the cluster. 11 Sitting outside 2.4% X X(64) 12 Traveling 2.4% X X • Accuracy: the overall accuracy of clusterization with 13 Takeout 1.9% X(690) X respect to the manual labeling. Table 3: Restaurants As we can see from Table 2, the separation process gives us two groups of reviews that are significantly di↵erent in all the presented parameters. Further, this di↵erence is ob- and beauty & spas categories respectively using the LDA- served not only in terms of the five parameters used in the based approach. We also went through these three lists and k-means clustering method used to separate the generic from identified the contextual variables among them - they are the specific reviews (first five rows in Table 2), but also in marked with the check marks in Column 5 (LDA) in Tables terms of the average rating (AvgRating) measure (that is 3, 4 and 5 (the numbers in parentheses next to them also significantly higher for the generic than for the specific re- identify the first occurrences of the topics in the sorted lists views across all the three categories). Also, the silhouette of the topics produced by the LDA-based method). measure is more than 0.4 for all the three categories and is As Table 3 demonstrates, we identified the following types as high as 0.46 for one of them, demonstrating significant of contexts for the Restaurants category: separation of the two clusters. Finally, note that the Ac- curacy measure is around 0.9 across the three categories of • Company: specifying with whom the user went to the reviews (with respect to the labeled reviews - see Section restaurant (e.g., with a spouse, children, friends, co- 3.2), which is a good performance result for separating the workers, etc.). reviews. We next extracted the contextual information from the • Time of the day: this context variable contains infor- specific reviews (produced in the previous step) using the mation about the time of the day, such as morning, word- and the LDA-based methods. As explained in Sec- evening and mid-day. tion 3.2, we obtained the sorted lists of 835, 755, 512 groups • Day of the week: specifying the day of the week (Mon- of words for restaurants, hotels and beauty & spas cate- day, Tuesday, etc.). gories respectively using the word-based approach. We went through these three lists and identified the contextual vari- • Advice: specifying the type of an advice given to the ables among them - they are marked with the check marks user, such as a recommendation from a friend or a re- in Column 4 (Word) in Tables 3, 4 and 5 (the numbers in view on Yelp. This context indicates that the user parentheses next to them identify the first occurrences of knows the opinions of other parties about the restau- the group of words in the sorted lists of the groups of words rant before going there. produced by the word-based method). Similarly, as explained in Section 3.2, we obtained the • Prior Visits: specifying if the user is the first time sorted lists of 135, 121 and 110 topics for restaurants, hotels visitor or a regular in the restaurant. 5 • Came by car: specifying if the user came to the restau- Context variable Frequency Word LDA rant by car or not. 1 Company 37.3% X(4) X(11) 2 Occasion 24.3% X(1) X(6) • Compliments: specifying any types of discounts or spe- 3 Reservation 12.9% X(18) X cial o↵ers that user recieved during his visit, such as 4 Time of the year 12.4% X(94) X(30) happy hour, free appetizer, special o↵er etc. 5 Came by car 9.4% X(381) X(65) 6 Day of the week 7.4% X(207) X(41) • Occasion: specifying the special occasion for going to the restaurant, such as birthday, date, wedding, an- 7 Airplane 4.9% X(57) X(40) niversary, business meeting, etc. 8 Discount 4.4% X(23) X 9 Prior Visits 3.7% X X(57) • Reservation: specifying if the user made a prior reser- 10 City Event 3.4% X X vation in the restaurant or not. 11 Advice 1.9% X(134) X(31) • Discount: specifying if the user used any types of dis- Table 4: Hotels count deals that he or she obtained before coming to the restaurant, such as groupon/coupon, a voucher and Context variable Frequency Word LDA a gift certificate. 1 Company 30.1% X(47) X(22) 2 Day of the week 18.9% X(8) X • Sitting outside: specifying if the user was sitting out- 3 Prior Visits 15.2% X X(25) side (vs. inside) the restaurant during his visit. 4 Time of the day 13.2% X(3) X(4) • Takeout: specifying if the user did not stay in the 5 Occasion 9.6% X(15) X(29) restaurant but ordered a takeout. 6 Reservation 9.4% X(167) X(1) 7 Discount 9.2% X(46) X(39) Note that some of this contextual information was found 8 Advice 4.1% X(2) X(8) using either the word-based (Company, TimeOfTheDay, Day- 9 Stay vs Visit 3.1% X X(19) OfTheWeek, Advice, CameByCar, Compliments, Occasion, 10 Came by car 1.8% X(113) X(75) Reservation, Discount and Takeout) or the LDA-based method (Company, TimeOfTheDay, DayOfTheWeek, Advice, Pri- Table 5: Beauty & Spas orVisits, CameByCar, Compliments, Occasion and SitOut- side). To validate the context extraction process, we went through did not capture them because these words (“reservation,” the 400 restaurant reviews (produced as described in Section “groupon” and “takeout”) got lost among some other irrele- 3.2) and identified by inspection the contextual information vant topics. in these reviews. This allowed us to identify the contextual Finally, nether method has discovered the Traveling con- information that served as the ”ground truth”. Table 3 con- text because it (a) is very infrequent and (b) is described in tains all the contextual information that we have found in more subtle ways, making it difficult to capture it. these 400 reviews (13 di↵erent types). Note that the word- In addition to Restaurants, we have also examined the and the LDA-based methods collectively found all this con- Hotels and the Beauty & Spas categories. The results are textual information, except for the Traveling context (that presented in Tables 4 and 5 with 10 types of contexts being determines if the user visited the restaurant while on a travel discovered for the Hotels case and 10 types for the Beauty & trip in the city or that he/she lives in that city) - 12 di↵erent Spas categories. Also, both methods missed the CityEvent types of context (out of 13). context (an event happening in the city which is the cause Furthermore, column 3 in Table 3 presents the frequencies of traveling to that city and staying in the hotel) for the with which particular types of contextual variables appear in Hotels and captured all the contextual information for the the specific reviews of restaurants. Note that the most fre- Beaty & Spas application. quently occurring popular contexts are discovered by both As these tables demonstrate, the word- and the LDA- the word- and the LDA-based methods. The di↵erences be- based methods are complementary to each other: some con- tween the two methods come in discoveries of less frequent texts were discovered by one but not by the other method. contexts. It is interesting to observe that the PriorVisits Further, collectively, these two methods discover most of the context was discovered by the LDA but not by the word- contextual information across the three applications exam- based method. This is the case because this context is usu- ined in this paper. ally represented by such expressions as “first time,”“second Figure 3 presents the performance of the word-based dis- time,” “twice” and so on, which are hard to capture by the covery method across the three applications (restaurants, word-based method because none of these expressions con- hotels and beauty& spas). On X-axis are the ordinal num- tain a clearly defined “strong” noun capturing this context. bers of the groups of words in the word-based list produced In contrast, the LDA-based approach captured this context as described in Section 3.2. On the Y -axis are the cumu- because LDA managed to combine the aforementioned ex- lative number of contexts y(x) discovered by examining the pressions into one topic. first x groups of words on the list. Each line in Figure 3 On the other hand, such contexts as Reservation, Discount corresponds to the appropriate application. The jumps on and Takeout were captured well by the word-based method the curves correspond to the number of the first occurrence since all the three contexts have clearly defined nouns char- of the next contextual variable in the list of groups of words. acterizing these contexts (e.g., “reservation,”“groupon” and As we can see from Figure 3, word-based method identified “takeout” respectively). In contrast, the LDA-based method eight contextual variables for each application within the 6 generated reviews. The first word-based method identifies the most important nouns that appear more frequently in the specific than in the generic reviews, and many important contextual variables appear high in this sorted list of nouns. The second LDA-based approach constructs a sorted list of topics generated by the popular LDA method [6]. We also show in the paper that many important types of context ap- pear high in the list of the constructed topics. Therefore, these contexts can easily be identified by examining these two lists, as Figures 3 and 4 demonstrate. We validated these two methods on three real-life appli- cations (Yelp reviews of Restaurants, Hotels, and Beauty& Spas) and empirically showed that the word- and the LDA- based methods (a) are complementary to each other (when- ever one misses certain contexts, the other one identifies Figure 3: Word-based method them and vice versa) and (b) collectively, they discover al- most all the contexts across the three di↵erent applications. Furthermore, in those few cases when these two methods fail to extract the relevant contextual information, the missed contexts turned out to be rare (appear infrequently in the reviews) and are more subtle (i.e., it is hard to describe these contexts in crisp terms). Finally, we showed that most of the contextual information was discovered quickly and ef- fectively across the three applications. As a future research, we plan to use other text mining methods in addition to the word-based and the LDA-based approaches and compare their e↵ectiveness with the two methods presented in the paper. Hopefully, these improve- ments will help us to discover even more subtle and low- frequency contexts. Since the proposed word-based and LDA-based methods constitute general-purpose approaches, they can be applied to a wide range of applications, and we Figure 4: LDA-based method plan to test them on various other (non-Yelp based) cases to demonstrate broad usefulness of these methods. first 300 groups of words on the list. Moreover, the first four contextual variables were identified from only first 30 groups 6. REFERENCES of words on the list. This supports our earlier observation [1] S. Aciar. Mining context information from consumers that many contextual variables appear relatively high on the reviews. In Proceedings of Workshop on list of words groups and therefore could be easily identified. Context-Aware Recommender System. ACM, 2010. Figure 4 presents similar curves for the LDA-based method. [2] G. Adomavicius, B. Mobasher, F. Ricci, and This method managed to identify 9 contextual variables for A. Tuzhilin. Context-aware recommender systems. AI restaurants and hotels applications, and 8 contextual vari- Magazine, 32(3):67–80, 2011. ables for the beauty & spas application from the first 78 [3] G. Adomavicius and A. Tuzhilin. Context-aware topics on the list of all the topics. Moreover, the first 6 recommender systems. In F. Ricci, L. Rokach, topics were identified within just the first 41 topics. This B. Shapira, and P. B. Kantor, editors, Recommender further supports the earlier observation that many contex- Systems Handbook, pages 217–253. Springer US, 2011. tual variables appear high on the topics list and therefore could be easily identified. [4] S. Anand and B. Mobasher. Contextual As discussed before, the word- and the LDA-based meth- recommendation. In B. Berendt, A. Hotho, ods are complementary to each other. In our three applica- D. Mladenic, and G. Semeraro, editors, From Web to tions all the identified contextual variables could be identi- Social Web: Discovering and Deploying User and fied within the first 78 LDA-topics and 29 groups of words Content Profiles, volume 4737 of Lecture Notes in in case of restaurants, 65 topics and 23 groups of words in Computer Science, pages 142–160. Springer Berlin case of hotels, and 75 topics and 8 groups of words in case of Heidelberg, 2007. beauty & spas. Therefore, combination of the word- and the [5] J. F. T. Ante Odic, Marko Tkalcic and A. Kosir. LDA-based methods idetifies almost all the frequent contex- Predicting and detecting the relevant contextual tual variables by examining only the top several items on information in a movie-recommender system. In the two lists. Interacting with Computers, 25(1), pages 74–90. Oxford University Press, 2013. [6] D. M. Blei, A. Y. Ng, and M. I. Jordan. Latent 5. CONCLUSION AND FUTURE WORK dirichlet allocation. J. Mach. Learn. Res., 3:993–1022, In this paper, we presented two novel methods for sys- Mar. 2003. tematically discovering contextual information from user- [7] P. Dourish. What we talk about when we talk about 7 context. Personal Ubiquitous Comput., 8(1):19–30, Feb. 2004. [8] N. Hariri, B. Mobasher, and R. Burke. Context-aware music recommendation based on latenttopic sequential patterns. In Proceedings of the Sixth ACM Conference on Recommender Systems, RecSys ’12, pages 131–138, New York, NY, USA, 2012. ACM. [9] N. Hariri, B. Mobasher, and R. Burke. Query-driven context aware recommendation. In Proceedings of the 7th ACM Conference on Recommender Systems, RecSys ’13, pages 9–16, New York, NY, USA, 2013. ACM. [10] N. Hariri, B. Mobasher, R. Burke, and Y. Zheng. Context-aware recommendation based on review mining. In IJCAI’ 11, Proceedings of the 9th Workshop on Intelligent Techniques for Web Personalization and Recommender Systems (ITWP 2011), pages 30–36, 2011. [11] X. Jin, Y. Zhou, and B. Mobasher. Task-oriented web user modeling for recommendation. In Proceedings of the 10th international conference on User Modeling, UM’05, pages 109–118, Berlin, Heidelberg, 2005. Springer-Verlag. [12] M. Kaminskas and F. Ricci. Location-adapted music recommendation using tags. In J. Konstan, R. Conejo, J. Marzo, and N. Oliver, editors, User Modeling, Adaption and Personalization, volume 6787 of Lecture Notes in Computer Science, pages 183–194. Springer Berlin Heidelberg, 2011. [13] J. Lee and J. Lee. Context awareness by case-based reasoning in a music recommendation system. In H. Ichikawa, W.-D. Cho, I. Satoh, and H. Youn, editors, Ubiquitous Computing Systems, volume 4836 of Lecture Notes in Computer Science, pages 45–58. Springer Berlin Heidelberg, 2007. [14] Y. Li, J. Nie, Y. Zhang, B. Wang, B. Yan, and F. Weng. Contextual recommendation based on text mining. In Proceedings of the 23rd International Conference on Computational Linguistics: Posters, COLING ’10, pages 692–700, Stroudsburg, PA, USA, 2010. Association for Computational Linguistics. [15] C. D. Manning, P. Raghavan, and H. Schütze. Introduction to Information Retrieval. Cambridge University Press, New York, NY, USA, 2008. [16] G. A. Miller. Wordnet: A lexical database for english. COMMUNICATIONS OF THE ACM, 38:39–41, 1995. [17] C. Palmisano, A. Tuzhilin, and M. Gorgoglione. Using context to improve predictive modeling of customers in personalization applications. IEEE Trans. on Knowl. and Data Eng., 20(11):1535–1549, Nov. 2008. [18] R. Rehurek and P. Sojka. Software framework for topic modelling with large corpora. In Proceedings of LREC 2010 workshop New Challenges for NLP Frameworks, pages 46–50, Valletta, Malta, 2010. University of Malta. [19] Z. Wu and M. Palmer. Verbs semantics and lexical selection. In Proceedings of the 32Nd Annual Meeting on Association for Computational Linguistics, ACL ’94, pages 133–138, Stroudsburg, PA, USA, 1994. Association for Computational Linguistics. 8