Discovering Contextual Information from User Reviews for
              Recommendation Purposes

                             Konstantin Bauman                                                        Alexander Tuzhilin
                            Stern School of Business                                                Stern School of Business
                              New York University                                                     New York University
                        kbauman@stern.nyu.edu                                                      atuzhili@stern.nyu.edu


ABSTRACT                                                                                 approaches, such as the ones described in [11]. Although
The paper presents a new method of discovering relevant                                  most of the CARS literature has focused on the representa-
contextual information from the user-generated reviews in                                tional approach, an argument has been made that the con-
order to provide better recommendations to the users when                                text is not known in advance in many CARS applications
such reviews complement traditional ratings used in rec-                                 and, therefore, needs to be discovered [3].
ommender systems. In particular, we classify all the user                                   In this paper, we focus on the interactional approach to
reviews into the “context rich” specific and “context poor”                              CARS and assume that the contextual information is not
generic reviews and present a word-based and an LDA-based                                known in advance and is latent. Furthermore, we focus on
methods of extracting contextual information from the spe-                               those applications where rating of items provided by the
cific reviews. We also show empirically on the Yelp data                                 users are supplemented with user-generated reviews con-
that, collectively, these two methods extract almost all the                             taining, the contextual information, among other things.
relevant contextual information across three di↵erent ap-                                For example, in case of Yelp, user reviews contain valuable
plications and that they are complementary to each other:                                contextual information about user experiences of interacting
when one method misses certain contextual information, the                               with Yelp businesses, such as restaurants, bars, hotels, and
other one extracts it from the reviews.                                                  beauty & spas. By analyzing these reviews, we can discover
                                                                                         various types of rich and important contextual information
                                                                                         that can subsequently be used for providing better recom-
Keywords                                                                                 mendations.
Recommender systems; Contextual information;                                                One way to discover this latent contextual information
Online reviews; User-generated content                                                   would be to provide a rigorous formal definition of context
                                                                                         and discover it in the texts of the user-generated reviews
1. INTRODUCTION                                                                          using some formal text mining-based context identification
                                                                                         methods. This direct approach is difficult, however, because
   The field of Context-Aware Recommender Systems (CARS)
                                                                                         of the complex multidimensional task of defining the un-
has experienced extensive growth since the first papers on
                                                                                         known contextual information in a rigorous way, identifying
this topic appeared in the mid-2000’s [3] when it was shown
                                                                                         what constitutes context and what does not in the user-
that the knowledge of contextual information helps to pro-
                                                                                         generated reviews, and dealing with complexities of extract-
vide better recommendations in various settings and applica-
                                                                                         ing it from the reviews using text mining methods.
tions, including Music [8, 9, 12, 13], Movies [5], E-commerce
                                                                                            Therefore, in this paper we propose the following indi-
[17], Hotels [10], Restaurants [14].
                                                                                         rect method for discovering relevant contextual information
   One of the fundamental issues in the CARS field is the
                                                                                         from the user-generated reviews. First, we observe that the
question of what context is and how it should be specified.
                                                                                         contextual information is contained mainly in the specific
According to [2, 7], context-aware approaches are divided
                                                                                         reviews (those that describe specific visit of a user to an es-
into representational and interactional. In the represen-
                                                                                         tablishment, such as a restaurant) and hardly appears in the
tational approach, adopted in most of the CARS papers,
                                                                                         generic reviews (the reviews describing overall impressions
context can be described using a set of observable contex-
                                                                                         about a particular establishment). Second, words or topic
tual variables that are known a priori and the structure of
                                                                                         describing the contextual information should appear much
which does not change over time. In the interactional ap-
                                                                                         more frequently in the specific than in the generic reviews
proach [4, 11], the contextual information is not known a pri-
                                                                                         because the latter should mostly miss such words or topics.
ori and either needs to be learned or modeled using latent
                                                                                         Therefore, if we can separate the specific from the generic re-
                                                                                         views, compare the frequencies of words or topics appearing
                                                                                         in the specific vs. the generic reviews and select these words
Permission to make digital or hard copies of all or part of this work for                or topic having high frequency ratios, then they should con-
personal or classroom use is granted without fee provided that copies are                tain most of the contextual information among them. This
not made or distributed for profit or commercial advantage and that copies               background work of applying the frequency-based method
bear this notice and the full citation on the first page. To copy otherwise, to          to identifying the important context-related words and top-
Copyright 2014 for the individual papers by the paper’s authors.
republish, to post on servers or to redistribute to lists, requires prior specific
Copying permitted
permission             for private and academic purposes. This volume is
             and/or a fee.
                                                                                         ics paves the way to the final stage of inspecting these lists
published and
CBRecSys        copyrighted
            2014, October 6, by   its editors.
                               2014,  Silicon Valley, CA, USA.                           of words and topics.
CBRecSys 2014
Copyright   2014,byOctober   6, 2014, Silicon Valley, CA, USA.
                     the author(s).


                                                                                     2
   In this paper, we followed this indirect approach and de-
veloped an algorithm for classifying the reviews into the
“context rich” specific and “context poor” generic reviews.
In additional, we present a word-based and an LDA-based
methods of extracting contextual information from the spe-
cific reviews. We also show that, together, these two meth-
ods extract almost all the relevant contextual information
across three di↵erent applications (restaurants, hotels, and                  Figure 1: An example of a generic review
beauty & spas) and that they are complementary to each
other: when one method misses certain contextual infor-
mation, the other one extracts it from the reviews and vice
versa. Furthermore, in those few cases when these two meth-
ods fail to extract the relevant contextual information, these
types of contexts turned out to be rare (appear infrequently
in the reviews) and are more subtle (i.e., it is hard to de-
scribe such contexts in crisp linguistic terms).                              Figure 2: An example of a specific review
   [1, 10, 14] present some prior work on extracting contex-
tual information from the user-generated reviews. Although
presenting di↵erent approaches, these three references have             one on the popular LDA method [6]. These two approaches
one point in common: in all the three papers the types of               are described in Section 2.2 and 2.3 respectively.
contextual information are a priori known. Therefore, the
key issue in these papers is determination of the specific              2.1    Separating Reviews into
values of the known contextual types based on the reviews.                     Specific and Generic
   Although significant progress has been made on learning                 The main idea in separating specific from generic reviews
context from user-generated reviews, nobody proposed any                lies in identification of certain characteristics that are preva-
method of separating the reviews into specific and generic              lent in one type but not in the other type of review. For ex-
and presented the particular methods of extracting the con-             ample, users who describe particular restaurant experiences
textual information from the reviews that are described in              tend to write long reviews and extensively use past tenses
this paper.                                                             (e.g., “I came with some friends for lunch today”), while
   This paper makes the following contributions. First, we              generic reviews tend to use present tense more frequently
proposed two novel methods, a word-based and an LDA-                    (e.g., “they make wonderful pastas”).
based, of extracting the contextual information from the                   In this work, we identified several such features for sep-
user-generated reviews in those CARS applications where                 arating the generic from the specific reviews, including (a)
contexts are not known in advance. Second, we validated                 the length of the review, (b) the total number of verbs used
them on three real-life applications (Restaurants, Hotels,              in the review and (c) the number of verbs used in past
and Beauty & Spas) and experimentally showed that these                 tenses. More specifically, we used the following measures
two methods are (a) complementary to each other (when-                  in our study:
ever one misses certain contexts, the other one identifies
them and vice versa) and (b) collectively, they discover al-
most all the contexts across the three di↵erent applications.              • LogSentences: logarithm of the number of sentences in
Third, we show that most of this contextual information can                  the review plus one1 .
be discovered quickly and e↵ectively.
                                                                           • LogWords: logarithm of the number of words used in
2. METHOD OF CONTEXT DISCOVERY                                               the review plus one.
   The key idea of the proposed method is to extract the con-
                                                                           • VBDsum: logarithm of the number of verbs in the past
textual information from the user-generated reviews. How-
                                                                             tenses in the review plus one.
ever, not all the reviews contain rich contextual information.
For example, generic reviews, describing overall impressions
about a particular restaurant or a hotel, such as the one pre-             • Vsum: logarithm of the number of verbs in the review
sented in Figure 1, contain only limited contextual informa-                 plus one.
tion, if any. In contrast, the specific visits to a restaurant or
staying in a hotel may contain rich contextual information.                • VRatio - the ratio of VBDsum and Vsum ( V BDsum
                                                                                                                       V sum
                                                                                                                             ).
For example, the review presented in Figure 2 and describ-
ing the specific dining experience in a restaurant contains             Given these characteristics, we used the classical K-means
such contextual information as “lunch time,” with whom the              clustering method to separate all the reviews into the “spe-
person went to the restaurant, and the date of the visit.               cific” vs. “generic” clusters. We describe the specifics of this
Therefore, the first step in the proposed approach is to sep-           separation method, as applied to our data, in Section 3.2.
arate such generic from the specific reviews, and we present               Once the two types of reviews are separated into two dif-
a particular separation method in Section 2.1.                          ferent classes, we next apply the word-based and LDA-based
   After that, we use the specific/generic dichotomy to ex-             methods described in the next two sections.
tract the contextual information using the two methods pro-
posed in this paper, the first one based on the identification          1
                                                                          We added one avoid the problem of having empty reviews
of the most important context-related words and the second              when logarithm becomes 1.


                                                                    2
2.2 Discovering Context Using                                        2.3    Discovering Context Using
    Word-based Method                                                       LDA-based Method
   The key idea of this method is to identify those words              The key idea of this method is to generate a list of topics
(more specifically, nouns) that occur with a significantly           about an application using the well-known LDA approach [6]
higher frequency in the specific than in the generic reviews.        and identify among them those topics corresponding to the
As explained earlier, many contextual words describing the           contextual information for that application. In particular,
contextual information fit into this category. We can cap-           we proceed as follows:
ture them by analyzing the dichotomy between the patterns
of words in the two categories of reviews, as explained below,         1. Build an LDA model on the set of the specific reviews.
and identify them as follows:                                          2. Apply this LDA model to all the user-generated re-
  1. For each review Ri , identify the set of nouns Ni ap-                views in order to obtain the set of topics Ti for each
     pearing in it.                                                       review Ri with probability higher than certain thresh-
                                                                          old level.
  2. For each noun nk , determine its weighted frequencies
     ws (nk ) and wg (nk ) corresponding to the specific (s)           3. For each topic tk from the generated LDA model, de-
     and generic (g) reviews, as follows                                  termine its weighted frequencies ws (tk ) and wg (tk ) cor-
                                                                          responding to the specific (s) and generic (g) reviews,
                        |Ri : Ri 2 specif ic and nk 2 Ni |                as follows
           ws (nk ) =
                               |Ri : Ri 2 specif ic|                                        |Ri : Ri 2 specif ic and tk 2 Ti |
                                                                                 ws (tk ) =
     and                                                                                           |Ri : Ri 2 specif ic|
                        |Ri : Ri 2 generic and nk 2 Ni |                   and
           wg (nj ) =                                    .
                               |Ri : Ri 2 generic|                                            |Ri : Ri 2 generic and tk 2 Ti |
                                                                                 wg (tk ) =                                    .
  3. Filter out the words nk that have low overall frequency,                                        |Ri : Ri 2 generic|
     i.e.,                                                             4. Filter out the topics tk that have low overall frequency,
                            |Ri : nk 2 Ni |                               i.e.,
       w(nk ) =                                       < ↵,
                |Ri : Ri 2 generic or Ri 2 specif ic|                                              |Ri : tk 2 Ti |
                                                                            w(tk ) =                                         < ↵,
     where ↵ is a threshold value for the application (e.g.,                           |Ri : Ri 2 generic or Ri 2 specif ic|
     ↵ = 0.005).                                                           where ↵ is a threshold value for the application (e.g.,
  4. For each noun nk determine ratio of its specific and                  ↵ = 0.005).
                                                ws (nk )
     generic weighted frequencies: ratio(nk ) = w g (n ) .
                                                      k                5. For each topic tk determine the ratio of its specific and
                                                                                                                       ws (tk )
  5. Filter out nouns with ratio(nk ) <       (e.g   = 1.0).              generic weighted frequencies: ratio(tk ) = w  g (t ) .
                                                                                                                            k


  6. For each remaining noun nk left after filtering step 5,           6. Filter out topics with ratio(tk ) <      (e.g.   = 1.0).
     find the set of senses synset(nk ) using WordNet2 [16].           7. Sort the topics by ratio(tk ) in the descending order.
  7. Combine senses into groups gt having close meanings
                                                                        As a result of running this procedure, we obtain a list
     using WordNet taxonomy distance. Words with sev-
                                                                     of LDA topics that is sorted using the ratio metric defined
     eral distinct meanings can be represented in several
                                                                     in Step 5 above. Since the contextual information is usually
     distinct groups.
                                                                     related to the specific user experiences, we expect that these
  8. For each group gt determine its weighted frequencies            contextual LDA topics will appear high in the generated list,
     ws (gt ) and wg (gt ) through frequencies of its members        as in the case of the word-based method described in Section
     as:                                                             2.2.
                    |Ri : Ri 2 specif ic and gk \ Ni 6= ;|              We next go through the lists of words and topics generated
         ws (gt ) =                                        .         in Sections 2.2 and 2.3 and select the contextual information
                              |Ri : Ri 2 specif ic|
                                                                     out of them. As is shown in Section 4, this contextual infor-
  9. For each group gt determine ratio of its specific and           mation is usually located high on these two lists and there-
     generic weighted frequencies as ratio(gt ) = w
                                                    s
                                                      (gt )
                                                            .        fore can be easily identified and extracted from them. The
                                                  wg (tt )
                                                                     specifics are further presented in Section 4. As we can see,
 10. Sort groups by ratio(gt ) in its descending order.              the list generation methods described in Sections 2.2 and
                                                                     2.3 lie at the core of our context extraction methodology
As a result of running this procedure, we obtain a list of           and make the final context selection process easy.
groups of words that are sorted based on the metric ratio               In summary, we proposed a method of separating the re-
defined in Step 9 above. Furthermore, the contextual words           views pertaining to the specific user experiences from the
are expected to be located high in the list (and we empiri-          generic reviews. We also proposed two methods of generat-
cally show it in Section 4).                                         ing contextual information, one is based on the LDA topics
2
  WordNet is a large lexical database of English. Words are          and another on generating list of words relevant to the con-
grouped into sets of cognitive synonyms, each expressing a           textual information.
distinct concept. Function synset(word) returns a list of               In Section 3, we empirically validate our methods and will
lemmas of this word that represent distinct concepts.                show their usefulness and complementarity in Section 4.


                                                                 3
                        Category                  Restaurants              Hotels         Beauty & Spas
                        Cluster                 specific generic     specific generic    specific generic
                        Number of reviews       168      132         195       105       173      127
                        Number of reviews       146      25          127       13        103      9
                        with context
                        % of reviews with       87%       19%        65%        12%      59%        7%
                        context

                                          Table 1: Specific vs. Generic Statistics


3. EXPERIMENTAL SETTINGS                                                We have also counted the number of occurrences of contex-
   To demonstrate how well our methods work in practice,             tual information in generic and specific reviews. The results
we tested them on the Yelp data (www.yelp.com) that was              presented in Table 1 support our claim that specific reviews
provided for the RecSys 2013 competition. In particular,             contain richer contextual information than generic reviews
we extracted the contextual information from the reviews             across all the three applications.
pertaining to restaurants, hotels and beauty & spas applica-            Second, we have applied the word-based method described
tions using the word-based and the LDA-based approaches.             in Section 2.2 to the Yelp data. Initially, we generated sets of
We describe the Yelp data in Section 3.1 and the specifics           nouns for restaurants, hotels and beauty & spas applications
of our experiments in Section 3.2.                                   respectively. After we computed the weighted frequencies of
                                                                     nouns and filtered out infrequent and low-ratio words (hav-
3.1 Dataset Descriptions                                             ing the thresholds values of ↵ = 0.005, = 1.0), only 1495,
                                                                     1292 and 1150 nouns were left in the word lists for restau-
   The Yelp dataset contains reviews of various businesses,
                                                                     rants, hotels and beauty & spas cases respectively. Finally,
such as restaurants, bars, hotels, shopping, real estate, beauty
                                                                     we combined the remaining words into groups, as described
& spas, etc., provided by various users of Yelp describing
                                                                     in Step 7, using the Wu&Palmer Similarity measure [19] with
their experiences visiting these businesses, in addition to
                                                                     the threshold level of 0.9. As a result, we obtained 835, 755,
the user-specified ratings of these businesses. These reviews
                                                                     512 groups of similar nouns for the restaurants, hotels and
were collected in the Phoenix metropolitan area (including
                                                                     beauty & spas categories.
towns of Scottsdale, Tempe and Chandler) in Arizona over
                                                                        Third, we have applied the LDA-based method described
the period of 6 years. For the purposes of this study, we used
                                                                     in Section 2.3 to the Yelp data. Initially, we pre-processed
all the reviews in the dataset for all the 4503 restaurants
                                                                     the reviews using the standard text analysis techniques by
(158430 reviews by 36473 users), 284 hotels (5034 reviews
                                                                     removing punctuation marks, stop words, high-frequency
by 4148 users) and 764 beauty & spas (5579 reviews by 4272
                                                                     words, etc. [15]. Then we ran LDA on the three prepro-
users). We selected these three categories of businesses (out
                                                                     cessed sets of reviews with m = 150 topics for each of the
of 22 in total) because they contained some of the largest
                                                                     three applications using the standard Python module gen-
numbers of reviews and also di↵ered significantly from each
                                                                     sim[18]. After generating these topics, we removed the most
other.
                                                                     infrequent ones, as described in Step 4 of the LDA-based
   The data about these businesses is specified with the fol-
                                                                     approach (setting the parameter ↵ = 0.005) and low-ratio
lowing attributes: business ID, name, address, category of
                                                                     topics (Step 6) having the parameter = 1.0. As a result,
business, geolocation (longitude/latitude), number of reviews,
                                                                     we were left with 135, 121 and 110 topics for each of the
the average rating of the reviews, and whether the business
                                                                     three applications.
is open or not. The data about the users is specified with
                                                                        We describe the obtained results in the next section.
the following attributes: user ID, first name, number of re-
views, and the average rating given by the user. Finally, the
reviews are specified with the following attributes: review          4.    RESULTS
ID, business ID, user ID, the rating of the review, the re-            First, the results of separation of the user-generated re-
view (textual description), and the date of the review. For          views into the specific and generic classes are presented in
instance, Figures 1 and 2 provide examples of restaurant             Table 2 that has the following entries:
reviews.
                                                                          • AvgSentences: the average number of sentences in re-
3.2 Applying the proposed methods                                           views from the generic or specific cluster.
  We applied our context discovery method to the three
Yelp applications from Section 3.1 (Restaurants, Hotels and               • AvgWords: the average number of words in reviews
Beauty & Spas). As a first step, we have separated all the                  from the cluster.
user-generated reviews into the specific and generic classes,
as explained in Section 2.1. In order to determine how well               • AvgVBDsum: the average number of verbs in past
this method works on the Yelp data, we manually labeled                     tense in reviews from the claster.
300 reviews into specific vs. generic for each of the three
applications used in this study (i.e., restaurants, hotels and            • AvgVsum: the average number of verbs in reviews from
beauty & spas - 900 reviews in total). This labeled data was                the cluster.
used for computing performance metrics of our separation
algorithm. The results of this performance evaluation are                 • AvgVRatio: the average ratio of VBDsum and Vsum
reported in Section 4.                                                      for reviews from the cluster.


                                                                 4
                         Category                 Restaurants               Hotels         Beauty & Spas
                         Cluster               specific generic       specific generic    specific generic
                         AvgSentences          9.59      5.04         10.38     5.58      9.36      4.54
                         AvgWords              129.42    55.97        147.81    65.48     134.5     50.88
                         AvgVBDsum             27.07     1.09         28.87     1.58      25.8      1.03
                         AvgVsum               91.54     23.93        107.43    28.88     107.22    25.65
                         AvgVRatio             0.43      0.02         0.40      0.06      0.38      0.03
                         Size                  59.3%     40.7%        67.8%     32.2%     59.2%     40.8%
                         AvgRating             3.53      4.03         3.57      3.81      3.76      4.35
                         Silhouette                  0.446                  0.424               0.461
                         Precision             0.87      0.89         0.83      0.92      0.83      0.94
                         Recall                0.83      0.91         0.83      0.92      0.88      0.90
                         Accuracy                     0.89                   0.88                0.90

                                               Table 2: Clusterization quality


   • Size: size of the cluster in percents from the number                     Context variable      Frequency     Word      LDA
     of all reviews in the category (restaurants, hotels and               1   Company                 56.3%        X(1)     X(6)
     beauty & spas).                                                       2   Time of the day         34.8%       X(77)     X(21)
                                                                           3   Day of the week         22.5%        X(2)     X(15)
   • AvgRating: the average rating for reviews from the                    4   Advice                  10.7%       X(13)     X(16)
     cluster.                                                              5   Prior Visits            10.2%         X       X(26)
                                                                           6   Came by car              7.8%       X(267)    X(78)
   • Silhouette: the silhouette measure of the clusterization              7   Compliments              4.9%       X(500)    X(74)
     quality (showing how separable the clusters are).
                                                                           8   Occasion                 3.9%       X(39)     X(19)
   • Precision: the precision measure for the cluster.                     9   Reservation              3.0%       X(29)      X
                                                                          10   Discount                 2.9%        X(4)      X
   • Recall: the recall measure for the cluster.                          11   Sitting outside          2.4%         X       X(64)
                                                                          12   Traveling                2.4%         X        X
   • Accuracy: the overall accuracy of clusterization with                13   Takeout                  1.9%       X(690)     X
     respect to the manual labeling.
                                                                                         Table 3: Restaurants
   As we can see from Table 2, the separation process gives
us two groups of reviews that are significantly di↵erent in
all the presented parameters. Further, this di↵erence is ob-           and beauty & spas categories respectively using the LDA-
served not only in terms of the five parameters used in the            based approach. We also went through these three lists and
k-means clustering method used to separate the generic from            identified the contextual variables among them - they are
the specific reviews (first five rows in Table 2), but also in         marked with the check marks in Column 5 (LDA) in Tables
terms of the average rating (AvgRating) measure (that is               3, 4 and 5 (the numbers in parentheses next to them also
significantly higher for the generic than for the specific re-         identify the first occurrences of the topics in the sorted lists
views across all the three categories). Also, the silhouette           of the topics produced by the LDA-based method).
measure is more than 0.4 for all the three categories and is              As Table 3 demonstrates, we identified the following types
as high as 0.46 for one of them, demonstrating significant             of contexts for the Restaurants category:
separation of the two clusters. Finally, note that the Ac-
curacy measure is around 0.9 across the three categories of               • Company: specifying with whom the user went to the
reviews (with respect to the labeled reviews - see Section                  restaurant (e.g., with a spouse, children, friends, co-
3.2), which is a good performance result for separating the                 workers, etc.).
reviews.
   We next extracted the contextual information from the                  • Time of the day: this context variable contains infor-
specific reviews (produced in the previous step) using the                  mation about the time of the day, such as morning,
word- and the LDA-based methods. As explained in Sec-                       evening and mid-day.
tion 3.2, we obtained the sorted lists of 835, 755, 512 groups            • Day of the week: specifying the day of the week (Mon-
of words for restaurants, hotels and beauty & spas cate-                    day, Tuesday, etc.).
gories respectively using the word-based approach. We went
through these three lists and identified the contextual vari-             • Advice: specifying the type of an advice given to the
ables among them - they are marked with the check marks                     user, such as a recommendation from a friend or a re-
in Column 4 (Word) in Tables 3, 4 and 5 (the numbers in                     view on Yelp. This context indicates that the user
parentheses next to them identify the first occurrences of                  knows the opinions of other parties about the restau-
the group of words in the sorted lists of the groups of words               rant before going there.
produced by the word-based method).
   Similarly, as explained in Section 3.2, we obtained the                • Prior Visits: specifying if the user is the first time
sorted lists of 135, 121 and 110 topics for restaurants, hotels             visitor or a regular in the restaurant.


                                                                  5
   • Came by car: specifying if the user came to the restau-                Context variable     Frequency     Word      LDA
     rant by car or not.                                                1   Company                37.3%        X(4)     X(11)
                                                                        2   Occasion               24.3%        X(1)     X(6)
   • Compliments: specifying any types of discounts or spe-             3   Reservation            12.9%       X(18)      X
     cial o↵ers that user recieved during his visit, such as            4   Time of the year       12.4%       X(94)     X(30)
     happy hour, free appetizer, special o↵er etc.                      5   Came by car             9.4%       X(381)    X(65)
                                                                        6   Day of the week         7.4%       X(207)    X(41)
   • Occasion: specifying the special occasion for going to
     the restaurant, such as birthday, date, wedding, an-               7   Airplane                4.9%       X(57)     X(40)
     niversary, business meeting, etc.                                  8   Discount                4.4%       X(23)      X
                                                                        9   Prior Visits            3.7%         X       X(57)
   • Reservation: specifying if the user made a prior reser-           10   City Event              3.4%         X        X
     vation in the restaurant or not.                                  11   Advice                  1.9%       X(134)    X(31)
   • Discount: specifying if the user used any types of dis-                              Table 4: Hotels
     count deals that he or she obtained before coming to
     the restaurant, such as groupon/coupon, a voucher and                  Context variable     Frequency     Word      LDA
     a gift certificate.                                                1   Company                30.1%       X(47)     X(22)
                                                                        2   Day of the week        18.9%        X(8)      X
   • Sitting outside: specifying if the user was sitting out-
                                                                        3   Prior Visits           15.2%         X       X(25)
     side (vs. inside) the restaurant during his visit.
                                                                        4   Time of the day        13.2%        X(3)     X(4)
   • Takeout: specifying if the user did not stay in the                5   Occasion                9.6%       X(15)     X(29)
     restaurant but ordered a takeout.                                  6   Reservation             9.4%       X(167)    X(1)
                                                                        7   Discount                9.2%       X(46)     X(39)
   Note that some of this contextual information was found              8   Advice                  4.1%        X(2)     X(8)
using either the word-based (Company, TimeOfTheDay, Day-                9   Stay vs Visit           3.1%         X       X(19)
OfTheWeek, Advice, CameByCar, Compliments, Occasion,                   10   Came by car             1.8%       X(113)    X(75)
Reservation, Discount and Takeout) or the LDA-based method
(Company, TimeOfTheDay, DayOfTheWeek, Advice, Pri-                                  Table 5: Beauty & Spas
orVisits, CameByCar, Compliments, Occasion and SitOut-
side).
   To validate the context extraction process, we went through      did not capture them because these words (“reservation,”
the 400 restaurant reviews (produced as described in Section        “groupon” and “takeout”) got lost among some other irrele-
3.2) and identified by inspection the contextual information        vant topics.
in these reviews. This allowed us to identify the contextual           Finally, nether method has discovered the Traveling con-
information that served as the ”ground truth”. Table 3 con-         text because it (a) is very infrequent and (b) is described in
tains all the contextual information that we have found in          more subtle ways, making it difficult to capture it.
these 400 reviews (13 di↵erent types). Note that the word-             In addition to Restaurants, we have also examined the
and the LDA-based methods collectively found all this con-          Hotels and the Beauty & Spas categories. The results are
textual information, except for the Traveling context (that         presented in Tables 4 and 5 with 10 types of contexts being
determines if the user visited the restaurant while on a travel     discovered for the Hotels case and 10 types for the Beauty &
trip in the city or that he/she lives in that city) - 12 di↵erent   Spas categories. Also, both methods missed the CityEvent
types of context (out of 13).                                       context (an event happening in the city which is the cause
   Furthermore, column 3 in Table 3 presents the frequencies        of traveling to that city and staying in the hotel) for the
with which particular types of contextual variables appear in       Hotels and captured all the contextual information for the
the specific reviews of restaurants. Note that the most fre-        Beaty & Spas application.
quently occurring popular contexts are discovered by both              As these tables demonstrate, the word- and the LDA-
the word- and the LDA-based methods. The di↵erences be-             based methods are complementary to each other: some con-
tween the two methods come in discoveries of less frequent          texts were discovered by one but not by the other method.
contexts. It is interesting to observe that the PriorVisits         Further, collectively, these two methods discover most of the
context was discovered by the LDA but not by the word-              contextual information across the three applications exam-
based method. This is the case because this context is usu-         ined in this paper.
ally represented by such expressions as “first time,”“second           Figure 3 presents the performance of the word-based dis-
time,” “twice” and so on, which are hard to capture by the          covery method across the three applications (restaurants,
word-based method because none of these expressions con-            hotels and beauty& spas). On X-axis are the ordinal num-
tain a clearly defined “strong” noun capturing this context.        bers of the groups of words in the word-based list produced
In contrast, the LDA-based approach captured this context           as described in Section 3.2. On the Y -axis are the cumu-
because LDA managed to combine the aforementioned ex-               lative number of contexts y(x) discovered by examining the
pressions into one topic.                                           first x groups of words on the list. Each line in Figure 3
   On the other hand, such contexts as Reservation, Discount        corresponds to the appropriate application. The jumps on
and Takeout were captured well by the word-based method             the curves correspond to the number of the first occurrence
since all the three contexts have clearly defined nouns char-       of the next contextual variable in the list of groups of words.
acterizing these contexts (e.g., “reservation,”“groupon” and        As we can see from Figure 3, word-based method identified
“takeout” respectively). In contrast, the LDA-based method          eight contextual variables for each application within the


                                                                6
                                                                  generated reviews. The first word-based method identifies
                                                                  the most important nouns that appear more frequently in
                                                                  the specific than in the generic reviews, and many important
                                                                  contextual variables appear high in this sorted list of nouns.
                                                                  The second LDA-based approach constructs a sorted list of
                                                                  topics generated by the popular LDA method [6]. We also
                                                                  show in the paper that many important types of context ap-
                                                                  pear high in the list of the constructed topics. Therefore,
                                                                  these contexts can easily be identified by examining these
                                                                  two lists, as Figures 3 and 4 demonstrate.
                                                                     We validated these two methods on three real-life appli-
                                                                  cations (Yelp reviews of Restaurants, Hotels, and Beauty&
                                                                  Spas) and empirically showed that the word- and the LDA-
                                                                  based methods (a) are complementary to each other (when-
                                                                  ever one misses certain contexts, the other one identifies
            Figure 3: Word-based method                           them and vice versa) and (b) collectively, they discover al-
                                                                  most all the contexts across the three di↵erent applications.
                                                                  Furthermore, in those few cases when these two methods fail
                                                                  to extract the relevant contextual information, the missed
                                                                  contexts turned out to be rare (appear infrequently in the
                                                                  reviews) and are more subtle (i.e., it is hard to describe these
                                                                  contexts in crisp terms). Finally, we showed that most of
                                                                  the contextual information was discovered quickly and ef-
                                                                  fectively across the three applications.
                                                                     As a future research, we plan to use other text mining
                                                                  methods in addition to the word-based and the LDA-based
                                                                  approaches and compare their e↵ectiveness with the two
                                                                  methods presented in the paper. Hopefully, these improve-
                                                                  ments will help us to discover even more subtle and low-
                                                                  frequency contexts. Since the proposed word-based and
                                                                  LDA-based methods constitute general-purpose approaches,
                                                                  they can be applied to a wide range of applications, and we
             Figure 4: LDA-based method                           plan to test them on various other (non-Yelp based) cases
                                                                  to demonstrate broad usefulness of these methods.
first 300 groups of words on the list. Moreover, the first four
contextual variables were identified from only first 30 groups    6.   REFERENCES
of words on the list. This supports our earlier observation        [1] S. Aciar. Mining context information from consumers
that many contextual variables appear relatively high on the           reviews. In Proceedings of Workshop on
list of words groups and therefore could be easily identified.         Context-Aware Recommender System. ACM, 2010.
   Figure 4 presents similar curves for the LDA-based method.      [2] G. Adomavicius, B. Mobasher, F. Ricci, and
This method managed to identify 9 contextual variables for             A. Tuzhilin. Context-aware recommender systems. AI
restaurants and hotels applications, and 8 contextual vari-            Magazine, 32(3):67–80, 2011.
ables for the beauty & spas application from the first 78
                                                                   [3] G. Adomavicius and A. Tuzhilin. Context-aware
topics on the list of all the topics. Moreover, the first 6
                                                                       recommender systems. In F. Ricci, L. Rokach,
topics were identified within just the first 41 topics. This
                                                                       B. Shapira, and P. B. Kantor, editors, Recommender
further supports the earlier observation that many contex-
                                                                       Systems Handbook, pages 217–253. Springer US, 2011.
tual variables appear high on the topics list and therefore
could be easily identified.                                        [4] S. Anand and B. Mobasher. Contextual
   As discussed before, the word- and the LDA-based meth-              recommendation. In B. Berendt, A. Hotho,
ods are complementary to each other. In our three applica-             D. Mladenic, and G. Semeraro, editors, From Web to
tions all the identified contextual variables could be identi-         Social Web: Discovering and Deploying User and
fied within the first 78 LDA-topics and 29 groups of words             Content Profiles, volume 4737 of Lecture Notes in
in case of restaurants, 65 topics and 23 groups of words in            Computer Science, pages 142–160. Springer Berlin
case of hotels, and 75 topics and 8 groups of words in case of         Heidelberg, 2007.
beauty & spas. Therefore, combination of the word- and the         [5] J. F. T. Ante Odic, Marko Tkalcic and A. Kosir.
LDA-based methods idetifies almost all the frequent contex-            Predicting and detecting the relevant contextual
tual variables by examining only the top several items on              information in a movie-recommender system. In
the two lists.                                                         Interacting with Computers, 25(1), pages 74–90.
                                                                       Oxford University Press, 2013.
                                                                   [6] D. M. Blei, A. Y. Ng, and M. I. Jordan. Latent
5. CONCLUSION AND FUTURE WORK                                          dirichlet allocation. J. Mach. Learn. Res., 3:993–1022,
  In this paper, we presented two novel methods for sys-               Mar. 2003.
tematically discovering contextual information from user-          [7] P. Dourish. What we talk about when we talk about


                                                              7
     context. Personal Ubiquitous Comput., 8(1):19–30,
     Feb. 2004.
 [8] N. Hariri, B. Mobasher, and R. Burke. Context-aware
     music recommendation based on latenttopic sequential
     patterns. In Proceedings of the Sixth ACM Conference
     on Recommender Systems, RecSys ’12, pages 131–138,
     New York, NY, USA, 2012. ACM.
 [9] N. Hariri, B. Mobasher, and R. Burke. Query-driven
     context aware recommendation. In Proceedings of the
     7th ACM Conference on Recommender Systems,
     RecSys ’13, pages 9–16, New York, NY, USA, 2013.
     ACM.
[10] N. Hariri, B. Mobasher, R. Burke, and Y. Zheng.
     Context-aware recommendation based on review
     mining. In IJCAI’ 11, Proceedings of the 9th
     Workshop on Intelligent Techniques for Web
     Personalization and Recommender Systems (ITWP
     2011), pages 30–36, 2011.
[11] X. Jin, Y. Zhou, and B. Mobasher. Task-oriented web
     user modeling for recommendation. In Proceedings of
     the 10th international conference on User Modeling,
     UM’05, pages 109–118, Berlin, Heidelberg, 2005.
     Springer-Verlag.
[12] M. Kaminskas and F. Ricci. Location-adapted music
     recommendation using tags. In J. Konstan, R. Conejo,
     J. Marzo, and N. Oliver, editors, User Modeling,
     Adaption and Personalization, volume 6787 of Lecture
     Notes in Computer Science, pages 183–194. Springer
     Berlin Heidelberg, 2011.
[13] J. Lee and J. Lee. Context awareness by case-based
     reasoning in a music recommendation system. In
     H. Ichikawa, W.-D. Cho, I. Satoh, and H. Youn,
     editors, Ubiquitous Computing Systems, volume 4836
     of Lecture Notes in Computer Science, pages 45–58.
     Springer Berlin Heidelberg, 2007.
[14] Y. Li, J. Nie, Y. Zhang, B. Wang, B. Yan, and
     F. Weng. Contextual recommendation based on text
     mining. In Proceedings of the 23rd International
     Conference on Computational Linguistics: Posters,
     COLING ’10, pages 692–700, Stroudsburg, PA, USA,
     2010. Association for Computational Linguistics.
[15] C. D. Manning, P. Raghavan, and H. Schütze.
     Introduction to Information Retrieval. Cambridge
     University Press, New York, NY, USA, 2008.
[16] G. A. Miller. Wordnet: A lexical database for english.
     COMMUNICATIONS OF THE ACM, 38:39–41, 1995.
[17] C. Palmisano, A. Tuzhilin, and M. Gorgoglione. Using
     context to improve predictive modeling of customers
     in personalization applications. IEEE Trans. on
     Knowl. and Data Eng., 20(11):1535–1549, Nov. 2008.
[18] R. Rehurek and P. Sojka. Software framework for
     topic modelling with large corpora. In Proceedings of
     LREC 2010 workshop New Challenges for NLP
     Frameworks, pages 46–50, Valletta, Malta, 2010.
     University of Malta.
[19] Z. Wu and M. Palmer. Verbs semantics and lexical
     selection. In Proceedings of the 32Nd Annual Meeting
     on Association for Computational Linguistics, ACL
     ’94, pages 133–138, Stroudsburg, PA, USA, 1994.
     Association for Computational Linguistics.


                                                              8