<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Discovering Contextual Information from User Reviews for Recommendation Purposes</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Konstantin Bauman</string-name>
          <email>kbauman@stern.nyu.edu</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Alexander Tuzhilin</string-name>
          <email>atuzhili@stern.nyu.edu</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Stern School of Business, New York University</institution>
          ,
          <country country="US">USA</country>
        </aff>
      </contrib-group>
      <abstract>
        <p>The paper presents a new method of discovering relevant contextual information from the user-generated reviews in order to provide better recommendations to the users when such reviews complement traditional ratings used in recommender systems. In particular, we classify all the user reviews into the “context rich” specific and “context poor” generic reviews and present a word-based and an LDA-based methods of extracting contextual information from the specific reviews. We also show empirically on the Yelp data that, collectively, these two methods extract almost all the relevant contextual information across three di↵ erent applications and that they are complementary to each other: when one method misses certain contextual information, the other one extracts it from the reviews.</p>
      </abstract>
      <kwd-group>
        <kwd>Recommender systems</kwd>
        <kwd>Contextual information</kwd>
        <kwd>Online reviews</kwd>
        <kwd>User-generated content</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>-</title>
      <p>1. INTRODUCTION</p>
      <p>
        The field of Context-Aware Recommender Systems (CARS)
has experienced extensive growth since the first papers on
this topic appeared in the mid-2000’s [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ] when it was shown
that the knowledge of contextual information helps to
provide better recommendations in various settings and
applications, including Music [
        <xref ref-type="bibr" rid="ref12 ref13 ref8 ref9">8, 9, 12, 13</xref>
        ], Movies [
        <xref ref-type="bibr" rid="ref5">5</xref>
        ], E-commerce
[
        <xref ref-type="bibr" rid="ref17">17</xref>
        ], Hotels [
        <xref ref-type="bibr" rid="ref10">10</xref>
        ], Restaurants [
        <xref ref-type="bibr" rid="ref14">14</xref>
        ].
      </p>
      <p>
        One of the fundamental issues in the CARS field is the
question of what context is and how it should be specified.
According to [
        <xref ref-type="bibr" rid="ref2 ref7">2, 7</xref>
        ], context-aware approaches are divided
into representational and interactional. In the
representational approach, adopted in most of the CARS papers,
context can be described using a set of observable
contextual variables that are known a priori and the structure of
which does not change over time. In the interactional
approach [
        <xref ref-type="bibr" rid="ref11 ref4">4, 11</xref>
        ], the contextual information is not known a
priori and either needs to be learned or modeled using latent
Permission to make digital or hard copies of all or part of this work for
personal or classroom use is granted without fee provided that copies are
not made or distributed for profit or commercial advantage and that copies
bear this notice and the full citation on the first page. To copy otherwise, to
Copyright 2014 for the individual papers by the paper’s authors.
republish, to post on servers or to redistribute to lists, requires prior specific
pCeorpmyiisnsgionpearnmd/iottreda ffeoer. private and academic purposes. This volume is
CpuBbRlieschSeyds a2n0d14c,oOpyctroigbhetre6d, b2y01it4s, eSdiliitcoorsn. Valley, CA, USA.
      </p>
      <p>
        CCoBpRyercigShyts22001144b,yOtchteobaeurth6o,r2(s0)1.4, Silicon Valley, CA, USA.
approaches, such as the ones described in [
        <xref ref-type="bibr" rid="ref11">11</xref>
        ]. Although
most of the CARS literature has focused on the
representational approach, an argument has been made that the
context is not known in advance in many CARS applications
and, therefore, needs to be discovered [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ].
      </p>
      <p>In this paper, we focus on the interactional approach to
CARS and assume that the contextual information is not
known in advance and is latent. Furthermore, we focus on
those applications where rating of items provided by the
users are supplemented with user-generated reviews
containing, the contextual information, among other things.
For example, in case of Yelp, user reviews contain valuable
contextual information about user experiences of interacting
with Yelp businesses, such as restaurants, bars, hotels, and
beauty &amp; spas. By analyzing these reviews, we can discover
various types of rich and important contextual information
that can subsequently be used for providing better
recommendations.</p>
      <p>One way to discover this latent contextual information
would be to provide a rigorous formal definition of context
and discover it in the texts of the user-generated reviews
using some formal text mining-based context identification
methods. This direct approach is di cult, however, because
of the complex multidimensional task of defining the
unknown contextual information in a rigorous way, identifying
what constitutes context and what does not in the
usergenerated reviews, and dealing with complexities of
extracting it from the reviews using text mining methods.</p>
      <p>Therefore, in this paper we propose the following
indirect method for discovering relevant contextual information
from the user-generated reviews. First, we observe that the
contextual information is contained mainly in the specific
reviews (those that describe specific visit of a user to an
establishment, such as a restaurant) and hardly appears in the
generic reviews (the reviews describing overall impressions
about a particular establishment). Second, words or topic
describing the contextual information should appear much
more frequently in the specific than in the generic reviews
because the latter should mostly miss such words or topics.
Therefore, if we can separate the specific from the generic
reviews, compare the frequencies of words or topics appearing
in the specific vs. the generic reviews and select these words
or topic having high frequency ratios, then they should
contain most of the contextual information among them. This
background work of applying the frequency-based method
to identifying the important context-related words and
topics paves the way to the final stage of inspecting these lists
of words and topics.</p>
      <p>In this paper, we followed this indirect approach and
developed an algorithm for classifying the reviews into the
“context rich” specific and “context poor” generic reviews.
In additional, we present a word-based and an LDA-based
methods of extracting contextual information from the
specific reviews. We also show that, together, these two
methods extract almost all the relevant contextual information
across three di↵erent applications (restaurants, hotels, and
beauty &amp; spas) and that they are complementary to each
other: when one method misses certain contextual
information, the other one extracts it from the reviews and vice
versa. Furthermore, in those few cases when these two
methods fail to extract the relevant contextual information, these
types of contexts turned out to be rare (appear infrequently
in the reviews) and are more subtle (i.e., it is hard to
describe such contexts in crisp linguistic terms).</p>
      <p>
        [
        <xref ref-type="bibr" rid="ref1 ref10 ref14">1, 10, 14</xref>
        ] present some prior work on extracting
contextual information from the user-generated reviews. Although
presenting die↵rent approaches, these three references have
one point in common: in all the three papers the types of
contextual information are a priori known. Therefore, the
key issue in these papers is determination of the specific
values of the known contextual types based on the reviews.
      </p>
      <p>Although significant progress has been made on learning
context from user-generated reviews, nobody proposed any
method of separating the reviews into specific and generic
and presented the particular methods of extracting the
contextual information from the reviews that are described in
this paper.</p>
      <p>This paper makes the following contributions. First, we
proposed two novel methods, a word-based and an
LDAbased, of extracting the contextual information from the
user-generated reviews in those CARS applications where
contexts are not known in advance. Second, we validated
them on three real-life applications (Restaurants, Hotels,
and Beauty &amp; Spas) and experimentally showed that these
two methods are (a) complementary to each other
(whenever one misses certain contexts, the other one identifies
them and vice versa) and (b) collectively, they discover
almost all the contexts across the three di↵erent applications.
Third, we show that most of this contextual information can
be discovered quickly and ee↵ctively.</p>
    </sec>
    <sec id="sec-2">
      <title>METHOD OF CONTEXT DISCOVERY</title>
      <p>The key idea of the proposed method is to extract the
contextual information from the user-generated reviews.
However, not all the reviews contain rich contextual information.
For example, generic reviews, describing overall impressions
about a particular restaurant or a hotel, such as the one
presented in Figure 1, contain only limited contextual
information, if any. In contrast, the specific visits to a restaurant or
staying in a hotel may contain rich contextual information.
For example, the review presented in Figure 2 and
describing the specific dining experience in a restaurant contains
such contextual information as “lunch time,” with whom the
person went to the restaurant, and the date of the visit.
Therefore, the first step in the proposed approach is to
separate such generic from the specific reviews, and we present
a particular separation method in Section 2.1.</p>
      <p>After that, we use the specific/generic dichotomy to
extract the contextual information using the two methods
proposed in this paper, the first one based on the identification
of the most important context-related words and the second</p>
      <p>The main idea in separating specific from generic reviews
lies in identification of certain characteristics that are
prevalent in one type but not in the other type of review. For
example, users who describe particular restaurant experiences
tend to write long reviews and extensively use past tenses
(e.g., “I came with some friends for lunch today”), while
generic reviews tend to use present tense more frequently
(e.g., “they make wonderful pastas”).</p>
      <p>In this work, we identified several such features for
separating the generic from the specific reviews, including (a)
the length of the review, (b) the total number of verbs used
in the review and (c) the number of verbs used in past
tenses. More specifically, we used the following measures
in our study:
• LogSentences: logarithm of the number of sentences in
the review plus one1.
• LogWords: logarithm of the number of words used in
the review plus one.
• VBDsum: logarithm of the number of verbs in the past
tenses in the review plus one.
• Vsum: logarithm of the number of verbs in the review
plus one.</p>
      <p>• VRatio - the ratio of VBDsum and Vsum ( V BVDsusmum ).
Given these characteristics, we used the classical K-means
clustering method to separate all the reviews into the
“specific” vs. “generic” clusters. We describe the specifics of this
separation method, as applied to our data, in Section 3.2.</p>
      <p>Once the two types of reviews are separated into two
different classes, we next apply the word-based and LDA-based
methods described in the next two sections.
1We added one avoid the problem of having empty reviews
when logarithm becomes 1 .</p>
      <p>The key idea of this method is to identify those words
(more specifically, nouns) that occur with a significantly
higher frequency in the specific than in the generic reviews.
As explained earlier, many contextual words describing the
contextual information fit into this category. We can
capture them by analyzing the dichotomy between the patterns
of words in the two categories of reviews, as explained below,
and identify them as follows:
1. For each review Ri, identify the set of nouns Ni
appearing in it.
2. For each noun nk, determine its weighted frequencies
ws(nk) and wg(nk) corresponding to the specific (s)
and generic (g) reviews, as follows
and
ws(nk) = |Ri : Ri 2 specif ic and nk 2 Ni|</p>
      <p>|Ri : Ri 2 specif ic|
wg(nj) = |Ri : Ri 2 generic and nk 2 Ni| .</p>
      <p>|Ri : Ri 2 generic|
3. Filter out the words nk that have low overall frequency,
i.e.,
w(nk) =</p>
      <p>
        |Ri : nk 2 Ni|
|Ri : Ri 2 generic or Ri 2 specif ic|
&lt; ↵,
where ↵ is a threshold value for the application (e.g.,
↵ = 0.005).
4. For each noun nk determine ratio of its specific and
generic weighted frequencies: ratio(nk) = wwgs((nnkk)) .
5. Filter out nouns with ratio(nk) &lt;
(e.g
= 1.0).
6. For each remaining noun nk left after filtering step 5,
find the set of senses synset(nk) using WordNet2[
        <xref ref-type="bibr" rid="ref16">16</xref>
        ].
7. Combine senses into groups gt having close meanings
using WordNet taxonomy distance. Words with
several distinct meanings can be represented in several
distinct groups.
8. For each group gt determine its weighted frequencies
ws(gt) and wg(gt) through frequencies of its members
as:
ws(gt) = |Ri : Ri 2 specif ic and gk \ Ni 6= ;| .
      </p>
      <p>|Ri : Ri 2 specif ic|
9. For each group gt determine ratio of its specific and
generic weighted frequencies as ratio(gt) = wwsg((gttt)) .
10. Sort groups by ratio(gt) in its descending order.
As a result of running this procedure, we obtain a list of
groups of words that are sorted based on the metric ratio
defined in Step 9 above. Furthermore, the contextual words
are expected to be located high in the list (and we
empirically show it in Section 4).
2WordNet is a large lexical database of English. Words are
grouped into sets of cognitive synonyms, each expressing a
distinct concept. Function synset(word) returns a list of
lemmas of this word that represent distinct concepts.
2.3</p>
    </sec>
    <sec id="sec-3">
      <title>Discovering Context Using</title>
    </sec>
    <sec id="sec-4">
      <title>LDA-based Method</title>
      <p>
        The key idea of this method is to generate a list of topics
about an application using the well-known LDA approach [
        <xref ref-type="bibr" rid="ref6">6</xref>
        ]
and identify among them those topics corresponding to the
contextual information for that application. In particular,
we proceed as follows:
1. Build an LDA model on the set of the specific reviews.
2. Apply this LDA model to all the user-generated
reviews in order to obtain the set of topics Ti for each
review Ri with probability higher than certain
threshold level.
3. For each topic tk from the generated LDA model,
determine its weighted frequencies ws(tk) and wg(tk)
corresponding to the specific (s) and generic (g) reviews,
as follows
ws(tk) = |Ri : Ri 2 specif ic and tk 2 Ti|
      </p>
      <p>|Ri : Ri 2 specif ic|
and
wg(tk) = |Ri : Ri 2 generic and tk 2 Ti| .</p>
      <p>|Ri : Ri 2 generic|
4. Filter out the topics tk that have low overall frequency,
i.e.,
w(tk) =</p>
      <p>|Ri : tk 2 Ti|
|Ri : Ri 2 generic or Ri 2 specif ic|
&lt; ↵,
where ↵ is a threshold value for the application (e.g.,
↵ = 0.005).
5. For each topic tk determine the ratio of its specific and
generic weighted frequencies: ratio(tk) = wwgs((ttkk)) .</p>
      <sec id="sec-4-1">
        <title>6. Filter out topics with ratio(tk) &lt;</title>
        <p>(e.g.</p>
        <p>= 1.0).
7. Sort the topics by ratio(tk) in the descending order.</p>
        <p>As a result of running this procedure, we obtain a list
of LDA topics that is sorted using the ratio metric defined
in Step 5 above. Since the contextual information is usually
related to the specific user experiences, we expect that these
contextual LDA topics will appear high in the generated list,
as in the case of the word-based method described in Section
2.2.</p>
        <p>We next go through the lists of words and topics generated
in Sections 2.2 and 2.3 and select the contextual information
out of them. As is shown in Section 4, this contextual
information is usually located high on these two lists and
therefore can be easily identified and extracted from them. The
specifics are further presented in Section 4. As we can see,
the list generation methods described in Sections 2.2 and
2.3 lie at the core of our context extraction methodology
and make the final context selection process easy.</p>
        <p>In summary, we proposed a method of separating the
reviews pertaining to the specific user experiences from the
generic reviews. We also proposed two methods of
generating contextual information, one is based on the LDA topics
and another on generating list of words relevant to the
contextual information.</p>
        <p>In Section 3, we empirically validate our methods and will
show their usefulness and complementarity in Section 4.
Category
Cluster
Number of reviews
Number of reviews
with context
% of reviews with
context
87%
19%
65%
12%
59%
7%</p>
      </sec>
    </sec>
    <sec id="sec-5">
      <title>EXPERIMENTAL SETTINGS</title>
      <p>To demonstrate how well our methods work in practice,
we tested them on the Yelp data (www.yelp.com) that was
provided for the RecSys 2013 competition. In particular,
we extracted the contextual information from the reviews
pertaining to restaurants, hotels and beauty &amp; spas
applications using the word-based and the LDA-based approaches.
We describe the Yelp data in Section 3.1 and the specifics
of our experiments in Section 3.2.
3.1</p>
    </sec>
    <sec id="sec-6">
      <title>Dataset Descriptions</title>
      <p>The Yelp dataset contains reviews of various businesses,
such as restaurants, bars, hotels, shopping, real estate, beauty
&amp; spas, etc., provided by various users of Yelp describing
their experiences visiting these businesses, in addition to
the user-specified ratings of these businesses. These reviews
were collected in the Phoenix metropolitan area (including
towns of Scottsdale, Tempe and Chandler) in Arizona over
the period of 6 years. For the purposes of this study, we used
all the reviews in the dataset for all the 4503 restaurants
(158430 reviews by 36473 users), 284 hotels (5034 reviews
by 4148 users) and 764 beauty &amp; spas (5579 reviews by 4272
users). We selected these three categories of businesses (out
of 22 in total) because they contained some of the largest
numbers of reviews and also di↵ered significantly from each
other.</p>
      <p>The data about these businesses is specified with the
following attributes: business ID, name, address, category of
business, geolocation (longitude/latitude), number of reviews,
the average rating of the reviews, and whether the business
is open or not. The data about the users is specified with
the following attributes: user ID, first name, number of
reviews, and the average rating given by the user. Finally, the
reviews are specified with the following attributes: review
ID, business ID, user ID, the rating of the review, the
review (textual description), and the date of the review. For
instance, Figures 1 and 2 provide examples of restaurant
reviews.
3.2</p>
    </sec>
    <sec id="sec-7">
      <title>Applying the proposed methods</title>
      <p>We applied our context discovery method to the three
Yelp applications from Section 3.1 (Restaurants, Hotels and
Beauty &amp; Spas). As a first step, we have separated all the
user-generated reviews into the specific and generic classes,
as explained in Section 2.1. In order to determine how well
this method works on the Yelp data, we manually labeled
300 reviews into specific vs. generic for each of the three
applications used in this study (i.e., restaurants, hotels and
beauty &amp; spas - 900 reviews in total). This labeled data was
used for computing performance metrics of our separation
algorithm. The results of this performance evaluation are
reported in Section 4.</p>
      <p>We have also counted the number of occurrences of
contextual information in generic and specific reviews. The results
presented in Table 1 support our claim that specific reviews
contain richer contextual information than generic reviews
across all the three applications.</p>
      <p>
        Second, we have applied the word-based method described
in Section 2.2 to the Yelp data. Initially, we generated sets of
nouns for restaurants, hotels and beauty &amp; spas applications
respectively. After we computed the weighted frequencies of
nouns and filtered out infrequent and low-ratio words
(having the thresholds values of ↵ = 0.005, = 1.0), only 1495,
1292 and 1150 nouns were left in the word lists for
restaurants, hotels and beauty &amp; spas cases respectively. Finally,
we combined the remaining words into groups, as described
in Step 7, using the Wu&amp;Palmer Similarity measure [
        <xref ref-type="bibr" rid="ref19">19</xref>
        ] with
the threshold level of 0.9. As a result, we obtained 835, 755,
512 groups of similar nouns for the restaurants, hotels and
beauty &amp; spas categories.
      </p>
      <p>
        Third, we have applied the LDA-based method described
in Section 2.3 to the Yelp data. Initially, we pre-processed
the reviews using the standard text analysis techniques by
removing punctuation marks, stop words, high-frequency
words, etc. [
        <xref ref-type="bibr" rid="ref15">15</xref>
        ]. Then we ran LDA on the three
preprocessed sets of reviews with m = 150 topics for each of the
three applications using the standard Python module
gensim[
        <xref ref-type="bibr" rid="ref18">18</xref>
        ]. After generating these topics, we removed the most
infrequent ones, as described in Step 4 of the LDA-based
approach (setting the parameter ↵ = 0.005) and low-ratio
topics (Step 6) having the parameter = 1.0. As a result,
we were left with 135, 121 and 110 topics for each of the
three applications.
      </p>
      <p>We describe the obtained results in the next section.
4.</p>
    </sec>
    <sec id="sec-8">
      <title>RESULTS</title>
      <p>First, the results of separation of the user-generated
reviews into the specific and generic classes are presented in
Table 2 that has the following entries:
• AvgSentences: the average number of sentences in
reviews from the generic or specific cluster.
• AvgWords: the average number of words in reviews
from the cluster.
• AvgVBDsum: the average number of verbs in past
tense in reviews from the claster.
• AvgVsum: the average number of verbs in reviews from
the cluster.
• AvgVRatio: the average ratio of VBDsum and Vsum
for reviews from the cluster.
• Size: size of the cluster in percents from the number
of all reviews in the category (restaurants, hotels and
beauty &amp; spas).
• AvgRating: the average rating for reviews from the
cluster.
• Silhouette: the silhouette measure of the clusterization
quality (showing how separable the clusters are).
• Precision: the precision measure for the cluster.
• Recall: the recall measure for the cluster.
• Accuracy: the overall accuracy of clusterization with
respect to the manual labeling.</p>
      <p>As we can see from Table 2, the separation process gives
us two groups of reviews that are significantly di↵erent in
all the presented parameters. Further, this di↵erence is
observed not only in terms of the five parameters used in the
k-means clustering method used to separate the generic from
the specific reviews (first five rows in Table 2), but also in
terms of the average rating (AvgRating) measure (that is
significantly higher for the generic than for the specific
reviews across all the three categories). Also, the silhouette
measure is more than 0.4 for all the three categories and is
as high as 0.46 for one of them, demonstrating significant
separation of the two clusters. Finally, note that the
Accuracy measure is around 0.9 across the three categories of
reviews (with respect to the labeled reviews - see Section
3.2), which is a good performance result for separating the
reviews.</p>
      <p>We next extracted the contextual information from the
specific reviews (produced in the previous step) using the
word- and the LDA-based methods. As explained in
Section 3.2, we obtained the sorted lists of 835, 755, 512 groups
of words for restaurants, hotels and beauty &amp; spas
categories respectively using the word-based approach. We went
through these three lists and identified the contextual
variables among them - they are marked with the check marks
in Column 4 (Word) in Tables 3, 4 and 5 (the numbers in
parentheses next to them identify the first occurrences of
the group of words in the sorted lists of the groups of words
produced by the word-based method).</p>
      <p>Similarly, as explained in Section 3.2, we obtained the
sorted lists of 135, 121 and 110 topics for restaurants, hotels
and beauty &amp; spas categories respectively using the
LDAbased approach. We also went through these three lists and
identified the contextual variables among them - they are
marked with the check marks in Column 5 (LDA) in Tables
3, 4 and 5 (the numbers in parentheses next to them also
identify the first occurrences of the topics in the sorted lists
of the topics produced by the LDA-based method).</p>
      <p>As Table 3 demonstrates, we identified the following types
of contexts for the Restaurants category:
• Company: specifying with whom the user went to the
restaurant (e.g., with a spouse, children, friends,
coworkers, etc.).
• Time of the day: this context variable contains
information about the time of the day, such as morning,
evening and mid-day.
• Day of the week: specifying the day of the week
(Monday, Tuesday, etc.).
• Advice: specifying the type of an advice given to the
user, such as a recommendation from a friend or a
review on Yelp. This context indicates that the user
knows the opinions of other parties about the
restaurant before going there.
• Prior Visits: specifying if the user is the first time
visitor or a regular in the restaurant.
• Came by car: specifying if the user came to the
restaurant by car or not.
• Compliments: specifying any types of discounts or
special o↵ers that user recieved during his visit, such as
happy hour, free appetizer, special o↵er etc.
• Occasion: specifying the special occasion for going to
the restaurant, such as birthday, date, wedding,
anniversary, business meeting, etc.
• Reservation: specifying if the user made a prior
reservation in the restaurant or not.
• Discount: specifying if the user used any types of
discount deals that he or she obtained before coming to
the restaurant, such as groupon/coupon, a voucher and
a gift certificate.
• Sitting outside: specifying if the user was sitting
outside (vs. inside) the restaurant during his visit.
• Takeout: specifying if the user did not stay in the
restaurant but ordered a takeout.</p>
      <p>Note that some of this contextual information was found
using either the word-based (Company, TimeOfTheDay,
DayOfTheWeek, Advice, CameByCar, Compliments, Occasion,
Reservation, Discount and Takeout) or the LDA-based method
(Company, TimeOfTheDay, DayOfTheWeek, Advice,
PriorVisits, CameByCar, Compliments, Occasion and
SitOutside).</p>
      <p>To validate the context extraction process, we went through
the 400 restaurant reviews (produced as described in Section
3.2) and identified by inspection the contextual information
in these reviews. This allowed us to identify the contextual
information that served as the ”ground truth”. Table 3
contains all the contextual information that we have found in
these 400 reviews (13 di↵erent types). Note that the
wordand the LDA-based methods collectively found all this
contextual information, except for the Traveling context (that
determines if the user visited the restaurant while on a travel
trip in the city or that he/she lives in that city) - 12 di↵erent
types of context (out of 13).</p>
      <p>Furthermore, column 3 in Table 3 presents the frequencies
with which particular types of contextual variables appear in
the specific reviews of restaurants. Note that the most
frequently occurring popular contexts are discovered by both
the word- and the LDA-based methods. The di↵erences
between the two methods come in discoveries of less frequent
contexts. It is interesting to observe that the PriorVisits
context was discovered by the LDA but not by the
wordbased method. This is the case because this context is
usually represented by such expressions as “first time,”“second
time,” “twice” and so on, which are hard to capture by the
word-based method because none of these expressions
contain a clearly defined “strong” noun capturing this context.
In contrast, the LDA-based approach captured this context
because LDA managed to combine the aforementioned
expressions into one topic.</p>
      <p>On the other hand, such contexts as Reservation, Discount
and Takeout were captured well by the word-based method
since all the three contexts have clearly defined nouns
characterizing these contexts (e.g., “reservation,”“groupon” and
“takeout” respectively). In contrast, the LDA-based method</p>
      <sec id="sec-8-1">
        <title>Word</title>
        <p>X(47)
X(8)</p>
        <p>X
X(3)
X(15)
X(167)
X(46)
X(2)</p>
        <p>X
X(113)
LDA
X(22)</p>
        <p>X
X(25)
X(4)
X(29)
X(1)
X(39)
X(8)
X(19)
X(75)
did not capture them because these words (“reservation,”
“groupon” and “takeout”) got lost among some other
irrelevant topics.</p>
        <p>Finally, nether method has discovered the Traveling
context because it (a) is very infrequent and (b) is described in
more subtle ways, making it dicult to capture it.</p>
        <p>In addition to Restaurants, we have also examined the
Hotels and the Beauty &amp; Spas categories. The results are
presented in Tables 4 and 5 with 10 types of contexts being
discovered for the Hotels case and 10 types for the Beauty &amp;
Spas categories. Also, both methods missed the CityEvent
context (an event happening in the city which is the cause
of traveling to that city and staying in the hotel) for the
Hotels and captured all the contextual information for the
Beaty &amp; Spas application.</p>
        <p>As these tables demonstrate, the word- and the
LDAbased methods are complementary to each other: some
contexts were discovered by one but not by the other method.
Further, collectively, these two methods discover most of the
contextual information across the three applications
examined in this paper.</p>
        <p>Figure 3 presents the performance of the word-based
discovery method across the three applications (restaurants,
hotels and beauty&amp; spas). On X-axis are the ordinal
numbers of the groups of words in the word-based list produced
as described in Section 3.2. On the Y -axis are the
cumulative number of contexts y(x) discovered by examining the
first x groups of words on the list. Each line in Figure 3
corresponds to the appropriate application. The jumps on
the curves correspond to the number of the first occurrence
of the next contextual variable in the list of groups of words.
As we can see from Figure 3, word-based method identified
eight contextual variables for each application within the
first 300 groups of words on the list. Moreover, the first four
contextual variables were identified from only first 30 groups
of words on the list. This supports our earlier observation
that many contextual variables appear relatively high on the
list of words groups and therefore could be easily identified.</p>
        <p>Figure 4 presents similar curves for the LDA-based method.
This method managed to identify 9 contextual variables for
restaurants and hotels applications, and 8 contextual
variables for the beauty &amp; spas application from the first 78
topics on the list of all the topics. Moreover, the first 6
topics were identified within just the first 41 topics. This
further supports the earlier observation that many
contextual variables appear high on the topics list and therefore
could be easily identified.</p>
        <p>As discussed before, the word- and the LDA-based
methods are complementary to each other. In our three
applications all the identified contextual variables could be
identified within the first 78 LDA-topics and 29 groups of words
in case of restaurants, 65 topics and 23 groups of words in
case of hotels, and 75 topics and 8 groups of words in case of
beauty &amp; spas. Therefore, combination of the word- and the
LDA-based methods idetifies almost all the frequent
contextual variables by examining only the top several items on
the two lists.</p>
      </sec>
    </sec>
    <sec id="sec-9">
      <title>CONCLUSION AND FUTURE WORK</title>
      <p>
        In this paper, we presented two novel methods for
systematically discovering contextual information from
usergenerated reviews. The first word-based method identifies
the most important nouns that appear more frequently in
the specific than in the generic reviews, and many important
contextual variables appear high in this sorted list of nouns.
The second LDA-based approach constructs a sorted list of
topics generated by the popular LDA method [
        <xref ref-type="bibr" rid="ref6">6</xref>
        ]. We also
show in the paper that many important types of context
appear high in the list of the constructed topics. Therefore,
these contexts can easily be identified by examining these
two lists, as Figures 3 and 4 demonstrate.
      </p>
      <p>We validated these two methods on three real-life
applications (Yelp reviews of Restaurants, Hotels, and Beauty&amp;
Spas) and empirically showed that the word- and the
LDAbased methods (a) are complementary to each other
(whenever one misses certain contexts, the other one identifies
them and vice versa) and (b) collectively, they discover
almost all the contexts across the three di↵erent applications.
Furthermore, in those few cases when these two methods fail
to extract the relevant contextual information, the missed
contexts turned out to be rare (appear infrequently in the
reviews) and are more subtle (i.e., it is hard to describe these
contexts in crisp terms). Finally, we showed that most of
the contextual information was discovered quickly and
effectively across the three applications.</p>
      <p>As a future research, we plan to use other text mining
methods in addition to the word-based and the LDA-based
approaches and compare their e↵ectiveness with the two
methods presented in the paper. Hopefully, these
improvements will help us to discover even more subtle and
lowfrequency contexts. Since the proposed word-based and
LDA-based methods constitute general-purpose approaches,
they can be applied to a wide range of applications, and we
plan to test them on various other (non-Yelp based) cases
to demonstrate broad usefulness of these methods.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <given-names>S.</given-names>
            <surname>Aciar</surname>
          </string-name>
          .
          <article-title>Mining context information from consumers reviews</article-title>
          .
          <source>In Proceedings of Workshop on Context-Aware Recommender System. ACM</source>
          ,
          <year>2010</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [2]
          <string-name>
            <given-names>G.</given-names>
            <surname>Adomavicius</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B.</given-names>
            <surname>Mobasher</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F.</given-names>
            <surname>Ricci</surname>
          </string-name>
          ,
          <article-title>and</article-title>
          <string-name>
            <given-names>A.</given-names>
            <surname>Tuzhilin</surname>
          </string-name>
          .
          <article-title>Context-aware recommender systems</article-title>
          .
          <source>AI Magazine</source>
          ,
          <volume>32</volume>
          (
          <issue>3</issue>
          ):
          <fpage>67</fpage>
          -
          <lpage>80</lpage>
          ,
          <year>2011</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [3]
          <string-name>
            <given-names>G.</given-names>
            <surname>Adomavicius</surname>
          </string-name>
          and
          <string-name>
            <given-names>A.</given-names>
            <surname>Tuzhilin</surname>
          </string-name>
          .
          <article-title>Context-aware recommender systems</article-title>
          . In F. Ricci,
          <string-name>
            <given-names>L.</given-names>
            <surname>Rokach</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B.</given-names>
            <surname>Shapira</surname>
          </string-name>
          , and P. B. Kantor, editors,
          <source>Recommender Systems Handbook</source>
          , pages
          <fpage>217</fpage>
          -
          <lpage>253</lpage>
          . Springer US,
          <year>2011</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [4]
          <string-name>
            <given-names>S.</given-names>
            <surname>Anand</surname>
          </string-name>
          and
          <string-name>
            <given-names>B.</given-names>
            <surname>Mobasher</surname>
          </string-name>
          .
          <article-title>Contextual recommendation</article-title>
          . In B.
          <string-name>
            <surname>Berendt</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          <string-name>
            <surname>Hotho</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          <string-name>
            <surname>Mladenic</surname>
          </string-name>
          , and G. Semeraro, editors, From Web to Social Web:
          <article-title>Discovering and Deploying User and Content Profiles</article-title>
          , volume
          <volume>4737</volume>
          of Lecture Notes in Computer Science, pages
          <fpage>142</fpage>
          -
          <lpage>160</lpage>
          . Springer Berlin Heidelberg,
          <year>2007</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          [5]
          <string-name>
            <given-names>J. F. T.</given-names>
            <surname>Ante</surname>
          </string-name>
          <string-name>
            <surname>Odic</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Marko</given-names>
            <surname>Tkalcic</surname>
          </string-name>
          and
          <string-name>
            <given-names>A.</given-names>
            <surname>Kosir</surname>
          </string-name>
          .
          <article-title>Predicting and detecting the relevant contextual information in a movie-recommender system</article-title>
          .
          <source>In Interacting with Computers</source>
          ,
          <volume>25</volume>
          (
          <issue>1</issue>
          ), pages
          <fpage>74</fpage>
          -
          <lpage>90</lpage>
          . Oxford University Press,
          <year>2013</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          [6]
          <string-name>
            <given-names>D. M.</given-names>
            <surname>Blei</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A. Y.</given-names>
            <surname>Ng</surname>
          </string-name>
          , and
          <string-name>
            <given-names>M. I.</given-names>
            <surname>Jordan</surname>
          </string-name>
          .
          <article-title>Latent dirichlet allocation</article-title>
          .
          <source>J. Mach. Learn. Res.</source>
          ,
          <volume>3</volume>
          :
          <fpage>993</fpage>
          -
          <lpage>1022</lpage>
          , Mar.
          <year>2003</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          [7]
          <string-name>
            <given-names>P.</given-names>
            <surname>Dourish</surname>
          </string-name>
          .
          <article-title>What we talk about when we talk about context</article-title>
          .
          <source>Personal Ubiquitous Comput.</source>
          ,
          <volume>8</volume>
          (
          <issue>1</issue>
          ):
          <fpage>19</fpage>
          -
          <lpage>30</lpage>
          , Feb.
          <year>2004</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          [8]
          <string-name>
            <given-names>N.</given-names>
            <surname>Hariri</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B.</given-names>
            <surname>Mobasher</surname>
          </string-name>
          , and
          <string-name>
            <given-names>R.</given-names>
            <surname>Burke</surname>
          </string-name>
          .
          <article-title>Context-aware music recommendation based on latenttopic sequential patterns</article-title>
          .
          <source>In Proceedings of the Sixth ACM Conference on Recommender Systems, RecSys '12</source>
          , pages
          <fpage>131</fpage>
          -
          <lpage>138</lpage>
          , New York, NY, USA,
          <year>2012</year>
          . ACM.
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          [9]
          <string-name>
            <given-names>N.</given-names>
            <surname>Hariri</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B.</given-names>
            <surname>Mobasher</surname>
          </string-name>
          , and
          <string-name>
            <given-names>R.</given-names>
            <surname>Burke</surname>
          </string-name>
          .
          <article-title>Query-driven context aware recommendation</article-title>
          .
          <source>In Proceedings of the 7th ACM Conference on Recommender Systems, RecSys '13</source>
          , pages
          <fpage>9</fpage>
          -
          <lpage>16</lpage>
          , New York, NY, USA,
          <year>2013</year>
          . ACM.
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          [10]
          <string-name>
            <given-names>N.</given-names>
            <surname>Hariri</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B.</given-names>
            <surname>Mobasher</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Burke</surname>
          </string-name>
          , and
          <string-name>
            <given-names>Y.</given-names>
            <surname>Zheng</surname>
          </string-name>
          .
          <article-title>Context-aware recommendation based on review mining</article-title>
          .
          <source>In IJCAI' 11, Proceedings of the 9th Workshop on Intelligent Techniques for Web Personalization and Recommender Systems (ITWP</source>
          <year>2011</year>
          ), pages
          <fpage>30</fpage>
          -
          <lpage>36</lpage>
          ,
          <year>2011</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          [11]
          <string-name>
            <given-names>X.</given-names>
            <surname>Jin</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Y.</given-names>
            <surname>Zhou</surname>
          </string-name>
          , and
          <string-name>
            <given-names>B.</given-names>
            <surname>Mobasher</surname>
          </string-name>
          .
          <article-title>Task-oriented web user modeling for recommendation</article-title>
          .
          <source>In Proceedings of the 10th international conference on User Modeling</source>
          ,
          <source>UM'05</source>
          , pages
          <fpage>109</fpage>
          -
          <lpage>118</lpage>
          , Berlin, Heidelberg,
          <year>2005</year>
          . Springer-Verlag.
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          [12]
          <string-name>
            <given-names>M.</given-names>
            <surname>Kaminskas</surname>
          </string-name>
          and
          <string-name>
            <given-names>F.</given-names>
            <surname>Ricci</surname>
          </string-name>
          .
          <article-title>Location-adapted music recommendation using tags</article-title>
          . In J. Konstan,
          <string-name>
            <given-names>R.</given-names>
            <surname>Conejo</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Marzo</surname>
          </string-name>
          , and N. Oliver, editors,
          <source>User Modeling, Adaption and Personalization</source>
          , volume
          <volume>6787</volume>
          of Lecture Notes in Computer Science, pages
          <fpage>183</fpage>
          -
          <lpage>194</lpage>
          . Springer Berlin Heidelberg,
          <year>2011</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          [13]
          <string-name>
            <given-names>J.</given-names>
            <surname>Lee</surname>
          </string-name>
          and
          <string-name>
            <given-names>J.</given-names>
            <surname>Lee</surname>
          </string-name>
          .
          <article-title>Context awareness by case-based reasoning in a music recommendation system</article-title>
          . In H. Ichikawa, W.-D. Cho,
          <string-name>
            <surname>I. Satoh</surname>
          </string-name>
          , and H. Youn, editors,
          <source>Ubiquitous Computing Systems</source>
          , volume
          <volume>4836</volume>
          of Lecture Notes in Computer Science, pages
          <fpage>45</fpage>
          -
          <lpage>58</lpage>
          . Springer Berlin Heidelberg,
          <year>2007</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref14">
        <mixed-citation>
          [14]
          <string-name>
            <given-names>Y.</given-names>
            <surname>Li</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Nie</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Y.</given-names>
            <surname>Zhang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B.</given-names>
            <surname>Wang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B.</given-names>
            <surname>Yan</surname>
          </string-name>
          , and
          <string-name>
            <given-names>F.</given-names>
            <surname>Weng</surname>
          </string-name>
          .
          <article-title>Contextual recommendation based on text mining</article-title>
          .
          <source>In Proceedings of the 23rd International Conference on Computational Linguistics: Posters, COLING '10</source>
          , pages
          <fpage>692</fpage>
          -
          <lpage>700</lpage>
          , Stroudsburg, PA, USA,
          <year>2010</year>
          .
          <article-title>Association for Computational Linguistics</article-title>
          .
        </mixed-citation>
      </ref>
      <ref id="ref15">
        <mixed-citation>
          [15]
          <string-name>
            <surname>C. D. Manning</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          <string-name>
            <surname>Raghavan</surname>
          </string-name>
          , and H. Schu¨tze. Introduction to Information Retrieval. Cambridge University Press, New York, NY, USA,
          <year>2008</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref16">
        <mixed-citation>
          [16]
          <string-name>
            <given-names>G. A.</given-names>
            <surname>Miller</surname>
          </string-name>
          .
          <article-title>Wordnet: A lexical database for english</article-title>
          .
          <source>COMMUNICATIONS OF THE ACM</source>
          ,
          <volume>38</volume>
          :
          <fpage>39</fpage>
          -
          <lpage>41</lpage>
          ,
          <year>1995</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref17">
        <mixed-citation>
          [17]
          <string-name>
            <given-names>C.</given-names>
            <surname>Palmisano</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Tuzhilin</surname>
          </string-name>
          , and
          <string-name>
            <given-names>M.</given-names>
            <surname>Gorgoglione</surname>
          </string-name>
          .
          <article-title>Using context to improve predictive modeling of customers in personalization applications</article-title>
          .
          <source>IEEE Trans. on Knowl. and Data Eng</source>
          .,
          <volume>20</volume>
          (
          <issue>11</issue>
          ):
          <fpage>1535</fpage>
          -
          <lpage>1549</lpage>
          , Nov.
          <year>2008</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref18">
        <mixed-citation>
          [18]
          <string-name>
            <given-names>R.</given-names>
            <surname>Rehurek</surname>
          </string-name>
          and
          <string-name>
            <given-names>P.</given-names>
            <surname>Sojka</surname>
          </string-name>
          .
          <article-title>Software framework for topic modelling with large corpora</article-title>
          .
          <source>In Proceedings of LREC 2010 workshop New Challenges for NLP Frameworks</source>
          , pages
          <fpage>46</fpage>
          -
          <lpage>50</lpage>
          , Valletta, Malta,
          <year>2010</year>
          . University of Malta.
        </mixed-citation>
      </ref>
      <ref id="ref19">
        <mixed-citation>
          [19]
          <string-name>
            <given-names>Z.</given-names>
            <surname>Wu</surname>
          </string-name>
          and
          <string-name>
            <given-names>M.</given-names>
            <surname>Palmer</surname>
          </string-name>
          .
          <article-title>Verbs semantics and lexical selection</article-title>
          .
          <source>In Proceedings of the 32Nd Annual Meeting on Association for Computational Linguistics, ACL '94</source>
          , pages
          <fpage>133</fpage>
          -
          <lpage>138</lpage>
          , Stroudsburg, PA, USA,
          <year>1994</year>
          .
          <article-title>Association for Computational Linguistics</article-title>
          .
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>