Context-Aware Recommendation Based On Review Mining Negar Hariri, Bamshad Mobasher, Robin Burke and Yong Zheng DePaul University, College of Computing and Digital Media 243 S. Wabash Ave, Chicago, IL 60604, USA {nhariri, mobasher, burke, yzheng8}@cs.depaul.edu Abstract his friend’s child may repeatedly receive suggestions to buy items related to kids as the recommendation algorithm de- Recommender systems are important building cides based on the whole history in user’s profile without pri- blocks in many of today’s e-commerce applications oritizing his current interests. To address this issue, the notion including targeted advertising, personalized mar- of context and context-aware recommender systems (CARS) keting and information retrieval. In recent years, has been introduced. the importance of contextual information has moti- vated many researchers to focus on designing sys- Contextual information can be explicit or implicit and can tems that produce personalized recommendations be inferred in different ways such as using GPS sensor data, in accordance with the available contextual infor- clickstream analysis or monitoring user rating behavior. In mation of users. Compared to the traditional sys- this paper, we concentrate on deriving context from a textual tems that mainly utilize users’ preference history, description of a user’s current state and the item features in context-aware recommender systems provide more which he/she is interested. This data can be in different forms relevant results to users. We introduce a context- such as tweets, blog posts, review texts or it can be given aware recommender system that obtains contextual directly to the system as part of a query. information by mining user reviews and combining As an example application of our approach, we have used them with user rating history to compute a utility our method to mine hidden contextual data from customers’ function over a set of items. An item utility is a reviews of hotels. The reason behind the selection of this measure that shows how much it is preferred ac- dataset is that users usually provide some contextual cues in cording to user’s current context. In our system, the their comments. For example, they may mention that they are context inference is modeled as a supervised topic- with family or on a business trip, or they may express their modeling problem in which a set of categories for opinions about the hotel services that are important to them a contextual attribute constitute the topic set. As an such as having wireless internet, conference rooms, etc. In or- example application, we used our method to mine der to evaluate our method, we have used “trip Advisor” hotel hidden contextual data from customers’ reviews of reviews dataset where each review contains an overall rating, hotels and use it to produce context-aware recom- an optional review comment and also a “trip type” attribute mendations. Our evaluations suggest that our sys- that shows the types of trips user suggest for this hotel. For tem can help produce better recommendations in this attribute, the user can select a subset of five possible val- comparison to a standard kN N recommender sys- ues: Family, Couples, Solo travel, Business, and Friends’ get- tem. away. The “trip type” attribute is not a feature of user or hotel (as different users may assign different values), it is rather re- lated to the interaction and it is assumed to be an indication 1 Introduction of context in our system. In recent years, recommender systems (RS) have been exten- Our approach in inferring context is based on using a clas- sively used in various domains to recommend items of inter- sifier that is trained by the samples of descriptions and their est to users based on their profiles. A user’s profile is a reflec- corresponding contexts. Usually the trip type that a customer tion of the user’s previous selections and preferences that can picks for a hotel is related to his review. Having this assump- be captured as rating scores given to different items in the sys- tion, a set of review texts and their associated trip types are se- tem. Using preference data, different systems have been de- lected as the training set for the context classifier. After train- veloped to produce personalized recommendations based on ing, for a given description (as the user context) the classifier collaborative filtering, content-based or a hybrid approach. computes the probability of each the trip category. This prob- Despite the broad usage of such recommender systems, ability distribution is used to infer context. Since we are deal- failure to consider the users’ current situations may result in ing with a multi-class supervised classification problem, we considerable performance degradation in recommendations. chose to use Labeled-LDA [1] as our categorization method For example, a customer who has once bought a toy for as based on our experiments it performs better in our dataset in comparison to other similar methods. context is not observable itself, the activity that arise from the We propose a method to use this inferred context to pro- context can be observed. duce context-aware recommendations. While most of the Adomavicius et al. [6] suggest three different architec- existing approaches assume that a user’s rating behavior de- tures for context-aware recommender systems: In Contex- pends on the current context and predict a rating function, tual pre-filtering approach, the dataset is first filtered, the we differentiate between the “rating” that a user gives to an recommendations will be then provided based on the con- item and the “utility” gains from choosing it. The inferred textualized dataset. On the other hand, contextual post- context is used to define a utility function for the items re- filtering approach generates recommendations similar to tra- flecting how much each item is preferred by a user given his ditional recommender systems. It will then filter and re- current context. More specifically, the utility value depends rank these recommendations to provide contextual recom- on two factors: the predicted rating and the “context score” mendations. In contextual modeling, context is added to the where context score represents the suitability of an item for a problem as an additional dimension; meaning that in con- user in a given context. Rating can be predicted based on any trast to traditional recommender systems that estimate the rat- conventional recommendation algorithms such as kNN. ing function in two dimensional space of U ser × Item, the Through the rest of this paper, we will first review some of context-aware recommender system is defined over the space the related work. Section 3 describes our proposed context- of U ser × Item × Context. The representation of context aware recommendation process. Finally, section 4 includes and the way it should be captured and integrated into the rec- the evaluation of the proposed method and its comparison ommendation algorithm depend on available contextual in- with traditional recommender. formation as well as the definition of context in the system. An interesting application of context-aware recommender 2 Related Work systems is in mobile devices that are equipped with GPS or have internet access. In this case, different contextual infor- Several researchers have previously investigated the use of mation can be captured in real-time in order to be used in contextual information in various applications of recom- the recommendation process. For example PioApp Recom- mender systems. Although there is no clear-cut definition mender [4] produces recommendations based on points of in- of context, one of the most commonly used definitions was terest (such as restaurants, museum and train stations) in the suggested by Abowd et al. [2] as follows: “Context is any in- neighborhood of the mobile user. Social camera introduced formation that can be used to characterize the situation of an in [7] assists users in picking photo compositions given their entity. An entity is a person, place, or object that is consid- current location and scene context. Many mobile travel appli- ered relevant to the interaction between a user and an appli- cations such as [8–10] have also took advantage of context in cation, including the user and applications themselves.” This order to make better suggestions. Numerous algorithms have is a general definition that limits the context only to the in- also been suggested for music and movie recommendation formation that could be used to characterize the situation or (as well as many other domains). Micro-profiling introduced the circumstance. Another similar definition by H.Lieberman in [11], splits each single user profile into several possibly et al. [3] is: “context can be considered to be everything that overlapping sub-profiles where each of them represent user’s affects computation except the explicit input and output”. In preference in a particular context. A context random walk addition to these general definitions, a number of more spe- algorithm was proposed in [12] to model the user’s movie cific definitions of context have been recently provided. For browsing behavior and then use it to make context-aware rec- example, “Context can be described by a vector of context ommendations. attributes, e.g. time, location or currently available network Some of the above mentioned approaches such as [8, 9] bandwidth in a mobile scenario”. [4]. use a simple representational view of context where context Capturing and representation of context in a system de- is shown as a set of attributes (such as time, location, weather pends on the way context is defined in that system. Dourish et conditions) that is given to the system as input; while some al. [5] presented two different views in modeling context: The other systems try to infer the contextual attributes from the representational view and the interactional view. In represen- user behavior. Instead of using a representational model, tational view, context is defined as a form of information that the context-aware recommender in [13] uses an interaction is stable, delineable and is separate from activity. Having this model. The proposed system was inspired by human mem- view, context can be defined and represented as a specific set ory model in psychology where the short term and long term of attributes of the environment within which the user’s in- memories are separately modeled. The short term memory teraction with the system has taken place. For example, time contains the user preferences derived from his active interac- and location can be considered as contextual attributes. In in- tion with the system while the long term memory stores the teractional view, it is assumed that contextuality is a rational preference models related to his previous interactions with property that holds between objects and activities rather than the system. They introduced three types of contextual cues to be information (as in the case of representational view). including collaborative, semantic and behavioral cues in or- Also, the contextual features are not definable and static but der to retrieve relevant preference models from the long term their scope is defined dynamically. Furthermore, rather than memory. The retrieved memory objects will then be com- assuming that context defines the situation within which an bined with user’s current preference model to generate and activity occurs, it is assumed that context arises from activity aggregate a final preference model that is used to produce and activity is induced by context. Therefore, even though recommendations. In this paper we propose a method for mining contextual In our experiments, a dataset containing a set of hotel re- data from texual reviews. The importance of the hidden data views from Trip Advisor website has been used. In this in review comments has been the subject of many researches dataset, the “trip type” attribute assigned to a hotel review in the area of opinion mining and sentiment analysis. In opin- shows the types of trips that the user suggest for the hotel. ion analysis, various Natural Language processing techniques The assigned attribute can be selected by the user from a set and text analysis methods are applied to a set of review to ex- of five possible choices: Family, couples, Solo travel, busi- tract attributes of the object that are referred to in the review ness, and friends’ getaway. We assume that this element is text and to discover polarity (positive, negative or neutral) of the representation of context in our system. A sample review the expressed opinions. from this dataset is depicted in Table 1. This sample shows The problem of extracting contextual information from un- the relationship between the trip type attribute and the review structured text is fairly new and has not been extensively ad- comment. For example “budget accommodation”, “twin bed- dressed in prior researches. Aciar [14] introduces a method room”, “small” and “shared bathroom” can be more related to to identify review sentences which contain contextual infor- Friends getaway trip than to a business trip or a family travel. mation. In their approach, rule sets were created to classify Producing context aware recommendations requires min- sentences into contextual and preference categories where the ing the user’s current context. If the user explicitly specifies preference category groups sentences including user’s evalu- his context, then it can be easily used in the recommendation ation of the features. The approach presented in [14] does not algorithm. On the other hand, if he implies his context in discuss the use of the retrieved information in the recommen- a set of sentences describing his current state or his desired dation process while we will provide a way of incorporating features for the hotel, then an inference method is required the contextual knowledge in producing the recommendations. to determine the probability of each trip type. In this way, the context is shown as a distribution over the set of trip cat- 3 Context-Aware Recommendation Process egories. In both cases, let Contextiu denote the context of user u when using item i. For example, if the reviewer u indi- Our context-Aware recommender system (CARS), includes cates the trip type for hotel i as business and solo travel, then several components. The first component is the context miner the context represenation is Contextiu = {P (family) = 0, that is responsible for determining a user’s current context. P (couples) = 0, P (solo travel) = 0.5, P (business) = 0.5, Context is represented as a distribution function over the set P (friends’ get away) = 0}. of trip types and can be mined from a textual description of The context inference problem just described is similar to a user’s current situation and the features that are important to multi-labeled text classification problem in which documents him. The main part of the context inference module consists can be a classified into one or more categories. The general of a multi-class supervised classifier. After training the clas- solution is to provide a training set, build the model and use sifier, context can be inferred for a given query. An example the model to categorize the new documents. If the trip type query is shown in Table 1. Based on the underlined words, it categories assigned to each review is assumed to be related to seems that the user is most probably looking for “couples” or the review comment (and we will show they are related in our “family” type trip rather than a “business” one. dataset), then we can use a set of review comments and their corresponding trip type values as our training set for training the classifier. I’m planning a romantic trip for my anniversary. I’m looking for an all inclusive resort near a beach. I ex- Trip type Friends’ getaway pect the hotel room to be spacious, have a nice view Review This is an excellent option for budget ac- over the sea and to be nicely decorated. Comment commodation in a hostel type establish- ment in a top class location, very close Table 1: A sample query to central station and quick bus journey to circular quay. Stayed in twin bedroom which was very small but did the trick. If The second component of our system is the rating predic- all you want is a clean bed in a clean room tor that is a simple collaborative filtering recommender which then this is grand. Shared bathroom and predicts ratings of items. This component can be replaced showering facilities were kept clean too. with other types of rating prediction algorithms. The third Summary Excellent hostel accommodation in great component calculates the utility function based on user’s cur- Quote location rent context and the predicted rating and presents a set of sug- gestions according to the order of the utility values. Table 2: A sample review comment and the associated trip type 3.1 Context Representation Contextual recommender systems can either have an interac- tional or a representational view of the context. In this paper, we assume there are explicit labels representing context and 3.2 Inferring the Context the contextual information is obtained for each textual review Different techniques have been used in text categorization by mapping it to this label set. such as probabilistic methods, regression modeling and SVM Figure 1: Graphical Representation of LDA [16] Figure 2: Graphical Representation of Labeled-LDA [1] classification. In this article, we have used Labeled Latent chose to use Labeled-LDA [1] as the other methods limit each Drichlet Allocation [1] (shown as L-LDA) for our dataset as document to be associated with only one topic while in our it has shown to perform relatively well on our dataset. This case, reviews can have multiple labels. Similar to LDA, in method is a supervised classification algorithm for multi- Labeled-LDA modeling, each word in the document is as- labeled text corpora and is based on topic modeling. signed a single topic. However, in order to incorporate super- vision, the topic should belong to the label set of the docu- Topic modeling and Labeled-LDA ment. In other words, there is a one-to-one relationship be- Topic modeling deals with statistical modeling of documents tween the set of labels assigned to the documents and the top- in order to discover the latent topics behind them. Probabilis- ics and the topic mixture of each document is formed accord- tic latent semantic analysis (PLSA) [15] is one of the early ing to its label set. Figure 2 shows the graphical presentation approaches in this area which models a document as a proba- of Labeled-LDA. Having k unique labels in all documents, bility distribution over the set of topics. the parameter Λ for each document is a k dimensional binary Later, the Latent Drichlet Allocation [16], known as LDA, vector that shows the presents or absence of each topic in the was proposed as an extension of PLSA. LDA specifies a gen- document label set. For each document, Λ is generated using erative process for creating documents. The document gen- a Bernoulli coin toss with a prior probability vector η eration is based on the idea that documents are mixture of As in [1], we used Gibbs sampling [19] for training. Let topics, where a topic is a probability distribution over words. C W T and C DT represent two matrices which contain word- To generate a new document d, first the distribution over top- topic counts and document-topic counts respectively. Gibbs ics denoted by θ(d) should be specified. For each word in the sampling begins with randomly assigning words to topics and document a topic t is selected based on θ(d) . Let φ(t) denote filling the two matrices accordingly. Then iteratively updates the multinomial distribution over words for topic t; Accord- them to finally converge to estimations of θ and φ. At each ing to this distribution a word is picked and is added to the iteration, a word token is selected and its current topic is re- document. It should be noted that this is similar to the gen- moved and C W T and C DT are updated by decrementing the eral procedure followed by most of the existing topic mod- corresponding entries to the removed topic assignment. Then, els except that the statistical assumptions differs based on the a new topic is sampled based on the topic assignments to all model. The LDA model assumes that the topic mixture θ is a other words and the count matrices are incremented accord- k-dimensional random variable as follows [16]. ingly. After convergence, estimates of θ and φ can be ob- tained using equations 2 and 3 respectively. Pk Γ( αi ) α1 −1 P (θ|α) = Qk i=1 θ . . . θkαk −1 (1) DT Cdj +α Γ(α ) (d) i=1 i θj = PT (2) DT k=1 Cdk + T α Where α is k-vector with elements αi > 0 and Γ(x) is the gamma function. Figure 1 describes the graphical representa- WT tion of LDA where the rectangles show replicates. The outer (j) Cij +β φi = PW (3) rectangle represents M documents while the inner rectangle WT k=1 Ckj + W β illustrates the process of sampling words for a document of size N . In the LDA model, the document size follows a Poisson distribution. In corpora with a large vocabulary, it 3.3 Predicting item utility is likely that some of the words do not appear in the training As noted earlier, we make a distinction between predicting examples. In order to cope with problem, a smoothing strat- rating and utility. We assume that the utility of an item for egy is used by placing a Drichlet prior on φ with parameter β a user may differ among different contexts, even though the as shown in the figure. user has rated the same item equally in those contexts. For In our problem, the user reviews are assumed to be docu- instance, in the hotel review dataset it is possible that the rat- ments where the topics behind these documents are the set ing given by a customer to hotel on a business trip does not of possible values for a trip type. As the topics are pre- change if he visits the same hotel one more time with his fam- defined, we need to adopt a supervised topic modeling ap- ily while the utility of selecting that hotel changes from one proach. Some variations of LDA have been proposed to sup- trip type to the other. When he is on a business trip, business port supervised learning such as [1, 17, 18] among which we services of the hotel are more important while in a family trip some other characteristics of the hotel (such as having a pool, Context score of item i for user u can be estimated by com- distance to beach etc.) gain more priority. paring the distribution of inferred context of u and predicted We define context score as a measure of suitability of context for this item. We used three different methods namely an item for a user in a given context. To calculate the Chebyshev Similarity [21], Kullback-Leibler Similarity [22] context score for user u and item i, we need to predict and a simple cosine similarity. We have chosen to use co- the context that u would assign to i that is denoted by sine similarity in our evaluations as it performs better on our predictedContext(u, i). The predicted context will then dataset. be compared to current context of u (that can be inferred). Let ICu denote the inferred context for user u and P Cui We use a collaborative approach for calculating context of a indicate the predicted context (calculated based on equation (user, item) pair. The similarity between two items i and j 5). The context Score for item i and user u is computed as is computed using the cosine similarity as follows: follows: ICu · P Cui P commonLabels(i, j) contexualSimilarity(i, j) = pPu contextScore(u, i) = (6) u |labels(i)| × |labels(i)| ||ICu ||||P Cui || (4) The utility score of item i and user u is calculated as a function of both the context score of i and also the predicted Where commonLabels(i, j) is the number of times users rating of the item. In our experiments, standard item-based assign the same trip type category to both i and j and labels(i) kN N was used to calculate the predicted ratings. counts the number of trip type labels given to i by all users. This similarity is used to obtain a neighborhood for item i by selecting the top N most similar items. Then, the predicted utility(i, j) = α · predictedRating(u, i)+ (7) context can be computed as in equation 5. In the predicted (1 − α) · contextScore(u, i) context, the probability of each trip category is calculated by In relation 7, α is a constant representing the weight of the taking the weighted average of its probabilities in the neigh- predicted rating in the utility function. The items are sorted bors’ contexts. based on utility values and the top N items are suggested to the user. predictedContext(u, i) = 4 Evaluation P k (5) k∈Neighbors(i) contextu · contexualSimilarity(k, i) The evaluations presented in this paper were performed on the Trip Advisor dataset that contains 12558 reviews for 8941 P k∈Neighbors(i) |contextualSimilarity(k, i)| hotels made by 1071 reviewers. About 9500 of the reviews Where contextku stands for the neighbor k context given by has the “trip type” label which has been used as an indication user u. of context. Our notion of predicted context for a (user, item) pair Our system consists of two main parts and the experiments is somehow similar to the idea of “best context” introduced have been designed accordingly. The first experiment focuses in [20] for music recommendation. The authors have defined on assessing the accuracy of the context inference module on this concept as the contextual information most suited for a our dataset. In the second experiment, the performance of particular item. They have used a vector representation of the recommender system is compared with the standard kNN context where each dimension corresponds to a contextual at- recommender. tribute. if the user believes that context is suitable for that specific item, the value of the corresponding dimension is set 4.1 Context Inference Evaluation to one. They have proposed four different approaches for the The accuracy of the context inference algorithm plays a sig- prediction of the best context. The first method is based on nificant role in the performance of the system. As previously averaging the context vectors of the item across all users. An- explained, we used Labeled-LDA as it has shown to perform other technique is to find the K-nearest neighbors of the user relatively better than other multi-Labeled text classification (based on rating history) and compute the predicted context method. In this experiment we will assess its performance on as the weighted average of the contexts assigned to that item our dataset. The experiment was set up as a five-fold cross by his neighbor. The other two methods follow the same ap- validation. In each of the five runs, one of the folds was used proach except that the similarity of users are computed based for testing while the topic model was built based on the re- on the context vectors and independent of their rating his- maining four folds. For every test case (i.e. the review text), tory. Our method is different from the previously mentioned the probability distribution over the trip type categories were approaches in various aspects: The above methods focus on predicted. A category is assigned to a test case if the predicted predicting the suitable context for a (user, item) pair while probability for that category exceeds a certain threshold. we address the whole process of context-aware recommenda- The results are evaluated by measuring both precision and tion; In other words, predicting the best context is just a part recall where precision is computed as the fraction of correct of our context-aware algorithm. Moreover, our method for categorical labels, and recall is computed as the ratio of cor- calculation of contextual similarity and also prediction of the rect labels to total number of labels. Figures 3 and 4 de- best context is different from the previous techniques. pict recall and precision values for different categories. As Figure 3: Recall values for different categories Figure 5: Hit Ratio comparison for item-based kNN and context-aware recommender tion list for standard kNN and context aware recommender system where the user’s context is inferred and also when it is explicitly expressed. The results suggest that an increase in hit ratio is expected when the contextual information is in- volved in producing the recommendations. 5 Conclusions This paper has presented a novel approach for mining context from unstructured text and using it to produce context-aware recommendations. In our system, the context inference is modeled as a supervised topic-modeling problem for which we used Labeled-LDA to build the context classifier. The in- ferred context is used to define a utility function for the items Figure 4: Precision values for different categories reflecting how much each item is preferred by a user given his current context. The utility value for each item depends on two factors: the predicted rating and the “context score” it is shown, the precision tends to be higher as the threshold where context score represents the suitability of the item for increases. Also, as expected, by increasing the confidence a user in a given context. Rating can be predicted based on threshold, recall is likely to decrease. any conventional recommendation algorithms such as kNN. 4.2 Evaluation of Recommendations As an example application, we have used our method to mine hidden contextual data from customers’ reviews of ho- As we are working with a sparse dataset, a preprocessing tels in “Trip Advisor” dataset and used it to produce context- phase has been added to the procedure in order to prune the aware recommendations. Our evaluations indicate that using matrix by removing all those items that have less than 5 rat- the contextual information can improve the performance of ings. the recommender system in terms of hit ratio. In previous sections, we introduced a context-aware rec- ommender that produce recommendations for a user based on a utility function that depends both the user’s current context References and also the predicted rating for that item. As recommenda- [1] D. Ramge, D. Hall, R. Nallapati, and C. Manning, “La- tions are based on utility function (and not ratings alone), it is beled lda: a supervised topic model for credit attribution not logical to use metrics such as MAE and other metrics that in multi-labeled corpora,” in Proceedings of the 2009 compare the predicted rating with the actual ones. Instead, Conference on Empirical Methods in Natural Language hit ratio was chosen as our performance measure and we per- Processing, 2009. formed a leave-one-out cross validation experiment on those [2] G. Abowd, A. Dey, N. D. P.J. Brown, M. Smith, and reviews that have ratings greater than the reviewer’s average P. Steggles, “Towards a better understanding of con- rating. Having the recommendation size of k, the hit ratio is text and context-awareness,” Handheld and Ubiquitous calculated as the probability that the left-out item is included Computing, vol. 1707, no. 2, pp. 304–307, 1999. in the list of N recommendations. The standard item-based kNN algorithm has also been run on the same dataset and un- [3] H. Lieberman and T.Selker, “Out of context: Computer der the same condition as our recommender method. Figure systems that adapt to, and lean from, context,” IBM Sys- 5 shows the hit ratio having different sizes of recommenda- tems Journal, vol. 39, no. 3, pp. 617–632, 2000. [4] W. Woerndl and J. Schlichter, “Introducing context into classification,” Neural Information Processing Systems, recommender systems,” in Proceedings of AAAI Work- vol. 22, 2008. shop on Recommender Systems in E-Commerce, 2007, [19] D. j. S. W. R. Gilks, S. Richardson, Markov chain Monte pp. 138–140. Carlo in practice. London: Chapman & Hall, 1996. [5] P. Dourish, “What do we talk about when we talk about [20] L. Baltrunas, M. Kaminskas, F. Ricci, L. Rokach, context,” Personal and Ubiquitous Computing, vol. 8, B. Shapira, and K. Luke, “Best usage context predic- no. 1, pp. 19–30, 2004. tion for music tracks,” in Proceedings of the 2nd Work- [6] G. Adomavicius and A. Tuzhilin, “Context-aware rec- shop on Context Aware Recommender Systems, Septem- ommender systems,” in Proceedings of the 2008 ACM ber 2010. conference on Recommender Systems. ACM, 2008. [21] C. Cantrell, Modern Mathematical Methods for Physi- [7] S. Bourke, K. McCarthy, and B. Smyth, “The so- cists and Engineers. Cambridge University Press, cial camera: Recommending photo composition using 2000. contextual features,” in Proceedings of Workshop on [22] R. L. S. Kullback, “On information and sufficiency,” Context-Aware Recommender System. ACM, 2010. Annals of Mathematical Statistics, vol. 22, pp. 79–86, [8] K. Cheverst, N. Davies, K. Mitchell, A. Friday, and 1951. C. Efstratiou, “Developing a context-aware electronic tourist guide: some isues and experiences,” in Proceed- ings of the SIGCHI conference on Human Factors in Computing Systems, p. 17. [9] L. Ardissono, A. Goy, G. Petrone, M. Segnan, and P. Torasso, “Intrigue: personalized recommendation of tourist attractions for desktop and hand held devices,” Applied Artificial Intelligence, vol. 17, no. 8, pp. 678– 714, 2003. [10] M. V. Setten, S. Pokraev, and J. Koolwaaij, “Context- aware recommendations in the mobile tourist applica- tion compass,” in Proceedings of Third International Conference In Adaptive Hypermedia and Adaptive Web- Based Systems. Springer, August 2004. [11] L. Baltrunas and X. Amatriain, “Towards time- dependant recommendation based on implicit feed- back,” in Proceedings of Workshop on Context-Aware Recommender System. ACM, 2009. [12] T. Bogers, “Movie recommendation using random walks over the contextual graph,” in Proceedings of Workshop on Context-Aware Recommender System. ACM, 2010. [13] S. Anand and B. Mobasher, “Contextual recommenda- tion,” From Web to Social Web: Discovering and De- ploying User and Content Profiles, 2007. [14] S. Aciar, “Mining context information from consumers reviews,” in Proceedings of Workshop on Context- Aware Recommender System. ACM, 2010. [15] T. Hoffman, “Probabilistic latent semantic indexing,” in Proceedings of the 22nd annual international ACM SI- GIR conference on Research and development in infor- mation retrieval (SIGIR99), 1999. [16] D. Blei, A. Ng, and M. Jordan, “Latent dirichlet al- location,” The Journal of Machine Learning Research, vol. 3, 2003. [17] D. Blei and J. McAuliffe, “Supervised topic models,” Neural Information Processing Systems, vol. 21, 2007. [18] L. Julien, F. Sha, and M. I. Jordan, “Disclda: Dis- criminative learning for dimensionality reduction and