=Paper=
{{Paper
|id=None
|storemode=property
|title=Optimal Feature Selection for Context-Aware Recommendation Using Differential Relaxation
|pdfUrl=https://ceur-ws.org/Vol-889/paper1.pdf
|volume=Vol-889
}}
==Optimal Feature Selection for Context-Aware Recommendation Using Differential Relaxation==
Optimal Feature Selection for Context-Aware Recommendation using Differential Relaxation Yong Zheng, Robin Burke, Bamshad Mobasher Center for Web Intelligence School of Computing, DePaul University Chicago, Illinois, USA {yzheng8,rburke,mobasher}@cs.depaul.edu ABSTRACT What we see in this example, however, is that these contextu- Research in context-aware recommender systems (CARS) usual- al factors (occasion, companions) are linked to specific features ly requires the identification of the influential contextual variables associated with the restaurant itself: the type of food, the atmo- in advance. In collaborative recommendation, there is a substantial sphere, perhaps the price. To the extent that such linkages are trade-off between applying context very strictly and achieving good shared among many users, we call these context-linked features. coverage and accuracy. Our prior work showed that this tradeoff When contextual information is limited, we may be able to observe can be managed by applying the contexts differentially in different context-linked features more readily, and so discovering influential components of the recommendation algorithm. In this paper, we context-linked features is an important goal for CARS research. extend our previous model and show that our differential context In our previous research [16], we developed a context-aware col- relaxation (DCR) model can also be used to identify demographic laborative recommendation approach based on differential context and item features that are linked to the contexts. We also demon- relaxation (DCR). The differential aspect of the algorithm is that strate the application of binary particle swarm optimization as a we segment the recommendation algorithm into different compo- scalable optimization technique for deriving the optimal relaxation. nents and apply different aspects of the context to each part. Each component of the algorithm operates on the same collaborative user-item-rating data, but only those parts that match its particu- Categories and Subject Descriptors lar context. The context relaxation part of the algorithm arises be- H.3.3 [Information Systems]: Information Search and Retrieval cause we treat the context as a set of constraints controlling what data is available to a given component. Finding the optimum set of contextual features then becomes a matter of finding a relaxation General Terms of these contextual constraints so that a balance between accuracy Algorithms, Modeling, Optimization and coverage is achieved. In this paper, we show that context-linked features are also good Keywords candidates for our relaxation technique. Previously, we used a data Context-aware recommender systems, collaborative filtering, dif- set with only a handful of contextual variables so it was possible to ferential context relaxation, particle swarm optimization do an exhaustive search of all possible relaxations in order to find the optimum. In this work, we show that binary particle swarm optimization (BPSO) [10] can be applied to find optimal relaxations 1. INTRODUCTION with a larger set of context and context-linked variables. Context-aware recommender systems (CARS) extends the focus of recommender systems research beyond users and items to the con- texts in which items are experienced by users, attempting to model 2. RELATED WORK the interactions that are often influential in making good recom- CARS researchers have sought different means of discovering in- mendations. For example, Alice may prefer causal restaurants with fluential contextual variables. Baltrunas et al [2] conducted a sur- spicy Thai food when dining with friends, but would rather choose vey asking subjects to evaluate the importance of contextual factors a quiet but upscale Italian place when entertaining work colleagues. to acquire contextual relevances. Others [15, 7] applied feature se- Clearly, the occasion (work vs. personal life) and the companions lection or reduction techniques to extract influential contexts from (friends vs. co-workers) play a key factor in what a system should a set of candidate variables. However, each of these techniques recommend. The recognition of the importance of such contextual has its drawbacks: the technique in [2] requires a lot of user effort; factors is a starting point for much CARS research. the feature reduction techniques are not reliable unless the data set is dense across contexts – items have been rated multiple times in different contexts. Comparing with explicit contextual approaches above, there are also approaches incorporating latent factors: previously we ap- plied labeled-LDA to infer contexts based on review mining[6]. A. Karatzoglou et al [8] introduced multiverse recommendation based on a N-dimensional tensor factorization model, and L. Baltrunas et al [3] present a novel context-aware recommender based on matrix —————————————————————————————– CARS-2012, September 9, 2012, Dublin, Ireland. factorizations. In this paper, we mainly focus on the existing ex- Copyright is held by the author/owner(s). plicit contextual features we already know, and we will incorporate latent factor models in the future. C2 , and we do a similar thing for the target user by matching a- Rather than seeking to define a set of contextual variables to be gainst C3 . applied to the algorithm as a whole, our previous work [16] sought In our previous work, we were able to show significant improve- to decompose collaborative recommendation into components and ment in accuracy for hotel recommendation in a travel data set apply different aspects of the context to each. We think of the (http://www.tripadvisor.com/) with the following constraints: C1 context as a constraint governing what data each component con- (neighbor filtering) using “state of origin”, C2 (peer baseline) us- siders and seek the strongest constraint consistent with low error ing “trip type” (e.g. business, family, etc), and no constraints on and acceptable coverage. Note that this concept of constraining target user baseline, C3 = {}, where trip type is known as a typical the data used by different parts of the recommendation algorithm influential context [6], and we found the state of origin was another is quite different from constraint-based recommendation, in which one. For more details, see [16]. What is significant here is that each constraints are applied to the recommended items themselves [5]. of the three components of the algorithm worked best with their in- put data filtered in different context-related ways. We derived this result by evaluating all possible sets of constraints on all of the data 3. METHODOLOGY and finding the optimum combination. We take as our starting point the well-known Resnick’s algorithm for collaborative recommendation: P 3.1 Optimization of Constraint Search (ru,i − r̄u ) × sim(a, u) u∈N We model the relaxation as a process of binary selection – we rep- Pa,i = r̄a + P (1) resent C1 , C2 and C3 as binary vectors where a bit value of one sim(a, u) u∈N denotes that we filter the data using that feature. In extending this work to other data sets and adding context-linked features, it where a is a user, i is an item, and N is a neighborhood of other became clear that an optimization approach based on exhaustive users u similar to a. The algorithm calculates Pa,i , which is the search was not scalable. predicted rating that user a is expected to assign to item i. To some extent, regularities in the data and dependencies be- The algorithm can be understood as operating over the rating tween the variables can shrink the combinatorial space. For exam- data in three different ways. r̄a averages over all of the ratings ple, in our travel data set, origin city was fixed for each user, mean- supplied by user a to establish an overall baseline for the target ing that whether we restricted it or ignored it, the computation was user. The formation of the neighborhood N requires selecting all the same when computing a user’s baseline. So, that variable only users for their similarity to a. And, finally, r̄u is computed for each became relevant for neighborhood calculation. There were other user u to establish their baseline rating to be subtracted from their similar simplifications for the other components, and we imagine individual rating of item i. that this would be true of most recommendation domains. Contextual effects may enter into the algorithm in various ways. Even with such simplifications, we require a more scalable opti- Perhaps a user has a different baseline rating in different contexts. mization technique than exhaustive search. Furthermore, the opti- For example, Alice, our hypothetical diner, may have relatively low mization space is highly non-linear and standard approaches such expectations for her casual Thai restaurants and give a lot of high as gradient descent cannot be used. However, we have had success scores but give out relatively few five star ratings for her work- with the particle swarm optimization technique. oriented experiences. Perhaps good collaborative peer groups are context-specific – recommendations from expense-account-wielding business types might be good for her work dinners but not for any- 3.2 Particle Swarm Optimization thing else. Particle swarm optimization (PSO) [9] is a kind of swarm intelli- To account for this kind of variance, we start with the target gence which was originally introduced by Eberhart and Kennedy context C, which is the context for which the recommendation is in 1995. It is a population-based optimization approach inspired by sought. Our prediction now is for target user a, item i, and context social behaviors in swarming and flocking creatures like bees, birds C: how will the user respond to this item in this context? For ex- or fish. It was introduced to the domain of information retrieval [4] ample, how will Alice like a fancy regional American restaurant in and recommender system [1, 14] recently as a way for feature se- the “work” context with Betsy, her boss and Carl, an out-of-town lection and feature weighting. Binary particle swarm optimization client? It is unlikely that other users will have dined with exactly (BPSO) is a discrete binary version of the technique introduced in the same individuals before, which is why we want to think of con- 1997, which is used to fit our binary selections for contexts and trolling recommendation based on constraints derived from C as feature relaxation in this paper. opposed to C itself. For example, we may want to generalize Al- The basic idea behind PSO is that optimization is produced by ice’s dining occasion as a small work event, which would exclude a number of particles at different points in an optimization space. peer experiences where the meal was a large banquet. Each particle searches independently for the optimum, guided by From the target context, we derive three sets of contextual con- the local value and communication with the other particles. In our straints: C1 , C2 , C3 and apply them to these components separate- case, we use RMSE as the value to be minimized and the posi- ly, yielding the following context-sensitive formulation: tion in the space corresponds to a set of constraints. Each particle P i records its personal best performance (iBest) and corresponding (ru,i,C2 − r̄u,C2 ) × sim(a, u) u∈NC1 best position (Pibest ). The algorithm also keeps track of the global Pa,i,C = r̄a,C3 + P (2) best performance (gBest) and corresponding position (Pgbest ). sim(a, u) u∈NC1 In each iteration of the algorithm, each particle moves through the optimization space as a function of its velocity. The velocity is This version of the equation replaces N with NC1 , meaning that a function of an inertia term and represented by functions of iBest we only consider neighbors who have rated item i in a context and gBest. The inertia is gradually decreased over time so that the matching C1 . Instead of averaging over all ratings when normaliz- particles will be more influenced by the previously-located minima. ing peer ratings, we select only those ratings in contexts matching The update formulas for Vij (the velocity of j th bit in particle i) and the global inertia w are as follows: of change to zero should be decreased. Then the improvement can j be described as follows: Vij,t = wt × Vij,t−1 + α1 ϕ1 × (Pibest − Xij,t−1 ) j If Pibest = 1 Then d1ij,1 = α1 r1 and d0ij,1 = −α1 r1 (3) j j +α2 ϕ2 × (Pgbest − Xij,t−1 ) If Pibest = 0 Then d0ij,1 = α1 r1 and d1ij,1 = −α1 r1 j If Pgbest = 1 Then d1ij,2 = α2 r2 and d0ij,2 = −α2 r2 where ϕ1 and ϕ2 are positive random numbers drawn from a j If Pgbest = 0 Then d0ij,2 = α2 r2 and d1ij,2 = −α2 r2 uniform distribution between 0.0 and 1.0, α1 and α2 are constan- t learning factors (set as 2.0 by empirical suggestions [13]) which where d1ij and d0ij are two temporary values and r1 and r2 are two control the weight given to the local and the global minima, respec- random variable in range of (0,1). α1 and α2 are the same learning tively. Xij,t is the particle’s position (the value of j th bit in particle factors. With this in mind, we define two separate velocities, one i) at the iteration t. in the “1” direction V 1 and one in “0” direction V 0 . Usually velocity is restricted within range [−Vmax , Vmax ]. It Vij1 = wVij1 + d1ij,1 + d1ij,2 (6) cannot be too fast or too slow – a small velocity will result in mi- nor changes on the positions, where large velocity will drive the Vij0 = wVij0 + d0ij,1 + d0ij,2 (7) particle to move too much in the space. Vmax is suggested to be the maximum value of each bit in the position vector, where in our We choose Vij1 or Vij0 as the velocity, depending on the current case it is 1 because the value switches between 0 and 1. Further position of the particle for that bit. experiments show that 2.0 is a better configuration. 1 The inertia at a given iteration t is calculated by: Vij , if (Xij,t−1 = 0) Vij,t = (8) Vij0 , if (Xij,t−1 = 1) (Tmax − t) wt = wend + × (wstart − wend ) (4) Finally the bit value can be updated as follows: Tmax where Tmax denotes the maximum number of iterations. The de- X ij,t−1 , if (randt < S(Vij,t )) Xij,t = (9) signer specifies a starting and ending weight with wstart > wend . Xij,t−1 , otherwise A large inertia weight value facilitates a global search while a s- where X ij,t−1 denotes the 2’s complement of Xij,t−1 ; that is, mall value facilitates a local search. The linear decreasing value if Xij,t−1 = 0, then X ij,t−1 =1; otherwise, X ij,t−1 =0. Our ex- means that the algorithm begins with the particles being more in- periments show this improved version can find optimum more effi- fluenced by their current direction and ends with them being more ciently than classical BPSO. influenced by their neighbors. We followed the empirical parame- ter configuration: wstart = 0.9 and wend = 0.4, which is suggested in PSO research [13]. 4. EXPERIMENTAL SETUP The velocity is used to update the value (position) Xij of each bit For this paper, we use a rating data set in the area of food, the in the binary vector. Basically, the higher the velocity, the greater "AIST context-aware food preference dataset" which was used and probability of switching to 1. distributed by the author Hideki Asoh [12]. The data set was gen- erated using a survey of 212 users asking them to rate items on a 1, if (randt < S(Vij,t )) menu. There were two different conditions in the survey. In one Xij,t = (5) 0, otherwise condition, “real hunger”, users were asked to rate the items based 1 on their current degree of hunger. In the “virtual hunger” situation, where S(Vij,t ) = 1+exp(−V ij,t ) , which is a sigmoidal function they were asked to imagine a degree of hunger different from the squashes the range of velocity to a range of [0,1]. randt is a uni- current state and asked what their preferences would be in that state. form random number in the range [0,1]. Asoh and his colleagues collected 6,360 ratings for 20 food menus, In our application, we use from 1 to 5 particles initialized with and also included demographic and item features in the data. Each a random velocity and random bit vectors representing the possible user supplied 30 ratings. constraints. For each iteration, every particle runs our algorithm For our exploration of contextual recommendation, we use 6 across the training set, compares the RMSE it acquires in current variables. The key contextual variable is degree of hunger (real iteration with the best so far (pBest and gBest), and updates them as or virtual), but we also wanted to explore context-linked features well as the corresponding positions. Then the new position for next including gender and various item features related to the food that iteration is generated based on the equations above. The ability of were available in the data. Food genre indicates the food is Chinese, the particles to communicate within the swarm reduces the total Japanese or Western food; food stuff denotes the food category is number of iterations required to reach an optimum. vegetable, pork, beef, or fish, and the food style contains values that represent the style of food preparation. 3.2.1 Improved BPSO The data set is split into five folds where each user has at least Whether each bit value in the binary vector for each particle is 1 one rating in each fold. For each fold, we used DCR to come up or 0 actually depends on the value of velocity – if the velocity is with the optimal combinations of context and context-linked fea- larger, it is more like it is to switch to 1. Thus the weakness of tures. We compare this technique to collaborative filtering without classical BPSO is that it only considers the possibility of change contexts and with contextual pre-filtering which is the similar to to 1, and does not take the possibility of change to 0 into accoun- the Equation 2 where C2 and C3 are null are we only constrain the t. Mojtaba et al [11] improved BPSO by taking the possibility of neighbor selection. change to 0 into the consideration. Consider the best position visit- In order to determine the value of context-linked features, we ed so far for a particle is Pibest and the global best position for the created two additional test conditions looking at the contextual fea- particle is Pgbest . Assume the j th bit of ith best particle is 1. So to ture alone (degree of hunger) and using the contextual feature to- guide the bit j th of ith particle to its best position, the velocity of gether with context-linked features (demographic and item features change to one for that particle should be increased and the velocity discussed above). In order to assess the effectiveness of our particle swarm tech- 5.2 Analysis of the Optimal Constraints nique, we performed exhaustive search of the constraint space (8,192 Table 1 shows how the different conditions yield different optimal iterations) and compared this result with that found by the opti- constraints. For pre-filtering and DCR using contextual features mization technique. only, the real hunger is influential. When context-linked features are added, real hunger is still influential, but it is applied to dif- 5. EXPERIMENTAL RESULTS ferent components in the DCR model. Context-linked features can Recall that we want to discover influential contexts and context- provide accuracy improvements but it has to be applied to appropri- linked features, and our method for doing so is to find the optimal ate components – the best DCR model is using all of the algorith- set of constraints and see what features are contained therein. Our m components, the pre-filtering part of the algorithm retains only results show that on this data set appropriate relaxation of contexts “gender” and the hunger-related features are applied elsewhere. and features can help improve predictive performance, and incor- Context-linked features - gender, food genre, food stuff, food porating context-linked features outperforms using contexts only. style are selected and applied in the optimal model. The position of contextual constraints in the optimal relaxation is also telling. 5.1 Predictive Performance “Gender” turns out to be the best feature for C1 , indicating that The experimental results are shown in Figure 1 and Table 1. We same-sex peer neighborhoods work best in this domain. Also, we use CO to denote contextual features (degree of hunger) and CL see that a fully restricted constraint – all contexts and food fea- to denote the other demographic and item features that we suspect tures appear in C2 , which is where we choose the ratings to be may be context-linked. These are the results achieved by exhaustive used in constructing the peer user’s baseline rating. It confirms the search; the optimization results are as below. assumption by CARS – users’ preferences differ from contexts to 1.215 100.0% contexts, thus neighbor’s exactly matched contextual ratings con- tribute significantly to predict user’s rating on the same item under 1.200 99.7% the same contexts. The optimal model took real hunger as the on- 1.185 99.4% ly constraint in 3rd component, which implies the users’ average 99.1% 1.170 rating under the same real hunger situation works the best as the 1.155 98.8% representative of this component. RMSE Coverage 98.5% (Bar) (Line) 1.140 98.2% 5.3 Optimization Performance 1.125 97.9% It is clear that differential context relaxation is useful both for 1.110 97.6% improving algorithm performance and discovering crucial features. 1.095 97.3% However, exhaustive search is not a practical or scalable way to 1.080 97.0% realize this goal, although it is possible in this data set. Standard CF Pre!filtering Pre!filtering Pre!filtering DCR (CL) DCR (CO) DCR (CO+CL) We used this data set as an opportunity to experiment with BPSO (CL) (CO) (CO+CL) as an alternative method of finding the optimal context relaxation. RMSE Coverage We examined particle counts from 1 to 5, using a maximum of 100 iterations. A local minimum occurred in classical BPSO – it Figure 1: Comparison of Predictive Performance converged an RMSE (1.116) not quite as good as the global min- As the figure shows, the contextual pre-filtering and our DCR imum found by exhaustive search (1.114). The improved BPSO model can make improvements comparing with the standard CF algorithm does find the global minimum RMSE and the same set without incorporating contexts, and the DCR model achieved the of relaxations as the exhaustive search. Figure 2 shows the learn- best RMSE among these algorithms. Besides, DCR shows better ing curve of the improved version, where N -BPSO denotes we use accuracy improvement when context-linked features are added, but number of N particles in the swarm. Improved BPSO finds the it does not work for contextual pre-filtering. We believe the ef- 1 BPSO 3 BPSO 5 BPSO fects shown by adding context-linked features are domain-specific. 1.210 And those features should be applied to appropriate components 1.200 in the model, where DCR using CL only and DCR using CO and 1.190 CL together provide significant improvements in terms of RMSE. 1.180 But pre-filtering restricts the application of context-linked features 1.170 RMSE 1.160 for neighborhood filtering only. – it may not provide accuracy im- 1.150 provements; instead, more restrict constraints may bring lower cov- 1.140 erage as shown in the figure above. 1.130 We also examined coverage. When the dataset is sparse, intro- 1.120 ducing contextual constraints can greatly limit coverage, an effect 1.110 1.100 that we found in our previous research. With this dataset, there was 1 6 11 16 21 26 31 36 41 46 51 56 61 66 71 76 81 86 91 96 little cost in coverage – the best coverage is 99.7% by standard CF Number of Iterations model, while the best DCR model shows a 99.3% coverage. Notice that when contexts are introduced into the 1st algorithm component Figure 2: Improved BPSO (neighbor filtering), the coverage may depend on which variables are selected as the relaxed ones for this component, as well as the optimal solutions on average around the 18th iteration using one density of contextual information in the data set. From Figure 1 and particle, and 12th iteration using five particles. 18 iterations with 1 Table 1, we can see the coverage of pre-filtering using both context particles corresponds to 18 evaluations over the training set, a more and context-linked features is the lowest, because real hunger and than 455× reduction in computational cost. Increasing the number gender together as more strict constraints are selected and applied of particles expands the power of social contributions by other par- in this component. ticles, and can reduce the number of iterations, but also increases Table 1: Optimal Relaxations Dataset: Food Menu RMSE Coverage Optimal Relaxation Standard CF 1.203 99.7% N/A Pre-filtering (CL) 1.202 99.3% C1 = {gender} Pre-filtering (CO) 1.198 99.6% C1 = {real hunger} Pre-filtering (CO+CL) 1.204 98.3% C1 = {real hunger, gender} DCR (CL) 1.163 99.7% C1 = {}; C2 = {food genre, food stuff, food style}; C3 = {food genre, food stuff, food style} DCR (CO) 1.119 99.6% C1 = {real hunger}; C2 = {}; C3 = {real hunger} C1 = {gender}; C2 = {real hunger, virtual hunger, food genre, food stuff, food style}; DCR (CO+CL) 1.114 99.3% C3 = {real hunger} the amount of computation per iteration. The selection of the num- the 10th international conference on Electronic commerce, ber of particles and the number of iterations depends on the specific ICEC ’08, pages 3:1–3:10, New York, USA, 2008. ACM. cost and computational complexity of the algorithm. [6] N. Hariri, B. Mobasher, R. Burke, and Y. Zheng. Context-aware recommendation based on review mining. In 6. CONCLUSIONS Proceedings of the 9th Workshop on Intelligent Techniques Differential context relaxation is a technique for context-aware rec- for Web Personalization and Recommender Systems (ITWP ommendation that decomposes the recommendation algorithm into 2011), page 30, 2011. components and filters the data used in each component. The con- [7] Z. Huang, X. Lu, and H. Duan. Context-aware text is considered a set of constraints, which may be relaxed to recommendation using rough set model and collaborative achieve the best recommendation performance. filtering. Artificial Intelligence Review, pages 1–15, 2011. In this work, we examine an optimization technique – binary [8] A. Karatzoglou, X. Amatriain, L. Baltrunas, and N. Oliver. particle swarm optimization – to efficiently locate the highest per- Multiverse recommendation: n-dimensional tensor forming set of contextual constraints. We show that BPSO can find factorization for context-aware collaborative filtering. In the global optimum constraint set with a fraction of the iterations Proceedings of the fourth ACM conference on Recommender required by exhaustive search. systems, pages 79–86. ACM, 2010. We also show that both contextual features and context-linked [9] J. Kennedy and R. Eberhart. Particle swarm optimization. In features can be useful in our context relaxation model. Further- Neural Networks, 1995. Proceedings., IEEE International more, our algorithm highlights the different roles that contextual Conference on, volume 4, pages 1942–1948. IEEE, 1995. features play, producing some interesting results, such as the fact [10] J. Kennedy and R. Eberhart. A discrete binary version of the that same-sex neighbor populations work best in our food data set. particle swarm algorithm. In Systems, Man, and Cybernetics, Future research will be focused on two parts: 1) exploring the 1997. Computational Cybernetics and Simulation., 1997 possibility for real-valued (as opposed to binary) definition of dif- IEEE International Conference on, volume 5, pages ferential context relaxation and 2) applying the same technique to 4104–4108. IEEE, 1997. additional algorithms beyond user-based collaborative filtering. 3) [11] M. Khanesar, M. Teshnehlab, and M. Shoorehdeli. A novel combine with the latent factor models, such as matrix factorization binary particle swarm optimization. In Control & in order to alleviate the sparsity problem of contextual ratings and Automation, 2007. MED’07. Mediterranean Conference on, examine our differential model in another way. pages 1–6. Ieee, 2007. [12] C. Ono, Y. Takishima, Y. Motomura, and H. Asoh. 7. REFERENCES Context-aware preference model based on a study of [1] A. Abdelwahab, H. Sekiya, I. Matsuba, Y. Horiuchi, and difference between real and supposed situation data. User S. Kuroiwa. Feature optimization approach for improving the Modeling, Adaptation, and Personalization, pages 102–113, collaborative filtering performance using particle swarm 2009. optimization. Journal of Computational Information [13] Y. Shi and R. Eberhart. Empirical study of particle swarm Systems, 8(1):435–450, 2012. optimization. In Evolutionary Computation, 1999. CEC 99. [2] L. Baltrunas, B. Ludwig, S. Peer, and F. Ricci. Context Proceedings of the 1999 Congress on, volume 3. IEEE, 1999. relevance assessment and exploitation in mobile [14] S. Ujjin and P. Bentley. Particle swarm optimization recommender systems. Personal and Ubiquitous Computing, recommender system. In IEEE Swarm Intelligence pages 1–20, 2011. Symposium, pages 124–131. IEEE, 2003. [3] L. Baltrunas, B. Ludwig, and F. Ricci. Matrix factorization [15] B. Vargas-Govea, G. González-Serna, and techniques for context aware recommendation. In R. Ponce-Medellín. Effects of relevant contextual features in Proceedings of the fifth ACM conference on Recommender the performance of a restaurant recommender system. In systems, pages 301–304. ACM, 2011. ACM RecSys’ 11, the 3rd Workshop on Context-Aware [4] H. Drias. Web information retrieval using particle swarm Recommender Systems (CARS-2011), 2011. optimization based approaches. In Web Intelligence and [16] Y. Zheng, R. Burke, and B. Mobasher. Differential context Intelligent Agent Technology (WI-IAT), 2011 relaxation for context-aware travel recommendation. In 13th IEEE/WIC/ACM International Conference on, volume 1, International Conference on Electronic Commerce and Web pages 36–39. IEEE, 2011. Technologies (EC-WEB 2012), volume 123 of Lecture Notes [5] A. Felfernig and R. Burke. Constraint-based recommender in Business Information Processing, pages 88–99. Springer systems: technologies and research issues. In Proceedings of Berlin Heidelberg, 2012.