-

Estimate features relevance for groups of users

Stefano Cereda

Leonardo Cella

Paolo Cremonesi

In item cold-start, collaborative ltering techniques cannot be used directly since newly added items have no interactions with users. Hence, content-based ltering is usually the only viable option left. In this paper we propose a feature-based machine learning model that addresses the item cold-start problem by jointly exploiting item content features, past user preferences and interactions of similar users. The proposed solution learns a relevance of each content feature referring to a community of similar users. In our experiments, the proposed approach outperforms classical content-based ltering on an enriched version of the Net ix dataset.

Traditional Content Based recommender systems (CBF) need to represent users and items pro les in order to recommend similar items to those previously liked by users. Their main advantage is the capability of recommending previously unseen items, thus they solve the item-cold start issue. On the contrary, Collaborative Filtering (CF) algorithms usually reach better performance on predictions, especially with many interactions between users and items [1]. Their downside consists of the inability of recommending items with no previous interactions. Even if CBF approaches are able to solve the item cold-start problem, they are a ected by at least two relevant limitations: recommended items tend to be too similar to previously rated items (over-specialization problem) and recommendations do not depend on preferences of similar users.

Some attempts to improve recommendation quality of CBFs consists of: Filtering methods and Embedded approaches. The former main drawback is that they do not take into account the ratings of users, therefore ignoring if the feature-based similarity between items is aligned with the user perception of similarity ([2], [3]). Embedded approaches perform feature weighting during the learning process and use its objective function to guide searching for relevant features. Instances of this methodology are: SSLIM [5] , UFSM [4] and Factorization Machines. The main drawback of embedded methods is the coupling between the collaborative and content components of the model. When used on datasets with unstructured user-generated features (e.g., tags) the noise from the features propagate to the collaborative part, a ecting the overall prediction quality.

As a rst solution to this problem, we have developed a machine learning algorithm whose aim is to compute global1 feature weights based on a pure item 1 in the sense that the relevance scores were shared by all the di erent users. collaborative ltering approach. Its main objective was to embed in item features also information regarding user interests. In this research we propose an extension to this approach, the main contribution brought by this work is a general, straightforward wrapper to make content-based methods rate-aware and based on communities of similar users. Our experiments are conducted on the Net ix dataset in a version enriched with IMDB attributes. The experiments shown that the proposed solution outperforms classical pure content-based approaches. 2

Clustered Feature Weighting

(1) (2) (3) Our objective is to recommend items from a set I to users in a set U . Items are described by the d-dimensional set of features F. User interactions are collected with the RjUj jIj feedback matrix. Item features are described by the item features matrix AjIj jF j, aij = 1 i item i has feature j. In general, usercluster based recommender systems rely on a cluster-dependent similarity matrix SpjIj jIj, where p denotes the considered subset of users.

The predicted rate of user u, that belongs to the group pu, for item i is computed as follows: r^ui =

Pj2Nkpu (i) ruj sipju

Pj2Nkpu (i) sipju where: sipju is a local item-item similarity derived from the user subset pu to which the target user u belongs and Nkpu (i) is the set of k nearest neighbors of item i according to the similarity model of cluster pu. Starting from this model, we would recommend the items whose predicted ratings are the largest. Feature weighting aims to derive a feature vector wpu 2 RjF j such that each entry wlpu 2 wpu re ects the lth feature relevance for the pu subset of users. We de ne the weighted similarity sipju between items i and j for the cluster pu as: sipju = X wfpu aif ajf = hwpu ; ai

aj i f2F where ai; aj 2 f0; 1gjF j are the feature vectors of items i and j respectively and is the element-wise product. We propose to compute the feature weights by solving the following LSQ problem for each cluster of users pu:

i2I j2Infig argmin X w pu

X jjsicjCF sipju jj2 More speci cally, in our experiments we have adopted LSLIMr0 [6] as local similarity matrix ScCF and CLUTO [7]2 to derive the user subsets pu. Since our goal is to learn a set of feature weights so that CBF similarities mimic as close as possible CF ones, there is no need to add a regularization term, 2 this choice is based on the methodology followed in [6]. thus greatly simplifying the optimization. Experimental results con rmed this hypothesis.

When a new item is added to the catalog, we use w pu to compute its weighted similarity w.r.t. the previously existing items. Then, it can be recommended to users belonging to subset pu by using Equation 1. We call the proposed approach CLFW (Clustered Least-square Features Weighting). 3

Experimental Evaluation

Dataset. For our experiments, we used a version of the Net- Fig. 1. Dataset ix dataset enriched with structured and unstructured at- partitioning. tributes extracted from IMDB. This dataset has 186K users, Items 6.5k movies and 6.7M ratings in 1-5 scale. The rating data 4866 1623 is enriched with 16803 binary attributes representing various A B kinds of meta-information on movies such as director, actor, A1 A2 genres and user-generated tags3. To investigate the new-item scenario, we performed a 70/30 random hold-out split over items as shown in Figure 1. The sub matrix A has then been divided by moving 30% of positive (> 3) ratings into A2 and everything else in A1. A1 has then been used to compute LSLIMr0 and therefore to t CLFW. When evaluating the warm-start scenario we used A1 as user pro les and A2 as ground truth, whereas for the cold-item we used the positive ratings of B as ground truth and A1 as user pro les.

Baselines. As in the previous work we have used simple unweighted cosine similarity (Cos) and TF-IDF-weighted cosine similarity (CosIDF) as CBF baselines to evaluate the performance of CLSFW in both scenarios.

Performance Analysis In Table 1, we report the RMSE computed over predicted rates for di erent neighborhood sizes k in the new-item scenario. The warm-start scenario is instead represented by Table 2.

We can state that in both scenarios CLFW consistently outperforms both the baselines on RMSE at any value of k. Moreover, in the warm-start scenario, 3 the set of content features was signi cantly augmented with respect to our previous unclustered work. it is nearly as good as LSLIMr0. We want to also highlight that CLFW di ers from the other CBF baselines solely in the feature weighting scheme. Therefore, the improvement in performance must be due to a better feature weighting discovered by our approach. By comparing the CLFW column with the regCLFW one 4, we can observe that the regularization does not bring a performance improvement. This is reasonable and totally in agreement with our prediction. In fact, the data from which we are learning do not contain noise and, further more, the number of weights that we learn does not allow to overestimate the model complexity. 4

Conclusions and Future Work

With this research we investigated the possibility of deriving a user based feature weighting. We have presented ongoing results of an extended approach that solves the item cold-start issue by de ning personalized features relevance. The ongoing development is focused in the usage of di erent personalization methodologies and extension to other datasets. Moreover, we are interested in combining this clustered approach with the, already developed, global one. 4 which contains the results of our algorithm when the feature weights are computed adding an l2 regularization term to Equation 3

Istvn

Pilszy , Domonkos Tikk: Recommending new movies: even a few ratings are more valuable than metadata RecSys 09 Proceedings of the third ACM conference on Recommender systems , 93 { 100 (2009) Lops

Pasquale

, De Gemmis Marco , Semeraro Giovanni: Content-based recommender systems: State of the art and trends Recommender systems handbook , 73 { 105 ( 2011 ) Panagiotis Symeonidis , Alexandros Nanopoulos, Yannis Manolopoulos: Feature-Weighted User Model for Recommender Systems UM 07 Proceedings of the 11th international conference on User Modeling , Springer, 10 .1007/978- 3- 540 -73078-1-13 97 { 106 ( 2007 ) Elbadrawy Asmaa, Karypis, George: User-Speci c Feature-Based Similarity Models for Top-n Recommendation of New Items , ACM Trans. Intell. Syst. Technol. , 6 , 33 :1{ 33 : 20 , (May 2015 ) 10 .1145/2700495, ACM Ning Xia , Karypis George: Sparse Linear Methods with Side Information for Top-N Recommendations , Proceedings of the 21st International Conference on World Wide Web , WWW '12 Companion, ACM 581 { 582 , ( 2012 ) Christakopoulou Evangelia, Karypis George: Local Item-Item Models for TopN Recommendation , 10th ACM Conference on Recommender Systems , RecSys16 67 { 74 , ( 2016 ) http://glaros.dtc.umn.edu/gkhome/cluto/cluto/overview