Reconstructing Trust Matrix to Improve Prediction Accuracy and Solve Cold User Problem in Recommender Systems Shunpan Liang*, Lin Ma, Fuyong Yuan College of Information Science and Engineering, Yanshan University, Qinhuangdao, Hebei 066004, China liangshunpan@ysu.edu.cn *: corresponding author. ABSTRACT Collaborative filtering(CF) has become the most well-known Recommender systems(RS) are a type of solution to the and commonly used techniques to generate recommenda- information overload problem suffered by users of websites tions in RS[2]. The intuition of CF is making predictions that allow the rating of certain items. Collaborative filter- about a useraŕs, preferences or tastes based on the prefer- ing(CF) is one of the most widely used methods in person- ences of a group of users that are considered similar to this alized RS. The most critical part of collaborative filtering is user. However, CF suffers from several inherent issues, for to compute similarities among users using a user-item rating example, data sparsity and cold start. Data sparsity refers matrix based on which recommendations can be generated. to the difficulty in finding sufficient and reliable similar users However, CF suffers from several inherent issues, such as due to the fact that users in general only rate a small por- data sparsity and cold start, which affect the quality of rec- tion of items, while cold start includes two main problems: ommendation seriously. To address these problems, we pro- (1) How to recommend to the new users who have not rated pose a reconstructing trust matrix measure in this paper, any items; (2) How to deal with the items never rated or which combines user similarity and weighted trust propa- purchased.To resolve the issue, additional information such gation. Specifically, we first remove the trust relationship as trust[3] is studied and incorporated into CF. But as cold of those users whose similarity falls below a certain thresh- users do not have a large number of trusted neighbors, we old. We then add the users that are not in the trust matrix cannot use the trust information directly. Fortunately, trust into it when the similarity between them exceeds a certain can be propagated along with the web-of-trust. That is, if threshold. Finally, weighted trust propagation is considered, user A trusts B and B trusts C, it can be inferred that A aiming to distinguish trusted neighbors in a shorter distance trusts C in some degree. Therefore, it is necessary to prop- with those in a longer distance and incorporate more trust- agate trust in order to find more trusted neighbors for cold ed neighbors, especially useful for cold users. Experimental users. results on two real-world data sets show that our method The majority of earlier approaches for prediction in trust- achieves superior accuracy and it can solve cold user prob- based systems make predictions utilizing all the trust state- lem as well. ments present in the data. But user A trusting user B dose not signify that the similarity between A and B will also be high and the low similarity in trust relationship will impact CCS Concepts prediction quality adversely. Thus in this paper we first re- •Information systems → World Wide Web; •Web move the trust relationship of those users whose similarity searching and information discovery → Social rec- falls below a certain threshold. We then add the users that ommendation; are not in the trust matrix into it when the similarity be- tween them exceeds a certain threshold. Besides, weighted trust propagation is considered, aiming to distinguish trust- Keywords ed neighbors in a shorter distance with those in a longer Recommender systems; Collaborative filtering; Reconstruct- distance and to incorporate more trusted neighbors, which ing trust matrix; Accuracy; Cold user is especially useful for cold users. The rest of the paper is organized as follows. In Section 2, we give a brief overview of related research on trust-based 1. INTRODUCTION CF. The proposed approach is then elaborated in Section 3. The emergence of Web 2.0 applications has greatly changed Experiments on two real-world data sets are conducted in usersaŕ , styles of online activities from searching and brows- Section 4. Finally, Section 5 concludes our work. ing to interacting and sharing[1]. However, it is up against the challenge of information overload meanwhile, which is well-known as requiring to spend a lot of time to find useful 2. RELATED WORK information. Recommender systems (RS) are designed to To better model user preferences for the cold users who on- cope with the problem and heavily used in e-commerce ap- ly rated a few items, additional information is often adopted plications such as Amazon.com, Ebay.com, and Netflix.com and trust is of less ambiguity and more relevant to similarity. etc. Till now many trust-based approaches have been proposed. Jamali et al. designed the TrustWalker approach[4] to users a and b. ra,i and rb,i denote the rating of item by randomly select trusted neighbors in the trust networks, users a and b respectively. ra and rb are the average rating where trust information of the selected neighbors is com- value of users a and b . bined with an item-based technique to predict item ratings. In a recommender system, T rust presents the traditional In contrast, our work focuses on generating predictions by trust matrix, and ta,b is the trust value between a and b. combining trust information with a user-based technique. Generally, the trust value is binary, namely 0 and 1, where Massa et al. proposed the MoleTrust algorithm[5], which 0 stands for distrust and 1 indicates absolutely trust. We performs depth-first search, to propagate and infer trust in set two similarity threshold α and β when reconstructing the trust networks. Empirical results show that the coverage trust matrix. First, the low similarity in trust relationship is significantly enlarged but the accuracy remains compara- will reduce the quality of rating prediction, thus, for the ble when propagating trust. two trust users a and b, namely ta,b = 1, we will remove Ray et al. presented another trusted method[6]. The trust their trust statement if Sa,b < α and reserve their trust links between two users will be removed if their similarity is relationship if Sa,b > α. It is defined as (2): lower than a threshold. But empirical results show that good performance is achieved at the cost of poor coverage, and it ( 0, ta,b = 1 and Sa,b < α fails to function in cold conditions where user similarity may ta,b = (2) not be computable. 1, ta,b = 1 and Sa,b > α Recently, Deng et al. proposed a social network based ser- The high similarity in trust matrix will improve recom- vice recommendation method with trust enhancement known mendation results, accordingly, for the two users a and b as RelevantTrustWalker[7]. First, a matrix factorization which are not in the trust matrix, namely ta,b = 0, if Sa,b > method is utilized to assess the degree of trust between users β we will add the trust relationship between a and b. How- in social network. Next, an extended random walk algorith- ever, if Sa,b < β, the trust relationship cannot be added for m is proposed to obtain recommendation results. a and b. It is defined as (3): Guo et al. presented a merged method called Mergex[8], which merged the ratings of trusted neighbors in order to ( form a new and more complete rating pro?le for the active 1, ta,b = 0 and Sa,b > β ta,b = (3) users based on which recommendations can be generated by 0, ta,b = 0 and Sa,b < β integrating similarity and trust into CF. Through the above steps, the trust matrix is reconstruct- Our work focuses on generating predictions by combining ed, which is defined as U pT rust. weighted trust propagation with a user-based technique. In order to achieve a better result, we first remove the trust s- 3.2 Weighted trust propagation algorithm(WTPA) tatements between two users if their similarity is lower than The cold users are generally defined as the users who have a threshold considering low similarity in trust will affect the rated less than five items[10]. Since cold users are usually prediction accuracy. Since high similarity in trust state- less active in the systems, they may not have a large number ments can improve the recommendation results, the trust of trusted neighbors. Fortunately, trust can be propagated links between two users will be added into the original trust along with the web-of-trust. That is, if user A trusts B and matrix if their similarity overtops a certain threshold. Fi- B trusts C, it can be inferred that A trusts C in some degree. nally, weighted trust propagation will be considered aiming Therefore, it is necessary to propagate trust in order to find to distinguish trusted neighbors in a shorter distance with more trusted neighbors for cold user problem. MoleTrust[11] those in a longer distance and find more trusted neighbors, is the method using the above trust propagation to infer the which is especially useful for cold users. trust value of indirectly connected users. Note that the trust value in the reconstructed trust matrix U pT rust is binary, 3. OUR METHOD i.e., 0 or 1. As a result, the inferred trust value by the Mo- In this section, we will present the specific method. We leTrust will be also binary, and thus we cannot distinguish will introduce how to incorporate similarity and trust to re- trusted neighbors in a shorter distance with those in a longer construct trust matrix. Then the weighted trust propagation distance. Hence, we adopt a weighting factor to devalue the will be explained. inferred trust in a long distance: 3.1 Reconstruct trust matrix algorithm(RTMA) 0 1 ta,b = × ta,b (4) The majority of earlier approaches for prediction in trust- d based systems make predictions utilizing all the trust state- Where ta,b denotes the trust value in the trust matrix, 0 ments present in the data. But as we all know, user A trust- ta,b is the weighted trust value and d is the shortest distance ing user B does not mean that the correlation between A and between users a and b determined by a breath first search B will be also high. So we present the method of combining algorithm. Note that the greater d is, the more trusted similarity and trust to reconstruct trust matrix. Pearson neighbors will be inferred. However, the more cost will be Correlation Coefficient(PCC) is a preferable method[9] and taken and more noise is likely to be incorporated. In this we adopt PCC to compute similarity in recommender sys- work, we restrict d ≤ 3 to prevent meaningless searching and tems, it is defined as follow: save computational cost for large-scale data sets. In fact, as P we will show later, our method works well enough when d is i (ra,i − ra )(rb,i − rb ) small. The trust matrix is U pT rust d after using weighted Sa,b = pP (1) trust propagation. pP i (ra,i − ra ) 2 i (ra,i − ra ) 2 Where i represents the set of commonly rated items by 3.3 The description of our method 4.3 Results and analysis According to the above description, the algorithm of RT- In this section, we will verify our method of reconstructing MA and WTPA are expressed as follows: trust matrix. Step1: According to user-item rating matrix and PCC Fig. 1(a) and (b) are the performances of these approach- algorithm, compute similarities between every two users. es on FilmTrust in terms of MAE and RMSE respectively. Step2: Predefine threshold α and β. The threshold α and β are set as 0.2 and 0.3 respectively. Step3: For two users u and v in user-item rating matrix, The results show that our method UpTrust d is the best reset tu,v according to definition (2) and (3). method. PCC is better than Trust d and it illustrates that Step4: Get the U pT rust. the low similarity in trust statement can deteriorate the pre- 0 Step5: For two users u and v in UpTrust d, reset tu,v diction results. From the figure, we can observe that with according to definition (4). the lengthening of trust propagation, Trust 3 is better than Step6: Get the U pT rust d. Trust 2 and Trust 2 is better than Trust 1, we may conclude that trust propagation is helpful to improve recommenda- 4. EXPERIMENTS tion performance. However, in the method UpTrust d, Up- To verify our method, we conduct experiments on two Trust 1 is the best method and it better than UpTrust 2 and real-world data sets using the 5-fold cross validation method. UpTrust 3. This is because although more trusted neighbors The data set is split into five disjoint sets; for each iteration, can be identified via trust propagation, it does not guarantee four folds are used as training set and one as testing set. We that the rating profile will cover a lot more items and hence apply the K-Nearest Neighbor(KNN) approach to select a increase accuracy greatly. Rather, it is possibly that adding group of similar users whose ranking is in the top K accord- few trusted neighbors may result in some noisy ratings, and ing to similarity; we vary K from 5 to 50 with step 5. The hence harm the predictive performance. ratings of selected similar users are aggregated to predict itemsaŕ , ratings by a mean-centering approach[12]. 4.1 Data sets Two real-world data sets are used in our experiments, namely FilmTrust and Epinions. They are available data sets that contain both the trust matrix and user-item rating matrix. FilmTrust is a trust-based social site in which users can (a) The results of MAE (b) The results of RMSE rate and review movies. It includes 1986 users, 2071 movies and 35497 ratings. The ratings take values from 0.5 to Figure 1: The performances on FilmTrust 4.0with step 0.5. In addition, 1853 trust ratings that are issued by 609 users are gathered. The sparsity is 98.86%. Fig. 2(a) and (b) are the results of those methods on Epinions is a website in which users can express their opin- Epinions in terms of MAE and RMSE respectively and the ions about items (such as movies, books, and software) by threshold α and β are set as 0.1 and 0.3. Likewise, Up- assigning numerical ratings and writing text reviews. The Trust d is the best approach of all and PCC is better than data set consists of 49K users who issued 664K ratings over Trust d. However, with the lengthening of trust propaga- 139K different items and 478K trust statements. The rat- tion, Trust 1 is the best method in Trust d while the Up- ings are integers rated from 1 to 5 and the sparsity is 99.95%. Trust 3 is the greatest method in UpTrust d, which is just The trust values of both data sets are binary (either 1 or 0). opposite with the results on FilmTrust. This also explains that trust propagation can not guarantee better results will 4.2 Evaluation metrics be received, although more trusted neighbors can be found, The evaluation metrics are mean absolute error (MAE), it is likely to add noisy information and decrease the accu- root mean square error (RMSE) and rating coverage (RC) racy of recommendation.. respectively. They are defined as follows: X M AE = |rui − rbui |/|T | (5) (u,i)∈T s X RM SE = |rui − rbui |/|T | (6) (u,i)∈T (a) The results of MAE (b) The results of RMSE Where T represents the set of prediction results and |T | is the number of the set, and rbui is the prediction rating of user u to item i. Figure 2: The performances on Epinions In addition, to verify whether our method can solve cold s- M tart problem, we conduct experiments on FilmTrust in terms RC = (7) N of MAE, RMSE and RC on cold users. Similarly, we com- Where M and N are the number of predictable and all pare UpTrust 1 with PCC and the performances are shown the testing ratings, respectively. in Fig. 3. We can get that UpTrust 1 is much better than PCC in terms of cold users which declares that our method Proceedings of the 43rd Hawaii International can solve cold start problem to some extent. Table 1 is the Conference on System Sciences, 2010, 1-9. performances of the two methods in terms of RC, which can [7] S. Deng, L. Huang, and G. Xu, Social network-based further verify that our method can solve cold start problem service recommendation with trust enhancement[J], effectively. Expert Systems with Applications, 2014(41), 8075-8084. [8] G. Guo, J. Zhang, and D. Thalmann, Merging trust in collaborative fltering to alleviate data sparsity and cold start[J], Knowledge-based Systems, 2014(57), 57-68. [9] J. S. Breese, D. Heckerman, and C. Kadie et al, Empirical analysis of predictive algorithms for collaborative filtering[C], in Proceedings of the 14th (a) The results of MAE (b) The results of RMSE Conference on University in Artificial Intelligence(UAI98), 1998, 43-52. [10] P. Massa and P. Avesani, Trust-aware recommender Figure 3: The performances on FilmTrust in terms systems[C], in: Proceedings of the 2007 ACM of Cold Users Conference on Recommender Systems, 2007, 17-24. [11] I. Guy, I. Ronen, and E. Wilcox, Do you know recommending people to invite into your social Table 1: The RC in the view of Cold Users on network[C], in: Proceedings of the 14th International FilmTrust Conference on Intelligent User Interfaces, 2009, 77-86. Method RC [12] C. Desrosiers and G. Karypis, A comprehensive survey PCC 43.57% of neighborhood-based recommendation methods[J], UpTrust 1 58.80% Recommender Systems Handbook, 2011, 107-144. 5. CONCLUSIONS This paper presents a reconstructing trust matrix method to improve the prediction accuracy and solve cold user prob- lem of collaborative filtering recommender systems. Consid- ering the similarity in trust matrix will affect recommenda- tion results, we reconstruct traditional trust matrix. Be- sides, to recommend better for cold users and distinguish trusted neighbors in a shorter distance with those in a longer distance, weighted trust propagation is considered. The ex- perimental results on two real data sets demonstrate the effectiveness of our methods in improving the prediction ac- curacy and solving cold user problem of recommender sys- tems. 6. REFERENCES [1] X. Zhou, Y. Xu, and Y. Li et al, The state-of-the-art in personalized recommender systems for social networking[J], Artif. Intell. Rev., 2012, 1-14. [2] D. Jannach, M. Zanker, and M. Ge et al, Recommender systems in computer science and information systemsĺCa landscape of research[J], E-Commerce Web Technol, 2012,76-87. [3] J. Audun and D.K. Walter Quattrochicchi, Taste and trust[J], Trust Management V, 2011, 312-322. [4] M. Jamali, and M. Ester, Trustwalker: a random walk model for combining trustbased and item-based recommendation[C], in: Proceedings of the 15th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, 2009, 397-406. [5] P. Massa and P. Avesani, Trust-aware recommender systems[C], in: Proceedings of the 2007 ACM Conference on Recommender Systems, 2007, 17-24. [6] S. Ray and A. Mahanti, Improving prediction accuracy in trust-aware recommender systems[C], in: