Reconstructing Trust Matrix to Improve Prediction
  Accuracy and Solve Cold User Problem in Recommender
                        Systems

                                       Shunpan Liang*, Lin Ma, Fuyong Yuan
        College of Information Science and Engineering, Yanshan University, Qinhuangdao, Hebei 066004,
                                                     China
                                                    liangshunpan@ysu.edu.cn
                                                    *: corresponding author.

ABSTRACT                                                             Collaborative filtering(CF) has become the most well-known
Recommender systems(RS) are a type of solution to the             and commonly used techniques to generate recommenda-
information overload problem suffered by users of websites        tions in RS[2]. The intuition of CF is making predictions
that allow the rating of certain items. Collaborative filter-     about a useraŕs,  preferences or tastes based on the prefer-
ing(CF) is one of the most widely used methods in person-         ences of a group of users that are considered similar to this
alized RS. The most critical part of collaborative filtering is   user. However, CF suffers from several inherent issues, for
to compute similarities among users using a user-item rating      example, data sparsity and cold start. Data sparsity refers
matrix based on which recommendations can be generated.           to the difficulty in finding sufficient and reliable similar users
However, CF suffers from several inherent issues, such as         due to the fact that users in general only rate a small por-
data sparsity and cold start, which affect the quality of rec-    tion of items, while cold start includes two main problems:
ommendation seriously. To address these problems, we pro-         (1) How to recommend to the new users who have not rated
pose a reconstructing trust matrix measure in this paper,         any items; (2) How to deal with the items never rated or
which combines user similarity and weighted trust propa-          purchased.To resolve the issue, additional information such
gation. Specifically, we first remove the trust relationship      as trust[3] is studied and incorporated into CF. But as cold
of those users whose similarity falls below a certain thresh-     users do not have a large number of trusted neighbors, we
old. We then add the users that are not in the trust matrix       cannot use the trust information directly. Fortunately, trust
into it when the similarity between them exceeds a certain        can be propagated along with the web-of-trust. That is, if
threshold. Finally, weighted trust propagation is considered,     user A trusts B and B trusts C, it can be inferred that A
aiming to distinguish trusted neighbors in a shorter distance     trusts C in some degree. Therefore, it is necessary to prop-
with those in a longer distance and incorporate more trust-       agate trust in order to find more trusted neighbors for cold
ed neighbors, especially useful for cold users. Experimental      users.
results on two real-world data sets show that our method             The majority of earlier approaches for prediction in trust-
achieves superior accuracy and it can solve cold user prob-       based systems make predictions utilizing all the trust state-
lem as well.                                                      ments present in the data. But user A trusting user B dose
                                                                  not signify that the similarity between A and B will also be
                                                                  high and the low similarity in trust relationship will impact
CCS Concepts                                                      prediction quality adversely. Thus in this paper we first re-
•Information systems → World Wide Web; •Web                       move the trust relationship of those users whose similarity
searching and information discovery → Social rec-                 falls below a certain threshold. We then add the users that
ommendation;                                                      are not in the trust matrix into it when the similarity be-
                                                                  tween them exceeds a certain threshold. Besides, weighted
                                                                  trust propagation is considered, aiming to distinguish trust-
Keywords                                                          ed neighbors in a shorter distance with those in a longer
Recommender systems; Collaborative filtering; Reconstruct-        distance and to incorporate more trusted neighbors, which
ing trust matrix; Accuracy; Cold user                             is especially useful for cold users.
                                                                     The rest of the paper is organized as follows. In Section
                                                                  2, we give a brief overview of related research on trust-based
1.   INTRODUCTION                                                 CF. The proposed approach is then elaborated in Section 3.
   The emergence of Web 2.0 applications has greatly changed      Experiments on two real-world data sets are conducted in
usersaŕ
      , styles of online activities from searching and brows-
                                                                  Section 4. Finally, Section 5 concludes our work.
ing to interacting and sharing[1]. However, it is up against
the challenge of information overload meanwhile, which is
well-known as requiring to spend a lot of time to find useful     2.   RELATED WORK
information. Recommender systems (RS) are designed to                To better model user preferences for the cold users who on-
cope with the problem and heavily used in e-commerce ap-          ly rated a few items, additional information is often adopted
plications such as Amazon.com, Ebay.com, and Netflix.com          and trust is of less ambiguity and more relevant to similarity.
etc.                                                              Till now many trust-based approaches have been proposed.
   Jamali et al. designed the TrustWalker approach[4] to                 users a and b. ra,i and rb,i denote the rating of item by
randomly select trusted neighbors in the trust networks,                 users a and b respectively. ra and rb are the average rating
where trust information of the selected neighbors is com-                value of users a and b .
bined with an item-based technique to predict item ratings.                In a recommender system, T rust presents the traditional
In contrast, our work focuses on generating predictions by               trust matrix, and ta,b is the trust value between a and b.
combining trust information with a user-based technique.                 Generally, the trust value is binary, namely 0 and 1, where
   Massa et al. proposed the MoleTrust algorithm[5], which               0 stands for distrust and 1 indicates absolutely trust. We
performs depth-first search, to propagate and infer trust in             set two similarity threshold α and β when reconstructing
the trust networks. Empirical results show that the coverage             trust matrix. First, the low similarity in trust relationship
is significantly enlarged but the accuracy remains compara-              will reduce the quality of rating prediction, thus, for the
ble when propagating trust.                                              two trust users a and b, namely ta,b = 1, we will remove
   Ray et al. presented another trusted method[6]. The trust             their trust statement if Sa,b < α and reserve their trust
links between two users will be removed if their similarity is           relationship if Sa,b > α. It is defined as (2):
lower than a threshold. But empirical results show that good
performance is achieved at the cost of poor coverage, and it
                                                                                               (
                                                                                               0,       ta,b = 1 and Sa,b < α
fails to function in cold conditions where user similarity may                        ta,b =                                        (2)
not be computable.                                                                             1,       ta,b = 1 and Sa,b > α
   Recently, Deng et al. proposed a social network based ser-              The high similarity in trust matrix will improve recom-
vice recommendation method with trust enhancement known                  mendation results, accordingly, for the two users a and b
as RelevantTrustWalker[7]. First, a matrix factorization                 which are not in the trust matrix, namely ta,b = 0, if Sa,b >
method is utilized to assess the degree of trust between users           β we will add the trust relationship between a and b. How-
in social network. Next, an extended random walk algorith-               ever, if Sa,b < β, the trust relationship cannot be added for
m is proposed to obtain recommendation results.                          a and b. It is defined as (3):
   Guo et al. presented a merged method called Mergex[8],
which merged the ratings of trusted neighbors in order to                                    (
form a new and more complete rating pro?le for the active                                     1,        ta,b = 0 and Sa,b > β
                                                                                      ta,b =                                        (3)
users based on which recommendations can be generated by                                      0,        ta,b = 0 and Sa,b < β
integrating similarity and trust into CF.                                  Through the above steps, the trust matrix is reconstruct-
   Our work focuses on generating predictions by combining               ed, which is defined as U pT rust.
weighted trust propagation with a user-based technique. In
order to achieve a better result, we first remove the trust s-           3.2    Weighted trust propagation algorithm(WTPA)
tatements between two users if their similarity is lower than               The cold users are generally defined as the users who have
a threshold considering low similarity in trust will affect the          rated less than five items[10]. Since cold users are usually
prediction accuracy. Since high similarity in trust state-               less active in the systems, they may not have a large number
ments can improve the recommendation results, the trust                  of trusted neighbors. Fortunately, trust can be propagated
links between two users will be added into the original trust            along with the web-of-trust. That is, if user A trusts B and
matrix if their similarity overtops a certain threshold. Fi-             B trusts C, it can be inferred that A trusts C in some degree.
nally, weighted trust propagation will be considered aiming              Therefore, it is necessary to propagate trust in order to find
to distinguish trusted neighbors in a shorter distance with              more trusted neighbors for cold user problem. MoleTrust[11]
those in a longer distance and find more trusted neighbors,              is the method using the above trust propagation to infer the
which is especially useful for cold users.                               trust value of indirectly connected users. Note that the trust
                                                                         value in the reconstructed trust matrix U pT rust is binary,
3.     OUR METHOD                                                        i.e., 0 or 1. As a result, the inferred trust value by the Mo-
  In this section, we will present the specific method. We               leTrust will be also binary, and thus we cannot distinguish
will introduce how to incorporate similarity and trust to re-            trusted neighbors in a shorter distance with those in a longer
construct trust matrix. Then the weighted trust propagation              distance. Hence, we adopt a weighting factor to devalue the
will be explained.                                                       inferred trust in a long distance:

3.1      Reconstruct trust matrix algorithm(RTMA)                                                   0   1
                                                                                                   ta,b =  × ta,b                    (4)
  The majority of earlier approaches for prediction in trust-                                           d
based systems make predictions utilizing all the trust state-               Where ta,b denotes the trust value in the trust matrix,
                                                                          0
ments present in the data. But as we all know, user A trust-             ta,b is the weighted trust value and d is the shortest distance
ing user B does not mean that the correlation between A and              between users a and b determined by a breath first search
B will be also high. So we present the method of combining               algorithm. Note that the greater d is, the more trusted
similarity and trust to reconstruct trust matrix. Pearson                neighbors will be inferred. However, the more cost will be
Correlation Coefficient(PCC) is a preferable method[9] and               taken and more noise is likely to be incorporated. In this
we adopt PCC to compute similarity in recommender sys-                   work, we restrict d ≤ 3 to prevent meaningless searching and
tems, it is defined as follow:                                           save computational cost for large-scale data sets. In fact, as
                       P                                                 we will show later, our method works well enough when d is
                         i (ra,i − ra )(rb,i − rb )                      small. The trust matrix is U pT rust d after using weighted
           Sa,b = pP                                               (1)
                                                                         trust propagation.
                                         pP
                      i (ra,i − ra )
                                     2
                                              i (ra,i − ra )
                                                               2


     Where i represents the set of commonly rated items by               3.3    The description of our method
                                                                     4.3    Results and analysis
  According to the above description, the algorithm of RT-              In this section, we will verify our method of reconstructing
MA and WTPA are expressed as follows:                                trust matrix.
  Step1: According to user-item rating matrix and PCC                   Fig. 1(a) and (b) are the performances of these approach-
algorithm, compute similarities between every two users.             es on FilmTrust in terms of MAE and RMSE respectively.
  Step2: Predefine threshold α and β.                                The threshold α and β are set as 0.2 and 0.3 respectively.
  Step3: For two users u and v in user-item rating matrix,           The results show that our method UpTrust d is the best
reset tu,v according to definition (2) and (3).                      method. PCC is better than Trust d and it illustrates that
  Step4: Get the U pT rust.                                          the low similarity in trust statement can deteriorate the pre-
                                                        0
  Step5: For two users u and v in UpTrust d, reset tu,v              diction results. From the figure, we can observe that with
according to definition (4).                                         the lengthening of trust propagation, Trust 3 is better than
  Step6: Get the U pT rust d.                                        Trust 2 and Trust 2 is better than Trust 1, we may conclude
                                                                     that trust propagation is helpful to improve recommenda-
4.    EXPERIMENTS                                                    tion performance. However, in the method UpTrust d, Up-
   To verify our method, we conduct experiments on two               Trust 1 is the best method and it better than UpTrust 2 and
real-world data sets using the 5-fold cross validation method.       UpTrust 3. This is because although more trusted neighbors
The data set is split into five disjoint sets; for each iteration,   can be identified via trust propagation, it does not guarantee
four folds are used as training set and one as testing set. We       that the rating profile will cover a lot more items and hence
apply the K-Nearest Neighbor(KNN) approach to select a               increase accuracy greatly. Rather, it is possibly that adding
group of similar users whose ranking is in the top K accord-         few trusted neighbors may result in some noisy ratings, and
ing to similarity; we vary K from 5 to 50 with step 5. The           hence harm the predictive performance.
ratings of selected similar users are aggregated to predict
itemsaŕ
       , ratings by a mean-centering approach[12].

4.1    Data sets
   Two real-world data sets are used in our experiments,
namely FilmTrust and Epinions. They are available data
sets that contain both the trust matrix and user-item rating
matrix.
   FilmTrust is a trust-based social site in which users can            (a) The results of MAE        (b) The results of RMSE
rate and review movies. It includes 1986 users, 2071 movies
and 35497 ratings. The ratings take values from 0.5 to                     Figure 1: The performances on FilmTrust
4.0with step 0.5. In addition, 1853 trust ratings that are
issued by 609 users are gathered. The sparsity is 98.86%.               Fig. 2(a) and (b) are the results of those methods on
   Epinions is a website in which users can express their opin-      Epinions in terms of MAE and RMSE respectively and the
ions about items (such as movies, books, and software) by            threshold α and β are set as 0.1 and 0.3. Likewise, Up-
assigning numerical ratings and writing text reviews. The            Trust d is the best approach of all and PCC is better than
data set consists of 49K users who issued 664K ratings over          Trust d. However, with the lengthening of trust propaga-
139K different items and 478K trust statements. The rat-             tion, Trust 1 is the best method in Trust d while the Up-
ings are integers rated from 1 to 5 and the sparsity is 99.95%.      Trust 3 is the greatest method in UpTrust d, which is just
The trust values of both data sets are binary (either 1 or 0).       opposite with the results on FilmTrust. This also explains
                                                                     that trust propagation can not guarantee better results will
4.2    Evaluation metrics                                            be received, although more trusted neighbors can be found,
  The evaluation metrics are mean absolute error (MAE),              it is likely to add noisy information and decrease the accu-
root mean square error (RMSE) and rating coverage (RC)               racy of recommendation..
respectively. They are defined as follows:

                            X
                M AE =              |rui − rbui |/|T |        (5)
                          (u,i)∈T

                          s X
              RM SE =                 |rui − rbui |/|T |      (6)
                            (u,i)∈T
                                                                        (a) The results of MAE        (b) The results of RMSE
   Where T represents the set of prediction results and |T |
is the number of the set, and rbui is the prediction rating of
user u to item i.                                                          Figure 2: The performances on Epinions

                                                                       In addition, to verify whether our method can solve cold s-
                                  M                                  tart problem, we conduct experiments on FilmTrust in terms
                           RC =                  (7)
                                   N                                 of MAE, RMSE and RC on cold users. Similarly, we com-
  Where M and N are the number of predictable and all                pare UpTrust 1 with PCC and the performances are shown
the testing ratings, respectively.                                   in Fig. 3. We can get that UpTrust 1 is much better than
PCC in terms of cold users which declares that our method             Proceedings of the 43rd Hawaii International
can solve cold start problem to some extent. Table 1 is the           Conference on System Sciences, 2010, 1-9.
performances of the two methods in terms of RC, which can         [7] S. Deng, L. Huang, and G. Xu, Social network-based
further verify that our method can solve cold start problem           service recommendation with trust enhancement[J],
effectively.                                                          Expert Systems with Applications, 2014(41),
                                                                      8075-8084.
                                                                  [8] G. Guo, J. Zhang, and D. Thalmann, Merging trust in
                                                                      collaborative fltering to alleviate data sparsity and
                                                                      cold start[J], Knowledge-based Systems, 2014(57),
                                                                      57-68.
                                                                  [9] J. S. Breese, D. Heckerman, and C. Kadie et al,
                                                                      Empirical analysis of predictive algorithms for
                                                                      collaborative filtering[C], in Proceedings of the 14th
     (a) The results of MAE      (b) The results of RMSE              Conference on University in Artificial
                                                                      Intelligence(UAI98), 1998, 43-52.
                                                                 [10] P. Massa and P. Avesani, Trust-aware recommender
Figure 3: The performances on FilmTrust in terms
                                                                      systems[C], in: Proceedings of the 2007 ACM
of Cold Users
                                                                      Conference on Recommender Systems, 2007, 17-24.
                                                                 [11] I. Guy, I. Ronen, and E. Wilcox, Do you know
                                                                      recommending people to invite into your social
Table 1: The RC in the view of Cold Users on                          network[C], in: Proceedings of the 14th International
FilmTrust                                                             Conference on Intelligent User Interfaces, 2009, 77-86.
               Method     RC                                     [12] C. Desrosiers and G. Karypis, A comprehensive survey
                PCC     43.57%                                        of neighborhood-based recommendation methods[J],
              UpTrust 1 58.80%                                        Recommender Systems Handbook, 2011, 107-144.


5.    CONCLUSIONS
   This paper presents a reconstructing trust matrix method
to improve the prediction accuracy and solve cold user prob-
lem of collaborative filtering recommender systems. Consid-
ering the similarity in trust matrix will affect recommenda-
tion results, we reconstruct traditional trust matrix. Be-
sides, to recommend better for cold users and distinguish
trusted neighbors in a shorter distance with those in a longer
distance, weighted trust propagation is considered. The ex-
perimental results on two real data sets demonstrate the
effectiveness of our methods in improving the prediction ac-
curacy and solving cold user problem of recommender sys-
tems.

6.    REFERENCES
 [1] X. Zhou, Y. Xu, and Y. Li et al, The state-of-the-art
     in personalized recommender systems for social
     networking[J], Artif. Intell. Rev., 2012, 1-14.
 [2] D. Jannach, M. Zanker, and M. Ge et al,
     Recommender systems in computer science and
     information systemsĺCa landscape of research[J],
     E-Commerce Web Technol, 2012,76-87.
 [3] J. Audun and D.K. Walter Quattrochicchi, Taste and
     trust[J], Trust Management V, 2011, 312-322.
 [4] M. Jamali, and M. Ester, Trustwalker: a random walk
     model for combining trustbased and item-based
     recommendation[C], in: Proceedings of the 15th ACM
     SIGKDD International Conference on Knowledge
     Discovery and Data Mining, 2009, 397-406.
 [5] P. Massa and P. Avesani, Trust-aware recommender
     systems[C], in: Proceedings of the 2007 ACM
     Conference on Recommender Systems, 2007, 17-24.
 [6] S. Ray and A. Mahanti, Improving prediction
     accuracy in trust-aware recommender systems[C], in: