INTRODUCTION

Improved Listwise Collaborative Filtering with High- Rating-Based Similarity and Temporary Ratings

Yoshiki Tsuchiya

tsuchiya@cmu.iit.tsukuba.ac.jp 0

Hajime Nobuhara

nobuhara@iit.tsukuba.ac.jp 0 0 University of Tsukuba , Tsukuba , Japan

In this paper, we make two proposals to speed up listwise collaborative filtering and improve its accuracy. The first is to speed up computation by only using a subset of the rating information (the high ratings). The second is to improve accuracy using temporary ratings that estimate the rating scores that neighboring users are not rating. Experiments using MovieLens datasets (1M and 10M) demonstrate that these proposals effectively reduce computation time about 1/50 and improve accuracy 2.22% compared with ListCF, a well-known listwise collaborative filtering algorithm.

INTRODUCTION In recent years, due to the development of the Web,

recommender systems have become increasingly important in various situations; many researchers are now focusing on recommendation technologies and systems [ 2,6,7,9,10 ]. Collaborative filtering (CF) is a widely used recommendation algorithm that is based on the similarity between users or items, as calculated from a user and rating matrix. Various CF algorithms have been proposed, and they can be divided into two types: rating-oriented [ 6,9 ] and ranking-oriented [ 2,7,10 ], as shown in Fig. 1. Ratingoriented CF algorithms, such as item-based CF [9], predict the ratings of items that have not been evaluated by users and make recommendations by calculating the similarity between users or items. On the contrary, ranking-oriented

CF uses user similarity to predict the item ranking and

recommends items based on this. We will focus on this method due to its performance. Ranking-oriented CF can be further divided into two types: pairwise ranking-oriented © 2018. Copyright for the individual papers remains with the authors. Copying permitted for private and academic purposes.

WII'18, March 11, Tokyo, Japan. [7,10] and listwise ranking-oriented [ 2 ]. Pairwise rankingoriented CF predicts the order of pairs of items but requires large computation time. In contrast, listwise ranking-oriented CF predicts the order of the complete list of items. Although this produces better accuracy than a typical pairwise CF algorithm, calculating the required similarities is time-consuming and there is room to improve the ranking accuracy. In this paper, we propose an efficient listwise ranking-oriented CF algorithm that is both faster and has higher ranking accuracy. The proposed method implements two improvements. First, when calculating the similarity between users, it only considers the highest-rated items, greatly speeding up the calculation. Second, it introduces temporary ratings when making ranking predictions. Experimental comparisons using MovieLens 1M (6,040 users, 3,952 movies, 1,000,209 ratings) and 10M (71,567 users, 10,681 movies, 10,000,054 ratings) confirm that the proposed method reduces about 1/50 computation time of similarity and improves 2.22% ranking accuracy than a conventional CF algorithm.

Overview of ListCF

In this section, we give an overview of ListCF [ 2 ], a wellknown ranking-oriented listwise CF algorithm. ListCF proceeds in two phases, first calculating the similarities between users and then predicting ranks for the target user’s unrated items. The first phase is based on a probability distribution of item permutations, calculated by combining the Plackett–Luce [8] and top-k probability [ 1 ] models and finding each user’s neighboring users.

Similarity calculation In ListCF, the similarity of a pair of users and is

calculated based on a probability distribution of item permutations for each user, calculated using the Plackett–

model [8], which is a representative permutation probability model. The flow of similarity calculation is shown on the left of Fig. 2 and Fig. 3. Let the set = { 1, 2, ⋯ , } of items = ( 1, 2, ⋯ , ) (∈ ) be an ordered list where ∈ and ≠ if ≠ , and let Ω (⊂ ) be the set of all possible permutations of . Given the item ratings ( , 1

2 , ⋯ , ) (e.g., on a real interval in the case of MovieLens [ 1,5 ]), where is the rating score of , the probability of , ( ), is defined using an increasing and strictly positive function (∙) ≥ 0 as where the function is defined as ( ) = . However, this requires us to consider ! different permutations of the items, which would take a long time to compute. To speed up the computation, the top-k probability model [ 1 ] is introduced as follows:

( 1, 2, ⋯ , ) = { | ∈ Ω , = , = 1, ⋯ , , = 1, ⋯ , ! ( − )! } and the probability of the top-k permutation is calculated as ( ( 1, 2, ⋯ )) = ∏

Let (⊂ Ω ) be the set of top-k permutations of , and let the probabilities of these permutations form the probability distribution. Then, define , (⊂ ) as the set of items rated (⊂ Ω ), ( ) ∑ =1 = ( ) (2) (3) proposed method (right). by users and , and and (∈ [ 0,1 ]) as the probability distributions over the users’ rating scores. The similarity score is now obtained from the Kullback–Leibler (KL) divergence [ 5 ] calculated from and . Given a pair of users and , the KL divergence of and is defined as , (⊂ Ω , ) calculated by Eq. (3) using 2 ( ( ) ( ) ) .

(4) ( ∥ ) =

∑ ( ) If the set , only includes a few items, the similarity will be high, so this is relaxed by multiplying by the similarity function by min{ , ⁄ , 1}, where is a threshold. Each user’s neighboring

users can then be found from the similarities calculated using Eqs. (3)–(5).

Ranking prediction The flow of ranking prediction is shown on the left of Fig. 4.

Let U be the set of users, (⊂ ) be the set of the user ’s neighbors and (⊂ ∖ ) be the set of items whose ranks are to be predicted. Let ̂ (∈ [ 0,1 ]) be the probability written as distribution of the top-k permutations (⊂ Ω ) of , ̂ ( ) =

, ∑ ′∈ , ′ where { , |∀ ∈

} are unknown variables assigned to the top-k permutations. In ListCF, the cross entropy is used as a loss function for prediction. Consider a target user , for whom you want to rank a set of items , and a neighboring user ∈ . Let the set of items rated by be , with , = ∩ , and top-k permutations of , . The cross entropy is calculated , (⊂ Ω , ) be the set of using probability distributions ̂′ and ′ over , as ( ̂′ , ′ ) = − ′ ( )log2 ̂′ ( ).

(7) ∑

, only on high ratings. The flow of similarity calculation is shown on the right of Fig. 2 and Fig.5. Let (⊂ ) be the set of items that were rated highly by the target user , and , (⊂ , ) be the set of items that both and rated highly. Given the user and rating matrix, we define the similarity between and as follows: where ( , ) = ( , ) is | , | | | complexity is , it is never worse than that of conventional

ListCF, and often better. Changing how the high ratings are determined changes the ranking accuracy, as explained in the experimental section. Improving ranking accuracy using temporary ratings We also propose to improve the ListCF’s ranking accuracy

by introducing temporary ratings. In ListCF, the cross entropy in Eq. (7) is calculated from the probability distribution of , , the set of permutations of , .

However, optimizing the objective function may fail if , has too few elements, e.g., if it only includes one element and that item received a low rating from the neighboring user . In this case, despite the item’s low rating, optimization generates a high item rank, which is unhelpful for user . Fig. 6 shows how the probability distribution for target user ’s unrated items before optimization (upper) and neighboring user ’s distribution (middle) give a graph like the one on the lower. Item 1 was given a low rating by but, since no other items were rated, it is guaranteed to top the rankings. As a result, ’s probability distribution is updated to place item 1 at a higher position. The proposed method therefore makes ranking predictions after giving temporary estimated ratings to items that were not rated by the neighboring user . The flow of ranking prediction is shown on the right of Fig 4 and Fig.7. Let , be the rating given by user to item , and let unrated items have a score of zero. ∑

. , ← , − (11)

PROPOSED METHOD Rapid similarity calculation using only high ratings Conventional ListCF is slow because it has to calculate similarities using all rating scores. To speed up computation, we now

occur when users have only one low-rated item in common.

The horizontal axis represents the item number. For a given neighborhood user ∈ and set = ′, (if , is zero) , ( = 1, ⋯ , ), (13) used. where ′ , is the set of ’s neighbors ′ ∈ who have rated item . If none of ’s neighbors have rated , the temporary rating will still be zero. In that case, we can obtain a nonzero temporary rating by calculating the temporary ratings ( ′, ) for each neighbor ′ of . By calculating the cross entropy of Eq. (7) using these temporary ratings (Eq. (13)) and replacing ( , ) in Eq. (8) with ( , ) (Eq. (12)), ranking predictions can then be made in the same way as for ListCF by using

Eqs. (8)–(11). Calculating temporary ratings

allows the neighbors’ probability distributions in situations such as Fig. 6 to be more reasonable. As a result, since the parameters are less frequently updated in an undesirable way, this should improve the ranking accuracy. =1 2 − 1 log2(1 + ) (14) where is a normalization term that ensures the NDCG value of the correct ranking is 1 and is the rating of the pth-ranked item for user . For a set of users , the overall 1 | | | | ∑ =1

Comparison of similarity computation time In this experiment, we measured the similarity computation

time. We compared the time taken to calculate the user similarities using both the proposed and ListCF methods, and then compared the ranking accuracy of

ListCF

predictions made using the calculated similarities. The criterion used to determine high ratings was changed from 1 to 5 in increments of 1, and similarities were obtained for each criterion. A criterion of 1 means all ratings are used, while a criterion of 5 means that only items rated as 5 are

Users Items Ratings MovieLens 1M MovieLens 10M 6,040 3,952

1,000,209 71,567 10,681 10,000,054 60 50 s 40 e itun 30 m20 10 0

Figs. 8(a) and 9(a) show the computation time results for MovieLens 1M and 10M, respectively, while Figs. 8(b)-(d)

and 9(b)-(d) show the corresponding ranking accuracy results. The horizontal axes show the high rating criterion in all graphs, while the vertical axes show calculation time in Figs. 8(a) and 9(a), NDCG@1 in Figs. 8(b) and 9(b), NDCG@3 in Figs. 8(c) and 9(c) and NDCG@5 in Figs.

8(d) and 9(d) respectively. As can be seen from Figs. 8(a)

and 9(a), similarity computation is considerably more rapid for the proposed method than for conventional ListCF. In addition, Figs. 8(b)-(d) and 9(b)-(d) show that when scores of 4 and 5 are considered to be high ratings, the rankings are as accurate as those of the conventional method.

Comparison of ranking accuracy

Next, we conducted experiments to examine the effect of using temporary ratings on ranking prediction accuracy. We compared the accuracy of ranking predictions made using both the proposed and ListCF methods using the similarities calculated in the previous subsection. We used an initial , value of 10 for both datasets with learning rates of 0.025 and 0.01 for MovieLens 1M and 10M, respectively.

The gradient descent method was repeated 50 times. The computation time results are shown in Fig. 10, while ranking accuracy results are shown in Figs. 11 and 12.

ListCF prediction proposed prediction

ListCF prediction proposed prediction ListCF 1 2 3 4 5

ListCF 1 2 3 4 5 ListCF 1 3 4 5

ListCF 1 3 4 5

ListCF 1 3 4 5 Fig. 10 shows that the computation times for the proposed ranking prediction method were shorter in all cases. In addition, Figs. 11 and 12 show that its NDCG@1,

NDCG@3 and NDCG@5 values were better for all cases. Since MovieLens 1M has less data than MovieLens 10M, the number of temporary ratings increases, and the difference between the proposed method and the conventional ListCF becomes larger than MovieLens 10. CONCLUSION In this paper, we have made two proposals. The first is to

calculate user similarity scores using only high ratings, instead of all ratings, to speed up computation. The second is to introduce temporary estimated ratings for items that have not been rated by neighboring users to improve ranking accuracy.

In experiments using the MovieLens 1M and 10M datasets, we have compared the computation time and accuracy of both proposals with those of conventional ListCF. The results demonstrated that the first proposal provided a considerable reduction in computation time, compared with conventional ListCF, while maintaining equal or greater ranking accuracy. In addition, the second proposal shortened the prediction time in most cases while always improving the ranking accuracy.

In the future, we plan on improving the way neighboring users are selected and search for a better objective function.

http://dx.doi.org/10.1145/1273496.1273513

Zhe

Cao , Tao Qin, Tie-Yan

Liu

, Ming-Feng Tsai , and Hang Li . 2007 . Learning to rank: from pairwise approach to listwise approach . In Proceedings of the 24th international conference on Machine learning (ICML '07) , 129 - 136 .

Shanshan

Huang , Shuaiqiang

Wang

, Tie-Yan

Liu

, Jun Ma, Zhumin Chen, and

Jari

Veijalainen . 2015 . Listwise Collaborative Filtering . In Proceedings of the 38th International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR '15) , 343 - 352 .

http://dx.doi.org/10.1145/2766462.2767693 3.

Kalervo

Järvelin and

Jaana

Kekäläinen . 2000 . IR evaluation methods for retrieving highly relevant documents . In Proceedings of the 23rd annual international ACM SIGIR conference on Research and development in information retrieval (SIGIR '00) , 41 - 48 .

http://dx.doi.org/10.1145/345508.345545

Kalervo

Järvelin and

Jaana

Kekäläinen . 2002 .

ACM

Trans. Inf . Syst. 20 , 4 : 422 - 446 .

Kullback . 1997 . Information Theory and Statistics. Courier Corporation.

6. Linden , G. , Smith , B. , & York, J. 2003 . Amazon. com recommendations: Item-to-item collaborative filtering . IEEE Internet computing, 7 , 1 : 76 - 80 .

Nathan N.

Liu and

Qiang

Yang . 2008 . EigenRank: a ranking-oriented approach to collaborative filtering . In Proceedings of the 31st annual international ACM SIGIR conference on Research and development in information retrieval (SIGIR '08) , 83 - 90 .

http://dx.doi.org/10.1145/1390334.1390351 8.

J. I.

Marden . 1996 . Analyzing and modeling rank data . CRC Press.

Badrul

Sarwar , George Karypis, Joseph Konstan,

and John

Riedl . 2001 . Item-based collaborative filtering recommendation algorithms . In Proceedings of the 10th international conference on World Wide Web (WWW '01) , 285 - 295 .

http://dx.doi.org/10.1145/371920.372071 10. Shuaiqiang

Wang

, Jiankai Sun, Byron J . Gao , and Jun Ma. 2014 . VSRank: A Novel Framework for RankingBased Collaborative Filtering . ACM Trans. Intell. Syst. Technol. 5 , 3 : 24 .