<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Improved Listwise Collaborative Filtering with High- Rating-Based Similarity and Temporary Ratings</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Yoshiki Tsuchiya</string-name>
          <email>tsuchiya@cmu.iit.tsukuba.ac.jp</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Hajime Nobuhara</string-name>
          <email>nobuhara@iit.tsukuba.ac.jp</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>University of Tsukuba</institution>
          ,
          <addr-line>Tsukuba</addr-line>
          ,
          <country country="JP">Japan</country>
        </aff>
      </contrib-group>
      <abstract>
        <p>In this paper, we make two proposals to speed up listwise collaborative filtering and improve its accuracy. The first is to speed up computation by only using a subset of the rating information (the high ratings). The second is to improve accuracy using temporary ratings that estimate the rating scores that neighboring users are not rating. Experiments using MovieLens datasets (1M and 10M) demonstrate that these proposals effectively reduce computation time about 1/50 and improve accuracy 2.22% compared with ListCF, a well-known listwise collaborative filtering algorithm.</p>
      </abstract>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>INTRODUCTION</title>
      <sec id="sec-1-1">
        <title>In recent years, due to the development of the Web,</title>
        <p>
          recommender systems have become increasingly important
in various situations; many researchers are now focusing on
recommendation technologies and systems [
          <xref ref-type="bibr" rid="ref2 ref8">2,6,7,9,10</xref>
          ].
Collaborative filtering (CF) is a widely used
recommendation algorithm that is based on the similarity
between users or items, as calculated from a user and rating
matrix. Various CF algorithms have been proposed, and
they can be divided into two types: rating-oriented [
          <xref ref-type="bibr" rid="ref8">6,9</xref>
          ] and
ranking-oriented [
          <xref ref-type="bibr" rid="ref2">2,7,10</xref>
          ], as shown in Fig. 1.
Ratingoriented CF algorithms, such as item-based CF [9], predict
the ratings of items that have not been evaluated by users
and make recommendations by calculating the similarity
between users or items. On the contrary, ranking-oriented
        </p>
      </sec>
      <sec id="sec-1-2">
        <title>CF uses user similarity to predict the item ranking and</title>
        <p>recommends items based on this. We will focus on this
method due to its performance. Ranking-oriented CF can be
further divided into two types: pairwise ranking-oriented
© 2018. Copyright for the individual papers remains with the authors.
Copying permitted for private and academic purposes.</p>
        <p>
          WII'18, March 11, Tokyo, Japan.
[7,10] and listwise ranking-oriented [
          <xref ref-type="bibr" rid="ref2">2</xref>
          ]. Pairwise
rankingoriented CF predicts the order of pairs of items but requires
large computation time.
In contrast, listwise ranking-oriented CF predicts the order
of the complete list of items. Although this produces better
accuracy than a typical pairwise CF algorithm, calculating
the required similarities is time-consuming and there is
room to improve the ranking accuracy. In this paper, we
propose an efficient listwise ranking-oriented CF algorithm
that is both faster and has higher ranking accuracy.
The proposed method implements two improvements. First,
when calculating the similarity between users, it only
considers the highest-rated items, greatly speeding up the
calculation. Second, it introduces temporary ratings when
making ranking predictions. Experimental comparisons
using MovieLens 1M (6,040 users, 3,952 movies,
1,000,209 ratings) and 10M (71,567 users, 10,681 movies,
10,000,054 ratings) confirm that the proposed method
reduces about 1/50 computation time of similarity and
improves 2.22% ranking accuracy than a conventional CF
algorithm.
        </p>
      </sec>
    </sec>
    <sec id="sec-2">
      <title>Overview of ListCF</title>
      <p>
        In this section, we give an overview of ListCF [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ], a
wellknown ranking-oriented listwise CF algorithm. ListCF
proceeds in two phases, first calculating the similarities
between users and then predicting ranks for the target user’s
unrated items. The first phase is based on a probability
distribution of item permutations, calculated by combining
the Plackett–Luce [8] and top-k probability [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ] models and
finding each user’s neighboring users.
      </p>
    </sec>
    <sec id="sec-3">
      <title>Similarity calculation</title>
      <sec id="sec-3-1">
        <title>In ListCF, the similarity of a pair of users  and  is</title>
        <p>calculated based on a probability distribution of item
permutations for each user, calculated using the Plackett–</p>
        <p>model [8], which is a representative permutation
probability model. The flow of similarity calculation is
shown on the left of Fig. 2 and Fig. 3. Let the set  =
{ 1,  2, ⋯ ,   } of items   = ( 1,  2, ⋯ ,   ) (∈   ) be an
ordered list where    ∈  and    ≠   if  ≠  , and let
Ω (⊂   ) be the set of all possible permutations of  . Given
the item ratings (
 , 
 1</p>
        <p>
          2 , ⋯ ,    ) (e.g., on a real interval
in the case of MovieLens [
          <xref ref-type="bibr" rid="ref1 ref7">1,5</xref>
          ]), where    is the rating
score of    , the probability of   ,  (  ), is defined using an
increasing and strictly positive function  (∙) ≥ 0 as
where the function is defined as  ( ) =   . However, this
requires us to consider  ! different permutations of the 
items, which would take a long time to compute. To speed
up the computation, the top-k probability model   [
          <xref ref-type="bibr" rid="ref1">1</xref>
          ] is
introduced as follows:
        </p>
        <p>( 1,  2, ⋯ ,   ) =
{  |  ∈ Ω ,   =   ,  = 1, ⋯ ,  ,  = 1, ⋯ ,

 !
( −  )!
}
and the probability of the top-k permutation is calculated as
 (  ( 1,  2, ⋯   )) = ∏</p>
        <p>
          Let   (⊂ Ω ) be the set of top-k permutations of  , and let
the probabilities of these permutations form the probability
distribution. Then, define   , (⊂  ) as the set of items rated
(⊂ Ω ),

 (   )

∑
 =1  =  (   )
(2)
(3)
proposed method (right).
by users  and  , and   and   (∈ [
          <xref ref-type="bibr" rid="ref1">0,1</xref>
          ]) as the probability
distributions over  
the users’ rating scores. The similarity score is now
obtained from the Kullback–Leibler (KL) divergence [
          <xref ref-type="bibr" rid="ref7">5</xref>
          ]
calculated from   and   . Given a pair of users  and  ,
the KL divergence of   and   is defined as
  , (⊂ Ω  , ) calculated by Eq. (3) using
2 (
  ( )
 ( )
) .
        </p>
        <p>(4)
  (
 ∥   ) =</p>
        <p>∑   ( )
If the set   , only includes a few items, the similarity will
be high, so this is relaxed by multiplying by the similarity
function by min{  , ⁄ , 1}, where  is a threshold. Each
user’s neighboring</p>
        <p>users can then be found from the
similarities calculated using Eqs. (3)–(5).</p>
      </sec>
    </sec>
    <sec id="sec-4">
      <title>Ranking prediction</title>
      <sec id="sec-4-1">
        <title>The flow of ranking prediction is shown on the left of Fig. 4.</title>
        <p>
          Let U be the set of users,   (⊂  ) be the set of the user  ’s
neighbors and   (⊂  ∖   ) be the set of items whose ranks
are to be predicted. Let  ̂ (∈ [
          <xref ref-type="bibr" rid="ref1">0,1</xref>
          ]) be the probability
written as
distribution of the top-k permutations  
  (⊂ Ω  ) of   ,
 ̂ ( ) =
        </p>
        <p>,
∑ ′∈ 
    , ′
where {  , |∀ ∈</p>
        <p>} are unknown variables assigned to
the top-k permutations. In ListCF, the cross entropy is used
as a loss function for prediction. Consider a target user  ,
for whom you want to rank a set of items   , and a
neighboring user  ∈   . Let the set of items rated by  be
  , with   , =   ∩   , and  
top-k permutations of   , . The cross entropy is calculated
  , (⊂ Ω  , ) be the set of
using probability distributions  ̂′ and  ′ over  
  , as
 ( ̂′ ,  ′ ) = −
 ′ ( )log2  ̂′ ( ).</p>
        <p>(7)
∑</p>
        <p>,
only on high ratings. The flow of similarity calculation is
shown on the right of Fig. 2 and Fig.5. Let   (⊂   ) be the
set of items that were rated highly by the target user  , and
  , (⊂   , ) be the set of items that both  and  rated
highly. Given the user and rating matrix, we define the
similarity between  and  as follows:
where 


( ,  ) =
( ,  ) is
|  , |
|  |
complexity is  , it is never worse than that of conventional</p>
      </sec>
      <sec id="sec-4-2">
        <title>ListCF, and often better. Changing how the high ratings are determined changes the ranking accuracy, as explained in the experimental section.</title>
      </sec>
    </sec>
    <sec id="sec-5">
      <title>Improving ranking accuracy using temporary ratings</title>
      <sec id="sec-5-1">
        <title>We also propose to improve the ListCF’s ranking accuracy</title>
        <p>by introducing temporary ratings. In ListCF, the cross
entropy in Eq. (7) is calculated from the probability
distribution of 
  , , the set of permutations of   , .</p>
        <p>However, optimizing the objective function may fail if   ,
has too few elements, e.g., if it only includes one element
and that item received a low rating from the neighboring
user  . In this case, despite the item’s low
rating,
optimization generates a high item rank, which is unhelpful
for user  . Fig. 6 shows how the probability distribution for
target user  ’s unrated items before optimization (upper)
and neighboring user  ’s distribution (middle) give a graph
like the one on the lower. Item 1 was given a low rating by
 but, since no other items were rated, it is guaranteed to
top the rankings. As a result,  ’s probability distribution is
updated to place item 1 at a higher position. The proposed
method therefore makes ranking predictions after giving
temporary estimated ratings to items that were not rated by
the neighboring user  . The flow of ranking prediction is
shown on the right of Fig 4 and Fig.7. Let   , be the rating
given by user  to item  , and let unrated items have a score
of zero.
∑</p>
        <p>,</p>
        <p>.
  , ←   , − 
(11)</p>
      </sec>
    </sec>
    <sec id="sec-6">
      <title>PROPOSED METHOD</title>
    </sec>
    <sec id="sec-7">
      <title>Rapid similarity calculation using only high ratings</title>
      <sec id="sec-7-1">
        <title>Conventional ListCF is slow because it has to calculate similarities using all rating scores. To speed up computation, we now</title>
        <p>occur when users have only one low-rated item in common.</p>
        <p>The horizontal axis represents the item number.
For a given neighborhood user  ∈   and set   =
 ′,  (if   ,  is zero) , ( = 1, ⋯ ,  ),
(13)
used.
where  ′ ,  is the set of  ’s neighbors  ′ ∈   who have
rated item   . If none of  ’s neighbors have rated   , the
temporary rating will still be zero. In that case, we can
obtain a nonzero temporary rating by calculating the
temporary ratings 

( ′,   ) for each
neighbor  ′ of  . By calculating the cross entropy of Eq. (7)
using these temporary ratings (Eq. (13)) and replacing
 ( ,  ) in Eq. (8) with 
( ,  ) (Eq. (12)), ranking
predictions can then be made in the same way as for ListCF
by
using</p>
      </sec>
      <sec id="sec-7-2">
        <title>Eqs. (8)–(11). Calculating temporary ratings</title>
        <p>allows the neighbors’ probability distributions in situations
such as Fig. 6 to be more reasonable. As a result, since the
parameters are less frequently updated in an undesirable
way, this should improve the ranking accuracy.

 =1


2  − 1
log2(1 +  )
(14)
where   is a normalization term that ensures the NDCG
value of the correct ranking is 1 and    is the rating of the
pth-ranked item for user  . For a set of users  , the overall
1
| |
| |
∑ 
 =1</p>
      </sec>
    </sec>
    <sec id="sec-8">
      <title>Comparison of similarity computation time</title>
      <sec id="sec-8-1">
        <title>In this experiment, we measured the similarity computation</title>
        <p>time. We compared the time taken to calculate the user
similarities using both the proposed and ListCF methods,
and then compared the ranking
accuracy
of</p>
      </sec>
      <sec id="sec-8-2">
        <title>ListCF</title>
        <p>predictions made using the calculated similarities. The
criterion used to determine high ratings was changed from 1
to 5 in increments of 1, and similarities were obtained for
each criterion. A criterion of 1 means all ratings are used,
while a criterion of 5 means that only items rated as 5 are</p>
      </sec>
      <sec id="sec-8-3">
        <title>Users</title>
      </sec>
      <sec id="sec-8-4">
        <title>Items</title>
      </sec>
      <sec id="sec-8-5">
        <title>Ratings</title>
      </sec>
      <sec id="sec-8-6">
        <title>MovieLens 1M MovieLens 10M 6,040 3,952</title>
        <p>1,000,209
71,567
10,681
10,000,054
60
50
s 40
e
itun 30
m20
10
0</p>
      </sec>
      <sec id="sec-8-7">
        <title>Figs. 8(a) and 9(a) show the computation time results for</title>
      </sec>
      <sec id="sec-8-8">
        <title>MovieLens 1M and 10M, respectively, while Figs. 8(b)-(d)</title>
        <p>and 9(b)-(d) show the corresponding ranking accuracy
results. The horizontal axes show the high rating criterion in
all graphs, while the vertical axes show calculation time in
Figs. 8(a) and 9(a), NDCG@1 in Figs. 8(b) and 9(b),
NDCG@3 in Figs. 8(c) and 9(c) and NDCG@5 in Figs.</p>
      </sec>
      <sec id="sec-8-9">
        <title>8(d) and 9(d) respectively. As can be seen from Figs. 8(a)</title>
        <p>and 9(a), similarity computation is considerably more rapid
for the proposed method than for conventional ListCF. In
addition, Figs. 8(b)-(d) and 9(b)-(d) show that when scores
of 4 and 5 are considered to be high ratings, the rankings
are as accurate as those of the conventional method.</p>
      </sec>
    </sec>
    <sec id="sec-9">
      <title>Comparison of ranking accuracy</title>
      <p>Next, we conducted experiments to examine the effect of
using temporary ratings on ranking prediction accuracy. We
compared the accuracy of ranking predictions made using
both the proposed and ListCF methods using the similarities
calculated in the previous subsection. We used an initial
  , value of 10 for both datasets with learning rates of
0.025 and 0.01 for MovieLens 1M and 10M, respectively.</p>
      <sec id="sec-9-1">
        <title>The gradient descent method was repeated 50 times. The computation time results are shown in Fig. 10, while ranking accuracy results are shown in Figs. 11 and 12.</title>
        <p>ListCF prediction
proposed prediction</p>
        <p>ListCF prediction
proposed prediction
ListCF 1
2
3
4
5</p>
        <p>ListCF 1
2
3
4
5
ListCF
1
3
4
5</p>
        <p>ListCF 1
3
4
5</p>
        <p>ListCF 1
3
4
5
Fig. 10 shows that the computation times for the proposed
ranking prediction method were shorter in all cases. In
addition, Figs. 11 and 12 show that its NDCG@1,</p>
      </sec>
      <sec id="sec-9-2">
        <title>NDCG@3 and NDCG@5 values were better for all cases.</title>
      </sec>
      <sec id="sec-9-3">
        <title>Since MovieLens 1M has less data than MovieLens 10M, the number of temporary ratings increases, and the difference between the proposed method and the conventional ListCF becomes larger than MovieLens 10.</title>
      </sec>
    </sec>
    <sec id="sec-10">
      <title>CONCLUSION</title>
      <sec id="sec-10-1">
        <title>In this paper, we have made two proposals. The first is to</title>
        <p>calculate user similarity scores using only high ratings,
instead of all ratings, to speed up computation. The second
is to introduce temporary estimated ratings for items that
have not been rated by neighboring users to improve
ranking accuracy.</p>
        <p>In experiments using the MovieLens 1M and 10M datasets,
we have compared the computation time and accuracy of
both proposals with those of conventional ListCF. The
results demonstrated that the first proposal provided a
considerable reduction in computation time, compared with
conventional ListCF, while maintaining equal or greater
ranking accuracy. In addition, the second proposal
shortened the prediction time in most cases while always
improving the ranking accuracy.</p>
      </sec>
      <sec id="sec-10-2">
        <title>In the future, we plan on improving the way neighboring users are selected and search for a better objective function.</title>
        <p>http://dx.doi.org/10.1145/1273496.1273513</p>
      </sec>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          1.
          <string-name>
            <given-names>Zhe</given-names>
            <surname>Cao</surname>
          </string-name>
          , Tao Qin,
          <string-name>
            <surname>Tie-Yan</surname>
            <given-names>Liu</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Ming-Feng Tsai</surname>
            , and
            <given-names>Hang</given-names>
          </string-name>
          <string-name>
            <surname>Li</surname>
          </string-name>
          .
          <year>2007</year>
          .
          <article-title>Learning to rank: from pairwise approach to listwise approach</article-title>
          .
          <source>In Proceedings of the 24th international conference on Machine learning (ICML '07)</source>
          ,
          <fpage>129</fpage>
          -
          <lpage>136</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          2.
          <string-name>
            <given-names>Shanshan</given-names>
            <surname>Huang</surname>
          </string-name>
          ,
          <string-name>
            <surname>Shuaiqiang</surname>
            <given-names>Wang</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Tie-Yan</surname>
            <given-names>Liu</given-names>
          </string-name>
          , Jun Ma, Zhumin Chen, and
          <string-name>
            <given-names>Jari</given-names>
            <surname>Veijalainen</surname>
          </string-name>
          .
          <year>2015</year>
          .
          <article-title>Listwise Collaborative Filtering</article-title>
          .
          <source>In Proceedings of the 38th International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR '15)</source>
          ,
          <fpage>343</fpage>
          -
          <lpage>352</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>http://dx.doi.org/10.1145/2766462.2767693 3.</mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          <string-name>
            <given-names>Kalervo</given-names>
            <surname>Järvelin</surname>
          </string-name>
          and
          <string-name>
            <given-names>Jaana</given-names>
            <surname>Kekäläinen</surname>
          </string-name>
          .
          <year>2000</year>
          .
          <article-title>IR evaluation methods for retrieving highly relevant documents</article-title>
          .
          <source>In Proceedings of the 23rd annual international ACM SIGIR conference on Research and development in information retrieval (SIGIR '00)</source>
          ,
          <fpage>41</fpage>
          -
          <lpage>48</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          http://dx.doi.org/10.1145/345508.345545
          <string-name>
            <given-names>Kalervo</given-names>
            <surname>Järvelin</surname>
          </string-name>
          and
          <string-name>
            <given-names>Jaana</given-names>
            <surname>Kekäläinen</surname>
          </string-name>
          .
          <year>2002</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          <string-name>
            <given-names>ACM</given-names>
            <surname>Trans. Inf</surname>
          </string-name>
          . Syst.
          <volume>20</volume>
          ,
          <issue>4</issue>
          :
          <fpage>422</fpage>
          -
          <lpage>446</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          5.
          <string-name>
            <given-names>S.</given-names>
            <surname>Kullback</surname>
          </string-name>
          .
          <year>1997</year>
          . Information Theory and Statistics. Courier Corporation.
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          6.
          <string-name>
            <surname>Linden</surname>
            ,
            <given-names>G.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Smith</surname>
            ,
            <given-names>B.</given-names>
          </string-name>
          , &amp; York,
          <string-name>
            <surname>J.</surname>
          </string-name>
          <year>2003</year>
          .
          <article-title>Amazon. com recommendations: Item-to-item collaborative filtering</article-title>
          .
          <source>IEEE Internet computing, 7</source>
          ,
          <issue>1</issue>
          :
          <fpage>76</fpage>
          -
          <lpage>80</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          <string-name>
            <given-names>Nathan N.</given-names>
            <surname>Liu</surname>
          </string-name>
          and
          <string-name>
            <given-names>Qiang</given-names>
            <surname>Yang</surname>
          </string-name>
          .
          <year>2008</year>
          .
          <article-title>EigenRank: a ranking-oriented approach to collaborative filtering</article-title>
          .
          <source>In Proceedings of the 31st annual international ACM SIGIR conference on Research and development in information retrieval (SIGIR '08)</source>
          ,
          <fpage>83</fpage>
          -
          <lpage>90</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          http://dx.doi.org/10.1145/1390334.1390351 8.
          <string-name>
            <given-names>J. I.</given-names>
            <surname>Marden</surname>
          </string-name>
          .
          <year>1996</year>
          .
          <article-title>Analyzing and modeling rank data</article-title>
          . CRC Press.
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          <string-name>
            <given-names>Badrul</given-names>
            <surname>Sarwar</surname>
          </string-name>
          , George Karypis, Joseph Konstan,
          <string-name>
            <given-names>and John</given-names>
            <surname>Riedl</surname>
          </string-name>
          .
          <year>2001</year>
          .
          <article-title>Item-based collaborative filtering recommendation algorithms</article-title>
          .
          <source>In Proceedings of the 10th international conference on World Wide Web (WWW '01)</source>
          ,
          <fpage>285</fpage>
          -
          <lpage>295</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          http://dx.doi.org/10.1145/371920.372071 10.
          <string-name>
            <surname>Shuaiqiang</surname>
            <given-names>Wang</given-names>
          </string-name>
          , Jiankai Sun,
          <string-name>
            <given-names>Byron J</given-names>
            .
            <surname>Gao</surname>
          </string-name>
          , and Jun Ma.
          <year>2014</year>
          .
          <article-title>VSRank: A Novel Framework for RankingBased Collaborative Filtering</article-title>
          .
          <source>ACM Trans. Intell. Syst. Technol. 5</source>
          ,
          <issue>3</issue>
          :
          <fpage>24</fpage>
          .
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>