<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>TOCCF: Time-Aware One-Class Collaborative Filtering</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>CCS Concepts</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="editor">
          <string-name>Time-Aware; One-Class Collaborative Filtering; Factored
Item Similarity Model</string-name>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Xinchao Chen</institution>
          ,
          <addr-line>Weike Pan</addr-line>
        </aff>
      </contrib-group>
      <abstract>
        <p>One-class collaborative filtering (OCCF), or recommendation with one-class feedback such as shopping records, has recently gained more attention from researchers and practitioners in the community. The main reason is that one-class feedback in the form of (user, item) pairs are often more abundant than numerical ratings in the form of (user, item, rating) triples as exploited by traditional collaborative filtering algorithms. However, most of the previous work on OCCF do not consider the temporal context, which is known of great importance to users' preferences and behaviors. In this paper, we first formally define a new problem called time-aware OCCF (TOCCF), and then design a novel timeaware similarity learning (TSL) model accordingly. Our TSL is based on a novel time-aware weighting scheme and a seminal work on similarity learning, and is able to learn the item similarities more accurately. Empirical studies on two large real-world datasets show that our TSL model can integrate the temporal information effectively, and perform significantly better than several state-of-the-art recommendation algorithms.</p>
      </abstract>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>INTRODUCTION</title>
      <p>
        One-class collaborative filtering (OCCF) [
        <xref ref-type="bibr" rid="ref2 ref5">2, 5</xref>
        ] is a recent
research focus in the community of recommender
systems. In OCCF, the data we can exploit for recommendation
are the so-called one-class feedback such as “transactions”
in e-commerce instead of multi-class feedback or numerical
ratings in traditional collaborative filtering problems. The
reason that modeling one-class feedback is considered more
important is simply due to the fact that users are somehow
reluctant to assign a multi-class score to a product after
purchasing.
      </p>
      <p>
        In order to model the one-class feedback, two main lines of
techniques are usually adopted, which are parallel to that of
collaborative filtering, including memory-based OCCF and
model-based OCCF. For memory-based OCCF, the only
difference from that of memory-based CF is that the
similarity between two users or two items are estimated based
on the one-class feedback instead of the ratings. For
modelbased OCCF, the techniques are often different from that of
model-based CF, in particular of the underlying assumption
for the learning task of positive feedback only and the
prediction rule based on similarity learning. The most well-known
preference assumption for one-class feedback is probably the
pairwise preference assumption called Bayesian personalized
ranking defined on the difference between a purchased
product and an un-purchased one [
        <xref ref-type="bibr" rid="ref7">7</xref>
        ]. And the most recent work
on similarity learning approach is the factored item
similarity model (FISM) [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ], which learns the latent representation
of items with the assumption that the inner product of two
items’ latent factors is their similarity.
      </p>
      <p>
        The aforementioned advances in modeling one-class
feedback in OCCF have indeed achieved great success in various
recommendation applications. However, we find that very
few work have explicitly studied the temporal effect in
OCCF, though it has shown to be very helpful in user behavior
modeling in CF [
        <xref ref-type="bibr" rid="ref3 ref4">3, 4</xref>
        ]. A recent work on microblog
recommendation [
        <xref ref-type="bibr" rid="ref8">8</xref>
        ] shows that temporal information is helpful.
However, the time-aware weighting scheme [
        <xref ref-type="bibr" rid="ref8">8</xref>
        ] is designed
for the specific application, where the items (or tweets) are
recorded with time when they arrive at the users instead of
when they are retweeted by the users. In the studied general
time-aware OCCF, we only have the temporal information
when users have actions to items.
      </p>
      <p>
        In this paper, we first design a time-aware weighting scheme
for the reliability of the positive feedback, and then propose
a time-aware similarity learning (TSL) model by integrating
the weight as a confidence score into the similarity learning
model [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ]. The time complexity of TSL is the same with
that of FISM [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ]. We conduct extensive empirical studies
on two public large datasets with the state-of-the-art
baselines of memory-based methods and model-based methods.
The empirical results show that our new similarity learning
model is simple but very effective in exploiting the time
context, and is significantly better than the algorithms without
modeling the temporal effect.
2.1
      </p>
    </sec>
    <sec id="sec-2">
      <title>TIME-AWARE SIMILARITY LEARNING</title>
    </sec>
    <sec id="sec-3">
      <title>Problem Definition</title>
      <p>In time-aware one-class collaborative filtering (TOCCF),
we have n users, m items and their positive feedback in the
form of (user, item, time) triples, e.g., (u, i, tui), denoting
user u has a positive feedback on item i at time tui. In
TOCCF, our goal is to learn users’ preferences from the
positive feedback and associated temporal information, and
provide a personalized ranked list of items for each user u
that he or she may like in the future.</p>
      <p>
        Notice that in OCCF [
        <xref ref-type="bibr" rid="ref5">5</xref>
        ], the temporal information is not
exploited, i.e., the data is of (user, item) pairs. We illustrate
the studied problem in Figure 1, where OCCF is a special
case of TOCCF and is represented as a mixed (user, item)
feedback matrix ignoring the time context.
      </p>
      <p>
        It is well known that similarity measurement is critical
in collaborative filtering, because it determines the
neighborhood of a certain user u and thus affects the preference
prediction of user u on other items. The state-of-the-art
approach [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ] does not adopt traditional similarity
measurement such as Cosine similarity or Jaccard index, but turns
to learn the similarity from the preference data, which is
empirically more adaptive to different datasets.
Mathematically, the learned similarity in FISM [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ] is represented as
follows,
si′i =
      </p>
      <p>1
p|Ni\{i}|</p>
      <p>Vi·WiT′·,
(1)
where Ni = {i′|(u, i′, tui′ ) ∈ Tu}, and Vi·, Wi′· ∈ R1×d are
latent feature vectors to be learned for item i and item i′,
respectively. The similarity si′i or the latent vectors Vi·
and Wi′· can be learned via some pointwise or pairwise loss
functions in an optimization problem.</p>
      <p>
        With the learned similarity si′i, the preference of user u
on item i can then be predicted as follows [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ],
(2)
(3)
rˆui =
      </p>
      <p>X
i′∈Nu\{i}</p>
      <p>si′i + bu + pi,
where bu and pi are preference bias of user u and popularity
bias of item i, respectively.
2.3</p>
    </sec>
    <sec id="sec-4">
      <title>Time-Aware Similarity Learning</title>
      <p>We introduce a confidence measurement for an observed
positive feedback (u, i, tui),
cui =</p>
      <p>1
(tτ + 1) − tui
where tτ is the largest time stamp (in day) in the training
data, and thus (tτ + 1) is used to denote the current day.
Notice that we use the inverse of the difference between the
current time (tτ + 1) and the time the positive feedback is
issued tui because a more recent feedback is more reliable,
and is thus of high confidence.</p>
      <p>
        Our proposed confidence measurement shown in Eq.(3)
looks similar to but is very different from the pairwise
confidence weight in BPRC [
        <xref ref-type="bibr" rid="ref8">8</xref>
        ], which is defined on two (user,
item) pairs, i.e., cuij for (u, i) and (u, j), instead of on one
single (user, item) pair in Eq.(3). Notice that in BPRC [
        <xref ref-type="bibr" rid="ref8">8</xref>
        ],
(u, i) denotes a retweet feedback and (u, j) denotes a
nonretweet feedback, while both are associated with temporal
information of when the tweet is received by user u. In our
TOCCF, we only have the temporal information of the
observed positive feedback, and thus the approach in [
        <xref ref-type="bibr" rid="ref8">8</xref>
        ] is also
not applicable to our studied problem.
      </p>
      <p>Finally, we have a general time-aware weighting scheme,
ωui =
cui, if (u, i, tui) ∈ Tu,
1, otherwise,
(4)
which means that we will weight the known (i.e., observed
feedback) only. It is thus a ponitwise confidence weight.</p>
      <p>
        With the time-aware weighting scheme and the loss
function of FISM [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ], we propose to solve the following
optimization problem,
V,mWi,nb,p (u,i,tuXi)∈T ∪T ′ 21 ωui(rui − rˆui)2 + R(V, W, b, p), (5)
where T ′ is a set of randomly sampled unobserved
feedback with |T ′| = 3|T |, R(V, W, b, p) = α Pim=1 ||Vi·||2 +
2
α2 Pim′=1 ||Wi′·||2 + α2 Pun=1 b2u + α2 Pim=1 pi2 is the
regularization term commonly used to avoid overfitting. The
optimization problem can be solved in a commonly used gradient
descent algorithm [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ].
      </p>
    </sec>
    <sec id="sec-5">
      <title>EXPERIMENTAL RESULTS</title>
    </sec>
    <sec id="sec-6">
      <title>Datasets and Evaluation Metrics</title>
      <p>In order to verify the effectiveness of our proposed
pointwise weighting scheme and time-ware similarity learning
(TSL) model, we use two large public datasets, i.e., MovieLens
10M (we use ML10M for short) and Netflix, in our
empirical studies. ML10M contains about 10 million numerical
ratings from 71567 users and 10681 items, and Netflix
contains about 100 million numerical ratings from 480189 users
and 17770 items. In order to simulate the TOCCF problem
setting with time-aware positive feedback, for each dataset,
we first remove the (u, i, rui, tui) quadruples with rui ≤ 4,
and then take the (u, i, tui) triples from the remaining data.</p>
      <p>For the resulted time-aware one-class feedback of each
dataset, we further split it according to the time stamp in
order to generate a copy of training data, validation data
and test data. We illustrate the data generation procedure
in Figure 3. Specifically, we first use 60% feedback with the
smallest time stamps for training; and then from the left
40% feedback, we randomly sample 20% feedback for
validation and the remaining 20% for test. We put the statistics
of the resulted datasets in Table 3.</p>
      <p>For one-class feedback in TOCCF, we adopt several
commonly used evaluation metrics in ranking-oriented item
recommendation or information retrieval scenarios. In
particular, we check the top-K performance using Precision@K,
Recall@K, F1@K, NDCG@K and 1-call@K.</p>
      <p>For neighborhood-based methods ICF(JI) and ICF(CS),
we fix the size of nearest neighbors as 20. For
factorizationbased methods BPR, FISM and our TSL, we fix the
dimension of latent space as d = 20 and the learning rate γ = 0.01.
The iteration number in BPR, FISM and our TSL are chosen
from T ∈ {100, 500, 1000} and the value of tradeoff
parameters are chosen from α ∈ {0.001, 0.01, 0.1} all through the
NDCG@5 on the validation data, i.e., there are nine
combinations of the value of the two types of parameters.</p>
      <p>ICF(JI)
ICF(CS)
BPR
FISM
TSL
Netflix</p>
      <p>K</p>
      <p>ICF(JI)
ICF(CS)
BPR
FISM
TSL
3.3</p>
    </sec>
    <sec id="sec-7">
      <title>Main Results</title>
      <p>We report the main results in Table 2, from which we can
have the following observations,
• Two neighborhood-based methods, i.e., ICF(JI) and
ICF(CS), are poor regarding the recommendation
performance, which is caused by the intransitivity of the
similarity measurements for the scarce positive
feedback. Notice that the density of the training data of
ML10M and Netflix are smaller than 0.2%.
• Two factorization-based methods, i.e., BPR and
FISM, perform much better than the neighborhood-based
methods, which is expected because of the merit of
transitivity via learned latent factors.
• Our proposed time-aware similarity learning method,
i.e., TSL, further improves FISM and BPR
significantly, from which we can clearly see the value of the
temporal information and the effectiveness of our
weighting scheme to integrate the time context.</p>
      <p>For real-world deployment of a recommendation model,
we usually pay more attention to its top-K performance,
because that will affect users’ behaviors most. For this
reason, we also check the recommendation performance with
different value of K ∈ {5, 10, 15}. We show the results of
NDCG@K in Figure 2. Notice that the results on other
topK performance are similar, and are thus not included due to
space limitation. From Figure 2, we can see that the
performance ordering on different value of K over two datasets is
ICF(JI), ICF(CS) &lt; BPR, FISM &lt; TSL, which is consistent
to that of Table 2. The results on NDCG@K again show the
usefulness of the temporal context and effectiveness of our
time-aware weighting scheme in similarity learning.</p>
    </sec>
    <sec id="sec-8">
      <title>CONCLUSIONS AND FUTURE WORK</title>
      <p>
        In this paper, we study an important recommendation
problem termed time-aware one-class collaborative filtering
(TOCCF), and propose a novel time-aware similarity
learning (TSL) model based on the seminal work of factored item
similarity model [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ]. Empirical results show that our TSL
can incorporate the time information in a simple but
effective way, and is able to recommend significantly more
accurate ranked lists of items than several state-of-the-art
methods without modeling the time information.
      </p>
      <p>
        For future work, we are interested in generalizing our
timeaware similarity learning model to more advanced
similarity learning approaches for recommendation with social and
other auxiliary data [
        <xref ref-type="bibr" rid="ref1 ref6">1, 6</xref>
        ].
5.
      </p>
    </sec>
    <sec id="sec-9">
      <title>ACKNOWLEDGMENT</title>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <given-names>H.</given-names>
            <surname>Fang</surname>
          </string-name>
          , G. Guo, and
          <string-name>
            <surname>J. Zhang.</surname>
          </string-name>
          <article-title>Multi-faceted trust and distrust prediction for recommender systems</article-title>
          .
          <source>Decision Support Systems</source>
          ,
          <volume>71</volume>
          :
          <fpage>37</fpage>
          -
          <lpage>47</lpage>
          ,
          <year>2015</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [2]
          <string-name>
            <given-names>S.</given-names>
            <surname>Kabbur</surname>
          </string-name>
          ,
          <string-name>
            <given-names>X.</given-names>
            <surname>Ning</surname>
          </string-name>
          , and
          <string-name>
            <given-names>G.</given-names>
            <surname>Karypis</surname>
          </string-name>
          . Fism:
          <article-title>Factored item similarity models for top-n recommender systems</article-title>
          .
          <source>In Proceedings of the 19th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, KDD '13</source>
          , pages
          <fpage>659</fpage>
          -
          <lpage>667</lpage>
          ,
          <year>2013</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [3]
          <string-name>
            <given-names>Y.</given-names>
            <surname>Koren</surname>
          </string-name>
          .
          <article-title>Collaborative filtering with temporal dynamics</article-title>
          .
          <source>In Proceedings of the 15th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining</source>
          , pages
          <fpage>447</fpage>
          -
          <lpage>456</lpage>
          ,
          <year>2009</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [4]
          <string-name>
            <given-names>S.</given-names>
            <surname>Liu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Wang</surname>
          </string-name>
          , and
          <string-name>
            <given-names>F.</given-names>
            <surname>Zhu</surname>
          </string-name>
          .
          <article-title>Structured learning from heterogeneous behavior for social identity linkage</article-title>
          .
          <source>IEEE Transactions on Knowledge and Data Engineering</source>
          ,
          <volume>27</volume>
          (
          <issue>7</issue>
          ):
          <fpage>2005</fpage>
          -
          <lpage>2019</lpage>
          ,
          <year>2015</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          [5]
          <string-name>
            <given-names>R.</given-names>
            <surname>Pan</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Y.</given-names>
            <surname>Zhou</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B.</given-names>
            <surname>Cao</surname>
          </string-name>
          ,
          <string-name>
            <given-names>N. N.</given-names>
            <surname>Liu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Lukose</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Scholz</surname>
          </string-name>
          , and
          <string-name>
            <given-names>Q.</given-names>
            <surname>Yang</surname>
          </string-name>
          .
          <article-title>One-class collaborative filtering</article-title>
          .
          <source>In Proceedings of the 8th IEEE International Conference on Data Mining, ICDM '08</source>
          , pages
          <fpage>502</fpage>
          -
          <lpage>511</lpage>
          ,
          <year>2008</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          [6]
          <string-name>
            <given-names>W.</given-names>
            <surname>Pan</surname>
          </string-name>
          .
          <article-title>A survey of transfer learning for collaborative recommendation with auxiliary data</article-title>
          .
          <source>Neurocomputing</source>
          ,
          <volume>177</volume>
          :
          <fpage>447</fpage>
          -
          <lpage>453</lpage>
          ,
          <year>2016</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          [7]
          <string-name>
            <given-names>S.</given-names>
            <surname>Rendle</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Freudenthaler</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Z.</given-names>
            <surname>Gantner</surname>
          </string-name>
          , and L.
          <string-name>
            <surname>Schmidt-Thieme</surname>
          </string-name>
          .
          <article-title>Bpr: Bayesian personalized ranking from implicit feedback</article-title>
          .
          <source>In Proceedings of the 25th Conference on Uncertainty in Artificial Intelligence, UAI '09</source>
          , pages
          <fpage>452</fpage>
          -
          <lpage>461</lpage>
          ,
          <year>2009</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          [8]
          <string-name>
            <given-names>S.</given-names>
            <surname>Wang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>X.</given-names>
            <surname>Zhou</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Z.</given-names>
            <surname>Wang</surname>
          </string-name>
          , and
          <string-name>
            <surname>M. Zhang.</surname>
          </string-name>
          <article-title>Please spread: Recommending tweets for retweeting with implicit feedback</article-title>
          .
          <source>In Proceedings of the 2012 Workshop on Data-driven User Behavioral Modelling and Mining from Social Media, DUBMMSM '12</source>
          , pages
          <fpage>19</fpage>
          -
          <lpage>22</lpage>
          ,
          <year>2012</year>
          .
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>