Implicit Feedback Recommendation via Implicit-to-Explicit
           Ordinal Logistic Regression Mapping

                   Denis Parra                         Alexandros Karatzoglou                    Xavier Amatriain
             University of Pittsburgh                     Telefonica Research                    Telefonica Research
                                                                Idil Yavuz
                                                          University of Pittsburgh


ABSTRACT                                                                obtained from direct usage [4]. However, it is not clear that
One common dichotomy faced in recommender systems is                    we can trust a simple one-to-one mapping between usage
that explicit user feedback -in the form of ratings, tags, or           and preference [5]. On the other hand, explicit feedback is
user-provided personal information- is scarce, yet the most             obtained by directly querying the user, who is usually pre-
popular source of information in most state-of-the-art rec-             sented with an integer scale where to quantify how much
ommendation algorithms, and on the other side, implicit                 she likes the items. In principle, explicit feedback is a more
user feedback - such as numbers of clicks, playcounts, or               robust way to extract preference, since the user is reporting
web pages visited in a session- is more frequently available,           directly on this variable, removing the need of an indirect
but there are fewer methods well studied to provide recom-              inference. However, it is also known that this kind of feed-
mendations based on this kind of information.                           back is affected by user inconsistencies known as natural
   Given the current scenario, and under a situation where              noise [6]. Besides, the fact that we are introducing a user
just implicit user feedback is available, it would be more ap-          overhead, makes it difficult to have a complete view on the
propriate either to provide recommendations using the im-               user preferences [7].
plicit data and implicit-fedback-based methods, or to map                  None of the two existing strategies for capturing user feed-
implicit user feedback to explicit feedback and then use an             back clearly outperforms the other. Ideally, we would like
explicit-based algorithm? On this paper, we analyze this                to use implicit feedback, minimizing the impact on the user,
problem in the context of music recommendation by means                 but having a robust and proven way to map this informa-
of a well-known implicit feedback recommendation method                 tion to the actual user preference. In a previous work [8], we
described in Hu et al. [1] by comparing the use of raw play-            tested several regression models and we were able to map im-
counts with the use of explicit data - user ratings - obtained          plicit user feedback to explicit ratings. Our results were sat-
by mapping implicit to explicit feedback with a novel mixed-            isfactory, but we did not compare to state-of-the-art meth-
effects logistic regression model.                                      ods that make use of raw implicit information to provide rec-
                                                                        ommendations. In this paper we propose an ordinal logistic
                                                                        regression model that by using a few ratings is able to infer
1. INTRODUCTION                                                         a generic parametric mapping from implicit to explicit data.
   Recommender Systems (RS) [2] have proved their busi-                 Our mapping model integrates usual implicit user feedback
ness value and impact on many application scenarios that                (playcounts) with contextual information (how recently the
go from recommending movie rentals to new contacts on a                 user listened to an album). We compare our approach to a
social network. One of the main features of these systems               state-of-the art algorithm for implicit feedback recommen-
is that they rely on understanding user preferences in or-              dations and discuss possible extensions.
der to estimate the utility of items and decide whether they
should be recommended. These user preferences are infered
by taking into account direct feedback from the user, either
                                                                        2.   PRELIMINARIES AND RELATED WORK
in explicit or implicit form.                                              Implicit feedback is much more readily available in prac-
   We obtain implicit feedback [3] by measuring the interac-            tical scenarios for recommender systems. However, most of
tion of the user with the different items. We can use signals           the research literature focuses on the use of explicit feedback
such as the number of playcounts in a song, or the clicks on            input since this is considered the ground truth on the user
webpages as implicit feedback. This kind of data is obtained            preferences and allows to reduce the recommender problem
without incurring into any overhead on the user, since it is            to one of predicting ratings.
                                                                           In one of the few papers addressing the implicit feedback
                                                                        recommendation problem [1], Hu et al. deal with the implicit
                                                                        feedback recommendation problem by binarizing it and in-
                                                                        troducing the idea of confidence. In our previous work [8],
                                                                        however, we presented an analysis of implicit and explicit
                                                                        feedback that challenged most of the assumptions stated
                                                                        in [1]. In particular: (1) There is no negative feed-
                                                                        back. While it is true that you cannot interpret “no implicit
CARS-2011, October 23, 2011, Chicago, Illinois, USA.                    feedback“ as “negative feedback“ – and this is true also for
Copyright is held by the author/owner(s).                               explicit feedback–, implicit data can include negative feed-
back. You can assume that low feedback is negative feed-          post-filtering [16]. Once the implicit-to-explicit mapping is
back as long as the granularity of the items is comparable,       performed, we can use the inferred ratings in methods for
and there is enough variability. (2) Implicit feedback is         explicit or implicit data. We can then compare the perfor-
noisy. Implicit feedback is noisy but, as we showed in pre-       mance of these models to the one by Hu et al. in several
vious work [6], so is explicit feedback. (3) Preference vs.       experiments.
Confidence. As we showed in our work [8], the numerical
value of implicit feedback can indeed be directly mapped          3.    REGRESSION MODELS
to preference, given the appropriate mapping. (4) Evalu-
ation of implicit feedback. On the other hand, we do              3.1 Linear Regression
agree that there is no appropriate evaluation approaches for
                                                                     In [8] we introduce a linear regression model to predict
implicit feedback and this is in fact one of the motivations
                                                                  explicit preference of users on music albums in the form of
of our work: if we find an appropriate way to map implicit
                                                                  ratings based on implicit user behavior variables - (1) Im-
to explicit feedback we can ensure an evaluation that is as
                                                                  plicit Feedback (if ): playcount for a user on a given item;
good as the one we have in the explicit case.
                                                                  (2) Global Popularity (gp): global playcount for all users
   Our hypothesis that there is some observable correlation
                                                                  on a given item; (3) Recentness (re) : time elapsed since
between implicit and explicit feedback can be tracked in the
                                                                  user played a given item. In that article, we compare dif-
literature. Already in 1994, Morita and Shinoda [9] proved
                                                                  ferent linear regression models based on the aforementioned
that there was a correlation between reading time on on-
                                                                  variables and we find that the variables implicit feedback
line news and self-reported preference. Konstan et al. [10]
                                                                  and recentness explain the largest part the variability of the
did a similar experiment with the larger user base of the
                                                                  ratings, while global popularity explained a very small por-
Grouplens project and again found this to be true. Oard
                                                                  tion. This result suggested us that the two former variables
and Kim [11] performed experiments using not only reading
                                                                  would be better predictors of the user preference, and we
time but also other actions like printing an article to find a
                                                                  supported these assumption by performing a 10-fold cross
positive correlation between implicit feedback and ratings.
                                                                  validation experiment using the data of our online survey on
Koh et al. did a thorough study of rating behavior in two
                                                                  music preference as a ground truth. The RMSE values were
popular websites [12]. They hypothesize that the overall
                                                                  consistent with the previously described regression analysis.
popularity or average rating of an item will influence raters
and they conclude that while there is an effect, this depends     3.1.1    Limitations and shortcomings of Linear Regres-
on the cultural background of the raters.                                  sion
   Lee et al. [13] implement a recommender system based
                                                                     Although the linear regression gives good results, there are
on implicit feedback by constructing “pseudo-ratings” us-
                                                                  some considerations that must be observed to generalize this
ing temporal information. In this work, the authors intro-
                                                                  model to other domains and to make it able to be compared
duce the idea that recent implicit feedback should contribute
                                                                  with other approaches. First, depending on the application
more positively towards inferring the rating. The authors
                                                                  we may want the predicted values to fall in the range from
also use the idea of distinguishing three temporal bins: old,
                                                                  1 to 5, but using linear regression we cannot ensure it. Sec-
middle, and recent.
                                                                  ond, as in most of recommender systems research, our main
   Two recent works approach the issue of implicit feedback
                                                                  evaluation metric is RMSE. When using this metric, we are
in the music domain. Jawasher et. al analyze the charac-
                                                                  assuming that ratings form an interval scale, i.e. the dis-
teristics of user implicit and explicit feedback in the context
                                                                  tance between any two consecutive values in the rating scale
of last.fm music service [14]. However, their results are not
                                                                  is the same. However, in a previous study [6], we have shown
conclusive due to limitations in the dataset since they only
                                                                  that users have a larger probabilty to be more inconsistent
used explicit feedback available in the last.fm profiles, which
                                                                  with some ratings numbers than with others, what give us
is limitted to the love/ban binary categories. This data is
                                                                  the clue that users do not see the rating scale as equally
very sparse and, as the authors report, almost non-existant
                                                                  spaced. Hence, we should consider the ratings as an ordi-
for some users or artists. On the other hand, Kurdomova
                                                                  nal variable rather than an linear or interval one. This also
et. al use a Bayesian approach to learn a classifier on mul-
                                                                  implies that RMSE is not a good measure alone to predict
tiple implicit feedback variables [15]. Using these features,
                                                                  user preference, it should be combined, and in some cases
the authors are able to classify liked and disliked items with
                                                                  replaced, with other measures coming from Information Re-
an accuracy of 0.75, uncovering the potential of mapping
                                                                  trieval such as precision, recall, or nDCG.
implicit feedback directly to preferences.
                                                                     Given that users present individual variability in their rat-
   In our previous work [8], we showed that it was possible
                                                                  ings, a good extension of our model should include the user
to create a simple parametric model for implicit feedback
                                                                  as a random factor. Additionally, given that ratings are
by using linear regression on some available explicit ratings.
                                                                  actually an ordinal variable, as explained in the previous
However, as we will explain, in the context of user ratings,
                                                                  paragraph, and the fact that are not normally distributed,
it may be more appropriate to use a mixed-effects ordinal
                                                                  logistic regression is a proper alternative to our linear re-
logistic regression model. In this context, the main contribu-
                                                                  gression model. Combining both considerations, our next
tion of our present work is to present an ordinal logistic re-
                                                                  model for implicit-to-explicit behavior mapping model will
gression model that allows to map implicit data into explicit
                                                                  be a mixed-effects logistic regression.
ratings for the task of recommendation. We make our model
context-aware with respect to how recently a user listened        3.2     Mixed-effects Ordinal Logistic Regression
to an album by contextual modeling, i.e., using the contex-
                                                                    The multinomial logistic regression is the natural model
tual information directly in the modelling technique, unlike
                                                                  for an ordinal scale variable (rating, that ranges from 1 to
data-driven approaches such as contextual pre-filtering or
                                                                  5) and a mixed-effects model will help us to reduce the vari-
                                        Effect          Estimate      SE        DF        t     Pr > |t|
                                        intercept 1     −1.2740     0.2808      112     −4.54   <.0001
                                        intercept 2      0.3791     0.2784      112     1.36     0.1759
                                        intercept 3      2.0898     0.2792      112     7.49    <.0001
                                        intercept 4      3.7355     0.2808      112     13.30   <.0001
                                        gp              −0.01589   0.05598     10000    −0.28    0.7766
                                        if              −0.5894    0.08094     10000    −7.28   <.0001
                                        re              −0.04137   0.05395     10000    −0.77    0.4432
                                        gp*if           −0.06955   0.02956     10000    −2.35    0.0187
                                        if*re           −0.1331    0.02782     10000    −4.78   <.0001
                                        concerts        −0.1912    0.07825     10000    −2.44    0.0145

             Table 1: Details of the mixed-effects multinomial regression model with 4 fixed effects


ability due to differences in rating among the users. Our                    list of albums in the user’s playlist so that users responded
multinomial logistic regression, that uses cummulative logit                 to a personalized survey. Details of this study, such as the
as link function, can be represented as:                                     strategy to sample the items that were rated by users and
                                                                             the results of user demographics and user consumption, can
             logit(P (rui ≤ k)) = αk + Xβ + gu                     (1)       be found in our previous article [8].
where k = {1, 2, 3, 4}, rui is the rating that user u gives to
item i, P (rui ≤ k) is the probability that the rating rui is                4.1.2     Implicit Music Consumption Feedback
less or equal than k, αk is the intercept for the cumulative                    We call Implicit Music Consumption Feedback to our Dataset2
probability that rating is less than or equal to k, X is a vector            since, unlike Dataset1 that has demographic data of each
with the actual values of the fixed factors (if, re and gp), β is            user, it only has information about implicit behavior of the
                                                          iid
the vector of coefficients of the fixed factors, gu ∼ N (0, σg2 )            users: playcount of albums per each user, how recently each
is the random effect of the users, and                                       album was listened to for the last time, and the total num-
                                                                             ber of listeners of each album in the whole last.fm website.
                                       p                                     The statistics of this dataset are described in Table 2.
                     logit(p) = log(      )                 (2)
                                     1−p
  To obtain the predicted rating of a user u on an item i,                   4.2 Regression Model Selection
we calculate the expected value of the rating as                                To select the fixed effects that would be part of our model
                                                                             we conducted a forward selection on the set of all the main
                                X
                                5
                                                                             effects and their two-way interactions. The main effects con-
                    E[rui ] =         k · P (rui = k)              (3)
                                                                             sidered were if , re, gp (as described in section 3.1) plus ten
                                k=1
                                                                             demographic and consumption variables: gender, age, hours
where                                                                        of music per week, hours of internet per week, buying phys-
                
                                   P (rui ≤ k) , k = 1                      ical records, buying online records, interaction style (prefer-
P (rui = k) =     P (rui ≤ k) − P (rui ≤ k − 1) , 1 < k < 5                  ence on listening to tracks or albums), number of concerts
                           1 − P (rui ≤ k − 1) , k = 5                      per year, interest on reading specialized music blogs or mag-
                                                         (4)                 azines, and familiarity rating music online. We have to pick
                                                                             two models finally because of the nature of our two datasets.
                                                                             In the smallest one (dataset1) we have all the variables ob-
4. EXPERIMENTAL SETUP                                                        tained by a user study, but in the second dataset (dataset2)
                                                                             we just have implicit information (playcounts per user, how
4.1 Data sets                                                                recently the user listened to each album, and the total num-
   We use two datasets in this study. The first one was col-                 ber of listeners of an album in the whole dataset) that can
lected by an online user study among users of the last.fm mu-                be reduced to if , re and gp.
sic service between September and October of 2010, contain-                     After conducting the process of forward selection, the model
ing implicit and explicit information, and also demographic                  obtained for dataset1 considers four fixed effects (if, re, gp
and consumption data. The second one was collected using                     and concerts per year) and the random effect of the user.
the last.fm API during May of 2011, and contains only im-                    The details of the model are described in Table 1. Although
plicit information. The characteristics of both datasets are                 the main effects of global popularity (gp) and recentness (re)
described in Table 2 .                                                       are not significant, we keep them in the model because their
                                                                             interaction with implicit feedback (if ) is significant [17].
4.1.1 Generating Explicit Fedback                                               For dataset2, we consider in the model if , re, and gp as
   We conducted an online user study among users of the                      fixed effects plus the random effect of the user. For the sake
last.fm music service. The goal of the study was to gather                   of space we do not show the details of this model, but the
explicit feedback on music albums to compare to the user im-                 coefficient and significant values are similar to those shown
plicit feedback we obtained by directly crawling the last.fm                 in Table 1 excepting that the factor number of concerts is
page related to the user taking the survey. Explicit feed-                   not considered in the model. As in the previous model, we
back was obtained by asking users to rate albums on a 1                      keep in the model gp and re although they are not significant
to 5 star scale. The items to rate were obtained from the                    due to their interaction with if . Under this model, is also
                                                         Dataset1 (Implicit Explicit)   Dataset2 (Implicit)
                                users                                114                       2549
                                albums                              6037                       6037
                                entries                             10122                    111815
                                density                            1.47%                      0.73%
                                avg albums/user                     88.79                     43.87
                                avg user/album                       1.71                     18.52

                                                      Table 2: Details of the datasets

                                                  MAP (D1)      nDCG(D1)       MAP(D2)     nDCG(D2)
                                    HK             0.02315       0.14831        0.1014       0.2718
                                    HKlog          0.02742       0.15447        0.1234       0.2954
                                    logit3         0.02636       0.15319        0.1223       0.2944
                                    logit4         0.02601       0.15268         N/A          N/A
                                    popularity     0.48331       0.54378        0.0178       0.1367


  Table 3: Results of MAP and nDCG after 5-fold Cross validation on dataset 1 (D1) and dataset 2 (D2)


not significant the intercept for rating equal to 2, which tell              where the Frobenius norm of the factor matrices is used
us that this intercept is not significantly different than 0,             for regularization. This minimization problem is then solved
and we may dismiss it from the model.                                     in linear time using Alternate Least Squares and utilizing a
                                                                          trick to avoid direct optimization over the 0 entries of the
4.3 Comparing the different approaches                                    matrix.
   After we have done the implicit-to-explicit mapping, we
are in condition to compare the use of impplicit data with                4.3.1    Error Measures
inferred explicit data. In this article, we compare four ap-
proaches using dataset 1 and three aproaches using dataset                   RMSE [18] is probably the most common measure to eval-
2. The methods we compare, as identified in the first column              uate the performance of recommender systems and we used
of Table 3, are:                                                          it to evaluate and compare our linear regression approaches
   • HK : the implicit feedback method introduced in Hu et al.            in [8]. However, when there are no ratings to assess the
     [1] which uses raw playcounts,                                       performance of the algorithms we can not use metrics like
   • HKlog: a variation of the HK method, also introduced in              RMSE or MAE. Hence, we opt for using Mean Average Pre-
     [1], that makes a log-transformation of the playcounts,              cision (MAP) [19] and normalized Discounted Cummulative
   • logit3 : the HK method, where the input values are the
                                                                          Gain (nDCG) [20]. The former gives us an overall sense of
     ratings inferred by logistic regression using 3 fixed factors        how well we identify relevant items to recommend from a
     (if, gp, and re)                                                     set of retrieved recommendations, and the latter how well
   • logit4 : similar to logit3 but adding the factor number of           we rank them in a list.
     concerts in the logistic regression model to infer the ratings.
     We have this information available just for dataset1.
                                                                          5.   RESULTS
  Description of the HK method. For the implicit                             In order to evaluate and compare the methods, we split
feedback modeling we use the Matrix Factorization method                  each dataset into 5 groups in order to perform a 5-fold cross
developed in [1]. In this Matrix Factorization method a                   validation. The result of each run is a list of recommended
weighted least squares error loss function is minimized. To               items (albums) for each user in the test set, sorted by the
this end user-item interactions pij are signaled with a 1 and             preference that the user would have for that item. We cal-
missing interactions are marked with a 0. The counts of                   culate MAP and nDCG for each list recommended to a
user-item interactions (e.g. playcounts Yij ) are translated              user, judging an item as relevant whether it was consumed
into a confidence measure wij , which in the case of the HK               (played) at least once by the user. Results can be seen in
method correspond to pij + αYij , and in the case of the                  Table 3.
HKlog method a simple log transform is used where:                           In the case of dataset 1, the best results of MAP and
                                                                         nDCG are obtained by recommending the most popular
                          αlog(1 + Yijk ) Yijk > 0                        items. This result is somewhat expected due to the spar-
             wij =                                               (5)      sity of the dataset that affects the methods based on matrix
                          1               Yijk = 0
                                                                          factorization. As shown in Table 2, each album was rated in
   This ”confidence” is then used as a weight in the loss func-           average by just 1.71 users. This situation is not repeated in
tion and the objective function then becomes                              dataset 2, where the average number of users per album is
                                                                          18.52, and then the popularity method performs the worst.
                     X
                     n X
                       m
                                                                             We highlight two results on these initial experiments. The
              min        [wij (pij − hUi∗ Mj∗ i)2                (6)      first one is that the log transformation of raw playcounts
             U,M,C
                      i     j
                                                                          makes HKlog improve clearly over HK on both MAP and
                          λ            λ                                  nDCG measures. The second result we higlight is that logit3
                     +      ||Ui∗ ||2 + ||Mj∗ ||2 ]
                          n            m                                  and logit4 perform better than HK and there is not a big
difference in performance with HKlog, leading us investigate             Conference on User Modeling, Adaptation, and
further to confirm this difference.                                      Personalization, 2009.
                                                                     [7] G. Jawaheer, M. Szomszor, and P. Kostkova.
6. CONCLUSIONS AND FUTURE WORK                                           Characterisation of explicit feedback in an online
   In this paper, we continue the work that we started in [8]            music recommendation service. In Proceedings of the
to create a model that allows us to map implicit to explicit             fourth ACM conference on Recommender systems,
user behavior. Using MAP and nDCG metrics, we show                       RecSys ’10, pages 317–320, 2010.
that our method is comparable to state of the art methods            [8] D. Parra and X. Amatriain. Walk the talk: Analyzing
that provides recommendations making use of implicit user                the relation between implicit and explicit feedback for
feedback.                                                                preference elicitation. In Proc. of the 2011 Conference
   The results that we have obtained, part of which we show              on User Modeling, Adaptation, and Personalization,
on this paper, give us some insights but they mainly open                2011.
research questions that we need to analyze further. We have          [9] M. Morita and Y. Shinoda. Information filtering based
confirmed in our dataset the benefits of applying a log trans-           on user behavior analysis and best match text
formation to the raw user feedback in the Hu et al. model,               retrieval. In SIGIR ’94: Proceedings of the 17th annual
showing consistently better results than the unmodified ver-             international ACM SIGIR conference, pages 272–281,
sion.                                                                    New York, NY, USA, 1994. Springer-Verlag New
   In terms of the questions we need to further analyze, up              York, Inc.
to this point, we have considered the factors implicit feed-        [10] Joseph A. Konstan, Bradley N. Miller, David Maltz,
back and global popularity in our logistic regression models             Jonathan L. Herlocker, Lee R. Gordon, and John
as ordinal variables. We coded these variables on this way               Riedl. Grouplens: applying collaborative filtering to
to make sure that we were doing an appropiate diverse sam-               usenet news. Commun. ACM, 40(3):77–87, 1997.
pling when creating the user survey described in [8]. How-          [11] D. Oard and J. Kim. Modeling information content
ever, there is no constraint to rather use the raw playcounts            using observable behavior. In Proc. of the ASIST
for both factors aforementioned, and we think that this mod-             Annual Meeting, pages p481–88, 2001.
ification can benefit the results of our implicit-to-explicit lo-   [12] N.S. Koh, N. Hu, and E. K. Clemons. Do online
gistic regression model.                                                 reviews reflect a product’s true perceived quality? - an
   On the experiments run on this study, since we are not                investigation of online movie reviews across cultures.
predicting user ratings but rather user preference, metrics              Electronic Commerce Research and Applications, 2010.
such as RMSE or MAE can not be used to compare the                  [13] T. Lee, Y. Park, and Y. Park. A time-based approach
methods so we opt for IR metrics such as MAP and nDCG,                   to effective recommender systems using implicit
which rely on how we define relevancy. We wonder if our def-             feedback. Expert Syst. Appl., 34(4):3055–3062, 2008.
inition of relevance might bias our results and conclusions.        [14] Gawesh Jawaheer, Martin Szomszor, and Patty
As we have stated it before, we think that low feedback                  Kostkova. Comparison of implicit and explicit
might be, in fact, negative feedback. For this reason, we are            feedback from an online music recommendation
currently testing different user activity (implicit feedback)            service. In Proceedings of the 1st International
thresholds to define relevancy in order to analyze how that              Workshop on Information Heterogeneity and Fusion in
influences the evaluation of the different recommendation                Recommender Systems, 2010.
approaches.                                                         [15] S. Kordumova, I. Kostadinovska, M. Barbieri,
                                                                         V. Pronk, and J. Korst. Personalized implicit learning
7. REFERENCES                                                            in a music recommender system. In UMAP 2010, 2010.
 [1] Y. Hu, Y. Koren, and C. Volinsky. Collaborative                [16] Gediminas Adomavicius and Alexander Tuzhilin.
     filtering for implicit feedback datasets. In Proceedings            Context-aware recommender systems. In Francesco
     of ICDM 2008, 2008.                                                 Ricci, Lior Rokach, Bracha Shapira, and Paul B.
 [2] F. Ricci, L. Rokach, B. Shapira, and P. B. Kantor,                  Kantor, editors, Recommender Systems Handbook,
     editors. Recommender Systems Handbook. Springer,                    pages 217–253. Springer US, 2011.
     2011.                                                          [17] J. Neter, M. H. Kutner, C. J. Nachtsheim, and
 [3] Douglas Oard and Jinmook Kim. Implicit feedback for                 W. Wasserman. Applied Linear Statistical Models.
     recommender systems. In in Proceedings of the AAAI                  Irwin, Chicago, 1996.
     Workshop on Recommender Systems, pages 81–83,                  [18] Jonathan L. Herlocker, Joseph A. Konstan, Loren G.
     1998.                                                               Terveen, and John T. Riedl. Evaluating collaborative
 [4] G. Potter. Putting the collaborator back into                       filtering recommender systems. ACM Trans. Inf. Syst.,
     collaborative filtering. In 2nd KDD Workshop on                     22(1):5–53, 2004.
     Large-Scale Recommender Systems and the Netflix                [19] Christopher D. Manning, Prabhakar Raghavan, and
     Prize Competition, 2008.                                            Hinrich Schtze. Introduction to Information Retrieval.
 [5] D. M. Nichols. Implicit rating and filtering. In In                 Cambridge University Press, New York, NY, USA,
     Proceedings of the Fifth DELOS Workshop on                          2008.
     Filtering and Collaborative Filtering, pages 31–36,            [20] Kalervo Järvelin and Jaana Kekäläinen. Cumulated
     1997.                                                               gain-based evaluation of ir techniques. ACM Trans.
 [6] X. Amatriain, J.M. Pujol, and N. Oliver. I like it... i             Inf. Syst., 20:422–446, October 2002.
     like it not: Evaluating user ratings noise in
     recommender systems. In Proc. of the 2009