Negative-Aware Collaborative Filtering Sheng-Chieh Lin Yu-Neng Chuang Academia Sinica, Taiwan National Chengchi University, Taiwan jacklin_64@citi.sinica.edu.tw 107753011@nccu.edu.tw Sheng-Fang Yang Ming-Feng Tsai National Chengchi University, Taiwan National Chengchi University, Taiwan 106753011@nccu.edu.tw mftsai@nccu.edu.tw Chuan-Ju Wang Academia Sinica, Taiwan cjwang@citi.sinica.edu.tw ABSTRACT Most traditional recommender systems regard unseen user-item associations as negative user pref- erences and optimize recommendation models mainly based on observed associations and some negative instances sampled from unseen associations. However, such unseen user-item associations may contain potential positive user preferences on items and are not uniformly distributed in terms of the possibility of being negative (or positive) user preference; therefore, it is essential to quantify such associations for model training. Along this line, in this paper, in contrast to existing recommendation models, which equally treat all unseen associations as negative samples, we present a negative-aware recommendation approach that explicitly models the likelihood of each unseen association being a potentially positive preference. Empirical results on real-world datasets in different fields show that our approach consistently improves recommendation performance. KEYWORDS recommendation, collaborative filtering, unseen associations, asymmetric user similarity ACM RecSys 2019 Late-breaking Results, 16th-20th September 2019, Copenhagen, Denmark Copyright ©2019 for this paper by its authors. Use permitted under Creative Commons License Attribution 4.0 International (CC BY 4.0). Negative-Aware Collaborative Filtering ACM RecSys 2019 Late-breaking Results, 16th-20th September 2019, Copenhagen, Denmark observed association unseen association INTRODUCTION N N With the rapid development of online services over the last decade, recommender systems have gained much importance, finding use in areas such as music, news, movies, books, and products in general. N N For such an important problem, collaborative filtering (CF) is a common yet powerful approach that P U P U generates user recommendations using only user-item interaction data [3]. Some CF-based algorithms N have been shown to yield reasonable performance among diverse situations and have been used in N many real-world applications [7]. (a) Traditional CF (b) Negative-aware CF One major challenge for CF-based recommendation algorithms is the sparsity of interaction data; that is, most users provide feedback on only a few items. This challenge is attributed to the fact that Figure 1: An illustrative example for in most recommendation scenarios, there is an extremely large pool of items; thus it is unfeasible to negative-aware collaborative filtering expect user feedback for most items. As shown in Figure 1(a), traditional model-based collaborative filtering takes this into account by treating observed interactions as positive associations and treating the majority of unseen interactions as negative ones. However, this approach introduces noise into the modeling process as unseen interactions are not necessarily to be negative instances. In the literature, a few studies attempt to implicitly address this problem. For example, weighted regularized matrix factorization (WRMF) [2] treats unseen associations as a kind of uncertainty instead of negative feedback and uses case weights to reduce the impact of negative examples. In addition, Bayesian personalized ranking (BPR) [6] deals with such uncertainty problem by modeling relative user preference on items. Despite that, these studies consider that unseen associations are uniformly distributed in terms of the possibility of being negative user preferences and do not granulate these associations by quantifying the degree of uncertainty. In this paper, in contrast to existing recommendation models, which equally treat all unseen associations as negative user preferences, we propose quantifying the degree of uncertainty for unseen associations by leveraging user preference similarity, and explicitly model the likelihood of each unseen association being a potentially positive user preference (illustrated in Figure 1(b)). Note that the proposed quantification of unseen associations can be applicable to other recommendation algorithms with the use of negative sampling. Empirical results on two real-world datasets show that our approach improves recommendation performance. METHODOLOGY In collaborative filtering(CF), an interaction matrix, denoted as A = (au,i ) ∈ R |U |×|I | , represents the user-item associations, where U and I denote the sets of users and items, respectively; au,i = 1 if there exits an observed association between user u ∈ U and item i ∈ I , and otherwise, au,i = 0. We first introduce the negative-aware matrix N ∈ R |U |×|I | to quantify the uncertainty of unseen user-item associations for later recommendation model training, each element nui ∈ N of which is Negative-Aware Collaborative Filtering ACM RecSys 2019 Late-breaking Results, 16th-20th September 2019, Copenhagen, Denmark calculated as Í |U |  0   if ŝ a = 0, k =1 uk ki nui = Í   Í  |U | |U |   k=1 ŝ uk a ki / k=1 1 { ŝ uk >0} a ki otherwise. A AAAB8nicbVDLSgMxFM34rPVVdekmWARXJRGfG6m4cVnBPmA6lEyaaUMzyZBkhDL0M9y4UMStX+POvzHTDqLVA4HDOfeSc0+YCG4sQp/ewuLS8spqaa28vrG5tV3Z2W0ZlWrKmlQJpTshMUxwyZqWW8E6iWYkDgVrh6Ob3G8/MG24kvd2nLAgJgPJI06JdZLfjYkdUiKy60mvUkU1NAVEtVOEL88w/FZwQaqgQKNX+ej2FU1jJi0VxBgfo8QGGdGWU8Em5W5qWELoiAyY76gkMTNBNo08gYdO6cNIafekhVP150ZGYmPGcegm84hm3svF/zw/tdFFkHGZpJZJOvsoSgW0Cub3wz7XjFoxdoRQzV1WSIdEE2pdS2VXAp4/+S9pHdcwquG7k2r9qqijBPbBATgCGJyDOrgFDdAEFCjwCJ7Bi2e9J+/Ve5uNLnjFzh74Be/9C47ukWk= AAAB8nicbVDLSsNAFL2pr1pfVZdugkVwVRIp6LLixmUF+4A2lMl00g6dzISZG6GEfoYbF4q49Wvc+TdO2iy09cDA4Zx7mXNPmAhu0PO+ndLG5tb2Tnm3srd/cHhUPT7pGJVqytpUCaV7ITFMcMnayFGwXqIZiUPBuuH0Lve7T0wbruQjzhIWxGQsecQpQSv1BzHBCSUiu50PqzWv7i3grhO/IDUo0BpWvwYjRdOYSaSCGNP3vQSDjGjkVLB5ZZAalhA6JWPWt1SSmJkgW0SeuxdWGbmR0vZJdBfq742MxMbM4tBO5hHNqpeL/3n9FKObIOMySZFJuvwoSoWLys3vd0dcM4piZgmhmtusLp0QTSjaliq2BH/15HXSuar7Xt1/aNSajaKOMpzBOVyCD9fQhHtoQRsoKHiGV3hz0Hlx3p2P5WjJKXZO4Q+czx9sGJFJ AAAB8nicdVDLSgMxFM34rPVVdekmWARXZVK0j40U3LisaB8wHUomzbShmWRIMkIZ+hluXCji1q9x59+YaUdQ0QOBwzn3knNPEHOmjet+OCura+sbm4Wt4vbO7t5+6eCwq2WiCO0QyaXqB1hTzgTtGGY47ceK4ijgtBdMrzK/d0+VZlLcmVlM/QiPBQsZwcZK3iDCZkIwT2/nw1LZrbi1+gWqQUsWsKTZbFQRgihXyiBHe1h6H4wkSSIqDOFYaw+5sfFTrAwjnM6Lg0TTGJMpHlPPUoEjqv10EXkOT60ygqFU9gkDF+r3jRRHWs+iwE5mEfVvLxP/8rzEhA0/ZSJODBVk+VGYcGgkzO6HI6YoMXxmCSaK2ayQTLDCxNiWiraEr0vh/6RbrSC3gm7Oy63LvI4COAYn4AwgUActcA3aoAMIkOABPIFnxziPzovzuhxdcfKdI/ADztsn2f2RnA== AAAB+nicdVDLSgMxFM3UV62vqS7dBIvgqkyK9rGRghuXFe0DOkPJpJk2NPMgyShlnE9x40IRt36JO//GTDuCih4IHM65l3ty3IgzqSzrwyisrK6tbxQ3S1vbO7t7Znm/J8NYENolIQ/FwMWSchbQrmKK00EkKPZdTvvu7CLz+7dUSBYGN2oeUcfHk4B5jGClpZFZtqdYJbaP1ZRgnlyn6cisWFWr3jhDdajJApq0Ws0aQhDlSgXk6IzMd3scktingSIcSzlEVqScBAvFCKdpyY4ljTCZ4Qkdahpgn0onWURP4bFWxtALhX6Bggv1+0aCfSnnvqsns4zyt5eJf3nDWHlNJ2FBFCsakOUhL+ZQhTDrAY6ZoETxuSaYCKazQjLFAhOl2yrpEr5+Cv8nvVoVWVV0dVppn+d1FMEhOAInAIEGaINL0AFdQMAdeABP4Nm4Nx6NF+N1OVow8p0D8APG2yc1S5Sa preference than user k. Furthermore, the denominator of each element nui in N is for normalization and denotes the number of users who have positive feedback for item i and at the same time have AAAB8nicbVDLSsNAFL2pr1pfVZdugkVwVRIp6LLixmUF+4A2lMl00g6dzISZG6GEfoYbF4q49Wvc+TdO2iy09cDA4Zx7mXNPmAhu0PO+ndLG5tb2Tnm3srd/cHhUPT7pGJVqytpUCaV7ITFMcMnayFGwXqIZiUPBuuH0Lve7T0wbruQjzhIWxGQsecQpQSv1BzHBCSUiu50PqzWv7i3grhO/IDUo0BpWvwYjRdOYSaSCGNP3vQSDjGjkVLB5ZZAalhA6JWPWt1SSmJkgW0SeuxdWGbmR0vZJdBfq742MxMbM4tBO5hHNqpeL/3n9FKObIOMySZFJuvwoSoWLys3vd0dcM4piZgmhmtusLp0QTSjaliq2BH/15HXSuar7Xt1/aNSajaKOMpzBOVyCD9fQhHtoQRsoKHiGV3hz0Hlx3p2P5WjJKXZO4Q+czx9sGJFJ AAAB7XicbVA9SwNBEJ3zM8avqKXNYhCswp0IWlgEbCwjmA9IjrC32SSb7O0eu3NCOPIfbCwUsfX/2Plv3CRXaOKDgcd7M8zMixIpLPr+t7e2vrG5tV3YKe7u7R8clo6OG1anhvE601KbVkQtl0LxOgqUvJUYTuNI8mY0vpv5zSdurNDqEScJD2M6UKIvGEUnNWw3G42m3VLZr/hzkFUS5KQMOWrd0lenp1kac4VMUmvbgZ9gmFGDgkk+LXZSyxPKxnTA244qGnMbZvNrp+TcKT3S18aVQjJXf09kNLZ2EkeuM6Y4tMveTPzPa6fYvwkzoZIUuWKLRf1UEtRk9jrpCcMZyokjlBnhbiVsSA1l6AIquhCC5ZdXSeOyEviV4OGqXL3N4yjAKZzBBQRwDVW4hxrUgcEInuEV3jztvXjv3seidc3LZ07gD7zPH+Uxj04= AAAB7XicbVA9SwNBEJ3zM8avqKXNYhCswp0IWlgEbCwjmA9IjrC32SSb7O0eu3NCOPIfbCwUsfX/2Plv3CRXaOKDgcd7M8zMixIpLPr+t7e2vrG5tV3YKe7u7R8clo6OG1anhvE601KbVkQtl0LxOgqUvJUYTuNI8mY0vpv5zSdurNDqEScJD2M6UKIvGEUnNWw3G42n3VLZr/hzkFUS5KQMOWrd0lenp1kac4VMUmvbgZ9gmFGDgkk+LXZSyxPKxnTA244qGnMbZvNrp+TcKT3S18aVQjJXf09kNLZ2EkeuM6Y4tMveTPzPa6fYvwkzoZIUuWKLRf1UEtRk9jrpCcMZyokjlBnhbiVsSA1l6AIquhCC5ZdXSeOyEviV4OGqXL3N4yjAKZzBBQRwDVW4hxrUgcEInuEV3jztvXjv3seidc3LZ07gD7zPH+a2j08= AAAB7XicbVDLSgNBEOyNrxhfUY9eBoPgKeyKoAcPAS8eI5gHJEuYncwm485jmZkVwpJ/8OJBEa/+jzf/xkmyB00saCiquunuilLOjPX9b6+0tr6xuVXeruzs7u0fVA+P2kZlmtAWUVzpboQN5UzSlmWW026qKRYRp50ouZ35nSeqDVPywU5SGgo8kixmBFsntc0gT5LpoFrz6/4caJUEBalBgeag+tUfKpIJKi3h2Jhe4Kc2zLG2jHA6rfQzQ1NMEjyiPUclFtSE+fzaKTpzyhDFSruSFs3V3xM5FsZMROQ6BbZjs+zNxP+8Xmbj6zBnMs0slWSxKM44sgrNXkdDpimxfOIIJpq5WxEZY42JdQFVXAjB8surpH1RD/x6cH9Za9wUcZThBE7hHAK4ggbcQRNaQOARnuEV3jzlvXjv3seiteQVM8fwB97nD+g8j1A= 0.25 0.5 0 1 0 on average very similar to user u, item i is likely to match user u’s preference. Thus, nui (or 1 − nui ) can be interpreted as the likelihood of the association between user u and item i being a positive (or AAAB7HicbVBNS8NAEJ3Ur1q/qh69LBbBU0lE0GPRi94qmLbQhrLZbtqlm03YnQgl9Dd48aCIV3+QN/+N2zYHbX0w8Hhvhpl5YSqFQdf9dkpr6xubW+Xtys7u3v5B9fCoZZJMM+6zRCa6E1LDpVDcR4GSd1LNaRxK3g7HtzO//cS1EYl6xEnKg5gOlYgEo2gl/76fj6f9as2tu3OQVeIVpAYFmv3qV2+QsCzmCpmkxnQ9N8UgpxoFk3xa6WWGp5SN6ZB3LVU05ibI58dOyZlVBiRKtC2FZK7+nshpbMwkDm1nTHFklr2Z+J/XzTC6DnKh0gy5YotFUSYJJmT2ORkIzRnKiSWUaWFvJWxENWVo86nYELzll1dJ66LuuXXv4bLWuCniKMMJnMI5eHAFDbiDJvjAQMAzvMKbo5wX5935WLSWnGLmGP7A+fwB4HCOtw== AAAB7HicbVBNS8NAEJ34WetX1aOXxSJ4KokIeix60VsF0xbaUDbbTbt2swm7E6GE/gYvHhTx6g/y5r9x2+agrQ8GHu/NMDMvTKUw6Lrfzsrq2vrGZmmrvL2zu7dfOThsmiTTjPsskYluh9RwKRT3UaDk7VRzGoeSt8LRzdRvPXFtRKIecJzyIKYDJSLBKFrJv+vlj5NeperW3BnIMvEKUoUCjV7lq9tPWBZzhUxSYzqem2KQU42CST4pdzPDU8pGdMA7lioacxPks2Mn5NQqfRIl2pZCMlN/T+Q0NmYch7Yzpjg0i95U/M/rZBhdBblQaYZcsfmiKJMEEzL9nPSF5gzl2BLKtLC3EjakmjK0+ZRtCN7iy8ukeV7z3Jp3f1GtXxdxlOAYTuAMPLiEOtxCA3xgIOAZXuHNUc6L8+58zFtXnGLmCP7A+fwB3uuOtg== 0.25 0 0.66 0 1 (c) Asymmetric relations (d) User similarity matrix (asymmetric) negative, respectively) preference. We then tailor the negative-aware matrix using pointwise and pairwise approaches to account for Figure 2: Asymmetric user preference sim- implicit user feedback for recommendation. Both approaches attempt to estimate the latent factors ilarity of the following two sets: θ U , θ I ⊆ Θ, where θ U ∈ R |U |×d for users, θ I ∈ R |I |×d for items, d is the dimension of the low-rank latent factor space, and Θ is a superset of θ I and θ U consisting of all the parameters in the model. Let θu (θ i ) denote the row vector for user u (item i, respectively) from θ U (θ I , respectively). Of the pointwise approaches, the most representative method is matrix factorization (MF). To incorporate the designed negative-aware matrix into the optimization, we modify the objective function of MF for implicit feedback proposed by [2] as Õ ⊺ ⊺ LMFN = aui (1 − θu θ i )2 + (1 − aui ) (nui − θu θ i )2 + λ ∥Θ∥ 22 , (1) u,i where u ∈ U , i ∈ I , and λ is a hyperparameter preventing overfitting to the observations. Note that Dataset Movielens CiteULike Eq. (1) can be seen as a variant of WRMF, the case weight of which is however a hyperparameter requiring to be exogenously determined; in contrast, nui plays a similar role of the case weight and is Users (|U |) 938 3,527 shaped by the observed user-item associations. Items (|I |) 950 6.339 Feedback 54,413 77,546 Density 6.100% 0.347% Table 1: Datasets Negative-Aware Collaborative Filtering ACM RecSys 2019 Late-breaking Results, 16th-20th September 2019, Copenhagen, Denmark Dataset Movielens CiteULike P@5 MAP@5 P@10 MAP@10 P@5 MAP@5 P@10 MAP@10 MF 0.237 0.169 0.199 0.123 0.060 0.058 0.045 0.048 MFn-aware *0.241 *0.173 *0.202 *0.125 0.062 *0.061 *0.048 *0.052 BPR 0.257 0.189 0.211 0.136 0.064 0.064 0.049 0.054 BPRn-aware *0.262 ***0.195 **0.214 **0.140 *0.066 0.066 *0.050 0.055 Table 2: Top-N recommedation For the pairwise approaches, we integrate the negative awareness of unseen associations into Bayesian Personalized Ranking (BPR)[6], a popular ranking-based model:   Õ 1 ⊺ ⊺ N LBPR =− log log σ (θu θ i − θu θ i ′ ) + λ ∥Θ∥ 22 . (2) ′ n ui ′ u,(i,i ) Note that as the proposed negative-aware matrix is independent of the recommendation models, it can be seen as a generic device applicable to other recommendation algorithms with the use of negative sampling. EXPERIMENTS We conduct experiments on two publicly available real-world datasets in different fields: MovieLens- 1 https://grouplens.org/datasets/movielens/ 100K,1 and CiteULike.2 To demonstrate the effectiveness of our negative-aware approach, we imple- 2 http://www.wanghao.in/data/ctrsr_datasets. ment both pointwise and pairwise recommendation models based on the loss functions in Eqs. (1) rar and (2), and compare the models with and without incorporating the proposed negative-aware matrix. For all approaches, we set the dimension of latent factors d to 64 and the number of negative samples to 5. The dot product is used as the scoring function for an association given the latent factors of the corresponding user and item. To evaluate the model capability for the task of top-N item recommen- dation, we use the two commonly adopted evaluation metrics: precision@N and MAP@N [1]. For each dataset, we randomly divide the observed user-item associations into 80% and 20% as training and testing sets, respectively, and obtain the averaged results by randomly dividing the data 5 times in this manner. Table 2 tabulates the performance of our negative-aware approach on pointwise and pairwise recommendation models. In the table, *, **, and *** indicate significance levels of p < 0.05, p < 0.005, and p < 0.0005 based on the paired t-test with respect to its counterpart; the reported numbers are averaged over the five test results. Negative-Aware Collaborative Filtering ACM RecSys 2019 Late-breaking Results, 16th-20th September 2019, Copenhagen, Denmark We first evaluate our models, denoted as MFn-aware and BPRn-aware , on the original datasets with observed positive feedback, the results of which are listed in the top panel of Table 2. Observe that the proposed models in most cases outperform or yield performance comparable to their counterparts. Figure 3 shows the model performance at different training epochs, where each point in the figure denotes the performance in terms of MAP@10. As shown in the figure, our negative-aware models (blue curves) are generally capable of maintaining better performance than the traditional models at each training epoch. This clearly demonstrates that our approach boosts the recommendation quality of both pointwise and pairwise recommendation CF models. CONCLUSIONS In this work, we present a negative-aware recommendation approach that explicitly addresses the uncertainty of unseen user-item associations This approach is shown to be applicable to recommenda- tion algorithms with the use of negative sampling. Empirical results on two real-world datasets show that our approach improves the performance of pointwise and pairwise recommendation models. Figure 3: Performance (MAP@10) at each training stage REFERENCES [1] Jonathan L. Herlocker, Joseph A. Konstan, Loren G. Terveen, and John T. Riedl. 2004. Evaluating Collaborative Filtering Recommender Systems. ACM Transactions on Information Systems 22, 1 (2004), 5–53. [2] Yifan Hu, Yehuda Koren, and Chris Volinsky. 2008. Collaborative Filtering for Implicit Feedback Datasets. In Proceedings of IEEE International Conference on Data Mining (ICDM ’08), Vol. 8. 263–272. [3] Dietmar Jannach, Paul Resnick, Alexander Tuzhilin, and Markus Zanker. 2016. Recommender Systems — Beyond Matrix Completion. Communications of the ACM 59, 11 (2016), 94–102. [4] Marta Millan, Maria Trujillo, and Edward Ortiz. 2007. A Collaborative Recommender System Based on Asymmetric User Similarity. In Intelligent Data Engineering and Automated Learning - IDEAL 2007, Hujun Yin, Peter Tino, Emilio Corchado, Will Byrne, and Xin Yao (Eds.). 663–672. [5] Parivash Pirasteh, Dosam Hwang, and Jason J. Jung. 2015. Exploiting matrix factorization to asymmetric user similarities in recommendation systems. Knowl.-Based Syst. 83 (2015), 51–57. [6] Steffen Rendle, Christoph Freudenthaler, Zeno Gantner, and Lars Schmidt-Thieme. 2009. BPR: Bayesian Personalized Ranking from Implicit Feedback. In Proceedings of the 25th Conference on Uncertainty in Artificial Intelligence (UAI ’09). 452–461. [7] Xiaoyuan Su and Taghi M. Khoshgoftaar. 2009. A Survey of Collaborative Filtering Techniques. Advances in Artificial Intelligence 2009 (2009), Article ID 421425.