Assessing the Impact of a User-Item Collaborative Attack on Class of Users∗ Yashar Deldjoo Tommaso Di Noia Felice Antonio Merra† Polytechnic University of Bari Polytechnic University of Bari Polytechnic University of Bari Bari, Italy Bari, Italy Bari, Italy yashar.deldjoo@poliba.it tommaso.dinoia@poliba.it felice.merra@poliba.it Abstract In this direction, first works [15, 19, 24] focused on different Collaborative Filtering (CF) models lie at the core of most rec- profile injection strategies by analyzing and classifying them on ommendation systems due to their state-of-the-art accuracy. They the required effort and amount of attacker’s knowledge to craft are commonly adopted in e-commerce and online services for their successful attacks. These works have been followed by multiple impact on sales volume and/or diversity, and their impact on com- studies on the evaluation of the robustness [4, 7, 22] of different CF panies’ outcome. However, CF models are only as good as the inter- models and detection strategies [8, 18, 29]. The robustness analysis action data they work with. As these models rely on outside sources in surveys [11, 21] shows that Item-kNN is more robust than User- of information, counterfeit data such as user ratings or reviews can kNN and model-based CF are generally more resistant to shilling be injected by attackers to manipulate the underlying data and attacks than conventional nearest neighbor-based algorithms. alter the impact of resulting recommendations, thus implement- One common characteristic of the previous literature on shilling ing a so-called shilling attack. While previous works have focused attacks on CF-RS is their focus on assessing the global impact of on evaluating shilling attack strategies from a global perspective shilling attacks on different CF models by examining the success of paying particular attention to the effect of the size of attacks and attacks from the perspective of attacker’s knowledge and the size of attacker’s knowledge, in this work we explore the effectiveness of attack (i.e. the number of shilling profiles) [11]. In the present work shilling attacks under novel aspects. First, we investigate the effect instead, we investigate the effectiveness of an attack on a target- of attack strategies crafted on a target user in order to push the rec- item of a target-user, namely user-item attack, with a novel point of ommendation of a low-ranking item to a higher position, referred to attention focused on influence of the attack on the classes of attacked as user-item attack. Second, we evaluate the effectiveness of attacks users in particular highly-active (HA) user and slightly-active (SA) in altering the impact of different CF models by contemplating the user. class of the target user, from the perspective of the richness of her The application scenario for class-based study of attacks on RS profile (i.e., slightly-active v.s. highly-active user). Finally, similar to may span in different domains. As an example, a restaurant owner previous work we contemplate the size of attack (i.e., the amount may wish to diminish the trust on a target user of a competitor of fake profiles injected) in examining their success. by pushing a low-ranked product for the specific user. The same The results of experiments on two widely used datasets in busi- argument can be made for new users. An attacker may be interested ness and movie domains, namely Yelp and MovieLens, suggest that in pushing or nuking, particular products with the objective of highly-active and slightly-active users exhibit contrasting behaviors modifying the impact of a recommender system in order to affect in datasets with different characteristics. future interactions of the new user. The leading reserach questions of this work are then: 1 Introduction and Related Work • RQ1: From a global perspective, what is the impact of user- Collaborative filtering (CF) models are a crucial component in item attack on classes of users such as slightly-active and many real-world recommendation services due to their state-of-the- highly-active users? art accuracy. Considering their widespread popularity and adoption – Could attacks be tailored to have a higher impact on a in the industry, the output of these models can impact many de- particular class of users? cision qualities in different application scenarios [3, 16, 28]. The – Which factors play a role on the impact of such an attack? open nature of CF models, which rely on user-specified judgments • RQ2: From a local perspective, how do CF recommendation (e.g., ratings or reviews) to build user profiles and compute recom- model work differently under user-item attacks by looking mendation, can be used in the hand of adversaries to manipulate to user-classes? the underlying data and affect the impact of recommendation, a phenomenon commonly referred to as shilling attacks [11, 19]. The The remainder of the paper is structured as follows. Section 2 attacker may manipulate the recommender for positive motivations, presents the evaluation protocol and datasets description we used like outcomes improvement, or malicious, like reducing the user’s in our experimental evaluation. Section 3 reports on the results loyalty to a competitor. and their discussion. Section 4 concludes the paper and introduces ∗ Copyright 2019 for this paper by its authors. Use permitted under Creative Commons future perspectives. License Attribution 4.0 International (CC BY 4.0). Presented at the ImpactRS workshop held in conjunction with the 13th ACM Confer- ence on Recommender Systems (RecSys), 2019, in Copenhagen, Denmark. 2 User-Item attack modeling and evaluation † Authors are listed in alphabetical order. Corresponding author: Felice Antonio Merra In this section, we discuss our evaluation protocol for a user-item (felice.merra@poliba.it). attack modeling and the corresponding evaluation setup. 2.1 Evaluation Protocol BPR-SLIM [23]: Sparse LInear Method (SLIM) is an item-item In order to test the effects of a user-item attack on attacked model that models the estimation of unknown user-item rating user classes, an extensive set of experiments has been carried out as a regression problem. It learns a sparse aggregation coefficient with respect to three dimensions: (i) the attack strategy (type and matrix from aggregated users’ preferences. This model allows the quantity of injected profiles), (ii) core CF recommendation model system to capture correlations between items. BPR-SLIM uses the and (iii) the user classes. The experimental evaluation has been BPR optimization criterion.2 executed on two well-known datasets, MovieLens-1M (ML-1M) and BPR-MF [26]: This method uses matrix factorization (MF) as its un- Yelp (described in Section 2.2). derlying core predictor and optimizes it with Bayesian Personalized Ranking (BPR) objective function. 2.1.1 Attack Strategies. We have implemented two attack strate- These CF models stand for state-of-the-art models for the item gies to craft shilling profiles (SP) in order to model different level recommendation task, each using a different prediction concept, of attacker’s capability. Given a user profile P(u) = {r i 1 , . . . , r i n } allowing us to study the impact of different attack strategies from (consisting of a set of items rated by user u), we consider the items multiple viewpoints. in P(u) in the form of: selected items (I S ), filler items (I F ), target item (IT ) previously identified in [6], with |I S | + |I F | + |IT | = |P(u)|. 2.1.3 User Classes. Given that CF models only rely on user prefer- The items in the set I F are selected randomly in order to obstruct ence scores (i.e., ratings) to compute recommendation, we hypoth- detection of an SP while the only element in IT is the item that the esize that it is relevant to investigate the impact of different attack attacker wants to push, or nuke. Here we focus on two strategies strategies with respect to the victim user’s level of activity, i.e. the to build I S , which lies at the core of a shilling profile generation. richness of her profile, calculated on the basis of the number of The number of items in a shilling profile is close to the mean value ratings available in her profile. To this aim, we define two classes of the number of rating in the dataset. We execute two types of of users: attacks: • Highly-active (HA) users are defined as users who have • User-and-Model aware attack (UMA) assumes a partial a number of ratings greater than the second quartile of the knowledge of some victim preferences. The attacker creates number of ratings for each user in the dataset. a new profile, called seed profile, on the system with these • Slightly-active (SA) users are defined as users who have preferences and uses the recommendation systems to receive a number of ratings lower than the second quartile. recommendations. The recommendations are then used to fill I S with high ratings. This type of attack is inspired by the 2.1.4 Evaluation Metric. Several metrics have already been pro- probe attack [2, 6, 11]. In the probe attack, the seed profile is posed to evaluate malicious attacks. For example, [24] proposes the created by the adversary and the recommendations gener- prediction shift (PS) which estimates the success of an attack by ated by the recommender system are used to learn related measuring the prediction difference before and after the attack [30]. items and their ratings in order to built up shilling profiles It has been identified that a strong PS does not necessarily implies very similar to existing users in the system. These items an effective attack result [20]. From the perspective of the attacker, constitute the 50% of each shilling profile. the ideal goal in a push attack is to increase the chance of a de- • User-Neighbor aware attack (UNA) assumes that the at- sired item being recommended after the attack than before. We tacker knows some users similar to the victim. We employ use a modified version of Hit-Ratio [17] to measure the fraction of this attack by evaluating the k-nearest neighbor users of each successful attacks on a set of different user-item pairs. victim1 and selecting the most rated items in the neighbor Definition 1. Let u be the user under attack and i be the targeted in order to fill I S . This attack is a modified version of the item that the attacker wants to push/appear in the top-k recommenda- bandwagon or popular attack [25]. While the bandwagon at- tack sets high ratings on the popular items of the system; the tions of u. Let topuk be the top-k recommendations of u. Let ϕ(i, topuk ) proposed attack sets high ratings on the popular items inside be the function to evaluate the effectiveness on an attack on (u, i). the victim’s neighborhood in order in order to inject profiles If i is pushed in the top-k then ϕ(i, topuk ) = 1 (successful attack), capable to influence more the victim-s recommendations. otherwise ϕ(i, topuk ) = 0 (unsuccessful attack). Let S be the set of (u, i) user-item pairs under attack. HR@k is defined as the fraction We executed experiments with different size of injected profiles, of successful attacks on each (u, i) ∈ S. which are classified in small-size attacks by averaging results of k (u,i)∈S ϕ(i, topu ) Í attacks with 2, 10, 20, 50 shilling profiles and large-size attacks by HR@k = (1) averaging attacks with 200 and 500 injected profiles. |S | where |S | is the number of (u, i) pairs over which HR@k is measured. 2.1.2 CF Models. In our evaluation, we compared the vulnerabil- ity/robustness of the following CF models: User-kNN [5]: user-based k-nearest-neighbor (kNN) method. In 2.2 Data Descriptions We conducted experiments on two well-known datasets, Movie- our experiments, we set the number of neighbors k to 20 [19]. Lens 1M [12] and Yelp [13, 14]. The datasets represent different item Item-kNN [27]: item-based kNN method. Also in this case, the number of neighbors k has been set equal to 20. 2 The computation of the CF comparative models has been done with the publicly available software library MyMediaLite http://www.mymedialite.net/. We used default 1 experiment setting: k = 50, similarity metric = cosine similarity. parameters for both BPR-MF and BPR-SLIM. 2 recommendation scenarios for movie and business domains and small-size attacks, while it is 0.800 for large-size attacks, a difference have data densities which are approximately 40 times different from of approximately three times. The same pattern of results is obtained each other. Table 1 summarizes the statistics of the two datasets in other experimental cases. These results are in line with those (after pre-processing). presented in previous works [21, 22]. Our objective here is to study the impact of different attack Table 1: Characteristics of the dataset used in the offline experi- strategies on user classes. For this purpose, we define the variable ment: |U | is the number of users, |I | the number of items, | R | the r = H RH A H R S A and refer to it as user-class attack impact —i.e., the number of ratings impact of an attack on highly-active users in comparison with |R | Dataset |U| |I| |R| | I | · |U | × 100 slightly-active users. Different values for r are interpreted as in the following: ML-1M 6040 3706 1000209 4.468% Yelp 5135 5163 24809 0.093% • r = 1: the attack has an equal impact on highly-active and slightly-active users. • r > 1: the attack has an unequal impact on highly-active w.r.t MovieLens-1M: We used a million-sized version of the dataset slightly-active users. The impact of attack on highly-active ML-1M, which contain 1M ratings of users for items (movies). We users is relatively higher in comparison with slightly-active used the original ML-1M dataset for the experiments without any users.3 filtering. • r < 1: the attack has an unequal impact on highly-active v.s. Yelp: This dataset contains ratings of users on businesses. We used slightly-active users. The impact of attack on slightly-active the pre-processed version of the dataset provided by [13, 14] with users is relatively higher in comparison with highly-active 731K ratings of 25K users for 25K businesses. Given the large size users. of users and items from which item-item or user-similarities have It is obvious that the larger r deviates from the center point 1, the to be computed, similar to [1] we extracted a random sample of larger is the attack success in differentiating highly-active with 5K users and 5K items in order to speed up the experiments. The respect to slightly-active users in one of the above-mentioned di- resulted dataset contains 24.8K ratings with data density (0.110%), rections (r < 1 or r > 1). Before starting a deeper analysis of the which is comparable with the one before filtering (0.093%). results we highlight that the most interesting values are in the left portion of Table 2 (small-size attacks), because when the size 3 Results and Discussion of attack is larger the attack reaches the maximum effectiveness, In order to validate the empirical impact of the under study attack HR = 1, independently of user classes. types on different classes of users, an extensive set of experiments By looking at the results for each attack size in Table 2, we can has been carried out with respect to the dimensions introduced in see that the average user-class impact r¯ has a value higher than 1 for Section 2.1. The final results are presented in Table 2 and discussed the Yelp dataset (¯r > 1), while a value lower than 1 for the ML-1M from the following viewpoints: dataset (¯r < 1). These results show that both attack types have an • A global analysis of the impact of attacks on user classes (cf. unequal impact on slightly-active vs highly-active users as r , 1. Section 3.1) However, the class of users they have a larger impact on remains • A fine-grained analysis of the impact of attacks on user largely different and contrasting in the two datasets. classes by looking into the CF models and attack types. (cf. As an example, in Yelp and for UMA, one can note that for small- Section 3.2) size attack r¯ = 2.393 and for large-size attack r¯ = 1.832, while We present each of these analysis viewpoints in the following sub- the corresponding values on ML-1M are r¯ = 0.658 and r¯ = 0.909, sections. respectively. This means that the impact of attacks on user classes is higher on highly-active users on the Yelp dataset (¯r >1), differently 3.1 Global impact of attacks on user classes from ML-1M (¯r <1). The goal of this analysis is to answer the first research question We conjecture that the above contrasting behaviors are directly related to the global assessment on the effectiveness of user-item linked with the characteristics of the datasets such as their sparsity. attack with respect to the identified users classes. We use the term As shown in Table 1, Yelp dataset is approximately 40 times sparser global here, since in this analysis we would like to free our attention than ML-1M and we consider this difference as the main/possible from the impact of attacks on CF models, attack quality (type) cause of the contrasting outcomes in tested datasets. We try to and/or quantity as they have been largely addressed in previous provide a possible explanation here. In the more sparse dataset (i.e., works [11, 21, 22]. Instead, we examine the impact of attacks on the Yelp dataset), users with a small number of ratings (slightly- the dimension of user classes by looking into the aggregate mean active users) are more immune to attacks because they have a smaller values computed across CF models on the two datasets we adopted support size of the user profile (i.e., the user profile is not rich in our experimental evaluation. enough for the attacker to be able to mimic it in a crafted way). A general observation for the results in Table 2 is that larger-size In contrast, highly-active users are more immune to attack in ML- attacks reach higher level of effectiveness on both classes of users 1M with higher density, because their recommendations rely on (highly-active and slightly-active) in comparison with smaller-size neighbors with (very) rich user profiles. Put it simply, the crafted attacks. For example, on the Yelp dataset, the average HR@10 for 3 This is equal to say, slightly-active users are relatively more immune to the attack UNA attack on highly-active users (across CF models) is 0.256 for w.r.t. highly-active users. 3 Table 2: HR@10 for small-size and large-size attacks with respect to the class of user, slightly-active and highly-active, and the CF model. The user-class impact r is the ratio of H R H A value to H R S A . (Abbreviations: HA → Highly Active, SA→Slightly Active) Small-size attacks Large-size attacks BPR BPR BPR BPR overall Dataset CF/Attack U-kNN I-kNN mean U-kNN I-kNN mean SLIM MF SLIM MF mean SA 0.750 0.067 0.225 0.108 0.288 0.967 0.184 0.500 0.533 0.546 0.417 UMA HA 0.800 0.350 0.492 0.117 0.440 1.000 0.667 0.784 0.584 0.758 0.599 r 1.067 5.243 2.184 1.079 2.393 1.034 3.632 1.567 1.095 1.832 2.113 Yelp SA 0.850 0.625 0.792 0.400 0.667 1.000 0.834 1.000 1.000 0.958 0.813 UNA HA 0.875 0.742 0.850 0.433 0.725 1.000 0.850 1.000 1.000 0.963 0.844 r 1.029 1.186 1.074 1.082 1.093 1.000 1.020 1.000 1.000 1.005 1.049 SA 0.302 0.155 0.267 0.121 0.211 0.897 0.086 0.586 0.328 0.474 0.343 UMA HA 0.092 0.108 0.159 0.125 0.121 0.383 0.150 0.350 0.284 0.292 0.206 r 0.303 0.698 0.593 1.037 0.658 0.427 1.744 0.597 0.866 0.909 0.783 ML-1M SA 0.621 0.302 0.595 0.164 0.420 1.000 0.897 1.000 0.811 0.927 0.673 UNA HA 0.459 0.133 0.250 0.183 0.256 1.000 0.800 0.800 0.600 0.800 0.528 r 0.739 0.442 0.421 1.121 0.680 1.000 0.892 0.800 0.740 0.858 0.769 which gain good effect also with small-size attacks. For instance, HR@10 for Yelp on slightly-active users (0.750 and 0.850) is higher than the mean values with other models for both attack (mean = 0.288 and 0.440). We can also observe an interesting behavior when we compare ρ of BPR-MF with BPR-SLIM and Item-kNN. Figure 1 (a) and (b) show that HR@10 on both classes of attacked users is highly correlated (ρ ≥ 0.840). Finally, results in Table 2 show that BPR-MF is the model that is less influenced by user-classes (a) HR@10 Slightly-active Users (b) HR@10 Highly-active Users because the user-impact factor is close to 1 for each class of users Figure 1: Heat-map of Correlation Coefficient (ρ) of different mea- and attacks. sures between CF models for small-size attacks: (a) HR@10 on Slightly-active Users, (b) HR@10 on Highly-active Users. 4 Conclusion and Future Work This work investigates the effect of user-item attacks on classes of users. Particularly, we investigated the effectiveness of attacks attacks need to use a large number of profiles to be able to alter from a global and local perspective by varying the quality and quan- recommendation for the target user. tity of attacks, the target user class and the collaborative filtering The insight on sparsity is an important indication that data recommendation model. characteristics are playing a role in the effectiveness of attacks and Experimental results on Yelp and MovieLens datasets indicate it motivates further research in this direction. that for Yelp dataset slightly-active users are more immune to shilling attacks than highly-active users, a characteristic that is in 3.2 Fine-grained analysis of the impact of contrast with the results on MovieLens dataset where highly-active attacks on user classes users are more immune than slightly-active users. As datasets have The goal of this analysis is to study how different CF models a very different sparsity (Yelp is approximately 40 time more sparse behave against the attacks: which ones have similar performance than MovieLens) we will move our future works in analyzing the and which ones have a different performance. This study resembles effectiveness of dataset properties under different attack scenarios. previous work on shilling attacks on CF models. However, we take From a local perspective, we evidence that BPR-MF is less influ- into account the impact of attack on user classes in this study as enced than other models when varying user-class and attack types. well. On the other hand, BPR-SLIM and Item-kNN have shown similar Instead of individual CF models performances and attack types, behavior related to the effect of attacks on user classes. In future, we compute the pairwise Pearson correlation between each pair we also plan to extend our study by considering more datasets from of analyzed CF models. Figure 1 indicates a strong correlation on different domains, exploring in an extensive way the influence of HR@10 between BPR-SLIM and Item-kNN (ρ = 0.960 in Figure 1a dataset properties, such as sparsity, user and item skewness, rating and ρ = 0.993 in Figure 1b). We justify this value by the fact variance, on the effectiveness of different type of attacks. Also, it that both CF models exploit the item-item similarity computation. is of our interest to consider the impact of various shilling attack Looking at the correlation values for User-kNN in Figure 1, one can types on CF models using item content as side information [9, 10]. observe a slightly lower correlation in the case of slightly-active- These studies give important insights on the impact of shilling users with respect to other models. We think that this phenomenon attacks on recommender systems and provide clues on how to comes from the fact that tested attack are based on user preferences reduce their effectiveness by working on datasets characteristics. 4 References [22] Bamshad Mobasher, Robin D. Burke, Runa Bhaumik, and Chad Williams. 2007. Toward trustworthy recommender systems: An analysis of attack models and [1] Gediminas Adomavicius and Jingjing Zhang. 2015. Improving Stability of Rec- algorithm robustness. ACM Trans. Internet Techn. 7, 4 (2007), 23. https://doi.org/ ommender Systems: A Meta-Algorithmic Approach. IEEE Trans. Knowl. Data 10.1145/1278366.1278372 Eng. 27, 6 (2015), 1573–1587. https://doi.org/10.1109/TKDE.2014.2384502 [23] Xia Ning and George Karypis. 2011. SLIM: Sparse Linear Methods for Top-N [2] Charu C. Aggarwal. 2016. Recommender Systems - The Textbook. Springer. Recommender Systems. In 11th IEEE International Conference on Data Mining, https://doi.org/10.1007/978-3-319-29659-3 ICDM 2011, Vancouver, BC, Canada, December 11-14, 2011. 497–506. https://doi. [3] Xavier Amatriain and Justin Basilico. 2015. Recommender Systems in Industry: org/10.1109/ICDM.2011.134 A Netflix Case Study. In Recommender Systems Handbook, Francesco Ricci, Lior [24] Michael P. O’Mahony, Neil J. Hurley, Nicholas Kushmerick, and Guenole C. M. Rokach, and Bracha Shapira (Eds.). Springer, 385–419. https://doi.org/10.1007/ Silvestre. 2004. Collaborative recommendation: A robustness analysis. ACM Trans. 978-1-4899-7637-6_11 Internet Techn. 4, 4 (2004), 344–377. https://doi.org/10.1145/1031114.1031116 [4] Runa Bhaumik, Chad Williams, Bamshad Mobasher, and Robin Burke. 2006. [25] Michael P. O’Mahony, Neil J. Hurley, and Guenole C. M. Silvestre. 2005. Rec- Securing collaborative filtering against malicious attacks through anomaly de- ommender Systems: Attack Types and Strategies. In Proceedings, The Twentieth tection. In Proceedings of the 4th Workshop on Intelligent Techniques for Web National Conference on Artificial Intelligence and the Seventeenth Innovative Ap- Personalization (ITWP’06), Boston, Vol. 6. 10. plications of Artificial Intelligence Conference, July 9-13, 2005, Pittsburgh, Pennsyl- [5] John S. Breese, David Heckerman, and Carl Myers Kadie. 1998. Empirical Analy- vania, USA. 334–339. http://www.aaai.org/Library/AAAI/2005/aaai05-053.php sis of Predictive Algorithms for Collaborative Filtering. In UAI ’98: Proceedings [26] Steffen Rendle, Christoph Freudenthaler, Zeno Gantner, and Lars Schmidt- of the Fourteenth Conference on Uncertainty in Artificial Intelligence, University Thieme. 2009. BPR: Bayesian Personalized Ranking from Implicit Feed- of Wisconsin Business School, Madison, Wisconsin, USA, July 24-26, 1998. 43– back. In UAI 2009, Proceedings of the Twenty-Fifth Conference on Uncer- 52. https://dslpitt.org/uai/displayArticleDetails.jsp?mmnu=1&smnu=2&article_ tainty in Artificial Intelligence, Montreal, QC, Canada, June 18-21, 2009. 452– id=231&proceeding_id=14 461. https://dslpitt.org/uai/displayArticleDetails.jsp?mmnu=1&smnu=2&article_ [6] Robin Burke, Bamshad Mobasher, Roman Zabicki, and Runa Bhaumik. 2005. id=1630&proceeding_id=25 Identifying attack models for secure recommendation. Beyond Personalization [27] Badrul Munir Sarwar, George Karypis, Joseph A. Konstan, and John Riedl. 2001. 2005 (2005). Item-based collaborative filtering recommendation algorithms. In Proceedings of [7] Zunping Cheng and Neil Hurley. 2010. Robust Collaborative Recommendation the Tenth International World Wide Web Conference, WWW 10, Hong Kong, China, by Least Trimmed Squares Matrix Factorization. In 22nd IEEE International Con- May 1-5, 2001. 285–295. https://doi.org/10.1145/371920.372071 ference on Tools with Artificial Intelligence, ICTAI 2010, Arras, France, 27-29 October [28] Yue Shi, Martha Larson, and Alan Hanjalic. 2014. Collaborative filtering beyond 2010 - Volume 2. 105–112. https://doi.org/10.1109/ICTAI.2010.90 the user-item matrix: A survey of the state of the art and future challenges. ACM [8] Paul-Alexandru Chirita, Wolfgang Nejdl, and Cristian Zamfir. 2005. Preventing Computing Surveys (CSUR) 47, 1 (2014), 3. shilling attacks in online recommender systems. In Seventh ACM International [29] Zhihai Yang and Zhongmin Cai. 2017. Detecting abnormal profiles in collabo- Workshop on Web Information and Data Management (WIDM 2005), Bremen, rative filtering recommender systems. J. Intell. Inf. Syst. 48, 3 (2017), 499–518. Germany, November 4, 2005. 67–74. https://doi.org/10.1145/1097047.1097061 https://doi.org/10.1007/s10844-016-0424-5 [9] Yashar Deldjoo, Maurizio Ferrari Dacrema, Mihai Gabriel Constantin, Hamid [30] Mi Zhang, Jie Tang, Xuchen Zhang, and Xiangyang Xue. 2014. Addressing Eghbal-zadeh, Stefano Cereda, Markus Schedl, Bogdan Ionescu, and Paolo Cre- cold start in recommender systems: A semi-supervised co-training algorithm. monesi. 2019. Movie genome: alleviating new item cold start in movie rec- In Proceedings of the 37th international ACM SIGIR conference on Research & ommendation. User Model. User-Adapt. Interact. 29, 2 (2019), 291–343. https: development in information retrieval. ACM, 73–82. //doi.org/10.1007/s11257-019-09221-y [10] Yashar Deldjoo, Markus Schedl, Paolo Cremonesi, and Gabriella Pasi. 2018. Content-Based Multimedia Recommendation Systems: Definition and Appli- cation Domains. In Proceedings of the 9th Italian Information Retrieval Workshop, Rome, Italy, May, 28-30, 2018. http://ceur-ws.org/Vol-2140/paper15.pdf [11] Ihsan Gunes, Cihan Kaleli, Alper Bilge, and Huseyin Polat. 2014. Shilling attacks against recommender systems: a comprehensive survey. Artif. Intell. Rev. 42, 4 (2014), 767–799. https://doi.org/10.1007/s10462-012-9364-9 [12] F. Maxwell Harper and Joseph A. Konstan. 2016. The MovieLens Datasets: History and Context. TiiS 5, 4 (2016), 19:1–19:19. https://doi.org/10.1145/2827872 [13] Xiangnan He, Zhankui He, Xiaoyu Du, and Tat-Seng Chua. 2018. Adversarial Personalized Ranking for Recommendation. In The 41st International ACM SIGIR Conference on Research & Development in Information Retrieval, SIGIR 2018, Ann Ar- bor, MI, USA, July 08-12, 2018. 355–364. https://doi.org/10.1145/3209978.3209981 [14] Xiangnan He, Hanwang Zhang, Min-Yen Kan, and Tat-Seng Chua. 2016. Fast Matrix Factorization for Online Recommendation with Implicit Feedback. In Proceedings of the 39th International ACM SIGIR conference on Research and Devel- opment in Information Retrieval, SIGIR 2016, Pisa, Italy, July 17-21, 2016. 549–558. https://doi.org/10.1145/2911451.2911489 [15] Shyong K. Lam and John Riedl. 2004. Shilling recommender systems for fun and profit. In Proceedings of the 13th international conference on World Wide Web, WWW 2004, New York, NY, USA, May 17-20, 2004. 393–402. https://doi.org/10. 1145/988672.988726 [16] Greg Linden, Brent Smith, and Jeremy York. 2003. Industry Report: Amazon.com Recommendations: Item-to-Item Collaborative Filtering. IEEE Distributed Systems Online 4, 1 (2003). [17] Bhaskar Mehta and Wolfgang Nejdl. 2008. Attack resistant collaborative filtering. In Proceedings of the 31st Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, SIGIR 2008, Singapore, July 20-24, 2008. 75–82. https://doi.org/10.1145/1390334.1390350 [18] Bhaskar Mehta and Wolfgang Nejdl. 2009. Unsupervised strategies for shilling detection and robust collaborative filtering. User Model. User-Adapt. Interact. 19, 1-2 (2009), 65–97. https://doi.org/10.1007/s11257-008-9050-4 [19] Bamshad Mobasher, Robin Burke, Runa Bhaumik, and Chad Williams. 2005. Effective attack models for shilling item-based collaborative filtering systems. Citeseer. [20] Bamshad Mobasher, Robin Burke, and Jeff J Sandvig. 2006. Model-based collab- orative filtering as a defense against profile injection attacks. In AAAI, Vol. 6. 1388. [21] Bamshad Mobasher, Robin D. Burke, Runa Bhaumik, and Jeff J. Sandvig. 2007. Attacks and Remedies in Collaborative Recommendation. IEEE Intelligent Systems 22, 3 (2007), 56–63. https://doi.org/10.1109/MIS.2007.45 5