Assessing the Impact of a User-Item Collaborative Attack on
                            Class of Users∗
                  Yashar Deldjoo                                          Tommaso Di Noia                            Felice Antonio Merra†
         Polytechnic University of Bari                             Polytechnic University of Bari                 Polytechnic University of Bari
                  Bari, Italy                                                Bari, Italy                                      Bari, Italy
           yashar.deldjoo@poliba.it                                   tommaso.dinoia@poliba.it                         felice.merra@poliba.it

Abstract                                                                                    In this direction, first works [15, 19, 24] focused on different
   Collaborative Filtering (CF) models lie at the core of most rec-                      profile injection strategies by analyzing and classifying them on
ommendation systems due to their state-of-the-art accuracy. They                         the required effort and amount of attacker’s knowledge to craft
are commonly adopted in e-commerce and online services for their                         successful attacks. These works have been followed by multiple
impact on sales volume and/or diversity, and their impact on com-                        studies on the evaluation of the robustness [4, 7, 22] of different CF
panies’ outcome. However, CF models are only as good as the inter-                       models and detection strategies [8, 18, 29]. The robustness analysis
action data they work with. As these models rely on outside sources                      in surveys [11, 21] shows that Item-kNN is more robust than User-
of information, counterfeit data such as user ratings or reviews can                     kNN and model-based CF are generally more resistant to shilling
be injected by attackers to manipulate the underlying data and                           attacks than conventional nearest neighbor-based algorithms.
alter the impact of resulting recommendations, thus implement-                              One common characteristic of the previous literature on shilling
ing a so-called shilling attack. While previous works have focused                       attacks on CF-RS is their focus on assessing the global impact of
on evaluating shilling attack strategies from a global perspective                       shilling attacks on different CF models by examining the success of
paying particular attention to the effect of the size of attacks and                     attacks from the perspective of attacker’s knowledge and the size of
attacker’s knowledge, in this work we explore the effectiveness of                       attack (i.e. the number of shilling profiles) [11]. In the present work
shilling attacks under novel aspects. First, we investigate the effect                   instead, we investigate the effectiveness of an attack on a target-
of attack strategies crafted on a target user in order to push the rec-                  item of a target-user, namely user-item attack, with a novel point of
ommendation of a low-ranking item to a higher position, referred to                      attention focused on influence of the attack on the classes of attacked
as user-item attack. Second, we evaluate the effectiveness of attacks                    users in particular highly-active (HA) user and slightly-active (SA)
in altering the impact of different CF models by contemplating the                       user.
class of the target user, from the perspective of the richness of her                       The application scenario for class-based study of attacks on RS
profile (i.e., slightly-active v.s. highly-active user). Finally, similar to             may span in different domains. As an example, a restaurant owner
previous work we contemplate the size of attack (i.e., the amount                        may wish to diminish the trust on a target user of a competitor
of fake profiles injected) in examining their success.                                   by pushing a low-ranked product for the specific user. The same
   The results of experiments on two widely used datasets in busi-                       argument can be made for new users. An attacker may be interested
ness and movie domains, namely Yelp and MovieLens, suggest that                          in pushing or nuking, particular products with the objective of
highly-active and slightly-active users exhibit contrasting behaviors                    modifying the impact of a recommender system in order to affect
in datasets with different characteristics.                                              future interactions of the new user.
                                                                                            The leading reserach questions of this work are then:
1    Introduction and Related Work                                                            • RQ1: From a global perspective, what is the impact of user-
   Collaborative filtering (CF) models are a crucial component in
                                                                                                item attack on classes of users such as slightly-active and
many real-world recommendation services due to their state-of-the-
                                                                                                highly-active users?
art accuracy. Considering their widespread popularity and adoption
                                                                                                – Could attacks be tailored to have a higher impact on a
in the industry, the output of these models can impact many de-
                                                                                                    particular class of users?
cision qualities in different application scenarios [3, 16, 28]. The
                                                                                                – Which factors play a role on the impact of such an attack?
open nature of CF models, which rely on user-specified judgments
                                                                                              • RQ2: From a local perspective, how do CF recommendation
(e.g., ratings or reviews) to build user profiles and compute recom-
                                                                                                model work differently under user-item attacks by looking
mendation, can be used in the hand of adversaries to manipulate
                                                                                                to user-classes?
the underlying data and affect the impact of recommendation, a
phenomenon commonly referred to as shilling attacks [11, 19]. The                           The remainder of the paper is structured as follows. Section 2
attacker may manipulate the recommender for positive motivations,                        presents the evaluation protocol and datasets description we used
like outcomes improvement, or malicious, like reducing the user’s                        in our experimental evaluation. Section 3 reports on the results
loyalty to a competitor.                                                                 and their discussion. Section 4 concludes the paper and introduces
∗ Copyright 2019 for this paper by its authors. Use permitted under Creative Commons
                                                                                         future perspectives.
License Attribution 4.0 International (CC BY 4.0).
Presented at the ImpactRS workshop held in conjunction with the 13th ACM Confer-
ence on Recommender Systems (RecSys), 2019, in Copenhagen, Denmark.
                                                                                         2   User-Item attack modeling and evaluation
† Authors are listed in alphabetical order. Corresponding author: Felice Antonio Merra      In this section, we discuss our evaluation protocol for a user-item
(felice.merra@poliba.it).                                                                attack modeling and the corresponding evaluation setup.
2.1      Evaluation Protocol                                                          BPR-SLIM [23]: Sparse LInear Method (SLIM) is an item-item
   In order to test the effects of a user-item attack on attacked                     model that models the estimation of unknown user-item rating
user classes, an extensive set of experiments has been carried out                    as a regression problem. It learns a sparse aggregation coefficient
with respect to three dimensions: (i) the attack strategy (type and                   matrix from aggregated users’ preferences. This model allows the
quantity of injected profiles), (ii) core CF recommendation model                     system to capture correlations between items. BPR-SLIM uses the
and (iii) the user classes. The experimental evaluation has been                      BPR optimization criterion.2
executed on two well-known datasets, MovieLens-1M (ML-1M) and                         BPR-MF [26]: This method uses matrix factorization (MF) as its un-
Yelp (described in Section 2.2).                                                      derlying core predictor and optimizes it with Bayesian Personalized
                                                                                      Ranking (BPR) objective function.
2.1.1 Attack Strategies. We have implemented two attack strate-                          These CF models stand for state-of-the-art models for the item
gies to craft shilling profiles (SP) in order to model different level                recommendation task, each using a different prediction concept,
of attacker’s capability. Given a user profile P(u) = {r i 1 , . . . , r i n }        allowing us to study the impact of different attack strategies from
(consisting of a set of items rated by user u), we consider the items                 multiple viewpoints.
in P(u) in the form of: selected items (I S ), filler items (I F ), target
item (IT ) previously identified in [6], with |I S | + |I F | + |IT | = |P(u)|.       2.1.3 User Classes. Given that CF models only rely on user prefer-
The items in the set I F are selected randomly in order to obstruct                   ence scores (i.e., ratings) to compute recommendation, we hypoth-
detection of an SP while the only element in IT is the item that the                  esize that it is relevant to investigate the impact of different attack
attacker wants to push, or nuke. Here we focus on two strategies                      strategies with respect to the victim user’s level of activity, i.e. the
to build I S , which lies at the core of a shilling profile generation.               richness of her profile, calculated on the basis of the number of
The number of items in a shilling profile is close to the mean value                  ratings available in her profile. To this aim, we define two classes
of the number of rating in the dataset. We execute two types of                       of users:
attacks:                                                                                   • Highly-active (HA) users are defined as users who have
     • User-and-Model aware attack (UMA) assumes a partial                                   a number of ratings greater than the second quartile of the
       knowledge of some victim preferences. The attacker creates                            number of ratings for each user in the dataset.
       a new profile, called seed profile, on the system with these                        • Slightly-active (SA) users are defined as users who have
       preferences and uses the recommendation systems to receive                            a number of ratings lower than the second quartile.
       recommendations. The recommendations are then used to
       fill I S with high ratings. This type of attack is inspired by the             2.1.4 Evaluation Metric. Several metrics have already been pro-
       probe attack [2, 6, 11]. In the probe attack, the seed profile is              posed to evaluate malicious attacks. For example, [24] proposes the
       created by the adversary and the recommendations gener-                        prediction shift (PS) which estimates the success of an attack by
       ated by the recommender system are used to learn related                       measuring the prediction difference before and after the attack [30].
       items and their ratings in order to built up shilling profiles                 It has been identified that a strong PS does not necessarily implies
       very similar to existing users in the system. These items                      an effective attack result [20]. From the perspective of the attacker,
       constitute the 50% of each shilling profile.                                   the ideal goal in a push attack is to increase the chance of a de-
     • User-Neighbor aware attack (UNA) assumes that the at-                          sired item being recommended after the attack than before. We
       tacker knows some users similar to the victim. We employ                       use a modified version of Hit-Ratio [17] to measure the fraction of
       this attack by evaluating the k-nearest neighbor users of each                 successful attacks on a set of different user-item pairs.
       victim1 and selecting the most rated items in the neighbor
                                                                                          Definition 1. Let u be the user under attack and i be the targeted
       in order to fill I S . This attack is a modified version of the
                                                                                      item that the attacker wants to push/appear in the top-k recommenda-
       bandwagon or popular attack [25]. While the bandwagon at-
       tack sets high ratings on the popular items of the system; the                 tions of u. Let topuk be the top-k recommendations of u. Let ϕ(i, topuk )
       proposed attack sets high ratings on the popular items inside                  be the function to evaluate the effectiveness on an attack on (u, i).
       the victim’s neighborhood in order in order to inject profiles                 If i is pushed in the top-k then ϕ(i, topuk ) = 1 (successful attack),
       capable to influence more the victim-s recommendations.                        otherwise ϕ(i, topuk ) = 0 (unsuccessful attack). Let S be the set of
                                                                                      (u, i) user-item pairs under attack. HR@k is defined as the fraction
We executed experiments with different size of injected profiles,
                                                                                      of successful attacks on each (u, i) ∈ S.
which are classified in small-size attacks by averaging results of
                                                                                                                                         k
                                                                                                                         (u,i)∈S ϕ(i, topu )
                                                                                                                       Í
attacks with 2, 10, 20, 50 shilling profiles and large-size attacks by
                                                                                                            HR@k =                                         (1)
averaging attacks with 200 and 500 injected profiles.                                                                           |S |
                                                                                      where |S | is the number of (u, i) pairs over which HR@k is measured.
2.1.2 CF Models. In our evaluation, we compared the vulnerabil-
ity/robustness of the following CF models:
User-kNN [5]: user-based k-nearest-neighbor (kNN) method. In
                                                                                      2.2     Data Descriptions
                                                                                        We conducted experiments on two well-known datasets, Movie-
our experiments, we set the number of neighbors k to 20 [19].
                                                                                      Lens 1M [12] and Yelp [13, 14]. The datasets represent different item
Item-kNN [27]: item-based kNN method. Also in this case, the
number of neighbors k has been set equal to 20.                                       2 The computation of the CF comparative models has been done with the publicly
                                                                                      available software library MyMediaLite http://www.mymedialite.net/. We used default
1 experiment setting: k = 50, similarity metric = cosine similarity.                  parameters for both BPR-MF and BPR-SLIM.
                                                                                  2
recommendation scenarios for movie and business domains and                   small-size attacks, while it is 0.800 for large-size attacks, a difference
have data densities which are approximately 40 times different from           of approximately three times. The same pattern of results is obtained
each other. Table 1 summarizes the statistics of the two datasets             in other experimental cases. These results are in line with those
(after pre-processing).                                                       presented in previous works [21, 22].
                                                                                  Our objective here is to study the impact of different attack
Table 1: Characteristics of the dataset used in the offline experi-           strategies on user classes. For this purpose, we define the variable
ment: |U | is the number of users, |I | the number of items, | R | the        r = H    RH A
                                                                                     H R S A and refer to it as user-class attack impact —i.e., the
number of ratings
                                                                              impact of an attack on highly-active users in comparison with
                                                   |R |
        Dataset      |U|     |I|      |R|       | I | · |U | × 100
                                                                              slightly-active users. Different values for r are interpreted as in the
                                                                              following:
        ML-1M       6040    3706    1000209         4.468%
         Yelp       5135    5163     24809          0.093%                          • r = 1: the attack has an equal impact on highly-active and
                                                                                      slightly-active users.
                                                                                    • r > 1: the attack has an unequal impact on highly-active w.r.t
MovieLens-1M: We used a million-sized version of the dataset                          slightly-active users. The impact of attack on highly-active
ML-1M, which contain 1M ratings of users for items (movies). We                       users is relatively higher in comparison with slightly-active
used the original ML-1M dataset for the experiments without any                       users.3
filtering.                                                                          • r < 1: the attack has an unequal impact on highly-active v.s.
Yelp: This dataset contains ratings of users on businesses. We used                   slightly-active users. The impact of attack on slightly-active
the pre-processed version of the dataset provided by [13, 14] with                    users is relatively higher in comparison with highly-active
731K ratings of 25K users for 25K businesses. Given the large size                    users.
of users and items from which item-item or user-similarities have
                                                                              It is obvious that the larger r deviates from the center point 1, the
to be computed, similar to [1] we extracted a random sample of
                                                                              larger is the attack success in differentiating highly-active with
5K users and 5K items in order to speed up the experiments. The
                                                                              respect to slightly-active users in one of the above-mentioned di-
resulted dataset contains 24.8K ratings with data density (0.110%),
                                                                              rections (r < 1 or r > 1). Before starting a deeper analysis of the
which is comparable with the one before filtering (0.093%).
                                                                              results we highlight that the most interesting values are in the
                                                                              left portion of Table 2 (small-size attacks), because when the size
3     Results and Discussion                                                  of attack is larger the attack reaches the maximum effectiveness,
   In order to validate the empirical impact of the under study attack        HR = 1, independently of user classes.
types on different classes of users, an extensive set of experiments              By looking at the results for each attack size in Table 2, we can
has been carried out with respect to the dimensions introduced in             see that the average user-class impact r¯ has a value higher than 1 for
Section 2.1. The final results are presented in Table 2 and discussed         the Yelp dataset (¯r > 1), while a value lower than 1 for the ML-1M
from the following viewpoints:                                                dataset (¯r < 1). These results show that both attack types have an
     • A global analysis of the impact of attacks on user classes (cf.        unequal impact on slightly-active vs highly-active users as r , 1.
       Section 3.1)                                                           However, the class of users they have a larger impact on remains
     • A fine-grained analysis of the impact of attacks on user               largely different and contrasting in the two datasets.
       classes by looking into the CF models and attack types. (cf.               As an example, in Yelp and for UMA, one can note that for small-
       Section 3.2)                                                           size attack r¯ = 2.393 and for large-size attack r¯ = 1.832, while
We present each of these analysis viewpoints in the following sub-            the corresponding values on ML-1M are r¯ = 0.658 and r¯ = 0.909,
sections.                                                                     respectively. This means that the impact of attacks on user classes is
                                                                              higher on highly-active users on the Yelp dataset (¯r >1), differently
3.1    Global impact of attacks on user classes                               from ML-1M (¯r <1).
   The goal of this analysis is to answer the first research question             We conjecture that the above contrasting behaviors are directly
related to the global assessment on the effectiveness of user-item            linked with the characteristics of the datasets such as their sparsity.
attack with respect to the identified users classes. We use the term          As shown in Table 1, Yelp dataset is approximately 40 times sparser
global here, since in this analysis we would like to free our attention       than ML-1M and we consider this difference as the main/possible
from the impact of attacks on CF models, attack quality (type)                cause of the contrasting outcomes in tested datasets. We try to
and/or quantity as they have been largely addressed in previous               provide a possible explanation here. In the more sparse dataset (i.e.,
works [11, 21, 22]. Instead, we examine the impact of attacks on              the Yelp dataset), users with a small number of ratings (slightly-
the dimension of user classes by looking into the aggregate mean              active users) are more immune to attacks because they have a smaller
values computed across CF models on the two datasets we adopted               support size of the user profile (i.e., the user profile is not rich
in our experimental evaluation.                                               enough for the attacker to be able to mimic it in a crafted way).
   A general observation for the results in Table 2 is that larger-size       In contrast, highly-active users are more immune to attack in ML-
attacks reach higher level of effectiveness on both classes of users          1M with higher density, because their recommendations rely on
(highly-active and slightly-active) in comparison with smaller-size           neighbors with (very) rich user profiles. Put it simply, the crafted
attacks. For example, on the Yelp dataset, the average HR@10 for              3 This is equal to say, slightly-active users are relatively more immune to the attack
UNA attack on highly-active users (across CF models) is 0.256 for             w.r.t. highly-active users.
                                                                          3
Table 2: HR@10 for small-size and large-size attacks with respect to the class of user, slightly-active and highly-active, and the CF model. The
user-class impact r is the ratio of H R H A value to H R S A . (Abbreviations: HA → Highly Active, SA→Slightly Active)
                                                    Small-size attacks                                   Large-size attacks
                                                             BPR        BPR                                      BPR      BPR               overall
      Dataset     CF/Attack             U-kNN      I-kNN                             mean     U-kNN     I-kNN                     mean
                                                             SLIM       MF                                       SLIM     MF                 mean
                                   SA   0.750      0.067     0.225      0.108        0.288    0.967     0.184    0.500    0.533    0.546     0.417
                  UMA              HA   0.800      0.350     0.492      0.117        0.440    1.000     0.667    0.784    0.584    0.758     0.599
                                   r    1.067      5.243     2.184       1.079       2.393     1.034    3.632     1.567   1.095    1.832     2.113
      Yelp                         SA   0.850      0.625     0.792      0.400        0.667    1.000     0.834    1.000    1.000    0.958     0.813
                  UNA              HA   0.875      0.742     0.850      0.433        0.725    1.000     0.850    1.000    1.000    0.963     0.844
                                   r    1.029      1.186     1.074      1.082        1.093    1.000     1.020    1.000    1.000    1.005     1.049
                                   SA   0.302      0.155     0.267      0.121        0.211    0.897     0.086    0.586    0.328    0.474     0.343
                  UMA              HA   0.092      0.108     0.159      0.125        0.121    0.383     0.150    0.350    0.284    0.292     0.206
                                   r     0.303     0.698     0.593      1.037        0.658    0.427     1.744    0.597    0.866    0.909     0.783
      ML-1M                        SA   0.621      0.302     0.595      0.164        0.420    1.000     0.897    1.000    0.811    0.927     0.673
                  UNA              HA   0.459      0.133     0.250      0.183        0.256    1.000     0.800    0.800    0.600    0.800     0.528
                                   r    0.739      0.442     0.421      1.121        0.680    1.000     0.892    0.800    0.740    0.858     0.769


                                                                                     which gain good effect also with small-size attacks. For instance,
                                                                                     HR@10 for Yelp on slightly-active users (0.750 and 0.850) is higher
                                                                                     than the mean values with other models for both attack (mean =
                                                                                     0.288 and 0.440). We can also observe an interesting behavior when
                                                                                     we compare ρ of BPR-MF with BPR-SLIM and Item-kNN. Figure 1
                                                                                     (a) and (b) show that HR@10 on both classes of attacked users
                                                                                     is highly correlated (ρ ≥ 0.840). Finally, results in Table 2 show
                                                                                     that BPR-MF is the model that is less influenced by user-classes
      (a) HR@10 Slightly-active Users   (b) HR@10 Highly-active Users                because the user-impact factor is close to 1 for each class of users
Figure 1: Heat-map of Correlation Coefficient (ρ) of different mea-                  and attacks.
sures between CF models for small-size attacks: (a) HR@10 on
Slightly-active Users, (b) HR@10 on Highly-active Users.                             4   Conclusion and Future Work
                                                                                        This work investigates the effect of user-item attacks on classes
                                                                                     of users. Particularly, we investigated the effectiveness of attacks
attacks need to use a large number of profiles to be able to alter
                                                                                     from a global and local perspective by varying the quality and quan-
recommendation for the target user.
                                                                                     tity of attacks, the target user class and the collaborative filtering
   The insight on sparsity is an important indication that data
                                                                                     recommendation model.
characteristics are playing a role in the effectiveness of attacks and
                                                                                        Experimental results on Yelp and MovieLens datasets indicate
it motivates further research in this direction.
                                                                                     that for Yelp dataset slightly-active users are more immune to
                                                                                     shilling attacks than highly-active users, a characteristic that is in
3.2    Fine-grained analysis of the impact of                                        contrast with the results on MovieLens dataset where highly-active
       attacks on user classes                                                       users are more immune than slightly-active users. As datasets have
   The goal of this analysis is to study how different CF models                     a very different sparsity (Yelp is approximately 40 time more sparse
behave against the attacks: which ones have similar performance                      than MovieLens) we will move our future works in analyzing the
and which ones have a different performance. This study resembles                    effectiveness of dataset properties under different attack scenarios.
previous work on shilling attacks on CF models. However, we take                     From a local perspective, we evidence that BPR-MF is less influ-
into account the impact of attack on user classes in this study as                   enced than other models when varying user-class and attack types.
well.                                                                                On the other hand, BPR-SLIM and Item-kNN have shown similar
   Instead of individual CF models performances and attack types,                    behavior related to the effect of attacks on user classes. In future,
we compute the pairwise Pearson correlation between each pair                        we also plan to extend our study by considering more datasets from
of analyzed CF models. Figure 1 indicates a strong correlation on                    different domains, exploring in an extensive way the influence of
HR@10 between BPR-SLIM and Item-kNN (ρ = 0.960 in Figure 1a                          dataset properties, such as sparsity, user and item skewness, rating
and ρ = 0.993 in Figure 1b). We justify this value by the fact                       variance, on the effectiveness of different type of attacks. Also, it
that both CF models exploit the item-item similarity computation.                    is of our interest to consider the impact of various shilling attack
Looking at the correlation values for User-kNN in Figure 1, one can                  types on CF models using item content as side information [9, 10].
observe a slightly lower correlation in the case of slightly-active-                    These studies give important insights on the impact of shilling
users with respect to other models. We think that this phenomenon                    attacks on recommender systems and provide clues on how to
comes from the fact that tested attack are based on user preferences                 reduce their effectiveness by working on datasets characteristics.
                                                                                 4
References                                                                                         [22] Bamshad Mobasher, Robin D. Burke, Runa Bhaumik, and Chad Williams. 2007.
                                                                                                        Toward trustworthy recommender systems: An analysis of attack models and
 [1] Gediminas Adomavicius and Jingjing Zhang. 2015. Improving Stability of Rec-                        algorithm robustness. ACM Trans. Internet Techn. 7, 4 (2007), 23. https://doi.org/
     ommender Systems: A Meta-Algorithmic Approach. IEEE Trans. Knowl. Data                             10.1145/1278366.1278372
     Eng. 27, 6 (2015), 1573–1587. https://doi.org/10.1109/TKDE.2014.2384502                       [23] Xia Ning and George Karypis. 2011. SLIM: Sparse Linear Methods for Top-N
 [2] Charu C. Aggarwal. 2016. Recommender Systems - The Textbook. Springer.                             Recommender Systems. In 11th IEEE International Conference on Data Mining,
     https://doi.org/10.1007/978-3-319-29659-3                                                          ICDM 2011, Vancouver, BC, Canada, December 11-14, 2011. 497–506. https://doi.
 [3] Xavier Amatriain and Justin Basilico. 2015. Recommender Systems in Industry:                       org/10.1109/ICDM.2011.134
     A Netflix Case Study. In Recommender Systems Handbook, Francesco Ricci, Lior                  [24] Michael P. O’Mahony, Neil J. Hurley, Nicholas Kushmerick, and Guenole C. M.
     Rokach, and Bracha Shapira (Eds.). Springer, 385–419. https://doi.org/10.1007/                     Silvestre. 2004. Collaborative recommendation: A robustness analysis. ACM Trans.
     978-1-4899-7637-6_11                                                                               Internet Techn. 4, 4 (2004), 344–377. https://doi.org/10.1145/1031114.1031116
 [4] Runa Bhaumik, Chad Williams, Bamshad Mobasher, and Robin Burke. 2006.                         [25] Michael P. O’Mahony, Neil J. Hurley, and Guenole C. M. Silvestre. 2005. Rec-
     Securing collaborative filtering against malicious attacks through anomaly de-                     ommender Systems: Attack Types and Strategies. In Proceedings, The Twentieth
     tection. In Proceedings of the 4th Workshop on Intelligent Techniques for Web                      National Conference on Artificial Intelligence and the Seventeenth Innovative Ap-
     Personalization (ITWP’06), Boston, Vol. 6. 10.                                                     plications of Artificial Intelligence Conference, July 9-13, 2005, Pittsburgh, Pennsyl-
 [5] John S. Breese, David Heckerman, and Carl Myers Kadie. 1998. Empirical Analy-                      vania, USA. 334–339. http://www.aaai.org/Library/AAAI/2005/aaai05-053.php
     sis of Predictive Algorithms for Collaborative Filtering. In UAI ’98: Proceedings             [26] Steffen Rendle, Christoph Freudenthaler, Zeno Gantner, and Lars Schmidt-
     of the Fourteenth Conference on Uncertainty in Artificial Intelligence, University                 Thieme. 2009. BPR: Bayesian Personalized Ranking from Implicit Feed-
     of Wisconsin Business School, Madison, Wisconsin, USA, July 24-26, 1998. 43–                       back. In UAI 2009, Proceedings of the Twenty-Fifth Conference on Uncer-
     52. https://dslpitt.org/uai/displayArticleDetails.jsp?mmnu=1&smnu=2&article_                       tainty in Artificial Intelligence, Montreal, QC, Canada, June 18-21, 2009. 452–
     id=231&proceeding_id=14                                                                            461. https://dslpitt.org/uai/displayArticleDetails.jsp?mmnu=1&smnu=2&article_
 [6] Robin Burke, Bamshad Mobasher, Roman Zabicki, and Runa Bhaumik. 2005.                              id=1630&proceeding_id=25
     Identifying attack models for secure recommendation. Beyond Personalization                   [27] Badrul Munir Sarwar, George Karypis, Joseph A. Konstan, and John Riedl. 2001.
     2005 (2005).                                                                                       Item-based collaborative filtering recommendation algorithms. In Proceedings of
 [7] Zunping Cheng and Neil Hurley. 2010. Robust Collaborative Recommendation                           the Tenth International World Wide Web Conference, WWW 10, Hong Kong, China,
     by Least Trimmed Squares Matrix Factorization. In 22nd IEEE International Con-                     May 1-5, 2001. 285–295. https://doi.org/10.1145/371920.372071
     ference on Tools with Artificial Intelligence, ICTAI 2010, Arras, France, 27-29 October       [28] Yue Shi, Martha Larson, and Alan Hanjalic. 2014. Collaborative filtering beyond
     2010 - Volume 2. 105–112. https://doi.org/10.1109/ICTAI.2010.90                                    the user-item matrix: A survey of the state of the art and future challenges. ACM
 [8] Paul-Alexandru Chirita, Wolfgang Nejdl, and Cristian Zamfir. 2005. Preventing                      Computing Surveys (CSUR) 47, 1 (2014), 3.
     shilling attacks in online recommender systems. In Seventh ACM International                  [29] Zhihai Yang and Zhongmin Cai. 2017. Detecting abnormal profiles in collabo-
     Workshop on Web Information and Data Management (WIDM 2005), Bremen,                               rative filtering recommender systems. J. Intell. Inf. Syst. 48, 3 (2017), 499–518.
     Germany, November 4, 2005. 67–74. https://doi.org/10.1145/1097047.1097061                          https://doi.org/10.1007/s10844-016-0424-5
 [9] Yashar Deldjoo, Maurizio Ferrari Dacrema, Mihai Gabriel Constantin, Hamid                     [30] Mi Zhang, Jie Tang, Xuchen Zhang, and Xiangyang Xue. 2014. Addressing
     Eghbal-zadeh, Stefano Cereda, Markus Schedl, Bogdan Ionescu, and Paolo Cre-                        cold start in recommender systems: A semi-supervised co-training algorithm.
     monesi. 2019. Movie genome: alleviating new item cold start in movie rec-                          In Proceedings of the 37th international ACM SIGIR conference on Research &
     ommendation. User Model. User-Adapt. Interact. 29, 2 (2019), 291–343. https:                       development in information retrieval. ACM, 73–82.
     //doi.org/10.1007/s11257-019-09221-y
[10] Yashar Deldjoo, Markus Schedl, Paolo Cremonesi, and Gabriella Pasi. 2018.
     Content-Based Multimedia Recommendation Systems: Definition and Appli-
     cation Domains. In Proceedings of the 9th Italian Information Retrieval Workshop,
     Rome, Italy, May, 28-30, 2018. http://ceur-ws.org/Vol-2140/paper15.pdf
[11] Ihsan Gunes, Cihan Kaleli, Alper Bilge, and Huseyin Polat. 2014. Shilling attacks
     against recommender systems: a comprehensive survey. Artif. Intell. Rev. 42, 4
     (2014), 767–799. https://doi.org/10.1007/s10462-012-9364-9
[12] F. Maxwell Harper and Joseph A. Konstan. 2016. The MovieLens Datasets: History
     and Context. TiiS 5, 4 (2016), 19:1–19:19. https://doi.org/10.1145/2827872
[13] Xiangnan He, Zhankui He, Xiaoyu Du, and Tat-Seng Chua. 2018. Adversarial
     Personalized Ranking for Recommendation. In The 41st International ACM SIGIR
     Conference on Research & Development in Information Retrieval, SIGIR 2018, Ann Ar-
     bor, MI, USA, July 08-12, 2018. 355–364. https://doi.org/10.1145/3209978.3209981
[14] Xiangnan He, Hanwang Zhang, Min-Yen Kan, and Tat-Seng Chua. 2016. Fast
     Matrix Factorization for Online Recommendation with Implicit Feedback. In
     Proceedings of the 39th International ACM SIGIR conference on Research and Devel-
     opment in Information Retrieval, SIGIR 2016, Pisa, Italy, July 17-21, 2016. 549–558.
     https://doi.org/10.1145/2911451.2911489
[15] Shyong K. Lam and John Riedl. 2004. Shilling recommender systems for fun
     and profit. In Proceedings of the 13th international conference on World Wide Web,
     WWW 2004, New York, NY, USA, May 17-20, 2004. 393–402. https://doi.org/10.
     1145/988672.988726
[16] Greg Linden, Brent Smith, and Jeremy York. 2003. Industry Report: Amazon.com
     Recommendations: Item-to-Item Collaborative Filtering. IEEE Distributed Systems
     Online 4, 1 (2003).
[17] Bhaskar Mehta and Wolfgang Nejdl. 2008. Attack resistant collaborative filtering.
     In Proceedings of the 31st Annual International ACM SIGIR Conference on Research
     and Development in Information Retrieval, SIGIR 2008, Singapore, July 20-24, 2008.
     75–82. https://doi.org/10.1145/1390334.1390350
[18] Bhaskar Mehta and Wolfgang Nejdl. 2009. Unsupervised strategies for shilling
     detection and robust collaborative filtering. User Model. User-Adapt. Interact. 19,
     1-2 (2009), 65–97. https://doi.org/10.1007/s11257-008-9050-4
[19] Bamshad Mobasher, Robin Burke, Runa Bhaumik, and Chad Williams. 2005.
     Effective attack models for shilling item-based collaborative filtering systems.
     Citeseer.
[20] Bamshad Mobasher, Robin Burke, and Jeff J Sandvig. 2006. Model-based collab-
     orative filtering as a defense against profile injection attacks. In AAAI, Vol. 6.
     1388.
[21] Bamshad Mobasher, Robin D. Burke, Runa Bhaumik, and Jeff J. Sandvig. 2007.
     Attacks and Remedies in Collaborative Recommendation. IEEE Intelligent Systems
     22, 3 (2007), 56–63. https://doi.org/10.1109/MIS.2007.45
                                                                                               5