=Paper= {{Paper |id=Vol-2440/short4 |storemode=property |title= Bias Disparity in Recommendation Systems |pdfUrl=https://ceur-ws.org/Vol-2440/short4.pdf |volume=Vol-2440 |authors=Virginia Tsintzou,Evaggelia Pitoura,Panayiotis Tsaparas |dblpUrl=https://dblp.org/rec/conf/recsys/TsintzouPT19 }} == Bias Disparity in Recommendation Systems== https://ceur-ws.org/Vol-2440/short4.pdf
                           Bias Disparity in Recommendation Systems∗
                Virginia Tsintzou                                        Evaggelia Pitoura                                Panayiotis Tsaparas
            Department of Computer                                    Department of Computer                             Department of Computer
            Science and Engineering                                   Science and Engineering                            Science and Engineering
             University of Ioannina                                    University of Ioannina                             University of Ioannina
              vtsintzou@cs.uoi.gr                                        pitoura@cs.uoi.gr                                    tsap@cs.uoi.gr

ABSTRACT                                                                               systems; (b) Using synthetic data we study different conditions
Recommender systems have been applied successfully in a number                         under which bias disparity may appear. We consider the effect of
of different domains, such as, entertainment, commerce, and em-                        the iterative application of recommendation algorithms on the bias
ployment. Their success lies in their ability to exploit the collective                of the data; (c) We present some observations on bias disparity on
behavior of users in order to deliver highly targeted, personalized                    real data, using the MovieLens1 dataset; (d) We consider a simple
recommendations. Given that recommenders learn from user pre-                          re-ranking algorithm for correcting bias disparity, and we study it
ferences, they incorporate different biases that users exhibit in the                  experimentally.
input data. More importantly, there are cases where recommenders
may amplify such biases, leading to the phenomenon of bias dis-                        2    RELATED WORK
parity. In this short paper, we present a preliminary experimental                     The problem of algorithmic bias, and its flip side, fairness in algo-
study on synthetic data, where we investigate different conditions                     rithms, has attracted considerable attention in the recent years [4, 5].
under which a recommender exhibits bias disparity, and the long-                       Most existing work focuses on classification systems, while there
term effect of recommendations on data bias. We also consider a                        is limited work on recommendation systems.
simple re-ranking algorithm for reducing bias disparity, and present                       One type of recommendation bias that has been considered in the
some observations for data disparity on real data.                                     literature is popularity bias [3, 6]. It has been observed that under
                                                                                       some conditions popular items are more likely to be recommended
                                                                                       leading to a rich get richer effect, and there are some attempts to
1    INTRODUCTION                                                                      correct this bias [6, 7]. Related to this is also the quest for diver-
Recommender systems have found applications in a wide range of                         sity [8], where the goal is to include different types of items in
domains, including e-commerce, entertainment and social media,                         the recommendations, or provide additional exposure for specific
news portals, and employment sites [12]. They have been proven                         classes of items [1].
to be extremely effective in predicting the preferences of the users,                      These notions of fairness do not take into account the presence of
filtering the available content to provide a highly personalized and                   different (protected) groups of users, and different item categories
targeted experience.                                                                   that we consider in this work. This setting is considered in [2],
    One of the most popular classes of recommendation systems                          where they define two types of bias, and they propose a modification
is collaborative filtering. Collaborative Filtering (CF) uses the col-                 of the recommendation algorithm in [9] to ensure a fair output.
lective behavior of all users over all items to infer the preferences of               Their work focuses on fairness, rather than bias disparity, and
individual users for specific items [12]. However, given the reliance                  works with a specific algorithm. The notion of bias disparity is
of CF algorithms on the user preferences, they are susceptible to                      examined in [14] but in a classification setting. Closely related to
biases that may appear in the input data. In this work we consider                     our work is the work in [11], where they consider a similar notion
biases with respect to the preferences of specific groups of users                     of bias disparity, and they propose calibrated recommendations for
(e.g., men and women) towards specific categories of items (e.g.,                      mitigating its effect. Their work assumes a single class of users, and
different movie genres).                                                               they treat users individually, rather than as a group.
    Bias in recommendations is not necessarily always problematic.                         Fairness in terms of correcting rating errors for specific groups
For example, it is natural to expect gender bias when recommending                     of users was studied in [13] for a matrix factorization CF recommen-
clothes. However, gender bias is undesirable when recommending                         der. A similar setting is considered in [10], where they provide a
job postings, or information content. Furthermore, we want to avoid                    general framework for defining fairness (either individual or group
the case where the recommender system introduces bias in the data,                     fairness), and a methodology for enforcing fairness by inserting
by amplifying existing biases and reinforcing stereotypes. We refer                    “antidote data” in the dataset. A notion of fairness for tensor-based
to this phenomenon, where there is a difference between input and                      recommendations that relies on statistical parity is explored in [15].
recommendation bias, as bias disparity.
    In this paper, we consider the problem of bias disparity in re-                    3 MODEL
commendation systems, and we make the following contributions:
(a) We define notions of bias and bias disparity for recommender                       3.1 Definitions
                                                                                       We consider a set of n users U and a set of m items I. We are given
∗ Copyright 2019 for this paper by its authors. Use permitted under Creative Commons
                                                                                       a binary n × m matrix S, where S(u, i) = 1 if user u has selected
License Attribution 4.0 International (CC BY 4.0).
Presented at the RMSE workshop held in conjunction with the 13th ACM Conference
                                                                                       1 MovieLens 1M: https://grouplens.org/datasets/movielens/1m/
on Recommender Systems (RecSys), 2019, in Copenhagen, Denmark.
item i, and zero otherwise. Selection may mean that user u liked                recommend r items to a user, the r items with the highest utility
post i, or that u purchased product i, or that u watched video i.               values are selected.
   We assume that users are associated with an attribute AU , that
partitions them into groups depending on their value for the at-                4     BIAS DISPARITY ON SYNTHETIC DATA
tribute. For example, the attribute AU may be the gender of the                 In this section, we present experiments with synthetic data. Our
user, partitioning the users into men and women. We will typically              goal is to study the conditions under which the UserKNN exhibits
assume that we have two groups, and one of the groups is the pro-               bias disparity.
tected group. Similarly, we assume that items are associated with
an attribute AI , e.g., the genre of a movie, which partitions them             4.1    Synthetic data generation
into categories, e.g., action and romance movies.                               Users are split into two groups G 1 and G 2 of size n 1 and n 2 respecti-
   Given the association matrix S, we define the input preference ra-           vely, and items are partitioned into two categories C 1 and C 2 of size
tio PR S (G, C) of group G for category C as the fraction of selections         m 1 and m 2 respectively. We assume that users in G 1 tend to favor
from group G that are in category C. Formally:                                  items in category C 1 , while users in group G 2 tend to favor items
                                              S(u, i)                           in category C 2 . To quantify this preference, we give as input to the
                                  Í     Í
                    PR S (G, C) = Íu ∈G Íi ∈C                        (1)
                                   u ∈G i ∈I S(u, i)                            data generator two parameters ρ 1 , ρ 2 , where parameter ρ i deter-
This is essentially the conditional probability that a selection is in          mines the preference ratio PR S (G i , Ci ) of group G i for category Ci .
category C given that it comes from a user in group G.                          For example, ρ 1 = 0.7 means that 70% of the ratings of group G 1
   To assess the importance of this probability we compare it against           are in category C 1 . The datasets we create consist of 1,000 users
the probability P(C) = |C |/m of selecting from category C when                 and 1,000 items. We assume that each user selects 5% of the items
selecting uniformly at random. We define the bias B S (G, C) of group           in expectation, and we recommend r = 10 items per user.
G for category C as:                                                               We perform two different sets of experiments. In the first set,
                                                                                we examine the effect of the preference ratios, and in the second
                                      PR S (G, C)
                        B S (G, C) =                               (2)          set, the effect of group and category sizes. All reported values are
                                        P(C)                                    averages over 10 experiments.
Bias values less than 1 denote negative bias, that is, the group G
on average tends to select less often from category C, while bias               4.2    Varying the preference ratios
values greater than 1 denote positive bias, that is, that group G               In these experiments, we create datasets with equal-size groups G 1
favors category C disproportionately to its size.                               and G 2 , and equal-size item categories C 1 and C 2 , and we vary the
   We assume that the recommendation algorithm outputs for each                 preference ratios of the groups.
user u a ranked list of r items Ru . The collection of all recommenda-
tions can be represented as a binary matrix R, where R(u, i) = 1 if                4.2.1 Symmetric Preferences: In the first experiment, we assume
item i is recommended for user u and zero otherwise. Given matrix               that the two groups G 1 and G 2 have the same preference ratios
R, we can compute the output preference ratio of the recommenda-                by setting ρ 1 = ρ 2 = ρ, where ρ takes values from 0.5 to 1, in
tion algorithm, PR R (G, C), of group G for category C using Eq. (1),           increments of 0.05. In Figure 1(a), we plot the output preference
and the output bias B R (G, C) of group G for category C.                       ratio PR R (G 1 , C 1 ) (eq. PR R (G 2 , C 2 )) as a function of ρ. Note that
   To compare the bias of a group G for a category C in the input               in this experiment, bias is the preference ratio scaled by a factor
data S and the recommendations R, we define the bias disparity,                 of two. We report preference ratios to be more interpretable. The
that is, the relative change of the bias value.                                 dashed line shows when the output ratio is equal to the input ratio
                                                                                and thus there is no bias disparity. We consider different values
                             B R (G, C) − B S (G, C)
                  BD(G, C) =                                   (3)              for K, the number of neighbors. A first observation is that when
                                   B S (G, C)                                   the input bias is small (PR S ≤ 0.6), the output bias decreases or
   Our definitions of preference ratios and bias are motivated by               stays the same. In this case, users have neighbors from both groups.
concepts of group proportionality, and group fairness considered                For higher input bias (PR S > 0.6), we have a sharp increase of the
in the literature [4, 5].                                                       output bias, which reaches its peak for PR S = 0.8. In these cases,
                                                                                the recommender polarizes the two groups, recommending items
3.2    The Recommendation Algorithm                                             only from their favored category.
For the recommendations, in our experiments, we use a user-based                   In Figure 1(b), we report the preference ratio for all candidate
K-Nearest-Neighbors (UserKNN) algorithm. The UserKNN algo-                      items for recommendation for each user (i.e., if the system recom-
rithm first computes for each user, u, the set N K (u) of the K most            mended all items with non zero utility). Surprisingly, the candidate
similar users to u. For similarity, it uses the Jaccard similarity, JSim,       items are less biased even for high values of the input bias. This
computed using the matrix S. For user u and item i not selected by              shows that: (a) Utility proportional to user-similarity increases bias,
u, the algorithm computes a utility value                                       since the top-r recommendations with the highest utility are signi-
                             n ∈N K (u) JSim(u, n)S(n, i)
                           Í                                                    ficantly more biased, (b) It is possible to reduce bias by re-ranking
                V (u, i) =                                            (4)       the candidate items.
                                n ∈N K (u) JSim(u, n)
                              Í
                                                                                   Increasing the value of K increases the output bias. Adding neig-
The utility value V (u, i) is the fraction of the similarity scores of          hbors increases the strength of the signal, and the algorithm discri-
the top-K most similar users to u that have selected item i. To                 minates better between the items in the different categories, causing
                                                                            2
                                                                                                                                                                                                                                                                                                       0.5
                            1                                                                       1                                                                              1
 output preference ratio




                                                                                                                                                        output preference ratio




                                                                                                                                                                                                                                                                          output preference ratio
                                                                           candidate items ratio
                           0.9                                                                     0.9                                                                            0.9                                                                                                                  0.4

                           0.8                                      K=10                           0.8                                                                            0.8                                                                 K=10                                             0.3                                                                K=10
                                                                    K=20                                                                                                                                                                              K=20                                                                                                                K=20
                                                                    K=50                           0.7                                                                                                                                                K=50                                                                                                                K=50
                           0.7                                                                                                                                                    0.7                                                                                                                  0.2

                                                                                                   0.6                                       K=10
                           0.6                                                                                                                                                    0.6
                                                                                                                                             K=20                                                                                                                                                      0.1
                                                                                                                                             K=50
                           0.5                                                                     0.5                                                                            0.5
                                 0.5   0.6     0.7    0.8     0.9    1                                    0.5   0.6     0.7    0.8     0.9    1                                                               0.5       0.6     0.7    0.8     0.9         1                                                                   0.5          0.6     0.7    0.8     0.9     1
                                         input preference ratio                                                   input preference ratio                                                                                  input preference ratio                                                                                              input preference ratio

                                  (a) P R R , symmetric case                                              (b) Ratio of candidate items                                             (c) P R R (G 1, C 1 ), asymmetric case                                                                                            (d) P R R (G 2, C 2 ), asymmetric case

                                                                                                         Figure 1: Experiment with different preference ratios.

it to favor the preferred category. Understanding the role of K is a
                                                                                                                                                                                                               1                                                                                                      1
subject for future study.




                                                                                                                                                                                    output preference ratio




                                                                                                                                                                                                                                                                                           output preference ratio
                                                                                                                                                                                                              0.8                                                                                                    0.8

   4.2.2 Asymmetric Preferences: In this experiment, group G 1 has                                                                                                                                                                                                                                                   0.6

preference ratio ρ 1 ranging from 0.5 to 1, while G 2 is unbiased
                                                                                                                                                                                                              0.6
                                                                                                                                                                                                                                                                                                                     0.4

with fixed preference ratio ρ 2 = 0.5. In Figure 1, we show the                                                                                                                                               0.4
                                                                                                                                                                                                                                                               K=10                                                  0.2

recommendation preference ratio for groups G 1 (Figure 1(c)) and                                                                                                                                                                                               K=20                                                                  K=10
                                                                                                                                                                                                              0.2
                                                                                                                                                                                                                                                                                                                                     K=20
                                                                                                                                                                                                                                                               K=50                                                   0              K=50
G 2 (Figure 1(d)) as a function of ρ 1 .                                                                                                                                                                            0     0.2      0.4     0.6       0.8              1                                                    0           0.2        0.4       0.6     0.8        1

   We observe that the output bias of group G 1 is amplified at a rate
                                                                                                                                                                                                                                  group ratio ?                                                                                                  category ratio 3



much higher than in Figure 1(a), while group G 2 becomes biased                                                                                                                                                                 (a) Group Size                                                                                          (b) Category Size

towards category C 1 . Surprisingly, the presence of the unbiased                                                                                                                 Figure 2: (a) Unbalanced group sizes, (b) Unbalanced cate-
group G 2 has an amplifying effect on the bias of G 1 , rather than a                                                                                                             gory sizes; input preference ratio PR S (G i , Ci ) = 0.7.
moderating one, more so than an oppositely-biased group. This is
due to the fact that the users in the unbiased group G 2 provide a                                                                                                                    Note that as long as θ ≤ 0.7, group G 1 has positive bias (greater
stronger signal in favor of category C 1 , compared to the symmetric                                                                                                              than 1) for category C 1 since bias is equal to ρ 1 /θ . However, it de-
case where group G 2 is biased over C 2 . This reinforces the bias in                                                                                                             creases as the size of the category increases. When the category size
favor of category C 1 . As expected, the unbiased group adopts the                                                                                                                is not very large (θ ≤ 0.5), the output bias is amplified regardless
biases of the biased group                                                                                                                                                        of the category size. For θ > 0.7, G 1 is actually biased in favor of
                                                                                                                                                                                  C 2 , and this is reflected in the output. There is an interesting range
4.3                              Varying group and category sizes                                                                                                                 [0.6, 0.7] where G 1 is positively biased towards C 1 but its bias is
In this experiment we examine bias disparity with unbalanced                                                                                                                      weak, and thus the recommendation output is drawn to category
groups and categories.                                                                                                                                                            C 2 by the more biased group.

    4.3.1 Varying Group Sizes: We first consider groups of uneven                                                                                                                 4.4                                   Iterative Application of Recommendations
size. We set the size n 1 of G 1 to be a fraction ϕ of the number of all
                                                                                                                                                                                  We observed bias disparity in the output of the recommendation
users n, ranging from 5% to 95%. Both groups have fixed preference
                                                                                                                                                                                  algorithm. However, how does this affect the bias in the data? To
ratio ρ 1 = ρ 2 = 0.7. Figure 2(a) shows the output recommenda-
                                                                                                                                                                                  study this we consider a scenario where the users accept (some
tion preference ratio PR R (G 1 , C 1 ) as a function of ϕ. The plot of
                                                                                                                                                                                  of) the recommendations of the algorithm, and we study the long-
PR R (G 2 , C 2 ) is the mirror image of this one, so we do not report it.
                                                                                                                                                                                  term effect of the iterative application of the algorithm on the
    We observe that for ϕ ≤ 0.3 group G 1 has negative bias disparity
                                                                                                                                                                                  bias of the data. More precisely, at each iteration, we consider
(PR R (G 1 , C 1 ) < 0.7). That is, the small group is drawn by the larger
                                                                                                                                                                                  the top-r recommendations of the algorithm (r = 10) to a user
group. For medium values of ϕ in [0.35, 0.5], the bias of both groups
                                                                                                                                                                                  u, and we normalize their utility values, by the utility value of
is amplified, despite the fact that G 1 is smaller than G 2 . The increase
                                                                                                                                                                                  the top recommendation. We then assume that the user accepts a
is larger for the larger group, but there is increase for the smaller
                                                                                                                                                                                  recommendation with probability equal to the normalized score.
group as well.
                                                                                                                                                                                  The accepted recommendations are added to the data, and they are
    We also experimented with the case where G 2 is unbiased. In
                                                                                                                                                                                  fed as input to the next iteration of the recommendation algorithm.
this case G 2 becomes biased towards C 1 even for ϕ = 0.05, while
                                                                                                                                                                                     We apply this iterative algorithm on a dataset with two equally
the point at which the bias disparity for G 1 becomes positive is
                                                                                                                                                                                  but oppositely biased groups, as described in Section 4.2.1. The
much earlier (ϕ ≈ 0.2). This indicates that a small biased group can
                                                                                                                                                                                  results of this iterative experiment are shown in Figure 3(a), where
have a stronger impact than a large unbiased one.
                                                                                                                                                                                  we plot the average preference ratio for each iteration. Iteration 0
   4.3.2 Varying Category Sizes: We now consider categories of                                                                                                                    corresponds to the input data. In our experiment a user accepts on
uneven size. We set the size m 1 of C 1 to be a fraction θ of the number                                                                                                          average 7 recommendations. For this experiment we set the number
items m, ranging from 10% to 90%. We assume that both groups                                                                                                                      of neighbors K to 50.
have fixed preference ratio ρ 1 = ρ 2 = 0.7. Figure 2(b) shows the                                                                                                                   We observe that even with the probabilistic acceptance of recom-
recommendation preference ratio PR R (G 1 , C 1 ) as a function of θ .                                                                                                            mendations, there is a clear long-term effect of the recommendation
The plot of PR R (G 2 , C 2 ) is again the mirror image of this one.                                                                                                              bias. For small values of input bias, we observe a decrease, in line
                                                                                                                                                    3
                                 PRS:0.5       PRS:0.55    PRS:0.6   PRS:0.65   PRS:0.7                     PRS:0.75      PRS:0.8    PRS:0.85      PRS:0.9       PRS:0.95       denotes that u was recommended item i. We will refer to the pair
                                                                                                                                                                                (u, i) as a recommendation. The set R ∗ contains r recommendati-
                        1                                                                                        1                                                              ons for each user, thus, rn recommendations in total. Let V (R ∗ ) =
                                                                                                                                                                                  (u,i)∈R ∗ V (u, i) denote the total utility of the recommendations
                                                                                                                                                                                Í
              S




                                                                                                       S
                       0.9                                                                                      0.9
 preference ratio PR




                                                                                          preference ratio PR
                       0.8                                                                                      0.8                                                             in set R ∗ . Since R ∗ contains for each user u the top-r items with
                       0.7                                                                                      0.7
                                                                                                                                                                                the highest utility, R ∗ has maximum possible utility among all sets
                       0.6                                                                                      0.6
                                                                                                                                                                                with r recommendations per user.
                       0.5                                                                                      0.5
                                                                                                                                                                                   However, as we have seen in our experiments, the set R ∗ may
                             0             1         2         3       4        5                                     0     1        2         3             4         5
                                                      iteration                                                                       iteration                                 have high bias disparity. We will adjust the recommendations in
                                                                                                                                                                                the set R ∗ to produce a new set of recommendations R, with r
                                               (b) UserKNN                                                                          (c) GULM

Figure 3: The evolution of the preference ratio in the data                                                                                                                     recommendations per user, with zero bias disparity. Clearly this
for different input preference ratios (PR S ), after 5 iterations                                                                                                               will come at the expense of utility. Our goal is to find the set R with
of (a) UserKNN and (b) GULM. Iteration 0 shows the original                                                                                                                     the minimum utility loss.
preference ratio of each experiment.                                                                                                                                               Since we have two categories, to achieve zero bias disparity, it
                                                                                                                                                                                suffices to have B R (G i , Ci ) = B S (G i , Ci ). Without loss of generality
with the observations in Figure 1(a). For these values of bias, the
                                                                                                                                                                                assume that B R ∗ (G i , Ci ) > B S (G i , Ci ). Let Ci denote the category
recommender will result in reducing bias and smoothing out dif-
                                                                                                                                                                                other than Ci . We decrease the output bias B R by swapping recom-
ferences. The value of preference ratio 0.6 remains more or less
                                                                                                                                                                                mendations (u, i) of category Ci with recommendations (u, j) of
constant, while for larger values the bias in the data increases. The-
                                                                                                                                                                                category Ci . Given a target bias value, we can compute the number
refore, for large values of bias the recommender has a reinforcing
                                                                                                                                                                                of swaps for achieving zero bias disparity. The utility loss incurred
effect, which in the long term will lead to polarized groups of users.
                                                                                                                                                                                by swapping (u, i) with (u, j) is V (u, i) − V (u, j). The goal is to find
                                                                                                                                                                                the swaps with the minimum utility loss.
5                            BIAS DISPARITY ON REAL DATA
                                                                                                                                                                                   We present a simple and efficient greedy algorithm for this task.
In this experiment, we use the Movielens 1M dataset2 . We consider                                                                                                              Let N S denote the desired number of swaps. The algorithm starts
as categories the genres Action and Romance, with 468 and 463                                                                                                                   with the set R = R ∗ , and performs N S steps. At each step it computes
movies. We extract a subset of users U that have at least 90 ratings                                                                                                            a set of candidate swaps by pairing for each user u the lowest-ranked
in these categories, resulting in 1,259 users. Users in U consist of                                                                                                            recommendation (u, i) in R from category Ci , with the highest ran-
981 males and 278 females.                                                                                                                                                      ked recommendation (u, j) not in R from category Ci , and performs
   In Table 1, we show the input/output bias and in parentheses                                                                                                                 the swap with the minimum utility loss. It is easy to show that
the bias disparity for each group-category combination. The right                                                                                                               the algorithm is optimal, that is, it achieves the minimum utility
part of the table reports these numbers when the user groups are                                                                                                                loss. We refer to this algorithm as the GULM (Group Utility Loss
balanced, by selecting a random sample of 278 males. We observe                                                                                                                 Minimization) algorithm.
that males are biased in favor of Action movies while females prefer                                                                                                               By design, when we apply the GULM algorithm on the output
Romance movies. The application of UserKNN increases the strong                                                                                                                 of the recommendation algorithm, we eliminate bias disparity (mo-
input bias for males in the output. Females are moderately biased                                                                                                               dulo rounding errors) in the recommendations. We consider the
in favor of Romance movies. Hence, their output bias is drawn                                                                                                                   iterative application of the recommendation algorithm, in the set-
to Action items. We observe a very similar picture for balanced                                                                                                                 ting described in Section 4.4, assuming again that the probability
data, indicating that the changes in bias are not due to the group                                                                                                              of a recommendation being accepted depends on its utility. The
imbalance.                                                                                                                                                                      results are shown in Figure 3(b). For values of preference ratio up
                                                                                                                                                                                to 0.65, we observe that bias remains more or less constant after
                                       Table 1: Gender bias on action and romance
                                                                                                                                                                                re-ranking. For larger values, there is some noticeable increase in
                               Unbalanced Groups                   Balanced Groups                                                                                              the bias, albeit significantly smaller than before re-ranking. The
                             Action          Romance            Action          Romance
                                                                                                                                                                                increase is due to the fact that the recommendations introduced by
                       M 1.39/1.67 (0.2) 0.58/0.28 (-0.51) 1.40/1.66 (0.18) 0.57/0.29 (-0.49)
                                                                                                                                                                                GULM have low probability to be accepted.
                       F 0.97/1.14 (0.17) 1.03/0.85 (-0.17) 0.97/1.08 (0.11) 1.03/0.92 (-0.10)

                                                                                                                                                                                7    CONCLUSIONS
                                                                                                                                                                                In this short paper, we performed a preliminary study of bias dis-
6                            CORRECTING BIAS DISPARITY
                                                                                                                                                                                parity in recommender systems, and the conditions under which it
To address the problem of bias disparity, we consider an algorithm                                                                                                              may appear. Using synthetic data, we observed that recommenda-
that performs post-processing of the recommendations. Our goal is                                                                                                               tion algorithms can introduce bias disparity even for moderately
to adjust the set of recommended items, so as to ensure that there                                                                                                              biased groups. We view this analysis as a first step towards a syste-
is no bias disparity. In addition, we would like the new recommen-                                                                                                              matic analysis of the factors that cause bias disparity. We intend to
dation set to have the maximum possible utility.                                                                                                                                investigate more recommendation algorithms, and more datasets,
   Abusing the notation, let R ∗ denote the set of user-item pairs                                                                                                              including the case of numerical, rather than binary, ratings. It is also
produced by our recommendation algorithm, where (u, i) ∈ R ∗                                                                                                                    interesting to examine this work in the context of other definitions
2 MovieLens 1M: https://grouplens.org/datasets/movielens/1m/                                                                                                                    for bias and fairness.
                                                                                                                                                                            4
REFERENCES
 [1] Alex Beutel, Ed H. Chi, Zhiyuan Cheng, Hubert Pham, and John Anderson. 2017.
     Beyond Globally Optimal: Focused Learning for Improved Recommendations. In
     Proceedings of the 26th International Conference on World Wide Web.
 [2] Robin Burke, Nasim Sonboli, and Aldo Ordonez-Gauger. 2018. Balanced Neig-
     hborhoods for Multi-sided Fairness in Recommendation. In Conference on Fairness,
     Accountability and Transparency, FAT.
 [3] Òscar Celma and Pedro Cano. 2008. From Hits to Niches?: Or How Popular
     Artists Can Bias Music Recommendation and Discovery. In Proceedings of the
     2nd KDD Workshop on Large-Scale Recommender Systems and the Netflix Prize
     Competition (NETFLIX). ACM.
 [4] Cynthia Dwork, Moritz Hardt, Toniann Pitassi, Omer Reingold, and Richard S.
     Zemel. 2012. Fairness through awareness. In Innovations in Theoretical Computer
     Science.
 [5] Sara Hajian, Francesco Bonchi, and Carlos Castillo. 2016. Algorithmic Bias: From
     Discrimination Discovery to Fairness-aware Data Mining. In ACM SIGKDD.
 [6] Dietmar Jannach, Lukas Lerche, Iman Kamehkhosh, and Michael Jugovac. 2015.
     What Recommenders Recommend: An Analysis of Recommendation Biases and
     Possible Countermeasures. User Modeling and User-Adapted Interaction 25, 5 (Dec.
     2015), 427–491.
 [7] Toshihiro Kamishima, Shotaro Akaho, Hideki Asoh, and Jun Sakuma. 2014. Cor-
     recting Popularity Bias by Enhancing Recommendation Neutrality. In Poster
     Proceedings of the 8th ACM Conference on Recommender Systems, RecSys 2014.
 [8] Matev Kunaver and Toma Porl. 2017. Diversity in Recommender Systems A
     Survey. Know.-Based Syst. (2017).
 [9] X. Ning and G. Karypis. 2011. SLIM: Sparse Linear Methods for Top-N Recom-
     mender Systems. In 2011 IEEE 11th International Conference on Data Mining.
[10] Bashir Rastegarpanah, Krishna P. Gummadi, and Mark Crovella. 2019. Fighting
     Fire with Fire: Using Antidote Data to Improve Polarization and Fairness of Re-
     commender Systems. In Proceedings of the Twelfth ACM International Conference
     on Web Search and Data Mining, WSDM 2019, Melbourne, VIC, Australia, February
     11-15, 2019. 231–239.
[11] Harald Steck. 2018. Calibrated Recommendations. In Proceedings of the 12th ACM
     Conference on Recommender Systems (RecSys ’18). ACM, New York, NY, USA,
     154–162.
[12] Xiaoyuan Su and Taghi M. Khoshgoftaar. 2009. A Survey of Collaborative Filtering
     Techniques. Adv. in Artif. Intell. (2009).
[13] Sirui Yao and Bert Huang. 2017. Beyond Parity: Fairness Objectives for Collabo-
     rative Filtering. In NIPS.
[14] Jieyu Zhao, Tianlu Wang, Mark Yatskar, Vicente Ordonez, and Kai-Wei Chang.
     2017. Men Also Like Shopping: Reducing Gender Bias Amplification using
     Corpus-level Constraints. In EMNLP.
[15] Ziwei Zhu, Xia Hu, and James Caverlee. 2018. Fairness-Aware Tensor-Based
     Recommendation. In Proceedings of the 27th ACM International Conference on
     Information and Knowledge Management (CIKM ’18). ACM, New York, NY, USA,
     1153–1162.




                                                                                        5