<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Bias Disparity in Collaborative Recommendation: Algorithmic Evaluation and Comparison∗</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Masoud Mansoury†</string-name>
          <email>m.mansoury@tue.nl</email>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Robin Burke</string-name>
          <email>robin.burke@colorado.edu</email>
          <xref ref-type="aff" rid="aff3">3</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>∗Copyright 2019 for this paper by its authors. Use permitted under Creative Commons</string-name>
          <email>USA, mmansou4@depaul.edu.</email>
          <xref ref-type="aff" rid="aff2">2</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Bamshad Mobasher</string-name>
          <email>mobasher@cs.depaul.edu</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Mykola Pechenizkiy</string-name>
          <email>m.pechenizkiy@tue.nl</email>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>DePaul University</institution>
          ,
          <addr-line>Chicago</addr-line>
          ,
          <country country="US">USA</country>
        </aff>
        <aff id="aff1">
          <label>1</label>
          <institution>Eindhoven University of Technology</institution>
          ,
          <addr-line>Eindhoven</addr-line>
          ,
          <country country="NL">the Netherlands</country>
        </aff>
        <aff id="aff2">
          <label>2</label>
          <institution>License Attribution 4.0 International (CC BY 4.0)., Presented at the RMSE workshop held in conjunction with the 13th ACM Conference</institution>
          ,
          <addr-line>on Recommender Systems (RecSys), 2019, in Copenhagen</addr-line>
          ,
          <country country="DK">Denmark.</country>
          <institution>, †This author also has afiliation in School of Computing, DePaul University</institution>
          ,
          <addr-line>Chicago</addr-line>
        </aff>
        <aff id="aff3">
          <label>3</label>
          <institution>University of Colorado Boulder</institution>
          ,
          <addr-line>Boulder</addr-line>
          ,
          <country country="US">USA</country>
        </aff>
      </contrib-group>
      <pub-date>
        <year>2019</year>
      </pub-date>
      <abstract>
        <p>Research on fairness in machine learning has been recently extended to recommender systems. One of the factors that may impact fairness is bias disparity, the degree to which a group's preferences on various item categories fail to be reflected in the recommendations they receive. In some cases biases in the original data may be amplified or reversed by the underlying recommendation algorithm. In this paper, we explore how diferent recommendation algorithms reflect the tradeof between ranking quality and bias disparity. Our experiments include neighborhood-based, model-based, and trust-aware recommendation algorithms.</p>
      </abstract>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>INTRODUCTION</title>
      <p>
        Recommender systems are powerful tools in extracting users
preferences and suggesting desired items. These systems, while accurate,
may sufer from a lack of fairness to specific groups of users.
Research in fairness-aware recommender systems have shown that the
outputs of recommendation algorithms are, in some cases, biased
against protected groups [
        <xref ref-type="bibr" rid="ref7">7</xref>
        ]. As a result, this discrimination among
users will degrade users’ satisfaction, loyalty, and efectiveness of
recommender systems, and at worst, it can lead to or perpetuate
undesirable social dynamics.
      </p>
      <p>
        Discrimination in recommendation output can originate from
diferent sources. It may stem from the underlying biases in the
input data [
        <xref ref-type="bibr" rid="ref25 ref4">4, 25</xref>
        ] used for training. On the other hand, the
discriminative behavior may be the result of recommendation algorithms
[
        <xref ref-type="bibr" rid="ref13 ref27 ref28">13, 27, 28</xref>
        ].
      </p>
      <p>In this paper, we examine the efectiveness of recommendation
algorithms in capturing diferent groups’ interests across item
categories. We compare diferent recommendation algorithms in terms
of how they capture the categorical preferences of users and reflect
them in the recommendation delivered.</p>
      <p>It is important to note that in this paper, although we do not
directly measure the fairness of recommendation algorithms, we
study bias disparity of recommendation algorithms as an important
factor that afects fairness. The benefit of studying bias disparity in
recommender systems is that, depending on the domain, knowing
which algorithms produce more or less disparity from users’ stated
preferences can allow system designers to better control the
recommendation output. In our analysis of bias disparity, we also take
into account item coverage in recommended lists. A
recommendation algorithm with higher item coverage signifies that majority of
item providers in the system will have equal chance to be shown
to users.</p>
      <p>
        Our analysis includes a variety of recommendation algorithms:
neighborhood models, factorization models, and trust-aware
recommendation algorithms. In particular we investigate the performance
of trust-aware recommendation algorithms. In these algorithms,
besides items ratings, explicit trust ratings are used as side
information to enhance the quality of input values for recommender
systems. It has been shown that using explicit trust ratings will
provide advantages for recommender systems [
        <xref ref-type="bibr" rid="ref20">20</xref>
        ]. First, since trust
ratings can be propagated, they can help overcome cold-start issue
in recommender systems. Secondly, trust-aware methods are robust
against shilling attacks in recommender systems [
        <xref ref-type="bibr" rid="ref16">16</xref>
        ]. In this paper,
we also analyze the performance of these algorithms in addressing
bias disparity in recommender systems.
      </p>
      <p>The motivation behind this research is analyzing the
performance of recommendation algorithms in preference deviation across
item categories for a specific group of users (e.g., male vs. female).
Given protected and unprotected groups, we aim to compare the
ability of recommendation algorithms to generate
recommendations equally well for each group based on their preferences in
training data. Therefore, no matter what the context of the dataset
is, given protected/unprotected groups and item categories, we
are interested in comparing recommendation algorithms for their
ability to recommend preferred item categories to these groups of
users.</p>
      <p>For experiments, we prepared a sample of publicly-available
Yelp dataset for research on fairness-aware recommender systems.
Our experiments are performed on multiple recommendation
algorithms and the results are evaluated in terms of bias disparity and
average disparity along with ranking quality and item coverage.
2</p>
    </sec>
    <sec id="sec-2">
      <title>BACKGROUND</title>
      <p>
        The problem of unfair outputs in machine learning applications is
well studied [
        <xref ref-type="bibr" rid="ref12 ref3 ref6">3, 6, 12</xref>
        ] and also it has been extended to recommender
systems. Various studies have considered fairness in
recommendation results [
        <xref ref-type="bibr" rid="ref4">4</xref>
        ].
      </p>
      <p>
        One research direction in fairness-aware recommender systems
is providing fair recommendations for consumers. Burke et. al. in [
        <xref ref-type="bibr" rid="ref4">4</xref>
        ]
have shown that inclusion of a balanced neighborhood
regularization to SLIM algorithm can improve the equity of recommendations
for protected and unprotected groups. Based on their definition
for protected and unprotected groups, their solution takes into
account the group fairness of recommendation outputs. Analogously,
Yao and Huang in [
        <xref ref-type="bibr" rid="ref27">27</xref>
        ] improved the equity of recommendation
results by adding fairness terms to objective function in model-based
recommendation algorithms. They proposed four fairness metrics
that capture the degree of unfairness in recommendation outputs
and added these metrics to learning objective function to further
optimize it for fair results.
      </p>
      <p>
        Zhu et al. in [
        <xref ref-type="bibr" rid="ref29">29</xref>
        ] proposed a fairness-aware tensor-based
recommender systems to improve the equity of recommendations
while maintaining the recommendation quality. The idea in their
paper is isolating sensitive information from latent factor matrices
of the tensor model and then using this information to generate
fairness-aware recommendations.
      </p>
      <p>Besides consumer fairness, provider fairness is another research
direction in fairness-aware recommender systems. Provider fairness
refers to the fact that items belong to each provider have equal
chance to be shown in the recommended lists. This is known as
popularity bias and usually measured by item coverage.</p>
      <p>
        Abdollahpouri et al., [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ] addressed popularity bias in
learningto-rank algorithms by inclusion of fairness-aware regularization
term into objective function. They showed that the fairness-aware
regularization term controls the recommendations being toward
popular items.
      </p>
      <p>
        Jannach et al., [
        <xref ref-type="bibr" rid="ref11">11</xref>
        ] conducted a comprehensive set of analysis
on popularity bias of several recommendation algorithms. They
analyzed recommended items by diferent recommendation
algorithms in terms of their average ratings and their popularity. While
it is very dependent to the characteristics of the data sets, they
found that some algorithms (e.g., SlopeOne, KNN techniques, and
ALS-variant of factorization models) focus mostly on high-rated
items which bias them toward a small sets of items (low coverage).
Also, they found that some algorithms (e.g., ALS-variants of
factorization model) tend to recommend popular items, while some
other algorithms (e.g., UserKNN and SlopeOne) tend to recommend
less-popular items.
      </p>
      <p>
        Multi-stakeholder recommender systems simultaneously take
into account the fairness of all stakeholders or entities in a
multisided platform. The main goal of multi-stakeholder
recommendations is maximizing the fairness of all stakeholders. Consumers and
providers are the major stakeholders in most multi-sided platforms
[
        <xref ref-type="bibr" rid="ref1 ref5">1, 5</xref>
        ].
      </p>
      <p>
        Surer et al. in [
        <xref ref-type="bibr" rid="ref30">30</xref>
        ] proposed a multi-stakeholder optimization
model that works as a post-processing approach for standard
recommendation algorithms. In this model, a set of constraints for
providers are considered when generating recommendation lists
for end users. Also, Liu and Burke in [
        <xref ref-type="bibr" rid="ref17">17</xref>
        ] proposed a fairness-aware
re-ranking approach that iteratively balances the ranking quality
and provider fairness. In this post-processing approach, users’
tolerance for diversity list is also considered to find trade-of between
accuracy and provider fairness.
3
      </p>
    </sec>
    <sec id="sec-3">
      <title>FAIRNESS METRICS</title>
      <p>In this paper, we compare the performance of state-of-the-art
recommendation algorithms in terms of bias disparity in recommended
lists. We also consider ranking quality and item coverage of
recommendation algorithms as two important additional metrics.</p>
      <p>We use two metrics to measure changes in bias for groups of
users given item categories: bias disparity and average disparity.</p>
      <p>
        Bias disparity measures how much an individual’s
recommendation list deviates from his or her original preferences in the training
set [
        <xref ref-type="bibr" rid="ref25">25</xref>
        ]. Given a group of users, G, and an item category, C, bias
disparity is defined as follow:
      </p>
      <p>BD(G, C) = BR (G, C) − BT (G, C) (1)</p>
      <p>BT (G, C)
where BT (BR ) is the bias value of group G on category C in training
data (recommendation list). BT is defined by:</p>
      <p>BT (G, C) = PRT (G, C) (2)</p>
      <p>P (C)
where P (C) is the fraction of item category C in the dataset
deifned as |C |\|m|. PRT is the preference ratio of group G on category
C calculated as:</p>
      <p>PRT (G, C) = ÍÍu ∈G Íi ∈C T (u, i) (3)</p>
      <p>u ∈G Íi ∈I T (u, i)
where T is the binarized user-item matrix. If user u has rated
item i, then T (u, i) = 1, otherwise T (u, i) = 0.</p>
      <p>The bias value of group G on category C in the recommendation
list, BR , is defined similarly.</p>
      <p>
        On the other hand, average disparity measures how much
preference disparity between training data and recommendation list for
one group of users (e.g., unprotected groups) is diferent from that
for another group of users (e.g., protected group). Inspired by value
unfairness metric proposed by Yao and Huang [
        <xref ref-type="bibr" rid="ref27">27</xref>
        ], we introduce
the average disparity as:
disparity =
1 |C |
      </p>
      <p>Õ
|C | i=0
|(NR (GU , Ci ) − NT (GU , Ci ))
(4)
−(NR (GP , Ci ) − NT (GP , Ci ))|
where GU and GP are unprotected and protected groups,
respectively. NR (G, C) and NT (G, C) return number of items from
category C in recommendation lists and training data, respectively,
that are rated by users in group G.</p>
      <p>As part of our analysis, we also measure item coverage of
recommended lists which is an important consideration in provider-side
fairness. Given the whole set of items in the system, I , and whole
recommendation lists for all users, Rall , item coverage measures
what percentage of items in the system appeared in
recommendation lists and can be calculated as:
For comparing the efects of recommendation algorithms on bias
and on item coverage, we performed an extensive experiments
on state-of-the-art recommendation algorithms. Experiments are
performed on model-based, neighborhood-based, and trust-aware
recommendation algorithms.</p>
      <p>
        Our experiments on neighborhood-based recommendation
algorithms include user-based collaborative filtering ( UserKNN) [
        <xref ref-type="bibr" rid="ref22">22</xref>
        ] and
item-based collaborative filtering ( ItemKNN) [
        <xref ref-type="bibr" rid="ref23">23</xref>
        ]. Also, our
experiments on model-based recommendation algorithms include biased
matrix factorization (BiasedMF) [
        <xref ref-type="bibr" rid="ref15">15</xref>
        ], combined explicit and
implicit model (SVD++) [
        <xref ref-type="bibr" rid="ref14">14</xref>
        ], list-wise matrix factorization (ListRankMF)
[
        <xref ref-type="bibr" rid="ref24">24</xref>
        ], and the sparse linear method (SLIM) [
        <xref ref-type="bibr" rid="ref21">21</xref>
        ]. Finally, our
experiments on trust-aware recommendation algorithms include
trustaware neighborhood model (TrustKNN) [
        <xref ref-type="bibr" rid="ref20">20</xref>
        ], trust-based singular
value decomposition (TrustSVD) [
        <xref ref-type="bibr" rid="ref9">9</xref>
        ], social regularization-based
method (SoReg) [
        <xref ref-type="bibr" rid="ref18">18</xref>
        ], trust-based matrix factorization (TrustMF)
[
        <xref ref-type="bibr" rid="ref26">26</xref>
        ], and social matrix factorization (SocialMF) [
        <xref ref-type="bibr" rid="ref10">10</xref>
        ]. Besides above
well-known recommendation algorithms, we also performed
experiments on two naive algorithms: random and most popular.
      </p>
      <p>For sensitivity analysis, we performed extensive experiments
with diferent parameter configurations for each algorithm. Table 1
shows the parameter configurations we used for our experiments.</p>
      <p>
        We performed 5-fold cross validation, and in the test condition,
generated recommendation lists of size 10 for each user. Then,
we evaluated nDCG, item coverage, bias disparity, and average
disparity at list size 10. Results were averaged over all users and then
over all folds. We used librec-auto and LibRec 2.0 for all experiments
[
        <xref ref-type="bibr" rid="ref19 ref8">8, 19</xref>
        ].
4.2
      </p>
    </sec>
    <sec id="sec-4">
      <title>Yelp dataset</title>
      <p>For our experiments, we use a subset of Yelp dataset from round 12
of Yelp Challenge1. In this sample, each user has rated at least 40
businesses and each business is rated by at least 40 users. Thus, there</p>
      <sec id="sec-4-1">
        <title>1https://www.yelp.com/dataset</title>
        <p>are 1,355 users who provided 100,409 ratings on 1,272 businesses.
The range of ratings is 1 (not preferred) to 5 (preferred). The density
of rating matrix is 5.826.</p>
        <p>This Yelp dataset also has information about users friendship.
Each user has selected a set of other users as her friends. We
interpret this relationships as a trust network. When user A selects user
B as a friend, it means that user A trusts user B with respect to the
corresponding domain or category. In this dataset, 919 users have
expressed their trustworthiness to 1,172 users and there are 26,453
trust relationships between users. With regard to the number of
users, the density of trust matrix is 2.456.</p>
        <p>In order to evaluate the recommendation outputs in terms of bias
disparity and average disparity, specific information about users
and items is needed. First, we need to define users group based on
users demographic information and item category based on item
contents. In Yelp dataset, there is no useful information about user
to define users’ group. To overcome this issue, we prepared the
dataset by extracting users’ gender from users’ name. To do this,
we use an existing online tool2 to extract users’ gender. In this tool,
for each user name as input, it will return the predicted gender,
number of samples used for prediction, and prediction accuracy.
Hence, it enables us to increase the reliability of extracted genders
by taking outputs with high accuracy and fair amount of samples.</p>
        <p>Moreover, information about items’ category is provided in the
dataset. Each business in Yelp dataset is assigned multiple relevant
categories.</p>
        <p>Overall, the prepared dataset has four separate sets:
1. The rating data that each user provided to businesses.
2. Explicit trust data that each user has selected trusted (friends)
users.
3. Users information that consists of users’ gender.
4. Items category that consists of several category for each
business.</p>
        <p>By using this dataset, we define the set G =&lt; male, f emale &gt;
and set C as categories assigned to each business. The dataset is
available at https://github.com/masoudmansoury/yelp_core40.
4.3</p>
      </sec>
    </sec>
    <sec id="sec-5">
      <title>Experimental results</title>
      <p>In this section, we compare the performance of recommendation
algorithms across the diferent metrics discussed earlier. First, we
show the bias disparity of recommendations results on top 10 most
preferred item categories. Second, we show average disparity for
each algorithm on all categories. For sensible comparison, we also
take into account the ranking quality and item coverage.</p>
      <p>4.3.1 Bias disparity. Results on model-based recommendation
algorithms on top 10 most preferred item categories for male and
female are shown in Figure 1. Figure 1a shows the bias disparity for
male individuals and Figure 1b shows the bias disparity for female
individuals. Since there is always a trade-of between accuracy and
non-accuracy metrics (e.g., nDCG vs. fairness), for comparison, the
fairness analysis is conducted on recommendation outputs that
give the same nDCG (highest possible) for all recommendation
algorithms. For model-based recommendation algorithms, the nDCG
value is set to 0.023 ± 0.001. This setting guarantees that the fairness</p>
      <sec id="sec-5-1">
        <title>2https://gender-api.com</title>
        <p>(a) Male
(b) Female
of recommendation algorithms is compared in same condition for
all algorithms.</p>
        <p>As it is shown in Figure 1, in most cases, SoReg provides lower
bias disparity on top 10 most preferred categories for male and
female groups. For males in Figure 1a, SoReg and SLIM generated
more stable outputs compared to other algorithms with the lowest
bias disparity in 40% cases. On the other hand, for female, SoReg
and ListRankMF generated recommendations with the lowest bias
disparity of 50% and 40% cases, respectively, when compared to
other recommendation algorithms.</p>
        <p>In Figure 1, we did not report the results for BiasedMF, SVD++,
SocialMF, TrustMF, and random and most popular item
recommendations because these algorithms either did not recommend
any items from top 10 most preferred categories, or their ranking
quality was lower than specified value for other algorithms.</p>
        <p>Results on neighborhood-based recommendation algorithms for
male and female groups are shown in Figure 2. The nDCG values for
neighborhood algorithms are all set to 0.074 ± 0.01. Figure 2a shows
the bias disparity of neighborhood models for male. TrustKNN
generated more stable recommendations compared to other algorithms
with 50% top 10 most categories. Also, for other categories, its
output is very close to the best one. Moreover, a better output in terms
of bias disparity can be observed in Figure 2b for female. On 60%
of top 10 most preferred categories, TrustKNN worked better that
other neighborhood algorithms.</p>
        <p>4.3.2 Average disparity. Figure 3 compares the performance of
recommendation algorithms with respect to two criteria: 1) how
accurately recommendation algorithms generate stable (i.e. low
disparity) recommendations for unprotected and protected groups,
2) how accurately recommendation algorithms are able to equally
(a) Male
(b) Female
recommend the items belonging to all providers when generating
recommendations (provider-side fairness).</p>
        <p>For all experiments that we performed with diferent
hyperparameters, the best and worst nDCG for each algorithm are reported
in Figure 3.</p>
        <p>Random guess algorithm is a naive approach that randomly
recommends a list of items to each user. Although this algorithm has
low accuracy, it has the highest item coverage and lower average
disparity compared to other recommendation algorithms. This
algorithm does not take any preferences into account and unlikely to
provide good results for any user. Also, most popular item
recommendation is another naive, non-personalized, algorithm that only
recommends items with the highest number of ratings to each user.
Although it has high ranking quality and average disparity similar
to model-based recommendation algorithms, it has the lowest item
coverage. These algorithms provide baselines that other algorithms
should be expected to beat.</p>
        <p>For neighborhood models, TrustKNN showed better performance.
Although it has lower ranking quality than UserKNN and ItemKNN,
it has significantly better item coverage and average disparity. One
possible reason for low nDCG of TrustKNN can be high sparsity
of trust matrix. Using a propagation model for reducing the
sparsity of trust matrix may increase the ranking quality of TrustKNN.
Overall, neighborhood algorithms worked better than model-based
algorithms in terms of all metrics. This is due to the fact that the
rating data for these experiments is very dense and all users are
heavy raters.</p>
        <p>For model-based algorithms, SLIM shows better performance
compared to other algorithms. From Figure 3a, while showing high
nDCG, it has the lowest average disparity and in terms of item
coverage, it has comparable coverage to other model-based algorithms.
(a) nDCG vs. average disparity
(b) nDCG vs. item coverage
This result is also consistent with the definition of SLIM algorithm
which is an extension of ItemKNN and analogous to neighborhood
algorithms, it showed significant performance.</p>
        <p>In addition, ListRankMF is another model-based algorithm that,
although having high accuracy and item coverage, has average
disparity is as high as other algorithms. Also, for model-based
trust-aware recommendation algorithms, although SoReg showed
significant reduction in bias disparity on the top 10 most preferred
categories, it did not improve the average disparity on all categories.
5</p>
      </sec>
    </sec>
    <sec id="sec-6">
      <title>CONCLUSION</title>
      <p>In this paper, we examined the efectiveness of recommendation
algorithms in generating outputs with lower bias disparity for
different groups of users across item categories. We measured the
performance of recommendation algorithms in terms of bias
disparity on top 10 most preferred item categories, average disparity,
ranking quality, and item coverage. A comprehensive sets of
experiments showed that neighborhood models work significantly
better than other algorithms, particularly trust-aware
neighborhood model that outperformed other algorithms. Also, we observed
that in most cases, having additional information along with rating
data can enhance the performance of recommender systems.</p>
      <p>For future work, we would like to investigate individual fairness
by considering the performance of recommendation algorithms in
capturing individual users’ interest across diferent item categories.
Also, we are interested to repeat the experiments in this paper
on another sample of Yelp dataset with sparser rating data and
denser trust data to see how recommendation algorithms are able
to control bias disparity.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <given-names>Himan</given-names>
            <surname>Abdollahpouri</surname>
          </string-name>
          , Gediminas Adomavicius, Robin Burke, Ido Guy, Dietmar Jannach, Toshihiro Kamishima, Jan Krasnodebski, and Luiz Augusto Pizzato.
          <year>2019</year>
          . Beyond Personalization: Research Directions in Multistakeholder Recommendation. CoRR abs/
          <year>1905</year>
          .
          <year>01986</year>
          (
          <year>2019</year>
          ). arXiv:
          <year>1905</year>
          .
          <year>01986</year>
          http://arxiv.org/abs/
          <year>1905</year>
          . 01986
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [2]
          <string-name>
            <given-names>Himan</given-names>
            <surname>Abdollahpouri</surname>
          </string-name>
          , Robin Burke, and
          <string-name>
            <given-names>Bamshad</given-names>
            <surname>Mobasher</surname>
          </string-name>
          .
          <year>2017</year>
          .
          <article-title>Controlling Popularity Bias in Learning-to-Rank Recommendation</article-title>
          .
          <source>In RecSys '17 Proceedings of the Eleventh ACM Conference on Recommender Systems</source>
          .
          <volume>42</volume>
          -
          <fpage>46</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [3]
          <string-name>
            <given-names>Engin</given-names>
            <surname>Bozdag</surname>
          </string-name>
          .
          <year>2013</year>
          .
          <article-title>Bias in algorithmic filtering and personalization</article-title>
          .
          <source>Ethics and information technology 15</source>
          ,
          <issue>3</issue>
          (
          <year>2013</year>
          ),
          <fpage>209</fpage>
          -
          <lpage>227</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [4]
          <string-name>
            <given-names>Robin</given-names>
            <surname>Burke</surname>
          </string-name>
          , Nasim Sonboli, Masoud Mansoury, and
          <string-name>
            <surname>Aldo</surname>
          </string-name>
          OrdoÃśez-Gauger.
          <year>2017</year>
          .
          <article-title>Balanced neighborhoods for fairness-aware collaborative recommendation</article-title>
          .
          <source>In RecSys workshop on Fairness, Accountability and Transparency in Recommender Systems.</source>
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          [5]
          <string-name>
            <surname>Robin</surname>
            <given-names>D.</given-names>
          </string-name>
          <string-name>
            <surname>Burke</surname>
            , Himan Abdollahpouri, Bamshad Mobasher, and
            <given-names>Trinadh</given-names>
          </string-name>
          <string-name>
            <surname>Gupta</surname>
          </string-name>
          .
          <year>2016</year>
          .
          <article-title>Towards Multi-Stakeholder Utility Evaluation of Recommender Systems</article-title>
          .
          <source>In In UMAP (Extended Proceedings).</source>
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          [6]
          <string-name>
            <given-names>Cynthia</given-names>
            <surname>Dwork</surname>
          </string-name>
          , Moritz Hardt, Toniann Pitassi, Omer Reingold, and Richard Zemel.
          <year>2012</year>
          .
          <article-title>Fairness through awareness</article-title>
          .
          <source>In In Proceedings of the 3rd innovations in theoretical computer science conference</source>
          .
          <volume>214</volume>
          -
          <fpage>226</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          [7]
          <string-name>
            <surname>Michael</surname>
            <given-names>D.</given-names>
          </string-name>
          <string-name>
            <surname>Ekstrand</surname>
            , Mucun Tian, Ion Madrazo Azpiazu,
            <given-names>Jennifer D.</given-names>
          </string-name>
          <string-name>
            <surname>Ekstrand</surname>
          </string-name>
          , Oghenemaro Anuyah,
          <string-name>
            <surname>David McNeill</surname>
            ,
            <given-names>and Maria Soledad</given-names>
          </string-name>
          <string-name>
            <surname>Pera</surname>
          </string-name>
          .
          <year>2018</year>
          .
          <article-title>All The Cool Kids, How Do They Fit In?: Popularity and Demographic Biases in Recommender Evaluation and Efectiveness</article-title>
          . In In Conference on Fairness,
          <source>Accountability and Transparency</source>
          .
          <volume>172</volume>
          -
          <fpage>186</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          [8]
          <string-name>
            <given-names>Guibing</given-names>
            <surname>Guo</surname>
          </string-name>
          , Jie Zhang, Zhu Sun, and
          <string-name>
            <surname>Neil</surname>
          </string-name>
          Yorke-Smith.
          <year>2015</year>
          .
          <article-title>LibRec: A Java Library for Recommender Systems</article-title>
          . In UMAP Workshops.
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          [9]
          <string-name>
            <given-names>Guibing</given-names>
            <surname>Guo</surname>
          </string-name>
          , Jie Zhang, and
          <string-name>
            <surname>Neil</surname>
          </string-name>
          Yorke-Smith.
          <year>2015</year>
          .
          <article-title>TrustSVD: collaborative ifltering with both the explicit and implicit influence of user trust and of item ratings</article-title>
          .
          <source>In Twenty-Ninth AAAI Conference on Artificial Intelligence .</source>
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          [10]
          <string-name>
            <given-names>Mohsen</given-names>
            <surname>Jamali</surname>
          </string-name>
          and
          <string-name>
            <given-names>Martin</given-names>
            <surname>Ester</surname>
          </string-name>
          .
          <year>2010</year>
          .
          <article-title>A matrix factorization technique with trust propagation for recommendation in social networks</article-title>
          .
          <source>In In Proceedings of the fourth ACM conference on Recommender systems. 135-142.</source>
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          [11]
          <string-name>
            <surname>Dietmar</surname>
            <given-names>Jannach</given-names>
          </string-name>
          , Lukas Lerche, Iman Kamehkhosh, and
          <string-name>
            <given-names>Michael</given-names>
            <surname>Jugovac</surname>
          </string-name>
          .
          <year>2015</year>
          .
          <article-title>What recommenders recommend: an analysis of recommendation biases and possible countermeasures</article-title>
          .
          <source>User Modeling and User-Adapted Interaction 25</source>
          ,
          <issue>5</issue>
          (
          <year>2015</year>
          ),
          <fpage>427</fpage>
          -
          <lpage>491</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          [12]
          <string-name>
            <surname>Faisal</surname>
            <given-names>Kamiran</given-names>
          </string-name>
          , Toon Calders, and
          <string-name>
            <given-names>Mykola</given-names>
            <surname>Pechenizkiy</surname>
          </string-name>
          .
          <year>2010</year>
          .
          <article-title>Discrimination aware decision tree learning</article-title>
          .
          <source>In In 2010 IEEE International Conference on Data Mining</source>
          .
          <fpage>869</fpage>
          -
          <lpage>874</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          [13]
          <string-name>
            <surname>Toshihiro</surname>
            <given-names>Kamishima</given-names>
          </string-name>
          , Shotaro Akaho, and
          <string-name>
            <given-names>Jun</given-names>
            <surname>Sakuma</surname>
          </string-name>
          .
          <year>2011</year>
          .
          <article-title>Fairness-aware learning through regularization approach</article-title>
          .
          <source>In In 11th International Conference on Data Mining Workshops</source>
          .
          <fpage>643</fpage>
          -
          <lpage>650</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref14">
        <mixed-citation>
          [14]
          <string-name>
            <given-names>Yehuda</given-names>
            <surname>Koren</surname>
          </string-name>
          .
          <year>2008</year>
          .
          <article-title>Factorization meets the neighborhood: a multifaceted collaborative filtering model</article-title>
          .
          <source>In Proceedings of the 14th ACM SIGKDD international conference on Knowledge discovery and data mining. ACM</source>
          ,
          <volume>426</volume>
          -
          <fpage>434</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref15">
        <mixed-citation>
          [15]
          <string-name>
            <surname>Yehuda</surname>
            <given-names>Koren</given-names>
          </string-name>
          , Robert Bell, and
          <string-name>
            <given-names>Chris</given-names>
            <surname>Volinsky</surname>
          </string-name>
          .
          <year>2009</year>
          .
          <article-title>Matrix factorization techniques for recommender systems</article-title>
          .
          <source>Computer 42</source>
          ,
          <issue>8</issue>
          (
          <year>2009</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref16">
        <mixed-citation>
          [16]
          <string-name>
            <surname>Shyong</surname>
            <given-names>K.</given-names>
          </string-name>
          <string-name>
            <surname>Lam and John Riedl</surname>
          </string-name>
          .
          <year>2004</year>
          .
          <article-title>Shilling recommender systems for fun and profit</article-title>
          .
          <source>In Proceedings of the 13th international conference on World Wide Web. ACM</source>
          ,
          <volume>393</volume>
          -
          <fpage>402</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref17">
        <mixed-citation>
          [17]
          <string-name>
            <given-names>Weiwen</given-names>
            <surname>Liu</surname>
          </string-name>
          and
          <string-name>
            <given-names>Robin</given-names>
            <surname>Burke</surname>
          </string-name>
          .
          <year>2018</year>
          .
          <article-title>Personalizing Fairness-aware Re-ranking</article-title>
          . CoRR abs/
          <year>1809</year>
          .02921 (
          <year>2018</year>
          ). arXiv:
          <year>1809</year>
          .02921 http://arxiv.org/abs/
          <year>1809</year>
          .02921
        </mixed-citation>
      </ref>
      <ref id="ref18">
        <mixed-citation>
          [18]
          <string-name>
            <surname>Hao</surname>
            <given-names>Ma</given-names>
          </string-name>
          , Dengyong Zhou, Chao Liu,
          <string-name>
            <surname>Michael R. Lyu</surname>
            , and
            <given-names>Irwin</given-names>
          </string-name>
          <string-name>
            <surname>King</surname>
          </string-name>
          .
          <year>2011</year>
          .
          <article-title>Recommender systems with social regularization</article-title>
          .
          <source>In Proceedings of the fourth ACM international conference on Web search and data mining</source>
          .
          <volume>287</volume>
          -
          <fpage>296</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref19">
        <mixed-citation>
          [19]
          <string-name>
            <surname>Masoud</surname>
            <given-names>Mansoury</given-names>
          </string-name>
          , Robin Burke, Aldo Ordonez-Gauger, and
          <string-name>
            <given-names>Xavier</given-names>
            <surname>Sepulveda</surname>
          </string-name>
          .
          <year>2018</year>
          .
          <article-title>Automating recommender systems experimentation with librec-auto</article-title>
          .
          <source>In Proceedings of the 12th ACM Conference on Recommender Systems. ACM</source>
          ,
          <volume>500</volume>
          -
          <fpage>501</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref20">
        <mixed-citation>
          [20]
          <string-name>
            <given-names>Paolo</given-names>
            <surname>Massa</surname>
          </string-name>
          and
          <string-name>
            <given-names>Paolo</given-names>
            <surname>Avesani</surname>
          </string-name>
          .
          <year>2007</year>
          .
          <article-title>Trust-aware recommender systems</article-title>
          .
          <source>In Proceedings of the 2007 ACM conference on Recommender systems. ACM</source>
          ,
          <volume>17</volume>
          -
          <fpage>24</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref21">
        <mixed-citation>
          [21]
          <string-name>
            <given-names>Xia</given-names>
            <surname>Ning</surname>
          </string-name>
          and
          <string-name>
            <given-names>George</given-names>
            <surname>Karypis</surname>
          </string-name>
          .
          <year>2011</year>
          .
          <article-title>SLIM: Sparse Linear Methods for Top-N Recommender Systems</article-title>
          .
          <source>In Data Mining (ICDM)</source>
          ,
          <source>2011 IEEE 11th International Conference on. IEEE</source>
          ,
          <fpage>497</fpage>
          -
          <lpage>506</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref22">
        <mixed-citation>
          [22]
          <string-name>
            <given-names>Paul</given-names>
            <surname>Resnick</surname>
          </string-name>
          , Neophytos Iacovou, Mitesh Suchak,
          <string-name>
            <given-names>Peter</given-names>
            <surname>Bergstrom</surname>
          </string-name>
          ,
          <string-name>
            <given-names>and John</given-names>
            <surname>Riedl</surname>
          </string-name>
          .
          <year>1994</year>
          .
          <article-title>GroupLens: an open architecture for collaborative filtering of netnews</article-title>
          .
          <source>In Proceedings of the 1994 ACM conference on Computer supported cooperative work. ACM</source>
          ,
          <volume>175</volume>
          -
          <fpage>186</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref23">
        <mixed-citation>
          [23]
          <string-name>
            <surname>Badrul</surname>
            <given-names>Sarwar</given-names>
          </string-name>
          , George Karypis, Joseph Konstan,
          <string-name>
            <given-names>and John</given-names>
            <surname>Riedl</surname>
          </string-name>
          .
          <year>2001</year>
          .
          <article-title>Item-based collaborative filtering recommendation algorithms</article-title>
          .
          <source>In WWW'01 Proceedings of the 10th international conference on World Wide Web</source>
          .
          <fpage>285</fpage>
          -
          <lpage>295</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref24">
        <mixed-citation>
          [24]
          <string-name>
            <surname>Yue</surname>
            <given-names>Shi</given-names>
          </string-name>
          ,
          <string-name>
            <given-names>Martha</given-names>
            <surname>Larson</surname>
          </string-name>
          , and
          <string-name>
            <given-names>Alan</given-names>
            <surname>Hanjalic</surname>
          </string-name>
          .
          <year>2010</year>
          .
          <article-title>List-wise learning to rank with matrix factorization for collaborative filtering</article-title>
          .
          <source>In Proceedings of the fourth ACM conference on Recommender systems. ACM</source>
          ,
          <volume>269</volume>
          -
          <fpage>272</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref25">
        <mixed-citation>
          [25]
          <string-name>
            <surname>Virginia</surname>
            <given-names>Tsintzou</given-names>
          </string-name>
          , Evaggelia Pitoura, and
          <string-name>
            <given-names>Panayiotis</given-names>
            <surname>Tsaparas</surname>
          </string-name>
          .
          <year>2018</year>
          .
          <article-title>Bias Disparity in Recommendation Systems</article-title>
          . CoRR abs/
          <year>1811</year>
          .01461 (
          <year>2018</year>
          ). arXiv:
          <year>1811</year>
          .01461 http://arxiv.org/abs/
          <year>1811</year>
          .01461
        </mixed-citation>
      </ref>
      <ref id="ref26">
        <mixed-citation>
          [26]
          <string-name>
            <surname>Bo</surname>
            <given-names>Yang</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Yu</surname>
            <given-names>Lei</given-names>
          </string-name>
          , Jiming Liu, and
          <string-name>
            <given-names>Wenjie</given-names>
            <surname>Li</surname>
          </string-name>
          .
          <year>2017</year>
          .
          <article-title>Social collaborative filtering by trust</article-title>
          .
          <source>IEEE Transactions on Pattern Analysis and Machine Intelligence</source>
          <volume>39</volume>
          ,
          <issue>8</issue>
          (
          <year>2017</year>
          ),
          <fpage>1633</fpage>
          -
          <lpage>1647</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref27">
        <mixed-citation>
          [27]
          <string-name>
            <given-names>Sirui</given-names>
            <surname>Yao</surname>
          </string-name>
          and
          <string-name>
            <given-names>Bert</given-names>
            <surname>Huang</surname>
          </string-name>
          .
          <year>2017</year>
          .
          <article-title>Beyond parity: Fairness objectives for collaborative filtering</article-title>
          .
          <source>In In Advances in Neural Information Processing Systems</source>
          .
          <volume>2921</volume>
          -
          <fpage>2930</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref28">
        <mixed-citation>
          [28]
          <string-name>
            <surname>Rich</surname>
            <given-names>Zemel</given-names>
          </string-name>
          , Yu Wu, Kevin Swersky, Toni Pitassi, and
          <string-name>
            <given-names>Cynthia</given-names>
            <surname>Dwork</surname>
          </string-name>
          .
          <year>2013</year>
          .
          <article-title>Learning fair representations</article-title>
          .
          <source>In In International Conference on Machine Learning</source>
          .
          <fpage>325</fpage>
          -
          <lpage>333</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref29">
        <mixed-citation>
          [29]
          <string-name>
            <surname>Ziwei</surname>
            <given-names>Zhu</given-names>
          </string-name>
          , Xia Hu, and
          <string-name>
            <given-names>James</given-names>
            <surname>Caverlee</surname>
          </string-name>
          .
          <year>2018</year>
          .
          <article-title>Fairness-aware tensor-based recommendation</article-title>
          .
          <source>In In Proceedings of the 27th ACM International Conference on Information and Knowledge Management</source>
          .
          <fpage>1153</fpage>
          -
          <lpage>1162</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref30">
        <mixed-citation>
          [30]
          <string-name>
            <surname>ÃŰzge</surname>
            <given-names>SÃĳrer</given-names>
          </string-name>
          , Robin Burke, and
          <string-name>
            <surname>Edward</surname>
            <given-names>C.</given-names>
          </string-name>
          <string-name>
            <surname>Malthouse</surname>
          </string-name>
          .
          <year>2018</year>
          .
          <article-title>Multistakeholder recommendation with provider constraints</article-title>
          .
          <source>In In Proceedings of the 12th ACM Conference on Recommender Systems</source>
          .
          <volume>54</volume>
          -
          <fpage>62</fpage>
          .
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>