Fair Exposure of Documents in Information Retrieval:
                       a Community Detection Approach
            Adrian-Gabriel Chifu                                         Josiane Mothe                                  Md Zia Ullah
       Aix Marseille Univ, Université de                   IRIT UMR5505 CNRS, INSPE, Univ. de                  IRIT UMR5505 CNRS, Toulouse,
             Toulon, CNRS, LIS                                  Toulouse, Toulouse, France                               France
              Marseille, France                                  Josiane.Mothe@irit.fr                             mdzia.ullah@irit.fr
        adrian.chifu@univ-amu.fr
ABSTRACT                                                                                 Another recent research direction is the so called “fair” exposure
While (mainly) designed to answer users’ needs, search engines                       of documents. It has mainly been cast into a re-ranking problem
and recommendation systems do not necessarily guarantee the                          with constraints and optimization functions [2, 29, 38]. In 2019,
exposure of the data they store and index while it can be essential                  TREC also decided to tackle this problem in one of its track: the Fair
for information providers. A recent research direction so called                     Ranking Track. The rationale is that not only users are concerned
“fair” exposure of documents tackles this problem in information                     by which information is retrieved, but also information producers
retrieval. It has mainly been cast into a re-ranking problem with                    who are certainly inclined to store their data on platforms that give
constraints and optimization functions. This paper presents the                      them the best exposure to users. As defined by TREC organizers, the
first steps toward a new framework for fair document exposure.                       Fair Ranking Track considers scientific documents and the central
This framework is based on document linking and document com-                        goal is to provide fair exposure to different groups of authors. Fair
munity detection; communities are used to rank the documents to                      exposure is certainly a much more complex problem than currently
be retrieved according to an information need. In addition to the                    defined since when platforms disclose the algorithms they use,
first step of this new framework, we present its potential through                   producers try to fit to be better exposed. However, this track as it
both a toy example and a few illustrative examples from the 2019                     is, corresponds to a first interesting step and is likely to result in
TREC Fair Ranking Track data set.                                                    important milestones related to IR transparency.
                                                                                         This paper presents the first steps toward a new framework for
KEYWORDS                                                                             fair document exposure in information retrieval. This framework
                                                                                     is based on information clustering and document linking that is
Information systems, Information retrieval, Fair document expo-                      then used to select the documents to be retrieved according to an
sure, Document network, Document communities, Document re-                           information need. The rest of the paper is organized as follows:
ranking                                                                              Section 2 includes some related work. Section 3 presents the first
                                                                                     step of our method along with a toy example. Section 4 describes the
1    INTRODUCTION                                                                    preliminary results on some examples from the TREC Fair Ranking
                                                                                     Track. Section 5 concludes this paper and presents our future work.
While (mainly) designed to answer users’ needs, search engines
and recommendation systems (RS) do not necessarily guarantee
the exposure of the data they store and index. For example, many
                                                                                     2   RELATED WORK
algorithms are founded on both content and collaborative selection                   Fair algorithms. Designing fair algorithms has recently been
of items. They are likely to retrieve or recommend the items that                    emerged to tackle the underlying bias in the training data or al-
other users have liked or seen on similar need topics. In the recom-                 gorithm itself. Chierichetti et al. [7] study the fair clustering algo-
mendation domain, this problem is known as the cold start problem                    rithms (a protected class must have fair representation in different
where new items are unlikely to be recommended since no user                         clusters) under the disparate impact [14]. The authors introduce
clicked on them when just added [24, 26, 27]. In search engines,                     fairlets that satisfy fair representation while maintaining the cluster-
ranking algorithms also include information on past searches and                     ing objective. The authors then reduce the fair clustering problem
users’ preferences [15, 34] that favors the most popular documents                   into fairlet decomposition and use the classical clustering algorithm
to the detriment of others, new or less known, although poten-                       to obtain the balanced representation of protective class in differ-
tially equally relevant. To solve this problem, algorithms both in                   ent clusters. In further study, Chierichetti et al. [8] focus on the
Information Retrieval (IR) and RS have been developed and aim                        large class of fair algorithmic problems for optimization subject to
at diversifying the results. In IR, diversity is mainly considered                   matroid constraints such as cardinality (sets), connectivity (graph),
in relation to ambiguity [9], where the problem is to provide the                    or matching (subgraph). They found that matroid constraints are
user documents that answer the various meanings of a word (e.g.                      general enough to encode many different types of problems while
Orange as a Telecom company, a French city, a color, or a fruit). In                 relatively easy to optimize.
RS, many algorithms also diversify the results by adding items that                  Document diversity. While not initially formulated as a fair ex-
would not be recommended otherwise [22, 26].                                         posure of data, diversity is nonetheless a way to provide document
                                                                                     exposure.
"Copyright © 2020 for this paper by its authors. Use permitted under Creative Com-
mons License Attribution 4.0 International (CC BY 4.0)."                             https://fair-trec.github.io/
                                                                                                 Adrian-Gabriel Chifu, Josiane Mothe, and Md Zia Ullah


    Diversification is the process of re-ranking the documents ini-       categories. They noted that higher diversity and fairness would
tially retrieved for the query or selected for recommendation, taking     lead to increased user acceptance regarding the results.
into account the coverage, importance, and novelty of the doc-               From the evaluation point of view, the authors of [12] proposed
uments in an implicit or explicit approach to reduce the search           an interpolation strategy to integrate relevance and fairness into
bias [37, 40]. Result diversification is generally formulated as an       a unique evaluation metric. They investigated how existing TREC
optimization problem and diversification methods differ by how            corpora could be adapted for a fairness task. Castillo [6] reviewed
they implement the objective function.                                    the recent works on fairness and transparency in ranking.
    Some methods consider the document content only such as Maxi-            In RS, Khenissi and Nasraoui modeled the user exposure in an
mal Marginal Relevance (MMR) [5] which selects the best document          iterative closed feedback loop and developed a de-biasing strategy
incrementally by balancing the relevance with the query and topic         to reduce the bias inherent in the recommendation system, such
novelty. Fusion diversification is another example of content-based       as [20]. Mehrotra et al. proposed a framework to evaluate the trade-
methods where topic modeling is used to infer latent subtopics;           off between the relevance of recommendation to the customers and
the results of sub-topic rankers are then fused to obtain the final       fairness of representation of suppliers in a two-sided marketplace
ranking [23].                                                             such as Airbnb or Spotify [29].
    Other methods use external cues which represent users’ inten-            In TREC 2019 Fair Ranking Track, Mcdonald et al. [28] cast
tion like, Xia et al. who proposed a diverse ranking approach based       the fair ranking as a diversification problem and employed two
on continuous state Markov decision process to model the user             diversification approaches based on xQuAD [36] by optimizing the
perceived utility at different ranks [41]. Also considering users’        authors influence in one run, and authors and venues influence in
perspective, Santos et al. proposed xQuAD, a greedy diversification       another run; the results show that both of the runs are effective
algorithm that maximizes the coverage of explicitly mined query           in minimizing unfairness. Wang et al. [39] used the convolutional
subtopics [36]. Wasilewski et al. proposed an intent-aware collab-        kernel-based neural ranking model (Conv-KNRM) to obtain the
orative filtering based recommendation system where the system            relevance oriented ranking and then applied the documents swap-
diversify the recommended items by integrating the relevance, user        ping for authors exposure. Bonart, another participant, used a basic
intentions, and item-aspect associations [40]. Abdollahpouri et al.       learning-to-rank framework for solving the track, stressing that
introduced a method to diversify the recommended items by reduc-          the main challenge was that multiple objectives (searchers’ rele-
ing the popularity bias and enhancing the novelty and coverage [1].       vance and authors’ group fairness) have to be optimized under the
Küçüktunç [21] proposed an enhanced random-walk based diver-              constraint of an unknown author-group-partition.
sification for citation recommendation using the citation graph to
allow users to reach the old, fundamental and recent papers.
                                                                          3   PROPOSED METHODOLOGY
Fair document exposure. None of the above mentioned methods               While in this paper we are not providing a completed framework for
provides fair item exposure in their re-ranking step. Recently, differ-   fair information exposure, we are describing the first steps toward
ent frameworks have been proposed for the formulation of fairness         it. In this section, we present the main intuition that drove our
constraints on rankings in terms of exposure allocation [2, 38].          proposal, the main principles as well as a toy example to illustrate
In [38], the authors aim to find rankings that would maximize the         the model.
fairness with respect to various facets, by proposing a general frame-
                                                                          Rationale of our model.
work that employs probabilistic rankings and linear programming.
                                                                              Rather than considering document ranking constraints, as in
   In IR, the concept of fair exposure is seen as the average attention
                                                                          previous work [17, 38], in order to solve the issue of fair docu-
received by ranked items [13], as an optimization problem with
                                                                          ment exposure, our model relies on graph structure and community
fairness constraints [17], or as a diversity ranking problem [18].
                                                                          founding.
Diaz et al. [13] argued that measuring and optimizing expected ex-
                                                                              Fair document exposure can be considered on different views or
posure metrics using randomization opens a new area for retrieval
                                                                          considering different facets[38]. One of them is what we call fair
algorithms, while Gao and Shah [17] estimated the solution space
                                                                          item exposure. In that case, the items in the retrieved documents
and answered questions about the impact of fairness consideration
                                                                          should expose as many items as possible within a small as possible
in the system, or about the trade-off between various optimization
                                                                          document set. For example, if items are authors, then the objective
strategies. Still, Gao and Shah [18] studied the top-k diversity fair-
                                                                          is to expose as many authors as possible in the (relevant) retrieved
ness ranking in terms of statistical parity fairness (equal number
                                                                          documents. The second view is what we call fair community expo-
of documents from different groups) and disparate impact fairness
                                                                          sure. In that case, we want to provide a fair exposure of the different
(unintentional discrimination to a group).
                                                                          communities of documents. A document community is based on
   Some approaches consider the fair exposure as a re-ranking
                                                                          the links between documents as defined by their shared items. The
problem [25, 30]. Liu and Burke [25] generated the ranking list
                                                                          main idea is that the top ranked documents should not belong to the
based on a linear combination trade-off between accuracy and
                                                                          same document community to ensure a fair exposure of the variety
provider coverage and argued that the algorithm can significantly
                                                                          of documents in terms of their items (e.g., their authors, sources,
promote fairness, however, with a slight loss in terms of accuracy.
                                                                          affiliation, opinions).
Natwar et al. [30] introduced the concept of representative diversity,
                                                                              Document communities are extracted from the document network
which means the level of interest of the consumer in different
                                                                          which is composed of the initially retrieved documents as nodes.
                                                                          As such the ground of the model is similar to the popular page rank
Fair Exposure of Documents in Information Retrieval:
a Community Detection Approach


algorithm [4]. However, rather than considering the in and out                    the exposure (e.g., changing the type(s) of links to consider
hyperlinks or explicit references between documents, document                     between the retrieved documents or changing the commu-
links are rather extracted from implicit links. The implicit links are            nity detection algorithms).
based on meta-data that documents share (e.g., common authors).
                                                                           Problem formulation.
Such links can be weighted and eventually oriented, depending
on what they represent. Indeed, there is a large variety of type of            • Document communities:
links that can be considered in between documents. For example,            𝐷𝑘 is the set of 𝑘 documents {𝑑 1, 𝑑 2, · · · , 𝑑𝑘 }. Each 𝑑 𝑗 becomes a
considering scientific publications, documents can be linked ac-           node of the document network. An edge between two document
cording to the authors they have in common: the more common                nodes is typed and depends on the meta-data or items used. Let 𝐼𝑘𝑇
authors in between two documents, the higher the weight between            be the set of items of type 𝑇 extracted from 𝐷𝑘 . The weight of the
the two corresponding nodes in the network. Once the document              edge between 𝑑𝑖 and 𝑑 𝑗 is defined as:
network is built, the main document communities can easily be                                                        𝑇    𝑇
                                                                                                                  𝐼𝑑 ∩ 𝐼𝑑
extracted using state of the art community detection algorithms                                                     𝑖     𝑗
                                                                                               𝑊 𝑒𝑖𝑔ℎ𝑡 𝑑𝑖 , 𝑑 𝑗 =                               (1)
such as Fast Greedy [10], Leading Eigenvector [32], Walktrap [33]                                                 𝐼𝑑𝑇 ∪ 𝐼𝑑𝑇
or Infomap [35], for example.                                                                                         𝑖     𝑗

   The principle is then to define a document ranking that both                  • Fair exposure of items:
merges the communities (top documents should belong to different               Let 𝐼 be a set of items and 𝑑 𝑗 be a document. 𝐼𝑑 𝑗 is the set
communities) and ranks the documents within a given community              of items associated with a document 𝑑 𝑗 and |𝐼𝑑 𝑗 | is the number
while considering the topic-document relevance.                            of those unique items. In the same way, |𝐼𝐷𝑘 | is the number of
Using contextual networks & community detection. In gen-                   unique items associated with 𝐷𝑘 where 𝐷𝑘 is the set of 𝑘 docu-
eral, the goal of community detection is to partition any network          ments {𝑑 1, 𝑑 2, · · · , 𝑑𝑘 }.
into communities to extract the subgroups of densely connected                 R is the initial -assumed as non empty- list of 𝑁 retrieved docu-
nodes [16, 19]. The elements from a given community are then               ments {𝑑 1 , 𝑑 2 , · · · , 𝑑 𝑁 } ordered by document score, expressed as
considered as very close concerning the cue the edges represent.           the probability of the document 𝑑𝑘 at rank 𝑘 to be relevant to the
   The problem of fair exposure of documents is then cast to the           query q (𝑃(𝑑𝑘 |q)).
problem of a fair exposure of each “community” of documents. Our               S𝑘−1 is the set of documents already selected with fair exposure
model thus relies on a network where the nodes of the network              to items at (𝑘 − 1)-th iteration and initially empty (S0 = ∅ i.e., for
correspond to documents, while the edges between nodes are a               k=1).
means to contextualize the type of exposure the model should                   We want to optimize the document selection in an iterative way.
give to documents. The edges and thus the document network are             𝑑𝑘∗ is the best fair document to be selected at the 𝑘-th iteration and
contextual and depend on the targeted type of document exposure.           is calculated as follows:
Indeed, documents can be linked in different ways, depending on the
types of documents and metadata which are associated with them.                      𝑑𝑘∗ = argmax 𝛾 𝑃 (𝑑𝑖 |𝑞) + (1 − 𝛾) 𝐸 (𝑑𝑖 , S𝑘−1 )          (2)
The principle is also very close to the linked data principle. “The                        𝑑𝑖 ∈R\S𝑘−1
term Linked Data refers to a set of best practices for publishing and      where 𝛾 is a combining parameter and 𝛾 ∈ [0, 1], R \ S𝑘−1 is the
connecting structured data on the Web” [3]. As an example, linking         list of documents from R that has not been selected yet in S𝑘−1 at
documents according to the shared countries their authors belong to        the (𝑘 − 1)-th iteration.
is a way to structure the document space and extract communities               Then,
                                                                                                      S𝑘 = S𝑘−1 ∪ 𝑑𝑘∗
                                                                                                                   
of documents that are authored by people from the same region of                                                                              (3)
the world. As another example, linking the documents according                 The function 𝐸 (𝑑𝑖 , 𝑆𝑘−1 ) measures the exposure of items of the
to the venue where they have been published is another way to              document 𝑑𝑖 given the document set 𝑆𝑘−1 and is defined as follows:
structure the document space. This document space representation
                                                                                                                𝐼𝑑𝑖 ∪ 𝐼𝑆𝑘−1
has several advantages:                                                                          𝐸 (𝑑𝑖 , 𝑆𝑘−1 ) =                             (4)
                                                                                                                    |𝐼 R |
     • First, it is very adaptive: the definition of the edges can              • Fair exposure of communities:
       depend on the type of documents or the type of exposure                Let C be a set of communities detected from the graph of docu-
       one wants to reveal;                                                ments. Each community 𝑐 ∈ C contains a set of documents 𝐷𝑐 . Like
     • Second, different types of linking can be easily combined: it       previously, 𝐼𝐷𝑐 is the set of items in 𝐷𝑐 and |𝐼𝐷𝑐 | is the number of
       is possible to link documents both regarding the countries          those items.
       the authors belong to and at the same time, the venue where            For fair exposure of communities, we want to optimize the docu-
       they have published;                                                ment selection over the communities C. We define this selection in
     • Third, well-known community detection algorithms can be             an iterative way. 𝑑𝑘∗ is the document to be selected for fair exposure
       easily applied; some are known to result in smaller but denser      of communities at the 𝑘-th iteration and is calculated as a trade-off
       communities, while others yield larger communities [11, 31];        between document relevance and exposure as follows:
     • Finally, the communities can be adaptive and can be defined
       on a document type-based, user-based or query-based man-                     𝑑𝑘∗ = argmax 𝛾 𝑃 (𝑑𝑖 |𝑞) + (1 − 𝛾) 𝐸 (𝑑𝑖 , S𝑘−1 |C)         (5)
       ner to fit better the objectives or definition of the fairness of                 𝑑𝑖 ∈R\S𝑘−1
                                                                                                    Adrian-Gabriel Chifu, Josiane Mothe, and Md Zia Ullah


   Like in Equation 2, 𝑃 (𝑑𝑖 |𝑞) is the probability of the document                            Docs Authors       Entities    JN
𝑑𝑖 to be relevant for 𝑞. The function 𝐸 (𝑑𝑖 , 𝑆𝑘−1 |C) measures the                   Query      #     # AVG      #    AVG #
exposure of items of the document 𝑑𝑖 in the document set 𝑆𝑘−1                         Q38944    12    64 5.58 122       14     7
considering the set of communities C and is defined as follows:                       Q5842     13    56 4.3     98 10.5       9
                                    Õ                                        Table 1: Examples of Fair Ranking track queries along with
               𝐸 (𝑑𝑖 , S𝑘−1 |C) =          𝑃 (𝑑𝑖 , S𝑘−1 |𝑐) 𝑃 (𝑐)     (6)    some statistics. # stands for “number of different”, AVG
                                    𝑐 ∈𝐶                                     stands for “average number of ... per document”. Docs cor-
where 𝑃 (𝑐) denotes the probability of the community 𝑐 which is              responds to documents, while JN is for journals.
considered as uniform ( |𝐶1 | ) of any community 𝑐 ∈ C and can be
ignored for ranking objective.
   In Equation 6, 𝑃 (𝑑𝑖 , S𝑘−1 |𝑐) estimates the exposure of the items
of document 𝑑𝑖 given the exposure of the items of already selected
documents S𝑘−1 that belong to the community 𝑐 and can be defined
as follows:
                                         Ö
            𝑃 (𝑑𝑖 , S𝑘−1 |𝑐) = 𝑃 (𝑑𝑖 |𝑐)      (1 − 𝑃 (𝑑 𝑗 |𝑐))      (7)
                                           𝑑 𝑗 ∈S𝑘−1
                    |𝐼𝑑 |                                                           (a) Q46933 author-based          (b) Q46933 venue-based
where 𝑃 (𝑑𝑖 |𝑐) = |𝐼 𝑖 | × 1𝑑𝑖 denotes the exposure of the items of
                     𝐷𝑐
document 𝑑 (|𝐼𝑑𝑖 |) over the items of community 𝑐 (|𝐼𝐷𝑐 |) if document
𝑑 belongs to community 𝑐.
   1𝑑𝑖 makes the probability being 0 if the document does not belong
to the community and is defined as follows:
                               (
                                 1 if 𝑑𝑖 ∈ 𝑐,
                        1𝑑𝑖 =                                                        (c) Q46933 year-based
                                 0 otherwise.

4    PRELIMINARY RESULTS
TREC Fair Ranking Track and data. In order to provide a pre-
liminary evaluation and proof of concept, we used the Fair Ranking
data as released by the track organizers.
   In this shared task, participants are asked to re-rank query-based
retrieved document sets in order to optimize both the relevance to
                                                                                  (i) Query Q5842 venue-based     (ii) Query Q5842 year-based
the consumers and the fairness to the producers.
   The data consists in:                                                     Figure 1: Document communities extracted from the re-
     • the Semantic Scholar (S2) Open Corpus from the Allen Insti-           trieved documents for two of the 2019 Fair Ranking Track
        tute for Artificial Intelligence which consists of 47 of 1GB         queries. Document IDs have been shorten for readability (us-
        data files. Most of the papers consist of the identifier, its DOI,   ing the two first and the latest characters of the real IDs.
        title, abstract, authors along with their IDs, inbound, and
        outbound citations;
     • the Query log provided by Semantic Scholar. For each query,           (b) venues where the papers were published (c) year of publication.
        the organizers provide the query ID, the query string, the           In (a), two documents are linked if they have at least one author
        query normalized frequency, as well as a list of documents           in common while in (b) (resp. (c)), two documents are linked if
        to rerank. There are 652 training and 635 evaluation queries         they have been published in the same venue (resp. on the same
        although not all contain a non-empty document list.                  year). Other types of linking could have been made using the “In" or
                                                                             "Out" links, entities (key-words) for examples. For a fair exposure
Preliminary results. In this first experiment, we do not provide             of documents associated to Query 38944, considering venues, the
exhaustive results on the entire collection but rather some illustra-        group "dbf", "c1f", "e7a" and "dde" for example should have a single
tive examples.                                                               of them being top ranked as the others are redundant regarding
   To start with, we considered a few queries for which the set of           the venue. A fair exposure regarding the time line of the field could
associated documents contains at least 10 documents. We present              also be considered.
two of them to illustrate the potential power of our model.                     In a similar way, for Query Q5842 (second row in Table 1), we
   Query 38944 is one of them (detailed in the first row Table 1).           obtained the document communities as presented in Figure 1, when
For this query, the two first rows in Figure 1 display the document          considering (i) the venues where the papers are published and (ii)
network and document communities based on (a) co-authorship                  the years of publication.
https://fair-trec.github.io/                                                    While this method does not include yet the final order of doc-
http://api.semanticscholar.org/corpus/ (the 2019-01-31 version)              uments, it opens new ways of considering diversity or/and fair
Fair Exposure of Documents in Information Retrieval:
a Community Detection Approach


exposure according to various criteria that can be either considered                              (Santa Clara, CA, USA) (ICTIR ’19). Association for Computing Machinery, New
individually or combined. An interesting point of the model is that                               York, NY, USA, 229–236. https://doi.org/10.1145/3341981.3344215
                                                                                             [18] Ruoyuan Gao and Chirag Shah. 2020. Toward creating a fairer ranking in search
the criteria can evolve according to the needs.                                                   engine results. Information Processing and Management 57, 1 (2020), 102138.
                                                                                                  https://doi.org/10.1016/j.ipm.2019.102138
                                                                                             [19] Mariam Haroutunian, Karen Mkhitaryan, and Josiane Mothe. 2018. f-Divergence
5     CONCLUSION                                                                                  measures for evaluation in community detection. In W. on Collaborative Tech-
In this paper, we paved the way for a new framework for fair                                      nologies and Data Science in Smart City Applications (CODASSCA). 137–144.
                                                                                             [20] Sami Khenissi and Olfa Nasraoui. 2020. Modeling and Counteracting Exposure
document exposure. We presented the model.We also present some                                    Bias in Recommender Systems.
illustrative examples when considering the TREC Fair Ranking                                 [21] Onur Küçüktunç, Erik Saule, Kamer Kaya, and Ümit V Çatalyürek. 2014. Diversi-
                                                                                                  fying citation recommendations. ACM Transactions on Intelligent Systems and
Track 2019 data.
                                                                                                  Technology (TIST) 5, 4 (2014), 1–21.
    In future work, first, we will develop the model in a formal way                         [22] Matevž Kunaver and Tomaž Požrl. 2017. Diversity in recommender systems–A
and complete the framework so that it includes a total document                                   survey. Knowledge-Based Systems 123 (2017), 154–162.
                                                                                             [23] Shangsong Liang, Zhaochun Ren, and Maarten de Rijke. 2014. Fusion helps
ordering. We will also evaluate it in a more general way by con-                                  diversification. In Proc. of the 37th international ACM SIGIR conference on Research
sidering the entire 2019 TREC Fair Ranking Track the organizers                                   & development in information retrieval. 303–312.
provided as well as the track evaluation measures. As it has been                            [24] Blerina Lika, Kostas Kolomvatsos, and Stathes Hadjiefthymiades. 2014. Facing
                                                                                                  the cold start problem in recommender systems. Expert Systems with Applications
mentioned by the organizers, in the 2019 collection a lot of queries                              41, 4 (2014), 2065–2073.
have too few retrieved documents; a method such as ours make                                 [25] Weiwen Liu and Robin Burke. 2018. Personalizing Fairness-aware Re-ranking.
                                                                                                  CoRR abs/1809.02921 (2018). arXiv:1809.02921 http://arxiv.org/abs/1809.
more sense when the list of documents is large enough. As a future                                02921
work, we will also participate to the track in 2020 with our model.                          [26] Jonathan Louëdec, Max Chevalier, Josiane Mothe, Aurélien Garivier, and
                                                                                                  Sébastien Gerchinovitz. 2015. A multiple-play bandit algorithm applied to rec-
                                                                                                  ommender systems. In The Twenty-Eighth International Flairs Conference.
REFERENCES                                                                                   [27] Jie Lu, Dianshuang Wu, Mingsong Mao, Wei Wang, and Guangquan Zhang. 2015.
 [1] Himan Abdollahpouri, Robin Burke, and Bamshad Mobasher. 2019. Managing                       Recommender system application developments: a survey. Decision Support
     popularity bias in recommender systems with personalized re-ranking. In The                  Systems 74 (2015), 12–32.
     Thirty-Second International Flairs Conference.                                          [28] Graham McDonald, Thibaut Thonet, Iadh Ounis, Jean-Michel Renders, and Craig
 [2] Asia J Biega, Krishna P Gummadi, and Gerhard Weikum. 2018. Equity of attention:              Macdonald. [n.d.]. University of Glasgow Terrier Team and Naver Labs Europe
     Amortizing individual fairness in rankings. In The 41st international acm sigir              at TREC 2019 Fair Ranking Track. ([n. d.]).
     conference on research & development in information retrieval. 405–414.                 [29] Rishabh Mehrotra, James McInerney, Hugues Bouchard, Mounia Lalmas, and
 [3] Christian Bizer, Tom Heath, and Tim Berners-Lee. 2011. Linked data: The story so             Fernando Diaz. 2018. Towards a fair marketplace: Counterfactual evaluation
     far. In Semantic services, interoperability and web applications: emerging concepts.         of the trade-off between relevance, fairness & satisfaction in recommendation
     205–227.                                                                                     systems. In Proc. of the 27th acm international conference on information and
 [4] Sergey Brin and Lawrence Page. 1998. The anatomy of a large-scale hypertextual               knowledge management. 2243–2251.
     web search engine. (1998).                                                              [30] Natwar Modani, Deepali Jain, Ujjawal Soni, Gaurav Kumar Gupta, and Palak
 [5] Jaime Carbonell and Jade Goldstein. 1998. The use of MMR, diversity-based                    Agarwal. 2017. Fairness Aware Recommendations on Behance. In Advances
     reranking for reordering documents and producing summaries. In Proc. of the                  in Knowledge Discovery and Data Mining, Jinho Kim, Kyuseok Shim, Longbing
     21st annual international ACM SIGIR conference on Research and development in                Cao, Jae-Gil Lee, Xuemin Lin, and Yang-Sae Moon (Eds.). Springer International
     information retrieval. 335–336.                                                              Publishing, Cham, 144–155.
 [6] Carlos Castillo. 2019. Fairness and transparency in ranking. In ACM SIGIR Forum,        [31] Josiane Mothe, Karen Mkhitaryan, and Mariam Haroutunian. 2017. Community
     Vol. 52. 64–71.                                                                              detection: Comparison of state of the art algorithms. In 2017 Computer Science
 [7] Flavio Chierichetti, Ravi Kumar, Silvio Lattanzi, and Sergei Vassilvitskii. 2017.            and Information Technologies (CSIT). 125–129.
     Fair clustering through fairlets. In Advances in Neural Information Processing          [32] M. Newman. 2006. Finding community structure in networks using the eigenvec-
     Systems. 5029–5037.                                                                          tors of matrices. Phys. Rev. E 74, 036104 (September 2006). https://doi.org/
 [8] Flavio Chierichetti, Ravi Kumar, Silvio Lattanzi, and Sergei Vassilvtiskii. 2019. Ma-        10.1103/PhysRevE.74.036104
     troids, matchings, and fairness. In The 22nd International Conference on Artificial     [33] P. Pons and M. Latapy. 2005. Computing Communities in Large Networks
     Intelligence and Statistics. 2212–2220.                                                      Using Random Walks. Lecture Notes in Computer Science, Springer 3733 (2005).
 [9] Charles LA Clarke, Maheedhar Kolla, Gordon V Cormack, Olga Vechtomova,                       https://doi.org/10.1007/11569596_31
     Azin Ashkan, Stefan Büttcher, and Ian MacKinnon. 2008. Novelty and diversity            [34] Antonio J Roa-Valverde and Miguel-Angel Sicilia. 2014. A survey of approaches
     in information retrieval evaluation. In Proc. of the 31st annual international ACM           for ranking on the web of data. Information Retrieval 17, 4 (2014), 295–325.
     SIGIR conference on Research and development in information retrieval. 659–666.         [35] M. Rosvall and C. Bergstrom. 2007. Maps of random walks on complex networks
[10] A. Clauset, M. Newman, and C. Moore. 2004. Finding community structure                       reveal community structure. PNAS 105, 4 (July 2007), 1118–1123. https://doi.
     in very large networks. Phys. Rev. E 70 70, 066111 (December 2004). https:                   org/10.1073/pnas.0706851105
     //doi.org/10.1103/PhysRevE.70.066111                                                    [36] Rodrygo LT Santos, Craig Macdonald, and Iadh Ounis. 2010. Exploiting query re-
[11] Leon Danon, Albert Diaz-Guilera, Jordi Duch, and Alex Arenas. 2005. Comparing                formulations for web search result diversification. In Proc. of the 19th international
     community structure identification. Journal of Statistical Mechanics: Theory and             conference on World wide web. 881–890.
     Experiment 2005, 09 (2005), P09008.                                                     [37] Rodrygo LT Santos, Craig Macdonald, Iadh Ounis, et al. 2015. Search result
[12] Anubrata Das and Matthew Lease. 2019. A Conceptual Framework for Evaluating                  diversification. Foundations and Trends® in Information Retrieval 9, 1 (2015),
     Fairness in Search. CoRR abs/1907.09328 (2019). arXiv:1907.09328 http://arxiv.               1–90.
     org/abs/1907.09328                                                                      [38] Ashudeep Singh and Thorsten Joachims. 2018. Fairness of exposure in rankings.
[13] Fernando Diaz, Bhaskar Mitra, Michael D. Ekstrand, Asia J. Biega, and Ben                    In Proc. of the 24th ACM SIGKDD International Conference on Knowledge Discovery
     Carterette. 2020. Evaluating Stochastic Rankings with Expected Exposure.                     & Data Mining. 2219–2228.
     arXiv:cs.IR/2004.13157                                                                  [39] Meng Wang, Haopeng Zhang, Fuhuai Liang, Bin Feng, and Di Zhao. [n.d.]. ICT
[14] Michael Feldman, Sorelle A Friedler, John Moeller, Carlos Scheidegger, and Suresh            at TREC 2019: Fair Ranking Track. ([n. d.]).
     Venkatasubramanian. 2015. Certifying and removing disparate impact. In pro-             [40] Jacek Wasilewski and Neil Hurley. 2018. Intent-Aware Item-Based Collaborative
     ceedings of the 21th ACM SIGKDD international conference on knowledge discovery              Filtering for Personalised Diversification. In Proc. of the 26th Conference on User
     and data mining. 259–268.                                                                    Modeling, Adaptation and Personalization. Association for Computing Machinery,
[15] Larry Fitzpatrick and Mei Dent. 1997. Automatic Feedback Using Past Queries:                 81–89. https://doi.org/10.1145/3209219.3209234
     Social Searching?. In The 20th ACM SIGIR Conference. ACM, 306–313.                      [41] Long Xia, Jun Xu, Yanyan Lan, Jiafeng Guo, Wei Zeng, and Xueqi Cheng. 2017.
[16] S. Fortunato. 2010. Community detection in graphs. Physics Reports 486 (2010),               Adapting Markov Decision Process for Search Result Diversification. In Proc. of
     75–174.                                                                                      the 40th International ACM SIGIR Conference on Research and Development in
[17] Ruoyuan Gao and Chirag Shah. 2019. How Fair Can We Go: Detecting the                         Information Retrieval. Association for Computing Machinery, 535–544. https:
     Boundaries of Fairness Optimization in Information Retrieval. In Proceedings of              //doi.org/10.1145/3077136.3080775
     the 2019 ACM SIGIR International Conference on Theory of Information Retrieval