<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta>
      <journal-title-group>
        <journal-title>June</journal-title>
      </journal-title-group>
    </journal-meta>
    <article-meta>
      <title-group>
        <article-title>TransFAT: Translating Fairness, Accountability and Transparency into Data Science Practice</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>TransFAT</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>New York University New York NY</institution>
          ,
          <country country="US">USA</country>
        </aff>
      </contrib-group>
      <pub-date>
        <year>2019</year>
      </pub-date>
      <volume>4</volume>
      <issue>2019</issue>
      <abstract>
        <p>Data science holds incredible promise for improving peoples lives, accelerating scienti c discovery and innovation, and bringing about positive societal change. Yet, if not used responsibly | in accordance with legal and ethical norms | the same technology can reinforce economic and political inequities, destabilize global markets, and rea rm systemic bias. In this paper I discuss an ongoing regulatory e ort in New York City, where the goal is to develop a methodology for enabling responsible use of algorithms and data in city agencies. I then highlight some ongoing work that makes part of the Data, Responsibly project, aiming to operationalize fairness, diversity, accountability, transparency, and data protection at all stages of the data science lifecycle. Additional information about the project, including technical papers, teaching materials, and open-source tools, is available at dataresponsibly.github.io.</p>
      </abstract>
      <kwd-group>
        <kwd>responsible data science</kwd>
        <kwd>fairness</kwd>
        <kwd>diversity</kwd>
        <kwd>transparency</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>-</title>
      <p>
        Data science holds incredible promise for improving peoples lives, accelerating
scienti c discovery and innovation, and bringing about positive societal change.
Yet, if not used responsibly | in accordance with legal and ethical norms |
the same technology can reinforce economic and political inequities, destabilize
global markets, and rea rm systemic bias [
        <xref ref-type="bibr" rid="ref1 ref14 ref17 ref4 ref6 ref7">1,4,6,7,14,17</xref>
        ].
      </p>
      <p>
        The public sector is under particular pressure to ful ll the mandate for
responsibility: All decisions made by algorithms will be scrutinized by the a ected
individuals and groups, and by the taxpayers who are entitled to verify equitable
resource distribution. Yet, recent reports on data-driven decision making,
specifically in the public sector, underscore that fairness and equitable treatment of
individuals and groups is di cult to achieve [
        <xref ref-type="bibr" rid="ref15">15</xref>
        ], and that transparency and
accountability of algorithmic processes are indispensable but rarely enacted [
        <xref ref-type="bibr" rid="ref1 ref5">1,5</xref>
        ].
As a society, we cannot a ord the status quo: Algorithmic bias in administrative
processes limits access to resources for those who need these resources most, and
ampli es the e ects of systemic historical discrimination. Lack of transparency
and accountability threatens the democratic process itself.
      </p>
      <p>J. StoyanTovihche data science lifecycle</p>
      <p>sharing
annotation
querying
ranking
analysis
validation
acquisition</p>
      <p>curation</p>
      <p>
        How can the technical community1support responsible data science practices
in complex administrative processes? Researchers are actively working on
methods for enabling fairness, accountability and transparency (FAT) of speci c
algorithms and their outputs [
        <xref ref-type="bibr" rid="ref10 ref11 ref13 ref18 ref28 ref9">9,10,11,13,18,28</xref>
        ]. While important, these approaches
focus solely on the analysis and validation step of the data science lifecycle
(depicted in Figure 1), and operate under the assumption that input datasets are
clean and reliable.
      </p>
      <p>
        To realize the limitations of this assumption, observe that additional
information and intervention methods are available if we consider the upstream process
that generated the input data [
        <xref ref-type="bibr" rid="ref21">21</xref>
        ]. Appropriately annotating datasets when they
are shared, and maintaining information about how datasets are acquired and
manipulated, allows us to provide data transparency: to explain statistical
properties of the datasets, uncover any sources of bias, and make statements about
data quality and tness for use. Put another way: if we have no information
about how a dataset was generated and acquired, we cannot convincingly argue
that it is appropriate for use by an automated decision system.
      </p>
      <p>In the remainder of this paper, I will further motivate technical work on
responsible data science in the context of an ongoing regulatory e ort (Section 2).
I will then highlight some work that makes part of the Data, Responsibly project
(Section 3). For additional information about the project, including technical
papers, teaching materials, and open-source tools, see dataresponsibly.github.
io.
2</p>
    </sec>
    <sec id="sec-2">
      <title>Towards a Data Transparency Framework</title>
      <p>
        New York City is the rst municipality in the United States to attempt to
regulate the use of data-driven algorithmic decision making in government. The City
passed Local Law 49 of 2018 [
        <xref ref-type="bibr" rid="ref23">23</xref>
        ], requiring that a task force be put in place
to survey the current use of \automated decision systems," de ned as
\computerized implementations of algorithms, including those derived from machine
learning or other data processing or arti cial intelligence techniques, which are
used to make or assist in making decisions," in City agencies. The task force will
develop a set of recommendations for enacting algorithmic transparency by the
agencies, and will propose procedures for:
{ requesting and receiving an explanation of an algorithmic decision a ecting
an individual (Section 3(b));
{ interrogating automated decision systems for bias and discrimination against
members of legally protected groups, and addressing instances in which a
person is harmed based on membership in such groups (Sections 3(c), 3(d));
{ assessing how automated decision systems function and are used, and
archiving the systems together with the data they use (Sections 3(e), 3(f)).
      </p>
      <p>Local Law 49 of 2018 in e ect mandates the development of an algorithmic
transparency framework. In the remainder of this section, I argue that
meaningful transparency of algorithmic processes cannot be achieved without
transparency of data.</p>
      <p>What is data transparency? In applications involving predictive analytics, data is
used to customize generic algorithms for speci c situations | we say algorithms
are trained using data. The same algorithm may exhibit radically di erent
behavior | make di erent predictions; make a di erent number of mistakes, and
even di erent kinds of mistakes { when trained on two di erent datasets. In
other words, without access to the training data, it is impossible to know how
an algorithm would actually behave.</p>
      <p>Algorithms and corresponding training data are used, for example, in
predictive policing applications to target areas or people that are deemed to be
high-risk. But as has been shown extensively, when the data used to train these
algorithms re ects the systemic historical bias towards poor and predominately
African American neighborhoods, the predictions will simply reinforce the status
quo rather than provide any new insight into crime patterns. The transparency
of the algorithm is neither necessary nor su cient to understand and
counteract these particular errors. Rather, the conditions under which the data was
collected must be retained and made available to make the decision-making
process transparent.</p>
      <p>
        Even those decision-making applications that do not explicitly attempt to
predict future behavior based on past behavior are still heavily in uenced by
the properties of the underlying data. For example, the VI-SPDAT [
        <xref ref-type="bibr" rid="ref12">12</xref>
        ] risk
assessment tool, used to prioritize homeless individuals for receiving services,
does not involve machine learning, but still assigns a risk score based on
survey responses | a score that cannot be interpreted without understanding the
conditions under which the data was collected. As another example:
Matchmaking methods such as those used by the Department of Education to assign
children to spots in public schools are designed and validated using datasets; if
these datasets are not made available, the matchmaking method itself cannot be
considered transparent.
      </p>
      <p>What is data transparency, and how can we achieve it? One immediate
interpretation of this term is \making the training and validation datasets publicly
available." However, while data should be made open whenever possible, much
of it is sensitive and cannot be shared directly. That is, data transparency is in
tension with the privacy of individuals who are included in the dataset. In light
of this, an alternative interpretation of data transparency is as follows:
{ In addition to releasing training and validation datasets whenever possible,
agencies shall make publicly available summaries of relevant statistical
properties of the datasets that can aid in interpreting the decisions made using
the data, while applying state-of-the-art methods to preserve the privacy of
individuals.
{ When appropriate, privacy-preserving synthetic datasets can be released in
lieu of real datasets to expose certain features of the data, if real datasets
are sensitive and cannot be released to the public.</p>
      <p>An important aspect of data transparency is interpretability | surfacing the
statistical properties of a dataset, the methodology that was used to produce it,
and, ultimately, substantiating its \ tness for use" in the context of a speci c
automated decision system or task. This consideration of a speci c use is
particularly important because datasets are increasingly used outside the original
context for which they were intended. This compels us to augment our
interpretation of data transparency in the public sector to include:
{ Agencies shall make publicly available information about the data collection
and pre-processing methodology, in terms of assumptions, inclusion criteria,
known sources of bias, and data quality.</p>
      <p>Data transparency is important both when an automated decision system
is interrogated for systematic bias and discrimination, and when it is asked to
explain an algorithmic decision that a ects an individual. For example, suppose
that a system scores and ranks individuals for access to a service. If an individual
enters her data and receives the result | say, a score of 42 | this number alone
provides no information about why she was scored in this way, how she compares
to others, and what she can do to potentially improve her outcome.</p>
      <p>To facilitate transparency, the explanation given to an individual should be
interpretable, insightful and actionable. As part of the result, data that pertains
to other individuals, or a summary of such data, may need to be released, for
example, to explain which other individuals, or groups of individuals, receive
a higher score, or a more favorable outcome. This functionality requires data
transparency mechanisms discussed above.
3</p>
    </sec>
    <sec id="sec-3">
      <title>Highlights of the Data, Responsibly Project</title>
      <p>
        The goal of the Data, Responsibly project is to develop a foundational
understanding of responsible data science at all stages of the data lifecycle [
        <xref ref-type="bibr" rid="ref21">21</xref>
        ], and to
translate that understanding into tools and platforms [
        <xref ref-type="bibr" rid="ref16 ref27">16,27</xref>
        ]. Such tools should
be placed in the hands of data practitioners in the public sector. Importantly, the
requirement of responsibility cannot be handled as an afterthought, but must
be provisioned for at design time. In the remainder of this section, I highlight
several recent technical results. To keep the discussion focused, I will discuss
results that pertain to ranking and set selection tasks.
      </p>
      <p>
        Algorithmic decisions often result in scoring and ranking individuals to
determine credit worthiness, quali cations for college admissions and employment,
and compatibility as dating partners. While automatic and seemingly objective,
ranking algorithms can discriminate against individuals and protected groups,
and exhibit low diversity. Furthermore, ranked results are often unstable | small
changes in the input data or in the ranking methodology may lead to drastic
changes in the output, making the result uninformative and easy to
manipulate. Similar concerns apply in cases where items other than individuals are
ranked, including colleges, academic departments, or products. Finally, even in
cases where both the data and the ranking method are publicly available, ranked
results may still be di cult to interpret. [
        <xref ref-type="bibr" rid="ref20">20</xref>
        ]
      </p>
      <p>In addition to being commonly used in the analysis and validation stage of the
data science lifecycle, set selection and ranking are also very common upstream
from data analysis, in data sharing, acquisition, integration, and querying (see
Figure 1), making this family of methods particularly important to study.
3.1</p>
      <sec id="sec-3-1">
        <title>Fairness and diversity in ranking and set selection</title>
        <p>
          In [
          <xref ref-type="bibr" rid="ref25">25</xref>
          ] we started an inquiry into fairness in ranked outputs. We considered the
setting in which an institution, called a ranker, evaluates a set of individuals
based on demographic, behavioral or other characteristics. The nal output is a
ranking that represents the relative quality of the individuals. While automatic
and therefore seemingly objective, rankers can, and often do, discriminate against
individuals and systematically disadvantage members of protected groups.
        </p>
        <p>In this work we focused on datasets in which items have a single binary
sensitive attribute, such as male or female gender, and minority or majority ethnic
group, with one of the groups designated as the protected group (the groups
that experienced a historical disadvantage). We proposed a family of fairness
measures, quantifying the relative representation of protected group members at
discrete points in the ranking (e.g., top-10, top-20, etc.), and compounding these
proportions with a logarithmic discount, in the style of information retrieval.</p>
        <p>
          Score-based set selection is a mechanism that closely related to ranking.
Selection algorithms usually score individual items in isolation, and then select
the top scoring items. However, often there is an additional diversity
objective | selecting high-quality items that have di erent attributes (as in product
recommendation systems), or highs-scoring individuals who belong to di erent
demographic, geographic or socio-economic groups (as in college admissions and
hiring). In a recent work [
          <xref ref-type="bibr" rid="ref22">22</xref>
          ] we proposed methods for enforcing diversity in
online set selection, where a decision must be made on each item as it is presented.
We showed through experiments with real and synthetic data that diversity can
be achieved, usually with modest costs in terms of quality.
        </p>
        <p>Our experimental evaluation lead to several important insights in online set
selection. Most importantly, we showed that if a di erence in scores is expected
between groups (e.g., due to historical disadvantage), then these groups must be
treated separately during processing. Otherwise, a solution may be derived that
meets diversity constraints, but that selects lower-scoring members of
disadvantaged groups. This insight supports the argument of responsibility by design.</p>
        <p>
          In a recent follow-up work [
          <xref ref-type="bibr" rid="ref24">24</xref>
          ], we studied an unintended consequence of
applying diversity constraints to set selection and ranking, in datasets with
multiple sensitive attributes (e.g., gender and race). We observed that maximizing
utility (sum of item scores) subject to diversity constraints leads to reduced
ingroup fairness: the selected candidates from a given group may not be the best
ones, and this unfairness may not be well-balanced across groups.
        </p>
        <p>We studied this phenomenon using datasets that comprise multiple sensitive
attributes. We then introduce additional constraints, aimed at balancing
ingroup fairness across groups, and formalized the induced optimization problems
as integer linear programs. Using these programs, we conducted an
experimental evaluation with real datasets, and quanti ed the feasible trade-o s between
balance and overall performance in the presence of diversity constraints.</p>
        <p>
          Finally, we considered the design of fair score-based ranking functions in [
          <xref ref-type="bibr" rid="ref3">3</xref>
          ].
Items from a database are often ranked based on a combination of criteria. The
weight given to each criterion in the combination can greatly a ect the fairness
of the produced ranking, for example, systematically preferring men over women.
A user may have the exibility to choose combinations that weigh these criteria
di erently, within limits. In this work, we developed a system that helps users
choose criterion weights that lead to greater fairness.
        </p>
        <p>We considered ranking functions that compute the score of each item as a
weighted sum of (numeric) attribute values, and then sort items on their score.
Each ranking function can be expressed as a point in a multi-dimensional space.
For a broad range of fairness criteria, including proportionality, we showed how
to e ciently identify regions in this space that satisfy these criteria. Using this
identi cation method, our system is able to tell users whether their proposed
ranking function satis es the desired fairness criteria and, if it does not, to
suggest the smallest modi cation that does. Our extensive experiments on real
datasets demonstrated that our methods are able to nd solutions that satisfy
fairness criteria e ectively (usually with only small changes to proposed weight
vectors) and e ciently (in interactive time, after some initial pre-processing).
3.2</p>
      </sec>
      <sec id="sec-3-2">
        <title>Stability in ranking</title>
        <p>Decision making is challenging when there is more than one criterion to
consider. In such cases, it is common to assign a goodness score to each item as
a weighted sum of its attribute values and rank them accordingly. Clearly, the
ranking depends on the weights used for this summation. Ideally, one would
want the ranked order not to change if the weights are changed slightly. We call
this property stability of the ranking. A consumer of a ranked list may trust
the ranking more if it has high stability. A producer of a ranked list prefers to
choose weights that result in a stable ranking, both to earn the trust of
potential consumers and because a stable ranking is intrinsically likely to be more
meaningful.</p>
        <p>
          In a recent paper [
          <xref ref-type="bibr" rid="ref2">2</xref>
          ], we developed a framework that can be used to
assess the stability of a provided ranking and to obtain a stable ranking within
an acceptable range of weight values (called \the region of interest"). Using a
geometric interpretation, we proposed algorithms that produce stable rankings,
and experimentally validates our methods on real datasets. In our ongoing work
we are developing methods to quantify and improve stability of rankings under
slight changes to the data.
3.3
        </p>
      </sec>
      <sec id="sec-3-3">
        <title>Interpretability with Nutritional Labels</title>
        <p>
          In a recent paper we presented Ranking Facts, a Web-based application that
generates a \nutritional label" for rankings. [
          <xref ref-type="bibr" rid="ref26">26</xref>
          ]. Ranking Facts is made up of a
collection of visual widgets that implement our latest research results on fairness,
diversity, stability, and transparency for rankings, and that communicate details
of the ranking methodology, or of the output, to the end user. Figure 2 presents
Ranking Facts for CS department rankings. The nutritional label consists of six
widgets, each with an overview and a detailed view.
        </p>
        <p>
          The Recipe widget succinctly describes the ranking algorithm. For example,
for a linear scoring formula, each attribute would be listed together with its
weight. The Ingredients widget lists attributes most material to the ranked
outcome, in order of importance. For example, for a linear model, this list could
present the attributes with the highest learned weights. Put another way, the
explicit intentions of the designer of the scoring function about which attributes
matter, and to what extent, are stated in the Recipe, while Ingredients may show
additional attributes associated with high rank. Such associations can be derived
with linear models or with other methods, such as rank-aware similarity in our
prior work [
          <xref ref-type="bibr" rid="ref19">19</xref>
          ].
        </p>
        <p>The Stability widget explains whether the ranking methodology is robust on
the given dataset. An unstable ranking is one where slight changes to the data
(e.g., due to uncertainty and noise), or to the methodology (e.g., by slightly
adjusting the weights in a score-based ranker) could lead to a signi cant change
in the output.</p>
        <p>
          The Fairness widget quanti es whether the ranked output exhibits
statistical parity (one interpretation of fairness) with respect to one or more sensitive
attributes, such as gender or race. The Diversity widget shows diversity with
respect to a set of demographic categories of individuals, or a set of
categorical attributes of other kinds of items [
          <xref ref-type="bibr" rid="ref8">8</xref>
          ]. The widget displays the proportion of
each category in the top-10 ranked list and over-all, and, like other widgets, is
updated as the user selects di erent ranking methods or sets di erent weights.
Responsible data science | incorporating legal norms and ethical considerations
into data-driven algorithmic decision making | presents signi cant challenges
and exciting opportunities for both basic and applied research. Importantly,
lasting impact in this area cannot be achieved by technology alone, but must
combine technological advances with social science methodologies, regulatory
e orts, and education and engagement of the stakeholders. Responsible data
science is our new frontier.
5
        </p>
      </sec>
    </sec>
    <sec id="sec-4">
      <title>Acknowledgements</title>
      <p>This work was supported in part by NSF Grants No. 1926250 and 1916647.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          1.
          <string-name>
            <surname>Angwin</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Larson</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Mattu</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Kirchner</surname>
            ,
            <given-names>L.</given-names>
          </string-name>
          :
          <article-title>Machine bias: Risk assessments in criminal sentencing</article-title>
          .
          <source>ProPublica (May</source>
          <volume>23</volume>
          ,
          <year>2016</year>
          ), https://www.propublica.org/ article/machine-bias
          <article-title>-risk-assessments-in-criminal-sentencing</article-title>
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          2.
          <string-name>
            <surname>Asudeh</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Jagadish</surname>
            ,
            <given-names>H.V.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Miklau</surname>
            ,
            <given-names>G.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Stoyanovich</surname>
          </string-name>
          , J.:
          <article-title>On obtaining stable rankings</article-title>
          .
          <source>PVLDB</source>
          <volume>12</volume>
          (
          <issue>3</issue>
          ),
          <volume>237</volume>
          {
          <fpage>250</fpage>
          (
          <year>2018</year>
          ), http://www.vldb.org/pvldb/vol12/ p237-asudeh.pdf
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          3.
          <string-name>
            <surname>Asudeh</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Jagadish</surname>
            ,
            <given-names>H.V.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Stoyanovich</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Das</surname>
            ,
            <given-names>G.</given-names>
          </string-name>
          :
          <article-title>Designing fair ranking schemes</article-title>
          .
          <source>In: Proceedings of the ACM SIGMOD International Conference on Management of Data, SIGMOD</source>
          <year>2019</year>
          (
          <year>2019</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          4.
          <string-name>
            <surname>Barocas</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Selbst</surname>
            ,
            <given-names>A.D.</given-names>
          </string-name>
          :
          <article-title>Big Data's Disparate Impact</article-title>
          . SSRN eLibrary (
          <year>2014</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          5.
          <string-name>
            <surname>Brauneis</surname>
            ,
            <given-names>R.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Goodman</surname>
            ,
            <given-names>E.P.</given-names>
          </string-name>
          :
          <article-title>Algorithmic transparency for the smart city</article-title>
          .
          <source>Yale Journal of Law &amp; Technology (forthcoming)</source>
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          6.
          <string-name>
            <surname>Citron</surname>
            ,
            <given-names>D.K.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Pasquale</surname>
            ,
            <given-names>F.A.</given-names>
          </string-name>
          :
          <article-title>The scored society: Due process for automated predictions</article-title>
          .
          <source>Washington Law Review</source>
          <volume>89</volume>
          (
          <year>2014</year>
          ), http://papers.ssrn.com/sol3/ papers.cfm?abstract_id=
          <fpage>2376209</fpage>
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          7.
          <string-name>
            <surname>Crawford</surname>
            ,
            <given-names>K.</given-names>
          </string-name>
          :
          <article-title>Arti cial intelligences white guy problem</article-title>
          . New York Times (June 25,
          <year>2016</year>
          ), https://www.nytimes.com/
          <year>2016</year>
          /06/26/opinion/sunday/ artificial-intelligences
          <article-title>-white-guy-problem</article-title>
          .html
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          8.
          <string-name>
            <surname>Drosou</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Jagadish</surname>
            ,
            <given-names>H.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Pitoura</surname>
            ,
            <given-names>E.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Stoyanovich</surname>
            ,
            <given-names>J.:</given-names>
          </string-name>
          <article-title>Diversity in Big Data: A review</article-title>
          .
          <source>Big Data</source>
          <volume>5</volume>
          (
          <issue>2</issue>
          ) (
          <year>2017</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          9.
          <string-name>
            <surname>Dwork</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Hardt</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Pitassi</surname>
            ,
            <given-names>T.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Reingold</surname>
            ,
            <given-names>O.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Zemel</surname>
            ,
            <given-names>R.S.:</given-names>
          </string-name>
          <article-title>Fairness through awareness</article-title>
          .
          <source>In: Innovations in Theoretical Computer Science</source>
          <year>2012</year>
          , Cambridge, MA, USA, January 8-
          <issue>10</issue>
          ,
          <year>2012</year>
          . pp.
          <volume>214</volume>
          {
          <issue>226</issue>
          (
          <year>2012</year>
          ). https://doi.org/10.1145/2090236.2090255, http://doi.acm.
          <source>org/10</source>
          .1145/ 2090236.2090255
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          10.
          <string-name>
            <surname>Feldman</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Friedler</surname>
            ,
            <given-names>S.A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Moeller</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Scheidegger</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Venkatasubramanian</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          :
          <article-title>Certifying and removing disparate impact</article-title>
          .
          <source>In: Proceedings of the 21th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining</source>
          , Sydney,
          <string-name>
            <surname>NSW</surname>
          </string-name>
          , Australia,
          <source>August 10-13</source>
          ,
          <year>2015</year>
          . pp.
          <volume>259</volume>
          {
          <issue>268</issue>
          (
          <year>2015</year>
          ). https://doi.org/10.1145/2783258.2783311, http://doi.acm.
          <source>org/10</source>
          .1145/ 2783258.2783311
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          11.
          <string-name>
            <surname>Hajian</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Domingo-Ferrer</surname>
            ,
            <given-names>J.:</given-names>
          </string-name>
          <article-title>A methodology for direct and indirect discrimination prevention in data mining</article-title>
          .
          <source>IEEE Trans. Knowl. Data Eng</source>
          .
          <volume>25</volume>
          (
          <issue>7</issue>
          ),
          <volume>1445</volume>
          {
          <fpage>1459</fpage>
          (
          <year>2013</year>
          ). https://doi.org/10.1109/TKDE.
          <year>2012</year>
          .
          <volume>72</volume>
          , http://dx.doi.org/ 10.1109/TKDE.
          <year>2012</year>
          .72
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          12.
          <string-name>
            <surname>Homelessness</surname>
            ,
            <given-names>P.E.</given-names>
          </string-name>
          :
          <article-title>Vulnerability Index - Service Prioritization Decision Assistance Tool (VI-SPDAT)</article-title>
          . http://pehgc.org/wp-content/uploads/2016/09/ VI-SPDAT-v2.
          <fpage>01</fpage>
          -
          <string-name>
            <surname>Single-</surname>
          </string-name>
          US-Fillable.pdf, [Online; accessed on 14-September2017]
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          13.
          <string-name>
            <surname>Kamiran</surname>
            ,
            <given-names>F.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Zliobaite</surname>
            ,
            <given-names>I.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Calders</surname>
            ,
            <given-names>T.</given-names>
          </string-name>
          :
          <article-title>Quantifying explainable discrimination and removing illegal discrimination in automated decision making</article-title>
          .
          <source>Knowl. Inf. Syst</source>
          .
          <volume>35</volume>
          (
          <issue>3</issue>
          ),
          <volume>613</volume>
          {
          <fpage>644</fpage>
          (
          <year>2013</year>
          ). https://doi.org/10.1007/s10115-012-0584-8, http: //dx.doi.org/10.1007/s10115-012-0584-8
        </mixed-citation>
      </ref>
      <ref id="ref14">
        <mixed-citation>
          14. Mun~oz,
          <string-name>
            <given-names>C.</given-names>
            ,
            <surname>Smith</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            ,
            <surname>Patil</surname>
          </string-name>
          ,
          <string-name>
            <surname>D.</surname>
          </string-name>
          :
          <article-title>Big data: A report on algorithmic systems, opportunity, and civil rights</article-title>
          .
          <source>The White House (May</source>
          <year>2016</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref15">
        <mixed-citation>
          15.
          <string-name>
            <surname>Network</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          :
          <article-title>First, do no harm: Ethical guidelines for applying predictive tools within human services</article-title>
          . http://www.alleghenycountyanalytics.us/ (
          <year>2017</year>
          ), [forthcoming]
        </mixed-citation>
      </ref>
      <ref id="ref16">
        <mixed-citation>
          16.
          <string-name>
            <surname>Ping</surname>
            ,
            <given-names>H.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Stoyanovich</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Howe</surname>
            ,
            <given-names>B.</given-names>
          </string-name>
          :
          <article-title>Datasynthesizer: Privacy-preserving synthetic datasets</article-title>
          .
          <source>In: Proceedings of the 29th International Conference on Scienti c and Statistical Database Management</source>
          , Chicago, IL, USA, June 27-29,
          <year>2017</year>
          . pp.
          <volume>42</volume>
          :
          <issue>1</issue>
          {
          <issue>42</issue>
          :
          <issue>5</issue>
          (
          <year>2017</year>
          ). https://doi.org/10.1145/3085504.3091117, http://doi.acm.
          <source>org/10</source>
          . 1145/3085504.3091117
        </mixed-citation>
      </ref>
      <ref id="ref17">
        <mixed-citation>
          17.
          <string-name>
            <surname>Podesta</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Pritzker</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Moniz</surname>
            ,
            <given-names>E.J.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Holdern</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Zients</surname>
          </string-name>
          , J.:
          <article-title>Big data: seizing opportunities, preserving values. Executive O ce of the President, The White House</article-title>
          (May
          <year>2014</year>
          ), https://www.whitehouse.gov/sites/default/files/docs/ big_data_privacy_report_may_1_
          <year>2014</year>
          .pdf
        </mixed-citation>
      </ref>
      <ref id="ref18">
        <mixed-citation>
          18.
          <string-name>
            <surname>Romei</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Ruggieri</surname>
            ,
            <given-names>S.:</given-names>
          </string-name>
          <article-title>A multidisciplinary survey on discrimination analysis</article-title>
          .
          <source>Knowledge Eng. Review</source>
          <volume>29</volume>
          (
          <issue>5</issue>
          ),
          <volume>582</volume>
          {
          <fpage>638</fpage>
          (
          <year>2014</year>
          ). https://doi.org/10.1017/S0269888913000039, http://dx.doi.org/10.1017/ S0269888913000039
        </mixed-citation>
      </ref>
      <ref id="ref19">
        <mixed-citation>
          19.
          <string-name>
            <surname>Stoyanovich</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Amer-Yahia</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Milo</surname>
            ,
            <given-names>T.</given-names>
          </string-name>
          :
          <article-title>Making interval-based clustering rankaware</article-title>
          .
          <source>In: EDBT</source>
          (
          <year>2011</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref20">
        <mixed-citation>
          20.
          <string-name>
            <surname>Stoyanovich</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Goodman</surname>
            ,
            <given-names>E.P.</given-names>
          </string-name>
          :
          <article-title>Revealing algorithmic rankers</article-title>
          .
          <source>Freedom to Tinker (August</source>
          <volume>5</volume>
          ,
          <year>2016</year>
          ), http://freedom-to-tinker.com/
          <year>2016</year>
          /08/05/ revealing-algorithmic-rankers/
        </mixed-citation>
      </ref>
      <ref id="ref21">
        <mixed-citation>
          21.
          <string-name>
            <surname>Stoyanovich</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Howe</surname>
            ,
            <given-names>B.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Abiteboul</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Miklau</surname>
            ,
            <given-names>G.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Sahuguet</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Weikum</surname>
          </string-name>
          , G.:
          <article-title>Fides: Towards a platform for responsible data science</article-title>
          .
          <source>In: Proceedings of the 29th International Conference on Scienti c and Statistical Database Management</source>
          , Chicago, IL, USA, June 27-29,
          <year>2017</year>
          . pp.
          <volume>26</volume>
          :
          <issue>1</issue>
          {
          <issue>26</issue>
          :
          <issue>6</issue>
          (
          <year>2017</year>
          ). https://doi.org/10.1145/3085504.3085530, http://doi.acm.
          <source>org/10</source>
          .1145/ 3085504.3085530
        </mixed-citation>
      </ref>
      <ref id="ref22">
        <mixed-citation>
          22.
          <string-name>
            <surname>Stoyanovich</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Yang</surname>
            ,
            <given-names>K.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Jagadish</surname>
            ,
            <given-names>H.V.</given-names>
          </string-name>
          :
          <article-title>Online set selection with fairness and diversity constraints</article-title>
          .
          <source>In: Proceedings of the 21th International Conference on Extending Database Technology, EDBT</source>
          <year>2018</year>
          , Vienna, Austria, March
          <volume>26</volume>
          - 29,
          <year>2018</year>
          . pp.
          <volume>241</volume>
          {
          <issue>252</issue>
          (
          <year>2018</year>
          ). https://doi.org/10.5441/002/edbt.
          <year>2018</year>
          .
          <volume>22</volume>
          , https: //doi.org/10.5441/002/edbt.
          <year>2018</year>
          .22
        </mixed-citation>
      </ref>
      <ref id="ref23">
        <mixed-citation>
          23. The New York City Council: Int. No. 1696
          <article-title>-A: A Local Law in relation to automated decision systems used by agencies (</article-title>
          <year>2017</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref24">
        <mixed-citation>
          24.
          <string-name>
            <surname>Yang</surname>
            ,
            <given-names>K.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Gkatzelis</surname>
            ,
            <given-names>V.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Stoyanovich</surname>
          </string-name>
          , J.:
          <article-title>Balanced ranking with diversity constraints</article-title>
          .
          <source>In: Proceedings of the Twenty-Eighths International Joint Conference on Arti cial Intelligence</source>
          ,
          <source>IJCAI</source>
          (
          <year>2019</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref25">
        <mixed-citation>
          25.
          <string-name>
            <surname>Yang</surname>
            ,
            <given-names>K.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Stoyanovich</surname>
          </string-name>
          , J.:
          <article-title>Measuring fairness in ranked outputs</article-title>
          .
          <source>FATML abs/1610</source>
          .08559 (
          <year>2016</year>
          ), http://arxiv.org/abs/1610.08559
        </mixed-citation>
      </ref>
      <ref id="ref26">
        <mixed-citation>
          26.
          <string-name>
            <surname>Yang</surname>
            ,
            <given-names>K.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Stoyanovich</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Asudeh</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Howe</surname>
            ,
            <given-names>B.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Jagadish</surname>
            ,
            <given-names>H.V.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Miklau</surname>
          </string-name>
          , G.:
          <article-title>A nutritional label for rankings</article-title>
          .
          <source>In: Proceedings of the 2018 International Conference on Management of Data, SIGMOD Conference</source>
          <year>2018</year>
          , Houston, TX, USA, June 10- 15,
          <year>2018</year>
          . pp.
          <volume>1773</volume>
          {
          <issue>1776</issue>
          (
          <year>2018</year>
          ). https://doi.org/10.1145/3183713.3193568, https: //doi.org/10.1145/3183713.3193568
        </mixed-citation>
      </ref>
      <ref id="ref27">
        <mixed-citation>
          27.
          <string-name>
            <surname>Yang</surname>
            ,
            <given-names>K.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Stoyanovich</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Asudeh</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Howe</surname>
            ,
            <given-names>B.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Jagadish</surname>
            ,
            <given-names>H.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Miklau</surname>
          </string-name>
          , G.:
          <article-title>A nutritional label for rankings</article-title>
          .
          <source>In: ACM SIGMOD</source>
          <year>2018</year>
          (
          <year>2018</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref28">
        <mixed-citation>
          28.
          <string-name>
            <surname>Zemel</surname>
            ,
            <given-names>R.S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Wu</surname>
            ,
            <given-names>Y.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Swersky</surname>
            ,
            <given-names>K.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Pitassi</surname>
            ,
            <given-names>T.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Dwork</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          :
          <article-title>Learning fair representations</article-title>
          .
          <source>In: ICML</source>
          . pp.
          <volume>325</volume>
          {
          <issue>333</issue>
          (
          <year>2013</year>
          ), http://jmlr.org/proceedings/papers/ v28/zemel13.html
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>