<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Detecting Bias: Does an Algorithm Have to Be Transparent in Order to Be Fair?</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>William Seymour</string-name>
          <email>william.seymour@cs.ox.ac.uk</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>University of Oxford</institution>
          ,
          <addr-line>Oxford</addr-line>
          ,
          <country country="UK">UK</country>
        </aff>
      </contrib-group>
      <abstract>
        <p>The most commonly cited solution to problems surrounding algorithmic fairness is increased transparency. But how do we reconcile this point of view with the state of the art? Many of the most e ective modern machine learning methods (such as neural networks) can have millions of variables, defying human understanding. This paper decomposes the quest for transparency and examines two of the options available using technical examples. By considering some of the current uses of machine learning and using human decision making as a null hypothesis, I suggest that pursuing transparent outcomes is the way forward, with the quest for transparent algorithms being a lost cause.</p>
      </abstract>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>Introduction</title>
      <p>
        Recent investigations into the fairness of algorithms have intensi ed the call
for machine learning methods that are transparent. Unless an algorithm is
transparent, so the argument goes, then how are we to know if it is fair? But
this approach comes with a problem: many machine learning methods are useful
precisely because they work in a way which is alien to conscious human reasoning.
Thus, we place ourselves in the position of having to choose between a more
limited (and potentially less e ective) set of algorithms that work in ways that
we can understand, and those which are better suited to the task at hand but
cannot easily be explained. To clarify, this paper is concerned with the use of
transparency as a tool for auditing and communicating decisions, rather than
debate over the higher level `transparency ideal', or harmful/obstructive uses of
transparency as described by [
        <xref ref-type="bibr" rid="ref1 ref2">1, 2</xref>
        ].
      </p>
      <p>This paper will discuss the arguments for and against transparency as a
design requirement of machine learning algorithms. Firstly, I will break down
what we mean when we talk about fairness and transparency, before considering
arguments and examples from both sides of the discussion. I will cover two
di erent black box techniques that provide interpretable explanations about
algorithmic decisions|local explanations and statistical analyis|as well as some
of the problems associated with each of these techniques. The techniques listed
are by no means exhaustive and are meant to represent di erent styles that can
be used to generate explanations. To conclude, there will be a discussion on the
role that transparency might play in the future of machine learning.</p>
    </sec>
    <sec id="sec-2">
      <title>What Do We Mean by Transparency?</title>
      <p>Since transparency in this context is rooted in fairness, perhaps a better starting
point would be to ask what we mean by fairness. A dauntingly complex question in
itself, most people would consider approaches that `treat similar people in similar
ways' to be fair. These often coalesce along lines of protected characteristics (such
as race and gender), as these are where the most glaring problems are often to be
found. These characteristics are often expected to be excluded from the decision
making process even if they are statistically related to its outcome.</p>
      <p>
        But problems arise when a philosophical de nition of fairness is translated into
a set of statistical rules against which an algorithm is to be compared. There are
multiple perpendicular axes against which one can judge an algorithm, and the
best t will vary based on the context in which the algorithm is used. Examples
include predictive parity, error rate balance, and statistical parity to name a few
[
        <xref ref-type="bibr" rid="ref3">3</xref>
        ]. To further muddy the waters, it is possible to draw a distinction between
process fairness (the actual process of making a decision) and outcome fairness
(the perceived fairness of a decision itself) [
        <xref ref-type="bibr" rid="ref4">4</xref>
        ]. It is possible for an algorithm with
low process fairness (e.g. including race as a factor in decision making) to exhibit
high output fairness (e.g. ensuring similar levels of false positives across racial
groups).
      </p>
      <p>As for the term transparency, I refer to information available about an
algorithm that details part of its decision making process or information about
the decisions it makes, which can be interpreted by a human being. Depending
on the context, this could be a data scientist, policy maker, or even a member of
the public. Interpretability is a key requirement here, ensuring that published
data do actually aid our understanding of algorithmic processes.</p>
      <p>As we are concerned about investigating fairness, it makes sense to think of two
types of transparency corresponding to those for fairness: process transparency
(how much we understand about the internal state of an algorithm) and outcome
transparency (how much we understand about the decisions, and patterns in
decisions, made by an algorithm). This distinction is important, as while there
exist tools that can achieve some level of outcome transparency for all algorithms,
only certain types of algorithm exhibit process transparency.</p>
    </sec>
    <sec id="sec-3">
      <title>Method I: Local Explanations</title>
      <p>The rst method we consider is a black box method of explaining individual
decisions. Local explanations work by sampling decisions from the problem
domain weighted by proximity to the instance being explained. These samples
are then used to construct a new model that accurately re ects the local decision
boundary of the algorithm. For non-trivial algorithms, the local model will be a
bad t for other inputs, as global decision boundaries will be of a higher dimension
than the local one (see Figure 1).</p>
      <p>An example of this would be online content moderation. If a user has submitted
a post which is deemed by an algorithm to be too toxic, we might want to explain
to them which parts of their message caused the algorithm to reject it. For the
input sentence
\idiots. backward thinking people. nationalists. not accepting facts.
susceptible to lies"1
a local explanation might reveal that the words \idiots", and \nationalists" are
the greatest factors contributing to the message being agged as toxic. This is
not to say that all messages containing the word \nationalists" are toxic, but
that the word is considered problematic in this context.</p>
      <p>Here we have produced an interpretable explanation without knowing anything
about how the algorithm operates|we can say that local explanations provide
evidence for outcome fairness. By looking at these explanations for decisions a
system makes, we have enough information to conclude that a decision was unfair
because it violates our de nition of fairness as described above. This is a good
start to our goal of auditing for fairness.</p>
    </sec>
    <sec id="sec-4">
      <title>Moving From Local to Global</title>
      <p>Local explanations do a good job of informing users of the main factors behind
the decisions they are subject to, but they fall short of providing assurance that
the system as a whole operates fairly. In order for this to happen, one needs to be
able to create a mental model of the system which is functionally close enough
to the original that one can predict what it will do (or at least believe that
1 Taken from the list of examples on the Google Perspective API home page at
https://www.perspectiveapi.com/
its reasoning will be of su cient quality). Because local explanations consider
only facets of the current decision, they do not reveal much about the wider
reasoning that pervades an algorithm. While of great use to an individual who
is concerned about a decision concerning themselves, they are much less useful
to an auditor who is seeking assurance that the algorithm as a whole is fair. A
handful of randomly chosen samples being satisfactory does not give su cient
assurance that all answers will satisfy a set of fairness criteria. This highlights
the distinction drawn earlier between local and global fairness guarantees.</p>
      <p>Perhaps then, explanations for audits need to operate at a higher level
than local explanations. But then we encounter the problem that the high
dimensionality of non-trivial models means that global explanations must be
simpli ed to the point of absurdity in order to be intelligible. If explanations can
be thought of as \a three way trade o between the quality of the approximation
vs. the ease of understanding the function and the size of the domain for which
the approximation is valid" [6], then do we risk going so far towards the scale
end of the spectrum that we must abandon our hopes of arriving at an answer
which is also understandable and accurate?</p>
    </sec>
    <sec id="sec-5">
      <title>Method II: Statistical Analysis</title>
      <p>Given these problems it is perhaps questionable as to whether any scheme which
only considers individual decisions can ever be su cient to determine if an
algorithm is fair or not. When considering higher level explanations of algorithms
we nd that statistical analysis can o er us the reassurance (or otherwise) that
we desire about an algorithm, taking into accounts trends across entire groups of
users rather than being limited to individual circumstances.</p>
      <p>Statistical analysis is another black box method, and often takes the form
of calculating information about particular groups of users and how they are
dealt with by the algorithm. By comparing accuracies and error rates between
groups it is possible to identify systemic mistreatment. Explaining these ndings
is often simple, given most people's intuitive understanding of accuracy and false
positives/negatives (see Figure 2).</p>
    </sec>
    <sec id="sec-6">
      <title>Lies, Damned Lies, and Statistics</title>
      <p>One trap that exists when performing statistical analysis is that due to the
aforementioned multitude of ways one can express statistical fairness it is almost
always possible to present evidence of compliance and noncompliance. This is
because many types of statistical fairness are inherently incompatible with each
other: altering the classi er to increase fairness along one axis will always decrease
it in another.</p>
      <p>
        In the wake of Machine Bias [7], ProPublica and Northepoint argued that
the COMPAS algorithm was unfair and fair, respectively. Both parties were
technically correct. These explanations are thus only valid when paired with
background knowledge in data science and ethics, and may not suitable for
presentation to the general public|doing so could lead to a reduction in trust of
machine learning techniques, especially if the presented facts are used to support
previously held beliefs which are incorrect [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ].
      </p>
      <p>Another issue is that all of the methods that provide interpretable decisions
inevitably present reasoning that correlates with a decision making algorithm
but is not causally related to its output. In these cases if the algorithms internals
are indeed intractable then it will remain impossible to ever prove a causal
link between the explanation system and the algorithm itself. This is not an
insurmountable problem, by its nature all machine learning deals with correlations,
but it needs to be understood that using black box analysis techniques is not
enough to guarantee that a system is fair unless the entire problem domain is
exhaustively searched. For any model big enough to require auditing this will be
impossible.</p>
    </sec>
    <sec id="sec-7">
      <title>Discussion</title>
      <p>The point that becomes clear as we look at the realities surrounding transparency
in machine learning is that exclusively pursuing understandable and/or open
source algorithms is infeasible. When reviewing even a moderately-sized code
base, it quickly becomes apparent that issues of transparency and interpretability
cannot be resolved simply by making computer code available [8]. With a caveat
for certain contexts, we need to be able to deal with algorithms that are not
inherently transparent.</p>
      <p>Put another way, industry players are incentivised to use the machine learning
techniques that are best for pro ts, a decision which almost always favours
e cacy over interpretability. Given this, we need to consider techniques that can
be applied to methods where the raw form of the model de es our understanding,
such as neural networks.</p>
      <p>The position I advocate for here is not that we should give up completely on
pursuing transparency, but that we need to be clearer about what we are seeking.
By failing to di erentiate between process and outcome transparency we run the
risk of intractable algorithms being used as an excuse for opaque and potentially
unfair decision making.</p>
      <p>At the same time, it is important to understand the epistemological
implications that come from using correlation-based methods to provide transparency.
However, this is already something that is being dealt with when it comes to
algorithmic decisions themselves. If the rest of the community can tackle the
misguided use of algorithmic `evidence', then it is surely also possible to do the
same with transparency.</p>
      <p>Ultimately it is up to us to decide in each case whether the correlation-focused
evidence we can generate about an algorithm is su cient to draw conclusions
about its fairness or unfairness. It is helpful to frame the question in the context of
the alternative, which is human-led decision making. It is no secret that decisions
made by people can occasionally be opaque and prone to bias [8], and using this
human baseline as a null hypothesis reminds us that the goal of our quest for
transparency should be for machines exceed our own capabilities, not to obtain
perfection.</p>
      <p>A realistic approach would be to use both types of technique (white and
black box) in tandem, analysing the inner workings of simpler components where
possible and utilising second hand explanations and analysis otherwise. We should
remember that transparency can appear as a panacea for ethical issues arising
from new technologies, and that the case of machine learning is unlikely to be
any di erent [9]. That it is di cult to analyse the inner workings of particular
techniques will not slow or prevent their uptake, and it is increasingly clear that
there is a public and regulatory appetite for more accountable machine learning
systems. Therefore, going forward we need to be focussed on the attainable if we
are to e ectively hold algorithm developers and their algorithms to account.
6. Wachter, S., Mittelstadt, B., Russell, C.: Counterfactual explanations without
opening the black box: Automated decisions and the GDPR. (2017)
7. Larson, J., Mattu, S., Kirchner, L., Angwin, J.: How we analyzed the COMPAS
recidivism algorithm. ProPublica (5 2016) (2016)
8. The Royal Society: Machine learning. Technical report, The Royal Society (2017)
9. Mittelstadt, B.D., Allo, P., Taddeo, M., Wachter, S., Floridi, L.: The ethics of
algorithms: Mapping the debate. Big Data &amp; Society 3(2) (2016)</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          1.
          <string-name>
            <surname>Ananny</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Crawford</surname>
            ,
            <given-names>K.</given-names>
          </string-name>
          :
          <article-title>Seeing without knowing: Limitations of the transparency ideal and its application to algorithmic accountability. new media &amp; society (</article-title>
          <year>2016</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          2.
          <string-name>
            <surname>Flyverbom</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          :
          <article-title>Transparency: Mediation and the management of visibilities</article-title>
          .
          <source>International Journal of Communication</source>
          <volume>10</volume>
          (
          <issue>1</issue>
          ) (
          <year>2016</year>
          )
          <volume>110</volume>
          {
          <fpage>122</fpage>
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          3.
          <string-name>
            <surname>Chouldechova</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          :
          <article-title>Fair prediction with disparate impact: A study of bias in recidivism prediction instruments</article-title>
          .
          <source>Big data 5(2)</source>
          (
          <year>2017</year>
          )
          <volume>153</volume>
          {
          <fpage>163</fpage>
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          4.
          <string-name>
            <surname>Grgic-Hlaca</surname>
            ,
            <given-names>N.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Zafar</surname>
            ,
            <given-names>M.B.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Gummadi</surname>
            ,
            <given-names>K.P.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Weller</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          :
          <article-title>The case for process fairness in learning: Feature selection for fair decision making</article-title>
          .
          <source>In: NIPS Symposium on Machine Learning and the Law</source>
          .
          <article-title>(</article-title>
          <year>2016</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          5.
          <string-name>
            <surname>Ribeiro</surname>
            ,
            <given-names>M.T.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Singh</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Guestrin</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          :
          <article-title>Why should I trust you?: Explaining the predictions of any classi er</article-title>
          .
          <source>In: Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining</source>
          , ACM (
          <year>2016</year>
          )
          <volume>1135</volume>
          {
          <fpage>1144</fpage>
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>