<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta>
      <journal-title-group>
        <journal-title>European Workshop on Algorithmic Fairness, July</journal-title>
      </journal-title-group>
    </journal-meta>
    <article-meta>
      <title-group>
        <article-title>How to be Fair? A Discussion and Future Perspectives</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Marco Favier</string-name>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Toon Calders</string-name>
        </contrib>
      </contrib-group>
      <pub-date>
        <year>2022</year>
      </pub-date>
      <volume>0</volume>
      <fpage>1</fpage>
      <lpage>03</lpage>
      <abstract>
        <p>In our previous work titled “How to be fair? A study of label and selection bias,” we discussed how the interaction between bias and fairness can yield fruitful insights for fairness research. We considered the scenario where an initially unobservable fair distribution becomes corrupted by bias, leading to an observable unfair distribution. By employing various fairness definitions and types of bias, we derived valuable mathematical conditions and properties that the observable distribution must adhere to. This would allow practitioners to better understand bias and mitigate its efect in their models. In this paper, we delve into the significance of these findings, address their limitations, and explore potential future research directions.</p>
      </abstract>
      <kwd-group>
        <kwd>eol&gt;Algorithmic fairness</kwd>
        <kwd>Ethical AI</kwd>
        <kwd>Classification</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>1. Introduction</title>
      <p>
        The fairness-accuracy trade-of [
        <xref ref-type="bibr" rid="ref1 ref2 ref3">1, 2, 3</xref>
        ] is frequently misunderstood in fairness literature. Rather
than being viewed as a simple emergent phenomenon, as it should be, it is often portrayed as
an insurmountable conflict between two ethical principles. While it is reasonable to expect a
decrease in a classifier’s accuracy when adding a fairness constraint, it’s incorrect to assume that
sacrificing accuracy is necessary to meet fairness requirements. Often, the trade-of is perceived
as an unavoidable compromise, where the fairness of the classifier is exchanged for its accuracy,
creating a false dichotomy between morality and performance quality. This line of thinking
fosters the notion that fairness is a luxury reserved for situations where accuracy is not paramount.
This dangerous perspective can lead to justifications for the absence of fairness in a classifier.
      </p>
      <p>
        Furthermore, the ethical interpretation of the fairness-accuracy trade-of is inherently flawed,
since it is self-contradictory in nature. If we acknowledge the necessity of implementing fairness
interventions on our data, it implies that due to bias we have reason to believe that the data
may contain artifacts that could hurt the performance of a classifier trained on them. We’ve
already deemed our data flawed. It is then naive to expect that the accuracy of the classifier will
not decrease, since the efort to make the data suitable for classification will necessarily shift
the data distribution to a fairer one. If we pursue fairness, it is because we believe our data to
be biased, and if we believe our data to be biased, we should also believe they cannot provide a
good measure of the quality of the classifier. In reality, the trade-of should be considered as an
empty signifier: no meaning is conveyed by the trade-of, which should only be regarded as a
numerical fact. In recent years, an increasing number of fairness researchers have endeavored to
critically address the misconception surrounding the fairness-accuracy trade-of by highlighting
that when bias is introduced into the data, the accuracy of a fair classifier on the unbiased data
outperforms the accuracy of a fairness-agnostic classifier [
        <xref ref-type="bibr" rid="ref4 ref5">4, 5</xref>
        ]. In this regard, studying the
efect of diferent biases on the probability distribution of the data is of primary importance, in
order to develop efective fairness techniques capable of removing bias from the data.
      </p>
      <p>Despite this, the literature still lacks a clear understanding of the relationship between bias
and fairness. Current fairness measures only acknowledge the presence of bias, but they are
unable to comprehend its nature or provide a clear path to remove it. This could pose potential
challenges when attempting to enforce fairness in real-world applications.</p>
      <p>For instance, when considering the fairness measure “Demographic Parity”, it’s clear that
the objective of the metric is to ensure that the probability of being classified as positive is the
same for all sensitive groups.</p>
      <p>However, it remains unclear whether the distribution of positive labels achieved through a
fairness intervention enforcing demographic parity is indicative of the successful elimination
of societal bias or merely an instance of afirmative action. In other words, it is unclear if the
new distribution is unbiased or if it is just a diferent kind of bias. We believe that studying the
interaction between bias and fairness should be a priority for the fairness community, and our
previous work is a step in this direction.</p>
    </sec>
    <sec id="sec-2">
      <title>2. How to be fair? A study of label and selection bias</title>
      <p>
        In our previous work “How to be fair? A study of label and selection bias” [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ], we investigated
how bias interacts with fair distributions and whether fair intervention techniques could
efectively remove bias from data. Our investigation begins with the notion of an ideal world
where the data collected from it are inherently fair by default. At the same time, for our
available data, we assumed they are a distorted version of the data sampled from the ideal world.
This distortion is caused by bias, which obfuscated the original data, resulting in two distinct
distributions: an observable unfair distribution and a no-longer-observable fair distribution.
Moreover, in our work, we assume that every classifier is able to learn the observable biased
distribution from the available data and that the score assigned by the classifier corresponds
to the unfair conditional probability of the label for a given data point. In technical terms, we
assume no epistemic uncertainty.
      </p>
      <p>Under these assumptions, asking whether fairness interventions are able to remove bias can be
rephrased as whether fairness interventions manage to correct a classifier and align its score with
the fair unobservable conditional probability. To mathematically formalize this, we consider a
binary random variable  representing bias, independent of the non-sensitive features , given
the binary label  and the sensitive attribute . In layman’s terms, for each data point in our
initially fair dataset, we flip a weighted coin to decide whether to introduce bias, where the weight of
the coin, i.e. the probability of bias, depends only on the sensitive attribute and the original label.</p>
      <p>In this paper we analyze two types of biases: label bias and selection bias. Label bias arises
when the assigned label for an individual does not accurately represent the label they should
Selection Bias</p>
      <p>Demographic Parity
Necessary conditions on  ( |, )</p>
      <p>( |, ) satisfies fairness</p>
      <p>Not detectable from  ( |, )
 ( |, ) doesn’t satisfy fairness</p>
      <p>We’re all equal
Necessary conditions on  ( |, )</p>
      <p>( |, ) satisfies fairness
Necessary conditions on  ( |, )
 ( |, ) satisfies fairness
have been assigned, whereas selection bias occurs when certain individuals are not represented
in the dataset, resulting in a non-representative sample of the population.</p>
      <p>To study these biases, a mathematical model is needed. We adopted the following: for label
bias, when a data point needs to be biased, i.e. when it loses the coin flip, we change the assigned
binary label; for selection bias, we instead remove the data point entirely.</p>
      <p>We then examined the impact of these biases on two specific fairness definitions:
“Demographic Parity" and “We’re all equal". Demographic parity, as previously mentioned, means that
the distribution of the initial label is the same for both sensitive groups, formally  =|. “We’re
all equal”, on the other hand, requires that the distribution of the initial label depends solely on
the non-sensitive features, formally  =|| . Based on these notions, our research question
now is:</p>
      <p>Consider a distribution that satisfies either demographic parity or “we’re all equal,”
which is transformed by either label or selection bias. Is it possible to detect or
discern the bias based on a sample of the altered distribution? And if so, is there a
fairness intervention that can counteract it?
Everything combined, it means there exist four possible combinations of bias and fairness we
can study, as shown in Table 1.</p>
      <p>Having a clear mathematical framework proved to be a fruitful choice for our research, as we
were able to prove multiple statements about each of the four combinations. In particular, what
became evident from our research is that in 3 out of 4 combinations, when bias has influenced
the distribution, it is possible to confirm it purely by looking at the conditional distribution of
the label, since there are strict conditions that the observable distribution must satisfy. Moreover,
since diferent conditions are satisfied for diferent combinations, it is possible to discern the
nature of the bias.</p>
      <p>This result has positive implications for the field of fairness, as it suggests that, at least
theoretically, the data itself could assist in determining the best course of action. This is because
certain choices might already be excluded when they do not align with the data distribution,
alleviating practitioners from the burden of blindly applying fairness techniques and helping
them to take a more informed decision.</p>
      <p>On the other hand, the remaining combination, selection bias combined with demographic
parity, yields an even more interesting and somewhat counter-intuitive result. In this case, we
demonstrated that the original fair distribution never satisfies demographic parity when
computed on the available biased data. In other words, the correct fair distribution does not appear
completely fair according to the used measure. This simple condition has a significant impact
on the application of many classifiers in terms of fairness. Indeed, many fairness techniques
are designed to enforce demographic parity on the available data, which automatically fails to
properly remove selection bias from the data as they cannot retrieve the original fair distribution.
When bias influences the data, even the fairness measures themselves can become biased.</p>
      <p>This also formally supports one of the most common criticisms of fairness, namely that we
need a better understanding of the efect of fairness techniques on data.</p>
    </sec>
    <sec id="sec-3">
      <title>3. Limitations, Future work and Conclusions</title>
      <p>
        Despite the merits of our work, we acknowledge that it also has its limitations. Firstly, even
though we utilized our theoretical framework to explain certain experimental results from the
literature [
        <xref ref-type="bibr" rid="ref4">4</xref>
        ] and expanded upon them with our own, our work remains primarily theoretical in
nature. Further exploration into practical applications is still necessary. Secondly, all our proofs
rely on multiple, and often strong, assumptions, which would be interesting to generalize. In
particular, the assumption that the bias is independent of the features is unrealistic.
      </p>
      <p>In reality, it is often the case that bias may be influenced by both the features and the sensitive
attributes. For instance, non-sensitive features that correlate with being part of a minority might
inadvertently trigger confirmation bias in individuals with prejudices. This could subsequently
exacerbate discrimination against certain groups of people, thereby perpetuating bias. A person
with a foreign-sounding name might experience more rejections than a compatriot with a
more common name when seeking employment. Police over-patrolling black neighborhoods
gathers more data on residents of those neighborhoods than on black individuals residing in
predominantly-white areas. In both cases, the bias is influenced by the features; label bias for
the first example and selection bias for the second.</p>
      <p>We believe this to be the most important point of improvement in our work. To overcome this
limitation, in our future work we will explore what happens when the features are allowed to
influence bias in our definitions. However, we also need to limit the extent to which the features
influence the bias. This enables us to relax the independence assumption without altering the
definitions of bias established thus far. We believe this to be a reasonable approach, as it allows
us to still apply mathematical rigorousness when defining diferent kinds of biases.</p>
      <p>Here we show three possible approaches: one based on diferential privacy, the second on
the Hirschfeld–Gebelein–Rényi coeficient, and the third is a direct chance-constraint on the
probability distribution.</p>
      <p>1.  (|, ,  ) ≤  (|,  )
2. HGR(, |,  ) := sup corr( (), ()|,  ) &lt;  for functions  and .
3.  (| ( = 1|, ,  ) −  ( = 1|,  )| ≥  ) ≤ 
All these definitions are suitable, albeit with diferent levels of generality and ease of
implementation. They can further improve our understanding of the relationship between bias and
fairness and help us develop more efective fairness interventions. We believe our work is a
step in the right direction: a new research line to connect bias types and data assumptions to
measurable properties in observed data. The implications are both theoretical and practical,
leading to a better understanding of fairness measures and better tools to guarantee fair AI.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <given-names>M.</given-names>
            <surname>Favier</surname>
          </string-name>
          ,
          <string-name>
            <given-names>T.</given-names>
            <surname>Calders</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Pinxteren</surname>
          </string-name>
          , J. Meyer, How to be fair?
          <article-title>a study of label and selection bias</article-title>
          ,
          <source>Machine Learning</source>
          <volume>112</volume>
          (
          <year>2023</year>
          )
          <fpage>5081</fpage>
          -
          <lpage>5104</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [2]
          <string-name>
            <given-names>S.</given-names>
            <surname>Corbett-Davies</surname>
          </string-name>
          ,
          <string-name>
            <given-names>E.</given-names>
            <surname>Pierson</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Feller</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Goel</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Huq</surname>
          </string-name>
          ,
          <article-title>Algorithmic Decision Making and the Cost of Fairness</article-title>
          ,
          <source>in: Proceedings of the 23rd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining</source>
          , KDD '17,
          <string-name>
            <surname>Association</surname>
          </string-name>
          for Computing Machinery, New York, NY, USA,
          <year>2017</year>
          , pp.
          <fpage>797</fpage>
          -
          <lpage>806</lpage>
          . URL: https://doi.org/10.1145/3097983.3098095. doi:
          <volume>10</volume>
          . 1145/3097983.3098095.
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [3]
          <string-name>
            <given-names>A. K.</given-names>
            <surname>Menon</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R. C.</given-names>
            <surname>Williamson</surname>
          </string-name>
          ,
          <article-title>The cost of fairness in binary classification</article-title>
          ,
          <source>in: Proceedings of the 1st Conference on Fairness, Accountability and Transparency</source>
          ,
          <string-name>
            <surname>PMLR</surname>
          </string-name>
          ,
          <year>2018</year>
          , pp.
          <fpage>107</fpage>
          -
          <lpage>118</lpage>
          . URL: https://proceedings.mlr.press/v81/menon18a.html, iSSN:
          <fpage>2640</fpage>
          -
          <lpage>3498</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [4]
          <string-name>
            <given-names>M.</given-names>
            <surname>Wick</surname>
          </string-name>
          ,
          <string-name>
            <surname>J.-B. Tristan</surname>
          </string-name>
          , et al.,
          <article-title>Unlocking fairness: a trade-of revisited</article-title>
          ,
          <source>Advances in neural information processing systems</source>
          <volume>32</volume>
          (
          <year>2019</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          [5]
          <string-name>
            <given-names>D.</given-names>
            <surname>Lenders</surname>
          </string-name>
          , T. Calders,
          <article-title>Real-life performance of fairness interventions-introducing a new benchmarking dataset for fair ml</article-title>
          ,
          <source>in: Proceedings of the 38th ACM/SIGAPP symposium on applied computing</source>
          ,
          <year>2023</year>
          , pp.
          <fpage>350</fpage>
          -
          <lpage>357</lpage>
          .
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>