<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta>
      <journal-title-group>
        <journal-title>J. Baumann);</journal-title>
      </journal-title-group>
    </journal-meta>
    <article-meta>
      <title-group>
        <article-title>Machine Learning Through Post-processing: The Case of Predictive Parity</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Joachim Baumann</string-name>
          <email>baumann@ifi.uzh.ch</email>
          <xref ref-type="aff" rid="aff0">0</xref>
          <xref ref-type="aff" rid="aff1">1</xref>
          <xref ref-type="aff" rid="aff2">2</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Anikó Hannák</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Christoph Heitz</string-name>
          <email>christoph.heitz@zhaw.ch</email>
          <xref ref-type="aff" rid="aff0">0</xref>
          <xref ref-type="aff" rid="aff2">2</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>EWAF'23: European Workshop on Algorithmic Fairness</institution>
        </aff>
        <aff id="aff1">
          <label>1</label>
          <institution>University of Zurich</institution>
          ,
          <addr-line>Zurich</addr-line>
          ,
          <country country="CH">Switzerland</country>
        </aff>
        <aff id="aff2">
          <label>2</label>
          <institution>Zurich University of Applied Sciences</institution>
          ,
          <addr-line>Zurich</addr-line>
          ,
          <country country="CH">Switzerland</country>
        </aff>
      </contrib-group>
      <pub-date>
        <year>2023</year>
      </pub-date>
      <volume>000</volume>
      <fpage>0</fpage>
      <lpage>0002</lpage>
      <abstract>
        <p>Post-processing is a bias mitigation technique proposed by the algorithmic fairness community to ensure the fairness of decision making systems that rely on machine learning (ML). Several works have provided solutions to optimally post-process ML-based systems for taking decisions that are fair w.r.t. specific group fairness criteria such as statistical parity (SP ) or equality of opportunity (EOP ) [1, 2]: here, optimal decision rules always take the form of lower-bound threshold rules. We investigate the case of another important fairness criterion called predictive parity. We show that for this notion of fairness, the optimum decision rules are diferent: In some cases, the optimum decision rule consists in applying an threshold rule for (at least) one group. This result is counter-intuitive: For a decision maker, it may be optimal to leave out the most promising individuals of a group in order to generate predictive parity in a globally optimal way. This is in contrast to the analogous solutions for SP and EOP. Furthermore, even if between-group fairness is achieved, within-group fairness may be created. We encourage readers to consult the complete manuscript [3], which was published at FAccT 2022.</p>
      </abstract>
      <kwd-group>
        <kwd>Fairness</kwd>
        <kwd>predictive parity</kwd>
        <kwd>post-processing</kwd>
        <kwd>optimal decision rules</kwd>
        <kwd>group fairness</kwd>
        <kwd>suficiency</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>-</title>
      <p>Background</p>
      <p>
        Prediction-based binary decision systems are not fair by default. In order
to measure and eventually correct for discrimination against certain social groups, diferent
mathematical notions of so-called group fairness criteria have been proposed [
        <xref ref-type="bibr" rid="ref4 ref5">4, 5</xref>
        ]. One line
of research is concerned with optimal post-processing of ML models, deriving decision rules
that satisfy some group fairness constraint while still leading to eficient decisions [
        <xref ref-type="bibr" rid="ref1 ref2 ref6 ref7">1, 2, 6, 7</xref>
        ].
Following this approach, we formulate the goal of fairness as a constrained optimization problem
for a decision maker, assuming that goal is to maximize a decision maker’s utility function while
satisfying some fairness constraint [
        <xref ref-type="bibr" rid="ref8">8</xref>
        ]. Such optimal decision rules have been derived for the
group fairness criteria (conditional) statistical parity, equality of opportunity (also called True
Positive Rate (TPR) parity), False Positive Rate (FPR) parity, and Equalized Odds (EO) [
        <xref ref-type="bibr" rid="ref1 ref2 ref6">1, 2, 6</xref>
        ].
It has been shown that lower-bound threshold rules characterize optimal decision rules that
satisfy these fairness constraints.1
CEUR
Workshop
Proceedings
decisions [
        <xref ref-type="bibr" rid="ref10 ref9">9, 10</xref>
        ]. Lower-bound threshold rules are decision rules given by  = 1 , if  &gt;  ,  = 0 else. For the EO
Predictive parity  ( = 1| = 1,  = 0) =  ( = 1| = 1,  = 1)
FOR parity  ( = 1| = 0,  = 0) =  ( = 1| = 0,  = 1)
      </p>
      <p>Suficiency  ( = 1| = ,  = 0) =  ( = 1| = ,  = 1),  ∈ {0, 1}</p>
      <p>
        In computer science and in philosophy literature, predictive parity (also known as parity of
positive predictive values (PPV) or precision across groups) is often mentioned as one of the main
fairness criteria [
        <xref ref-type="bibr" rid="ref10 ref4 ref5">4, 5, 10–22</xref>
        ]. Related fairness criteria are false omission rate (FOR) parity and
suficiency. Most prominent is probably the case of the 2016 debate surrounding the recidivism
risk prediction tool COMPAS [11]. In response to [23] suggesting that the tool systematically
disadvantages black defendants, Northpointe (the developers of COMPAS) claimed that their
tool is fair because it satisfies predictive parity and FOR parity [ 24].2
Research gap Optimal post-processing solutions are unknown for fairness criteria that
condition on the decision, namely, predictive parity, FOR parity, and suficiency (which combines
the former two) – see Table 1 for the definitions w.r.t the decision  , label  , and binary groups
 = {0, 1} . We close this gap by deriving optimal decision rules that satisfy these group fairness
criteria through post-processing.
      </p>
      <p>Findings We provide formal proof showing that optimal decision rules satisfying predictive
parity or FOR parity take the form of group-specific threshold rules, as has been found for other
fairness criteria. However, surprisingly, under some conditions (depending on the populations
and the applied utility function), upper-bound thresholds are optimal: a decision maker would
assign a positive decision ( = 1 ) to individuals with a low probability of belonging to the
positive class ( = 1 ). This is visualized in Figure 1 where the probability ( ) density functions
are shown for two groups 0 and 1 and the colored parts represent those individuals that receive
a positive decision: Without any fairness constraints, a single uniform lower-bound threshold
would be optimal (i.e.,  = 1 if  &gt;  0), resulting in diferent PPVs for the two groups (denoted
by    0 in Figure 1). To ensure predictive parity, it is optimal to apply a lower-bound threshold
to Group 0 (i.e.,  = 1 if  &gt;  1) and an upper-bound threshold to Group 1 (i.e.,  = 1 if  &lt;  2),
resulting in a PPV of    1,2 for both groups. In this situation, any rational decision maker
is willing to omit the most promising individuals from Group 1 in order to achieve predictive
parity – which is highly counter-intuitive.</p>
      <p>
        Furthermore, we provide a solution for the optimal decision rules that satisfy suficiency. We
ifnd that this definition of fairness requires randomization (similar to the EO criterion [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ]).
criterion, randomization involving two such thresholds is needed to satisfy EOP and FPR parity simultaneously [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ].
2In addition to recidivism prediction, predictive parity is also prevalent in predictive policing [25] (where the metric
is usually called hit rate or outcome test) and in personalized online ads (where the notion of click through rates [26],
which is an equivalent metric, is omnipresent).
Recently, we have conducted additional experiments, showing that the solution provided in this
paper is efective in mitigating many diferent types of bias that can be present in ML-based
decision making systems [27]. These experiments show that post-processing techniques [
        <xref ref-type="bibr" rid="ref1 ref2 ref3">1–3</xref>
        ]
can cope with historical biases on the features or labels and even with measurement bias on
the features. However, measurement bias on the label is particularly dificult to mitigate, and
existing (post-processing) solutions are limited since they rely on the biased proxy of the label.
Ethical implications In many cases, individuals with  = 1 have, morally speaking, a higher
claim to a positive decision  = 1 than individuals with  = 0 , and vice versa. For example, in
the case of COMPAS, this means that individuals with a lower probability of recidivism (i.e., a
low  =  [ = 1] ) should preferably be released ( = 0 ). However, requiring a rational decision
maker to fulfill predictive parity can result in releasing individuals with higher recidivism
probabilities instead. This represents a case of within-group unfairness: achieving
betweengroup fairness at the expense of within-group fairness may be problematic from an ethical
perspective.
      </p>
      <p>Society increasingly calls for fairer algorithms. At least for the group fairness criteria
predictive parity, FOR parity, and suficiency, our work shows that imposing such fairness criteria on
utility-maximizing decision makers may lead to ethically problematic outcomes.
Acknowledgments
We thank the other members of our project and colleagues (Corinna Hertweck, Eleonora Viganò,
Ulrich Leicht-Deobald, Serhiy Kandul, Markus Christen, Nicolò Pagan, Stefania Ionescu,
Aleksandra Urman, and Leonore Röseler) for their helpful comments and suggestions. We also thank
the anonymous reviewers for their feedback. This work was supported by Innosuisse – grant
number 44692.1 IP-SBM – and by the National Research Programme “Digital Transformation”
(NRP 77) of the Swiss National Science Foundation (SNSF) – grant number 187473.
[11] A. Chouldechova, Fair Prediction with Disparate Impact: A Study of Bias in Recidivism</p>
      <p>Prediction Instruments, Big data 5 (2017) 153–163. doi:10.1089/big.2016.0047.
[12] M. Kearns, A. Roth, The Ethical Algorithm: The Science of Socially Aware Algorithm</p>
      <p>Design, Oxford University Press, Inc., USA, 2019.
[13] S. Barocas, M. Hardt, A. Narayanan, Fairness and Machine Learning, fairmlbook.org, 2019.
[14] D. Pessach, E. Shmueli, A Review on Fairness in Machine Learning, ACM Comput. Surv.</p>
      <p>55 (2022). doi:10.1145/3494672.
[15] D. Leben, Normative Principles for Evaluating Fairness in Machine Learning, Association
for Computing Machinery, New York, NY, USA, 2020, pp. 86–92. URL: https://doi.org/10.
1145/3375627.3375808.
[16] R. Berk, H. Heidari, S. Jabbari, M. Kearns, A. Roth, Fairness in Criminal Justice Risk
Assessments: The State of the Art, Sociological Methods &amp; Research 50 (2021) 3–44.
doi:10.1177/0049124118782533.
[17] K. Makhlouf, S. Zhioua, C. Palamidessi, On the Applicability of Machine Learning Fairness</p>
      <p>Notions, SIGKDD Explor. Newsl. 23 (2021) 14–23. doi:10.1145/3468507.3468511.
[18] J. Baumann, C. Heitz, Group Fairness in Prediction-Based Decision Making: From Moral
Assessment to Implementation, in: 2022 9th Swiss Conference on Data Science (SDS),
2022, pp. 19–25. doi:10.1109/SDS54800.2022.00011.
[19] M. Loi, A. Herlitz, H. Heidari, A Philosophical Theory of Fairness for Prediction-Based</p>
      <p>Decisions, SSRN Electronic Journal (2019). doi:10.2139/ssrn.3450300.
[20] J. Kleinberg, S. Mullainathan, M. Raghavan, Inherent Trade-Ofs in the Fair Determination
of Risk Scores, 2016. arXiv:1609.05807v2.
[21] S. A. Friedler, C. Scheidegger, S. Venkatasubramanian, On the (im)possibility of fairness,
2016. URL: https://arxiv.org/abs/1609.07236. arXiv:1609.07236.
[22] G. Pleiss, M. Raghavan, F. Wu, J. Kleinberg, K. Q. Weinberger, On Fairness and Calibration,
in: Proceedings of the 31st International Conference on Neural Information Processing
Systems, Curran Associates Inc., 2017, pp. 5684–5693.
[23] J. Angwin, J. Larson, S. Mattu, L. Kirchner, Machine bias,
ProPublica, May 23 (2016) 139–159. URL: https://www.propublica.org/article/
machine-bias-risk-assessments-in-criminal-sentencing.
[24] W. Dieterich, C. Mendoza, T. Brennan, COMPAS Risk Scales:
Demonstrating Accuracy Equity and Predictive Parity,
Technical Report, Northpoint Inc, 2016. URL: https://www.equivant.com/
response-to-propublica-demonstrating-accuracy-equity-and-predictive-parity/.
[25] C. Simoiu, S. Corbett-Davies, S. Goel, The problem of infra-marginality in outcome tests
for discrimination, The Annals of Applied Statistics 11 (2017) 1193–1216. doi:10.1214/
17- AOAS1058.
[26] X. Wang, W. Li, Y. Cui, R. Zhang, J. Mao, Click-through rate estimation for rare events in
online advertising, in: Online multimedia advertising: Techniques and technologies, IGI
Global, 2011, pp. 1–12.
[27] J. Baumann, A. Castelnovo, R. Crupi, N. Inverardi, D. Regoli, Bias on Demand: A
Modelling Framework That Generates Synthetic Data With Bias, in: 2023 ACM Conference
on Fairness, Accountability, and Transparency, FAccT ’23, Association for Computing
Machinery, New York, NY, USA, 2023. doi:10.1145/3593013.3594058.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <given-names>S.</given-names>
            <surname>Corbett-Davies</surname>
          </string-name>
          ,
          <string-name>
            <given-names>E.</given-names>
            <surname>Pierson</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Feller</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Goel</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Huq</surname>
          </string-name>
          ,
          <article-title>Algorithmic Decision Making and the Cost of Fairness</article-title>
          ,
          <source>in: Proceedings of the 23rd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining</source>
          , KDD '17,
          <string-name>
            <surname>Association</surname>
          </string-name>
          for Computing Machinery, New York, NY, USA,
          <year>2017</year>
          , pp.
          <fpage>797</fpage>
          -
          <lpage>806</lpage>
          . doi:
          <volume>10</volume>
          .1145/3097983.3098095.
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [2]
          <string-name>
            <given-names>M.</given-names>
            <surname>Hardt</surname>
          </string-name>
          ,
          <string-name>
            <given-names>E.</given-names>
            <surname>Price</surname>
          </string-name>
          ,
          <string-name>
            <given-names>N.</given-names>
            <surname>Srebro</surname>
          </string-name>
          ,
          <article-title>Equality of opportunity in supervised learning</article-title>
          ,
          <source>in: Advances in Neural Information Processing Systems</source>
          , NIPS'16, Curran Associates Inc.,
          <string-name>
            <surname>Red</surname>
            <given-names>Hook</given-names>
          </string-name>
          ,
          <string-name>
            <surname>NY</surname>
          </string-name>
          , USA,
          <year>2016</year>
          , pp.
          <fpage>3323</fpage>
          -
          <lpage>3331</lpage>
          . arXiv:
          <volume>1610</volume>
          .
          <fpage>02413</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [3]
          <string-name>
            <given-names>J.</given-names>
            <surname>Baumann</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Hannák</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Heitz</surname>
          </string-name>
          , Enforcing Group Fairness in
          <article-title>Algorithmic Decision Making: Utility Maximization Under Suficiency</article-title>
          ,
          <source>in: Proceedings of the 2022 ACM Conference on Fairness, Accountability, and Transparency</source>
          , FAccT '22,
          <string-name>
            <surname>Association</surname>
          </string-name>
          for Computing Machinery, New York, NY, USA,
          <year>2022</year>
          , pp.
          <fpage>2315</fpage>
          -
          <lpage>2326</lpage>
          . doi:https://doi.org/ 10.1145/3531146.3534645.
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [4]
          <string-name>
            <given-names>A.</given-names>
            <surname>Narayanan</surname>
          </string-name>
          ,
          <article-title>Translation tutorial: 21 fairness definitions and their politics</article-title>
          ,
          <source>in: Proc. Conf</source>
          . Fairness Accountability Transp., New York, USA,
          <year>2018</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          [5]
          <string-name>
            <given-names>S.</given-names>
            <surname>Verma</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Rubin</surname>
          </string-name>
          , Fairness Definitions Explained,
          <source>in: Proceedings of the International Workshop on Software Fairness</source>
          , FairWare '18,
          <string-name>
            <surname>Association</surname>
          </string-name>
          for Computing Machinery, New York, NY, USA,
          <year>2018</year>
          , pp.
          <fpage>1</fpage>
          -
          <lpage>7</lpage>
          . doi:
          <volume>10</volume>
          .1145/3194770.3194776.
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          [6]
          <string-name>
            <given-names>Z. C.</given-names>
            <surname>Lipton</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Chouldechova</surname>
          </string-name>
          ,
          <string-name>
            <surname>J. McAuley</surname>
          </string-name>
          ,
          <article-title>Does mitigating ML's impact disparity require treatment disparity?</article-title>
          ,
          <source>in: Proceedings of the 32nd International Conference on Neural Information Processing Systems</source>
          , Curran Associates, Inc.,
          <year>2018</year>
          , pp.
          <fpage>8136</fpage>
          -
          <lpage>8146</lpage>
          . URL: https: //proceedings.neurips.cc/paper/2018/file/8e0384779e58ce2af40eb365b318cc32-Paper.pdf.
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          [7]
          <string-name>
            <given-names>A. K.</given-names>
            <surname>Menon</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R. C.</given-names>
            <surname>Williamson</surname>
          </string-name>
          ,
          <article-title>The cost of fairness in binary classification, in: S. A</article-title>
          .
          <string-name>
            <surname>Friedler</surname>
          </string-name>
          , C. Wilson (Eds.),
          <source>Proceedings of the 1st Conference on Fairness, Accountability and Transparency</source>
          , volume
          <volume>81</volume>
          <source>of Proceedings of Machine Learning Research</source>
          , PMLR, New York, NY, USA,
          <year>2018</year>
          , pp.
          <fpage>107</fpage>
          -
          <lpage>118</lpage>
          . URL: http://proceedings.mlr.press/v81/menon18a.html.
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          [8]
          <string-name>
            <given-names>S.</given-names>
            <surname>Mitchell</surname>
          </string-name>
          , E. Potash,
          <string-name>
            <given-names>S.</given-names>
            <surname>Barocas</surname>
          </string-name>
          ,
          <string-name>
            <surname>A. D'Amour</surname>
            ,
            <given-names>K.</given-names>
          </string-name>
          <string-name>
            <surname>Lum</surname>
          </string-name>
          , Algorithmic Fairness: Choices, Assumptions, and
          <string-name>
            <surname>Definitions</surname>
          </string-name>
          ,
          <source>Annual Review of Statistics and Its Application</source>
          <volume>8</volume>
          (
          <year>2021</year>
          )
          <fpage>141</fpage>
          -
          <lpage>163</lpage>
          . doi:
          <volume>10</volume>
          .1146/annurev-statistics-
          <volume>042720</volume>
          -125902.
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          [9]
          <string-name>
            <given-names>J.</given-names>
            <surname>Kleinberg</surname>
          </string-name>
          ,
          <string-name>
            <given-names>H.</given-names>
            <surname>Lakkaraju</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Leskovec</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Ludwig</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Mullainathan</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Human</given-names>
            <surname>Decisions</surname>
          </string-name>
          and Machine Predictions*,
          <source>The Quarterly Journal of Economics</source>
          <volume>133</volume>
          (
          <year>2017</year>
          )
          <fpage>237</fpage>
          -
          <lpage>293</lpage>
          . doi:
          <volume>10</volume>
          . 1093/qje/qjx032.
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          [10]
          <string-name>
            <given-names>S.</given-names>
            <surname>Caton</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Haas</surname>
          </string-name>
          ,
          <source>Fairness in Machine Learning: A Survey</source>
          ,
          <year>2020</year>
          . arXiv:
          <year>2010</year>
          .04053.
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>