<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta>
      <journal-title-group>
        <journal-title>European Workshop on Algorithmic Fairness, June</journal-title>
      </journal-title-group>
    </journal-meta>
    <article-meta>
      <title-group>
        <article-title>Model-Agnostic Auditing: A Lost Cause?</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Sakina Hansen</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Joshua Loftus</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>London School of Economics</institution>
          ,
          <addr-line>Houghton Street, London</addr-line>
          ,
          <country country="UK">United Kingdom</country>
          ,
          <addr-line>WC2A 2AE</addr-line>
        </aff>
      </contrib-group>
      <pub-date>
        <year>2023</year>
      </pub-date>
      <volume>0</volume>
      <fpage>7</fpage>
      <lpage>09</lpage>
      <abstract>
        <p>Tools for interpretable machine learning (IML) or explainable artificial intelligence (xAI) can be used to audit algorithms for fairness or other desiderata. In a black-box setting without access to the algorithm's internal structure an auditor may be limited to methods that are model-agnostic. These methods have severe limitations with important consequences for outcomes such as fairness. Among modelagnostic IML methods, visualizations such as the partial dependence plot (PDP) or individual conditional expectation (ICE) plots are popular and useful for displaying qualitative relationships. Although we focus on fairness auditing with PDP/ICE plots, the consequences we highlight generalize to other auditing or IML/xAI applications. This paper questions the validity of auditing in high-stakes settings with contested values or conflicting interests if the audit methods are model-agnostic.</p>
      </abstract>
      <kwd-group>
        <kwd>eol&gt;machine learning</kwd>
        <kwd>artificial intelligence</kwd>
        <kwd>supervised learning</kwd>
        <kwd>black-box auditing</kwd>
        <kwd>visualization</kwd>
        <kwd>partial dependence plots</kwd>
        <kwd>individual conditional expectation</kwd>
        <kwd>causal models</kwd>
        <kwd>counterfactual fairness</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>1. Introduction</title>
      <p>
        Algorithm auditing is a rapidly growing field with little consensus about what makes an audit
trustworthy [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ]. Understanding the limitations of auditing methods is necessary to judge
whether a particular audit is rigorous. To study these methods, we simulate the role of an
external auditor who can only interact with the model by providing input data and recording
the predicted outcome. This case is relevant to regulatory, oversight, or other competitive
settings when an auditor can only use auditing methods that are model-agnostic [
        <xref ref-type="bibr" rid="ref2 ref3 ref4">2, 3, 4</xref>
        ]. We
focus on the partial dependence plot (PDP) [
        <xref ref-type="bibr" rid="ref4 ref5">5, 4</xref>
        ], a popular tool for visualizing relationships
between black-box input and output, and its close variants individual conditional expectation
(ICE) [
        <xref ref-type="bibr" rid="ref6">6</xref>
        ] plots and conditional PDP. We demonstrate their limitations for fairness auditing
through examples.
      </p>
    </sec>
    <sec id="sec-2">
      <title>2. Theoretical Limitations of Black-Box Auditing for Fairness</title>
      <p>
        Data Dependence. Model-agnostic explanations like PDPs depend on the joint distribution
of data used to compute them. If part of the motivation of an audit is to understand the world
by explaining the black-box, then unrepresentative data could lead to inaccurate conclusions
about the world. Likewise, if an auditor is uninformed about the data selection process and
uses data from a biased sample to produce a PDP or other model explanation their audit may
fail to detect unfairness in the pipeline. If the audit does use data from the same distribution as
the training data, this leaves open questions of whether discrimination occurs if the model is
deployed under conditions of distribution shift [
        <xref ref-type="bibr" rid="ref7">7</xref>
        ].
      </p>
      <p>
        Unfairness via Mediators and Proxies. Model-agnostic explanations like PDPs will only
show relationships with variables that are explicit inputs to an algorithm by definition. If a
black-box does not take a sensitive attribute as an input it can still perform proxy discrimination
[
        <xref ref-type="bibr" rid="ref8">8</xref>
        ], but a PDP may not uncover this. Additionally, due to the way PDPs average over other
predictors they may hide indirect discrimination through mediating variables.
Interaction. PDPs are most efective at showing model dependence on each predictor if the
model is additive, but can hide dependence if there are interactions [
        <xref ref-type="bibr" rid="ref9">9</xref>
        ]. This strong dependence
on model structure complicates the interpretation of PDPs, especially in an auditing setting
where we do not know the assumptions of the model fitting algorithm. ICE plots can help
somewhat with this issue [
        <xref ref-type="bibr" rid="ref4">4</xref>
        ].
      </p>
      <p>
        Attrition and Causality. In some real world examples, multiple sensitive attributes can
interact resulting in intersectional discrimination [
        <xref ref-type="bibr" rid="ref10 ref11 ref12 ref13">10, 11, 12, 13, 14</xref>
        ]. One example of this is
age related attrition, which has been studied related to unfairness to black defendants in the
COMPAS case [15]. Attrition is also a relevant in health applications, where age and health
interact with a number of socioeconomic factors [16]. In examples like these, attrition can
violate the backdoor criterion, a requirement for causal interpretations of PDP [17]. Hence, the
relationship uncovered by a model-agnostic explanation be a non-causal association that is not
relevant to the purpose of the audit. Finally, causality raises issues about the interpretation
of social categories as causal variables [18, 19] but can also help reveal diferences between
predictive algorithms and interventional policies [20]. [21]
      </p>
    </sec>
    <sec id="sec-3">
      <title>3. Hiring Simulation</title>
      <p>Algorithmic recruitment systems are emerging in the EU market [22, 23] and in many other
places in the world [24], with the aims of accelerating hiring processing, and reducing errors and
costs. These algorithms could exclude people from the job market with little human involvement
or checking procedures, and so come with extensive risks for discrimination and unfairness
[25, 26]. Our main simulation uses a synthetic causal model to generate data, consistent with
the model ℳ in Figure 1, with age and gender as variables that afect experience, which
in turn afects chances of a job interview. The application rate decreases according to an
interaction between age and gender, so that one gender group’s application rate is 23% and
the other group is 63%, for an overall application rate of 42% from an initial population of
 = 2000 job seekers. Hence, the training data for the black-box models is not representative
of the overall population. Experience increases with age but with diferent slopes depending on
gender, potentially reflecting efects of unfairness at previous time points. Finally, interview
probability increases positively with experience, and positively with age for one gender group
but negatively with age for the other gender group, again potentially reflecting unfairness
(direct discrimination in this case) in the training data.</p>
      <p>Through a series of experiments, we generate conditional PDPs with predictive models that
assume diferent relationships between the variables and outcome:
1. Model ^  includes only experience as a predictor.
2. Model ^ int includes experience, age, and gender as predictors with interaction efects
(correctly specified).</p>
    </sec>
    <sec id="sec-4">
      <title>4. Conclusion</title>
      <p>We used fairness as an example objective for black-box audits and PDPs and related plots as
example model-agnostic explanation methods. We show with examples several important ways
these can fail to detect unfairness. Visual explanation methods may be convincing because
“seeing is believing,” so they have potential to be particularly deceptive if they are interpreted
without understanding their limitations. Our broader message calls into question the use of
any model-agnostic explanation methods in the black-box audit setting. To make any valid
conclusions from the explanations output by these tools, we must think beyond the input-output
interface and consider causal structure in the real world, the sources of data used to train the
model and generate the explanation, and choices of variables used to elaborate any univariate
explanations. Important future work would look at extending the analysis present here to
other model-agnostic explanation methods such as SHAP [27] and LIME [28]. We hope this
encourages critical engagement and use in fairness contexts where explanations can obscure
rather than reveal unfair discrimination if not used correctly.
volume 192 of Leibniz International Proceedings in Informatics (LIPIcs), Schloss Dagstuhl
– Leibniz-Zentrum für Informatik, Dagstuhl, Germany, 2021, pp. 7:1–7:20. URL: https:
//drops.dagstuhl.de/opus/volltexte/2021/13875. doi:10.4230/LIPIcs.FORC.2021.7.
[14] A. Wang, V. V. Ramaswamy, O. Russakovsky, Towards intersectionality in machine learning:
Including more identities, handling underrepresentation, and performing evaluation, in:
2022 ACM Conference on Fairness, Accountability, and Transparency, 2022, pp. 336–349.
[15] C. Rudin, C. Wang, B. Coker, The age of secrecy and unfairness in recidivism prediction,</p>
      <p>Harvard Data Science Review 2 (2020) 1.
[16] D. M. Cutler, A. Lleras-Muney, T. Vogl, Socioeconomic status and health: dimensions and
mechanisms (2008).
[17] Q. Zhao, T. Hastie, Causal interpretations of black-box models, Journal of Business &amp;
Economic Statistics 39 (2021) 272–281. URL: https://doi.org/10.1080/07350015.2019.1624293.
arXiv:https://doi.org/10.1080/07350015.2019.1624293.
[18] I. Kohler-Hausmann, Eddie murphy and the dangers of counterfactual causal thinking
about detecting racial discrimination, Nw. UL Rev. 113 (2018) 1163.
[19] L. Hu, I. Kohler-Hausmann, What’s sex got to do with machine learning?, in: Proceedings
of the 2020 Conference on Fairness, Accountability, and Transparency, 2020, pp. 513–513.
[20] L. Bynum, J. Loftus, J. Stoyanovich, Disaggregated Interventions to Reduce Inequality,
in: Equity and Access in Algorithms, Mechanisms, and Optimization, Association for
Computing Machinery, New York, NY, USA, 2021, pp. 1–13. URL: https://doi.org/10.1145/
3465416.3483286.
[21] L. Bynum, J. Loftus, J. Stoyanovich, Counterfactuals for the future, in: Proceedings of the</p>
      <p>AAAI Conference on Artificial Intelligence, 2023.
[22] R. Xenidis, L. Senden, Eu non-discrimination law in the era of artificial intelligence:
Mapping the challenges of algorithmic discrimination, in General Principles of EU law
and the EU Digital Order (Kluwer Law International, 2020) (2019) 151–182.
[23] H. Parviainen, Can algorithmic recruitment systems lawfully utilise automated
decisionmaking in the eu?, European Labour Law Journal 13 (2022) 225–248.
[24] L. Li, T. Lassiter, J. Oh, M. K. Lee, Algorithmic hiring in practice: Recruiter and hr
professional’s perspectives on ai use in hiring, in: Proceedings of the 2021 AAAI/ACM
Conference on AI, Ethics, and Society, 2021, pp. 166–176.
[25] A. Kelly-Lyth, Challenging biased hiring algorithms, Oxford Journal of Legal Studies 41
(2021) 899–928.
[26] M. Buyl, C. Cociancig, C. Frattone, N. Roekens, Tackling algorithmic disability
discrimination in the hiring process: An ethical, legal and technical analysis, in: 2022 ACM
Conference on Fairness, Accountability, and Transparency, 2022, pp. 1071–1082.
[27] E. Štrumbelj, I. Kononenko, Explaining prediction models and individual predictions with
feature contributions, Knowledge and information systems 41 (2014) 647–665.
[28] M. T. Ribeiro, S. Singh, C. Guestrin, "Why should i trust you?" Explaining the predictions
of any classifier, in: Proceedings of the 22nd ACM SIGKDD international conference on
knowledge discovery and data mining, 2016, pp. 1135–1144.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <given-names>S.</given-names>
            <surname>Costanza-Chock</surname>
          </string-name>
          ,
          <string-name>
            <given-names>I. D.</given-names>
            <surname>Raji</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Buolamwini</surname>
          </string-name>
          ,
          <article-title>Who audits the auditors? recommendations from a field scan of the algorithmic auditing ecosystem</article-title>
          ,
          <source>in: 2022 ACM Conference on Fairness, Accountability, and Transparency</source>
          ,
          <year>2022</year>
          , pp.
          <fpage>1571</fpage>
          -
          <lpage>1583</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [2]
          <string-name>
            <given-names>M. T.</given-names>
            <surname>Ribeiro</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Singh</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Guestrin</surname>
          </string-name>
          ,
          <article-title>Model-agnostic interpretability of machine learning (</article-title>
          <year>2016</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [3]
          <string-name>
            <given-names>A.-H.</given-names>
            <surname>Karimi</surname>
          </string-name>
          ,
          <string-name>
            <given-names>G.</given-names>
            <surname>Barthe</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B.</given-names>
            <surname>Balle</surname>
          </string-name>
          ,
          <string-name>
            <surname>I. Valera</surname>
          </string-name>
          ,
          <article-title>Model-Agnostic Counterfactual Explanations for Consequential Decisions</article-title>
          ,
          <source>in: Proceedings of the Twenty Third International Conference on Artificial Intelligence and Statistics</source>
          , PMLR,
          <year>2020</year>
          , pp.
          <fpage>895</fpage>
          -
          <lpage>905</lpage>
          . URL: https://proceedings. mlr.press/v108/karimi20a.html, iSSN:
          <fpage>2640</fpage>
          -
          <lpage>3498</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [4]
          <string-name>
            <given-names>C.</given-names>
            <surname>Molnar</surname>
          </string-name>
          ,
          <source>Interpretable Machine Learning: A Guide for Making Black Box Models Explainable</source>
          ,
          <year>2022</year>
          . URL: https://christophm.github.io/interpretable-ml-book/.
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          [5]
          <string-name>
            <given-names>J. H.</given-names>
            <surname>Friedman</surname>
          </string-name>
          ,
          <article-title>Greedy function approximation: a gradient boosting machine</article-title>
          ,
          <source>Annals of statistics</source>
          (
          <year>2001</year>
          )
          <fpage>1189</fpage>
          -
          <lpage>1232</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          [6]
          <string-name>
            <given-names>A.</given-names>
            <surname>Goldstein</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Kapelner</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Bleich</surname>
          </string-name>
          , E. Pitkin,
          <article-title>Peeking inside the black box: Visualizing statistical learning with plots of individual conditional expectation</article-title>
          ,
          <source>journal of Computational and Graphical Statistics</source>
          <volume>24</volume>
          (
          <year>2015</year>
          )
          <fpage>44</fpage>
          -
          <lpage>65</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          [7]
          <string-name>
            <given-names>N.</given-names>
            <surname>Kallus</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Zhou</surname>
          </string-name>
          ,
          <article-title>Residual unfairness in fair machine learning from prejudiced data</article-title>
          ,
          <source>in: International Conference on Machine Learning, PMLR</source>
          ,
          <year>2018</year>
          , pp.
          <fpage>2439</fpage>
          -
          <lpage>2448</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          [8]
          <string-name>
            <given-names>S.</given-names>
            <surname>Barocas</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Hardt</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Narayanan</surname>
          </string-name>
          ,
          <article-title>Fairness and Machine Learning: Limitations and Opportunities, fairmlbook</article-title>
          .org,
          <year>2019</year>
          . http://www.fairmlbook.org.
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          [9]
          <string-name>
            <given-names>C.</given-names>
            <surname>Molnar</surname>
          </string-name>
          , G. König,
          <string-name>
            <given-names>J.</given-names>
            <surname>Herbinger</surname>
          </string-name>
          ,
          <string-name>
            <given-names>T.</given-names>
            <surname>Freiesleben</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Dandl</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C. A.</given-names>
            <surname>Scholbeck</surname>
          </string-name>
          , G. Casalicchio,
          <string-name>
            <given-names>M.</given-names>
            <surname>Grosse-Wentrup</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B.</given-names>
            <surname>Bischl</surname>
          </string-name>
          ,
          <article-title>Pitfalls to avoid when interpreting machine learning models (</article-title>
          <year>2020</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          [10]
          <string-name>
            <given-names>K. W.</given-names>
            <surname>Crenshaw</surname>
          </string-name>
          , On intersectionality: Essential writings, The New Press,
          <year>2017</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          [11]
          <string-name>
            <given-names>L. K.</given-names>
            <surname>Bright</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Malinsky</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Thompson</surname>
          </string-name>
          ,
          <article-title>Causally interpreting intersectionality theory</article-title>
          ,
          <source>Philosophy of Science</source>
          <volume>83</volume>
          (
          <year>2016</year>
          )
          <fpage>60</fpage>
          -
          <lpage>81</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          [12]
          <string-name>
            <given-names>J. R.</given-names>
            <surname>Foulds</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Islam</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K. N.</given-names>
            <surname>Keya</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Pan</surname>
          </string-name>
          ,
          <article-title>An intersectional definition of fairness</article-title>
          ,
          <source>in: 2020 IEEE 36th International Conference on Data Engineering (ICDE)</source>
          , IEEE,
          <year>2020</year>
          , pp.
          <fpage>1918</fpage>
          -
          <lpage>1921</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          [13]
          <string-name>
            <given-names>K.</given-names>
            <surname>Yang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J. R.</given-names>
            <surname>Loftus</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Stoyanovich</surname>
          </string-name>
          , Causal Intersectionality and Fair Ranking, in: K. Ligett, S. Gupta (Eds.),
          <source>2nd Symposium on Foundations of Responsible Computing (FORC</source>
          <year>2021</year>
          ),
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>