<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Explainable AI methods and their interplay with privacy protection</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Francesca Naretto</string-name>
          <email>francesca.naretto@unipi.it</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>University of Pisa</institution>
          ,
          <country country="IT">Italy</country>
        </aff>
      </contrib-group>
      <pub-date>
        <year>2026</year>
      </pub-date>
      <abstract>
        <p>In recent years, Machine Learning (ml) models have achieved remarkable predictive accuracy, leading to their adoption in various domains, such as health care, bank related services, and even autonomous driving cars. However, these models are often referred to as black-boxes since they lack transparency in their decision-making processes. This opacity raises concerns regarding trustworthiness, a fundamental requirement, emphasized in regulations such as the Artificial Intelligence (ai) Act in Europe, as well as policies in China, Japan, and the United States [1, 2, 3]. To develop trustworthy ai systems, it is crucial to interpret and understand how ml models reach their decisions. This challenge is addressed by Explainable Artificial Intelligence (xai), a field that has gained increasing attention but still faces numerous open challenges, particularly in ensuring stable and reliable explanations. At the same time, there is another ethical concern: privacy. Since the introduction of the GDPR [4], data privacy has been approached from multiple perspectives. In particular, researchers have been investigating how to provide access to data, allowing for pattern extraction and knowledge discovery tasks, while safeguarding individuals' privacy. Technically, various privacy attacks have been proposed, leading to the development of privacy protection methodologies designed to defend against these threats. Nowadays, there are also privacy attacks targeting ml models. Even when training data remains private, adversaries can exploit model queries to infer sensitive information, such as membership inference or attribute disclosure, posing significant risks [5]. Given the importance of both xai and Privacy in ml, my Ph.D. thesis explores the intersection of explainability and privacy, providing the following two key contributions: • A novel variant of a local rule-based explanation method that enhances the stability and actionability of explanations. • A study on the synergies and tensions between data privacy and explainability, analyzing how explanations can both enhance privacy awareness and, conversely, expose sensitive information.</p>
      </abstract>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>-</title>
      <p>high fidelity to the original model. Experimental results confirm that this approach provides more
stable and reliable explanations compared to existing methods.</p>
    </sec>
    <sec id="sec-2">
      <title>Privacy and Explainability: A Dual Perspective</title>
      <p>The second part of the thesis examines the relationship between privacy and explainability, considering
both privacy-aware explanations and privacy risks introduced by explanations.</p>
      <p>Enhancing Privacy Awareness Through Explainability. This thesis presents Expert[8, 9, 10], a
framework that predicts privacy risks and provides explanations for individuals’ data exposure. Existing
privacy risk evaluation methods are computationally expensive, requiring frequent re-evaluation when
new data arrives. Instead, Expert uses ml models to classify individuals as high-risk or low-risk in
terms of privacy, enabling real-time assessments. Although complex black-box models are used for
classification, explainability techniques are applied post-hoc to interpret privacy risk factors. The
framework is validated on human mobility data, providing visual explanations that highlight risk areas
on a map, assisting both end-users in understanding their exposure and data providers in evaluating
privacy risks at a population level.</p>
      <p>Privacy Risks of Explainability Methods. Explainability methods can also compromise privacy
by revealing patterns learned by ml models. A key example is the Membership Inference Attack (MIA)
[5], which determines whether a given record was part of the model’s training set. While existing MIA
techniques require access to probability vectors or dataset statistics, this thesis introduces Aloa [11], a
membership attack that is agnostic to both, making it more practical and efective in real-world settings.
Experimental results demonstrate that Aloa achieves privacy exposure levels comparable to or higher
than state-of-the-art attacks, even with fewer assumptions.</p>
      <p>This thesis examines also how explainability itself can be exploited in privacy attacks. Many
explanations rely on surrogate models, which may inadvertently expose sensitive information. To assess this
risk, this work introduces REVEAL[12], a framework for evaluating privacy exposure in global and
local explainers. Results show that global explainers pose a significantly higher privacy risk than local
ones, emphasizing the need for privacy-aware xai approaches.</p>
    </sec>
    <sec id="sec-3">
      <title>Conclusions and Future Directions</title>
      <p>This work highlights the complex interplay between explainability and privacy, demonstrating that
explanations can be both a tool for privacy awareness and a potential privacy threat. The findings
contribute to the ongoing development of trustworthy ai by proposing solutions that improve explanation
stability while also addressing privacy risks.</p>
      <p>Future work will explore privacy-preserving explainability methods to ensure that ai systems remain
both interpretable and secure. This includes investigating techniques such as diferentially private
explanations and adversarial robustness against attacks targeting explainability models. Expanding
the evaluation to additional domains, beyond mobility and tabular data, would further validate the
generalizability of the proposed solutions.</p>
      <p>Declaration on Generative AI
The author(s) have not employed any Generative AI tools.
[3] China, New generation artificial intelligence development plan, https://digichina.stanford.edu/
work/full-translation-chinas-new-generation-artificial-intelligence-development-plan-2017/, ????
[4] E. Union, The general data protection regulation, https://www.garanteprivacy.it/, ????
[5] R. Shokri, M. Stronati, C. Song, V. Shmatikov, Membership inference attacks against machine
learning models, in: 2017 IEEE Symposium on Security and Privacy (SP), 2017.
[6] F. Bodria, F. Giannotti, R. Guidotti, F. Naretto, D. Pedreschi, S. Rinzivillo, Benchmarking and
survey of explanation methods for black box models, Data Mining and Knowledge Discovery
Journal (2023). doi:https://doi.org/10.1007/s10618-023-00933-9.
[7] R. Guidotti, A. Monreale, S. Ruggieri, F. Naretto, F. Turini, D. Pedreschi, F. Giannotti, Stable and
actionable explanations of black-box models through factual and counterfactual rules, Data Mining
and Knowledge Discovery (2024). doi:https://doi.org/10.1007/s10618-022-00878-5.
[8] F. Naretto, R. Pellungrini, D. Fadda, S. Rinzivillo, Exphlot: Explainable privacy assessment for
human location trajectories, in: Discovery Science, 2023. doi:https://doi.org/10.1007/
978-3-031-45275-8_22.
[9] F. Naretto, R. Pellungrini, F. M. Nardini, F. Giannotti, Prediction and explanation of privacy risk
on mobility data with neural networks, in: ECML PKDD 2020 Workshops, Springer International
Publishing, 2020. doi:https://doi.org/10.1007/978-3-030-65965-3\_34.
[10] F. Naretto, R. Pellungrini, A. Monreale, F. M. Nardini, M. Musolesi, Predicting and explaining
privacy risk exposure in mobility data, in: Discovery Science - 23rd International Conference, DS
2020, Thessaloniki, Greece, October 19-21, 2020, Proceedings, Lecture Notes in Computer Science,
Springer, 2020.
[11] A. Monreale, F. Naretto, S. Rizzo, Agnostic label-only membership inference attack, in: 17th</p>
      <p>International Conference on Network and System Security, Springer, 2023.
[12] F. Naretto, A. Monreale, F. Giannotti, Evaluating the privacy exposure of interpretable global and
local explainers, in: Transactions on Data Privacy, volume 18, 2025.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <given-names>E.</given-names>
            <surname>Union</surname>
          </string-name>
          ,
          <source>The artificial intelligence act</source>
          , https://artificialintelligenceact.eu/the-act/, ????
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [2]
          <string-name>
            <surname>S. C.</surname>
          </string-name>
          <article-title>on Artificial Intelligence, The national artificial intelligence research and development strategic plan: 2019 update, in: Executive Ofice of the President of the United States</article-title>
          , Curran Associates, Inc.,
          <year>2019</year>
          .
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>