<?xml version="1.0" encoding="UTF-8"?>
<TEI xml:space="preserve" xmlns="http://www.tei-c.org/ns/1.0" 
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" 
xsi:schemaLocation="http://www.tei-c.org/ns/1.0 https://raw.githubusercontent.com/kermitt2/grobid/master/grobid-home/schemas/xsd/Grobid.xsd"
 xmlns:xlink="http://www.w3.org/1999/xlink">
	<teiHeader xml:lang="en">
		<fileDesc>
			<titleStmt>
				<title level="a" type="main">Privacy Implications of Explainable AI in Data-Driven Systems</title>
			</titleStmt>
			<publicationStmt>
				<publisher/>
				<availability status="unknown"><licence/></availability>
			</publicationStmt>
			<sourceDesc>
				<biblStruct>
					<analytic>
						<author role="corresp">
							<persName><forename type="first">Fatima</forename><surname>Ezzeddine</surname></persName>
							<email>fatima.ezzeddine@usi.ch</email>
							<affiliation key="aff0">
								<orgName type="institution">Università della Svizzera italiana</orgName>
								<address>
									<settlement>Lugano</settlement>
									<country key="CH">Switzerland</country>
								</address>
							</affiliation>
							<affiliation key="aff1">
								<orgName type="institution">Scuola universitaria professionale della Svizzera italiana</orgName>
								<address>
									<settlement>Lugano</settlement>
									<country key="CH">Switzerland</country>
								</address>
							</affiliation>
						</author>
						<title level="a" type="main">Privacy Implications of Explainable AI in Data-Driven Systems</title>
					</analytic>
					<monogr>
						<idno type="ISSN">1613-0073</idno>
					</monogr>
					<idno type="MD5">F993A7CBB15353E27D8BF734A1367C5A</idno>
				</biblStruct>
			</sourceDesc>
		</fileDesc>
		<encodingDesc>
			<appInfo>
				<application version="0.7.2" ident="GROBID" when="2025-04-23T19:36+0000">
					<desc>GROBID - A machine learning software for extracting information from scholarly documents</desc>
					<ref target="https://github.com/kermitt2/grobid"/>
				</application>
			</appInfo>
		</encodingDesc>
		<profileDesc>
			<textClass>
				<keywords>
					<term>Explainable Artificial Intelligence</term>
					<term>Privacy-Preserving Machine Learning</term>
					<term>Privacy Attacks</term>
				</keywords>
			</textClass>
			<abstract>
<div xmlns="http://www.tei-c.org/ns/1.0"><p>Machine learning (ML) models, demonstrably powerful, suffer from a lack of interpretability. The absence of transparency, often referred to as the black box nature of ML models, undermines trust and urges the need for efforts to enhance their explainability. Explainable AI (XAI) techniques address this challenge by providing frameworks and methods to explain the internal decision-making processes of these complex models. Techniques like Counterfactual Explanations (CF) and Feature Importance play a crucial role in achieving this goal. Furthermore, high-quality and diverse data remains the foundational element for robust and trustworthy ML applications. In many applications, the data used to train ML and XAI explainers contain sensitive information. In this context, numerous privacy-preserving techniques can be employed to safeguard sensitive information in the data, such as differential privacy. Subsequently, a conflict between XAI and privacy solutions emerges due to their opposing goals. Since XAI techniques provide reasoning for the model behavior, they reveal information relative to ML models, such as their decision boundaries, the values of features, or the gradients of deep learning models when explanations are exposed to a third entity. Attackers can initiate privacy breaching attacks using these explanations, to perform model extraction, inference, and membership attacks. This dilemma underscores the challenge of finding the right equilibrium between understanding ML decision-making and safeguarding privacy.</p></div>
			</abstract>
		</profileDesc>
	</teiHeader>
	<text xml:lang="en">
		<body>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="1.">Context and Motivation</head><p>In recent years, advancements in Artificial Intelligence (AI) have expanded beyond the primary objective of predictive capabilities. Although accurate predictions are crucial, an equally important goal has emerged: ensuring explainability. Explainability in Machine Learning (ML) models has become a critical objective for making clear and justifiable predictions, especially in high-stakes social decisions. It is essential for these models to offer clear and comprehensible reasons for their predictions and decisions <ref type="bibr" target="#b0">[1]</ref>. In this context, Explainable AI (XAI) has emerged as a crucial field of investigation. XAI methodologies are specifically designed to unveil the decision-making processes of complex, opaque models, often referred to as black boxes. With the use of XAI techniques, researchers can gain valuable insights into the reasoning behind model decisions, after they have already been made <ref type="bibr" target="#b1">[2]</ref>. XAI techniques employ various methods to interpret the inner workings of complex ML models. These methods generate different types of explanations, e.g., feature importance, counterfactual explanations, etc. To generate tailored explanations, XAI requires a combination of data, interpretable models, and explanatory techniques and often incorporates user interaction. Therefore, XAI starts with the foundational element of data, which needs to be diverse and of high quality. This data is not only used to train AI models but also to create explainers. This combination of data, interpretable models, explanatory techniques, and user interaction builds the XAI.</p><p>In many applications, the data used to train AI and XAI models contain sensitive information about individuals, such as medical records, or financial transactions, which the GDPR <ref type="bibr" target="#b2">[3]</ref> seeks to safeguard. Different approaches are proposed to safeguard sensitive information in data, such as differential privacy (DP) and federated learning (FL). These approaches affect predictive performance to some extent, resulting in a drop in performance, yet they manage to uphold an acceptable level of it. Subsequently, a conflict between ensuring transparency through XAI and ensuring privacy emerges due to their opposing goals. XAI aims to provide insights into model behavior for transparency, while privacy-preserving solutions obscure data to prevent data leakage. Moreover, the output of XAI can unintentionally expose model decision boundaries, leading to potential attacks on privacy <ref type="bibr" target="#b3">[4,</ref><ref type="bibr" target="#b4">5]</ref>. For instance, attackers can exploit XAI explanations such as CFs, which describe the minimal feature value change to alter the model decision and return instances that are close to the decision boundary. FI, which scores the contribution and impact of each feature to the model output exposes information about the gradients in Deep Neural Networks (DNNs) or about the values of the features in ML. In this context, attackers can initiate attacks from these explanations to perform model extraction, inference, and membership attacks <ref type="bibr" target="#b5">[6]</ref>, especially when the model is shared or deployed publicly on the cloud as ML as a Service (MLaaS). This dilemma underscores the challenge of finding the right equilibrium between explainability and safeguarding private information <ref type="bibr" target="#b3">[4]</ref>.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="2.">Background on Explainable Artificial Intelligence</head></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="2.1.">Motivation and Definition</head><p>In order to enhance transparency, XAI techniques provide the necessary tools to open up complex black boxes and shed light on how AI decisions are made <ref type="bibr" target="#b6">[7]</ref>, promoting fairness, transparency, and accountability within real-world organizations. Moreover, XAI has proven to play a pivotal role in ensuring that AI is trusted and used responsibly. By answering essential "How?" and "Why?" inquiries regarding AI systems, XAI serves as a valuable tool for tackling the increasing ethical and legal issues associated with them. XAI targets diverse entities and includes various stakeholders, such as researchers, model developers like engineers and data scientists, as well as practitioners.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="2.2.">Post-hoc Explainability</head><p>Post-hoc explainability is a technique used to gain insight into the decision-making process of a trained ML model. In this context, post-hoc means that the model's interpretability is addressed after its training, regardless of its complexity or the algorithms used. The approach primarily revolves around the act of querying the model with diverse sets of input data to observe how it reacts to different scenarios. Through these interactions, we can effectively map out the decision boundaries the model uses, shedding light on what factors influence its predictions.</p><p>Visualizations and explanations can then be applied to make these insights more accessible and human-friendly, ultimately enabling a better understanding of the model's predictions. These visual aids are essential in making the insights gained more accessible to data scientists, end users, and domain experts who are willing to understand why the model is making specific predictions. By going through this process, post-hoc explainability serves a vital role in improving model transparency and building trust in its performance.</p><p>Understanding an AI system with XAI relies on its training data, process, and model. Therefore, XAI can be applied throughout the entire AI development pipeline. Specifically, it can be applied in different stages of modeling, such as before, during, and after (post-modeling explainability). In this work, the primary emphasis will be on post-modeling XAI (Post-hoc), since ML models are often developed with only predictive performance in mind.</p><p>Feature Importance Feature Importance (FI) explanations involve assigning a quantitative measure in the form of a numerical score to each input feature within a given model. The primary goal of calculating FI is to discern which features have influential effects on the model's predictions and which ones have a relatively lesser impact. These importance scores help practitioners and data scientists gain insights into which factors are most critical in influencing those decisions. Features that, when modified, cause more substantial shifts in the model's output are considered more important because they have a greater influence on the final prediction. For deep learning models, many feature-based explanation functions are gradientbased techniques that analyze the gradient flow through a model. Approaches such as Layer-wise Relevance Propagation (LRP) <ref type="bibr" target="#b7">[8]</ref>, and Deep Learning Important FeaTures (DeepLIFT) <ref type="bibr" target="#b8">[9]</ref> exist. Counterfactual Explanations CFs leverage the concept of potential outcomes to assess causal relationships within a data-model framework. CFs empower informed decision-making and the implementation of explainable, accountable, and ultimately more ethically responsible AI <ref type="bibr" target="#b9">[10]</ref>. It achieves this by constructing a hypothetical scenario, distinct from the observed data, and evaluating the corresponding model output under this scenario. The generation of informative and interpretable CFs necessitates the optimization of well-defined metrics <ref type="bibr" target="#b10">[11]</ref> such as diversity, validity, proximity, and user constraints. Conversely, model-specific methods tailor the cost function optimization process to leverage the inherent characteristics of the employed model. For instance, in the case of differentiable models, gradients play a critical role in guiding the optimization towards finding CFs <ref type="bibr" target="#b11">[12]</ref>. Conversely, model-agnostic methods achieve generalization across diverse model architectures <ref type="bibr" target="#b12">[13,</ref><ref type="bibr" target="#b13">14]</ref>.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="3.">Related Work: Interplay between XAI and Privacy</head></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="3.1.">Context and Problem Formulation</head><p>Data protection and privacy is one of the primary dimensions in ML and AI. It involves ensuring that the data used to train and test ML models does not expose sensitive information about individuals or entities. This is particularly critical when dealing with datasets that contain personally identifiable information or confidential details. Techniques like anonymization and DP have emerged as valuable tools in the data privacy field. They allow us to protect the privacy of individuals represented in the data, even as we leverage it to train models. Beyond data privacy, model privacy is also a pressing concern. The architecture of ML models can be  <ref type="figure" target="#fig_0">1</ref>). However, privacy is not included in the default behavior of most ML algorithms. They tend to learn not just the general trends but also the specifics of the data, potentially revealing sensitive information when the model is made public. In an ideal scenario, we want these algorithms to focus on extracting general trends and patterns from the data while deliberately avoiding the inclusion of specific details about the data. This emphasis on distilling general patterns means that the algorithms should primarily capture the fundamental, common insights that are valuable for decision-making, aligning with privacy concerns, as they identify important details without risking individual privacy. XAI can inadvertently compromise privacy by revealing sensitive information about the model's decision boundaries. Moreover, the process of returning real data points with CFs can inadvertently expose specific instances from the training set or behaviors. Also, the process of assigning FI scores exposes the values of gradients and the feature values themselves. This conflict makes striking the right balance between model explainability and data privacy crucial to ensuring that XAI enhances our understanding of AI systems without leaking individual privacy.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="3.2.">Attacks on Machine Learning Models</head></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="3.2.1.">Membership inference Attacks</head><p>A membership inference attack (MIA) is a privacy-related threat in ML where an adversary attempts to determine whether a specific data point was part of the training dataset of a deployed model <ref type="bibr" target="#b14">[15,</ref><ref type="bibr" target="#b15">16]</ref>. MIA are particularly concerning because they can compromise the privacy of individuals whose data was part of the training dataset. If an attacker can determine that a specific data point was included in the training data, it may reveal sensitive information about that individual, even if the model's output does not directly disclose such information. To perform membership attacks, <ref type="bibr" target="#b14">[15]</ref> proposes a shadow training process that mimics the target model with shadow models, and trains the attack model using data that is extracted using data synthesis. Also, <ref type="bibr" target="#b16">[17]</ref> discusses and proves that points with a very high loss tend to be far from the decision boundary and are more likely to be non-members. Regarding how explanation can facilitate performing MIA, <ref type="bibr" target="#b3">[4]</ref> quantifies information leakage in model predictions when explanations are provided. The authors evaluate feature-based explanations, highlighting how back-propagation-based explanations reveal decision boundaries.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="3.2.2.">Model Extraction Attack</head><p>Model extraction (MEA) is a class of attacks where an adversary tries to reverse-engineer a target model by observing its behavior and querying it. MEA can potentially lead to the theft of intellectual property compromising proprietary models <ref type="bibr" target="#b17">[18,</ref><ref type="bibr" target="#b18">19]</ref>. Authors in <ref type="bibr" target="#b18">[19]</ref> discuss the weakness in ML services that take incomplete data with confidence levels and show successful attacks on different ML models like decision trees, SVMs, and DNNs by using equation-solving, path-finding algorithms. Regarding how explanations can facilitate MEA, FIs, and CFs have proved their ability to reveal the decision boundary of a target model <ref type="bibr" target="#b19">[20]</ref>.</p><p>[21] perform the attack by minimizing task-classification loss and task-explanation loss. Authors in <ref type="bibr" target="#b21">[22]</ref> show how gradient-based explanations quickly reveal the model itself and highlight the power of gradients. Regarding CFs, <ref type="bibr" target="#b22">[23]</ref> proposes a strategy to target the decision boundary shift by taking not only the CF but also the CF of the CF as pairs of training samples.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="3.2.3.">Model Inversion Attack</head><p>A model inversion attack (MINA) is a privacy-related threat in ML where an adversary attempts to reconstruct sensitive or private information about individual data points from trained model predictions. In other words, the MINA task is to predict the input data, that is, the original dataset for the target model. In <ref type="bibr" target="#b23">[24]</ref> discusses how providing explanations harms privacy and studies this risk for image-based MINA on private image data from model explanations. The authors developed several CNN architectures that achieve significantly higher inversion performance than using only the target model prediction. To minimize the risk of MINA, <ref type="bibr" target="#b24">[25]</ref> presents a generative noise injector for model FI explanations by perturbing model explanations.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="4.">Research Questions and Objectives</head><p>We pose the following research questions (RQs):</p><p>1. To what extent does the utilization of known privacy-preserving techniques, such as DP, effectively safeguard privacy and prevent information leakage when combined with explanations provided by XAI?</p><p>2. Can we produce high-quality XAI explanations while safeguarding privacy to mitigate potential vulnerabilities to attacks? 3. Which approach, privacy-preserving XAI or privacy-preserving ML, offers a more effective solution for safeguarding sensitive information in XAI systems?</p><p>To address RQ1, we aim to evaluate the trade-off and assess the effectiveness of existing privacy-preserving techniques (e.g., DP) in mitigating information leakage when combined with XAI explanations for CFs and FI. This will involve investigating the extent to which explanations can be exploited for privacy attacks like MIA, MEA, or MENA.</p><p>To address RQ2, we aim to explore the possibility of generating high-fidelity XAI explanations while simultaneously safeguarding privacy.</p><p>Such approaches aim to develop an XAI framework that concurrently optimizes two objectives: i) generating high-quality CFS, and ii) adhering to pre-defined privacy constraints. Furthermore, the integration of DP during the backpropagation of gradients for FI computation is another promising avenue for investigation.</p><p>To address RQ3, we will conduct a comparative analysis of privacy-preserving XAI and privacy-preserving ML techniques. This analysis will evaluate their strengths and weaknesses in safeguarding sensitive information within XAI systems. By comprehensively assessing these aspects, we aim to identify the approach that offers a more robust and enduring mechanism for privacy protection within XAI applications, covering different types of data.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="5.">Results and contributions to date</head><p>In the initial research, I explored CF generation through RL, with the specific goal of constructing an explainer that operates independently of input data. The investigation then progressed to a more in-depth examination of CFs, focusing on their potential for information leakage and their ability to reveal the decision boundaries of ML models. To reach this aim, a new methodology is proposed to carry out MEA through a concept known as knowledge distillation (KD). I also delved into the domain of explainable deep learning methods within distributed systems, such as Vertical Split Learning (VSL), aiming to evaluate the potential information disclosure resulting from FI across various entities. In addition, I analyzed the impact of DP on the explainability of anomaly detection (AD) models. More specifically:</p><p>1. Explored how RL can be leveraged to generate CF explanations without relying on the dataset as input to the explainer. The main aim is to let the CF generator learn generalizable patterns from the training data without exposing it. The explainer determines which features to modify and by how much, by maximizing a custom reward function designed to jointly optimize various metrics.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="2.">Designed a new attack approach to evaluate the use of KD for an MEA in scenarios</head><p>where CFs are given to an attacker. I benefit from the property of KD and the process of transferring knowledge from a large model to a smaller one. The findings reveal that employing KD with the presence of CFs can indeed yield successful MEA. 3. Proposed an approach to generate private CFs I introduce the concept of DP within the GANs CF generation pipeline to generate CFs that deviate from the statistical properties of the confidential dataset, offering a layer of protection against potential privacy breaches.</p><p>4. Explored VSL strategies and performed experiments to explore the risk of information leakage regarding the original features using gradient-based explanations (IG and DeepLIFT). My application of VSL focused on a use case related to Network Function Virtualization. My findings highlight how an attacker on the server side can exploit XAI techniques to achieve additional tasks, without access to the original features. 5. Explored DP with AD Analyzed the trade-off between privacy achieved by DP and explainability achieved using SHAP.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="6.">Expected next steps and final contribution to knowledge</head><p>This PhD research aims to achieve significant advancements in bridging the critical gap between XAI and data privacy. We will address the inherent conflict between providing users with clear explanations of AI models and protecting their sensitive data (privacy). We aim to develop a defense mechanism in the form of high-quality explanations while simultaneously ensuring privacy.</p></div><figure xmlns="http://www.tei-c.org/ns/1.0" xml:id="fig_0"><head>Figure 1 :</head><label>1</label><figDesc>Figure 1: Scenario of privacy attacks where MLaaS provides explanations alongside the prediction</figDesc><graphic coords="4,97.18,84.19,408.80,212.45" type="bitmap" /></figure>
		</body>
		<back>
			<div type="references">

				<listBibl>

<biblStruct xml:id="b0">
	<analytic>
		<title level="a" type="main">The mythos of model interpretability: In machine learning, the concept of interpretability is both important and slippery</title>
		<author>
			<persName><forename type="first">Z</forename><forename type="middle">C</forename><surname>Lipton</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">Queue</title>
		<imprint>
			<biblScope unit="volume">16</biblScope>
			<biblScope unit="page" from="31" to="57" />
			<date type="published" when="2018">2018</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b1">
	<analytic>
		<title level="a" type="main">Causability and explainability of artificial intelligence in medicine</title>
		<author>
			<persName><forename type="first">A</forename><surname>Holzinger</surname></persName>
		</author>
		<author>
			<persName><forename type="first">G</forename><surname>Langs</surname></persName>
		</author>
		<author>
			<persName><forename type="first">H</forename><surname>Denk</surname></persName>
		</author>
		<author>
			<persName><forename type="first">K</forename><surname>Zatloukal</surname></persName>
		</author>
		<author>
			<persName><forename type="first">H</forename><surname>Müller</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">Wiley Interdisciplinary Reviews: Data Mining and Knowledge Discovery</title>
		<imprint>
			<biblScope unit="volume">9</biblScope>
			<biblScope unit="page">e1312</biblScope>
			<date type="published" when="2019">2019</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b2">
	<analytic>
		<title level="a" type="main">General data protection regulation</title>
		<author>
			<persName><forename type="first">P</forename><surname>Regulation</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">Intouch</title>
		<imprint>
			<biblScope unit="volume">25</biblScope>
			<biblScope unit="page" from="1" to="5" />
			<date type="published" when="2018">2018</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b3">
	<analytic>
		<title level="a" type="main">On the privacy risks of model explanations</title>
		<author>
			<persName><forename type="first">R</forename><surname>Shokri</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Strobel</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Y</forename><surname>Zick</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Proceedings of the 2021 AAAI/ACM Conference on AI, Ethics, and Society</title>
				<meeting>the 2021 AAAI/ACM Conference on AI, Ethics, and Society</meeting>
		<imprint>
			<date type="published" when="2021">2021</date>
			<biblScope unit="page" from="231" to="241" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b4">
	<analytic>
		<title level="a" type="main">The privacy issue of counterfactual explanations: explanation linkage attacks</title>
		<author>
			<persName><forename type="first">S</forename><surname>Goethals</surname></persName>
		</author>
		<author>
			<persName><forename type="first">K</forename><surname>Sörensen</surname></persName>
		</author>
		<author>
			<persName><forename type="first">D</forename><surname>Martens</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">ACM Transactions on Intelligent Systems and Technology</title>
		<imprint>
			<biblScope unit="volume">14</biblScope>
			<biblScope unit="page" from="1" to="24" />
			<date type="published" when="2023">2023</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b5">
	<analytic>
		<title level="a" type="main">A survey of privacy attacks in machine learning</title>
		<author>
			<persName><forename type="first">M</forename><surname>Rigaki</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><surname>Garcia</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">ACM Computing Surveys</title>
		<imprint>
			<date type="published" when="2020">2020</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b6">
	<analytic>
		<title level="a" type="main">A review of taxonomies of explainable artificial intelligence (xai) methods</title>
		<author>
			<persName><forename type="first">T</forename><surname>Speith</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Proceedings of the 2022 ACM Conference on Fairness, Accountability, and Transparency</title>
				<meeting>the 2022 ACM Conference on Fairness, Accountability, and Transparency</meeting>
		<imprint>
			<date type="published" when="2022">2022</date>
			<biblScope unit="page" from="2239" to="2250" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b7">
	<analytic>
		<title level="a" type="main">On pixel-wise explanations for non-linear classifier decisions by layer-wise relevance propagation</title>
		<author>
			<persName><forename type="first">S</forename><surname>Bach</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Binder</surname></persName>
		</author>
		<author>
			<persName><forename type="first">G</forename><surname>Montavon</surname></persName>
		</author>
		<author>
			<persName><forename type="first">F</forename><surname>Klauschen</surname></persName>
		</author>
		<author>
			<persName><forename type="first">K.-R</forename><surname>Müller</surname></persName>
		</author>
		<author>
			<persName><forename type="first">W</forename><surname>Samek</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">PloS one</title>
		<imprint>
			<biblScope unit="volume">10</biblScope>
			<biblScope unit="page">e0130140</biblScope>
			<date type="published" when="2015">2015</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b8">
	<analytic>
		<title level="a" type="main">Learning important features through propagating activation differences</title>
		<author>
			<persName><forename type="first">A</forename><surname>Shrikumar</surname></persName>
		</author>
		<author>
			<persName><forename type="first">P</forename><surname>Greenside</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Kundaje</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">International conference on machine learning</title>
				<meeting><address><addrLine>PMLR</addrLine></address></meeting>
		<imprint>
			<date type="published" when="2017">2017</date>
			<biblScope unit="page" from="3145" to="3153" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b9">
	<analytic>
		<title level="a" type="main">Counterfactual explanations without opening the black box: Automated decisions and the gdpr</title>
		<author>
			<persName><forename type="first">S</forename><surname>Wachter</surname></persName>
		</author>
		<author>
			<persName><forename type="first">B</forename><surname>Mittelstadt</surname></persName>
		</author>
		<author>
			<persName><forename type="first">C</forename><surname>Russell</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">Harv. JL &amp; Tech</title>
		<imprint>
			<biblScope unit="volume">31</biblScope>
			<biblScope unit="page">841</biblScope>
			<date type="published" when="2017">2017</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b10">
	<monogr>
		<author>
			<persName><forename type="first">S</forename><surname>Verma</surname></persName>
		</author>
		<author>
			<persName><forename type="first">V</forename><surname>Boonsanong</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Hoang</surname></persName>
		</author>
		<author>
			<persName><forename type="first">K</forename><forename type="middle">E</forename><surname>Hines</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><forename type="middle">P</forename><surname>Dickerson</surname></persName>
		</author>
		<author>
			<persName><forename type="first">C</forename><surname>Shah</surname></persName>
		</author>
		<idno type="arXiv">arXiv:2010.10596</idno>
		<title level="m">Counterfactual explanations and algorithmic recourses for machine learning: A review</title>
				<imprint>
			<date type="published" when="2020">2020</date>
		</imprint>
	</monogr>
	<note type="report_type">arXiv preprint</note>
</biblStruct>

<biblStruct xml:id="b11">
	<analytic>
		<title level="a" type="main">Scout: Self-aware discriminant counterfactual explanations</title>
		<author>
			<persName><forename type="first">P</forename><surname>Wang</surname></persName>
		</author>
		<author>
			<persName><forename type="first">N</forename><surname>Vasconcelos</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition</title>
				<meeting>the IEEE/CVF Conference on Computer Vision and Pattern Recognition</meeting>
		<imprint>
			<date type="published" when="2020">2020</date>
			<biblScope unit="page" from="8981" to="8990" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b12">
	<analytic>
		<title level="a" type="main">Ordered counterfactual explanation by mixed-integer linear optimization</title>
		<author>
			<persName><forename type="first">K</forename><surname>Kanamori</surname></persName>
		</author>
		<author>
			<persName><forename type="first">T</forename><surname>Takagi</surname></persName>
		</author>
		<author>
			<persName><forename type="first">K</forename><surname>Kobayashi</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Y</forename><surname>Ike</surname></persName>
		</author>
		<author>
			<persName><forename type="first">K</forename><surname>Uemura</surname></persName>
		</author>
		<author>
			<persName><forename type="first">H</forename><surname>Arimura</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Proceedings of the AAAI Conference on Artificial Intelligence</title>
				<meeting>the AAAI Conference on Artificial Intelligence</meeting>
		<imprint>
			<date type="published" when="2021">2021</date>
			<biblScope unit="volume">35</biblScope>
			<biblScope unit="page" from="11564" to="11574" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b13">
	<monogr>
		<author>
			<persName><forename type="first">T</forename><forename type="middle">M</forename><surname>Nguyen</surname></persName>
		</author>
		<author>
			<persName><forename type="first">T</forename><forename type="middle">P</forename><surname>Quinn</surname></persName>
		</author>
		<author>
			<persName><forename type="first">T</forename><surname>Nguyen</surname></persName>
		</author>
		<author>
			<persName><forename type="first">T</forename><surname>Tran</surname></persName>
		</author>
		<idno type="arXiv">arXiv:2103.12983</idno>
		<title level="m">Counterfactual explanation with multi-agent reinforcement learning for drug target prediction</title>
				<imprint>
			<date type="published" when="2021">2021</date>
		</imprint>
	</monogr>
	<note type="report_type">arXiv preprint</note>
</biblStruct>

<biblStruct xml:id="b14">
	<analytic>
		<title level="a" type="main">Membership inference attacks against machine learning models</title>
		<author>
			<persName><forename type="first">R</forename><surname>Shokri</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Stronati</surname></persName>
		</author>
		<author>
			<persName><forename type="first">C</forename><surname>Song</surname></persName>
		</author>
		<author>
			<persName><forename type="first">V</forename><surname>Shmatikov</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">IEEE symposium on security and privacy (SP), IEEE</title>
				<imprint>
			<date type="published" when="2017">2017. 2017</date>
			<biblScope unit="page" from="3" to="18" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b15">
	<analytic>
		<title level="a" type="main">Model explanations with differential privacy</title>
		<author>
			<persName><forename type="first">N</forename><surname>Patel</surname></persName>
		</author>
		<author>
			<persName><forename type="first">R</forename><surname>Shokri</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Y</forename><surname>Zick</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Proceedings of the 2022 ACM Conference on Fairness, Accountability, and Transparency</title>
				<meeting>the 2022 ACM Conference on Fairness, Accountability, and Transparency</meeting>
		<imprint>
			<date type="published" when="2022">2022</date>
			<biblScope unit="page" from="1895" to="1904" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b16">
	<analytic>
		<title level="a" type="main">White-box vs black-box: Bayes optimal strategies for membership inference</title>
		<author>
			<persName><forename type="first">A</forename><surname>Sablayrolles</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Douze</surname></persName>
		</author>
		<author>
			<persName><forename type="first">C</forename><surname>Schmid</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Y</forename><surname>Ollivier</surname></persName>
		</author>
		<author>
			<persName><forename type="first">H</forename><surname>Jégou</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">International Conference on Machine Learning</title>
				<meeting><address><addrLine>PMLR</addrLine></address></meeting>
		<imprint>
			<date type="published" when="2019">2019</date>
			<biblScope unit="page" from="5558" to="5567" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b17">
	<analytic>
		<title level="a" type="main">Activethief: Model extraction using active learning and unannotated public data</title>
		<author>
			<persName><forename type="first">S</forename><surname>Pal</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Y</forename><surname>Gupta</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Shukla</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Kanade</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><surname>Shevade</surname></persName>
		</author>
		<author>
			<persName><forename type="first">V</forename><surname>Ganapathy</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Proceedings of the AAAI Conference on Artificial Intelligence</title>
				<meeting>the AAAI Conference on Artificial Intelligence</meeting>
		<imprint>
			<date type="published" when="2020">2020</date>
			<biblScope unit="volume">34</biblScope>
			<biblScope unit="page" from="865" to="872" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b18">
	<analytic>
		<title level="a" type="main">Stealing machine learning models via prediction {APIs}</title>
		<author>
			<persName><forename type="first">F</forename><surname>Tramèr</surname></persName>
		</author>
		<author>
			<persName><forename type="first">F</forename><surname>Zhang</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Juels</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><forename type="middle">K</forename><surname>Reiter</surname></persName>
		</author>
		<author>
			<persName><forename type="first">T</forename><surname>Ristenpart</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">25th USENIX security symposium (USENIX Security 16)</title>
				<imprint>
			<date type="published" when="2016">2016</date>
			<biblScope unit="page" from="601" to="618" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b19">
	<monogr>
		<author>
			<persName><forename type="first">T</forename><surname>Miura</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><surname>Hasegawa</surname></persName>
		</author>
		<author>
			<persName><forename type="first">T</forename><surname>Shibahara</surname></persName>
		</author>
		<idno type="arXiv">arXiv:2107.08909</idno>
		<title level="m">Megex: Data-free model extraction attack against gradientbased explainable ai</title>
				<imprint>
			<date type="published" when="2021">2021</date>
		</imprint>
	</monogr>
	<note type="report_type">arXiv preprint</note>
</biblStruct>

<biblStruct xml:id="b20">
	<analytic>
		<title level="a" type="main">Towards explainable model extraction attacks</title>
		<author>
			<persName><forename type="first">A</forename><surname>Yan</surname></persName>
		</author>
		<author>
			<persName><forename type="first">R</forename><surname>Hou</surname></persName>
		</author>
		<author>
			<persName><forename type="first">X</forename><surname>Liu</surname></persName>
		</author>
		<author>
			<persName><forename type="first">H</forename><surname>Yan</surname></persName>
		</author>
		<author>
			<persName><forename type="first">T</forename><surname>Huang</surname></persName>
		</author>
		<author>
			<persName><forename type="first">X</forename><surname>Wang</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">International Journal of Intelligent Systems</title>
		<imprint>
			<biblScope unit="volume">37</biblScope>
			<biblScope unit="page" from="9936" to="9956" />
			<date type="published" when="2022">2022</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b21">
	<analytic>
		<title level="a" type="main">Model reconstruction from model explanations</title>
		<author>
			<persName><forename type="first">S</forename><surname>Milli</surname></persName>
		</author>
		<author>
			<persName><forename type="first">L</forename><surname>Schmidt</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><forename type="middle">D</forename><surname>Dragan</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Hardt</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Proceedings of the Conference on Fairness, Accountability, and Transparency</title>
				<meeting>the Conference on Fairness, Accountability, and Transparency</meeting>
		<imprint>
			<date type="published" when="2019">2019</date>
			<biblScope unit="page" from="1" to="9" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b22">
	<analytic>
		<title level="a" type="main">Dualcf: Efficient model extraction attack from counterfactual explanations</title>
		<author>
			<persName><forename type="first">Y</forename><surname>Wang</surname></persName>
		</author>
		<author>
			<persName><forename type="first">H</forename><surname>Qian</surname></persName>
		</author>
		<author>
			<persName><forename type="first">C</forename><surname>Miao</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Proceedings of the 2022 ACM Conference on Fairness, Accountability, and Transparency</title>
				<meeting>the 2022 ACM Conference on Fairness, Accountability, and Transparency</meeting>
		<imprint>
			<date type="published" when="2022">2022</date>
			<biblScope unit="page" from="1318" to="1329" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b23">
	<analytic>
		<title level="a" type="main">Exploiting explanations for model inversion attacks</title>
		<author>
			<persName><forename type="first">X</forename><surname>Zhao</surname></persName>
		</author>
		<author>
			<persName><forename type="first">W</forename><surname>Zhang</surname></persName>
		</author>
		<author>
			<persName><forename type="first">X</forename><surname>Xiao</surname></persName>
		</author>
		<author>
			<persName><forename type="first">B</forename><surname>Lim</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Proceedings of the IEEE/CVF international conference on computer vision</title>
				<meeting>the IEEE/CVF international conference on computer vision</meeting>
		<imprint>
			<date type="published" when="2021">2021</date>
			<biblScope unit="page" from="682" to="692" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b24">
	<analytic>
		<title level="a" type="main">Learning to generate inversion-resistant model explanations</title>
		<author>
			<persName><forename type="first">H</forename><surname>Jeong</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><surname>Lee</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><forename type="middle">J</forename><surname>Hwang</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><surname>Son</surname></persName>
		</author>
		<ptr target="https://proceedings.neurips.cc/paper_files/paper/2022/file/70d638f3177d2f0bbdd9f400b43f0683-Paper-Conference.pdf" />
	</analytic>
	<monogr>
		<title level="m">Advances in Neural Information Processing Systems</title>
				<editor>
			<persName><forename type="first">S</forename><surname>Koyejo</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">S</forename><surname>Mohamed</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">A</forename><surname>Agarwal</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">D</forename><surname>Belgrave</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">K</forename><surname>Cho</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">A</forename><surname>Oh</surname></persName>
		</editor>
		<imprint>
			<publisher>Curran Associates, Inc</publisher>
			<date type="published" when="2022">2022</date>
			<biblScope unit="volume">35</biblScope>
			<biblScope unit="page" from="17717" to="17729" />
		</imprint>
	</monogr>
</biblStruct>

				</listBibl>
			</div>
		</back>
	</text>
</TEI>
