<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta>
      <journal-title-group>
        <journal-title>X. Shi);</journal-title>
      </journal-title-group>
      <issn pub-type="ppub">1613-0073</issn>
    </journal-meta>
    <article-meta>
      <title-group>
        <article-title>Detection using Graph Neural Network</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Xiangyu Shi</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
          <xref ref-type="aff" rid="aff1">1</xref>
          <xref ref-type="aff" rid="aff4">4</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Abhishek Srinivasan</string-name>
          <email>srini@kth.se</email>
          <xref ref-type="aff" rid="aff0">0</xref>
          <xref ref-type="aff" rid="aff1">1</xref>
          <xref ref-type="aff" rid="aff4">4</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Sepideh Pashami</string-name>
          <xref ref-type="aff" rid="aff1">1</xref>
          <xref ref-type="aff" rid="aff2">2</xref>
          <xref ref-type="aff" rid="aff3">3</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Workshop</string-name>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Department of Computer Science, KTH University</institution>
          ,
          <addr-line>Stockholm</addr-line>
          ,
          <country country="SE">Sweden</country>
        </aff>
        <aff id="aff1">
          <label>1</label>
          <institution>Graph Neural Network</institution>
          ,
          <addr-line>Time-series Anomaly Detection, Counterfactual Explanation, Graph Node selection, GNN</addr-line>
        </aff>
        <aff id="aff2">
          <label>2</label>
          <institution>Halmstad University</institution>
          ,
          <addr-line>Halmstad</addr-line>
          ,
          <country country="SE">Sweden</country>
        </aff>
        <aff id="aff3">
          <label>3</label>
          <institution>RISE AB</institution>
          ,
          <addr-line>Isafjordsgatan 28 A, Kista</addr-line>
          ,
          <country country="SE">Sweden</country>
        </aff>
        <aff id="aff4">
          <label>4</label>
          <institution>Scania CV AB</institution>
          ,
          <addr-line>Vagnmakarvägen 1, Södertälje</addr-line>
          ,
          <country country="SE">Sweden</country>
        </aff>
      </contrib-group>
      <pub-date>
        <year>1941</year>
      </pub-date>
      <volume>000</volume>
      <fpage>0</fpage>
      <lpage>0003</lpage>
      <abstract>
        <p>In industrial settings, anomalies often indicate critical events such as equipment failures or system faults. These events are rare but highly impactful and require urgent attention and often have financial or safety consequences. Deep learning models, especially Graph Neural Networks (GNNs) have gained prominence due to their ability to capture intricate dependencies between sensor signals as graphs. Understanding the reasons behind the predicted anomalies is essential for efective response, however, the black-box nature of GNNs poses a significant challenge.</p>
      </abstract>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>-</title>
      <p>Explainer</p>
    </sec>
    <sec id="sec-2">
      <title>1. Introduction</title>
      <p>In the age of cyber-physical systems (CPS), where physical processes are tightly integrated with
computation and communication infrastructure, ensuring reliable and safe operation of these systems
is of paramount importance. As these systems become increasingly complex, continuous monitoring of
their health has emerged as a vital component of operational safety and performance optimization. One
of the key techniques employed in this context is anomaly detection (AD), which involves identifying
patterns in system behavior that deviate from expected norms. Accurate anomaly detection enables
early fault diagnosis, minimizes downtime, and helps prevent catastrophic failures.</p>
      <p>Traditional anomaly detection methods encompass a wide range of statistical and machine learning
techniques. These include clustering approaches such as k-means, and density estimation methods
like One-Class SVM and Isolation Forests [1]. More recently, deep learning-based methods such as
autoencoders, recurrent neural networks (RNNs), and variational autoencoders (VAEs) have been
employed to model the normal behavior of time-series data and identify deviations [2]. While efective
in many cases, these approaches generally operate under the assumption that sensor observations
are independent or sequentially dependent, and they often fail to account for the structural
interdependencies among sensors in a system.</p>
      <p>Traditional anomaly detection methods often treat sensor observations independently or assume
simplistic temporal dependencies, ignoring the inherent structural relationships between diferent</p>
      <p>CEUR</p>
      <p>ceur-ws.org
sensing components. In many CPS applications, such as industrial automation, energy distribution
networks, and autonomous vehicles-the behavior of a sensor is often influenced by the states of its
neighboring sensors due to underlying physical or logical connections. Capturing these interactions is
essential for robust modeling of system behavior. Graph-based representations provide a natural and
expressive framework to encode these inter-sensor relationships. Recent advances in Graph Neural
Networks (GNNs) have made it possible to efectively leverage graph-structured data for tasks like
classification, prediction, and anomaly detection in multivariate time series data [ 3].</p>
      <p>Several studies have demonstrated that modeling sensor dependencies through graph structures can
significantly enhance the performance of anomaly detection systems in CPS settings [ 4, 5, 3]. Despite
these promising results, a major limitation persists: the lack of explainability. GNN-based anomaly
detection models are often treated as black boxes, ofering little insight into why a particular anomaly
was detected. This is particularly a problem in safety-critical domains, where human operators must
understand and trust the decisions made by automated systems.</p>
      <p>To bridge this gap, the machine learning community has increasingly focused on explainability, with
methods generally categorized into local explanations—targeting individual predictions and global
explanations—describing overall model behavior [6]. While several explanation techniques have been
proposed for standard deep learning models, the explainability of GNNs, especially in time-series
contexts, remains an underexplored area. Moreover, existing explanation methods often rely on
feature attribution or saliency maps, which may lack causal grounding and are limited in the types of
counterfactual insights they can provide. Our primary focus is to explore whether graph structure in
GNNs be harnessed for better explaining the model’s decisions.</p>
      <p>In this work, we propose a novel framework for counterfactual explanation tailored to GNN-based
anomaly detection models operating on time-series sensor data. Counterfactual explanations aim to
answer the question: “What minimal change to the input would alter the model’s prediction?”—thus
providing actionable and intuitive insights into model decisions. Counterfactual explanations can give
a clue as to the root cause of the anomalies.</p>
      <p>Our approach comprises a two-stage process. In the first stage, we identify the most influential
sensors that contribute to an anomaly, along with their local graph neighborhoods. This localization
step leverages node-level deviations and GNN attention mechanisms to pinpoint regions of the graph
that are most responsible for the prediction. In the second stage, we generate counterfactual instances
by perturbing sensor readings in a minimal and plausible manner, aiming to flip the model’s prediction
from anomalous to normal (or vice versa). These counterfactuals serve as transparent, case-specific
explanations that can assist operators in understanding failure modes and potential corrective actions.
By integrating such support-systems reduces cognitive load on the human decision-makers, while
allowing them to efectively validate model outputs.</p>
      <p>By combining the structural strengths of GNNs with the intuitive clarity of counterfactual reasoning,
our method advances the state of the art in explainable anomaly detection for cyber-physical systems.
On the SWaT and WADI benchmarks, our two-stage approach alters fewer than 6% of sensors, yet
still delivers an outstanding sparsity-versus-proximity balance that makes the counterfactuals concise
and actionable. This not only enhances trust and accountability but also opens new avenues for
troubleshooting and diagnostics.</p>
    </sec>
    <sec id="sec-3">
      <title>2. Related Work</title>
      <sec id="sec-3-1">
        <title>2.1. Counterfactual Explanation for Time Series</title>
        <p>Several recent studies have explored counterfactual explanation techniques for time series data, with
the aim of explaining model decisions by identifying minimal changes in input features that would alter
the model output.</p>
        <p>For instance, Karlsson et al. 2020, propose a technique for generating counterfactuals using models like
k-nearest neighbors and random shapelet forests. In another approach, wan 2021, focus on univariate
time series by mapping data to a latent space, identifying counterfactuals there, and decoding them
back to the input space. Native-Guide [9] identifies the nearest contrasting instance, extracts its most
influential subsequence, and substitutes it into the original time series. CoMTE [ 10] selects alternative
series from the training set to replace parts of the input in order to induce prediction changes. More
recently, CFWoT [11] introduces a model-agnostic framework for both static and multivariate time
series, capable of handling continuous and categorical features without needing access to training data
or similar samples.</p>
        <p>These approaches often do not focus on relational structures present in multivariate time series data,
which is the focus of this work.</p>
      </sec>
      <sec id="sec-3-2">
        <title>2.2. Counterfactual Explanation of Graph Neural Networks</title>
        <p>Counterfactual explanation methods of graph neural networks aim to identify the smallest possible
modifications to the input that would lead to a diferent model output. By pinpointing which features
must be altered to change a prediction, these methods ofer valuable insights into the model’s decision
boundaries and causal reasoning.</p>
        <p>
          A representative method in this category is CF-GNNExplainer [12], which introduces a learnable
binary mask over the model’s computational graph to indicate edge presence or removal. The mask is
optimized to (
          <xref ref-type="bibr" rid="ref1">1</xref>
          ) alter predictions (prediction loss) and (
          <xref ref-type="bibr" rid="ref2">2</xref>
          ) minimize structural changes (distance loss).
The final explanation highlights edges with the highest importance scores from the learned mask.
        </p>
        <p>Another thread of counterfactual explanation methods is to generate counterfactual instances that are
close to the original instance but lead to a diferent prediction. CLEAR [ 13] employs a graph variational
autoencoder (GVAE) to learn a latent representation of the input graph and generate counterfactual
graphs by making minimal changes to the original structure or features. The GVAE is trained to
reconstruct the original graph while ensuring that the generated counterfactual samples result in a
diferent model prediction, maintaining both proximity (closeness to the original instance) and validity
(changing the prediction). RCExplainer [14] uses a neural network that predicts the existence of an edge
between two nodes based on their embeddings. To generate counterfactual explanations, RCExplainer
modifies these pairwise node embeddings, efectively simulating the addition or removal of edges that
lead to a change in the model’s prediction. This approach allows for a structured and interpretable way
of understanding which edges influence the decision of the GNN model.</p>
        <p>However, these methods primarily focus on structural changes to the graph, such as edge addition or
removal, rather than utilizing graph structures for time series data, which is the focus of our work. Our
approach leverages the inherent relationships between sensors in a time series context, enabling us to
generate counterfactual explanations that are both interpretable and relevant to the specific anomalies
detected by GNN-based models.</p>
      </sec>
    </sec>
    <sec id="sec-4">
      <title>3. Method</title>
      <sec id="sec-4-1">
        <title>3.1. Problem Statement</title>
        <p>This paper addresses the task of explaining anomalies in multivariate time series data through
counterfactual explanation generation. To support this, we incorporate an initial anomaly detection component
as a foundation.</p>
        <p>
          We begin by employing an unsupervised time series anomaly detection model that learns the normal
behavior of a system from historical data and detects deviations in unseen data. The input consists of
multivariate sensor data  , where | | =  and  is the number of sensors. The training data is denoted
as strain = [s(
          <xref ref-type="bibr" rid="ref1">1</xref>
          ), s(
          <xref ref-type="bibr" rid="ref2">2</xref>
          ), … , s( train)], where each s() ∈ ℝ represents sensor readings at time  . The model
assumes training data to be free of anomalies and captures normal system patterns to flag abnormal
points in the test data.
        </p>
        <p>
          The core focus of this work lies in generating counterfactual explanations for the data points identified
as anomalous. The counterfactual explanation provides human-interpretable insights into the model’s
decision-making process by answering the question: What minimal change would make an anomalous
instance be considered normal? Formally, given a test data sequence stest = [s(
          <xref ref-type="bibr" rid="ref1">1</xref>
          ), s(
          <xref ref-type="bibr" rid="ref2">2</xref>
          ), … , s( test)] and a
set of anomaly predictions, the goal is to generate, for each detected anomaly st(e)st, a modified version
st(e)s′t such that the model classifies st(e)s′t as normal, and st(e)s′t remains as close as possible to st(e)st under a
suitable distance metric.
        </p>
      </sec>
      <sec id="sec-4-2">
        <title>3.2. Overview</title>
        <p>1) GNN-based Anomaly
Detection</p>
        <p>Graph
Information
Anomaly
Sample
Two-stage approach
2.a) Node Extraction
Informative</p>
        <p>Subgraph
2.b) Counterfactual
Explanation Generation</p>
      </sec>
      <sec id="sec-4-3">
        <title>3.3. GNN-based Model for Time-Series Anomaly Detection</title>
        <p>This section presents GNN-based model for time-series anomaly detection, which utilizes the
methodology proposed by den 2021. The model produces an anomaly score for time series data, labeling
it as anomalous if its score exceed a specified threshold. Following the GDN architecture [ 4], the
implementation integrates structural learning techniques with graph neural networks, comprising four
interconnected modules: sensor embedding, graph structure learning, graph attention-based forecasting,
and graph deviation scoring.</p>
        <p>For each sensor  is represented by a trainable embedding vector e ∈ ℝ , learned jointly with the
forecasting objective. These embeddings capture the behavior patterns of the sensors and can be used
to identify which sensors are similar to each other. Sensors that are highly correlated will have similar
embedding vectors.</p>
        <p>To explicitly represent inter-sensor relationships, we build a data-driven directed graph. For every
where attention weights   are softmax-normalized cosine similarities of concatenated node features.</p>
        <p>A fully connected layer then maps sensor representations into predicted sensor values:
where   is a fully connected layer. The output ŝ() is the predicted values of the sensors at time  . The
model is trained using a mean squared error (MSE) loss function:
h() = ReLU ( , Wx() +</p>
        <p>∑  , Wx() ) ,
∈ ()
ŝ() =   ([e1 ⋅ h(1) , e2 ⋅ h(2) , … , e ⋅ h() ]),
ℒMSE =</p>
        <p>1
 train −  =+1
 train
∑
‖ŝ() − s() ‖2,
2
where  train is the total number of training samples.</p>
        <p>Deviations between predicted and actual values are calculated as the deviation score for each sensor
Err () = |ŝ
() − s
() |, where ŝ
() is the predicted value and s
() is the actual value of sensor  at time  .</p>
        <p>
          To ensure that all deviation scores are on the same scale, we normalize the deviation score as follows:
(
          <xref ref-type="bibr" rid="ref1">1</xref>
          )
(
          <xref ref-type="bibr" rid="ref2">2</xref>
          )
(
          <xref ref-type="bibr" rid="ref3">3</xref>
          )
(
          <xref ref-type="bibr" rid="ref4">4</xref>
          )
(
          <xref ref-type="bibr" rid="ref5">5</xref>
          )
(
          <xref ref-type="bibr" rid="ref6">6</xref>
          )
pair of sensors embeddings e and e , we compute cosine similarity as:
        </p>
        <p>A′ =</p>
        <p>e e
‖e ‖‖e ‖
,
anomaly scoring steps.</p>
        <p>For node  at time  , the hidden state is
and retain the top- neighbors of each node to obtain the adjacency matrix A. This resulting directed
graph explicitly encodes dominant inter-sensor relationships, informing subsequent forecasting and
With the learned adjacency matrix A, graph-attention layers process each time window x() ∈ ℝ × .</p>
        <p>AS () =</p>
        <p>Err () −   ̃
 ̃</p>
        <p>,
AS() = max   (),

=1
where  ̃ and  ̃ are the median and inter-quartile range (IQR) of the deviation scores of sensor  over
the training set, as followed by [4].</p>
        <p>The final anomaly score at time  is given by taking the maximum across all sensors:
where  is the number of sensors. The system is flagged as anomalous if the score exceeds a predefined
threshold.</p>
      </sec>
      <sec id="sec-4-4">
        <title>3.4. Two-stage Approach</title>
        <p>Anomaly samples detected by the GNN-based anomaly detection model are fed into the two-stage
approach for generating an explanation. The first stage involves node extraction, which identifies
the most relevant sensors to guide the counterfactual explanation method. The second stage uses a
counterfactual explanation method that generates counterfactual instances by altering only the sensors
identified in the extracted node set from the first stage.
selected
e3
e2
c)</p>
        <p>
          e4
selected
(
          <xref ref-type="bibr" rid="ref7">7</xref>
          )
(
          <xref ref-type="bibr" rid="ref8">8</xref>
          )
(
          <xref ref-type="bibr" rid="ref9">9</xref>
          )
        </p>
        <sec id="sec-4-4-1">
          <title>3.4.1. Node Extraction</title>
          <p>To generate counterfactual explanations focused on the most relevant sensors, we need to extract a
node set containing only the most important sensors from the original graph. The node extraction
module uses the anomaly score for each sensor to identify the most important sensors in the graph.
The extraction process consists of three steps:
1. Select the top  1 sensors with the highest anomaly scores, where  1 is a hyperparameter that
controls the size of the initial node set.
2. Select  2 additional sensors that are connected to the selected  1 sensors in the graph, where  2 is
a hyperparameter that controls the size of the extended node set.
3. Combine both sets of sensors to form the final set of selected sensors  () .
where rank(⋅)ranks sensors by anomaly scores in descending order. This step selects sensors with the
highest anomaly scores, as they are most likely to contribute to the detected anomaly and are therefore
most relevant for generating counterfactual explanations.</p>
          <p>In the second step, we select  2 sensors  2() that are connected to the selected  1 sensors:
 ( 1() ) = { ∈  ∖  1() ∣ ∃ ∈  1() ∶   = 1},</p>
          <p>S(2) ⊆  ( 1() ), |S(2) | =  2,
where  ( 1() )represents the neighboring sensors of the selected  1 sensors. Several strategies exist for
selecting S() from  ( 1() ), including choosing sensors with the highest anomaly scores, those most
2
connected to  1() , or random selection. We choose the top  2 sensors with the highest anomaly scores
as this provides a simple and efective way to select the most relevant sensors.</p>
          <p>Finally, we combine both sets to form the final selected sensor set:</p>
          <p>() =  1() ∪  2() .</p>
          <p>The selected node set  () is then used as input to the counterfactual explanation method, which
generates counterfactual instances by altering only the sensors in the extracted set.</p>
        </sec>
        <sec id="sec-4-4-2">
          <title>3.4.2. Counterfactual Explanation Generation</title>
          <p>The counterfactual explanation generation module creates counterfactual instances by altering the
signals of the sensors in the extracted node set. We use a perturbation-based approach that generates
counterfactual instances by adding small changes to the original signal.</p>
          <p>We employ gradient optimization, a technique commonly used in adversarial attacks, to compute these
perturbations efectively. The perturbation is found by minimizing the objective function ℒ (x, x +  ),
where x is the original signal, and  is the perturbation. The objective function is defined as:
ℒ (x, x +  ) = ℒCE( (x +  ),  target) +  ⋅ ‖  ‖,
where  controls the trade-of between the two terms,  (⋅) is the model,  target is the target class, and
ℒCE is the cross-entropy loss. The first term pushes the model to produce a specific output (the target
class), while the second term keeps the perturbation small.</p>
          <p>The perturbation is computed using gradient descent:
where  is the learning rate, and  is the iteration number. We initialize the perturbation to zero:  0 = 0.</p>
          <p>
            To focus only on the extracted sensors, we apply a mask to the gradient. The mask m is defined as:
 (+1) =   − ∇  ℒ (x, x +  ),
m = {
1, if  ∈  () ,
0, otherwise.
(
            <xref ref-type="bibr" rid="ref10">10</xref>
            )
(
            <xref ref-type="bibr" rid="ref11">11</xref>
            )
(
            <xref ref-type="bibr" rid="ref12">12</xref>
            )
(
            <xref ref-type="bibr" rid="ref13">13</xref>
            )
(
            <xref ref-type="bibr" rid="ref14">14</xref>
            )
(
            <xref ref-type="bibr" rid="ref15">15</xref>
            )
This mask zeros out the gradients for sensors not in the extracted node set. The masked gradient is
computed as:
          </p>
          <p>∇′ℒ (x, x +  ) = ∇ ℒ (x, x +  ) ⊙m,
where ⊙ denotes element-wise multiplication. The perturbation is then updated using the masked
gradient:</p>
          <p>This process continues until we reach the maximum number of iterations or obtain a valid
counterfactual instance. The final step adds the perturbation to the original signal:
 (+1) =   − ∇ ′ℒ (x, x +  ).</p>
          <p>xcf = x +  .</p>
          <p>The generated counterfactual instance xcf is a modified version of the original signal that produces a
diferent model output. This counterfactual instance explains the model’s decision by showing how the
prediction changes when influential sensors are altered. By only perturbing sensors in the extracted
node set, we focus on the most relevant sensors, which helps minimize the perturbation size and
improve the quality of explanations.</p>
        </sec>
      </sec>
    </sec>
    <sec id="sec-5">
      <title>4. Experiments</title>
      <sec id="sec-5-1">
        <title>4.1. Experiment Setup</title>
        <sec id="sec-5-1-1">
          <title>4.1.1. Datasets</title>
          <p>We evaluate our approach on two multivariate time series datasets from industrial control systems,
comprising both public benchmarks. Dataset statistics are summarized in Table 1.</p>
          <p>We use two widely-adopted water treatment testbed datasets: SWaT [15] and WADI [16]. The Secure
Water Treatment (SWaT) dataset contains data from a scaled water treatment plant with 51 sensors
monitoring various physical processes. The Water Distribution (WADI) dataset extends SWaT with a
more comprehensive 128-sensor water distribution system. Both datasets include two weeks of normal
operations followed by controlled attack scenarios that simulate real-world anomalies through physical
system manipulations.</p>
          <p>
            We apply consistent preprocessing across all datasets following [4]: (
            <xref ref-type="bibr" rid="ref1">1</xref>
            ) median downsampling
to 0.1 Hz (one sample per 10 seconds) to reduce noise and computational overhead, (
            <xref ref-type="bibr" rid="ref2">2</xref>
            ) sensor-wise
min-max normalization to [0,1] range, and (
            <xref ref-type="bibr" rid="ref3">3</xref>
            ) sliding window segmentation into 50-second chunks (5
downsampled measurements) for model input, following the previous works [4].
          </p>
        </sec>
        <sec id="sec-5-1-2">
          <title>4.1.2. Baseline Methods</title>
          <p>We compare the GNN anomaly detection approach against several baseline models, including six
traditional machine learning models, and one GNN-based model. The compared models are listed as
follows:
• KNN: K Nearest Neighbors utilizes the distance of each point to its  nearest neighbors as the
anomaly score and classifies the point as anomalous if the score is greater than a specified
threshold.
• IForest: Isolation Forest is an ensemble-based anomaly detection model that isolates anomalies
by randomly partitioning the data into smaller subsets. It builds an ensemble of isolation trees
and uses the average path length of the trees to compute the anomaly score.
• OCSVM: One-Class SVM is a support vector machine-based anomaly detection model that learns
a decision boundary around the normal data points and classifies points outside the boundary as
anomalous.
• AutoEncoder: AutoEncoder consists of an encoder and a decoder which reconstruct data samples
from the input data. The reconstruction error is used as the anomaly score.
• VAE: Variational AutoEncoder is a improved version of AutoEncoder, which learns a probabilistic
model of the data.
• PCA: Principal Component Analysis looks for a low-dimensional projection of the data that
captures most of the variance of the data. The reconstruction error is used as the anomaly score.
• FuSAGNet [17]: FuSAGNet introduces Fused Sparse Autoencoder and Graph Net, which jointly
optimizes reconstruction and forecasting while explicitly modeling the relationships within
multivariate time series.</p>
          <p>
            For counterfactual explanation generation, we compare against two additional baselines: (
            <xref ref-type="bibr" rid="ref1">1</xref>
            )
Reconstruction, which directly uses autoencoder reconstructions as counterfactual explanations under the
assumption that reconstructions project onto the normal space, and (
            <xref ref-type="bibr" rid="ref2">2</xref>
            ) Without Node Extraction, which
represents our method without the node extraction component.
          </p>
        </sec>
        <sec id="sec-5-1-3">
          <title>4.1.3. Evaluation Metrics</title>
          <p>We evaluate our approach using two sets of metrics: anomaly detection performance and counterfactual
explanation quality.</p>
          <p>Anomaly Detection Performance. We assess the anomaly detection model using standard
classification metrics: precision, recall, F1-score, AUC-ROC, and PRC-AUC. AUC-ROC and PRC-AUC provide
a comprehensive assessment of the model’s performance across diferent threshold values and are
widely used metrics for evaluating classification models.</p>
        </sec>
        <sec id="sec-5-1-4">
          <title>Counterfactual Explanation Quality.</title>
          <p>We evaluate generated counterfactuals using three
quantitative metrics alongside qualitative visual inspection. Validity measures the fraction of counterfactuals
that successfully flip the model’s prediction:
where  cf is the number of counterfactuals,  (⋅) is the model,  is the classification threshold, and (⋅) is
the indicator function.</p>
          <p>
            Sparsity quantifies the average fraction of sensors modified per counterfactual:
minimal change threshold.
where  is the number of sensors,   is the perturbation for sensor  in counterfactual  , and  is a
Proximity measures the average magnitude of perturbations:
(
            <xref ref-type="bibr" rid="ref16">16</xref>
            )
(
            <xref ref-type="bibr" rid="ref17">17</xref>
            )
(18)
Validity =
1
 cf =1
 cf
∑ ( ( xcf) &lt; )
Sparsity =
 cf 1
∑
          </p>
          <p>1
 cf =1  =1</p>
          <p>∑ (|  | &gt; )
Proximity =
1
 cf =1
 cf
∑ ‖ ‖

where   represents the perturbation vector for counterfactual  .</p>
          <p>Higher validity indicates more efective counterfactuals, while lower sparsity and proximity reflect
better explainability through minimal, localized changes.</p>
        </sec>
        <sec id="sec-5-1-5">
          <title>4.1.4. Implementation Details</title>
          <p>We implement the proposed approach using PyTorch and PyTorch Geometric. The model is trained
with Adam optimizer with learning rate 1 × 10−3 and ( 1,  2) = (0.9, 0.99)for 50 epochs. We include
early stopping with a patience of 10 epochs. The embedding dimension for the sensors is 128 for WADI
dataset, and 64 for SWaT dataset. Training is performed on a single Tesla T4 GPU with 16 GB memory.
For the node extraction module, we set  1 = 2 and  2 = 1. The perturbation is computed using gradient
descent with a learning rate of 0.001 and a maximum of 100 iterations, with Adam optimizer.  for the
objective function is 0.1 for SWaT dataset and 0.001 for WADI dataset.</p>
        </sec>
      </sec>
      <sec id="sec-5-2">
        <title>4.2. Benchmark Comparison</title>
        <p>In this section, we conduct two benchmark comparisons. The first benchmark is to compare the
anomaly detection performance of the proposed GNN-based model with the other baseline models.
This benchmarking acts a sanity check for anomaly detection. The second benchmark is to compare
the generated counterfactual explanations with the baseline models.</p>
        <sec id="sec-5-2-1">
          <title>Anomaly Detection Performance</title>
          <p>As a sanity check for GNN model, we compare the performance
of anomaly detection for the proposed GNN-based model and the other baseline models on the two
datasets. The results are shown in Table 2.</p>
          <p>On the WADI dataset, GDN achieves the highest F1, precision and PRC-AUC, while FuSAGNet leads
in ROC-AUC. VAE achieves the best recall. These results suggest that GNN-based models ofer more
balanced performance.
normal instances.</p>
          <p>On the SWaT dataset, GDN consistently outperforms others across nearly all metrics. PCA achieves
the highest precision but with lower recall, indicating a stricter anomaly boundary that may misclassify
Explanation Performance We compare the performance of counterfactual explanations across
diferent models. In addition to our proposed method, we apply the two-stage approach using
FuSAGNet. For baseline models without graph structures, we skip the node extraction step and apply the
counterfactual method directly. We also evaluate a reconstruction-based counterfactual approach on
both GDN and FuSAGNet. Results are shown in Table 3. Note that KNN and IForest are excluded, as
their non-diferentiable nature prevents gradient-based counterfactual generation.</p>
          <p>On the WADI dataset, the proposed approach achieves a validity score of 0.5718, which is not
significantly higher than other models, but still acceptable. Notably, it outperforms others in sparsity
and proximity, indicating that the generated counterfactuals are both sparse and close to the original
instances. In contrast, baseline models show poor performance, with a sparsity score of 1.0000 and much
higher proximity values. While FuSAGNet with node extraction achieves a higher validity score, its
sparsity and proximity do not improve significantly. These suggest that generating valid counterfactuals
is more easy, but needs more adjustment to the original signal. We also find that the node extraction
step is not efective for FuSAGNet, as the validity score is the same as the model without the node
extraction step. Reconstruction-based methods perform poorly on WADI, with low validity and sparsity
ifxed at 1.0000, indicating dificulty in generating meaningful and interpretable counterfactuals.</p>
          <p>On the SWaT dataset, a similar trend emerges. The proposed approach achieves a high validity score
alongside low sparsity and proximity, indicating efective and interpretable counterfactuals. Although
baseline models and reconstruction-based methods reach perfect validity, they sufer from high sparsity
and proximity, reducing explinability. When the node extraction step is removed from the proposed
approach, validity drops and sparsity increases significantly, which highlights the step’s efectiveness.
We attribute this to the gradient-based method distributing perturbations across all sensors, leading to
less valid and less sparse counterfactuals. Interestingly, FuSAGNet performs worse with node extraction
on SWaT, dropping to a validity score of 0.1380. This may stem from its architectural constraints
enforcing sparsity in the latent space [17], which limits its adaptability in counterfactual generation.</p>
        </sec>
      </sec>
      <sec id="sec-5-3">
        <title>4.3. Ablation Studies</title>
        <sec id="sec-5-3-1">
          <title>Efect of Node Extraction Hyperparameters for Counterfactual Explanations: We investigate</title>
          <p>the impact of hyperparameters  1 and  2, which control the number of selected sensors and their
neighbors, on the quality of counterfactual explanations using GDN on the SWaT and WADI datasets.
Results are shown in Table 4.</p>
          <p>As  1 and  2 increase, the validity score generally improves, indicating that more valid counterfactuals
can be generated, with more features to perturb. However, this trend plateaus when the sum  1 +  2
exceeds 2, suggesting only a small number of informative sensors and their immediate neighborhood
are suficient for efective explanation. Meanwhile, both sparsity and proximity scores increase with  1
and  2, reflecting reduced explainability due to more widespread perturbations.</p>
          <p>On the SWaT dataset, the best configuration is  1 = 2,  2 = 1, achieving the highest validity of 0.9740
while maintaining relatively low sparsity and proximity. Notably, this configuration outperforms than
the one with  1 = 3 and  2 = 0, despite both involving three total nodes. This indicates that leveraging
the graph structure to incorporate neighbors provides more targeted and eficient perturbations than
selecting more sensors independently, which highlights the benefit of graph-based relational modeling
in counterfactual generation. In contrast, too few sensors (e.g.,  1 = 1,  2 = 0) result in poor validity
(0.4160), while too many ( 1 = 5) can dilute the perturbation efect, lowering validity to 0.9430. A
similar pattern is observed on WADI, where the best validity (0.5900) occurs at  1 = 3,  2 = 0, though
the overall scores are lower, which is likely due to WADI’s higher dimensionality and complexity.</p>
          <p>Overall, the number of selected sensors should be large enough to ensure generation of valid
counterfactuals, but small enough to maintain explainability. Validity gains plateau after a certain point,
suggesting a trade-of between completeness and sparsity.</p>
        </sec>
      </sec>
      <sec id="sec-5-4">
        <title>4.4. Visual Analysis Experiments</title>
        <p>We show one illustrative example of the generated counterfactual explanations for the detected
anomalies. This example is selected from the SWaT dataset, which contains a detected anomaly with a label of
9. The original instance and the generated counterfactual instance are shown in Figure 3.</p>
        <p>Anomaly 9
0
2
4</p>
        <p>Time
6
8</p>
        <p>This anomaly example is due to a attack on the sensor FIT-401, which is a flow transmitter sensor.
The attack manually set the sensor value to 0, which makes actuator P-501 turns of (from 2 to 1 in
value). The original instance is shown in solid lines, and both sensors are in the status of turning of.
Our node extraction module selects the most important sensors, i.e., FIT-401 and P-501. The generated
counterfactual instance is shown in dashed lines, where the sensor FIT-401 is set to a higher value,
and the actuator P-501 is set to a higher value. We can see that there are correlations between the two
sensors, which indicates that they are related and can influence each other’s behavior. This aligns with
the physical setting of the system, as FIT-401 is the upstream sensor of P-501, and the value of FIT-401
has direct influence on the value of P-501. The generated counterfactual instance is valid, as it is close
to the original instance and can be interpreted as a valid counterfactual explanation.</p>
      </sec>
    </sec>
    <sec id="sec-6">
      <title>5. Discussion</title>
      <p>Our experimental results confirm that the proposed two-stage counterfactual framework provides
concise, actionable explanations that improve trust and troubleshooting eficiency for system operators.
In this section we discuss two main insights.</p>
      <p>Efectiveness of graph‐aware counterfactuals: Across both datasets, validity increases sharply
once the explanation can perturb at most three sensors, i.e. the  1+ 2 = 3 setting, where  1 &gt; 0, and
 2 &gt; 0 and  2 &gt; 0, denote that neighbors of the selected features are utilised. This shows that usually
only a few, closely linked variables drive each anomaly. When we choose some of those sensors using
the graph of how they connect (i.e. increasing  2), the resulting counterfactuals are more valid than if
we just picked the sensors with the highest anomaly scores. This backs up our idea that knowing the
system’s structure is crucial for clear counterfactual reasoning in highly coupled systems.
Trade-of between validity, sparsity and proximity: Letting the algorithm perturb more sensors
(higher  1 or  2) makes its explanations more often valid, but it also means bigger changes to the data,
resulting in the results become harder to read and trust. Looking at Table 4, the sweet spot seems to be
 1 = 2 and  2 = 1: we still get over 97% validity on the SWaT dataset while the typical change stays
under 0.015 (in normalized units). In practice, engineers can pick these two knobs to suit their goals:
smaller values if they want to pinpoint the root cause with minimal edits, larger values if making sure
the validity is more important than keeping the edits tiny.</p>
    </sec>
    <sec id="sec-7">
      <title>6. Conclusion</title>
      <p>In this work, we introduced a novel framework to generate counterfactual explanations tailored for
graph neural network-based model. Our approach leverages the representational power of GNNs to
model complex inter-sensor relationships in our two-stage explanation mechanism which enables
interpretable counterfactual reasoning. Extensive experiments on the SWaT and WADI benchmarks
show that our two-stage framework cuts the number of perturbed sensors to less than 6% on average,
while generating highly valid counterfactual explanation. This superior sparsity–proximity trade-of
means the counterfactuals are both concise and easier for practitioners to act upon.</p>
      <p>Our framework contributes to more transparent and trustworthy machine learning solutions for
safetycritical domains by bridging the gap between black-box anomaly detection using GNNs and explainable
AI. Future work may explore weighted similarity-based relationships in graphs, the integration of
domain constraints, real-time explanation generation, and multi-criteria optimization.</p>
    </sec>
    <sec id="sec-8">
      <title>Acknowledgments</title>
      <p>The work was carried out with support from Vinnova (Sweden’s innovation agency) through the
Advanced Digitalisation Program as part of the future AI-based maintenance project (project number:
2023-01917).</p>
    </sec>
    <sec id="sec-9">
      <title>Declaration on Generative AI</title>
      <p>During the preparation of this work, the authors used Chat-GPT-4 in order to: Improving writing style,
grammar and spelling check. After using this services, the authors reviewed and edited the content as
needed and take full responsibility for the publication’s content.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <given-names>V.</given-names>
            <surname>Chandola</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Banerjee</surname>
          </string-name>
          ,
          <string-name>
            <given-names>V.</given-names>
            <surname>Kumar</surname>
          </string-name>
          ,
          <article-title>Anomaly detection: A survey, ACM computing surveys (CSUR) 41 (</article-title>
          <year>2009</year>
          )
          <fpage>1</fpage>
          -
          <lpage>58</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [2]
          <string-name>
            <given-names>Z.</given-names>
            <surname>Zamanzadeh Darban</surname>
          </string-name>
          ,
          <string-name>
            <given-names>G. I.</given-names>
            <surname>Webb</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Pan</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Aggarwal</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Salehi</surname>
          </string-name>
          ,
          <article-title>Deep learning for time series anomaly detection: A survey</article-title>
          ,
          <source>ACM Computing Surveys</source>
          <volume>57</volume>
          (
          <year>2024</year>
          )
          <fpage>1</fpage>
          -
          <lpage>42</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [3]
          <string-name>
            <given-names>M.</given-names>
            <surname>Jin</surname>
          </string-name>
          ,
          <string-name>
            <given-names>H. Y.</given-names>
            <surname>Koh</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Q.</given-names>
            <surname>Wen</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Zambon</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Alippi</surname>
          </string-name>
          ,
          <string-name>
            <given-names>G. I.</given-names>
            <surname>Webb</surname>
          </string-name>
          ,
          <string-name>
            <given-names>I.</given-names>
            <surname>King</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Pan</surname>
          </string-name>
          ,
          <article-title>A survey on graph neural networks for time series: Forecasting, classification, imputation, and anomaly detection</article-title>
          ,
          <source>IEEE Transactions on Pattern Analysis and Machine Intelligence</source>
          (
          <year>2024</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          <article-title>[4] Graph neural network-based anomaly detection in multivariate time series</article-title>
          , volume
          <volume>35</volume>
          ,
          <year>2021</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          [5]
          <string-name>
            <given-names>Z.</given-names>
            <surname>Wu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Jain</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Wright</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Mirhoseini</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J. E.</given-names>
            <surname>Gonzalez</surname>
          </string-name>
          ,
          <string-name>
            <surname>I.</surname>
          </string-name>
          <article-title>Stoica, Representing long-range context for graph neural networks with global attention</article-title>
          ,
          <source>Advances in neural information processing systems</source>
          <volume>34</volume>
          (
          <year>2021</year>
          )
          <fpage>13266</fpage>
          -
          <lpage>13279</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          [6]
          <string-name>
            <given-names>C.</given-names>
            <surname>Molnar</surname>
          </string-name>
          ,
          <source>Interpretable Machine Learning</source>
          ,
          <volume>3</volume>
          <fpage>ed</fpage>
          .,
          <year>2025</year>
          . URL: https://christophm.github.
          <article-title>io/ interpretable-ml-book.</article-title>
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          [7]
          <string-name>
            <given-names>I.</given-names>
            <surname>Karlsson</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Rebane</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Papapetrou</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Gionis</surname>
          </string-name>
          ,
          <article-title>Locally and globally explainable time series tweaking</article-title>
          ,
          <source>Knowledge and Information Systems</source>
          <volume>62</volume>
          (
          <year>2020</year>
          )
          <fpage>1671</fpage>
          -
          <lpage>1700</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          <article-title>[8] Learning time series counterfactuals via latent space representations</article-title>
          , Springer,
          <year>2021</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          <article-title>[9] Instance-based counterfactual explanations for time series classification</article-title>
          , Springer,
          <year>2021</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          [10]
          <article-title>Counterfactual explanations for multivariate time series</article-title>
          , IEEE,
          <year>2021</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          [11]
          <string-name>
            <given-names>X.</given-names>
            <surname>Sun</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Aoki</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K. H.</given-names>
            <surname>Wilson</surname>
          </string-name>
          ,
          <article-title>Counterfactual explanations for multivariate time-series without training datasets</article-title>
          ,
          <source>arXiv preprint arXiv:2405.18563</source>
          (
          <year>2024</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          [12]
          <article-title>Cf-gnnexplainer: Counterfactual explanations for graph neural networks</article-title>
          ,
          <source>PMLR</source>
          ,
          <year>2022</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          [13]
          <string-name>
            <given-names>J.</given-names>
            <surname>Ma</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Guo</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Mishra</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Zhang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Li</surname>
          </string-name>
          ,
          <article-title>Clear: Generative counterfactual explanations on graphs</article-title>
          ,
          <source>Advances in neural information processing systems</source>
          <volume>35</volume>
          (
          <year>2022</year>
          )
          <fpage>25895</fpage>
          -
          <lpage>25907</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref14">
        <mixed-citation>
          [14]
          <string-name>
            <given-names>M.</given-names>
            <surname>Bajaj</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L.</given-names>
            <surname>Chu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Z. Y.</given-names>
            <surname>Xue</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Pei</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L.</given-names>
            <surname>Wang</surname>
          </string-name>
          , P. C.
          <string-name>
            <surname>-H. Lam</surname>
            ,
            <given-names>Y. Zhang,</given-names>
          </string-name>
          <article-title>Robust counterfactual explanations on graph neural networks</article-title>
          ,
          <source>Advances in Neural Information Processing Systems</source>
          <volume>34</volume>
          (
          <year>2021</year>
          )
          <fpage>5644</fpage>
          -
          <lpage>5655</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref15">
        <mixed-citation>
          [15]
          <article-title>SWaT: A water treatment testbed for research and training on ICS security</article-title>
          , IEEE,
          <year>2016</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref16">
        <mixed-citation>
          [16]
          <article-title>WADI: a water distribution testbed for research in the design of secure cyber physical systems</article-title>
          ,
          <year>2017</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref17">
        <mixed-citation>
          <source>[17] Learning Sparse Latent Graph Representations for Anomaly Detection in Multivariate Time Series</source>
          ,
          <year>2022</year>
          .
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>