<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Mining and Interpretability in Graph Neural Networks</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Kislay Raj</string-name>
          <email>kislay.raj2@mail.dcu.ie</email>
          <xref ref-type="aff" rid="aff0">0</xref>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Alessandra Mileo</string-name>
          <email>alessandra.mileo@insight-centre.org</email>
          <xref ref-type="aff" rid="aff0">0</xref>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="editor">
          <string-name>Neurosymbolic AI, Explainable AI, Graph Neural Networks, Rule Mining,</string-name>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Insight SFI Research Centre for Data Analytics, School of Computing, Dublin City University</institution>
          ,
          <addr-line>Dublin</addr-line>
          ,
          <country country="IE">Ireland</country>
        </aff>
        <aff id="aff1">
          <label>1</label>
          <institution>Research Ireland Centre's for Research Training in Artificial Intelligence, DCU</institution>
          ,
          <addr-line>Dublin</addr-line>
          ,
          <country country="IE">Ireland</country>
        </aff>
      </contrib-group>
      <pub-date>
        <year>2025</year>
      </pub-date>
      <fpage>25</fpage>
      <lpage>30</lpage>
      <abstract>
        <p>In this paper we introduce ActiMine-GNN, a novel framework that derives compact sets of activation rules from hidden layers of GNNs and trains shallow surrogate models for explanation. Unlike prior work that seeks individually discriminative patterns, ActiMine converts layer embeddings into binary activation matrices, mines nonredundant coactivation patterns with FP growth, and trains a depth limited decision tree over pattern features. ActiMine-GNN achieves higher fidelity at substantially lower sparsity than strong baselines, while maintaining task accuracy. A quantitative analysis reports the coverage of the rule set, the conciseness, and the global surrogate alignment with the frozen GNN. We also study robustness to activation thresholding and pattern caps, and report compute and scaling behaviour. These results indicate that compact activation rules can provide transparent and faithful rationales for GNN decisions across synthetic and molecular benchmarks.</p>
      </abstract>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>1. Introduction</title>
      <p>[9] attempts to mine hidden-layer activations to reveal internal GNN reasoning; however, it requires
iterative and potentially computationally expensive subgroup discovery, limiting scalability to larger
graphs.</p>
      <p>Our previous research introduced Functional Semantic Activation Mapping (FSAM) [10], providing
global interpretability by tracking neuron activations across layers. FSAM identifies when and where
neurons lose their class-specific activations, providing insight into the efects of network depth and</p>
      <p>CEUR
Workshop</p>
      <p>ISSN1613-0073
helping to detect scenarios where GNNs achieve correct predictions for incorrect reasons. However,
FSAM does not directly translate these activations into human-understandable decision rules or
instancelevel masks and does not tackle the scalability issue of activation analysis for GNN explainability.
To address this gap, we propose ActiMine-GNN, an explainability framework that builds on FSAM,
binary activation extraction, frequent pattern mining, and shallow decision trees. ActiMine-GNN
ifrst converts hidden-layer activations into binary activation matrices, capturing significant neuron
activations across  -hop ego-neighbourhoods. Then we extract frequent coactivation patterns, which
we use to construct interpretable activation pattern features. These features are then passed to a shallow
decision tree to produce clear, dataset-level rules with per-instance explanations. A common problem
when generalising local GNN explainability approaches based on surrogate models to the entire GNN is
the need to generate exponentially many possible surrogates on based on the size of the input graph. In
contrast, ActiMine-GNN trains a fixed, small</p>
      <p>number of shallow global surrogates (e.g., one tree per
layer or a single tree on concatenated features), caps the number of mined patterns (top- per layer),
and thus avoids surrogate proliferation while keeping the rule set compact.</p>
      <sec id="sec-1-1">
        <title>The method is applicable to both the node and graph classification settings; in this paper, we focus on</title>
        <p>the classification of the graph and evaluate the approach on three standard benchmarks used in GNN
explainability: BA-2Motifs [6] (synthetic motif detection), AIDS [11] (antiviral activity classification
on molecular graphs), and BBBP [12] (blood–brain barrier permeability). These data sets are widely
adopted to assess the fidelity and sparsity of subgraph-based explanations and provide a mix of synthetic
ground truth (BA-2Motifs) and chemically meaningful structure (AIDS/BBBP). Across these benchmarks,</p>
      </sec>
      <sec id="sec-1-2">
        <title>ActiMine-GNN achieves higher fidelity at matched sparsity while maintaining task accuracy, providing transparent and compact global rules.</title>
      </sec>
    </sec>
    <sec id="sec-2">
      <title>2. Proposed Methodology: ActiMine-GNN</title>
      <p>
        In this paper, we consider ActiMine-GNN on Graph Convolutional Networks (GCNs) [13] for binary
graph classification because the benchmarks we evaluate (BA-2Motifs, AIDS, BBBP) are binary tasks,
and this facilitates comparative fidelity analysis. This choice is not a limitation of the approach, as
the pipeline is model-agnostic and can be extended to multiclass classification by training a multiclass
surrogate (or one-vs-rest surrogate) on the same pattern features and ranking mined patterns by their
association with each class; evaluation then uses class-conditional fidelity at matched sparsity. The
approach is built upon the activation-extraction approach of FSAM [10], which provides a global
understanding of coactivation of a trained GNN by (i) extracting layer-wise activations and binarising
them to form activation matrices, and (ii) building a co-activation graph over hidden units to analyse how
class-specific activations emerge or dissipate across layers. FSAM produces model-level insights (e.g.,
loss of class specificity with depth), but it does not produce decision rules or instance-level masks. We
keep the FSAM activation extraction step unchanged: namely, we derive binary activation matrices from
ReLU embeddings for each hidden layer. We then replace FSAM’s coactivation graph construction with
a rule mining pipeline as follows: (
        <xref ref-type="bibr" rid="ref1">1</xref>
        ) for each layer, form transactions from  hop ego-neighbourhoods;
(
        <xref ref-type="bibr" rid="ref2">2</xref>
        ) mine frequent coactivation patterns under support / top-
constraints to obtain activation pattern
features; (3) train a depth-limited decision tree on these features to extract interpretable activation
rules; and (4) induce per-instance masks from the selected rules for evaluation. Figure 1 illustrates
this process: the left block corresponds to the result of the FSAM activation-extraction step which
represents the input, the middle and right blocks implement our pattern mining and rule induction
pipeline, respectively.
      </p>
      <sec id="sec-2-1">
        <title>2.1. Binary Activation Matrix</title>
        <p>(GID) and  ∈   the node index (NID); let  tot = ∑

=1   .</p>
        <p>We begin by converting each continuous GCN embedding into a discrete activation pattern a step
motivated by the focus of FSAM on neuron activations [10], but here specialised in rule mining in graph

classification. Let  = {  }=1 with   = (  ,   ,   ,   ),   = |  |, and   ∈ { 0,  1}. Here  is the graph index
Input Graphs
s
rk
o
w
t
e
N
l
a
r
u
e
N
h
p
a
r
G</p>
        <p>ReLU
ReLU
ReLU
1</p>
        <p>GID</p>
        <p>NID</p>
        <p>ActivationMatrix1</p>
        <p>Output</p>
        <p>2
GID</p>
        <p>NID</p>
        <p>ActivationMatrix2</p>
        <p>Output
GID</p>
        <p>NID</p>
        <p>ActivationMatrix3</p>
        <p>Output
AboutActivation</p>
        <p>3
Decision Tree
Evaluation</p>
        <p>
          Activation Rules
4
5
input graph (node colours indicate input categories only and are not explanations); (
          <xref ref-type="bibr" rid="ref2">2</xref>
          ) binarise activations to
obtain  (ℓ) = {
        </p>
        <p>(ℓ) &gt;  } ; (3) mine frequent coactivation patterns from  (ℓ) using FP-growth, optionally filtered
by a maximum entropy background model; (4) fit a depth limited decision tree on pattern features to obtain
activation rules; (5) instantiate instance level node masks from the selected rules; (6) evaluate at the same
sparsity budget  (fidelity, 1−fidelity, and sparsity). Dotted stacks indicate repetition across layers (ℓ = 1, … ,  ),
producing per layer rule sets and masks. In  (ℓ), rows are indexed by (graph ID, node ID) and ones mark active
embedding components or (7) to provide insights on the model</p>
        <p>A trained  -layer GCN produces continuous activations
  (ℓ) = [ℎ,
(ℓ)
]∈ 
∈ ℝ  × ,
ℎ
,( ℓ) = ReLU( (ℓ)</p>
        <p>
          ∑
∈  ()∪{}
ℎ,
√ ,
(ℓ−1)
 ,
) ,
with ℎ,
on  
indicators for pattern mining:
(0) =  , ,  (
          <xref ref-type="bibr" rid="ref1">1</xref>
          ) ∈ ℝ× and  (ℓ) ∈ ℝ×
        </p>
        <p>for ℓ &gt; 1. FSAM continues with continuous correlations
(ℓ) to build a global coactivation graph; instead, we discretise   (ℓ) channel-wise to obtain binary
 (ℓ)
( , ) = { ℎ ,,
(ℓ)
&gt;   } ∈ {0, 1},
 (ℓ) ∈ {0, 1}  × ,
where thresholds {  } are chosen in validation (ReLU’s &gt; 0 is a special case; layer-specific  (ℓ) are also

admissible). Binarisation (i) produces transactions for scalable itemset mining, (ii) normalises away
arbitrary activation scales across layers/graphs, and (iii) produces crisp rule conditions for shallow
surrogates. This choice is orthogonal to the number of classes and works unchanged for multiclass
(one-vs-rest or a multiclass surrogate).</p>
      </sec>
      <sec id="sec-2-2">
        <title>2.2. Frequent Coactivation Mining</title>
        <p>(ℓ)
Transactions and background. For layer ℓ, define the transaction of each node  ,
(ℓ) = { ∶
( , ) = 1} . The marginal activation probability (independent Bernoulli / MaxEnt under fixed</p>
        <p>Itemset mining algorithm</p>
        <sec id="sec-2-2-1">
          <title>We extract frequent itemsets  ⊆ {1, … ,  }</title>
          <p>with FP-growth, which
compresses transactions into an FP tree and avoids the candidate explosion of Apriori; it is robust on
sparse binary activations and fast at moderate  . We considered Eclat as an alternative, but empirically
determined that FP growth was consistently faster in our setting.</p>
          <p>Support and its interpretation</p>
          <p>Global support is the fraction of nodes (across all graphs) whose
transaction contains  :</p>
          <p>1
 tot =1
supp( ) =
∑|{  ∈   ∶  ⊆  ,
(ℓ) }|.
association beyond chance.</p>
          <p>In an independence context, the expected support for  is ∏∈ 
(ℓ); deviations from ∏  
(ℓ) indicate
Threshold selection and redundancy control</p>
        </sec>
        <sec id="sec-2-2-2">
          <title>We set a minimum support per layer  by validation</title>
          <p>on a small grid (e.g. {0.01, 0.02, … , 0.10}), choosing the largest  that still produce at least 
patterns
after pruning. We prune through</p>
          <p>closed itemsets (removing subsumed patterns) to avoid near duplicates.</p>
          <p>Ranking: support, confidence, and lift</p>
        </sec>
        <sec id="sec-2-2-3">
          <title>For graph-level ranking, define a presence indicator</title>
          <p>(  ) = {Φ  (  ) &gt; 0}with
Given class  , the confidence and lift are
Φ (  ) =
1
  ∈</p>
          <p>∑ { ⊆  ,(ℓ)}.
conf( → ) = Pr(  =  ∣   (  ) = 1),
lift ( → ) =</p>
          <p>Pr(  =  ∣   (  ) = 1)</p>
          <p>Pr(  = )
.</p>
          <p>We first filter by</p>
          <p>supp( ) ≥  (unsupervised frequency), then rank by conf (class association) and report
lift &gt; 1 as evidence over background prevalence. We keep the top-
as salient patterns, that is, frequent and class-associated (lift
non-redundant itemsets per layer
&gt; 1). Unless otherwise stated:  = 1
(ego-neighbourhood for transactions),  = 100
per layer,  selected as above; Multiple testing control
through Holm adjustment when screening many  .</p>
        </sec>
      </sec>
      <sec id="sec-2-3">
        <title>2.3. Activation Rule Induction via Decision Trees</title>
        <sec id="sec-2-3-1">
          <title>For each selected itemset  , define node- and graph-level features</title>
          <p>(,  ) = { ⊆  ,(ℓ)},
Φ (  ) =</p>
          <p>∑   (,  ).
1
  ∈ 
We train a shallow decision tree surrogate on graph features ([Φ (  ) ]∈ℐ ,   )=1 . We limit the depth at
4 (min leaf = 3) as the length of the rule (at most four feature tests per rule) to reduce overfitting and
keep the set of rules compact; deeper trees yielded only marginal gains in fidelity against substantially
longer rules in validation.</p>
          <p>Each root-to-leaf path produces a human-readable activation rule.</p>
          <p>=1
⋀(Φ  (  ) ⋈   ) ⇒  ̂ = ,
⋈  ∈ {≥, &lt;},
rule conditions.
which can be visualised as per-instance masks by highlighting nodes with  

(,  ) = 1 for the selected</p>
        </sec>
      </sec>
      <sec id="sec-2-4">
        <title>2.4. what do the mined rules look like?</title>
        <p>Rule form: From each layer ℓ, ActiMine-GNN mines frequent coactivation patterns   (sets of activated
embedding components) and trains a depth limited decision tree over pattern-count features. A rule
therefore has the form</p>
        <p>Rule  ( Layer ℓ, class ) ∶
[count(  1) ≥ 1] ∧ ⋯ ∧ [count(  ) ≥  ] ⇒ ,

where count(  )is the number of nodes in the graph whose ℓ-layer binary activations include pattern
  , and   are small integer thresholds (typically 1–3). We annotate each rule with support (coverage on
test graphs), precision (class-conditional accuracy on covered graphs) and lift (increase over class prior).
BA-2Motifs example (Layer 2, class  0) Consider a 5-cycle graph from BA-2Motifs (class  0). One
high-ranking Layer 2 rule for  0 is:</p>
        <p>Rule 4 (L2,  0) ∶</p>
        <p>[count( 13) ≥ 1] ∧ [count( 27) ≥ 1] ⇒  0.</p>
        <p>Here  13 and  27 are Layer 2 coactivation patterns mined by FP-growth; intuitively, they correspond to
embedding components that tend to co-occur on cycle nodes. On a graph where Rule 4 fires, instance
level mask is formed by nodes that support the antecedent patterns (optionally expanded by the chosen
mask variant, e.g., ego neighbourhood or top- edges). In the same sparsity budget  , retaining the
masked subgraph reproduces the original class on most covered graphs (Layer 2 fidelity
= 0.94 in
BA-2Motifs; Table 3); deleting it substantially reduces confidence in  0 (infidelity
= 1 − 0.94 = 0.06),
illustrating necessity.</p>
        <p>How the rule produce a mask.</p>
        <sec id="sec-2-4-1">
          <title>Given a firing rule, we score nodes by their antecedent support</title>
          <p>(number of satisfied patterns) and select the highest scoring nodes until the budget  is met; the chosen
mask variant (node-removal, ego, distance-weighted, or top- edges) is then applied. This produces a
compact, human interpretable subgraph that links the global rule to a local rationale. Figure 2 shows an
example mask induced by Rule 4 on a 5-cycle: removing the masked node breaks the cycle into a path
and often changes the prediction, while retaining the mask alone preserves the class.
Molecular example. In BBBP, Layer 2 rules typically combine a small number of patterns that occur
on substructures linked to permeability (e.g., ring and heteroatom contexts). The local mask highlights
the few atoms whose Layer 2 activations satisfy the antecedent, providing concise explanations at
 ≈ 0.17 with fidelity 0.88 and infidelity 0.12 (Table 3).</p>
        </sec>
      </sec>
    </sec>
    <sec id="sec-3">
      <title>3. Experiments and Results</title>
      <p>classification datasets.</p>
      <sec id="sec-3-1">
        <title>3.1. Dataset Summary</title>
        <sec id="sec-3-1-1">
          <title>To evaluate the efectiveness of ActiMine-GNN, we performed experiments on benchmark graph</title>
          <p>We evaluated GNN interpretability using a pipeline comprising activation mining, decision tree rule
extraction, and XGBoost-based rule validation. Experiments were performed on three diferent binary
graph classification benchmark datasets:
• BA2 [6]: A synthetic dataset containing 1,000 graphs with balanced classes, each featuring either
a 5-cycle or a house motif as ground truth
• AIDS [11]: 2,000 molecular graphs (atoms as nodes, bonds as edges) with a binary label for
anti-HIV activity; node and edge types provided; classes are imbalanced.
• BBBP [12]: blood–brain barrier permeability (permeable vs non-permeable); 1,640 molecular
graphs retained after standard sanitisation and deduplication.
retained  =</p>
          <p>| |</p>
          <p>We quantify interpretability with (i) fidelity (agreement between the surrogate and the frozen GNN)
and (ii) sparsity, defined as the budget of the instance-level mask, i.e. the fraction of nodes (or edges)
|  | (lower is better). We also report rule conciseness as the average number of rule
conditions per leaf (equivalently, the average path length in the decision tree). XGBoost’s feature
importance is used only to validate that the features of the mined patterns carry a predictive signal;
it is not our sparsity metric. These metrics are especially relevant for the prediction of molecular
properties, where concise and interpretable explanations are critical for domain experts. We quantify
interpretability with (i) fidelity (suficiency of the masked subgraph to reproduce the frozen GNN’s
prediction) and (ii) sparsity, defined as the mask budget  (fraction of nodes retained; lower is better).
of graphs with negative/positive labels in the full data set after preprocessing (before splitting).</p>
          <p>Let  () ∈ ℝ  be the frozen softmax scores of GNN for graph  , and  ⋆ =
arg max   () its predicted class. Given a rule set at layer ℓ, let  (ℓ)() ⊆  ()
selects (a union of all applicable rule bodies), and define
be the node mask it
 (ℓ)() =
| (ℓ)()|
| ()|</p>
          <p>(node sparsity; lower is better).</p>
          <p>Write  keep =  [ (ℓ)() ] and  d(ℓe)l =  [ () ∖</p>
          <p>(ℓ)
graphs  (ℓ) = {  ∈ 
test ∶ | (ℓ)()| &gt; 0 } (coverage = | (ℓ)|/| test|).</p>
          <p>(ℓ)() ]. We evaluated on the subset of covered test
Fidelity (suficiency, hard):
Sparsity (mask budget):</p>
          <p>Fidh+ard(ℓ) =</p>
          <p>1
| (ℓ)| ∈ (ℓ)</p>
          <p>∑  [arg max   ( k(ℓe)ep) =  ⋆] .</p>
          <p>Infidelity (lower is better):</p>
        </sec>
        <sec id="sec-3-1-2">
          <title>We report infidelity as the complement of fidelity in the same sparsity</title>
          <p>budget  and in the same covered set: Infidelity (ℓ)() = 1 − Fidelity(ℓ)().</p>
          <p>(ℓ) =
1</p>
          <p>∑  (ℓ)().</p>
          <p>| (ℓ)| ∈ (ℓ)</p>
          <p>Unless stated otherwise, we report means over the covered test graphs  (ℓ); fidelity is calculated at the
(matched) sparsity reported in Table 2, and infidelity is the complement of fidelity in the same sparsity
budget. Soft variants of fidelity can be obtained by replacing the indicator with
  ⋆( k(ℓe)ep).</p>
        </sec>
      </sec>
      <sec id="sec-3-2">
        <title>3.2. Rule Extraction Robustness</title>
        <p>Among all datasets, ActiMine-GNN extracted between 8 and 15 high-fidelity activation rules per layer.
Although we extracted and evaluated the rules at each layer, we focus on the results of layer 2 (L2) in
Table 2. This is because across Layers 1→3, fidelity only fluctuates by ±0.08, and the fraction of nodes
retained by the mask (sparsity) by at most 0.13 relative to the L2 value, confirming the stability of our
experiment. We can see that the highest fidelity is achieved in the BA-2Motifs dataset, where Layer 2
produced 12 rules that apply to more than 95 % of the test graphs, with an average fidelity of 0.94.
Notes. Variability across layers is modest: |Fidℓ − Fid2 | ≤ 0.08 and | ℓ −  2 | ≤ 0.13 across datasets. Infidelity is reported as
1 − Fidelity at the same budget  as fidelity.</p>
        <p>Key observations.</p>
        <p>• Fidelity ≥ 0.88 across BA-2Motifs (0.94), AIDS (0.91), and BBBP (0.88) indicates that, at the stated
budgets, the masked subgraph alone reproduces the original prediction on most test graphs.
• Infidelity ≤ 0.12 in the same budget  (0.06, 0.09, 0.12 respectively) shows that the removal of
the highlighted subgraph markedly reduces the confidence of the model in the original class,
signaling the necessity of the selected nodes.</p>
        <p>• Sparsity 0.15–0.18 (mean ≈ 0.17) produces concise explanations, with about 17% nodes retained.</p>
      </sec>
      <sec id="sec-3-3">
        <title>3.3. Instance-Level Explanations</title>
        <p>ActiMine-GNN discovers twelve high‐fidelity activation rules on the BA-2Motifs dataset at layer 2.
BA-2Motifs comprises 1000 graphs: 500 hiding a 5-cycle (class  0) and 500 hiding a 5-node ’house’
motif (class  1). ActiMine-GNN exactly recovers the correct explanatory substructure for each graph,
outperforming all baseline methods. In Figure 2, we display a single test graph (a 5-cycle) and highlight
its Rule 4 support set</p>
        <p>mask = { 1}
in red. Removing  1 (and its incident edges) breaks the cycle into a simple path, causing the prediction
of GNN to change from ’5 cycle’ to ’house’. This flip occurs in 94 % of the graphs where Rule 4 applies,
producing an instance-level fidelity of 0.94. For each class  and layer ℓ we pre-rank rules by confidence ,
breaking the links by lift (see definitions in Section 2.2), and require a minimum test-coverage threshold.
The example in Fig. 2 shows the layer 2 rule ranked top for class  0 among those that meet a predefined
coverage threshold. We rank rules by validation set fidelity, breaking ties by higher support and greater
conciseness (shorter path length). This fixed procedure, applied uniformly across datasets, avoids
selective reporting and ensures comparability across datasets. Although the node removal mask is
the simplest, ActiMine-GNN also supports more expressive strategies. The node-removal (keep) mask
retains only the nodes in  mask and the edges they induce, removing everything else. The ego-graph
mask retains the full radius-ℓ neighbourhood around each  ∈  mask, capturing the multi-hop context.
The distance-weighted mask assigns continuous weights that decay with the shortest path distance from
 mask (measured in hops for unweighted graphs and by path length for weighted graphs), producing
a soft subgraph. Finally, the top- edge mask selects the  highest-weight edges under the
distanceweighted scheme to produce a compact, human-readable subgraph. Unless stated otherwise, for each
instance, we choose the mask variant that maximises fidelity in the same sparsity budget  , and we
report the results obtained  .</p>
      </sec>
      <sec id="sec-3-4">
        <title>3.4. Comparison with State-of-the-Art Methods</title>
        <p>Table 3 reports a comparison of ActiMine-GNN with three widely used instance level GNN explanation
methods: GNNExplainer, PGExplainer and PGM-Explainer. Fidelity measures the degree to which
the masked subgraph is suficient for the model to reproduce its original prediction: for each method
and each test graph, we apply the mask of that method, retain only the highlighted subgraph, and
check if the predicted class is unchanged; fidelity is the average proportion of the covered test graphs.
Infidelity is reported as the complement of fidelity at the same sparsity budget  and on the same set
covered. Sparsity quantifies conciseness and is the fraction of nodes retained by the mask (lower is
better). As shown in Table 3, across BA-2Motifs, AIDS, and BBBP, ActiMine-GNN achieves the highest
ifdelity and the lowest infidelity while producing the most concise masks. These results indicate that
our rule-driven approach aligns closely with the model decision process and yields human-interpretable
focused rationales. For completeness, the table reports each baseline in its achieved sparsity, and</p>
        <sec id="sec-3-4-1">
          <title>ActiMine-GNN uses the Layer 2 configuration.</title>
        </sec>
      </sec>
    </sec>
    <sec id="sec-4">
      <title>4. Conclusion and Future Work</title>
      <p>In this paper we presented ActiMine-GNN, a rule driven framework for understanding GNN decision
processes. The method mines compact activation rules from binarised hidden layer activations using
decision tree induction and applies these rules to generate instance level masks for individual graphs.
We quantitatively validated ActiMine-GNN on benchmark datasets, focussing on both the intrinsic
quality of the rules discovered and their ability to faithfully explain the model’s decisions. To assess
whether the extracted rules characterise the overall behaviour of the GNN, we used rule-derived
characteristics to train XGBoost surrogates and measured their agreement with the frozen GNN. For
each input graph, the feature vector encodes the number of nodes that support each activation rule.
The high alignment between surrogate predictions and the original GNN output indicates that the
rules capture the core predictive logic of the model. We chose decision trees for rule extraction due to
their clarity and interpretability, and XGBoost for quantitative validation because it reliably assesses
the predictive strength of the rule derived features. Our pipeline binarises activations and relies on
support thresholds and top  caps; while this confers scalability and interpretability, performance can
depend on these hyperparameters. Experiments focus on graph level binary classification with a GCN
backbone; although the approach is model agnostic and extends to multiclass settings (for example, one
versus rest or multiclass surrogates), we have not exhaustively evaluated alternative architectures and
leave this for future work. Although this work emphasised local masks, examining the sets of rules
global would help to assess how well the method characterises the GNN as a whole. This includes: (a)
measuring the coverage of the rule set and the conditional accuracy of the class on the highlighted
graphs; (b) quantifying the agreement between a global surrogate trained on the features derived from
the rules and the frozen GNN; (c) performing rule adjustments (dropping or adding rules) to estimate
marginal contributions; and (d) checking the consistency between global rules and local masks, that
is, that the masks instantiated from a given rule are predominantly produced for the class of that rule
and tend to increase local fidelity. Finally, we anticipate that models capable of generating prototype
subgraphs linked to concise global rules (with per class exemplars and counterfactual contrasts) will
ofer case level transparency and auditable rule sets, supporting adoption in regulated domains such as
healthcare, finance, and autonomous systems.</p>
    </sec>
    <sec id="sec-5">
      <title>Acknowledgment</title>
      <sec id="sec-5-1">
        <title>This work was conducted with the financial support of the Science Foundation Ireland Centre for</title>
      </sec>
      <sec id="sec-5-2">
        <title>Research Training in Artificial Intelligence under Grant No. 18CRT6223 and Research Ireland INSIGHT</title>
      </sec>
      <sec id="sec-5-3">
        <title>Centre for Data Analytics, Grant no. 12/RC/2289 P2.</title>
      </sec>
    </sec>
    <sec id="sec-6">
      <title>Declaration on Generative AI</title>
      <p>The author(s) have not employed any Generative AI tools.
[3] E. Şahin, N. N. Arslan, D. Özdemir, Unlocking the black box: an in-depth review on interpretability,
explainability, and reliability in deep learning, Neural Computing and Applications (2024). doi:10.
1007/s00521-024-10437-2.
[4] M. Saarela, V. Podgorelec, Recent applications of explainable ai (xai): A systematic literature
review, Applied Sciences 14 (2024) 8884. doi:10.3390/app14198884.
[5] A. Longa, S. Azzolin, G. Santin, G. Cencetti, P. Lio, B. Lepri, A. Passerini, Explaining the explainers
in graph neural networks: a comparative study, ACM Computing Surveys (2024). doi:10.1145/
3696444.
[6] Z. Ying, D. Bourgeois, J. You, M. Zitnik, J. Leskovec, Gnnexplainer: Generating explanations for
graph neural networks, in: NeurIPS, 2019. URL: https://proceedings.neurips.cc/paper/2019/hash/
d80b7040b773199015de6d3b4293c8ff-Abstract.html.
[7] D. Luo, W. Cheng, D. Xu, W. Yu, B. Zong, H. Chen, X. Zhang, Parameterized explainer for graph
neural network, arXiv preprint arXiv:2011.04573, 2020. doi:10.48550/arxiv.2011.04573.
[8] P. Müller, L. Faber, K. Martinkus, R. Wattenhofer, Dt+gnn: A fully explainable graph neural network
using decision trees, arXiv preprint arXiv:2205.13234, 2022. doi:10.48550/arxiv.2205.13234.
[9] L. Veyrin-Forrer, A. Kamal, S. Dufner, M. Plantevit, C. Robardet, On gnn explainability with
activation rules, Data Mining and Knowledge Discovery 38 (2022) 3227–3261. doi:10.1007/
s10618-022-00870-z.
[10] K. Raj, A. Mileo, Towards understanding graph neural networks: Functional-semantic activation
mapping, in: International Conference on Neural-Symbolic Learning and Reasoning, Springer,
2024, pp. 98–106.
[11] C. Morris, N. M. Kriege, F. Bause, K. Kersting, P. Mutzel, M. Neumann, Tudataset: A collection of
benchmark datasets for learning with graphs, arXiv preprint arXiv:2007.08663 (2020).
[12] Z. Wu, B. Ramsundar, E. N. Feinberg, J. Gomes, C. Geniesse, A. S. Pappu, K. Leswing, V. Pande,
Moleculenet: a benchmark for molecular machine learning, Chemical Science 9 (2017) 513–530.
doi:10.1039/c7sc02664a.
[13] T. N. Kipf, M. Welling, Semi-supervised classification with graph convolutional networks, arXiv
preprint arXiv:1609.02907 (2016).
[14] M. N. Vu, M. T. Thai, Pgm-explainer: Probabilistic graphical model explanations
for graph neural networks, in: Advances in Neural Information Processing Systems
(NeurIPS), volume 33, 2020, pp. 12225–12235. URL: https://proceedings.neurips.cc/paper/2020/file/
8fb134f258b1f7865a6ab2d935a897c9-Paper.pdf.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <given-names>L.</given-names>
            <surname>Waikhom</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Patgiri</surname>
          </string-name>
          ,
          <article-title>A survey of graph neural networks in various learning paradigms: methods, applications, and challenges</article-title>
          ,
          <source>Artificial Intelligence Review</source>
          <volume>56</volume>
          (
          <year>2022</year>
          )
          <fpage>6295</fpage>
          -
          <lpage>6364</lpage>
          . doi:
          <volume>10</volume>
          .1007/ s10462- 022- 10321- 2.
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [2]
          <string-name>
            <given-names>Z.</given-names>
            <surname>Ye</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Y. J.</given-names>
            <surname>Kumar</surname>
          </string-name>
          ,
          <string-name>
            <given-names>G. O.</given-names>
            <surname>Sing</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F.</given-names>
            <surname>Song</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Wang</surname>
          </string-name>
          ,
          <article-title>A comprehensive survey of graph neural networks for knowledge graphs</article-title>
          ,
          <source>IEEE Access 10</source>
          (
          <year>2022</year>
          )
          <fpage>75729</fpage>
          -
          <lpage>75741</lpage>
          . doi:
          <volume>10</volume>
          .1109/access.
          <year>2022</year>
          .
          <volume>3191784</volume>
          .
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>