=Paper=
{{Paper
|id=Vol-3910/aics2024_p16
|storemode=property
|title=Tangentially Aligned Integrated Gradients for User-Friendly Explanations
|pdfUrl=https://ceur-ws.org/Vol-3910/aics2024_p16.pdf
|volume=Vol-3910
|authors=Lachlan Simpson,Federico Costanza,Kyle Millar,Adriel Cheng,Cheng-Chew Lim,Hong Gunn Chew
|dblpUrl=https://dblp.org/rec/conf/aics/SimpsonCMCLC24
}}
==Tangentially Aligned Integrated Gradients for User-Friendly Explanations==
Tangentially Aligned Integrated Gradients for
User-Friendly Explanations
Lachlan Simpson1,∗ , Federico Costanza2 , Kyle Millar3 , Adriel Cheng1,3 , Cheng-Chew Lim1
and Hong Gunn Chew1
1
School of Electrical and Mechanical Engineering, The University of Adelaide, Australia
2
School of Computer and Mathematical Sciences, The University of Adelaide, Australia
3
Information Sciences Division, Defence Science and Technology Group, Australia
Abstract
Integrated gradients is prevalent within machine learning to address the black-box problem of neural networks.
The explanations given by integrated gradients depend on a choice of base-point. The choice of base-point is
not a priori obvious and can lead to drastically different explanations. There is a longstanding hypothesis that
data lies on a low dimensional Riemannian manifold. The quality of explanations on a manifold can be measured
by the extent to which an explanation for a point lies in its tangent space. In this work, we propose that the
base-point should be chosen such that it maximises the tangential alignment of the explanation. We formalise the
notion of tangential alignment and provide theoretical conditions under which a base-point choice will provide
explanations lying in the tangent space. We demonstrate how to approximate the optimal base-point on several
well-known image classification datasets. Furthermore, we compare the optimal base-point choice with common
base-points and three gradient explainability models.
Keywords
Explainable AI, XAI, Integrated Gradients, Manifold Hypothesis.
1. Introduction
Deep learning provides state-of-the-art solutions to a wide array of computer vision tasks [1]. The
accuracy of deep learning comes with the trade-off of interpretability [2]. A fundamental problem
of deep learning is how a model reached a prediction [3]. Post hoc gradient explainability models
address the black-box problem by providing an attribution of the input features to the prediction of
neural network under analysis [4]. Several gradient explainability methods exist with the underlying
assumption that analysis of the model’s gradient highlights features with greatest impact on a prediction
[5, 6].
Several metrics have been proposed to measure the quality of explainability models. In [4, 7], the
authors propose the Lipschitz constant of an explainability model as a measure of explainability quality.
Other works consider the extent to which an explainability model approximates the underlying neural
network as a measure of quality. These metrics do not consider the user’s perception of the explanations.
Following from Ganz et al.’s [8] notion of perceptually aligned gradients of a neural network, Brodt et al.
[9] introduce perceptually aligned explanations. Brodt el al. [9] measure how perceptually aligned an
explanation is by the extent to which an explanation lies in the tangent space of the manifold. Brodt el
al. [9]’s measure of tangential explanations relies on the manifold hypothesis. The manifold hypothesis
is the notion that data lies on a low dimensional Riemannian manifold [10, 11, 12, 8, 13, 14].
The tangent space captures the features of an image that can be changed whilst remaining in the
distribution of images. The intuition is if an explanation lies in the tangent space of the image, the
explanation will contain meaningful components of the image [9]. Brodt et al. [9] demonstrate their
hypothesis on several gradient explainability models on well-known computer vision datasets. Brodt el
al. [9] further demonstrate tangentially aligned explanations are robust to adversarial attacks.
AICS’24: 32nd Irish Conference on Artificial Intelligence and Cognitive Science, December 09–10, 2024, Dublin, Ireland
∗
Corresponding author.
$ lachlan.simpson@adelaide.edu.au (L. Simpson)
© 2024 Copyright for this paper by its authors. Use permitted under Creative Commons License Attribution 4.0 International (CC BY 4.0).
CEUR
ceur-ws.org
Workshop ISSN 1613-0073
Proceedings
Integrated gradients (IG) [6] is a popular explainability method employed in a wide array of computer
vision tasks [15]. IG relies on a hyper-parameter known as the base-point. The choice of base-point
fundamentally alters the explanation provided [16]. Base-point selection is domain dependent and
chosen heuristically. The zero vector, however, is a prevalent choice in computer vision, NLP and graph
machine learning [6, 17, 18]. Several works have investigated different choices of base-point, however,
none are able to determine a correct choice [19]. In this work we investigate the conditions under which
a choice of base-point will provide perceptually aligned explanations.
The contributions of this work are twofold:
1. We provide sufficient conditions for when integrated gradient explanations are tangentially
aligned. We extend these results to any base-point attribution method.
2. We provide a framework to choose a base-point point which provides meaningful explanations to
the user. We compare our method with three gradient explainability models and IG with common
base-points. We demonstrate that our base-point choice provides better tangential alignment
and consequently more meaningful explanations. We validate our approach on four well-known
computer vision datasets.
The remainder of this work is structured as follows: Section 2 provides related work and background.
Section 3 investigates theoretical conditions for tangential alignment of base-point attribution methods.
Section 4 calculates base-points for tangential alignment of IG on four well known datasets. We
compare tangential IG with four common base-point choices and three gradient explainability models.
We conclude in Section 5 with a discussion for future works.
2. Related Work and Background
2.1. Tangentially Aligned Integrated Gradients Explanations
Post hoc explainability models are methods for providing an attribution for the features that influence
the output of a neural network. Post hoc explainability is a step towards addressing the black-box
problem [6].
Base-point attribution methods (BAM) [20] are a specific class of post hoc explainability models. A
BAM is a function
A : M × M × F(M ) → Rd (1)
′ ′
(x, x , F ) 7→ A(x, x , F ) (2)
where, M ⊂ Rd is a manifold, F(M ) denotes the set of neural networks on M and x, x′ ∈ M are an
input and a base-point, respectively.
We will further restrict the space of BAM functions to path methods, and we will generalise the
definition of path methods to be independent of coordinates. Given a closed interval I := [a, b] ⊂ R, a
path γ : I → M and a unit vector v ∈ Rd , the component of a path method Aγ : M ×M ×F(M ) → Rd
in the direction of v is defined as
Z b
′
γ
Av (x, x , F ) = ⟨∇F (γ(t)), v⟩⟨γ ′ (t), v⟩dt. (3)
a
In this way, for a given orthonormal basis {v1 , . . . , vd } of Rd , Aγ is expressed as
d
X
Aγ (x, x′ , F ) = Aγvi (x, x′ , F )vi . (4)
i=1
Particularly, for the standard orthonormal basis {e1 , . . . , ed } of Rd , we obtain the usual definition
Z b
∂F ∂γi
Aγei (x, x′ , F ) = (γ(t)) (t)dt. (5)
a ∂xi ∂t
The prominent path method, integrated gradients [6] is a path method where γ is taken to be the
straight line between points x, x′ ∈ M . For any pair of points x, x′ ∈ M , a neural network F ∈ F(M ),
and a unit vector v, integrated gradients of the v component of x is defined to be:
Z 1
′ ′
IGv (x, x , F ) := ⟨x − x , v⟩ ⟨∇F (x′ + t(x − x′ )), v⟩dt. (6)
0
Letting I : M × M × F(M ) → R be the map defined by
n
Z 1
I(x, x′ , F ) := (∇F )(x′ + t(x − x′ ))dt, (7)
0
integrated gradients can be expressed succinctly in the standard orthonormal basis of Rd as
IG(x, x′ , F ) = (x − x′ ) ⊙ I(x, x′ , F ), (8)
where ⊙ denotes the Hadamard product.
Several metrics have been proposed to measure the quality of explainability models. In [7, 4],
Lipschitzness is proposed as a measure of explainability quality. Other works consider the extent
an explainability model approximates the neural network as a measure of quality. Brodt et al. [9]
propose the extent to which an explanation lies in the tangent space of the manifold as a measure
of explanation quality. Attributions which lie in tangent space were demonstrated to constitute the
meaningful features that contribute to a prediction [8, 9]. Orthogonal attributions were closer to random
noise. The hypothesis that tangential explanations provide meaningful explanations is validated on
several image classification datasets and a user study [9]. Here tangentially aligned explanations is
formalised.
For the reminder of this work we will consider Rd , equipped with its standard inner product ⟨·, ·⟩,
and we will let M ⊂ Rd be a manifold of dimension n < d. We will also write ⟨·, ·⟩ for the restriction
of the inner product of Rd to M , such that (M, ⟨·, ·⟩) is an embedded Riemannian submanifold of
(Rd , ⟨·, ·⟩). We will denote the tangent space of M at a point x by Tx M which, in the context of data
manifolds, will consist of all v ∈ Rd such that x + v is “close" to M , with ∥v∥2 small [9]. Lastly,
making use of the inner product of Rd , for each x ∈ M we have orthogonal direct sum decomposition
Tx Rd = Tx M ⊕ Tx M ⊥ ,where
Tx M ⊥ := {u ∈ Tx Rd : ⟨u, v⟩ = 0, ∀v ∈ Tx M }. (9)
We will let πx : Tx Rd → Tx M denote the natural projection from Tx Rd to Tx M defined by
n
X
πx (v) = ⟨v, τℓ ⟩τℓ , (10)
ℓ=1
where, {τ1 , . . . τn } is an orthonormal basis for Tx M . We define the map µx : Tx Rd → [0, 1], given by
∥πx v∥22
µx (v) := , v ∈ Tx Rd . (11)
∥v∥22
The map defined in Equation (11) provides us a measure of “how much" of a vector lies in the tangent
space of M at x, i.e. a vector v is in Tx M if and only if µx (v) = 1 and, on the other hand, v will be in
Tx M ⊥ if and only if µx (v) = 0, which can be observed directly from its definition. Moreover, letting
πx⊥ : Tx Rd → Tx M ⊥ denote the natural projection and, noting that,
v = πx v + πx⊥ v, (12)
∥v∥22 = ∥πx v∥22 + ∥πx⊥ v∥22 , (13)
we can express µx as
∥πx v∥22
µx (v) := , v ∈ Tx Rd . (14)
∥πx v∥22 + ∥πx⊥ v∥22
Minimising the norm of the projection into Tx M ⊥ provides a framework to ensure tangential alignment.
2.2. Base-point Selection for Integrated Gradients
The attribution of IG depends on the base-point chosen. Base-point selection is domain dependent and
chosen heuristically. Here we review common base-point choices as provided by [21].
1. Zero. Here the base-point for all points is a constant zero vector
αzero = 0. (15)
in general the zero base-point can be any constant vector.
2. Maximum Distance. For a given input x ∈ M , α is defined as the point in M of maximum
distance from x i.e.
αxmax = argmax ∥x − y∥p . (16)
y∈M
Usually p = 1 or 2.
3. Uniform. We sample uniformly over a valid range of M
αiuniform ∼ U (min, max). (17)
i i
4. Gaussian. A Gaussian filter is applied to the input x.
αGaussian = σ · v + x, (18)
where, vi ∼ N (0, 1) and σ ∈ R. We require the αGaussian is still within the data distribution so
αGaussian → αUniform as σ → ∞ [21].
The zero base-point (Equation 15) will not highlight the aspects of the image which may be important
if the object of interest contains black pixels [21, 22]. To address the issue of a constant base-point missing
important features maximum distance (Equation 16) was proposed in [21]. Maximum distance takes the
furthest point (in ℓp distance) from the input image such that the base-point does not contain important
information of the input. Another alternative is to sample a base-point from a distribution such as
uniform (Equation 17) or Gaussian (Equation 18) [21, 23]. Despite the various choices of base-point we
demonstrate none of the aforementioned base-points provide perceptually aligned explanations.
Zaher et al. [24] propose Manifold Integrated Gradients (MIG). MIG replaces the straight line in IG
with a geodesic such that the attribution lies in the Riemannian manifold. Whilst MIG addresses the
problem of IG not conforming to the geometry of the data. MIG does not resolve the issue of base-point
choice nor does MIG ensure that the attribution lies in the tangent space of the manifold.
3. Optimising the Base-point for Tangentially Aligned Explanations
Throughout this section, we will study the map defined in Equation 11, to identify possible choices of
base-points for the attribution given by a BAM to be tangent to M at a point. To be precise, for a given
BAM, we want to find α ∈ M such that the map
x′ 7→ µx (A(x, x′ , F )) (19)
attains its maximum and, particularly, when this maximum value is equal to 1. We note that α = x is
always a solution, however, we will always require α ̸= x for non-trivial solutions.
Definition 1. Let A : M × M × F(M ) → Rd be a BAM and x, α ∈ M , F ∈ F(M ). A is tangentially
aligned at x, with base-point α, if µx (A(x, α, F )) = 1.
In the remainder of this section x ∈ M and F ∈ F(M ) will be fixed, unless otherwise stated. Letting
πx⊥ : Tx Rd → Tx M ⊥ denote the natural projection and defining the maps
Hx : M → Tx M ⊥ , Hx (x′ ) := πx⊥ A(x, x′ , F ) (20)
and
1
Ex (x′ ) := ∥Hx (x′ )∥22 ,
Ex : M → R, (21)
2
we can characterise tangentially aligned BAM explanations with the following theorem.
Theorem 1. Let A : M × M × F(M ) → Rd be a BAM and x, α ∈ M , F ∈ F(M ). Then A is
tangentially aligned at x, with base-point α, if and only if Hx (α) = 0 or, equivalently, if Ex (α) = 0.
Proof. It is immediate from the definitions of Hx and Ex , since they are the projection to Tx M ⊥ of A
and a multiple of its norm, respectively.
Choosing an orthonormal basis
{τ1 , . . . , τn , νn+1 , . . . , νd } (22)
of Tx Rd such that {τi }ni=1 and {νi }di=n+1 are orthonormal basis of Tx M and Tx M ⊥ , respectively, we
observe that
n d
Hx (x′ ) = A(x, x′ , F ) − ⟨A(x, x′ , F ), τi ⟩τi = ⟨A(x, x′ , F ), νi ⟩νi (23)
P P
i=1 i=n+1
and
d
′ 1 X
Ex (x ) = ⟨A(x, x′ , F ), νi ⟩2 . (24)
2
i=n+1
Therefore, any choice of a basis for Tx Rd , adapted to the splitting of Tx Rd into tangent and normal
spaces of M at x, will provide us with with a system of equations to test for tangentially aligned
explanations.
Theorem 1 provides us with a necessary condition that a base-point must satisfy to obtain a tangen-
tially aligned explanation. To observe this, suppose that there exists α ∈ M such that A(x, α, F ) is
tangentially aligned. Then, by Theorem 1, Ex (α) = 0 and since Ex (x′ ) ≥ 0 for all x′ ∈ M , it is in fact
a global minimum of Ex and, consequently, (∇Ex )(α) = 0. Moreover, its Hessian matrix HessEx is
positive definite at α.
To simplify notation, in what follows we will denote the partial derivatives with respect to xi and x′i
by ∂i and ∂i′ , respectively.
Corollary 2. It is a necessary condition for A(x, α, F ) to be tangentially aligned, that
⟨Hx (α), (∂i′ Hx )(α)⟩ = 0, (25)
for all i = 1, . . . , d.
Proof. If A(x, α, F ) is tangentially aligned, then (∇Ex )(α) = 0, which is equivalent to (∂i′ Ex )(α) = 0
for all i = 1, . . . , d. It follows from the definition of Ex that:
1
⟨Hx (α), (∂i′ Hx )(α)⟩ = ∂i′ ⟨Hx , Hx ⟩|α = (∂i′ Ex )(α) = 0. (26)
2
for all i = 1, . . . , d, as claimed.
In order to find conditions for the Hessian matrix of Ex to be positive definite, we will make use of
Geršgorin circle theorem [25] to find bounds for the eigenvalues of Hess Ex . For a given complex n × n
matrix A, its i-th Geršgorin disk is the closed disk Gi (A) := D(Aii , Ri ) ⊂ C, where the radius is given
by the formula X
Ri = |Aij |, Ji = {1, . . . , i − 1, i + 1, . . . , n}. (27)
j∈Ji
Lemma 3. Let A be a real symmetric matrix such that Aii > Ri for all i, then A is positive definite.
Lemma 3 follows immediately from [25]. The following theorem is an immediate consequence of
Corollary 2 and of Lemma 3 applied to HessEx .
Theorem 4. It is a sufficient condition for A(x, α, F ) to be tangentially aligned, that for all i
⟨Hx (α), (∂i′ Hx )(α)⟩ = 0 (28)
and that
(Hess Ex )(α)ii > Ri (α), (29)
where Ri (α) denotes the radius of the i-th Geršgorin disk of (HessEx )(α).
4. Numerical Analysis
In this section we approximate tangential base-point choices on four well-known datasets in computer
vision: MNIST [26], Fashion-MNIST [27], CIFAR10 and FER2013 [28]. We demonstrate that the four
common base-point choices defined in Section 2.2 consistently provide explanations that are not
well aligned with the tangent space. We further demonstrate tangentially aligned IG provides higher
tangentially aligned explanations than three gradient explainability models: Gradient [29], Smooth
Grad (SG) [5] and Input*Gradient (I*G) [30].
4.1. Approximating the Tangent and Normal Space
Following [9] the tangent space is approximated via a convolutional autoencoder. As discussed in [31]
if we consider the decoder, dec : L → M , as a map from the latent space L to the manifold M , then
the Jacobian of the decoder is a linear map from the tangent spaces of L and M
Jdec (x) : Tx L → Tdec(x) M. (30)
The Jacobian of the decoder can be computed via back-propagation [31]. The tangent space of M is
spanned by the gradient of dec [9]. For our work we require the normal space Tx M ⊥ . Given a basis for
the tangent space {τ1 , . . . , τn }, one can compute a basis for the normal space by
Null (τ1 , . . . , τn ) , (31)
where one considers the basis of the tangent space as a matrix.
4.2. Experimental Setup
We utilise the implementation of [9] to generate the tangent space with a convolutional autoencoder
and train a CNN for classification. The convolutional autoencoder has two convolutional layers with
pooling followed by a fully connected layer with ReLU activation. A two layer CNN of kernel size 3
with dropout and Relu activation is used to perform image classification. Using the parameters of [9],
n = dim(Tx M ) = 144 for CIFAR10 and FER2013 and for MNIST32 and Fashion n = 10. Explainability
models are produced with the PyTorch library Captum.ai [32].
) ( 5 &