Tangentially Aligned Integrated Gradients for
                         User-Friendly Explanations
                         Lachlan Simpson1,∗ , Federico Costanza2 , Kyle Millar3 , Adriel Cheng1,3 , Cheng-Chew Lim1
                         and Hong Gunn Chew1
                         1
                           School of Electrical and Mechanical Engineering, The University of Adelaide, Australia
                         2
                           School of Computer and Mathematical Sciences, The University of Adelaide, Australia
                         3
                           Information Sciences Division, Defence Science and Technology Group, Australia


                                      Abstract
                                      Integrated gradients is prevalent within machine learning to address the black-box problem of neural networks.
                                      The explanations given by integrated gradients depend on a choice of base-point. The choice of base-point is
                                      not a priori obvious and can lead to drastically different explanations. There is a longstanding hypothesis that
                                      data lies on a low dimensional Riemannian manifold. The quality of explanations on a manifold can be measured
                                      by the extent to which an explanation for a point lies in its tangent space. In this work, we propose that the
                                      base-point should be chosen such that it maximises the tangential alignment of the explanation. We formalise the
                                      notion of tangential alignment and provide theoretical conditions under which a base-point choice will provide
                                      explanations lying in the tangent space. We demonstrate how to approximate the optimal base-point on several
                                      well-known image classification datasets. Furthermore, we compare the optimal base-point choice with common
                                      base-points and three gradient explainability models.

                                      Keywords
                                      Explainable AI, XAI, Integrated Gradients, Manifold Hypothesis.


                         1. Introduction
                         Deep learning provides state-of-the-art solutions to a wide array of computer vision tasks [1]. The
                         accuracy of deep learning comes with the trade-off of interpretability [2]. A fundamental problem
                         of deep learning is how a model reached a prediction [3]. Post hoc gradient explainability models
                         address the black-box problem by providing an attribution of the input features to the prediction of
                         neural network under analysis [4]. Several gradient explainability methods exist with the underlying
                         assumption that analysis of the model’s gradient highlights features with greatest impact on a prediction
                         [5, 6].
                            Several metrics have been proposed to measure the quality of explainability models. In [4, 7], the
                         authors propose the Lipschitz constant of an explainability model as a measure of explainability quality.
                         Other works consider the extent to which an explainability model approximates the underlying neural
                         network as a measure of quality. These metrics do not consider the user’s perception of the explanations.
                         Following from Ganz et al.’s [8] notion of perceptually aligned gradients of a neural network, Brodt et al.
                         [9] introduce perceptually aligned explanations. Brodt el al. [9] measure how perceptually aligned an
                         explanation is by the extent to which an explanation lies in the tangent space of the manifold. Brodt el
                         al. [9]’s measure of tangential explanations relies on the manifold hypothesis. The manifold hypothesis
                         is the notion that data lies on a low dimensional Riemannian manifold [10, 11, 12, 8, 13, 14].
                            The tangent space captures the features of an image that can be changed whilst remaining in the
                         distribution of images. The intuition is if an explanation lies in the tangent space of the image, the
                         explanation will contain meaningful components of the image [9]. Brodt et al. [9] demonstrate their
                         hypothesis on several gradient explainability models on well-known computer vision datasets. Brodt el
                         al. [9] further demonstrate tangentially aligned explanations are robust to adversarial attacks.

                          AICS’24: 32nd Irish Conference on Artificial Intelligence and Cognitive Science, December 09–10, 2024, Dublin, Ireland
                         ∗
                           Corresponding author.
                          $ lachlan.simpson@adelaide.edu.au (L. Simpson)
                                     © 2024 Copyright for this paper by its authors. Use permitted under Creative Commons License Attribution 4.0 International (CC BY 4.0).


CEUR
                  ceur-ws.org
Workshop      ISSN 1613-0073
Proceedings
   Integrated gradients (IG) [6] is a popular explainability method employed in a wide array of computer
vision tasks [15]. IG relies on a hyper-parameter known as the base-point. The choice of base-point
fundamentally alters the explanation provided [16]. Base-point selection is domain dependent and
chosen heuristically. The zero vector, however, is a prevalent choice in computer vision, NLP and graph
machine learning [6, 17, 18]. Several works have investigated different choices of base-point, however,
none are able to determine a correct choice [19]. In this work we investigate the conditions under which
a choice of base-point will provide perceptually aligned explanations.
   The contributions of this work are twofold:
   1. We provide sufficient conditions for when integrated gradient explanations are tangentially
      aligned. We extend these results to any base-point attribution method.
   2. We provide a framework to choose a base-point point which provides meaningful explanations to
      the user. We compare our method with three gradient explainability models and IG with common
      base-points. We demonstrate that our base-point choice provides better tangential alignment
      and consequently more meaningful explanations. We validate our approach on four well-known
      computer vision datasets.
The remainder of this work is structured as follows: Section 2 provides related work and background.
Section 3 investigates theoretical conditions for tangential alignment of base-point attribution methods.
Section 4 calculates base-points for tangential alignment of IG on four well known datasets. We
compare tangential IG with four common base-point choices and three gradient explainability models.
We conclude in Section 5 with a discussion for future works.


2. Related Work and Background
2.1. Tangentially Aligned Integrated Gradients Explanations
Post hoc explainability models are methods for providing an attribution for the features that influence
the output of a neural network. Post hoc explainability is a step towards addressing the black-box
problem [6].
  Base-point attribution methods (BAM) [20] are a specific class of post hoc explainability models. A
BAM is a function
                                        A : M × M × F(M ) → Rd                                             (1)
                                                 ′                      ′
                                            (x, x , F ) 7→ A(x, x , F )                                    (2)

where, M ⊂ Rd is a manifold, F(M ) denotes the set of neural networks on M and x, x′ ∈ M are an
input and a base-point, respectively.
   We will further restrict the space of BAM functions to path methods, and we will generalise the
definition of path methods to be independent of coordinates. Given a closed interval I := [a, b] ⊂ R, a
path γ : I → M and a unit vector v ∈ Rd , the component of a path method Aγ : M ×M ×F(M ) → Rd
in the direction of v is defined as
                                               Z b
                                      ′
                                γ
                              Av (x, x , F ) =     ⟨∇F (γ(t)), v⟩⟨γ ′ (t), v⟩dt.                    (3)
                                                 a

In this way, for a given orthonormal basis {v1 , . . . , vd } of Rd , Aγ is expressed as
                                                        d
                                                        X
                                    Aγ (x, x′ , F ) =         Aγvi (x, x′ , F )vi .                        (4)
                                                        i=1

Particularly, for the standard orthonormal basis {e1 , . . . , ed } of Rd , we obtain the usual definition
                                                    Z b
                                                        ∂F          ∂γi
                                Aγei (x, x′ , F ) =        (γ(t))       (t)dt.                             (5)
                                                     a ∂xi           ∂t
The prominent path method, integrated gradients [6] is a path method where γ is taken to be the
straight line between points x, x′ ∈ M . For any pair of points x, x′ ∈ M , a neural network F ∈ F(M ),
and a unit vector v, integrated gradients of the v component of x is defined to be:
                                                     Z 1
                               ′               ′
                      IGv (x, x , F ) := ⟨x − x , v⟩     ⟨∇F (x′ + t(x − x′ )), v⟩dt.               (6)
                                            0
Letting I : M × M × F(M ) → R be the map defined by
                             n

                                                 Z 1
                              I(x, x′ , F ) :=         (∇F )(x′ + t(x − x′ ))dt,                       (7)
                                                  0

integrated gradients can be expressed succinctly in the standard orthonormal basis of Rd as
                                 IG(x, x′ , F ) = (x − x′ ) ⊙ I(x, x′ , F ),                           (8)
where ⊙ denotes the Hadamard product.
   Several metrics have been proposed to measure the quality of explainability models. In [7, 4],
Lipschitzness is proposed as a measure of explainability quality. Other works consider the extent
an explainability model approximates the neural network as a measure of quality. Brodt et al. [9]
propose the extent to which an explanation lies in the tangent space of the manifold as a measure
of explanation quality. Attributions which lie in tangent space were demonstrated to constitute the
meaningful features that contribute to a prediction [8, 9]. Orthogonal attributions were closer to random
noise. The hypothesis that tangential explanations provide meaningful explanations is validated on
several image classification datasets and a user study [9]. Here tangentially aligned explanations is
formalised.
   For the reminder of this work we will consider Rd , equipped with its standard inner product ⟨·, ·⟩,
and we will let M ⊂ Rd be a manifold of dimension n < d. We will also write ⟨·, ·⟩ for the restriction
of the inner product of Rd to M , such that (M, ⟨·, ·⟩) is an embedded Riemannian submanifold of
(Rd , ⟨·, ·⟩). We will denote the tangent space of M at a point x by Tx M which, in the context of data
manifolds, will consist of all v ∈ Rd such that x + v is “close" to M , with ∥v∥2 small [9]. Lastly,
making use of the inner product of Rd , for each x ∈ M we have orthogonal direct sum decomposition
Tx Rd = Tx M ⊕ Tx M ⊥ ,where
                          Tx M ⊥ := {u ∈ Tx Rd : ⟨u, v⟩ = 0, ∀v ∈ Tx M }.                              (9)
We will let πx : Tx Rd → Tx M denote the natural projection from Tx Rd to Tx M defined by
                                                        n
                                                        X
                                          πx (v) =             ⟨v, τℓ ⟩τℓ ,                           (10)
                                                         ℓ=1

where, {τ1 , . . . τn } is an orthonormal basis for Tx M . We define the map µx : Tx Rd → [0, 1], given by
                                                 ∥πx v∥22
                                   µx (v) :=              ,          v ∈ Tx Rd .                      (11)
                                                  ∥v∥22
The map defined in Equation (11) provides us a measure of “how much" of a vector lies in the tangent
space of M at x, i.e. a vector v is in Tx M if and only if µx (v) = 1 and, on the other hand, v will be in
Tx M ⊥ if and only if µx (v) = 0, which can be observed directly from its definition. Moreover, letting
πx⊥ : Tx Rd → Tx M ⊥ denote the natural projection and, noting that,
                                       v = πx v + πx⊥ v,                                              (12)
                                       ∥v∥22 = ∥πx v∥22 + ∥πx⊥ v∥22 ,                                 (13)
we can express µx as
                                              ∥πx v∥22
                             µx (v) :=                        ,               v ∈ Tx Rd .             (14)
                                         ∥πx v∥22 + ∥πx⊥ v∥22
Minimising the norm of the projection into Tx M ⊥ provides a framework to ensure tangential alignment.
2.2. Base-point Selection for Integrated Gradients
The attribution of IG depends on the base-point chosen. Base-point selection is domain dependent and
chosen heuristically. Here we review common base-point choices as provided by [21].

   1. Zero. Here the base-point for all points is a constant zero vector

                                                     αzero = 0.                                            (15)

      in general the zero base-point can be any constant vector.

   2. Maximum Distance. For a given input x ∈ M , α is defined as the point in M of maximum
      distance from x i.e.
                                  αxmax = argmax ∥x − y∥p .                              (16)
                                                       y∈M

      Usually p = 1 or 2.

   3. Uniform. We sample uniformly over a valid range of M

                                             αiuniform ∼ U (min, max).                                     (17)
                                                              i       i


   4. Gaussian. A Gaussian filter is applied to the input x.

                                               αGaussian = σ · v + x,                                      (18)

      where, vi ∼ N (0, 1) and σ ∈ R. We require the αGaussian is still within the data distribution so
      αGaussian → αUniform as σ → ∞ [21].

    The zero base-point (Equation 15) will not highlight the aspects of the image which may be important
if the object of interest contains black pixels [21, 22]. To address the issue of a constant base-point missing
important features maximum distance (Equation 16) was proposed in [21]. Maximum distance takes the
furthest point (in ℓp distance) from the input image such that the base-point does not contain important
information of the input. Another alternative is to sample a base-point from a distribution such as
uniform (Equation 17) or Gaussian (Equation 18) [21, 23]. Despite the various choices of base-point we
demonstrate none of the aforementioned base-points provide perceptually aligned explanations.
    Zaher et al. [24] propose Manifold Integrated Gradients (MIG). MIG replaces the straight line in IG
with a geodesic such that the attribution lies in the Riemannian manifold. Whilst MIG addresses the
problem of IG not conforming to the geometry of the data. MIG does not resolve the issue of base-point
choice nor does MIG ensure that the attribution lies in the tangent space of the manifold.


3. Optimising the Base-point for Tangentially Aligned Explanations
Throughout this section, we will study the map defined in Equation 11, to identify possible choices of
base-points for the attribution given by a BAM to be tangent to M at a point. To be precise, for a given
BAM, we want to find α ∈ M such that the map

                                           x′ 7→ µx (A(x, x′ , F ))                                        (19)

attains its maximum and, particularly, when this maximum value is equal to 1. We note that α = x is
always a solution, however, we will always require α ̸= x for non-trivial solutions.

Definition 1. Let A : M × M × F(M ) → Rd be a BAM and x, α ∈ M , F ∈ F(M ). A is tangentially
aligned at x, with base-point α, if µx (A(x, α, F )) = 1.
   In the remainder of this section x ∈ M and F ∈ F(M ) will be fixed, unless otherwise stated. Letting
πx⊥ : Tx Rd → Tx M ⊥ denote the natural projection and defining the maps

                                 Hx : M → Tx M ⊥ ,           Hx (x′ ) := πx⊥ A(x, x′ , F )                          (20)

and
                                                        1
                                             Ex (x′ ) := ∥Hx (x′ )∥22 ,
                                    Ex : M → R,                                                                     (21)
                                                        2
we can characterise tangentially aligned BAM explanations with the following theorem.

Theorem 1. Let A : M × M × F(M ) → Rd be a BAM and x, α ∈ M , F ∈ F(M ). Then A is
tangentially aligned at x, with base-point α, if and only if Hx (α) = 0 or, equivalently, if Ex (α) = 0.

Proof. It is immediate from the definitions of Hx and Ex , since they are the projection to Tx M ⊥ of A
and a multiple of its norm, respectively.

   Choosing an orthonormal basis

                                             {τ1 , . . . , τn , νn+1 , . . . , νd }                                 (22)

of Tx Rd such that {τi }ni=1 and {νi }di=n+1 are orthonormal basis of Tx M and Tx M ⊥ , respectively, we
observe that
                                                 n                                     d
                Hx (x′ ) = A(x, x′ , F ) −           ⟨A(x, x′ , F ), τi ⟩τi =              ⟨A(x, x′ , F ), νi ⟩νi   (23)
                                                 P                                     P
                                                 i=1                                  i=n+1

and
                                                    d
                                             ′    1 X
                                        Ex (x ) =     ⟨A(x, x′ , F ), νi ⟩2 .                                       (24)
                                                  2
                                                       i=n+1

Therefore, any choice of a basis for Tx Rd , adapted to the splitting of Tx Rd into tangent and normal
spaces of M at x, will provide us with with a system of equations to test for tangentially aligned
explanations.
   Theorem 1 provides us with a necessary condition that a base-point must satisfy to obtain a tangen-
tially aligned explanation. To observe this, suppose that there exists α ∈ M such that A(x, α, F ) is
tangentially aligned. Then, by Theorem 1, Ex (α) = 0 and since Ex (x′ ) ≥ 0 for all x′ ∈ M , it is in fact
a global minimum of Ex and, consequently, (∇Ex )(α) = 0. Moreover, its Hessian matrix HessEx is
positive definite at α.
   To simplify notation, in what follows we will denote the partial derivatives with respect to xi and x′i
by ∂i and ∂i′ , respectively.

Corollary 2. It is a necessary condition for A(x, α, F ) to be tangentially aligned, that

                                             ⟨Hx (α), (∂i′ Hx )(α)⟩ = 0,                                            (25)

for all i = 1, . . . , d.

Proof. If A(x, α, F ) is tangentially aligned, then (∇Ex )(α) = 0, which is equivalent to (∂i′ Ex )(α) = 0
for all i = 1, . . . , d. It follows from the definition of Ex that:
                                                    1
                            ⟨Hx (α), (∂i′ Hx )(α)⟩ = ∂i′ ⟨Hx , Hx ⟩|α = (∂i′ Ex )(α) = 0.                           (26)
                                                    2
for all i = 1, . . . , d, as claimed.
  In order to find conditions for the Hessian matrix of Ex to be positive definite, we will make use of
Geršgorin circle theorem [25] to find bounds for the eigenvalues of Hess Ex . For a given complex n × n
matrix A, its i-th Geršgorin disk is the closed disk Gi (A) := D(Aii , Ri ) ⊂ C, where the radius is given
by the formula                   X
                          Ri =        |Aij |, Ji = {1, . . . , i − 1, i + 1, . . . , n}.               (27)
                                j∈Ji

Lemma 3. Let A be a real symmetric matrix such that Aii > Ri for all i, then A is positive definite.

  Lemma 3 follows immediately from [25]. The following theorem is an immediate consequence of
Corollary 2 and of Lemma 3 applied to HessEx .

Theorem 4. It is a sufficient condition for A(x, α, F ) to be tangentially aligned, that for all i

                                         ⟨Hx (α), (∂i′ Hx )(α)⟩ = 0                                    (28)

and that
                                        (Hess Ex )(α)ii > Ri (α),                                      (29)
where Ri (α) denotes the radius of the i-th Geršgorin disk of (HessEx )(α).


4. Numerical Analysis
In this section we approximate tangential base-point choices on four well-known datasets in computer
vision: MNIST [26], Fashion-MNIST [27], CIFAR10 and FER2013 [28]. We demonstrate that the four
common base-point choices defined in Section 2.2 consistently provide explanations that are not
well aligned with the tangent space. We further demonstrate tangentially aligned IG provides higher
tangentially aligned explanations than three gradient explainability models: Gradient [29], Smooth
Grad (SG) [5] and Input*Gradient (I*G) [30].

4.1. Approximating the Tangent and Normal Space
Following [9] the tangent space is approximated via a convolutional autoencoder. As discussed in [31]
if we consider the decoder, dec : L → M , as a map from the latent space L to the manifold M , then
the Jacobian of the decoder is a linear map from the tangent spaces of L and M

                                       Jdec (x) : Tx L → Tdec(x) M.                                    (30)

The Jacobian of the decoder can be computed via back-propagation [31]. The tangent space of M is
spanned by the gradient of dec [9]. For our work we require the normal space Tx M ⊥ . Given a basis for
the tangent space {τ1 , . . . , τn }, one can compute a basis for the normal space by

                                            Null (τ1 , . . . , τn ) ,                                  (31)

where one considers the basis of the tangent space as a matrix.

4.2. Experimental Setup
We utilise the implementation of [9] to generate the tangent space with a convolutional autoencoder
and train a CNN for classification. The convolutional autoencoder has two convolutional layers with
pooling followed by a fully connected layer with ReLU activation. A two layer CNN of kernel size 3
with dropout and Relu activation is used to perform image classification. Using the parameters of [9],
n = dim(Tx M ) = 144 for CIFAR10 and FER2013 and for MNIST32 and Fashion n = 10. Explainability
models are produced with the PyTorch library Captum.ai [32].
                                              ) ( 5                                                                                              & , ) $ 5  
                                                                                                     
                
                                                                                                        
                
                
                                                                                                        
                
                                                                                                     
                                                                                                                                                                               
 ' H Q V L W \


                                              0 1 , 6 7                                                                                   ) D V K L R Q  0 1 , 6 7
                
                                                                                                                                                                                    7 D Q J H Q W L D O
                                                                                                                                                                                 8 Q L I R U P
                                                                                                                                                                                 0 D [  G L V W  / 
                                                                                                                                                                                    * X D V V L D Q
                                                                                                                                                                              = H U R

                                                                                                    

                                                                                                    
                                                                                                                                                                               
                                                                      ) U D F W L R Q  L Q  W K H  W D Q J H Q W  V S D F H

                                                                    (a) Base-point choices.

                                              ) ( 5                                                                                              & , ) $ 5  
                                                                                                       

                                                                                                        
                  
                                                                                                        
                  
                                                                                                        

                                                                                                       
 ' H Q V L W \


                                                                                                                                                                               
                                              0 1 , 6 7                                                                                   ) D V K L R Q  0 1 , 6 7
                                                                                                                                                                            * U D G L H Q W
                                                                                                                                                                            6 P R R W K  * U D G
                                                                                                                                                                               , Q S X W 
 * U D G L H Q W
                                                                                                                                                                            7 D Q J H Q W L D O  , *
                

                                                                                                    

                                                                                                    
                                                                                                                                                                               
                                                                      ) U D F W L R Q  L Q  W K H  W D Q J H Q W  V S D F H

                                                                       (b) Gradient models.
Figure 1: Kernel density estimate plot of the fraction of the explanation in the tangent space with (a) different
base-point choices and (b) different gradient explainability models. The fraction of explanation in the tangent
space is measured with µpx (Equation 11). The vertical line represents the expected fraction a random vector lies
in the tangent space ≈ n/d, where n = dim(Tx M ) and d = dim(M ). On CIFAR10 and FER2013, n = 144.
On MNIST32 and Fashion-MNIST, n = 10.


4.3. Complexity Analysis
The problem of finding a base-point that gives tangentially aligned explanations can be phrased as

                                                                   αx∗ = argmin Ex (α).                                                                                                                 (32)
                                                                                        x̸=α

If we suppose M ⊆ Rd is compact, then by Weierstrass’s theorem such an α∗ exists [33]. The continuity
of Ex follows from the continuity of IG and norms. The condition x ̸= α is required to ensure non-trivial
solutions. A solution to the optimisation problem in Equation 32 can be approximated via gradient-
descent. We note that zero, Gaussian and Uniform base-points have constant time O(1) complexity.
Maximum ℓ2 distance is O(|D|) where |D| is the number of points in the dataset. Calculating a
tangential base-point has complexity O(ε), where ε is the number of iterations in gradient descent to
solve Equation 32. IG with base-point αx∗ from Equation 32 will be referred to as tangentially aligned
IG.

  * U R X Q G  7 U X W K  7 D Q J H Q W L D O      = H U R    0 D [   G L V W    * D X V V L D Q    8 Q L I R U P


                             x = 0.9446          x = 0.2432   x = 0.1070           x = 0.1280         x = 0.1806

  * U R X Q G  7 U X W K  7 D Q J H Q W L D O      = H U R    0 D [   G L V W    * D X V V L D Q    8 Q L I R U P


                             x = 0.9536          x = 0.0567   x = 0.1035           x = 0.0492         x = 0.0764

  * U R X Q G  7 U X W K  7 D Q J H Q W L D O      = H U R    0 D [   G L V W    * D X V V L D Q    8 Q L I R U P


                             x = 0.9184          x = 0.1616   x = 0.1142           x = 0.1298         x = 0.1212
          Figure 2: Attributions of IG with differing base-point choice on example points from MNIST, FER2013,
          and Fashion-MNIST. The fraction of explanation in the tangent space is denoted by µx (Equation 11).


4.4. Comparison of Different Base-points with Tangential Integrated Gradients
For each dataset IG is applied to the CNN with the base-points defined in Section 2.2 and the fraction of
each explanation is calculated via Equation 11. To calculate each base-point in Section 2.2 we use the
implementation provided by [21].
   For each point we approximate the solution to Equation 32 to provide tangential explanations on
each dataset. To approximate the solution to Equation 32 over all points we use the same learning rate
and number of iterations. Some points may require a different learning rate and number of iterations to
achieve higher tangential alignment. We leave this to future work. In Figure 1a we have the distributions
of the fraction in the tangent space on FER2013, CIFAR10, MNIST32 and Fashion MNIST. We see in
Figure 1a that approximating solutions to Equation 32 consistently provides explanations with high
tangential alignment.
   We see in Figure 1a that the uniform base-point provides explanations consistently close to the
normal space; followed by maximum ℓ2 distance and Gaussian. We note that on FER2013 and CIFAR10
the Gaussian base-point performs better than zero, uniform, and maximum ℓ2 distance. The better
performance of a Gaussian base-point is likely due to the smoothing parameter σ defined in Section 2.2.
It is the goal of future work to determine the impact of σ on tangential alignment. The vertical lines
in Figure 1a indicate
                   p the expectation a random vector will lie in the tangent space. The expectation
is approximately n/d, where n and d are the dimensions of the tangent space approximation and
manifold, respectively [9]. An explanation is therefore
                                                   p sufficiently aligned with the tangent space when
that fraction in the tangent space is greater than n/d. We see in Figure 1a that standard base-point
choices on CIFAR10 are significantly below the vertical line. It is the goal of future work to determine if
the dimension of the tangent space of CIFAR10 of n = 144 or the parameter of the Gaussian base-point
impacts the tangential alignment of IG on CIFAR10.
   We provide in Figure 2, example integrated gradient explanations for a point on MNIST32, FER2013,
and Fashion-MNIST with differing base-point choice. We see that our method provides tangentially
aligned explanations with µx > 0.91 for all datasets. The tangentially aligned integrated gradient
attributions are clear and perceptually aligned with the object to classify in the image. We see in Figure
2 that uniform, maximum ℓ2 distance, and Gaussian are consistently random noise.

4.5. Comparison of Gradient Explainability Models with Tangential Integrated
     Gradients
In this section we compare tangentially aligned integrated gradients with three common gradient
explainability models: Gradient, Smooth Grad and Input *Gradient. The aforementioned gradient
explainability models do not require a base-point choice. We demonstrate that tangentially aligned
integrated gradients significantly improves upon integrated gradients. The gradient explainability
models for a given model are defined as follows:

   1. Gradient The gradient of a model f at x ∈ Rd for class i is defined as:

                                                            ∂f(x)i
                                              grad(x)i :=          .                                  (33)
                                                             ∂x

   2. Smooth Grad We define Smooth Grad with n samples and standard deviation σ as:
                                                              n
                                                         1X
                                    SmoothGrad(x) =         ∇f (x + a),                               (34)
                                                         n
                                                            i=1

      where, a ∼ N (0, σ 2 ). Following [9] we take σ = 0.02 and n = 25.

   3. Input*Gradient Input*Gradient is defined as:

                                                                   ∂f (x)i
                                      Input ∗ Gradient := x ⊙              .                          (35)
                                                                     ∂x

  In Figure 1b we provide density plots of the fraction an attribution is in the tangent space for:
Gradient, Smooth Grad, Input*Gradient and tangentially aligned integrated gradients. We see in Figure
1b that tangentially aligned integrated gradients provides attributions consistently in the tangent space,
out-performing the aforementioned gradient explainability models. In Figures 1a and 1b Gradient,
Smooth Grad and Input*Gradient provide better tangential alignment than the common base-point
choices provided in Section 2.2 on MNIST and CIFAR10. On Fashion-MNIST we see that the zero
base-point choice provides comparable performance with Gradient, Smooth Grad and Input*Gradient.
In Figures 1a and 1b we see that on FER2013, Gradient, Smooth Grad and Input*Gradient perform
similarly to Gaussian, maximum ℓ2 distance and zero base-point choices for integrated gradients. All
gradient models on FER2013 outperform the uniform base-point choice for integrated gradients. We
see in Figures 1a and 1b, Gradient, Smooth Grad, and Input*Gradient tend to out-perform Integrated
gradients with standard the standard base-point choices. Tangential integrated gradients out-performs
the aforementioned gradient explainability models and standard base-point choices.
5. Conclusions and Future Work
In this work we investigated how to choose base-points for IG that provide tangentially aligned
explanations. We provided theoretical conditions for a base-point to provide tangentially aligned
explanations for any BAM. We demonstrated how to numerically approximate the base-point which
provides tangentially aligned explanations and validated this approach on several well-known image
classification datasets. In future work we seek to further investigate the theoretical conditions a
base-point must have to provide tangential explanations.


Acknowledgments
The Commonwealth of Australia (represented by the Defence Science and Technology Group) supports
this research through a Defence Science Partnerships agreement. Lachlan Simpson is supported by a
scholarship from the University of Adelaide.


References
 [1] J. Redmon, S. Divvala, R. Girshick, A. Farhadi, You Only Look Once: Unified, Real-Time Object
     Detection, Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition
     (CVPR) (2016).
 [2] C. Zednik, Solving the Black Box Problem: A Normative Framework for Explainable Artificial
     Intelligence, Philosophy & Technology 34 (2021) 265–288.
 [3] T. J. Sejnowski, The Unreasonable Effectiveness of Deep Learning in Artificial Intelligence,
     Proceedings of the National Academy of Sciences 117 (2020) 30033–30038.
 [4] L. Simpson, K. Millar, A. Cheng, C.-C. Lim, H. G. Chew, Probabilistic Lipschitzness and the Stable
     Rank for Comparing Explanation Models, arXiv preprint arXiv:2402.18863 (2024).
 [5] D. Smilkov, N. Thorat, B. Kim, F. Viégas, M. Wattenberg, Smoothgrad: Removing Noise by Adding
     Noise, arXiv preprint arXiv:1706.03825 (2017). arXiv:1706.03825.
 [6] M. Sundararajan, A. Taly, Q. Yan, Axiomatic Attribution for Deep Networks, Proceedings of the
     34th International Conference on Machine Learning (ICML) 70 (2017) 3319–3328.
 [7] Z. Khan, D. Hill, A. Masoomi, J. Bone, J. Dy, Analyzing Explainer Robustness via Lipschitzness of
     Prediction Functions, arXiv preprint arXiv:2206.12481 (2023).
 [8] R. Ganz, B. Kawar, M. Elad, Do Perceptually Aligned Gradients Imply Robustness?, in: Proceedings
     of the 40th International Conference on Machine Learning, volume 202 of Proceedings of Machine
     Learning Research, PMLR, 2023, pp. 10628–10648.
 [9] S. Bordt, U. Uddeshya, Z. Akata, U. von Luxburg, The Manifold Hypothesis for Gradient-Based
     Explanations, in: 2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition
     Workshops (CVPRW), 2023, pp. 3697–3702.
[10] N. Whiteley, A. Gray, P. Rubin-Delanchy, Statistical Exploration of the Manifold Hypothesis, arXiv
     preprint arXiv:2208.11665 (2024).
[11] C. Fefferman, S. Mitter, H. Narayanan, Testing the Manifold Hypothesis, Journal of the American
     Mathematical Society 29 (2016) 983–1049.
[12] I. J. Goodfellow, Y. Bengio, A. Courville, Deep Learning, MIT Press, Cambridge, MA, USA, 2016.
[13] D. Tsipras, S. Santurkar, L. Engstrom, A. Turner, A. Madry, Robustness May Be at Odds with
     Accuracy, International Conference on Learning Representations (2019).
[14] A. Madry, A. Makelov, L. Schmidt, D. Tsipras, A. Vladu, Towards Deep Learning Models Resistant
     to Adversarial Attacks, International Conference on Learning Representations (2018).
[15] A. Das, P. Rad, Opportunities and Challenges in Explainable Artificial Intelligence (XAI): A Survey,
     arXiv preprint arXiv:2006.11371 (2020). arXiv:2006.11371.
[16] P.-J. Kindermans, S. Hooker, J. Adebayo, M. Alber, K. T. Schütt, S. Dähne, D. Erhan, B. Kim, The
     (Un)reliability of Saliency Methods, arXiv preprint arXiv:1711.00867 (2017). arXiv:1711.00867.
[17] P. Xenopoulos, G. Chan, H. Doraiswamy, L. G. Nonato, B. Barr, C. Silva, GALE: Globally Assessing
     Local Explanations, in: Proceedings of Topological, Algebraic, and Geometric Learning Workshops
     2022, volume 196 of Proceedings of Machine Learning Research (PMLR), 2022, pp. 322–331.
[18] B. Sanchez-Lengeling, J. Wei, B. Lee, E. Reif, P. Wang, W. Qian, K. McCloskey, L. Colwell,
     A. Wiltschko, Evaluating Attribution for Graph Neural Networks, in: Advances in Neural
     Information Processing Systems, volume 33, 2020, pp. 5898–5910.
[19] D. Drakard, R. Liu, J. Yosinski, Exploring unfairness in Integrated Gradients based attribution
     methods, OpenReview (2022).
[20] D. Lundstrom, T. Huang, M. Razaviyayn, A Rigorous Study of Integrated Gradients Method and
     Extensions to Internal Neuron Attributions, Proceedings of the 39th International Conference on
     Machine Learning 162 (2022) 14485–14508.
[21] P. Sturmfels, S. Lundberg, S.-I. Lee, Visualizing the Impact of Feature Attribution Baselines, Distill
     (2020). Https://distill.pub/2020/attribution-baselines.
[22] M. Sundararajan, A. Taly, A Note About: Local Explanation Methods for Deep Neural
     Networks Lack Sensitivity to Parameter Values, arXiv preprint arXiv:1806.04205 (2018).
     arXiv:1806.04205.
[23] R. C. Fong, A. Vedaldi, Interpretable Explanations of Black Boxes by Meaningful Perturbation, in:
     2017 IEEE International Conference on Computer Vision (ICCV), IEEE, 2017, pp. 3449–3457.
[24] E. Zaher, M. Trzaskowski, Q. Nguyen, F. Roosta, Manifold Integrated Gradients: Riemannian
     Geometry for Feature Attribution, arXiv preprint arXiv:2405.09800 (2024). arXiv:2405.09800.
[25] S. Geršgorin, Über die Abgrenzung der Eigenwerte einer Matrix, Bulletin de l’Académie des
     Sciences de l’URSS (1931) 749–754.
[26] L. Deng, The MNIST Database of Handwritten Digit Images for Machine Learning Research [Best
     of the Web], IEEE Signal Processing Magazine 29 (2012) 141–142.
[27] H. Xiao, K. Rasul, R. Vollgraf, Fashion-MNIST: a Novel Image Dataset for Benchmarking Machine
     Learning Algorithms, arXiv preprint:1708.07747 (2017).
[28] I. J. Goodfellow, D. Erhan, P. L. Carrier, A. Courville, M. Mirza, B. Hamner, W. Cukierski, Y. Tang,
     D. Thaler, D.-H. Lee, Y. Zhou, C. Ramaiah, F. Feng, R. Li, X. Wang, D. Athanasakis, J. Shawe-Taylor,
     M. Milakov, J. Park, R. Ionescu, M. Popescu, C. Grozea, J. Bergstra, J. Xie, L. Romaszko, B. Xu,
     Z. Chuang, Y. Bengio, Challenges in Representation Learning: A report on three machine learning
     contests, arXiv preprint arXiv:1307.0414 (2013).
[29] K. Simonyan, A. Vedaldi, A. Zisserman, Deep Inside Convolutional Networks: Visualis-
     ing Image Classification Models and Saliency Maps, arXiv preprint arXiv:1312.6034 (2014).
     arXiv:1312.6034.
[30] A. Shrikumar, P. Greenside, A. Shcherbina, A. Kundaje, Not Just a Black Box: Learning Important
     Features Through Propagating Activation Differences, arXiv preprint arXiv:1605.01713 (2017).
     arXiv:1605.01713.
[31] H. Shao, A. Kumar, P. T. Fletcher, The Riemannian Geometry of Deep Generative Models, in: 2018
     IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW), 2018,
     pp. 428–4288.
[32] N. Kokhlikyan, V. Miglani, M. Martin, E. Wang, B. Alsallakh, J. Reynolds, A. Melnikov, N. Kliushkina,
     C. Araya, S. Yan, O. Reblitz-Richardson, Captum: A Unified and Generic Model Interpretability
     Library for PyTorch, arXiv preprint arXiv:2009.07896 (2020). arXiv:2009.07896.
[33] W. Rudin, Principles of Mathematical Analysis, McGraw Hill, 1976.