<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Concepts Guide and Explain Difusion Visual Counterfactuals</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Franz Motzkus</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Ute Schmid</string-name>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Continental Automotive GmbH</institution>
          ,
          <addr-line>Max-Urich-Straße 3, 13355 Berlin</addr-line>
          ,
          <country country="DE">Germany</country>
        </aff>
        <aff id="aff1">
          <label>1</label>
          <institution>Universität Bamberg</institution>
          ,
          <addr-line>An der Weberei 5, 96047 Bamberg</addr-line>
          ,
          <country country="DE">Germany</country>
        </aff>
      </contrib-group>
      <abstract>
        <p>Difusion models enable the generation of diverse yet realistic image features, which is crucial in counterfactual generation answering “what if” questions of what needs to change to make an image classifier change its prediction. Current methods generate authentic counterfactuals, but lack transparency in the feature changes. To address this limitation, we introduce Concept-guided Latent Difusion Counterfactual Explanations (CoLa-DCE), a concept-guided approach for any classifier that provides a high degree of control over concept selection and spatial conditioning. The counterfactuals comprise an increased granularity through minimal feature changes, improved comprehensibility through feature visualization, and increased transparency by localizing feature changes. We show the advantages of our approach in minimality and interpretability extensively across multiple datasets, classification, and difusion models and demonstrate how our CoLa-DCE explanations make model errors like misclassification cases comprehensible.</p>
      </abstract>
      <kwd-group>
        <kwd>eol&gt;explainable AI (xAI)</kwd>
        <kwd>Counterfactuals</kwd>
        <kwd>Image-to-Image Difusion</kwd>
        <kwd>Concept Encodings</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>1. Introduction</title>
      <p>
        Recent advancements in generative models have sparked new interest in counterfactual explanations for
computer vision tasks [
        <xref ref-type="bibr" rid="ref1 ref2 ref3">1, 2, 3</xref>
        ]. By answering what would need to change to induce a diferent outcome,
counterfactual explanations are well-aligned with human reasoning [
        <xref ref-type="bibr" rid="ref4 ref5">4, 5</xref>
        ] and are deemed plausible, if
they are consistent with the user’s beliefs - realistic, and with minimal efort to change towards the
counterfactual [
        <xref ref-type="bibr" rid="ref5">5</xref>
        ]. Difusion models generate realistic high-resolution images with diverse features
within the data distribution [
        <xref ref-type="bibr" rid="ref6 ref7 ref8">6, 7, 8</xref>
        ], designating them for generating counterfactual images [
        <xref ref-type="bibr" rid="ref1 ref2 ref3">1, 3, 2</xref>
        ].
Previous difusion-based counterfactuals optimize all image features but lack clarity on specific feature
changes and their relation to the model prediction, making detection and tracking feature changes
challenging. As humans perceive minimal counterfactual diferences semantically rather than pixel-wise,
defining minimality in the number of feature changes is better suitable.
      </p>
      <p>CoLa-DCE solves both problems: We guide the counterfactual generation with a restricted number of
semantic concepts, further enabling a high level of control by concept selection. We additionally include
feature visualization capabilities, allowing for direct comprehensibility of features that represent the
diference between the original and the counterfactual class. Hereby, CoLa-DCE provides semantic as
well as spatial guidance and visualization, simultaneously enabling control and better transparency.
1. We introduce CoLa-DCE for the difusion-based generation of counterfactuals using semantic
concept-guidance. We show how local counterfactual targets and concept-guided feature changes
derived from the classifier’s perception increase the quality of counterfactuals.
2. We extend our concept guidance with spatial conditioning, guiding and revealing localized feature
changes, that are made comprehensible via concept visualization and localization maps.
3. We show how CoLa-DCE assists in model debugging by making cases of misclassification more
understandable by exposing feature-level information.
6.22 65
4.78 243
-4.7 39</p>
      <p>Concept Gradient
Explanation: "Hen"
Explanation: "Cock"</p>
    </sec>
    <sec id="sec-2">
      <title>2. Related Work</title>
      <sec id="sec-2-1">
        <title>2.1. Counterfactual Image Generation</title>
        <p>
          Diverse approaches for computing image counterfactuals exist. CVE [
          <xref ref-type="bibr" rid="ref9">9</xref>
          ] replaces feature regions in an
image with matching image patches from a distractor image of the counterfactual class. Other methods
directly optimize the image using specific loss functions [
          <xref ref-type="bibr" rid="ref10 ref11">10, 11</xref>
          ] or use an autoencoder to control the
optimization or modification in a disentangled latent space [
          <xref ref-type="bibr" rid="ref12">12</xref>
          ] or simplified interpretable space [
          <xref ref-type="bibr" rid="ref13">13</xref>
          ].
        </p>
        <p>
          DiME [
          <xref ref-type="bibr" rid="ref2">2</xref>
          ] introduces difusion models for generating counterfactuals, using a classifier to guide
the difusion process. However, it is limited to small perturbations as required in the CelebA dataset.
ACE [
          <xref ref-type="bibr" rid="ref14">14</xref>
          ] is a two-step process consisting of computing pre-explanations and refining them. A
localization mask for the most probable feature change is computed before repainting the image by combining
the generated counterfactual within the mask with the original image outside.
        </p>
        <p>
          DVCE [
          <xref ref-type="bibr" rid="ref1">1</xref>
          ] includes an additional robust classifier to relax the robustness constraint for the tested
classifier and aligns the gradients of both models. However, generated features might be induced by
the robust classifier, decreasing the faithfulness towards the original classifier. LDCE [
          <xref ref-type="bibr" rid="ref3">3</xref>
          ] overcomes
the robustness requirement for the classifier by constructing a consensus mechanism, aligning the
gradients of the external and the difusion model’s implicit classifier directly. However, feature changes
are hard to track due to the optimization on all features and thus lack transparency. A concept-based
approach can improve the transparency and comprehensibility to the user by modifying features on a
semantic concept level while enforcing semantic minimality in the number of feature changes.
        </p>
      </sec>
      <sec id="sec-2-2">
        <title>2.2. Local Concept Attribution</title>
        <p>
          Layer-wise Relevance Propagation (LRP) [
          <xref ref-type="bibr" rid="ref15">15</xref>
          ] describes a local attribution method that backpropagates
a modified gradient to assign pixel-wise importance scores for a target class. Concept-wise Relevance
Propagation (CRP) [
          <xref ref-type="bibr" rid="ref16">16</xref>
          ] extends LRP to concept space by defining every neuron or channel in the latent
space as a concept. A concept mask filters attributions during the LRP backward pass, retaining the
concept attribution in input space. Relevance Maximization [
          <xref ref-type="bibr" rid="ref16">16</xref>
          ] assigns semantic meaning to channels
by analyzing the constrained explanations across samples. Our approach generalizes the concept
masking for a gradient manipulation and applies Relevance Maximization to visualize key concepts.
        </p>
      </sec>
    </sec>
    <sec id="sec-3">
      <title>3. Concept-guided Latent Difusion Counterfactual Explanations</title>
      <p>As depicted in Figure 2 CoLa-DCE introduces three major improvements: Selecting sample-specific
targets (yellow), conditioning on a set of concepts (orange), and further spatial conditioning (purple).</p>
      <sec id="sec-3-1">
        <title>3.1. Local Counterfactual Targets</title>
        <p>
          To select the counterfactual target class, we use the model’s perception of the respective data sample
and compare it to the perception of a reference dataset ′.The model perception can hereby be derived
by either computing the activation of the model for each sample in a selected layer or by computing the
intermediate attribution using a local eXplainable Artificial Intelligence (xAI) method like LRP [
          <xref ref-type="bibr" rid="ref15">15</xref>
          ].
As the model perception of the data shall be represented, the class predictions are used to determine
class afiliation. For a new sample ˆ ∈ ˆ that we want to generate a counterfactual for, we derive the
model prediction and feature space encoding  (ˆ) and compare it to the encodings of the reference
dataset. Hereby, based on the feature space encodings, the closest reference point with a difering class
prediction is extracted, resembling the near miss approach [
          <xref ref-type="bibr" rid="ref17">17</xref>
          ]. The counterfactual target  is then
defined as the predicted class of the reference point ′.
        </p>
        <p>=  (′∈′ ( (′),  (ˆ)) and  (′) ̸=  (ˆ))
(1)</p>
      </sec>
      <sec id="sec-3-2">
        <title>3.2. Concept Selection</title>
        <p>
          For the counterfactual target , the gradient ∇(|) of a sample  can be extracted at each network
layer. For a selected layer , the intermediate gradient is summed over the spatial dimensions to obtain
a one-dimensional representation over the channels, encoding a particular concept each [
          <xref ref-type="bibr" rid="ref16">16</xref>
          ]. Taking
the absolute value of the summed gradients, the top- concepts with  ∈  (1, ), and  denoting
the overall number of channels, are selected, which are most likely to induce a change towards the
counterfactual class. The concepts are visualized with feature visualization methods like CRP [
          <xref ref-type="bibr" rid="ref16">16</xref>
          ].
        </p>
      </sec>
      <sec id="sec-3-3">
        <title>3.3. Concept Conditioning</title>
        <p>
          Classifier-free difusion guidance [
          <xref ref-type="bibr" rid="ref8">8</xref>
          ] separates the conditioning into an unconditional part and a
conditional part, where the diference between both parts can be used as an implicit classifier score:
∇ log  (|) = ∇ log () +  ∇ log (|).
(2)
This gradient-based score can be further modified by gradient manipulation to control the counterfactual
generation. Motivated by the LDCE [
          <xref ref-type="bibr" rid="ref3">3</xref>
          ] algorithm, we condition the difusion process solely on the
selected concepts. The conditions require precomputation and remain fixed during the counterfactual
generation. Instead the original gradient of the external classifier ∇(|) for target , the conditioned
gradient with regards to the selected concepts  1, ...,  with binary constraints  1, ...,   is used. With
layer  splitting the model into two parts (|) = ℎ((|)|), the conditioned gradient is computed
as:
        </p>
        <p>∇(|,  1... ) = ∇(ℎ(())|,  1... )
with  (∇()ℎ,  1... ) =
=  (∇()ℎ,  1... ) · ∇ 
{︃
∇()ℎ , if  ∈ { 1, ...,  }
0, otherwise
(3)
with  indicating binary masking the latent space gradient in the selected layer. The masked latent
gradient can be backpropagated to the input without further constraints.</p>
      </sec>
      <sec id="sec-3-4">
        <title>3.4. Spatial Conditioning</title>
        <p>
          While the concept conditioning focuses on semantic feature changes, the spatial dimensions of the
intermediate activations provide local information. We assume that each feature should be only changed
at a single location or that the feature gradient is approximately identical in equivalent locations. We
add binary masking to the spatial dimensions similar to Equation 3 based on the gradient for the selected
features, zeroing gradients below a threshold  . For visualization, the binary mask can additionally be
upscaled to the input scale like in Net2Vec [
          <xref ref-type="bibr" rid="ref18">18</xref>
          ], yielding additional information about where a specific
concept is expected to change towards the counterfactual. The spatial conditioning minimizes the
feature change by restricting it locally, while the feature localization improves the comprehensibility.
        </p>
      </sec>
    </sec>
    <sec id="sec-4">
      <title>4. Results</title>
      <p>
        We test our approach on the ImageNet [
        <xref ref-type="bibr" rid="ref19">19</xref>
        ] validation dataset using pre-trained Torchvision models
Torchvision: a VGG16 (with/without batch normalization), a ResNet18, and a ViT model. To derive
appropriate targets, 90% of the validation data is used as reference data. Counterfactuals are generated and
evaluated on the remaining 1000 samples, all ImageNet classes included. We inherit the parametrization
parameters from LDCE [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ]. Similar to [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ], we evaluate the minimality via the FID score [
        <xref ref-type="bibr" rid="ref20">20</xref>
        ] as well as
the L1 norm between the original and counterfactual image to measure their semantic and pixel-based
distance. The flip ratio (FR) determines the accuracy of predicting the target class.
      </p>
      <sec id="sec-4-1">
        <title>4.1. Selecting a local target results in improved counterfactuals</title>
        <p>
          While LDCE [
          <xref ref-type="bibr" rid="ref3">3</xref>
          ] uses WordNet [
          <xref ref-type="bibr" rid="ref21">21</xref>
          ] to derive counterfactual targets based on the semantic similarity
between labels, we suggest using the classifier’s perception of the local input. Table 1 shows the influence
of the target selection on the generated samples’ quantitative performance metrics. Choosing a local
(sample-based) counterfactual target on a near-miss basis leads to an improved flip ratio and confidence
in all settings, demonstrating a closer decision boundary and more superficial change between the
original and target class. However, retrieving the target via the intermediate activation may lead to a
slightly increased FID compared to the baseline, and some counterfactual targets are not semantically
connected to the original class, requiring a more substantial semantic change.
        </p>
        <p>
          Using the intermediate LRP [
          <xref ref-type="bibr" rid="ref15">15</xref>
          ] attribution yields substantial improvements in the minimal change
needed while simultaneously achieving high flip ratios. This indicates semantically similar
counterfactuals close to the original images. Including the model’s classification in the intermediate attribution
rather than only considering the activation up to the selected layer may better represent how the
features in the layer are connected toward the output, comprising top-level semantics between classes.
Thus, fewer feature changes are necessary. Including the results of our CoLa-DCE method, even closer
counterfactuals are generated with flip ratios on par with the LDCE baseline. Reconsidering the hard
constraint on the number of concepts, damping the gradient signal, CoLa-DCE yields much more
transparent counterfactuals while still being competitive to the baseline.
        </p>
        <p>(a) Tradeof between FID and Flip Ratio
(b) CoLa-DCE Model Comparison</p>
      </sec>
      <sec id="sec-4-2">
        <title>4.2. The number of concepts is a tradeof between accuracy and comprehensibility</title>
        <p>
          For the minimality constraint, we express the minimal semantic change in the number of feature/concept
changes. While mostly a handful of concepts is used [
          <xref ref-type="bibr" rid="ref16 ref22">16, 22</xref>
          ], restricting the latent space gradient from
hundreds to few channels significantly reduces the gradient for difusion guidance. Our quantitative
study (Figure 3) assesses how the concept number influences the quality of counterfactuals regarding
accuracy (flip ratio) and minimality (FID). Restricting the concept number improves the FID (minor
change) while the flip ratio decreases. Masking the gradient causes fewer feature changes, but also
attenuates the shift towards the counterfactual class. Only ten concepts can already achieve a good
performance &gt; 75% regarding the flip ratio, while the FID score outperforms the baseline. Thus,
CoLa-DCE ofers concept-based transparency and control without losing much detail or accuracy.
Figure 3b depicts the tradeof between minimality and accuracy for multiple model architectures and
settings. Adding spatial constraints per concept slightly lowers flip ratios, but improves the FID. Figure 4
illustrates how the number of concepts afects the counterfactual generation. Restricting the concepts
causes minor changes that alter the target object semantically, while too many concepts (like LDCE)
induce an alteration of the image composition.
        </p>
        <p>Original</p>
        <p>LDCE</p>
        <p>CoLa-DCE</p>
        <p>CoLa-DCE (spatial)</p>
      </sec>
      <sec id="sec-4-3">
        <title>4.3. Spatial constraints per concept improve the focus</title>
        <p>Assuming each feature is locally restricted, we add spatial constraints per concept by thresholding the
gradient. In Figure 1, changes towards the cockscomb are only reasonable near the hen’s head, with the
gradient set to zero elsewhere. In Figure 5, CoLa-DCE yields much more sparse explanations than LDCE,
highlighting fewer and more concentrated feature changes. With added spatial constraints, a stronger
focus in the explanation becomes apparent, either having more sparse explanations or reflecting a
stronger focus on single semantic features. Performance-wise, the spatial conditioning further decreases
the FID for the better, while only slight drawbacks regarding the flip ratio occur.</p>
      </sec>
      <sec id="sec-4-4">
        <title>4.4. How can concept-based counterfactuals help in explaining model failures?</title>
        <p>Counterfactuals are especially useful when explaining samples at the classifier’s decision boundary
between two classes. When misclassified samples and their correctly classified counterfactuals are
inspected using our CoLa-DCE approach, the root cause of the misclassification in terms of identified or
missing features becomes apparent. Figure 6 describes a misclassification case where the original image
lacks specific evidence of belonging to the label “brambling”. The sample seems to represent a rare case
of the class where the classifier is missing essential concepts shown in the CoLa-DCE explanation for a
correct classification. Hence, a dataset or model adaptation is required, where more samples of the class
showing the necessary concepts can be included in a model finetuning.</p>
      </sec>
    </sec>
    <sec id="sec-5">
      <title>5. Limitations</title>
      <p>
        As ground truth information of an optimal counterfactual image does not exist, only heuristics containing
desired properties for counterfactuals can be optimized. However, the right balance between minimally
deviating the image while maximizing the flip ratio depends on a rough estimate of the user’s preferences.
Parameter optimization is also required to balance the influence of the external gradient and the
reconstruction accuracy, like in LDCE[
        <xref ref-type="bibr" rid="ref3">3</xref>
        ]. We acknowledge that the difusion model’s ability to accurately
reconstruct an image and generate similar concept information as the external classifier highly influences
the counterfactual quality. Poor results are expected for out-of-distribution data, as the needed features
are naturally not captured by the difusion model and cannot be generated.
      </p>
      <p>Wrong Prediction:
"Junco, Snowbird"</p>
      <p>ConceptID
8
9
1
Orig: "Brambling"
CF: "Brambling"
9
1
2
5
8
2
4
1
7
1</p>
    </sec>
    <sec id="sec-6">
      <title>6. Conclusion</title>
      <p>Our CoLa-DCE method successfully tackles the lack of transparency and fine-grained control in
difusionbased counterfactual generation methods. Starting from an improved target selection, we show how our
concept-based approach yields semantically fewer image changes, enforcing the minimality requirement.
By restricting concepts and applying spatial constraints, the counterfactual generation is more focused
on small, localized feature perturbations, which are additionally more comprehensible due to the
concept grounding. From our CoLa-DCE explanations, it is directly deducible which feature changes
at which location cause the prediction change of the classifier, strongly improving the transparency
and understandability to a human user. With the high degree of control in generating images with
CoLa-DCE, we are confident to induce further work using fine-grained concept guidance for image
alteration tasks.</p>
    </sec>
    <sec id="sec-7">
      <title>Declaration on Generative AI</title>
      <p>During the preparation of this work, the authors used Chat-GPT-4 and Grammarly for grammar and
spelling checks. After using these tools, the authors reviewed and edited the content as needed and
take full responsibility for the publication’s content.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <given-names>M.</given-names>
            <surname>Augustin</surname>
          </string-name>
          ,
          <string-name>
            <given-names>V.</given-names>
            <surname>Boreiko</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F.</given-names>
            <surname>Croce</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Hein</surname>
          </string-name>
          ,
          <article-title>Difusion visual counterfactual explanations</article-title>
          ,
          <source>in: Advances in Neural Information Processing Systems</source>
          , volume
          <volume>35</volume>
          ,
          <year>2022</year>
          , pp.
          <fpage>364</fpage>
          -
          <lpage>377</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [2]
          <string-name>
            <given-names>G.</given-names>
            <surname>Jeanneret</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L.</given-names>
            <surname>Simon</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F.</given-names>
            <surname>Jurie</surname>
          </string-name>
          ,
          <article-title>Difusion models for counterfactual explanations</article-title>
          , in: Computer Vision - ACCV 2022, Springer Nature Switzerland, Cham,
          <year>2023</year>
          , pp.
          <fpage>219</fpage>
          -
          <lpage>237</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [3]
          <string-name>
            <given-names>K.</given-names>
            <surname>Farid</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Schrodi</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Argus</surname>
          </string-name>
          , T. Brox, Latent difusion counterfactual explanations,
          <year>2023</year>
          . arXiv:
          <volume>2310</volume>
          .
          <fpage>06668</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [4]
          <string-name>
            <given-names>D.</given-names>
            <surname>Lewis</surname>
          </string-name>
          ,
          <article-title>Counterfactuals and comparative possibility</article-title>
          ,
          <source>Journal of Philosophical Logic</source>
          <volume>2</volume>
          (
          <year>1973</year>
          )
          <fpage>418</fpage>
          -
          <lpage>446</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          [5]
          <string-name>
            <given-names>R. M. J.</given-names>
            <surname>Byrne</surname>
          </string-name>
          ,
          <article-title>Précis of the rational imagination: How people create alternatives to reality</article-title>
          ,
          <source>Behavioral and Brain Sciences</source>
          <volume>30</volume>
          (
          <year>2007</year>
          )
          <fpage>439</fpage>
          -
          <lpage>453</lpage>
          . doi:
          <volume>10</volume>
          .1017/S0140525X07002579.
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          [6]
          <string-name>
            <given-names>J.</given-names>
            <surname>Ho</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Jain</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Abbeel</surname>
          </string-name>
          ,
          <article-title>Denoising difusion probabilistic models</article-title>
          ,
          <source>in: Advances in Neural Information Processing Systems</source>
          , volume
          <volume>33</volume>
          ,
          <string-name>
            <surname>Curran</surname>
            <given-names>Associates</given-names>
          </string-name>
          , Inc.,
          <year>2020</year>
          , pp.
          <fpage>6840</fpage>
          -
          <lpage>6851</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          [7]
          <string-name>
            <given-names>R.</given-names>
            <surname>Rombach</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Blattmann</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Lorenz</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Esser</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B.</given-names>
            <surname>Ommer</surname>
          </string-name>
          ,
          <article-title>High-resolution image synthesis with latent difusion models</article-title>
          ,
          <source>in: 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)</source>
          ,
          <year>2022</year>
          , pp.
          <fpage>10674</fpage>
          -
          <lpage>10685</lpage>
          . doi:
          <volume>10</volume>
          .1109/CVPR52688.
          <year>2022</year>
          .
          <volume>01042</volume>
          .
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          [8]
          <string-name>
            <given-names>J.</given-names>
            <surname>Ho</surname>
          </string-name>
          , T. Salimans, Classifier-free
          <source>difusion guidance</source>
          ,
          <year>2022</year>
          . arXiv:
          <volume>2207</volume>
          .
          <fpage>12598</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          [9]
          <string-name>
            <given-names>Y.</given-names>
            <surname>Goyal</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Z.</given-names>
            <surname>Wu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Ernst</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Batra</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Parikh</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Lee</surname>
          </string-name>
          ,
          <article-title>Counterfactual visual explanations</article-title>
          ,
          <source>in: Proceedings of the 36th International Conference on Machine Learning</source>
          , volume
          <volume>97</volume>
          <source>of Proceedings of Machine Learning Research, PMLR</source>
          ,
          <year>2019</year>
          , pp.
          <fpage>2376</fpage>
          -
          <lpage>2384</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          [10]
          <string-name>
            <given-names>M.</given-names>
            <surname>Augustin</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Meinke</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Hein</surname>
          </string-name>
          ,
          <article-title>Adversarial robustness on in- and out-distribution improves explainability</article-title>
          , in: A.
          <string-name>
            <surname>Vedaldi</surname>
            ,
            <given-names>H.</given-names>
          </string-name>
          <string-name>
            <surname>Bischof</surname>
            ,
            <given-names>T.</given-names>
          </string-name>
          <string-name>
            <surname>Brox</surname>
          </string-name>
          ,
          <string-name>
            <surname>J.-M. Frahm</surname>
          </string-name>
          (Eds.),
          <source>Computer Vision - ECCV 2020</source>
          , Springer International Publishing, Cham,
          <year>2020</year>
          , pp.
          <fpage>228</fpage>
          -
          <lpage>245</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          [11]
          <string-name>
            <given-names>V.</given-names>
            <surname>Boreiko</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Augustin</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F.</given-names>
            <surname>Croce</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Berens</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Hein</surname>
          </string-name>
          ,
          <article-title>Sparse visual counterfactual explanations in image space</article-title>
          ,
          <source>in: Pattern Recognition</source>
          , Springer International Publishing, Cham,
          <year>2022</year>
          , pp.
          <fpage>133</fpage>
          -
          <lpage>148</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          [12]
          <string-name>
            <given-names>P.</given-names>
            <surname>Rodríguez</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Caccia</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Lacoste</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L.</given-names>
            <surname>Zamparo</surname>
          </string-name>
          ,
          <string-name>
            <surname>I. Laradji</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L.</given-names>
            <surname>Charlin</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Vazquez</surname>
          </string-name>
          ,
          <article-title>Beyond trivial counterfactual explanations with diverse valuable explanations</article-title>
          ,
          <source>in: Proceedings of the IEEE/CVF International Conference on Computer Vision</source>
          (ICCV),
          <year>2021</year>
          , pp.
          <fpage>1056</fpage>
          -
          <lpage>1065</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          [13]
          <string-name>
            <given-names>M.</given-names>
            <surname>Zemni</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Chen</surname>
          </string-name>
          , E. Zablocki,
          <string-name>
            <given-names>H.</given-names>
            <surname>Ben-Younes</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Pérez</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Cord</surname>
          </string-name>
          , Octet:
          <article-title>Object-aware counterfactual explanations</article-title>
          ,
          <source>in: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)</source>
          ,
          <year>2023</year>
          , pp.
          <fpage>15062</fpage>
          -
          <lpage>15071</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref14">
        <mixed-citation>
          [14]
          <string-name>
            <given-names>G.</given-names>
            <surname>Jeanneret</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L.</given-names>
            <surname>Simon</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F.</given-names>
            <surname>Jurie</surname>
          </string-name>
          ,
          <article-title>Adversarial counterfactual visual explanations</article-title>
          ,
          <source>in: 2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)</source>
          ,
          <year>2023</year>
          , pp.
          <fpage>16425</fpage>
          -
          <lpage>16435</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref15">
        <mixed-citation>
          [15]
          <string-name>
            <given-names>S.</given-names>
            <surname>Bach</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Binder</surname>
          </string-name>
          ,
          <string-name>
            <given-names>G.</given-names>
            <surname>Montavon</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F.</given-names>
            <surname>Klauschen</surname>
          </string-name>
          ,
          <string-name>
            <surname>K.-R. Müller</surname>
          </string-name>
          , W. Samek,
          <article-title>On pixel-wise explanations for non-linear classifier decisions by layer-wise relevance propagation</article-title>
          ,
          <source>PLOS ONE 10</source>
          (
          <year>2015</year>
          )
          <fpage>1</fpage>
          -
          <lpage>46</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref16">
        <mixed-citation>
          [16]
          <string-name>
            <given-names>R.</given-names>
            <surname>Achtibat</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Dreyer</surname>
          </string-name>
          , I. Eisenbraun,
          <string-name>
            <given-names>S.</given-names>
            <surname>Bosse</surname>
          </string-name>
          ,
          <string-name>
            <given-names>T.</given-names>
            <surname>Wiegand</surname>
          </string-name>
          ,
          <string-name>
            <given-names>W.</given-names>
            <surname>Samek</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Lapuschkin</surname>
          </string-name>
          ,
          <article-title>From attribution maps to human-understandable explanations through concept relevance propagation</article-title>
          ,
          <source>Nat. Mac. Intell</source>
          .
          <volume>5</volume>
          (
          <year>2023</year>
          )
          <fpage>1006</fpage>
          -
          <lpage>1019</lpage>
          . doi:
          <volume>10</volume>
          .1038/S42256-023-00711-8.
        </mixed-citation>
      </ref>
      <ref id="ref17">
        <mixed-citation>
          [17]
          <string-name>
            <given-names>J.</given-names>
            <surname>Rabold</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Siebers</surname>
          </string-name>
          , U. Schmid,
          <article-title>Generating contrastive explanations for inductive logic programming based on a near miss approach</article-title>
          ,
          <source>Machine Learning</source>
          <volume>111</volume>
          (
          <year>2022</year>
          )
          <fpage>1799</fpage>
          -
          <lpage>1820</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref18">
        <mixed-citation>
          [18]
          <string-name>
            <given-names>R.</given-names>
            <surname>Fong</surname>
          </string-name>
          ,
          <string-name>
            <surname>A. Vedaldi,</surname>
          </string-name>
          <article-title>Net2vec: Quantifying and explaining how concepts are encoded by filters in deep neural networks</article-title>
          ,
          <source>in: Proc. IEEE Conf. on Computer Vision and Pattern Recognition</source>
          ,
          <year>2018</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref19">
        <mixed-citation>
          [19]
          <string-name>
            <given-names>J.</given-names>
            <surname>Deng</surname>
          </string-name>
          ,
          <string-name>
            <given-names>W.</given-names>
            <surname>Dong</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Socher</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L.-J.</given-names>
            <surname>Li</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K.</given-names>
            <surname>Li</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L.</given-names>
            <surname>Fei-Fei</surname>
          </string-name>
          ,
          <article-title>ImageNet: A Large-Scale Hierarchical Image Database</article-title>
          ,
          <source>in: CVPR09</source>
          ,
          <year>2009</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref20">
        <mixed-citation>
          [20]
          <string-name>
            <given-names>M.</given-names>
            <surname>Heusel</surname>
          </string-name>
          ,
          <string-name>
            <given-names>H.</given-names>
            <surname>Ramsauer</surname>
          </string-name>
          ,
          <string-name>
            <given-names>T.</given-names>
            <surname>Unterthiner</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B.</given-names>
            <surname>Nessler</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Hochreiter</surname>
          </string-name>
          ,
          <article-title>Gans trained by a two time-scale update rule converge to a local nash equilibrium</article-title>
          ,
          <source>in: Advances in Neural Information Processing Systems</source>
          , volume
          <volume>30</volume>
          ,
          <string-name>
            <surname>Curran</surname>
            <given-names>Associates</given-names>
          </string-name>
          , Inc.,
          <year>2017</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref21">
        <mixed-citation>
          [21]
          <string-name>
            <given-names>G. A.</given-names>
            <surname>Miller</surname>
          </string-name>
          ,
          <article-title>Wordnet: a lexical database for english</article-title>
          ,
          <source>Commun. ACM</source>
          <volume>38</volume>
          (
          <year>1995</year>
          )
          <fpage>39</fpage>
          -
          <lpage>41</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref22">
        <mixed-citation>
          [22]
          <string-name>
            <given-names>M.</given-names>
            <surname>Dreyer</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Achtibat</surname>
          </string-name>
          ,
          <string-name>
            <given-names>T.</given-names>
            <surname>Wiegand</surname>
          </string-name>
          ,
          <string-name>
            <given-names>W.</given-names>
            <surname>Samek</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Lapuschkin</surname>
          </string-name>
          ,
          <article-title>Revealing hidden context bias in segmentation and object detection through concept-specific explanations</article-title>
          ,
          <source>in: Proc. IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops</source>
          ,
          <year>2023</year>
          , pp.
          <fpage>3829</fpage>
          -
          <lpage>3839</lpage>
          .
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>