<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>XAI-driven Model Improvements in Interpretable Image Segmentation</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Rokas Gipiškis</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Vilnius University, Institute of Data Science and Digital Technologies</institution>
          ,
          <addr-line>4 Akademijos St, Vilnius</addr-line>
          ,
          <country country="LT">Lithuania</country>
        </aff>
      </contrib-group>
      <abstract>
        <p>Semantic image segmentation is the most fine-grained task in computer vision. Its applications range from autonomous vehicles to medical imaging. Despite its deployments in critical areas, interpretable image segmentation remains an underexplored field, especially when compared to explainable AI (XAI) solutions in classification and object detection. Even less attention has been paid to the use of XAI in non-explainability-related scenarios, where XAI methods are applied not for interpretability per se, but rather for other instrumental reasons, such as improving a model's performance. Such use cases can potentially extend to AI safety, specifically in the case of adversarial attacks, self-supervised learning, neural architecture search (NAS), and continual learning (CL). Most of these areas have never been investigated in the context of interpretable segmentation. This work outlines key developments in the ifeld of interpretable image segmentation, with a particular focus on XAI-driven model improvements. We also consider potential uses of interpretable image segmentation for model compression in the case of NAS, and instance-based memory compression in the case of CL.</p>
      </abstract>
      <kwd-group>
        <kwd>eol&gt;Explainable AI</kwd>
        <kwd>Interpretable AI</kwd>
        <kwd>Image segmentation</kwd>
        <kwd>XAI</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>1. Introduction</title>
      <p>Image segmentation is a predominant task in computer vision, with applications ranging from
medicine to industry. However, evaluation metrics for deep learning (DL) models do not provide
a complete view of their performance. Although a model’s performance might be good, it could
still focus on undesirable spurious correlations. Furthermore, even if the metric is accurate, it
does not ofer insights into the internal mechanisms of the model. The need for explainable and
trustworthy systems is ever-increasing. Yet, there appears to be a clear gap in the explainable
AI (XAI) literature between classification and segmentation.</p>
      <p>One could argue that segmentation could be considered a subset or rather an extension of
explainable classification. However, this does not resolve the fact that explainable segmentation
has its own unique challenges. Firstly, there is the question of how to eficiently generate
explanations for segmentation when multiple pixels are involved. Secondly, there is the challenge
of how to interpret those explanations when multiple pixels are involved. Another important
question, not directly related to enhancements in interpretability, is whether current XAI
techniques in image segmentation can be used to improve model performance, specifically in
compression-based approaches in neural architecture search (NAS) and continual learning (CL).</p>
      <p>Based on this reasoning, the current doctoral research is framed by the following questions:
1. How do we improve explainable segmentation techniques for better explainability? This
question is directly related to the enhancement of explainable segmentation techniques
for a better understanding of segmentation models.
2. How do we use explainable segmentation to enhance performance in tasks not directly
related to explainability? This question is related to model improvements by indirectly
using XAI-based techniques.</p>
      <p>By combining the results of these two application areas, we can not only identify the most
suitable XAI candidates in image segmentation, but also evaluate their potential use in creating
more eficient models, both in terms of their weights and memory utilization. Such eficient
solutions could be particularly useful in edge computing and in Internet of Things devices.</p>
    </sec>
    <sec id="sec-2">
      <title>2. Related Work</title>
      <p>The research into explainable image segmentation began in the late 2010s. Since then, it has
seen incremental growth in new methods and applications. For the purposes of this paper,
we can divide related works into two groups, based on the end-purpose of XAI application.
The first group uses XAI methods to better understand a model’s performance or enhance its
interpretability. The second group employs XAI techniques not primarily for explainability, but
as tools to enhance a model’s performance, whether by compressing its weights, improving
its memory utilization, or, in the case of self-supervised segmentation, limiting the need for
manual annotations.</p>
      <sec id="sec-2-1">
        <title>2.1. Explanations for Image Segmentation</title>
        <p>
          Two influential early applications of XAI in segmentation can be found in [
          <xref ref-type="bibr" rid="ref1">1</xref>
          ] and [
          <xref ref-type="bibr" rid="ref2">2</xref>
          ]. A
perturbation-based explainability approach is introduced in [
          <xref ref-type="bibr" rid="ref1">1</xref>
          ] as a method to detect contextual
bias. In [
          <xref ref-type="bibr" rid="ref2">2</xref>
          ], on the other hand, a gradient-based Seg-Grad-CAM solution is proposed and
evaluated using U-Net [
          <xref ref-type="bibr" rid="ref3">3</xref>
          ]. Perturbation-based XAI techniques are further investigated in [
          <xref ref-type="bibr" rid="ref4">4</xref>
          ]
and [
          <xref ref-type="bibr" rid="ref5">5</xref>
          ]. In [
          <xref ref-type="bibr" rid="ref4">4</xref>
          ], the focus is on occlusions in the input space, while [
          <xref ref-type="bibr" rid="ref5">5</xref>
          ] explores
gradientfree perturbations in the activation space. Simple gradient and Smoothgrad [
          <xref ref-type="bibr" rid="ref6">6</xref>
          ] extensions for
segmentation are presented in [
          <xref ref-type="bibr" rid="ref7">7</xref>
          ], where they are investigated for applications in cyber-physical
systems. These methods are further discussed in Section 5. Currently, gradient-based post-hoc
approaches are more prevalent in explainable segmentation due to their lower computational
costs.
        </p>
      </sec>
      <sec id="sec-2-2">
        <title>2.2. XAI-driven Model Improvements</title>
        <p>
          To our knowledge, there are no XAI-driven model improvements specifically for NAS or CL in
segmentation. In classification, [
          <xref ref-type="bibr" rid="ref8">8</xref>
          ] proposes a NAS model based on class activation mapping
(CAM). The teacher and student models are incorporated into the evolutionary search. The less
complex student model has to generate an explanation map that closely approximates the one
generated by the teacher model, as measured by the inverse of the Euclidean distance. In [
          <xref ref-type="bibr" rid="ref9">9</xref>
          ], an
input saliency-based NAS is introduced as a way of reweighing diferent data points. However,
the proposed solution only focuses on the features in the input space, leaving investigation of
the activation space features for further research. This approach is suitable for diferentiable
NAS methods, but further investigation is needed for non-diferentiable methods, such as
evolutionary-algorithm-based NAS. Additional modifications or selecting a non-gradient based
optimization algorithm would be required.
        </p>
        <p>
          Explainable segmentation techniques have also been investigated for safety and robustness
evaluations in segmentation models [
          <xref ref-type="bibr" rid="ref7">7</xref>
          ]. XAI methods can be potentially used in detecting
adversarial attacks targeting segmentation. However, the study does not investigate
architecturebased changes, so that the safety-critical measures could be incorporated into the model itself.
Another non-explainability-related area that uses XAI in model training is self-supervised
learning as well as weakly-supervised segmentation. Although, after the initial use of classification
saliencies for weakly-supervised object localization [
          <xref ref-type="bibr" rid="ref10">10</xref>
          ], several other applications have been
proposed, the solutions focus on XAI techniques in classification and are not directly related to
interpretable segmentation. Typically, explanatory heatmaps for a selected class are used as
imprecise segmentation masks that could be employed to increase dataset size, since manual
annotations are expensive and time-consuming.
        </p>
      </sec>
    </sec>
    <sec id="sec-3">
      <title>3. Research</title>
      <sec id="sec-3-1">
        <title>3.1. Research Questions</title>
        <p>Current research questions encompass XAI use for both explainability-related (the first question)
and non-explainability-related tasks (the latter two questions):
1. How can explainability techniques for image segmentation be improved, either in terms
of evaluative XAI metrics or computational costs?
2. Can more eficient segmentation models be found by incorporating explainable
segmentation techniques in NAS, specifically in the case of teacher-student architectures?
3. Can explainable segmentation techniques be implemented in continual learning for
compression in experience replay?</p>
      </sec>
      <sec id="sec-3-2">
        <title>3.2. Hypothesis</title>
        <p>The underlying assumption is that eficient explainable segmentation techniques can identify
those regions in the input space or those feature maps in the activation space that are most
important for the decision-making of a model and, by extension, its accuracy. Since XAI
techniques primarily focus on these areas, their results could be used for model compression in
NAS, or memory compression in CL.</p>
      </sec>
      <sec id="sec-3-3">
        <title>3.3. Objectives</title>
        <p>To better investigate XAI-driven segmentation model enhancements, we define the following
objectives:
1. Identify the most suitable XAI techniques in segmentation based on computational
requirements and quantitative XAI metrics.
2. Investigate whether the CAM-NAS application in classification can be successfully
extended to segmentation.
3. Evaluate the performance of various explainable segmentation techniques, focusing on
their potential uses in NAS.
4. Explore the use of explainable segmentation techniques for memory compression in
experience replay by storing only the image areas centered around the most important
input features, as identified by selected XAI techniques.</p>
      </sec>
    </sec>
    <sec id="sec-4">
      <title>4. Approach</title>
      <p>
        Firstly, suitable XAI techniques have to be selected for the experiments. Based on previous
research [
        <xref ref-type="bibr" rid="ref4">4</xref>
        ], [
        <xref ref-type="bibr" rid="ref5">5</xref>
        ], [
        <xref ref-type="bibr" rid="ref7">7</xref>
        ], gradient-based methods are preferable for the proposed use-cases due to
their lower computational costs. Speed is an important factor when extracting saliency maps,
especially when multiple iterations are required. This is further supported by the CAM-NAS [
        <xref ref-type="bibr" rid="ref8">8</xref>
        ]
experiments in classification, where gradient-based methods achieve the best results. A simple
gradient-based saliency map technique can be used as a baseline. Diferent implementations of
Seg-Grad-CAM [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ] can also be investigated. Since gradient-based techniques can generate a lot
of noise, noise-reduction techniques, such as thresholding a certain percentage of pixels based
on their importance, might also be considered. Especially when manual human-in-the-loop
supervision is involved.
      </p>
      <p>
        NAS focuses on automating the design of neural network architectures. Following [
        <xref ref-type="bibr" rid="ref8">8</xref>
        ],
the initial teacher-student model will be extended to semantic segmentation models. The
explanations will be generated based on the summed-up pre-Softmax prediction scores for
the selected class of interest (Fig. 1). A well-trained segmentation model (the teacher) will
be paired up with a less complex model (the student). Then, explainable segmentation maps
will be generated for the same input images and compared in terms of a similarity score. If
the teacher model has truly learned the most important representations in an unbiased way,
and if the selected XAI technique can capture the most important features for the model’s
decision-making, then we would like to have a student model that is also sensitive to the same
features. This could be viewed as a knowledge transfer from the teacher model to the student.
The original CAM-NAS implementation [
        <xref ref-type="bibr" rid="ref8">8</xref>
        ] uses evolutionary algorithms for the generation of
search submodels, and it could serve as an initial starting point for the experiments.
      </p>
      <p>
        It is less clear whether XAI-driven model enhancements can be implemented in the case of CL,
specifically for memory compression in experience replay. CL focuses on how an already trained
model can learn new tasks without forgetting the previous ones. Experience replay is an eficient
CL strategy that allows storing the most important examples from old tasks inside the memory
so that the model can still be exposed to them in the future. In classification, it is possible to
reduce memory utilization by storing just the most salient regions of the data samples [
        <xref ref-type="bibr" rid="ref11">11</xref>
        ]. By
cropping the image so that it is centered around the most important regions, memory can be
utilized more eficiently. However, it is unclear if the cropping strategy could work in the case
of segmentation, as it is a dense prediction task that, unlike image classification, could not be
completed if part of the image was missing. In this particular context, compared to compression
in classification, segmentation appears to be more sensitive to partial data. Enough critical
contextual information would have to be stored for the segmentation to be successful. Perhaps
less salient regions could be downsampled, as described in [
        <xref ref-type="bibr" rid="ref12">12</xref>
        ] in the case of classification. Then,
enough contextual information could still be preserved to complete segmentation, especially
if the right contextual information was identified by the explainable segmentation technique.
Following [
        <xref ref-type="bibr" rid="ref12">12</xref>
        ], once the most important image regions are identified, they can be occluded
by a bounding box. The resulting image with unoccluded non-discriminative pixels is then
downsampled. Then, the previously occluded salient region is summed up to the downsampled
image. The final image occupies significantly less space in memory. To our knowledge, similar
experiments have not yet been conducted for CL in image segmentation.
      </p>
    </sec>
    <sec id="sec-5">
      <title>5. Results</title>
      <p>
        So far, the investigation has focused on explainable segmentation for a better understanding of
the model, with an additional focus on XAI safety and robustness [
        <xref ref-type="bibr" rid="ref7">7</xref>
        ]. The first comprehensive
survey on XAI in image segmentation has been prepared [13], including the initial taxonomy,
graphical representations of XAI pipelines for diferent XAI method categories, and a detailed
literature analysis based on evaluation metrics, application domains, and used datasets. Surveyed
XAI methods in segmentation have been grouped into gradient-based, perturbation-based,
prototype-based, architecture-based, and counterfactual techniques.
      </p>
      <p>
        Input perturbation experiments, also known as occlusion sensitivity, have been performed on
diferent segmentation models in [
        <xref ref-type="bibr" rid="ref4">4</xref>
        ]. Both the size and color of the occlusion filter have been
investigated using deletion curves, a quantitative XAI evaluation metric. The results indicate
that segmentation models are sensitive to varying occlusion colors and sizes, and that this
is related to the original colors in the unperturbed input image as well as the ratio between
the foreground object of interest and its background. It has also been observed that diferent
occlusion colors can greatly afect the segmentation outputs, even when the same filter size is
used and when applied to the same input region. More neutral Gaussian-based filters appear to
cause less unnatural distortions to the model’s output. In contrast, depending on the background
and foreground colors, the black occlusion filter can be mistaken as part of the segmented
object. Another interesting finding is that compared to perturbation-based explanations in
classification, there is significantly less variance generated in the evaluation scores. Therefore,
normalization techniques, such as min-max scaling, can be used to generate clearer explanations
with higher color intensities. It is also worth noting that input perturbation methods require
significant computational costs, as each occlusion requires an additional inference.
      </p>
      <p>
        Perturbations in the activation space have been investigated in [
        <xref ref-type="bibr" rid="ref5">5</xref>
        ]. A new gradient-free
Seg-Ablation-CAM technique has been proposed as an extension of [14]. The method is based
on partial or full deactivations of activation maps from the selected neural network layer. The
results of foreground and background occlusions indicate that foreground occlusions are more
important for the model’s output. Based on the qualitative results, the proposed approach
provides less noisy and more concentrated saliency visualizations compared to gradient-based
XAI methods in segmentation. However, just like other perturbation-based methods, it is
computationally expensive, as multiple inferences are required.
      </p>
      <p>
        Gradient-based explainable segmentation methods have been described in [
        <xref ref-type="bibr" rid="ref7">7</xref>
        ], where vanilla
gradient and its SmoothGrad [
        <xref ref-type="bibr" rid="ref6">6</xref>
        ] extension have been investigated for industrial applications.
This is the first study to focus on adversarial attacks targeting explainable segmentation. It has
been demonstrated that segmentation models can be attacked so that the model’s output does
not change, but its corresponding explanation is arbitrarily manipulated. This is achieved by
introducing a three-term loss function which ensures that the perceptible noise is not introduced
in the input, that the output is not significantly changed, and that the generated explanation is
close to the targeted one, as specified before the attack.
      </p>
    </sec>
    <sec id="sec-6">
      <title>6. Further Research</title>
      <p>
        The next research steps will focus on the implementation of CAM-NAS for semantic
segmentation. It needs to be evaluated whether the proposed strategy is feasible for higher-resolution
images. The final contribution could lead to more eficient image segmentation models. Further
research directions could explore how to retain both interpretability and NAS eficiency while
investigating the potential trade-ofs between the two. This might be related to the fact that
some computationally expensive XAI techniques, such as Seg-Ablation-CAM [
        <xref ref-type="bibr" rid="ref5">5</xref>
        ], are less noisy
compared to less computationally demanding techniques, like simple gradients. Other studies
in interpretable image segmentation could also investigate whether safety-critical components
can be automatically identified by XAI techniques and then incorporated into the proposed
architecture.
      </p>
    </sec>
    <sec id="sec-7">
      <title>Acknowledgments</title>
      <p>I extend my gratitude to Prof. Olga Kurasova, my doctoral advisor, and Prof. Chun-Wei Tsai for
their support and counsel.
[13] R. Gipiškis, C.-W. Tsai, O. Kurasova, Explainable AI (XAI) in image segmentation in
medicine, industry, and beyond: A survey, arXiv preprint arXiv:2405.01636 (2024).
[14] S. Desai, H. G. Ramaswamy, Ablation-CAM: Visual explanations for deep convolutional
network via gradient-free localization, in: Proceedings of the IEEE/CVF Winter Conference
on Applications of Computer Vision, 2020, pp. 983–991.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <given-names>L.</given-names>
            <surname>Hoyer</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Munoz</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Katiyar</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Khoreva</surname>
          </string-name>
          ,
          <string-name>
            <given-names>V.</given-names>
            <surname>Fischer</surname>
          </string-name>
          ,
          <article-title>Grid saliency for context explanations of semantic segmentation</article-title>
          ,
          <source>Proceedings of the Advances in Neural Information Processing Systems</source>
          <volume>32</volume>
          (
          <year>2019</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [2]
          <string-name>
            <given-names>K.</given-names>
            <surname>Vinogradova</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Dibrov</surname>
          </string-name>
          , G. Myers,
          <article-title>Towards interpretable semantic segmentation via gradient-weighted class activation mapping (student abstract)</article-title>
          ,
          <source>in: Proceedings of the AAAI Conference on Artificial Intelligence</source>
          , volume
          <volume>34</volume>
          ,
          <year>2020</year>
          , pp.
          <fpage>13943</fpage>
          -
          <lpage>13944</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [3]
          <string-name>
            <given-names>O.</given-names>
            <surname>Ronneberger</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Fischer</surname>
          </string-name>
          ,
          <string-name>
            <given-names>T.</given-names>
            <surname>Brox</surname>
          </string-name>
          , U-net:
          <article-title>Convolutional networks for biomedical image segmentation, in: Proceedings of the Medical image computing and computer-assisted intervention-</article-title>
          <source>MICCAI</source>
          <year>2015</year>
          ,
          <year>2015</year>
          , part III 18,
          <year>2015</year>
          , pp.
          <fpage>234</fpage>
          -
          <lpage>241</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [4]
          <string-name>
            <given-names>R.</given-names>
            <surname>Gipiškis</surname>
          </string-name>
          ,
          <string-name>
            <given-names>O.</given-names>
            <surname>Kurasova</surname>
          </string-name>
          ,
          <article-title>Occlusion-based approach for interpretable semantic segmentation</article-title>
          ,
          <source>in: Proceedings of the Iberian Conference on Information Systems and Technologies</source>
          ,
          <year>2023</year>
          , pp.
          <fpage>1</fpage>
          -
          <lpage>6</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          [5]
          <string-name>
            <given-names>R.</given-names>
            <surname>Gipiškis</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Chiaro</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Annunziata</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F.</given-names>
            <surname>Piccialli</surname>
          </string-name>
          ,
          <article-title>Ablation studies in activation maps for explainable semantic segmentation in Industry 4.0</article-title>
          , in:
          <source>Proceedings of the IEEE EUROCON 2023-20th International Conference on Smart Technologies</source>
          ,
          <year>2023</year>
          , pp.
          <fpage>36</fpage>
          -
          <lpage>41</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          [6]
          <string-name>
            <given-names>D.</given-names>
            <surname>Smilkov</surname>
          </string-name>
          ,
          <string-name>
            <given-names>N.</given-names>
            <surname>Thorat</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B.</given-names>
            <surname>Kim</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F.</given-names>
            <surname>Viégas</surname>
          </string-name>
          , M. Wattenberg,
          <article-title>SmoothGrad: Removing noise by adding noise</article-title>
          ,
          <source>arXiv preprint arXiv:1706.03825</source>
          (
          <year>2017</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          [7]
          <string-name>
            <given-names>R.</given-names>
            <surname>Gipiškis</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Chiaro</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Preziosi</surname>
          </string-name>
          ,
          <string-name>
            <given-names>E.</given-names>
            <surname>Prezioso</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F.</given-names>
            <surname>Piccialli</surname>
          </string-name>
          ,
          <article-title>The impact of adversarial attacks on interpretable semantic segmentation in cyber-physical systems</article-title>
          ,
          <source>IEEE Systems Journal</source>
          (
          <year>2023</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          [8]
          <string-name>
            <given-names>Z.</given-names>
            <surname>Zhang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Z.</given-names>
            <surname>Wang</surname>
          </string-name>
          ,
          <string-name>
            <surname>I. Joe</surname>
          </string-name>
          , CAM-NAS:
          <article-title>An eficient and interpretable neural architecture search model based on class activation mapping</article-title>
          ,
          <source>Applied Sciences</source>
          <volume>13</volume>
          (
          <year>2023</year>
          )
          <fpage>9686</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          [9]
          <string-name>
            <given-names>R.</given-names>
            <surname>Hosseini</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Xie</surname>
          </string-name>
          ,
          <article-title>Saliency-aware neural architecture search</article-title>
          ,
          <source>Proceedings of the Advances in Neural Information Processing Systems</source>
          <volume>35</volume>
          (
          <year>2022</year>
          )
          <fpage>14743</fpage>
          -
          <lpage>14757</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          [10]
          <string-name>
            <given-names>B.</given-names>
            <surname>Zhou</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Khosla</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Lapedriza</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Oliva</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Torralba</surname>
          </string-name>
          ,
          <article-title>Learning deep features for discriminative localization</article-title>
          ,
          <source>in: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition</source>
          ,
          <year>2016</year>
          , pp.
          <fpage>2921</fpage>
          -
          <lpage>2929</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          [11]
          <string-name>
            <given-names>G.</given-names>
            <surname>Saha</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K.</given-names>
            <surname>Roy</surname>
          </string-name>
          ,
          <article-title>Online continual learning with saliency-guided experience replay using tiny episodic memory</article-title>
          ,
          <source>Machine Vision and Applications</source>
          <volume>34</volume>
          (
          <year>2023</year>
          )
          <fpage>65</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          [12]
          <string-name>
            <given-names>Z.</given-names>
            <surname>Luo</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Y.</given-names>
            <surname>Liu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B.</given-names>
            <surname>Schiele</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Q.</given-names>
            <surname>Sun</surname>
          </string-name>
          ,
          <article-title>Class-incremental exemplar compression for classincremental learning</article-title>
          ,
          <source>in: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition</source>
          ,
          <year>2023</year>
          , pp.
          <fpage>11371</fpage>
          -
          <lpage>11380</lpage>
          .
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>