Interactive xAI-dashboard for Semantic Segmentation
                                Finn Schürmann1 and Sibylle D. Sager-Müller1,∗

                                1 Lucerne University of Applied Sciences and Arts, Suurstoffi 1, CH-6343 Rotkreuz, Switzerland


                                                Abstract
                                                This article proposes an interactive dashboard for analyzing semantic image segmentation
                                                models using eXplainable AI (xAI) methods. It integrates open-source xAI packages with
                                                segmentation models from PyTorch and TensorFlow Keras, focusing on road traffic images.
                                                Through model-based and post hoc explanation methods, users gain insights into model
                                                perceptions. The dashboard facilitates user interaction by allowing selection of model, label, and
                                                xAI method, with visualizations displaying segmented images and explanations. The
                                                implementation uses Python's Dash library, complemented by PyTorch and external xAI
                                                libraries. A demo app showcases model comparisons and xAI method outputs, enhancing
                                                transparency and trust in AI systems for safety-critical applications like autonomous driving.

                                                Keywords
                                                Computer vision, Semantic segmentation, explainable AI, Human-Machine interaction 1


                                1. Introduction
                                xAI is a set of methods and procedures which help humans to understand and trust results
                                created by artificial intelligence (AI) algorithms. Especially in safety-critical applications,
                                xAI is a key requirement [1]. This can be achieved through various validation algorithms.
                                To obtain an unbiased and comprehensive understanding of the outcomes produced by
                                existing algorithms for neural networks (NNs) at different depths, this article proposes a
                                novel interactive dashboard containing a subset of common xAI methods of the most
                                prominent open source xAI packages. These packages can be used to analyze data and
                                segmentation models, e.g., from road traffic. The dashboard also allows for a selection of
                                data and trained segmentation models from common libraries like PyTorch [2].

                                   In autonomous driving the perception of environment is an important aspect. Image
                                segmentation, a frequently used technique in this application, assigns a label to every pixel
                                in an image so that pixels with the same label share certain characteristics. It is important
                                that these segmentation models have a high accuracy and efficiency as they are included in


                                Late-breaking work, Demos and Doctoral Consortium, colocated with The 2nd World Conference on eXplainable
                                Artificial Intelligence: July 17–19, 2024, Valletta, Malta
                                *Corresponding author
                                   finn.schuermann@kasel.ch (F. Schürmann), sibylle.sager@hslu.ch (S. Sager-Müller)
                                   https://orcid.org/0000-0003-2375-5601 (F. Schürmann), https://orcid.org/0009-0000-4857-5514 (S.
                                Sager-Müller)
                                           © 2024 Copyright for this paper by its authors.
                                           Use permitted under Creative Commons License Attribution 4.0 International (CC BY 4.0).


CEUR
                  ceur-ws.org
Workshop      ISSN 1613-0073
Proceedings
driver assistance systems. xAI enables the user to analyze the performance of the
segmentation models. Thus, it helps the AI developer – in the example of machine vision for
autonomous vehicles – to check if the model evaluates the situation shown in the image
correctly. However, the segmentation models and xAI methods are not yet optimised for
autonomous driving. While they can assist in the resolution of issues, their development in
this field is not yet sufficiently advanced.

   From 2015 to 2021, more than 150 xAI related tools have been published [3]. In general
the tools are implemented for image classification, because the xAI-methods were initially
developed for image classification [4]. For both machine vision tasks, image classification
and semantic segmentation, heatmaps can be employed to help the user to find out if the
model learned what it was expected to learn. During the analysis of the widely used xAI-
tools, we did not come across a specific tool dedicated solely to image segmentation. The
Neuroscope framework [4] has been one approach towards xAI analysis for image
segmentation (plus image classification). However, it is important to note that its
implementation is platform-dependent and there are currently no plans for further
development [5]. Therefore, we intend to fill the gap by implementing a novel platform-
independent dashboard for image segmentation specifically. Our goal is to insert the most
common segmentation models and xAI-methods in a dashboard with user-friendly
interaction possibility.


2. Models and xAI-Methods
2.1. Models
The right choice of model for computer vision tasks is crucial. The library from PyTorch
specifies which model is preferable for which task. Table 1 shows an overview of the models
for image segmentation used in the dashboard. The models can be categorised in fully
convolutional networks (FCN) and DeepLabV3 networks.


Table 1
Overview of used models for semantic image segmentation in PyTorch
                    task                model        specification
                                                     FCN_ResNet_50
                                         FCN
                                                     FCN_ResNet_101
            Semantic segmentation                    DeepLabV3_MobileNet_V3_Large
                                      DeepLabV3      DeepLabV3_ResNet_50
                                                     DeepLabV3_ResNet_101


  The first model class comprises FCN ResNet models which take inputs of different sizes
and produce outputs of corresponding sizes. Segmentation networks are based on
classification networks with little adaption. ResNet is a deep convolutional neural network
proposed by Microsoft. With residual blocks that help optimise a residual function, this
architecture allows accuracy to be increased by increasing the depths of layers [6]. The
number “50”, for example in FCN ResNet 50, represents the number of layers in the network.
All models in Table 1 are pretrained with the PASCAL VOC dataset.

   The second model class is the DeepLabV3 model, which is based on the Resnet 50, Resnet
101 or MobileNet3 backbone. The difference is that DeepLabV3 uses atrous convolution,
also called dilated convolution. The models based on atrous convolution are actively
researched for in semantic segmentation [7]. DeepLabV3 uses the MobileNetV3-Large
model, which is 34% faster than its predecessor at the same accuracy level for cityscape
segmentation. The reason for the speed increase of mobile models is two-fold: First,
MobileNetV3 uses the hard sigmoid function instead of the standard sigmoid function
because the hard sigmoid function has much lower latency costs [8]. Second, mobile models
employ atrous convolution meaning that the kernel laid over the input has some holes. The
size of the holes can be controlled by the hyperparameter rate. The default convolution sets
the rate to 1. The more the rate increases, the more it is possible to encode the object with
multi-scale context [7].

2.2. xAI-Methods
Our dashboard in its current implementation uses Layer GradCAM, LIME, Feature Ablation,
and Saliency as xAI-methods. xAI-methods can be categorized in model-specific and model-
agnostic methods. Model-specific methods calculated the effect of changes in the input
features to the output using the model itself, while model-agnostic methods work by
manipulating input data and analyzing the respective model predictions without knowledge
of the model. Within the subclasses of specific and agnostic, one can further distinguish
between local or global methods. Local methods explain the individual predictions of
models, while global methods explain the behavior of the model averaged over all samples
[9],[10],[11].

   Gradient-weighted Class Activation Mapping (GradCAM) [12] is a technique that
analyzes gradient information for any convolutional layer of a model and generates a
heatmap that highlights important regions in the image. This method operates through
forward passes without backpropagation.

    Local Interpretable Model Agnostic Explanations (LIME) [13] trains an interpretable
surrogate model. The model is evaluated at sampling points around a defined input example
to train a simple surrogate model. It is a model-agnostic, perturbation-based approach.

   Feature Ablation [9] is perturbation-based and calculates attribution, by replacing each
input feature with some reference, and calculating the difference in output. A set of features
can be turned off together instead of one at a time.
   Saliency [14] calculates the gradients with respect to inputs.

3. Implementation
The goal is to deliver a dashboard that allows the user to interactively check an AI model
for semantic segmentation using a variety of xAI methods. This involves finding the trade-
off between technical depth on one side and comprehensibility by the ordinary user on the
other side. The dashboard is implemented with python using Dash [15]. Dash is a library
from Plotly to create web apps without need to write code in JavaScript or HTML. Dash in
combination with PyTorch yields good visualization possibilities for segmented models. xAI
methods were implemented using the Captum library [9] and the pytorch-grad-cam library
from Jacob Gildenblat [16].

4. Function of dashboard
The dashboard can be used to compare xAI methods for different segmentation models. As
a first step, a demo app has been implemented with the goal to show the user the main
working principles of the dashboard. This will help the user to get accustomed to the tool.
The demo app can be opened with the tab “show demo”. In the dropdown menus as shown
in Figure 1 the segmentation models for comparison can be chosen. If on both images a
model is selected, the difference between the segmented images of the two models gets
visible on the bottom left side. This image is obtained by taking the pixelwise difference of
the arrays of the two segmented images.


Figure 1: Model selection to compare the segmentation models.

  The segmented image is overlaid with the original one so that the user can easily see the
quality of the segmentation. By showing the differences, the model quality can be easily
compared visually. This helps the user to decide for an appropriate model. After selecting
a model, the user chooses a label from the dropdown menu “Label Selection”, as shown in
Figure 2. The labels are, in the preliminary version, predefined since the dashboard only
allows the selection of pre-trained models from PyTorch.


Figure 2: Label Selection.

    After the label is selected, up to two xAI methods can be chosen from the dropdown
menu, one on the left-hand side, another on the right-hand side. Then, the upper row shows
the original image overlaid with the heatmap from the corresponding xAI methods. If two
xAI methods are selected, the difference of their corresponding heatmaps is shown in the
lower right section to allow for direct visual comparison. On a technical level, the user has
to make sure to select the same label for both xAI methods to make the comparison
meaningful. The difference of the heatmaps is calculated as soon as the methods are selected
in both filter bars. If a method is not selected in one of the filters, the element will indicate
which filter needs to have a method chosen, as shown in Figure 3.
    This is not the final state of the dashboard: One the lower right section, it is planned to
include metrics to evaluate the xAI methods, e.g., from the library Quantus [17]. Currently,
Quantus is only applicable to image classification, but with a few modifications, it should be
possible to apply it to image segmentation as well. Also, the metrics have to be pre-selected
first. For implementation everything is documented on GitHub*2.


2 https://github.com/fschurma/xAI_dashboard
Figure 3: Method selection to compare the xAI methods.


5. Conclusion
The demo app with the functions described above serves as the foundation of the final app
currently under construction. The goal of the final version is to allow the users to import
their own image(s) and segmentation model(s) to test its/their performance. This enables
the user to adjust the models. In the current state, comparison of two models and two
methods is possible only visually. It is planned to display evaluation metrics additionally,
which would be a benefit compared to similar implementations like Neuroscope [4]. Metrics
could be based on those from Quantus [17]. However, this will require first to study which
of the metrics can be transferred from classification to segmentation tasks. The xAI
methods employed in this demo app are just a small selection which will be extended to a
larger subset like, e.g., in Neuroscope. For a better visualization, it is planned to include a
color scale to illustrate the magnitude of the image differences. As a last step, the app will
be thoroughly tested to make sure it will meet the requirements for user-friendly human-
machine interaction.
References
[1] F. Xu, H. Uszkoreit, Y. Du, W. Fan, D. Zhao, and J. Zhu, Explainable AI: A Brief Survey on
    History, Research Areas, Approaches and Challenges, in Natural Language Processing
    and Chinese Computing, vol. 11839, J. Tang, M.-Y. Kan, D. Zhao, S. Li, and H. Zan, Eds., in
    Lecture Notes in Computer Science, vol. 11839, Cham: Springer International
    Publishing, 2019, pp. 563–574. doi: 10.1007/978-3-030-32236-6_51.
[2] Models and pre-trained weights — Torchvision 0.16 documentation’. Accessed: Oct. 24,
    2023. [Online]. Available: https://pytorch.org/vision/stable/models.html#semantic-
    segmentation
[3] N. Uhl, Making It Easier to Compare the Tools for Explainable AI, Partnership on AI.
    Accessed: Nov. 26, 2023. [Online]. Available: https://partnershiponai.org/making-it-
    easier-to-compare-the-tools-for-explainable-ai/
[4] C. Schorr, P. Goodarzi, F. Chen, and T. Dahmen, Neuroscope: An Explainable AI Toolbox
    for Semantic Segmentation and Image Classification of Convolutional Neural Nets, Appl.
    Sci., vol. 11, no. 5, p. 2199, Mar. 2021, doi: 10.3390/app11052199.
[5] C. Schorr, Personal Communication, 2023.
[6] K. Le, ‘A quick overview of ResNet models’, MLearning.ai. Accessed: Nov. 07, 2023.
    [Online]. Available: https://medium.com/mlearning-ai/a-quick-overview-of-resnet-
    models-f8ed277ae81e
[7] L.-C. Chen, G. Papandreou, F. Schroff, and H. Adam, Rethinking Atrous Convolution for
    Semantic Image Segmentation. arXiv, Dec. 05, 2017. Accessed: Nov. 12, 2023. [Online].
    Available: http://arxiv.org/abs/1706.05587
[8] A. Howard et al., ‘Searching for MobileNetV3’. arXiv, Nov. 20, 2019. Accessed: Nov. 07,
    2023. [Online]. Available: http://arxiv.org/abs/1905.02244
[9] Introduction · Captum’. Accessed: Dec. 02, 2023. [Online]. Available: https://captum.ai/
[10] C. Molnar,G. Casalicchi, B. Bischl (2020). Interpretable Machine Learning – A Brief
    History, State-of-the-Art and Challenges. In: Koprinska, I., et al. ECML PKDD 2020
    Workshops. ECML PKDD 2020. Communications in Computer and Information Science,
    vol 1323. Springer, Cham. https://doi.org/10.1007/978-3-030-65965-3_28.
[11] M. Munn and D. Pitman, Explainable AI for practitioners: designing and implementing
    explainable ML solutions. Beijing, Sebastopol, CA: O’Reilly, 2022.
[12] R. Selvaraju, M. Cogswell, A. Das, R. Vedantam, D. Parikh, D. Batra, ‘Grad-CAM: Visual
     Explanations from Deep Networks via Gradient-based Localization’, International
     Journal Computer Vision, vol. 128, no. 2, pp. 336–359, Feb. 2020, doi: 10.1007/s11263-
     019-01228-7.
[13] M. T. Ribeiro, S. Singh, and C. Guestrin, "Why Should I Trust You?": Explaining the
    Predictions of Any Classifier. arXiv, Feb. 26, 2016. Accessed: May 13, 2024. [Online].
    Available: http://arxiv.org/abs/1602.04938.
[14] K. Simonyan, A. Vedaldi, and A. Zisserman, Deep Inside Convolutional Networks:
    Visualising Image Classification Models and Saliency Maps. arXiv, Apr. 19, 2014.
    Accessed: Mar. 20, 2024. [Online]. Available: http://arxiv.org/abs/1312.6034
[15] ‘Dash Documentation & User Guide | Plotly’. Accessed: Apr. 05, 2024. [Online].
    Available: https://dash.plotly.com/
[16] J. Gildenblat, ‘jacobgil/pytorch-grad-cam’. Apr. 05, 2024. Accessed: Apr. 05, 2024.
    [Online]. Available: https://github.com/jacobgil/pytorch-grad-cam
[17] A. Hedström et al., ‘Quantus: An Explainable AI Toolkit for Responsible Evaluation of
    Neural Network Explanations and Beyond’. arXiv, Feb. 14, 2022. Accessed: May 13,
    2024. [Online]. Available: http://arxiv.org/abs/2202.06861.