<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Exploring Brain Tumor Segmentation and Patient Survival: An Interpretable Model Approach</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Valerio Ponzi</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Giorgio De Magistris</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Department of Computer, Control and Management Engineering, Sapienza University of Rome</institution>
          ,
          <addr-line>Via Ariosto 25, Roma, 00185</addr-line>
          ,
          <country country="IT">Italy</country>
        </aff>
        <aff id="aff1">
          <label>1</label>
          <institution>Institute for Systems Analysis and Computer Science, Italian National Research Council</institution>
          ,
          <addr-line>Via dei Taurini 19, Roma, 00185</addr-line>
          ,
          <country country="IT">Italy</country>
        </aff>
      </contrib-group>
      <abstract>
        <p>Detecting and delineating brain tumors from MRI images using artificial intelligence presents a complex challenge in medical AI. Recent progress has seen a variety of techniques employed to assist medical professionals in this task. Despite the efectiveness of machine learning algorithms in segmenting tumors, their lack of transparency in decision-making hinders trust and validation. In our project, we constructed an interpretable U-Net Model specifically tailored for brain tumor segmentation, leveraging both the Gradient-weighted Class Activation Mapping (Grad-CAM) Algorithm and the SHapley Additive exPlanations (SHAP) library. We relied on the BraTS2020 benchmark dataset for training and evaluation purposes. The U-Net model we employed yielded promising results. We then utilized Grad-CAM to visualize the crucial features attended to by the model within an image. Additionally, we enhanced interpretability by utilizing the SHAP library to elucidate the predictions made by various models (including Random Forest, KNN, SVC, and MLP) utilized for predicting patient survival days.</p>
      </abstract>
      <kwd-group>
        <kwd>eol&gt;Brain Tumor</kwd>
        <kwd>U-Net</kwd>
        <kwd>Segmentation</kwd>
        <kwd>Explainable Artificial Intelligence</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>1. Introduction</title>
      <p>
        parency inherent in deep learning models poses a
significant challenge. This opacity is problematic as
docBrain tumors represent a significant challenge in health- tors need to understand how the model arrives at its
care, afecting millions of individuals worldwide with conclusions to make informed decisions about patient
their life-threatening implications. Accurate delineation care[
        <xref ref-type="bibr" rid="ref2 ref3">2, 3</xref>
        ]. Additionally, explainable artificial intelligence
of these tumors is paramount for efective treatment (XAI) plays a crucial role in mitigating biases within AI
strategies and ongoing monitoring of disease progres- models. Biases may arise if the model is trained on data
sion. Over the past few years, deep learning techniques that doesn’t adequately represent the population it will
have emerged as promising tools for brain tumor segmen- serve, leading to incorrect or skewed predictions. XAI
tation, with the U-Net architecture gaining popularity can help identify and rectify these biases, thereby
enfor its ability to capture intricate details within medical hancing the model’s reliability. Moreover, XAI fosters
images. However, the inherent opacity of deep learning trust in AI systems by elucidating the decision-making
models presents a hurdle, as it limits their interpretabil- process [
        <xref ref-type="bibr" rid="ref4">4</xref>
        ], thereby increasing the willingness of doctors
ity and makes it dificult for clinicians to comprehend and patients to rely on these models [
        <xref ref-type="bibr" rid="ref5 ref6">5, 6</xref>
        ].
the rationale behind their decisions. Explainable Arti- In our project, we utilize the Gradient-weighted Class
ifcial Intelligence (XAI) has garnered increasing impor- Activation Mapping (Grad-CAM) technique to imbue our
tance, particularly in the medical domain, where precise segmentation UNET model with explainability.
Gradtumor segmentation plays a crucial role. Tumor seg- CAM generates heatmaps highlighting the crucial
rementation involves the identification and localization of gions of input images that the model focuses on when
tumors within medical imaging data, such as MRI scans, making predictions. By visualizing these heatmaps, we
CT scans, or X-rays. This process is indispensable in can- gain insights into the features guiding the model’s
decicer diagnosis, treatment planning, and progress tracking. sions, facilitating better understanding of its behavior.
XAI holds significance in tumor segmentation for several Furthermore, our project incorporates the SHAP
reasons. Firstly, AI models often operate as "black boxes," (SHapley Additive exPlanations) approach, particularly
meaning their decision-making processes are not readily relevant for tasks like predicting patient survival based
transparent [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ]. on medical imaging data in datasets like BRATS. SHAP
In the context of medical imaging, the lack of trans- values elucidate the contributions of individual features
to the model’s output, shedding light on the
mechanisms underlying predictions. This transparency is vital
in the medical context, where accurate predictions
prosegmentation, augmented by Grad-CAM for heatmap the most interesting ones was [
        <xref ref-type="bibr" rid="ref10">10</xref>
        ]. The author
implevisualization and SHAP for survival prediction analysis. mented a prototypical part network (ProtoPNet), which
The segmentation accuracy stands at an impressive 99 dissects images by identifying prototypical parts and
percent, with specific dice scores for necrotic, edema, and amalgamating evidence from these prototypes to derive
enhancing regions. This comprehensive approach not a final classification. The operational principle of this
only yields accurate predictions but also enhances inter- ProtoPNet involves comparing the latent features of f(x)
pretability, trust, and confidence in AI-assisted medical with the learned prototypes. Specifically, for each class k,
decision-making. the network seeks evidence for x belonging to class k by
assessing its latent patch representations against every
learned prototype p(j) associated with class k.
2. Related Works In another study focusing on interpretable machine
learning, researchers introduced a method for
"ClassifiBrain tumors are among the most perilous types of tu- cation of Mass Lesions in Digital Mammography" [
        <xref ref-type="bibr" rid="ref11">11</xref>
        ].
mors globally, with gliomas emerging as the predominant They employed a pixel-wise annotation technique to
preprimary brain tumors. Gliomas stem from the aberrant cisely segment afected lesions, and the outcomes were
proliferation of glial cells in the brain and spinal cord, subsequently depicted using GradCam and GradCam++
exhibiting varying degrees of malignancy and histologi- heatmaps. The findings demonstrated that pixel-wise
cal classifications. Individuals diagnosed with glioblas- annotation improved the segmentation and localization
toma, the most aggressive form of glioma, typically face of the afected area, with the generated heatmaps
maina survival prognosis of fewer than 14 months on aver- taining focus on the impacted part of the skin, rather
age. Medical professionals frequently utilize Magnetic than encompassing all image pixels.
Resonance Imaging (MRI), a non-invasive technique, to The concept of utilizing GradCam for visual
interprediagnose brain tumors because of its capability to gen- tation and explanation of model results originated from a
erate a wide variety of tissue contrasts in each imaging related study, which utilized GradCam for visual
explanamode [
        <xref ref-type="bibr" rid="ref7">7</xref>
        ]. However, analyzing and segmenting structural tions across a wide range of CNN-based models. This
apMRI images of brain tumors is a challenging and time- proach combines Grad-CAM with fine-grained
visualizaconsuming task that typically requires the expertise of tions to produce high-resolution, class-discriminative
viprofessional neuroradiologists. Therefore, an automated sualizations. It was applied to various of-the-shelf image
and dependable brain tumor segmentation method would classification, captioning, and visual question-answering
greatly facilitate the diagnosis and treatment of brain tu- (VQA) models, including those based on ResNet
architecmors. tures [
        <xref ref-type="bibr" rid="ref12">12</xref>
        ].
      </p>
      <p>
        [
        <xref ref-type="bibr" rid="ref8">8</xref>
        ] An alternative approach is suggested to focus solely The domain of explainable Artificial Intelligence (xAI)
on a small region of the image rather than processing the is relatively new but evolving rapidly, with the
introducentire image, reducing computational time and address- tion of numerous libraries designed to elucidate the
outing overfitting issues in a Cascade Deep Learning model. puts of opaque deep learning models. One such notable
Additionally, a Cascade Convolutional Neural Network library is the SHAP (SHapley Additive exPlanations)
li(C-ConvNet/C-CNN) is introduced, which extracts both brary. SHAP assigns an importance value to each feature
local and global features through separate pathways. for a specific prediction. In their work [
        <xref ref-type="bibr" rid="ref13">13</xref>
        ], SHAP was
Moreover, to enhance the accuracy of brain tumor seg- applied to the brats dataset. For each input feature, SHAP
mentation beyond existing models, a new Distance-Wise calculates the importance value, ofering various
calcuAttention (DWA) mechanism is employed. lation methods, including two model-agnostic methods
      </p>
      <p>
        In another work [
        <xref ref-type="bibr" rid="ref9">9</xref>
        ], a novel design relying on a 3D applicable regardless of the trained network type, and
U-Net model was developed, incorporating numerous four specific model methods, one of which is
DeepExskip connections alongside cost-efective pre-trained 3D plainer.
      </p>
      <p>MobileNetV2 blocks and attention modules. These pre- In this study, DeepExplainer was utilized to determine
trained MobileNetV2 blocks aid the architecture by ofer- the importance values for a given combination of 3D MRI
ing fewer parameters, ensuring a manageable model size voxel and age values. DeepExplainer eficiently
approxiwithin our computational capacity, and facilitating faster mates SHAP values for a deep neural network model by
convergence. Furthermore, additional skip connections recursively propagating DeepLIFT multipliers, thereby
were introduced between the encoder and decoder blocks deriving an efective linearization technique from the
to facilitate the transfer of extracted features, while at- SHAP values. By inputting an example data point into
tention modules were employed to filter out irrelevant DeepExplainer, importance values for every pixel in the
features transmitted through the skip connections. 3D voxel, as well as for the age value, are determined.</p>
      <p>Further existing works on interpretable CNNs were These important values can then be visually represented
examined during the execution of our project, One of by integrating them into a background image, which
The U-Net architecture is a convolutional neural network
initially devised for biomedical image segmentation but
widely applicable across various computer vision
segmentation tasks. It comprises a contracting path and an
expanding path, forming a "U" shape. The contracting
path functions akin to a traditional convolutional neural
network, capturing image context through successive
convolutional and max-pooling layers. These layers
reduce spatial resolution while augmenting channel depth
to extract broader image features. Conversely, the
expanding path reconstructs spatial resolution and yields</p>
    </sec>
    <sec id="sec-2">
      <title>3. Dataset</title>
      <p>most pertinent tumor data after rigorous
experimentation.</p>
      <p>During data preprocessing, we identified irregular
patterns in file 355, prompting its removal from the dataset to
ensure the integrity of our results and model training.
Additionally, we standardized the image size to 128*128 for
training purposes. Our training data comprises stacked
Flair and T1 images, while the model receives
segmentation masks as labels, which are subsequently one-hot
encoded for compatibility.</p>
      <sec id="sec-2-1">
        <title>In this project, we utilized the BRATS2020 Dataset, a</title>
        <p>commonly employed medical imaging dataset utilized for
both brain tumor segmentation and classification tasks.</p>
        <p>It represents an enhanced iteration of the BRATS2015
dataset and is made available by the Multimodal Brain
Tumor Segmentation Challenge (BRATS).</p>
        <p>The dataset comprises MRI scans of the brain obtained
from patients diagnosed with diverse types of brain tu- 4. Our Methodology
mors, including gliomas, meningiomas, and pituitary
adenomas. These scans encompass four distinct modali- In this project, our approach is delineated into the
folties: T1-weighted (T1), T1-weighted contrast-enhanced lowing steps:
(T1ce), T2-weighted (T2), and fluid-attenuated inversion • Employing the Unet model for intricate
segmenrecovery (FLAIR) images 1. Accompanying each MRI tation tasks, adept at delineating tumor contours
scan is a ground truth segmentation map delineating the within MRI scans.
tumor’s location and extent. Containing a total of 369
MRI scans, the BRATS2020 dataset designates 335 scans • Enacting an array of Machine Learning
Algofor training and 34 for testing. These scans were sourced rithms to prognosticate patient survival,
harnessfrom various medical institutions and meticulously an- ing extracted features from segmented tumor
renotated by multiple experts. Furthermore, the dataset gions alongside ancillary clinical data.
provides additional patient-related information such as • Deploying SHAP (SHapley Additive
exPlanaage, gender, and tumor subtype. Widely recognized as tions) to furnish comprehensive elucidations of
a benchmark dataset, BRATS2020 serves as a standard predictions rendered by machine learning
modfor assessing the eficacy of algorithms in brain tumor els, facilitating an enhanced understanding of the
segmentation and classification. Researchers leverage underlying determinants contributing to patient
this dataset to innovate and validate new approaches survival prognostications.
for automating these processes, aiming to enhance the
accuracy and eficiency of diagnosis and treatment for 4.1. UNET for Tumour Segmentation
individuals aflicted with brain tumors.</p>
        <sec id="sec-2-1-1">
          <title>3.1. Data Preprocessing</title>
        </sec>
      </sec>
      <sec id="sec-2-2">
        <title>The dataset utilized in our project comprises MRI scans</title>
        <p>of patients with various types of brain tumors,
encompassing four modalities: T1-weighted (T1), T1-weighted
contrast-enhanced (T1ce), T2-weighted (T2), and Flair
Images, alongside corresponding ground truth
segmentation masks. Each MRI image contains 155 slices, of which
we selected slices ranging from 22 to 100, capturing the
the final segmentation map. It employs convolutional analysis.
and up-sampling layers to enhance spatial resolution Once the UNET Model is trained, the next phase
inwhile reducing channel depth. Up-sampling methods volves the practical application of GradCam for
explalike bilinear interpolation or transposed convolution are nation generation. We start by selecting an input MRI
commonly used. image containing a brain tumor, which serves as the</p>
        <p>
          Moreover, the U-Net incorporates skip connections subject for interpretation. This MRI image is then fed
between corresponding layers of the contracting and ex- through the trained UNET Model to obtain the output
panding paths. These connections enable the network segmentation map, which delineates the tumor region
to circumvent spatial information loss from pooling op- within the image.
erations and merge local and global image features ef- To delve deeper into understanding the model’s
fectively. Skip connections concatenate feature maps decision-making process, we employ the concept of
gradifrom corresponding layers, followed by a 1x1 convolu- ent computation. Specifically, we compute the gradients
tional layer to decrease channel depth. The concatenated of the output segmentation map concerning the feature
feature maps then feed subsequent convolutional and up- maps of the last convolutional layer. These gradients
sampling layers in the expanding path. Training the U- provide valuable information regarding the importance
Net model involves end-to-end optimization using pixel- of diferent regions within the input image in
influencwise cross-entropy loss. This loss function compares ing the model’s segmentation decision. Leveraging these
predicted segmentation maps with ground truth maps, gradients, we proceed to compute the GradCam heatmap
guiding parameter adjustments of convolutional filters for the input MRI image. This heatmap efectively
highto minimize the loss and generate accurate segmentation lights the regions within the image that exert the most
maps [
          <xref ref-type="bibr" rid="ref14">14</xref>
          ]. significant influence on the segmentation decision made
        </p>
        <p>
          Our UNET architecture operates on input images with by the UNET Model. By overlaying this heatmap onto
dimensions of (128, 128, 2). Initially, a convolutional layer the input MRI image, we create a visually intuitive
reprewith 64 filters, a 3x3 kernel size, and "same" padding is sentation that facilitates the interpretation of the model’s
employed, followed by batch normalization and ReLU decision-making process.
activation. Subsequently, the encoder phase comprises Through this approach, we gain valuable insights into
multiple down-sampling blocks, each featuring two 3x3 the inner workings of the UNET Model for brain tumor
convolutional layers with 64 filters, followed by batch segmentation. By visualizing the regions of the input
normalization and ReLU activation. After each block, the image that contribute most significantly to the model’s
iflter count doubles, and spatial resolution is halved via predictions, we enhance the interpretability of our model,
max-pooling. The bottleneck layer is characterized by thereby fostering greater trust and understanding among
four 3x3 convolutional layers with 1024 filters, alongside stakeholders in the medical domain.
batch normalization and ReLU activation. Conversely, In summary, by integrating GradCam into our
workthe decoder phase involves up-sampling blocks, consist- flow for brain tumor segmentation, we not only improve
ing of 2x2 transpose convolutional layers with 512 filters. the transparency and interpretability of our model but
These layers are concatenated with corresponding fea- also empower clinicians and researchers with actionable
ture maps from the encoder part, followed by two 3x3 insights into the diagnostic process, ultimately leading
convolutional layers with 64 filters, batch normalization, to more informed decision-making and better patient
and ReLU activation. Finally, a 1x1 convolutional layer outcomes [
          <xref ref-type="bibr" rid="ref12">12</xref>
          ].
with 4 filters is employed in the final layer, succeeded
by a softmax activation function to yield a probability 4.3. SHAP Explanations
distribution across the four segmentation classes.
        </p>
      </sec>
      <sec id="sec-2-3">
        <title>SHAP (SHapley Additive exPlanations) is a model</title>
        <p>
          4.2. GradCam Algorithm agnostic method for interpreting the predictions of
machine learning models. It can help to identify which
GradCam, short for Gradient-weighted Class Activation features in the input data contributed the most to a
parMapping, serves as a valuable tool in enhancing the in- ticular prediction [
          <xref ref-type="bibr" rid="ref15">15</xref>
          ].
terpretability of complex neural network models, par- In this project, we use SHAP to determine a patient
ticularly in medical imaging tasks such as brain tumor survival. The patient survival data is already provided in
segmentation using MRI images. To utilize GradCam ef- the survival info CSV file, which contains the following
fectively for brain tumor segmentation, we first embark columns Brats20ID, Age, Survival days, Extent of
Resecon training a UNET Model using BRATS Data, a well- tion. We preprocess the data and determine whether the
known dataset extensively used in the field for brain extension of the tumor was short, medium, or large. Then
tumor segmentation tasks. This initial step is crucial as the data is trained and tested using the various
classificait lays the foundation for the subsequent interpretability tion algorithms including KNN, Random Forest, etc. The
results of KNN, SVC, MLP, and Random Forest are then
explained through the SHAP library. We used SHAP
Kernel explainer and Tree Explainer to get the SHAP values
and visualize them using the SHAP summary plot and
SHAP force plot.
        </p>
      </sec>
    </sec>
    <sec id="sec-3">
      <title>5. Results</title>
      <sec id="sec-3-1">
        <title>The results section includes the results obtained on the UNET Model train and test data, Visualizations using the GradCam Algorithm, and Results obtained on survival predictions data, and its explanations using SHAP.</title>
        <sec id="sec-3-1-1">
          <title>5.1. Results on UNET Model</title>
          <p>To interpret the model predictions, we use the GradCam
technique. Our GradCam visualization function builds
the gradient model using Unet model inputs, the last
convolution layer of the model, and model outputs. The
gradient model is then provided with a test image for which
it computes the gradients of the output segmentation
relating to the last convolution layer. These gradients are
then used to compute the Heatmap, By visualizing the
heatmap generated by Grad-CAM, we can gain insights
into which parts of the input image are most important
for the interpretable model to make its segmentation.
Figure 5 shows the original and the GradCam heatmaps
generated on that MRI Image by the model, we can see
that the model focuses more on the Tumour area to
predict the correct segmentation mask.</p>
        </sec>
        <sec id="sec-3-1-2">
          <title>5.3. Patient Survival Prediction</title>
        </sec>
      </sec>
      <sec id="sec-3-2">
        <title>For predicting patient survival various ML algorithms</title>
        <p>were used. Initially, we used a Random Forest Classifier
with 3 trees to predict the extent of survival. The
survival extent was categorized into three categories, small,
medium, or long. For further experiments, we used the
KNN classifier. Next, we used a Support Vector classifier
on the same data in the context of getting better accuracy
scores. Lastly, we experimented by training and testing
the model using a Multi-Layer Perceptron (MLP)
classiifer. The results of all those algorithms are included in
the Table 1.</p>
      </sec>
      <sec id="sec-3-3">
        <title>Our model was then validated on the test images to</title>
        <p>visualize the output segmentations made by the model.</p>
        <p>
          Figure 3 and figure 4 shows the results of the model. It
perfectly segments all three classes namely
"Neurotic/core", "Edema", and "Enhancing". The area comprising
the tumour was perfectly identified by the model hence
giving us perfect segmentations.
understand feature importance and model behavior. The
SHAP Tree Explainer is a method designed for
interpreting tree-based machine learning models, such as decision
trees or random forests. It computes the Shapley val- 6. Conclusion and Future Works
ues by approximating the model with a set of additive
tree-based models, enabling the attribution of feature con- Brain tumor segmentation through machine learning has
tributions to individual predictions made by tree models. significantly assisted medical professionals in eficiently
Figure 6 and Figure 7 display the results of our SHAP locating and resecting tumors. Various techniques have
explanations. been explored to accurately segment tumors from MRI
imAs we can notice from the above table, the results are
not quite promising, with KNN and Random Forest being
slightly better than our other experimented models. In
future work, we would try to achieve better scores by
trying various ensembles of models on the data.
ages, including the utilization of the UNET Model in this
project. Recent advancements in semantic segmentation
have introduced several notable models: Mask R-CNN:
This CNN architecture extends the Faster R-CNN object
detection model to include a mask prediction branch,
allowing it to perform object detection and instance
segmentation simultaneously. DeepLab V3+: Designed for
semantic segmentation of images, DeepLab V3+ employs
dilated convolution to capture multi-scale context
without increasing the number of parameters. PSPNet:
Utilizing a pyramid pooling module, PSPNet captures global
context at multiple scales, facilitating accurate
predictions for objects of various sizes [
          <xref ref-type="bibr" rid="ref18 ref19">18, 19</xref>
          ]. FCN: Fully
Convolutional Networks perform dense pixel-wise
prediction of image labels, accommodating input images of
arbitrary size and producing output images of the same
size with predicted labels for each pixel [
          <xref ref-type="bibr" rid="ref20">20</xref>
          ]. Segment
Anything Model (SAM): Facebook’s SAM, an open-source
state-of-the-art computer vision model, is designed for
image segmentation tasks [
          <xref ref-type="bibr" rid="ref21">21</xref>
          ]. These models, alongside
UNET, have showcased state-of-the-art results in
segmentation tasks. Our objective is to collaborate with these
models on our data and visualize their outcomes on the
BRATS2020 dataset.
        </p>
        <p>In addition to tumor segmentation, we aimed to
enhance the interpretability of the UNET model by
employing GradCam. However, with advancements in the
eXplainable Artificial Intelligence (XAI) field, several
other visual interpretation techniques have emerged. We
plan to explore these techniques, including GradCam++,
SmoothGradCam++, Guided GradCam, and Score-CAM,
to provide more precise and insightful model
interpretations.</p>
        <p>Moreover, in the realm of patient survival prediction,
current models exhibit low accuracy and struggle with
generalization. To address this, our future work will
involve experimenting with sequential neural network
models to achieve better results. Additionally, we will
focus on tuning hyperparameters and exploring
diferent parameter sets to improve model performance on
both training and test data. These eforts aim to enhance
the accuracy and reliability of patient survival
predictions, thus advancing the impact of medical AI in clinical
settings.</p>
        <p>1–8</p>
      </sec>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <given-names>H.</given-names>
            <surname>Saleem</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A. R.</given-names>
            <surname>Shahid</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B.</given-names>
            <surname>Raza</surname>
          </string-name>
          ,
          <article-title>Visual interpretability in 3d brain tumor segmentation network</article-title>
          ,
          <source>Computers in Biology and Medicine</source>
          <volume>133</volume>
          (
          <year>2021</year>
          )
          <article-title>104410</article-title>
          . URL: https://www.sciencedirect.com/science/article/pii/ S0010482521002043. doi:https://doi.org/10. 1016/j.compbiomed.
          <year>2021</year>
          .
          <volume>104410</volume>
          .
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [2]
          <string-name>
            <given-names>F.</given-names>
            <surname>Yan</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Y.</given-names>
            <surname>Chen</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Y.</given-names>
            <surname>Xia</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Z.</given-names>
            <surname>Wang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Xiao</surname>
          </string-name>
          ,
          <article-title>An explainable brain tumor detection framework for mri analysis</article-title>
          ,
          <source>Applied Sciences</source>
          <volume>13</volume>
          (
          <year>2023</year>
          ). URL: https://www.mdpi.com/2076-3417/13/6/3438. doi:
          <volume>10</volume>
          .3390/app13063438.
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [3]
          <string-name>
            <given-names>G.</given-names>
            <surname>De Magistris</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Caprari</surname>
          </string-name>
          , G. Castro,
          <string-name>
            <given-names>S.</given-names>
            <surname>Russo</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L.</given-names>
            <surname>Iocchi</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Nardi</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Napoli</surname>
          </string-name>
          ,
          <article-title>Vision-based holistic scene understanding for context-aware humanrobot interaction 13196 LNAI (</article-title>
          <year>2022</year>
          )
          <fpage>310</fpage>
          -
          <lpage>325</lpage>
          . doi:
          <volume>10</volume>
          .1007/978-3-
          <fpage>031</fpage>
          -08421-8_
          <fpage>21</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [4]
          <string-name>
            <given-names>G.</given-names>
            <surname>Borowik</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Woźniak</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Fornaia</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Giunta</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Napoli</surname>
          </string-name>
          ,
          <string-name>
            <given-names>G.</given-names>
            <surname>Pappalardo</surname>
          </string-name>
          ,
          <string-name>
            <given-names>E.</given-names>
            <surname>Tramontana</surname>
          </string-name>
          ,
          <article-title>A software architecture assisting workflow executions on cloud resources</article-title>
          ,
          <source>International Journal of Electronics and Telecommunications</source>
          <volume>61</volume>
          (
          <year>2015</year>
          )
          <fpage>17</fpage>
          -
          <lpage>23</lpage>
          . doi:
          <volume>10</volume>
          .1515/eletel-2015-0002.
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          [5]
          <string-name>
            <given-names>G.</given-names>
            <surname>Yang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Q.</given-names>
            <surname>Ye</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Xia</surname>
          </string-name>
          ,
          <article-title>Unbox the black-box for the medical explainable ai via multi-modal and multicentre data fusion: A mini-review, two showcases and beyond</article-title>
          ,
          <source>Information Fusion</source>
          <volume>77</volume>
          (
          <year>2022</year>
          )
          <fpage>29</fpage>
          -
          <lpage>52</lpage>
          . URL: https://www.sciencedirect.com/science/ article/pii/S1566253521001597. doi:https://doi. org/10.1016/j.inffus.
          <year>2021</year>
          .
          <volume>07</volume>
          .016.
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          [6]
          <string-name>
            <given-names>V.</given-names>
            <surname>Ponzi</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Russo</surname>
          </string-name>
          ,
          <string-name>
            <given-names>V.</given-names>
            <surname>Bianco</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Napoli</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Wajda</surname>
          </string-name>
          ,
          <article-title>Psychoeducative social robots for an healthier lifestyle using artificial intelligence: a case-study</article-title>
          , volume
          <volume>3118</volume>
          ,
          <year>2021</year>
          , pp.
          <fpage>26</fpage>
          -
          <lpage>33</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          [7]
          <string-name>
            <given-names>S.</given-names>
            <surname>Khawaldeh</surname>
          </string-name>
          ,
          <string-name>
            <given-names>U.</given-names>
            <surname>Pervaiz</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Rafiq</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R. S.</given-names>
            <surname>Alkhawaldeh</surname>
          </string-name>
          ,
          <article-title>Noninvasive grading of glioma tumor using magnetic resonance imaging with convolutional neural networks</article-title>
          ,
          <source>Applied Sciences</source>
          <volume>8</volume>
          (
          <year>2018</year>
          ). URL: https://www.mdpi.com/2076-3417/8/1/ 27. doi:
          <volume>10</volume>
          .3390/app8010027.
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          [8]
          <string-name>
            <given-names>R.</given-names>
            <surname>Ranjbarzadeh</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A. Bagherian</given-names>
            <surname>Kasgari</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S. Jafarzadeh</given-names>
            <surname>Ghoushchi</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Anari</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Naseri</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Bendechache</surname>
          </string-name>
          ,
          <article-title>Brain tumor segmentation based on deep learning and an attention mechanism using mri multi-modalities brain images</article-title>
          ., Scientific reports (
          <year>2021</year>
          ). URL: http://hdl.handle.net/10147/ 631439. doi:
          <volume>10</volume>
          .1038/s41598-021-90428-8.
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          [9]
          <string-name>
            <given-names>J.</given-names>
            <surname>Nodirov</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A. B.</given-names>
            <surname>Abdusalomov</surname>
          </string-name>
          ,
          <string-name>
            <given-names>T. K.</given-names>
            <surname>Whangbo</surname>
          </string-name>
          ,
          <article-title>Attention 3d u-net with multiple skip connections for segmentation of brain tumor images</article-title>
          ,
          <source>Sensors</source>
          <volume>22</volume>
          (
          <year>2022</year>
          ). URL: https://www.mdpi.com/1424-8220/22/ 17/6501. doi:
          <volume>10</volume>
          .3390/s22176501.
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          [10]
          <string-name>
            <given-names>C.</given-names>
            <surname>Chen</surname>
          </string-name>
          ,
          <string-name>
            <given-names>O.</given-names>
            <surname>Li</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Barnett</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Su</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Rudin</surname>
          </string-name>
          ,
          <article-title>This looks like that: deep learning for interpretable image recognition</article-title>
          , CoRR abs/
          <year>1806</year>
          .10574 (
          <year>2018</year>
          ). URL: http: //arxiv.org/abs/
          <year>1806</year>
          .10574. arXiv:
          <year>1806</year>
          .10574.
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          [11]
          <string-name>
            <given-names>A. J.</given-names>
            <surname>Barnett</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F. R.</given-names>
            <surname>Schwartz</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Tao</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Chen</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Y.</given-names>
            <surname>Ren</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J. Y.</given-names>
            <surname>Lo</surname>
          </string-name>
          ,
          <string-name>
            <surname>C.</surname>
          </string-name>
          <article-title>Rudin, IAIA-BL: A case-based interpretable deep learning model for classification of mass lesions in digital mammography</article-title>
          ,
          <source>CoRR abs/2103</source>
          .12308 (
          <year>2021</year>
          ). URL: https://arxiv.org/abs/ 2103.12308. arXiv:
          <volume>2103</volume>
          .
          <fpage>12308</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          [12]
          <string-name>
            <given-names>R. R.</given-names>
            <surname>Selvaraju</surname>
          </string-name>
          ,
          <string-name>
            <surname>A. Das</surname>
            ,
            <given-names>R.</given-names>
          </string-name>
          <string-name>
            <surname>Vedantam</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          <string-name>
            <surname>Cogswell</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          <string-name>
            <surname>Parikh</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          <string-name>
            <surname>Batra</surname>
          </string-name>
          ,
          <article-title>Grad-cam: Why did you say that</article-title>
          ?,
          <year>2017</year>
          . arXiv:
          <volume>1611</volume>
          .
          <fpage>07450</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          [13]
          <string-name>
            <given-names>M.</given-names>
            <surname>Eder</surname>
          </string-name>
          ,
          <string-name>
            <given-names>E.</given-names>
            <surname>Moser</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Holzinger</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>JeanQuartier</surname>
          </string-name>
          ,
          <string-name>
            <surname>F. Jeanquartier,</surname>
          </string-name>
          <article-title>Interpretable machine learning with brain image and survival data</article-title>
          ,
          <source>BioMedInformatics</source>
          <volume>2</volume>
          (
          <year>2022</year>
          )
          <fpage>492</fpage>
          -
          <lpage>510</lpage>
          . URL: https: //www.mdpi.com/2673-7426/2/3/31. doi:
          <volume>10</volume>
          .3390/ biomedinformatics2030031.
        </mixed-citation>
      </ref>
      <ref id="ref14">
        <mixed-citation>
          [14]
          <string-name>
            <given-names>O.</given-names>
            <surname>Ronneberger</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Fischer</surname>
          </string-name>
          ,
          <string-name>
            <given-names>T.</given-names>
            <surname>Brox</surname>
          </string-name>
          , U-net:
          <article-title>Convolutional networks for biomedical image segmentation</article-title>
          ,
          <source>CoRR abs/1505</source>
          .04597 (
          <year>2015</year>
          ). URL: http: //arxiv.org/abs/1505.04597. arXiv:
          <volume>1505</volume>
          .
          <fpage>04597</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref15">
        <mixed-citation>
          [15]
          <string-name>
            <given-names>S. M.</given-names>
            <surname>Lundberg</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Lee</surname>
          </string-name>
          ,
          <article-title>A unified approach to interpreting model predictions</article-title>
          ,
          <source>CoRR abs/1705</source>
          .07874 (
          <year>2017</year>
          ). URL: http://arxiv.org/abs/ 1705.07874. arXiv:
          <volume>1705</volume>
          .
          <fpage>07874</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref16">
        <mixed-citation>
          [16]
          <string-name>
            <given-names>S.</given-names>
            <surname>Jadon</surname>
          </string-name>
          ,
          <article-title>A survey of loss functions for semantic segmentation</article-title>
          ,
          <source>in: 2020 IEEE Conference on Computational Intelligence in Bioinformatics and Computational Biology (CIBCB)</source>
          , IEEE,
          <year>2020</year>
          . URL: https://doi.org/10.1109%
          <fpage>2Fcibcb48159</fpage>
          .
          <year>2020</year>
          .
          <volume>9277638</volume>
          . doi:
          <volume>10</volume>
          .1109/cibcb48159.
          <year>2020</year>
          .
          <volume>9277638</volume>
          .
        </mixed-citation>
      </ref>
      <ref id="ref17">
        <mixed-citation>
          [17]
          <string-name>
            <given-names>A.</given-names>
            <surname>Alfarano</surname>
          </string-name>
          , G. De Magistris,
          <string-name>
            <given-names>L.</given-names>
            <surname>Mongelli</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Russo</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Starczewski</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Napoli</surname>
          </string-name>
          ,
          <article-title>A novel convmixer transformer based architecture for violent behavior detection 14126 LNAI (</article-title>
          <year>2023</year>
          )
          <fpage>3</fpage>
          -
          <lpage>16</lpage>
          . doi:
          <volume>10</volume>
          .1007/ 978-3-
          <fpage>031</fpage>
          -42508-
          <issue>0</issue>
          _
          <fpage>1</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref18">
        <mixed-citation>
          [18]
          <string-name>
            <given-names>H.</given-names>
            <surname>Zhao</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Shi</surname>
          </string-name>
          ,
          <string-name>
            <given-names>X.</given-names>
            <surname>Qi</surname>
          </string-name>
          ,
          <string-name>
            <given-names>X.</given-names>
            <surname>Wang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Jia</surname>
          </string-name>
          , Pyramid scene parsing network,
          <year>2017</year>
          . arXiv:
          <volume>1612</volume>
          .
          <fpage>01105</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref19">
        <mixed-citation>
          [19]
          <string-name>
            <given-names>V.</given-names>
            <surname>Marcotrigiano</surname>
          </string-name>
          ,
          <string-name>
            <given-names>G. D.</given-names>
            <surname>Stingi</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Fregnan</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Magarelli</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Pasquale</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Russo</surname>
          </string-name>
          ,
          <string-name>
            <given-names>G. B.</given-names>
            <surname>Orsi</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M. T.</given-names>
            <surname>Montagna</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Napoli</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Napoli</surname>
          </string-name>
          ,
          <article-title>An integrated control plan in primary schools: Results of a field investigation on nutritional and hygienic features in the apulia region (southern italy)</article-title>
          ,
          <source>Nutrients</source>
          <volume>13</volume>
          (
          <year>2021</year>
          ). doi:
          <volume>10</volume>
          .3390/nu13093006.
        </mixed-citation>
      </ref>
      <ref id="ref20">
        <mixed-citation>
          [20]
          <string-name>
            <given-names>J.</given-names>
            <surname>Long</surname>
          </string-name>
          , E. Shelhamer, T. Darrell,
          <article-title>Fully convolutional networks for semantic segmentation</article-title>
          ,
          <year>2015</year>
          . arXiv:
          <volume>1411</volume>
          .
          <fpage>4038</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref21">
        <mixed-citation>
          [21]
          <string-name>
            <given-names>A.</given-names>
            <surname>Kirillov</surname>
          </string-name>
          ,
          <string-name>
            <given-names>E.</given-names>
            <surname>Mintun</surname>
          </string-name>
          ,
          <string-name>
            <given-names>N.</given-names>
            <surname>Ravi</surname>
          </string-name>
          ,
          <string-name>
            <given-names>H.</given-names>
            <surname>Mao</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Rolland</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L.</given-names>
            <surname>Gustafson</surname>
          </string-name>
          ,
          <string-name>
            <given-names>T.</given-names>
            <surname>Xiao</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Whitehead</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A. C.</given-names>
            <surname>Berg</surname>
          </string-name>
          , W.- Y. Lo,
          <string-name>
            <given-names>P.</given-names>
            <surname>Dollár</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Girshick</surname>
          </string-name>
          , Segment anything,
          <year>2023</year>
          . arXiv:
          <volume>2304</volume>
          .
          <fpage>02643</fpage>
          .
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>