<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta>
      <journal-title-group>
        <journal-title>These authors contributed equally.
$ contact@soumick.com (S. Chatterjee)
 https://www.soumick.com/ (S. Chatterjee)</journal-title>
      </journal-title-group>
    </journal-meta>
    <article-meta>
      <title-group>
        <article-title>Unboxing the black-box of deep learning based reconstruction of undersampled MRIs</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Soumick Chatterjee</string-name>
          <xref ref-type="aff" rid="aff2">2</xref>
          <xref ref-type="aff" rid="aff4">4</xref>
          <xref ref-type="aff" rid="aff5">5</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Arnab Das</string-name>
          <xref ref-type="aff" rid="aff5">5</xref>
          <xref ref-type="aff" rid="aff6">6</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Rupali Khatun</string-name>
          <xref ref-type="aff" rid="aff1">1</xref>
          <xref ref-type="aff" rid="aff3">3</xref>
          <xref ref-type="aff" rid="aff7">7</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Andreas Nürnberger</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
          <xref ref-type="aff" rid="aff2">2</xref>
          <xref ref-type="aff" rid="aff4">4</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Centre for Behavioural Brain Sciences</institution>
          ,
          <addr-line>Magdeburg</addr-line>
          ,
          <country country="DE">Germany</country>
        </aff>
        <aff id="aff1">
          <label>1</label>
          <institution>Comprehensive Cancer Centre Erlangen-EMN</institution>
          ,
          <addr-line>Erlangen</addr-line>
          ,
          <country country="DE">Germany</country>
        </aff>
        <aff id="aff2">
          <label>2</label>
          <institution>Data and Knowledge Engineering Group, Otto von Guericke University Magdeburg</institution>
          ,
          <addr-line>Magdeburg</addr-line>
          ,
          <country country="DE">Germany</country>
        </aff>
        <aff id="aff3">
          <label>3</label>
          <institution>Department of Radiation Oncology, Universitätsklinikum Erlangen, Friedrich-Alexnder-Universität Erlangen-Nürnberg Erlangen</institution>
          ,
          <country country="DE">Germany</country>
        </aff>
        <aff id="aff4">
          <label>4</label>
          <institution>Faculty of Computer Science, Otto von Guericke University Magdeburg</institution>
          ,
          <addr-line>Magdeburg</addr-line>
          ,
          <country country="DE">Germany</country>
        </aff>
        <aff id="aff5">
          <label>5</label>
          <institution>Genomics Research Centre, Human Technopole</institution>
          ,
          <addr-line>Milan</addr-line>
          ,
          <country country="IT">Italy</country>
        </aff>
        <aff id="aff6">
          <label>6</label>
          <institution>German Research Centre for Artificial Intelligence</institution>
          ,
          <addr-line>Kaiserslautern</addr-line>
          ,
          <country country="DE">Germany</country>
        </aff>
        <aff id="aff7">
          <label>7</label>
          <institution>Translational Radiobiology, Department of Radiation Oncology</institution>
          ,
          <addr-line>Universitätsklinikum Erlangen, Erlangen</addr-line>
          ,
          <country country="DE">Germany</country>
        </aff>
      </contrib-group>
      <pub-date>
        <year>2023</year>
      </pub-date>
      <volume>000</volume>
      <fpage>0</fpage>
      <lpage>0001</lpage>
      <abstract>
        <p>Deep learning has emerged as a very important area of research and has shown immense potential in solving diferent kinds of problem, including in the medical field. For tasks like undersampled MRI reconstruction - the process of speeding up MRI acquisition with the help of undersampling, deep learning has shown its dominance over the years. But one of the major problems with deep learning is trust: Complex reasonings done by these models appear black-box to the users. Therefore, to build trust and better acceptability, it is important to open up this black-box nature of these models. For classification models, several approaches have been proposed. Nevertheless, for models dealing with inverse problems, like the reconstruction of the undersampled MRIs, it is more challenging as the output of the model has the same number of output pixels as the input, making the interpretability of such models more complex. This research explores diferent methods to understand the working mechanism of a deep learning model for the task of undersampled MRI reconstruction.</p>
      </abstract>
      <kwd-group>
        <kwd>eol&gt;Deep Learning</kwd>
        <kwd>Blackbox</kwd>
        <kwd>Inverse Problem</kwd>
        <kwd>Interpretability</kwd>
        <kwd>Explainability</kwd>
        <kwd>MRI</kwd>
        <kwd>MR Image Reconstruction</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>1. Introduction</title>
      <p>
        Deep learning models have been proven very successful for a wide variety of tasks. And
nowadays, it is applied in various fields ranging from the study of energy, consumer, and
socialcultural linguistics to critical domains such as autonomous driving, medical image analysis,
and many more. The decisions made by these models directly or indirectly afect human life.
The main reason for the success of these deep learning models is the availability of digitised
data and their power to find complex patterns from it - to learn to perform the trained task. For
computer vision-related tasks, which are often very complex, deep models with hundreds of
thousands of parameters are employed. Models can be understood as a parameterised complex
function estimator that maps input domain data to decision domains of classification [
        <xref ref-type="bibr" rid="ref1 ref2">1, 2</xref>
        ],
segmentation [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ], regression, Image-reconstruction [
        <xref ref-type="bibr" rid="ref4">4</xref>
        ], de-noising [
        <xref ref-type="bibr" rid="ref5">5</xref>
        ], and many more. But
from an external perspective, a deep learning model, making decisions after learning from the
given training data, may appear as a "black box" with no direct accountability for the decisions
it makes. This is often true, since these models often do not provide any reason for their
predictions. Hence, for critical domains such as biomedical applications, where the slightest
of mistakes may have grave efects and even can be fatal, the use of these methods becomes
a widely debated topic, as it has been seen in the past that a model giving the best accuracy
during the test might not be making the correct reasoning to arrive at their decision [
        <xref ref-type="bibr" rid="ref6">6</xref>
        ]. This
not only increases the chances of failure while in use in production but also makes it dificult to
trust such methods. So, for better acceptability and applicability, opening the black-box nature
of these models is the need of the hour. This will build trust in the decisions made by deep
learning models, as predictions will be better grounded and explained.
      </p>
      <p>
        Recent years have seen an increase in diferent interpretability and explainability techniques
to try to understand the working mechanism of these complex models. Captum [
        <xref ref-type="bibr" rid="ref7">7</xref>
        ] is one of
the many available packages that enables the application of post hoc methods on deep learning
models already trained to help better understand those models. The primary focus of Captum,
as well as most of the existing methods, is on classification models. TorchEsegeta [
        <xref ref-type="bibr" rid="ref8">8</xref>
        ] is a unified
pipeline, including Captum and several other methods, which enables developers and decision
makers (e.g., doctors) to apply several post-hoc interpretability and explainability methods on
already trained classification models. This pipeline also extends these methods to explain deep
segmentation models. However, there has not been any significant research on reconstruction
models.
      </p>
      <p>
        Image reconstruction is another task where deep learning models have demonstrated their
superiority. An example of image reconstruction in the field of medical imaging is the task
of undersampled image reconstruction. Magnetic resonance imaging (MRI) is an inherently
slow process - making it dificult to be used in real-time applications. Undersampling, a process
of ignoring parts of the data [
        <xref ref-type="bibr" rid="ref9">9</xref>
        ], can make image acquisition faster, but can compromise
image quality (e.g. loss of resolution, presence of artefacts). Deep learning models, such
as UNet [
        <xref ref-type="bibr" rid="ref4">4</xref>
        ] and ReconResNet (including the NCC1701 pipeline) [
        <xref ref-type="bibr" rid="ref9">9</xref>
        ], have shown superior
performance over non-deep learning-based techniques for undersampled MRI reconstruction,
such as compressed sensing [
        <xref ref-type="bibr" rid="ref10">10</xref>
        ]. There are also techniques that aim to combine deep learning
models with compressed sensing [
        <xref ref-type="bibr" rid="ref11 ref12">11, 12</xref>
        ]. All these models primarily aim to reduce artefacts
from the given input undersampled MRIs - learning by comparing their outputs against the
corresponding fully-sampled MRIs. Similarly, reconstruction models like ShufleUNet [
        <xref ref-type="bibr" rid="ref13">13</xref>
        ] and
DDoS-UNet [
        <xref ref-type="bibr" rid="ref14">14</xref>
        ] attempt to superresolve (i.e., improving the image resolution) the input of
low-resolution undersampled MRIs to the resolution of the corresponding high-resolution fully
sampled MRIs. Although numerous deep learning methods have been proposed, in addition to
uncertainty quantification [
        <xref ref-type="bibr" rid="ref15 ref16">15, 16</xref>
        ], not much exploration has been done from the perspective
of interpretability and explainability. Hence, the objective of this research is to find ways to
understand the inner working mechanism of the reconstruction models for undersampled MRI
reconstruction, with the help of diferent analyses and visualisations, to try to interpret and
explain such models.
      </p>
    </sec>
    <sec id="sec-2">
      <title>2. Methodology</title>
      <p>
        In literature, interpretability and explainability methods are grouped under diferent rubrics,
for example, local vs global, model dependent vs model agnostic, intrinsic vs post hoc, etc. [
        <xref ref-type="bibr" rid="ref17">17</xref>
        ].
This work proposes several methods for model understanding, uncertainty estimation, and
interpretability in a post hoc fashion. The methods used are discussed in three subsections
accordingly. As input, reconstruction models are provided with undersampled brain images, and
the models are able to predict fully sampled images, and all models were trained in a supervised
manner.
      </p>
      <sec id="sec-2-1">
        <title>2.1. Uncertainty estimation</title>
        <sec id="sec-2-1-1">
          <title>2.1.1. Model weight perturbation</title>
          <p>
            Epistemic uncertainty gives rise to parameter uncertainty in trained models, which means that
parameters can take several values for a region in input data space where there are no or very
few data points presented during training [
            <xref ref-type="bibr" rid="ref18">18</xref>
            ]. To leverage this fact, the model looks perturbed
by adding small random Gaussian noise in each run [
            <xref ref-type="bibr" rid="ref19">19</xref>
            ]. Then, this was applied several times
for the same input image. Again, the pixel-wise output variance was then calculated from all
the runs and produced an uncertainty heatmap from the same.
          </p>
        </sec>
        <sec id="sec-2-1-2">
          <title>2.1.2. Monte-Carlo Simulation</title>
          <p>
            This is a popular method for estimating the parameter uncertainty of predictions of a model.
This is marked as epistemic uncertainty in the literature. Often the dropout layer is used in deep
learning models as a Bayesian approximation of a model ensemble and as a regulariser to tackle
overfitting problems [
            <xref ref-type="bibr" rid="ref20">20</xref>
            ]. But, as standard practise, dropout is only enabled in training time
and disabled in test/inference time to output a deterministic and reproducible prediction [
            <xref ref-type="bibr" rid="ref21">21</xref>
            ].
The dropout was enabled at the test time and the model several times on the same input image.
Then the pixel-wise variance was calculated from all the runs and produced an uncertainty
heatmap of the same. Several additional dropouts were also introduced in the model at the test
time and the experiments were repeated.
          </p>
        </sec>
        <sec id="sec-2-1-3">
          <title>2.1.3. Subject Level Uncertainty score</title>
          <p>For patch-based segmentation networks, [22] proposed a way to estimate the uncertainty at the
subject level. This method estimates a multivariate Gaussian distribution over average pooled
latent space activations from training patches and then calculates the Mahalanobis distance for
test patches. Then, from these distances, it calculates an uncertainty mask for the entire volume
and finally provides a subject-level uncertainty score by averaging the mask over all voxels.</p>
        </sec>
      </sec>
      <sec id="sec-2-2">
        <title>2.2. Model Understanding</title>
        <sec id="sec-2-2-1">
          <title>2.2.1. Latent Space Exploration</title>
          <p>Medical domain input data, such as MRI volume, is extremely high-dimensional, so directly
estimating the prior data distribution is not a suitable task. Deep learning model architectures
that incorporate an encoder-decoder structure learn a representation of the input data in its
latent space, which is relatively lower dimensional. Exploring the latent space is often a go-to
method for diving into the model’s understanding of the data. In this work, various latent space
exploration experiments [23, 24] on reconstruction models have been performed.
• The simplest technique is to directly visualise the latent space activation/feature maps.
• In the second method, 1000 images have been through the model and captured their latent
space representation. Later on, t-Distributed Stochastic Neighbour Embedding (tSNE)
was performed to project the high-dimensional data into a 2D map to visualise the same.
The same has been repeated for training, validation, and test set images to understand or
identify any distribution shift.
• Upon doing the tSNE and visualising, one could see grouping structures appearing.
Therefore, the clustering was performed on the 2D representation. And identified representative
input images for each cluster by backtracking the 2d representation to input data. The
goal is to visualise how diferent latent representations correspond to the diferent input
image.
• Latent spacewalk: Another popular approach, often performed by the deep learning
community to understand whether the manifold learnt by the model in its latent space is
continuous and fills the entire space or not. This can also help adapt the architecture of
the model. Two diferent input images, in this case - MRIs, were chosen randomly from
the test set and their latent representation of the model was obtained. Then, the latent
vectors were linearly interpolated with uniform steps to generate the intermediate latent
representations. Finally, these representations were decoded and visualised.</p>
        </sec>
        <sec id="sec-2-2-2">
          <title>2.2.2. Noise Tolerance Estimation</title>
          <p>This experiment shows the robustness of the model under test against possible noise, specifically
for the decoder part of the model [25].</p>
          <p>• In the initial approach, Gaussian noise of diferent magnitudes has been added to the
latent representation, and then the noisy latent has been decoded. Finally, the structural
similarity index (SSIM)[26] of the reconstructed input and the ground truth is calculated.
After the experiment, a 2D graph of the SSIM value and the noise magnitude was plotted
to see the notion of how noise tolerant the decoder is.
• In the second approach, some of the latent feature maps were randomly zeroed in diferent
proportions and then reconstructed the resulting representation. The reconstructed image
and ground truth were compared by calculating the SSIM value. It is to be noted that
the latent feature maps are suppressed randomly, and when unimportant feature maps
get suppressed, no efective change occurs in the reconstructed image that is obtained
by decoding the latent. However, this may also give a false notion of noise robustness.
To ignore this problem, the same method has been applied several times and the median
image of the reconstructed images has been selected.</p>
        </sec>
        <sec id="sec-2-2-3">
          <title>2.2.3. Probabilistic Model Understanding Approximation</title>
          <p>Image Reconstruction models with bottleneck layers can be considered similarly to
autoencoders (AE). The bottleneck layer represents the latent space representation for a given
input learnt by the model. So in connection to the AE, layers up to the bottleneck layer in the
model can be considered an encoder, and the rest of the layers as a decoder. Reconstruction
models are capable of removing any unintended artefacts that occurred due to violation of the
Nyquist-Shannon sampling theorem [27, 28], undersampling the MRI slices. So, these models
can be considered analogous to energy-based generative models, and expect that after training,
the model would have its understanding of data distribution. For
• Latent space representation: h
• Input data: x
• Reconstruction data: x’
• Encoder, decoder params: 
• True data distribution : () or  ()
• Models data distribution : () or  ()
() can be estimated using a repeated Gibbs update by sampling from (ℎ|) and
(|ℎ). As this is a directed model, thus the single update only means one pass-through
encoder and decoder. But the problem is that, similar to AE, there is no mechanism to get
the initial h to burn the chain, so two solutions have been proposed in this work, inspired by
naive Markov Chain Monte Carlo (MCMC) and Contrastive Divergence (CD) [29]. In the naive
MCMC, a sample of a Normal distribution with 0 mean and standard deviation of 1 has been
fetched. And for the solution depicting CD, a sample from training data has been taken and
passed through the encoder once to get the initial h.</p>
        </sec>
        <sec id="sec-2-2-4">
          <title>2.2.4. Input Anomaly test</title>
          <p>The goal of the Input Anomaly test is to verify the robustness of the model against any additional
anomaly in the input image in real time that it has not seen at training time. This test inspired
by the fact that deep learning models might be prone to adversarial noise [30, 31] or might react
diferently when encountered with anomalies if trained only with non-anomalous data - the
idea behind unsupervised anomaly detection [32]. This work mainly deals with brain image
reconstruction, so the expectation of the model is that the model should be able to perform
proper reconstruction of the image if there is a tumour or a tumour-like structure present in the
brain tissue. Brain images without tumours were selected, and a tumour-like circular structure
was added to these images, and later on, these images were used as ground truth. The images
were then undersampled and passed through the model to visualise the final reconstructed
image. This experiment has been performed for various pixel values for the circular lesion.</p>
        </sec>
        <sec id="sec-2-2-5">
          <title>2.2.5. Targeted Activation Maximisation</title>
          <p>Activation maximisation is a technique for classification models to show which types of input
images activate a particular output neuron the most [33, 34]. This idea has been extended
in this reconstruction model. For the classification task, there is a final output neuron (in
the case of binary classification), and this is the one on which activation maximisation is
applied. For the same, constantly the activation of the output neuron is taken, and the same
is considered for the loss, and a gradient ascent is performed on the input to maximise the
activation of this neuron as much as possible. So, the selection of this output neuron gives a
hint to the discriminating network in which "concept" is to be maximised. The problem with
the reconstruction network is that the selection of a particular output neuron is not possible.
Because each pixel in the reconstruction output is a regressed value, selecting just one output
pixel for activation maximisation does not provide a meaningful concept. So to give the network
a hint about the concept that is to be maximised, a group of pixels from the reconstruction
output can be chosen rather than a single one. But this raises the question of which pixels
to choose? For this, one can take the support of the truth of the ground. A fully sampled
ground-truth image was then used to generate two binary masks. The first masks indicate
which pixels have a value greater than a threshold, and the second mask indicates which pixels
have values lower than the threshold in the ground truth. These masks can be used as hints for
maximisation of the activation of the network. Initially, the computation can start from random
Gaussian noise, then perform gradient ascent for the pixels related to the first mask, and finally,
gradient descent is performed for the pixels related to the second mask. This can be achieved
by multiplying the binary mask by the reconstructed output at each step before summing. This
work also showed the diferent maximised concepts based on the threshold value.</p>
        </sec>
        <sec id="sec-2-2-6">
          <title>2.2.6. Activation/Reconstruction Comparison</title>
          <p>The primary motivation for this experiment is to capture the model’s response when presented
with out-of-distribution data, which they have not seen before during the training period [35].
This will help us to understand the model’s understanding of the data by analysing what sort
of thing it can successfully reconstruct and what it cannot. This, in turn, helps to understand
whether the models have learnt any prior knowledge about the structure that it is reconstructing
or not. As input, The models have been presented with one in-distribution brain data, one noise
input, and one completely out-of-distribution flower image, and then the histogram of latent
space values and the reconstructed images have been compared.</p>
        </sec>
      </sec>
      <sec id="sec-2-3">
        <title>2.3. Interpretability</title>
        <p>
          The TorchEsegeta project [
          <xref ref-type="bibr" rid="ref8">8</xref>
          ] provides more than 40 interpretability methods from the literature
and third-party libraries, catering to classification and segmentation models. As a part of the
current research, some of these methods were extended for reconstruction models. This has been
achieved with the help of a wrapping mechanism which converts the output of reconstruction
models to be similar to that of classification models. The wrapping mechanism was inspired by
the wrapper proposed in TorchEsegeta [
          <xref ref-type="bibr" rid="ref8">8</xref>
          ]. This method performs class identification by Otsu
thresholding of the reconstruction output and then sums up the pixels for each class. This task
is also performed in two steps:
        </p>
        <p>a. Normalisation - In this step, the output reconstructed image is normalised by the following
function:</p>
        <p>b. Pixel-wise binarisation - Pixel-wise binarisation is performed with the help of Otsu
thresholding.</p>
        <p>− ()
 = () − ()
 =
{︃1 ℎ  &gt;ℎ</p>
        <p>0 ℎ
where th = otsu( )</p>
        <p>
          The output of both processes is a tensor with a single value for each class. Although, the output
range would not strictly be in the range [
          <xref ref-type="bibr" rid="ref1">0,1</xref>
          ]. As of now, the methods present in TorchEsegeta,
belonging to the two libraries Captum and CNN Visualisation, have been extended and tested
for the reconstruction models.
        </p>
      </sec>
      <sec id="sec-2-4">
        <title>2.4. Experimental Setup</title>
        <p>
          This research analysed the ReconResNet model [
          <xref ref-type="bibr" rid="ref9">9</xref>
          ] for the task of undersampled MRIs.
Following the original article, two diferent publicly available brain MRI datasets were used for
all experiments with the undersampled MRIs - OASIS [36] and IXI (available online: ). The
MRIs from the datasets were treated as fully sampled ground truth images and were artificially
undersampled. The model was trained by supplying undersampled images as input (i.e., images
with artefacts), and the prediction was compared with the ground-truth images.
        </p>
      </sec>
    </sec>
    <sec id="sec-3">
      <title>3. Results</title>
      <p>This section presents the results obtained using some of the methods discussed in the
methodology section.</p>
      <sec id="sec-3-1">
        <title>3.1. Uncertainty estimation</title>
        <p>The rightmost heatmaps in Figure 1 are generated using the ’hot’ colourmap from the Matplotlib
library, which means that black represents the lowest uncertainty and bright yellow represents
a high amount of uncertainty. As the pictures depict, the model is quite certain about the areas
outside of the brain area, and hence no noisy undersampling artefact of the input image is
transferred to the reconstructed output image. The most uncertainty arises in the skull and
brain tissues. Also, it can be seen that the model is quite robust against dropout, but it produces
higher uncertainty when the model weights are perturbed.
(1)
(2)</p>
        <p>The uncertainty map in Figure 1 clearly shows an increase in uncertainty, and quantitatively
the variance of the maximum uncertainty value also increases.</p>
      </sec>
      <sec id="sec-3-2">
        <title>3.2. Model Understanding</title>
        <p>As shown in Figure 3a, the second method of latent space exploration shows that there are
indeed three clusters that exist in the latent space data with reduced dimensionality. The authors
took three representative samples from each cluster and then backtracked them to the input
space. The lower subplot shows the three diferent input images corresponding to those three
samples.</p>
        <p>Two codes have been selected from the latent code space, linearly interpolated between the
codes, and reconstructed all the codes. In Figure 3b, you can see how one image is slowly
interpolating into another. This experiment helps to understand whether the manifold learnt in
the latent space is continuous and suficiently covers the whole space or not. As the transition
is quite smooth and intermediate images are not that blurry, one can say that the model has
(a) Principal component analysis using the tSNE
method. The top left subplot shows the
distribution shift between the training, validation, and
test dataset. The top right subplot showed the
clustering outcome when the 2D embeddings
were clustered to find any pattern.
(b) Outcome of walking the latent space experiment.
(a) Reconstructed images
(b) SSIM values
learnt a manifold that can cover the latent space suficiently and continuously.</p>
        <p>Figure 4a and Figure 4b are from the decoder’s noise tolerance experiment. The top image
shows how the reconstructed image changes depending on the amount of noise added to the
latent space representation. While the top image is a visual representation, the bottom image
quantitatively shows the result. The beta values along the x-axis are the noise level, and the
y-axis depicts the SSIM values for the reconstructed image against the ground truth. The starting
of the graph is horizontal up to a certain beta value. This shows the robustness to noise of the
model in the region.</p>
        <p>Figure 5 shows the quality (visual and quantitative) of the reconstructed image when randomly
suppressing a certain fraction of the latent feature maps.</p>
        <p>The result of the probabilistic model understanding approximation experiment is depicted in
Figure 6a. The outcome shows diferent (|ℎ) output after running diferent numbers of
CD steps. This is the result of the Varden1D reconstruction model. The same experiment was
also performed for the radial model, Figure 6b shows the same.</p>
        <p>In Figure 7, the result of the targeted activation maximisation experiment is shown here.
The result shows diferent input images, which maximises the output of the model against
diferent selected threshold values. It is interesting to see, for diferent threshold values, how
the activation of the is maximised for diferent parts of the brain when the input is mere noise.
The images are from threshold value 0.0 to 0.6 as you go along from left to right and top to
bottom.</p>
        <p>Figure 8a presents the result of the input anomaly test experiment, in which the authors
check the robustness or ability of the network to reconstruct the lesion in brain cells, which the
model has not seen during training. When presenting lesion images with various pixel values
as output, the model generates plausible results when the pixel value is in a higher range. When
the pixel value of the lesion is similar to the neighbouring brain cells, the reconstructed pixel
values are underpredicted, and the reconstruction is not that prominent.</p>
        <p>In the activation/reconstruction comparison experiment, it was found that the reconstruction
model failed to reconstruct the out-of-distribution input images. Figure 8b shows the experiment
(a) Unseen lesion (anomaly) with diferent pixel
in</p>
        <p>tensity
(b) In-distribution vs out-of-distribution data
(a)
(b)
(c)
(d)
output. For the flower image in the middle, the model reconstructed the parts that are similar
to the in-distribution brain image. But it made most of the reconstructed pixel zero, following
the distribution of brain images. Furthermore, the histogram shows that for the flower image,
the latent space activation is zero for more neurons compared to the in-distribution data. That
means that most neurons are not activated when presented with the flower image.</p>
      </sec>
      <sec id="sec-3-3">
        <title>3.3. Interpretability</title>
        <p>
          The attribution results of the reconstructed model generated by TorchEsegeta [
          <xref ref-type="bibr" rid="ref8">8</xref>
          ] are shown in
Figures 9 and 10. As elaborated in the Methods section, an Otsu-based wrapper has been used
for generating these attributions. For all figures, the positive attribution of the corresponding
methods is overlaid on top of the input images.
        </p>
      </sec>
    </sec>
    <sec id="sec-4">
      <title>4. Conclusion and Future Work</title>
      <p>This research presented several methods for understanding ReconResNet, a deep learning-based
undersampled MRI reconstruction model. This paper serves as a starting point for exploring
these and other methods for the explainability and interpretability of such models. Here, some
(a)
(b)
(c)
(d)
of the proposed methods were applied at a limited scale. In the future, all these methods will be
evaluated in more detail and a user study will be conducted involving medical professionals
to evaluate the advantage of these methods in terms of trust-building in clinical practise. In
addition, more undersampling techniques for reconstruction models, diferent datasets, and
comparisons between diferent models will also be performed in the near future. The remaining
interpretability methods in the TorchEsegeta pipeline would also be extended and evaluated for
reconstruction models.
[22] C. Gonzalez, K. Gotkowski, A. Bucher, R. Fischbach, I. Kaltenborn, A. Mukhopadhyay,
Detecting when pre-trained nnu-net models fail silently for covid-19 lung lesion
segmentation, in: International Conference on Medical Image Computing and Computer-Assisted
Intervention, Springer, 2021, pp. 304–314.
[23] L. Fetty, M. Bylund, P. Kuess, G. Heilemann, T. Nyholm, D. Georg, T. Löfstedt, Latent space
manipulation for high-resolution medical image synthesis via the stylegan, Zeitschrift für
Medizinische Physik 30 (2020) 305–314.
[24] C. Qin, S. Wang, C. Chen, W. Bai, D. Rueckert, Generative myocardial motion tracking via
latent space exploration with biomechanics-informed prior, Medical Image Analysis 83
(2023) 102682.
[25] Into the latent space, Nature Machine Intelligence 2 (2020) 151–151. URL: https://doi.org/
10.1038/s42256-020-0164-7. doi:10.1038/s42256-020-0164-7.
[26] G. P. Renieblas, A. T. Nogués, A. M. González, N. G. León, E. G. Del Castillo, Structural
similarity index family for image quality assessment in radiological images, Journal of
medical imaging 4 (2017) 035501.
[27] H. Nyquist, Certain topics in telegraph transmission theory, Transactions of the American</p>
      <p>Institute of Electrical Engineers 47 (1928) 617–644.
[28] C. E. Shannon, Communication in the presence of noise, Proceedings of the IRE 37 (1949)
10–21.
[29] I. J. Goodfellow, Y. Bengio, A. Courville, Deep Learning, MIT Press, Cambridge, MA, USA,
2016. http://www.deeplearningbook.org.
[30] A. Fawzi, S.-M. Moosavi-Dezfooli, P. Frossard, Robustness of classifiers: from adversarial
to random noise, Advances in neural information processing systems 29 (2016).
[31] T. Y. Liu, Y. Yang, B. Mirzasoleiman, Friendly noise against adversarial noise: a powerful
defense against data poisoning attack, Advances in Neural Information Processing Systems
35 (2022) 11947–11959.
[32] S. Chatterjee, A. Sciarra, M. Dünnwald, P. Tummala, S. K. Agrawal, A. Jauhari, A. Kalra,
S. Oeltze-Jafra, O. Speck, A. Nürnberger, Strega: Unsupervised anomaly detection in brain
mris using a compact context-encoding variational autoencoder, Computers in Biology
and Medicine 149 (2022) 106093.
[33] K. Simonyan, A. Vedaldi, A. Zisserman, Deep inside convolutional networks: Visualising
image classification models and saliency maps, 2014. arXiv:1312.6034.
[34] D. Erhan, Y. Bengio, A. Courville, P. Vincent, Visualizing higher-layer features of a deep
network, Technical Report, Univeristé de Montréal (2009).
[35] J. Linmans, S. Elfwing, J. van der Laak, G. Litjens, Predictive uncertainty estimation for
out-of-distribution detection in digital pathology, Medical Image Analysis 83 (2023) 102655.
[36] D. S. Marcus, T. H. Wang, J. Parker, J. G. Csernansky, J. C. Morris, R. L. Buckner, Open
access series of imaging studies (oasis): cross-sectional mri data in young, middle aged,
nondemented, and demented older adults, Journal of cognitive neuroscience 19 (2007)
1498–1507.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <given-names>L.</given-names>
            <surname>Mou</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Ghamisi</surname>
          </string-name>
          ,
          <string-name>
            <given-names>X. X.</given-names>
            <surname>Zhu</surname>
          </string-name>
          ,
          <article-title>Unsupervised spectral-spatial feature learning via deep residual conv-deconv network for hyperspectral image classification</article-title>
          ,
          <source>IEEE Transactions on Geoscience and Remote Sensing</source>
          <volume>56</volume>
          (
          <year>2017</year>
          )
          <fpage>391</fpage>
          -
          <lpage>406</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [2]
          <string-name>
            <given-names>J.</given-names>
            <surname>Zhang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Y.</given-names>
            <surname>Xie</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Y.</given-names>
            <surname>Xia</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Shen</surname>
          </string-name>
          ,
          <article-title>Attention residual learning for skin lesion classification</article-title>
          ,
          <source>IEEE transactions on medical imaging 38</source>
          (
          <year>2019</year>
          )
          <fpage>2092</fpage>
          -
          <lpage>2103</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [3]
          <string-name>
            <given-names>D.</given-names>
            <surname>Pakhomov</surname>
          </string-name>
          ,
          <string-name>
            <given-names>V.</given-names>
            <surname>Premachandran</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Allan</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Azizian</surname>
          </string-name>
          ,
          <string-name>
            <given-names>N.</given-names>
            <surname>Navab</surname>
          </string-name>
          ,
          <article-title>Deep residual learning for instrument segmentation in robotic surgery</article-title>
          ,
          <source>in: International Workshop on Machine Learning in Medical Imaging</source>
          , Springer,
          <year>2019</year>
          , pp.
          <fpage>566</fpage>
          -
          <lpage>573</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [4]
          <string-name>
            <given-names>C. M.</given-names>
            <surname>Hyun</surname>
          </string-name>
          ,
          <string-name>
            <given-names>H. P.</given-names>
            <surname>Kim</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S. M.</given-names>
            <surname>Lee</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Lee</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J. K.</given-names>
            <surname>Seo</surname>
          </string-name>
          ,
          <article-title>Deep learning for undersampled mri reconstruction</article-title>
          ,
          <source>Physics in Medicine &amp; Biology</source>
          <volume>63</volume>
          (
          <year>2018</year>
          )
          <fpage>135007</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          [5]
          <string-name>
            <given-names>W.</given-names>
            <surname>Jifara</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F.</given-names>
            <surname>Jiang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Rho</surname>
          </string-name>
          , M. Cheng, S. Liu,
          <article-title>Medical image denoising using convolutional neural network: a residual learning approach</article-title>
          ,
          <source>The Journal of Supercomputing</source>
          <volume>75</volume>
          (
          <year>2019</year>
          )
          <fpage>704</fpage>
          -
          <lpage>718</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          [6]
          <string-name>
            <given-names>S.</given-names>
            <surname>Chatterjee</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F.</given-names>
            <surname>Saad</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Sarasaen</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Ghosh</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Khatun</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Radeva</surname>
          </string-name>
          , G. Rose,
          <string-name>
            <given-names>S.</given-names>
            <surname>Stober</surname>
          </string-name>
          ,
          <string-name>
            <given-names>O.</given-names>
            <surname>Speck</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Nürnberger</surname>
          </string-name>
          ,
          <article-title>Exploration of interpretability techniques for deep covid-19 classification using chest x-ray images</article-title>
          , arXiv preprint arXiv:
          <year>2006</year>
          .
          <volume>02570</volume>
          (
          <year>2020</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          [7]
          <string-name>
            <given-names>N.</given-names>
            <surname>Kokhlikyan</surname>
          </string-name>
          ,
          <string-name>
            <given-names>V.</given-names>
            <surname>Miglani</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Martin</surname>
          </string-name>
          ,
          <string-name>
            <given-names>E.</given-names>
            <surname>Wang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B.</given-names>
            <surname>Alsallakh</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Reynolds</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Melnikov</surname>
          </string-name>
          ,
          <string-name>
            <given-names>N.</given-names>
            <surname>Kliushkina</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Araya</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Yan</surname>
          </string-name>
          , et al.,
          <article-title>Captum: A unified and generic model interpretability library for pytorch</article-title>
          , arXiv preprint arXiv:
          <year>2009</year>
          .
          <volume>07896</volume>
          (
          <year>2020</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          [8]
          <string-name>
            <given-names>S.</given-names>
            <surname>Chatterjee</surname>
          </string-name>
          ,
          <string-name>
            <surname>A. Das</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          <string-name>
            <surname>Mandal</surname>
            ,
            <given-names>B.</given-names>
          </string-name>
          <string-name>
            <surname>Mukhopadhyay</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          <string-name>
            <surname>Vipinraj</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          <string-name>
            <surname>Shukla</surname>
            ,
            <given-names>R. Nagaraja</given-names>
          </string-name>
          <string-name>
            <surname>Rao</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          <string-name>
            <surname>Sarasaen</surname>
            ,
            <given-names>O.</given-names>
          </string-name>
          <string-name>
            <surname>Speck</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          <string-name>
            <surname>Nürnberger</surname>
          </string-name>
          ,
          <article-title>Torchesegeta: Framework for interpretability and explainability of image-based deep learning models</article-title>
          ,
          <source>Applied Sciences</source>
          <volume>12</volume>
          (
          <year>2022</year>
          )
          <year>1834</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          [9]
          <string-name>
            <given-names>S.</given-names>
            <surname>Chatterjee</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Breitkopf</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Sarasaen</surname>
          </string-name>
          ,
          <string-name>
            <given-names>H.</given-names>
            <surname>Yassin</surname>
          </string-name>
          ,
          <string-name>
            <given-names>G.</given-names>
            <surname>Rose</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Nürnberger</surname>
          </string-name>
          ,
          <string-name>
            <given-names>O.</given-names>
            <surname>Speck</surname>
          </string-name>
          , Reconresnet:
          <article-title>Regularised residual learning for mr image reconstruction of undersampled cartesian and radial data, Computers in Biology and Medicine (</article-title>
          <year>2022</year>
          )
          <fpage>105321</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          [10]
          <string-name>
            <given-names>M.</given-names>
            <surname>Lustig</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Donoho</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J. M.</given-names>
            <surname>Pauly</surname>
          </string-name>
          ,
          <article-title>Sparse mri: The application of compressed sensing for rapid mr imaging</article-title>
          ,
          <source>Magnetic resonance in medicine 58</source>
          (
          <year>2007</year>
          )
          <fpage>1182</fpage>
          -
          <lpage>1195</lpage>
          . doi:
          <volume>10</volume>
          .1002/ mrm.21391.
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          [11]
          <string-name>
            <given-names>K.</given-names>
            <surname>Hammernik</surname>
          </string-name>
          ,
          <string-name>
            <given-names>T.</given-names>
            <surname>Klatzer</surname>
          </string-name>
          , E. Kobler,
          <string-name>
            <given-names>M. P.</given-names>
            <surname>Recht</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D. K.</given-names>
            <surname>Sodickson</surname>
          </string-name>
          ,
          <string-name>
            <given-names>T.</given-names>
            <surname>Pock</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F.</given-names>
            <surname>Knoll</surname>
          </string-name>
          ,
          <article-title>Learning a variational network for reconstruction of accelerated mri data</article-title>
          ,
          <source>Magnetic resonance in medicine 79</source>
          (
          <year>2018</year>
          )
          <fpage>3055</fpage>
          -
          <lpage>3071</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          [12]
          <string-name>
            <given-names>A.</given-names>
            <surname>Sriram</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Zbontar</surname>
          </string-name>
          ,
          <string-name>
            <given-names>T.</given-names>
            <surname>Murrell</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Defazio</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C. L.</given-names>
            <surname>Zitnick</surname>
          </string-name>
          ,
          <string-name>
            <given-names>N.</given-names>
            <surname>Yakubova</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F.</given-names>
            <surname>Knoll</surname>
          </string-name>
          , P. Johnson,
          <article-title>End-to-end variational networks for accelerated mri reconstruction</article-title>
          , in: International Conference on Medical Image Computing and
          <string-name>
            <surname>Computer-Assisted</surname>
            <given-names>Intervention</given-names>
          </string-name>
          , Springer,
          <year>2020</year>
          , pp.
          <fpage>64</fpage>
          -
          <lpage>73</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          [13]
          <string-name>
            <given-names>S.</given-names>
            <surname>Chatterjee</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Sciarra</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Dünnwald</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R. V.</given-names>
            <surname>Mushunuri</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Podishetti</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R. N.</given-names>
            <surname>Rao</surname>
          </string-name>
          ,
          <string-name>
            <given-names>G. D.</given-names>
            <surname>Gopinath</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Oeltze-Jafra</surname>
          </string-name>
          ,
          <string-name>
            <given-names>O.</given-names>
            <surname>Speck</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Nürnberger</surname>
          </string-name>
          , Shufleunet:
          <article-title>Super resolution of difusion-weighted mris using deep learning</article-title>
          ,
          <source>in: 2021 29th European Signal Processing Conference (EUSIPCO)</source>
          , IEEE,
          <year>2021</year>
          , pp.
          <fpage>940</fpage>
          -
          <lpage>944</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref14">
        <mixed-citation>
          [14]
          <string-name>
            <given-names>S.</given-names>
            <surname>Chatterjee</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Sarasaen</surname>
          </string-name>
          ,
          <string-name>
            <given-names>G.</given-names>
            <surname>Rose</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Nürnberger</surname>
          </string-name>
          ,
          <string-name>
            <given-names>O.</given-names>
            <surname>Speck</surname>
          </string-name>
          , Ddos-unet:
          <article-title>Incorporating temporal information using dynamic dual-channel unet for enhancing super-resolution of dynamic mri</article-title>
          ,
          <source>arXiv preprint arXiv:2202.05355</source>
          (
          <year>2022</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref15">
        <mixed-citation>
          [15]
          <string-name>
            <given-names>V.</given-names>
            <surname>Edupuganti</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Mardani</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Vasanawala</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Pauly</surname>
          </string-name>
          ,
          <article-title>Uncertainty quantification in deep mri reconstruction</article-title>
          ,
          <source>IEEE Transactions on Medical Imaging</source>
          <volume>40</volume>
          (
          <year>2020</year>
          )
          <fpage>239</fpage>
          -
          <lpage>250</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref16">
        <mixed-citation>
          [16]
          <string-name>
            <given-names>S.</given-names>
            <surname>Chatterjee</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Sciarra</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Dünnwald</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A. B.</given-names>
            <surname>Talagini Ashoka</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Oeltze-Jafra</surname>
          </string-name>
          ,
          <string-name>
            <given-names>O.</given-names>
            <surname>Speck</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Nürnberger</surname>
          </string-name>
          ,
          <article-title>Uncertainty quantification for ground-truth free evaluation of deep learning reconstructions</article-title>
          ,
          <source>in: Joint Annual Meeting ISMRM-ESMRMB</source>
          <year>2022</year>
          ,
          <year>2022</year>
          , p.
          <fpage>5631</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref17">
        <mixed-citation>
          [17]
          <string-name>
            <given-names>A.</given-names>
            <surname>Barredo Arrieta</surname>
          </string-name>
          ,
          <string-name>
            <given-names>N.</given-names>
            <surname>Díaz-Rodríguez</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J. Del</given-names>
            <surname>Ser</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Bennetot</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Tabik</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Barbado</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Garcia</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Gil-Lopez</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Molina</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Benjamins</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Chatila</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F.</given-names>
            <surname>Herrera</surname>
          </string-name>
          ,
          <article-title>Explainable artificial intelligence (xai): Concepts, taxonomies, opportunities and challenges toward responsible ai</article-title>
          ,
          <source>Information Fusion</source>
          <volume>58</volume>
          (
          <year>2020</year>
          )
          <fpage>82</fpage>
          -
          <lpage>115</lpage>
          . URL: https://www.sciencedirect.com/science/ article/pii/S1566253519308103. doi:https://doi.org/10.1016/j.inffus.
          <year>2019</year>
          .
          <volume>12</volume>
          . 012.
        </mixed-citation>
      </ref>
      <ref id="ref18">
        <mixed-citation>
          [18]
          <string-name>
            <given-names>S.</given-names>
            <surname>MacDonald</surname>
          </string-name>
          , H. Foley,
          <string-name>
            <given-names>M.</given-names>
            <surname>Yap</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R. L.</given-names>
            <surname>Johnston</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K.</given-names>
            <surname>Steven</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L. T.</given-names>
            <surname>Koufariotis</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Sharma</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Wood</surname>
          </string-name>
          ,
          <string-name>
            <given-names>V.</given-names>
            <surname>Addala</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J. V.</given-names>
            <surname>Pearson</surname>
          </string-name>
          , et al.,
          <article-title>Generalising uncertainty improves accuracy and safety of deep learning analytics applied to oncology</article-title>
          ,
          <source>Scientific Reports</source>
          <volume>13</volume>
          (
          <year>2023</year>
          )
          <fpage>7395</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref19">
        <mixed-citation>
          [19]
          <string-name>
            <surname>Y.-L. Tsai</surname>
          </string-name>
          , C.-Y. Hsu,
          <string-name>
            <surname>C.-M. Yu</surname>
          </string-name>
          , P.-Y. Chen,
          <article-title>Formalizing generalization and adversarial robustness of neural networks to weight perturbations</article-title>
          ,
          <source>Advances in Neural Information Processing Systems</source>
          <volume>34</volume>
          (
          <year>2021</year>
          )
          <fpage>19692</fpage>
          -
          <lpage>19704</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref20">
        <mixed-citation>
          [20]
          <string-name>
            <given-names>G. E.</given-names>
            <surname>Hinton</surname>
          </string-name>
          ,
          <string-name>
            <given-names>N.</given-names>
            <surname>Srivastava</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Krizhevsky</surname>
          </string-name>
          ,
          <string-name>
            <surname>I. Sutskever</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R. R.</given-names>
            <surname>Salakhutdinov</surname>
          </string-name>
          ,
          <article-title>Improving neural networks by preventing co-adaptation of feature detectors</article-title>
          ,
          <source>arXiv preprint arXiv:1207.0580</source>
          (
          <year>2012</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref21">
        <mixed-citation>
          [21]
          <string-name>
            <given-names>Y.</given-names>
            <surname>Gal</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Z.</given-names>
            <surname>Ghahramani</surname>
          </string-name>
          ,
          <article-title>Dropout as a bayesian approximation: Representing model uncertainty in deep learning</article-title>
          ,
          <source>in: international conference on machine learning, PMLR</source>
          ,
          <year>2016</year>
          , pp.
          <fpage>1050</fpage>
          -
          <lpage>1059</lpage>
          .
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>