<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <article-id pub-id-type="doi">10.1007/978-3-030</article-id>
      <title-group>
        <article-title>Application of explainable AI to healthcare: a review*</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Samuel Gbenga Faluyi</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Yousra Chabchoub</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Maurras Togbe</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Jérémie Sublime</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Isep</institution>
          ,
          <addr-line>10 rue de Vanves 92130 Issy Les Moulineaux</addr-line>
          ,
          <country country="FR">France</country>
        </aff>
      </contrib-group>
      <pub-date>
        <year>2020</year>
      </pub-date>
      <volume>12269</volume>
      <issue>3</issue>
      <fpage>448</fpage>
      <lpage>469</lpage>
      <abstract>
        <p>The world of technology is advancing by the day, presenting innovative and efficient solutions across various sectors, with healthcare being no exception. This review study majorly focuses on eliciting the impact of machine learning and deep learning techniques to improve the delivery of healthcare. It investigates the different frameworks of previous research studies to establish facts regarding the application of machine learning and deep learning, as well as where enhancement of the model is required. The strengths and weaknesses of the techniques used are identified. Our review study shows that the impact of machine learning and deep learning techniques cannot be berated, notably in prediction modelling, pattern recognition, classification, regression, and image processing, among other applications of the models. Furthermore, the study identifies numerous benefits of model explainability and different model explanation techniques, such as Alibi, InterpretML, Explainerdashboard, etc. We also show that prospective studies could employ ensemble learning using boosting and deep learning algorithms as core learning units.</p>
      </abstract>
      <kwd-group>
        <kwd>eol&gt;Healthcare</kwd>
        <kwd>Machine learning</kwd>
        <kwd>Deep learning</kwd>
        <kwd>Boosting algorithms</kwd>
        <kwd>Model Explainability</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>1. Introduction</title>
      <p>
        The involvement of technology in healthcare has achieved major improvements in resolving human
health challenges. Explainability focuses on making an AI model's decisions understandable and
accessible, providing user-friendly explanations that support causal reasoning [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ]. In clinical
medicine, it is crucial to have a clearer and deeper understanding of the stans made by the algorithms
used to prevent occurrence of faulty conclusion and adverse patient outcomes [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ]. To ensure these
explanations are accessible to other professionals that are not computer experts and to obtain a
greater level of fundamental comprehension among experts, straightforward explanation interfaces
are essential [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ].
      </p>
      <p>
        The integration of Internet of Things (IoT) facilities for dataset collection along with patient
monitoring has contributed to the improvement of healthcare mainly for medical staff decision
support systems [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ], [
        <xref ref-type="bibr" rid="ref4">4</xref>
        ]. Furthermore, with its prominent advantages like networking, sensing,
expression, safety, and intelligence, the IoTs have become vital component of the healthcare industry
[
        <xref ref-type="bibr" rid="ref5">5</xref>
        ]. The IoTs represents the interconnectedness of physical objects in cyberspace to exchange data.
In addition to communicating, they are remotely controlled and observed. To maintain health records,
data are gathered from a variety of devices, including blood glucose monitors, electrocardiograms
(ECGs), and fetus monitors [
        <xref ref-type="bibr" rid="ref5">5</xref>
        ]. The IoT facilities help collect individual health relevant information
in real-time. By leveraging data mining and ML/DL techniques, the data are often used to recommend
health-related services or suggest lifestyle changes for the individual. [
        <xref ref-type="bibr" rid="ref6">6</xref>
        ]. Many modern medical
sensors and gadgets are often linked over different networks, giving access to vital data regarding
patients' status. The data can be employed for many functions, including remote patient monitoring,
prognosticating disease, and recuperation by gaining a deeper understanding of symptoms and
enhancing the diagnosis and treatment procedure through enhanced automation and mobility [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ].
      </p>
      <p>
        Machine learning and deep learning (ML/DL) are two subgroups of artificial intelligence (AI), that
often learn from collected data, for intelligent decisions making. Their use in healthcare has improved
the precision of diagnoses, customize treatment regimens, forecast patient outcomes, and accelerates
operational efficiency [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ]. There has been significant growth in the application of EHR resources
recently which has accelerated the application of ML/DL to create patient phenotypes from EHR data
[
        <xref ref-type="bibr" rid="ref6">6</xref>
        ]. EHR is regarded as rich spring of longitudinal experimental data that have the capability to house
all important clinical and administrative data pertinent to a patient's care under a specific provider,
including vital signs, prescriptions, medical history notes, demographics, and laboratory results [
        <xref ref-type="bibr" rid="ref6">6</xref>
        ].
      </p>
      <p>
        ML models are often considered as black box algorithms, where the internal processes behind their
predictions are not easily interpretable. In medicine, however, trust and explainability are crucial, as
healthcare professionals and patients need to understand how decisions are being made.
Explainability in machine learning refers to the ability to understand and articulate the inner
workings of a model, its predictions, and the factors influencing those predictions. This transparency
is essential for ensuring trust, accountability, and effective communication with stakeholders in
automated decision-making systems. [
        <xref ref-type="bibr" rid="ref7">7</xref>
        ].
      </p>
      <p>It is obvious that artificial intelligence with the use of ML/DL has contributed immensely to the
rapidly growing achievements in healthcare. However, there is room for improvement. This review
examines various ML/DL approaches previously used to extract critical information from
state-ofthe-art techniques. It also highlights their limitations and identifies areas where further solutions are
necessary to enhance performance in healthcare.</p>
      <p>This review paper is organized as follows. Section 2 presents the context and objectives of our
study. Section 3 highlights the main research studies applying IA to healthcare. Detailed information
regarding the ML/DL algorithms applied to healthcare can be found in Section 4. In section 5, we
address healthcare data collection and data types. Explainability in the healthcare context is discussed
in Section 6. Finally, Section 7 presents the conclusion and future works.</p>
    </sec>
    <sec id="sec-2">
      <title>2. Context and Objectives</title>
      <p>To develop individualized treatment routines, data is often collected and analyzed using ML/DL.
Algorithms can predict a patient's response to different therapies based on their genetic composition.
Moreover, predictive analytics is performed using models to forecast each patient's unique disease
risk, making possible more individualized medical interventions.</p>
      <p>
        ML/DL techniques are often used to examine EHR data to predict patient outcomes, readmission
rates, and notify patients of high danger for certain situations. The EHR, formerly known as Clinical
Information System, was described as a warehouse for healthcare big data [
        <xref ref-type="bibr" rid="ref8">8</xref>
        ]. The data can be
numerical, text (for NLP), or medical imaging (e.g. Positron Emission Tomography, X-ray, Computed
Tomography, and Ultrasound identification of tumors, fractures, and lesions) [
        <xref ref-type="bibr" rid="ref8">8</xref>
        ]. Most of the previous
research studies have adopted EHR for their analytics, as an example, we can cite the studies named
“Prediction of mortality in paralytic ileus patients” [
        <xref ref-type="bibr" rid="ref9">9</xref>
        ], “predicting post-pneumonia using deep neural
network approaches”, [
        <xref ref-type="bibr" rid="ref10">10</xref>
        ], or also “Predicting the onset of type 2 diabetes” [
        <xref ref-type="bibr" rid="ref11">11</xref>
        ], among other research
studies.
      </p>
      <p>
        Real-time health data, like blood pressure, glucose levels, and heart rate, are gathered by wearable
technology and IoTs sensors. These data are analyzed using ML/DL algorithms to monitor patient
health and send out notifications to the concerned stakeholders in case of any abnormalities.
AIpowered chatbots and virtual assistants facilitate telehealth by setting up appointments, making
initial diagnoses, and responding to patient questions. Moreover, IoT was used for the collection of
live data in real-time (time series), which is the case of the study [
        <xref ref-type="bibr" rid="ref5">5</xref>
        ] about heart disease prediction
where IoT devices were used to collect live data.
      </p>
      <p>
        To guarantee that medical professionals understand and can rely on AI-driven decisions, there is
an increasing emphasis on creating models that yield clear and comprehensible outcomes. ML/DL
models are sometimes considered black boxes, due to difficulties in explaining the internal operations
of the models. However, through interpretable models, model-agnostic methods, visualization
techniques, and a balance between complexity and interpretability; giving practitioners a clear
understanding of the results, fostering better and more responsible use of machine learning
technologies [
        <xref ref-type="bibr" rid="ref12">12</xref>
        ].
      </p>
      <p>
        There are numerous techniques for achieving explainability, but it is crucial to grasp the key
themes underlying different types of explainers. These include factors such as scope (local vs. global),
model type (black box vs. white box), task (e.g., classification or regression), data type (tabular,
images, text, etc.), and insights (feature attributions, counterfactuals, influential training instances,
and more) [
        <xref ref-type="bibr" rid="ref13">13</xref>
        ]. Furthermore, explainers serve as interfaces that can work together with the model.
For black-box techniques, this interaction typically involves analyzing the inputs and outputs. In
contrast, for white-box techniques, explainers can access and interpret the internal workings of the
model [
        <xref ref-type="bibr" rid="ref14">14</xref>
        ].
      </p>
    </sec>
    <sec id="sec-3">
      <title>3. State of the art</title>
      <p>
        Ahmed et al. [
        <xref ref-type="bibr" rid="ref9">9</xref>
        ] purported a predictive model that combines statistical techniques with machine
learning algorithms to enhance the predictive performance of the models. Statistical data analysis
was performed on the data obtained using IBM SPSS to examine the statistical significance of the
features, for features reduction. The machine learning algorithms used are decision tree, linear
discriminant analysis, K-Nearest Neighbors, gaussian naive bayes, and support vector machines with
linear kernel and radial basis functions. Based on the results of the model, SVM with (radical basic
function, RBF kernel) has the highest accuracy and ROC-AUC score. However, there was no method
of data validation such as K-fold validation. Hyper-parameter tuning could be used for better
optimization of the model.
      </p>
      <p>
        Ge et al. [
        <xref ref-type="bibr" rid="ref10">10</xref>
        ] developed a post-stroke pneumonia predictive model applying ML/DL techniques,
which combine both time series and time-insensitive attributes. As part of the preprocessing, the
numerical observations of the data were normalized to tackle data sparseness and convert the
laboratory test to categorical. The data were divided into two parts (in the proportion of 85% and 15%)
for training and testing respectively, and three machine learning (LR, SVM, XGBoost) and deep
learning (MLP and attention augmented GRU) algorithms were tested. 10 (k-fold) cross-validation
was applied to the training set. To assess the performance of the model ROC, AUC, sensitivity, and
specificity were applied to the test set. Based on the results, deep learning outperformed all the
machine learning methods employed, where attention augmented gated recurrent unit (GRU) model
achieved the highest AUC score. Subsequently, hyper-parameter tuning for optimization such as grid
search or Bayesian optimization could be applied to identify the best hyper-parameter values.
Moreover, exploratory data analysis may be administered to the dataset to aid the feature selection
and to identify the statistical significance of the data.
      </p>
      <p>
        Gupta et al. [
        <xref ref-type="bibr" rid="ref5">5</xref>
        ] proposed ML-based models for heart disease detection. The statistical correlation
matrix was used on the collected data to examine the significance of the features. The model phase
was carried out by dividing the entire collected data into two proportions for training and testing of
the model. The algorithms used include K-Nearest Neighbors (K-NN), Support Vector Machine (SVM),
Naive Bayes (NB), Random Forest (RF) and Decision Tree (DT). Furthermore, to determine the
Kvalue of the K-NN, a score graph method was used, where the highest score was identified at the
Kvalue of 3. The performance metrics were accuracy, sensitivity, miss rate, and confusion matrix.
However, the best model was selected to validate the live dataset. Thus, real-time data are gathered
by attaching several sensors to the body of the patients to measure different parameters. This data is
then fed into the trained model to predict the outcome. The result of training and testing identified
K-NN (3-NN) as the best algorithm. Therefore, K-NN was used for the prediction of heart diseases on
the collected live data.
      </p>
      <p>
        Nguyen et al. [
        <xref ref-type="bibr" rid="ref11">11</xref>
        ] developed a prediction model that can identify patients at high risk for
developing type 2 diabetes using electronic health record data. The collected data is divided into
training and tests (respectively 70% and 30%). 10-fold cross-validation was applied to the training set.
Due to an imbalance in the data, SMOTE [
        <xref ref-type="bibr" rid="ref11">11</xref>
        ] was adopted. Also, the stochastic gradient descent
optimizer and the binary cross-loss function were adopted for model training, with an ensemble
learning model. The sensitivity, specificity, and ROC AUC were the performance metrics used for this
study. The predictive model for T2DM was developed and comparisons of the algorithms used were
made. The outcome of data with the application with and without SMOTE were compared. However,
models using SMOTE showed higher sensitivity but no significant improvement for the other metrics.
Moreover, ensemble models without SMOTE showed higher AUC and specificity compared to
SMOTE-enhanced models. No explainability techniques were adopted for a better understanding of
the model as well as for transparency.
      </p>
      <p>
        Sood et al. [
        <xref ref-type="bibr" rid="ref4">4</xref>
        ] proposed healthcare IoT-fog technology to diagnose patient’s hypertension stages
and make prediction of hypertension occurrences based on users' health data collected. The study
aims to leverage fog computing [15] to provide continuous monitoring for hypertensive patients and
establish an efficient mechanism for sharing medical records and implementing precautionary
measures. The system consists of three subsystems: an IoT-based subsystem for users, a health smart
gateway (fog subsystem), and a cloud subsystem. The user subsystem utilizes various IoT devices to
capture hypertension-related data, which is then transmitted to the health fog subsystem for
realtime processing and diagnosis. Upon identifying a potential health issue, the health fog subsystem
generates an alert message, sent directly to the user's mobile phone, allowing for timely precautionary
action. Simultaneously, the analysis results and compiled medical records are stored in a cloud system,
where they can be shared with authorized medical professionals, including doctors, pharmacies,
hospitals, and healthcare providers. The cloud subsystem facilitates data storage and sharing,
enabling domain experts to take swift action and offer precautionary advice in emergencies. The
algorithms used for classification and prediction are Artificial Neural Network (ANN), K – Nearest
Neighbours (K-NN), Multi-Layer Perceptron (MLP), and Logistic Regression (LR) [
        <xref ref-type="bibr" rid="ref4">4</xref>
        ]. The evaluation
metrics include accuracy, time, sensitivity and precision. According to the system result, ANN
outperforms all other classification algorithms in terms of accuracy, time, and standard metrics, for
predicting the classification of hypertension attacks. The alert-generating result also reveals high
values for sensitivity, specificity, precision, and coverage and low values for the Mean Absolute Error
(MAE), Root Absolute Squared Error (RASE), Relative Absolute Error (RAE) and Root Relative
Squared error (RRSE). These two latter are given by the following formulas:
∑
∑

 =1 |  −  ̂ |

 =1 |  −  ̅ |
∑
      </p>
      <p>) = √  =1(  −  ̂ )2
∑</p>
      <p>=1(  −  ̅ )2


(</p>
      <p>) =</p>
      <p>Where yi is the actual value, ŷi is the predicted value, n is the number of instances, and ȳ is the
mean of the actual values.</p>
      <p>Furthermore, when it comes to alert production efficiency, fog monitoring-based alerts have the
lowest delay times when compared to alerts based on cloud monitoring and alerts based on manual
monitoring. Nonetheless, the security and privacy of data created by several layers of fog and cloud
computing could be added in future studies.</p>
      <p>Nguyen et al proposed in [16] three deep ensemble learning (DEL) approaches, for different data
types (statistical, image-based and sequential). These are deep-stacked generalization ensemble
learning, gradient deep learning boosting, and deep aggregation learning. Following the data reading
phase, preprocessing of datasets was carried out using various techniques to convert the data into
appropriate forms (e.g., converting the pictures to numerical data
with the application of
Convolutional Auto-Encoder, CAE). Subsequently, in the next phase, the suggested models and
additional conventional machine learning models were constructed following the dataset comprising
various data. During this phase, the models' hyperparameters were also adjusted. Ultimately, they
assessed the
models and the suggested
model in the last phase and compared the
models'
performances. A confusion matrix and metrics derived from it were used to evaluate the predictions’
performance. The accuracy, Matthew’s correlation coefficient (MCC), precision, F1-score, recall, and
AUC metrics were obtained from the confusion matrix. MCC is defined as:
=
√(
+</p>
      <p>The findings demonstrate that, across all dataset types, the deep ensemble learning (DEL) family
of techniques outperforms all other based deployed models in terms of performance. The experiment
results show that when DL is used as the CLU, the GDLB strategy is appropriate for numerical dataset,
the DAL method is suggested for image dataset, and the DeSGEL [16] technique is suggested for
sequential data. For more transparency of the developed model, the application of explainability
techniques could be adopted. Also, developing an XGBoost algorithm for Neural networks which
drops layers instead of pruning branches in decision trees.</p>
    </sec>
    <sec id="sec-4">
      <title>4. ML/DL algorithms applied to healthcare</title>
      <p>Research studies have shown that there are various ML/DL algorithms, which can be applied to
healthcare. This section presents basic concepts of some common algorithms.</p>
      <sec id="sec-4-1">
        <title>Ensemble Learning: Boosting, Bagging and Stacking</title>
        <p>Ensemble methods hybridize several algorithms to build more powerful predictive models to obtain
better performance than an individual based model [17]. The most typical kinds are bagging, boosting,
and stacking.</p>
        <p>
          Tree-based gradient boosting technique, XGBoost is flexible [18]. It integrates weak classifiers to
produce prediction with high level of precision. While the classic gradient boosting approach
optimizes using only the first derivative, XGBoost conducts the second-order Taylor expansion of the
cost function and, for improved efficiency, adds a regularization item to the cost function [
          <xref ref-type="bibr" rid="ref10">10</xref>
          ].
        </p>
        <p>Bagging is an ensemble machine learning technique, where many models are trained
simultaneously on a random subset using the bootstrap aggregating approach. Using samples
generated through sampling with replacement, the multiple algorithms are trained in the
bootstrapping approach. The predictions from each model are then averaged. This algorithm
considers the most evolved algorithms' categorization capability based on the voting mechanism.
Reducing variance during training minimizes the likelihood of overfitting, which is a well-known
specification of this approach [17], [19]. Random forest algorithm is one the major examples of
bagging.</p>
        <p>Boosting trains weak learners in a stepwise manner, where the mistakes made by earlier learners
in the series are improved by each succeeding learner. First, a subset of the original dataset is selected.
Following its training on this data, the first model predicts the outcome. Predictions about the samples
may be accurate or inaccurate. For training the subsequent model, the incorrectly predicted samples
are presented again. This allows the improvement of errors in earlier models by later versions. By
aggregating the outcomes at each stage, boosting gathers the results earlier than bagging, which does
it at the conclusion. A weighted average is used to combine them. Based on how successfully a model
predicts the future, it is given a varying weight via weighted averaging [17], [19], [20]. Row and
column resampling are also used by the boosting algorithms; these methods are used to prevent
overfitting [21].</p>
      </sec>
      <sec id="sec-4-2">
        <title>Support Vector Machine (SVM)</title>
        <p>
          SVM algorithms, which are considered supervised learning are often used to create a model that
classifies objects or data into different categories, by finding the maximum-margin hyperplane for
the binary classification of new data points from known classified data points [
          <xref ref-type="bibr" rid="ref3">3</xref>
          ], [
          <xref ref-type="bibr" rid="ref8">8</xref>
          ], [22], [23]. This
aids the new input sets to be predicted more quickly than in other predictive models, regardless of
the size of the training set in the domain. The goal of SVM is to locate the largest margin hyperplane,
to divide the data and provide the best fit to arrange it [
          <xref ref-type="bibr" rid="ref8">8</xref>
          ]. By applying classical statistical learning
theory, the SVM produces a model that is easily interpreted and provides good generalization of fresh
data. Since they support the placement of the dividing hyperplanes, the nearest points are known as
support vectors. This implies that the hyperplanes cannot be changed by shifting the nonsupport
vectors, and vice versa [22]. Due to its ability to classify objects, SVM is often utilized in clinical
imaging analysis to classify or categorize diagnoses [23].
        </p>
        <p>
          SVMs exhibit the advantage of being very preventive to overfitting issues. SVMs are not limited
to employing linear classifiers; they can also be used to classify data using non-linear functions by
utilizing non-linear kernels [
          <xref ref-type="bibr" rid="ref3">3</xref>
          ]. A model demonstrating an SVM classifier that performed better than
previous classifiers for determining whether a person has influenza-like illnesses (ILI) often referred
to as acute respiratory infections was proposed by [24]. In location verification, SVM is determined
to be the most accurate, and it doesn't need channel characteristics data to function [
          <xref ref-type="bibr" rid="ref3">3</xref>
          ].
        </p>
      </sec>
      <sec id="sec-4-3">
        <title>Random Forest (RF)</title>
        <p>
          Multiple decision trees can be trained concurrently using random forests to generate a single output.
Random Forest is one of the bagging ensemble machine learning that involves merging decision trees
[
          <xref ref-type="bibr" rid="ref3">3</xref>
          ]. Al Hossain et al. [25] provided evidence of the use of a random forest algorithm that outperformed
alternative models in estimating the number of influenza cases in public areas with a 95% accuracy
rate. Because it can integrate the results from every decision tree, it demonstrates a high degree of
accuracy.
        </p>
      </sec>
      <sec id="sec-4-4">
        <title>Naive Bayes (NB)</title>
        <p>
          The Bayes theorem operates as the conceptual foundation for Naive Bayes classification. The term
"naive" describes the belief that each attribute is independent of the others. A response vector and a
feature matrix are created from the data [
          <xref ref-type="bibr" rid="ref3">3</xref>
          ]. The whole set of data is presented in the feature matrix's
rows as vectors, each of which denotes a different relative variable type. Conversely, every row in
the response vector denotes a class of outcome. Naive Bayes classifier performed significantly well in
classifying in controlling the social networks during pandemic catastrophe, where it outperformed
other classifiers [26].
        </p>
      </sec>
      <sec id="sec-4-5">
        <title>Extreme Gradient Boost</title>
        <p>
          As an ensemble learning technique, where weak learners are combined to produce a stronger learner
for better accuracy, boosting establishes decision boundaries for every weak learner and weights them
according to how well the boundaries identified or approximated the data. Until a workable model is
produced, this is repeated. Gradient boosting involves the sequential creation of numerous
boundaries, or learners so that each learner can partially account for the errors of the preceding one
[
          <xref ref-type="bibr" rid="ref3">3</xref>
          ]. Through parallel processing, pruning of decision trees, management of missing values, and
reducing the likelihood of bias or overfitting in a model, extreme gradient boosting (XGBoost) is used
to optimize gradient boosting techniques. Each iteration's tree is computed using the first- and
second-order gradient of the loss function, and the shrinkage parameter is used to add the predictions
to the current function and minimize the optimal node predictions made in each iteration [30].
XGBoost is a potent and adaptable algorithm that may be applied to a range of problems, including
regression and classification, forecasting, and ranking. Ensuring that machines are operating as
efficiently as possible in terms of mobility, scalability, and accuracy is the primary objective of the
XGBoost model. On the other hand, extreme gradient boosting, or XGBoost, is renowned for its
meticulous tuning to yield better outcomes with fewer resources while remaining effective [27]. Heart
patients' irregular cycles were predicted with 92.1% accuracy using extreme gradient boosting [28].
Analogously, speech signals obtained from wearables can be utilized to identify Parkinson's disease
symptoms in their early stages [29], while predictive analytics can be employed to identify diabetes
[
          <xref ref-type="bibr" rid="ref3">3</xref>
          ], [30].
        </p>
      </sec>
      <sec id="sec-4-6">
        <title>Artificial Neural Network (ANN)</title>
        <p>
          ANN is an ML model that simulates the way the human brain learns. It consists of an input layer that
accepts information, many tiers that analyze the input, and an output layer that gives results. If the
outputs are inaccurate, they are propagated backwards through the preceding layers using a cost
function to adjust the weights until the answers are received with a high enough degree of precision.
“It calculates several weighted sums, which are then passed through layers with weights and sums
until they reach the last layer, which uses an activation function to determine the output” [
          <xref ref-type="bibr" rid="ref3">3</xref>
          ]. ANNs
are very flexible in application and are often used in pattern recognition-related fields. Sood and
Mahajan [
          <xref ref-type="bibr" rid="ref4">4</xref>
          ] employed a fog-layer system to store patient data related to heart attacks and to detect,
monitor and treatment of hypertension (BP) cases [
          <xref ref-type="bibr" rid="ref3">3</xref>
          ], [
          <xref ref-type="bibr" rid="ref4">4</xref>
          ].
        </p>
      </sec>
      <sec id="sec-4-7">
        <title>Convolutional Neural Network (CNN)</title>
        <p>
          CNN is regarded as a feed-forward neural network and is usually used in classification challenges.
The input is decomposed into its constituent pieces, which are subsequently sent to a convolution
layer, and then these parts are combined in various ways until patterns are produced (convolution)
[
          <xref ref-type="bibr" rid="ref3">3</xref>
          ], [31]. The input images are then mapped against these patterns using a Rectified Linear Unit
(ReLU) layer, creating a rectified feature layer, which is then passed on to a pooling layer. To create
a pooled feature map, the map is reduced by the pooling layer. This map is then flattened to create a
linear vector, which is then served into a completely linked network to classify the input. CNNs are
widely utilized in fields where visual interpretation of grid-like-topped images is required [
          <xref ref-type="bibr" rid="ref3">3</xref>
          ]. Brain
wave values acquired as a 2D-time series were converted to forecast epileptic incidents and
immediately notify health authorities [32]. Ke et al. [33] suggested utilizing lightweight CNN to assess
depression using raw electroencephalograms (EEGs). Ciocca et al. [34] utilized picture recognition to
recognise food and, consequently, calories, a finding with implications for fitness and nourishment.
Alhussein and Muhammad [35] applied deep learning on pitch tones in mobile healthcare frameworks
to identify speech disorders. Using the LUNA16 dataset, Bansal [36] developed a resnet-based model
for lung disease classification and 3D dissection, where an excellent accuracy of segmentation and
classification was obtained [36].
        </p>
        <p>Thus, numerous ML/DL techniques can be used to healthcare to perform various activities
including prediction, classification, regression, among others. Some of the algorithms have been
highlighted above and various research studies have identified the efficiency of these algorithms
based on the adopted parameters.</p>
      </sec>
    </sec>
    <sec id="sec-5">
      <title>5. Healthcare data collection and data types</title>
      <p>In the application or development of ML/DL modelling, one of the major factors to be considered is
the approach of data gathering and the type of collected data. This section presents some instances
of data collected and used in previous research studies.</p>
      <sec id="sec-5-1">
        <title>Component of the data</title>
        <p>The dataset made up of 46476 patients
admitted to ICU, where 1021 patients
were diagnosed with paralytic ileus and
&gt;= 18 years old was used for the
prediction model.</p>
      </sec>
      <sec id="sec-5-2">
        <title>Data type</title>
        <p>Numerical
and
Categorical</p>
        <p>The dataset was collected from the The data contains 13930 records of Numerical
EHR of a hospital in the space of 10 patients, where 1012 had pneumonia and
years from 2007 to 2017. while in the hospital. Some of the categorical
records are time sensitive (medication,
laboratory tests) and others are time
insensitive (demographic information).</p>
        <p>Data used for the study was collected The data consists of 303 instances of 14
from the UCI repository. features, which are grouped into
numerical (such as age) and categorical
(sex, chest pain type, etc) features.</p>
        <p>The data was collected from the EHR The data comprises 9948 patients’ Numerical
of a hospital in the United States records, where 1904 patients were and
from 2009 to 2011. diagnosed with type 2 diabetes mellitus. categorical
The data were collected via users’ Data collected by this system are Numerical
subsystems comprising several IoT categorized into six groups, which are and
facilities to obtain hypertension health data (such as obesity, SBP, DBP, categorical
activities. They are then etc), environmental data (room
communicated to a fog system for temperature, noise level, air quality),
concurrent processing and physical activity data (such as sleeping,
diagnosis. Alerts are generated and sitting, walking, etc), behavioural data
shared with the staff concerned. The (anxiety level, restlessness, etc), dietary
data is stored in the cloud. data (Diet type, quantity), and GPS data</p>
        <p>(location and time).</p>
        <p>Three different open datasets were HDU: 270 instances (containing 120 and HDU:
used, which are Heart Disease UCI 150 records of having and not having Numerical
(HDU) data, X-ray data, and the heart disease respectively) with 13 X-ray:
Depresjon data. attributes. Image
X-ray: 5856 samples (made up of 4273 Depresjon:
and 1583 images with and without Numerical
pneumonia respectively)
Depresjon: 267 and 547 samples of
depressed patients and non-depressed
people respectively</p>
        <p>Numerical
and
categorical.</p>
        <p>We showed in table 1 that healthcare data for ML/DL modelling can be obtained from different
sources. More so, part of the benefits of adopting ML/DL techniques is the capability to deal with
different types of input data (numerical, categorical, image-based, text based, etc.). The data can be
time sensitive or time insensitive. The selection of the most appropriate ML/DL model depends on all
these characteristics of the dataset, in addition to the target result of the application (prediction,
classification…). The trust of the obtained result is also closely dependent on its explainability which
is discussed in the following part.</p>
      </sec>
    </sec>
    <sec id="sec-6">
      <title>6. Explainability in healthcare context</title>
      <p>
        Model explainability, often known as explainable AI, describes methods for making machine learning
(black box and white box) models' predictions more comprehensible to human observers, particularly
when there are difficulties in explaining the internal operations of the models [
        <xref ref-type="bibr" rid="ref12">12</xref>
        ]. A strong machine
learning system must have the capacity to justify predictions to foster confidence in the decisions
made in the model's process [37]. The explanation’s target insights vary greatly depending on who
uses them, from regulators auditing the models to data scientists troubleshooting them. Therefore, to
meet the needs of the target audience, a variety of approaches are required. This is because
standalone explanation techniques may produce explanations that are deceptive or lacking in context [38],
[39]. This implies that explaining models holistically is necessary. Explainability in machine learning
is essential for ensuring transparency, trust, and accountability in automated decision-making
systems [40]. It involves understanding how models make predictions and being able to communicate
this knowledge to various stakeholders [41], [42]. The interpretation ability, which assesses the
influence, relationship, and correlation of conditioning components within a model, highlights the
benefits of XAI above traditional techniques. Explainability was proposed to enhance the prediction
capability of infections related to healthcare in patients admitted into intensive care units while
preserving the model. This goes beyond the artificial neural network black box paradigm by using a
parsimonious and robust semi-parametric approach. More so, the saliency map was used to examine
and justify the additional predictive capability of this model [
        <xref ref-type="bibr" rid="ref7">7</xref>
        ].
      </p>
      <p>Explainability and causality in the medical field are also critical for regulatory compliance, further
highlighting its relevance [43]. These interfaces not only help keep humans involved in the process
but also permit for the incorporation of their experiential knowledge and conceptual understanding
into AI operations. While the importance of a person-in-the-loop is sometimes undervalued, implicit
knowledge and human expertise remain indispensable in medical diagnosis [44]. By following
diagnostic steps, individual components that contribute to a diagnosis can be identified and applied
to train and improve models prospectively [45].</p>
    </sec>
    <sec id="sec-7">
      <title>Explainability frameworks</title>
      <p>The rapid advancements in ML/DL technologies, along with the increasing adoption of AI, highlight
the need for greater awareness of AI's operational mechanisms, making explainable modelling
essential. There are various examples of model explainers, which include but are not limited to the
ones highlighted in table 2.</p>
      <p>
        Although there is a wide variety of approaches accessible for explainability, it is critical to
comprehend the overarching themes of the many categories of Explainers. Among them are: scope
(local (L) and global (G)), type of model (white box (Wbox) and black box (Bbox)), Task (regression
(R), classification (C), time-series (TS), image (I), etc.), type of data (text (Tt), image (I), tabular (Tab),
etc.) and Insight (attributions of features, counterfactuals, significant training examples, etc.).
However, these systems exhibit data flow patterns like those of explainers functioning as interfaces.
Particularly, a lot of them call for the users to enable them to communicate both with the model and
the data it processes; in the case of black-box techniques, this refers to the inputs and outputs, while
in the case of white-box techniques, it refers to the internal workings of the models [
        <xref ref-type="bibr" rid="ref14">14</xref>
        ].
Insight Feature Feature Feature Feature Summar Local Saliency
attributio importanc importanc importance, y plots, Outlier map,
layern, e, PDP, e, EBM, SHAP, PDP, PDP, Factor, wise
influential residua SHAP, Decision interacti Isolation attribution,
training diagnostic, LIME, etc. Path on effect, Forest, neuron
instances surrogate Visualizatio feature Visualizati attribution
model n importan on,
ce anomaly
score
      </p>
      <p>
        In [
        <xref ref-type="bibr" rid="ref7">7</xref>
        ], authors introduced a two-step methodology for predicting ICU-acquired infections
(ICUAIs) using high-resolution longitudinal data combined with survival models. The study applied a
saliency map model explainer to examine the images of signal present in the used data and the
outcome of the model [
        <xref ref-type="bibr" rid="ref7">7</xref>
        ].
      </p>
      <p>
        Model explainability can be categorized as either ante-hoc or post-hoc. Ante-hoc models are
inherently self-explainable, while post-hoc models require the use of explainable AI (XAI) methods
to provide explanations for their decisions [
        <xref ref-type="bibr" rid="ref12">12</xref>
        ], [43]. Once a machine learning model has been trained
and has made its predictions, post-hoc explainer techniques examine and clarify its decision-making
procedure to provide insights into how the model operates. In contrast, ante-hoc approaches are
inherently interpretable; often referred to as intrinsically explainable, transparent, or glass-box
models. Like interactive machine learning (iML), these approaches focus on embedding
interpretability directly into architecture of the model, ensuring transparency and explainability from
the outset [43], [46], [47], [48].
      </p>
      <p>One common post-hoc method involves determining the significance of various attributes in
producing a specific result [49]. Post-hoc methods based on game theory, such as Shapley values, can
quantify the importance of individual features. Similarly, Anchors, another post-hoc approach,
provides insights into coverage, and the region where the explanation is applicable, and helps define
the boundaries of attributes. Anchors are particularly useful for classification models involving
textbased, and tabular data [50]. According to Dandl et al. [51], counterfactuals are an XAI technique that
describes the smallest modification to the attribute values that affects the prediction to explain
specific forecasts [43], [51].</p>
      <p>Decision trees (DT) are one of the well-known examples of interpretable machine learning models.
They operate by repeatedly splitting the data based on specific threshold values of the features,
creating distinct subsets of the dataset, with each instance assigned to one of these subsets [52]. These
models are interpretable because their structure can be easily followed, commencing from the root
node, through the subsequent nodes and edges, until the leaf node with the predicted outcome is
reached. The DT algorithms are considered interpretable due to their structure of hierarchy such as
if-then-else rules that can be easily visualized, understood, and interpreted by humans [43].</p>
    </sec>
    <sec id="sec-8">
      <title>7. Conclusion and Future works</title>
      <p>We focus in this paper on identifying the impact of ML/DL and their applications in providing
solutions to health challenges. These include prediction modelling, enhancing models using ensemble
machine learning and optimization techniques, application of fog facilities and cloud systems for
collecting, sharing and storing data as well as easy retrieval of the data. We considered the main
research studies applying ML/DL algorithms to healthcare. We highlighted the benefits of using EHR:
some of the collected datasets adopted for the training, validation and testing in the simulation
processes of the model development were stored in the EHR. Various data types are considered.
Moreover, we examined the most well-known explainability frameworks and their different
characteristics.</p>
      <p>However, there are several challenges or shortfalls in the previous research studies, which require
prospective investigation to further enhance the impact of ML/DL applications in the healthcare
sector. For example, making predictions requires the involvement of optimization techniques such as
hyperparameter tuning, grid-search techniques, etc. Further studies could engage the application of
ensemble learning approaches along with model explainer techniques to enhance the adoption of
ML/DL in healthcare. Furthermore, transparency in the model undeniably fosters trust, which in turn
promotes its adoption and contributes to improving the developed model.</p>
    </sec>
    <sec id="sec-9">
      <title>Declaration on Generative AI</title>
      <p>In the preparation of this study, the author(s) used Quillbot, Grammarly to paraphrase and reword,
check the grammar and spelling of the work. After using the tools, the author(s) reviewed and edited
the content as necessary and take(s) full responsibility for the publication’s content.</p>
    </sec>
    <sec id="sec-10">
      <title>Acknowledgement References</title>
      <p>We would like to thank the Tertiary Education Trust Fund (TETFUND) for supporting this study.
[15] P. Verma and S. K. Sood, “Fog Assisted-IoT Enabled Patient Health Monitoring in Smart Homes,”
IEEE Internet of Things Journal, vol. 5, no. 3, pp. 1789–1796, Jun. 2018, doi:
10.1109/JIOT.2018.2803201.
[16] D.-K. Nguyen, C.-H. Lan, and C.-L. Chan, “Deep Ensemble Learning Approaches in Healthcare
to Enhance the Prediction and Diagnosing Performance: The Workflows, Deployments, and
Surveys on the Statistical, Image-Based, and Sequential Datasets,” International Journal of
Environmental Research and Public Health, vol. 18, no. 20, Art. no. 20, Jan. 2021, doi:
10.3390/ijerph182010811.
[17] Sumbatilinda, “Ensemble Learning in Machine Learning: Bagging, Boosting and Stacking,”
Medium. Accessed: Jul. 23, 2024. [Online]. Available:
https://medium.com/@sumbatilinda/ensemble-learning-in-machine-learning-baggingboosting-and-stacking-a00c6bae971f
[18] T. Chen and C. Guestrin, “XGBoost: A Scalable Tree Boosting System,” in Proceedings of the 22nd
ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, in KDD ’16.
New York, NY, USA: Association for Computing Machinery, Aug. 2016, pp. 785–794. doi:
10.1145/2939672.2939785.
[19] R. Mirzaeian, R. Nopour, Z. Asghari Varzaneh, M. Shafiee, M. Shanbehzadeh, and H.
KazemiArpanahi, “Which are best for successful aging prediction? Bagging, boosting, or simple
machine learning algorithms?,” BioMed Eng OnLine, vol. 22, no. 1, p. 85, Aug. 2023, doi:
10.1186/s12938-023-01140-9.
[20] S. Ganie, P. Dutta Pramanik, M. Malik, and A. Nayyar, “An Improved Ensemble Learning
Approach for Heart Disease Prediction Using Boosting Algorithms,” Computer Systems Science
and Engineering, vol. 46, pp. 3993–4006, Apr. 2023, doi: 10.32604/csse.2023.035244.
[21] S. Faluyi, T. Balogun, G. Ojo, K. Fapohunda, and Akande Adeyemi, “Forecasting transaction card
fraud using bossting algorithms,” in Communication and e-Systems for Economic Stability, 2023.
[22] B. A. Akinnuwesi et al., “Application of support vector machine algorithm for early differential
diagnosis of prostate cancer,” Data Science and Management, vol. 6, no. 1, pp. 1–12, Mar. 2023,
doi: 10.1016/j.dsm.2022.10.001.
[23] E.-J. Lee, Y.-H. Kim, N. Kim, and D.-W. Kang, “Deep into the Brain: Artificial Intelligence in</p>
      <p>Stroke Imaging,” J Stroke, vol. 19, no. 3, pp. 277–285, Sep. 2017, doi: 10.5853/jos.2017.02054.
[24] N. L. W. S. R. Ginantra, I. G. A. D. Indradewi, and E. Hartono, “Machine learning approach for
Acute Respiratory Infections (ISPA) prediction: Case study Indonesia,” J. Phys.: Conf. Ser., vol.
1469, no. 1, p. 012044, Feb. 2020, doi: 10.1088/1742-6596/1469/1/012044.
[25] F. Al Hossain, A. A. Lover, G. A. Corey, N. G. Reich, and T. Rahman, “FluSense: A Contactless
Syndromic Surveillance Platform for Influenza-Like Illness in Hospital Waiting Areas,” Proc.
ACM Interact. Mob. Wearable Ubiquitous Technol., vol. 4, no. 1, pp. 1–28, Mar. 2020, doi:
10.1145/3381014.
[26] N. Assery, Y. Xiaohong, S. Almalki, R. Kaushik, and Q. Xiuli, “Comparing Learning-Based
Methods for Identifying Disaster-Related Tweets,” in 2019 18th IEEE International Conference On
Machine Learning And Applications (ICMLA), Dec. 2019, pp. 1829–1836. doi:
10.1109/ICMLA.2019.00295.
[27] A. Khang, G. Rana, R. K. Tailor, and V. Abdullayev, Eds., Data-centric AI solutions and emerging
technologies in the healthcare ecosystem, First edition. Boca Raton: CRC Press, 2024.
[28] H. Shi, H. Wang, Y. Huang, L. Zhao, C. Qin, and C. Liu, “A hierarchical method based on
weighted extreme gradient boosting in ECG heartbeat classification,” Computer Methods and
Programs in Biomedicine, vol. 171, pp. 1–10, Apr. 2019, doi: 10.1016/j.cmpb.2019.02.005.
[29] S. Barbon Junior, V. G. T. Costa, S.-H. Chen, and R. C. Guido, “U-Healthcare System for
PreDiagnosis of Parkinson’s Disease from Voice Signal,” in 2018 IEEE International Symposium on
Multimedia (ISM), Dec. 2018, pp. 271–274. doi: 10.1109/ISM.2018.00039.
[30] F. Zafar, S. Raza, M. U. Khalid, and M. A. Tahir, “Predictive Analytics in Healthcare for Diabetes
Prediction,” in Proceedings of the 2019 9th International Conference on Biomedical Engineering
and Technology, in ICBET ’19. New York, NY, USA: Association for Computing Machinery, Mar.
2019, pp. 253–259. doi: 10.1145/3326172.3326213.
[31] A. Mehra, M. Mandal, P. Narang, and V. Chamola, “ReViewNet: A Fast and Resource Optimized
Network for Enabling Safe Autonomous Driving in Hazy Weather Conditions,” IEEE Trans.</p>
      <p>Intell. Transport. Syst., vol. 22, no. 7, pp. 4256–4266, Jul. 2021, doi: 10.1109/TITS.2020.3013099.
[32] M. Alhussein, G. Muhammad, M. S. Hossain, and S. U. Amin, “Cognitive IoT-Cloud Integration
for Smart Healthcare: Case Study for Epileptic Seizure Detection and Monitoring,” Mobile Netw
Appl, vol. 23, no. 6, pp. 1624–1635, Dec. 2018, doi: 10.1007/s11036-018-1113-0.
[33] H. Ke et al., “Cloud‐aided online EEG classification system for brain healthcare: A case study of
depression evaluation with a lightweight CNN,” Softw Pract Exp, vol. 50, no. 5, pp. 596–610, May
2020, doi: 10.1002/spe.2668.
[34] G. Ciocca, P. Napoletano, and R. Schettini, “CNN-based features for retrieval and classification
of food images,’’ Comput. Vis. Image Understand,” vol. 176, pp. 70–77, 2018.
[35] M. Alhussein and G. Muhammad, “Voice pathology detection using deep learning on mobile
healthcare framework,” vol. 6, pp. 41034–41041, 2018.
[36] G. Bansal, V. Chamola, P. Narang, S. Kumar, and S. Raman, “Deep3DSCan: Deep residual
network and morphological descriptor based framework forlung cancer classification and 3D
segmentation,” IET Image Processing, vol. 14, no. 7, pp. 1240–1247, 2020, doi:
10.1049/ietipr.2019.1164.
[37] J. Klaise, A. Van Looveren, C. Cox, G. Vacanti, and A. Coca, “Monitoring and explainability of
models in production,” Jul. 13, 2020, arXiv: arXiv:2007.06299. doi: 10.48550/arXiv.2007.06299.
[38] J. Klaise, A. Van Looveren, G. Vacanti, and A. Coca, “Alibi explain: algorithms for explaining
machine learning models,” J. Mach. Learn. Res., vol. 22, no. 1, p. 181:8194-181:8200, Jan. 2021.
[39] J. Covell, “Project expl AI n - Interim report | Policy Commons,” 2019, Accessed: Aug. 19, 2024.</p>
      <p>[Online]. Available: https://policycommons.net/artifacts/2440692/project-expl-ai-n/3462416/
[40] H. A. H. Al-Najjar, B. Pradhan, G. Beydoun, R. Sarkar, H.-J. Park, and A. Alamri, “A novel
method using explainable artificial intelligence (XAI)-based Shapley Additive Explanations for
spatial landslide prediction using Time-Series SAR dataset,” Gondwana Research, vol. 123, pp.
107–124, Nov. 2023, doi: 10.1016/j.gr.2022.08.004.
[41] P. Biecek, “XAI in Python with dalex,” Medium. Accessed: Aug. 19, 2024. [Online]. Available:
https://medium.com/@ModelOriented/xai-in-python-with-dalex-4b173486aa92
[42] A. Dhinakaran, “A Look Into Global, Cohort and Local Model Explainability,” Medium.</p>
      <p>Accessed: Aug. 18, 2024. [Online]. Available:
https://towardsdatascience.com/a-look-intoglobal-cohort-and-local-model-explainability-973bd449969f
[43] C. O. Retzlaff et al., “Post-hoc vs ante-hoc explanations: xAI design guidelines for data
scientists,” Cognitive Systems Research, vol. 86, p. 101243, Aug. 2024, doi:
10.1016/j.cogsys.2024.101243.
[44] J. M. Metsch et al., “CLARUS: An interactive explainable AI platform for manual counterfactuals
in graph neural networks,” Journal of Biomedical Informatics, vol. 150, p. 104600, Feb. 2024, doi:
10.1016/j.jbi.2024.104600.
[45] M. Plass, M. Kargl, P. Nitsche, E. Jungwirth, A. Holzinger, and H. Muller, “Understanding and
Explaining Diagnostic Paths: Toward Augmented Decision Making,” IEEE Comput. Grap. Appl.,
vol. 42, no. 6, pp. 47–57, Nov. 2022, doi: 10.1109/MCG.2022.3197957.
[46] A. Holzinger, “Explainable AI (ex-AI),” Informatik Spektrum, vol. 41, no. 2, pp. 138–143, Apr.</p>
      <p>2018, doi: 10.1007/s00287-018-1102-5.
[47] A. Holzinger et al., “Interactive machine learning: experimental evidence for the human in the
algorithmic loop: A case study on Ant Colony Optimization,” Appl Intell, vol. 49, no. 7, pp. 2401–
2414, Jul. 2019, doi: 10.1007/s10489-018-1361-5.
[48] C. O. Retzlaff et al., “Human-in-the-Loop Reinforcement Learning: A Survey and Position on
Requirements, Challenges, and Opportunities,” jair, vol. 79, pp. 359–415, Jan. 2024, doi:
10.1613/jair.1.15348.
[49] C. Glanois et al., “A survey on interpretable reinforcement learning,” Mach Learn, vol. 113, no.</p>
      <p>8, pp. 5847–5890, Aug. 2024, doi: 10.1007/s10994-024-06543-w.
[50] R. Dwivedi et al., “Explainable AI (XAI): Core Ideas, Techniques, and Solutions,” ACM Comput.</p>
      <p>Surv., vol. 55, no. 9, pp. 1–33, Sep. 2023, doi: 10.1145/3561048.
[51] S. Dandl, C. Molnar, M. Binder, and B. Bischl, “Multi-Objective Counterfactual Explanations,”
in Parallel Problem Solving from Nature – PPSN XVI, vol. 12269, T. Bäck, M. Preuss, A. Deutz, H.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <given-names>T.</given-names>
            <surname>Evans</surname>
          </string-name>
          et al., “
          <article-title>The explainability paradox: Challenges for xAI in digital pathology,” Future Generation Computer Systems</article-title>
          , vol.
          <volume>133</volume>
          , pp.
          <fpage>281</fpage>
          -
          <lpage>296</lpage>
          , Aug.
          <year>2022</year>
          , doi: 10.1016/j.future.
          <year>2022</year>
          .
          <volume>03</volume>
          .009.
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [2]
          <string-name>
            <given-names>M.</given-names>
            <surname>Plass</surname>
          </string-name>
          et al.,
          <article-title>“Explainability and causability in digital pathology,”</article-title>
          <source>The Journal of Pathology CR</source>
          , vol.
          <volume>9</volume>
          , no.
          <issue>4</issue>
          , pp.
          <fpage>251</fpage>
          -
          <lpage>260</lpage>
          , Jul.
          <year>2023</year>
          , doi: 10.1002/cjp2.
          <fpage>322</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [3]
          <string-name>
            <given-names>H. K.</given-names>
            <surname>Bharadwaj</surname>
          </string-name>
          et al.,
          <article-title>“A Review on the Role of Machine Learning in Enabling IoT Based Healthcare Applications</article-title>
          ,” IEEE Access, vol.
          <volume>9</volume>
          , pp.
          <fpage>38859</fpage>
          -
          <lpage>38890</lpage>
          ,
          <year>2021</year>
          , doi: 10.1109/ACCESS.
          <year>2021</year>
          .
          <volume>3059858</volume>
          .
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [4]
          <string-name>
            <given-names>S. K.</given-names>
            <surname>Sood</surname>
          </string-name>
          and
          <string-name>
            <surname>I. Mahajan</surname>
          </string-name>
          , “
          <article-title>IoT-Fog-Based Healthcare Framework to Identify and Control Hypertension Attack,” IEEE Internet Things J</article-title>
          ., vol.
          <volume>6</volume>
          , no.
          <issue>2</issue>
          , pp.
          <fpage>1920</fpage>
          -
          <lpage>1927</lpage>
          , Apr.
          <year>2019</year>
          , doi: 10.1109/JIOT.
          <year>2018</year>
          .
          <volume>2871630</volume>
          .
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          [5]
          <string-name>
            <given-names>R.</given-names>
            <surname>Gupta</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Tanwar</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Tyagi</surname>
          </string-name>
          , and
          <string-name>
            <given-names>N.</given-names>
            <surname>Kumar</surname>
          </string-name>
          , “
          <article-title>Machine Learning Models for Secure Data Analytics: A taxonomy and threat model</article-title>
          ,”
          <source>Computer Communications</source>
          , vol.
          <volume>153</volume>
          , pp.
          <fpage>406</fpage>
          -
          <lpage>440</lpage>
          , Mar.
          <year>2020</year>
          , doi: 10.1016/j.comcom.
          <year>2020</year>
          .
          <volume>02</volume>
          .008.
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          [6]
          <string-name>
            <given-names>M.</given-names>
            <surname>Bampa</surname>
          </string-name>
          ,
          <string-name>
            <given-names>I.</given-names>
            <surname>Miliou</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B.</given-names>
            <surname>Jovanovic</surname>
          </string-name>
          , and
          <string-name>
            <given-names>P.</given-names>
            <surname>Papapetrou</surname>
          </string-name>
          , “
          <article-title>M-ClustEHR: A multimodal clustering approach for electronic health records</article-title>
          ,
          <source>” Artificial Intelligence in Medicine</source>
          , vol.
          <volume>154</volume>
          , p.
          <fpage>102905</fpage>
          ,
          <string-name>
            <surname>Aug</surname>
          </string-name>
          .
          <year>2024</year>
          , doi: 10.1016/j.artmed.
          <year>2024</year>
          .
          <volume>102905</volume>
          .
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          [7]
          <string-name>
            <given-names>G.</given-names>
            <surname>Lancia</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M. R. J.</given-names>
            <surname>Varkila</surname>
          </string-name>
          ,
          <string-name>
            <given-names>O. L.</given-names>
            <surname>Cremer</surname>
          </string-name>
          , and
          <string-name>
            <given-names>C.</given-names>
            <surname>Spitoni</surname>
          </string-name>
          , “
          <article-title>Two-step interpretable modeling of ICU-AIs,”</article-title>
          <source>Artificial Intelligence in Medicine</source>
          , vol.
          <volume>151</volume>
          , p.
          <fpage>102862</fpage>
          , May
          <year>2024</year>
          , doi: 10.1016/j.artmed.
          <year>2024</year>
          .
          <volume>102862</volume>
          .
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          [8]
          <string-name>
            <given-names>H.</given-names>
            <surname>Habehh</surname>
          </string-name>
          and
          <string-name>
            <given-names>S.</given-names>
            <surname>Gohel</surname>
          </string-name>
          , “Machine Learning in Healthcare,” CG, vol.
          <volume>22</volume>
          , no.
          <issue>4</issue>
          , pp.
          <fpage>291</fpage>
          -
          <lpage>300</lpage>
          , Dec.
          <year>2021</year>
          , doi: 10.2174/1389202922666210705124359.
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          [9]
          <string-name>
            <given-names>F. S.</given-names>
            <surname>Ahmed</surname>
          </string-name>
          et al.,
          <article-title>“A hybrid machine learning framework to predict mortality in paralytic ileus patients using electronic health records (EHRs</article-title>
          ),
          <source>” J Ambient Intell Human Comput</source>
          , vol.
          <volume>14</volume>
          , no.
          <issue>10</issue>
          , pp.
          <fpage>14367</fpage>
          -
          <lpage>14367</lpage>
          , Oct.
          <year>2023</year>
          , doi: 10.1007/s12652-020-02509-7.
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          [10]
          <string-name>
            <given-names>Y.</given-names>
            <surname>Ge</surname>
          </string-name>
          et al.,
          <article-title>“Predicting post-stroke pneumonia using deep neural network approaches</article-title>
          ,”
          <source>International Journal of Medical Informatics</source>
          , vol.
          <volume>132</volume>
          , p.
          <fpage>103986</fpage>
          ,
          <string-name>
            <surname>Dec</surname>
          </string-name>
          .
          <year>2019</year>
          , doi: 10.1016/j.ijmedinf.
          <year>2019</year>
          .
          <volume>103986</volume>
          .
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          [11]
          <string-name>
            <given-names>B. P.</given-names>
            <surname>Nguyen</surname>
          </string-name>
          et al.,
          <article-title>“Predicting the onset of type 2 diabetes using wide and deep learning with electronic health records,” Computer Methods</article-title>
          and Programs in Biomedicine, vol.
          <volume>182</volume>
          , p.
          <fpage>105055</fpage>
          ,
          <string-name>
            <surname>Dec</surname>
          </string-name>
          .
          <year>2019</year>
          , doi: 10.1016/j.cmpb.
          <year>2019</year>
          .
          <volume>105055</volume>
          .
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          [12]
          <string-name>
            <given-names>C.</given-names>
            <surname>Molnar</surname>
          </string-name>
          ,
          <volume>8</volume>
          .6 Global Surrogate |
          <source>Interpretable Machine Learning</source>
          .
          <year>2019</year>
          . Accessed: Jul.
          <volume>29</volume>
          ,
          <year>2024</year>
          . [Online]. Available: https://christophm.github.io/interpretable-ml-book/global.html
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          [13]
          <string-name>
            <given-names>G.</given-names>
            <surname>Vilone</surname>
          </string-name>
          and
          <string-name>
            <given-names>L.</given-names>
            <surname>Longo</surname>
          </string-name>
          , “
          <article-title>Notions of explainability and evaluation approaches for explainable artificial intelligence</article-title>
          ,
          <source>” Information Fusion</source>
          , vol.
          <volume>76</volume>
          , pp.
          <fpage>89</fpage>
          -
          <lpage>106</lpage>
          , Dec.
          <year>2021</year>
          , doi: 10.1016/j.inffus.
          <year>2021</year>
          .
          <volume>05</volume>
          .009.
        </mixed-citation>
      </ref>
      <ref id="ref14">
        <mixed-citation>
          [14]
          <string-name>
            <given-names>A.</given-names>
            <surname>Saucedo</surname>
          </string-name>
          , “
          <source>Production Machine Learning Monitoring: Outliers</source>
          , Drift, Explainers &amp; Statistical Performance,” Medium.
          <source>Accessed: Jul. 29</source>
          ,
          <year>2024</year>
          . [Online]. Available: https://towardsdatascience.com
          <article-title>/production-machine-learning-monitoring-outliers-driftexplainers-statistical-performance-d9b1d02ac158</article-title>
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>