<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta>
      <journal-title-group>
        <journal-title>SMARTERCARE Workshop, November</journal-title>
      </journal-title-group>
    </journal-meta>
    <article-meta>
      <title-group>
        <article-title>ALFABETO: Supporting COVID-19 hospital admissions with Bayesian Networks</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Giovanna Nicora</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Antonio Lo Tito</string-name>
          <xref ref-type="aff" rid="aff5">5</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Antonella Donatelli</string-name>
          <xref ref-type="aff" rid="aff5">5</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Giovanni Callea</string-name>
          <xref ref-type="aff" rid="aff5">5</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Carla Biasibetti</string-name>
          <xref ref-type="aff" rid="aff5">5</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Maria Vittoria Galli</string-name>
          <xref ref-type="aff" rid="aff5">5</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Federico Comotto</string-name>
          <xref ref-type="aff" rid="aff4">4</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Chandra Bortolotto</string-name>
          <xref ref-type="aff" rid="aff2">2</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Stefano Perlini</string-name>
          <xref ref-type="aff" rid="aff1">1</xref>
          <xref ref-type="aff" rid="aff3">3</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Lorenzo Preda</string-name>
          <xref ref-type="aff" rid="aff2">2</xref>
          <xref ref-type="aff" rid="aff5">5</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Riccardo Bellazzi</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Dep. of Electrical, Computer and Biomedical Engineering, University of Pavia</institution>
          ,
          <country country="IT">Italy</country>
        </aff>
        <aff id="aff1">
          <label>1</label>
          <institution>Dep. of Internal Medicine and Therapeutics, University of Pavia</institution>
        </aff>
        <aff id="aff2">
          <label>2</label>
          <institution>Dep. of Radiology, Fondazione IRCCS Policlinico San Matteo</institution>
          ,
          <addr-line>Pavia</addr-line>
        </aff>
        <aff id="aff3">
          <label>3</label>
          <institution>Emergency Department, I.R.C.C.S. Policlinico San Matteo Foundation</institution>
        </aff>
        <aff id="aff4">
          <label>4</label>
          <institution>Laife Reply</institution>
          ,
          <addr-line>Milan</addr-line>
          ,
          <country country="IT">Italy</country>
        </aff>
        <aff id="aff5">
          <label>5</label>
          <institution>Unit of Radiology, Department of Clinical</institution>
          ,
          <addr-line>Surgical, Diagnostic</addr-line>
          ,
          <institution>and Pediatric Sciences, University of Pavia</institution>
        </aff>
      </contrib-group>
      <pub-date>
        <year>2021</year>
      </pub-date>
      <volume>29</volume>
      <issue>2021</issue>
      <fpage>79</fpage>
      <lpage>84</lpage>
      <abstract>
        <p>The ongoing pandemics of coronavirus disease has accelerated the implementation of machine learning methods (ML) to support clinical decisions. Within this context, we present the ALFABETO project, whose aim is to aid clinicians during COVID-19 patients hospital admission through the application of ML approaches exploiting clinical and chest x-ray features. Yet, non linear ML classifiers are often perceived as not easily interpretable by users, thus hampering trust in ML predictions. Moreover, these ML models, such as Neural Networks or Random Forest, are not able to include pre-exisisting knowledge about a specific domain and are not designed to find causal relationships between variables. For these reasons, we wanted to investigate if Bayesian Networks were able to properly describe the hospital admission decision process. Bayesian Networks are probabilistic graphical models representing a set of variables and their conditional dependencies. The network structure was derived both from existing medical knowledge and from patients data collected during the first wave of the pandemic. While being explainable, we show that the Bayesian network has similar performance when compared to a less explainable ML model and that was able to generalize well across COVID-19 waves.</p>
      </abstract>
      <kwd-group>
        <kwd>eol&gt;Bayesian networks</kwd>
        <kwd>COVID-19 hospitalization prediction</kwd>
        <kwd>Clinical decision support system</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>1. Introduction</title>
      <p>
        As of October 2021, the ongoing pandemics of coronavirus disease 2019 (COVID-19) has caused
more than 200 million confirmed cases and 4 million deaths [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ]. COVID-19 illness severity
varies greatly: many patients experience mild or no symptoms, while some need long or short
hospitalization. Since the beginning of the pandemics, Artificial Intelligence (AI) approaches
have been identified as useful approaches to support clinicians [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ]. These tools hold the promise
to efectively support diferent types of decisions, from hospital admission to therapeutic
strategies, and research hospitals have worked to integrate them in clinical practice. Nevertheless,
building ML tools able to generalize over time and/or on patients coming from diferent hospitals
can be challenging, since ML inherently sufers from dataset shifts and poor generalization
ability across diferent population [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ]. As the pandemic evolves, several sources of data shifts
are arising, from new variants highly transmissible to new treatment protocols. Additionally,
ML classifications of widely used algorithms, from Neural Networks to Gradient Boosting, are
often perceived as opaque. Current research on AI Explainability (XAI) aims at making ML
predictions more transparent for the user. Towards this direction, diferent XAI approaches have
been developed, many of these providing explanations of single ML predictions by highlighting
the important features that lead the classifier to its final decision. Yet, as recently stated, in order
to reach explainability in medicine we need to promote causability [
        <xref ref-type="bibr" rid="ref4">4</xref>
        ]. However, explanations
derived from current XAI methods provide spurious correlations rather than cause/efects
relationships, leading to erroneous or even biased explanations [
        <xref ref-type="bibr" rid="ref5">5</xref>
        ]. In this context, we present
the ALFABETO Sars-CoV2 project (ALl FAster BEtter TOgheter), whose aim is to develop an
AIbased pipeline integrating data from diagnostic tools and clinical features to support clinicians
during the triage of COVID-19 patients within the Policlinico San Matteo University Hospital,
located in Pavia (Italy). The ML-based component will suggest clinicians whether the patient
can be treated at home, or he/she needs to be hospitalized. In particular, we have developed a
Bayesian Network (BN), a probabilistic graphical model that allows to model the conditional
dependencies of a set of variables. As a consequence, BNs are particularly suitable to model
pre-existing domain knowledge and the automated reasoning process of human experts [
        <xref ref-type="bibr" rid="ref6">6</xref>
        ].
We were therefore able to model existing medical evidence and suggest potential cause/efect
relationships between clinical variables and hospital admission. We evaluated whether the
BN predictive performances are comparable with those of a widely used but less explainable
ML model, e.g. Random Forest. We also tested the generalization ability of the models across
diferent pandemic waves.
      </p>
    </sec>
    <sec id="sec-2">
      <title>2. Materials and Methods</title>
      <sec id="sec-2-1">
        <title>2.1. Datasets</title>
        <p>During the first wave of the COVID-19 pandemics in Italy (March 2020-May 2021), we gathered
data from 660 COVID-19 patients treated at the IRCCS Policlinico S. Matteo hospital, an
excellence center that is known to have successfully treated the fist diagnosed COVID-19 patients
in western countries. Half of these patients were hospitalized, while the remaining showed a
better prognosis, and were treated at home. For each patient, we collected clinical features, such
as age, gender, and evidence of comorbidities. Deep Learning was used to extract features from
chest radiographs (RX) images through the X-RAIS platform, developed by Reply . X-RAIS
is a deep network able to analyze diferent types of medical images and to extract relevant
information for diagnosis. In this context X-RAIS transforms the RX image of a patient into 5
numerical clinically relevant features: Consolidation, Infiltration, Edema, Efusion and Lung
Opacity. These 5 features, together with 19 clinical features, will represent the input of a ML
model that will predict whether a patient should be hospitalized (class 1) or not (class 0). We
randomly selected 90% of the patients as training set. The remaining 10% of patients will be kept
for testing and selecting the best performing model. During the third wave (March-May 2021),
462 additional patients experienced the triage. In this case, 68% of patients were hospitalized.
The third wave set was exploited as validation set.</p>
      </sec>
      <sec id="sec-2-2">
        <title>2.2. Bayesian Network design and implementation</title>
        <p>
          To implement the BN, we first designed a graph based on our pre-existing knowledge. This
graph contains relationships between few variables that may represent the clinicians reasoning
process during triage. The graph is represented in Figure 1a: the node label as “Treatment
(Home vs Hospital)” is the target node representing our outcome of interest, i.e. whether the
patient should be hospitalized or not. To make this decision, we assume that the clinician
would evaluate at least the age, the gender (male patients are more likely to incur more severe
consequences from the infection) and whether the patient has breathing dificulties. The target
node depends on these 3 variables, and we also assume a direct dependency between age and
breathing dificulties. We then enrich the structure of this graph with the remaining collected
variables, by using the hill climbing search algorithm applied on the training data: starting
from the constraints represented in Figure 1a, this method implements a greedy local search
and performs single-edge manipulations that maximally increase a score of fitness. The search
terminates once a local maximum is found [
          <xref ref-type="bibr" rid="ref7">7</xref>
          ]. The resulting graph is shown in Figure 1b:
notably, the Boolean feature indicating whether the patient has more than 2 comorbidities
(“ComorbiditiesGreaterThan2”) is explicitly linked to comorbidities nodes, such as the presence
of cancer or cardiovascular diseases. Interestingly, the outcome is not directly linked to the node
“ComorbiditiesGreaterThan2”, but it can be linked to the presence of comorbidities through
patient’s age. Some DL features directly depend on the target node. BN is implemented in
Python 3.7, using the bnlearn package.
        </p>
      </sec>
    </sec>
    <sec id="sec-3">
      <title>3. Results</title>
      <p>
        Here, we report predicted performance of the BN in Figure 1b, whose structure is based both
evidence and from data. The simplest network, based only on evidence (Fig. 1a) shows good
recall (i.e. the ability to correctly classify hospitalization) on the Test Set (85%), but low specificity
(around 20%) and it was excluded from the analysis. We trained and tested three additional
models: a regularized Logistic Regression, Gradient Boosting and Random Forest (RF). We show
the performance of the RF only, since it outperforms the other two models on test data. RF is a
widely applied ensemble classifier, that works by training several decision trees through bagging.
Table 1 reports BN and RF classification performance on 66 patients of the Test set, in terms of
various metrics, such as Area Under the ROC Curve (AUC) and Area Under the Precision-Recall
Curve (PRC). Performances are quite similar, but BN shows slightly higher values for all the
metrics. To test whether the error rates of the two approaches are significantly diferent, we
apply the McNemar’s Test [
        <xref ref-type="bibr" rid="ref8">8</xref>
        ]. P-value is 0.6, and we cannot reject the null hypothesis, i.e. the
two classifiers have the same error rates. In Table 2 we can observe the 95% confidence interval
prediction ability on the 462 third wave patients, where RF has slightly higher performance.
Also in this case the p-value of the McNemar’s test calculated on the confusion matrix is high
(0.7), and the estimated confidence intervals overlap. In comparison with Test results, both
BN and RF show lower recall, but higher specificity. We examine the RF features importance
by computing the mean decrease impurity The most important feature for classification is the
protein C reactive value (Pcr),which was also directly linked to the outcome by the BN structure
learning algorithm. Pcr levels usually increase when an inflammation is occurring. In RF, Pcr is
followed by four DL-extracted features (LungOpacity, Edema, Consolidation and Infiltration).
All these features, except for Consolidation, have a direct link to the outcome (1b). Age, gender
and breathing dificulties are placed in the 8th, 9th and 10th positions.
      </p>
    </sec>
    <sec id="sec-4">
      <title>4. Discussion</title>
      <p>The need for clinical decision support systems implementing ML is increasing. These approaches
are able to detect useful and hidden patterns in data, that can be exploited to support knowledge
discovery and/or to implement automatic and highly accurate classifiers. Explainability of the
classification process is needed to safely integrate ML within clinical practice, yet the majority
of high-performing classifiers are perceived as black-box. Additionally, by learning entirely
from training data, most of them prevent the integration of existing medical knowledge and
evidence. Here, we show the development of a Bayesian Network whose aim is to predict hospital
admission of COVID-19 patients. BN allows us to: 1) develop a model that is explainable by
design and 2) combine known evidence about variable dependency with information encoded
in patients data. The resulting structure can be inspected by clinicians to understand the
classification process. The BN is able to generalize well during the third wave, despite some
population variables, such as age, changed in comparison with patients of the first wave,
used for training. Moreover, BN predictive ability is similar to a completely data-driven and
less interpretable approach (RF). Future works will explore new networks configuration, with
additional medical knowledge, and the exploration of potential causal relationships between
variables.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <given-names>WHO</given-names>
            <surname>Coronavirus (COVID-19) Dashboard</surname>
          </string-name>
          , ???? URL: https://covid19.who.int.
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [2]
          <string-name>
            <given-names>A.</given-names>
            <surname>Alimadadi</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Aryal</surname>
          </string-name>
          ,
          <string-name>
            <surname>I. Manandhar</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P. B.</given-names>
            <surname>Munroe</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B.</given-names>
            <surname>Joe</surname>
          </string-name>
          , X. Cheng,
          <source>Artificial intelligence and machine learning to fight COVID-19, Physiological Genomics</source>
          <volume>52</volume>
          (
          <year>2020</year>
          )
          <fpage>200</fpage>
          -
          <lpage>202</lpage>
          . doi:
          <volume>10</volume>
          .1152/physiolgenomics.00029.
          <year>2020</year>
          , publisher: American Physiological Society.
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [3]
          <string-name>
            <given-names>C. J.</given-names>
            <surname>Kelly</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Karthikesalingam</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Suleyman</surname>
          </string-name>
          ,
          <string-name>
            <given-names>G.</given-names>
            <surname>Corrado</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>King</surname>
          </string-name>
          ,
          <article-title>Key challenges for delivering clinical impact with artificial intelligence</article-title>
          ,
          <source>BMC Medicine 17</source>
          (
          <year>2019</year>
          )
          <article-title>195</article-title>
          . doi:
          <volume>10</volume>
          . 1186/s12916- 019- 1426- 2.
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [4]
          <string-name>
            <given-names>A.</given-names>
            <surname>Holzinger</surname>
          </string-name>
          , G. Langs,
          <string-name>
            <given-names>H.</given-names>
            <surname>Denk</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K.</given-names>
            <surname>Zatloukal</surname>
          </string-name>
          ,
          <string-name>
            <given-names>H.</given-names>
            <surname>Müller</surname>
          </string-name>
          ,
          <article-title>Causability and explainability of artificial intelligence in medicine</article-title>
          ,
          <source>WIREs Data Mining and Knowledge Discovery</source>
          <volume>9</volume>
          (
          <year>2019</year>
          )
          <article-title>e1312</article-title>
          . doi:
          <volume>10</volume>
          .1002/widm.1312.
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          [5]
          <string-name>
            <given-names>Y.-L.</given-names>
            <surname>Chou</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Moreira</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Bruza</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Ouyang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Jorge</surname>
          </string-name>
          ,
          <source>Counterfactuals and Causability in Explainable Artificial Intelligence: Theory</source>
          , Algorithms, and Applications, arXiv:
          <fpage>2103</fpage>
          .04244 [cs] (
          <year>2021</year>
          ). URL: http://arxiv.org/abs/2103.04244, arXiv:
          <fpage>2103</fpage>
          .
          <fpage>04244</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          [6]
          <string-name>
            <given-names>S.</given-names>
            <surname>Thirumuruganathan</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Huber</surname>
          </string-name>
          ,
          <article-title>Building Bayesian Network based expert systems from rules</article-title>
          ,
          <source>in: 2011 IEEE International Conference on Systems, Man, and Cybernetics</source>
          ,
          <year>2011</year>
          , pp.
          <fpage>3002</fpage>
          -
          <lpage>3008</lpage>
          . doi:
          <volume>10</volume>
          .1109/ICSMC.
          <year>2011</year>
          .
          <volume>6084157</volume>
          , iSSN:
          <fpage>1062</fpage>
          -
          <lpage>922X</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          [7]
          <string-name>
            <given-names>M.</given-names>
            <surname>Scutari</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C. E.</given-names>
            <surname>Graafland</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J. M.</given-names>
            <surname>Gutiérrez</surname>
          </string-name>
          ,
          <article-title>Who learns better Bayesian network structures: Accuracy and speed of structure learning algorithms</article-title>
          ,
          <source>International Journal of Approximate Reasoning</source>
          <volume>115</volume>
          (
          <year>2019</year>
          )
          <fpage>235</fpage>
          -
          <lpage>253</lpage>
          . URL: https://www.sciencedirect.com/science/article/pii/ S0888613X19301434. doi:
          <volume>10</volume>
          .1016/j.ijar.
          <year>2019</year>
          .
          <volume>10</volume>
          .003.
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          [8]
          <string-name>
            <given-names>T. G.</given-names>
            <surname>Dietterich</surname>
          </string-name>
          ,
          <article-title>Approximate Statistical Tests for Comparing Supervised Classification Learning Algorithms</article-title>
          ,
          <source>Neural Computation</source>
          <volume>10</volume>
          (
          <year>1998</year>
          )
          <fpage>1895</fpage>
          -
          <lpage>1923</lpage>
          . URL: https://doi.org/10. 1162/089976698300017197. doi:
          <volume>10</volume>
          .1162/089976698300017197, publisher: MIT Press.
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>