<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta>
      <journal-title-group>
        <journal-title>Co-located with STAF</journal-title>
      </journal-title-group>
    </journal-meta>
    <article-meta>
      <title-group>
        <article-title>Clinical Data Modeling Combining Agent-Based and Epidemiological Models</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Denisse Kim</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Manuel Campos</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Bernardo Canovas-Segura</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Jose M. Juarez</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>MedAI Lab, University of Murcia</institution>
          ,
          <addr-line>Campus Espinardo, Murcia, 30100</addr-line>
          ,
          <country country="ES">Spain</country>
        </aff>
        <aff id="aff1">
          <label>1</label>
          <institution>Murcian Bio-Health Institute (IMIB-Arrixaca)</institution>
          ,
          <addr-line>El Palmar, Murcia, 30120</addr-line>
          ,
          <country country="ES">Spain</country>
        </aff>
      </contrib-group>
      <pub-date>
        <year>2023</year>
      </pub-date>
      <volume>18</volume>
      <fpage>18</fpage>
      <lpage>21</lpage>
      <abstract>
        <p>Hospital-acquired infections (HAIs) are a major concern nowadays, since they entail a big threat to society and an increase in healthcare costs. AI techniques show great performance in the development of efective systems to help in their control and prevention. However, many recent studies highlight the lack of available datasets for reproducing their experiments, claiming for more trustworthy medical AI models. Realistic data simulation is a valid approach for testing these models when data is publicly unavailable or when clinical data gathering is cumbersome or impossible. Main simulators often focus on implementing compartmental epidemiological models and contact networks for validating epidemiological hypotheses. However, very little attention is paid to hospital infrastructure (e.g. hospital building, policy, shifts, etc.) which plays a key role in the infection and outbreak processes. This paper proposes a novel approach for a simulation model of HAI spread, combining agent-based patient description, spatial-temporal constraints of the hospital settings, and microorganism behavior driven by epidemiological models.</p>
      </abstract>
      <kwd-group>
        <kwd>eol&gt;simulation model</kwd>
        <kwd>agent-based model</kwd>
        <kwd>hospital-acquired infection</kwd>
        <kwd>infection control</kwd>
        <kwd>epidemiological model</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>1. Introduction</title>
      <p>
        Multidrug-resistant bacteria (MDR-bacteria) are bacteria that evolved and acquired resistance
to antimicrobial drugs. This resistance makes the treatment more complex, increasing the risk
of infection, its spread and mortality [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ]. They are a growing concern, especially in the case
of hospital-acquired infections (HAIs), since they entail an increase in healthcare costs and a
big threat to society. Health systems must have the necessary means to be able to assess the
presence of these infections in hospitals. To this end, the spatial structure of a hospital and the
physical distribution of patients over time play important roles in the detection of infection
outbreaks and the prevention of their spread.
      </p>
      <p>
        In this context, machine learning (ML) and deep learning (DL) techniques present an
opportunity to develop efective systems that can help in the clinical decision and planning process.
The development of these systems requires access to big volumes of high-quality data, both
for training and validating. However, several concerns might compromise the use of health
data in AI research, such as the quality of accessible data and the risk of bias, the protection of
individual privacy, or the individual loss of autonomy, among others [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ].
      </p>
      <p>
        Some approaches to preserving privacy are the anonymization or pseudo-anonymization of
data. This could encourage the data usage, though not without challenges (e.g. data triangulation)
[
        <xref ref-type="bibr" rid="ref3">3</xref>
        ]. Part of the solution to this problem is the simulation of realistic data, since it has two main
benefits: from a public-health perspective, they allow predictive analysis and early evaluation
of hospital policies in diferent scenarios; from a medical-AI perspective, they are useful for
implementing and evaluating future ML and DL techniques in a more fair and reliable way.
      </p>
      <p>This paper presents a simulation model to study the propagation of infections within a
hospital. The core of the simulation is to study the movement of a population of patients inside
a hospital, and to allow the analysis of the spread and outbreak of an infective disease in this
population. Therefore, it consists in the combination of an agent-based model (micro-scale
model), the dynamics of an epidemiological compartmental model (macro-scale model), and the
policies and the physical structure of a healthcare environment.</p>
    </sec>
    <sec id="sec-2">
      <title>2. Simulation model and structure</title>
      <p>Our proposal is based on a discrete event time model in order to simulate a bacterial infection
spread among patients in a hospital setting. In each step of time, hospitalized patients are going
to move around the hospital, and they are going to be involved in the transmission of a bacterial
infection as well as in its potential outbreaks. To achieve this, the simulation model consists in
3 main components: the input parameters, the core of the simulator (i.e. micro-scale modeling
via an agent-based model, macro-scale modeling with the use of a compartmental model, and
spatial-temporal constraints of the hospital settings), and the simulation outputs. The framework
of the simulation model is presented in Figure 1 and the components are explained below.</p>
      <sec id="sec-2-1">
        <title>2.1. Macro-scale model: epidemic model</title>
        <p>
          Compartmental models are modeling techniques often applied in the simulation of infectious
diseases. The population is assigned to labeled compartments and they can progress from one
to another. In these models, the population dynamics are well known [
          <xref ref-type="bibr" rid="ref4">4</xref>
          ], which makes them
suitable for performing predictions and estimations of epidemiological parameters. Among the
many variations of compartmental models (e.g. SIR, SIS, SEIR, etc.), the SEIRD model is the
most suitable, since it adjusts well to the disease progression we want to study. Therefore, to
represent the evolution of the disease, we have adjusted and applied this model by assigning
each patient a state of health at a time: susceptible (S), exposed (E), infected (I), recovered (R),
deceased (D), or non-susceptible (NS), which represents those that have immunity.
        </p>
        <p>Population are admitted at the hospital in state S, I, or NS. They can get infected while in
state S. If this happens, they go to state E, which means that they are incubating the disease,
but are not contagious. They are going to remain in state E a period of time that depends on
the incubation duration of the infection. Once this period is over, they go to state I, thus they
can infect others and contaminate the environment. If they survive the disease, they go to state
R and, if not, they go to state D.</p>
      </sec>
      <sec id="sec-2-2">
        <title>2.2. Micro-scale model: agent-based model</title>
        <p>
          Agent-based approaches ease micro-scale simulations, describing individual and their
interactions. Unlike other approaches, like [
          <xref ref-type="bibr" rid="ref5">5</xref>
          ], patients are the only agents in the model. This decision
helps us to focus on the evolution of the infection process based on solid epidemiological models,
avoiding unverified factors of contagion vectors like healthcare workers or visitors.
        </p>
        <p>With the patient simulation proposed, we track their stays from the arrival to the hospital to
their discharge. Each patient has a unique ID and a set of attributes that include the localization
where they are, age, gender, length of stay (LOS) in the hospital, health state, incubation period,
infection duration, and applied treatment. We have considered only adult patients and we track
their movements around the hospital in each time step through the diferent areas and services.</p>
        <p>Patients can interact with other patients and with the environment. Interactions between
patients take place when they share a room. During these interactions, a sick patient can infect
another with probability . Interactions with the environment happen when a patient has
spent an amount of time in the same place, and they can infect it with probability . This
probability is going to be higher if the infected patient has not started the treatment yet, and
lower if they have been on it for more than 3 days. Finally, a contaminated place can infect a
susceptible patient with probability .</p>
        <p>Regarding the recovery, infected patients may have a quick recovery without treatment or a
longer one with the need of treatment. Both have a probability of success and a duration. In
case of non-recovery, a patient may die with probability . All of these probabilities and time
periods depend on the modeled pathogen and are part of the parameters explained in Section 3.</p>
      </sec>
      <sec id="sec-2-3">
        <title>2.3. Hospital: spatial-temporal constraints and policies</title>
        <p>The spatial distribution of the hospital plays a key role in the spread of infections. We have
considered a two-story hospital, where we took into account the most likely places in which a
hospitalized patient can become infected: emergency room (ER), operating rooms, radiology
rooms, wards which contain several patient rooms each, and an intensive care unit (ICU). The
ER, the ICU and each room can have a user-defined number of beds. Each bed and place have a
unique ID and a state indicating whether they are contaminated by the infection or not.</p>
        <p>The places that a patient can go to are divided into two types: temporary and indefinite.
Temporary places are those in which a patient is going to spend a short period of time (e.g.
radiology or surgery). Indefinite places are those where a patient can stay for a long period of
time (e.g. a bed, the ICU). The patients movements are spatially and temporally constrained. In
order to implement these constraints as realistic as possible, we have designed a series of rules
following the suggestions of medical doctors:
• Temporal constraint: in each step of the simulation, only a limited number of patients
can move to each ward.
• Spatial-temporal (ST) constraint: patients must have spent a minimum number of steps
without having gone to a temporary place to go back. For example, if a patient has just
undergone surgery, they will not go back into the operating room right away.
• ST constraint: patients that have been in the ER or the ICU for a certain period of time
can be transferred to a ward.
• Spatial constraint: when a patient goes to a temporary place, in the next step they return
to the same bed where they were before.
• ST constraint: patients can change of bed in the same ward during 1 simulation step.
• ST constraint: patients in a ward can change to another ward during 1 simulation step.
• ST constraint: patients in a ward or the ER might be transferred to the ICU.</p>
      </sec>
    </sec>
    <sec id="sec-3">
      <title>3. Parameters</title>
      <p>The simulation model receives input parameters with user-assigned values. These parameters
can be classified into three types: population parameters, epidemic-model parameters and
configuration parameters.</p>
      <p>Population parameters are those that represent the population and the characteristics of the
hospital (i.e. the occupancy rate, admission rate through the ER, number of beds, rooms and
wards, etc.). Among these parameters, we also find the patient’s age and LOS distributions.</p>
      <p>
        Epidemic-model parameters are those that configure the infection behavior and allow the
change from one state of health to another (i.e. probability of contamination, probability of
recovery, treatment duration, etc.). We obtained or calculated both these and the population
parameters from public access data and information from the literature. Those for which there
was not enough information to infer their distribution follow a triangular distribution defined
by a mean value that represents its mode, a minimum and a maximum value, based on [
        <xref ref-type="bibr" rid="ref6">6</xref>
        ].
      </p>
      <p>Configuration parameters comprise the settings for each run of the simulator (i.e. the cleaning
frequency, the time between the movements of a patient, etc.). We defined these parameters
based on the hospital size, data published by hospitals, and information obtained from an expert.</p>
    </sec>
    <sec id="sec-4">
      <title>4. Simulation outputs</title>
      <p>When working with micro and macro-scale modeling, one advantage is the derivation of results
with low and high level of abstraction. Two straightforward outputs are daily statistics of the
processes under study, and a database of the patients during the time of the simulation. This
database includes information on each patient and the places from the hospital where they have
been to. Another output is the extraction of aggregated information of the infection: in this
case, the number of patients in each health state per day is stored. By combining this with the
localization of the patients in each step, it is possible to compute any epidemiological indicator
that can be calculated with these data (e.g. prevalence, incidence density, etc.) for the diferent
areas from the hospital.</p>
    </sec>
    <sec id="sec-5">
      <title>5. Experiments and results</title>
      <p>
        To experiment with this simulator, we have configured a hospital with 212 beds and the input
parameters according to the Santa Lucia General University Hospital from Murcia, Spain. Each
step is 8 hours, since it is the duration of a standard work shift. For the infection, we have
chosen the Clostridium Dificile (CD) pathogen, which is the main cause of infectious diarrhea
in hospitalized patients [
        <xref ref-type="bibr" rid="ref7">7</xref>
        ]. Figure 2 presents part of the output from an execution of our
simulator with CDI, this includes a representation of the number of patients in each health
state and their evolution and dynamics through time, as well as an example of the output logs.
The patients log includes information regarding their age, sex, LOS, duration of incubation,
duration of infection, duration of treatment, treatment, admission day, last day in the hospital,
and deceased check. The movement log includes, for each step, the patients’ ID and health
states, and their localizations’ ID and contamination check.
      </p>
    </sec>
    <sec id="sec-6">
      <title>6. Related work</title>
      <p>
        Codella et al. [
        <xref ref-type="bibr" rid="ref6">6</xref>
        ] implemented an agent-based simulation model with a Markov model to study
the transmission of CD in a midsized hospital. They represented diferent types of agents and
applied the model to compare the output of several control strategies. Our model is diferent
from this approach in that our goal is to generate a dataset with individual as well as aggregated
information for their latter use in other AI implementations. Besides this, another output from
our work are epidemiological indicators to help monitor the spread of an MDR-bacteria infection
in a hospital setting.
      </p>
      <p>
        Lee et al. [
        <xref ref-type="bibr" rid="ref5">5</xref>
        ] presented a software tool called the Regional Healthcare Ecosystem Analyst,
that creates an agent-based model with input data of a healthcare ecosystem. It is configurable
by the user, including the characteristics of the infection chosen to represent. Its aim is to
serve as a virtual laboratory to help to test diferent policies and interventions. Practically any
healthcare facility type can be represented and beds are divided into ICU units and general
wards. This difers from our model in that we give more importance to space by studying an
infection spread in the most common areas of a hospital setting (e.g. the ER, the ICU, etc.).
Another diference is that they use subroutines to calculate the number of agents in infectious
and susceptible state in each ward, and based on that, they calculate the number of new cases
in that ward that day. Instead of this, we monitor all the agents present in the hospital, so that
we can know when they shared a room and interacted at low level.
      </p>
      <p>
        Haber et al. [
        <xref ref-type="bibr" rid="ref8">8</xref>
        ] combined an agent-based with a compartmental model and focused on the
study of second-line drugs in a small hospital. They analyzed several interventions to reduce the
use of antibiotics and the incidence of HAIs. The main diference with ours is that the infection
spread is calculated with diferential equations and they do not model patients movements in
the hospital, nor give the same importance to space and time as we do.
      </p>
    </sec>
    <sec id="sec-7">
      <title>7. Conclusions and future work</title>
      <p>This paper proposes an agent-based model coupled with the infection dynamics extracted from
an epidemiological model. This model can be used as a generator of synthetic clinical data on
MDR-bacteria infections within hospitals. The use of an agent-based model together with the
role of the hospital topology in an infection spread can enhance the detection of spatial and
temporal patterns to help in the monitoring and the decision-making process. This is thanks to
a more precise study of the infection process and the consequences in hospitalized patients. The
capacity of tracing patients at a low level and to also obtain aggregate results from them can play
a key role and be a step forward in the creation of a more explainable AI and in the generation
of higher-quality synthetic data. In future we plan to carry out a thorough evaluation of the
model to ensure its correct implementation, clinical meaning and utility. Once we perform this
evaluation, the simulator is going to be available in an open repository.
This work was partially funded by the CONFAINCE project (Ref: PID2021-122194OB-I00) by
MCIN/AEI/10.13039/501100011033 and by "ERDF A way of making Europe", by the "European
Union" or by the "European Union NextGenerationEU/PRTR", and by the GRALENIA project
(Ref: 2021/C005/00150055) supported by the Spanish Ministry of Economic Afairs and Digital
Transformation, the Spanish Secretariat of State for Digitization and Articial Intelligence, Red.es
and by the NextGenerationEU funding. This research is also partially funded by the FPI program
grant (Ref: PRE2019-089806).</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <given-names>World</given-names>
            <surname>Health</surname>
          </string-name>
          <string-name>
            <surname>Organization</surname>
          </string-name>
          , Antimicrobial resistance,
          <year>2022</year>
          . URL: https://www.who.int/ news-room/fact-sheets/detail/antimicrobial-resistance.
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [2]
          <string-name>
            <given-names>World</given-names>
            <surname>Health</surname>
          </string-name>
          <string-name>
            <surname>Organization</surname>
          </string-name>
          ,
          <source>Ethics and governance of artificial intelligence for health</source>
          ,
          <year>2023</year>
          . URL: https://www.who.
          <source>int/publications-detail-redirect/9789240029200.</source>
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [3]
          <string-name>
            <given-names>B. C. L.</given-names>
            <surname>Paisner</surname>
          </string-name>
          , At a glance:
          <article-title>De-identification, anonymization, and pseudonymization under the gdpr</article-title>
          ,
          <year>2017</year>
          . URL: https://www.bclplaw.com/en-US/
          <article-title>events-insights-news/ at-a-glance-de-identification-anonymization-and-pseudonymization-1</article-title>
          .html.
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [4]
          <string-name>
            <given-names>E.</given-names>
            <surname>Hunter</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Kelleher</surname>
          </string-name>
          ,
          <article-title>A framework for validating and testing agent-based models: a case study from infectious diseases modelling</article-title>
          .,
          <source>34th. Annual European Simulation and Modelling Conference</source>
          (
          <year>2020</year>
          ). doi:https://doi.org/10.21427/2xjb-cq79.
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          [5]
          <string-name>
            <given-names>B. Y.</given-names>
            <surname>Lee</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K. F.</given-names>
            <surname>Wong</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S. M.</given-names>
            <surname>Bartsch</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S. L.</given-names>
            <surname>Yilmaz</surname>
          </string-name>
          ,
          <string-name>
            <given-names>T. R.</given-names>
            <surname>Avery</surname>
          </string-name>
          , S. T. Brown,
          <string-name>
            <given-names>Y.</given-names>
            <surname>Song</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Singh</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D. S.</given-names>
            <surname>Kim</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S. S.</given-names>
            <surname>Huang</surname>
          </string-name>
          ,
          <article-title>The Regional Healthcare Ecosystem Analyst (RHEA): a simulation modeling tool to assist infectious disease control in a health system</article-title>
          ,
          <source>Journal of the American Medical Informatics Association: JAMIA</source>
          <volume>20</volume>
          (
          <year>2013</year>
          )
          <fpage>139</fpage>
          -
          <lpage>146</lpage>
          . doi:
          <volume>10</volume>
          .1136/ amiajnl-2012-001107.
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          [6]
          <string-name>
            <given-names>J.</given-names>
            <surname>Codella</surname>
          </string-name>
          ,
          <string-name>
            <given-names>N.</given-names>
            <surname>Safdar</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Hefernan</surname>
          </string-name>
          ,
          <string-name>
            <given-names>O.</given-names>
            <surname>Alagoz</surname>
          </string-name>
          ,
          <article-title>An agent-based simulation model for clostridium dificile infection control, Medical Decision Making: An International Journal of the Society for Medical Decision Making 35 (</article-title>
          <year>2015</year>
          )
          <fpage>211</fpage>
          -
          <lpage>229</lpage>
          . doi:
          <volume>10</volume>
          .1177/0272989X14545788.
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          [7]
          <string-name>
            <given-names>L.</given-names>
            <surname>Meyer</surname>
          </string-name>
          , R. Espinoza,
          <string-name>
            <given-names>R.</given-names>
            <surname>Quera</surname>
          </string-name>
          ,
          <article-title>Infección por clostridium dificile: epidemiología, diagnóstico y estrategias terapéuticas</article-title>
          ,
          <source>Revista Medica Clinica Las Condes</source>
          <volume>25</volume>
          (
          <year>2014</year>
          )
          <fpage>473</fpage>
          -
          <lpage>484</lpage>
          . doi:
          <volume>10</volume>
          .1016/S0716-
          <volume>8640</volume>
          (
          <issue>14</issue>
          )
          <fpage>70064</fpage>
          -
          <lpage>1</lpage>
          , publisher: Elsevier.
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          [8]
          <string-name>
            <given-names>M.</given-names>
            <surname>Haber</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B. R.</given-names>
            <surname>Levin</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Kramarz</surname>
          </string-name>
          ,
          <article-title>Antibiotic control of antibiotic resistance in hospitals: a simulation study</article-title>
          ,
          <source>BMC infectious diseases 10</source>
          (
          <year>2010</year>
          )
          <article-title>254</article-title>
          . doi:
          <volume>10</volume>
          .1186/
          <fpage>1471</fpage>
          -2334-10-254.
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>