=Paper= {{Paper |id=Vol-2538/paper1 |storemode=property |title=Generating T1DM Virtual Patients for In Silico Clinical Trials via AI-Guided Statistical Model Checking |pdfUrl=https://ceur-ws.org/Vol-2538/paper1.pdf |volume=Vol-2538 |authors=Agostina Calabrese,Toni Mancini,Annalisa Massini,Stefano Sinisi,Enrico Tronci |dblpUrl=https://dblp.org/rec/conf/aiia/CalabreseMMST19 }} ==Generating T1DM Virtual Patients for In Silico Clinical Trials via AI-Guided Statistical Model Checking== https://ceur-ws.org/Vol-2538/paper1.pdf
    Generating T1DM Virtual Patients for In Silico
    Clinical Trials via AI-Guided Statistical Model
                        Checking

              Agostina Calabrese? , Toni Mancini, Annalisa Massini,
                        Stefano Sinisi, and Enrico Tronci

             Computer Science Department, Sapienza University of Rome
                    calabrese.1657689@studenti.uniroma1.it
              {tmancini, massini, sinisi, tronci}@di.uniroma1.it


        Abstract. The availability of a representative population of virtual pa-
        tients, i.e., a population large enough to represent all relevant human
        patient behaviours, is a key enabler for the design of In Silico Clinical
        Trials (ISCTs), that is trials following simulation-based approaches for
        the safety and efficacy assessment of pharmacological treatments and
        biomedical devices. This involves the development of Virtual Physio-
        logical Human (VPH) models able to represent the whole phenotypes
        spectrum of the human physiology of interest. Usually, such models are
        open-loop models, i.e., their behaviour depends also on exogenous inputs
        (such as, e.g., pharmacological drugs).
        In this paper, we propose a methodology to convert an open-loop VPH
        model into a closed-loop model. As a case study, we apply our method-
        ology to a state-of-the-art VPH model defining the human glucose reg-
        ulation system of individuals with Type 1 Diabetes Mellitus (T1DM),
        and show how we generate a representative population of T1DM virtual
        patients.

        Keywords: In Silico Clinical Trials · VPH models · AI Global search ·
        Model Checking · Simulation.


1     Introduction
The advance of biomedical engineering has promoted the design of new, rev-
olutionary, biomedical devices. The purpose of such devices is to improve the
quality of life in different kinds of patients by making the treatments they follow
completely, or partially, automated. An In Silico Clinical Trial (ISCT), in oppo-
sition to the more traditional in vivo clinical trial, is used in the development
of products such as drugs and biomedical devices. More specifically, a clinical
trial is performed by means of computer simulations over a population of virtual
patients (see, e.g., http://paeon.di.uniroma1.it).
?
    Corresponding author. Alternative email address: agostina.calabrese3@gmail.com
    Proceedings of RCRA 2019.
    Copyright © 2019 for this paper by its authors. Use permitted under Creative
    Commons License Attribution 4.0 International (CC BY 4.0).
1.1   Motivation

In order to generate virtual patients, a Virtual Physiological Human (VPH)
model is required. A VPH model is a mathematical mechanistic model of the
(patho-) physiology of interest and reactions to drugs (Pharmaco-Kinetics/Dyn-
amics, PKPD) in the form of a parametric system of, e.g., Ordinary Differential
Equations (ODEs). Many of the parameters in the system usually model inter-
subject variability, which means that different assignments to these parameters
determine the behaviour of different patients. Unfortunately, many assignment
to the parameters can lead to model evolutions that are not coherent with the
laws of biology. As a consequence, the population of virtual patients can not be
built by picking assignments arbitrarily, but a smart search in the parameter
space is needed.
    Given a VPH model M , we can define the corresponding population of
Virtual Patients (VPs) as the set S = {λ|λ is an assignment to the parame-
ters in M and λ defines a behaviour that is coherent with the laws of biology}
[56].
    Thanks to the use of in silico clinical trials, risky and costly experiments on
humans can be reduced and postponed to when a deeper knowledge of the side-
effects of the product has been acquired (http://paeon.di.uniroma1.it). ISCTs
basically entail the following steps. First, generate a set of VPs whose predictions
are in agreement with (patho-) physiology, PKPD, and in vivo clinical data.
Second, use the above in vivo validated VPs to assess, through simulation, safety
and efficacy of a candidate drug or biomedical device.


1.2   Contribution

In this paper, we consider, as a case study, patients with Type 1 Diabetes Mel-
litus (T1DM), and generate a population of VPs by exploiting the VP gener-
ator developed by the Model Checking Laboratory (MCLab) (http://mclab.di.
uniroma1.it) and originally presented in [56,41]. This phase is preliminary to the
verification of safety-critical devices such as artificial pancreas devices used for
blood glucose levels monitoring and regulation.
    For our case study, we implemented the Medtronic VPH model of the hu-
man glucose regulation system [22] into Modelica. However, the Medtronic VPH
model could not be used directly within our VP generator, because it requires, as
mandatory external inputs, the time functions defining the daily carbohydrates
intake and insulin administration. Indeed, without such inputs, the model time
evolutions are not significant.
    To overcome this obstruction, we set up a methodology which allows us to
use our VP generator tool [56,41] also in the case of VPH models that require
external inputs (open-loop VPH models). In particular, we extend the Medtronic
VPH model with a new module that, given some information on the patient
(e.g., their body weight), generates (using standardised formulas) the amounts
of carbohydrates and insulin that are recommended for that patient. As the
extended model is closed-loop (i.e., it does not require any external input), we
could finally use our VP generator to compute a population of T1DM VPs. The
generation of the population is a time consuming activity requiring clever AI-
based exploration strategies of the search space on High Performance Computing
(HPC) resources.


1.3    Paper Outline

The paper is organised as follows. After recalling the state of the art on VPH
models and ISCTs (Section 2), in Section 3 we introduce the Medtronic VPH
model of the human glucose regulation system and outline how we implemented
it in Modelica. Then, in Section 4 we describe how we extended our VPH model
to enable the computation of a representative population of T1DM VPs, by
computing and providing proper input patterns for carbohydrates intake and
insulin injections. Finally, in Section 5 we analyse the results of our computation
and in Section 6 we draw conclusions.


2     State of the art

A key enabler to carry out an ISCT is the availability of a suitable cohort of
VPs. Such a cohort must be complete, i.e., large enough to represent all relevant
human patient phenotypes. In hormonal regulatory systems (as is our case), the
whole spectrum of possible phenotypes can be quite large, since such systems
typically occur within a complex network of endocrinological, neurological, and
psychological factors (see, e.g., [18,27,26]). On the other hand, we also need such
a cohort of VPs to be minimal, i.e., to not include behaviours that are not
compatible with the human (patho-) physiology of interest.
    It is worth noting that achieving simultaneously completeness and minimality
is presently out of reach. For this reason, usually, completeness is privileged.
This way, an adequate in silico representation of all human patient phenotypes
is guaranteed.
    In the following we briefly review the state of the art on VPs generation
methods.
    VPH models are usually developed using medical knowledge from the lit-
erature and from pathway databases such as KEGG (Kyoto Encyclopedia of
Genes and Genomes) or Reactome. Open formats, such as the Systems Biol-
ogy Markup Language (SBML), allow exchange of computational models across
different platforms and their integration within open-standard general-purpose
simulation ecosystems [30].
    Unfortunately, physiology knowledge is often qualitative. As a result, the
model behaviour is hard to define and is not uniquely determined. To overcome
such an obstacle, many qualitative as well as quantitative approaches have been
devised. An overview is in [48].
    Qualitative approaches, often referred to as logic based, discretise the values
of interest. For example, using a Boolean approach, a gene up-regulation (re-
spectively down-regulation) may be represented with a logical 1 (respectively 0).
Boolean models have been widely studied (see, e.g., [20] and citations thereof).
Logical models are very useful for a qualitative analysis, but are quite difficult
to use within a compositional framework where quantitative models from phys-
iology or pharmacology are present. Since in an ISCT quantitative aspects are
extremely relevant (e.g., drug dosage) we will focus on quantitative models.
    Quantitative models typically focus on representing dynamics of chemical
reactions through ODEs with stoichiometric parameters and reaction rates esti-
mated from clinical data using model identification techniques, as, e.g., in [58].
Such models are typically used to support ISCTs, as, e.g., in [29]. The main ob-
stacle to overcome in using quantitative models is the lack of knowledge about
the values for their parameters. In fact, very few model parameter values can be
estimated using clinical data (see, e.g., [55] for a survey). As a result, most pa-
rameter values have to be estimated through computational methods, typically
using model identification techniques (see, e.g., [60,58]).
   In such a setting, two approaches are typically used, respectively based on
optimisation and statistical techniques.
    Both global and local optimisation-based approaches, relying on AI as well
as on numerical methods, have been widely studied. Global optimisation–based
approaches are computationally heavy, however convergence is guaranteed to a
global optimum (see, e.g., [31,59,46]). On the other hand, local optimisation ap-
proaches (e.g., [32,38,42]) are in general computationally lighter than global ones,
but convergence is only guaranteed towards a local optimum (e.g., [23,24,44]).
An in-depth comparison among the two approaches can be found in [47], whereas
an example of the use of AI-based optimisation techniques for computing per-
sonalised treatments is presented in [34].
   Typically, both global as well as local optimisation approaches rely on model
identifiability, i.e., different values for the model parameters lead to different
model behaviours (see, e.g., [28]). In such a case, different model identification
techniques can be used. Examples are in [10,49,12,57,1,53].
    When clinical data are scarce, e.g., when clinical assays are costly, risky and
invasive, identification approaches can be applied by averaging data coming from
different patients. However, in this case the resulting parameter value only yields
an inter-patient model behaviour (see, e.g., [50]).
    Since the goal of the above mentioned approaches is to generate a model
parameter value that fits available experimental data, a huge amount of data
per patient is needed in order to generate a virtual population that is represen-
tative of all human phenotypes. As a result, generating a complete set of VPs
using model identification techniques would require considering a large amount
of patients (ideally at least one for each relevant phenotype) along with a huge
amount of clinical data for each of such patients. Unfortunately, this is exactly
one of the obstacles ISCTs aim at overcoming. Thus, while model identification
techniques are essential to validate models against clinical data and to develop
patient-specific models, they can be hardly used to generate a complete set of
VPs.
    Moreover, VPH models are often globally or partially unidentifiable, i.e.,
wide ranges of parameter values lead to very similar model behaviours (see, e.g.,
[8]). However, being able to generate VPs (i.e., parameter values) which yield
clearly distinguishable model behaviours is of crucial importance for ISCTs.
    Statistical approaches are also widely used. In such approaches a param-
eter probability distribution, rather than a single value, is inferred (see, e.g.,
[19,25,52]). They are typically used for physiology-based PKPD models (see,
e.g., [51]), i.e., quantitative VP models where parameter values are measurable
physiological quantities (such as blood flow, organ volumes, etc). Unfortunately,
most VP models have hard-to-measure parameters such as stoichiometric con-
stants and reaction rates. In fact, probability distribution functions for them are
typically unknown.
    Limited knowledge about VPH model parameters calls for methods that
handle models whose parameters are partially unknown. This guarantees com-
pleteness, possibly at the price of minimality. In other words, we use an over-
approximation of the set of all physiologically admissible (i.e., whose behaviour
is plausible with respect to physiology) VPs.
    Since model-checking techniques enable exploration of all possible behaviours
of a model, it is not surprising that such approaches have been investigated in
order to generate VPs for both qualitative as well as quantitative VPH models.
In particular, for qualitative models (e.g., Boolean models) the problem of find-
ing parameter values yielding a biological meaningful behaviour is reduced to the
problem of finding stable states of the model, i.e., attractors. In such a setting,
symbolic model checking is widely used (see, e.g., [45,15,2,21]). There, physio-
logical admissibility can be defined using a temporal logic, such as CTL or LTL.
Some examples are in [7,6,5,13,4] or in [17], where a probabilistic model-checking
approach is used.
    Unfortunately, even using model approximation techniques like those in [3,43],
the above mentioned approaches cannot be used to generate VPs from quanti-
tative ODE-based models, i.e., those sought to support ISCTs. Indeed, in such
a setting only simulation-based approaches are viable. This suggests to investi-
gate use of model checking–driven simulation-based techniques to support VP
generation.


3   Virtual Physiological Human Model of the Human
    Glucose Regulation System

VPH models of human glucose regulation give a quantitative description of pan-
creatic hormones-glucose interactions. These models can be used to predict the
evolution of the species of interest (e.g., blood glucose concentration) under cer-
tain conditions (e.g., the amount of ingested carbohydrates), and are usually
validated by comparing the results with data collected during clinical trials. In
this paper, we use the Medtronic VPH model described in [22]. This model is
simpler than the models in [11,9], but not less effective in predicting the evolution
of blood glucose and plasma insulin concentrations.
     We implemented the Medtronic VPH model using the Modelica language, by
organising it in 4 compartments: glucose, insulin action, gastrointestinal absorp-
tion, subcutaneous insulin absorption. Also, we validated our Modelica imple-
mentation by comparing our simulation outcomes with the results shown in [22].
In order to collect data comparable to those available in that paper, we used the
parameter values estimated by the authors and supplied the subjects with the
same meals and the same insulin dosage specified by the protocol of the clinical
trial [54] (i.e., the same input used in [22]). As an example, Figure 1 shows the
curves generated by our experiments for Subject 7, which are to be compared
to those provided in [22]. Our simulation outcome is very similar to the model
fit curves provided in that paper. There are indeed minor numerical differences,
which are however justified by the fact that the glucose profile in [22] was fit
using actually measured plasma insulin profiles, and not the model-predicted
amounts, as obtained from the corresponding equations. This of course leads
to limited error accumulation. Overall, the high qualitative similarity of all our
predictions with respect to the corresponding model fit curves of [22] allows us
to regard our Modelica porting of the Medtronic VPH model correct.


4   Generating T1DM Virtual Patients

In order to study the behaviour of the patients, an individualisable VPH model
in an executable form is fundamental. Parametric ODE systems are usually em-
ployed to define individualisable VPH models. Changes in the parameter values
entail the definition of patients with different behaviours. One way to generate
VPs could be to instantiate the VPH model by choosing the parameter values at
random. Unfortunately, this strategy usually leads to modelling the behaviour of
a patient which is incompatible with respect to the physiological knowledge, i.e.,
is not Biologically Admissible (BA). Behaviours that are not BA are typically
obtained by choosing parameter values ignoring inter-dependency constraints
among them. This might happen since such constraints are usually not explic-
itly known [56].
    In order to compute a population of patients representative of the real pop-
ulation, we need to define a global search strategy that filters out all the virtual
subjects with a non-BA behaviour. The criterion proposed in [56] defines the
concept of BA behaviour as a behaviour which is similar enough to that of a
known real patient. Our algorithm performs an AI-guided randomised search in
the space of the model parameters, building on our Model Checking tools in,
e.g., [35,16,36,37,33]. It is important to note that, in most cases, the size of the
parameter space is such that, even if we discretise it, an exhaustive search would
be infeasible. To counteract this issue, our algorithm leverages on statistical hy-
pothesis rejection approaches (see, e.g., [14,40,39]), thus computing a set of BA
virtual patients which is complete with an arbitrarily high statistical confidence.
    In the following, we describe the tool we used to generate the population of
patients with T1DM (Section 4.1), and present a new methodology that applies
in the case of VPH models that require inputs (Section 4.2).
                                   400




          Glucose (mg/dl)
                                   300

                                   200

                                   100

                                           0
                                                   8    12        16       20              24   28   32   36
                                                                                Time (h)


                                                                       (a) Glucose
                                   80
          Plasma Insulin (µU/ml)




                                   60

                                   40

                                   20

                                       0
                                               8       12        16       20               24   28   32   36
                                                                               Time (h)


                                                                       (b) Insulin
                                   6
                                   5
          Meal (mg/dl/min)




                                   4
                                   3
                                   2
                                   1
                                   0
                                               8       12        16       20               24   28   32   36
                                                                               Time (h)


                                                             (c) Glucose plasma appearance

         Fig. 1: Predictions of our Modelica model on the parameter
         assignment defining Subject 7 in [22].



4.1   VP Generator


We compute the population of T1DM patients using the tool presented in [56,41],
which combines AI-based global search and statistical model checking. The sys-
tem takes as input a VPH model in the form of a parametric ODE system and
reference values for the parameter vector (i.e., the parameter values that model
the behaviour of one, real, patient), and computes the set of the parameter val-
ues that lead to time evolutions of the species of interest (e.g., in our case study,
blood glucose concentration and insulin among the others) that are qualitatively
similar to those of the reference patient. This is done by computing a set S of
BA parameter values (i.e., the population of virtual patients) which is complete
with an arbitrarily high statistical confidence [56].
   In order to identify the elements of S, the tool iteratively selects a random
parameter λ (i.e., the model parameter vector) from the search space and, if λ
passes the BA-test, adds λ to S.
    Search effectiveness is achieved by sampling the parameter space using a (AI-
based) strategy that conjugates exploration of the search space and exploitation
of the knowledge acquired from the VPs already generated.
    Search efficiency is achieved by using a HPC infrastructure in a master-
slave architecture, where a single master process (Orchestrator) is responsible
of intelligent sampling, storage of set S, and bookkeeping, and many slaves (BA
Verifiers) are devoted to simulation of candidate VPs and assessment of their
admissibility [41].
    Assessment of biological admissibility (BA) of a parameter vector λ is per-
formed by checking whether the predicted time evolution obtained using λ is
qualitatively similar to the evolution obtained using the reference parameter
vector λ0 . The concept of similarity is quantified by the use of signal processing
Key Performance Indicators (KPIs) such as cross-correlation, normalised average
differences, and normalised squared norm differences [56].
    The algorithm termination condition is based on statistical hypothesis rejec-
tion techniques. Namely, given two user-defined values δ (confidence ratio) and
ε (error bound ) in (0, 1), the algorithm, at any state of the search (when the
current population is stored in set S), formulates the following null hypothesis
H0 (S): “the probability to find, by further sampling, a VP not in  l S is atm least
                                                                       log(δ)
ε”. When the algorithm fails to find such a new VP within N =        log(1−ε)   con-
secutive attempts, it rejects the current null hypothesis H0 (S), thus inferring
that the probability to find, by further sampling, a new VP is actually < ε, and
terminates. Our algorithm implements a one-sided error procedure, in that it
might reject H0 (S) when it holds. However, the probability of committing such
an error (which is called a type-II error ) is upper-bounded by δ [56].

4.2   A New Methodology for Open VPH Models
In this section, we present a new methodology to exploit our VP generator, even
in the presence of open VPH models, i.e., models that require external inputs.
Then, we apply such a methodology to generate a population of VPs for the
Medtronic VPH model.
     To do so, we need to take into account the effect of inputs when performing
the BA-test. First, we define a portfolio U of representative input patterns suit-
able for the open VPH model at hand, M . Then, for each input pattern u ∈ U ,
we instantiate the given VPH model by feeding it with u. This leads to |U | in-
stances of the model M , denoted mu , u ∈ U . Last, we create a new VPH model,
M 0 , defined as the composition of such instances. Model M 0 is closed, i.e., no
external inputs are required. In this way, by giving M 0 as input to the tool, the
BA-test implicitly checks whether the behaviour entailed by a parameter vector
λ is qualitatively similar to the behaviour entailed by λ0 , for all model instances
mu , u ∈ U .

Input Generators. The input to the model are the daily dose of insulin and
the daily amount of ingested carbohydrates, together with meal times and time-
of-maximum appearance rate of glucose. In order to enable the simulation of
the behaviour of T1DM VPs, we define the two input generators, describing
daily dose of injected insulin and the amount of daily ingested carbohydrates,
respectively.

Insulin Dose Generator. In order to get analysable results, we need an element
in the model with the capability to supply each patient with the recommended
dose of insulin. According to the Diabetes Teaching Center at the University
of California (https://dtc.ucsf.edu), the correct dose can be calculated with the
following formula:

                 dose = 0.55 × body_weight × insulin_factor

where dose is measured in units of insulin per day (U/day) and body_weight
is in kilograms. Therefore, we introduce the body weight in the set of patient-
specific parameters. Moreover, to enable the study of behaviour of subjects who
received too much or too little insulin, we add a multiplication factor, namely
the insulin_factor, having its nominal value equal to 1.

Meals Generator. Our meals generator is based on the assumption that a subject
with T1DM has to follow a balanced diet, or at least does not eat an amount
of carbohydrates which significantly deviates from the recommended dose. For
this reason, we feed the VPs with 5 meals a day, each one at a fixed time.
    The time-of-maximum appearance rate of glucose (τm ) characterising a meal
depends on the complexity of the ingested carbohydrates, and can vary from 10
minutes to more than 2 hours. We set τm to a different fixed value for each meal,
and we add a multiplication factor in order to model inter-subject variability of
time-of-maximum glucose absorption.
    Finally, we define the percentage of carbohydrates in each meal as follows.
    Our meals generator includes three different functions for calculating the
recommended daily calories, and when invoked chooses the best (i.e., more pre-
cise) formula according to the available information about the patient. The sim-
plest function in the generator has been proposed by Harvard Medical School
(https://hms.harvard.edu) and takes in input only the body weight and the level
of physical activity (e.g., sedentary, moderate, active) of the subject.
    When further information, such as age and sex, are available, the generator
computes the daily calories amount using the World Health Organization (http:
//www.who.int/en) equation.
    The most precise formula in our meals generator is known as the Mifflin -
St. Jeor equation and requires 5 inputs: body weight, level of physical activity,
age, sex and height. The Mifflin - St. Jeor equation was judged by the American
Dietetic Association as the most reliable formula in predicting actual energy
expenditure.
    Since subjects with diabetes are suggested to consume carbohydrates in the
amount of 45–65% of the total calories intake (as estimated by the American
Diabetes Association, http://www.diabetes.org) and 1 gram of carbohydrates
provides 4 kCal, we define the daily amount of carbohydrates using the following
equation:
                             carbo_factor × 55 × daily_calories
              daily_carbos =
                                             400
   In the equation above, carbo_factor is used to alter the amount of carbo-
hydrates provided to subjects in order to simulate the behaviour of over- and
under-eating patients.

Data. In order to provide a reference parameter vector value, λ0 , (i.e., a ref-
erence BA VP) to our VP generator, we use the values of the subjects shown
in [22]. For each component of the model parameter vector, we define a discrete
domain (along the lines of [56]). This is not a rough approximation, since small
deviations in values are meaningless from a biological point of view [41].
    According to our methodology presented above, to perform a closed-loop
simulation of the model we combine the VPH model together with our input
generator. In particular, we define a portfolio of inputs, U , containing the fol-
lowing three elements:
Input 1 The recommended amount of carbohydrates and the recommended
   insulin dose (with respect to the patient in exam).
Input 2 An excessive amount of carbohydrates and an insufficient insulin dose
   (with respect to the patient in exam).
Input 3 An insufficient amount of carbohydrates and an excessive insulin dose
   (with respect to the patient in exam).




       Fig. 2: Medtronic VPH model extended with our input generator.


   In order to compute a population of VPs whose admissibility condition is
based on qualitative similarity with respect to the reference patient under all our
inputs (i.e., normal, hyperglycemic, and hypoglycemic conditions), we define an
outer Modelica model which encapsulates three instances of the Medtronic VPH
model, one for each input. The three Medtronic model instances where config-
ured with insulin_factor and patient_carbo_factor set to, respectively: (1, 1)
(normal conditions); (0.5, 3) (hyperglycemic conditions); (3, 0.5) (hypoglycemic
conditions). Figure 2 shows our extended Medtronic VPH model.
5   Results

In this section we show the outcome of the execution of our VP generator on
the input described in Section 3. Simulations have been executed on 16 nodes,
each with 16 cores, of partition A1 of the Marconi CINECA cluster (located in
Bologna, Italy). We set both our confidence ratio δ and error bound ε to 10−3 .
In other words, our algorithm terminates when, with statistical confidence at
least 99.9%, the probability to find, via further sampling, a new VP is less than
0.1%. The overall computation took 5 days.
    The importance of the obtained computed population resides in its size.
For example, the population computed using Subject 7 from [22] as the initial
reference comprises 16 082 all-different VPs. This has to be contrasted with the
size of the cohorts of in vivo clinical trials, which typically involve at most a few
dozens of volunteers (see, for example, [11,54]).
    Figure 3 shows the time evolution of blood glucose concentration (left) and
exogenous glucose appearance (right) in the three different scenarios (i.e., nor-
mal, hyperglycemic, and hypoglycemic conditions) in the VPs found by our gen-
erator using Subject 7 as the initial reference. It can be seen that our computed
population covers a wide spectrum of behaviours, with glucose concentrations
assuming values in a much larger range than Subject 7 (physiological thresholds
are part of our tool configuration), although keeping a clear qualitative similarity
with respect to the time evolutions of the reference behaviour. On the other hand,
excursions of the exogenous glucose appearance in the population (right plots
of Figure 3) are way more limited. This is expected, as our input generator was
indeed explicitly designed to generate input patterns properly individualised on
the needs of each of the VPs (e.g., carbohydrates intake computation considers
information such as the body weight of each VP). Populations of VPs generated
starting from the other subjects of [22] have a similar size and outlook.


6   Conclusions

To make ISCTs effective, the first step to perform is the generation of a repre-
sentative population of VPs, whose predictions are in agreement with (patho-)
physiology, PKPD, and in vivo clinical data.
    In this paper, we consider, as a case study, patients with T1DM, and com-
pute a representative population of VPs exploiting the Medtronic VPH model
of the glucose regulation mechanism [22] and the VP generation tool originally
presented in [56,41]. Our tool leverages AI-driven global search and Statistical
Model Checking techniques to deal with the huge search space stemming from
the parameter space of the input VPH model.
    Since the Medtronic VPH model requires a time profile for carbohydrates
intake and insulin injection as inputs, we equipped it with a new module that
produces such inputs in a principled-yet-individualised way, by defining both
normal conditions (proper carbohydrates and proper insulin intake for the VP
                                                    Meal (mg/dl/min)
Glucose (mg/dl)




                           Time (min)                                            Time (min)
                  Blood glucose concentration                          Exogenous glucose appearance

                                           Normal conditions


                                                    Meal (mg/dl/min)
Glucose (mg/dl)




                           Time (min)                                            Time (min)
                  Blood glucose concentration                          Exogenous glucose appearance

                                        Hyperglycemic conditions
                                                    Meal (mg/dl/min)
Glucose (mg/dl)




                           Time (min)                                            Time (min)
                  Blood glucose concentration                          Exogenous glucose appearance

                                        Hypoglycemic conditions

       Fig. 3: Blood glucose concentration and exogenous glucose appearance in normal,
       hyperglycemic, and hypoglycemic conditions for the 16 082 VPs found by our VP
       generator using Subject 7 from [22] as initial reference.
at hand) and pathological conditions (hyperglycaemic and hypoglycaemic con-
ditions). Our approach is general, as it can be employed to transform any open-
loop VPH model (i.e., a model requiring inputs to function properly) into a
closed-loop model.
    The computed population of VPs defines a starting point for ISCTs of new
insulin-injection patterns as well as new medical devices (e.g., autonomous in-
sulin pumps or artificial pancreas devices).


Acknowledgements. This work was partially supported by the following re-
search projects/grants: Italian Ministry of University & Research (MIUR) grant
“Dipartimenti di Eccellenza 2018–2022” (Dept. Computer Science, Sapienza Univ.
of Rome); EC FP7 project PAEON (Model Driven Computation of Treatments
for Infertility Related Endocrinological Diseases, 600773); INdAM “GNCS Project
2019”. The experimental part has been run on the Marconi CINECA cluster,
thanks to Class C ISCRA Project n. HP10COBFWG.
    The authors are grateful to Prof. Marianna Maranghi, MD (Dept. Transla-
tional and Precision Medicine, Sapienza Univ. of Rome) for useful discussions.


References
 1. U. Abdulla and R. Poteau. Identification of parameters in systems biology. Math.
    Biosci., 305:133–145, 2018.
 2. W. Abou-Jaoudé, P. Monteiro, A. Naldi, M. Grandclaudon, V. Soumelis,
    C. Chaouiya, and D. Thieffry. Model checking to assess t-helper cell plasticity.
    Front. Bioeng. Biotechnol, 2, 2015.
 3. V. Alimguzhin, F. Mari, I. Melatti, I. Salvo, and E. Tronci. Linearizing discrete-
    time hybrid systems. IEEE TAC, 62(10), 2017.
 4. G. Arellano, J. Argil, E. Azpeitia, M. Benítez, M. Carrillo, P. Góngora, D. Rosen-
    blueth, and E. Alvarez-Buylla. Antelope: A hybrid-logic model checker for
    branching-time boolean grn analysis. BMC Bioinf., 12(1), 2011.
 5. J. Barnat, L. Brim, D. Šafránek, and M. Vejnár. Parameter scanning by parallel
    model checking with applications in systems biology. In PDMC-HIBI 2010. IEEE,
    2010.
 6. G. Batt, D. Ropers, H. De Jong, J. Geiselmann, R. Mateescu, M. Page, and
    D. Schneider. Validation of qualitative models of genetic regulatory networks by
    model checking: analysis of the nutritional stress response in escherichia coli. Bioin-
    formatics, 21(suppl_1), 2005.
 7. N. Chabrier and F. Fages. Symbolic model checking of biochemical networks. In
    CMSB 2003, volume 2602 of LNCS. Springer, 2003.
 8. O.-T. Chis, J. Banga, and E. Balsa-Canto. Structural identifiability of systems
    biology models: A critical comparison of methods. PLoS ONE, 6(11), 2011.
 9. C. Dalla Man, F. Micheletto, D. Lv, M. Breton, B. Kovatchev, and C. Cobelli. The
    UVA/Padova type 1 diabetes simulator: New features. JDST, 8, 2014.
10. P. Docherty, J. Chase, and T. David. Characterisation of the iterative integral
    parameter identification method. Med. Biol. Eng. Comput., 50(2):127–134, 2012.
11. F. El-Khatib, S. Russell, D. Nathan, R. Sutherlin, and E. Damiano. Bi-hormonal
    closed-loop blood glucose control for type 1 diabetes. Sc. Transl. Med., 2, 2010.
12. J. Garcia-Tirado, C. Zuluaga-Bedoya, and M. Breton. Identifiability analysis
    of three control-oriented models for use in artificial pancreas systems. JDST,
    12(5):937–952, 2018.
13. H. Gong, P. Zuliani, Q. Wang, and E. Clarke. Formal analysis for logical models
    of pancreatic cancer. In CDC 2011. IEEE, 2011.
14. R. Grosu and S. Smolka. Monte Carlo model checking. In TACAS 2005, volume
    3440 of LNCS. Springer, 2005.
15. W. Guo, G. Yang, W. Wu, L. He, and M. Sun. A parallel attractor finding algo-
    rithm based on boolean satisfiability for genetic regulatory networks. PloS One,
    9(4):e94258, 2014.
16. B. Hayes, I. Melatti, T. Mancini, M. Prodanovic, and E. Tronci. Residential de-
    mand management using individualised demand aware price policies. IEEE Trans.
    Smart Grid, 8(3), 2017.
17. J. Heath, M. Kwiatkowska, G. Norman, D. Parker, and O. Tymchyshyn. Proba-
    bilistic model checking of complex biological pathways. TCS, 391(3), 2008.
18. M. Hengartner, T. Kruger, K. Geraedts, E. Tronci, T. Mancini, F. Ille, M. Egli,
    S. Roeblitz, R. Ehrig, L. Saleh, K. Spanaus, C. Schippert, Y. Zhang, and B. Leeners.
    Negative affect is unrelated to fluctuations in hormone levels across the menstrual
    cycle: Evidence from a multisite observational study across two successive cycles.
    J. Psycho. Res., 99, 2017.
19. C. F. Higham and D. Husmeier. A bayesian approach for parameter estimation
    in the extended clock gene circuit of arabidopsis thaliana. BMC Bioinf., 14(S-10),
    2013.
20. I. Irurzun-Arana, J. Pastor, I. Trocóniz, and J. Gómez-Mantilla. Advanced boolean
    modeling of biological networks applied to systems pharmacology. Bioinf., 2017.
21. S. Ito, T. Ichinose, M. Shimakawa, N. Izumi, S. Hagihara, and N. Yonezaki. Qual-
    itative analysis of gene regulatory networks by temporal logic. Theor. Comput.
    Sci., 594:151–179, 2015.
22. S. Kanderian, S. Weinzimer, G. Voskanyan, and G. Steil. Identification of intraday
    metabolic profiles during closed-loop glucose control in individuals with type 1
    diabetes. JDST, 3, 2009.
23. K. A. Kim, S. L. Spencer, J. G. Albeck, J. M. Burke, P. K. Sorger, S. Gaudet, and
    D. H. Kim. Systematic calibration of a cell signaling network model. BMC Bioinf.,
    11, 2010.
24. A. Kramer, B. Calderhead, and N. Radde. Hamiltonian monte carlo methods for
    efficient parameter estimation in steady state dynamical systems. BMC Bioinf.,
    15, 2014.
25. M. Krauss, R. Burghaus, J. Lippert, M. Niemi, P. Neuvonen, A. Schuppert, S. Will-
    mann, L. Kuepfer, and L. Görlitz. Using bayesian-pbpk modeling for assessment of
    inter-individual variability and subgroup stratification. In Silico Pharmacol., 1(1),
    2013.
26. B. Leeners, T. Krüger, K. Geraedts, E. Tronci, T. Mancini, M. Egli, S. Röblitz,
    L. Saleh, K. Spanaus, C. Schippert, Y. Zhang, and F. Ille. Associations between
    natural physiological and supraphysiological estradiol levels and stress perception.
    Front. Psycol., 10, 2019.
27. B. Leeners, T. Kruger, K. Geraedts, E. Tronci, T. Mancini, F. Ille, M. Egli, S. Roe-
    blitz, L. Saleh, K. Spanaus, C. Schippert, Y. Zhang, and M. Hengartner. Lack of
    associations between female hormone levels and visuospatial working memory, di-
    vided attention and cognitive bias across two consecutive menstrual cycles. Front.
    Behav. Neuro., 11, 2017.
28. L. Ljung. System identification: theory for the user. Prentice-Hall, 1987.
29. J. Lu, K. Hübner, M. Nanjee, E. Brinton, and N. Mazer. An in-silico model of
    lipoprotein metabolism and kinetics for the evaluation of targets and biomarkers
    in the reverse cholesterol transport pathway. PLoS Comput. Biol., 10(3), 2014.
30. F. Maggioli, T. Mancini, and E. Tronci. SBML2Modelica: Integrating biochemical
    models within open-standard simulation ecosystems. Bioinformatics, 2019.
31. T. Mancini, M. Cadoli, D. Micaletto, and F. Patrizi. Evaluating ASP and com-
    mercial solvers on the CSPLib. Constraints, 13(4), 2008.
32. T. Mancini, P. Flener, and J. Pearson. Combinatorial problem solving over
    relational databases: View synthesis through constraint-based local search. In
    SAC 2012. ACM, 2012.
33. T. Mancini, F. Mari, A. Massini, I. Melatti, F. Merli, and E. Tronci. System level
    formal verification via model checking driven simulation. In CAV 2013, volume
    8044 of LNCS. Springer, 2013.
34. T. Mancini, F. Mari, A. Massini, I. Melatti, I. Salvo, S. Sinisi, E. Tronci, R. Ehrig,
    S. Röblitz, and B. Leeners. Computing personalised treatments through in sil-
    ico clinical trials. A case study on downregulation in assisted reproduction. In
    RCRA 2018, volume 2271 of CEUR W.P. CEUR, 2018.
35. T. Mancini, F. Mari, A. Massini, I. Melatti, I. Salvo, and E. Tronci. On minimising
    the maximum expected verification time. Inf. Proc. Lett., 122, 2017.
36. T. Mancini, F. Mari, A. Massini, I. Melatti, and E. Tronci. Anytime system level
    verification via parallel random exhaustive hardware in the loop simulation. Mi-
    croprocessors and Microsystems, 41, 2016.
37. T. Mancini, F. Mari, A. Massini, I. Melatti, and E. Tronci. SyLVaaS: System level
    formal verification as a service. Fundam. Inform., 1–2, 2016.
38. T. Mancini, F. Mari, I. Melatti, I. Salvo, and E. Tronci. An efficient algorithm for
    network vulnerability analysis under malicious attacks. In ISMIS 2018. Springer,
    2018.
39. T. Mancini, F. Mari, I. Melatti, I. Salvo, E. Tronci, J. Gruber, B. Hayes, and
    L. Elmegaard. Parallel statistical model checking for safety verification in smart
    grids. In SmartGridComm 2018. IEEE, 2018.
40. T. Mancini, F. Mari, I. Melatti, I. Salvo, E. Tronci, J. Gruber, B. Hayes, M. Pro-
    danovic, and L. Elmegaard. User flexibility aware price policy synthesis for smart
    grids. In DSD 2015. IEEE, 2015.
41. T. Mancini, E. Tronci, I. Salvo, F. Mari, A. Massini, and I. Melatti. Computing bi-
    ological model parameters by parallel statistical model checking. In IWBBIO 2015,
    volume 9044 of LNCS. Springer, 2015.
42. T. Mancini, E. Tronci, A. Scialanca, F. Lanciotti, A. Finzi, R. Guarneri, and
    S. Di Pompeo. Optimal fault-tolerant placement of relay nodes in a mission critical
    wireless network. In RCRA 2018, volume 2271 of CEUR W.P. CEUR, 2018.
43. F. Mari, I. Melatti, I. Salvo, and E. Tronci. Model based synthesis of control
    software from system level formal specifications. ACM TOSEM, 23(1), 2014.
44. A. Miró, C. Pozo, G. Guillén-Gosálbez, J. A. Egea, and L. Jiménez. Deterministic
    global optimization algorithm based on outer approximation for the parameter
    estimation of nonlinear dynamic biological systems. BMC Bioinf., 13, 2012.
45. N. Miskov-Zivanov, P. Zuliani, E. Clarke, and J. Faeder. Studies of biological
    networks with statistical model checking: Application to immune system cells. In
    ACM-BCB 2013. ACM, 2013.
46. R. Moss, T. Grosse, I. Marchant, N. Lassau, F. Gueyffier, and S. R. Thomas.
    Virtual patients and sensitivity analysis of the guyton model of blood pressure
    regulation: towards individualized models of whole-body physiology. PLoS Comput.
    Biol., 8(6), 2012.
47. M. Nobile, A. Tangherloni, L. Rundo, S. Spolaor, D. Besozzi, G. Mauri, and P. Caz-
    zaniga. Computational intelligence for parameter estimation of biochemical sys-
    tems. In IEEE CEC, pages 1–8. IEEE, 2018.
48. N. L. Novre. Quantitative and logic modelling of molecular and gene networks.
    Nature Rev. Gen., 2015.
49. Y. Pantazis, M. A. Katsoulakis, and D. G. Vlachos. Parametric sensitivity analysis
    for biochemical reaction networks based on pathwise information theory. BMC
    Bioinf., 14, 2013.
50. S. Röblitz, C. Stötzel, P. Deuflhard, H. Jones, D.-O. Azulay, P. van der Graaf,
    and S. Martin. A mathematical model of the human menstrual cycle for the
    administration of GnRH analogues. J. Theor. Biol., 321, 2013.
51. S. Schaller, S. Willmann, J. Lippert, L. Schaupp, T. Pieber, A. Schuppert, and
    T. Eissing. A generic integrated physiologically based whole-body model of the
    glucose-insulin-glucagon regulatory system. CPT: Pharmacometr. Sys. Pharma-
    col., 2(8), 2013.
52. E. Shockley, J. Vrugt, and C. Lopez. PyDREAM: high-dimensional parameter
    inference for biological models in Python. Bioinf., 34(4):695–697, 2018.
53. P. Stapor, D. Weindl, B. Ballnus, S. Hug, C. Loos, A. Fiedler, S. Krause, S. Hroß,
    F. Fröhlich, and J. Hasenauer. PESTO: Parameter EStimation TOolbox. Bioinf.,
    34(4):705–707, 2018.
54. G. Steil, K. Rebrin, C. Darwin, F. Hariri, and M. Saad. Feasibility of automating
    insulin delivery for the treatment of type 1 diabetes. Diabetes, 55, 2006.
55. J. Sun, J. Garibaldi, and C. Hodgman. Parameter estimation using metaheuristics
    in systems biology: A comprehensive review. IEEE/ACM Trans. Comp. Biol.
    Bioinf., 9(1), 2012.
56. E. Tronci, T. Mancini, I. Salvo, S. Sinisi, F. Mari, I. Melatti, A. Massini, F. Davi’,
    T. Dierkes, R. Ehrig, S. Röblitz, B. Leeners, T. Krüger, M. Egli, and F. Ille.
    Patient-specific models from inter-patient biological models and clinical records.
    In FMCAD 2014. IEEE, 2014.
57. N. Tsiantis, E. Balsa-Canto, and J. Banga. Optimality and identification of dy-
    namic models in systems biology: an inverse optimal control framework. Bioinf.,
    34(14):2433–2440, 2018.
58. A. Villaverde and J. Banga. Reverse engineering and identification in systems
    biology: strategies, perspectives and challenges. J. Royal Soc. Interface, 11(91),
    2014.
59. W. H. Wu, F. S. Wang, and M. S. Chang. Dynamic sensitivity analysis of biological
    systems. BMC Bioinf., 9(S-12), 2008.
60. C. Zhan and L. F. Yeung. Parameter estimation in systems biology models using
    spline approximation. BMC Sys. Biol., 5(1), 2011.