Computing Personalised Treatments through In
               Silico Clinical Trials.
          A Case Study on Downregulation in Assisted
                        Reproduction.

      T. Mancini1 , F. Mari1 , A. Massini1 , I. Melatti1 , I. Salvo1 , S. Sinisi1 ,
             E. Tronci1 , R. Ehrig2 , S. Röblitz2 , and B. Leeners3
           1
             Computer Science Department, Sapienza University of Rome
           2
             Computational Systems Biology Group, Zuse Institute Berlin
        3
          Division of Reproductive Endocrinology, University Hospital Zurich


       Abstract. In-Silico Clinical Trials (ISCT), i.e., clinical experimental
       campaigns carried out by means of computer simulations, hold the promise
       to decrease time and cost for the safety and efficacy assessment of phar-
       macological treatments, reduce the need for animal and human testing,
       and enable precision medicine.
       In this paper we present a case study aiming at quantifying, by means of
       a multi-arm ISCT supervised by intelligent search, the potential impact
       of precision medicine approaches on a real pharmacological treatment,
       namely the downregulation phase of a complex clinical protocol for as-
       sisted reproduction.


1    Introduction
Model-based approaches to safety and efficacy assessment of pharmacological
treatments (In-Silico Clinical Trials, ISCT) hold the promise to decrease time
and cost for the needed experimentations, reduce the need for animal and human
testing, and enable precision medicine, where personalised treatments optimised
for each patient can be designed before being actually administered. This is to
be achieved by developing computational models for human physiology, patho-
physiology, PharmacoKinetics (PK) and PharmacoDynamics (PD), which also
define the possible physiological differences between different individuals (i.e.,
the possible phenotypes).
    Research in Virtual Physiological Human (VPH) provides mechanistic quan-
titative models of the human physiology at levels of scale ranging from body
compartments (see, e.g., [8,36]), organs (see, e.g., [7]) down to cells (see, e.g.,
[2]) or even molecules (see, e.g., [37]).
    Independently of the level of scale used, most VPH models are defined by
means of complex, highly non-linear differential equations. By means of numer-
ical integration, such models can be executed within simulators in order to pro-
vide quantitative information about the time evolution of the modelled biological
quantities (e.g., hormone concentrations).
    Also, VPH models of different body compartments can be integrated in larger
models, eventually defining whole human models. Important attempts in this
direction are given by HumMod [17] and the more recent Physiomodel [33],
a complex whole-body model written in the standard Modelica language and
executable by means of open-source (e.g., OpenModelica, openmodelica.org) as
well as proprietary (e.g., Dymola, dymola.com) simulators.

1.1   Motivations
Current pharmacological treatments are often designed with the average pa-
tient in mind. A key topic in precision medicine is to develop pharmacological
treatments optimised for any given individual (personalised treatments). Several
optimisation criteria can be defined. A typical criterion is minimisation of the
overall amount of administered drug, which often also reduces the probability of
adverse side-effects and contributes to decreasing the treatment cost.
    With their amenability to define different phenotypes, VPH models of proved
accuracy are a key enabler for precision medicine.

1.2   Contributions
In this paper we present a case study aiming at quantifying, by means of ex-
tensive computer simulation–based experimental campaigns (In-Silico Clinical
Trials, ISCT) guided by intelligent search, the impact of precision medicine ap-
proaches on a real pharmacological treatment, namely the downregulation phase
of a complex clinical protocol for assisted reproduction in humans. To do this,
we exploit a large VPH model of the Hypothalamic–Pituitary–Gonadal (HPG)
axis [36] in order to conduct a multi-arm ISCT. This model has been used as
a case study in [39,31], where a parallel algorithm based on statistical model
checking techniques and deployed on a High Performance Computing (HPC) in-
frastructure has been described, which computes a representative set of Virtual
Phenotypes (VPs) for the VPH model given as input.
    Here we select, from the population of VPs computed in [39,31] for the HPG
axis model [36], a representative subset of 98 different VPs of our HPG axis
model, and conduct a distinct arm of our ISCT for each of such phenotypes, in
order to compute the lightest (in terms of overall amount of administered drug)
treatment still effective for that phenotype.
    Our multi-arm ISCT has been conducted on a large HPC infrastructure using
a backtracking-based search algorithm to seek for the lightest but still effective
treatment for each VP. Our search algorithm drives a (black-box) simulator of
the VPH model at hand to intelligently explore the space of possible time-series
of drug administrations.
    We note that a clinical trial with a so high number of arms is not even
conceivable in the classical in vivo setting (it would require a distinct human
patient for each phenotype and for each possible alternative time-series of drug
administrations). This undoubtedly shows the revolutionary potential of artificial
intelligence for model-based (in silico) precision medicine.

                                        2
Paper outline. The paper is organised as follows. After describing some back-
ground knowledge on the HPG axis model used in our case study and its pheno-
types in Section 2, we outline our reference downregulation protocol in Section 3.
In Section 4 we state our problem and describe our algorithm to solve it. Section 5
describes and analyses the outcome of our multi-arm ISCT. Finally, Section 6
discusses related work and Section 7 draws conclusions.


2     Preliminaries

In this section we give the necessary background knowledge.


2.1   The GynCycle Virtual Physiological Human Model

We focus on the GynCycle Virtual Physiological Human (VPH) model, initially
presented in [36]. GynCycle is a continuous-time differential-algebraic equation–
based mathematical model of the human female Hypothalamic–Pituitary–Gonadal
(HPG) axis, with a special focus on the interactions and feedback mechanisms
between hormones like GnRH, Follicle-Stimulating Hormone (FSH), Luteinizing
Hormone (LH), Estradiol (E2), Progesterone (P4), Inhibin A (IhA) and Inhibin
B (IhB) during the different stages of the human female menstrual cycle.
    The model defines the time evolution of overall 33 biological quantities (mostly
blood concentration of hormones) and the Pharmacokinetics/Pharmacodynam-
ics (PKPD) of several pharmaceutical drugs used in assisted reproduction (in
particular, GnRH analogues like Triptorelin, or FSH/LH–based drugs) by means
of highly non-linear differential equations.
    As such, GynCycle is a hybrid system whose inputs represent drug adminis-
trations. As it happens with VPH models of practical interest as GynCycle, the
complexity of the differential equations makes their symbolic analysis (by, e.g.,
a model checker for hybrid systems) very challenging. Indeed, the model can
be simulated by means of numerical integration in order to compute the time
evolution of the hormones of interest upon administration of a sequence of doses
of one or more drugs. Thus, GynCycle is used as an executable black-box model.


2.2   GynCycle Virtual Phenotypes

VPH models like GynCycle typically take into account inter-subject variabil-
ity (i.e., the physiological differences among different individuals) by including
suitable parameters in their equations. Different value assignments to model pa-
rameters yield different model time evolutions and/or different reactions to drug
administrations, thus defining different Virtual Phenotypes (VPs). Intuitively,
each VP represents a class of patients behaving similarly to each other.
    Computing a complete set of VPs for the model at hand is the starting point
to obtain a representative population of virtual patients (hence, ideally showing
all possible phenotypes). In turn, this is the key enabler to perform In-Silico


                                        3
Clinical Trials (ISCT) to assess, e.g., safety and/or efficacy of a pharmacological
treatment.
    Unfortunately, computing the complete set of VPs defined by a complex
VPH model is all but easy from the combinatorial point of view, and may re-
quire weeks of massive simulation on large High Performance Computing (HPC)
infrastructures. In [39,31] a statistical model checking–based approach using a
non-uniform sampling process finely tuned against the model has been discussed,
and a representative population of VPs for GynCycle consisting of around 104
individuals has been computed (once and for all ) from a huge search space of
1075 possible candidates.
    The availability of such a representative population of VPs is the key enabler
for our ISCT.


3   Downregulation in Assisted Reproduction Treatments

In this section we define the treatment we used as our case study: the downreg-
ulation phase of an assisted reproduction protocol currently administered at the
Department of Reproductive Endocrinology of University Hospital Zurich.
     At each menstrual cycle of a human female, among the several follicles ini-
tially present in the ovaries, only one, unless in exceptional cases, grows and
reaches maturity. A complex competition among follicles takes place, driven by
a hormone-based feedback loop, in order to inhibit the maturation of multiple
follicles.
     In a so-called “long protocol”, the treatment aims at neutralising (through
drugs) such a feedback loop (downregulation phase), in order to achieve a con-
trolled and an as-simultaneous-as-possible growth of 5–15 ovarian follicles (stim-
ulation phase). When the first three follicles reach a size where a mature oocyte
can be expected (about 18 mm in diameter), ovulation is induced, the oocytes
are retrieved, those which are mature are tried to be fertilised in vitro, and
implanted either immediately within the treatment cycle or later after cryop-
reservation back into the uterus.
     Assisted reproduction treatments are complex and challenging, with low av-
erage success rates (around 30%) even in the top clinics, and with many factors
that, to date, can be hardly kept under full control. Indeed, as hormonal regu-
latory systems occur within a complex network of endocrinological, neurological
and psychological factors [19,16], they are difficult to capture within clinical
studies, and model-based approaches might be of great aid in taking these many
factors under better control.
     Our case study is one of the worldwide classically used downregulation pro-
tocols aiming at suppressing the usual hormonal oscillations of the menstrual
cycle (as in Figure 1) and preparing the patient to the following stimulation. At
the Deptartment of Reproductive Endocrinology of University Hospital Zurich,
this protocol currently consists of a sequence of daily administrations of 0.1 mg
of Triptorelin (a GnRH analogue). For physiological reasons, downregulation is
started within a precise time window of the menstrual cycle, namely between


                                        4
        450                                                                               16
                                                     No treatment                                                                    No treatment
        400                                Triptorelin 0.1 mg/day                         14                               Triptorelin 0.1 mg/day
        350                                                                               12
        300
                                                                                          10
pg/mL


                                                                                  ng/mL
        250
                                                                                           8
        200
                                                                                           6
        150
        100                                                                                4
         50                                                                                2
             0                                                                             0
                     0     7   14   21      28   35   42     49     56   63                        0       7   14   21      28   35     42   49     56   63
                                          time (days)                                                                     time (days)

                                         (a) E2                                                                          (b) P4
        20                                                                                140
                                                     No treatment                                                                    No treatment
        18                                 Triptorelin 0.1 mg/day                         120                              Triptorelin 0.1 mg/day
        16
                                                                                          100
        14
        12                                                                                 80
IU/L


                                                                                  IU/L
        10                                                                                 60
         8
                                                                                           40
         6
         4                                                                                 20
         2                                                                                     0
                 0        7    14   21      28   35     42   49     56   63                            0   7   14   21      28   35   42     49     56   63
                                          time (days)                                                                     time (days)

                                     (c) FSH                                                                         (d) LH

Fig. 1: Expected downregulation effects on some of the hormones playing a role
in the human female menstrual cycle. The purple area defines the time window
during which Triptorelin is administered.


day 21 and day 25. The treatment might have different duration in order to
address different patient reactions. In particular, a downregulation treatment is
considered successful (effective) for a patient in case the blood concentrations
of a given set of hormones and other physiological quantities go below certain
thresholds within 9 days from the first drug administration, and stay always
below such thresholds for the following 21 days. As a consequence, a downregu-
lation treatment might last up to 30 days.


4                Computing Optimal Personalised Fertility Treatments

In the following, we denote with R, R≥0 , R+ and N+ the sets of, respectively,
all real, non-negative real, strictly positive real, and strictly positive natural
numbers.


4.1                      Formalising the Virtual Physiological Human (VPH) model

VPH models (as GynCycle) are hybrid systems. In the typical case (as ours) in
which model inputs are drug administrations, VPH models can be abstracted
into Discrete Event Systems (DESs) (see, e.g., [23,27]), i.e., continuous-time


                                                                              5
input-state-output deterministic causal dynamical systems [38] whose input func-
tions are discrete event sequences.
    Events for a DES defining a VPH model as GynCycle represent clinical ac-
tions, and take values from a finite alphabet A (in our case, defining all the
possible doses for the treatment drug). We assume that a distinguished element
nop exists in A, representing the “null” event (“no action”).
    A discrete event sequence (defining an input function for our VPH model) is
a function u(t) associating to each (continuous) time instant t ∈ R≥0 an event
(i.e., a clinical action) in A, and such that u(t) differs from nop only in time
points multiple of a given time quantum τ ∈ R+ . Intuitively, in our setting,
a discrete event sequence u(t) defines a series of drug administration actions
(including “no action”) occurring at time points multiple of τ .
    Given an assignment λ to the VPH model parameters (i.e., a Virtual Pheno-
type, VP) and a discrete event sequence u, the associated VPH model trajectory
is a continuous-time function x(λ, t) representing the time evolution of the bio-
logical quantities defined by the model when fed with u starting from its initial
state and with its parameters set to λ. Of course, as the VPH model is causal,
its trajectories up to any time point t ∈ R≥0 only depend on the restriction of
the input discrete sequence up to time point t [24].

4.2   Modelling treatment invariants and goals
Treatment invariants and goals define conditions that must be, respectively,
always and eventually satisfied by a successful treatment. Invariants and goals
for the treatment being sought can be modelled as continuous-time monitors
embedded within the VPH (DES) model, along the lines of [23,28]. Monitors
observe the state of the system and check whether the properties of interest are
satisfied.
    In particular, given a model trajectory x(λ, t) under a given input function
u(t), our monitor output is Undet as long as x(λ, t) satisfies the invariants (in
other words, the monitor decision is “undetermined” as there is hope to extend
the current treatment into a successful treatment) and goes to and stays at
value Fail as soon as invariants are violated. When a Undet model trajectory
satisfies the goal conditions, the monitor output turns to Success, informing the
caller that the input function u(t) defines a successful treatment (i.e., a effective
treatment which always satisfies invariants).
    The use of continuous-time monitors embedded in the VPH model gives us a
flexible way to model both bounded safety and bounded liveness properties (see,
e.g., [25,27] for a use of monitors to define safety properties for cyber-physical
systems).
    In our setting, the properties of interest are the conditions of successful down-
regulation treatments (Section 3). In particular, our invariant requires that the
value of all the biological quantities under observation go and stay beyond their
thresholds from the 9-th day after the first drug administration. Our goal condi-
tion instead requires that values for those quantities stay below their thresholds
for 21 consecutive days.


                                         6
                      invariants violated
            Undet                           Fail                     Success

                                       goal reached

                            Fig. 2: Treatment monitor.


    As a consequence, our monitor output is Undet in the initial state, and
eventually turns either to Fail or to Success (see Figure 2). Model output turns
to value Fail as soon as, from the 9-th day after the first drug administration,
the value of any of the physiological quantities under observation goes beyond
its threshold. Model output turns instead from value Undet to value Success
as soon as all the thresholds are satisfied for 21 consecutive days.


4.3   VPH model simulation

As anticipated in Section 2 the complexity of the (highly non-linear) differential
equations typically occurring in actual VPH models hinders the possibility for
their symbolic analysis. Numerical simulation is often the only means to compute
the time evolution of the model quantities of interests (mainly blood hormone
concentrations in GynCycle) upon a sequence of clinical actions (administration
of one or more drugs with their associated doses). Also, as VPH model equations
are often stiff, simulation can be an expensive process from a computational point
of view.
    Our algorithm for optimal personalised treatment computation regards the
input VPH model (GynCycle in our case study) as an executable black-box
model. In order to drive the VPH model simulator, our algorithm assumes that
it accepts a set of basic commands (which are available or can be readily im-
plemented within most modern simulators [27]) in order to: (i) set given values
(λ, the VP at hand) for the model parameters; (ii) prepare the model to be
simulated from its initial state; (iii) inject a clinical action (i.e., administer a
drug dose) and advance the simulation by time τ (e.g., 1 day); (iv) rollback to a
previous simulation state. A simulation campaign (see, e.g., [26]) is a sequence
of such simulator commands.


4.4   Computing an optimal personalised treatment

In this section we describe the search-based algorithm that we used at the core
of our In-Silico Clinical Trial (ISCT).
    Given a VP λ of our GynCycle VPH model and a set of possible alternative
doses for the drug at hand defining the set A of possible clinical actions (including
action “no action”, i.e., dose = 0), our backtracking-based algorithm performs a
depth-first search in the space of possible treatments (i.e., discrete sequences of
clinical actions in A) lasting at most h = 25 + 9 + 21 = 55 days from day 1 of


                                            7
the patient menstrual cycle (i.e., the latest cycle day, 25, when the treatment
can start, plus the latest day, 9, within which the safety conditions must be met,
plus the number of consecutive days, 21, in which the safety conditions must be
always satisfied in order to declare success).
    The search algorithm is implemented in C and drives a GynCyle model simu-
lator (implemented within the Limex Solver [12]) using the commands described
in Section 4.3. As the downregulation treatment forbids to administer multiple
drug doses in a single day, we can fix the time step τ (time quantum between
clinical actions) to 1 day.
    The initial state of the GynCycle model represents a patient of VP λ on day
1 of her menstrual cycle. Furthermore, as described in Section 4.2, the model
is equipped with a continuous-time monitor which checks at any time whether
the current model trajectory satisfies the treatment safety and goal conditions
or not.
    The algorithm initialises the current treatment to the empty treatment and
advances the simulator to cycle day 21 (the earliest day of the first clinical
action) with no drug administrations. From this point, at each step it extends
the current treatment with a new clinical action a ∈ A (possibly nop, i.e., “no
drug”).


Checking safety and goal conditions and backtracking After having ap-
pended to the current partial treatment a new action a ∈ A, the action is injected
into the simulator, the simulator is advanced by time τ , and the monitor output
at the end of the last simulation step is retrieved, thus evaluating the safety and
goals treatment conditions (as defined in Section 4.2) after one day from the last
clinical action. The simulated (partial) treatment is said safe if the monitor out-
put is Undet, and unsafe if the monitor output if Fail. Whenever the monitor
output turns from Undet to Success, the current (safe) treatment represents
a successful treatment. As soon as the monitor returns Fail, a backtracking is
performed, with the previous simulator state being restored (together with the
state of the monitor and the monitor output value in the restored state). If no
further actions can be executed in the current node of the search tree, a back-
track is triggered which restores the simulator (plus monitor) state one level up
in the stack.


Optimality constraint Our algorithm does not stop at the first found suc-
cessful treatment. Indeed, along the lines of [22], it keeps track of the lightest
successful treatment found so far, i.e., the one envisioning the administration of
the minimum overall drug amount Dmin ∈ R≥0 . Initially, Dmin is set to +∞.
    At each node of the search tree, any action extending the current partial
treatment with an administration of a drug dose which would make the over-
all administered drug amount reaching or exceeding Dmin is regarded as not
applicable.

                                        8
Preference ordering among actions In a black-box setting as ours, no in-
ference can be made on the effects of each candidate action without performing
a simulator run to actually advance the model and then querying the model
monitor output. Hence, approaches to compute, through inference, a dynamic
preference order among the candidate actions to be tried during search (as those
exploited in, e.g., classical planning, planning for white-box hybrid systems –see
Section 6– or CSP, SAT or local search solvers) cannot be applied.
    The only information available to the algorithm without running the simu-
lator is value Dmin and the overall cost of the current partial treatment (overall
amount of drug administered). Hence, given that our algorithm searches for a
successful treatment of minimum cost and that any action has a non-negative
contribution to the overall treatment cost, it not surprisingly tries the clinical
actions on each node in the search tree in ascending order of their associated
dose (i.e., cost), hence performing a greedy, optimistic choice.
    The optimality constraint (whose threshold is updated each time a new opti-
mal treatment is found), the presence of a bounded horizon h and the fact that
our algorithm does not stop at the first successful treatment clearly guarantee
that a global optimum is always returned (if any successful treatment exists).

5     Multi-Arm In-Silico Clinical Trial
In this section we outline how we set up and conducted our In-Silico Clinical
Trial (ISCT).

5.1    Selection of Virtual Phenotypes (VPs) and exclusion criteria
The GynCycle model defines around 104 VPs (Section 2.1) whose time behaviour
(under no treatment) is shown in Figure 3 (light curves). In order to carry out
our multi-arm ISCT, we clustered such VPs by merging in the same cluster
phenotypes behaving similarly, where similarity has been defined (along the lines
of [39,31]) by means of signal processing metrics. For each obtained cluster, a
single representative has been randomly selected.
    Not surprisingly, the reference downregulation treatment of Section 3 does
not succeed on all possible phenotypes. Hence, we excluded from our ISCT those
VPs for which the reference treatment fails. Our exclusion criterion directly
stems from domain knowledge on downregulation treatments (namely: for these
protocols, if the reference treatment, which envisions one full drug dose per day,
fails for a patient, a lighter treatment will fail as well).
    As a result of the application of clustering and of our exclusion criteria, we
obtained a population of 98 VPs representative of the spectrum of all model
VPs for which the reference downregulation treatment works (dark curves in
Figure 3). Any two different VPs in our selection differ significantly for the time
behaviour of at least one of the 33 biological quantities defined by the model
(although they might appear similar on others, e.g., those in Figure 3).
    Each of the 98 VPs in our selected population defines a distinct arm of our
multi-arm ISCT.

                                        9
                (a) E2                                     (b) P4


               (c) FSH                                     (d) LH

Fig. 3: Time evolutions of all GynCycle VPs for some of the biological quantities
defined by the model.


5.2   Multi-arm ISCT run
We ran our 98-arm ISCT using a large High Performance Computing (HPC)
infrastructure (the Marconi cluster) kindly provided by the Cineca consortium.
    For each VP in the selected population, we ran our algorithm of Section 4
on an independent node of the cluster searching for an optimal (i.e., lightest)
downregulation treatment for that VP, thus in an embarrassing parallel fashion.

5.3   Computational results
Figure 4a shows the distribution of the computation times of each parallel process
implementing a single arm of our multi-arm ISCT.
   In particular, the average completion time of the arms of our ISCT is 19.09 h.
Although standard deviation is very large (21.31 h), from Figure 4a it can be
seen that the vast majority (around 90%) of ISCT arms terminate in less than
48 hours.
   Figure 4b shows the distribution of the number of computation nodes among
our ISCT arms. On average, 918 770 nodes have been expanded by each parallel
search process in order to find an optimal treatment for the VP at hand. Also
here, standard deviation is very large (1 144 123 nodes).


                                       10
    The high standard deviations of the distributions of computation time and
number of expanded nodes among VPs can be explained by observing that our
algorithm deterministically orders clinical actions to be selected in each search
tree node in ascending order of the administered drug dose (i.e., on their con-
tribution to the overall cost function). In Section 4.4 we have argued that any
choice of the actions based on inference of the prospective effect of each action is
a not viable option, because of the black-box nature of the Virtual Physiological
Human (VPH) model. That said, the left skew of the distributions in Figures 4a
and 4b suggests that our chosen action preference order does behave very well
in the vast majority of cases.
    Overall, Figures 4a and 4b show the revolutionary potential of ISCT: an
ISCT over a population of 98 VPs has been conducted in a few days of parallel
computation, whilst an equivalent in vivo clinical trial would have been not even
conceivable. In fact, we would need to recruit in the clinical trial a distinct human
patient for each phenotype and for each of the tested treatment variations.
The overall number of human patients to be recruited would have been several
millions, thus clearly making the approach infeasible.


5.4   ISCT outcomes

Figure 5 shows the distribution of the total amount of drug employed by our
optimal personalised treatments, as found by our algorithm. In particular, our
computed personalised treatments use only a tiny fraction (on average 16.96%)
of the drug quantity employed by the reference treatment of Section 3 (i.e., 0.1
mg on each day for up to 30 days, totalling up to 3 mg and never less than 2.8
mg, since the required thresholds are never reached before day 7).
    Figure 5 shows that optimal personalised treatments save, on average, 83.04%
of the drug administered in the reference treatment (standard deviation: 2.74%).
    However, these extremely favourable results must be read with care. Differ-
ently from what happens in clinical practice, our algorithm has perfect knowledge
about the phenotype of the patient under treatment, hence it can predict ex-
actly her reaction to each tentative clinical action (drug administration). As a
consequence, both the dose and the timing of each drug administration can be
optimally chosen in the personalised treatment.
    The goal of presenting results as those in Figure 5 is then to show the potential
huge room for improvement that in silico approaches can bring to precision
medicine. However, many critical issues are still open in the area of ISCT. Some
of them are recapped in Section 7.


6     Related Work

Individualised treatments have the potential value to reduce costs and improve
outcomes of standard clinical treatments. In recent years data-driven techniques
have been investigated thanks to the availability of big data [34]. For example,
the knowledge-base approach in [13] has been used to optimise treatment plans


                                         11
                                                          Cumulative distribution
                                 100%                     Computation time
                                     90%
                                     80%


                 Percentage of VPs
                                     70%
                                     60%
                                     50%
                                     40%
                                     30%
                                     20%
                                     10%

                                              < 6 6 12 18 24 30 36 42 48 54 60 66 72 78 84 90 96
                                                            Computation time (hours)


                                                                                     (a)
                                                          Cumulative distribution
                                 100%                     Nodes expanded
                                     90%
                                     80%
                 Percentage of VPs


                                     70%
                                     60%
                                     50%
                                     40%
                                     30%
                                     20%
                                     10%

                                                  < 1.0e06 1.0e06 2.0e06 3.0e06 4.0e06 5.0e06 6.0e06
                                                                Number of expanded nodes


                                                                                     (b)

Fig. 4: Distributions of computation time (a) and number of expanded search
tree nodes (b) among the different ISCT arms.

                                                                      Percentage of dose saving
                                           91.1%89.3%87.5%85.7%83.9%82.1%80.4%78.6%76.8%75.0%73.2%                   ...   0%
                                                  Cumulative distribution of optimised treatment
                            100%                  Optimised treatment
                                                  Reference treatment
                                 90%

                                 80%
             Percentage of VPs


                                 70%

                                 60%

                                 50%

                                 40%

                                 30%

                                 20%

                                 10%


                                           0.25    0.3   0.35   0.4    0.45 0.5 0.55 0.6 0.65 0.7 0.75               ...   2.8
                                                                      Overall amount administered Triptorelin (mg)


                                                  Fig. 5: Multi-arm ISCT outcomes.


                                                                                      12
for lung cancer. Unfortunately in presence of scarce experimental data, the above
approaches cannot be applied. For example, in our case study hormones blood
concentrations are not measured every day, since those measurements are costly
and invasive.
    Model-based approaches, exploiting PharmacoKinetics (PK), as e.g., [42], are
used instead to build virtual phenotypes populations. Such populations are used
to optimise and individualise drug doses [18,41]. PK-based models, however, do
not define how administered drugs can affect a Virtual Phenotype (VP) (namely,
PharmacoDynamics, PD), i.e., possible side-effects due to drug administrations
are not taken into account.
    In our model-based setting, we have to face with complex Virtual Physio-
logical Human (VPH) models, e.g., HumMod [17], Physiomodel [33] and Gy-
nCycle [36] defined through highly non-linear differential equations modelling
underlying biological mechanisms (e.g., inhibitory and stimulatory effects).
    As outlined in Section 4.1, such VPH models are hybrid systems that can be
abstracted into Discrete Event Systems (DESs) (see, e.g., [23,27]) whose inputs
are discrete event sequences. To find an optimal treatment means to find an opti-
mal plan in hybrid domains, where the behaviour of the given system is described
by both discrete and continuous quantities. In the literature, there are many tech-
niques and tools to model planning problems in hybrid domains. Examples are:
PDDL+ [14,40], Satisfiability Modulo Theories (SMT)-based PDDL+ [5] and
other PDDL+ extensions [11,10,6,3]. Model checking techniques are also used
to find plans. Examples in this direction are [4], which exploits symbolic model
checking, UPMurphi [11], which given as input a PDDL+ problem specification
computes a universal plan, CGMurphi [9], an explicit model checker used to
compute optimal controllers, and [1,32], which define a methodology to compute
controllers for non-linear systems.
    However, as outlined in Section 4.3, the complexity of the differential equa-
tions of VPH models relevant for in silico clinical practice makes such models
out of reach for symbolic approaches like those mentioned above, and appoints
numerical integration as the only viable means to compute (black-box) the model
evolutions under a given input function.
    In particular, even considering that clinical actions have constant and equal
duration (as, e.g., in [20]), no reasoning or inference can be made on action effects
in a black-box setting as ours. The automated synthesis of rational decisions and
plans in black-box environments, where numerical simulation is the only means
to discover the effect of actions, is common in several other application domains
of high industrial relevance, like smart grids (see, e.g., [29,30,15]), games (see,
e.g., [21]) and real-time manoeuvring of Unmanned Aerial Vehicles (see, e.g.,
[35]).


7   Conclusions

We presented a case study showing how model-based approaches coupled with
intelligent search techniques can be used to support precision medicine by com-


                                         13
puting, for each given patient, a personalised treatment maximising effectiveness
while minimising cost as well as likelihood and severity of adverse effects.
    By exploiting a complex Virtual Physiological Human (VPH) model of the
human female Hypothalamic–Pituitary–Gonadal (HPG) axis, its set of Virtual
Phenotypes (VPs), and a backtracking-search algorithm that intelligently drives
a simulator for the VPH model, we set up and conducted a multi-arm In-Silico
Clinical Trial (ISCT) which computes, for each VP satisfying our inclusion cri-
teria, a personalised variation of a reference pharmaceutical treatment which,
preserving effectiveness, minimises the overall amount of drug used.
    Our ISCT can be carried out with a few days of parallel computation on a
regular High Performance Computing (HPC) infrastructure, while an equivalent
(classical) in vivo trial would require the recruiting of several millions of human
patients in order to test all treatment variations assessed in silico, thus resulting
just infeasible.
    Although our trial shows extremely favourable results (with computed per-
sonalised treatments using only a tiny fraction of the drug quantity employed by
the reference treatment), there are sill open issues in the area of ISCT. In partic-
ular, mapping each human patient to her corresponding VP calls for novel and
highly inter-disciplinary approaches at the intersection of Artificial Intelligence,
Medicine, Mathematics and Biology.


Acknowledgements. This work was partially supported by the Italian Min-
istry of University and Research (MIUR) under grant “Dipartimenti di eccellenza
2018–2022” of the Department of Computer Science of Sapienza University of
Rome and by the EC FP7 project PAEON (Model Driven Computation of Treat-
ments for Infertility Related Endocrinological Diseases, 600773).


References
 1. V. Alimguzhin, F. Mari, I. Melatti, I. Salvo, and E. Tronci. Linearizing discrete
    time hybrid systems. IEEE TAC, 62(10), 2017.
 2. M. Bächler, D. Menshykau, Ch. De Geyter, and D. Iber. Species-specific differences
    in follicular antral sizes result from diffusion-based limitations on the thickness of
    the granulosa cell layer. Mol. Hum. Reprod., 20(3), 2014.
 3. S. Bogomolov, D. Magazzeni, S. Minopoli, and M. Wehrle. PDDL+ planning with
    hybrid automata: Foundations of translating must behavior. In Proc. ICAPS 2015.
    AAAI, 2015.
 4. S. Bogomolov, D. Magazzeni, A. Podelski, and M. Wehrle. Planning as model
    checking in hybrid domains. In Proc. AAAI 2014. AAAI, 2014.
 5. D. Bryce, S. Gao, D.J. Musliner, and R.P. Goldman. Smt-based nonlinear pddl+
    planning. In Proc. AAAI 2015. AAAI, 2015.
 6. A.J. Coles and A.I. Coles. Pddl+ planning with events and linear processes. In
    Proc. ICAPS 2014, 2014.
 7. L.G.E. Cox, S. Loerakker, M.C.M. Rutten, B.A.J.M. De Mol, and F.N. Van
    De Vosse. A mathematical model to evaluate control strategies for mechanical
    circulatory support. Artif. Organs, 33(8), 2009.


                                           14
 8. C. Dalla Man, F. Micheletto, D. Lv, M. Breton, B. Kovatchev, and C. Cobelli. The
    uva/padova type 1 diabetes simulator: New features. JDST, 8, 2014.
 9. G. Della Penna, B. Intrigila, D. Magazzeni, I. Melatti, and E. Tronci. Cgmurphi:
    Automatic synthesis of numerical controllers for nonlinear hybrid systems. Eur. J.
    Control, 19(1), 2013.
10. G. Della Penna, B. Intrigila, D. Magazzeni, and F. Mercorio. Planning for au-
    tonomous planetary vehicles. In Proc. ICAS 2010. IEEE, 2010.
11. G. Della Penna, D. Magazzeni, F. Mercorio, and B. Intrigila. Upmurphi: A tool
    for universal planning on pddl+ problems. In Proc. ICAPS 2009. AAAI, 2009.
12. R. Ehrig, U. Nowak, L. Oeverdieck, and P. Deuflhard. Advanced extrapolation
    methods for large scale differential algebraic problems. In F.H.-J. Bungartz, editor,
    High Performance Scientific and Engineering Computing, volume 8 of LNCSE.
    Springer, 1999.
13. A. Fogliata, F. Belosi, A. Clivio, P. Navarria, G. Nicolini, M. Scorsetti, E. Vanetti,
    and L. Cozzi. On the pre-clinical validation of a commercial model-based optimi-
    sation engine: Application to volumetric modulated arc therapy for patients with
    lung or prostate cancer. Radiother. Oncol., 113(3):385–391, 2014.
14. M. Fox and D. Long. Modelling mixed discrete-continuous domains for planning.
    JAIR, 27, 2006.
15. B.P. Hayes, I. Melatti, T. Mancini, M. Prodanovic, and E. Tronci. Residential de-
    mand management using individualised demand aware price policies. IEEE Trans.
    Smart Grid, 8(3), 2017.
16. M.P. Hengartner, T.H.C. Kruger, K. Geraedts, E. Tronci, T. Mancini, F. Ille,
    M. Egli, S. Roeblitz, R. Ehrig, L. Saleh, K. Spanaus, C. Schippert, Y. Zhang,
    and B. Leeners. Negative affect is unrelated to fluctuations in hormone levels
    across the menstrual cycle: Evidence from a multisite observational study across
    two successive cycles. J. Psycho. Res., 99, 2017.
17. R. Hester, A. Brown, L. Husband, R. Iliescu, W.A. Pruett, R.L. Summers, and
    T. Coleman. Hummod: a modeling environment for the simulation of integrative
    human physiology. Front. Physiol, 2, 2011.
18. P.M. Jeena, W.R. Bishai, J.G. Pasipanodya, and T. Gumbo. In silico children and
    the glass mouse model: clinical trial simulations to identify and individualize opti-
    mal isoniazid doses in children with tuberculosis. Antimicrob. Agents Chemother.,
    55(2):539–545, 2011.
19. B. Leeners, T.H.C. Kruger, K. Geraedts, E. Tronci, T. Mancini, F. Ille, M. Egli,
    S. Roeblitz, L. Saleh, K. Spanaus, C. Schippert, Y. Zhang, and M.P. Hengartner.
    Lack of associations between female hormone levels and visuospatial working mem-
    ory, divided attention and cognitive bias across two consecutive menstrual cycles.
    Front. Behav. Neuro., 11, 2017.
20. H.X. Li and B.C. Williams. Generative planning for hybrid systems based on flow
    tubes. In Proc. ICAPS 2008. AAAI, 2008.
21. N. Lipovetzky, M. Ramirez, and H. Geffner. Classical planning with simulators:
    Results on the atari video games. In Proc. IJCAI 2015, volume 15, pages 1610–
    1616, 2015.
22. T. Mancini, P. Flener, and J. Pearson. Combinatorial problem solving over rela-
    tional databases: View synthesis through constraint-based local search. In Proc.
    SAC 2012. ACM, 2012.
23. T. Mancini, F. Mari, A. Massini, I. Melatti, F. Merli, and E. Tronci. System
    level formal verification via model checking driven simulation. In Proc. CAV 2013,
    volume 8044 of LNCS. Springer, 2013.


                                           15
24. T. Mancini, F. Mari, A. Massini, I. Melatti, I. Salvo, and E. Tronci. On minimising
    the maximum expected verification time. IPL, 122, 2017.
25. T. Mancini, F. Mari, A. Massini, I. Melatti, and E. Tronci. Anytime system
    level verification via random exhaustive hardware in the loop simulation. In Proc.
    DSD 2014. IEEE, 2014.
26. T. Mancini, F. Mari, A. Massini, I. Melatti, and E. Tronci. System level formal
    verification via distributed multi-core hardware in the loop simulation. In Proc.
    PDP 2014. IEEE, 2014.
27. T. Mancini, F. Mari, A. Massini, I. Melatti, and E. Tronci. SyLVaaS: System level
    formal verification as a service. In Proc. PDP 2015. IEEE, 2015.
28. T. Mancini, F. Mari, A. Massini, I. Melatti, and E. Tronci. Anytime system level
    verification via parallel random exhaustive hardware in the loop simulation. Mi-
    croprocessors and Microsystems, 41, 2016.
29. T. Mancini, F. Mari, I. Melatti, I. Salvo, E. Tronci, J. Gruber, B. Hayes, M. Pro-
    danovic, and L. Elmegaard. Demand-aware price policy synthesis and verification
    services for smart grids. In Proc. SmartGridComm 2014. IEEE, 2014.
30. T. Mancini, F. Mari, I. Melatti, I. Salvo, E. Tronci, J.K. Gruber, B. Hayes, M. Pro-
    danovic, and L. Elmegaard. User flexibility aware price policy synthesis for smart
    grids. In Proc. DSD 2015. IEEE, 2015.
31. T. Mancini, E. Tronci, I. Salvo, F. Mari, A. Massini, and I. Melatti. Comput-
    ing biological model parameters by parallel statistical model checking. In Proc.
    IWBBIO 2015, volume 9044 of LNCS. Springer, 2015.
32. F. Mari, I. Melatti, I. Salvo, and E. Tronci. Model based synthesis of control
    software from system level formal specifications. ACM TOSEM, 23(1), 2014.
33. M. Mateják and J. Kofránek. Physiomodel-an integrative physiology in modelica.
    In Proc. EMBC 2015. IEEE, 2015.
34. W. Raghupathi and V. Raghupathi. Big data analytics in healthcare: promise and
    potential. Health Inf. Sci. Sys., 2(1):3, 2014.
35. M. Ramirez, M. Papasimeon, L. Benke, N. Lipovetzky, T. Miller, and A.R. Pearce.
    Real–time uav maneuvering via automated planning in simulations. In Proc. IJ-
    CAI 2017, pages 5243–5245. AAAI, 2017.
36. S. Röblitz, C. Stötzel, P. Deuflhard, H.M. Jones, D.-O. Azulay, P. van der Graaf,
    and S.W. Martin. A mathematical model of the human menstrual cycle for the
    administration of GnRH analogues. J. Theor. Biol., 321, 2013.
37. P.P. Roy and K. Roy. Molecular docking and qsar studies of aromatase inhibitor
    androstenedione derivatives. J. Pharm. Pharmacol, 62(12):1717–1728, 2010.
38. E.D. Sontag. Mathematical Control Theory: Deterministic Finite Dimensional Sys-
    tems (2nd Ed.). Springer, 1998.
39. E. Tronci, T. Mancini, I. Salvo, S. Sinisi, F. Mari, I. Melatti, A. Massini, F. Davì,
    T. Dierkes, R. Ehrig, S. Röblitz, B. Leeners, T. H. C. Krüger, M. Egli, and F. Ille.
    Patient-specific models from inter-patient biological models and clinical records.
    In Proc. FMCAD 2014. IEEE, 2014.
40. M. Vallati, D. Magazzeni, B. De Schutter, L. Chrpa, and T.L. McCluskey. Effi-
    cient macroscopic urban traffic models for reducing congestion: A pddl+ planning
    approach. In Proc. AAAI 2016, pages 3188–3194, 2016.
41. S.C. van Dijkman, S.G. Wicha, M. Danhof, and O.E. Della Pasqua. Individual-
    ized dosing algorithms and therapeutic monitoring for antiepileptic drugs. Clin.
    Pharmacol. Ther., 103(4):663–673, 2018.
42. S. Willmann, J. Lippert, M. Sevestre, J. Solodenko, F. Fois, and W. Schmitt. Pk-
    sim® : a physiologically based pharmacokinetic ’whole-body’ model. BIOSILICO,
    1(4):121–124, 2003.


                                           16