<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Multi-Scale Long Short-Term Memory Network with Multi-Lag Structure for Blood Glucose Prediction</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Tao Yang</string-name>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Ruikun Wu</string-name>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Rui Tao</string-name>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Yuhang Zhao</string-name>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Xia Yu</string-name>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Ning Ma</string-name>
        </contrib>
      </contrib-group>
      <abstract>
        <p>Accurate blood glucose (BG) prediction is necessary for daily glucose management of diabetes therapy. As glucose dynamics are often affected by various factors, such as diet, physical exercise, and insulin injection, it is difficult to consider all the relevant information and make a balance between the high-dimensional inputs and learning efficiency for a deep learning network. In this work, a novel multivariate predictor with a multi-scale long shortterm memory (MS-LSTM) network was developed to automatically characterize the high-dimensional temporal dynamics and extract the features of blood glucose fluctuation and temporal trends sufficiently. Meanwhile, a multi-lag structure is designed for multiple variables, which can extract the dependence between different variables and blood glucose fluctuations more effectively. Furthermore, long-term sparse information was encoded and compressed to improve the learning efficiency of this deep learning network. The predictive capability of the proposed method was illustrated through 30-min and 60-min ahead glucose prediction in the OhioT1DM-2 Dataset. The root means square error (RMSE) values of 30-min and 60-min ahead predictions were 19.048 and 32.029, respectively, and the mean absolute error (MAE) values of 30-min and 60-min ahead predictions were 13.503 and 23.833. The results demonstrate the efficiency and prediction accuracy of the offline deep learning network, especially in the case of high-dimensional variables availability.</p>
      </abstract>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>INTRODUCTION</title>
      <p>
        Diabetes is a chronic disease characterized by the inability to
maintain glucose homeostasis. Healthy pancreas controls the release of
glucagon and insulin through -cells and -cells, respectively, to
maintain normal blood glucose levels [
        <xref ref-type="bibr" rid="ref8">7</xref>
        ]. Type 1 diabetics
cannot produce insulin normally because the -cells are compromised,
which leads to hyperglycemia and hypoglycemia [
        <xref ref-type="bibr" rid="ref6">5</xref>
        ], [
        <xref ref-type="bibr" rid="ref18">17</xref>
        ]. In recent
years, advances in continuous glucose monitoring (CGM) and
continuous subcutaneous insulin infusion (CSII) technologies have
contributed to the closed-loop treatment of diabetes [
        <xref ref-type="bibr" rid="ref2">1</xref>
        ], [
        <xref ref-type="bibr" rid="ref3">2</xref>
        ], and [
        <xref ref-type="bibr" rid="ref5">4</xref>
        ]. The
subcutaneous glucose concentration prediction algorithm has the
potential to improve further the closed-loop treatment system for
diabetes [
        <xref ref-type="bibr" rid="ref9">8</xref>
        ], [
        <xref ref-type="bibr" rid="ref15">14</xref>
        ], [
        <xref ref-type="bibr" rid="ref16">15</xref>
        ], and [
        <xref ref-type="bibr" rid="ref19">18</xref>
        ]. However, it is difficult to establish a
multivariate physiological model to predict blood glucose precisely
due to the influence of daily behaviors such as diet, physical exercise,
and insulin injection [
        <xref ref-type="bibr" rid="ref7">6</xref>
        ]. Recently, some multivariate data-driven
models are used to predict blood glucose levels and achieve
satisfactory results. A successful case is the multivariable LSTM network
proposed in paper [
        <xref ref-type="bibr" rid="ref13">12</xref>
        ], which has obtained better prediction results
than the support vector regression model and diabetes experts.
Nevertheless, different behaviors have different temporal effects
on glucose fluctuation [
        <xref ref-type="bibr" rid="ref4">3</xref>
        ]. Using a unified lag for all variables may
not be able to extract information about different characteristics
sufficiently. Therefore, using multiple lags for each variable has
positive implications for blood glucose prediction. An end-to-end
recurrent neural network framework is proposed in paper [
        <xref ref-type="bibr" rid="ref14">13</xref>
        ], which is
equipped with an adaptive input selection mechanism to improve the
prediction performance of the multivariate time series. Based on this
work, we develop a multi-scale LSTM (MS-LSTM) network that can
capture the high-dimensional temporal dynamics and extract the
features of blood glucose fluctuation and temporal trends sufficiently.
Meanwhile, the multi-lag structure in the network can more
effectively extract the dependence between different variables and blood
glucose fluctuations. Compared with the traditional single-lag
structure, using the multi-lag structure can extract more comprehensive
features. Furthermore, long-term sparse information is encoded and
compressed to accelerate the learning of deep networks. The
MSLSTM model was tested independently several times on the testing
dataset, and the prediction results show that the model is excellent
and robust.
      </p>
      <p>This paper is organized as follows: section 2 explains the data
preprocessing used; section 3 describes the architecture of the
MSLSTM network; section 4 illustrates model-free prediction methods
in case of missing data; section 5 analyses the experimental results;
section 6 summarizes the main contents from this study.
2</p>
    </sec>
    <sec id="sec-2">
      <title>DATA PREPROCESSING</title>
      <p>
        The variables selected for prediction included BG value, basal
insulin dosage, bolus insulin dosage, carbohydrate intake, and
timestamp [
        <xref ref-type="bibr" rid="ref12">11</xref>
        ]. Other variables provided were not selected for prediction,
such as galvanic skin response, skin temperature, and acceleration.
We used some data preprocessing methods, including aligning the
original data, filling in the missing data, detecting and reconciling
BG outliers, and normalizing the data. These data processing
techniques will be illustrated in detail in the following sections.
2.1
      </p>
    </sec>
    <sec id="sec-3">
      <title>Data alignment</title>
      <p>
        The data in OhioT1DM-2 Dataset was collected by multiple devices,
and some of the data was manually recorded by the patient, which
caused the raw data to be asynchronous [
        <xref ref-type="bibr" rid="ref17">16</xref>
        ]. Therefore, the data
needs to be aligned before feeding to the prediction model. Firstly,
a time grid with a 5-minute sample period was derived based on the
continuous glucose monitoring (CGM) data, and the missing data
was filled with zeros. Secondly, the timestamps of some insulin
injections and carbohydrate intakes information cannot precisely match
the timestamps of CGM data. They were reset to the timestamps of
CGM data with the smallest time difference to keep the temporal
correlation between the variables as much as possible [
        <xref ref-type="bibr" rid="ref4">3</xref>
        ].
2.2
      </p>
    </sec>
    <sec id="sec-4">
      <title>CGM outlier detection and reconciliation</title>
      <p>CGM measurements contain noise because of physical interference.
Therefore, outlier detection and reconciliation are necessary to
remove potential noise. Firstly, a gaussian process regression (GPR)
model was trained to detect outliers of CGM measurements. The
training dataset of the GPR model was the first 288 points of the
training dataset. The input of the GPR model was CGM
measurements from time t 30 to t 5, 6 points in total, and its output was
mean ( (t)) and variance ( 2(t)) of the CGM prediction at the time
t. Then (t) and 2(t) was used to reconcile CGM outlier at the time
t as equation(1).</p>
      <p>8 (t)
&gt;
g(t) = &lt;
&gt;:g(t)
4:5 2(t) , g(t) &lt;
(t)
4:5 2(t)
(t) + 4:5 2(t) , g(t) &gt;
(t) + 4:5 2(t)</p>
      <p>(1)
,others
where g(t) is the BG level at time t.
2.3</p>
    </sec>
    <sec id="sec-5">
      <title>Missing data filling</title>
      <p>In the OhioT1DM-2 Dataset, basal insulin dosage and CGM
measurements have missing data in some situations. As the basal insulin
dosage has daily periodicity, it can be filled by the previous day’s
data. Although many methods are applied for missing CGM value
filling, the accumulative error will inevitably increase as the number
of filling increasing. Therefore, to degrade the accumulative error
caused by data filling, the first-order Taylor series extrapolation and
historical averages were weighted and summed to fill in the missing
CGM values as the number of continuous missing items was less than
12. The respective methods for the missing numbers greater than or
equal to 12 will be explained in detail later. It should be noted that
the missing CGM values in the training dataset will not be filled to
avoid additional noise.
2.4</p>
    </sec>
    <sec id="sec-6">
      <title>Data normalization</title>
      <p>
        Data normalization can accelerate deep network training and
improve the accuracy of the model to a certain extent. We used three
methods to normalize the data, and the results show that the model
with coefficient normalization had the best performance. Coefficient
normalization refers to only scale the amplitude of data to maintain
the distribution of the raw data as much as possible [
        <xref ref-type="bibr" rid="ref11">10</xref>
        ]. The scaling
of different variables was shown in Table 1.
As shown in Figure 1, the MS-LSTM model has a multi-scale
hierarchy structure, which can learn the short-term and long-term
dependence of blood glucose sequence. Meanwhile, the multi-lag structure
can extract features on time-windows of different sizes, the features
extracted on a large time-window are more abundant, and the features
extracted on a small time-window are more time-sensitive.
Therefore, compared with single-lag, the multi-lag structure can extract
more comprehensive features and more effectively extract the
dependence between different variables and blood glucose fluctuations.
Theoretically, the more lags used, the more comprehensive features
extracted, but correspondingly, the training time of the model will
increase. Therefore, three lags were used for all variables to balance
the training time and adaptability, as shown in Table 2, where PH
represents the prediction horizon.
      </p>
      <p>Specifically, for predicting blood glucose after 30 minutes, the
three scales adopted for the blood glucose variable were 1×7, 2×7,
and 3×7, which means that all scale levels are 7, and the dilated
sampling rate is 1, 2 and 3, respectively. Three lags of the basal
variable were 8, 16, and 24, respectively. To ensure the unity of the
output dimensions, in the multi-scale hierarchical and multi-lag
structure, the number of LSTM states was equal to the minimum scale of
blood glucose variable. As shown in Table 3, to sufficiently extract
the useful information of various variables, the number of LSTM
states in the feature fusion layer was 256. The number of nodes in
the fully connected layer later was 256, 64, and 1, respectively, and
some dropout layers are added between the fully connected layers to
avoid the network overfitting problem.
In this section, we will introduce the architecture of the MS-LSTM
model and explain how the model is trained and tested.
The training data was divided into a training set and a verification set
at a ratio of 9:1. The last 10% of the training dataset is closest to the</p>
      <sec id="sec-6-1">
        <title>Multi-scale Hierarchical Structure</title>
      </sec>
      <sec id="sec-6-2">
        <title>LSTM and Fully Connected Layer Structure</title>
        <p>Blood Glucose</p>
        <sec id="sec-6-2-1">
          <title>Dilated Sampling</title>
        </sec>
        <sec id="sec-6-2-2">
          <title>Time</title>
          <p>testing dataset in time, and its distribution is most similar to the
testing dataset, so it was set apart as the verification set. When training
the model, each iteration was evaluated on the verification set. When
the model had not obtained better results after 300 consecutive
evaluations, the training would be stopped, and the model which performs
best on the verification set before would be saved. The training stop
strategy that can effectively avoid the problem of overfitting the
network is called early stopping. Because the 13th point on the test set
needs to be predicted, some training data was added at the beginning
of the test set to ensure that the number of prediction points meets
the requirements. Besides, for several CGM data after a noticeable
amount of continuously missing data, the model was not used for
prediction. Instead, two model-free prediction algorithms with
adaptive weight prediction and remain prediction were used to predict,
respectively. Finally, the predictions were limited in the range of 40
to 400. The flow diagram is shown in Figure 2.</p>
          <p>Training batch size: The experiment used mini-batch for weight
adjustment, and the batch size of each update weight will affect the
accuracy of the model. In the experiment, it was found that the larger
batch size could improve the accuracy and accelerate the training
process of the model, so the batch size was set to 1024.</p>
          <p>Loss function: The experiment compared the negative
loglikelihood (NLL) loss function, "-insensitive loss function, mean
absolute error (MAE) loss function, and root mean square error
(RMSE) loss function. The results displayed that the model trained
with the RMSE loss function had the best performance.</p>
          <p>
            Optimizer: This experiment tested the root mean square prop
(RMSProp) optimizer and adaptive moment estimation (Adam) optimizer
[
            <xref ref-type="bibr" rid="ref10">9</xref>
            ]. The results showed that the performances of RMSProp and
Adam were similar, but Adam had a significant advantage in the
convergence speed. Therefore, Adam optimizer was used to update
model weights, and the learning rate was set to 0.0001. In summary,
the hyperparameters are shown in Table 4.
          </p>
          <p>The experimental environment is Win10 Professional 64-bit
operating system, the hardware platform is Intel Core i7 9750H
processor, NVIDIA GeForce GTX 1660 Ti graphics processing unit, 16G
memory notebook computer, and the development tool is Python 3.6,
Keras 2.2.4, TensorFlow-GPU 1.12.0. The code used in the
experiment is available on Github. In this hardware and software
environment, the average training time for the MS-LSTM model was about
10 minutes.</p>
        </sec>
      </sec>
    </sec>
    <sec id="sec-7">
      <title>MODEL-FREE PREDICTION</title>
      <p>When the number of the missing CGM data is more than 11, the
predictions of the MS-LSTM model for the following several values
will cause a significant deviation. Therefore, for these CGM data,
adaptive weight prediction and remain prediction are used instead of
the model. The adaptive weight prediction algorithm uses short-term
maintainability and long-term periodicity of blood glucose levels to
make predictions. Specifically, when fewer CGM data are missing,
the prediction is close to the last CGM value before the missing data,
that is, depending on the short-term maintainability of blood glucose
levels. On the contrary, when there are more missing data, the
prediction is close to the CGM value at the same time of the previous
day, that is, depending on the long-term periodicity of blood glucose
levels. The process of adaptive weight prediction can be described by
equation (2)-(4).</p>
      <p>f
gav(t)</p>
      <p>Paw
=
=
=
nmiss=(nmiss + c)
288+n</p>
      <p>X g(t
where nmiss is the number of missing data between the current
prediction and the last CGM measurement before the missing data. c
is a constant not less than 0, and the value in this experiment is
set to 68. f is the adaptive weight factor, depend on nmiss and
c. T is the blood glucose measurement period, the value in the
OhioT1DM-2 Dataset is 5 minutes. n is a positive integer
constant not less than 0, and the value in this experiment is set to 1.
nback 2 f288 n; 288 n + 1; :::; 288 + ng. g(t) is the BG level
at time t. gav(t) is the average value of the CGM data of 2n + 1
points at the same time on the previous day, which represents the
long-term periodicity of BG levels. glast(t) is the last CGM value
before the missing data, which represents the short-term
maintainability of BG levels. Finally, Paw is the adaptive weight prediction
value. As shown in Figure 3, the black points in the period from time
D to F are the predictions produced by adaptive weight prediction
algorithm.</p>
      <p>Subject 567 - PH=30 - 22th February,2027</p>
      <p>Raw data
Model Predictions
Adaptive Weight Predictions
Remain Prediction</p>
      <p>Back Extrapolations
250
225
)L200
d
/gm175
(
e
sco150
u
l
dG125
o
loB100
75
50</p>
      <p>When the first CGM data appears after the missing data, the value
would be directly used as the predicted value of the required
prediction horizon. So this algorithm is called remain prediction. As shown
in the sky blue point in Figure 3, the blood glucose value at time D
was the prediction value at time G.</p>
      <p>Then, when two CGM values appeared after the missing data, as
shown about the BG values at time D and E in Figure 3. Based on
these two points, the reverse first-order Taylor series extrapolation
was performed. Then the extrapolated data and the average historical
data before the missing data were weighted and summed to ensure
the smoothness of the filled data. The green points in Figure 3 were
the extrapolated backward data, which were used by the MS-LSTM
model to predict BG level after time G.
5</p>
    </sec>
    <sec id="sec-8">
      <title>RESULTS AND ANALYSIS</title>
      <p>The performance of the model was evaluated by the root mean square
error (RMSE) and mean absolute error (MAE) between the
predictions and the original test data.</p>
      <p>RM SE</p>
      <p>M AE
=
=
v
uu 1</p>
      <p>N</p>
      <p>X (y^i
t N i=1
1 XN jy^i
N i=1
yij
yi)2
(5)
(6)
where y^i is the predicted BG value, yi is the target value and N
represents the size of the testing dataset. To be noted that, the extrapolated
values of BG were removed when evaluating the performance of the
model, which guarantees the predictions had the same number as the
test data.</p>
      <p>According to the preceding steps, the results of four independent
experiments are summarized in Table 5, where SD represents the
standard deviation. All subjects used the same experimental
parameters, but the RMSE of each patient varied from 15 to 22. Among
them, the smallest RMSE is 15.871 for patient 596, and the largest
RMSE is 21.934 for patient 567. The prediction results are shown
in Figure 4-5. It is worth noting that the average RMSE variance of
the MS-LSTM model is only 0.061 in 30 minutes prediction horizon,
which reflects the excellent robustness of the model.</p>
      <p>Subject 596 - PH=30
400
350
)dL300
/
g
(em250
so
lcu200
G
lood150
B100
50
400
350
)dL300
/
g
(em250
so
lcu200
G
lood150
B100
50
0
0
Raw data
Predictions
Raw data</p>
      <p>Predictions
1500</p>
      <p>Time Index
500
1000
2000
2500
3000</p>
      <p>The subject 567 has many consecutive spikes, which is the primary
source for the prediction error. Besides, another source of prediction</p>
      <p>RMSE and MAE values of the MS-LSTM model for 6 subjects.
error is the missing data, as shown in the predictions after the missing
data in Figure 6. Finally, a slight time delay is observed in the
prediction curve, and it is also a problem for most prediction methods.</p>
      <p>The CGM measurements contain noise because of physical
interference. We used the GPR model to detect and reconcile CGM
outliers to the greatest extent. However, only some severe outliers were
detected and reconciled because there was no judgment standard for
outliers. There are still many outliers in the raw CGM data, which
is very unfavorable for the prediction model learning. Therefore,
denoising CGM and obtaining high-quality data is very important to
improve the performance of the prediction model.</p>
      <p>Raw data
Predictions
400
350
50
12:00
15:00
18:00
Time (h)
21:00
24:00</p>
    </sec>
    <sec id="sec-9">
      <title>CONCLUSION</title>
      <p>In this paper, the MS-LSTM network is developed to adaptively
characterize high-dimensional temporal dynamics and extract the
longterm and short-term features of glucose fluctuation. Meanwhile, a
multi-lag structure is designed for multiple variables, which can
extract the dependence between different variables and blood glucose
fluctuations more effectively. The long-term sparse temporal data is
encoded and compressed to suitable for efficient learning with the
model. The mean value of the RMSE for 6 subjects is 19.048, with
standard deviation equals to 0.061 in 30-minute PH. Missing data
and rapid fluctuations in blood glucose levels are the two main
factors that affect the prediction performances of the model.
7</p>
    </sec>
    <sec id="sec-10">
      <title>FUNDING REFERENCES</title>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          <article-title>This research was supported by National Natural Science Foundation of China (No.61973067 and No</article-title>
          .
          <volume>61903071</volume>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [1]
          <string-name>
            <given-names>R.M.</given-names>
            <surname>Bergenstal</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.C.</given-names>
            <surname>Klonoff</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.K.</given-names>
            <surname>Garg</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B.W.</given-names>
            <surname>Bode</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Meredith</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.H.</given-names>
            <surname>Slover</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.J.</given-names>
            <surname>Ahmann</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.W.</given-names>
            <surname>Welsh</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.B.</given-names>
            <surname>Lee</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F.R.</given-names>
            <surname>Kaufman</surname>
          </string-name>
          , and AI-HS Group, '
          <article-title>Threshold-based insulin-pump interruption for reduction of hypoglycemia'</article-title>
          ,
          <source>New England Journal of Medicine</source>
          ,
          <volume>369</volume>
          ,
          <fpage>224</fpage>
          -
          <lpage>232</lpage>
          , (
          <year>2013</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [2]
          <string-name>
            <given-names>B.</given-names>
            <surname>Buckingham</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F.</given-names>
            <surname>Cameron</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Calhoun</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.M.</given-names>
            <surname>Maahs</surname>
          </string-name>
          , D.M Wilson,
          <string-name>
            <given-names>H.P.</given-names>
            <surname>Chase</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B.W.</given-names>
            <surname>Bequette</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Lum</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Sibayan</surname>
          </string-name>
          , and
          <string-name>
            <given-names>R.W.</given-names>
            <surname>Beck</surname>
          </string-name>
          , '
          <article-title>Outpatient safety assessment of an in-home predictive low-glucose suspend system with type 1 diabetes subjects at elevated risk of nocturnal hypoglycemia'</article-title>
          ,
          <source>Diabetes Technology &amp; Therapeutics</source>
          ,
          <volume>15</volume>
          ,
          <fpage>622</fpage>
          -
          <lpage>627</lpage>
          , (
          <year>2013</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [3]
          <string-name>
            <given-names>J.W.</given-names>
            <surname>Chen</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K.Z.</given-names>
            <surname>Li</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Herrero</surname>
          </string-name>
          ,
          <string-name>
            <given-names>T.Y.</given-names>
            <surname>Zhu</surname>
          </string-name>
          , and
          <string-name>
            <given-names>P.</given-names>
            <surname>Georgiou</surname>
          </string-name>
          , '
          <article-title>Dilated recurrent neural network for short-time prediction of glucose concentration'</article-title>
          ,
          <source>in CEUR Workshop Proceedings</source>
          , volume
          <volume>2148</volume>
          , pp.
          <fpage>69</fpage>
          -
          <lpage>73</lpage>
          , Stockholm, Sweden, (
          <year>2018</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          [4]
          <string-name>
            <given-names>P.</given-names>
            <surname>Dua</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F.J.</given-names>
            <surname>Doyle</surname>
          </string-name>
          , and
          <string-name>
            <given-names>E.N.</given-names>
            <surname>Pistikopoulos</surname>
          </string-name>
          ,
          <article-title>'Multi-objective blood glucose control for type 1 diabetes'</article-title>
          ,
          <source>Medical &amp; Biological Engineering &amp; Computing</source>
          ,
          <volume>47</volume>
          ,
          <fpage>343</fpage>
          -
          <lpage>352</lpage>
          , (
          <year>2009</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          [5]
          <string-name>
            <given-names>A.</given-names>
            <surname>Facchinetti</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Favero</surname>
          </string-name>
          , G. Sparacino, and
          <string-name>
            <given-names>C.</given-names>
            <surname>Cobelli</surname>
          </string-name>
          , '
          <article-title>An online failure detection method of the glucose sensor-insulin pump system: Improved overnight safety of type-1 diabetic subjects'</article-title>
          ,
          <source>IEEE Transactions on Biomedical Engineering</source>
          ,
          <volume>60</volume>
          ,
          <fpage>406</fpage>
          -
          <lpage>416</lpage>
          , (
          <year>2013</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          [6]
          <string-name>
            <given-names>E.I.</given-names>
            <surname>Georga</surname>
          </string-name>
          ,
          <string-name>
            <given-names>V.C.</given-names>
            <surname>Protopappas</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Ardigo</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Marina</surname>
          </string-name>
          ,
          <string-name>
            <given-names>I.</given-names>
            <surname>Zavaroni</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Polyzos</surname>
          </string-name>
          , and
          <string-name>
            <surname>D.I. Fotiadis</surname>
          </string-name>
          , '
          <article-title>Multivariate prediction of subcutaneous glucose concentration in type 1 diabetes patients based on support vector regression'</article-title>
          ,
          <source>IEEE Journal of Biomedical and Health Informatics</source>
          ,
          <volume>17</volume>
          ,
          <fpage>71</fpage>
          -
          <lpage>81</lpage>
          , (
          <year>2013</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          [7]
          <string-name>
            <surname>N.D</surname>
          </string-name>
          .D. Group, '
          <article-title>Classification and diagnosis of diabetes mellitus and other categories of glucose intolerance'</article-title>
          ,
          <source>Diabetes</source>
          ,
          <volume>28</volume>
          ,
          <fpage>1039</fpage>
          -
          <lpage>1057</lpage>
          , (
          <year>1979</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          [8]
          <string-name>
            <given-names>C.S.</given-names>
            <surname>Hughes</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.D.</given-names>
            <surname>Patek</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.D.</given-names>
            <surname>Breton</surname>
          </string-name>
          , and
          <string-name>
            <given-names>B.P.</given-names>
            <surname>Kovatchev</surname>
          </string-name>
          , '
          <article-title>Hypoglycemia prevention via pump attenuation and red-yellow-green “traffic” lights using continuous glucose monitoring and insulin pump data'</article-title>
          ,
          <source>Journal of Diabetes Science &amp; Technology, 4</source>
          ,
          <fpage>1146</fpage>
          -
          <lpage>1155</lpage>
          , (
          <year>2010</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          [9]
          <string-name>
            <given-names>D.P.</given-names>
            <surname>Kingma</surname>
          </string-name>
          and
          <string-name>
            <given-names>J.L.</given-names>
            <surname>Ba</surname>
          </string-name>
          , '
          <article-title>Adam: A method for stochastic optimization'</article-title>
          ,
          <source>in 3rd International Conference on Learning Representations, ICLR 2015 - Conference Track Proceedings</source>
          , San Diego, CA,
          <string-name>
            <surname>United</surname>
            <given-names>states</given-names>
          </string-name>
          , (
          <year>2015</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          [10]
          <string-name>
            <given-names>J.</given-names>
            <surname>Martinsson</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Schliep</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B.</given-names>
            <surname>Eliasson</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Meijner</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Persson</surname>
          </string-name>
          , and
          <string-name>
            <given-names>O.</given-names>
            <surname>Mogren</surname>
          </string-name>
          , '
          <article-title>Automatic blood glucose prediction with confidence using recurrent neural networks'</article-title>
          ,
          <source>in CEUR Workshop Proceedings</source>
          , volume
          <volume>2148</volume>
          , pp.
          <fpage>64</fpage>
          -
          <lpage>68</lpage>
          , Stockholm, Sweden, (
          <year>2018</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          [11]
          <string-name>
            <given-names>C.</given-names>
            <surname>Midroni</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P. J.</given-names>
            <surname>Leimbigler</surname>
          </string-name>
          , G. Baruah,
          <string-name>
            <given-names>M.</given-names>
            <surname>Kolla</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.J.</given-names>
            <surname>Whitehead</surname>
          </string-name>
          , and
          <string-name>
            <given-names>Y.</given-names>
            <surname>Fossat</surname>
          </string-name>
          , '
          <article-title>Predicting glycemia in type 1 diabetes patients: Experiments with xgboost'</article-title>
          ,
          <source>in CEUR Workshop Proceedings</source>
          , volume
          <volume>2148</volume>
          , pp.
          <fpage>79</fpage>
          -
          <lpage>84</lpage>
          , Stockholm, Sweden, (
          <year>2018</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          [12]
          <string-name>
            <given-names>S.</given-names>
            <surname>Mirshekarian</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Bunescu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Marling</surname>
          </string-name>
          , and
          <string-name>
            <given-names>F.</given-names>
            <surname>Schwartz</surname>
          </string-name>
          , '
          <article-title>Using lstms to learn physiological models of blood glucose behavior'</article-title>
          ,
          <source>in Proceedings of the Annual International Conference of the IEEE Engineering in Medicine and Biology Society</source>
          , pp.
          <fpage>2887</fpage>
          -
          <lpage>2891</lpage>
          ,
          <string-name>
            <surname>Jeju</surname>
            <given-names>Island</given-names>
          </string-name>
          , Korea, Republic of, (
          <year>2017</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref14">
        <mixed-citation>
          [13]
          <string-name>
            <given-names>L.</given-names>
            <surname>Munkhdalai</surname>
          </string-name>
          ,
          <string-name>
            <given-names>T.</given-names>
            <surname>Munkhdalai</surname>
          </string-name>
          ,
          <string-name>
            <given-names>H.P.</given-names>
            <surname>Kwang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>T.</given-names>
            <surname>Amarbayasgalan</surname>
          </string-name>
          , E. Erdenebaatar,
          <string-name>
            <given-names>Hyun W.P.</given-names>
            , and
            <surname>Keun</surname>
          </string-name>
          <string-name>
            <surname>H.R.</surname>
          </string-name>
          , '
          <article-title>An end-to-end adaptive input selection with dynamic weights for forecasting multivariate time series'</article-title>
          , IEEE Access,
          <volume>7</volume>
          ,
          <fpage>99099</fpage>
          -
          <lpage>99114</lpage>
          , (
          <year>2019</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref15">
        <mixed-citation>
          [14]
          <string-name>
            <given-names>M.</given-names>
            <surname>Phillip</surname>
          </string-name>
          ,
          <string-name>
            <given-names>T.</given-names>
            <surname>Battelino</surname>
          </string-name>
          ,
          <string-name>
            <given-names>E.</given-names>
            <surname>Atlas</surname>
          </string-name>
          ,
          <string-name>
            <given-names>O.</given-names>
            <surname>Kordonouri</surname>
          </string-name>
          ,
          <string-name>
            <given-names>N.</given-names>
            <surname>Bratina</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Miller</surname>
          </string-name>
          ,
          <string-name>
            <given-names>T.</given-names>
            <surname>Biester</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.M.</given-names>
            <surname>Avbelj</surname>
          </string-name>
          ,
          <string-name>
            <surname>I. Muller</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Nimri</surname>
          </string-name>
          , and T. Danne, '
          <article-title>Nocturnal glucose control with an artificial pancreas at a diabetes camp'</article-title>
          ,
          <source>New England Journal of Medicine</source>
          ,
          <volume>368</volume>
          ,
          <fpage>824</fpage>
          -
          <lpage>833</lpage>
          , (
          <year>2013</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref16">
        <mixed-citation>
          [15]
          <string-name>
            <given-names>J.C.</given-names>
            <surname>Pickup</surname>
          </string-name>
          , '
          <article-title>Insulin-pump therapy for type 1 diabetes mellitus'</article-title>
          ,
          <source>New England Journal of Medicine</source>
          ,
          <volume>366</volume>
          ,
          <fpage>1616</fpage>
          -
          <lpage>1624</lpage>
          , (
          <year>2012</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref17">
        <mixed-citation>
          [16]
          <string-name>
            <given-names>J.Y.</given-names>
            <surname>Xie</surname>
          </string-name>
          and
          <string-name>
            <given-names>Q.</given-names>
            <surname>Wang</surname>
          </string-name>
          , '
          <article-title>Benchmark machine learning approaches with classical time series approaches on the blood glucose level prediction challenge'</article-title>
          ,
          <source>in CEUR Workshop Proceedings</source>
          , volume
          <volume>2148</volume>
          , pp.
          <fpage>97</fpage>
          -
          <lpage>102</lpage>
          , Stockholm, Sweden, (
          <year>2018</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref18">
        <mixed-citation>
          [17]
          <string-name>
            <given-names>S.</given-names>
            <surname>Zavitsanou</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Mantalaris</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.C.</given-names>
            <surname>Georgiadis</surname>
          </string-name>
          , and
          <string-name>
            <given-names>E.N.</given-names>
            <surname>Pistikopoulos</surname>
          </string-name>
          , '
          <article-title>In silico closed-loop control validation studies for optimal insulin delivery in type 1 diabetes'</article-title>
          ,
          <source>IEEE Transactions on Biomedical Engineering</source>
          ,
          <volume>62</volume>
          ,
          <fpage>2369</fpage>
          -
          <lpage>2378</lpage>
          , (
          <year>2015</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref19">
        <mixed-citation>
          [18]
          <string-name>
            <given-names>C.</given-names>
            <surname>Zecchin</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Facchinetti</surname>
          </string-name>
          , G. Sparacino, and
          <string-name>
            <given-names>C.</given-names>
            <surname>Cobelli</surname>
          </string-name>
          , '
          <article-title>Reduction of number and duration of hypoglycemic events by glucose prediction methods: a proof-of-concept in silico study'</article-title>
          ,
          <source>Diabetes Technology &amp; Therapeutics</source>
          ,
          <volume>15</volume>
          ,
          <fpage>66</fpage>
          -
          <lpage>77</lpage>
          , (
          <year>2013</year>
          ).
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>