=Paper= {{Paper |id=Vol-2675/paper20 |storemode=property |title=Prediction of Blood Glucose Levels for People with Type 1 Diabetes using Latent-Variable-based Model |pdfUrl=https://ceur-ws.org/Vol-2675/paper20.pdf |volume=Vol-2675 |authors=Xiaoyu Sun,Mudassir Rashid,Mert Sevil,Nicole Hobbs,Rachel Brandt,Mohammad Reza Askari,Andrew Shahidehpour,Ali Cinar |dblpUrl=https://dblp.org/rec/conf/ecai/SunRSHBASC20 }} ==Prediction of Blood Glucose Levels for People with Type 1 Diabetes using Latent-Variable-based Model== https://ceur-ws.org/Vol-2675/paper20.pdf
Prediction of Blood Glucose Levels for People with Type 1
      Diabetes using Latent-Variable-based Model
       Xiaoyu Sun and Mudassir Rashid and Mert Sevil and Nicole Hobbs and Rachel Brandt and
                    Mohammad Reza Askari and Andrew Shahidehpour and Ali Cinar1


Abstract. Regulation of blood glucose concentrations (BGCs) is a                         insulin, and constraint conditions are taken into account when com-
tough burden for people living with type 1 diabetes mellitus (T1DM).                     puting the future insulin doses to be infused. The data-driven model-
People with T1DM must administer exogenous insulin to maintain                           ing techniques have been evaluated in many studies because of their
their BGC within an euglycemic range. Hyperglycemia (high BGC)                           computational tractability and identification efficiency.
and hypoglycemia (low BGC) can occur because of poor BGC man-                               The empirical modeling technologies for T1DM include linear
agement. Using recursively identified models to predict the future                       and nonlinear methods. For nonlinear models, artificial neural net-
BGC values opens novel possibilities for improving the BGC regu-                         works (ANN) [1], convolutional neural networks (CNN) [26], recur-
lation performance by adjusting the dose of insulin infusion, taking                     rent neural networks (RNN) [13, 3], and other machine learning and
rescue carbohydrates, or both. The BGC prediction model can also                         deep learning techniques [17, 11] have been used to model the glu-
benefit the development of artificial pancreas systems. In this paper,                   cose dynamics. However, the nonlinear models can only be trained
a latent variable (LV) based multivariable statistical modeling ap-                      with a large data set, and it can be difficult to develop personalized
proach is applied to model BGC dynamics and forecast future BGCs.                        models or to update the model parameters or structure. For the lin-
The LV-based model is a powerful linear method to build an empir-                        ear modeling methods, autoregressive model with exogenous inputs
ical model by using the collected data. The model is evaluated with                      (ARX), and autoregressive moving average model with exogenous
the Ohio T1DM dataset that contains BGCs from continuous glu-                            inputs (ARMAX) [7, 19, 25] are used to build personalized glucose
cose monitoring (CGM) sensors, basal and bolus insulin information                       prediction models with recursively updated model parameters where
from insulin pump, and additional information from wristbands or                         the future BGC values are modeled as a linear combination of his-
reported by the subject. The results indicate that the LV-based model                    torical measurements from sensors, administered insulin, and other
can predict future BGC values with high accuracy for prediction hori-                    information such as meal CHOs and exercise. Statistical methods
zons of 30 and 60 minutes.                                                               based on latent variables (LVs) are demonstrated to have powerful
                                                                                         abilities for various data analysis tasks, modeling, and process mon-
                                                                                         itoring [20, 21], and has been proven as a good alternative linear
1     Introduction                                                                       model for type 1 diabetes [24].
                                                                                            In this paper, a novel multivariate statistical method proposed by
People with type 1 diabetes mellitus (T1DM), an immune disorder                          Nelson and MacGregor [14, 8] where the score vector is estimated
where the pancreas does not produce insulin, must administer ex-                         using missing data imputation technique is applied to model the glu-
ogenous insulin either by injections several times every day or in-                      cose dynamics based on LVs derived from principal component anal-
fusion with an insulin pump to maintain their blood glucose con-                         ysis (PCA). There are three key steps in developing a BGC predic-
centrations (BGCs) within a safe range (70-180 mg/dL) [23]. With-                        tion model based on the LVs technique. First, a PCA is performed
out effective regulation, people with T1DM may suffer from several                       on the gathered data to decompose the data into a linear combina-
long-term complications caused by hyperglycemia (high BGC), such                         tion of scores and loadings. Then, the unobserved variables are esti-
as kidney failure, blindness, and the deterioration of cardiovascular                    mated as conditional mean values computed from the gathered data
health. They can also have low BGC (hypoglycemia), which may                             and new measurements. Finally, the score of new observed data is
cause dizziness, diabetes coma, or even death because of the lack of                     estimated using incomplete observations and the future BGC values
energy for the brain [4].                                                                are predicted. The rest of this paper is organized as follows: Section 2
   Artificial pancreas (AP) systems are proposed to provide more re-                     summarizes the pre-processing of the data set. The LV-based method
liable BGC management by automatically calculating the dose of in-                       is described and a glucose prediction model is developed in Section
sulin to infuse by using a closed-loop controller incorporating BGCs                     3. The Ohio T1DM data set [12] is used to assess the performance
measured by continuous glucose monitoring (CGM) sensors, histori-                        of the model and the results are given and discussed in Section 4.
cal infused insulin data, and other available information, such as meal                  Finally, some concluding remarks are provided in Section 5.
carbohydrates (CHOs) and exercise information [15, 9, 16, 18, 2, 10].
The closed-loop controller is the critical component in developing an
AP system. Model-based controllers have become the preferred op-                         2     Data Pre-processing
tion in recent in silico and clinical studies where the future BGCs                      2.1     Data
predicted by a model, historical CGM measurements, administered
                                                                                         The Ohio T1DM dataset provided by the Blood Glucose Level Pre-
1 Illinois Institute of Technology, USA, email: cinar@iit.edu
                                                                                         diction (BGLP) challenge records eight weeks of data collected from




    Copyright © 2020 for this paper by its authors. Use permitted under Creative Commons License Attribution 4.0 International (CC BY 4.0).
six T1DM patients with subject ID: 540, 544, 552, 567, 584, and 596.
The data set contains BGC values measured by CGM sensors with                                  300
                                                                                                                                                       Raw CGM
a sampling time of 5 minutes, basal and bolus insulin information                                                                                      Filled CGM
from the insulin pump, information collected from wristbands, and
                                                                                               250
events (i.e., meal, exercise, work, illness, etc.) recorded by the sub-
jects themselves (see [12] for details). However, there is no evidence
on the accuracy of the subject-reported information, which may de-




                                                                                 BGL (mg/dL)
                                                                                               200
grade the reliability of glucose prediction model and the wristband
did not work continuously for a long period of time due to its lim-
ited battery power, which causes lots of missed data. Thus, only data                          150

collected from CGM sensors and insulin pumps is used in this study.
   The insulin infused with an insulin pump includes basal insulin,
                                                                                               100
given continuously since the defined start times, and bolus insulin,
which is usually given to mitigate the effects of the rapid raise of
BGC around meal times. Basal insulin is recorded as “basal” which                               50
is the basal insulin infusion rate and “temp basal” which defines the                            8000       8050      8100        8150         8200   8250          8300
                                                                                                                             samples (/5min)
basal insulin infusion rate in specific time intervals to achieve better
regulation of BGCs. There are three types of bolus insulin defined in                                Figure 1. Illustration of the filled gap in training data
the dataset: normal, normal dual and square dual. For “normal” bolus                                 (Subject ID: 540, 18 CGM measurements are missed)
insulin, a certain dose of insulin is infused immediately to the patient,
while for “normal dual” bolus insulin, a certain percentage of bolus
is given to the subject up front and the remaining insulin amount is        where each row can be considered as an observation and each column
given gradually over a longer time duration. In this study, half of         is the values of a single variable.
the dose is supposed to be given to the patient immediately and the
remaining half is infused over the following 30 minutes. For “square        3         Methods
dual” bolus insulin, the bolus insulin is administered continuously
in the given time interval. The bolus and basal insulin values are          A multivariable statistical technique based on LVs [14, 8] is devel-
converted to U/min and aligned to match the CGM sampling time.              oped to predict the future BGC values, where a LV model is devel-
   The insulin on board (IOB), which represents the insulin that            oped using the PCA algorithm. Then, the conditional mean values are
remains active within the body, is calculated from a physiological          estimated from the same distribution, and future BGCs are predicted
model [22] and is found useful in some previous works [1, 6]. The           with LVs and incomplete observations.
physiological model is represented as
                  dC1 (t)                                                   3.1                Latent Variable Model
                          = u (t) − Kdia C1 (t)                      (1)
                     dt                                                     Consider a data matrix X (N × M ) that contains N observations
               dC2 (t)                                                      (rows) and M variables (columns), then a linear combination of an
                       = Kdia (C1 (t) − C2 (t))                      (2)
                 dt                                                         observation x in dataset X can be written as t = p1 x1 +. . .+pM xM ,
                      IOB (t) = C1 (t) + C2 (t)                      (3)    where t is a new vector in the same space as x. The fundamental
                                                                            idea behind PCA is to find the loading vector p that maximizes the
where C1 and C2 define two insulin compartments, u (t) is the in-
                                                                            variation of t, thus the first LV of PCA model can be calculated by
sulin infusion, and Kdia is a constant related to the duration of in-
                                                                            solving the following problem:
sulin action, set as 0.0182.
    The dataset contains missing CGM measurements for in both the                                        arg max tT t = arg max pT X T Xp
                                                                                                                                                        
                                                                                                                                                                           (4)
training and testing sets that cause discontinuous CGM curves. The                                         kpk=1                  kpk=1
gaps that are not greater than 3 samples in the training dataset are
interpolated using first-order linear model. For the cases where more       where p is the vector of regression coefficients. Accordingly, the
than 3 samples are missed, a single variable statistical model similar      dataset X can be expressed as X = tpT + E with E denoting the
to the one used to modeling glucose dynamics is used. In the testing        residual matrix. We can get more components by solving the follow-
data set, with the intent of on-line implementation, only the historical    ing problem:
                                                                                                                       2
CGM measurements are used to fill the missing gaps until new the                                  arg max X − tpT                           (5)
                                                                                                                      kpk=1
CGM observations become available. Fig. 1 shows an example of the
filled gap in the training dataset (Subject ID 540).                            Traditionally, the successive progress is evaluated by

                                                                                                                     kXk2 − kEk2
2.2    Batch Generation                                                                                                          100%                                      (6)
                                                                                                                        kXk2
The pre-processed CGM readings (y) and calculated IOB (IOB)
are arranged as time series and a sliding window of 2 hours length is       which is referred to as the percentage of explained variation of t and
chosen to generate batch segments from the training data as                 it is fixed as 95% in this study. If the first A (A is much smaller than
                                                                          the rank of X) significant components can summarize sufficiently
           y (1)       ···   y (24)    IOB (1)      ···   IOB (24)          well the dataset X, then a LV model developed using PCA algorithm
          y (2)
Xtrain =              ···   y (25)    IOB (2)      ···   IOB (25)         can be expressed as
             ..         ..      ..       ..          ..      ..
                                                                   
              .          .       .        .           .       .                                X = t1 pT1 + t2 pT2 + ... + tA pTA + E = T1:A P1:A
                                                                                                                                              T
                                                                                                                                                  +E                       (7)
where T1:A             =    [t1 , . . . , tA ] (N × A) and P1:A =          3.3    Glucose prediction based on LV model
[p1 , . . . , pA ] (M × A) are now matrices containing A score
                                                                           An accurate glucose prediction model could benefit diabetes man-
vectors and A loading vectors, respectively.
                                                                           agement and significantly reduce the risk of hypoglycemia. To pre-
                                                                           dict the future 60 minutes glucose values, the preceding one hour of
3.2    Conditional Mean Replacement Method                                 data is assumed available and marked as z # . Then, the glucose pre-
                                                                           diction model can be developed and the subsequent future hour of
For a new test observation z that was not used in the model develop-       BGC values can be predicted online as follows:
ment but is drawn from the same distribution as observations in X,
the scores τ of the first A significant components of the new obser-       1. The batch data set Xtrain is generated for each subject using the
vation can then be calculated as                                              training dataset.
                                                                           2. While a new observation z # is available:
                                       T
                               τ1:A = P1:A z                        (8)     (a) Calculate the similarities between the new object z # and ob-
                                                                                servations in submatrix X # , select the most similar N obser-
   Consider a situation where only part of the object z is observed, it         vations from Xtrain to form a new data matrix X.
is nature to assume the first R variables are measured and the remain-
                                                                            (b) Develop PCA model using data matrix X as stated in (7).
ing (M − R) variables are unobserved, then without loss of gener-
ality, we have                                                              (c) Estimate the score vector τ of the new observation z # as stated
                                                                                in (11).
                                   # 
                                     z
                              z=
                                     z∗                                     (d) Predict the next 1 hour’s glucose values as stated in (12).

where z # is the observed variables and z ∗ represents the missing            The above process was implemented in MATLAB 2019b and
measurements. This induces the following partition in X:                   the future one hour’s BGC values (12 samples) can be forecasted
                                                                         by feeding the testing data to the model. The codes are available:
                        X=            X#       X∗                          https://github.com/xiaoyu1115/BGLP_2020.git.

where X # contains the first R columns of X and X ∗ is made up
                                                                           4     Results
of the remaining (M − R) columns. Correspondingly, the loading
matrix P can be partitioned as                                             In the clinical study, the training data from Ohio T1DM dataset with
                                                                         subject ID: 540, 544, 552, 567, 584, and 596 were first divided into
                                          P#                               two parts: the first 90% of data were used to develop PCA model and
                               P =
                                          P∗                               the model is tested with the remaining 10% of data to determine the
                                                                           number of observations that should be used in building the LV-based
where P # is the submatrix comprising the first R rows of P and the        glucose prediction models. Using this approach, the computational
remaining (M − R) rows are defined as matrix P ∗ . Then, the score         burden in the modeling progress can be reduced significantly. Feed-
vector τ1:A of the new observation z can be rewrite as                     ing both the processed training and testing data into the modeling
                                                                           process described in Section 3, the future glucose concentrations can
                              #T #     ∗T ∗
                     τ̂1:A = P1:A z + P1:A z                        (9)    be predicted with prediction horizons of up to 60 minutes. The root
                                                                           mean square errors (RMSEs) and mean absolute errors (MAEs) of
   For a given data matrix X and a new observation z with z # de-          the 6 subjects are summarized in Table 1. The prediction results are
noting the measured variables that follow the same distribution as         also analysed using Clark Error Grid (CEG) [5] and the percentages
observations in X, the missing variables z ∗ of z can be estimated as      of data distributed in Zone A to Zone D are summarized in Table 1
the expected values from the conditional normal distribution:              as well.                      v
                                                                                                   u  N
                           ∗
                                       ∗   #
                                                                                                  u1 X
                           ẑ = E z         z ,S                   (10)                    RM SE = t    (y (i) − ŷ (i))2                     (13)
                                                                                                         N
                                                                                                             i=1
   Substituting the expression into the estimator of score vector of
                                                                                                             N
the new observation yields:                                                                             1 X
                                                                                             M AE =         |y (i) − ŷ (i)|                  (14)
                                                    − #                                                N
                                 #T
                   τ̂1:A = Θ1:A P1:A S ##               z          (11)                                    i=1

                                                                           where y is the BGC values measured by CGM sensors, ŷ is the pre-
                                                                           dicted BGCs, and N is the total data points in the testing dataset for
                                              
               T             S ## S #∗
where S = X        X
                        =                    is the covariance matrix of   each subject.
                N −1          S ∗# S ∗∗
X and Θ is the square diagonal matrix consisted of the eigenvalues            In Table 1, the RMSE varies from 16.66 to 22.76 mg/dL for the
λ = [λ1 , . . . , λH ] in descending order and H is the rank of X.         prediction horizon of 30 minutes with an average value of 19.37
   Finally, the unmeasured variables z ∗ in z can then be calculated       mg/dL. The RMSE varies from 26.81 to 38.99 mg/dL for the pre-
from the estimated score vector along with the loading matrix:             diction horizon of 60 minutes with an average value of 32.59 mg/dL.
                                                                           It is reasonable to see this increase in RMSE because the unknown
                               ẑ ∗ = P1:A
                                       ∗
                                           τ̂1:A                   (12)    disturbances including meals and exercise that will influence the glu-
                                                                           cose dynamics significantly. The relationships between the predic-
which yields the future predictions given the past observed glucose        tion horizon and RMSE or MAE in Figure 2 also indicate that the
values.                                                                    prediction accuracy decreases with increasing prediction horizon. An
                                              Table 1.       RMSE (mg/dL), MAE (mg/dL) and CEG (%) results of the LV-based model (STD: standard deviation)


                                                                             PH=30 minutes                                                             PH=60 minutes
      Subject                       RMSE                MAE            Zone A          Zone B            Zone C        Zone D   RMSE      MAE       Zone A     Zone B      Zone C     Zone D
                   540                  20.76           15.23           84.92           12.55                0.03       2.50    38.99     29.75      57.84      35.58       0.24        6.35
                   544                  16.70           11.65           93.23            6.29                  0        0.48    26.81     19.77      78.77      19.67       0.07        1.48
                   552                  16.66           12.36           88.05           10.88                  0        1.06    29.41     23.08      64.97      31.63       0.09        3.32
                   567                  22.76           15.30           86.03           12.28                0.04       1.60    37.95     28.13      57.93      34.54       0.34        7.15
                   584                  22.22           15.86           85.11           13.87                  0        1.02    34.81     26.57      67.66      30.08       0.15        2.11
                   596                  17.12           12.15           90.11            8.31                  0        1.57    27.57     20.53      72.06      25.23       0.04        2.67
                   Mean                 19.37           13.76           87.91           10.70                0.01       1.30    32.59     24.64      66.54      29.46       0.15        3.85
                   STD                   2.87            1.89           3.27            2.87                 0.02       0.68     5.35      4.12      8.17       6.03        0.12        2.34


                                                                                                                           average of 87.91% data lie in zone A of CEG which indicates the
                                                                                                                           high clinical accuracy of the model with prediction horizon of 30
                                                                                                                           minutes. The percentage of data in zone B increase dramatically as
                        40          540                                                                                    the prediction horizon increases to 60 minutes.
                                    544
          RMSE(mg/dL)




                        30          552                                                                                       The forecasting performance of the model is better for subjects
                                    567
                                    584
                                                                                                                           544, 552, and 596 than subjects 540, 552, and 567. One of the rea-
                        20
                                    596                                                                                    son is there are less noise and sensor failures contained in the dataset
                        10
                                                                                                                           provided by subjects 544, 552, and 596. A filter might improve pre-
                         0                                                                                                 diction performance for this case. And many gaps are observed in
                             0          2           4             6             8           10          12
                                                          Predict Horizon (5min)                                           CGM measurements, which also deteriorate the prediction accuracy
                                                                                                                           of the model in which the interpolated CGM data were used to pre-
                        30          540
                                    544
                                                                                                                           dict the further BGC values.
          MAE(mg/dL)




                                    552                                                                                       The predicted BGC values for subject 544 with prediction horizon
                        20          567
                                    584                                                                                    of 30 minutes and 60 minutes are shown in Figure 3. For the pre-
                                    596
                        10                                                                                                 diction horizon of 30 minutes, the LV-based model can predict the
                                                                                                                           future BGC values with a comparable high accuracy, and most of the
                         0
                             0          2           4             6             8           10          12                 hyperglycemia and hypoglycemia events can be forecasted on time.
                                                          Predict Horizon (5min)                                           When the prediction horizon is increased to 60 minutes, the predic-
                                                                                                                           tion accuracy decreases, but is still acceptable. The prediction values
Figure 2. Relationship between prediction horizon and prediction accuracy                                                  can still provide an insight on the variation of BGC which can be
                                                                                                                           used to tune the insulin infusion rate as a reference.

                                                                                                                           5    Conclusion
                                                                                                                           In this paper, an online recursively identified LV-based modeling ap-
                  300
                                                    30-min-ahead prediction results                                        proach is developed to predict the future BGC values with predic-
                  250
                                                                                                 Measurements              tion horizons of 30 and 60 minutes. With the pre-processed dataset,
                                                                                                 Predictions
    BGC (mg/dL)




                                                                                                                           the LV-based model is developed to calculate the LVs using a small
                  200
                                                                                                                           submatrix of the training dataset when a new observation is avail-
                  150
                                                                                                                           able. The future blood glucose values are predicted as a linear com-
                  100
                                                                                                                           bination of estimated scores and loadings. The proposed model is
                        50
                         1500    1600       1700   1800    1900       2000    2100   2200    2300     2400      2500
                                                                                                                           evaluated with the Ohio T1DM dataset and the results demonstrate
                                                                                                                           the effectiveness of the model. Although the measurement noise is
                                                    60-min-ahead prediction results
                  300                                                                                                      weight-averaged in the LV-based model, it still has a significant in-
                                                                                                 Measurements
                  250                                                                            Predictions
                                                                                                                           fluence on modeling and prediction progress. Online denoising tech-
    BGC (mg/dL)




                  200
                                                                                                                           niques would be one of the future study directions that might improve
                  150
                                                                                                                           the prediction accuracy. Further, integrating other data fields with the
                                                                                                                           personalized physiological models is a potential approach to improve
                  100
                                                                                                                           the prediction performance in the future work.
                        50
                         1500    1600       1700   1800    1900       2000    2100   2200    2300     2400      2500
                                                             samples (/5min)
                                                                                                                           ACKNOWLEDGEMENTS
    Figure 3. BGC prediction for Subject 544 using LV-based model                                                          The work of Xiaoyu Sun was supported by the China Scholarship
                                                                                                                           Council under grant 201906080136. Funds provided by the Hyosung
                                                                                                                           S. R. Cho Endowed Chair at Illinois Institute of Technology to Ali
                                                                                                                           Cinar is gratefully acknowledged.
REFERENCES                                                                               lating glucose concentration under challenging conditions’, IEEE Con-
                                                                                         trol Systems Magazine, 38(1), 105–124, (2018).
                                                                                  [19]   Kamuran Turksoy, Laurie Quinn, Elizabeth Littlejohn, and Ali Cinar,
 [1] Arthur Bertachi, Lyvia Biagi, Iván Contreras, Ningsu Luo, and Josep                ‘Multivariable adaptive identification and control for artificial pancreas
     Vehı́, ‘Prediction of blood glucose levels and nocturnal hypoglycemia               systems’, IEEE Transactions on Biomedical Engineering, 61(3), 883–
     using physiological models and artificial neural networks’, in 3rd In-              891, (2013).
     ternational Workshop on Knowledge Discovery in Healthcare Data,              [20]   C. Undey and A. Cinar, ‘Statistical monitoring of multistage, multi-
     KDH@ IJCAI-ECAI 2018, 13 July 2018, pp. 85–90.                                      phase batch processes’, IEEE Control Systems Magazine, 22(5), 40–52,
 [2] Marc Breton, Anne Farret, Daniela Bruttomesso, Stacey Anderson,                     (2002).
     Lalo Magni, Stephen Patek, Chiara Dalla Man, Jerome Place, Susan             [21]   Cenk Ündey, Sinem Ertunç, Eric Tatara, Fouad Teymour, and Ali Çιnar,
     Demartini, and Simone Del Favero, ‘Fully integrated artificial pancreas             ‘Batch process monitoring and its application to polymerization sys-
     in type 1 diabetes: modular closed-loop glucose control maintains near              tems’, in Macromolecular Symposia, volume 206, pp. 121–134. Wiley
     normoglycemia’, Diabetes, 61(9), 2230–2237, (2012).                                 Online Library, (2004).
 [3] Jianwei Chen, Kezhi Li, Pau Herrero, Taiyu Zhu, and Pantelis Geor-           [22]   Malgorzata E Wilinska, Ludovic J Chassin, Helga C Schaller, Lukas
     giou, ‘Dilated recurrent neural network for short-time prediction of glu-           Schaupp, Thomas R Pieber, and Roman Hovorka, ‘Insulin kinetics in
     cose concentration’, in 3rd International Workshop on Knowledge Dis-                type-1 diabetes: continuous and bolus delivery of rapid acting insulin’,
     covery in Healthcare Data, KDH@ IJCAI-ECAI 2018, 13 July 2018,                      IEEE Transactions on Biomedical Engineering, 52(1), 3–12, (2005).
     pp. 69–73.                                                                   [23]   Xia Yu, Mudassir Rashid, Jianyuan Feng, Nicole Hobbs, Iman Ha-
 [4] Ali CinAr, ‘Artificial pancreas systems: An introduction to the special             jizadeh, Sediqeh Samadi, Mert Sevil, Caterina Lazaro, Zacharie Mal-
     issue’, IEEE Control Systems, 38(1), 26–29, (2018).                                 oney, Elizabeth Littlejohn, Laurie Quinn, and Ali Cinar, ‘Online glu-
 [5] William L Clarke, ‘The original clarke error grid analysis (ega)’, Dia-             cose prediction using computationally efficient sparse kernel filtering
     betes technology & therapeutics, 7(5), 776–779, (2005).                             algorithms in type-1 diabetes’, IEEE Transactions on Control Systems
 [6] Iván Contreras, Arthur Bertachi, Lyvia Biagi, Josep Vehı́, and Silvia              Technology, 1–13, (2018).
     Oviedo, ‘Using grammatical evolution to generate short-term blood            [24]   Chunhui Zhao, Eyal Dassau, Lois Jovanovič, Howard C Zisser, Fran-
     glucose prediction models.’, in 3rd International Workshop on Knowl-                cis J Doyle III, and Dale E Seborg, ‘Predicting subcutaneous glucose
     edge Discovery in Healthcare Data, KDH@ IJCAI-ECAI 2018, 13 July                    concentration using a latent-variable-based statistical method for type
     2018, pp. 91–96, (2018).                                                            1 diabetes mellitus’, Journal of diabetes science and technology, 6(3),
 [7] Meriyan Eren-Oruklu, Ali Cinar, Lauretta Quinn, and Donald Smith,                   617–633, (2012).
     ‘Estimation of future glucose concentrations with subject-specific re-       [25]   Hong Zhao, Chunhui Zhao, Chengxia Yu, and Eyal Dassau, ‘Multiple
     cursive linear models’, Diabetes Technology and Therapeutics, 11(4),                order model migration and optimal model selection for online glucose
     243–253, (2009).                                                                    prediction in type 1 diabetes’, AIChE Journal, 64(3), 822–834, (2018).
 [8] Jesus Flores-Cerrillo and John F MacGregor, ‘Latent variable mpc             [26]   Taiyu Zhu, Kezhi Li, Pau Herrero, Jianwei Chen, and Pantelis Geor-
     for trajectory tracking in batch processes’, Journal of process control,            giou, ‘A deep learning algorithm for personalized blood glucose pre-
     15(6), 651–663, (2005).                                                             diction’, in 3rd International Workshop on Knowledge Discovery in
 [9] Iman Hajizadeh, Mudassir Rashid, Kamuran Turksoy, Sediqeh Samadi,                   Healthcare Data, KDH@ IJCAI-ECAI 2018, 13 July 2018, pp. 74–78.
     Jianyuan Feng, Mert Sevil, Nicole Hobbs, Caterina Lazaro, Zacharie
     Maloney, Elizabeth Littlejohn, et al., ‘Incorporating unannounced
     meals and exercise in adaptive learning of personalized models for mul-
     tivariable artificial pancreas systems’, Journal of Diabetes Science and
     Technology, 12(5), 953–966, (2018).
[10] Roman Hovorka, Valentina Canonico, Ludovic J. Chassin, Ulrich
     Haueter, Massimo Massi-Benedetti, Marco Orsini Federici, Thomas R.
     Pieber, Helga C. Schaller, Lukas Schaupp, Thomas Vering, and Mal-
     gorzata E. Wilinska, ‘Nonlinear model predictive control of glucose
     concentration in subjects with type 1 diabetes’, Physiological Measure-
     ment, 25(4), 905–920, (2004).
[11] Kezhi Li, John Daniels, Chengyuan Liu, Pau Herrero-Vinas, and Pan-
     telis Georgiou, ‘Convolutional recurrent neural networks for glucose
     prediction’, IEEE journal of biomedical and health informatics, (2019).
[12] Cindy Marling and Razvan Bunescu, ‘The ohiot1dm dataset for blood
     glucose level prediction: Update 2020’.
[13] John Martinsson, Alexander Schliep, Björn Eliasson, Christian Meijner,
     Simon Persson, and Olof Mogren, ‘Automatic blood glucose prediction
     with confidence using recurrent neural networks’, in 3rd International
     Workshop on Knowledge Discovery in Healthcare Data, KDH@ IJCAI-
     ECAI 2018, 13 July 2018, pp. 64–68.
[14] Philip RC Nelson, Paul A Taylor, and John F MacGregor, ‘Missing
     data methods in pca and pls: Score calculations with incomplete ob-
     servations’, Chemometrics and Intelligent Laboratory Systems, 35(1),
     45–65, (1996).
[15] Sediqeh Samadi, Kamuran Turksoy, Iman Hajizadeh, Jianyuan Feng,
     Mert Sevil, and Ali Cinar, ‘Meal detection and carbohydrate estimation
     using continuous glucose sensor data’, IEEE Journal of Biomedical and
     Health Informatics, 21(3), 619–627, (2017).
[16] Mert Sevil, Mudassir Rashid, Zacharie Maloney, Iman Hajizadeh,
     Sediqeh Samadi, Mohammad Reza Askari, Nicole Hobbs, Rachel
     Brandt, Minsun Park, Laurie Quinn, et al., ‘Determining physical ac-
     tivity characteristics from wristband data for use in automated insulin
     delivery systems’, IEEE Sensors Journal, (2020).
[17] Qingnan Sun, Marko V Jankovic, Lia Bally, and Stavroula G
     Mougiakakou, ‘Predicting blood glucose with an lstm and bi-lstm based
     deep neural network’, in 2018 14th Symposium on Neural Networks and
     Applications (NEUREL), pp. 1–5. IEEE, (2018).
[18] Kamuran Turksoy, Elizabeth Littlejohn, and Ali Cinar, ‘Multimodule,
     multivariable artificial pancreas for patients with type 1 diabetes: regu-