Data Fusion of Activity and CGM for Predicting Blood
                        Glucose Levels*
                    Hoda Nemat1 and Heydar Khadem1 and Jackie Elliott2 and Mohammed Benaissa1

Abstract.2 This work suggests two methods—both relying on                             minutes. Zhu generated a dilated deep convolutional neural network
stacked regression and data fusion of CGM and activity—to predict                     fed by CGM, insulin, and carbohydrate intake as inputs. Xie applied
the blood glucose level of patients with type 1 diabetes. Method 1                    an autoregression with exogenous inputs approach to predict BGL
uses histories of CGM data appended with the average of activity                      by exploiting current and past information of CGM data.
data in the same histories to train three base regressions: a multilayer
                                                                                         Physical activity is a critical factor in diabetes management.
perceptron, a long short- term memory, and a partial least squares
regression. In Method 2, histories of CGM and activity data are used                  Therefore, investigation of the activity data in BGL prediction
separately to train the same base regressions. In both methods, the                   models is encouraged [13]. However, developing models with high
predictions from the base regressions are used as features to create a                accuracy using activity and CGM data is challenging, and limited
combined model. This model is then used to make the final                             studies have been done in this area. Data fusion of activity and CGM
predictions. The results obtained show the effectiveness of both                      data normally result in models with a performance not comparable
methods. Method 1 provides slightly better results.                                   with those using CGM alone.
                                                                                         This paper proposes two novel CGM and activity data fusion
                                                                                      methods to generate BGL prediction models with performance
1          INTRODUCTION                                                               comparable with those using CGM data alone.
The literature emphasises the importance of the management of type
1 diabetes mellitus (T1DM) in reducing complications associated
with the disease [1], [2]. The key role in T1DM management is to
                                                                                      2           DATASET
control blood glucose level (BGL) to remain in a normal range [3],                    To develop BGL prediction algorithms, we used the OhioT1DM
[4].                                                                                  dataset [14]. The dataset contains eight weeks’ worth data of 12
   The prediction of BGL from current and past information can be                     people with T1DM. The data of six patients was released in 2018 for
a useful contributor [5]. BGL prediction could provide early                          the first BGL prediction challenge [15] and data for additional six
warnings concerning inadequate glycaemic control to prevent the                       patients (referred by ID 540, 544, 552, 567, 584, and 596) was
occurrence of an adverse glycemic status [6], [7].                                    released for the second BGL prediction challenge in 2020 [14]. In
   BGL prediction models could be classified into three main                          this work, we used the data of the latter six patients.
groups: physiological models, data-driven models, and hybrid                              The dataset includes data of CGM sensor, physical activity band,
models. Data-driven models explain the relationship between the                       physiological sensor, and self-reported life-event. Among the
present and past information to BGL prediction. In this regard,                       different collected data, we explored CGM and activity data which
machine learning and time series approaches have been widely used                     were collected every 5 and 1 minutes, respectively. Detailed
[5].                                                                                  information about the sensors and devices as well as characteristics
   Many studies have proposed data-driven BGL prediction                              of the patients has been published [14], [15].
methodologies. Mirshekarian et al. [8], Bertachi et al. [9],                              In the dataset, there are three types of activity data consisting of
Martinsson et al. [10], Zhu et al. [11] and Xie et al. [12] in separate               galvanic skin response, skin temperature, and magnitude of
studies, developed prediction models to forecast BGL with a                           acceleration. In this work, we only used the data of the magnitude of
prediction horizon of up to 60 minutes.                                               acceleration. Hereafter, for simplicity, ‘magnitude of acceleration’
   Mirshekarian’s model was based on a recursive neural network                       is referred to as ‘activity’.
(RNN), which utilised long short- term memory (LSTM) units.
CGM, insulin, meal, and activity information were inputs of their
model. Bertachi used physiological models of insulin, carbohydrate,                   3           METHODOLOGY
and activity on board to train an artificial neural network (ANN).                    This section presents the information about data preprocessing and
Martinsson proposed an RNN model trained on historical blood                          the methodologies developed for the prediction of BGL.
glucose information to predict BGL in two horizons of 30 and 60


This paper is submitted to the second Blood Glucose Level Prediction Challenge 2020, the 5th International Workshop on Knowledge Discovery in Healthcare
Data.
1
  Department of Electronic and Electrical Engineering, University of Sheffield, UK, email addresses: hoda.nemat@sheffield.ac.uk, h.khadem@sheffield.ac.uk,
  m.benaissa@sheffiels.ac.uk
2
  Department of Oncology and Metabolism, University of Sheffield, UK, email address: j.elliott@sheffield.ac.uk


 Copyright © 2020 for this paper by its authors. Use permitted under Creative Commons License Attribution 4.0 International (CC BY 4.0).
3.1       Preprocessing                                                     hidden layers used ReLU as the activation function. Mean squared
                                                                            error was the loss function, Adam was the optimiser. The model
Missing data in the training set is imputed using linear interpolation.     trained with 100 epochs with a learning rate of 0.01.
For the testing set, on the other hand, linear extrapolation is used.       •    Partial least squares regression (PLSR)
This is to assure that future data is not seen by the model, and that       PLSR carries considerable popularity in different applications, such
the model can be used for a real-time application. Thus, we convert         as glucose sensing [18]. In this work, PLSR was applied as a
CGM and activity data to regular time series without any missing            regression tool. Different values were considered for the number of
data in 5-minute and 1-minute intervals, respectively.                      components—ranging from one to the length of the input window.
   The next step was to unify the resolution of CGM and activity            Each time, the predicted residual sum of squares (𝑃𝑅𝐸𝑆𝑆) was
data. To do so, we downsampled the activity time series data to 5-          calculated as follows.
minute intervals by capturing the nearest activity data to each CGM
data and discarding the rest.                                                               𝑁

   There were a considerable number of unavailable activity data at         𝑃𝑅𝐸𝑆𝑆 = ∑(𝑦 − 𝑦̂𝑖 )2                                                                                (1)
the beginning and/or end of training and/or test set. This was due to                      𝑖=1
the difference in wear time of CGM and activity sensors. For these
points average of activity data in the training set is used rather than        Where, N is the size of the evaluation set, and 𝑦𝑖 is the reference
linear interpolation or extrapolation. Table 1 shows the number of          value, and 𝑦̂𝑖 is the predicted value.
unavailable activity data for each patient ID.                                 The number of components (𝐴) resulting in the minimum value
                                                                            for 𝑃𝑅𝐸𝑆𝑆/(𝑁 − 𝐴 − 1) is then selected [19].
 Table 1. The number of non-existent activity data points in training and
                  testing sets per data contributor.
               Patient ID       testing set    training set                 3.2.2        Stacked regression
                  540               547             31
                                                                            Stacked regression is applied to enhance the performance of BGL
                  544                 0            125
                                                                            prediction [20]. This technique uses predictions from a number of
                  552               622            505
                                                                            models—first-level models—as features to train a new model—
                  567                 0            108
                                                                            second-level model. In this work, a stacked regression structure was
                  584                 3            123
                                                                            employed where the three base regressions mentioned in 2.3.1 were
                  596                80             18
                                                                            set as its first-level models and a PLSR as the second-level model
                                                                            (Figure 1).
   Another data preprocessing step was to reframe a time series
problem to a supervised learning task. To this end, time series data
were transformed into samples with lag observations as input and                                 First-level models

future observations as output. We use a rolling window with a                                                  𝑌෠1                         Second-level model
                                                                                                     MLP
history length of 6 or 12 data points for the input, which has the
information of 30- or 60- minute history, respectively. Also, the                                              𝑌෠2     Stacking of                                 Final
                                                                              Training                                                            PLSR
                                                                                                     LSTM
output of each sample is a vector with 6 or 12 data points                                                     𝑌෠3
                                                                                                                      𝑌෠1 , 𝑌෠2, and 𝑌෠3                        prediction 𝑌෠
                                                                                  set
corresponding to prediction horizons of 30- and 60- minute,
                                                                                                     PLSR
respectively.

                                                                                         Figure 1. Diagram of the developed stacked regression.
3.2       Regression tools
Three base regressions and a stacked regression technique are used
as tools to develop the final prediction models.                            3.3          Prediction methods
                                                                            We developed two different methods using the stacked regression
3.2.1     Base regressions                                                  structure mentioned above to fuse CGM and activity data. Using
                                                                            these methods, models were then created to predict BGL of each
•     Multilayer perceptron (MLP)
                                                                            patient for both horizons of 30 and 60 minutes. For each prediction
MLP [16] is an ANN that can be used for time series forecasting. In
                                                                            horizon, two histories of 30 and 60 minutes were tried for training
this work, a single-hidden-layer MLP model was used. The model
                                                                            purposes.
comprised a dense layer of 100 nodes with an activation function of
rectified linear unit (ReLU) followed by an output layer. Adam and
mean absolute error were used as an optimiser and a loss function,          3.3.1        Method 1
respectively. The learning rate was 0.01, and the model was fitted
with 100 epochs.                                                            This method used the average value of activity data added to the
•     Long short-term memory (LSTM)                                         window of CGM data to train the first-level models.
RNN is also an artificial neural network suitable for working with
sequential data. We used a vanilla LSTM recurrent network [17]              3.3.2        Method 2
with vector output which is used for multi-step ahead forecast. The
model was composed of a hidden layer with 200 units followed by a           In this method, the first-level models were trained twice. Once using
fully-connected layer with 100 nodes and an output layer. Both              a history of CGM data, and once using a history of activity data, thus
                                                                            producing six first-level models rather than three.
                               Table 3. Evaluation results of the first-level models of Method 1 using a history of 30 minutes.
                                                                    PH: 30 min                                         PH: 60 min
             Patient ID           Model
                                                        RMSE                       MAE                      RMSE                     MAE
                                  PLSR                  22.13                      16.60                     41.09                   31.74
                  540              MLP               21.96 ± 0.29               16.46 ± 0.21             40.53 ± 0.38             30.95 ± 0.33
                                  LSTM               21.22 ± 0.12               15.82 ± 0.08             39.65 ± 0.28             30.38 ± 0.28
                                  PLSR                  18.08                      13.33                     31.80                   24.71
                  544              MLP               17.95 ± 0.07               12.87 ± 0.13             31.61 ± 0.32             24.27 ± 0.71
                                  LSTM               17.62 ± 0.20               12.60 ± 0.32             30.79 ± 0.29             23.02 ± 0.67
                                  PLSR                  16.76                      12.77                     30.23                   23.67
                  552              MLP               16.96 ± 0.19               12.69 ± 0.21             30.38 ± 0.36             23.42 ± 0.61
                                  LSTM               16.44 ± 0.17               12.18 ± 0.22             29.89 ± 0.47             22.53 ± 0.40
                                  PLSR                  20.97                      15.04                     37.41                   28.15
                  567              MLP               21.44 ± 0.63               15.60 ± 0.76             37.96 ± 1.45             29.01 ± 1.35
                                  LSTM               20.61 ± 0.20               14.64 ± 0.32             36.36 ± 0.31             27.08 ± 0.43
                                  PLSR                  22.07                      16.21                     36.85                   27.85
                  584              MLP               21.60 ± 0.12               15.61 ± 0.14             36.54 ± 0.74             27.27 ± 0.89
                                  LSTM               21.55 ± 0.26               15.58 ± 0.27             36.75 ± 1.69             27.62 ± 2.08
                                  PLSR                  17.79                      12.76                     29.63                   22.05
                  596              MLP               18.01 ± 0.16               12.99 ± 0.17             29.75 ± 0.69             21.93 ± 0.38
                                  LSTM               17.23 ± 0.17               12.25 ± 0.29             29.17 ± 0.22             21.29 ± 0.32
                                  PLSR                  19.63                      14.45                     34.50                   26.36
              Average              MLP               19.65 ± 0.24               14.37 ± 0.27             34.46 ± 0.66             26.14 ± 0.71
                                  LSTM               19.11 ± 0.19               13.85 ± 0.25             33.77 ± 0.54             25.32 ± 0.70


                                                                                Where 𝑦𝑖 , 𝑦̂𝑖 , and N have the same meaning as in (1).
3.4       Evaluation
In the Ohio dataset, the last 10 days’ worth of data for each
                                                                                4           RESULTS AND DISCUSSION
contributor was allocated as the testing set and the rest as training
[14]. To train and evaluation purposes, we used the training and                    In this section, the results of RMSE and MAE for prediction
testing sets, respectively. Extrapolated data and, the first 60 minutes         models are provided for both prediction horizons of 30 and 60
of the test set was excluded when calculating the evaluation metrics.           minutes. Models with a performance dependent on random
The latter is because the testing set starts immediately after the              initialisation ran five times, and the mean and standard deviation of
training set, and they are chronologically close to each other.                 results are reported. We have used the acronym PH for the prediction
Summarised statistics of the testing set for each patient is given in           horizon in the tables.
Table 2.
            Table 2. The statistics of the patients’ testing set.
                           Original         Imputed       Evaluation            4.1         Method 1
         Patient ID
                          data point       data point     data point               Table 3 displays the evaluation results of the first-level models of
            540             2896             3066           2884                Method 1 when a history of 30 minutes is used for training. Based
            544             2716             3136           2704                on the RMSE and MAE values, in both prediction horizons, LSTM
            552             2364             3950           2352                had the best prediction performance for all patients except 584. For
            567             2389             2871           2377                this patient, MLP had the best result. PLSR, as a simple linear
            584             2665             2995           2653                regressor, produced results comparable to the non-linear neural
            596             2743             3003           2731                network models.

  Root mean square error (RMSE) and mean absolute error (MAE)                    Table 4. Evaluation results of the second-level model of Method 1 using a
were calculated as follows and considered as evaluation metrics.                                           history of 30 minutes.
                                                                                                      PH: 30 min                      PH: 60 min
                                                                                 Patient ID
                                                                                                RMSE              MAE            RMSE           MAE
        ∑𝑁 (𝑦𝑖 − 𝑦̂𝑖 )2
𝑅𝑀𝑆𝐸 = √ 𝑖=1                                                            (2)         540      21.19 ± 0.07 15.73 ± 0.09 39.41 ± 0.09 30.04 ± 0.15
             𝑁                                                                      544      17.40 ± 0.08 12.45 ± 0.08 30.48 ± 0.07 22.90 ± 0.08
                                                                                    552      16.25 ± 0.07 12.02 ± 0.05 29.32 ± 0.09 22.21 ± 0.02
                                                                                    567      20.40 ± 0.07 14.44 ± 0.07 36.12 ± 0.02 27.12 ± 0.07
         ∑𝑁
          𝑖=1|𝑦𝑖 − 𝑦
                   ̂𝑖 |                                                             584      21.54 ± 0.06 15.62 ± 0.06 36.27 ± 0.15 27.17 ± 0.16
𝑀𝐴𝐸 =                                                                   (3)
              𝑁                                                                     596      17.17 ± 0.10 12.13 ± 0.09 28.77 ± 0.26 20.80 ± 0.17
                                                                                  Average 18.99 ± 0.08 13.73 ± 0.07 33.39 ± 0.12 25.04 ± 0.11
                             Table 5. Evaluation results of the first-level models of Method 1 using a history of 60 minutes.
                                                                  PH: 30 min                                        PH: 60 min
              Patient ID         Model
                                                      RMSE                       MAE                     RMSE                     MAE
                                 PLSR                  22.10                     16.58                    41.10                   31.76
                 540              MLP              21.58 ± 0.28               16.12 ± 0.22            40.53 ± 1.23             31.12 ± 0.91
                                 LSTM              21.11 ± 0.18               15.56 ± 0.11            39.18 ± 0.37             30.00 ± 0.33
                                 PLSR                  18.09                     13.33                    31.83                   24.71
                 544              MLP              18.09 ± 0.03               13.05 ± 0.08            32.34 ± 1.00             24.80 ± 1.76
                                 LSTM              18.04 ± 0.35               13.06 ± 0.48            30.79 ± 0.39             23.15 ± 0.68
                                 PLSR                  16.79                     12.78                    30.25                   23.67
                 552              MLP              17.58 ± 0.46               13.39 ± 0.70            30.16 ± 0.43             22.89 ± 0.14
                                 LSTM              16.97 ± 0.78               12.59 ± 0.55            30.69 ± 0.70             23.19 ± 0.55
                                 PLSR                  20.99                     15.03                    37.51                   28.21
                 567              MLP              21.71 ± 0.92               15.80 ± 1.06            37.34 ± 0.78             28.02 ± 0.76
                                 LSTM              20.74 ± 0.50               14.75 ± 0.59            36.67 ± 0.98             27.52 ± 1.06
                                 PLSR                  22.04                     16.19                    37.04                   27.97
                 584              MLP              22.10 ± 0.25               15.98 ± 0.23            37.13 ± 0.74             27.68 ± 0.89
                                 LSTM              21.66 ± 0.10               15.63 ± 0.12            36.76 ± 0.46             27.18 ± 0.44
                                 PLSR                  17.62                     12.66                    29.48                   21.97
                 596              MLP              18.05 ± 0.29               12.71 ± 0.27            29.71 ± 0.35             21.83 ± 0.21
                                 LSTM              17.58 ± 0.19               12.55 ± 0.34            29.55 ± 0.52             21.63 ± 0.34
                                 PLSR                  19.60                     14.43                    34.53                   26.38
               Average            MLP              19.85 ± 0.37               14.51 ± 0.43            34.54 ± 0.75             26.06 ± 0.78
                                 LSTM              19.35 ± 0.35               14.02 ± 0.36            33.94 ± 0.57             25.44 ± 0.57

   Table 4 shows the evaluation results of the second-level model of
Method 1 when a history of 30 minutes was used for training.
Comparing these results with those in Table 3, the second-level                4.2         Method 2
model resulted in better prediction performance than all the first-               In this section, the evaluation result of Method 2 is presented. To
level models for all patients and both prediction horizons. This               be concise, the results of the second-level model only are reported,
means that the stacked regression technique helped improve                     which are the final predictions of the method.
prediction performance.                                                           Table 7 shows the evaluation results of Method 2 using a 30-
   Table 5 displays the evaluation results of the first-level models of        minute history. Comparing these results with those in Table 4, the
Method 1, when a history of 60 minutes was used for training. As               prediction performance of Method 2 was comparable with that of
results show, for both prediction horizons, LSTM had the best                  Method 1 for all patients, except patient 552. This may be due to the
performance for a majority of the patients. In overall, PLSR                   existence of a large number of missing activity data points in this
provided the second-best results.                                              patient’s data (as can be seen in Table 1).
   The evaluation results of the second-level model of Method 1
using 60-minute history are shown in Table 6. In comparison with                  Table 7. Evaluation results of Method 2 using a history of 30 minutes.
Table 5, it can be observed that the stacked regression technique                                    PH: 30 min                      PH: 60 min
advanced the prediction performance for all patients for this history,          Patient ID
                                                                                                RMSE           MAE              RMSE           MAE
too. Also, in comparison with Table 4, Method 1 had a better overall
                                                                                  540        21.26 ± 0.09 15.89 ± 0.07       39.48 ± 0.16 30.26 ± 0.19
performance when it used a history of 30 minutes than a history of
                                                                                  544        17.59 ± 0.11 12.62 ± 0.12       30.68 ± 0.15 23.14 ± 0.20
60 minutes.
                                                                                  552        19.85 ± 4.51 12.65 ± 0.46       35.70 ± 3.32 23.76 ± 0.40
                                                                                  567        20.52 ± 0.12 14.49 ± 0.12       36.39 ± 0.20 27.14 ± 0.19
Table 6. Evaluation results of the second-level model of Method 1 using a
                                                                                  584        21.72 ± 0.17 15.78 ± 0.10       36.53 ± 0.13 27.45 ± 0.08
                          history of 60 minutes.
                                                                                  596        17.24 ± 0.11 12.19 ± 0.07       28.83 ± 0.11 21.03 ± 0.13
                     PH: 30 min                     PH: 60 min
Patient ID                                                                       Average     19.70 ± 0.85 13.94 ± 0.16       34.60 ± 0.68 25.46 ± 0.20
               RMSE              MAE            RMSE           MAE
   540      20.98 ± 0.13 15.50 ± 0.14 39.05 ± 0.17 29.68 ± 0.18
                                                                                  Table 8 lists the evaluation result of Method 2 using a history of
   544      17.66 ± 0.09 12.66 ± 0.08 30.42 ± 0.36 22.82 ± 0.42
                                                                               60 minutes. Comparing these results with those in Table 6, the
   552      16.30 ± 0.09 12.04 ± 0.06 29.38 ± 0.24 22.26 ± 0.21
                                                                               evaluation results for both methods were close to each other. Also,
   567      20.52 ± 0.17 14.54 ± 0.10 36.52 ± 0.10 27.31 ± 0.14
                                                                               comparing these results with those in Table 7, Method 2 made better
   584      21.62 ± 0.17 15.63 ± 0.08 37.01 ± 0.28 27.64 ± 0.20
                                                                               predictions using a history of 60 minutes than a history of 30
   596      17.45 ± 0.08 12.27 ± 0.09 28.92 ± 0.27 20.92 ± 0.19
                                                                               minutes.
 Average 19.09 ± 0.12 13.77 ± 0.09 33.55 ± 0.24 25.11 ± 0.23
  Table 8. Evaluation results of Method 2 using a history of 60 minutes.                5, p. e11030, 2019.
                     PH: 30 min                      PH: 60 min                  [6]    J. Vehí, I. Contreras, S. Oviedo, L. Biagi, and A. Bertachi,
 Patient ID                                                                             “Prediction and prevention of hypoglycaemic events in type-1
                RMSE             MAE           RMSE             MAE
                                                                                        diabetic patients using machine learning,” Health Informatics J., p.
    540      20.89 ± 0.05 15.49 ± 0.11 39.30 ± 0.35 29.80 ± 0.21
                                                                                        1460458219850682, 2019.
    544      17.70 ± 0.14 12.68 ± 0.13 30.71 ± 0.22 23.25 ± 0.29                 [7]    C. Berra et al., “Hypoglycemia and hyperglycemia are risk factors
    552      16.73 ± 0.51 12.33 ± 0.18 34.67 ± 3.51 23.47 ± 0.58                        for falls in the hospital population,” Acta Diabetol., vol. 56, no. 8,
    567      20.57 ± 0.14 14.63 ± 0.11 36.70 ± 0.30 27.48 ± 0.18                        pp. 931–938, 2019.
    584      21.72 ± 0.06 15.71 ± 0.05 36.85 ± 0.09 27.69 ± 0.13                 [8]    S. Mirshekarian, R. Bunescu, C. Marling, and F. Schwartz, “Using
                                                                                        LSTMs to learn physiological models of blood glucose behavior,”
    596      17.53 ± 0.21 12.26 ± 0.18 28.88 ± 0.21 21.02 ± 0.17
                                                                                        Proc. Annu. Int. Conf. IEEE Eng. Med. Biol. Soc. EMBS, pp. 2887–
  Average 19.19 ± 0.18 13.85 ± 0.13 34.52 ± 0.78 25.45 ± 0.26                           2891, 2017.
                                                                                 [9]    A. Bertachi, L. Biagi, I. Contreras, N. Luo, and J. Vehí, “Prediction
                                                                                        of Blood Glucose Levels And Nocturnal Hypoglycemia Using
5         SUMMARY AND CONCLUSION                                                        Physiological Models and Artificial Neural Networks.,” in 3rd
                                                                                        International Workshop on Knowledge Discovery in Healthcare
This work contributes to the prediction of BGL by proposing two                         Data, 2018, pp. 85–90.
methodologies for data fusion of CGM and activity using stacked                  [10]   J. Martinsson, A. Schliep, B. Eliasson, C. Meijner, S. Persson, and
regression.                                                                             O. Mogren, “Automatic blood glucose prediction with confidence
   In the first method, the average value of activity data added to a                   using recurrent neural networks,” 3rd Int. Work. Knowl. Discov.
                                                                                        Healthc. Data, vol. 2148, pp. 64–68, 2018.
window of CGM data was used as input to train prediction models.                 [11]   T. Zhu, K. Li, P. Herrero, J. Chen, and P. Georgiou, “A Deep
Initially, three base regression models consist of MLP, LSTM, and                       Learning Algorithm for Personalized Blood Glucose Prediction.,”
PLSR were trained. Subsequently, predictions from these base                            in 3rd International Workshop on Knowledge Discovery in
models were used as features to train a new PLSR model which then                       Healthcare Data, 2018, pp. 64–78.
made final predictions.                                                          [12]   J. Xie and Q. Wang, “Benchmark Machine Learning Approaches
                                                                                        with Classical Time Series Approaches on the Blood Glucose
   In the second method, the same base regressions were trained                         Level Prediction Challenge.,” in 3rd International Workshop on
once using windows of activity data and once using CGM data. The                        Knowledge Discovery in Healthcare Data, 2018, pp. 97–102.
predictions of all trained base models were then fed as features to a            [13]   M. H. Jensen, C. Dethlefsen, P. Vestergaard, and O. Hejlesen,
new PLSR model for its training process. The new PLSR was used                          “Prediction of Nocturnal Hypoglycemia From Continuous Glucose
to make refined predictions.                                                            Monitoring Data in People With Type 1 Diabetes: A Proof-of-
                                                                                        Concept Study,” J. Diabetes Sci. Technol., vol. 14, no. 2, pp. 250–
   The results obtained show that Method 1 (average value of                            256, 2020.
activity data added to the window of CGM data) had a slightly better             [14]   C. Marling and R. Bunescu, “The OhioT1DM Dataset for Blood
performance than Method 2 (first-level models trained twice, once                       Glucose Level Prediction: Update 2020,” in 5th International
with a history of CGM data, once using a history of activity data). In                  Workshop on Knowledge Discovery in Healthcare Data, 2020.
overall, Method 1 using a history of 30 minutes had the best results             [15]   C. Marling and R. C. Bunescu, “The OhioT1DM Dataset For
                                                                                        Blood Glucose Level Prediction.,” in 3rd International Workshop
by providing a RMSE of 18.99 and 33.39 for the prediction horizon                       on Knowledge Discovery in Healthcare Data, 2018, pp. 60–63.
of 30 minutes and 60 minutes, respectively.                                      [16]   F. Murtagh, “Multilayer perceptrons for classification and
                                                                                        regression,” Neurocomputing, vol. 2, no. 5–6, pp. 183–197, 1991.
                                                                                 [17]   S. Hochreiter and J. Schmidhuber, “Long short-term memory,”
6         SOFTWARE AND CODE                                                             Neural Comput., vol. 9, no. 8, pp. 1735–1780, 1997.
                                                                                 [18]   H. Khadem, M. R. Eissa, H. Nemat, O. Alrezj, and M. Benaissa,
To implement the models, we used Python 3.6, TensorFlow 1.15.0                          “Classification before regression for improving the accuracy of
and Keras 2.2.5. Also, Pandas, NumPy and Sklearn packages of                            glucose quantification using absorption spectroscopy,” Talanta,
python were used. The codes were run on a commodity laptop. The                         vol. 211, 2020.
codes    of    our     implementation     are  available     at:                 [19]   S. Wold, M. Sjöström, and L. Eriksson, “PLS-regression: a basic
                                                                                        tool of chemometrics,” Chemom. Intell. Lab. Syst., vol. 58, no. 2,
https://gitlab.com/Hoda-Nemat/data-fusion-                                              pp. 109–130, 2001.
stacking.git                                                                     [20]   L. Breiman, “Stacked regressions,” Mach. Learn., vol. 24, no. 1,
                                                                                        pp. 49–64, 1996.

REFERENCES
[1]      G. S. Jeha et al., “Continuous glucose monitoring and the reality
         of metabolic control in preschool children with type 1 diabetes,”
         Diabetes Care, vol. 27, no. 12, pp. 2881–2886, 2004.
[2]      L. S. Schilling et al., “A new self-report measure of self-
         management of type 1 diabetes for adolescents,” Nurs. Res., vol.
         58, no. 4, p. 228, 2009.
[3]      E. R. Seaquist et al., “Hypoglycemia and diabetes: A report of a
         workgroup of the american diabetes association and the endocrine
         society,” J. Clin. Endocrinol. Metab., vol. 98, no. 5, pp. 1845–
         1859, 2013.
[4]      K. Makris and L. Spanou, “Is there a relationship between mean
         blood glucose and glycated hemoglobin?,” J. Diabetes Sci.
         Technol., vol. 5, no. 6, pp. 1572–1583, 2011.
[5]      A. Z. Woldaregay, E. Årsand, T. Botsis, D. Albers, L. Mamykina,
         and G. Hartvigsen, “Data-driven blood glucose pattern
         classification and anomalies detection: machine-learning
         applications in type 1 diabetes,” J. Med. Internet Res., vol. 21, no.