On the Predictability of non-CGM Diabetes Data for Personalized
                         Recommendation

                                 Tu Nguyen                             Markus Rokicki
                             L3S Research Center                     L3S Research Center
                           Hannover, Germany 30167                 Hannover, Germany 30167
                              tunguyen@l3s.de                          rokicki@l3s.de


                                                                 by drawing blood samples every 10-15 min for up to 48
                                                                 h [8]. In these settings, long term (e.g., days or months)
                          Abstract                               studies resorted to self-monitoring blood glucose (SMBG)
                                                                 data, i.e., approx. 3 samples per day obtained by the patient
     With continuous glucose monitoring (CGM),                   herself by using fingerstick glucose meters. The retrospec-
     data-driven models on blood glucose prediction              tive analysis of SMBG time-series was used by physicians,
     have been shown to be effective in related work.            together with the information taken from the ‘patient’s di-
     However, such (CGM) systems are not always                  ary‘ (e.g., insulin dosage, meals intake, physical exercise)
     available, e.g., for a patient at home. In this work,       and some glycaemic indexes (typically HbA1c), to assess
     we conduct a study on 9 patients and examine the            glucose control and the effectiveness of a particular ther-
     predictability of data-driven (aka. machine learn-          apy [8].
     ing) based models on patient-level blood glucose
     prediction; with measurements are taken only pe-                With the support of continuous glucose monitoring
     riodically (i.e., after several hours). To this end,        (CGM) sensors, the development of new strategies for the
     we propose several post-prediction methods to               treatment of diabetes has been accelerated in recent years.
     account for the noise nature of these data, that            In particular, CGM sensors can be injected into ‘online‘
     marginally improves the performance of the end              recommender systems that are able to generate alerts when
     system.                                                     glucose concentration is predicted to exceed the normal
                                                                 range thresholds. Recently, there has been a lot of com-
1    Introduction                                                plex data-driven prediction models [4, 7, 3, 5] that are built
                                                                 based on the CGM data, and have been shown to be effec-
Diabetes mellitus has been a major and global problem for
                                                                 tive. These data-driven models, or machine learning/deep
a long time, as it is report that there are over 400 million
                                                                 learning are data-hungry, hence, its performance on sparse
patients over the world 1 . The knowledge of glucose con-
                                                                 / non-continuous data is still a question. CGM data are
centration in blood is a key aspect in the diagnosis and
                                                                 still not always available for all diabetic patients for many
treatment of diabetes. The use of signal processing tech-
                                                                 reasons 2 ; while a personalized or patient-level model that
niques on glucose data started a long time ago, when glu-
                                                                 are trained on the same patient’s data is essential. In this
cose time-series in a given individual could be obtained
                                                                 work, we examine the performance of these machine lean-
in lab study from samples drawn in the blood at a suffi-
                                                                 ing approaches on our real, limited data of a group of dia-
ciently high rate. In particular, related work employed not
                                                                 betic patients. Our contributions are two-fold: (1) we pro-
only linear (e.g., correlation and spectrum analysis, peak
                                                                 vide a quantitative study on the predictability of machine
detection), but also nonlinear (e.g., approximate entropy)
                                                                 learned models on limited and sparse data; (2) we propose
methods to investigate oscillations present in glucose (and
                                                                 a prediction system that is robust on noisy data (based on
insulin) time-series obtained, during hospital monitoring,
                                                                 prediction interval).
Copyright © CIKM 2018 for the individual papers by the papers'
authors. Copyright © CIKM 2018 for the volume as a collection
by its editors. This volume and its papers are published under
the Creative Commons License Attribution 4.0 International (CC      2 http://time.com/4703099/continuous-glucose-monitor-
BY 4.0).                                                         blood-sugar-diabetes/
    1 https://www.diabetes.co.uk/diabetes-prevalence.html
2    Dataset Overview                                                         particular, for most patients the number of glucose mea-
                                                                              surements roughly matches or exceeds the number of rapid
The data collection study was conducted from end of
                                                                              insulin applications throughout the days. Notable excep-
February to beginning of April 2017 and includes 9 patients
                                                                              tions are patients 14, 15, and 17 (figures excluded). For
who were given specially prepared smartphones. Measure-
                                                                              patient 14, in the evening the number of meals and rapid
ments on carbohydrate consumption, blood glucose levels,
                                                                              insulin applications match but exceed the number of blood
and insulin intake were made with the Emperras Esysta sys-
                                                                              glucose measurements by far. Patient 17 has more rapid in-
tem 3 . Measurements on physical activities were obtained
                                                                              sulin applications than glucose measurements in the morn-
using the Google Fit app. We use only steps information
                                                                              ing and particularly in the late evening. For patient 15,
(number of steps) for our study.
                                                                              rapid insulin again slightly exceeds the number of glucose
   We describe briefly here some basic patient information.
                                                                              measurements in the morning. Curiously, the number of
Half of the patients are female and ages range from 17 to
                                                                              glucose measurements match the number carbohydrate en-
66, with a mean age of 41.8 years. Body weight, accord-
                                                                              tries – it is possible the discrepancy is a result of miss-
ing to BMI (Body mass index), is normal for half of the
                                                                              ing (glucose and carbohydrate) measurements. We further
patients, four are overweight and one is obese. The mean
                                                                              show the blood glucose distribution of each patient in Fig-
BMI value is 26.9. Only one of the patients suffers from di-
                                                                              ure 1. The different lengths of the interquartile range for
abetes type 2 and all are in ICT therapy 4 . In terms of time
                                                                              each distribution also reflects the difficulty of prediction
since being diagnosed with diabetes, patients vary from in-
                                                                              problem on different patients.
experienced (2 years) to very experienced (35 years), with
a mean value of 13.9 years. We anonymize the patients and
identify them by IDs (from 8 to 17, we do not have infor-
mation for patient 9).

Frequency of Measurements
We give an overview of the number of different measure-
ments that are available for each patient. The study dura-
tion varies among the patients, ranging from 18 days, for
patient 8, to 33 days, for patient 14. Likewise, the daily
number of measurements taken for carbohydrate intake,
blood glucose level and insulin units vary across the pa-                       Figure 1: Blood glucose distribution for each patient.
tients. The median number of carbohydrate log entries vary
between 2 per day for patient 10 and 5 per day for patient
14. Median number of blood glucose measurements per
day varies between 2 and 7. Similarly, insulin is used on
average between 3 and 6 times per day. In terms of phys-
ical activity, we measure the 10 minute intervals with at
least 10 steps tracked by the google fit app. This very low
threshold for now serves to measure very basic movements
and to check for validity of the data. Patients 11 and 14 are                        Figure 2: Blood glucose prediction scenario.
the most active, both having a median of more than 50 ac-
tive intervals per day (corresponding to more than 8 hours
of activity). Patient 10 on the other hand has a surprisingly
low median of 0 active 10 minutes intervals per day, indi-
cating missing values due to, for instance, not carrying the
smartphone at all times.

Measurements per Hour of Day
Figure 3 show measurements of blood glucose, carbo-
hydrates and insulin per hour of day for patient 13 and
14. Overall, the distribution of all three kinds of values
throughout the day roughly correspond to each other. In
    3 https://www.emperra.com/en/esysta-product-system/
    4 describes as a model of an insulin therapy for the diabetics with two

different types of insulin.
         15
                                                        15
                                                                                                                                15
                                                                                                                                                             13 13
                                                                                                                                                                                                                                    i ∈ [1, n]. Median absolute error measures the median error
                                                   13             13                                                                                                                                                 11
 count
         10
                                                             12
                                                                            11
                                                                                                     10
                                                                                                          12
                                                                                                                                10
                                                                                                                                                                                                                          10
                                                                                                                                                                                                                                    made and is defined as


                                                                                                                        count
                                   9                                   9         9       9       9                 9                                                                           8
                                                                                     8       8                                                                         7                                7       7
                                                                                                                                                                           6 6         6 6                  6
                                                                                                                                                                                                    5
                                           5                                                                   5                 5                       4                         4                                           4
          5
                                                                                                                                      3


          0
                 2

                                                                                                                                 0
                                                                                                                                          1 1        1                                                                                              MdAE = median(|ŷi − yi |).
                0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 0                                                      0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 0                                                             i
                                                    Hour of day                                                                                                    Hour of day


                         (a) P13 Glucose                                                                                                          (b) P14 Glucose                                                                      Root mean squared error weighs larger errors more
                                                                                                                                                                                                                                    heavily and is defined as
         12.5                                                                                                                   25
                                                                                                                                                                                                                     22
                                                   10                                                                                                                                                                                                         r
         10.0                                                                                                                   20


                                                                                                                                                                                                                                                                ∑ni=1 (ŷi − yi )2
                                                         9
                                                                                                                                                                                                                          15
          7.5                          7                                                                                        15                                14
 count


                                                                                                                        count
          5.0
                                               5
                                                                             4
                                                                                         5
                                                                                                                                10
                                                                                                                                                             12                                12

                                                                                                                                                                                                        8 8
                                                                                                                                                                                                                11
                                                                                                                                                                                                                               8                    RMSE =                         .
          2.5
                                                              3
                                                                   2
                                                                        3

                                                                                 1
                                                                                     3           3
                                                                                                     2
                                                                                                          3

                                                                                                               1
                                                                                                                                 5
                                                                                                                                      6

                                                                                                                                          1 1        1
                                                                                                                                                         4
                                                                                                                                                                       6 6
                                                                                                                                                                               7
                                                                                                                                                                                   5
                                                                                                                                                                                       6 6          6
                                                                                                                                                                                                                                                                        n
          0.0                                                                                                                    0
                 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 0
                                                        Hour of day
                                                                                                                                     0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 0
                                                                                                                                                                   Hour of day                                                         Symmetric mean absolute percentage error relates pre-
                 (c) P13 Carbohydrates                                                                                                (d) P14 Carbohydrates                                                                         diction errors to predicted values. It is defined as
                                                        basal          rapid                                                                                           basal       rapid                                                                   100% n   |ŷi − yi |
                                                                                                                                     19
                                                                                                                                                                                                                     20                          SMAPE =       ∑ (|yi | + |ŷi |)/2 .
         10
                                                   12


                                                        8 8
                                                                  11

                                                                                 8
                                                                                         9
                                                                                             8       8
                                                                                                          12
                                                                                                                   10
                                                                                                                                20

                                                                                                                                15
                                                                                                                                                                                                                          15
                                                                                                                                                                                                                               18
                                                                                                                                                                                                                                                             n i=1
 count


                                                                                                                        count


                                   7                                        7                                                                                                                                   11
                                                                       6                         6                                                                                             10
                                           5                                                                                    10                           8                                          8
          5                                                                          4                         4                                                  7
                                                                                                                                          6                                    6           6
                                                                                                                                                                       5           5 5              5       5


          0
                 2                                                                                                               5

                                                                                                                                 0
                                                                                                                                              4
                                                                                                                                                         2
                                                                                                                                                                           3
                                                                                                                                                                                                                                    Note that this gives a result between 0% and 200%. Further,
                0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 0                                                      0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 0
                                                    Hour of day                                                                                                    Hour of day                                                      the measure penalizes a) deviating for low values and b)
                          (e) P13 Insulin                                                                                                         (f) P14 Insulin                                                                   over-forecasting.

Figure 3: Glucose, carbohydrate and insulin values per                                                                                                                                                                              3.2   Algorithms
hour of day for patients 13 and 14.                                                                                                                                                                                                 Simple Baselines
3                Prediction                                                                                                                                                                                                         As standard simple baselines, we use the last value ob-
                                                                                                                                                                                                                                    served one hour before the value that is being predicted
Our first approach to blood glucose prediction is based on                                                                                                                                                                          (Last) and the arithmetic mean of glucose values in the
a regression type form of time series prediction. Given                                                                                                                                                                             training set.
historical blood glucose data, we learn a model that pre-
dicts future glucose values based on a representation of the                                                                                                                                                                        Context-AVG
current situation (including the recent past), using informa-
tion on patient context, recent insulin applications, carbo-                                                                                                                                                                        As a more advanced baseline, we use a (temporal) context
hydrate intake, and physical activity levels.                                                                                                                                                                                       weighted average of previous glucose values. As our analy-
                                                                                                                                                                                                                                    sis showed differences in glucose values according to time
                                                                                                                                                                                                                                    of the day, we weigh previous glucose values base on tem-
3.1                  Setup
                                                                                                                                                                                                                                    poral proximity, weighted exponentially decreasing in the
Prediction task                                                                                                                                                                                                                     difference of time of day.
Our prediction task is a time series prediction of blood glu-                                                                                                                                                                       LSTM
cose values (in mmol/L) with a prediction horizon of 1
hour. Consequently, we can construct a data instance for                                                                                                                                                                            LSTM is a recurrent neural network model that effectively
each glucose measurement found in the dataset and use all                                                                                                                                                                           accounts for the long-term sequence dependence among
information available up until 1 hour before the measure-                                                                                                                                                                           glucose inputs.
ment for predicting the glucose value (c.f., Figure 2).
                                                                                                                                                                                                                                    RandomForest
Evaluation Protocol                                                                                                                                                                                                                 The Random Forest Regressor (RF) is a meta estimator that
                                                                                                                                                                                                                                    learns an ensemble of regression trees [2], averaging the
Performance is evaluated on a per patient basis. In addition,                                                                                                                                                                       output of individual regression trees to perform the predic-
we average performance over patients to get an overview.                                                                                                                                                                            tion. We use a standard value of 500 estimators, as well as
For each patient, we consider the first 66% of blood glucose                                                                                                                                                                        a minimal leaf size of 4 for the individual trees to reduce
measurements as training data to learn the models and the                                                                                                                                                                           overfitting of the individual models.
last 34% as test data to evaluate prediction performance.
                                                                                                                                                                                                                                    ExtraTrees
Performance Measures
                                                                                                                                                                                                                                    The Extra-Trees Regressor (ET) is a variation on Random-
Prediction performance is measured in terms of median                                                                                                                                                                               Forest that uses a different base learner: Extremely ran-
absolute error (MdAE), root mean squared error (RMSE)                                                                                                                                                                               domized trees [6]. In contrast to regular regression trees,
and symmetric mean absolute percentage error (SMAPE).                                                                                                                                                                               best split values per feature are chosen randomly. We use
Given are ground truth values yi and predictions ŷi , with                                                                                                                                                                         300 estimators and a minimum leaf size of 2.
3.3     Overall Results                                         about the underlying distribution of the data or the range
                                                                of response values. We hence, leverage the concept of pre-
In this section we report aggregate results, averaged over
                                                                diction intervals to supplement for the noisy data and en-
all patients. Table 1 shows regression performance aver-
                                                                hance the end model, in the sense that it can refuse to give
aged over all patients. Performance is based on 42 test in-
                                                                prediction at certain time when the confidence is low.
stances on average. The simple baselines Last and AVG
achieve median errors of 3.3 and 2.5 mmol/L. Weighing               A prediction interval or confidence interval is an esti-
previous glucose values based on time of the day (Context-      mate of an interval into which the future observations will
AVG) improves average median errors to 2.28 mmol/L. The         fall with a given probability. In other words, it can quantify
Extra-Trees Regressor achieves the lowest MdAE of 2.16          our confidence or certainty in the prediction. Unlike confi-
and similarly slightly outperforms Context-AVG in terms         dence intervals from classical statistics, which are about a
of RMSE and SMAPE. In comparison to predicting the              parameter of population (such as the mean), prediction in-
arithmetic mean (AVG), however, RMSE does not improve           tervals are about individual predictions [1]. We leverage the
by much (12.15 vs 12.96), indicating that the ensemble is       confidence interval estimations for Random Forests, pro-
not able to predict extreme errors well on average. We addi-    posed in [9], that account for certain variability estimation
tionally report the performance of a neural-network based       (of individual trees) bias to conduct the experiments.
model, the LSTM, trained with 10 and 100 epochs. LSTM
seems to be quite stable for MdAE but varies substantially
for RMSE and SMAPE. The performance of LSTM actu-               4.1     Regression evaluation
ally gets much worse after 100 epochs, that indicates the
prone to overfitting. One can imply the instability of such     We report here the variablity evaluation across all patients
models toward the dataset, and thus we do not consider the      for the regression task. Figure 4 show the error bars using
LSTM results for model comparisons in Table 1.                  unbiased variance for all patients. We then show in Fig-
                                                                ures 5 the error bar graphs for patient 8 in an incremental
      Method                         MdAE RMSE SMAPE            training size setting – meaning that we keep the same actual
                                                                test set, but training on only part of the training data. E.g.,
      Last                            3.28   25.71   40.96
      AVG                             2.51   12.96   31.42
                                                                1/4 training data indicates that we ‘look back’ on only 1/4
      Context-AVG                     2.28   12.53   29.71
                                                                of the available past data. The more dots that near the diag-
      ARIMA                           2.40   13.88   31.61      onal show the more ‘accurate’ is our prediction model. And
      LSTM (10 iter)                  2.24   10.41   29.02      the error bars show the ‘confidence’ interval. Figure 5(a)
      LSTM (100 iter)                 2.76   19.24   35.64      indicates the high ‘confidence’ in the predictions with little
      RandomForest                    2.27   12.05   29.98      training data, yet the dots are far away from the diagonal.
      Extremely (randomized) Trees    2.16   12.15   29.56

Table 1: Overall regression performance averaged over all
patients. Best performance per measure is marked in bold
(results in italic are not considered for comparison).
                                                                      (a) Patient 8      (b) Patient 10       (c) Patient 11
4      Prediction confidence
In this section, we study the confidence of our best per-
formed prediction tree-based models, RandomForest and
ExtraTrees. This would, to an extent, facilitate us to answer
an important question, when the system is reliable enough
to give out predictions. Thus, we study the variability of            (d) Patient 12     (e) Patient 13       (f) Patient 14
predictions and estimate standard errors for the prediction
model.

Prediction intervals
When looking at two regression models, while the model
predictions could be similar, confidence in them would vary           (g) Patient 15     (h) Patient 16       (i) Patient 17
if we look at the training data, a less and more spread
out data could bring a low confidence. Hence, a predic-         Figure 4: Error bar graphs for predicted BG using unbiased
tion returning a single value (typically meant to minimize      variance.
the squared error) likewise does not relay any information
                                                                 interval. The idea is to answer the question, ”if we filter
                                                                 low confidence instances (high confident interval), will the
                                                                 model perform better?”
                                                                     The answer somehow is depicted in Figure 11. For some
                                                                 patients i.e., patient 10, 13, the filtering technique substan-
                                                                 tially enhance the model performances on MAE and RMSE
                                                                 (not shown) metrics. It is witness that the biased confi-
    (a) 1/4 training data      (b) 2/4 training data
                                                                 dence measure somewhat works better than non-biased one
                                                                 across patients. However, for some patients i.e., patient 8 it
                                                                 seems does not bring any effects.
                                                                     We move on to experiment with filtering instances that
                                                                 we empirically witness that it seems lacking of preditable
                                                                 context within the training data. They are the BG mea-
                                                                 surements at night. We then attempt to filter those out for
    (c) 3/4 training data      (d) 4/4 training data
                                                                 prediction. Even though slightly improving for some (c.f.,
                                                                 Figure 12), overall the filtering attempt does not make sig-
                                                                 nificant difference, indicating that our model learns it better
Figure 5: Incremental training size - error bar graphs for
                                                                 than we expect.
predicted BG using unbiased variance for patient 8.

4.1.1   when to predict: on the training size evaluation.        4.1.4    when to predict: combined factors.

To answer this question, we set up an evaluation setting         Figure 7 show some highlighted combined filtering tech-
with increasing size of number of instances, order by time.      niques. In general, combining the aforementioned factors
Each training point is evaluated by leave-one-out valida-        together does improve the model performance. However,
tion. We show in Figure 6 the results for patient 8. The         the combination is not straightforward, e.g., confidence in-
general conclusion is the that the more training data, the       terval filtering lower the performance at the starting time
better the performance is, as witness for patient 13, 15 or      when the model is unstable aka. cold start. Hence, there is
17. However, the results for such patients e.,g patient 8, 11    not enough evidence for us to make a hard decision. The
or 16 show that the training size increment could also bring     more trial-and-error attempts on the fly or a bigger dataset
more noise and decrease the results. We envision that it         however will be at ease to be built on these as a foundation.
could because the learned model is not stabilized yet with
                                                                 4.2     Overall results with Filtering methods
the limited number of instances in our experiment. In ad-
dition, training size is not the only factor to decide when to   We show in Table 2 the overall results of our models with
predict. We hence move on to examine the other two fac-          different filtering approaches for all patients. We use 2
tors: (1) model stability - via std. dev. and (2) prediction     different filtering approaches: (1) Sanity filter, heuristics
confidence toward coming instances.                              (e.g., remove out wrongly input measurement or moments
                                                                 when the last glucose level input is too far) that remove
4.1.2   when to predict: on the model stability.                 noise and (2) Stability filter: prediction confidence (std.
                                                                 dev is not needed when the training size is large enough).
To answer this question, we measure the stability of the         The results show that the stability filter (based on bias and
model by the standard deviation of the k-fold cross valida-      bias-corrected) achieve the best performance, without the
tion with incremental training size. Figure 10 indicate on       need of human efforts on sanity filter. Sole stability filter
MAE and RMSE metrics, the model seems to be more sta-            also provide more predictions (avg. 24) than other filtering
bilized with the more number of training data. This is a         combination.
good indicator for the when to predict questions.

4.1.3   when to predict: on the prediction confidence.
We show in Figure 8 and Figure 9 the confidence distri-
bution at each run of the 5-fold CV for different patients
based on bias and no-bias confidences respectively. The
results show the confidence distributions are rather simi-                 (a) Patient 15               (b) Patient 17
lar across different run, indicating that the temporal order
of the instances does not impact much on the model perfor-       Figure 10: Standard deviation with incremental training
mance. Base on the distribution, we move on the the thresh-      size.
old parameter tuning for the data filtering using confidence
                                                                                     6                                         5
                                                                                                                                                                        8
                                                                                                                                                                                              ●
                                                                                                                                                                                                     ●
                                                                                                                                                                                              ●
                                                                                     5                           ●                                       ●
                                                                                                                               4                                ●

                                                                                            ●                                                                                                 ●
                                                                                                                                                                        6
                                                                                     4      ●
                                                                                                                        ●
                                                                                                                                                         ●
                                                                                                                               3
                                                                                                                        ●

                                                                                                                                    ●             ●
                                                                                     3                                                                                  4
                                                                                                                               2

                                                                                     2
                                                                                                                               1
                                                                                                                                                                        2
                                                                                     1
                                                                                                                               0
                                                                                                                                                                        0
                                                                                     0


     (a) Patient 8                (b) Patient 10           (c) Patient 11                  run1   run2   run3   run4   run5        run1   run2   run3   run4   run5            run1   run2   run3   run4   run5


                                                                                          (a) Patient 8                       (b) Patient 10                           (c) Patient 11

                                                                                                                               6
                                                                                    15                                  ●                                                                            ●


                                                                                                                               5           ●
                                                                                                                                                                       15
                                                                                                                                                                                                     ●


                                                                                                                                                                ●                                    ●
                                                                                                                               4                         ●
                                                                                                                                           ●                    ●
                                                                                                                                                         ●
                                                                                    10
                                                                                                                               3                                       10
                                                                                                                        ●
                                                                                                                                                                                ●

                                                                                            ●
                                                                                                                               2                                                ●
                                                                                                                                                                                       ●
                                                                                                                                                                                       ●


    (d) Patient 12                (e) Patient 13            (f) Patient 14
                                                                                     5
                                                                                                                               1                                        5


                                                                                                                               0

                                                                                     0                                        −1                                        0

                                                                                           run1   run2   run3   run4   run5        run1   run2   run3   run4   run5            run1   run2   run3   run4   run5


                                                                                     (d) Patient 12                            (e) Patient 13                            (f) Patient 14

                                                                                     3
                                                                                                                               5                                       2.0
                                                                                                                                           ●


                                                                                     2      ●                                  4                                       1.5
                                                                                                                                                         ●
                                                                                                                        ●                                ●


    (g) Patient 15                (h) Patient 16            (i) Patient 17
                                                                                                                 ●

                                                                                            ●                                                                                                        ●
                                                                                                          ●                    3    ●
                                                                                                                                                                       1.0
                                                                                                                                                                                                     ●
                                                                                                                                    ●
                                                                                                   ●                                ●                    ●
                                                                                                                                                  ●
                                                                                     1                                                     ●
                                                                                                                                                                ●
                                                                                                                               2                         ●
                                                                                                                                                                       0.5
                                                                                                                                                                ●
                                                                                                                                                                ●
                                                                                                                               1                                ●
                                                                                                   ●                                                            ●      0.0
                                                                                     0
                                                                                                                               0
                                                                                                                                                                      −0.5

                                                                                                                              −1

                                                                                           run1   run2   run3   run4   run5        run1   run2   run3   run4   run5            run1   run2   run3   run4   run5


Figure 6: Leave-one-out cross validation with incremental
training size.
                                                                                     (g) Patient 15                           (h) Patient 16                                 (i) Patient 17


                                                                                Figure 8: Confidence distributions at each run of 5-fold CV
                                                                                for predicted BG using biased variance.


            (a) Patient 8                           (b) Patient 13
                                                                                     4                                         4                                        5

                                                                                                                 ●
                                                                                            ●                                                                   ●       4
                                                                                     3                                         3                         ●
                                                                                                                                                                                                     ●
                                                                                                                        ●
                                                                                                                                                                                              ●

                                                                                                                                                                        3
                                                                                                                                                         ●
                                                                                     2
                                                                                                                               2
                                                                                                                                                  ●
                                                                                                                                                                        2
                                                                                     1
                                                                                                                               1                                        1

                                                                                     0
                                                                                                                               0                                        0

                                                                                    −1
                                                                                                                                                                       −1


                                                                                           run1   run2   run3   run4   run5        run1   run2   run3   run4   run5            run1   run2   run3   run4   run5


                                                                                          (a) Patient 8                       (b) Patient 10                           (c) Patient 11

                                                                                    12                                  ●                                                                            ●
                                                                                                                               3                                       10


           (c) Patient 15                           (d) Patient 16                  10


                                                                                     8
                                                                                                                               2
                                                                                                                                           ●


                                                                                                                                                         ●
                                                                                                                                                                ●
                                                                                                                                                                        8


                                                                                                                                                                        6
                                                                                                                                                                                                     ●


                                                                                                                                                                                                     ●


                                                                                                                                           ●      ●
                                                                                     6                                                            ●
                                                                                                                                                  ●      ●
                                                                                            ●
                                                                                            ●
                                                                                                                               1    ●
                                                                                                                                                         ●
                                                                                                                                                                        4       ●
                                                                                     4                                                                                                 ●

                                                                                                                                                                                       ●

                                                                                                                               0                                        2
                                                                                     2

                                                                                                                                                                        0
                                                                                     0
                                                                                                                              −1


Figure 7: 5-fold cross validation with incremental training                                run1   run2   run3   run4   run5        run1   run2   run3   run4   run5
                                                                                                                                                                       −2

                                                                                                                                                                               run1   run2   run3   run4   run5


size.
                                                                                     (d) Patient 12                            (e) Patient 13                            (f) Patient 14

                                                                                    2.0                                        3

                                                                                                                                                                       1.5
                                                                                    1.5
                                                                                                                               2           ●

                                                                                            ●
                                                                                                                                                                       1.0
                                                                                    1.0                                             ●
                                                                                                                                                                                       ●             ●
                                                                                                          ●             ●           ●                                                                ●
                                                                                                                                    ●                    ●
                                                                                            ●                    ●                                                                            ●             ●
                                                                                    0.5     ●
                                                                                            ●      ●
                                                                                                          ●
                                                                                                          ●
                                                                                                          ●      ●                         ●
                                                                                                                                                         ●      ●      0.5                                  ●
                                                                                            ●                    ●             1
                                                                                                                                                  ●
                                                                                                                                                         ●
                                                                                                                                                                ●
                                                                                                                                                                ●
                                                                                    0.0                                                                  ●             0.0


Table 2: Average performance of different filtering ap-                            −0.5


                                                                                   −1.0
                                                                                                                               0


                                                                                                                              −1
                                                                                                                                                                      −0.5


                                                                                                                                                                      −1.0


proaches for all patients.                                                                 run1   run2   run3   run4   run5        run1   run2   run3   run4   run5            run1   run2   run3   run4   run5


Model                            # predictions   MAE    MdAE   RMSE     SMAPE        (g) Patient 15                           (h) Patient 16                                 (i) Patient 17
rf                               42              2.58   2.27   12.05    29.98
et                               42              2.55   2.16   12.15    29.56
rf + sanity filter               16              2.22   2.01   8.80     28.10   Figure 9: Confidence distributions at each run of 5-fold CV
et + sanity filter               16              2.29   2.06   9.01     29.36   for predicted BG using unbiased variance.
rf + sanity + stability filter   15              2.22   1.92   8.71     27.82
rf + stability filter            24              1.92   1.77   7.57     22.65
5   Conclusion
We studied the predictability of machine-learning models
in the scenarios of non-continuous blood glucose tracking.
Additionally, we studied the stability and robustness of the
learned model over time. We show that Random Forest and
Extra Tree ensemble-based models are the most suitable
models for this case, as they can account for the outliers             (a) Patient 8 - bias       (b) Patient 8 - no bias
as well as overfitting problems when the data are limited.
Our further study on the prediction confidence show that
the model can give reliable predictions after acquiring 25-
30 instances.
Acknowledgements. This work was partially funded by
the German Federal Ministry of Education and Research
(BMBF) under project GlycoRec (16SV7172).
                                                                      (c) Patient 10 - bias      (d) Patient 10 - no bias

References
[1] Prediction intervals for random forests. http:
    //blog.datadive.net/prediction-intervals-
    for-random-forests/. Accessed: 2017-17-09.
[2] L. Breiman. Random forests.          Machine learning,
    45(1):5–32, 2001.                                                 (e) Patient 13 - bias       (f) Patient 13 - no bias

[3] I. Contreras, S. Oviedo, M. Vettoretti, R. Visentin, and
    J. Vehı́. Personalized blood glucose prediction: A hy-
    brid approach using grammatical evolution and physi-
    ological models. PloS one, 12(11):e0187754, 2017.
[4] M. Eren-Oruklu, A. Cinar, L. Quinn, and D. Smith. Es-
    timation of future glucose concentrations with subject-           (g) Patient 14 - bias       (h) Patient 14 - no bias
    specific recursive linear models. Diabetes technology
    & therapeutics, 11(4):243–253, 2009.                         Figure 11: Performance with confidence filtering thresh-
[5] S. Fiorini, C. Martini, D. Malpassi, R. Cordera,             old (x-axis) for some patients, MAE is when filtering is
    D. Maggi, A. Verri, and A. Barla. Data-driven strate-        applied.
    gies for robust forecast of continuous glucose moni-
    toring time-series. In Engineering in Medicine and
    Biology Society (EMBC), 2017 39th Annual Inter-
    national Conference of the IEEE, pages 1680–1683.
    IEEE, 2017.
[6] P. Geurts, D. Ernst, and L. Wehenkel. Extremely ran-
    domized trees. Machine learning, 63(1):3–42, 2006.
[7] K. Plis, R. Bunescu, C. Marling, J. Shubrook, and                     (a) Patient 8               (b) Patient 13
    F. Schwartz. A machine learning approach to pre-
    dicting blood glucose levels for diabetes management.
    2014.
[8] G. Sparacino, A. Facchinetti, and C. Cobelli. smart
    continuous glucose monitoring sensors: on-line signal
    processing issues. Sensors, 10(7):6751–6772, 2010.
                                                                         (c) Patient 15               (d) Patient 16
[9] S. Wager, T. Hastie, and B. Efron. Confidence inter-
    vals for random forests: the jackknife and the infinitesi-
                                                                 Figure 12: Performance with night time filtering, x-axis is
    mal jackknife. Journal of Machine Learning Research,
                                                                 the night time, MAE is when filtering is applied.
    15(1):1625–1651, 2014.