<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Personalised Glucose Prediction via Deep Multitask Networks</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>John Daniels</string-name>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Pantelis Georgiou</string-name>
          <email>pantelis@imperial.ac.uk</email>
        </contrib>
      </contrib-group>
      <abstract>
        <p>Glucose control is an essential requirement in primary therapy for diabetes management. Digital approaches to maintaining tight glycaemic control, such as clinical decision support systems and artificial pancreas systems rely on continuous glucose monitoring devices and self-reported data, which is usually improved through glucose forecasting. In this work, we develop a multitask approach using convolutional recurrent neural networks (MTCRNN) to provide short-term forecasts using the OhioT1DM dataset which comprises 12 participants. We obtain the following results - 30 min: 19.79 0.06 mg/dL (RMSE); 13.62 0.05 mg/dL (MAE) and 60 min: 33.73 0.24 mg/dL (RMSE); 24.54 0.15 mg/dL (MAE). Multitask learning facilitates an approach that allows for learning with the data from all available subjects, thereby overcoming the common challenge of insufficient individual datasets while learning appropriate individual models for each participant.</p>
      </abstract>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>INTRODUCTION</title>
      <p>
        In recent years, the proliferation of biosensors and wearable devices
has facilitated the ability to perform continuous monitoring of
physiological signals. In diabetes management, this has come with the
increasing use of continuous glucose monitoring (CGM) devices for
helping with glucose control. The current literature on clinical impact
of CGM devices shows that continuously monitoring blood glucose
concentration levels has benefit in maintaining tight glycaemic
control [
        <xref ref-type="bibr" rid="ref2 ref5">5, 2</xref>
        ]. As a next step, glucose prediction offers an opportunity
to further improve glucose control by taking actions to avert adverse
glycaemic events, such as suspension of insulin delivery in
closedloop systems to avert hypoglycaemia.
      </p>
      <p>The general work in this area has typically involved collecting data
covering physiological variables such as glucose concentration
levels, heart rate, and self-reported data covering exercise,sleep, stress,
illness, insulin, and meals. However, public datasets covering
ambulatory monitoring of T1DM population are not widely available.</p>
      <p>
        Deep learning [
        <xref ref-type="bibr" rid="ref6">6</xref>
        ] facilitates learning the optimal features and has
been shown to perform better than other methods involving hand
crafted features that have been employed in recent times for
predicting glucose concentration levels. However, typically these models
require relatively large amounts of data to converge on an appropriate
model.
      </p>
      <p>
        In this work, we employ a multitask learning [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ] approach in order
to improve the performance of the glucose forecasting in a neural
network, where each individual is viewed as a task, using shared layers
to enable learning form other individuals.
2
Glucose prediction has been a long-standing area of focus in the
diabetes community. As a result, many approaches have existed in order
to provide near-time glucose concentration level forecasts.
      </p>
      <p>
        Early work in this area have focused on physiological models and
traditional machine learning methods in predicting glucose
concentration levels [
        <xref ref-type="bibr" rid="ref12 ref3">12, 3</xref>
        ]. Recent work as seen in the 2018 Blood Glucose
Predictive Challenge has seen a move towards deep learning methods
with more impressive results [
        <xref ref-type="bibr" rid="ref11 ref14 ref8 ref9">11, 9, 14, 8</xref>
        ]. These have used
convolutional architectures, recurrent architectures, or a combination of both
to model the task of glucose prediction.
3
      </p>
    </sec>
    <sec id="sec-2">
      <title>DATASET AND DATA PREPROCESSING</title>
      <p>In this section, we detail the transformations that are performed on
the data prior to training and testing the model for each T1DM
participant.
3.1</p>
    </sec>
    <sec id="sec-3">
      <title>OhioT1DM Dataset 2020</title>
      <p>
        The OhioT1DM dataset 2020 [
        <xref ref-type="bibr" rid="ref10">10</xref>
        ] is a dataset comprising 12 unique
participants that cover eight weeks of daily living. The participants
are given IDs as the data is anonymised. This data comprises
physiological data gathered using a continuous glucose monitor (blood
glucose concentration levels) and wristband device (heart rate, skin
conductance, skin temperature), activity data (acceleration, step count),
and self-reported data (meal intake, insulin, exercise, work, sleep,
and stressors).
3.2
      </p>
    </sec>
    <sec id="sec-4">
      <title>Dealing with Missing Values</title>
      <p>
        A non-trivial aspect of the datasets used for developing glucose
prediction models is the aspect of missingness. This is evident in the
Ohio T1DM dataset with missingness present in both physiological
variables and self-reported data [
        <xref ref-type="bibr" rid="ref4">4</xref>
        ].
      </p>
      <p>Linear Interpolation: The blood glucose values that are
missing in this dataset are typically missing at random. This could be
attributed to issues around replacing glucose sensors and/or
transmitters, or dealing with faulty communication. As a result, we employ
linear interpolation in the training set to handle imputation of missing
blood glucose concentration levels in the dataset over a period of one
hour. In the samples where more than an hour of CGM data is
missing the sample is discarded from the training set. This is illustrated
with an example sequence in (C) of Fig.1</p>
      <p>On the other hand, features which comprise self-reported data the
assumption is made that any missing values represent an absence of
said feature. Therefore all missing values in insulin, meal intake and
reported exercise are imputed with zero.</p>
      <p>The missingness in features from the self-reported data in the
testing set is tackled similarly as in the training set. However, this is
not the case for blood glucose concentration levels as interpolation
when a current value at a given timestep is missing would lead to an
inaccurate evaluation of model performance.</p>
      <p>Extrapolation: In order to accurately evaluate the performance of
the model we cannot always rely on interpolation at test time as this
may require, in a real-time setting, an unknown future value to
perform interpolation. Consequently, we need to rely on other methods
of extrapolation to impute the missing glucose concentration levels.
In this scenario (A), for gaps of data less than 30 minutes, we
impute missing values with predicted values from the trained model.
For missing recent values longer than 30 minutes as in (B), we pad
the remaining values with the last computed value. In cases where, a
gap larger than 30 minutes is evident in historical data and a current
value is present at the given timestep, linear interpolation was then
employed instead to provide a more accurate imputation.
3.3</p>
    </sec>
    <sec id="sec-5">
      <title>Standardisation</title>
      <p>To enable training the proposed model effectively, we perform
transformation of the relevant input features (blood glucose concentration,
insulin bolus, meal(carbohydrate) intake, and reported exercise). The
blood glucose concentration levels are scaled down by a factor of
120. Similarly, the insulin bolus is scaled by 100 and meal intake
values are scaled by 200 in the same range between features. The
exercise values are transformed to a simple binary representation of the
presence or absence of exercise, from the recorded exercise intensity
on a range from 1-10.
4</p>
    </sec>
    <sec id="sec-6">
      <title>METHODS</title>
      <p>In this section we detail the machine learning technique that is used
to provide the means of learning personalised models with the entire
dataset. We detail the approach to develop the deep multitask
network for personalisation. We provide a summary of the
hyperparameters used in training as well and setting up the input for personalised
multitask learning.
4.1</p>
    </sec>
    <sec id="sec-7">
      <title>Multitask Learning</title>
      <p>
        Multitask learning is an approach in machine learning that can be
broadly described as a method of learning multiple tasks
simultaneously with the aim of improving generalisation [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ].
      </p>
      <p>
        Multitask learning for personalisation has been used mainly in
affective computing [
        <xref ref-type="bibr" rid="ref13">13</xref>
        ] with early work in diabetes management
focusing on using multitask learning for developing prediction models
for clustered groups of Type 1, Type 2, an non-diabetic participants
[
        <xref ref-type="bibr" rid="ref7">7</xref>
        ] rather than leveraging similarities within groups such as gender,
for personalised glucose predictions.
      </p>
      <p>As seen Figure 2, the output from the shared layers are now fed
into the individual(task)-specific fully connected layers of each user.</p>
      <p>In a multitask setting of this kind, a multiplicative gating approach
is used to ensure that the input corresponding to the particular user
trains on just that user in the individual-specific layers. In that sense,
at each iteration a batch that consists of data from a particular
individual is used to train the shared layers and the layers specific to the
individual.
4.2</p>
    </sec>
    <sec id="sec-8">
      <title>CRNN Model</title>
      <p>
        The deep learning model trained in the multitask learning setting is
a convolutional recurrent neural network (CRNN) proposed by Li et.
al [
        <xref ref-type="bibr" rid="ref8">8</xref>
        ] to perform short-term glucose prediction. This forms the basis
of the single-task (STL) model. The convolutional recurrent model
consists initially of a 3 temporal convolutional layers that perform
a 1-D convolution with a Gaussian kernel over the sequence of
input to extract features of various rates of appearance, followed by
a max pooling layer after each convolution operation. The input is
a 4-dimensional sequence that takes a 2-hour window of historical
data.
      </p>
      <p>The output from the convolutional layers performs feature
extraction and feeds into a recurrent long short-term memory (LSTM) layer
that is able to better model the temporal nature of the task.</p>
      <p>The output from the shared layers feed into the fully connected
layers of each user and to then provide the change in glucose value
over the prediction horizon. This is then added to the current glucose
value to provide the forecast glucose concentration level.
4.3</p>
    </sec>
    <sec id="sec-9">
      <title>Loss Function</title>
      <p>The loss function used for converging to the appropriate model for
the glucose forecasting is the mean absolute error. This is expressed
below as:</p>
      <p>L(y; y^) =</p>
      <p>1
Nbatch k=1
where y^ denotes the predicted results given the historical data and y
denotes the reference change in glucose concentration over the
relevant glucose prediction, and Nbatch refers to the batch size.
4.4</p>
    </sec>
    <sec id="sec-10">
      <title>Hyperparameters</title>
      <p>The following table details provides the details of the
hyperparameters used for the model architecture at each layer.</p>
      <p>The optimiser used for this work is Adam. The learning rate is
0.0053. The model is trained for 200 epochs. This value was obtained
through grid search optimisation.</p>
      <p>The model is developed on Keras 2.2.2, with a Tensorflow 1.5
backend. The training is performed on an NVIDIA GTX 1050 GPU.
The repository for the code accompanying the paper can be found at:
https://github.com/jsmdaniels/ecai-bglp-challenge
where y^ denotes the predicted results given the historical data and
y denotes the reference glucose measurement, and N refers to the
data size.</p>
      <p>In order to undertake a comprehensive evaluation of the model
performance, the subsequent criteria for assessment are followed:</p>
      <sec id="sec-10-1">
        <title>Performance evaluation over 30-minute and 60-minute pre</title>
        <p>diction horizon (PH): The RMSE and MAE for each participant
is analysed for a the same length of values for both prediction
horizons.</p>
      </sec>
      <sec id="sec-10-2">
        <title>Comparison of training setting: The performance of the multi</title>
        <p>task learning (MTL) approach is evaluated in the context of
comparison with the performance of a single task learning (STL)
approach which uses only patient specific data.</p>
      </sec>
      <sec id="sec-10-3">
        <title>Multiple runs for each participant ID: The multitask CRNN</title>
        <p>(MTCRNN) model uses randomly initialised weights at the start
of training. Given the variable nature of this training procedure,
the results reported are the average of 5 model runs.</p>
        <p>The unit for results reported below is mg/dL. The best
performance is in bold.
As seen in Table 3, the results shown provide a comprehensive
evaluation of the model predictive performance.</p>
        <p>Evidently, the model performance at PH = 30 minutes is better
than the model performance at PH = 60 minutes, given that prediction
at 60 minutes is a more complex task than prediction at 30 minutes.</p>
        <p>Figures 3 and 4 exhibit the differences in performance as seen in the
specific window for participant 596. The increased lag and reduced
predictive performance can also be attributed to the higher chance of
external activities (insulin, meals, exercise) that influence the blood
glucose trajectory occurring over the prediction horizon.</p>
        <p>
          The best predictive performances were achieved by the model
with IDs 544, 552, 596 whereas, IDs 540, 567, and 584 exhibited
worse performances over both 30 and 60 minute prediction horizons.
An investigation of the glycaemic variability, using the coefficient
of variation (CV) [
          <xref ref-type="bibr" rid="ref2">2</xref>
          ], of the training set of the former set of
participants are stable (CV 36%) whereas the latter group are labile
(CV&gt;36%). The multitask learning approach definitively performs
better over the single task approach over a 30-minute prediction
horizon. However, the performance improvement of the MTL approach
over a 60-minute prediction is not consistent across each participant
and metric.
        </p>
        <p>One potential issue with multitask learning is the issue of negative
transfer. This can be described as a scenario in which one or more
of the tasks (individuals) or sampled batches during training are not
strongly correlated, degrading the learning in the shared layers, and
subsequently the performance at test time.
7</p>
      </sec>
    </sec>
    <sec id="sec-11">
      <title>CONCLUSION</title>
      <p>In this work, we have presented a multitask convolutional recurrent
neural network that is capable of performing short-term personalised
predictions - 19.79 0.06mg/dL (RMSE) and 13.62 0.05mg/dL
(MAE) at 30 minutes, as well as 33.73 0.24mg/dL (RMSE) and
24.54 0.15mg/dL (MAE) at 60 minutes. We work towards
leveraging population data while still learning a personalised model. In
the future, we hope to address further challenges such as negative
transfer during learning that could improve the accuracy of
individual models. This approach would enable more accurate models to be
deployed in the face of limited personal data.</p>
    </sec>
    <sec id="sec-12">
      <title>ACKNOWLEDGEMENTS</title>
      <p>This work is supported by the ARISES project (EP/P00993X/1),
funded by the Engineering and Physical Sciences Research Council.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <given-names>Rich</given-names>
            <surname>Caruana</surname>
          </string-name>
          , 'Multitask Learning',
          <source>Machine Learning</source>
          ,
          <volume>28</volume>
          (
          <issue>1</issue>
          ),
          <fpage>41</fpage>
          -
          <lpage>75</lpage>
          , (
          <year>July 1997</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [2]
          <string-name>
            <given-names>Antonio</given-names>
            <surname>Ceriello</surname>
          </string-name>
          , Louis Monnier, and David Owens, '
          <article-title>Glycaemic variability in diabetes: clinical and therapeutic implications'</article-title>
          ,
          <source>The Lancet Diabetes &amp; Endocrinology</source>
          , (
          <year>August 2018</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [3]
          <string-name>
            <given-names>E. I.</given-names>
            <surname>Georga</surname>
          </string-name>
          ,
          <string-name>
            <given-names>V. C.</given-names>
            <surname>Protopappas</surname>
          </string-name>
          , D. Ardigo`,
          <string-name>
            <given-names>M.</given-names>
            <surname>Marina</surname>
          </string-name>
          ,
          <string-name>
            <given-names>I.</given-names>
            <surname>Zavaroni</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Polyzos</surname>
          </string-name>
          , and
          <string-name>
            <surname>D. I. Fotiadis</surname>
          </string-name>
          , '
          <article-title>Multivariate Prediction of Subcutaneous Glucose Concentration in Type 1 Diabetes Patients Based on Support Vector Regression'</article-title>
          ,
          <source>IEEE Journal of Biomedical and Health Informatics</source>
          ,
          <volume>17</volume>
          (
          <issue>1</issue>
          ),
          <fpage>71</fpage>
          -
          <lpage>81</lpage>
          , (
          <year>January 2013</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [4]
          <string-name>
            <given-names>Marzyeh</given-names>
            <surname>Ghassemi</surname>
          </string-name>
          , Tristan Naumann,
          <string-name>
            <given-names>Peter</given-names>
            <surname>Schulam</surname>
          </string-name>
          , Andrew L.
          <string-name>
            <surname>Beam</surname>
          </string-name>
          , and Rajesh Ranganath, '
          <article-title>Opportunities in Machine Learning for Healthcare'</article-title>
          , arXiv:
          <year>1806</year>
          .00388 [cs, stat],
          <source>(June</source>
          <year>2018</year>
          ). arXiv:
          <year>1806</year>
          .00388.
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          [5]
          <string-name>
            <given-names>Giacomo</given-names>
            <surname>Cappon</surname>
          </string-name>
          , Giada Acciaroli, Martina Vettoretti, Andrea Facchinetti, and Giovanni Sparacino, '
          <article-title>Wearable Continuous Glucose Monitoring Sensors: A Revolution in Diabetes Treatment'</article-title>
          ,
          <source>Electronics</source>
          ,
          <volume>6</volume>
          (
          <issue>3</issue>
          ),
          <fpage>65</fpage>
          , (
          <year>September 2017</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          [6]
          <string-name>
            <given-names>Ian</given-names>
            <surname>Goodfellow</surname>
          </string-name>
          , Yoshua Bengio, and Aaron Courville, Deep Learning, MIT Press,
          <year>2016</year>
          . http://www.deeplearningbook.org.
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          [7]
          <string-name>
            <given-names>Weixi</given-names>
            <surname>Gu</surname>
          </string-name>
          , Zimu Zhou, Yuxun Zhou, Miao He, Han Zou, and Lin Zhang, '
          <article-title>Predicting Blood Glucose Dynamics with Multi-time-series Deep Learning'</article-title>
          ,
          <source>in Proceedings of the 15th ACM Conference on Embedded Network Sensor Systems - SenSys '17</source>
          , pp.
          <fpage>1</fpage>
          -
          <lpage>2</lpage>
          , Delft, Netherlands, (
          <year>2017</year>
          ). ACM Press.
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          [8]
          <string-name>
            <given-names>K.</given-names>
            <surname>Li</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Daniels</surname>
          </string-name>
          , C. Liu,
          <string-name>
            <given-names>P.</given-names>
            <surname>Herrero-Vinas</surname>
          </string-name>
          , and
          <string-name>
            <given-names>P.</given-names>
            <surname>Georgiou</surname>
          </string-name>
          , '
          <article-title>Convolutional Recurrent Neural Networks for Glucose Prediction'</article-title>
          ,
          <source>IEEE Journal of Biomedical and Health Informatics</source>
          ,
          <fpage>1</fpage>
          -
          <lpage>1</lpage>
          , (
          <year>2019</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          [9]
          <string-name>
            <given-names>Kezhi</given-names>
            <surname>Li</surname>
          </string-name>
          , Chengyuan Liu, Taiyu Zhu, Pau Herrero, and Pantelis Georgiou, '
          <article-title>GluNet: A Deep Learning Framework for Accurate Glucose Forecasting'</article-title>
          ,
          <source>IEEE Journal of Biomedical and Health Informatics</source>
          ,
          <volume>24</volume>
          (
          <issue>2</issue>
          ),
          <fpage>414</fpage>
          -
          <lpage>423</lpage>
          , (
          <year>February 2020</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          [10]
          <string-name>
            <given-names>Cindy</given-names>
            <surname>Marling</surname>
          </string-name>
          and Razvan Bunescu, '
          <article-title>The OhioT1DM Dataset for Blood Glucose Level Prediction'</article-title>
          ,
          <source>In: The 5th International Workshop on Knowledge discovery in healthcare data.</source>
          , (
          <year>2020</year>
          ). CEUR proceeding in press. Available at http://smarthealth.cs.ohio.edu/bglp/OhioT1DMdataset-paper.pdf.
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          [11]
          <string-name>
            <surname>John</surname>
            <given-names>Martinsson</given-names>
          </string-name>
          , Alexander Schliep, Bjo¨rn Eliasson, and Olof Mogren, '
          <article-title>Blood Glucose Prediction with Variance Estimation Using Recurrent Neural Networks'</article-title>
          ,
          <source>Journal of Healthcare Informatics Research</source>
          ,
          <volume>4</volume>
          (
          <issue>1</issue>
          ),
          <fpage>1</fpage>
          -
          <lpage>18</lpage>
          , (
          <year>March 2020</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          [12]
          <string-name>
            <surname>C.</surname>
          </string-name>
          <article-title>Pe´rez-Gand´ıa, A</article-title>
          . Facchinetti, G. Sparacino,
          <string-name>
            <given-names>C.</given-names>
            <surname>Cobelli</surname>
          </string-name>
          , E.j. Go´mez,
          <string-name>
            <given-names>M.</given-names>
            <surname>Rigla</surname>
          </string-name>
          , A. de Leiva, and M.e. Hernando, '
          <article-title>Artificial Neural Network Algorithm for Online Glucose Prediction from Continuous Glucose Monitoring'</article-title>
          ,
          <source>Diabetes Technology &amp; Therapeutics</source>
          ,
          <volume>12</volume>
          (
          <issue>1</issue>
          ),
          <fpage>81</fpage>
          -
          <lpage>88</lpage>
          , (
          <year>January 2010</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          [13] Sara Ann Taylor, Natasha Jaques, Ehimwenma Nosakhare, Akane Sano, and Rosalind Picard, '
          <article-title>Personalized Multitask Learning for Predicting Tomorrow's Mood, Stress, and Health'</article-title>
          ,
          <source>IEEE Transactions on Affective Computing</source>
          ,
          <fpage>1</fpage>
          -
          <lpage>1</lpage>
          , (
          <year>2017</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref14">
        <mixed-citation>
          [14]
          <string-name>
            <surname>Taiyu</surname>
            <given-names>Zhu</given-names>
          </string-name>
          ,
          <string-name>
            <given-names>Kezhi</given-names>
            <surname>Li</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Jianwei</given-names>
            <surname>Chen</surname>
          </string-name>
          , Pau Herrero, and Pantelis Georgiou, '
          <article-title>Dilated Recurrent Neural Networks for Glucose Forecasting in Type 1 Diabetes'</article-title>
          ,
          <source>Journal of Healthcare Informatics Research</source>
          , (
          <year>April 2020</year>
          ).
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>