AIH 2012


        Acute Ischemic Stroke Prediction from
         Physiological Time Series Patterns

          Qing Zhang1,2 , Yang Xie2 , Pengjie Ye1,2 , and Chaoyi Pang2
             1
                 Australian e-Health Research Centre/CSIRO ICT Centre
                          2
                            THe University of New South Wales
                  {qing.zhang,pengjie.ye,chaoyi.pang}@csiro.au
                                yang.xie@unsw.edu.au


      Abstract. Stroke is one of the major diseases that can cause human
      deaths. However, despite the frequency and importance of stroke, there
      are only a limited number of evidence-based acute treatment options cur-
      rently available. Recent clinical research has indicated that early changes
      in common physiological variables represent a potential therapeutic tar-
      get, thus the manipulation of these variables may eventually yield an
      effective way to optimise stroke recovery. Nevertheless the accuracy of
      prediction methods based on statistical characteristics of certain physi-
      ological variables, such as blood pressure, glucose, is still far from sat-
      isfactory due to vague understandings of effects and function domain
      of those physiological determinants. Therefore, developing a relatively
      accurate prediction method of stroke outcome based on justifiable de-
      terminants becomes more and more important to the decision of the
      medical treatment at the very beginning of the stroke. In this work, we
      utilize machine learning techniques to find correlations between physi-
      ological parameters of stroke patient during 48 hours after stroke, and
      their stroke outcomes after three months. Our prediction method not
      only incorporates statistical characteristics of physiological parameters,
      but also considers physiological time series patterns as key features. Ex-
      periment results on real stroke patients’ data indicate that our method
      can greatly improve prediction accuracy to a high precision rate of 94%,
      as well as a high recall rate of 90%.

      Keywords: Stroke, Outcome Prediction, Time Series Data, Machine
      Learning


1   Introduction
Stroke is a common cause of human death and is a major cause of death after
ischemic heart disease [1]. The World Health Organisation (WHO) defines it as
”rapidly developing clinical signs of local (or global) disturbance of cerebral func-
tion, with symptoms lasting more than 24 hours or leading to death, and with no
apparent cause other than of vascular origin” [2]. Recent years research reveals a
strong association between physiological homeostasis and outcomes of Acute Is-
chemic Stroke. Thus understanding determinants of physiological variables, such


                                                                                             45
AIH 2012


           2       Acute Ischemic Stroke Prediction

           as blood pressure, temperature and blood glucose levels, may eventually yield
           an effective and potentially widely applicable range of therapies for optimis-
           ing stroke recovery, such as abbreviating the duration of ischaemia, preventing
           further stroke, or preventing deterioration due to post-stroke complications.
               The correlations between blood pressure and stroke outcomes have been
           widely studied in the literature. It is stated in current guidelines that a sig-
           nificant decrease of BP during the first hours after admission should be avoided,
           as it correlates with poor outcomes, measured by Canadian Stroke Scale or
           modified Rankin Score (mRS), at 3 months [10]. Extreme hypertension and
           hypotension on admission have also been associated with adverse outcome in
           acute stroke patients [11]. BP values, periodically monitored within the first
           72 hours after admission, demonstrate that extreme values still correlate with
           unfavored outcomes [9]. For example, high baseline of systolic BP is inversely
           associated with favourable outcome assessed on mRS at 90 days with OR=1.220
           and (95% CI: 1.01 to 1.49). Other periodically retrieved statistical properties of
           BP within 24 hours of ictus, such as maximum, mean, variability etc., have also
           been investigated. Yong et al. [12] report strong independent association between
           those properties and the outcome at 30 days after ischemic stroke. For example,
           variability of systolic BP is inversely associated with favourable outcome with
           OR=0.57, (95% CI: 0.35 to 0.92).
               Research also shows associations between other physiological variables and
           stroke outcomes. Abnormalities of blood glucose, heart rate variability, ECG and
           temperature may be predictors of 3-month stroke outcome.
               Most of the above analyses are based on periodically recorded physiological
           parameters, hourly or daily, up to 3 months. Whether continuous data patterns,
           such as data trends, have a similar predictive role is still uncertain. Although it is
           clear that the after stroke elevated 24-hours blood pressure levels predict a poor
           outcome, few studies have investigated the predictive ability of more sophisti-
           cate trends, e.g. combined trends of several physiological parameters. Yet this
           could be an effective way to readily obtain important prognostic information for
           acute ischemic stroke patients. Dawson et al did pioneering works on associating
           shorter length (around 10 minutes) beat-to-beat BP with acute ischemic stroke
           outcomes [8]. They conclude that a poor outcome, assessed by mRS, at 30 days
           after ischemic stroke is dependent on stroke subtype, beat-to-beat diastolic BP
           and Mean Arterial Pressure and variability. However in their study, they still
           use the average values of continuous recordings, instead of time series patterns
           as predictors. This motivates our research on mining physiological data patterns
           as effective predictors of acute ischemic stroke outcome.
               Obviously mining physiological data patterns can be easily aligned with time
           series data classification, which is a traditional topic and has attracted inten-
           sive studies. Although there exist many sophisticate time series data mining
           techniques, we find that most of them, if not all, are not applicable to our
           application scenario, due to the always incomplete, non-isometric physiological
           data collected from patients. Therefore, in this paper, we incorporate a simple
           yet powerful time series data pattern analysing method, trend analyses, into


46
                                                                                      AIH 2012


                                          Acute Ischemic Stroke Prediction       3

our prediction method. By utilising those trend features, together with values
of traditional physiological variables, we design an efficient algorithm that can
predict 3-month stroke outcome with high accuracy.
    In summary, we list our contributions in this paper:
 – We propose using trend patterns of physiological time series data as a new
   set of stroke outcome prediction features,
 – We design a novel prediction algorithm which can accurately predict 3-
   months stroke outcomes with high precision and recall rate, when tested
   against a real data set.
    The rest of this paper is organised as follows. Section 2 introduces works
related to stroke outcome predictions. Section 3 presents our prediction methods.
Section 4 reports empirical study results. And section 5 concludes this paper with
possible future studies.


2   Related Work
The relationship between beat-to-beat blood pressure (BP) and the early out-
come after acute ischemic stroke was firstly described in [8].
    A further investigation on BP was done in [6], which investigated detrimental
effects of blood pressure reduction in the first 24 hours of acute stroke onset. BP
reduction is regarded to have the possibility to worsen an already compromised
perfusion in the brain tissue and thus not lowering BP in the early stage after
the stroke onset is suggested. However, it lacks further discussion on the relation
of higher BP and outcome. Ritter et al. formulated the blood pressure variation
by counting threshold violations. Significant difference in the frequency of upper
threshold violation occurrences was observed between different time points after
stroke [9] . Wong observed some temporal patterns from the changing process
of some physiological variables and also attempted to employ such temporal
patterns to explain and predict the early outcomes [5]. However, due to the
limit of candidate feature set considered in those studies, achieving an accurate
prediction is fairly unlikely in those scenarios.
    Relationships between other physiological variables and stroke outcome have
also been studied in literature. Abnormalities of serum osmolarity, temperature,
blood glucose, SPO2 may be predictors of stroke outcomes. More specifically,
heart rate and ECG, can be correlated to stroke outcomes at 3-months:
 – Heart Rate Variability: Gujjar et al. reported that heart rate variability is
   efficient in predicting stroke outcome. Specifically they studied continuous
   echocardiogram of 25 patients with acute stroke and concluded that the eye-
   opening score of Glasgow Coma Scale and low-frequency spectral power were
   factors that were independently predictive of mortality [16].
 – ECG: The relationship between ECG abnormalities and stroke outcomes
   were reported by Christensen et al. They analysed a large cohort of 692
   patients and predict that ECG abnormalities are frequent in acute stroke
   and may conclude 3-month mortality [17].


                                                                                           47
AIH 2012


           4       Acute Ischemic Stroke Prediction

           3     Stroke outcomes prediction

           Our prediction method adopts statistical values of physiological parameters and
           also incorporates the descriptive ability of the physiological patterns as features
           to predict 3-months stroke outcomes. Particularly, we use the trend pattern of
           time series data as new add-on features to form an initial feature set. Then we
           apply the logistic regression method to classify stroke patient outcomes into two
           groups: good vs. bad. Note that there exist different clinical criteria in defining
           good/bad outcomes. We will report empirical study results on all criteria in the
           next section. Cross validation is also adopted to obtain an unbiased assessment of
           classifier performance, by which the physiological determinants can be accurately
           identified in the last stage. Finally, we select a subset of features that can most
           accurately predict 3-months stroke outcomes. Figure 1 presents logic flows of our
           method. We use Rankin Scale to represent various outcomes at 3 months after
           stroke (RS3) [18].


                               Fig. 1. Stroke outcomes prediction method


           3.1   Construct initial feature set

           Five physiological parameters are usually considered as influential factors on
           stroke patient outcomes, namely Blood Sugar Level, Diastolic Blood Pressure,
           Systolic Blood Pressure, Heart Rate and Body Temperature [6, 16, 17]. Exist-
           ing stroke outcome predictions always assume a certain parameter as the main


48
                                                                                          AIH 2012


                                            Acute Ischemic Stroke Prediction         5

feature in their approaches. However in our approach, we will assume all five
parameters in the initial feature set.
    Moreover, for each physiological parameter, we compute trends through par-
titioning the time series data into non-overlapping, continuous blocks. Although
there exists many trend and shape detection methods in the literature, such as
[3], in our application, we simply consider a bi-partition on the first 48-hours
time series data records after stroke. The reasons are:
 1. most available physiological data records are only within 48-hours after
    stroke.
 2. clinical observation and our initial experiments both suggest that setting the
    granularity level at having only two partitions in the 48-hours, well represents
    the physiological time series pattern changes.
    In each partition, accordingly we generate 6 new features, as shown below,
to represent the trend pattern:
 1. yChange: the difference between the value at the end of a trend and the
    value at the start of a trend

                  yChange = y(end of trend) − y(start of trend)

 2. absYChange: the absolute value of the yChange
 3. slope: the slope of the trend
 4. sign: the direction of the trend
 5. NumofMeasure: the number of values in a partition
 6. FreqofMeasure: the average time interval between measurements, i.e.
                                                T rend Length
                         F reqof M easure =
                                               N umof M easure
   The initial feature set comprised physiological values and their trend pat-
terns. We apply the logistical regression method to classify the good/bad stroke
outcomes based on this initial feature set.

3.2   Logistic Regression Classifier
In statistics, logistic regression is a type of regression analysis used for predicting
the outcome of a binary dependent variable (a variable which can take only two
possible outcomes, e.g. “yes” vs. “no” or “success” vs. “failure”) based on one or
more predictor variables. Like other forms of regression analysis, logistic regres-
sion makes use of one or more predictor variables that may be either continuous
or categorical. Unlike ordinary linear regression, however, logistic regression is
used for predicting binary outcomes rather than continuous outcomes. Logistic
regression adopted here is a type of regression analysis used for predicting the
outcome of stroke (“good” vs. “bad”) based on features in our initial feature set.
    To obtain an unbiased assessment of classifier performance, the Leave-One-
Out Cross validation technique is adopted. Suppose N folds are employed, this


                                                                                               49
AIH 2012


           6       Acute Ischemic Stroke Prediction

           technique withholds a subject from the training set for each run to later test
           with. Once a record has been withheld for testing, the classifier is trained us-
           ing the remaining N-1 subjects. The withheld subject is then reintroduced for
           classification.


           3.3   Final feature set selection

           We use two greedy search strategies to find the best feature subset that can
           achieve highest prediction accuracy. Specifically, we use backward search and
           forward search:

           backward search : A greedy backward search is performed to identify a near
           optimum subset of features. Starting with all features, in sequence, the feature
           which improves prediction accuracy the most (or decreases it the least) is re-
           moved from the current set of features and retained as an intermediate feature
           subset. This is repeated until all features have been removed. The intermediate
           feature subset which provides the maximum performance, compared to all other
           subset evaluated, is selected as the final feature set.

           forward Search A sequential forward floating search algorithm is used for feature
           selection, in an attempt to discover the optimal subset of features from the pool
           of available candidate features. This strategy begins with a forward-selection
           process, selecting a single feature from the pool of available features, which im-
           proves the prediction accuracy most. After this selection, removal of a feature
           from the set of selected features is considered. The process of possible feature ad-
           dition, followed by possible feature removal, is iterated until the selected feature
           set converges.


           4     Empirical Study

           In this section, we report experiment results through testing our prediction
           method on a real data set of stroke patients. Firstly, we introduce the physi-
           ological data sets of stroke patients and the good/bad criteria used in our study.
           Then we report prediction accuracy based on various combination of feature
           sets. Our study was approved by a ethics committee of the related institution.


           4.1   Experimental data sets

           A cohort of 157 patients with acute ischaemic stroke were recruited. Patients
           presenting to the Emergency Department of the Royal Brisbane and Women’s
           Hospital, an Australian tertiary referral teaching hospital, within 48 hours of
           stroke or existing inpatients with an intercurrent stroke were enrolled prospec-
           tively. Important physiological parameters, such as blood pressure, were recorded
           at least every 4 hours from the time of admission until 48 hours after the stroke.


50
                                                                                     AIH 2012


                                           Acute Ischemic Stroke Prediction     7

These values were used as the outcome variable in the analyses. The measure-
ments from patients who died during these first 48 hours were also included in
the analyses. Furthermore, some demographic and other stroke-related data were
also collected such as the age and gender. The age range of these 157 patients
was 16 to 92 years with median age 75 years. The patient distribution based on
different values of RS3 is showed in Figure 2.


                   Fig. 2. Patient distributions on values of RS3


4.2   Classification criteria

As shown in Figure 2, RS3 score varies between 0 and 6. Patients with RS3 =
6 means the subject is dead after three months and RS3 = 0 means the subject
recovers quite well after three months. Based on RS3 values, patient outcomes
can be divided into good/bad groups basing on different grouping criteria. Figure
3 illustrates patient distributions under three type grouping criteria.


4.3   Prediction accuracy comparisons

Applying techniques described in Section 3, we run experiments on various
grouping criteria to test our stroke outcome prediction algorithm. We always
notice that ‘backward search’ generates more accurate prediction results, which
will thus be used as our default feature set search strategy. Figure 4 shows
prediction accuracy comparisons under all three types of grouping criteria. In
Figure 5, we also evaluate the efficiency of including trend pattern as prediction


                                                                                          51
AIH 2012


           8      Acute Ischemic Stroke Prediction


                          Fig. 3. Good vs Bad outcomes under various criteria


                        Fig. 4. Prediction Accuracy on different grouping criteria


           features. Experiment shows that by adding those simple trend features, the pre-
           diction accuracy on all three grouping types is unanimously boosted from 71%
           to 89∼91%.


           5   Conclusion

           In this paper, we describe novel algorithms to predict three months stroke out-
           comes. We have quantified the great improvements brought by including phys-
           iological data trend patterns as features of a classifier. We believe that these
           trends play important roles on three months outcomes of stroke patients. The
           efficiency and accuracy of our algorithm have also been demonstrated through
           our experiments.


52
                                                                                             AIH 2012


                                               Acute Ischemic Stroke Prediction         9


                      features(values,trends)	
              features(values)	
  

           100%	
  
            90%	
  
            80%	
  
            70%	
  
            60%	
  
                         Type	
  1	
         Type	
  2	
             Type	
  3	
  

           Fig. 5. Prediction accuracy improved by adding trend features


    In our future work, we will first try to locate the most important trend pattens
for stroke outcome predictions. Then we will work with healthcare professionals
to find clinical ground truth beneath those physiological trend patterns of stroke
patients. This will greatly benefit clinical treatments of acute ischemic stroke.
We also plan to run clinical trials to validate our prediction methods on other
real data sets of stoke patients.


References
[1] Australian Institute of Health and Welfare.: Australias health 2006, the tenth bien-
    nial health report of the Australian Institute of Health and Welfare. ISBN 1 74024
    565 2. 2006
[2] The World Health Organization MONICA Project (monitoring trends and de-
    terminants in cardiovascular disease): a major international collaboration. WHO
    MONICA Project Principal Investigators. Journal of Clinical Epidemiology.
    1988;41(2):105-14.
[3] Ye, L., Keogh, E.: Time series shapelets: a new primitive for data mining. Proceed-
    ings of the 15th ACM SIGKDD international conference on Knowledge discovery
    and data mining, Vol. 22, ACM, Paris, France, pp. 947–956.
[4] Mueen, A., Keogh, E.,Young, N.: Logical-shapelets: an expressive primitive for
    time series classification. Proceedings of the 17th ACM SIGKDD international
    conference on Knowledge discovery and data mining, Vol. 22, ACM, San Diego,
    California, USA, pp. 1154–1162.
[5] Wong, A.: The Natural History and Determinants of Changes in Physiological Vari-
    ables after Ischaemic Stroke. Ph.D. Thesis, The University of Queensland, St.Lucia.
[6] Oliveira-Filho, J., Silva, S.C.S., Trabuco, C.C., Pedreira, B.B., Sousa, E.U., Bacel-
    lar, A.: Detrimental effect of blood pressure reduction in the first 24 hours of acute
    stroke onset. Neurology. 61(8), 1047–1051.
[7] Marti-Fabregas, J., Belvis, R., Guardia, E., Cocho, D., Munoz, J., Marruecos, L.,
    Marti-Vilalta, J.-L.: Prognostic value of Pulsatility Index in Acute Intracerebral
    Hemorrhage. Neurology. 61(8), 1051–1056.


                                                                                                  53
AIH 2012


           10      Acute Ischemic Stroke Prediction

           [8] Dawson, S.L., Manktelow, B.N., Robinson, T.G., Panerai, R.B., Potter, J.F.: Which
               Parameters of Beat-to-Beat Blood Pressure and Variability Best Predict Early
               Outcome After Acute Ischemic Stroke. Stroke. 2000(31), 463–468.
           [9] Ritter, M.A., Kimmeyer, P., Heuschmann, P.U., Dziewas, R. Dittrich, R., Nabavi,
               D.G., Ringelstein, E.B.: Blood Pressure Threshold Violations in the First 24 Hours
               After Admission for Acute Stroke: Frequency, Timing, Predictors, and Impact on
               Clinical Outcome. Stroke. 2009(40), 462–468.
           [10] Castillo, J., et al., Blood pressure decrease during the acute phase of ischemic
               stroke is associated with brain injury and poor stroke outcome. Stroke, 2004. 35(2):
               p.520-6
           [11] Ahmed, N., P. Nasman, and N.G. Wahlgren, Effect of intravenous nimodipine on
               blood pressure and outcome after acute stroke. Stroke, 2000. 31(6): p. 12505.
           [12] Yong, M. and M. Kaste, Association of characteristics of blood pressure profiles
               and stroke outcomes in the ECASSII trial. Stroke, 2008. 39(2): p. 36672
           [13] Wong AA, Schluter PJ, Henderson RD, O’Sullivan JD, Read SJ. The natural
               history of blood glucose within the first 48 hours after ischemic stroke. Neurology
               2008;70:103641.
           [14] Christensen, H., A. Fogh Christensen, and G. Boysen, Abnormalities on ECG
               and telemetry predict stroke outcome at 3 months. J Neurol Sci, 2005. 234(12): p.
               99103.
           [15] Boysen, G. and H. Christensen, Stroke severity determines body temperature in
               acute stroke. Stroke, 2001. 32 (2): p. 4137.
           [16] Gujjar AR, Sathyaprabha TN, Nagaraja D, Thennarasu K and Pradhan N, Heart
               rate variability and outcome in acute severe stroke: role of power spectral analysis.
               Neurocrit Care, 2004. 1(3): p. 347-53.
           [17] Christensen, H., A. Fogh Christensen, and G. Boysen, Abnormalities on ECG and
               telemetry predict stroke outcome at 3 months. J Neurol Sci, 2005. 234(1-2): p.
               99-103.
           [18] Rankin J (May 1957). Cerebral vascular accidents in patients over the age of 60.
               II. Prognosis. Scott Med J 2 (5): 200-15.


54