<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Challenges for the Similarity-Based Comparison of Human Physical Activities Using Time Series Data</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Tomasz Szczepanski</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Kerstin Bach</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Agnar Aamodt</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Department of Computer and Information Science Norwegian University of Science and Technology</institution>
          ,
          <addr-line>Trondheim</addr-line>
          ,
          <country country="NO">Norway</country>
        </aff>
      </contrib-group>
      <fpage>173</fpage>
      <lpage>177</lpage>
      <abstract>
        <p>In this position paper we present various aspects of comparing human physical activities using time series data within the selfBACK project. The goal of the project is to develop a decision support system for patients su ering from non-speci c low back pain. The system will give users advice in form of a self-management plan that is based on self-reported physical and psychological symptoms as well as activity stream data collected by a wristband. Here, we discuss the activity stream representation di culties as well as various challenges that arise when comparing the resulting activity streams.</p>
      </abstract>
      <kwd-group>
        <kwd>Case-Based Reasoning</kwd>
        <kwd>Time Series Representation</kwd>
        <kwd>Data Streams</kwd>
        <kwd>Similarity Assessment</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>-</title>
      <p>
        Introduction
selfBACK [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ] is a research project developing a decision support system that is
aimed towards helping patients to facilitate, improve and reinforce self-management
of non-speci c low back pain. The selfBACK system will constitute a
datadriven, predictive decision support system that uses the Case-Based Reasoning
(CBR) methodology to capture and reuse patient cases in order to suggest the
most suitable activity goals and plans tailored for an individual patient. This
will be based on data from two di erent types of sources. One is a questionnaire
presented to the patient at suitable intervals in order to capture self-reported
general information and progress of symptoms. The other is a stream of activity
data collected using a wristband. The incoming data will be analyzed to classify
the patients current state and recent activities, and matched against past cases
in order to derive follow-up advices to the patient.
      </p>
      <p>Two of the many challenges during matching and similarity assessment when
using the CBR methodology, is developing a suitable abstraction of the wristband
activity stream and an adequate comparison algorithm for such streams. The
abstractions and the comparison algorithm must catch the important aspects
and compare these in a meaningful way relevant in a non-speci c low back pain
domain.</p>
      <p>We will in the next section shortly present the di erent aspects and challenges
that need to be considered in order to develop such a comparison. Speci cally,
we will focus on the challenges arising when comparing the activity streams of
patients.
2</p>
    </sec>
    <sec id="sec-2">
      <title>Challenges</title>
      <p>The main goal of the selfBACK project is to give non-speci c low back pain
patients personalized advice based on their reported symptoms and activity
patterns collected by a wristband. Three major challenges arise when comparing the
activity streams:
1. The activity stream needs to contain the knowledge needed and used for the
treatment of non-speci c low back pain.
2. The comparison of such streams must be made in a way that is meaningful
when comparing patients with non-speci c low back pain.
3. Comparing activity streams with missing data. The wristbands have limited
battery time and are often not water proof and needs to be taken o during
showering or swimming activities. Users might nd it uncomfortable to wear
the wristbands continuously and try to loosen the strap or take breaks in
wearing them, which might introduce gaps in recordings.</p>
      <p>
        Because these problems are very novel, we have very little knowledge on what
to include and how to compare such activity streams. There is some evidence
that sleeping patterns and prolonged periods of inactivity are correlated with an
increase in non-speci c low back pain [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ], [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ], [
        <xref ref-type="bibr" rid="ref5">5</xref>
        ]. There is however, very limited
data on what activity types or intensities of activity types are bene cial/harmful
when having non-speci c low back pain.
      </p>
      <p>The activity streams that we have experimented with can contain activity
data from periods of up to several weeks. This is because to see the e ect of
a change in behaviour, by applying some advice, one often needs to wait for
about 4-6 weeks. Including all kinds of information into an activity stream (type
of activity, intensity, skin temperature, pulse, moisture of the skin, etc.), comes
with an increase in the computational cost when comparing the resulting activity
streams against each other. See gure 1 for examples of abstractions that can
be included.</p>
      <p>We have also experimented with di erent resolutions on our activity streams.
We are able to capture and transform data into 1 second as well as 1 hour
windows to name a few. Decreasing the resolution, lowers the computational
comparison cost, but important data may be lost during the transformation.
Di erent resolutions can be combined throughout the activity stream to prevent
loss of information. For example a bigger window can be used during prolonged
periods of inactivity or sleep. On the other hand, during periods of activity where
the intensity or activity types changes frequently, a smaller window can be used
to capture and conserve that information.</p>
      <p>Other ideas we have been experimenting with include comparing the activity
streams based on the activity intensity rather than activity type. Clinicians
with experience in the eld of non-speci c low back pain have suggested, instead
of comparing the streams directly, to compare sleeping patterns and periods
of prolonged inactivity. This way of comparing activity streams produces very
di erent results than when comparing activity streams containing activity types.
This is shown in gure 2.</p>
      <p>
        As for the comparison algorithms themselves, we are currently experimenting
with a string based approach. We convert the activity stream into a string of
characters. When comparing two strings, we compute the percentage of each
character per string and the number of consecutive sequences of those characters.
We then also compute the longest common sub-sequence, the sequence distance,
the number of similar k-mers and the number of unique similar k-mers. The
sequence distance is the Levenshtein distance which is the least number of single
character insertions, deletions or substitutions that is required to transform one
string into the other [
        <xref ref-type="bibr" rid="ref4">4</xref>
        ]. A k-mer is a consecutive substring of length k. All these
attributes are combined in to a nal similarity metric.
3
      </p>
    </sec>
    <sec id="sec-3">
      <title>Current status and Outlook</title>
      <p>We have presented and discussed various challenges that arise when comparing
activity streams in the selfBACK project.</p>
      <p>Our current work focuses on the representation and comparison of the activity
stream data that is meaningful in the non-speci c low back pain domain. We are
investigating which time resolution is bene cial and which information should
be included in the activity streams. We are also exploring whether and how
domain knowledge from non-speci c low back pain experts can be incorporated.
As shown in gure 2, the representation of the activity streams can di er a
lot depending on which information is most relevant (alteration of activities in
S1 and S2 vs. the reduction of pro-longed inactivtiy in S3). Therefore we are
working on how the representation a ects the similarity comparison of activity
streams collected from real patients.</p>
      <p>Our next steps include the investigation whether activity stream
rearrangement could yield towards a more precise comparison. Therefore we investigate if
parts of the activity stream are more important than others (for example sleep
patterns). When the data collection from the planned randomized control trial
is complete, it will also be possible to mine the collected activity streams for
behavioral patterns that contribute to an increase or a decrease in non-speci c
low back pain. For example we could discover that a certain amount of time
standing followed by a certain amount of time sitting is increasing non-speci c
low back pain etc. Finally we will conduct experiments on how much weight the
activity stream comparison should have compared to the self-reported reported
questionnaire data from the patients.</p>
      <p>Acknowledgement The work has been conducted as part of the selfBACK
project, which has received funding from the European Unions Horizon 2020
research and innovation programme under grant agreement No 689043.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          1.
          <string-name>
            <surname>Alsaadi</surname>
            ,
            <given-names>S.M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>McAuley</surname>
            ,
            <given-names>J.H.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Hush</surname>
            ,
            <given-names>J.M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Lo</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Lin</surname>
            ,
            <given-names>C.W.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Williams</surname>
            ,
            <given-names>C.M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Maher</surname>
            ,
            <given-names>C.G.</given-names>
          </string-name>
          :
          <article-title>Poor sleep quality is strongly associated with subsequent pain intensity in patients with acute low back pain</article-title>
          .
          <source>Arthritis Rheumatology</source>
          <volume>66</volume>
          (
          <issue>5</issue>
          ),
          <volume>1388</volume>
          {
          <fpage>94</fpage>
          (
          <year>2014</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          2.
          <string-name>
            <surname>Bach</surname>
            ,
            <given-names>K.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Szczepanski</surname>
            ,
            <given-names>T.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Aamodt</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Gundersen</surname>
            ,
            <given-names>O.E.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Mork</surname>
            ,
            <given-names>P.J.</given-names>
          </string-name>
          :
          <article-title>Case Representation and Similarity Assessment in the selfBACK Decision Support System. ICCBR 2116 (accepted for publication)</article-title>
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          3.
          <string-name>
            <surname>Gupta</surname>
            ,
            <given-names>N.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Christiansen</surname>
            ,
            <given-names>C.S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Hallman</surname>
            ,
            <given-names>D.M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Korshj</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Carneiro</surname>
            ,
            <given-names>I.G.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Holtermann</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          :
          <article-title>Is Objectively Measured Sitting Time Associated with Low Back Pain? A Cross-Sectional Investigation in the NOMAD study</article-title>
          .
          <source>PLoS ONE</source>
          <volume>10</volume>
          (
          <issue>3</issue>
          ) (
          <year>2015</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          4.
          <string-name>
            <surname>Navarro</surname>
          </string-name>
          , G.:
          <article-title>A guided tour to approximate string matching</article-title>
          .
          <source>ACM Computing Surveys</source>
          <volume>33</volume>
          (
          <issue>1</issue>
          ),
          <volume>31</volume>
          {
          <fpage>88</fpage>
          (
          <year>2001</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          5. van Tulder,
          <string-name>
            <given-names>M.</given-names>
            ,
            <surname>Becker</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            ,
            <surname>Bekkering</surname>
          </string-name>
          ,
          <string-name>
            <given-names>T.</given-names>
            ,
            <surname>Breen</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            ,
            <surname>del Real</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.T.G.</given-names>
            ,
            <surname>Hutchinson</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            ,
            <surname>Koes</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B.</given-names>
            ,
            <surname>Laerum</surname>
          </string-name>
          ,
          <string-name>
            <given-names>E.</given-names>
            ,
            <surname>Malmivaara</surname>
          </string-name>
          ,
          <string-name>
            <surname>A.</surname>
          </string-name>
          :
          <article-title>Chapter 3 european guidelines for the management ofacute nonspeci c low back painin primary care</article-title>
          .
          <source>European Spine Journal</source>
          <volume>15</volume>
          (
          <issue>2</issue>
          ),
          <year>s169</year>
          {
          <fpage>s191</fpage>
          (
          <year>2006</year>
          )
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>