=Paper=
{{Paper
|id=Vol-1538/paper-08
|storemode=property
|title=CORREDOR, A mobile Human-Centric Sensing System for Activity Recognition
|pdfUrl=https://ceur-ws.org/Vol-1538/paper-08.pdf
|volume=Vol-1538
|authors=Luis Jaimes,Idalides Vergara-Laurens
|dblpUrl=https://dblp.org/rec/conf/latincom/JaimesV15
}}
==CORREDOR, A mobile Human-Centric Sensing System for Activity Recognition==
<pdf width="1500px">https://ceur-ws.org/Vol-1538/paper-08.pdf</pdf>
<pre>
                                          7th Latin American Workshop On Communications - 2015


         CORREDOR, A mobile Human-Centric Sensing System for Activity
                              Recognition

                                           Luis G. Jaimes∗ and Idalides J. Vergara-Laurens†
        ∗ College of Science, Engineering, and Mathematics, Bethune-Cookman University, Daytona-Beach, FL, 32114

                                                           Email: jaimesl@cookman.edu
         † Department of Electrical and Computer Engineering - Universidad del Turabo, Gurabo, Puerto Rico, 00778

                                                            Email: ivergara@suagm.edu


   Abstract— This paper presents Corredor, a human-centric-                       Taking these facts into consideration, in this paper, we
sensing system that encourage people’s physical activity. The                  present Corredor, human-centric sensing system for activity
main objective of Corredor is to help people, that suffer obesity,             tracking and recognition with application in preventive health.
during their workout as part of their treatment. Corredor uses
phone’s embedded sensors along with machine learning algo-                     Physical activity is considered a preventive mechanism to
rithms to recognize human activities such as running, walking                  avoid and control problems such as obesity and psychological
and standing. Corrredor runs enterally in the user’s phone and                 stress. Both are well know issues in public health. The main
does not require any external server processing. In addition,                  idea is to employ persuasive and behavioral techniques to keep
Corredor displays on the screen the followed route by the user,                the patient engaged and motivated to meet health goals.
indicating the segments where the user was running, walking or
standing. The system computes a set of 64 features from real-                     Corredor is a mechanism that allows people to track their
time accelerometer data using a 5 seconds sliding window with                  workout progress using smart phones which has potential
50% of overlapping. The computed features are used to train a                  application in mHealth. Given the fact that people use their
C4.5 decision tree which in turns is used to recognize workout                 phones on a daily basis and carry them almost every place,
activities. After system evaluation, our results show that Corredor            this is an illustrious technology that could potentially help
achieves up to 93.7% overall accuracy. Finally, the application
saves the historical data and is able to show them using Google                solve this health epidemic. However, the sensor raw data are
Maps.                                                                          not sufficient in order to identify people’s behavior. One of the
                                                                               key challenges in creating useful and robust ubiquitous appli-
                          I. I NTRODUCTION                                     cations is context detection from noisy and often ambiguous
   Advancements in pervasive computing are rapidly changing                    sensor data [5]. Thus, the proposed mechanism has two stages:
preventative healthcare. Under the status quo, the average                     the training, and the testing. The first allows the application
healthy individual visits the doctor rarely, perhaps just once a               learn the relation between sensor data and person’s activities
year. The doctor assesses the patient and then may prescribe                   since different people run and walk in different way generating
medications and recommend behavior changes (reduce fat                         different acceleration signals [16]. The testing stage identifies
consumption, exercise more, etc.). One year later, the patient                 person’s activities using a feature extraction algorithm in the
returns and this process is repeated. In the emerging new                      frequency and the time domains.
model of health care, the patient carries sensors that monitor                    Our application allows users to track their running, walking,
health in real-time, as the patient goes about normal daily life               or standing activities. The system has two modules, the activity
[7], [8], [10], [15], [18], [20]. A smart phone and cloud-based                recognition module, and the visualization module. The first
services assess monitored data at a much higher frequency (on                  recognizes, and reports to the user the performed activities and
the order of minutes or seconds, if needed). Here patients play                their time duration; while the second module uses the phones
a more significant role in the management of their health. The                 GPS and Wifi sensor to collect outdoor and indoor location
idea is to build Personal health systems which are designed                    data, and allows users to track the followed route during her
for use by the patient rather than the doctor, and ubiquitous,                 workout showing the segments running, walking and standing.
meaning anywhere-anytime interaction with ones health via                      This feature allows users to plan their route in terms of goals
mobile devices.                                                                during their workout.
   Physical activity is considered a preventive mechanism to                      The rest of the paper presents the related work to this project
avoid and control problems such as obesity and psychological                   followed by the system description, the experimental settings
stress. Both are well know issues in public health. Obesity is a               and results. Finally, the conclusions are presented along with
leading cause of death worldwide, with increasing prevalence                   some considerations for future research in this area.
in adults and children. Obesity-related conditions include heart
disease, stroke, type 2 diabetes and certain types of cancer.                                           II. R ELATED WORK
Medical costs associated with obesity were estimated at $147                      The rapid development of mobile devices equipped with
billion; the medical costs for people who are obese were                       very accurate sensors (e.g., accelerometers, cameras, GPS,
$1,429 higher than those of normal weight [11]–[14], [21].                     etc.) has facilitated the process of taking data about individuals

Copyright © 2015 for the individual papers by the papers’ authors. Copying permitted for private and academic purposes. This volume is published and
copyrighted by its editors. Latin American Workshop On Communications' 2015 Arequipa, Peru Published on CEUR-WS: http://ceur-ws.org/Vol-1538/
and their surroundings. In addition, there are available external    three two modules: collector module and the classification
sensors equipped with communication capabilities which allow         module. The collector application collect ground true data,
their integration with other mobile devices within Personal          which is used by the tester module to build the classier that will
Area Networks (PANs) or Body Area Networks (BANs) [16].              be used later for activity recognition. The visualization module
For instance, Scosche Rhythm Bluetooth Armband Pulse Mon-            uses the phone’s GPS and Wifi sensor to collect outdoor
itor is a device that measure the heartbeat and transmits it to      and indoor location data. This data is stored in the phone’s
an Android application; this application monitors the burned         database and presented to the user using the Google Maps
calories while the person’s workout [9].                             API. Figure 2 shows the Corredor’s main modules and and
   On the other hand, human activity recognition has became          their interrelationships. The following are the main elements
a useful tool for military, security, and, especially, for medical   of the Corredor.
applications [17]. In this last subject, for example, people
suffering of diabetes, obesity, or heart disease often require
to be monitored during their treatment.
   Although several applications have been proposed for hu-
man activity recognition using smart phone, many of them
require additional devices such as external straps that the
patient must wear in order to sense data. This is the case
of Centinela which requires the BioHarnessT M BT chest
sensor strap manufactured by Zephyr [4]. On the other hand,
there exist several options in the android market that track
a users exercise and running routine. A few of the most well
known products are Nike+ [2], Runkeeper [3], and Ghost Race
Pro [1]. However, within these applications, the user is re-
quired to manually activate and specify the insensitive level of
activity. Our proposal is different because it introduces online
                                                                                          Fig. 2.   System architecture
activity recognition. This recognition technology is unique in
the fact that is activates automatically. The commercial devices
available today are required to be manually turned on. Some
advantages of this approach include convince, accuracy and
privacy.

                 III. S YSTEM DESCRIPTION
   We design an android application that allows the users to
track their running, walking, or standing activities. Users can
chose whether to manually input data or to use automatic
recognition module. These tasks can be used all day long
automatically or manually activated, see Figure 1.


                                                                                          Fig. 3.   Collector application


                                                                     A. Data collection
                                                                        We created an Android application for data collection, the
                                                                     application uses the phone’s accelerometer sensor for activity
                                                                     recognition, and GPS for visualization. We collect the three
                        Fig. 1.   Main Portal                        values associated with accelerometer data, namely the axes
                                                                     x,y, and z at a sampling rate of 50Hz. On average, sensor
   The system is organized in two main modules, the activity         values were received every 5-10 ms. The data ground true
recognition module, and the visualization module. The Corre-         collection was performed by a single individual for running,
dor’s activity recognition module is in turns subdivided in the      walking, and still. For running and walking, the phone was
held in the hand in various positions to simulate possible real-
life scenarios. For sitting still, the phone was in the pocket
and recorded data during normal desk work. Figure 3

B. Feature extraction
   We compute a set of 64 features, 63 in the frequency
domain, and one in the time domain. Every time that we obtain
a new (x, y, z) acceleration sample we compute its magnitude
m using Equation1
                                p
                      m=         x2 + y 2 + z 2                (1)

   We buffer up 64 consecutive magnitudes, namely,
{m0 , . . . , m64 } and compute the Fast Fourier Transform,(FFT)
of each element in order to form a new frequency vector with
elements {f0 , . . . , f63 }. Finally, the last feature corresponds
to maxa = max{m0 , . . . , m6 4}, forming the feature vector
{f0 , . . . , f63 , maxa }.
   The data was divided into five-second time windows. We
                                                                                               Fig. 6.   J48 classifier
implemented the concept of sliding time windows, which over-
lapped by 50% as shown in Figure 4. Sliding time windows
are widely known to reduce classification error during activity
transition.


                  Fig. 4.   Overlapping time window


C. Classification
  Using the collection mechanism described ins section III-A
we build a ground true with label features of three activities
                                                                                    Fig. 7.   Activity recognition process flow
as show Figure 5.


                                                                      D. Visualization
                                                                         We used the GPS for outdoors and WiFi/Antena Triangula-
                                                                      tion for indoors. We then broadcasted the inference activities
                                                                      to the MAP application and mapped the GPS signals to the
                                                                      activities. As result we obtained the following function:
                                                                         The visualization module retrieve the inferred activities
                                                                      store in the phone database as well as location coordates a
                                                                      this time to generate a route map as show in Figure 8.
                      Fig. 5.    Ground True file
                                                                                              IV. E VALUATION
  We download the ground true data from the phone and use                The accuracy of the classifier was evaluated using a cus-
Weka to build a used the ground true to generate a J48 prune          tomized form of stratified ten-fold cross validation. Ten-fold
decision three as shown in Figure6                                    cross validation randomly splits the testing set into ten equally
  The resulting classifier, namely the jar file is include as a       sized subsets. The folds are stratified, which means each fold
subroutine of the phone application and used along with the           contains a proportional amount of each class. For each fold,
FFT subroutine for classification in the production stage as          we train on the other nine folds and test on the current fold,
showed in Figure 7.                                                   and average together each folds classification accuracy for a
                                                                                also energy consumption. The following section sketch the
                                                                                main components of our approach.

                                                                                A. The Power-Aware Decision Tree Algorithm
                                                                                   The Power-Aware Decision Tree algorithm (PAT) considers
                                                                                the sensors’ power consumption along with feature’s infor-
                                                                                mation gain in order to increase the accuracy of the activity
                                                                                recognition process as well as the power efficiency. PAT
                                                                                is based on the popular C4.5 algorithm developed by Ross
                                                                                Quinlan, which greedily chooses splits on attributes to build a
                                                                                decision tree by maximizing information gain [19].

               Fig. 8.     Corredor’s visualization interface                   B. PAT training stage
                                                                                   C4.5 uses the concept of information entropy to calculate
                                 TABLE I
                                                                                the level of uncertainty of an attribute split and compare it
                            C ONFUSION M ATRIX
                                                                                with the information entropy without the split. The Kullback-
               Class         Still     Walking       Running                    Leibler (KL) divergence (also known as information gain) is
               Still         248         1              3                       the difference between those two information measures, and is
              Walking         1         232             19                      used as the criterion to generate the splits while the decision
              Running         5          22            225                      tree is being built. The KL divergence is a way of comparing
                                                                                two probability distributions, and is defined as follows [6].
                                                                                   Definition 1 (Kullback-Leibler Divergence): For two distri-
total predicted accuracy. Table I presents the confusion matrix,                butions q(x) and p(x):
here the elements of main diagonal are significatively bigger
than the elements out of diagonal showing a low level of                                     KLq|p ≡ hlog q(x) − log p(x)iq(x) ≥ 0
false positives and true negatives. Table II shows the detail                      We introduce a new criterion for split selection that takes
accuracy per class, and its last line presents the weight average               into account not only the KL divergence, but also the knowl-
over the three activist. Finally, Table III presents a shows the                edge of sensor power efficiencies. The main idea is to create a
number of correctly and incorrectly classified instances as well                tree that favors a combination of the most power efficient and
as the mean and absolute classification errors. of the computed                 the most informative attributes. Table IV shows the weights
statistical error estimation.                                                   assigned to each of the sensors that were used, with 1 being
                               TABLE II                                         the least power efficient and 10 being the most power efficient.
                      D ETAIL ACCURACY BY CLASS                                 In actual applications, these weights would correspond to the
                                                                                relative power efficiencies of the sensors.
  Class     Tp Rate      FP Rate     Precision   Recall   F-Measure   Roc Are      We introduce a new criterion for split selection that takes
  Still      0.984        0.012        0.976     0.984      0.98       0.986
 Walking     0.921        0.046        0.91      0.921      0.915      0.95     into account not only the KL divergence, but also the knowl-
 Running     0.893        0.044        0.911     0.893      0.902      0.935    edge of sensor power efficiencies. The main idea is to create a
 Weighted    0.933        0.034        0.932     0.933      0.932      0.957
   avg                                                                          tree that favors a combination of the most power efficient and
                                                                                the most informative attributes. Table IV shows the weights
                                                                                assigned to each of the sensors that were used, with 1 being
                           TABLE III                                            the least power efficient and 10 being the most power efficient.
               S UMMARY OF STATISTICAL ESTIMATORS                               In actual applications, these weights would correspond to the
                                                                                relative power efficiencies of the sensors. It is important to note
             Correctly classified instances              705
                                                                                that in our experiments we did not assign realistic weights to
            Incorrectly classified instances              51
                    Kappa statistic                    0.8988                   the sensors...we assigned these weights so that we could test
                 Mean absolute error                    0.051                   the behavior of the algorithm. In actual applications, these
               Root mean squared error                 0.2055                   weights would correspond to the relative power efficiencies of
                Relative absolute error               11.4796%                  the sensors.
                                                                                                             TABLE IV
                          V. F UTURE W ORK                                        W EIGHTS . I T MEANS LEAST POWER EFFICIENT AND 10 MEANS MOST
                                                                                                          POWER EFFICIENT.
  In this work, we explore a preliminary approach to save
energy based on a modification of the popular C4.5 algorithm.                      Accelerometer    Gyro    Gravity      Linear       Rotation
The main idea behind this modification is to take into account                                                         Acceleration    Vector
not only information gain as a criteria for branch partition but                         2            1        10           4            8
   Like C4.5, PAT chooses splits by finding the attribute that                     [18] Kurt Plarre, Andrew Raij, Syed Monowar Hossain, Amin Ahsan Ali,
will maximize the split criteria. The split criteria is a linear                        Motohiro Nakajima, Mustafa al’Absi, Emre Ertin, Thomas Kamarck,
                                                                                        Santosh Kumar, Marcia Scott, Daniel P. Siewiorek, Asim Smailagic,
combination of the Kullback-Leibler divergence and the power                            and Lorentz E. Wittmers. Continuous inference of psychological stress
efficiency of the attribute’s associate sensor. We control the                          from sensory measurements collected in the natural environment. In
relative weights of the KL divergence and the power efficiency                          IPSN, pages 97–108, 2011.
                                                                                   [19] J.R. Quinlan. C4. 5: programs for machine learning. 1993.
with a parameter θ. This new split criteria S is defined as                        [20] Saul Shiffman, Arthur A Stone, and Michael R Hufford. Ecological
follows:                                                                                momentary assessment. Annu. Rev. Clin. Psychol., 4:1–32, 2008.
                                                                                   [21] I.J. Vergara-Laurens, D. Mendez, and M.A. Labrador. Privacy, quality of
                           VI. C ONCLUSIONS                                             information, and energy consumption in participatory sensing systems.
                                                                                        In Proceedings of the 2014 IEEE International Conference on Pervasive
   This paper presents Corredor, a human-centric sensing                                Computing and Communications (PerCom), pages 199–207, March
                                                                                        2014.
platform for human activity recognition based upon human
acceleration data. An extensive evaluation was performed for
a set of 64 features, a J48 decision tree, eight classification,
and 5 seconds sliding window with a 50% of overlap . Overall,
the mean accuracy achieved was 93.2%. This result supports
the hypothesis that a energy efficient system based on only
acceleration data are enough to reach high labels of activity
recognition accuracy.

                               R EFERENCES
 [1] Ghost race pro, https://play.google.com.
 [2] Nike +, https://secure-nikeplus.nike.com/plus/.
 [3] Runkeeper.
 [4] Centinela: A human activity recognition system based on acceleration
     and vital sign data. Pervasive and Mobile Computing, 8(5):717 – 729,
     2012.
 [5] L. Bao and S.S. Intille. Activity recognition from user-annotated
     acceleration data. In Alois Ferscha and Friedemann Mattern, editors,
     Pervasive Computing, volume 3001 of Lecture Notes in Computer
     Science, pages 1–17. Springer Berlin Heidelberg, 2004.
 [6] D. Barber. Bayesian reasoning and machine learning. 2012.
 [7] Bibhas Chakraborty and Erica EM Moodie. Statistical Methods for
     Dynamic Treatment Regimes. Springer, 2013.
 [8] Kristin E Heron and Joshua M Smyth. Ecological momentary inter-
     ventions: incorporating mobile technology into psychosocial and health
     behaviour treatments. British journal of health psychology, 15(1):1–39,
     2010.
 [9] Scosche Industries. Scosche bluetooth armband pulse rate monitor. In
     http://www.scosche.com/rhythm.
[10] L. G. Jaimes, J. Calderon, J. Lopez, and A. Raij. Trends in mobile cyber-
     physical systems for health just-in time interventions. In Proceedings
     of the SoutheastCon 2015, pages 1–6, April 2015.
[11] L. G. Jaimes, A. Chakeri, J. Lopez, and A. Raij. A cooperative
     incentive mechanism for recurrent crowd sensing. In Proceedings of
     the SoutheastCon 2015, pages 1–5, April 2015.
[12] L.G. Jaimes, I. Vergara-Laurens, and M.A. Labrador. A location-based
     incentive mechanism for participatory sensing systems with budget
     constraints. In Proceedings of the 2012 IEEE International Conference
     on Pervasive Computing and Communications (PerCom), pages 103–
     108, March 2012.
[13] L.G. Jaimes, I. Vergara-Laurens, and A. Raij. A crowd sensing incentive
     algorithm for data collection for consecutive time slot problems. In
     Proceedings of the 2014 IEEE Latin-America Conference on Communi-
     cations (LATINCOM), pages 1–5, Nov 2014.
[14] L.G. Jaimes, I.J. Vergara-Laurens, and A. Raij. A survey on incentive
     techniques for mobile crowd sensing. Internet of Things Journal, IEEE,
     PP(99):1–1, 2015.
[15] Luis G. Jaimes, Martin Llofriu, and Andrew Raij. A stress-free life: Just-
     in-time interventions for stress via real-time forecasting and intervention
     adaptation. 2014.
[16] O.D. Lara and M.A. Labrador. A mobile platform for real-time human
     activity recognition. In Proceedings of the 2012 IEEE Consumer
     Communications and Networking Conference (CCNC), pages 667–671,
     Jan 2012.
[17] O.D. Lara and M.A. Labrador. A survey on human activity recognition
     using wearable sensors. Communications Surveys Tutorials, IEEE,
     15(3):1192–1209, Third 2013.

</pre>