<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Mining the user profile from a smartphone: a multimodal agent framework</article-title>
      </title-group>
      <contrib-group>
        <aff id="aff0">
          <label>0</label>
          <institution>Giuseppe Loseto</institution>
          ,
          <addr-line>Michele Ruta, Floriano Scioscia, Eugenio Di Sciascio, Marina Mongiello DEI - Politecnico di Bari via E. Orabona 4, I-70125, Bari</addr-line>
          ,
          <country country="IT">Italy</country>
        </aff>
      </contrib-group>
      <abstract>
        <p>-Nowadays smartphones play a significant role in gathering relevant data about their owners. Micro-devices embedded in Personal Digital Assistants (PDAs) perform a continuous sensing, the phone call lists, PIM (Personal Information Manager), text messages and so on allow to collect and mine data enough for a high-level description of daily activities of a user. This paper proposes an agent able to perform an automated profile annotation by adopting Semantic Web languages. As a proof of concept, the devised agent has been tested in an Ambient Intelligence (AmI) scenario, i.e., a domotic environment where it interacts with its home counterpart to trigger services best matching the user needs. A toy example is presented as case study aiming to better clarify the proposal while an early experimental evaluation is reported to assess its effectiveness.</p>
      </abstract>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>-</title>
      <p>Keywords—Ambient Intelligence; Agent-based Data Mining;
Semantic Web of Things; Home and Building Automation.</p>
    </sec>
    <sec id="sec-2">
      <title>I. INTRODUCTION</title>
      <p>
        Mobile phones are both pervasive and personal –following
the user and having clues about everyday situations– resulting
extremely useful to infer a context. Embedded micro-devices
(accelerometer, digital compass, gyroscope, GPS, microphone
and camera) can be used to extract significant information
about the user: GPS location traces, call and SMS lists,
PIM (Personal Information Management) records including
contacts and calendar, battery charging habits. By leveraging
the smartphone processing capabilities, ever-expanding ways
to investigate behavioral, spatial and temporal dimensions of
the everyday life can be provided. The personal nature of
mobile phones suggest they are well suited for pervasive
computing, but data they are able to collect and process could
be profitably used for a large set of context-aware applications,
like the Ambient Intelligence (AmI) [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ] ones.
      </p>
      <p>
        This paper presents a smart profiling agent1 which
borrows languages and technologies from the Semantic Web
experience to funnel inarticulate raw individual information
toward a semantically rich glossary. A crawler agent runs
on the user smartphone and performs a multimodal (i.e.,
involving several heterogeneous data sources) and continuous
sensing [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ] collecting and processing information without
human intervention. The multimodality requires specialized
analyses for each kind of collected data. The agent mines
the user habits automatically and annotates them in a
logicbased formalism to build a daily profile to be further
exploited in context-aware knowledge-based applications. The
main motivation for adopting an agent-based approach is that
1Project home page: http/sisinflab.poliba.it/swottools/mobile-user-profiler/
the mobile profiler must modulate proactively the amount
and complexity of data capture and processing, in order to
use energy efficiently. Smart Home and Building Automation
(HBA) [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ] was selected as proof scenario: the profiling agent
sends the inferred preferences to its HBA counterpart so that a
logic-based matchmaking session could finalize the adaptation
of the environment to user needs.
      </p>
      <p>The remainder of the paper is organized as in what follows.
Section II contextualizes the overall multi-agent HBA system
motivating the proposed approach before presenting both
architecture and algorithms of the profiler agent in Section III. The
toy example in Section IV acts as a case study while an early
experimental evaluation is reported in Section V. Finally, most
relevant related work is discussed in Section VI and concluding
remarks and future research are in Section VII.</p>
      <p>II. SCENARIO: SEMANTIC-BASED HOME AUTOMATION</p>
      <p>
        The user agent proposed in this paper is intended as a
part of a more complex HBA Multi Agent System (MAS) [
        <xref ref-type="bibr" rid="ref4">4</xref>
        ]
leveraging the semantic-based evolution of the KNX domotic
protocol in [
        <xref ref-type="bibr" rid="ref5">5</xref>
        ]. It introduced a semantic micro-layer on the
top of the stack enabling novel services and functions while
keeping a full backward-compatibility with current domestic
devices and HBA appliances. The above enhancements allowed
to fully describe device features by means of annotations
expressed in logic-based languages such as RDF2 and OWL3.
The knowledge domain of building automation was
conceptualized in a shared ontological vocabulary enabling a rich
characterization of home resources and services. The MAS
was implemented in Java on a testbed composed of
off-theshelf KNX domotic equipment4.
      </p>
      <p>The adopted multi-agent system comprised a home
mediator agent as well as user and device agents. Each agent
adopts the custom service-oriented model sketched in [4,
Fig. 4]. Basically, the agent monitors its internal state and
inputs; when a significant change occurs, it communicates with
the other agents in order to discover suitable services that
maximize its utility. The number of both resources/services and
agents varied unpredictably (as new users or devices joined or
disconnected the system at any time) without redefining the
communication paradigm for that.</p>
      <p>2RDF (Resource Description Framework) Primer, W3C Recommendation,
10 February 2004, http://www.w3.org/TR/rdf-primer/</p>
      <p>3OWL 2 Web Ontology Language, W3C Recommendation, 11 December
2012, http://www.w3.org/TR/owl2-overview/</p>
      <p>
        4See the related project home page
http://sisinflab.poliba.it/swottools/smartbuildingautomation/ for more details.
– The Mediator Agent coordinates the explicit
characterizations of available services, described w.r.t. a reference ontology
modeling the conceptual knowledge for the building
automation problem domain. Furthermore, it acts as a broker in order
to discover the (set of) elementary services that cover (part of)
the request coming from user or device agents.
– The Device Agents are thought to run on advanced devices,
i.e., home appliances with some computational capabilities
and memory availability. Each one can expose one or more
semantic descriptions, i.e., functional profiles to be discovered
by other agents, or alternatively each of them could issue
semantic-based requests to the mediator agent when the device
status changes and then require a home reconfiguration.
– KNX Device Interface Agents support semantic-based
enhancements in case of legacy or elementary appliances, e.g.,
switches, lamps, and so on. In such cases, there is only a static
interaction between agent and device.
– Finally the User Agents, running on mobile clients, send
requests toward the home environment, in order to satisfy user
needs and preferences. W.r.t. the version in [
        <xref ref-type="bibr" rid="ref4">4</xref>
        ], an approach
for the automated mining of a user profile in charge to that
kind of agent is proposed as main contribution of this paper.
      </p>
    </sec>
    <sec id="sec-3">
      <title>III. FRAMEWORK AND APPROACH</title>
      <p>
        Figure 1 sketches the general architecture of the profiling
agent. Raw data are extracted from smartphone embedded
micro-devices, communication tools and PIM. The data
mining life cycle consists of the following subsequent stages:
(a) gathering; (b) feature extraction; (c) classification and
interpretation; (d) semantic annotation. High-level information
about user activities, whereabouts, mental and physical status
is inferred and annotated w.r.t. an extension of the HBA
ontology in [
        <xref ref-type="bibr" rid="ref5">5</xref>
        ]. The mined profile should be finally used to
trigger the activation or deactivation of the most appropriate
home services. A modular architecture allows to process the
various data sources with specialized algorithms. In particular,
as shown by icons in Figure 1, three modules fully characterize
the agent at the moment: (i) Points of Interest Recognition; (ii)
Transportation Mode Recognition; (iii) User Activity
Recognition.
      </p>
      <p>Google
Places
1. Points of Interest Recognition. A mining algorithm
analyzes the smartphone GPS data in order to:
a. identify Stay Points (SPs) through a slightly refined version
of the algorithm in [6];
b. for each SP, retrieve the nearest Point Of Interest (POI) via
reverse geocoding queries to Google Places5 Web service;</p>
    </sec>
    <sec id="sec-4">
      <title>5http://developers.google.com/places/</title>
      <p>c. associate a “place category” to each POI, so as to further
infer the kind of user activity;
d. enrich the daily user profile conjoining all detected
activities, described w.r.t. a proper HBA ontology.</p>
      <p>A SP represents a narrow geographic region where a user
stands for a while. In particular, given two subsequent detected
GPS locations P1 and P2, a SP satisfies both the following
constraints: (i) maximum distance d(P1; P2) &lt; Dmax; (ii)
minimum time difference |T1 − T2| &gt; Tmin, where the
thresholds were set to Dmax = 200m; Tmin = 350s. An
empirical evaluation was executed to assign the thresholds
values granting the highest precision of the SP recognition
algorithm.</p>
      <p>(a) Home POI
(b) POI Info
(c) Extracted Places
(d) Profile mining
(e) Food place detail (f) Daily stay period and
location visited before</p>
      <p>
        Figure 2 shows the GUI of the profiler prototype on the
GPS-side. The daily GPS trace is drawn on Google Maps
together with detected SPs, depicted as markers on the map
in Figure 2(a). The Home and Workplace POIs are set by
the user in a preliminary configuration step. As said, the
SP classification leverages a Web-based reverse geocoding
service: after comparing Google Places and LinkedGeoData
(LGD) [7] (see Section V for further details) the first one
service has been chosen at the moment, since it provides more
available POIs even if LGD often seems to be more accurate.
In the example reported in Figure 2(c), the agent selected a
SP near to the Politecnico di Bari and all the nearby POIs
were retrieved by means of the Google Places API. The main
category of the nearest POI is used as label of the retrieved
location. Starting from the Google Places classification6, the
6http://developers.google.com/places/documentation/supported types/
reference ontology for domotics in [
        <xref ref-type="bibr" rid="ref5">5</xref>
        ] has been extended to
include a places taxonomy. Finally, as reported by the Figure
2(d), a profile is generated through the conjunction of location
information. As shown in Figure 2(e), each SP description
contains an ontology class related to the specific location the
user visited, the overall time spent there (in seconds), the daily
period and the place visited before, if present (Figure 2(f)).
2. Transportation Mode Recognition. GPS data are exploited
also to detect the transportation mode adopted by the user
when moving during a day. Four transportation modes are
supported: bus, train, car or walking. A pre-processing splits
the whole daily GPS trace P = {T1; : : : ; Tn} in trajectories
Ti. In turn, each trajectory Ti = Q{P OIi; P OI(i+1)} consists
of a set of GPS points Q included between two subsequent
POIs. Starting from the trajectories set, the transportation
mode detection is based on two reference parameters: (i) the
walking speed threshold (W Sth), set to an average value of
2 m/s (i.e., 7.2 km/h); (ii) the minimum correspondence ratio
(CRmin) between user trajectories and bus/train routes, set to
0.8 (i.e., at least a 80% correspondence is required). Also in
this case, an experimental evaluation was performed to select
the most suitable threshold values. The algorithm for detection
progresses along the following stages:
a. For each trajectory Ti, the average user speed is evaluated.
If it is lower than W Sth then walking mode is detected.
b. Otherwise, the algorithm queries OpenStreetMap7 (OSM)
via the Overpass API8 to retrieve all available bus and train
routes (Rs = Rbus ∪ Rtrain) in a bounding box covering the
geographical coordinates of the GPS points in Ti. Figure 3(a)
shows an example for that.
c. A comparison between the GPS points of the user trajectory
and the retrieved routes is performed. In case of a
correspondence ratio greater than CRmin with a bus or train path, the
trajectory Ti is associated to a bus or train mode, respectively
(Figure 3(b)).
d. Finally, if the detected mean is neither walking nor train
nor bus, then the car mode is selected.
      </p>
      <p>Each transportation mode is associated to a semantic-based
annotation fragment which includes a given class of the
ontology, further extended to include also concepts and properties
about user movements. Moreover, the description will include
the overall time –in seconds– the user spent during the day for
moving, the daily period and possible means of transport used
before. Figure 3(c) shows the details about the user profile
section related to a transfer by train.
3. User Activity Recognition. Beyond the above components,
the profiling agent is completed by a module to detect some
user activities. In particular, at the moment the following
elementary actions can be discovered: sitting, standing, walking,
walking upstairs and dowstairs. Starting from data acquired
from the smartphone accelerometer and gyroscope, a
supervised Machine Learning (ML) approach is adopted, exploiting
the Support Vector Machines (SVM) classifier in [8]. W.r.t. the
original approach, the classifier was simplified to improve its
efficiency on PDAs and to reduce the training time. The early
568 features used on the dataset9 associated to [8] as input</p>
    </sec>
    <sec id="sec-5">
      <title>7http://www.openstreetmap.org/</title>
      <p>
        8http://wiki.openstreetmap.org/wiki/Overpass API
9http://archive.ics.uci.edu/ml/datasets/
Human+Activity+Recognition+Using+Smartphones
(a) Overpass routes
(b) Train Mode
(c) Train Mode details
for the classifier were reduced to 16 (see Table I) by applying
the Recursive Feature Elimination (RFE) algorithm proposed
in [
        <xref ref-type="bibr" rid="ref9">9</xref>
        ].
      </p>
      <p>
        A training set composed by sensor raw data has been used
to let the classifier learn directly on the mobile device. The
smartphone used for the experimental evaluation is equipped
with an accelerometer and a gyroscope measuring both the
3axial linear acceleration and the angular velocity (tAcc-XYZ
and tGyro-XYZ, respectively) at a fixed sampling rate of 25
ms, which is adequate to identify a human body motion. The
collected data are subsequently processed through two
firstorder low-pass filters. The first one is used to reduce noise,
while the second filter splits the acceleration signal into body
and gravity components (tBody and tGravity). The classifier
has been implemented using Weka-for-Android10, an Android
port of Weka [
        <xref ref-type="bibr" rid="ref6">10</xref>
        ]. The training set has been built fastening
the smartphone in vertical position as reference; after the SVM
training, the recognition process starts. Data are sampled in
fixed-width sliding windows of 2.5 s (i.e., 100 samples) with
50% overlap, and processed as described above. From each
window, a vector with the 16 features in Table I is obtained
by computing the extracted accelerometer and gyroscope data
in the time and frequency domain. Finally, an energy saving
strategy is implemented to avoid unnecessary data capture:
after each activity recognition ARi, a pause W Pi is waited
10https://github.com/rjmarsan/Weka-for-Android
for. W Pi is defined as:
W Pi =
{ 0sec
2:5sec
(W Pi 1 ∗ 2)sec
if ARi ̸= ARi 1
if ARi = ARi 1
if ARi = ARi 1 = ARi 2
In this way, if the classifier consecutively detects two similar
activities, then the data sampling is stopped for 2.5 seconds.
This value is doubled in case of additional similar recognitions,
up to a maximum value of W Pi = 80s. Otherwise, the waiting
period is reset to zero when a different action is detected.
The rationale is that users usually perform similar activities
in a short period –consider for example the case of sitting
and walking– so a continuous data gathering could be often
avoided.
      </p>
      <p>The vector containing the extracted features is then used
as input of the trained SVM model. Finally, the user profile is
enriched with the annotations related to the detected activities.
For each of them it will be also considered the overall stay
time and the daily period.</p>
      <p>IV.</p>
      <p>CASE STUDY</p>
      <p>In order to clarify the rationale behind the proposed
approach and to let emerge the goal of the profiling agent, the
following daily scenario is considered as example. The user
leaves home early in the morning to go to work. He remains at
office until lunch, then reaches a bar for a fast meal. Afterward,
he comes back to work, then goes to the gym in the evening and
finally returns home late at night. The profiling agent extracts
the daily location sequence reported in Table II. Particularly,
Home and Office POIs are mapped to the user profile directly
as Home and Work activities; Bar is identified as a Food place;
Gym is associated to the Sport place category. The agent also
recognizes the adopted means of transport and the duration of
each trajectory.</p>
      <p>Route
Home ! Office
Office ! Bar
Bar ! Office
Office ! Gym
Gym ! Home</p>
      <p>Type
car
walk
walk
car
car</p>
      <p>Duration (min)
30
4
5
11
21</p>
      <p>Along the day, the agent also detects the activities of the
user: he was seated for about 6 hours (e.g., at work, within the
car, during lunch), walked for 35 minutes (e.g., to reach the bar
or for short strolls) and was standing for 15 minutes. As a result
of the mining and annotation processes, the following profile is
extracted (expressed in Description Logic [11] notation w.r.t.
the reference ontology)11:
User Daily Profile ≡ ∀ wasAtHome:HomeActivity ⊓
∀ wasAtW ork:W orkActivity ⊓ ∀ wasInF oodP lace:F oodActivity ⊓
∀ wasInSportP lace:SportActivity ⊓
∀ movedByCar:CarMode ⊓ ∀ movedByW alk:W alkMode ⊓
∀ wasSitting:SittingActivity ⊓ ∀ wasW alking:W alkingActivity ⊓
∀ wasStanding:StandingActivity
HomeActivity ≡ Home ⊓ ∀ during:(Morning ⊓ Night) ⊓
∀ af ter:Gym ⊓ =1945 stayT ime
WorkActivity ≡ W ork ⊓ ∀ during:(Morning ⊓ Af ternoon) ⊓
∀ af ter:(Home ⊓ Bar) ⊓ =32470 stayT ime
11Due to space constraints, some sections have been voluntarily omitted.
⊓
⊓
⊓</p>
      <sec id="sec-5-1">
        <title>FoodActivity ≡ Bar ⊓</title>
        <p>∀ af ter:W ork ⊓ =474 stayT ime</p>
      </sec>
      <sec id="sec-5-2">
        <title>SportActivity ≡ Gym ⊓</title>
        <p>∀ af ter:W ork ⊓ =5362 stayT ime
∀
∀
during:Af ternoon
during:Evening
WalkMode ≡ W alk ⊓ =2115 moveT ime ⊓ ∀ during:Af ternoon ⊓
∀ af ter:Car
SittingActivity ≡ Sitting ⊓ =21436
∀ during:(Morning ⊓ Af ternoon ⊓ Evening)
moveT ime</p>
        <p>The above generated profile will be adopted by the user
agent to negotiate with the mediator agent at home the
environmental situation best fitting needs and mood of the
inhabitant via a semantic-based matchmaking. The elementary
services and appliances covering the mined user profile as
much as possible are automatically activated (or in case
deactivated) to increase the overall MAS utility. As an example
of this phase, let us consider the following available home
services/resources:
CookingService ≡ Service ⊓ ∀ wasInSportP lace:( &gt;=1800
stayT ime) ⊓ ∀ wasAtHome:( ∀ af ter:(Sport ⊓ ¬F ood)) ⊓
∀ suggestedF orF eeling:Hungry
SoftLightLevel ≡ LightLevelRegulation ⊓ ∀ wasAtW ork:( &gt;=10800
stayT ime) ⊓ ∀ wasAtHome:( ∀ af ter: ¬Relax) ⊓
∀ suggestedF orStamina:MentallyT ired ⊓
∀ suggestedF orDisease:Headache
PlayMusic ≡ Service ⊓ ∀ wasAtHome:( ∀ af ter:( ¬W ork ⊓
Relax) ⊓ ∀ during: ¬Night) ⊓ ∀ suggestedF orStamina:Rested ⊓
∀ suggestedF orDisease: ¬Headache</p>
        <p>It should be noticed that service annotations are described
in terms of both user features (such as a physical status, mood
and health) and daily events which cause the activation. In this
way, a service/resource selection can be performed through the
matchmaking against the user profile. For example, a cooking
service is activated not only if the user explicitly declares he
is hungry, but also if the user agent detects he comes back
home after a sport activity, performed for more than 30 minutes
(expressed in seconds), without eating anything before. In a
similar way, a soft lighting setting is selected to improve the
comfort at home in case the user is mentally tired and he spent
more than 3 hours at work not followed by a restful activity.
The extracted user profile can also lead to a deactivation of
previously enabled services. For example, the music service
is normally activated to welcome the owner at home, but it is
unsuitable if the user comes back during the night and in that
case it must be turned off.</p>
        <p>The above case study is purposely simplified in order to
make the presentation of the proposed approach clear and
short. In real scenarios, more articulated user profiles and
service descriptions can be used.</p>
      </sec>
    </sec>
    <sec id="sec-6">
      <title>V. EXPERIMENTS</title>
      <p>
        An overall evaluation of the proposed approach has been
carried out following a reference user for a period of 14
months. Results reported here refer to the first 60 days of
observation. In particular, only the days –24 in the evaluated
dataset excerpt– with at least one Stay Point different from
Home or Workplace have been selected for further
investigation. The profiling agent has been tested on a smartphone
equipped with an ARM Cortex A8 CPU at 1 GHz, 512 MB
RAM, a 8 GB internal storage memory, and Android 2.3.3
as operating system. Done experiments basically aimed to
measure: (i) the amount of data retrieved from services on the
Web; (ii) the turnaround time (for which each test was repeated
four times taking the average of the last three runs); (iii) the
memory usage (for which the final result was the average of
three runs). This experimental analysis only focuses on the user
profiling aspects: [
        <xref ref-type="bibr" rid="ref4">4</xref>
        ] reports on evaluation of the remaining
elements of the reference HBA MAS.
      </p>
      <p>Figure 4 shows the total number of stay points detected
with the mining algorithm compared with the overall GPS
coordinates composing a daily trace. It can be noticed that
the user agent collects 53 GPS points per day on average,
detecting about 3 relevant SPs.</p>
      <p>GPS Points</p>
      <p>Starting from detected SPs, the results of Google Places
and LGD services have been compared in terms of number of
retrieved POIs in the neighborhood of each SP. As shown in
Figure 4, Google Places usually returns 16 POIs w.r.t. 5 POIs
on average retrieved by LGD, so an accurate identification
of the locations the user visited is more likely. Nevertheless,
as reported in Figure 5, in some cases the LGD replies are
longer even though it returns fewer POIs. This is due to the
LGD response format including, for each point, information
annotated according to Linked Data principles [12]: Google
Places uses 830 B per POI on average, whereas LGD uses
1.56 kB.</p>
      <p>Google Places</p>
      <p>LGD</p>
      <p>The time required by the main processing steps for
POIs recognition (GPS traces parsing; SPs detection; Google
Places/LGD services querying; profile enrichment),
transportation mode detection (Overpass service querying; traces
comparison; profile enrichment) and activity recognition are
reported in Figure 6. Google Places is slightly slower than
LGD, but this is due to the greater amount of retrieved POIs.
Considering Google Places as reference service, the agent
spends about 1.2 s to retrieve the POIs from a detected SP.
Activity
A Sitting
B Standing
C Walking
D Walking Upstairs
E Walking Downstairs
Precision %
TABLE III.</p>
      <p>CONFUSION MATRIX
In particular, the last step took about 1.15 s (49% of total
time) to parse the ontology and create the semantic-based
annotation. The remaining steps require only the 3% of the
overall turnaround time, as these procedures use elementary
data structures stored in the device main memory. For the
transportation mode detection, only 1.7 s were spent to query
the Overpass service, while traces comparison is one of the
slower operations, needing 3.4 s. The activity recognition
process has a very short turnaround time. After a preliminary
task (required to train the SVM classifier) taking about 5.6
s and performed when the profiling agent starts, this module
needs only 45 ms to extract the 16 reference features for each
windows and 6 ms to detect the user activity. Finally, a daily
profile was completely composed in about 1.2 seconds.
GPS Trace Parsing
SPs Detection
Google Query
LGD Query
Overpass Query
Traces Comparison
SVM Training
Features Extraction
Activity Recognition
Profile Creation
10000
1000
)
s
(m100
e
m
iT 10
1</p>
      <p>Processing Task</p>
      <p>A further evaluation of the activity recognition module
required to measure precision and recall of the classifier. 100
datasets of activities containing a similar number of samples
per class have been used. The confusion matrix shown in
Table III reports on the weighted precision of the classifier
and on single precision and recall values for each activity. It
is referred to a single specific dataset with 779 sample vectors.
However all confusion matrices for different tests showed
similar outputs, varying slightly in the classification results.
It is possible to notice that the classifier precision and recall
are very high despite the usage of a small set of features.</p>
      <p>RAM usage trend was also evaluated and results are shown
in Figure 7, where memory peaks are reported. The profiler
agent needs very low memory, only 4.2 MB on average, a
satisfactory value for current mobile devices.</p>
      <p>VI.</p>
    </sec>
    <sec id="sec-7">
      <title>RELATED WORK</title>
      <p>
        The recent popularization of smartphones equipped with
a wide range of embedded sensors and adequate processing
capabilities has attracted increasing research efforts toward
mobile sensing. Lane et al. [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ] proposed a survey on existing
algorithms, applications, and systems. In addition, many
pervasive frameworks were defined to collect and capture the user’s
context via cellphones in latest years: remarkable works are
ContextPhone [13], UbiqLog [14] and LifeMap [
        <xref ref-type="bibr" rid="ref15">15</xref>
        ]. The agent
proposed here aims to improve upon these works by leveraging
the multimodality aspect: the implemented prototype retrieve
information from a data source richer than the above systems,
even though further mining modules have been planned but not
integrated yet. A comparison should be carried out also with
respect to commercial location and context-aware mobile
software: trekking and fitness applications like Google MyTracks12
and Endomondo Sportstracker13; personalized assistants like
Google Now14 and Xme15. Nevertheless, these tools either
require explicit user interaction or define context just by means
of GPS location and time of day, hence they are quite far
off the agent proposed here which uses more parameters and
automatically recognizes a larger variety of contexts.
      </p>
      <p>
        The activity recognition from accelerometer by means of
machine learning is a frequent sensing application. Among
other proposal, noteworthy are [
        <xref ref-type="bibr" rid="ref16">16</xref>
        ], [8] where smartphone
accelerometer data are used to classify six common activities.
With reference to context extraction via GPS data analysis,
there are many approaches in literature. For example Zheng
et al. [17] model multiple individuals GPS trajectories with
a tree-based hierarchical graph to mine location history and
travel sequences in a given geospatial region. In [6] mobile
phones are used as sensors to collect location information.
Places are first grouped using a time-based clustering technique
to discover stay points; then the stay points are clustered in
stay regions through a grid-based algorithm. In [18] a
largescale dataset is collected from 114 users over 18 months.
      </p>
      <p>In the above cited works, however, the knowledge gap
between acquired data and the understanding of human behavior
is still huge. Stay points and movement patterns require to
be interpreted to extract a user profile, implicitly providing
knowledge about the user habits. Noteworthy attempts to
enrich movement trajectories with semantics are in [19] and
[20]. An ontology-based approach for a semantic modeling of
trajectories is also proposed in [21]. Trajectories are seen as
composed by three main elements: stops, moves and
beginends. Each part is described through an annotation referred
to a domain ontology and time information are also exploited
to annotate activities to enable rule-based queries and to help
users validate and discover moving objects.</p>
      <p>
        Although previous solutions add a machine-understandable
meaning to data collected by smartphones, a subsequent
ex12http://www.google.com/mobile/mytracks/
13http://www.endomondo.com
14http://www.google.com/landing/now/
15http://xndme.com/
ploitation in an articulated AmI framework is still missing.
Usually, collected data are only used to indicate detected user
conditions or activities through messages or alerts displayed on
the mobile phone. On the contrary, in the approach proposed
here, the ontology-based characterization of user activities is
used as an input for a context-aware HBA MAS [
        <xref ref-type="bibr" rid="ref4">4</xref>
        ], enabling a
direct environment adaptation and a negotiation between user
and home agents. This feature is not possible for any other
current user profiler.
      </p>
    </sec>
    <sec id="sec-8">
      <title>CONCLUSION AND FUTURE WORK</title>
      <p>The paper presented a lightweight agent able to mine data
collected by embedded micro-devices, logs and applications
of a smartphone to build a semantic-based daily profile of
its user. According to the AmI paradigm, such a description
can be exploited to transparently adapt the environment to
user preferences, implicitly inferred. In the matter in question,
the agent interacts in a multi-agent framework for Home and
Building Automation, grounded on knowledge representation
theory and reasoning technologies. It has been designed and
then implemented as an Android application and experiments
in a concrete case study proved its feasibility and effectiveness.</p>
      <p>Future work will include a more extensive experimental
campaign involving several different users to be profiled and
new performance indicators. Particularly, both battery drain
and storage peaks will be taken into account to assess the
feasibility of a continuous data collection and mining and to
compare the provided framework with existing approaches.
Also the exploitation of an agent-based framework w.r.t. to
classical approaches will be posed under investigation to verify
if it results in a more accurate profiling action. Finally, future
research will be also devoted to the integration of the current
multimodal information. A fusion of information coming from
data sources which now are distinct and independent will be
pursued in order to reach a more accurate and precise user
characterization.</p>
    </sec>
    <sec id="sec-9">
      <title>ACKNOWLEDGMENT The authors acknowledge partial support of Italian PON project Res Novae and EU PO Apulia region FESR project</title>
      <p>UbiCare.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <given-names>D. J.</given-names>
            <surname>Cook</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J. C.</given-names>
            <surname>Augusto</surname>
          </string-name>
          , and
          <string-name>
            <given-names>V. R.</given-names>
            <surname>Jakkula</surname>
          </string-name>
          , “
          <article-title>Ambient intelligence: Technologies, applications</article-title>
          , and opportunities,
          <source>” Pervasive and Mobile Computing</source>
          , vol.
          <volume>5</volume>
          , no.
          <issue>4</issue>
          , pp.
          <fpage>277</fpage>
          -
          <lpage>298</lpage>
          ,
          <year>2009</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [2]
          <string-name>
            <given-names>N. D.</given-names>
            <surname>Lane</surname>
          </string-name>
          , E. Miluzzo,
          <string-name>
            <given-names>H.</given-names>
            <surname>Lu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Peebles</surname>
          </string-name>
          ,
          <string-name>
            <given-names>T.</given-names>
            <surname>Choudhury</surname>
          </string-name>
          , and A. T. Campbell, “
          <article-title>A survey of mobile phone sensing,” IEEE Communications Magazine</article-title>
          , vol.
          <volume>48</volume>
          , no.
          <issue>9</issue>
          , pp.
          <fpage>140</fpage>
          -
          <lpage>150</lpage>
          , Sep.
          <year>2010</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [3]
          <string-name>
            <given-names>G.</given-names>
            <surname>Loseto</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F.</given-names>
            <surname>Scioscia</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Ruta</surname>
          </string-name>
          , and E. Di Sciascio, “
          <article-title>Semantic-based Smart Homes: a Multi-Agent Approach</article-title>
          ,” in 13th Workshop on Objects and
          <article-title>Agents (WOA 2012), ser</article-title>
          . CEUR Workshop Proceedings, F. De Paoli and G. Vizzari, Eds., vol.
          <volume>892</volume>
          ,
          <year>Sep 2012</year>
          , pp.
          <fpage>49</fpage>
          -
          <lpage>55</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [4]
          <string-name>
            <given-names>M.</given-names>
            <surname>Ruta</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F.</given-names>
            <surname>Scioscia</surname>
          </string-name>
          , G. Loseto, and E. Di Sciascio, “
          <article-title>Semantic-based resource discovery and orchestration in home and building automation: a multi-agent approach</article-title>
          ,
          <source>” IEEE Transactions on Industrial Informatics</source>
          ,
          <year>2013</year>
          , to appear.
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          [5]
          <string-name>
            <given-names>M.</given-names>
            <surname>Ruta</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F.</given-names>
            <surname>Scioscia</surname>
          </string-name>
          ,
          <string-name>
            <given-names>E. Di</given-names>
            <surname>Sciascio</surname>
          </string-name>
          , and G. Loseto, “
          <article-title>Semantic-based Enhancement of ISO/IEC 14543-3 EIB/KNX Standard for Building Automation,”</article-title>
          <source>IEEE Transactions on Industrial Informatics</source>
          , vol.
          <volume>7</volume>
          , no.
          <issue>4</issue>
          , pp.
          <fpage>731</fpage>
          -
          <lpage>739</lpage>
          ,
          <year>2011</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          <source>[10] [11] [12] [13] [14] [17] [18] [19] [20]</source>
          [21]
          <string-name>
            <given-names>R.</given-names>
            <surname>Montoliu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Blom</surname>
          </string-name>
          , and
          <string-name>
            <surname>D.</surname>
          </string-name>
          Gatica-Perez, “
          <article-title>Discovering places of interest in everyday life from smartphone data</article-title>
          ,
          <source>” Multimedia Tools and Applications</source>
          , pp.
          <fpage>1</fpage>
          -
          <lpage>29</lpage>
          ,
          <year>2012</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          <string-name>
            <given-names>C.</given-names>
            <surname>Stadler</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Lehmann</surname>
          </string-name>
          ,
          <string-name>
            <surname>K.</surname>
          </string-name>
          <article-title>Ho¨ ffner, and</article-title>
          <string-name>
            <given-names>S.</given-names>
            <surname>Auer</surname>
          </string-name>
          , “
          <article-title>LinkedGeoData: A Core for a Web of Spatial Open Data,” Semantic Web Journal</article-title>
          , vol.
          <volume>3</volume>
          , no.
          <issue>4</issue>
          , pp.
          <fpage>333</fpage>
          -
          <lpage>354</lpage>
          ,
          <year>2012</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          <string-name>
            <given-names>Davide</given-names>
            <surname>Anguita</surname>
          </string-name>
          , Alessandro Ghio, Luca Oneto, Xavier Parra and Jorge L.
          <article-title>Reyes-Ortiz, “Human Activity Recognition on Smartphones using a Multiclass Hardware-Friendly Support Vector Machine</article-title>
          .” in Workshop of Ambient Assisted Living (IWAAL
          <year>2012</year>
          ),
          <year>2012</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          [9]
          <string-name>
            <given-names>I.</given-names>
            <surname>Guyon</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Weston</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Barnhill</surname>
          </string-name>
          , and
          <string-name>
            <given-names>V.</given-names>
            <surname>Vapnik</surname>
          </string-name>
          , “
          <article-title>Gene selection for cancer classification using support vector machines,” Machine Learning</article-title>
          , vol.
          <volume>46</volume>
          , pp.
          <fpage>389</fpage>
          -
          <lpage>422</lpage>
          ,
          <year>2002</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          <string-name>
            <given-names>M.</given-names>
            <surname>Hall</surname>
          </string-name>
          , E. Frank,
          <string-name>
            <given-names>G.</given-names>
            <surname>Holmes</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B.</given-names>
            <surname>Pfahringer</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Reutemann</surname>
          </string-name>
          ,
          <string-name>
            <given-names>and I. H.</given-names>
            <surname>Witten</surname>
          </string-name>
          , “
          <article-title>The WEKA data mining software: an update</article-title>
          ,
          <source>” SIGKDD Explor. Newsl.</source>
          , vol.
          <volume>11</volume>
          , no.
          <issue>1</issue>
          , pp.
          <fpage>10</fpage>
          -
          <lpage>18</lpage>
          ,
          <year>2009</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          <string-name>
            <given-names>F.</given-names>
            <surname>Baader</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Calvanese</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D. Mc</given-names>
            <surname>Guinness</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Nardi</surname>
          </string-name>
          , and P. PatelSchneider,
          <source>The Description Logic Handbook</source>
          . Cambridge University Press,
          <year>2002</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          <string-name>
            <given-names>C.</given-names>
            <surname>Bizer</surname>
          </string-name>
          ,
          <string-name>
            <given-names>T.</given-names>
            <surname>Heath</surname>
          </string-name>
          , and
          <string-name>
            <given-names>T.</given-names>
            <surname>Berners-Lee</surname>
          </string-name>
          , “
          <article-title>Linked Data - The Story So Far</article-title>
          ,”
          <source>International Journal on Semantic Web and Information Systems</source>
          , vol.
          <volume>5</volume>
          , no.
          <issue>3</issue>
          , pp.
          <fpage>1</fpage>
          -
          <lpage>22</lpage>
          ,
          <year>2009</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          <string-name>
            <given-names>M.</given-names>
            <surname>Raento</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Oulasvirta</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Petit</surname>
          </string-name>
          , and
          <string-name>
            <given-names>H.</given-names>
            <surname>Toivonen</surname>
          </string-name>
          , “
          <article-title>Contextphone: A prototyping platform for context-aware mobile applications,” IEEE Pervasive Computing</article-title>
          , vol.
          <volume>4</volume>
          , no.
          <issue>2</issue>
          , pp.
          <fpage>51</fpage>
          -
          <lpage>59</lpage>
          , Apr.
          <year>2005</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref14">
        <mixed-citation>
          <string-name>
            <given-names>R.</given-names>
            <surname>Rawassizadeh</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Tomitsch</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K.</given-names>
            <surname>Wac</surname>
          </string-name>
          ,
          <article-title>and</article-title>
          <string-name>
            <given-names>A.</given-names>
            <surname>Tjoa</surname>
          </string-name>
          , “
          <article-title>Ubiqlog: a generic mobile phone-based life-log framework</article-title>
          ,
          <source>” Personal and Ubiquitous Computing</source>
          , pp.
          <fpage>1</fpage>
          -
          <lpage>17</lpage>
          ,
          <year>2012</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref15">
        <mixed-citation>
          [15]
          <string-name>
            <given-names>J.</given-names>
            <surname>Chon</surname>
          </string-name>
          and
          <string-name>
            <given-names>H.</given-names>
            <surname>Cha</surname>
          </string-name>
          , “
          <article-title>LifeMap: A Smartphone-Based Context Provider for Location-Based Services,” IEEE Pervasive Computing</article-title>
          , vol.
          <volume>10</volume>
          , no.
          <issue>2</issue>
          , pp.
          <fpage>58</fpage>
          -
          <lpage>67</lpage>
          , Apr.
          <year>2011</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref16">
        <mixed-citation>
          [16]
          <string-name>
            <given-names>J. R.</given-names>
            <surname>Kwapisz</surname>
          </string-name>
          ,
          <string-name>
            <given-names>G. M.</given-names>
            <surname>Weiss</surname>
          </string-name>
          , and
          <string-name>
            <given-names>S. A.</given-names>
            <surname>Moore</surname>
          </string-name>
          , “
          <article-title>Activity recognition using cell phone accelerometers,” ACM SIGKDD Explorations Newsletter</article-title>
          , vol.
          <volume>12</volume>
          , no.
          <issue>2</issue>
          , pp.
          <fpage>74</fpage>
          -
          <lpage>82</lpage>
          ,
          <year>2011</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref17">
        <mixed-citation>
          <string-name>
            <given-names>Y.</given-names>
            <surname>Zheng</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L.</given-names>
            <surname>Zhang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>X.</given-names>
            <surname>Xie</surname>
          </string-name>
          , and W.-Y. Ma, “
          <article-title>Mining Interesting Locations and Travel Sequences From GPS Trajectories,”</article-title>
          <source>in Proceedings of the 18th International Conference on World Wide Web, ser. WWW '09.</source>
        </mixed-citation>
      </ref>
      <ref id="ref18">
        <mixed-citation>
          New York, NY, USA: ACM,
          <year>2009</year>
          , pp.
          <fpage>791</fpage>
          -
          <lpage>800</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref19">
        <mixed-citation>
          <string-name>
            <surname>T. M. T. Do</surname>
          </string-name>
          and
          <string-name>
            <surname>D.</surname>
          </string-name>
          Gatica-Perez, “
          <article-title>The Places of Our Lives: Visiting Patterns and Automatic Labeling from Longitudinal Smartphone Data,”</article-title>
          <source>IEEE Transactions on Mobile Computing</source>
          ,
          <year>2013</year>
          , PrePrints.
        </mixed-citation>
      </ref>
      <ref id="ref20">
        <mixed-citation>
          <string-name>
            <given-names>C.</given-names>
            <surname>Parent</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Spaccapietra</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Renso</surname>
          </string-name>
          , G. Andrienko,
          <string-name>
            <given-names>N.</given-names>
            <surname>Andrienko</surname>
          </string-name>
          ,
          <string-name>
            <given-names>V.</given-names>
            <surname>Bogorny</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M. L.</given-names>
            <surname>Damiani</surname>
          </string-name>
          ,
          <string-name>
            <surname>A</surname>
          </string-name>
          . Gkoulalas-divanis, J. Macedo,
          <string-name>
            <given-names>N.</given-names>
            <surname>Pelekis</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Y.</given-names>
            <surname>Theodoridis</surname>
          </string-name>
          , and
          <string-name>
            <given-names>Z.</given-names>
            <surname>Yan</surname>
          </string-name>
          , “
          <article-title>Semantic Trajectories Modeling and Analysis,” ACM Computing Surveys</article-title>
          , vol.
          <volume>45</volume>
          , no.
          <issue>4</issue>
          ,
          <year>2013</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref21">
        <mixed-citation>
          <string-name>
            <given-names>R.</given-names>
            <surname>Wannous</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Malki</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Bouju</surname>
          </string-name>
          , and
          <string-name>
            <given-names>C.</given-names>
            <surname>Vincent</surname>
          </string-name>
          , “
          <article-title>Time Integration in Semantic Trajectories Using an Ontological Modelling Approach,” in New Trends in Databases and Information Systems, ser</article-title>
          .
          <source>Advances in Intelligent Systems and Computing</source>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Pechenizkiy</surname>
          </string-name>
          and
          <string-name>
            <given-names>M.</given-names>
            <surname>Wojciechowski</surname>
          </string-name>
          , Eds. Springer Berlin Heidelberg,
          <year>2013</year>
          , vol.
          <volume>185</volume>
          , pp.
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>