<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Deriving Semantic Sensor Metadata from Raw Measurements</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Jean-Paul Calbimonte</string-name>
          <email>jp.calbimonte@upm.es</email>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Zhixian Yan</string-name>
          <email>zhixian.yan@epfl.ch</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Hoyoung Jeung</string-name>
          <email>hoyoung.jeung@sap.com</email>
          <xref ref-type="aff" rid="aff2">2</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Oscar Corcho</string-name>
          <email>ocorcho@fi.upm.es</email>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Karl Aberer</string-name>
          <email>karl.aberer@epfl.ch</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>LSIR, Ecole Polytechnique Federale de Lausanne (EPFL)</institution>
          ,
          <country country="CH">Switzerland</country>
        </aff>
        <aff id="aff1">
          <label>1</label>
          <institution>OEG, Facultad de Informatica,Universidad Politecnica de Madrid</institution>
          ,
          <country country="ES">Spain</country>
        </aff>
        <aff id="aff2">
          <label>2</label>
          <institution>SAP Research</institution>
          ,
          <addr-line>Brisbane</addr-line>
          ,
          <country country="AU">Australia</country>
        </aff>
      </contrib-group>
      <abstract>
        <p>Sensor network deployments have become a primary source of big data about the real world that surrounds us, measuring a wide range of physical properties in real time. With such large amounts of heterogeneous data, a key challenge is to describe and annotate sensor data with high-level metadata, using and extending models, for instance with ontologies. However, to automate this task there is a need for enriching the sensor metadata using the actual observed measurements and extracting useful meta-information from them. This paper proposes a novel approach of characterization and extraction of semantic metadata through the analysis of sensor data raw observations. This approach consists in using approximations to represent the raw sensor measurements, based on distributions of the observation slopes, building a classi cation scheme to automatically infer sensor metadata like the type of observed property, integrating the semantic analysis results with existing sensor networks metadata.</p>
      </abstract>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>-</title>
      <p>
        Ubiquitous sensor networks are a primary source of observations from the
physical world, from environmental measuring stations, participatory or citizen
sensing, to various sensor applications in tra c, media and health monitoring.
Publishing sensor networks data on the web has the potential of increasing public
awareness and involvement on these di erent domains at a massive scale [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ].
Cheap sensing devices can be easily con gured and deployed, plugged to sensor
data platforms such as Cosm1 for exploitation, storage and querying.
      </p>
      <p>The increasing availability of sensor data in the web introduces higher
heterogeneity, which makes it more di cult for potential users to make sense out
of these data sources and be able to identify which ones are useful for their
applications. An example of this scenario is the Swiss Experiment2 project, a
1 Cosm, formerly Pachube https://cosm.com/
2 Swiss Experiment: http://www.swiss-experiment.ch/
platform that enables real-time publishing environmental data on the web, from
a large-scale federation of sensor networks, mainly in the Swiss Alps. The
published data is heterogeneous as it comes from di erent geographical locations,
with di erent time spans (e.g. observations collected during 1 year, 3 months,
etc.), as well as varying sampling rates (e.g. per minute, per 10 minutes).
Moreover, the metadata for these sensor types is not always complete and coherent.
As an example, to indicate that a sensor measures temperature (i.e. the observed
property ), di erent sensors use various tag names, like \temperature", \temp",
\t", \msptemperature", \tp", etc. Although the data is available for anyone to
use, these noisy descriptions are not understandable enough and do not provide
semantic information about what this data is about.</p>
      <p>In less-controlled scenarios than the Swiss Experiment, the problems of
heterogeneity are even more noticeable. For instance in the Cosm web platform,
users tag their sensor data as means of metadata, identifying which types of
measurements they are publishing. Projects like the Air Quality Egg3, aiming at
promoting air-quality participatory sensing, enable almost any citizen to
publish measurements at web-scale. However, the user-provided metadata is often
incomplete. In many cases these tags are misleading or they are not provided at
all, making it very hard for other users to query or make use of this data.</p>
      <p>
        To overcome this problem, establishing explicit semantics on the metadata
has been proposed in previous works, using sensor ontologies [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ]. When using
these ontologies, sometimes it is needed to manually map the semantic
information from the sources to the new metadata model [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ], which is a cumbersome and
error-prone task. In this paper we propose a novel approach of semantic sensor
analysis that infers semantic properties such as the type of observed property,
using the raw sensor observations as input. The main contributions of this paper
are the following:
{ We propose a novel method for representing time series as distributions that
represent the slopes of a linear approximation of the initial numeric sensor
measurements.
{ Based on the statistics of the observation slopes, we infer the type of observed
property of the sensor measurements. We use a classi cation method that
exploits the similarity of the slopes distributions.
{ We provide a mechanism for enriching the sensor metadata, based on the
      </p>
      <p>
        SSN Ontology [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ], with the metadata inferred from the observation slopes.
{ We build a self-contained evaluation system linking raw sensor
measurements to high-level semantics, and validate our method using two real-life
environmental sensor datasets, from the Swiss Experiment and AEMET4(the
Spanish meteorological o ce).
      </p>
      <p>The remainder of this paper is organized as follows: Section 2 describes the
global approach proposed for semantic analysis of sensor data. Section 3 studies
3 AirQuality Egg http://airqualityegg.wikispaces.com/
4 Agencia Estatal de Metereolog a: http://www.aemet.es
the sensor data representation using slopes, whereas Section 4 focuses on
building classi cation algorithm for inferring observed property types and integrating
them to the sensor metadata. In Section 5, we experimentally evaluate our
approach. Section 6 summarizes existing related work. Finally, Section 7 includes
concluding remarks and points to future works.
2</p>
    </sec>
    <sec id="sec-2">
      <title>From Raw</title>
    </sec>
    <sec id="sec-3">
      <title>Measurements to Semantic Metadata</title>
      <p>Sensor data is typically represented as time series, describing the evolution over
time of a certain observed property. Raw sensor data without any metadata
that describes it, has limited use as it is hard to discover, integrate or interpret.
While in controlled environments the sensor metadata can be reasonably well
managed and controlled by the data owners, in the context of the sensor web,
where any citizen is able to produce and publish data, it becomes a more di
cult task. While semantic metadata has been shown to be e ective for managing
large sensor metadata repositories, current proposals require expensive manual
curation and tagging (see Section 6). However, these approaches do not look into
the data values, from which we can derive some of these metadata properties
using analysis and mining techniques.</p>
      <p>We describe in Figure 1 our architecture for deriving semantic metadata from
sensor data measurements. The approach includes characterizing sensor time
series and extracting their observed property types to enrich sensor metadata,
and consists of four main layers:
{ At the sensor deployment layer, sensor nodes provide initial measurements
in terms of real-time numerical values, e.g. temperature, humidity, etc.
{ In the semantic sensor analysis layer, we rst represent the sensor data
stream using linear approximations and calculate the observation slopes.
Based on the sensor slopes, we are able to compute similarity between sensor
data series, detecting the observed property types through classi cation, and
performing detection of these types with partial information.
{ A semantic representation of the analysis component is integrated into the
semantic metadata. Using the SSN Ontology as a basis, and combined with
domain speci c ontologies, this enriched metadata is made available for
further processing or querying.
{ In the application layer, users can build tools and visualizations to query such
sensor data and receive results that include the new metadata computed by
the analysis layer.</p>
      <p>
        The deployment layer is usually built using sensor or stream data
management systems. These systems centralize the data captured by the devices and
provide storage, query interfaces and streaming operators. As for the
semantic metadata, we built upon previous work on semantic management of sensor
networks [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ], centered on the use of the SSN Ontology, coupled with domain
ontologies and vocabularies for quantities and units of measurements. For the
analysis of the time series, we propose a representation based on the slopes of
a linear approximation of the data, as described in Section 3. Then these
representations can be used to compare and nd similarities among new and existing
time series, classifying them according to the detected observed property type,
etc. As a result, we are able to complete and query the sensor metadata, as
detailed in Section 4.
      </p>
    </sec>
    <sec id="sec-4">
      <title>Sensor Data Representation with Slopes</title>
      <p>In environmental time series, similar patterns can be observed periodically over
time. These patterns can be characteristic to a type of sensor data, and therefore
help to recognize it. If we represent a time series using a linear representation,
such as the one in Figure 2(b), the patterns of the data can be associated to the
angles of the linear segments or its corresponding slope. For instance, a steep
slope indicates a sudden increase of the measured property. The intuition is that
if these slopes are repetitive over time, we can build slope distributions that can
be representative of a type of time series. Using slopes makes it possible to nd
similarities between time series that not necessarily have the same value ranges
but similar behavior, e.g such as the air temperature in two di erent locations.
(a) Linear approximation</p>
      <p>
        (b) Constructing the convex hulls and segments
We can use linear segments to approximate a time series (Piecewise Linear
Representation, PLR), and analyze the trends by observing the angles that the
segments form. For instance in Figure 2(a), we use 2 segments to represent the
original 10 data points. Notice that the number of points for a segment can be
variable (adaptive approximations). We used the algorithm of [
        <xref ref-type="bibr" rid="ref4">4</xref>
        ] for the
construction of piecewise linear histograms.
      </p>
      <p>
        Consider we have a time series of n data points X = x1; x2; :::; xn, and
we want to t it in m &lt;&lt; n segments. The algorithm maintains a set B of
buckets bi = hi; begi; endi; li; ri; hi, where hi is a convex hull of data points,
and (begi; li); (endi; ri) are the coordinates of the segment that best ts the
convex hull (the segment that bisects the thinnest bounding rectangle of hi [
        <xref ref-type="bibr" rid="ref4">4</xref>
        ]).
ri li
The slope of bi can be calculated as slope(bi) = endi begi . The algorithm adds
elements to B from X, until there are no buckets available, and then it starts
to merge those adjacent buckets bi and bi+1 that combined produce the smallest
increase in total error. Merging is reduced to a convex hull merge of hi and hi+1.
The algorithm iterates until all elements of X have been placed in a bucket. The
resulting set of segments of each bucket bi is the linear approximation of X.
      </p>
      <p>For instance in Figure 2(b), the convex hull hi encloses 8 data points and its
minimum rectangle is bisected by the thick black segment de ned by the points
(begi; li); (endi; ri). This is the linear representation for these 8 points. During
the computation of the linear representation, if merging hi with the next hull
hi+1 reduces the approximation error, they will form a new single hull with its
own bisecting segment. Once we apply this PLR algorithm we have the time
series represented as line segments, each with a distinctive slope.
3.2</p>
      <sec id="sec-4-1">
        <title>Slope Distributions</title>
        <p>To build the slope distributions, we rst compute a linear approximation of the
time series, using the algorithm described in Section 3.1. It is possible to create
linear approximations of di erent accuracy, depending on the number of
segments per unit of time. For instance for a time series of 30 days, if we use 4
segments per day, their slopes will re ect coarse-grained changes in the data
during each day. Time series of originally di erent sampling times, can be
represented using the same segment/day rate, in order to be comparable. Obviously,
if the original sampling interval is greater than the number of segments/day, the
representation with that rate is not possible.</p>
        <p>Once the linear representation is built, we can compute the slopes and
analyze them. The slope or gradient space, bounded in the [1; 1] interval for the
possible angles [ 2 ; 2 ], can be divided in sectors, each represented with a
symbol j from an alphabet A and we can assign each segment to its corresponding
symbol. We propose using the segment representation discussed in the
previous section, to compute slope symbolizations, which characterize a time series as
a sequence S of symbols si from an alphabet A that correspond to a type of slope.</p>
        <p>In this way, we characterize a time series by the type of variations present
in the sensor data, regardless of the data values. For example if we divide the
angle space in 4 sectors (labeled a; b; c; d), at intervals of 4 , we can match each
segment slope with one symbol. For instance in Figure 3 we have 4 segments,
whose symbolic representation is adac, by matching each slope with a symbol.</p>
        <p>Having this symbolic representation of the slopes, it is possible to compare
them to check if two series have similar slope patterns. One simple way to do so,
is to generate symbol distributions, or histograms that count how many symbols
of each type exist in a time series. So a distribution of a sequence S can be de ned
as a set DS of elements d j = j fsi 2 S; si = alphaj g j, for all symbols in A. For
the previous example, it would be a vector 2; 0; 1; 1, which can be normalized
by the total elapsed time, so that we can compare series encompassing di erent
time spans. A simple distance measure is the euclidean distance, de ned for two
2
distributions DS1 ; DS2 of length n as: deucl(DS1 ; DS2 ) =
qPin (dS1i
dS2i)
3.3</p>
      </sec>
      <sec id="sec-4-2">
        <title>Choosing the angle divisions</title>
        <p>Although we can arbitrarily choose how to divide the angle space (e.g. 4 sectors
of 4 as in the previous example), the actual angles may be more concentrated in
some intervals than others. For instance time series with highly changing angles
such as wind speed, may have steeper gradients than a more stable series. Taking
into account this fact, we propose to analyze the training data sets to determine
an angle division that better represents the actual distribution of angles in the
training set. Using this distribution information, we can divide the angle space
in divisions that hold the same number of angles of the training data.
4</p>
      </sec>
    </sec>
    <sec id="sec-5">
      <title>Deriving Semantic Metadata</title>
      <p>After establishing how the data is segmented and symbolized, we can use the
symbol distributions for data analysis tasks to help understanding the semantics
of the data. Given a time series, if it does not contain appropriate metadata, the
potential user of this data can use already analyzed time series and compare the
new one with them. We show how this can be done using our symbolization and
a simple classi cation scheme, even with a partial subset of a time series.
4.1</p>
      <sec id="sec-5-1">
        <title>Semantic Descriptions</title>
        <p>
          A semantic description of an observation is a collection of statements that
includes the observed property (e.g. humidity, pressure), feature of interest (e.g.
the air at some location), unit of measurement, among others. For instance, using
the vocabulary of the SSN Ontology [
          <xref ref-type="bibr" rid="ref2">2</xref>
          ], we describe a wind speed observation in
Listing 1. The observation, identi ed as swissex:WindSpeedObservation1, has
been observed by sensor swissex:SensorWind1 and reported a value of 6.245.
The sensor observed property type cf-property: wind speed (speed of the wind
feature) is de ned in a domain speci c vocabulary (in this case the Climate and
Forecast vocabulary de ned by the W3C SSN-XG group5). Additional metadata
about this observation are omitted for brevity.
swissex : WindSpeedObservation1 rdf : type ssn : Observation ;
ssn : featureOfInterest cf - feature : wind ;
ssn : observedProperty cf - property : wind_speed ;
ssn : observationResult
[ rdf : type ssn : SensorOutput ;
ssn : hasValue [ qudt : numericValue " 6.245 " ^^ xsd : double ]];
ssn : observedBy swissex : SensorWind1 ;
        </p>
        <sec id="sec-5-1-1">
          <title>Listing 1: Wind Speed observation in rdf according to the ssn ontology</title>
          <p>Concretely, the cf-property:wind speed property indicates that this is an
observation of wind speed, and it has further semantic information in the Climate
&amp; Forecast ontology, as seen in Listing 2. It states that it is a property of the
wind (cf-feature:wind) and is a property of the more general speed quantity
(qu:speed). In order to extract this information, the type of observed property
from an unannotated dataset, we propose the classi cation scheme in the next
subsection. The goal is basically to identify the ssn:observedProperty for a
time series.
cf - property : wind_speed rdf : type dim : VelocityOrSpeed ;
rdfs : label " wind speed ";
ssn : isPropertyOf cf - feature : wind ;
qu : propertyType qu : scalar ;
qu : generalQuantityKind qu : speed .</p>
        </sec>
        <sec id="sec-5-1-2">
          <title>Listing 2: Wind Speed property according to the Climate and Forecast vocabulary</title>
          <p>4.2</p>
        </sec>
      </sec>
      <sec id="sec-5-2">
        <title>Data Classi cation</title>
        <p>Given two sets of time series, a training set already annotated according to the
type of data that is captured, and an unannotated test set, we are interested in
nding the observed property for the second set. Assume we have a collection
D of symbol distributions D1; :::; Di; :::; Dn as a training set, each of them
corresponding to a time series tsi, already classi ed with a type observed property
(e.g. \wind speed"). The classi cation task consists in nding the best property
for time series tstest in the test set.</p>
        <p>
          We can use a simple k-nearest neighbor scheme, which has been successfully
used for time series classi cation [
          <xref ref-type="bibr" rid="ref5 ref6">5,6</xref>
          ]. First, the time series tstest is segmented
5 C&amp;F vocabulary: http://purl.oclc.org/NET/ssnx/cf/cf-property
and symbolized. Then, we generate a symbol distribution Dtest, as described in
Section 3.2, which can be compared iteratively with each of the distributions Di
in D. From the k distributions closer to Dtest, we select the observed property
of the majority.
4.3
        </p>
      </sec>
      <sec id="sec-5-3">
        <title>Using Partial Data Subsets</title>
        <p>This classi cation technique may use all the complete time series for computing
the symbolization and the slope distribution. However, for types of data with
recurring patterns such as the ones present in environmental and meteorological
data, using a smaller subset of data can be enough to extract the feature that
help detecting the type of observed property. In that case for the construction of
the linear representation of the data, we simply choose a subset of the original
data: X = x1; x2; :::; xn, with a di erent n0 such that n0 &lt; n.
4.4</p>
      </sec>
      <sec id="sec-5-4">
        <title>Querying using the Analysis Results</title>
        <p>After executing the classi cation, we can use the extracted information to
complete the sensor metadata, that is then available for querying. In Listing 4 we
show a simple sparql query that asks for sensors that measure air temperature.
SELECT ? sensor
WHERE {
? sensor a ssn : Sensor ;
ssn : observes cf - property : air_temperature .}</p>
        <sec id="sec-5-4-1">
          <title>Listing 3: Query all sensors that measure air temperature</title>
          <p>The streams produced by sensors can be seen as streaming datasets, whose
metadata can also be queried. The stream, identi ed by a URI, can be seen
as an unbounded dataset of observations, some of which are actually used to
compute the slope symbolizations and classi cation described above. The
observed properties obtained for the sensor (e.g. cf-property:air temperature)
are therefore the observed properties of the stream observations. We can also
query for more general types of data, for instance, the generic temperature
property. In Listing 4 we ask for all stream URIs of sensors that measure some type
of temperature.</p>
          <p>SELECT ? stream ? observedProperty
WHERE {
? sensor a ssn : Sensor ;</p>
          <p>ssn : observes ? observedProperty .
? stream ssn : isProducedBy ? sensor .
? observedProperty qu : generalQuantityKind qu : temperature .}</p>
        </sec>
        <sec id="sec-5-4-2">
          <title>Listing 4: Query all streams of sensors that measure air temperature</title>
          <p>Furthermore, we can expose the similarity measurements computed between
the time series, so that users can also query this information. As an example,
in Listing 5 we use the Similarity Ontology6(sim) to represent the computed
distance between two series, using our slope representation. Then we can query,
for instance the top 5 series similar to a given time series.
6 The Similarity Ontology: http://purl.org/ontology/similarity/
swissex : slopeSim1_2 a sim : Similarity ;
sim : subject swissex : timeseries1 ;
sim : object swissex : timeseries2 ;
sim : weight 0.32;
sim : method swissex : SlopeDistributionDistance .</p>
        </sec>
        <sec id="sec-5-4-3">
          <title>Listing 5: Slope distribution similarity between two time series</title>
          <p>This type of queries allows users not only to use the nal results of a classi
cation task, but also to query more detailed information including the precision
of the computations. This information can be used to validate this metadata or
provide insight about the analysis process and the relationship of a sensor stream
with other streams. In the case of the early detection of the observed property
of a time series, the user may be interested in knowing, for example, how many
days of data are typically used for classifying those sensors that measure wind
speed 6.</p>
          <p>SELECT ? sensor ? dur
WHERE {
? sensor a ssn : Sensor ;</p>
          <p>ssn : observes cf - property : wind_speed .
? timeseries ssn : isProducedBy ? sensor .
? timeseries swissex : duration [ qu : numericalValue ? dur ].}</p>
        </sec>
        <sec id="sec-5-4-4">
          <title>Listing 6: Query the number of data days used for classifying wind speed sensors</title>
          <p>5</p>
        </sec>
      </sec>
    </sec>
    <sec id="sec-6">
      <title>Experimentation</title>
      <p>
        The main goal of these experiments is to show that the proposed sensor data
representation using slopes can be used to characterize sensor data and extract
sensor metadata corresponding to the types of observed properties. First we show
how the classi cation behaves with two real life data sets, in terms of precision.
Next, we are interested in experimenting with smaller subsets of data samples,
and observing how the classi cation behaves with less data, as we know there are
repeating data patterns. Finally, we compare our approach with a classi cation
using the widely used SAX symbolic representation of the data [
        <xref ref-type="bibr" rid="ref5">5</xref>
        ].
      </p>
      <p>To validate the classi cation approach presented in Section 4.2, we
implemented and applied it to two di erent datasets in the environmental domain:
one from the Swiss Experiment7 and another form AEMET. The data is
heterogeneous as it comes from di erent geographical locations, some have di erent
time spans (e.g. observations collected during 1 year, 3 months, etc), others
have di erent sampling rates.Also the number of sensors per observation type
varies (e.g. 78 for temperature, only 4 for snow height). Due to the conditions
of the deployments, some of them experimental and others deployed in harsh
environments, this dataset contains a considerable amount of noise in the data.</p>
      <p>The AEMET dataset consists of sensor data from 100 weather stations
managed by the Spanish meteorological o ce. The data is heterogeneous, coming
from stations all over Spain, and was originally collected in intervals of 10
minutes. It contains, in general, less noise and anomalies than the Swiss Experiment
dataset, as it comes from stations daily used for meteorological forecasts.
7 The dataset is available at: http://lsirpeople.epfl.ch/qvhnguye/benchmark/
5.1</p>
      <sec id="sec-6-1">
        <title>Classi cation in Swiss Experiment and AEMET</title>
        <p>The goal of our rst experiment consists in evaluating the e ectiveness of the
classi cation in terms of precision and recall. The classi er is expected to assign
the correct label (the type of observed property, e.g. \humidity") to time series
from a test set. The classi er uses a training set of time series and the evaluation
criteria is computed in terms on the number of true positives (tp), false positives
tp tp
(f p) and false negatives (f n): precision (p = tp+fp ), and recall (r = tp+tn ).
Swiss Experiment The heterogeneity of the Swiss Experiment dataset required
applying di erent parameters for the linear approximation step. Some time series
had very short sampling time intervals (e.g. every 2 seconds for pressure, for at
most two days), while others had very long ones (e.g. every half-an-hour for
several months). Hence, the approximations were very di erent in these cases
(hundreds of segments per day for short intervals, and only a few per day for
long ones).We applied a 5-fold cross validation scheme to divide our dataset in
training and test set, and then apply the nearest neighbor algorithm. We present
the confusion matrix in Table 4, for k = 5.</p>
        <p>We can observe that the e ectiveness of the classi cation varies among the
di erent types of data. The nearest neighbor scheme is also biased as the dataset
is highly unbalanced. Since we have comparatively much more samples of
temperature or wind speed, than for pressure or snow height, these last are less
likely nd nearest neighbors of the same class. For instance for lysimeter and
snow height, almost no series are correctly identi ed, as we have a very small
number of series. Nevertheless, in the cases of pressure or CO2 the precision is
good regardless of the low number of series. This is a special case, since these
series have very di erent slope distributions, and also, have very short sampling
interval. Since their resolution is much smaller (e.g. every 2 seconds) than most
of the other series in the dataset, their comparison throws very large distances
that are quickly discarded.</p>
        <p>In cases where the total number of time series was very small (e.g. only 4 for
snow height ), the approach is clearly not e ective. It requires a larger training
set to have an acceptable precision. Also, when the series are very irregular
(sometimes due to noise and false non-curated data in the original dataset),
they logically fail to be correctly classi ed.</p>
        <p>AEMET For the AEMET dataset, we followed the same approach as with the
Swiss-Experiment. However, for the AEMET data, we had a larger number of
time series for every type of data, thus avoiding the problem of lack of training
data encountered in the previous tests. Moreover, the dataset sampling interval
is the same, making it easier to compare their slope distributions. We applied the
classi cation scheme with a 10-fold cross validation for this dataset. We provide
the confusion matrix for k = 5 in Table 5.</p>
        <p>We can notice that in this case the approach achieves better precision, as
expected, since we avoided the problems of sampling times and unbalanced types
(the number of series per each type is similar or the same). However, it can be
observed that there are important false positives at some speci c spots. For
instance the number of soil temperature series falsely identi ed as air temperature
is very high. This is in fact an expected result, since both are specializations
of the more general type temperature. Hence, both share patterns in the time
series, that are re ected in the slope distributions that are compared during the
classi cation process. The same situation can be seen between wind speed and
wind speed (max), and for wind direction and wind direction (max).</p>
        <p>It is also interesting to see that if we consider the \uni cation" of
similar types of data (e.g. wind speed and maximum wind speed ), the precision is
much higher (Figure 6). This suggests that the slope distributions are useful
for identifying similar data, because they have very similar slope distributions.
This is an expected behavior, for instance for wind speed and wind speed (max),
which are measurements of the same type of data. In order to discern between
small di erences like these, other characteristics of the data have to be taken
into account. In these cases where two types of observations are similar, we can
use a higher level de nition of observed property. For instance, in the Climate
and Forecast vocabulary, the speci c properties cf-property:air temperature
and cf-property:soil temperature both have qu:temperature as its general
quantity kind.</p>
      </sec>
      <sec id="sec-6-2">
        <title>Classi cation with Partial Information</title>
        <p>In this experiment we aim at showing how the classi cation precision varies
when using smaller subsets of the test data. As we discussed in Section 4.3, for
our environmental and meteorological datasets, recurrent slope patterns in the
data can be representative enough to compute the slope distribution, and make
it possible to classify the data. We have tested the classi cation reducing the
number of days-of-data used for computation. In Figure 7(a) and Figure 7(b) we
plot the precision for the AEMET and Swiss Experiment dataset series, for
different subsets of the data (expressed in terms of the number of days of measured
data). In total we have around 200 days of observations, but we can see that
for some types of data we require much less and obtain similar precision in the
classi cation. This is the case especially with series that include very repetitive
patterns on a daily basis, but not for others that have a more unpredictable
behavior such as wind speed. In this case we see that it needs more days-of-data
than other types to increase the precision.</p>
        <p>(a) AEMET
(b) Swiss Experiment
Fig. 7: Classi cation precision, for di erent partial datasets, in terms of the days of
data used.
5.3</p>
      </sec>
      <sec id="sec-6-3">
        <title>Comparison with SAX Classi cation</title>
        <p>
          The goal of this experiment is to compare our approach with a classi cation
based on the widely used SAX representation of time series [
          <xref ref-type="bibr" rid="ref5">5</xref>
          ]. The
comparison is based on the precision using both approaches. By classifying with SAX
we can verify how well our method behaves in comparison to a well established
technique. The SAX approach also produces a symbolization of the time series,
although the angles and slopes are not taken into account, as it uses a PAA
approximation. We applied the same classi cation method used for our slope-based
representation. We show the classi cation precision for the Swiss Experiment and
AEMET datasets in Figure 8(a) and Figure 8(b) respectively.
        </p>
        <p>(a) Swiss Experiment
(b) AEMET</p>
        <p>As it can be seen, the classi cation throws similar results for both methods,
with small di erences in AEMET, and slightly better for the slope-based
approach in the Swiss experiment dataset. Using the slopes distributions shows to
be helpful at di erencing time series with similar values but very di erent angles.
In the case of AEMET, the measured values are already enough to discern
between two di erent types of observation, and hence the results are not improved
by the slope distribution. While the SAX representation has been exploited in
other ways, for example by considering substrings of a xed size, instead of only
one symbol, this experiment shows that our approach is also able to extract
features that help characterizing a type of time series, and enabling its
semantic identi cation. A classi cation technique throws di erent results depending
on the type of data. Further amendments could be plugged to the classi
cation scheme, but they risk to be too speci c to the characteristics of certain
datatypes, and such methods are outside of the scope of this work.
6</p>
      </sec>
    </sec>
    <sec id="sec-7">
      <title>Related Work</title>
      <p>
        Previous works on time series classi cation and mining, have studied di erent
approaches for summarizing and exploiting sensor raw data, and have been
complemented with semantic representations for sensor data management.
Data Approximations High level representations reduce the dimensionality
of time series data, in order to reduce the complexity of indexing and comparison
algorithms, using di erent techniques. These include piecewise constant and
linear approximations (e.g. PAA[
        <xref ref-type="bibr" rid="ref7">7</xref>
        ], APCA[
        <xref ref-type="bibr" rid="ref8">8</xref>
        ], PLR[
        <xref ref-type="bibr" rid="ref9">9</xref>
        ]) that use constant and linear
segments respectively, to represent the original time series. Generally simple to
compute, either in batch mode and online using sliding window algorithms, these
methods o er accurate approximations of the original data. These
representations have been widely used for tasks including similarity search, fuzzy queries,
dynamic time warping, clustering and classi cation [
        <xref ref-type="bibr" rid="ref9">9</xref>
        ].
      </p>
      <p>
        While these approximations reduce dimensionality, some approaches
introduce a further step that consists in the symbolization of the time series. These
techniques, such as SAX [
        <xref ref-type="bibr" rid="ref5">5</xref>
        ], have shown to be space and time e cient for
indexing, classi cation and clustering, and also for additional tasks such as motif
discovery and visualization [
        <xref ref-type="bibr" rid="ref10">10</xref>
        ]. These symbolizations can be used to compute
distance measures that help in classi cation and clustering tasks [
        <xref ref-type="bibr" rid="ref5">5</xref>
        ]. Other works
have considered also the slopes of linear approximations such as the STS
distance [
        <xref ref-type="bibr" rid="ref11">11</xref>
        ] for clustering time series.
      </p>
      <p>
        SAX symbolization has also been used for sensor events detection [
        <xref ref-type="bibr" rid="ref12">12</xref>
        ] and
for creating high-level perception abstractions from the raw sensor data, by
matching SAX patterns with low-level thematic abstractions [
        <xref ref-type="bibr" rid="ref13">13</xref>
        ].
Time Series Classi cation Particularly, for the task of classi cation, di
erent techniques such as decision trees, neural networks and bayesian classi ers
have been used [
        <xref ref-type="bibr" rid="ref6">6</xref>
        ]. Classi cation approaches usually fall into the following three
categories: distance-based, feature-based and model-based[
        <xref ref-type="bibr" rid="ref6">6</xref>
        ]. Simple distance
measures such as euclidean, are very limited because they only consider
oneto-one matches in the time axis. Distance measures with more elastic matching
for the time axis, such as Dynamic Time Warping (DTW), have been proved
e ective for similarity matching [
        <xref ref-type="bibr" rid="ref14">14</xref>
        ]. These have been coupled with k-nearest
neighbor (k-NN) classi ers, proving an e ective combination for a number of
time series classi cation problems [
        <xref ref-type="bibr" rid="ref15 ref16">15,16</xref>
        ]. These techniques have space and time
computation limitations in some scenarios, and o er little explanation on why
a series belongs to a particular class [
        <xref ref-type="bibr" rid="ref17">17</xref>
        ]. Feature-based approaches try to nd
properties that are representative of a type of series, in order to classify them.
Most of these approaches use a high level representation e.g. symbolization or
discretization methods, before extracting the features[
        <xref ref-type="bibr" rid="ref6">6</xref>
        ] while others work
extracting representative subsequences (e.g. shapelets [
        <xref ref-type="bibr" rid="ref17">17</xref>
        ]).
      </p>
      <p>
        Semantic Sensor Representations The task of modeling sensor data and
metadata with ontologies has been addressed by the semantic web research
community in recent years. Early ontology proposals for describing wireless sensors
have been reviewed in [
        <xref ref-type="bibr" rid="ref18">18</xref>
        ]. However, the focus of most of these approaches was
on sensor meta information, while the description of observations was generally
overlooked. Besides some of these approaches lack ontology design best practices
of reuse and alignment with standards an reference ontologies. Others, including
the OntoSensor ontology [
        <xref ref-type="bibr" rid="ref19">19</xref>
        ], use the concepts de ned in the OGC SensorML8
standard as a basis. More recent proposals like [
        <xref ref-type="bibr" rid="ref20">20</xref>
        ] and [
        <xref ref-type="bibr" rid="ref21">21</xref>
        ], also consider the
OGC Observations and Measurements (O&amp;M) standard9 to represent
observations captured by sensor networks.
      </p>
      <p>
        Recently, through the W3C SSN-XG group, the semantic web and sensor
network communities have made an e ort to provide a domain independent
ontology, generic enough to adapt to di erent use-cases, and compatible with
the OGC standards at the sensor and observation levels. The result, the ssn
ontology [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ], is based on the stimulus-sensor-observation design pattern [
        <xref ref-type="bibr" rid="ref22">22</xref>
        ] and
the OGC standards.
8 SensorML. http://www.opengeospatial.org/standards/sensorml
9 OGC O&amp;M: http://www.opengeospatial.org/standards/om
      </p>
    </sec>
    <sec id="sec-8">
      <title>Conclusions and Future Work</title>
      <p>We have described an approach for identifying the type of data from sensor data
sources, using a symbolic representation of the time series slopes. We have shown
how this representation can be used for enriching semantic sensor metadata.
We have shown speci c use cases of time series data classi cation, providing
similarity measures, and metadata aggregation that can be queried in terms of
high-level standard ontologies. Finally, we evaluated our approach with real-life
datasets of the Swiss-Experiment project and AEMET.</p>
      <p>We have shown through experimentation that this representation can be
useful for balanced datasets, as the classi cation gets biased when there are
small numbers of samples in the training set, for a particular type of data.
Moreover, our results show that this representation can help grouping data of the
same type, despite geographical locations, since it is based on the distribution of
slopes of a linear approximation. Therefore, it can identify similarities of related
types of data: e.g. air temperature and soil temperature. We have compared our
characterization of sensor data with a competitive approach, and showed that
for the chosen environmental datasets it e ectively enables the extraction of
semantic metadata.</p>
      <p>The proposed approach, however, was evaluated within the same dataset, and
in the future we will study its applicability in an inter-dataset classi cation. This
framework could be used in the future for other tasks such as clustering, or for
identifying simple patterns in streams of sensor data. Moreover, complex
symbolizations consisting of sequences of slopes could be considered, which would
represent more complete patterns that can be exploited. Also, we can consider
building a more complex representation that includes not only the slopes
information but also the value ranges, and even tags and labels provided the data
publishers. This may enable a more complete and accurate extraction of
metadata that enriches the growing Semantic Sensor Web. As a nal future path, we
may consider applying online execution of these techniques for real-time analysis.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          1.
          <string-name>
            <surname>Sheth</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Henson</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Sahoo</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          :
          <article-title>Semantic sensor web</article-title>
          .
          <source>IEEE Internet Computing</source>
          <volume>12</volume>
          (
          <issue>4</issue>
          ) (
          <year>2008</year>
          )
          <volume>78</volume>
          {
          <fpage>83</fpage>
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          2.
          <string-name>
            <surname>Compton</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Barnaghi</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Bermudez</surname>
            ,
            <given-names>L.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Garc</surname>
            a-Castro,
            <given-names>R.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Corcho</surname>
            ,
            <given-names>O.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Cox</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Graybeal</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Hauswirth</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Henson</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Herzog</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Huang</surname>
            ,
            <given-names>V.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Janowicz</surname>
            ,
            <given-names>K.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Kelsey</surname>
            ,
            <given-names>W.D.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Phuoc</surname>
            ,
            <given-names>D.L.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Lefort</surname>
            ,
            <given-names>L.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Leggieri</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Neuhaus</surname>
            ,
            <given-names>H.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Nikolov</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Page</surname>
            ,
            <given-names>K.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Passant</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Sheth</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Taylor</surname>
          </string-name>
          , K.:
          <article-title>The SSN ontology of the W3C semantic sensor network incubator group</article-title>
          .
          <source>Journal of Web</source>
          Semantics (In press) (
          <year>2012</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          3.
          <string-name>
            <surname>Calbimonte</surname>
            ,
            <given-names>J.P.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Jeung</surname>
            ,
            <given-names>H.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Corcho</surname>
            ,
            <given-names>O.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Aberer</surname>
            ,
            <given-names>K.</given-names>
          </string-name>
          :
          <article-title>Semantic sensor data search in a large-scale federated sensor network</article-title>
          .
          <source>In: Proc. 4th International Workshop on Semantic Sensor Networks</source>
          . (
          <year>2011</year>
          )
          <volume>14</volume>
          {
          <fpage>29</fpage>
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          4.
          <string-name>
            <surname>Buragohain</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Shrivastava</surname>
            ,
            <given-names>N.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Suri</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          :
          <article-title>Space e cient streaming algorithms for the maximum error histogram</article-title>
          .
          <source>In: Data Engineering</source>
          ,
          <year>2007</year>
          .
          <article-title>ICDE 2007</article-title>
          . IEEE 23rd International Conference on,
          <source>Ieee</source>
          (
          <year>2007</year>
          )
          <volume>1026</volume>
          {
          <fpage>1035</fpage>
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          5.
          <string-name>
            <surname>Lin</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Keogh</surname>
            ,
            <given-names>E.J.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Wei</surname>
            ,
            <given-names>L.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Lonardi</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          :
          <article-title>Experiencing sax: a novel symbolic representation of time series</article-title>
          .
          <source>Data Min. Knowl. Discov</source>
          .
          <volume>15</volume>
          (
          <issue>2</issue>
          ) (
          <year>2007</year>
          )
          <volume>107</volume>
          {
          <fpage>144</fpage>
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          6.
          <string-name>
            <surname>Xing</surname>
            ,
            <given-names>Z.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Pei</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Keogh</surname>
            ,
            <given-names>E.J.:</given-names>
          </string-name>
          <article-title>A brief survey on sequence classi cation</article-title>
          .
          <source>SIGKDD Explorations</source>
          <volume>12</volume>
          (
          <issue>1</issue>
          ) (
          <year>2010</year>
          )
          <volume>40</volume>
          {
          <fpage>48</fpage>
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          7.
          <string-name>
            <surname>Keogh</surname>
            ,
            <given-names>E.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Chakrabarti</surname>
            ,
            <given-names>K.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Pazzani</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Mehrotra</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          :
          <article-title>Dimensionality reduction for fast similarity search in large time series databases</article-title>
          .
          <source>Knowledge and information Systems</source>
          <volume>3</volume>
          (
          <issue>3</issue>
          ) (
          <year>2001</year>
          )
          <volume>263</volume>
          {
          <fpage>286</fpage>
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          8.
          <string-name>
            <surname>Chakrabarti</surname>
            ,
            <given-names>K.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Keogh</surname>
            ,
            <given-names>E.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Mehrotra</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Pazzani</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          :
          <article-title>Locally adaptive dimensionality reduction for indexing large time series databases</article-title>
          .
          <source>ACM Transactions on Database Systems (TODS) 27(2)</source>
          (
          <year>2002</year>
          )
          <volume>188</volume>
          {
          <fpage>228</fpage>
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          9.
          <string-name>
            <surname>Keogh</surname>
            ,
            <given-names>E.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Chu</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Hart</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Pazzani</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          :
          <article-title>Segmenting time series: A survey and novel approach</article-title>
          .
          <source>Data mining in time series databases</source>
          <volume>57</volume>
          (
          <year>2004</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          10.
          <string-name>
            <surname>Kasetty</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          , Sta ord,
          <string-name>
            <given-names>C.</given-names>
            ,
            <surname>Walker</surname>
          </string-name>
          ,
          <string-name>
            <given-names>G.</given-names>
            ,
            <surname>Wang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>X.</given-names>
            ,
            <surname>Keogh</surname>
          </string-name>
          , E.:
          <article-title>Real-time classi cation of streaming sensor data</article-title>
          .
          <source>In: Tools with Arti cial Intelligence</source>
          ,
          <year>2008</year>
          . ICTAI'
          <volume>08</volume>
          . 20th IEEE International Conference on. Volume
          <volume>1</volume>
          ., IEEE (
          <year>2008</year>
          )
          <volume>149</volume>
          {
          <fpage>156</fpage>
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          11. M
          <article-title>oller-</article-title>
          <string-name>
            <surname>Levet</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Klawonn</surname>
            ,
            <given-names>F.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Cho</surname>
            ,
            <given-names>K.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Wolkenhauer</surname>
            ,
            <given-names>O.</given-names>
          </string-name>
          :
          <article-title>Fuzzy clustering of short time-series and unevenly distributed sampling points. Advances in Intelligent Data Analysis V (</article-title>
          <year>2003</year>
          )
          <volume>330</volume>
          {
          <fpage>340</fpage>
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          12.
          <string-name>
            <surname>Zoumboulakis</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Roussos</surname>
          </string-name>
          , G.:
          <article-title>Escalation: Complex event detection in wireless sensor networks</article-title>
          .
          <source>In: Proceedings of the 2nd European conference on Smart sensing and context</source>
          , Springer-Verlag (
          <year>2007</year>
          )
          <volume>270</volume>
          {
          <fpage>285</fpage>
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          13.
          <string-name>
            <surname>Payam</surname>
            <given-names>Barnaghi</given-names>
          </string-name>
          , Frieder Ganz,
          <string-name>
            <given-names>C.H.</given-names>
            ,
            <surname>Sheth</surname>
          </string-name>
          ,
          <string-name>
            <surname>A.</surname>
          </string-name>
          :
          <article-title>Computing perception from sensor data</article-title>
          .
          <source>In: Proceedings of the 2012 IEEE Sensors Conference</source>
          (to appear). (
          <year>2012</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref14">
        <mixed-citation>
          14.
          <string-name>
            <surname>Ding</surname>
            ,
            <given-names>H.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Trajcevski</surname>
            ,
            <given-names>G.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Scheuermann</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Wang</surname>
            ,
            <given-names>X.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Keogh</surname>
            ,
            <given-names>E.J.</given-names>
          </string-name>
          :
          <article-title>Querying and mining of time series data: experimental comparison of representations and distance measures</article-title>
          .
          <source>PVLDB</source>
          <volume>1</volume>
          (
          <issue>2</issue>
          ) (
          <year>2008</year>
          )
          <volume>1542</volume>
          {
          <fpage>1552</fpage>
        </mixed-citation>
      </ref>
      <ref id="ref15">
        <mixed-citation>
          15.
          <string-name>
            <surname>Xi</surname>
            ,
            <given-names>X.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Keogh</surname>
            ,
            <given-names>E.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Shelton</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Wei</surname>
            ,
            <given-names>L.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Ratanamahatana</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          :
          <article-title>Fast time series classi cation using numerosity reduction</article-title>
          .
          <source>In: Proceedings of the 23rd international conference on Machine learning ICML 06</source>
          . Volume
          <volume>150</volume>
          ., ACM Press (
          <year>2006</year>
          )
          <volume>1033</volume>
          {
          <fpage>1040</fpage>
        </mixed-citation>
      </ref>
      <ref id="ref16">
        <mixed-citation>
          16.
          <string-name>
            <surname>Geurts</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          :
          <article-title>Pattern extraction for time series classi cation</article-title>
          .
          <source>Principles of Data Mining and Knowledge Discovery</source>
          (
          <year>2001</year>
          )
          <volume>115</volume>
          {
          <fpage>127</fpage>
        </mixed-citation>
      </ref>
      <ref id="ref17">
        <mixed-citation>
          17.
          <string-name>
            <surname>Ye</surname>
            ,
            <given-names>L.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Keogh</surname>
          </string-name>
          , E.:
          <article-title>Time series shapelets: a new primitive for data mining</article-title>
          .
          <source>In: Proceedings of the 15th ACM SIGKDD international conference on Knowledge discovery and data mining</source>
          ,
          <source>ACM</source>
          (
          <year>2009</year>
          )
          <volume>947</volume>
          {
          <fpage>956</fpage>
        </mixed-citation>
      </ref>
      <ref id="ref18">
        <mixed-citation>
          18.
          <string-name>
            <surname>Compton</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Henson</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Lefort</surname>
            ,
            <given-names>L.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Neuhaus</surname>
            ,
            <given-names>H.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Sheth</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          :
          <article-title>A survey of the semantic speci cation of sensors</article-title>
          .
          <source>In: Proc. 2nd International Workshop on Semantic Sensor Networks</source>
          . (
          <year>2009</year>
          )
          <fpage>17</fpage>
        </mixed-citation>
      </ref>
      <ref id="ref19">
        <mixed-citation>
          19.
          <string-name>
            <surname>Russomanno</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Kothari</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Thomas</surname>
            ,
            <given-names>O.</given-names>
          </string-name>
          :
          <article-title>Sensor ontologies: from shallow to deep models</article-title>
          .
          <source>In: Proc. 37th Southeastern Symposium on System Theory</source>
          .
          <article-title>(</article-title>
          <year>2005</year>
          )
          <volume>107</volume>
          {
          <fpage>112</fpage>
        </mixed-citation>
      </ref>
      <ref id="ref20">
        <mixed-citation>
          20.
          <string-name>
            <surname>Barnaghi</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Meissner</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Presser</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Moessner</surname>
            ,
            <given-names>K.</given-names>
          </string-name>
          :
          <article-title>Sense and sensability: Semantic data modelling for sensor networks</article-title>
          .
          <source>In: Proceedings of the ICT Mobile Summit</source>
          . (
          <year>2009</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref21">
        <mixed-citation>
          21.
          <string-name>
            <surname>Compton</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Neuhaus</surname>
            ,
            <given-names>H.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Taylor</surname>
          </string-name>
          , K.,
          <string-name>
            <surname>Tran</surname>
            ,
            <given-names>K.</given-names>
          </string-name>
          :
          <article-title>Reasoning about sensors and compositions</article-title>
          . In: SSN. (
          <year>2009</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref22">
        <mixed-citation>
          22.
          <string-name>
            <surname>Janowicz</surname>
            ,
            <given-names>K.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Compton</surname>
            ,
            <given-names>M.:</given-names>
          </string-name>
          <article-title>The Stimulus-Sensor-Observation Ontology Design Pattern and its Integration into the Semantic Sensor Network Ontology</article-title>
          . In: SSN. (
          <year>2010</year>
          )
          <volume>7</volume>
          {
          <fpage>11</fpage>
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>