<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Aggregating Linked Sensor Data</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Christoph Stasch</string-name>
          <xref ref-type="aff" rid="aff3">3</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Sven Schade</string-name>
          <xref ref-type="aff" rid="aff2">2</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Alejandro Llaves</string-name>
          <xref ref-type="aff" rid="aff3">3</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Krzysztof Janowicz</string-name>
          <xref ref-type="aff" rid="aff4">4</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Arne Broring</string-name>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>52 North Initiative for Geospatial Open Source Software GmbH</institution>
          ,
          <country country="DE">Germany</country>
        </aff>
        <aff id="aff1">
          <label>1</label>
          <institution>ITC Faculty, University of Twente</institution>
          ,
          <addr-line>Enschede</addr-line>
          ,
          <country country="NL">Netherlands</country>
        </aff>
        <aff id="aff2">
          <label>2</label>
          <institution>Institute for Environment and Sustainability, Joint Research Centre</institution>
          ,
          <addr-line>Ispra</addr-line>
          ,
          <country country="IT">Italy</country>
        </aff>
        <aff id="aff3">
          <label>3</label>
          <institution>Institute for Geoinformatics, University of Munster</institution>
          ,
          <country country="DE">Germany</country>
        </aff>
        <aff id="aff4">
          <label>4</label>
          <institution>University of California</institution>
          ,
          <addr-line>Santa Barbara</addr-line>
          ,
          <country country="US">USA</country>
        </aff>
      </contrib-group>
      <abstract>
        <p>Sensor observations are usually o ered in relation to a speci c purpose, e.g., for reporting ne dust emissions, following strict procedures, and spatio-temporal scales. Consequently, the huge amount of data gathered by today's public and private sensor networks is most often not reused outside of its initial creation context. Fostering the reusability of observations and derived applications calls for (i) spatial, temporal, and thematic aggregation of measured values, and (ii) easy integration mechanisms with external data sources. In this paper, we investigate how work on sensor observation aggregation can be incorporated into a Linked Data framework focusing on external linkage as well as provenance information. We show that Linked Data adds new aspects to the aggregation problem, e.g., whether external links from one of the original observations can be preserved for the aggregate. The Stimulus-SensorObservation (SSO) ontology design pattern is extended by classes and relations necessary to model the aggregation of sensor observations.</p>
      </abstract>
      <kwd-group>
        <kwd>Sensor Aggregation</kwd>
        <kwd>Semantic Enablement</kwd>
        <kwd>Linked Data</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>-</title>
      <p>
        Sensor observations are collected with a speci c purpose in mind and, therefore,
measuring follows strict procedures and spatio-temporal scales [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ]. While the
same device, e.g., a thermometer, can be used to measure soil and air
temperature, both follow di erent procedures and their results cannot be combined.
Similar issues hold for ne dust (P M 10)6 measurements, where data coming from
rural monitoring stations has to be distinguished from data produced by sensors
located in urban areas, particularly at major roads [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ]. Consequently, the rich
observation data gathered by today's public and private sensor networks is di
cult to reuse aside of the initially intended context. We hope to boost the use of
observation results and the number of innovative observation-based applications
by providing mechanisms for (i) spatial, temporal, and thematic aggregation of
measured values, and (ii) easy integration mechanisms with other data sources.
6 The notation PM10 is used to describe ne dust particles of 10 micrometers or less.
      </p>
      <p>
        Building up on our previous work on exposing standardized observation data
as Linked Data [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ], this paper introduces the next steps towards opening up
sensor observations to new usage scenarios: the aggregation of observations and
exposing them as Linked Sensor Data. Having temporal aggregates (e.g., yearly
averaged ne dust measures), spatial aggregates (e.g., ne dust concentration in
the Munsterland region in Germany), thematically aggregated observations (e.g.,
blizzards, landslides, or forest res), and their combinations available, makes
linking more attractive and opens environmental information to new user
communities. On the one hand, observations may be connected to particular features
of interest in the Linked Data cloud. On the other hand, hubs such as DBpedia
may directly refer to aggregated observations, e.g., an entry about the German
city of Munster and its surrounding areas by referring to recent and average
weather conditions, or air quality measures.
      </p>
      <p>The main contributions of this paper are threefold. We (i) present a Linked
Data model for aggregated sensor data, (ii) discuss the e ects of aggregation
on links from and to observations, and (iii) outline the role of provenance in
this setting. The implementation of the extensions discussed in this paper are
ongoing and the 52 North semantics community7 plans to release an updated
prototype in fall 2011.</p>
      <p>The remainder of this paper is structured as follows. In section 2.1, we
introduce the concept of aggregated observations and provide background information
about Linked Sensor Data and provenance information in observations. Section 3
discusses the implication of aggregation on Linked Sensor Data. Here, we present
the required extensions to our Linked Data model for observations. Additional
investigations address the e ects of aggregation on external linking, and issues
on data provenance. In section 4, we set our work in relation to current e orts
to provide observations as Linked Data and to provide provenance information
in observation data. The paper concludes with a summary and an outline of the
remaining steps for implementing aggregated observations as Linked Data; see
section 5.
2</p>
    </sec>
    <sec id="sec-2">
      <title>Background</title>
      <p>In this section, we provide a brief overview on related work. At rst, we
introduce de nitions and related work about aggregation of observations. Second, we
introduce the concepts of Linked Data. Finally, we describe related work about
provenance of sensor data.
2.1</p>
      <sec id="sec-2-1">
        <title>Aggregation of Observations</title>
        <p>
          Aggregation of observations in space and time is essential to derive information
that is useful for a certain application purpose and to integrate observation data
with di ering spatio-temporal resolutions. Yet, spatio-temporal aggregation of
7 Implementations and documentations can be found at http://52north.org.
observations in the Linked Data context has not yet been addressed. However, in
other communities, e.g., the database community or in environmental sciences,
spatio-temporal aggregation has been a research topic for years and is sometimes
also referred to as scaling of observations and environmental models. Vega and
Lopez [
          <xref ref-type="bibr" rid="ref4">4</xref>
          ] give a comprehensive survey on spatio-temporal aggregation methods
for databases. Besides simple aggregation, complex statistical models might also
be applied as described for the domain of soil sciences by Bierkens et al. [
          <xref ref-type="bibr" rid="ref5">5</xref>
          ].
Spatio-temporal aggregation processes for observation data are not yet available
on the Web and, therefore, recent approaches demonstrate how to tackle this
challenge. A spatio-temporal aggregation service that can be used to provide
such aggregation functionality on the Sensor Web has been introduced in our
previous work [
          <xref ref-type="bibr" rid="ref6">6</xref>
          ].
        </p>
        <p>
          In this paper, we largely follow the de nition of aggregation8 by Jeong et. al.
[
          <xref ref-type="bibr" rid="ref7">7</xref>
          ]. During an aggregation process, the observations are grouped by a grouping
predicate, e.g., by a spatial predicate which is de ned by the polygon representing
the area of a city, or by a temporal predicate de ned by the time period of a
month. After grouping, an aggregation function is applied that computes a single
value, an aggregate, for the result values of an observation group. The aggregation
function might be linear (e.g., MEAN), but also non-linear (e.g., MEDIAN, or
areal fraction of spatial blocks where the concentration of a pollutant exceeds
a critical level) [
          <xref ref-type="bibr" rid="ref8">8</xref>
          ]. The grouping predicate does not necessarily have to be the
target spatial or temporal extent of an aggregated observation. Considering the
example of a block kriging method [
          <xref ref-type="bibr" rid="ref9">9</xref>
          ], for every aggregate of a spatial block, all
measurements are taken into account and not just the ones laying in the extent
of the block. Similarly, for temporal aggregation, moving windows might be used
to aggregate values to time periods that also include the values before and after
a certain period.
        </p>
        <p>
          Besides spatio-temporal aggregation as introduced above, extracting high
level events from observations is also done by aggregating observations. Treating
the high level events as observations again enables an easy integration into
existing infrastructures and tools. Considering the blizzard example as described
in [
          <xref ref-type="bibr" rid="ref10">10</xref>
          ], the event of a blizzard can also be modeled as an observation. The
blizzard is an aggregate of several observations indicating heavy snowfall, very low
temperatures, and high wind speed. This example demonstrates that the
grouping predicate is not merely spatial or temporal, but also contains predicates on
the result values of the observations (e.g., heavy snowfall). We thus refer to this
kind of aggregation as thematic aggregation of observations. However, the
observations are still aggregated spatially or temporally as the blizzard is observed at
a region in space and for a period in time.
8 Aggregation might be also referred to as complex entity with parts. In case of
observations, this might be a collection of observations where the non-aggregated
observations are parts of the aggregated observation collection. However, in our work
we consider aggregation as described in this paragraph and commonly used in
environmental sciences.
2.2
For aggregation of observations a mechanism that helps to retrace the original
observations and sensors from the aggregated observations is important. Linked
Data [
          <xref ref-type="bibr" rid="ref11">11</xref>
          ] provides a promising paradigm to provide such a mechanism, as the
original observations and the aggregates can be easily linked with clear
semantics. Linked Data proposes unique identi ers for data in the Web, links between
them, and relies on the Resource Description Framework (RDF) [
          <xref ref-type="bibr" rid="ref12">12</xref>
          ]. The most
common query language for RDF is SPARQL [
          <xref ref-type="bibr" rid="ref13">13</xref>
          ]. SPARQL has similar
capabilities as query languages for relational databases, but works by matching graph
patterns and is optimized for RDF triple stores, such as Sesame or Virtuoso.
Within the last years, Linked Data has become the most promising vision for
the Future Internet and has been widely adopted by academia and industry.
        </p>
        <p>
          Several approaches for Linked Sensor Data in the Web are already available
[
          <xref ref-type="bibr" rid="ref14 ref15 ref16">14,15,16</xref>
          ]. They describe, how to identify sensor resources using URIs, how to
link them with clear semantics and how to expose the sensor data in the Web.
However, the issue of spatio-temporal aggregation, e.g. how aggregation a ects
the links from and to observations, is not yet addressed. In our previous work
[
          <xref ref-type="bibr" rid="ref3">3</xref>
          ], we developed a standards-based approach to expose sensor metadata and
observations stored in a Sensor Observation Service (SOS) [
          <xref ref-type="bibr" rid="ref17">17</xref>
          ] to the Semantic
Web by following Linked Data principles and providing dereference-able HTTP
URIs for sensors, observed properties, features of interest, and observations, link
them (to external sources), and expose their semantics using the SSO ontology
[
          <xref ref-type="bibr" rid="ref18">18</xref>
          ]. In this work, we extend our previous work on Linked Sensor Data to support
aggregated observations.
2.3
        </p>
      </sec>
      <sec id="sec-2-2">
        <title>Provenance in Observation Data</title>
        <p>
          There are several approaches available for providing provenance information in
the Web. The W3C's Provenance Incubator Group9, predecessor of the new
Provenance Working Group10, compiled a list of requirements to support
provenance in RDF, which includes for example that every observation should have an
URI identi er [
          <xref ref-type="bibr" rid="ref19">19</xref>
          ]. Based on these requirements, the Provenance Vocabulary has
been de ned11 that can be used in the Web to provide provenance information for
Linked Data [
          <xref ref-type="bibr" rid="ref20">20</xref>
          ]. Similar to the Provenance Vocabulary, the Open Provenance
Model12 (OPM) de nes nodes and edges to create provenance graphs that allow
to retrace the creation of an item back to its origin. The nodes can be artifacts,
processes and agents whereas the edges between nodes can be de ned as the
causal relationships used, wasGeneratedBy, wasControlledBy, wasTriggeredBy,
and wasDerivedFrom. The graphs can be serialized in di erent data formats like
XML.
9 http://www.w3.org/2005/Incubator/prov/charter
10 http://www.w3.org/2011/01/prov-wg-charter
11 http://sourceforge.net/apps/mediawiki/trdf/index.php?title=Provenance_
        </p>
        <p>Vocabulary
12 http://openprovenance.org/</p>
        <p>
          Besides general approaches for provenance information in the Web,
providing provenance information in Linked Sensor Data has recently gained attention.
Provenance of sensor data can be de ned as information about the source of the
sensor data as well as information about transformations applied to the original
data [
          <xref ref-type="bibr" rid="ref21">21</xref>
          ]. Patni et al. [
          <xref ref-type="bibr" rid="ref10">10</xref>
          ] propose an approach for provenance in Linked Sensor
Data and de ne the capabilities of the sensor, the spatio-temporal parameters
of the observation, and the measurement value as relevant sensor provenance
information. Liu et al. [22] introduce a provenance aware virtual sensor system
based upon the OPM. Using the OPM for their virtual sensors enables the
description of (i) fetching processes for sensor data streams; (ii) work ow execution
like data transformation of raw measurements; and (iii) user interaction with a
web application that allows to manage the virtual sensors. In another approach,
Park and Heidemann [
          <xref ref-type="bibr" rid="ref21">21</xref>
          ] de ned their own provenance model that (for sensor
data) is more comprehensive than the OPM. Among other things, this
alternative model allows the de nition of access control for sources. Similar to the
approach of Liu and colleagues, the sensor data is annotated with additional
provenance metadata. Our approach will show how most of relevant provenance
information is already provided in our Linked Sensor Data and how the links
can be mapped to provenance relationships as de ned in the OPM.
3
        </p>
      </sec>
    </sec>
    <sec id="sec-3">
      <title>Aggregation of Observations in the Linked Data Cloud</title>
      <p>
        In this section we introduce an approach to enable the aggregation of
observations in the Linked Data cloud. First, we present an extension of the
StimulusSensor-Observation (SSO) ontology design pattern [
        <xref ref-type="bibr" rid="ref18">18</xref>
        ]. Next, we illustrate how
the change of observation properties during aggregation a ects the links from
and to observations in the cloud. Finally, we describe how provenance
information pointing back to the original observations can be provided.
3.1
      </p>
      <sec id="sec-3-1">
        <title>Extension of the SSO Design Pattern</title>
        <p>
          Following our previous work [
          <xref ref-type="bibr" rid="ref3">3</xref>
          ], we use an intermediate Linked Data model for
exposing sensor observations. It was derived from an ontology developed by the
W3C SSN-XG [23], namely the Stimulus-Sensor-Observation (SSO) ontology
design pattern [
          <xref ref-type="bibr" rid="ref18">18</xref>
          ]. The SSO pattern forms a generic and adaptable starting point
for the development of sensor ontologies as well as Linked Data vocabularies.
        </p>
        <p>Figure 1 shows the classes and relations from the pattern extended by the
Linked Data model for sensor data, and the new elements that have been added
in order to account for aggregation. In a nutshell, we reuse the following de
nitions:
{ FeatureOfInterest : entity that comprises observable properties.
{ ObservedProperty : property that inheres in a feature of interest.
{ ObservationCollection: set of observations, grouped by a distinct criteria.
{ Observation: (social) construct that connects observed properties with
sensors, sensing results, and sampling times.
{ SamplingTime: time instant or interval at which an observation was made.
{ Result : symbol representing an observed value.
{ Sensor : entity that performs observations.</p>
        <p>{ Procedure: description that speci es how observations have to be carried out</p>
        <p>In order to account for aggregated observations as Linked Data, we extend
the SSO pattern with the following elements:
{ isAggregateOf : a relation that allows one observation to be aggregated out of
others; e.g., an observation of daily P M 10 concentration being an aggregate
over hourly measures, or an observation of P M 10 in Munster, Germany
being an aggregate over various Point of Interest (POI) measures.
{ SensingDevice13: a sensor, which is a physical measuring device; e.g., a
particular air sampler including a special lter P M 10.
{ AggregationProcess : a sensor, which implements a concrete aggregation
procedure (see below), for example the process that calculates regional P M 10
concentrations based on several P M 10 concentration observations and
additional calibration parameters.
13 The concept of a SensingDevice is also captured as part of the W3C SSN-XG
ontology. However, it is not part of the SSO pattern, which is applied in our work. We
decided to introduce the SensingDevice in particular opposed to the notion of the
AggregationProcess in order to stress the di erence between a single physical
measurement instrument and the aggregation process that combines multiple sensory
inputs to a new observation.
{ AggregationProcedure: the speci c procedure used for aggregating several
observations into one; e.g., calculating the MEAN of 24 hourly observations
of P M 10 concentration, or a Kriging interpolation method</p>
        <p>
          The relations between the classes presented in Figure 1 act as links in our
model and de ne the multiple navigation paths and external references; see also
[
          <xref ref-type="bibr" rid="ref3">3</xref>
          ]. The above mentioned extensions allow for the generation of aggregated
observations together with an explicit mentioning of the applied aggregation method,
such as MIN, MEAN, or MAX calculations over a temporal series. This also
allows for linking aggregated observations back to ner grained observations
(discussed in Section 3.3). This new model can be used as URI scheme and
query lter to enable the Restful Linked Data SOS to serve aggregated data as
well.
3.2
        </p>
      </sec>
      <sec id="sec-3-2">
        <title>E ects on Links from and to Observations</title>
        <p>Aggregating linked observations a ects the links from and to the observations.
Questions like 'Are the links to a feature of interest still valid, if observations
taken at speci c points are spatially aggregated to an area?' or 'Which new links
can be established after aggregation of an observation?' need to be answered.
First of all, the links which are de ned in our observation ontology need to be
checked for consistency and changed, if necessary14. Table 1 shows examples
of objects (i.e., link targets) of the links from observations before and after
aggregation of point observations to an area in space and a period in time.</p>
        <p>Independent of the concrete example, for each aggregation the target of the
ldm:hasSamplingTime link changes from an instant in time (original
observations) to a period in time (aggregated observations), if the observations are
aggregated temporally. Also, the DUL:includesObject link will always point from
the aggregated observation to an instance of an AggregationProcess instead of
pointing to a speci c SensingDevice from the original observations. In
environmental applications, the ObservedProperty is usually a continuous phenomenon,
which is sampled at certain locations in space or time, e.g., P M 10 concentration.
If only spatial and/or temporal aggregations are applied, the ldm:aboutProperty
remains the same. In case of a thematic aggregation (see Section 2.1), the
ObservedProperty changes. An example is the blizzard as a combination of high
wind-speed, heavy snowfall, and low temperatures: the ObservedProperty of the
blizzard observation is the phenomenon of the blizzard, whereas the original
observations point to the phenomena wind-speed, snowfall and surface
temperature. Similar examples could be constructed for landslides or forest res.</p>
        <p>Though the ObservedProperty might be unchanged during an aggregation
process, the sso:isPropertyOf link changes, if the observations are aggregated in
14 Here, changing links means that triples of the original observations might be removed
and replaced by other triples in the aggregated observations for the same relationship.
For example, the hasSamplingTime relationship usually links to a point in time in
the original observations, but to a time period in the aggregated observations, if the
observations are aggregated temporally.</p>
        <p>Link in Ontology
ldm:hasSamplingTime
DUL:includesObject
ldm:aboutProperty
sso:isPropertyOf</p>
        <p>Object Before Aggregation Object After Aggregation
TimeInstant
(08/05/2011; 23:15 CEST)</p>
        <p>TimePeriod
(one day)
SensingDevice
(air sampler)
ObservedProperty
(PM10)
SamplingPoint
(N 51 57.466 E 007.37.433)</p>
        <p>AggregationProcess
(block kriging of PM10
measures)
ObservedProperty
(PM10)
GeospatialRegion
(area of Munster)
space. For example, aggregating the point measurements to an area causes the
FeatureOfInterest to change from a sampling point to an upper level feature like
the area of the city of Munster. Finally, the sso:involves link points to an
aggregate computed during the aggregation process. Originally, the sso:involves link
has pointed to the measurement values from the source observations. Besides
changing the original links, additional links might be added pointing to the
aggregated observation. As introduced in our model, the isAggregateOf link points
from an aggregated observation to the original observations. Furthermore, other
observation collections might contain the aggregated observation resulting in new
ldm:hasObservation links to the aggregated observation. Also, other higher level
features like cities, administrative areas, etc. might be linked to the aggregated
observations.</p>
        <p>Formalizing the changes of links during aggregation is challenging and often
domain speci c. However, we consider the identi cation and formalization of
such changes as crucial to provide a (semi-)automated aggregation of
observations in the Linked Data cloud in future and are currently working on such a
formalization.
3.3</p>
      </sec>
      <sec id="sec-3-3">
        <title>Provenance in Aggregated Linked Observation Data</title>
        <p>In a Linked Data context where di erent communities might have interest in
interlinking their datasets, it is important to publish trust-able datasets.
Provenance information favors trustworthiness of data because users are able to
analyze the historic changes and reproduce them [24]. Especially when aggregating
observations in Linked Data, it is important to be able to retrieve information
about the original observations as well as the aggregation process that has been
applied. Figure 2 shows a provenance graph that illustrates how provenance
information about the aggregation process and original observations is provided
in our Linked Data model and how this can be mapped to the concepts of the
OPM and the Provenance Vocabulary. The reason to extend our model instead
of re-using an existing solution lies in the fact that most of the provenance
information needed for sensors and observations is already available, thus we avoid
redundancy.</p>
        <p>First of all, the isAggregateOf allows to trace the aggregated
observations back to the original observations. Hence it can be mapped to the
opmv:wasDerivedFrom relationship in the OPM. Though the isAggregateOf
relationship cannot be directly mapped to a relationship in the Provenance
Vocabulary, it provides basically the information that is provided by the prv:usedData
link from a prv:DataCreator to a prv:DataItem. Information about the
aggregation process that has created an aggregated observation is provided by the
DUL:includesObject link from the aggregated observation to the
AggregationProcess. This link can be mapped to the opmv:wasGeneratedBy relationship in
the OPM and to the prv:createdBy relationship of the Provenance Vocabulary.
The ldm:hasSamplingTime attribute provides a link to the time at which the
value represents a physical phenomenon in the world. In case of observations
taken by a physical sensor this corresponds to the time when the observation
has been taken. However, if the observations gathered by physical sensors are
aggregated by an AggregationProcess, the SamplingTime is a time period
representing the value for which the aggregate is valid. This is no longer the time when
the observation has been produced (time of aggregation). Thus, for aggregated
observations, an additional time link might be added providing this information.
Similarly, additional links might be provided for the opmv:wasControlledBy and
the opmv:used relationships of the OPM, which we did not yet include, as we
focus on retracing the observations and not on the users which are aggregating
or using the observations.</p>
      </sec>
    </sec>
    <sec id="sec-4">
      <title>Discussion</title>
      <p>
        The presented research is in line with the theoretical challenges in Sensor Web
research, which have been identi ed during an expert meeting in 2010 [25],
addressing the challenges of interoperability and integration of sensor based system
and model based systems. Our extension of the SSO design pattern as described
in Section 3.1 allows to expose aggregated observations as Linked Sensor Data.
This goes beyond the approaches available for providing Linked Sensor Data
[
        <xref ref-type="bibr" rid="ref14 ref15 ref16">14,15,16</xref>
        ] which are focusing on providing non-aggregated observations. In our
approach, we follow an observation-centric viewpoint that an aggregated
observations is still an observation about a quality in the world and thus can be
modeled as such. However, further discussion is needed whether the aggregation
process can still be modeled as a sensor or has to be distinguished from the
concept of sensors.
      </p>
      <p>
        Our model also allows to retrace the aggregated observations back to the
original observations and to retrieve information about the aggregation process
applied, thus providing provenance information about the aggregated
observations (see Section 3.3). Instead of providing additional metadata as in other
approaches described in Section 2.3, we show how the provenance information
can be directly retrieved by using the links established in our Linked Data model
for (aggregated) observations. For example, Patni et. al. [
        <xref ref-type="bibr" rid="ref10">10</xref>
        ] present an approach
for provenance in Linked Sensor Data where a separate provenance ontology has
been de ned. In contrast, we aim to avoid duplication of, for example,
information about which sensor has created an observation at what time. This
information is already contained in the existing sensor and observation ontologies.
We rather show how the relationships of the observation ontologies providing
this information can be mapped to relationships of well-established provenance
models like the OPM or the Provenance Vocabulary. In order to enable the
integration of observations in tools relying on this common provenance models, the
observations can either easily be translated to such models or additional triples
can be added in the observation set. However, in both approaches, this causes
redundant information which might cause problems dealing with large datasets
which is common in environmental sciences. Opposed to the general approach
for providing provenance information, e.g., about triples in the Linked Data
cloud [
        <xref ref-type="bibr" rid="ref20">20</xref>
        ], we do not yet consider provenance information about the instances
of objects and links according to our Linked Data model, e.g. Who has created
an observation triple in the Linked Sensor Data at which time. To provide such
information, we think that the general approaches for data provenance in the
Web can be utilized.
      </p>
      <p>Both, sensor observations and aggregates provide estimations for physical
phenomena occurring in the world. As it is not possible to observe all relevant
aspects in reality, observations can only represent reality to a certain degree
and thus are uncertain about reality. In studies dealing and using observations,
it is crucial to account for the uncertainty. This is usually referred to as
uncertainty propagation [26]. Aggregation is one mean to adjust the uncertainty
in estimations. The more the data is aggregated, the less uncertainty is in the
data. At the moment, we do not yet explicitly account for uncertainty in the
presented work. Investigations how uncertainty can be propagated in
observation processing work ows in the Web are currently ongoing within the European
research project UncertWeb15 [27]. We plan to adopt their approaches and add
the uncertainty to our Linked Data model.</p>
      <p>While we are providing the model for exposing aggregated observations as
Linked Data and we discuss the e ects on links from and to observations during
aggregation (Section 3.2), we have not yet addressed the technological aspect of
executing aggregation processes on Linked Sensor Data. However, we are
currently working on extending our Spatio-Temporal Aggregation Service to also
deal with Linked Data serialized as RDF. This also leads to the question to what
degree observations should be aggregated before exposing them as Linked Data
in order to reduce the amount of triples or whether observations can/should be
provided at di erent aggregation levels as Linked Sensor Data. For example,
providing high resolution sensor data as Linked Data might lead to a huge amount
of triples which might cause performance problems. Thus, it might be better to
aggregate the observations before and then expose them as Linked Data.
5</p>
    </sec>
    <sec id="sec-5">
      <title>Conclusions and Outlook</title>
      <p>In this paper, we identify the need for spatial, temporal, and thematic
aggregations of sensor observations and their propagation as Linked Data for an easy
integration with other data sources. Aggregates of sensor observations (e.g., the
monthly average ne dust concentration in a city) can be much easier utilized
in applications. Facilitating the integration of such aggregated observations by
providing them as Linked Data enables their utilization among di erent
applications. We achieve this by: (1) extending the SSO ontology design pattern
to accommodate aggregation information and including concepts such as
AggregationProcedure or AggregationSensor ; (2) describing how links from point
observations change after aggregation (e.g., feature of interest may change from
a sampling point to a city area); (3) supporting the provenance information in
the model through enabling retraceability to original observations and introduce
relations such as isAggregationOf.</p>
      <p>Our future work will follow these lines. Aside from our ongoing
implementation work, we plan to exploit the combination of the proposed approach with
event detection mechanisms and stream processing. Therefore, we are planning
to combine the extension of the SSO ontology pattern presented in this paper
with our previous work on sensor plug &amp; play [28]. In that work, we designed
a framework that enables the on-the- y integration of sensors and Sensor Web
services by determining the semantic matching between sensor characteristics
and service requirements. This framework can also be put to use in on-stream
processing for the dynamic fusion of incoming data streams of multiple sensors
to produce aggregated observations. This is similar to approaches such as [29],
15 http://www.uncertweb.org
but also allows the creation of new, combined phenomena. A basic example
is the combination of temperature and conductivity data streams measured by
underwater sensors to derive a stream of salinity observations.</p>
      <p>Furthermore, we are planning to extend our approach developed with
representations for uncertainty as described in the Uncertainty Markup Language [30].
Our provenance information currently provides information about the
aggregation procedure applied, its implementation, and about the original observations
that have been used to derive the aggregated observation. In future, it has to
be explored how to add additional provenance information about providers and
users of the (aggregated) observations.</p>
      <p>Our approach of aggregation in Linked Data also allows to utilize the
semantics of the links and the objects. First of all, the changes to links as described
in Section 3.2 can be translated into rules to check whether adding or removing
links is allowed or not. In a next step, the process of adding and removing links
during aggregation of observations might be automatized. Furthermore, the
semantic reasoning can be used to decide, whether a certain aggregation procedure
can be applied to a certain set of observations. Considering, e.g., a set of water
level measurements along rivers in Germany, these should not be interpolated to
Germany and the semantics can be used to recommend appropriate or disallow
inappropriate aggregation processes. However, in order to realize such a system,
an ontology of aggregation processes is needed which we consider to be work
done in a longer time frame. We hope that our approach as presented in this
paper will contribute towards such a semantically-enabled aggregation system.</p>
    </sec>
    <sec id="sec-6">
      <title>Acknowledgments</title>
      <p>The presented work is developed within the 52 North semantics
community (http://52north.org/semantics), and partly funded by the European
projects UncertWeb (FP7-248488, http://www.uncertweb.org/), ENVIROFI
(FP7-284898, http://www.envirofi.eu/), ENVISION (FP7-249170, http://
www.envision-project.eu/), and through the International Research
Training Group on Semantic Integration of Geospatial Information funded by the
DFG (German Research Foundation), GRK 1498. We are thankful for
discussions with members of the Munster Semantic Interoperability Lab (MUSIL), the
Spatial Data Infrastructures Unit of the Joint Research Centre (JRC) of the
European Commission, and colleagues from the W3C Semantic Sensor Network
Incubator Group.
22. Liu, Y., Futrelle, J., Myers, J., Rodriguez, A., Kooper, R.: A provenance-aware
virtual sensor system using the open provenance model. In: 2010 International
Symposium on Collaborative Technologies and Systems (CTS). (2010) 330 {339
23. Lefort, L., Henson, C., Taylor, K., Barnaghi, P., Compton, M., Corcho, O.,
Castro, R., Graybeal, J., Herzog, A., Janowicz, K., Neuhaus, H., Nikolov, A., Page,
K.: Semantic Sensor Network XG Final Report. W3C Incubator Group Report
28 June 2011. Available at: http://www.w3.org/2005/Incubator/ssn/XGR-ssn/.</p>
      <p>Technical report (2011)
24. Boose, E.R., Ellison, A.M., Osterweil, L.J., Clarke, L.A., Podorozhny, R., Hadley,
J.L., Wise, A., Foster, D.R.: Ensuring reliable datasets for environmental models
and forecasts. Ecological Informatics 2(3) (2007) 237 { 247 Meta-information
systems and ontologies. A Special Feature from the 5th International Conference on
Ecological Informatics ISEI5, Santa Barbara, CA, Dec. 4-7, 2006 - Novel Concepts
of Ecological Data Management S.I.
25. Schade, S., Craglia, M.: A future sensor web for the environment in europe. In:
Proceedings of the 24th International Conference on Informatics for Environmental
Protection - Enviroinfo2010. (2010)
26. Heuvelink, G.: Error Propagation in Environmental Modelling with GIS. Taylor
&amp; Francis (1998)
27. Pebesma, E., Cornford, D., Nativi, S., Stasch, C.: The uncertainty enabled model
web (uncertweb). In: Environmental Information Systems and Services -
Infrastructures and Platforms. (2010)
28. Broring, A., Maue, P., Janowicz, K., Nust, D., Malewski, C.: Semantically-enabled
sensor plug &amp; play for the sensor web. Sensors 11(8) (2011) 7568{7605
29. Madden, S., Franklin, M.J., Hellerstein, J.M., Hong, W.: TAG: a Tiny AGgregation
service for ad-hoc sensor networks. SIGOPS Operating Systems Review 36 (2001)
131{146
30. Williams, M. and Conford, D. and Bastin, L. and Pebesma, E.: OGC 08-122r2:
Uncertainty Markup Language (UncertML) (2009)</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          1.
          <string-name>
            <surname>Larssen</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Sluyter</surname>
            ,
            <given-names>R.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Helmis</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          :
          <article-title>Criteria for EUROAIRNET - the EEA air quality monitoring and information network</article-title>
          .
          <source>Technical Report 12</source>
          ,
          <string-name>
            <surname>European</surname>
            <given-names>Environmental Agency (EEA</given-names>
          </string-name>
          ) (
          <year>1999</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          2.
          <string-name>
            <surname>Horalek</surname>
          </string-name>
          , J., de Smet, P.,
          <string-name>
            <surname>de Leeuw</surname>
            ,
            <given-names>F.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Conkova</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Denby</surname>
            ,
            <given-names>B.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Kurfuerst</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          :
          <article-title>Methodological improvements on interpolating european air quality maps</article-title>
          .
          <source>etc/acc technical paper 2009/16. Technical report, EEA</source>
          (
          <year>2010</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          3.
          <string-name>
            <surname>Janowicz</surname>
            ,
            <given-names>K.</given-names>
          </string-name>
          , Broring,
          <string-name>
            <given-names>A.</given-names>
            ,
            <surname>Stasch</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            ,
            <surname>Schade</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            ,
            <surname>Everding</surname>
          </string-name>
          ,
          <string-name>
            <given-names>T.</given-names>
            ,
            <surname>Llaves</surname>
          </string-name>
          ,
          <string-name>
            <surname>A.</surname>
          </string-name>
          :
          <article-title>A restful proxy and data model for linked sensor data</article-title>
          .
          <source>International Journal of Digital Earth</source>
          (
          <year>2011</year>
          ;
          <article-title>accepted for publication)</article-title>
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          4.
          <string-name>
            <given-names>Vega</given-names>
            <surname>Lopez</surname>
          </string-name>
          ,
          <string-name>
            <given-names>I.F.</given-names>
            ,
            <surname>Snodgrass</surname>
          </string-name>
          , R.T.,
          <string-name>
            <surname>Moon</surname>
            ,
            <given-names>B.</given-names>
          </string-name>
          :
          <article-title>Spatiotemporal aggregate computation: A survey</article-title>
          .
          <source>IEEE Trans. on Knowl. and Data Eng</source>
          .
          <volume>17</volume>
          (
          <issue>2</issue>
          ) (
          <year>2005</year>
          )
          <volume>271</volume>
          {
          <fpage>286</fpage>
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          5.
          <string-name>
            <surname>Bierkens</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Finke</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>De Willingen</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          :
          <article-title>Upscaling and Downscaling Methods for Environmental Research</article-title>
          . Kluwer Academic Publishers (
          <year>2000</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          6.
          <string-name>
            <surname>Stasch</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Autermann</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Foerster</surname>
            ,
            <given-names>T.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Pebesma</surname>
          </string-name>
          , E.:
          <article-title>Towards a Spatiotemporal Aggregation Service in the Sensor Web. Poster Presentation</article-title>
          .
          <source>In: The 14th AGILE International Conference on Geographic Information Science</source>
          . (
          <year>2011</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          7.
          <string-name>
            <surname>Jeong</surname>
            ,
            <given-names>S.H.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Fernandes</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Paton</surname>
            ,
            <given-names>N.W.</given-names>
          </string-name>
          , Gri ths, T.:
          <article-title>A generic algorithmic framework for aggregation of spatio-temporal data</article-title>
          .
          <source>In: SSDBM '04: Proceedings of the 16th International Conference on Scienti c and Statistical Database Management</source>
          , Washington, DC, USA, IEEE Computer Society (
          <year>2004</year>
          )
          <fpage>245</fpage>
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          8.
          <string-name>
            <surname>Pebesma</surname>
            , E.J., de Kwaadsteniet,
            <given-names>J.W.</given-names>
          </string-name>
          :
          <article-title>Mapping groundwater quality in the netherlands</article-title>
          .
          <source>Journal of Hydrology</source>
          <volume>200</volume>
          (
          <issue>1-4</issue>
          ) (
          <year>1997</year>
          )
          <volume>364</volume>
          {
          <fpage>386</fpage>
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          9.
          <string-name>
            <surname>Journel</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Huijbregts</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          :
          <article-title>Mining geostatistics</article-title>
          . Academic Press (
          <year>1978</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          10.
          <string-name>
            <surname>Patni</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Sahoo</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Henson</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Sheth</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          :
          <article-title>Provenance aware linked sensor data</article-title>
          . In Karger,
          <string-name>
            <given-names>P.</given-names>
            ,
            <surname>Olmedilla</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            ,
            <surname>Passant</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            ,
            <surname>Polleres</surname>
          </string-name>
          , A., eds.
          <source>: Proceedings of the Second Workshop on Trust and Privacy on the Social and Semantic Web</source>
          . (
          <year>2010</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          11.
          <string-name>
            <surname>Bizer</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Heath</surname>
            ,
            <given-names>T.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Berners-Lee</surname>
            ,
            <given-names>T.</given-names>
          </string-name>
          :
          <article-title>Linked Data { The Story So Far</article-title>
          .
          <source>International Journal on Semantic Web and Information Systems</source>
          <volume>5</volume>
          (
          <issue>3</issue>
          ) (
          <year>2009</year>
          )
          <volume>1</volume>
          {
          <fpage>22</fpage>
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          12.
          <string-name>
            <surname>Lassila</surname>
            ,
            <given-names>O.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Swick</surname>
          </string-name>
          , R.:
          <article-title>Resource description framework (rdf) model and syntax speci cation</article-title>
          . http://www.w3.org/tr/pr-rdf-syntax/.
          <source>Technical report, W3C</source>
          (
          <year>1999</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          13.
          <string-name>
            <surname>Prud'hommeaux</surname>
          </string-name>
          , E.,
          <string-name>
            <surname>Seaborne</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          :
          <article-title>Sparql query language for rdf</article-title>
          .
          <source>w3c recommendation</source>
          , http://www.w3.org/tr/rdf-sparql-query/.
          <source>Technical report</source>
          (
          <year>2008</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref14">
        <mixed-citation>
          14.
          <string-name>
            <surname>Page</surname>
            ,
            <given-names>K.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>De Roure</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Martinez</surname>
            ,
            <given-names>K.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Sadler</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Kit</surname>
            ,
            <given-names>O.</given-names>
          </string-name>
          :
          <article-title>Linked sensor data: Restfully serving rdf and gml</article-title>
          . In K.,
          <string-name>
            <given-names>T.</given-names>
            ,
            <surname>Ayyagari</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            ,
            <surname>De Roure</surname>
          </string-name>
          , D., eds.
          <source>: Proceedings of the 2nd International Workshop on Semantic Sensor Networks (SSN09)</source>
          . Volume Vol-
          <volume>522</volume>
          .,
          <string-name>
            <surname>CEUR</surname>
          </string-name>
          (
          <year>2009</year>
          )
          <volume>49</volume>
          {
          <fpage>63</fpage>
        </mixed-citation>
      </ref>
      <ref id="ref15">
        <mixed-citation>
          15.
          <string-name>
            <surname>Phuoc</surname>
            ,
            <given-names>D.L.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Hauswirth</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          :
          <article-title>Linked open data in sensor data mashups</article-title>
          . In Kerry Taylor, A.
          <string-name>
            <surname>A.D.D.R</surname>
          </string-name>
          ., ed.
          <source>: Proceedings of the 2nd International Workshop on Semantic Sensor Networks (SSN09)</source>
          . Volume Vol-
          <volume>522</volume>
          .,
          <string-name>
            <surname>CEUR</surname>
          </string-name>
          (
          <year>2009</year>
          )
          <volume>1</volume>
          {
          <fpage>16</fpage>
        </mixed-citation>
      </ref>
      <ref id="ref16">
        <mixed-citation>
          16.
          <string-name>
            <surname>Patni</surname>
            ,
            <given-names>H.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Henson</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Sheth</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          :
          <article-title>Linked sensor data</article-title>
          .
          <source>In: 2010 International Symposium on Collaborative Technologies and Systems</source>
          , IEEE (
          <year>2010</year>
          )
          <volume>362</volume>
          {
          <fpage>370</fpage>
        </mixed-citation>
      </ref>
      <ref id="ref17">
        <mixed-citation>
          17.
          <string-name>
            <surname>Na</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Priest</surname>
            ,
            <given-names>M.:</given-names>
          </string-name>
          <article-title>OGC Implementation Speci cation 06-009r6: OpenGIS Sensor Observation Service (SOS)</article-title>
          .
          <source>Open Geospatial Consortium</source>
          (
          <year>2007</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref18">
        <mixed-citation>
          18.
          <string-name>
            <surname>Janowicz</surname>
            ,
            <given-names>K.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Compton</surname>
            ,
            <given-names>M.:</given-names>
          </string-name>
          <article-title>The stimulus-sensor-observation ontology design pattern and its integration into the semantic sensor network ontology</article-title>
          . In Ayyagari,
          <string-name>
            <given-names>A.</given-names>
            ,
            <surname>Roure</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.D.</given-names>
            ,
            <surname>Taylor</surname>
          </string-name>
          , K., eds.
          <source>: Proceedings of the 3rd International workshop on Semantic Sensor Networks</source>
          <year>2010</year>
          (
          <article-title>SSN10) in conjunction with the 9th International Semantic Web Conference (ISWC</article-title>
          <year>2010</year>
          ). Volume
          <volume>668</volume>
          ., CEUR-WS (
          <year>2010</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref19">
        <mixed-citation>
          19.
          <string-name>
            <surname>Zhao</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Bizer</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Gil</surname>
            ,
            <given-names>Y.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Missier</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Sahoo</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          :
          <article-title>Provenance requirements for the next version of rdf</article-title>
          . In: W3C Workshop RDF Next Steps. (
          <year>2010</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref20">
        <mixed-citation>
          20.
          <string-name>
            <surname>Hartig</surname>
            ,
            <given-names>O.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Zhao</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          :
          <article-title>Publishing and consuming provenance metadata on the web of linked data</article-title>
          .
          <source>In: In: Proc. of 3rd Int. Provenance and Annotation Workshop</source>
          . (
          <year>2010</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref21">
        <mixed-citation>
          21.
          <string-name>
            <surname>Park</surname>
            ,
            <given-names>U.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Heidemann</surname>
            ,
            <given-names>J.:</given-names>
          </string-name>
          <article-title>Provenance in sensornet republishing</article-title>
          .
          <source>In: Proceedings of the 2nd International Provenance and Annotation Workshop</source>
          , Salt Lake City, Utah, USA, Springer-Verlag (
          <year>June 2008</year>
          )
          <volume>208</volume>
          {
          <fpage>292</fpage>
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>