=Paper= {{Paper |id=None |storemode=property |title=MappingSets for Spatial Observation Data Warehouses |pdfUrl=https://ceur-ws.org/Vol-1075/06.pdf |volume=Vol-1075 |dblpUrl=https://dblp.org/rec/conf/immoa/ViqueiraMVT13 }} ==MappingSets for Spatial Observation Data Warehouses== https://ceur-ws.org/Vol-1075/06.pdf
     MappingSets for Spatial Observation Data Warehouses

                                 José R.R. Viqueira                      Sebastián Villarroya
                                 COGRADE - CITIUS                          COGRADE - CITIUS
                             Universidade de Santiago de               Universidade de Santiago de
                                  Compostela, Spain                         Compostela, Spain
                                jrr.viqueira@usc.es                    sebastian.villarroya@usc.es
                                   David Martı́nez                         José A. Taboada
                                 COGRADE - CITIUS                          COGRADE - CITIUS
                             Universidade de Santiago de               Universidade de Santiago de
                                  Compostela, Spain                         Compostela, Spain
                               david.martinez.casas                    joseangel.taboada@usc.es
                                     @usc.es

ABSTRACT                                                                quency. The latter are started by some external event at
The amount of time evolving spatial data that is currently              any moment in time.
being generated by automatic observation processes is huge.                Observation data has an inherent temporal nature. Be-
In general, observation data consists of both heterogeneous             sides, in many cases FOIs are also spatial. Therefore, sys-
spatio-temporal data and relevant observation metadata.                 tems devoted to observation data analysis should cope with
The former includes data of Spatial Entities (cities, roads,            spatial and spatio-temporal data analysis. In particular,
vehicles, etc.) and data of temporal evolution of both prop-            they should support relevant functionality for the manage-
erties of Spatial Entities (population of a city, position of           ment of Spatial Entities and Spatial Coverages, and their
a vehicle, etc.) and properties of space (temperature, el-              evolution with respect to time [9, 20, 6]. Spatial Entities are
evation, etc.). Real uniform integrated management of all               entities of a given application domain that have geometric
these types of data is still not achieved by current models             valued properties (rivers, municipalities, cities, etc.). Spatial
and systems. The present paper describes the design of a                Coverages are sets of functions with a common spatial do-
data modeling and management framework for observation                  main that describe the continuous or discrete variation over
data warehouses. A hybrid logical-functional data model                 space of some specific phenomenon (temperature, humidity,
based on the concept of MappingSet and relevant language                elevation above sea level, etc.).
enables the specification of spatio-temporal analytical pro-               The amount of data that is currently being obtained from
cesses. The framework in currently being implemented.                   automatic observation processes is huge and the estimated
                                                                        tendency is to have an exponential growth during the up-
                                                                        coming years. The analysis of all these data to support
1.   INTRODUCTION                                                       appropriate decision making is key challenge for future in-
   According to [16], properties of entities (called Features           formation systems. Many application domains exist that
of Interest - FOI) are either exact values assigned by some             would benefit from innovative technologies in this area, in-
authority (names, prices, geometry of a municipality, etc.)             cluding environmental observation and monitoring, natural
or estimated by some observation process (height, classifica-           disaster management, e-health, etc.
tion, color, etc.). Observation processes may be classified in             Based on the above, in the present paper a data modeling
various different ways [15]. Physical Processes produce their           and management solution is proposed that enables spatio-
data in some spatial context. They are usually hardware                 temporal analysis in data warehouses of observation data. In
sensing devices that perform measurements either locally or             particular, a proposed E-R extension enables the insertion
remotely. Besides, they may be installed in either static or            of observation metadata in spatial models at a conceptual
mobile platforms. Non-Physical Processes are computations               level. At a logical level, a new data model based on Map-
that may be defined in some mathematical way. Any pro-                  pingSets enables the integrated management of any kind of
cess may be either Time-triggered o Event-triggered. The                spatial and temporal data. A MappingSet is a collection of
former perform their results at some predefined time fre-               Mappings, in the functional programming sense, defined on
                                                                        a common domain. Both Spatial Entities and Spatial Cov-
                                                                        erages and both Time-triggered and Event-triggered obser-
                                                                        vation data are modeled uniformly with MappingSets.
                                                                           The remainder of this paper is organized as follows. Sec-
                                                                        tion 2 describes other pieces of work related to the pro-
                                                                        posed solution. The MappingSet based spatio-temporal log-
                                                                        ical model is described in Section 3. The conceptual level
                                                                        E-R extension for observation data is described in Section
                                                                        4, as it is also its translation to the MappingSet based logi-
                                                                        cal model. Section 5 illustrates the spatio-temporal analysis


         Proceedings IMMoA’13                                    601         http://www.dbis.rwth-aachen.de/IMMoA2013/
capabilities of the model for the definition of Non-Physical           on those input streams. Continuous query languages [1, 11]
spatio-temporal analytical processes. Finally, Section 6 con-          enable the definition of those continuous queries on both
cludes the paper and outlines lines of future work.                    data streams and recorded relations. Operations to create
                                                                       relations from streams and streams from relations are at the
2.   RELATED WORK                                                      core of those languages. A similar approach is followed by
                                                                       some languages specifically designed to access sensor net-
   The OGC defines an abstract specification of a data model
                                                                       works [13, 7]. It is important to notice that spatial data,
for Observations and Measurements [16] in a Geographic
                                                                       including spatial entities and spatial coverages and spatial
Information context. Various types of observations are sup-
                                                                       analysis is not explicitly supported in these solutions.
ported, according to the data type of their values. Simple
observations include: i) measurements that combine a value
of a real type with a unit of measure, ii) categories whose            3.   SPATIO-TEMPORAL MAPPINGSET
results are items of enumerated types, iii) counts of inte-                 BASED DATA MANAGEMENT
ger types, iv) truth observations of boolean type, v) time                This section introduces the MappingSet based data model
observations and vi) geometric observations. Complex ob-               that is the basis of the proposed framework. Temporal and
servations are record structures that combine various simple           spatial data types are first defined. Based on them Mappings
observation types. Metadata of each observation is also rep-           and MappingSets are next formalized. Data management
resented in the model. In particular, each observation refer-          will be based on the intensional definition of MappingSets
ences its observation Process, the observed Property and its           using both logical and functional paradigms.
related FOI, the time instant when the observation applies                Conventional data types include Boolean, CString (vari-
to the FOI observed property (phenomenon time) and the                 able size character strings), Int16, Int32, Int64 (integers),
time instant when the Process obtained the result value (re-           Float32, Float64 (reals with floating point representation).
sult time). Notice for example that if a sample of water is            Fixed point parametric type Numeric(P,D) consists of real
obtained from a river and next analyzed in a laboratory two            numbers with a maximum of P digits, D of them are in the
different observation time instants are involved. Optionally,          fractional part. In order to define temporal and spatial data
other metadata, parameters, data quality information and               types, 1D and 2D samplings are first formalized. Let R and
observation context may also be provided.                              I denote the set of real and integer numbers, respectively,
   In [4] a conceptual model to represent observation data se-         then 1D and 2D samplings are defined as follows.
mantics is defined. Annotating the conventional data mod-
els of available heterogeneous datasets with observation and             Definition 1. A 1D-sampling S with resolution r ∈ R
measurement conceptual constructs enables their integra-               and phase p ∈ R is defined as the infinite subset of R
tion at a semantic level. Integrated query of heterogeneous              {x|x = i · r + p, ∀i ∈ I}
observation datasets becomes therefore possible after the an-
notation process. A similar approach is followed by the E-R               Definition 2. Let vr1 , vr2 , vp1 and vp2 be four vectors
extension proposed in the present paper.                               in R2 defined by respective directions D1 , D2 , D1 , D2 ∈
   Observation data has always a temporal nature. Besides,             (−π, π] and respective magnitudes r1 , r2 , p1 and p2 . A 2D-
the spatial components of observation data and metadata is             sampling S with directions D1 , D2 , resolutions r1 , r2 and
centric to many application domains, such as those related             phases p1 ∈ [−r1 /2, r1 /2], p2 ∈ [−r2 /2, r2 /2] is defined as
to environmental observation and monitoring. Spatial and               the infinite subset of R2
temporal extensions of conceptual and logical data models                {(x, y) ∈ R2 |
have to be considered. Examples of spatio-temporal concep-                  x = (i1 r1 + p1 ) cos(D1 ) + (i2 r2 + p2 ) cos(D2 )∧
tual models are [18, 17]. Relational and object-relational                  y = (i1 r1 + p1 ) sin(D1 ) + (i2 r2 + p2 ) sin(D2 ),
spatio-temporal extensions are defined in the area of Spa-                  ∀i1 , i2 ∈ I}
tial Databases [9, 20] to support spatial entity management.
                                                                          An element s of a 1D-sampling (2D-sampling) S is called a
Field [6] and array algebras [3] are behind spatial cover-
                                                                       1D-sample (2D-sample). Integer i, i1 , i2 are called the sam-
age and array management systems [14, 5, 2]. Integrated
                                                                       pling coordinates of s. s(i), s(i1 , i2 ) denote respectively the
management of spatial entities and coverages is also objec-
                                                                       1D-sample and 2D-sample with sampling coordinates i and
tive of some approaches [19, 12], that incorporate different
                                                                       (i1 , i2 ). Figure 1 illustrates the above definitions with a
structures for those data types. Integrated management of
                                                                       geometrical representation.
entities and coverages in a uniform manner is achieved by
the MappingSet data model proposed in the present paper.                  Definition 3. TimeInstant(D) is defined as a finite sub-
   Various different data management approaches are pos-               set of elements s(i) of a 1D-sampling S with resolution 10−D
sible to deal with spatio-temporal observation data auto-              and phase 0 such that
matically generated by sensing devices. If we consider the               −263 < i < 263 + 1
data generated by each sensor as a virtual temporal rela-              where each s(i) is interpreted as the time instant 1/1/1970+
tion, then the simplest approach is to consider Materialized           s(i) seconds. Maximum allowed D is 6 (microsecond).
Views of such virtual relations. Automatic maintenance of
such views on the arrival of new data from sensors has to                 Definition 4. TimeInstantSample(D, R) is defined as
be solved by the system [8]. Automatically updating these              a finite subset of elements s(i) of a 1D-sampling S with
views through Extraction Transformation and Load (ETL)                 resolution R · 10−D and phase (R · 10−D )/2 such that
processes on sensor data is the approach followed by the                 −263 < i < 263 + 1
present framework.                                                     where each s(i) is interpreted as the time interval [1/1/1970+
   A more sophisticated solution is to consider sensor data            s(i) seconds, 1/1/1970 + (s(i) + R · 10−D ) seconds). Again,
streams and to enable the continuous execution of queries              maximum allowed D is 6 (microsecond).


         Proceedings IMMoA’13                                    612        http://www.dbis.rwth-aachen.de/IMMoA2013/
                                     p         r                                        • LineString(S): Vector polylines defined by sequences
                   s(-2)   s(-1)         s(0)      s(1)        s(2)                       of elements of S.
                                    0
                            (a) 1D Sampling                                             • Polygon(S): Vector polygons, possibly with holes, whose
                                                                                          borders are defined by sequences of elements of S.
                                         y
                                                                                        • GeometryCollection(S): Heterogeneous collections
                                                                  s(1,1)                  of Geometries.
                                                      s(0,1)
                                                                                        • MultiPoint(S): Homogeneous collection of elements
                                                                                          of S.



                                                vr2
                                                                                        • MultiLineString(S): Homogeneous collection of ele-
                                                       vr 1                               ments of LineString(S).
                                                                 s(1,0)
                                              s(0,0)
                                   vp2                                                  • MultiPolygon(S): Homogeneous collection of elements
                                             vp1                                          of Polygon(S).
                               (0,0)                                       x

                                                                                        Definition 8. If ADT1 , ADT2 , . . . ADTn are not necessar-
                                                                                     ily distinct data types, A1 , A2 , . . . , An are distinct names
                                                                                     and RDT is a data type, then:
                           (a) 2D Sampling
                                                                                       1. A Mapping with signature M () : RDT is defined as
                                                                                          a value of type RDT
   Figure 1: Illustration of 1D and 2D samplings.
                                                                                       2. A Mapping with signature
                                                                                          M (A1 : ADT1 , A2 : ADT2 , . . . , An : ADTn ) : RDT
   Definition 5. Date is defined as a shorthand of TimeIn-                                is defined as a partial function
stantSample(0, 86400).                                                                    M : ADT1 × ADT2 × ADTn → RDT

   Definition 6. Point2D(P,D) is defined as the finite sub-                             Operations are syntactic sugar for Mappings. Implicit
set of elements s(i1 , i2 ) of a 2D-sampling S with directions                       castings between compatible data types are applied during
D1 = 0 and D2 = π/2, resolutions R1 = R2 = 10−D and                                  Mapping invocations, enabling transparent transformation
phases P h1 = P h2 = 0 such that                                                     between temporal and spatial elements of different resolu-
  −10P < i1 , i2 < 10P                                                               tions by applying constant interpolation. Various primitive
                                                                                     mappings and operations are provided by the model. How-
   Definition 7. Point2DSample(P,D,R) is defined as the
                                                                                     ever, formalizing a complete set of them is out of the scope
finite subset of elements s(i1 , i2 ) of a 2D-sampling S with im-
                                                                                     of the paper. Informal descriptions of required primitive
plementation dependent directions D1 and D2 , resolutions
                                                                                     mappings will be given throughout the paper.
r1 = r2 = K · R · 10−D and phases p1 = p2 = 0 such that
                                                                                        A MappingSet is nothing but a set of Mappings that share
  1. −10P < i1 , i2 < 10P                                                            a common domain defined as a n-ary relation over data
               ¯            ¯ ¯            ¯                                         types. Formalism is given below.
  2. K < max(¯cos( D2 −D2
                          1 ¯ ¯
                           ) , sin( D2 −D
                                       2
                                         1 ¯
                                          ))
                                                                                        Definition 9. Let C1 , C2 , . . . , Cn be distinct names, ADT1 ,
  TimeInstant and Point2D data types provide discrete rep-
                                                                                     ADT2 , . . ., ADTn be not necessarily distinct data types
resentations for both time and space, where the user has con-
                                                                                     and RDT1 , RDT2 , . . ., RDTm be not necessarily distinct
trol over the supported precision. Types TimeInstantSam-
                                                                                     data types. Let also D be a n-ary relation with scheme
ple and Point2DSample provide representations for temporal
                                                                                     D(C1 : ADT1 , C2 : ADT2 , . . ., Cn : ADTn ) defined as
and spatial samplings at user defined resolution. It is noticed
                                                                                     a finite subset of ADT1 × ADT2 × . . . × ADTn . Then
that each time instant is approximated by its closest lower
                                                                                     a MappingSet is defined in either of the three following
TimeInstantSample, whereas each 2D point is approximated
                                                                                     forms:
by its closest Point2DSample. It is out of the scope of this
paper to demonstrate that K factor above ensures that any                              1. A 1-tuple M S = hDi.
2D point is approximated by a sample at a distance lower or
equal to R · 10−D . Type castings are available for the above                          2. A m-tuple M S = hM1 , M2 , . . . , Mm i, where each Mi
data types.                                                                               is a Mapping with signature Mi () : RDTi defined as a
  If T is either a numeric or temporal type, then data type                               value of RDTi .
Interval(T) is a new data type whose values are closed in-
tervals over data type T. If t1 , t2 are two elements of data                          3. A (m+1)-tuple M S = hD, M1 , M2 , . . . , Mm i, where
type T, then [t1 , t2 ] is used to denote the relevant closed                             each Mi is a Mapping with signature Mi (C1 : ADT1 , C2 :
interval. Similarly, if S is spatial data type then the follow-                           ADT2 , . . . , Cn : ADTn ) : RDTi defined as a partial
ing geometric data types are also supported, based on the                                 function Mi : ADT1 × ADT2 × ADTn → RDTi .
standard specification given by [10].
                                                                                       The evolution with respect to time of spatial entities and
   • Geometry(S): Abstract type. Represents any vector                               spatial coverages may be modeled with appropriate Map-
     geometry or set of geometries defined with elements of                          pingSets that contain both Domain and Mappings. n-ary
     S.                                                                              relationships are also modeled with MappingSets, usually


         Proceedings IMMoA’13                                                  623        http://www.dbis.rwth-aachen.de/IMMoA2013/
without Mappings. MappingSets without Domain are also                       keyProperty   property     GeoProperty           property1       property2
useful to record short collections of key-value pairs that are
                                                                                                                                                         C
common in the specification of configuration settings.                                SpatialEntity                          SpatialCoverage
   The Domains and Mappings of a MappingSet may be de-
fined either extensionally or intensionally. If a extensional                        (a) Spatial Entities                   (b) Spatial Coverages
definition of the Domain is given, then both extensional                                                                       EO           TO
and intensional definitions of Mappings are allowed. On                                   EO             TO
                                                                                 Entity         Entity                       relat.        relat.
the other hand, an intensional definition of the Domain may
only be accompanied by intensional definitions of Mappings.                      (c) Observed Entities               (d) Observed Relationships
Generally, an extensional definition is a sequence of all the
elements of Domain and Mappings in some specific order.                                          component1          component2
                                                                            simpleProperty TO
Both row-wise and column-wise orderings may be used. It
is even possible to combine row and column-wise orders for
                                                                            simpleProperty EO                                       multiValued     TO
different components and Mappings. If the data type of                                                complexProperty EO
a Domain component is of some integer or sampling data
                                                                                                     (e) Observed Properties
type, then its extensional definition might be given in the
form of a collection of sequence definitions. In general, a
sequence definition has an start element, a size and a step.
For example, for an integer data type, a sequence start-               Figure 2: E-R Diagram Notation for Spatial and
ing at 5, with size 4 and step 2 describes the following list          Observation Data.
< 5, 7, 9, 11 >. For a TimeInstantSample data types, a se-
quence starting at “2013 − 05 − 0215 : 00 : 45.06”, with               by the system including both statistical and rank functions.
size 2 and step 30.42 describes the following sequence of              MappingSet domains may also be intensionally defined.
type TimeInstant(2, 3042) ¡“2013 − 05 − 0215 : 00 : 20.22”,
“2013 − 05 − 0215 : 00 : 50.64”¿. 1 . For Point2DSample data
types, starting element is fo type Point2D and step has to
                                                                       Intensional Domain. Let e be a functional expression that
                                                                       yields a value s of either Interval(T) or Geometry(S) data
be given by two pairs (direction, resolution).
                                                                       type, whose base type T, S is either some integer type or
   Spatio-temporal analysis is enabled through the inten-
                                                                       some sample type. Then, SAMPLING(e) yields all the el-
sional definition of Mappings and MappingSets. Mappings
                                                                       ements of type T or S contained in s. Based on this, the
may be intensionally defined with functional, conditional
                                                                       domain D of a MappingSet M may be defined by an expres-
and aggregate expressions.
                                                                       sion of the form
                                                                          {(e1 , e2 , . . . , en )|P }
Functional expression. A Mapping M with signature M(D):                where P is a domain relational calculus predicate and each ei
DT may be defined by a expression of the form                          is either a functional expression or an expression of the form
  M(D) := e                                                            SAM P LIN G(e), where e is also a functional expression.
where e is a functional expression of data type DT that may            Expressions e and ei may include variable names bounded
include variables referencing components of D, mappings,               to MappingSet domain components in P. Given that nested
operations, constants and castings.                                    structures are not allowed in the model, if an expression
                                                                       SAM P LIN G(e) is used then the result relation has to be
Conditional Expression. A Mapping M with signature                     unnested.
M(D): DT may be defined by a expression of the form
    M(D) := CASE b1 THEN e1
               CASE b2 THEN e2
                                                                       4.     MODELING OBSERVATION DATA WARE-
               ...                                                            HOUSES
               CASE bn THEN en                                            The data model described in this section captures observa-
               [OTHERWISE en+1 ]                                       tion data semantics and integrates them with spatial entities
where each bi is a functional expression that yields a value           and coverages. An E-R extension is proposed in Subsection
of Boolean type and each ei is a functional expression that            4.1 to model observation metadata. The translation of such
yields a value of type DT. The semantics are the obvious               a conceptual model to the MappingSet based logical model
ones.                                                                  is explained in Subsection 4.2.

Aggregate Expression. A Mapping M with signature M(D):                 4.1      Conceptual Data Model
DT may be defined by a expression of the form                             Contrary to conventional metadata that is recorded at the
    M(D):= agge                                                        level of entity and property types, some observation meta-
               OVER {P}                                                data has to be recorded at the level of entity and property
where P is a domain relational calculus predicate and agge             instances, i.e., combined with the data itself. This is the
is an functional expression where variables bounded to Map-            case for example of observation time instants and observa-
pingSet domains in P must be used as arguments of aggre-               tion processes.
gate functions. Various aggregate functions are provided                  An extension of the E-R model is next proposed to incor-
                                                                       porate spatial and observation data semantics in conceptual
1
 Notice that the start instant of the sequence is automati-            models. Spatial Entity types are represented in diagrams
cally adapted to match the underlying time representation              as conventional entities (see Figure 2(a)). Spatial Coverage
for type TimeInstant(2, 3042)                                          Types are represented as entities tagged with the symbol


         Proceedings IMMoA’13                                    634          http://www.dbis.rwth-aachen.de/IMMoA2013/
                                    load
                                                                    cond       temp      depth                  ProcessType                ObsProperty
                                        EO
                                       EO
                                    catches
                                                                                   ctd EO
                                                                                       EO
                                                                                                                                 Process
                                                                       vesId
     name
            comfort            stock TO quota                                                                    ObsFOI          NonObsFOI
                      (0..*)                              (0..*)                   loc TO
                                                      (0..*)
            Species (0..*)           capacity                   Vessel            difTem EO
                                                       (0..*)        (0..*)
     speId     comfGeo         TO                                                power                                     FOI                      MappingSet
                                                               EO

                                    (0..*)                enters
                  C    TO
                                                 (0..*)                             stock TO
               SST                      ICES                            Effort
                                                                                      quota
                                                 (1..*)
               temp                                                                                                                DomComponent          Mapping
                                    zoneId      geo


                                                                                                          Figure 4: E-R Diagram of the Frameworks Catalog.
Figure 3: E-R Diagram of a Running Application
Example.
                                                                                                          using the quota and the vessels GPS information. The Fish-
                                                                                                          ing Capacity gives the kilograms of each species that the ves-
 C
   (see Figure 2(b)). Entity Types, either spatial or not,                                                sel may get from each zone. Again both quota is recorded
and Coverages whose whole data is obtained through an ob-                                                 and stock is computed by a Non-Physical Process.
servation Process are tagged with either symbol                                     TO
                                                                                         if it is           The translation of the above model to the MappingSet
                                                                EO
                                                                                                          logical model of the framework is explained in the following
a Time-triggered Process or symbol         if it is an Event-                                             subsection.
triggered Process (see Figure 2(c)). Relationships resulting
from observation processes are tagged in the same way (see                                                4.2    MappingSet Based Logical Model
Figure 2(d)). Finally, properties of either Entity or Cover-                                                 To support the implementation of the conceptual model of
age types that are obtained through observation processes                                                 the previous section, observation metadata has to be added
are also tagged with the same TO and EO symbols, as it is                                                 to the frameworks catalog. Thus, the catalog contains meta-
shown in Figure 2(e) for simple, complex and multivalued                                                  data of the defined Mappings and MappingSets and meta-
properties.                                                                                               data related to the various observation processes, including
   To illustrate the use of the above notation the E-R dia-                                               observation properties and features of interest. The E-R
gram of a reduced running application example is given in                                                 diagram of such catalog structures is given in Figure 4.
Figure 3. Spatial Entity Type ICES records fishing zones de-                                                 Entity types MappingSet, DomComponent and Mapping
fined by the International Council for the Exploration of the                                             record general metadata of the MappingSets. Entity type
Sea (ICES). Spatial Coverage SST records Sea Surface Tem-                                                 FOI records metadata of Features of Interest, and it refer-
perature at each location of the sea, daily produced by the                                               ences the MappingSet that records its data. FOIs that are
Moderate Resolution Imaging Spectroradiometer (MODIS)                                                     fully generated by observation processes are registered in
sensor installed in the Terra and Aqua NASA satellites. En-                                               ObsFOI. The remainder FOIs, i.e., those that combine ob-
tity type Vessel records data of fishing vessels, including an                                            served with non observed properties are represented by en-
identifier (vesId ) and its engine power. Vessels incorporate                                             tity type NonObsFOI. Each observed property of such a FOI
CTD sensors that enable obtaining triples of water conduc-                                                is represented by a weak entity of type ObsProperty, which
tivity, water temperature and depth. Every time a ctd ob-                                                 references the MappingSet that records its data. Finally,
servation is performed a Non-Physical Process is executed                                                 ProcessType records metadata of the various types of ob-
that computes the difference with the value given by MODIS                                                servation processes registered in the framework. Metadata
and provides it as a derived property difTemp. Vessels also                                               of each specific instance of each process type is recorded in
incorporate GPS sensors from which locations are obtained                                                 weak entity type Process. Notice the difference between the
every 30 seconds. Entity type Species records data of fish-                                               process type “Vessel Bascule” that obtains values of load
ing species, including an identifier specId, species name and                                             property of relationships catches and the specific bascule
an interval of temperature values where the fish feels com-                                               installed in each vessel that must be referenced from each
fortable (property comfort). The derived property comfGeo                                                 observation.
records the geometry of the area of the sea where comfort-                                                   The rules that enable the transformation of the concep-
able temperatures for the fish are located. This property                                                 tual model of the previous section to MappingSets are now
is obtained by a Non-Physical Process from the SST data.                                                  given next. Each Entity Type, either Spatial or not, gener-
Property load of relationship catches records the values mea-                                             ates a relevant MappingSet, whose domain is defined by key
sured by the vessel bascule for each species. The autho-                                                  properties and whose Mappings are defined by the remain-
rized fishing capacity of a vessel is given by two parameters.                                            der properties. See for example Entity Types Vessel, Species
The Fishing Effort gives a measure of the number of days                                                  and ICES in Figure 3 and relevant MappingSets in Figure 5.
weighted by the vessel engine power that the vessel may                                                   Each Spatial Coverage generates a MappingSet, whose do-
stay in each zone. Relationship Effort records both the ini-                                              main has just one component of some Point2DSample type
tial quota and the available one (property stock ). Available                                             and whose Mappings are generated from coverage proper-
Fishing Effort stock is obtained by a Non-Physical Procress                                               ties. Each Relationship Type with cardinalities various to


              Proceedings IMMoA’13                                                                  645         http://www.dbis.rwth-aachen.de/IMMoA2013/
 MAPPINGSET Vessel                                      MAPPINGSET Capacity
 DOMAIN                                                 DOMAIN
  vesId: CString                                          vessel: CString
 MAPPINGS                                                 ices: CString
  power(vesId:CString):Numeric(6,2)                       species: CString
                                                        MAPPINGS
 MAPPINGSET Vessel_loc                                    quota(vessel:CString, ices:CString
 DOMAIN                                                          species:CString):Numeric(7,3)
   obsTime: TimeIntantSample(0, 30),                      quotaUOM(vessel:CString, ices:CString
   vesId: CString                                                   species:CString):CString
 MAPPINGS
   loc(phenTime: TimeIntantSample(0, 30),
        vesId:CString):Point2D                          MAPPINGSET Effort
   process(obsTime: TimeIntantSample(0, 30),            DOMAIN
            vesId:CString):CString                       vessel: CString
                                                         ices: CString
 MAPPINGSET Vessel_ctd                                  MAPPINGS
 DOMAIN                                                  quota(vessel:CString,
   obsTime: TimeIntant(0),                                     ices:CString):Numeric(7,3)
   vesId: CString                                        quotaUOM(vessel:CString,
 MAPPINGS                                                         ices:CString):CString
   cond(obsTime: TimeIntantSample(0, 30),
        vesId:CString):Numeric(4,1)                     MAPPINGSET Catches
   condUOM(obsTime: TimeIntantSample(0, 30),            DOMAIN
            vesId:CString):CString                        species: CString
   temp(obsTime: TimeIntantSample(0, 30),                 vessel: CString
        vesId:CString):Numeric(5,2)                       obsTime: TimeInstant(0)
   tempUOM(obsTime: TimeIntantSample(0, 30),            MAPPINGS
            vesId:CString):CString                        load(species:CString, vessel:CString,
   depth(obsTime: TimeIntantSample(0, 30),                     obsTime:TimeInstant(0)): Numeric(7,3)
          vesId:CString):Numeric(5,2)                     loadUOM(species:CString, vessel:CString,
   depthUOM(obsTime: TimeIntantSample(0, 30),                     obsTime:TimeInstant(0)): CString
             vesId:CString):CString                       process(species:CString, vessel:CString,
   process(obsTime: TimeIntantSample(0, 30),                      obsTime:TimeInstant(0)): CString
             vesId:CString):CString

 MAPPINGSET Species                                     MAPPINGSET SST
 DOMAIN                                                 DOMAIN
  speId: CString                                          loc:Point2DSample(9,2,100000)
 MAPPINGS                                                 obsTime:Date
  name(speId:CString):CString                           MAPPINGS
  comfort(speId:CString):Interval(Numeric(5,2))           temp(loc:Point2DSample(9,2,100000),
                                                               obsTime:Date):Numeric(5,2)
 MAPPINGSET ICES                                          tempUOM(loc:Point2DSample(9,2,100000),
 DOMAIN                                                           obsTime:Date):CString
   zoneId: CString                                        process(loc:Point2DSample(9,2,100000),
 MAPPINGS                                                         obsTime:Date):CString
   geo(zoneId:CString):Polygon(Point2D(9,2))




                 Figure 5: MappingSets for a Running Application Example.




Proceedings IMMoA’13                              656     http://www.dbis.rwth-aachen.de/IMMoA2013/
various generates a MappingSet whose domain is defined               MODIS (see difTemp derived property of Vessel in Figure
from the key properties of the participating Entity Types.           3).
Properties of those Relationship Types generate Mappings
                                                                       MAPPINGSET Vessel difTem
in such a MappingSet. See for an example Relationship                  DOMAIN
Types capacity and effort in Figure 3 and MappingSets Ca-               {(obsTime, vesId) | Vessel ctd(obsTime, vesId)}
pacity and Effort in Figure 5. If an Entity, Coverage or Re-           MAPPINGS
                                                                        difTem(obsTime, vesId):=
lationship Type is tagged with the symbol TO , then a com-
                                                                           SST.temp(Vessel loc.loc(obsTime, vesId), obsTime) −
ponent named obsTime of some TimeInstantSample(D,R)                        Vessel ctd.temp(obsTime, vesId)
data type is added to the MappingSet Domain to enable the               difTemUOM(obsTime, vesID):=
recording of observation time.2 Besides, a Mapping named                   Vessel ctd.tempUOM(obsTime, vesId)
process is also added to obtain the id of the process used              process(obsTime, vesID):= “difTemProcess”
to produce the observation. See for example Spatial Cover-              In the expression above it is noticed that automatic castings of
age Type SST in Figure 3 and relevant MappingSet SST in              spatial and temporal types are performed during the evaluations
Figure 5. If symbol EO is used instead, then the data type           of Mappings Vessel loc.loc and SST.temp.
of component obsTime is some TimeInstant(D). See for ex-                Example 2. Define a Non-Physical Process that detects when
ample Relationship Type catches in Figure 3 and relevant             a vessel leaves an ICES zone to enter a new one (see enters derived
Catches MappingSet in Figure 5. In any of the above cases,           relationship in Figure 3).
an entity of type ObsFOI has to be added to the catalog with
relevant relationships to its process type and MappingSet.             ICESFromLoc(loc):=
   If a simple or complex property is tagged with symbol TO              singleton(zone)
then such property is not added as a Mapping to the relevant             OVER {ICES(zone) ∧ within(loc, ICES.geo(zone))}
MappingSet. Instead, a separate MappingSet is created for              MAPPINGSET enters
the property whose domain has components to reference the              DOMAIN
key of its Entity Type (FOI of the relevant observation) and            {(vesId, ICESFromLoc(Vessel loc.loc(obsTime, vesId)),
has a component named obsTime of some TimeInstantSam-                     obsTime) |
                                                                          Vessel loc(obsTime, vesId) ∧
ple(D,R) type to record observation time. The property                    ICESFromLoc(Vessel loc.loc(obsTime, vesId)) <>
itself is added as a Mapping to the MappingSet and an ad-                 ICESFromLoc(Vessel loc.loc(predecessor(obsTime), vesId))}
ditional Mapping named process is added to record the id of            MAPPINGS
the process that generates the observation. An example is               process(vesId, zoneId, obsTime):= “entersProcess”
loc property of Entity Type Vessel in Figure 3 and relevant            In the above expression, Mapping within(g1 , g2 ) yields true if
Vessel loc MappingSet in Figure 5. If symbol EO is used              geometry g1 is within geometry g2 . Aggregate function single-
instead then the transformation is exactly the same except           ton(S) yields the element contained in the unitary set S. Finally,
for the fact that Domain component obsTime is of some                Mapping predecessor (ts) yields the time sample that precedes
                                                                     time sample ts in its data type.
TimeInstant(D) type. For an example see ctd property of
Vessel Entity Type in Figure 3 and relevant Vessel ctd Map-            Example 3. Define a Non-Physical Process that produces a
pingSet in Figure 5. In any of the above cases an entity of          measure of the remaining fishing effort for each vessel and ICES
type ObsProperty is added to the catalog, with appropriate           zone for each of the preceding 60 days. Consumed fishing effort is
                                                                     obtained from the temporal evolution of vessel location data and
references to its MappingSet, ProcessType and NonObsFOI.
                                                                     ICES zone geometries (see derived property stock of relationship
   Once the MappingSets are created and the required meta-           type Effort in Figure 3).
data are added to the catalog, the insertion of observation
data may be started. ETL tasks are continuously executed               ICESFromLoc(loc):=
to maintain the data warehouse updated with latest obser-                singleton(zone)
                                                                         OVER {ICES(zone) ∧ within(loc, ICES.geo(zone))}
vation data, using extensional MappingSet definitions. Each
observation is appended to the appropriate MappingSet with             consumed effort(vesId, zoneId, obsTime) :=
its observation time and reference to its process and FOI.               ((count(obsTime2)*30)/86400)*Vessel.power(vesId)
                                                                         OVER {Vessel loc(obsTime2, vesId2) ∧
                                                                            obsTime2 < obsTime ∧ vesId2 = vesId ∧
5.   DEFINITION OF SPATIO-TEMPORAL AN-                                      ICESFromLoc(Vessel loc.loc(obsTime2, vesId2)) = zoneId
                                                                            }
     ALYTICAL PROCESSES                                                MAPPINGSET Effort stock
   The capabilities provided by the framework for the inten-           DOMAIN
sional definition of MappingSets enable the specification of            {(vesId, zoneId,
spatio-temporal analytical processes. These capabilities are               SAMPLING([cast(difTime(now(), 60 Days) AS Date),
now illustrated with some examples.                                          cast(now() AS Date)])) | Effort(vesId, zoneId)}
                                                                       MAPPINGS
                                                                        stock(vesId, zoneId, obsTime):=
  Example 1. Define a Non-Physical Process that obtains                    Effort.quota(vesId, zoneId) −
a derived observed property that computes the difference                   consumed effort(vesId, zoneId, obsTime)
between the temperature measured by the CTD and the                     stockUOM (vesId, zoneId, obsTime):=
sea surface temperature produced for the same location by                  Effort.quotaUOM(vesId, zoneId)
                                                                        process(vesId, zoneId, obsTime):= “EffortStockProcess”
2
 Currently we restrict to phenomenon time semantics, how-               In the above expression, Mapping now () yields the current sys-
ever, it can be extended with result time and other required         tem time instant. Mapping difTime(t, i) subtracts time interval
metadata.                                                            i from time instant t.


         Proceedings IMMoA’13                                  667        http://www.dbis.rwth-aachen.de/IMMoA2013/
  Example 4. Define a Non-Physical Process that obtains the                    [4] S. Bowers, J. Madin, and M. Schildhauer. A conceptual
evolution with respect to to time during the last 7 days of the                    modeling framework for expressing observational data
geometry of the comfort zone for each species. Comfort zone is                     semantics. In Q. Li, S. Spaccapietra, E. Yu, and A. Oliv,
obtained from the temperature interval defined for each species                    editors, Conceptual Modeling - ER 2008, volume 5231 of
and the sea surface temperature generated by MODIS (see derived                    Lecture Notes in Computer Science, pages 41–54. Springer
property comfGeo of entity type species in Figure 3).                              Berlin Heidelberg, 2008.
                                                                               [5] P. G. Brown. Overview of scidb: large scale array storage,
     MAPPINGSET Species comfGeo                                                    processing and analysis. In Proceedings of the 2010 ACM
     DOMAIN                                                                        SIGMOD International Conference on Management of
      {(speId,                                                                     data, SIGMOD ’10, pages 963–968, New York, NY, USA,
        SAMPLING([cast(difTime(now(), 7 Days) AS Date),                            2010. ACM.
           cast(now() AS Date)])) | Species(speId)}                            [6] J. a. P. Cerveira Cordeiro, G. Câmara, U. Moura
     MAPPINGS                                                                      De Freitas, and F. Almeida. Yet another map algebra.
      comfGeo(speId, obsTime):=                                                    Geoinformatica, 13(2):183–202, June 2009.
        vectorize(loc)                                                         [7] I. Galpin, C. Brenninkmeijer, A. Gray, F. Jabeen,
        OVER { SST(loc, obsTime2) ∧ obsTime = obsTime2 ∧                           A. Fernandes, and N. Paton. Snee: a query processor for
           within(SST.temp(loc, obsTime2), Species.comfort(speId))}                wireless sensor networks. Distributed and Parallel
      process(vesId, zoneId, obsTime):= “ComfortZoneProcess”                       Databases, 29(1-2):31–85, 2011.
   In the above expression, aggregate function vectorize(loc) ob-              [8] A. Gupta and I. S. Mumick. Materialized views. chapter
tains the vector geometry that surrounds the set of sample loca-                   Maintenance of materialized views: problems, techniques,
tions loc. Mapping within(e, i) yields true if element e is within                 and applications, pages 145–157. MIT Press, Cambridge,
interval i.                                                                        MA, USA, 1999.
                                                                               [9] R. H. Güting, M. H. Böhlen, M. Erwig, C. S. Jensen, N. A.
                                                                                   Lorentzos, M. Schneider, and M. Vazirgiannis. A
6.      CONCLUSIONS AND FURTHER WORK                                               foundation for representing and querying moving objects.
   A data model and data management framework has been pro-                        ACM Trans. Database Syst., 25(1):1–42, Mar. 2000.
posed spatio-temporal analysis of data in data warehouses of spa-             [10] International Organization for Standardization (ISO).
tial observation data. The approach consists of an E-R exten-                      Information technology – Database languages – SQL
sion for observation data to be used at a conceptual level and                     multimedia and application packages – Part 3: Spatial.
a new logical level model that combines logical and functional                     ISO/IEC 13249-3:2011, 2011.
paradigms. The advantages of the approach can be summarized                   [11] N. Jain, S. Mishra, A. Srinivasan, J. Gehrke, J. Widom,
as follows:                                                                        H. Balakrishnan, U. Çetintemel, M. Cherniack,
      • General purpose observation data and metadata coexist                      R. Tibbetts, and S. Zdonik. Towards a streaming sql
        with application specific Spatial Entities and Coverages,                  standard. Proc. VLDB Endow., 1(2):1379–1390, Aug. 2008.
        enabling efficient analysis over the whole set.                       [12] M. Kersten, Y. Zhang, M. Ivanova, and N. Nes. Sciql, a
                                                                                   query language for science applications. In Proceedings of
      • Few primitive Mappings combined with general purpose                       the EDBT/ICDT 2011 Workshop on Array Databases, AD
        logical and functional expressions enable the integrated man-              ’11, pages 1–12, New York, NY, USA, 2011. ACM.
        agement of any kind of spatial and spatio-temporal data.              [13] S. R. Madden, M. J. Franklin, J. M. Hellerstein, and
        Besides, both data and analytical processing is unified un-                W. Hong. Tinydb: an acquisitional query processing system
        der the well known mathematical concept of function.                       for sensor networks. ACM Trans. Database Syst.,
      • Parametric temporal and spatial types enable the user to                   30(1):122–173, Mar. 2005.
        have control over the precision and resolution of underlying          [14] M. Neteler and H. Mitasova. Open Source GIS: A GRASS
        time and space representations.                                            GIS Approach. Third edition. Springer, New York, USA,
      • Specific constructs for the specification of sampled and non-              2008.
        sampled domain components together with the absence of                [15] Open Geospatial Consortium (OGC). OpenGIS Sensor
        nested structures simplifies efficient implementation.                     Model Language (SensorML) Implementation Specification,
                                                                                   2007. http://www.opengeospatial.org/standards/sensorml.
  Further work is mainly related to efficient implementation struc-           [16] Open Geospatial Consortium (OGC). Geographic
tures and algorithms and the extension of the framework to deal                    Information: Observations and Measurements. OGC
with continuous queries on sensor data.                                            Abstract Specification Topic 20, 2010.
                                                                                   http://www.opengeospatial.org/standards/om.
7.      ACKNOWLEDGMENTS                                                       [17] C. Parent, S. Spaccapietra, and E. Zimányi.
                                                                                   Spatio-temporal conceptual models: data structures +
   This work has been partially supported by the Spanish Ministry
                                                                                   space + time. In Proceedings of the 7th ACM international
of Science and Innovation (TIN2010-21246-C02-02).
                                                                                   symposium on Advances in geographic information systems,
                                                                                   GIS ’99, pages 26–33, New York, NY, USA, 1999. ACM.
8.      REFERENCES                                                            [18] N. Tryfona, R. Price, and C. Jensen. Chapter 3:
 [1] A. Arasu, S. Babu, and J. Widom. The cql continuous                           Conceptual models for spatio-temporal applications. In
     query language: semantic foundations and query execution.                     T. Sellis, M. Koubarakis, A. Frank, S. Grumbach,
     The VLDB Journal, 15(2):121–142, June 2006.                                   R. Güting, C. Jensen, N. Lorentzos, Y. Manolopoulos,
 [2] P. Baumann, A. Dehmel, P. Furtado, R. Ritsch, and                             E. Nardelli, B. Pernici, B. Theodoulidis, N. Tryfona, H.-J.
     N. Widmann. The multidimensional database system                              Schek, and M. Scholl, editors, Spatio-Temporal Databases,
     rasdaman. In Proceedings of the 1998 ACM SIGMOD                               volume 2520 of Lecture Notes in Computer Science, pages
     international conference on Management of data, SIGMOD                        79–116. Springer Berlin Heidelberg, 2003.
     ’98, pages 575–577, New York, NY, USA, 1998. ACM.                        [19] A. Vaisman and E. Zimányi. A multidimensional model
 [3] P. Baumann and S. Holsten. A comparative analysis of                          representing continuous fields in spatial data warehouses.
     array models for databases. In T.-h. Kim, H. Adeli,                           In Proceedings of the 17th ACM SIGSPATIAL
     A. Cuzzocrea, T. Arslan, Y. Zhang, J. Ma, K.-i. Chung,                        International Conference on Advances in Geographic
     S. Mariyam, and X. Song, editors, Database Theory and                         Information Systems, GIS ’09, pages 168–177, New York,
     Application, Bio-Science and Bio-Technology, volume 258                       NY, USA, 2009. ACM.
     of Communications in Computer and Information Science,                   [20] J. Viqueira and N. Lorentzos. Sql extension for
     pages 80–89. Springer Berlin Heidelberg, 2011.                                spatio-temporal data. The VLDB Journal, 16(2):179–200,
                                                                                   2007.
            Proceedings IMMoA’13                                        678         http://www.dbis.rwth-aachen.de/IMMoA2013/