=Paper= {{Paper |id=Vol-1222/paper1 |storemode=property |title=Automating the web publishing process of environmental data by using semantic annotations |pdfUrl=https://ceur-ws.org/Vol-1222/paper1.pdf |volume=Vol-1222 |dblpUrl=https://dblp.org/rec/conf/mir/MossgraberH14 }} ==Automating the web publishing process of environmental data by using semantic annotations== https://ceur-ws.org/Vol-1222/paper1.pdf
   Automating the web publishing process of environmental
            data by using semantic annotations
                         Jürgen Moßgraber                                                            Désirée Hilbring
                            Fraunhofer IOSB                                                            Fraunhofer IOSB
                          Fraunhoferstraße 1                                                         Fraunhoferstraße 1
                           76131 Karlsruhe                                                            76131 Karlsruhe
      juergen.mossgraber@iosb.fraunhofer.de                                             desiree.hilbring@iosb.fraunhofer.de


ABSTRACT                                                                          semantics, which require time-consuming discussions between
                                                                                  domain and IT experts. Furthermore, the domain experts need to
Large amounts of environmental data are still hidden away in                      be in control of which data are published. Since this is daily
databases only accessible by domain experts. There is the need to                 business, no programming should be required.
make this data available to other experts for further data fusion.                   In the following section 3, relevant standardized and
To implement standards like the Sensor Observation Service                        proprietary service interfaces for environmental data and their
(SOS) huge efforts on the side of environmental agencies are                      data models are described. The challenges of mapping data
required. At the same time, the pressure to make this data                        models are explained in section 4. After that, we present a method
available to the interested public arises in form of Linked Open                  to simplify the task of mapping the data models by facilitating
Data (LOD). This additional demand requires even more                             ontologies (section 5) and show a system architecture and
programming resources to fulfill the new requirements and                         experimental implementation based on our Extensible Database
interfaces. In this paper, we describe a system architecture, which               Application Configurator (XCNF) framework.
simplifies and automates this problem of publishing
environmental data in different data models. Ontologies are
applied to map the different models’ syntax and semantics.
                                                                                  2. RELATED WORK
Additionally, we present a proof-of-concept implementation                           A lot of research has been executed in the area of mapping
supporting both SOS and LOD interfaces.                                           (data) models. Especially, mapping schemas of relational
                                                                                  databases, which have been available for a long time, were in
Keywords                                                                          focus. A good overview of the state-of-the-art is given by [5].
Linked Open Data, Semantic, Sensor Observation Service (SOS),                     More current research focuses on XML and ontology models [7]
Web Publishing, Software Architecture.                                            of which the later have the advantage providing the semantics of
                                                                                  the model as well. In addition, mapping between these different
                                                                                  kinds of models has been researched. However, until now there is
1. INTRODUCTION                                                                   no fully automatic mapping algorithm, which solves the problem
   Geographical data play an increasingly important role in many                  100% [6]. Therefore, we center the following work on
application fields. Especially in the environmental domain, large                 simplifying the manual mapping of models by facilitating
amounts of measurement data are stored in expert databases.                       semantic annotations, which can be applied by a domain expert.
However, these are not accessible to other public bodies and to                      An overview of the state-of-the-art in Linked Data is given in
the citizens. One reason for this is, among others the lack of use                [4]. Tools such as “D2R Server” [12] are used to publish data
of standards for accessing the data.                                              stored in relational databases. The data publisher defines a
   The challenge is not to address a specific standard but the                    mapping between the relational schema of the database and the
increasing number of standards that have to be supported by an                    target ontology vocabulary with a declarative mapping language.
environmental information system. Examples are standards of the                   Due to this static nature domain experts cannot apply changes
Open Geospatial Consortium (OGC) such as Web Feature Service                      easily. Exemplary works are described in [13] and [14].
(WFS) and the Sensor Observation Service (SOS). At the same
time, the pressure to make this data available to the interested
public brings up the requirement to support also standards from                   3. RELEVANT INTERFACES AND DATA
the Linked Open Data (LOD) domain.                                                MODELS
   Huge efforts on the side of environmental agencies would be                      The Open Geospatial Consortium (OGC) is concerned with the
required to support all of them, which is way beyond the budgets                  definition of standardized interfaces in the domain of
of these institutions. Not only the plain programming work needs                  geographical information and increasingly in the area of sensor
to be considered but also the mapping of the syntax and semantics                 data ("Sensor Web Enablement").
of the different data models. The difficulty lies especially in the
                                                                                  3.1 Sensor Observation Service (SOS)
Copyright © by the paper’s authors. Copying permitted only for private               The SOS specification [1] provides operations to retrieve
and academic purposes.                                                            sensor data and specifically “observation” data.
In: S. Vrochidis, K. Karatzas, A. Karpinnen, A. Joly (eds.): Proceedings of          The observations themselves are defined by another OGC
the International Workshop on Environmental Multimedia Retrieval                  standard: the Observation and Measurement Model (O&M) [2].
(EMR2014), Glasgow, UK, April 1, 2014, published at http://ceur-ws.org            Observations described by O&M can be seen directly as




                                                                              1
measurements from sensors, but they can also represent other data           As noted above in 3.4 XCNF provides a metadata model to
structures.                                                              describe data models, which can change dynamically. This means
                                                                         that we cannot apply a once-only mapping of the models. Instead,
3.2 Web Feature Service (WFS)                                            the mapping always needs to be adjusted if an end user makes a
   The Web Feature Service (WFS) represents a change in the              change and therefore needs to be dynamic too.
way geographic information is created, modified and exchanged
on the Internet. Rather than sharing geographic information at the
file level using File Transfer Protocol (FTP), for example, the
WFS offers direct fine-grained access to geographic information
at the feature and feature property level [3].

3.3 Linked Open Data (LOD)
   In computing, linked data (often capitalized as Linked Data)
describes a method of publishing structured data so that it can be
interlinked and become more useful. It builds upon standard Web
technologies such as HTTP, RDF and URIs, but rather than using
them to serve web pages for human readers, it extends them to               Figure 1. Required mapping for accessing time series with
share information in a way that can be read automatically by                                 different standards
computers. This enables data from different sources to be
connected and queried [4].                                               4.1 Concept
                                                                           To publish data from XCNF the existing features are used and
3.4 XCNF                                                                 extended by ontology annotations:
   XCNF (eXtensible database application CoNFigurator) is a                      An ontology is required for each interface which should
Java based client/server framework by Fraunhofer IOSB for                         be supported (SOS, WFS, etc.). The ontology must
developing information systems for time series analysis. While                    contain the specific concepts and properties to describe
the framework can be applied to any domain, we mainly apply it                    the model. Preferably, an existing ontology should be
to the domains of water management and water quality. Most of                     reused.
the data are time series with spatial relationships.
                                                                                 All required concepts and their accompanying
                                                                                  properties contained in the used ontology must be
                                                                                  mapped to existing XCNF Views and their attributes.
                                                                                  This is done by annotating them with the URIs of
                                                                                  ontology resources. For example if available datasets
                                                                                  shall be published as SOS Observations the appropriate
                                                                                  XCNF View is annotated with #Observation (this is
                                                                                  only the hash part of the URI for better readability). The
                                                                                  attributes of the view need to be annotated with
                                                                                  properties from the ontologies too, e.g. #hasValue,
                                                                                  #hasTime, etc.
                                                                                 Other interfaces (e.g. LOD) can be supported by
                                                                                  annotating the views with URIs from the ontology used
                                                                                  for the other interface.
                                                                                 The specific publishing service (SOS, WFS, etc.) can
                                                                                  now read all of the entries from the related XCNF
   XCNF uses a proprietary metadata model, which not only
                                                                                  Views, annotated by concepts of its ontology.
describes the data but also the layout of input forms and search
masks. XCNF uses a concept called View. A View provides                          Since the structure is given by the ontology the service
access to a part of one or more connected databases quite similar                 can relate multiple views which belong together.
to a database view. In contrast to a database view, it provides
additional annotations to add semantics to its attributes and link
attributes to other views. This has the consequence that every end
user creates or extends its own data model by creating or
modifying a XCNF View.

4. PUBLISHING AND MAPPING OF DATA
MODELS
   Figure 1 depicts the problem that needs to be solved. Several
interface standards with their specific data models have to be                   Figure 2. Architecture for SOS accessing XCNF
mapped with respect to their syntax and semantics to a proprietary
data model of an existing information system in the backend.




                                                                     2
5. ARCHITECTURE AND IMPLEMENTA-                                                   #hasUnit                   GEW_MESSWERT_GUET
TION                                                                                                         E.DIMENSION_NR
   The architecture and implementation of an SOS interface is            #Phenomenon                         GEW_PARAMETER
described in the following. Other interfaces can be supported in
the same way.                                                                       #hasID                   GEW_PARAMETER
   The following figure depicts the components of the system that                                            .BASIS_NR
will be described in the following sub-sections:
                                                                                    #hasName                 GEW_PARAMETER
                                                                                                             KURZNAME
5.1 Ontology                                                             #Procedure                          UIS_SL_MESSVERFAHR
   Several translations to an ontology are available for the                                                 EN
Observation and Measurement Model (O&M) [8]. Since they tend
to be rather complex we have extracted only those concepts and                    #hasName                   UIS_SL_MESSVERFAHR
properties which were necessary for the mapping. The following                                               EN.LANGNAME
concepts and their properties are used:                                           #hasID                     UIS_SL_MESSVERFAHR
         Observation                                                                                        EN.MESSVERFAHREN_
                 hasObservedProperty                                                                         NR
                 measuredByProcedure                                     #FeatureOfInterest                  GEW_PNST, GEW_MST,
                 hasValue                                                                                    GEW_POSITION
                 relatesToFeatureOfInterest
                 hasTime                                                          #hasName                   GEW_MST.NAME
                 hasUnit
        Phenomenon                                                               #hasID                     GEW_PNST.PNST_NR

                  hasName                                                         #hasNorthing               GEW_POSITION.HW
                  hasID
                                                                                  #hasEasting                GEW_POSITION.RW
        Procedure
                  hasName                                                 Since the example above contains German words and
                  hasID                                                 acronyms here is a little glossary:
        FeatureOfInterest                                                        MESSWERT: measurement
                    hasName                                                       PROBE: observation
                    hasID                                                         GUETE: quality
                    hasNorthing                                                   MESSVERFAHREN: measurement procedure
                    hasEasting                                                    DATUM: date
                                                                                  KURZNAME: short name
5.1.1 Mapping Example                                                             LANGNAME: long name
   Our test data is taken from the Fachinformationssystem                         HW + RW: the geo location
Gewässer Qualität (FISGeQua) which contains water quality data
from all measurement stations of the German state Baden-                5.2 SOS Requests and Results
Württemberg.                                                               By facilitating the above mapping, it is possible to receive data
   The following tables show how the XCNF-Views of FISGeQua             from the FISGeQua database to make it accessible via an SOS
have been annotated with resources from the SOS ontology to             interface. A typical SOS request can be formulated in the
support the SOS interface:                                              following way:
                                                                                 Give me all available data which matches the following
   Ontology Concepts and              XCNF Views and                              conditions:
 Properties                         Attributes                                         o     The    #Phenomenon       shall    be     water
 #Observation                       GEW_MESSWERT_GUET                                        temperature.
                                    E, GEW_PROBE                                       o     The #Procedure, which has been used to
                                                                                             determine the water temperature is
         #hasObservedProperty       GEW_MESSWERT_GUET                                        electrometry.
                                    E.PARAMETER_NR
                                                                                       o     The data has been measured in the time range
         #measuredByProcedure       GEW_MESSWERT_GUET                                        of 2nd to 4th January 2005.
                                    E.MESSVERFAHREN_NR
                                                                                       o     The #FeatureOfInterest, which defines the
         #hasValue                  GEW_MESSWERT_GUET                                        spatial region, is the measuring point with id
                                    E.MESSWERT                                               1051.

         #relatesToFeatureOf-       GEW_PROBE.PNST_NR                      This request in the SOS XML notation looks like the
         Interest                                                       following:
                                                                        
                                                                                                     and the annotations of the views. The service provides the
    
    10
                                                                         following methods:
                                       /xcnfrestservice/capabilities Provides a list with all
    8289                                              supported models/ontologies
    
    TW                              /xcnfrestservice/capabilities/viewNames       Get    the
    
                                                                                  names of all published XCNF views, e.g.:
    
                                                                        {"viewNames":["GEW_MESSWERT_GUETE",
phenomenonTime
                                                                                  "GEW_PROBE","GEW_PARAMETER","UIS_SL_M
                                                    ESSVERFAHREN","GEW_PNST",“GEW_MST“,“G
                2005-01-                                       EW_POSITION“]}
02T14:00:00.000+01:00
                2005-01-
                                                                                 /xcnfrestservice/capabilities/mapping         Get     the
04T15:00:00.000+01:00                                           mapping of the views to the ontology concepts and
                                                                 properties. Note that multiple annotations from different
        
    
                                                                                  ontologies could be applied if the data should be
                                       available via different interfaces! The following shows
    1051.0                         the    mapping      part     for    #Phenomenon      and
    
                                                                                  #FeatureOfInterest and with deleted URIs to keep it
http://www.opengis.net/om/2.0
                                                               {"mappingStructure":{"viewMappingLi
                                                                                  st":[{"viewName":"GEW_PARAMETER","con
  As you can see, the request contains no FISGeQua specific                       ceptNames":["#Phenomenon"],"mappingLi
nomenclatures. Here is the response to this request:                              st":[{"columnName":"KURZNAME","concep
                                                                                  t":"#hasName"}]},{"viewName":"

                                                                        /xcnfrestservice/capabilities/model/?uri=uri Get the
                                                             ontology with the given URI.
    
      abc8dbd3-
13ff-442a-9e23-80a9ec96881f
                                                                                  data mapped to the concept with the given URI.
                                                                ri&valueList=value Query for the data mapped to the
                                                               concept with the given URI. The response is filtered
        
                                                                                  with the properties provided in the additional
          2005-01-                                              parameters.
03T12:10:00.000+01:00
        
                                                     5.4 SOS server and XCNF-DAO
      
                                                                         software, provides a SOS implementation based on Java. As
      
                                   52°North develops the reference implementation for the OGC
                                SOS      specification   we     chose    their  software     (see
      6.5
                                                     basis for our proof-of-concept implementation.
                                                      We chose an early access version 4 (4.0.0 Beta2) of the
                                            software since it provides much better modularity than version 3.
                                                                         In this new version there is now a defined way for plugging in
  It says that a water temperature (TW) of 6.5°C has been                your own data access into the server via so-called Data Access
measured at 1051 on January 3rd.                                         Objects (DAO). Out of the box it retrieves its’ data from a
                                                                         relational database in a proprietary format which did not fit our



                                                                     4
needs since we wanted a direct access to the data stored in an                      Either the XNCF REST Service would need to map its
XCNF server for performance reasons.                                                 responses to RDF or OWL or
   The implemented XCNF-DAO plugs into the SOS server. It
                                                                                    we use our extended SOS implementation and map the
retrieves the data from the XCNF-REST service by utilizing the
                                                                                     resulting XML Observation Collection to RDF or
SOS ontology annotations. The retrieved data is handed over to
                                                                                     OWL.
the SOS server, which handles the syntax formatting and
encoding (see Figure 2).                                                      The first approach will be faster, because it saves one mapping
                                                                           step. However, it will contain a proprietary solution while the
                                                                           second approach can use existing geospatial standards and might
6. DISCUSSION                                                              reuse mechanisms described in [10] and [11].
6.1 Distribution of Concept Properties over                                6.4 Adapting the approach to other systems
several Views                                                                 In this paper, we used the XCNF framework as an example to
   Analyzing the mapping example described in section 5.1.1, one           demonstrate our approach but it can be adapted to other systems
can see that it often happens that the properties of one concept           as well. To facilitate that, the following steps need to be taken:
need to be mapped to attributes, which belong to several different               1. Enable annotation of your relational data (could be done
XCNF Views. Here is an example:                                                       with a standard relational mapper).
   The #hasObservedProperty property of an #Observation can be                  2.   Support multiple mappings (ontologies)
found in the XCNF View GEW_MESSWERT_GUETE while the
property #relatedToFeatureOfInterest is contained in XCNF View                  3.   Add the possibility for the user to dynamically change
GEW_PROBE.                                                                           the mapping
   Requesting the #Observation concept via the integrated XCNF                  4.   Provide the means to publish only selections of the data
View filtering option filtered with #hasObservedProperty=A or                        (done by XCNF views in our approach).
requesting     the    #Observation       concept     filtered with
#relatedToFeatureOfInterest=B will lead in both cases to too
many results if the second filter option is missing.                       7. CONCLUSION
   To support the filtering mechanism of the XCNF REST                        In this paper, we presented a concept for dynamically mapping
Service                                                                    data models of domain expert systems to different interface
/xcnfrestservice/data/filter/?uri=uri&propertyList=uri&valueList=          standards by annotating the model with resources from an
value, the implementation must provide an additional filtering             ontology. In contrast to static approaches like D2R shown in the
operation before returning the results via the URI.                        related work section, this allows for quicker adaptions to new
                                                                           requirements by the domain expert.
6.2 Reducing the Amount of Data to be                                         The described implementation shows that the concept is
                                                                           applicable to a real world scenario. In the future, we will work on
published                                                                  removing the discussed drawbacks and improve the user interface
   Often only subsets of the data in the database are foreseen for         for executing the mapping. For example, ontology properties for
publishing. Therefore, we need a mechanism for defining which              an annotation could be suggested to the user depending on the
subsets of the data in the database can be delivered via the XCNF          data type and the selected ontology concept. Furthermore, since
REST Service.                                                              XCNF views already contain some metadata annotations it is
   XCNF foresees the possibility to create so called BDOs                  interesting to explore to what degree the mappings can be created
(“Benutzerdefiniertes Objekt”), which are user-defined objects. It         automatically.
is possible to create a BDO which reduces the amount of data in
the database to the subset which shall be published, e.g. via
defining specific measurement points, a specific time range or             8. REFERENCES
specific phenomena.                                                        [1] Bröring, A., Stasch, C., Echterhoff, J. (Ed.) 2012. OGC®
   Currently we consider implementing the following mechanism:                 Sensor Observation Service Interface Standard, Version: 2.0,
     1. The #Observation concept in the ontology needs to be                   Open Geospatial Consortium Inc., 12-006
          extended with a new property #hasBDO.                            [2] Cox, C. (Ed.) 2011. Observations and Measurements - XML
     2.   The owner of the database needs to define a specific                 Implementation, Version:2.0, Open Geospatial Consortium,
          BDO for the data subset to be published.                             10-025r1
     3.   This BDO needs to be annotated with #hasBDO.                     [3] Vretanos, P. (Ed.) 2010. OpenGIS Web Feature Service 2.0
                                                                               Interface Standard, Open Geospatial Consortium, OGC 09-
     4.   The implementation of the XCNF REST Service                          025r1 and ISO/DIS 19142
          /xcnfrestservice/data?uri=uri and its filter mechanism
          need to be extended with an additional filter                    [4] Bizer, C., Heath, T., Berners-Lee, T. 2009. Linked Data -
          (propertyList: #hasBDO, valueList: #8289) which is not               The Story So Far, International Journal on Semantic Web
          seen from outside the XCNF Rest Service.                             and Information Systems 5 (3): 1–22.
                                                                               doi:10.4018/jswis.2009081901
6.3 Ideas for Integrating Linked Open Data                                 [5] Bellahsene, Z., Bonifati, A., Rahm, E. 2011. Schema
   The possible support of Linked Open Data was another idea we                Matching and Mapping, Springer Verlag, doi:10.1007/978-3-
had. Therefore, the architecture foresees the possibility to support           642-16518-4
several interfaces. The additional support of LOD would require            [6] Bernstein, P., Madhavan, J., Rahm, E. Generic Schema
that we provide our data in RDF or OWL format. For our current                 Matching, Ten Years Later, 37th International Conference
implementation the following two possibilities exist:



                                                                       5
    on Very Large Data Bases (Seattle, Washington August 29th            Washington, DC, USA, October 2009; CEUR-WS: Aachen,
    - September 3rd 2011)                                                Germany, 2010; Volume 522, pp. 49–63.
[7] Gross, A., Hartung, M., Thor, A., Rahm, A. 2012. How do          [11] Probst, F., Gordon, A., Dornelas, I. 2006. Ontology-based
    computed ontology mappings evolve?, Joint Workshop on                 Representation of the OGC Observations and Measurements
    Knowledge Evolution and Ontology Dynamics (ISWC 2012)                 Model, Open Geospatial Consortium
[8] Compton, M., et al. 2012. The SSN ontology of the W3C            [12] Bizer, C., Cyganiak, R. 2006. D2R Server - Publishing
    semantic sensor network incubator group, Web Semantics:               Relational Databases on the Semantic Web. Poster at the 5th
    Science, Services and Agents on the World Wide Web 17,                International Semantic Web Conference (ISWC2006)
    p25-32.                                                          [13] Moraru, A., Fortuna, C., Mladenic, D. 2011. A System for
[9] Fielding, R., Taylor, R. 2002. Principled Design of the               Publishing Sensor Data on the Semantic Web. Journal of
    Modern Web Architecture, ACM Transactions on Internet                 Computing and Information Technology - CIT 19, 2011, 4,
    Technology (TOIT) (New York: Association for Computing                239–245, doi:10.2498/cit.1002030
    Machinery) 2 (2): 115–150, doi:10.1145/514183.514185,            [14] Page, K., Frazer, A., Nagel, B., De Roure, D., Martinez, K.
    ISSN 1533-5399                                                        2011. Semantic Access to Sensor Observations through Web
[10] Page, K., De Roure, D., Martinez, K., Sadler, J., Kit, O.            APIs, Fifth IEEE International Conference on Semantic
     2009. Linked Sensor Data: RESTfully Serving RDF and                  Computing
     GML. In Proceedings of 2nd International Workshop on
     Semantic Sensor Networks (SSN09), conjunction with the
     8th International Semantic Web Conference, ISWC 2009,




                                                                 6