=Paper= {{Paper |id=None |storemode=property |title=Augmenting SDI with Linked Data |pdfUrl=https://ceur-ws.org/Vol-691/paper3.pdf |volume=Vol-691 }} ==Augmenting SDI with Linked Data== https://ceur-ws.org/Vol-691/paper3.pdf
                     Augmenting SDI with Linked Data
                         Sven Schade1, Carlos Granell2, Laura Díaz2
                         1
                          European Commission – Joint Research Centre
                                  sven.schade@jrc.ec.europa.eu
                2
                  Universitat Jaume I – Institute of New Imaging Technologies
                                {carlos.granell, laura.diaz}@uji.es


       Abstract. Spatiotemporal data is provided and consumed by many different
       communities, reaching from groups of environmental experts, over decision
       makers, to the public. Due to heterogeneous conceptual and technological ap-
       proaches, cross-community communication and cooperation remains challeng-
       ing. Linked Data has been suggested as a means to enable interoperability and
       first experiments indicate suitability. In this paper, we discuss how solutions for
       spatiotemporal data management, specifically Spatial Data Infrastructures
       (SDI), can be augmented with Linked Data principles. We identify two com-
       mon usage scenarios and conclude that only minor changes to current SDI stan-
       dards are required for implementation and identify actions for future work.
       Keywords: Linked Data, Spatial Data Infrastructure, Interoperability


1 Introduction
Spatiotemporal data is provided and used by a large number of communities, reaching
from groups of environmental experts, over decision makers, to the public. In the case
of forest fires monitoring for example, environmental experts develop fuel maps, fire
maps and burned area maps, decision makers have to determine required actions (such
as tasking of fire fighters), and the public is affected, as well as it may provide valu-
able information in form of observations or photographs. The concept of Spatial Data
Infrastructure (SDI) [1] has been proposed to improve interoperability between those
communities, i.e. to move away from island solutions. Information systems built
using standards-based distributed services have been adopted by the geospatial com-
munity for building such infrastructures. Most relevant data encodings and service
interfaces are standardized by Open Geospatial Consortium (OGC) and International
Organization for Standardization (ISO).
   With current developments, we left island solutions in favor of aquariums [2]. SDIs
become implemented, but still many people only view them from the outside and
through a (thick) glass wall. In most cases, each SDI is strictly separated from the
others, i.e. they use distinct data models and terminology, as well as community spe-
cific resource discovery facilities. With our work, we try to leave this stage in favor of
a wider use of SDI and easier integration with any form of information infrastruc-
tures. Infrastructures for Volunteered Geographic Information (VGI) [3] are of par-
ticular interest as they represent the second major case for spatiotemporal data provi-
sion and consumption. We concentrate our work on spatiotemporal data as a source
for value added information. On the one hand, we intend easier data publication; on
the other hand, we aim at straightforward data discovery and access.
   In the (Semantic) Web community, Linked Data has been advocated as a means to
connect heterogeneous resources (data instances, data sets, services, etc.) within a
distributed environment [4]. It is based on the use of uniform identifiers of resources
and on the Resource Description Framework (RDF). Linked Data has been recently
introduced to the geosciences community [5]. Especially, augmenting SDI, Linked
Data may provide means to connect groups of environmental experts, decision mak-
ers, and the public [6].
   Assuming that the Linked Data principle and technologies provide a way beyond
the aquarium situation, we use the paper at hand to identify common usage scenarios
of an SDI that is augmented with Linked Data principles, and analyze required
changes in recent SDI standards. We suggest a possible implementation using existing
technologies. While the first scenario addresses the encoding of links in SDIs presum-
ing given standard structures, real Linked Data augmentation is provided by the sec-
ond scenario. Only the latter serves the wider Linked Data community.
   The remainder of this paper is structured as follows. Required background is pre-
sented in the next section (section 2). Common scenarios for spatiotemporal Linked
Data provision and consumption are introduced in section 3. In section 4, we discuss
the impacts on existing OGC standards, relevance for recent SDI developments, and
we present our conclusions and outline future work.


2 Background

Understanding the main discussions of this paper requires background on SDI tech-
nology and Linked Data principles. Both are introduced in a nutshell.

2.1 Spatial Data Infrastructures

An SDI is an information infrastructure for enhancing geospatial data sharing and
access [1]. Implementations rely on web service technology. The Web Map Service
(WMS, [7]) and the Web Feature Service (WFS, [8]) are two prominent examples. An
abstract structure for data modeling and encoding is provided in form of the Geo-
graphic Markup Language (GML, [9]). GML already provides possibilities of includ-
ing metadata, more sophisticated profiles (e.g. for data and service discovery) are
provided separately. The two ISO standards 19115 [10] and 19139 [11] provide the
most common examples. Functionalities for data and service publication and discov-
ery are provided by the Catalogue Service Web (CSW) [12].
   Resource metadata in CSW may include links at service level, telling us what ser-
vices are related to the current resource. ISO 19115 defines the CI_OnlineResource
complex element that contains information about services from which resources can
be obtained. This element permits to augment a URL in its linkage element together
with (optional) information for service definition in the protocol and description
fields. The values contained in this metadata descriptor provide the link to associated
data sets in terms of query parameters to the appropriate service. In addition, the data
resource itself may incorporate links at instance (aka feature) level1 to connect related
features among diverse data resources. Both aspects are revisited later in the paper.
   Based on these (meta-) data encodings and service interfaces, interoperable clients
for geospatial data provision and consumption are put into place [1]. Government
mandates such as the European Directive on Infrastructure for Spatial Information in
Europe (INSPIRE) [13] recommend such standards for sharing resources (such as
data and processes) with the goal of improving environmental decision-making. In
particular WFS is recommended for implementing data download services and CSW
is proposed for data and service discovery. A Service Framework, which allows envi-
ronmental experts to upload their data and retrieve links to access services, is under
development [14].
   Opposed to classical SDI, the notion of Volunteered Geographic Information
(VGI) emerged recently [3]. VGI highlights that users are active producers of geo-
graphic information rather than passive recipients of geographic information by for-
mal organizations. Possible approaches to merge this bottom-up model with the top-
down SDI model are under investigation [15]. Current implementations still suffer
from the aquarium situation, i.e. a restricted user community.


2.2 Linked Data

Linked Data is a current buzz-phrase promoting access to various forms of data on the
internet [4]. Linked Data is based on two principles that have underpinned the archi-
tecture and scalability of the World Wide Web; (1) Universal Resource Identifiers
(URI) [16], using the http protocol which is supported by the DNS system, and (2)
hypertext, in which URIs of related resources are embedded within a dataset.
    The Linked Data movement also adds, or re-emphasizes Semantic Web principles
by following the Resource Description Framework (RDF) data model and encoding
[17]. A basic typing system for subjects, predicates and objects has been proposed as
RDF-Schema (RDF-S) [18]. RDF-S allows for extensions in order to specify domain-
dependent subtypes. It provides one way to describe domain vocabularies with its
own namespace; for example the Simple Knowledge Organization System (SKOS)
[19]. We argue later in this paper that GML and metadata standards serve similar
purposes.
    Content negotiation provides the client uses the ‘Accept Header' to tell the service
what representation of a resource is acceptable for a given client [20]. Content nego-
tiation is not a requirement for publishing Linked Data, but it is common due to the
HTTP303 publishing pattern. The diversity and richness of Linked Data sources sup-
ports a great variety of user interfaces. Browsing becomes an important mode of user
interaction.




1 Due to the ISO General Feature Model, the concept of a feature includes feature collections.
3 Augmenting Linked Resources

Considering the growing interest of the geosciences community in Linked Data, we
analyze two common scenarios in Linked Data provision and consumption. We iden-
tified these scenarios based on personal experiences and a review of recently pub-
lished research. In scenario one, we use an agnostic format to codify links instead of
RDF to keep us compatible with current standard (ISO 19115). We elaborate on a
complete Linked Data augmentation (with RDF) in the second scenario. The scenar-
ios help us to illustrate potentials, requirements, and changes when augmenting SDI
with Linked Data. We suggest ways for adding links capabilities, both at the service
and feature level. Having two levels provide some benefits. From the service provider
perspective, linking capabilities can be increasingly added into the SDI mainstream
since links at the service level require less effort than ones at the feature level. In
addition, when geospatial data are connected at feature level, data visibility increases
greatly leading to both new synergies and unexplored set of new user applications.
    We concentrate on provision, i.e. deploy and publish (Figure 1), before visiting
consumption in form of discovery and access. In particular, resource deployment and
publication is carried out using encoding and service standards of OGC. In Figure 1,
for example, a data source provides information in the Observations and Measure-
ment Encoding (O&M) standard [21], a specific encoding for sensing results; the
WMS specification is applied to data visualization; the Sensor Observation Service
(SOS) [22], a service specialized on accessing sensing-based data, offers O&M; and
the CSW allows for resources advertisement and subsequent discovery. In the re-
mainder of this section we target a scenario for augmenting data provision (access
service deployment and publication) with Linked Data for open search and for offer-
ing geospatial data encodings based on Linked Data principles. These scenarios pro-
vide a basis for discussing required changes to existing SDI standards and implemen-
tation practices.




  Figure 1. Workflow for data provision.
Scenario One: Embedding Links at the Service Level

In this scenario, we suggest the use of links embedded in the metadata record of a
given SDI resource. A data resource (e.g. observational data) can be deployed in mul-
tiple SDI services, such as view services (WMS) and download services (SOS), at the
same time. The idea is to generate appropriate links between all SDI services related
with the data resource in question. As the resource metadata description resides CSWs
that codify records in ISO 19115, we elaborate on this standard to find out where
links at the service level might be placed.
   As argued earlier each metadata record may contain an URI in the linkage element
(see also Figure 1). This may point to the associate resource (e.g. service) by provid-
ing a direct locator with the required query parameters. For SOS data retrieval this
may for example be the HTTP-GET binding and the getObservation request [22]. The
linkage element provides a means to link from the metadata record to the access ser-
vice and the protocol field provides required information about the supported proto-
col. Nativi and Bigagli propose a similar solution to identify the type of binding of the
access service (HTTP, HTTP-GET, HTTP-POST) [23].
   Connections to other metadata records, related online services and VGI services in
the context of the current metadata record, still have to be provided. To overcome this
issue using the recent standards, we suggest the description of links according to on-
going work in Web Linking [24], which proposes a way to provide independent-
format links within HTTP headers [25]2. The syntax of a link header is a set of pair
parameter-value as follows:

Link:;rel="typed_relationship";type="accepted_mime_types_of
_target_resource"; title="human-readable_title_for_the_link"

The use of Web Linking yields at least a couple of benefits. First, links syntax is for-
mat-agnostic, i.e. does not depend on the actual representation of the resource. Sec-
ond, links are annotated with the rel attribute (highlighted in bold, above) that adds
semantics to the link in terms of established relation types. A link relation type con-
veys the role or purpose of the link and act as an identifier for the semantics associ-
ated with the link. A list of registered link relation types were already introduced in
HTML and extended later in the Atom specification [26]. Below we describe some of
these standard relation types that may be useful for establishing typed connections
with other related SDI services:
• rel="self" means a link to the preferred URI, i.e., the URI to the download service
   of the resource. Self relation type is equivalent to the current behavior of the link-
   age element as defined in ISO 19115, when the latter field is full qualified.
• rel="previous” means a link to a URI for older versions of the current metadata
   record. This link makes reference to a discovery service. This is common in O&M,
   since this type of data depends strongly on the time variable.




2 Each Link header field is semantically equivalent to the atom:link feed-level element in Atom

  (RFC 4287).
• rel="service” means a link to a URI for related geospatial web services (e.g.,
  WMS or WFS) that serve the same layers. The title attribute may contain a single
  tag (e.g. “WMS”, “WFS”) to identify the actual OGC service specification.
• rel="related" means links to related resources, for instance, VGI resources.
• rel="via" means a link to the source of the current resource. It refers to the sensor
  or to the process used to transform raw sensor data into value-added information.

Following this suggestion, link headers for a metadata record of a given SOS layer
may be provided like this:

Link:;
 rel="self";type="text/xml"
Link:;
 rel="previous"; type="text/xml"
Link:;
 rel=”service”;type="imag/jpeg" title=”WMS”
Link:;rel="related";type=
 "img/jpeg"
Link:;rel="via";type="text/html"

The obvious question that arises is where to place these links. A first attempt is to
place the list of links in the ISO 19115 linkage element. One inconvenience is that it
is of type URL. So a list of links encoded in such a way does not fit the data type
constrains of the field. We suggest the use of the description field to accommodate
links to related services and resources as illustrated in Figure 2.




  Figure 2. Workflow for data provision + links at the service level.

In respect to provision, client applications require minimal changes to support the
scenario illustrated above (Figure 2). Rather than treating the description field as free
text, client applications have to view it as a set of typed links to related services. From
the server perspective, this solution keeps ‘almost’ invariable current implementation
of the CSW-based catalog services. For instance, link edition would be made through
the CSW transactional interface [12]. Links would be stored in the metadata record
(description) as is now. No changes are needed, excepting the semantics but not syn-
tax of the description field.
   In order to support data providers, tools such as the Service Framework [14] sup-
port some of these typed links at publication time. Figure 3 illustrates the conceptual
architecture of the Service Framework, whose aim is to assist users in the integration
of geospatial data resources within an infrastructure by providing automatic mecha-
nisms to deploy resource based on OGC, ISO and INSPIRE standards and register
them in (INSPIRE-based) discovery services.




    Figure 3. ServiceFramework conceptual diagram.
Link discovery and access would be provided by the current CSW discovery inter-
faces (getRecords, respectively getRecordById query). Clients would be in charge of
interpreting the relation types of the set of link headers found in the description field3.
The client would submit a HTTP HEAD to get only the set of links associated with
the resource in question. This method is useful to retrieve the HTTP header fields
such as the list of link headers. It gives clients to chance of retrieving only the links
without processing the metadata record of the resource. In this case, we extend SDI
service interfaces slightly since we introduce the use of HTTP HEAD method. An
example of how such an HTTP HEAD request would look like if given below:

HEAD /catalog? service=csw&request=getrecord HTTP/1.1
Host: server.org

A response would return the list of links contained in the description field:

HTTP 1.1 200 OK
Content-Length:…
Cache-Control:..
…
Link:;
 rel="self";type="text/xml"


3   See also HATEOAS (Hipermedia As The Engine of Application State) constrains in REST.
Link:;
 rel="previous"; type="text/xml"
Link:;rel=”ser
vice”;type="img/jpeg" title=”WMS”
Link:;rel="related";type=
"img/jpeg"
Link: