<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Discovering Spatial and Temporal Links among RDF Data</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Panayiotis Smeros</string-name>
          <email>panayiotis.smeros@ep</email>
          <email>panayiotis.smeros@epfl.ch</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Manolis Koubarakis</string-name>
          <email>koubarak@di.uoa.gr</email>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>EPFL</institution>
        </aff>
        <aff id="aff1">
          <label>1</label>
          <institution>National and Kapodistrian University of Athens</institution>
        </aff>
      </contrib-group>
      <abstract>
        <p>Link Discovery is a new research area of the Semantic Web which studies the problem of nding semantically related entities lying in di erent knowledge bases. This area has become more crucial recently, as the volume of the available Linked Data on the web has been increasing considerably. Although many link discovery tools have been developed, none of them takes into consideration the discovery of spatial or temporal relations, leaving datasets with such characteristics weakly interlinked and therefore disallowing the exploitation of the rich information they provide. In this paper, we propose new methods for Spatial and Temporal Link Discovery and provide the rst implementation of our techniques based on the well-known framework Silk. Silk, enhanced with the new features, allows data publishers to generate a wide variety of spatial, temporal and spatiotemporal relations between their data and other Linked Open Data, dealing e ectively with the common heterogeneity issues of such data. Furthermore, we experimentally evaluate our implementation by using it in a real-world scenario and demonstrate that it discovers accurately all the existing links in a time e cient and scalable way.</p>
      </abstract>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>1. INTRODUCTION</title>
      <p>Linked data is a research area which studies how one can
make RDF data available on the Web, and interlink it with
Work done while the author was at National and
Kapodistrian University of Athens.</p>
      <p>Permission to make digital or hard copies of part or all of this work for personal or
classroom use is granted without fee provided that copies are not made or distributed
for profit or commercial advantage and that copies bear this notice and the full citation
on the first page. Copyrights for third-party components of this work must be honored.
For all other uses, contact the owner/author(s).</p>
      <p>
        WWW2016 Workshop: Linked Data on the Web (LDOW2016).
April 11, 2016 - Montréal, Canada.
c 2016 Copyright held by the owner/author(s).
other data in order to increase its value for users [
        <xref ref-type="bibr" rid="ref4">4</xref>
        ]. The
goal of Linked Data is to allow people to share structured
data on the web as easily as they can do with documents
today. An important step for the evolution of the Web of
documents to the Web of data is the transformation of the
data from any form that it exists into a common format, the
Resource Description Framework (RDF) so that it can be
easily integrated with other data already transformed in this
format. All the data that is compatible with the Linked Data
Principles composes the Linked Open Data (LOD) cloud1.
      </p>
      <p>
        Recently, spatial and temporal extensions to RDF have
been proposed and implemented. GeoSPARQL [
        <xref ref-type="bibr" rid="ref22">22</xref>
        ] is a
recent OGC2 standard that allows representing and querying
geospatial data on the Semantic Web. Also, the data model
stRDF accompanied by the query language stSPARQL [
        <xref ref-type="bibr" rid="ref17">17</xref>
        ]
are extensions of the standard RDF and SPARQL for
representing and querying geospatial data that changes over
time. Both of the above extensions are implemented in the
open source spatiotemporal RDF store Strabon [
        <xref ref-type="bibr" rid="ref18">18</xref>
        ].
Furthermore, OGC and W3C have established a joined working
group which studies the use of spatial data on the web [
        <xref ref-type="bibr" rid="ref29">29</xref>
        ].
      </p>
      <p>
        Link Discovery is one of the main Linked Data Principles.
Its main objective is to establish semantic links between
entities in order to enhance and enrich the information that is
known about them. Whilst the problem of Entity
Resolution i.e., the problem of nding entities which are equivalent,
has been studied a lot in areas such as Relational Databases
and Information Retrieval, Link Discovery de nes the more
generic problem of nding semantically related entities lying
in di erent knowledge bases [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ].
      </p>
      <p>Although a lot of e ort has been given in the
representation and querying of geospatial and temporal RDF data,
there are not many works in the respective area of Link
Discovery. In the context of Spatial Link Discovery, state of the
art techniques are focusing on nding only spatial
equivalences between entities, leaving other kinds of relations e.g.,
topological relations, undiscovered and the rich geospatial
information lying in many datasets unexploited. The
situation regarding Temporal Link Discovery is even more
premature and thus, to the best of our knowledge, there are
almost no datasets with temporal information that are
connected with each other with links that denote a temporal
relation.</p>
      <p>
        Another common use of Link Discovery is for detecting
internal links within a single dataset. For example, by
applying an Entity Resolution method on a dataset, we can
1http://www.w3.org/DesignIssues/LinkedData.html
2http://www.opengeospatial.org
discover all the similar entities i.e., all the duplicates of this
dataset. Hence, with Spatial and Temporal Link Discovery
we can materialize all the spatial, temporal and
spatiotemporal relations that hold between the entities of a dataset.
This operation is very useful in the areas of Qualitative
Spatial and Temporal Reasoning where the large graphs that are
created based on the qualitative relations are given as input
to corresponding reasoners [
        <xref ref-type="bibr" rid="ref10 ref7">7, 10</xref>
        ] in order to extract useful
information or to verify the consistency of a dataset.
      </p>
      <p>The lack of research in the area of Spatial and
Temporal Link Discovery will be made more notable when more
datasets with such characteristics will be made available in
the LOD Cloud. Currently, a lot of initiatives are moving
towards this direction by publishing open geospatial data
and metadata coming out of open government directives3
and open Earth Observation (EO) data and metadata that is
currently made available by space and environment agencies
(e.g., ESA, NASA and EEA)4. This data usually consists of
measurements produced by observations with hundreds of
gigabytes of geospatial and temporal information. Making
all this data available as Linked Data and interlinking it with
semantic connections will allow the development of services
with great environmental and commercial value.</p>
      <p>
        EU projects such as LEO, MELODIES and TELEIOS,
have already started exploiting this kind of data by
studying its whole life cycle [
        <xref ref-type="bibr" rid="ref15">15</xref>
        ]. Their nal goal is to build big
knowledge bases with data from various sources and be able
to perform queries e ciently on them. Use cases from these
projects have shown that combining heterogeneous data,
especially in its spatial dimension, can be a bottleneck to
this procedure, since they signi cantly increase the
execution time5. This happens due to the fact that most of the
spatial join operations (e.g., check of containment or
nonintersection) have by de nition quadratic complexity6. This
complexity can get even higher when we rst have to
homogenize the joining data. Thus, computing such relations
among very complex or heterogeneous data, on query time,
can be extremely time consuming and prohibitory for
realtime applications. Therefore, there is a need for a tool that
will be able to materialize those relations (links) e ciently,
even for highly demanding workloads.
      </p>
      <p>In this paper, we propose new methods for Spatial and
Temporal Link Discovery and provide the rst
implementation of our techniques based on the well-known framework
Silk. Silk, enhanced with the new features, allows data
publishers to generate a wide variety of spatial, temporal
and spatiotemporal relations between their data and other
Linked Open Data, dealing e ectively with the common
heterogeneity issues of such data. Furthermore, we
experimentally evaluate our implementation by using it in a real-world
scenario, with datasets that comprise rich spatial and
temporal information, and demonstrate that it discovers
accurately all the existing links in a time e cient and scalable
way.</p>
      <p>The structure of the paper is organized as follows. In
Section 2 we present related work in the area of Link Discovery.
3http://www.linkedopendata.gr
4http://datahub.io/organization/teleios and
http://datahub.io/organization/leo
5http://www.melodiesproject.eu/content/
enhancing-geospatial-sparql-query-times-silk
6O(nm) where n and m are the numbers of points of the
joining spatial objects.</p>
      <p>In Section 3 we provide the background on which the new
methods for Spatial and Temporal Link Discovery that we
propose in Section 4 are based. In Section 5 we describe the
implementation of our techniques in the framework Silk and
in Section 6 we present the experimental evaluation of them.
Finally, in Section 7 we conclude the work by discussing
future directions.
2.</p>
    </sec>
    <sec id="sec-2">
      <title>RELATED WORK</title>
      <p>Up to now, little e ort has been given in the research
area of Spatial and Temporal Link Discovery. Most of the
approaches on generic Link Discovery do not exploit the rich
spatial and temporal information existing in some datasets,
whereas domain speci c approaches on Spatial Link
Discovery are able to discover only spatial similarities (spatial
duplicates). Hence, to the best of our knowledge, there are
no frameworks, either generic or domain speci c, for
discovering spatial or temporal relations other than equivalences
among RDF datasets. Below we describe the most related
to our work state-of-the-art Link Discovery frameworks.</p>
      <p>
        In the area of generic Link Discovery, the authors of [
        <xref ref-type="bibr" rid="ref12">12</xref>
        ]
propose the declarative link speci cation language LinQL,
which is translated to standard SQL by the framework
LinQuer, for discovering semantic links over relational data.
      </p>
      <p>
        The LIMES framework [
        <xref ref-type="bibr" rid="ref20">20</xref>
        ] introduces a generic algorithm
for Link Discovery which reduces the number of comparisons
that are needed during the interlinking phase by utilizing the
triangle inequality in metric spaces. For nding link
speci cations, LIMES implements supervised and unsupervised
machine learning algorithms.
      </p>
      <p>
        Similarly to LIMES, Silk [
        <xref ref-type="bibr" rid="ref13">13</xref>
        ] is also a generic framework
for discovering relationships between data items within
different Linked Data sources. Silk, which is the only open
source generic Link Discovery framework, features a
declarative link speci cation language for specifying which types
of RDF links should be discovered between data sources as
well as which conditions entities must ful ll in order to be
interlinked. These linkage rules may combine various metrics
and can take the graph around entities into account, which
is addressed using an RDF path language. Silk accesses the
data sources that should be interlinked via the SPARQL
protocol and can thus be used against local as well as remote
SPARQL endpoints.
      </p>
      <p>
        In the area of Spatial Link Discovery there are some
domain speci c approaches which are able to discover only
spatial equivalences among datasets i.e., they focus on
Spatial Entity Resolution. In order to achieve this, they combine
the geographic distance of the geometries of the entities with
other kinds of distances e.g., with the string distance of their
labels. The supported geographic distances can be applied
either between any kind of spatial objects (e.g., Hausdor
distance), or only between point objects (e.g., Orthodromic
distance). Some of these approaches are presented in [
        <xref ref-type="bibr" rid="ref25 ref26 ref30">25,
26, 30</xref>
        ].
      </p>
      <p>
        From the generic frameworks, LIMES also addresses the
problem of Spatial Entity Resolution in [
        <xref ref-type="bibr" rid="ref21">21</xref>
        ]. The
computation of the distance between the spatial objects lies on
a combination of Hausdor and Orthodromic metrics. On
the other hand, Silk supports the geographic distance only
between point objects.
      </p>
      <p>
        A detailed review on state of the art algorithms and
frameworks is performed in the surveys [
        <xref ref-type="bibr" rid="ref19 ref2">2, 19</xref>
        ] as well as in [
        <xref ref-type="bibr" rid="ref28">28</xref>
        ].
      </p>
    </sec>
    <sec id="sec-3">
      <title>BACKGROUND</title>
      <p>
        In this section we provide the state-of-the-art on the
representation of spatial and temporal information in the RDF
data model and a formal de nition of the problem of Link
Discovery. We also present the models and calculi on which
we base the new Link Discovery relations that we introduce.
The extended background of this paper is given in [
        <xref ref-type="bibr" rid="ref28">28</xref>
        ].
3.1
      </p>
    </sec>
    <sec id="sec-4">
      <title>Representation of Spatial and Temporal</title>
    </sec>
    <sec id="sec-5">
      <title>Information in the RDF data model</title>
      <p>Spatial information in the RDF data model is usually
represented as serializations of geometries accompanied with a
Coordinate Reference System (CRS) which de nes how to
relate these serializations to real geometries on the surface
of Earth. For encoding this information, Well-Known Text
(WKT) and Geography Markup Language (GML) are used.
WKT is an OGC standard for the representation of
vector geometry objects, CRSs, and transformation rules
between di erent CRSs. On the other hand, GML, developed
by OGC as well, is the most common XML-based encoding
standard for the representation of geospatial data, that
provides XML schemas for de ning a variety of concepts that
are of use in Geography: geographic features, geometries,
CRSs and topologies.</p>
      <p>The main RDF vocabularies for representing spatial as
well as temporal information are described below.
W3C GEO. W3C GEO is an RDF vocabulary for
representing simple location information in RDF. It provides the
basic terminology for serializing point geometries using a
namespace for representing latitude, longitude and other
information about spatially-located things. The CRS of
this vocabulary is encoded in the namespace and it is the
WGS 847. Below, we give an example of the W3C GEO
vocabulary:
_:1 rdf:type wgs84geo:Point .
_:1 wgs84geo:lat "10"^^xsd:double .
_:1 wgs84geo:long "20"^^xsd:double .</p>
      <p>
        GeoSPARQL. GeoSPARQL [
        <xref ref-type="bibr" rid="ref22">22</xref>
        ] is a recent OGC standard
that de nes a vocabulary for representing geospatial data
in RDF, and an extension to the SPARQL query language
for processing geospatial data. It uses literal values to
encode geometries and introduces two RDF datatypes, the
geo:wktLiteral and geo:gmlLiteral, for the WKT and
GML literals. An example of a geometry in GeoSPARQL
is given below:
_:1 rdf:type geo:Geometry .
_:1 geo:hasGeometry
"&lt;epsg:4326&gt; POINT(10 20)"^^geo:wktLiteral .
stRDF: The Spatial Dimension. The data model stRDF
[
        <xref ref-type="bibr" rid="ref17">17</xref>
        ] is an extension of the standard RDF for representing
geospatial data. Similarly to GeoSPARQL, stRDF uses the
OGC standards WKT and GML for the representation of
geospatial data and introduces two new literal datatypes,
the stdf:WKT and strdf:GML. An example of a geometry in
stRDF is shown below:
_:1 rdf:type strdf:Geometry .
_:1 strdf:hasGeometry
"POINT(10 20);&lt;epsg:4326&gt;"^^strdf:WKT .
7http://spatialreference.org/ref/epsg/wgs-84/
stRDF: The Temporal Dimension. An approach for the
representation of temporal information in RDF was
introduced with the temporal dimension of stRDF [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ]. This
approach assumes a discrete time line and uses the value space
of the datatype xsd:dateTime of XML-Schema8 to model
time. Two kinds of time primitives are supported: time
instants and time periods. Time instants are represented by
literals of the xsd:dateTime datatype and time periods by
literals of the datatype strdf:period. These literals are
used as objects of triples to represent user-de ned time and
as valid time of temporal triples. A temporal triple is an
expression of the form (s, p, o, t) where (s, p, o) is an
RDF triple and t is the valid time of this triple. An example
of a temporal triple in stRDF is shown below:
_:1 rdf:type strdf:Geometry .
_:1 strdf:hasGeometry
"POINT(10 20);&lt;epsg:4326&gt;"^^strdf:WKT
"2000-01-01T00:00:00"^^xsd:dateTime .
3.2
      </p>
    </sec>
    <sec id="sec-6">
      <title>Definition of Link Discovery</title>
      <p>The de nition of Link Discovery based on which we de ne
our new methods is described as follows:</p>
      <p>Definition 1. Let S and T be two sets of entities and R
the set of relations that can be discovered between entities.
For a relation r 2 R, w.l.o.g., a distance function dr and a
distance threshold dr are de ned as follows:
dr : S</p>
      <p>T ! [0; 1] ; dr 2 [0; 1]
The domain of dr is the Cartesian product of S and T and
the range is the computed distance normalized to the interval
[0; 1]. dr is also normalized to [0; 1].</p>
      <p>The set of discovered links for relation r (DLr) is de ned as
follows:</p>
      <p>DLr = f(s; r; t) j s 2 S ^ t 2 T ^ dr(s; t)
dr g
DLr contains triples which have as subject an entity from
dataset S, as object an entity from dataset T and as
predicate the relation r. A triple belongs to DLr i the function
dr returns a distance that does not exceed the threshold dr .
3.3</p>
    </sec>
    <sec id="sec-7">
      <title>Relation Models and Calculi</title>
      <p>
        Region Connection Calculus. The Region Connection
Calculus (RCC) [
        <xref ref-type="bibr" rid="ref24">24</xref>
        ] is a formalization that provides a sound
and complete set of topological relations between two
spatial regions. RCC-8, which is a well-known subset of RCC,
is based on eight topological relations (Figure 1(a)) where
DC stands for DisConnected, EC for Externally Connected,
TPP for Tangential Proper Part, NTPP for Non Tangential
Proper Part, and TPPi and NTPPi are the inverse relations
of TPP and NTPP.
      </p>
      <p>
        Dimensionally Extended 9-Intersection Model. The
Dimensionally Extended 9-Intersection Model (DE-9IM) [
        <xref ref-type="bibr" rid="ref5">5</xref>
        ] is
a well-known model for representing topological relations
between geometries. More speci cally, this model captures
topological relations in R2, by considering the dimension
(dim) of the intersections involving the interior (I), the
boundary (B) and the exterior (E) of the two geometries.
Given the intersection matrix (Figure 1(b)), for any two
spatial objects that can be points, lines and/or polygonal
8http://www.w3.org/TR/xmlschema-2#dateTime
      </p>
      <p>Y
X EQ Y
(a)</p>
      <p>X
X TPP Y</p>
      <p>X</p>
      <p>Y
X TPPi Y</p>
      <p>X
X NTPP Y</p>
      <p>X Y
X NTPPi Y
areas, we can de ne relations derived from DE-9IM such as:
Intersects, Overlaps, Equals, Touches, Disjoint, Contains,
Crosses, Covers, CoveredBy and Within.</p>
      <p>
        Egenhofer’s and OGC Simple Features Model. The
Egenhofer's Model [
        <xref ref-type="bibr" rid="ref6">6</xref>
        ] and the Simple Features Model
proposed by OGC9 contain di erent subsets of the topological
relations that derive from the DE-9IM mentioned above.
Cardinal Direction Calculus. This calculus concentrates
on cardinal direction relations [
        <xref ref-type="bibr" rid="ref11 ref27">11, 27</xref>
        ] which are used to
describe how regions of space are placed relative to one another
e.g., region a is north of region b (Figure 1(c)).
Allen’s Interval Calculus. A widely used algebra for
temporal reasoning is Allen's Interval Calculus [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ], which
provides the de nition of possible relations between time
periods. It is based on the thirteen jointly exclusive and
pairwise disjoint qualitative relations given in Figure 1(d). From
these basic relations, one can build new ones by taking
disjunctions of them.
      </p>
      <p>
        Spatiotemporal Constraint Calculus. By pairing a
spatial and a temporal relation model or calculus we can
create a spatiotemporal one. For example, the
Spatiotemporal Constraint Calculus (STCC) [
        <xref ref-type="bibr" rid="ref9">9</xref>
        ] combines RCC-8 with
Allen's Interval Calculus. This calculus is useful when
we want to discover relations between spatial objects that
change over time (as we will see in Section 6) or between
moving objects.
      </p>
    </sec>
    <sec id="sec-8">
      <title>METHODS FOR SPATIAL AND TEM</title>
    </sec>
    <sec id="sec-9">
      <title>PORAL LINK DISCOVERY</title>
      <p>In this section we describe the new methods that we
introduce for Spatial and Temporal Link Discovery. Speci cally,
we present the new sets of relations and transformations that
we provide and an optimization technique called Blocking.
Finally, we prove theoretically the soundness and
complete9http://www.opengeospatial.org/standards/sfs
ness of our methods and describe how the latter has as result
their accuracy to be 100%.
4.1</p>
    </sec>
    <sec id="sec-10">
      <title>Spatial and Temporal Relations</title>
      <p>According to the de nition of Link Discovery, presented
above, the set R contains all the relations that can be
discovered between entities. In this paper we introduce the sets
of spatial (Rs), temporal (Rt) and spatiotemporal (Rst)
relations. These sets contain all the relations that are included
in the models and calculi described in Section 3. Speci cally,
Rs contains the relations that are included in the DE-9IM,
Egenhofer's and OGC Simple Features Models and the
Region Connection and Cardinal Direction Calculi, Rt contains
the relations included in the Allen's Interval Calculus and
Rst contains all the per two combinations of the above
models and calculi.</p>
      <p>We consider these relations as Boolean relations (RB) i.e.,
either they hold or they do not hold (Rs; Rt; Rst RB R).
These relations have also been studied in the context of fuzzy
logics but this is out of the scope of this paper.</p>
      <p>RB constitutes a special subset of R. The distance
function dr and the distance threshold dr for a relation r 2 RB
are de ned as follows:
dr(s; t) =
A common issue that occurs when one tries to discover
links between datasets created by di erent data providers is
heterogeneity (e.g., we can have datasets expressed in di
erent time-zones). A preprocessing technique that addresses
this problem is the application of transformations on the
attributes of the entities before checking the existence of a
link. The transformations that we introduce are generic and
not tightly coupled to speci c kind of datasets.
(a)
(b)
Spatial Transformations. The spatial transformations
that we introduce are the following:</p>
      <p>Vocabulary Transformation. As mentioned in
Section 3, the geometries of a dataset can be expressed in
di erent vocabularies (e.g., W3C GEO, GeoSPARQL
or stRDF). This transformation converts the
geometry literals of a dataset in order to be expressed in a
common vocabulary (GeoSPARQL).</p>
      <p>Serialization Transformation. The geometries of a
dataset can be also serialized in di erent ways (e.g.,
using WKT or GML). This transformation converts
the geometries of a dataset to follow a common
serialization (WKT).</p>
      <p>CRS Transformation. Although a CRS such as
WGS 84 is a comprehensive way to describe locations
on Earth, some applications work on a projection of
the Earth. In these cases, a projected CRS is used that
transforms the 3-dimensional ellipsoid approximation
of the Earth into a 2-dimensional surface. This
transformation reverts the CRS of a geometry from a
projected one (e.g., the GGRS87 for Greece) to the World
Geodetic System (WGS 84) for better precision and
homogeneity.</p>
      <p>Validation Transformation. This transformation
converts not valid geometries (e.g., self-intersecting
polygons) to valid ones.</p>
      <p>Simpli cation Transformation. Some datasets have
very complex geometries, which makes the
computation of spatial relations ine cient. This
transformation simpli es a geometry according to a given distance
tolerance, ensuring that the result is a valid geometry
having the same dimension and number of components
as the input.</p>
      <p>Envelope Transformation. This transformation
computes the envelope (i.e., the minimum bounding
rectangle) that contains a geometry and it is useful in cases
that we want to compute approximate spatial relations
between two datasets.</p>
      <p>Area Transformation. In some cases we may want to
compare the areas of two geometries to verify the
existence of a relation. This transformation computes the
area of a given geometry in square metres.</p>
      <p>Points-To-Centroid Transformation. In crowdsourcing
datasets like OpenStreetMap10, multiple users can
dene the position of the same placemark. As a better
approximation of the real position of this placemark
we can compute the centroid of these positions. This
transformation computes the centroid of a cluster of
points.</p>
      <p>Temporal Transformations. The respective temporal
transformations that we introduce are the following:
Period Transformation. The stRDF data model
supports both time instants and periods. This
transformation converts a time instant to a time period with
the same starting and ending point.</p>
      <p>Time-Zone Transformation. Time elements (i.e., time
instants and periods) can be expressed in di erent
time-zones. This transformation converts the
timezone of a given time element to the Coordinated
Universal Time (UTC).
4.3</p>
    </sec>
    <sec id="sec-11">
      <title>Blocking Technique</title>
      <p>
        Since the size of datasets with spatial or temporal
information can be very big, approaches that perform exhaustive
checks between datasets are considered ine cient. Thus,
there is need for a scalable, yet sound and complete
technique for decreasing the number of checks by dismissing
de nitive non-matches prior to the actual check. The most
well-known technique to achieve this is known as Blocking
[
        <xref ref-type="bibr" rid="ref14 ref23">14, 23</xref>
        ]. Blocking is an optimization technique that e
ectively reduces (as we will see in Section 6) the execution
time of our methods without a ecting (as we prove below)
the accuracy of the discovered links.
      </p>
      <p>Spatial Relations. The Blocking technique for spatial
relations that we propose builds blocks that divide the earth
into curved rectangles as depicted in Figure 2(a). The
reference system of our technique is WGS 84 which considers
a spheroid approximation of the Earth. This approximation
requires complicated mathematics for calculations such as
areas, distances, etc. (e.g., the shortest path between two
points in the spheroid is a circle arc whereas, in a planar
approximation it would be just a straight line). Nevertheless,
we chose WGS 84 because we prefer having highly accurate
calculations rather than fast ones.</p>
      <p>The area of the created blocks is measured in square
degrees and can be adjusted by a spatial blocking factor sbf .
The formula for the computation of the area is the following:
10http://www.openstreetmap.org
blockArea =
The bigger the value sbf gets, the more and smaller blocks
will be created. For example, if we assign to sbf the
value 10, our Blocking technique will create 6; 480; 000
non2
overlapping blocks with area 0:01 that cover the whole
surface of the earth11.</p>
      <p>After the division of the space into blocks, we compute the
set of blocks into which each geometry must be inserted. In
order to achieve this, we rst compute the minimum
bounding box (MBB ) that contains each geometry, and then we
nd the blocks that this MBB intersects with. Thus, each
geometry is assigned to all the blocks with which its MBB
intersects.</p>
      <p>Temporal Relations. In the Blocking technique for
temporal relations we follow a similar approach to the one for
spatial relations. Time is one-dimensional and thus the blocks
are one-dimensional as well. Let us suppose that all the
time elements of our data are included in the interval from
the Epoch12 until the Present (Figure 2(b)). This interval
can be manually con gured according to the time range of
our data. Following the same strategy as before, we divide
the time in blocks whose length can be adjusted with a
temporal blocking factor tbf . The formula for the computation
of the length of the blocks is the following:
blockLength =</p>
      <p>time units
1
tbf
where time units range from milliseconds to years.</p>
      <p>After the division of the time into blocks, we insert in
each block all the time periods and instants that temporally
intersect with it.</p>
      <p>Spatiotemporal Relations. The Blocking technique for
spatiotemporal relations is a combination of the techniques
mentioned above. In this case the blocks are
threedimensional (two dimensions for space and one for time).
The formula for the computation of the volume of the blocks
is the following:
blockV olume =
time units
1
sbf 2 tbf
2
4.4</p>
    </sec>
    <sec id="sec-12">
      <title>Link Discovery</title>
      <p>Given that all the entities are inserted in blocks using
the aforementioned Blocking technique, we check a relation
r 2 Rs[Rt[Rst only within the scope of each block. In order
to achieve this, we use the actual spatial and/or temporal
information, computing explicitly the relation.</p>
      <p>Since the blocks are built to be completely independent of
each other, this check can be performed in parallel with
respect to the blocks. Then we construct the set of discovered
links (DLr) by aggregating the respective links that have
been discovered within each block.</p>
      <p>Spatial relation Disjoint is treated in a slightly di erent
way from the other relations. If two entities belong to the
same block, then we follow the previous approach and we
check explicitly the relation. If they don't, the Disjoint
relation holds by de nition and thus we add a link without
actually checking the relation. A similar approach is
followed for the temporal relations Before and After as well as
for the cardinal direction relations.
11The longitude range of WGS 84 is [ 180 ; 180 ] and the
latitude range is [ 90 ; 90 ].
12The Epoch has been set to January 1, 1970, 00:00:00 GMT.
4.5</p>
    </sec>
    <sec id="sec-13">
      <title>Soundness and Completeness</title>
      <p>The soundness and completeness of the proposed methods
for Spatial and Temporal Link Discovery is proved in two
steps. In the rst step we prove that the methods are sound
and complete when performing an exhaustive check of all
the possible pairs of entities (Cartesian product) and in the
second that the Blocking technique that we propose does
not a ect the accuracy of the discovered links.</p>
      <p>
        Cartesian Product Technique. The proposed algorithms
that check if a spatial or temporal relation holds between
two geometries or time instants have been proven sound in
[
        <xref ref-type="bibr" rid="ref1 ref24 ref5">1, 5, 24</xref>
        ]. Also, by de nition, Cartesian product denotes
that we perform an exhaustive (complete) check of all the
pairs of entities from the datasets. Hence, we can state that,
our methods are sound and complete i.e., for each relation
they discover the exact set of pairs of entities for which this
relation holds.
      </p>
      <p>Blocking Technique. We prove that the application of the
Blocking technique does not a ect the completeness of our
methods for Spatial Link Discovery with reduction to
absurdity.</p>
      <p>Proof. Let two geometries lying in di erent blocks with
a spatial relation r 2 Rs nfDisjointg holding between them.</p>
      <p>If r holds between two geometries then they intersect at
least at one point (from the de nition of the spatial
relations).</p>
      <p>If two geometries intersect at one point, so do their MBBs
(from the de nition of MBB ).</p>
      <p>If the MBBs of two geometries intersect, then there is
at least one block in which they will be both inserted (as
described above).</p>
      <p>This results in a contradiction because we assumed that
the two geometries are lying in di erent blocks. Therefore
the initial assumption must be false.</p>
      <p>The above proves that if a relation other than Disjoint holds
between two geometries, then they will be placed in at least
one common block and consequently the relation between
them will be checked and discovered.</p>
      <p>With a similar proof for the Disjoint, the temporal and the
cardinal direction relations we can state that our methods
remain complete, even after the application of the Blocking
technique.</p>
    </sec>
    <sec id="sec-14">
      <title>4.6 Precision and Recall</title>
      <p>Since we proved theoretically that our methods are sound
and complete, their accuracy is guaranteed. Metrics such as
Precision and Recall are by de nition equal to 100%:
P recision =</p>
      <p>Recall =</p>
      <p>T DL
T DL + F DL</p>
      <p>T DL
=</p>
      <p>T DL
T DL</p>
      <p>T DL
=
= 100%</p>
      <p>= 100%
T DL + F NDL</p>
      <p>T DL
T DL stands for True Discovered Links, F DL for False
Discovered Links and F N DL for False Not Discovered Links.
F DL and F N DL are both equal to zero as a consequence
of the soundness and the completeness of our methods.</p>
    </sec>
    <sec id="sec-15">
      <title>EXTENDING THE LINK DISCOVERY</title>
    </sec>
    <sec id="sec-16">
      <title>FRAMEWORK SILK</title>
      <p>All the proposed methods for Spatial and Temporal Link
Discovery have been implemented as extensions to the Silk
framework13. As mentioned in Section 2, Silk is the only, to
the best of our knowledge, open-source generic framework
for discovering relationships between data items within
different Linked Data sources.</p>
      <p>For our implementation we extended transparently the
core components of the Link Discovery Engine of Silk in the
following phases:</p>
      <p>Transformation Phase. In this phase, Silk reads the
incoming entities from the data sources. As an optional
step, a transformation operator can be applied to
normalize the attributes of these entities. For this phase
we implemented the transformation operators that we
presented in Section 4.</p>
      <p>Blocking Phase. Silk employs a blocking technique
which maps entities to a multidimensional index. After
the mapping, the entities are divided into
multidimensional blocks which optionally overlap with each other.
For our implementation we adapted the blocking
technique that we introduced in Section 4 to the one that
Silk uses, by utilizing a two-dimensional index for the
spatial blocking, a one-dimensional for the temporal
and a three-dimensional for the spatiotemporal
blocking.</p>
      <p>Link Generation Phase. Finally, a distance operator
computes the distance for each pair of entities that
have been inserted in the same block and if this does
not exceed a given threshold, it writes the pair to the
output. For this phase we implemented distance
operators for all the spatial14 and temporal relations that
were introduced in Section 4.</p>
      <p>On top of the Link Discovery Engine, Silk implements
two main applications. The rst is used for generating RDF
links on a single machine and is called Silk Single Machine.
The datasets that must be interlinked can either reside on
the same machine or on remote machines which are accessed
via the SPARQL protocol.</p>
      <p>The second application is Silk MapReduce which is used
for generating RDF links between datasets using a cluster of
multiple machines. Silk MapReduce is based on Hadoop15
and it can scale out to very big datasets by distributing the
link generation to multiple machines.</p>
      <p>In both of these applications our Blocking technique
divides the source datasets into blocks and then the distance
operators run in parallel with respect to the blocks. In Silk
Single Machine we have multi-thread parallelization and in
Silk MapReduce multi-machine parallelization.
13The source code of the spatial and temporal extensions
of Silk is publicly available here: https://github.com/
silk-framework/silk.
14For the cardinal direction relations, the developed
operators support only points and not complex geometries as
happens with all the other relations.
15http://hadoop.apache.org
6.</p>
    </sec>
    <sec id="sec-17">
      <title>EXPERIMENTAL EVALUATION</title>
      <p>In this section we experimentally evaluate the spatial and
temporal extensions of Silk by using it in a real-world
scenario. The datasets, detailed instructions and other
useful information for reproducing the experiments are publicly
available16.
6.1</p>
    </sec>
    <sec id="sec-18">
      <title>Compared Frameworks</title>
      <p>
        As we discussed in Section 2, to the best of our
knowledge, there is no related framework with which we can
discover spatial or temporal relations other than equivalences
among RDF datasets. Hence, in the experiments that we
conducted, we compared against variants of Silk and the
state-of-the-art spatiotemporal RDF store Strabon [
        <xref ref-type="bibr" rid="ref18">18</xref>
        ].
      </p>
      <p>Strabon is not considered as a Link Discovery framework
but since it supports the GeoSPARQL and stSPARQL query
languages, NAMED GRAPHS and CONSTRUCT queries
it can be employed for discovering spatiotemporal relations
e.g., the intersects relation, by posing Link Discovery
queries like the one depicted in Figure 3(a). The only
restriction that we face with Strabon is that both the source
and the target datasets must be stored locally, in di erent
named graphs. On the other hand, with Silk, we can
interlink a local dataset with a remote one, that is published by
another data publisher. The only access that we need to it,
is via a SPARQL endpoint.
6.2</p>
    </sec>
    <sec id="sec-19">
      <title>Environment of Experiments</title>
      <p>We conducted our experiments both in a single machine
and a distributed environment. For the single machine
environment, we used a machine with two Intel Xeon E5620
processors (12MB L3 cache, 2.4 GHz), 32 GB of RAM and a
RAID-5 disk array that consists of four disks (32 MB cache,
7200 rpm). For the distributed environment we used a
cluster provided by the European Public Cloud Provider
Interoute17, in which we reserved 1 Master and 20 Slave Nodes
with 2 CPUs, 4GB RAM and 10GB disk each.</p>
      <p>We ran our experiments using the latest version of Silk
with the spatial and temporal enhancements (v2.6.1) and
the latest version of Strabon (v3.2.10) with accordingly
tuned PostgreSQL (v9.1.13) and PostGIS (v2.0) as proposed
by the developers.
6.3</p>
    </sec>
    <sec id="sec-20">
      <title>Scenario</title>
      <p>
        In [
        <xref ref-type="bibr" rid="ref16">16</xref>
        ] the authors present a real-time wild re
monitoring service that exploits satellite images and linked
geospatial data to detect and monitor the evolution of re fronts.
This service is now operational at the National Observatory
of Athens and is being used during the summer season by
emergency managers monitoring wild res in Greece18.
      </p>
      <p>A part of the processing chain of the service is to improve
the thematic accuracy of the detected res (hotspots) by
correlating them with auxiliary geospatial data. More speci
cally, the service nds the land cover of each area which is
threatened at a speci c time by a hotspot, in order to avoid
false alarms from res started e.g., by farmers as part of
their agricultural practices. Also, it nds the municipalities
that a hotspot threatens and thus, competent authorities
are made aware about the existence of a re in their area of
16http://silk.di.uoa.gr
17http://www.interoute.com
18http://ocean.space.noa.gr/ res
CONSTRUCT {?s strdf:intersects ?t .} WHERE{
GRAPH ex:source{?s geo:hasGeometry/geo:asWKT ?sg.</p>
      <p>?s strdf:hasValidTime ?st.}
GRAPH ex:target{?t geo:hasGeometry/geo:asWKT ?tg.</p>
      <p>?t strdf:hasValidTime ?tt.}
FILTER(geof:sfIntersects(?sg, ?tg) &amp;&amp;
strdf:intersects(?st, ?tt))}
(a)
Dataset
HG
CLCG
GAG</p>
      <p>Geometries Time Elements
Type #Points Type #Instants
Polygons 148,192 Instants 37,048
Polygons 8,004,058 Periods 9,736
Polygons 979,929 Periods 650
responsibility.</p>
      <p>
        Below, we provide a short description of the datasets of the
scenario, whilst in Figure 3(b) we present some quantitative
characteristics of them. These datasets also constitute a
subset of the datasets used in the state-of-the-art benchmark
for Geospatial RDF Stores, Geographica [
        <xref ref-type="bibr" rid="ref8">8</xref>
        ]:
      </p>
      <p>Hotspots of Greece (HG). The HG dataset contains the
location and the acquisition time of detected res for
each re season as produced by the National
Observatory of Athens19 after processing appropriate satellite
images.</p>
      <p>CORINE Land Cover of Greece (CLCG). The Corine
Land Cover project20 is an activity of the European
Environment Agency that provides data regarding the
land cover of European countries. The CLCG is a
subset of the dataset that contains all the available
information about Greece.</p>
      <p>Greek Administrative Geography (GAG). The GAG
dataset contains an ontology that describes the
administrative divisions of Greece (prefectures,
municipalities, districts, etc.) which has been populated with
relevant data that is publicly available in the Greek
open government data portal21.</p>
      <p>With Silk, the above scenario can be translated into two
interlinking tasks between the HG and the CLCG and the
GAG datasets respectively. In these tasks, Silk discovers
the spatiotemporal relation intersects between the datasets.
For this, it rst applies a CRS transformation to
normalize the heterogeneous geometries and a Period and a
TimeZone transformation to normalize the heterogeneous time
elements of the datasets. Finally, it populates the
threedimensional spatiotemporal blocks and discovers the
relation intersects within each one of them. Thus, if a hotspot
threatens a municipality or a land cover area, then the
output of this procedure will contain the respective entities from
the HG, GAG and CLCG datasets, interlinked with an
appropriate predicate (e.g., hg:threatens).
6.4</p>
    </sec>
    <sec id="sec-21">
      <title>Parameters sbf and tbf</title>
      <p>As we have discussed in Section 4, sbf and tbf adjust the
area and the length of the blocks in which we divide the
space and the time respectively. The bigger they get, the
more and smaller blocks are created.</p>
      <p>The optimal value for sbf depends on the distribution and
the size of the geometries which are not known in advance
in the case of our scenario. On the other hand, given the
19http://www.noa.gr
20http://www.eea.europa.eu/publications/COR0-landcover
21http://geodata.gov.gr
metadata of the datasets, we can easily compute the optimal
value for tbf . The temporal resolution of HG dataset is 15
minutes22, while the one of GAG and CLCG is a multiple of
a year (e.g., 5 years). By constructing blocks with temporal
dimension equal to 1 year, we can guarantee that, for all the
pairs of entities of the datasets HG-GAG and HG-CLCG
respectively which belong to the same block, the temporal
relation intersects holds by design (i.e., we do not have to
check it explicitly between each pair). Hence, the optimal
value for tbf is 1 year.
6.5</p>
    </sec>
    <sec id="sec-22">
      <title>Experiment 1:</title>
    </sec>
    <sec id="sec-23">
      <title>Blocking Factor</title>
    </sec>
    <sec id="sec-24">
      <title>Adjusting the Spatial</title>
      <p>Since we know the optimal value for tbf , in this
experiment we try to approximate the optimal value for sbf . In
order to achieve this, we analyze the performance of the
single machine implementation of Silk with respect to di erent
values of sbf . Particularly for this experiment, we use the
full datasets of the scenario and we measure separately the
computation times of HG-GAG and HG-CLCG.
Furthermore, we count the total number of discovered links in each
of these executions.</p>
      <p>The graph of Figure 4(a) summarizes the results of this
experiment. When sbf takes values close to 0, the blocks
are spanning big surfaces of the earth. Hence, most of the
geometries of the datasets are inserted in the same block,
making the link discovery procedure almost as time
consuming as the computation of the Cartesian product.</p>
      <p>As sbf gets bigger, Silk seems to perform better. However,
this improvement continues until a certain value (value 10)
and then, as the sbf increases, the computation time
deteriorates. This is due to the fact that a big value for sbf causes
the division of space into very small blocks and thus each
geometry is inserted into a big number of them. If two
geometries are inserted into multiple blocks, then the check of
the spatial relation is performed independently in each block
that they appear. Hence, in this case we have redundant
checks of the same relation that decrease the performance
of Silk.</p>
      <p>Another useful outcome from this experiment is the
comparison of the time consumed for the link discovery task
between HG-GAG and HG-CLCG. Figure 4(a) shows that
the HG-CLCG interlinking takes orders of magnitude more
than the respective HG-GAG. In Figure 3(b) we observe that
the CLCG dataset has more geometries than GAG, whereas
GAG has more complex ones (with respect to the number of
points and not necessarily to their shape). Hence, in cases
of spatial relations like intersects, the bottleneck is the
number and not the complexity of the geometries.</p>
      <p>One nal observation from this experiment is the number
22METEOSAT Second Generation (MSG) satellites return
images every 15 minutes.</p>
      <p>Links
HG-GAG
HG-CLCG
80000
79000
of discovered links. This number remains the same
independently from the value of sbf . This is expected, since
we have already proven in Section 4 that the Blocking
technique that we propose does not a ect the accuracy of the
discovered links which remains 100%.
6.6</p>
    </sec>
    <sec id="sec-25">
      <title>Experiment 2: Adjusting the Entities per</title>
    </sec>
    <sec id="sec-26">
      <title>Dataset</title>
      <p>In the second experiment we analyze the performance of
three variants of Silk and Strabon with respect to di
erent number of entities per dataset. For that reason, we use
four di erent subsets of the datasets of the scenario23 and
we measure the total execution time for performing the
interlinking tasks i.e., we don't distinguish between HG-GAG
and HG-CLCG.</p>
      <p>The rst variant (Silk (Baseline)), which we consider as
baseline, computes the full Cartesian product of the entities
and then it checks if the spatiotemporal relation intersects
holds between them. The second variant (Silk (Best sbf)),
utilizes the Blocking technique with the best sbf , as the
latter occurred from the previous experiment. The third
one (Silk (MR)), is the distributed variant of Silk, which
also utilizes the Blocking technique with the best sbf . In
the case of Strabon, the datasets are stored locally and a
CONSTRUCT query like the one we described in Figure 3(a) is
performed.</p>
      <p>The results of this experiment can be seen in Figure 4(b).
As we can see from the graph, Silk (Baseline) is the most
ine cient implementation, since even for the 1000 entities
per dataset it is more time consuming that all the other
implementations are for the full datasets.</p>
      <p>On the other hand, Strabon seems to be faster for small
number of entities per dataset whereas Silk (Best sbf) is
faster when interlinking the full datasets. This happens
because Silk fully utilizes the cores of the running machine by
assigning the workload of each block it creates into a new
thread. For big datasets, where the total workload is big
enough, the Blocking approach of Silk seems to be the most
e cient.</p>
      <p>The e ect of the massive parallelization is more
remarkable with the distributed variant of Silk (Silk (MR)). With
23The GAG dataset contains less than 1000 entities
(Figure 3(b)) and thus for the third measurement of the
experiment (Figure 4(b)) we used the full dataset.
10000
)
s
nd 1000
o
c
e
s
(em 100
i
T
10
1</p>
      <p>Links
Silk (Baseline)
Silk (Best sbf)
Strabon</p>
      <p>Silk (MR)
10</p>
      <p>100 1000*
Entities per dataset</p>
      <p>all
(b)
10000
1000
100
10
1
s
k
n
lif
o
#
Silk (MR) the total workload is divided into di erent
machines and in each machine it is divided into di erent cores.
Also, we observe that, the computation time of Hadoop for
small datasets is negligible with respect to the initialization
time and the time consumed to copy the data from the local
le system to the Hadoop Distributed File System (HDFS)
and vice versa. Hence, Silk (MR) outperforms the other Silk
variants and Strabon only for the measurement with the full
datasets. Also, we can observe from the graph that it has
the best scaling factor. Hence, we can claim that, for large
datasets it is the only viable solution.
7.</p>
    </sec>
    <sec id="sec-27">
      <title>CONCLUSIONS AND FUTURE WORK</title>
      <p>In this paper, we proposed new methods for Spatial and
Temporal Link Discovery and provided the rst
implementation of our techniques based on the well-known
framework Silk. Silk, enhanced with the new features, allows
data publishers to generate a wide variety of spatial,
temporal and spatiotemporal relations between their data and
other Linked Open Data, dealing e ectively with the
common heterogeneity issues of such data. Furthermore, we
experimentally evaluated our implementation by using it in a
real-world scenario, with datasets that comprise rich spatial
and temporal information, and demonstrate that it
discovers accurately all the existing links in a time e cient and
scalable way.</p>
      <p>Future work concentrates on extending Silk with more
spatial and temporal relations. These relations will be based
on algebras and calculi that appear frequently in the
relevant bibliography and are useful for speci c use cases. We
will also examine how we can estimate the optimal value of
the blocking factor, for both the spatial and the temporal
dimension, by posing preprocessing queries on the datasets
that we want to interlink. Finally, we will try approximate
blocking techniques which are more e cient than the one
we propose but not 100% accurate.
8.</p>
    </sec>
    <sec id="sec-28">
      <title>ACKNOWLEDGEMENTS</title>
      <p>This work has been partially funded by the FP7 projects
LEO (611141) (http://linkedeodata.eu) and MELODIES
(603525) (http://www.melodiesproject.eu). We also want
to thank Herve Caumont and Cesare Rossi from Terradue
(http://www.terradue.com) for helping us conduct our
experiments to a cloud environment.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <given-names>J. F.</given-names>
            <surname>Allen</surname>
          </string-name>
          .
          <article-title>Maintaining knowledge about temporal intervals</article-title>
          .
          <source>Commun. ACM</source>
          ,
          <volume>26</volume>
          (
          <issue>11</issue>
          ):
          <volume>832</volume>
          {
          <fpage>843</fpage>
          ,
          <string-name>
            <surname>Nov</surname>
          </string-name>
          .
          <year>1983</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [2]
          <string-name>
            <given-names>S.</given-names>
            <surname>Auer</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Lehmann</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.-C. N.</given-names>
            <surname>Ngomo</surname>
          </string-name>
          ,
          <article-title>and</article-title>
          <string-name>
            <given-names>A.</given-names>
            <surname>Zaveri</surname>
          </string-name>
          .
          <article-title>Introduction to Linked Data and Its Lifecycle on the Web</article-title>
          . In S. Rudolph, G. Gottlob,
          <string-name>
            <surname>I. Horrocks</surname>
          </string-name>
          , and F. van Harmelen, editors,
          <source>Reasoning Web</source>
          , volume
          <volume>8067</volume>
          of Lecture Notes in Computer Science, pages
          <fpage>1</fpage>
          <lpage>{</lpage>
          90. Springer,
          <year>2013</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [3]
          <string-name>
            <given-names>K.</given-names>
            <surname>Bereta</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Smeros</surname>
          </string-name>
          , and
          <string-name>
            <given-names>M.</given-names>
            <surname>Koubarakis</surname>
          </string-name>
          .
          <article-title>Representation and querying of valid time of triples in linked geospatial data</article-title>
          . In P. Cimiano,
          <string-name>
            <given-names>O.</given-names>
            <surname>Corcho</surname>
          </string-name>
          ,
          <string-name>
            <given-names>V.</given-names>
            <surname>Presutti</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L.</given-names>
            <surname>Hollink</surname>
          </string-name>
          , and S. Rudolph, editors,
          <source>The Semantic Web: Semantics and Big Data</source>
          , volume
          <volume>7882</volume>
          of Lecture Notes in Computer Science, pages
          <volume>259</volume>
          {
          <fpage>274</fpage>
          . Springer Berlin Heidelberg,
          <year>2013</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [4]
          <string-name>
            <given-names>C.</given-names>
            <surname>Bizer</surname>
          </string-name>
          ,
          <string-name>
            <given-names>T.</given-names>
            <surname>Heath</surname>
          </string-name>
          , and
          <string-name>
            <given-names>T.</given-names>
            <surname>Berners-Lee</surname>
          </string-name>
          .
          <article-title>Linked Data - The Story So Far</article-title>
          .
          <source>International Journal on Semantic Web and Information Systems</source>
          ,
          <volume>5</volume>
          (
          <issue>3</issue>
          ):1{
          <fpage>22</fpage>
          ,
          <year>2009</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          [5]
          <string-name>
            <given-names>E.</given-names>
            <surname>Clementini</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Di Felice</surname>
          </string-name>
          , and P. van
          <string-name>
            <surname>Oosterom</surname>
          </string-name>
          .
          <article-title>A small set of formal topological relationships suitable for end-user interaction</article-title>
          . In D. Abel and
          <string-name>
            <given-names>B.</given-names>
            <surname>Chin</surname>
          </string-name>
          Ooi, editors,
          <source>Advances in Spatial Databases</source>
          , volume
          <volume>692</volume>
          of Lecture Notes in Computer Science, pages
          <volume>277</volume>
          {
          <fpage>295</fpage>
          . Springer Berlin Heidelberg,
          <year>1993</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          [6]
          <string-name>
            <given-names>M.</given-names>
            <surname>Egenhofer</surname>
          </string-name>
          .
          <article-title>A formal de nition of binary topological relationships</article-title>
          . In W. Litwin and H.-J. Schek, editors,
          <source>Foundations of Data Organization and Algorithms</source>
          , volume
          <volume>367</volume>
          of Lecture Notes in Computer Science, pages
          <volume>457</volume>
          {
          <fpage>472</fpage>
          . Springer Berlin Heidelberg,
          <year>1989</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          [7]
          <string-name>
            <given-names>Z.</given-names>
            <surname>Gantner</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Westphal</surname>
          </string-name>
          , and
          <string-name>
            <given-names>S.</given-names>
            <surname>Woel . GQR -</surname>
          </string-name>
          <article-title>A Fast Reasoner for Binary Qualitative Constraint Calculi</article-title>
          .
          <source>In AAAI Workshop on Spatial and Temporal Reasoning</source>
          ,
          <year>2008</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          [8]
          <string-name>
            <given-names>G.</given-names>
            <surname>Garbis</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K.</given-names>
            <surname>Kyzirakos</surname>
          </string-name>
          , and
          <string-name>
            <given-names>M.</given-names>
            <surname>Koubarakis</surname>
          </string-name>
          .
          <article-title>Geographica: A benchmark for geospatial rdf stores (long version)</article-title>
          .
          <source>In The Semantic Web{ISWC</source>
          <year>2013</year>
          , pages
          <fpage>343</fpage>
          {
          <fpage>359</fpage>
          . Springer,
          <year>2013</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          [9]
          <string-name>
            <given-names>A.</given-names>
            <surname>Gerevini</surname>
          </string-name>
          and
          <string-name>
            <given-names>B.</given-names>
            <surname>Nebel</surname>
          </string-name>
          .
          <article-title>Qualitative spatio-temporal reasoning with rcc-8 and allen's interval calculus: Computational complexity</article-title>
          .
          <source>In ECAI</source>
          , volume
          <volume>2</volume>
          , pages
          <fpage>312</fpage>
          {
          <fpage>316</fpage>
          ,
          <year>2002</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          [10]
          <string-name>
            <given-names>S.</given-names>
            <surname>Giannakopoulou</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Nikolaou</surname>
          </string-name>
          , and
          <string-name>
            <given-names>M.</given-names>
            <surname>Koubarakis</surname>
          </string-name>
          .
          <article-title>A reasoner for the RCC-5 and RCC-8 calculi extended with constants</article-title>
          . In C. E. Brodley and P. Stone, editors,
          <source>Proceedings of the 28th AAAI Conference</source>
          , Quebec, Canada., pages
          <volume>2659</volume>
          {
          <fpage>2665</fpage>
          . AAAI Press,
          <year>2014</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          [11]
          <string-name>
            <given-names>R.</given-names>
            <surname>Goyal</surname>
          </string-name>
          and
          <string-name>
            <given-names>M.</given-names>
            <surname>Egenhofer</surname>
          </string-name>
          .
          <article-title>Cardinal Directions Between Extended Spatial Objects</article-title>
          .
          <source>IEEE Trans. on Data and Knowledge Engineering</source>
          ,
          <year>2000</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          [12]
          <string-name>
            <given-names>O.</given-names>
            <surname>Hassanzadeh</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Kementsietsidis</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L.</given-names>
            <surname>Lim</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R. J.</given-names>
            <surname>Miller</surname>
          </string-name>
          , and
          <string-name>
            <given-names>M.</given-names>
            <surname>Wang</surname>
          </string-name>
          .
          <article-title>A framework for semantic link discovery over relational data</article-title>
          .
          <source>In Proceedings of the 18th ACM conference on Information and knowledge management</source>
          , pages
          <volume>1027</volume>
          {
          <fpage>1036</fpage>
          . ACM,
          <year>2009</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          [13]
          <string-name>
            <given-names>R.</given-names>
            <surname>Isele</surname>
          </string-name>
          and
          <string-name>
            <given-names>C.</given-names>
            <surname>Bizer</surname>
          </string-name>
          .
          <article-title>Active learning of expressive linkage rules using genetic programming</article-title>
          .
          <source>Web Semantics: Science, Services and Agents on the World Wide Web</source>
          ,
          <volume>23</volume>
          :2{
          <fpage>15</fpage>
          ,
          <year>2013</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref14">
        <mixed-citation>
          [14]
          <string-name>
            <given-names>R.</given-names>
            <surname>Isele</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Jentzsch</surname>
          </string-name>
          , and
          <string-name>
            <given-names>C.</given-names>
            <surname>Bizer</surname>
          </string-name>
          .
          <article-title>E cient multidimensional blocking for link discovery without losing recall</article-title>
          .
          <source>In WebDB</source>
          ,
          <year>2011</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref15">
        <mixed-citation>
          [15]
          <string-name>
            <given-names>M.</given-names>
            <surname>Koubarakis</surname>
          </string-name>
          .
          <article-title>Linked Open Earth Observation Data: The LEO Project</article-title>
          .
          <source>In Image Information Mining Conference: The Sentinels Era. ESA-EUSC-JRC</source>
          ,
          <year>2014</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref16">
        <mixed-citation>
          [16]
          <string-name>
            <given-names>M.</given-names>
            <surname>Koubarakis</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Kontoes</surname>
          </string-name>
          , and
          <string-name>
            <given-names>S.</given-names>
            <surname>Manegold</surname>
          </string-name>
          .
          <article-title>Real-time wild re monitoring using scienti c database and linked data technologies</article-title>
          .
          <source>In Proceedings of the 16th International Conference on Extending Database Technology</source>
          , pages
          <volume>649</volume>
          {
          <fpage>660</fpage>
          . ACM,
          <year>2013</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref17">
        <mixed-citation>
          [17]
          <string-name>
            <given-names>M.</given-names>
            <surname>Koubarakis</surname>
          </string-name>
          and
          <string-name>
            <given-names>K.</given-names>
            <surname>Kyzirakos</surname>
          </string-name>
          .
          <article-title>Modeling and querying metadata in the semantic sensor web: The model stRDF and the query language stSPARQL</article-title>
          .
          <source>In ESWC</source>
          ,
          <year>2010</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref18">
        <mixed-citation>
          [18]
          <string-name>
            <given-names>K.</given-names>
            <surname>Kyzirakos</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Karpathiotakis</surname>
          </string-name>
          , and
          <string-name>
            <given-names>M.</given-names>
            <surname>Koubarakis</surname>
          </string-name>
          .
          <article-title>Strabon: a semantic geospatial dbms</article-title>
          .
          <source>In The Semantic Web{ISWC</source>
          <year>2012</year>
          , pages
          <fpage>295</fpage>
          {
          <fpage>311</fpage>
          . Springer,
          <year>2012</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref19">
        <mixed-citation>
          [19]
          <string-name>
            <given-names>M.</given-names>
            <surname>Nentwig</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Hartung</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.-C. N.</given-names>
            <surname>Ngomo</surname>
          </string-name>
          , and
          <string-name>
            <given-names>E.</given-names>
            <surname>Rahm</surname>
          </string-name>
          .
          <article-title>A survey of current link discovery frameworks</article-title>
          .
          <source>Semantic Web Journal</source>
          ,
          <year>2015</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref20">
        <mixed-citation>
          [20]
          <string-name>
            <given-names>A.-C. N.</given-names>
            <surname>Ngomo</surname>
          </string-name>
          and
          <string-name>
            <given-names>S.</given-names>
            <surname>Auer</surname>
          </string-name>
          .
          <article-title>Limes: A time-e cient approach for large-scale link discovery on the web of data</article-title>
          .
          <source>In Proceedings of the 22nd International Joint Conference on Arti cial Intelligence - Volume 3, IJCAI'11</source>
          , pages
          <fpage>2312</fpage>
          {
          <fpage>2317</fpage>
          . AAAI Press,
          <year>2011</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref21">
        <mixed-citation>
          [21]
          <string-name>
            <surname>A.-C. Ngonga Ngomo</surname>
          </string-name>
          .
          <article-title>Orchid - reductionratio-optimal computation of geo-spatial distances for link discovery</article-title>
          .
          <source>In Proceedings of ISWC</source>
          <year>2013</year>
          ,
          <year>2013</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref22">
        <mixed-citation>
          [22]
          <string-name>
            <surname>OGC. GeoSPARQL -</surname>
          </string-name>
          <article-title>A geographic query language for RDF data</article-title>
          ,
          <year>November 2010</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref23">
        <mixed-citation>
          [23]
          <string-name>
            <given-names>G.</given-names>
            <surname>Papadakis</surname>
          </string-name>
          , E. Ioannou,
          <string-name>
            <given-names>T.</given-names>
            <surname>Palpanas</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Niederee</surname>
          </string-name>
          , and
          <string-name>
            <given-names>W.</given-names>
            <surname>Nejdl</surname>
          </string-name>
          .
          <article-title>A blocking framework for entity resolution in highly heterogeneous information spaces. Knowledge and Data Engineering</article-title>
          , IEEE Transactions on,
          <volume>25</volume>
          (
          <issue>12</issue>
          ):
          <volume>2665</volume>
          {
          <fpage>2682</fpage>
          ,
          <year>2013</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref24">
        <mixed-citation>
          [24]
          <string-name>
            <given-names>D. A.</given-names>
            <surname>Randell</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Z.</given-names>
            <surname>Cui</surname>
          </string-name>
          ,
          <article-title>and</article-title>
          <string-name>
            <given-names>A. G.</given-names>
            <surname>Cohn</surname>
          </string-name>
          .
          <article-title>A spatial logic based on regions and connection</article-title>
          .
          <source>In KR</source>
          , pages
          <volume>165</volume>
          {
          <fpage>176</fpage>
          ,
          <year>1992</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref25">
        <mixed-citation>
          [25]
          <string-name>
            <given-names>J.</given-names>
            <surname>Salas</surname>
          </string-name>
          and
          <string-name>
            <given-names>A.</given-names>
            <surname>Harth</surname>
          </string-name>
          .
          <article-title>Finding spatial equivalences accross multiple RDF datasets</article-title>
          .
          <source>In Proceedings of the Terra Cognita Workshop on Foundations, Technologies and Applications of the Geospatial Web</source>
          , pages
          <volume>114</volume>
          {
          <fpage>126</fpage>
          .
          <string-name>
            <surname>Citeseer</surname>
          </string-name>
          ,
          <year>2011</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref26">
        <mixed-citation>
          [26]
          <string-name>
            <given-names>V.</given-names>
            <surname>Sehgal</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L.</given-names>
            <surname>Getoor</surname>
          </string-name>
          , and
          <string-name>
            <given-names>P. D.</given-names>
            <surname>Viechnicki</surname>
          </string-name>
          .
          <article-title>Entity resolution in geospatial data integration</article-title>
          .
          <source>In Proceedings of the 14th annual ACM international symposium on Advances in geographic information systems</source>
          , pages
          <volume>83</volume>
          {
          <fpage>90</fpage>
          . ACM,
          <year>2006</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref27">
        <mixed-citation>
          [27]
          <string-name>
            <given-names>S.</given-names>
            <surname>Skiadopoulos</surname>
          </string-name>
          and
          <string-name>
            <given-names>M.</given-names>
            <surname>Koubarakis</surname>
          </string-name>
          .
          <article-title>Composing cardinal direction relations</article-title>
          .
          <source>Arti cial Intelligence</source>
          ,
          <volume>152</volume>
          (
          <issue>2</issue>
          ):
          <volume>143</volume>
          {
          <fpage>171</fpage>
          ,
          <year>2004</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref28">
        <mixed-citation>
          [28]
          <string-name>
            <given-names>P.</given-names>
            <surname>Smeros</surname>
          </string-name>
          .
          <article-title>Discovering Spatial and Temporal Links among RDF Data</article-title>
          . In M. Koubarakis, editor,
          <source>Master Thesis</source>
          . National and Kapodistrian University of Athens,
          <year>2014</year>
          . Available from: http://openarchives.gr/view/2547590.
        </mixed-citation>
      </ref>
      <ref id="ref29">
        <mixed-citation>
          [29]
          <string-name>
            <given-names>K.</given-names>
            <surname>Taylor</surname>
          </string-name>
          and E. Parsons.
          <article-title>Where is everywhere: Bringing location to the web</article-title>
          .
          <source>Internet Computing</source>
          , IEEE,
          <volume>19</volume>
          (
          <issue>2</issue>
          ):
          <volume>83</volume>
          {
          <fpage>87</fpage>
          ,
          <string-name>
            <surname>Mar</surname>
          </string-name>
          <year>2015</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref30">
        <mixed-citation>
          [30]
          <string-name>
            <given-names>L. M.</given-names>
            <surname>Vilches-Blazquez</surname>
          </string-name>
          ,
          <string-name>
            <given-names>V.</given-names>
            <surname>Saquicela</surname>
          </string-name>
          , and
          <string-name>
            <given-names>O.</given-names>
            <surname>Corcho</surname>
          </string-name>
          .
          <article-title>Interlinking geospatial information in the web of data</article-title>
          .
          <source>In Bridging the Geographic Information Sciences</source>
          , pages
          <volume>119</volume>
          {
          <fpage>139</fpage>
          . Springer,
          <year>2012</year>
          .
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>