<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Serving Ireland's Geospatial Information as Linked Data</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Christophe Debruyne</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Éamonn Clinton</string-name>
          <email>eamonn.clinton@osi.ie</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Lorraine McNerney</string-name>
          <email>lorraine.mcnerney@osi.ie</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Atul Nautiyal</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Declan O'Sullivan</string-name>
          <email>declan.osullivan@scss.tcd.ie</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>The ADAPT Centre for Digital Content Technology, Trinity College Dublin</institution>
          ,
          <addr-line>Dublin 2</addr-line>
          ,
          <country>Ireland Ordnance Survey Ireland</country>
          ,
          <addr-line>Phoenix Park, Dublin 8</addr-line>
          ,
          <country country="IE">Ireland</country>
        </aff>
      </contrib-group>
      <abstract>
        <p>We present data.geohive.ie, which aims to provide an authoritative platform for serving Ireland's national geospatial data, including Linked Data. Currently, the platform provides information on Irish administrative boundaries and was designed to support two use cases: serving boundary data of geographic features at various level of detail and capturing the evolution of administrative boundaries. We report on the decisions taken for modeling and serving the data such as the adoption of an appropriate URI strategy, the development of necessary ontologies, and the use of (named) graphs to support aforementioned use cases.</p>
      </abstract>
      <kwd-group>
        <kwd />
        <kwd>Geospatial Data</kwd>
        <kwd>Linked Data</kwd>
        <kwd>Ontology Engineering</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>Introduction</title>
      <p>1 https://www.oracle.com/database/spatial/
2 Some of this data is also available via the Irish Government portal via data.gov.ie, but is
currently not available as Linked Data. The goal is furthermore to investigate the feasibility
of having a portal refer to OSi’s data rather than hosting (and therefore also duplicating) it.
have been made available by OSi's Open Data release3, taking into account two use
cases: i) providing different “generalizations” (i.e. different levels of detail) of the
boundaries and ii) capturing the evolution of boundaries, e.g. as ordered by Statutory
Instruments. The main contributions of this paper are the decisions made for capturing
and representing aforementioned information in RDF.
2</p>
    </sec>
    <sec id="sec-2">
      <title>Approach and Implementation</title>
      <p>
        With Prime2, OSi adopted an object-oriented model for capturing geospatial data. In
this model, a clear distinction is made between a geographical object (identified by a
GUID) and their representations. The distinction between objects and their
representations is argued to be important [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ], but in literature the terms geographic features and
geometries are used. The geometry of a feature can evolve over time, and these
changes do not have an impact on the feature. In other words, the geometry of a
feature is “merely” an attribute. Prime2 will drive some of the design decisions we made
for the development of the Linked Data platform.
2.1
      </p>
      <sec id="sec-2-1">
        <title>URI Strategy</title>
        <p>Unlike datasets that have been created at a specific time and for a specific purpose,
such as CENSUS data, OSi’s geospatial information is not static in nature; it does not
make sense to include variables such as a creation date in the URIs of resources.
Since each object is assigned a GUID, these can be used to create opaque URIs. We,
however, have decided to include the type of geographical feature in the URI as to
provide developers and consumers some idea of the nature of the entity the URI is
referring to. This will not pose a problem as Prime2 prescribes that features that
change of type (e.g. a convent becoming a hospital) are considered as new features
and are therefore assigned new GUIDs.
2.2</p>
      </sec>
      <sec id="sec-2-2">
        <title>Describing the Features and their Geometries</title>
        <p>
          Since we have not found suitable ontologies for appropriately annotating the different
boundaries (11 types in total; Baronies, Counties, County Councils, etc.), we decided
to create a new ontology4 that extends GeoSPARQL5. Ryan et al. noted some
differences between concepts related to Ireland's geographic features and Linked Data
datasets such as DBpedia and GeoNames [
          <xref ref-type="bibr" rid="ref5">5</xref>
          ]. Other problems include distinct
definitions for “town lands” and “counties” and the absence of an ontology for describing
“county council”, amongst others. GeoSPARQL is both an ontology for describing
geographical features and their geometries and defines predicates for spatial queries in
SPARQL, making it a suitable candidate for our platform. Subclasses of the concept
geo:Feature were introduced for each type of administrative boundary we serve.
        </p>
        <p>GeoSPARQL supports the distinction between features and geometries. Since a
3 http://www.osi.ie/about/open-data/
4 http://ontologies.geohive.ie/osi
5 http://www.opengeospatial.org/standards/geosparql
geometry is an attribute of a feature in the same way a name is an attribute of a
person, we have, for the time being, chosen not to provide a URI for geometries. The
geometries of a feature have thus to be accessed via a feature with
geo:hasGeometry. Geometries are available in three levels of detail: generalized
up to 100, 50 and 20 meters, which are stored in different (named) graphs. The default
graph contains the features, labels in English and Gaelic (whenever available), and
their representations generalized up to 100 meters (and are thus smaller in
bandwidth). The generalizations up to 50 and 20 meters each have their own named graph.</p>
        <p>Finally, Prime2 captures the geometries using the Irish Transverse Mercator (ITM)
coordinate system. At an international level, however, World Geodetic System 84 (or
WSG 84) is the standard used in cartography and navigation (amongst others). As OSi
wishes to encourage the uptake of WGS 84 within Ireland, a decision was made to
serve the geometries in WSG 84 only; third parties can themselves rely on services to
transform the data between coordinate systems.
2.3</p>
      </sec>
      <sec id="sec-2-3">
        <title>Capturing the Evolution of Geometries</title>
        <p>Next we aim to capture the evolution of boundaries. Though rare, administrative
boundaries can change with so called Statutory Instruments.6 Statutory Instruments
are available on the Web and are accessible via a URI, making it possible to relate the
evolution of boundaries with these instruments. To capture the evolution of
boundaries, we have chosen to extend PROV-O7 with a new prov:Activity called
“Boundary Change”, which is informed by a new prov:Entity called “Statutory
Instrument”.8 Prior versions of features and their geometries are captured in a separate
graph. Note that geometries do not have a URI, but can be discovered via the feature.
3</p>
      </sec>
    </sec>
    <sec id="sec-3">
      <title>The Platform</title>
      <p>
        Objects are stored in an Oracle Spatial and Graph instance according to the Prime2
data model. RDF graphs are created by means of several R2RML9 mappings that
relate tables of the database with predicates in aforementioned ontologies. Those
triples, currently 831,562 in total, are then loaded into a triplestore that supports
GeoSPARQL. To avoid an excessive load on the server, we have currently chosen to limit
access to the SPARQL endpoint and set up a Triple Pattern Fragments (TPF) server
[
        <xref ref-type="bibr" rid="ref6">6</xref>
        ] instead. A TPF server basically returns a result set for simple triple patterns and it
is up to a TPF client to compute the result of a SPARQL query. A limitation is that
TPF does not (yet) support the geospatial predicates provided by GeoSPARQL and
users therefore have no way to exploit these on the platform. The platform
furthermore hosts the boundary datasets as dumps and hosts the two aforementioned
ontologies for Irish administrative boundaries according to Linked Data principles.
6 An example of a Statutory Instrument altering borders between counties can be found here:
http://www.irishstatutebook.ie/eli/1994/si/114/made/en/print#
7 https://www.w3.org/TR/prov-o/
8 http://ontologies.geohive.ie/osiprov
9 https://www.w3.org/TR/r2rml/
We presented our ongoing work in creating data.geohive.ie for serving
Ireland’s geospatial information as an authoritative Linked Data dataset on the Web. We
currently serve 11 types of administrative boundaries and focused on two use cases:
serving different levels of detail and capturing the evolution of boundaries.
      </p>
      <p>One of the main limitations of this platform is that it does not yet serve meaningful
data for the latter. This is due to the fact that the boundary dataset is quite static and
prior versions of administrative boundaries have not (yet) been entered in the Prime2
data model. Though simulated, we should have access to such data to validate that
aspect of our approach. Future work described below would provide datasets to
validate our approach to capturing the evolution of geometries. Another limitation is that
users are currently unable to use the spatial predicates via Triple Pattern Fragments,
as the SPARQL endpoint is not (yet) made available to the public. The OSi will first
monitor the use of the platform before deciding whether this can be made available.</p>
      <p>We aim to incorporate other administrative boundaries and other types of features
in the future. For the first, we are looking at small areas that have been used for the
2011 CENSUS, allowing us to create links with CENSUS 2011 information published
at data.cso.ie. For the latter, we will look into the inclusion of features that are
not open and are made available via commercial licenses to certain parties, such as the
geometries of buildings. Serving these datasets as Linked (Closed) Data would thus
require the investigation of access control mechanisms. Approved construction works
may change the geometry of buildings and are captured in the Prime2 model. This
dataset could thus be used to validate our approach to capturing geometry evolution.
Acknowledgements. The ADAPT Centre for Digital Content Technology is funded
under the SFI Research Centres Programme (Grant 13/RC/2106) and is co-funded
under the European Regional Development Fund. We furthermore would like to
acknowledge the Department of Public Expenditure and Reform (DPER) and the
Central Statistics Office (CSO) for their input as a stakeholder.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          1. Prime2:
          <article-title>Data Concepts and Data Model Overview</article-title>
          .
          <source>Tech. rep., Ordnance Survey Ireland</source>
          (
          <year>2014</year>
          ), http://www.osi.ie/wp-content/uploads/2015/04/Prime2-V-2.pdf
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          2.
          <string-name>
            <surname>Battle</surname>
            ,
            <given-names>R.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Kolas</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          :
          <article-title>Enabling the geospatial semantic web with parliament and GeoSPARQL</article-title>
          .
          <source>Semantic Web</source>
          <volume>3</volume>
          (
          <issue>4</issue>
          ),
          <fpage>355</fpage>
          -
          <lpage>370</lpage>
          (
          <year>2012</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          3.
          <string-name>
            <surname>Bizer</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Heath</surname>
            ,
            <given-names>T.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Berners-Lee</surname>
            ,
            <given-names>T.</given-names>
          </string-name>
          :
          <article-title>Linked data - the story so far</article-title>
          .
          <source>Int. J. Semantic Web Inf. Syst</source>
          .
          <volume>5</volume>
          (
          <issue>3</issue>
          ),
          <fpage>1</fpage>
          -
          <lpage>22</lpage>
          (
          <year>2009</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          4.
          <string-name>
            <surname>Goodwin</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Dolbear</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Hart</surname>
          </string-name>
          , G.:
          <article-title>Geographical linked data: The administrative geography of Great Britain on the semantic web</article-title>
          .
          <source>Transactions in GIS 12</source>
          ,
          <fpage>19</fpage>
          -
          <lpage>30</lpage>
          (
          <year>2008</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          5.
          <string-name>
            <surname>Ryan</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Grant</surname>
            ,
            <given-names>R.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Carragáin</surname>
            ,
            <given-names>E.Ó.</given-names>
          </string-name>
          , Collins,
          <string-name>
            <given-names>S.</given-names>
            ,
            <surname>Decker</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            ,
            <surname>Lopes</surname>
          </string-name>
          , N.:
          <article-title>Linked data authority records for Irish place names</article-title>
          .
          <source>Int. J. on Digital Libraries</source>
          <volume>15</volume>
          (
          <issue>2-4</issue>
          ),
          <fpage>73</fpage>
          -
          <lpage>85</lpage>
          (
          <year>2015</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          6.
          <string-name>
            <surname>Verborgh</surname>
            ,
            <given-names>R.</given-names>
          </string-name>
          ,
          <string-name>
            <given-names>Vander</given-names>
            <surname>Sande</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            ,
            <surname>Hartig</surname>
          </string-name>
          ,
          <string-name>
            <given-names>O.</given-names>
            ,
            <surname>Van Herwegen</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            ,
            <surname>De Vocht</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L.</given-names>
            ,
            <surname>De Meester</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B.</given-names>
            ,
            <surname>Haesendonck</surname>
          </string-name>
          ,
          <string-name>
            <given-names>G.</given-names>
            ,
            <surname>Colpaert</surname>
          </string-name>
          ,
          <string-name>
            <surname>P.</surname>
          </string-name>
          :
          <article-title>Triple pattern fragments: A low- cost knowledge graph interface for the Web</article-title>
          .
          <source>J. Web Sem</source>
          .
          <volume>37</volume>
          ,
          <fpage>184</fpage>
          -
          <lpage>206</lpage>
          (
          <year>2016</year>
          )
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>