=Paper= {{Paper |id=None |storemode=property |title=Geographical Service: a compass for the Web of Data |pdfUrl=https://ceur-ws.org/Vol-628/ldow2010_paper15.pdf |volume=Vol-628 |dblpUrl=https://dblp.org/rec/conf/www/CorrendoSYGS10 }} ==Geographical Service: a compass for the Web of Data== https://ceur-ws.org/Vol-628/ldow2010_paper15.pdf
     Geographical Service: a compass for the Web of Data.

       Gianluca Correndo, Manuel Salvadores, Yang Yang, Nicholas Gibbins, Nigel Shadbolt
                                            Intelligence, Agents, Multimedia (IAM) Group
                                            School of Electronics and Computer Science
                                                           Southampton, UK
                                  {gc3, ms8, yy1402, nmg, nrs}@ecs.soton.ac.uk



ABSTRACT                                                            lishing Public Sector Information (PSI), adopting Linked
This paper describes a Linked Data service that supports            Data tenets as future best practices. Data sets recently de-
the navigation and retrieval of geographical entities for the       livered to the public include: government expenses, NHS
UK territory. Geographical entities, in the extent of this          trusts’ performances, public transportation, and a whole set
paper, are linked data resources that describe objects that         of statistics about crime, mortality, census, environment,
have a geographical extension. The service presented in this        school and social indicators. Some of the data sets men-
paper allows the querying of resources that contain or are          tioned have been published already in Linked Data format,
contained by a given entity URI. The recent publication of          others have been translated within the EnAKTing project,
UK Public Sector Information (PSI) data sets has brought            and many others are waiting to be freed in the LOD cloud.
to the attention of the community the redundant presence               Such a prolific inflow of Linked Data poses new questions
of location based context. At the same time it stresses the         and challenges to the community of researchers and develop-
inadequacy of current Linked Data services for exploiting           ers: how is it possible to integrate such different information
the semantics of such contextual dimensions for easing en-          into a meaningful schema? How is it possible to exploit the
tity retrieval and browsing. We present an approach for a           little semantics that goes a long way? How do we choreo-
geography based service that helps in querying qualitative          graph the publishing activity of separate organizations from
spatial relations for the UK geography (proper containment          the public sector? A common trait of PSI seems to be its
so far). We also provide an exploitation scenario based on          locality: local and national public organisations are in fact
a backlinking service and PSI Open Linked Data, published           mainly concerned with the collection of data about their
within the EnAKTing project.                                        territory, and the distribution of their resources.
                                                                       In the WoD vision, links between resources from differ-
                                                                    ent publishers are particularly important since they are the
Categories and Subject Descriptors                                  ones that allow new data to be discovered and integrated
H.3.4 [Systems and Software]: Distributed systems; H.5.4            into the current discourse. It is frequently the case that
[Web]: Navigation; H.3.5 [Online Information Services]:             different URIs are used to refer to the same things, moti-
Web-based services                                                  vating the use of co-reference services for the resolution of
                                                                    instance equivalences. Knowledge of this type of relation-
General Terms                                                       ship increases the potential for reuse since information from
                                                                    previously unknown sources is now accessible, and makes the
Linked Data, geographical services
                                                                    problem of co-reference resolution of primary importance [9].
                                                                    In any case, we can expect more and more of this linking
Keywords                                                            data to be made available as the number of Linked Data
Linked Data, geographical reasoning, Web of Data                    publishers increases.
                                                                       The publication of an authoritative geography of the UK,
1.   INTRODUCTION                                                   (its regions, counties, districts and their connections) by
                                                                    Ordnance Survey (the national mapping agency for Great
   The Linked Data Initiative represents the first collabora-       Britain, OS henceforth) as Linked Data, has opened inter-
tive effort to create a Web of Data (WoD henceforth) of con-        esting scenarios for exploiting semantics in contextualising
siderable scale, providing few, simple guidelines for publish-      the information sources published on data.gov.uk. The ge-
ing content using well established standards [3]. Such guide-       ographical dimensions in PSI data sets are already repre-
lines and standards are leading the way to a new paradigm of        sented, but their semantics may be lost if they are not ex-
interaction between government and citizens in the UK. In           ploited for creating new collections of data, browsing related
order to pursue better access for citizens to information held      resources, and making connections.
by local as well as national public organisations, the UK gov-         In this paper, we present a service for querying spatial
ernment has recently launched1 a public initiative for pub-         relationships for the UK (extensible to other countries when
1
 Public access to the site http://data.gov.uk has been              authoritative knowledge bases are available). We start in
granted the 19th of January, 2010.                                  Section 2 where the available knowledge bases are described
Copyright is held by the author/owner(s).
                                                                    along with an introduction of the qualitative spatial rea-
LDOW2010, April 27, 2010, Raleigh, USA.                             soning supported. Section 3 provides a rationale for the
.
developing of such service in support of Linked Data brows-       ticularly useful in our case for reasoning are the geographic
ing and retrieval. In Section 4 the implementation of the         location relationships.
geographical service and its APIs are described. The paper
then concludes with a description of an evaluation of the
presented service using public sector information from the
UK government in Section 5 and some concluding notes in
Section 6.

2.   BACKGROUND                                                   Figure 1: RCC Eight Jointly Exhaustive and Paris-
   The World Wide Web and the WoD can both be under-              wise Disjoint Relations
stood as hypertext systems, where the general purpose of
the hypertext system is for information discovery by navi-           Within the Linked Data context, there are several ser-
gation. Providing reasoning over hyperlinks for the purpose       vices providing resolvable URIs for geographic locations.
of navigation can benefit information discovery. In 1990,         Geo-names2 for example, is a community based service that
Nanard brought the concept of “semantic network” from Ar-         provides geographical representation of geographical entities
tificial Intelligence [16] into the hypertext field by creating   covering all countries worldwide and manages eight million
a Conceptual Hypertext System [13], in which a hyperlink          URIs for geographical resources. As a further example, the
can be reasoned by using a domain model classification. In        national mapping agency of Great Britain, Ordnance Sur-
the above system, typed links and typed chunks are used           vey, maintains a continuously updated database of the to-
to define relationship between types in order to incorporate      pography of Great Britain3 and is responsible for surveying
knowledge into a hypertext. This domain model classifica-         the boundaries of the administrative areas.
tion is used to classify the documents and documents that            In this paper, we exploited the Administrative Geogra-
share metadata, and which are deemed to be similar in some        phy ontology provided by Ordnance Survey as an author-
way. The Conceptual Open Hypermedia Service (COHSE)               itative knowledge base for querying the UK geographical
project [4] later took this approach forward by providing on-     structure [10]. Such ontology explicitly represents the mere-
tological reasoning based on links of services to bridge the      ological relationships within the administrative hierarchy, as
navigation gap between the Web and Linked Data, where             well as topologically representing the boundary information
the link services provided a mapping between concepts and         between administrative units at the same hierarchical level.
the lexical labels on the web page.                               The following depicts the class hierarchy created in the ad-
   Many of the PSI data sets published so far can be plotted      ministrative ontology from Ordnance Survey:
within a spatial and temporal dimension, in other words, all
data can be linked together by its spatial and temporal in-           • CivilAdministrativeArea
dexes. Within this context, the need to provide services
to reason the spatial and temporal aspects of the linked                  – EuropeanRegion
data is of key importance. This is unsurprising, the spa-                 – Country
tial and temporal reasoning have always been considered
to be an important part of common-sense reasoning in Ar-                  – UnitaryAuthority
tificial Intelligence. In this section, we will mainly focus              – MetropolitanDistrict
on qualitative spatial representation and reasoning. There                – GreaterLondonAuthority
are two major approaches to qualitative spatial represen-
tation - point based and region based [6]. Region based                   – LondonBorough
approaches, such as Topology [7] which describe relation-                 – District
ships between spatial regions are more intuitive than point
                                                                          – CivilParish
based approaches. The commonly known approaches for for-
malizing topological properties of spatial regions are based              – Community
on work from Whitehead [17] and Clarke [5] who axiom-
atized mereotopologies (a theory that combines mereology              • Country
and topology) using a single primitive relation and binary
connectivity relationships. By using these primitive rela-           The topological relations adopted by this ontology were
tions, other relations can be defined. The Region Con-            taken from the RCC8 and correspond to the properties NTP-
nection Calculus (RCC8) proposed by Randell, Cui and              Pi, TPPi, EC and EQ respectively. The topology of admin-
Cohn[14] defines a set of jointly exhaustive and pairwise         istrative geography of Great Britain contains no overlapping
disjoint relations DC, EC, PO, EQ, TPP, NTPP, TPPi an             regions, therefore, the PO relation was not required. Later
NTPPi, as illustrated in Figure 1, and is the most well-          version of the ontology reported overlapping entities as well.
known approach in the domain. Since the RCC Calculus is           The property of spatial containment used in the OS ontol-
expressed in first-order predicate calculus, a wide range of      ogy (equivalent to the NTTP(i) and TTP(i) relations in Fig-
theorem provers can be used for reasoning. For instance,          ure 1), implies a mereological relationship. For instance, if
Given a fixed vocabulary of relations, Ri, given R1(x,y) and      Hampshire spatially contains Fareham, then Fareham is a
R2(y,z), one can answer questions about the possible rela-        part of Hampshire.
tions (from the set Ri) that can hold between x and z by          2
                                                                   http://www.geonames.org last accessed 10/02/2010
looking up the composition table [8]. Although general 1st-       3
                                                                   With the exception of Northern Ireland that is covered by
order theorem proving is too inefficient to be useful for many    a different agency, the Land and Property Services Northern
purposes [11], it is relatively simple to implement and par-      Ireland.
   Dereferenceable URIs adopted by the Linked Data com-            a solution to overcome such issue that soundly enhance data
munity inherit the same properties of hyperlinks in the Web        retrieval and browsing when geographical dimensions are in-
hypertext system, which is (among others) uni-directionality.      volved.
The problem of such kind of links is that it is not possible          The issue is about the usage of geographical entities for
to navigate back to the original resource by using dereferen-      contextualising local information (i.e. information that are
ciation mechanism only. This problem becomes even more             related to a particular geographical location, for example
relevant when URIs from previous authoritative data sets           the population of a region, the MPs of a constituency, or
are reused in order to provide context and meaning to new          various statistical data based on territory). In publishing
data. It is in fact possible to browse from the new data to        this kind of information, we provided alignments of our data
the old one, but not the other way around. The back-linking        (at least for the geographical dimensions represented in the
service4 we have implemented for UK public sector informa-         data) to authoritative knowledge bases using co-reference
tion supports the discovery of back-links between datasets.        systems [9]. The problem we have to deal with originates
The benefit of a back-linking service is that it enables users     with the fact that, since the public sector information pub-
to discover, from a single dataset, other datasets which ref-      lished was originated by different sectors of UK government,
erence back to it, creating therefore data linkage opportuni-      the kind of spatial classifications used were highly heteroge-
ties between datasets, increasing the recall of valuable data      neous, ranging from local parishes to counties and up to
sources, and doubling the network effect [15] that increases       European regions (e.g. South East of England). The differ-
even more when co-reference systems are employed.                  ent granularities used to classify the data means, in Linked
   In this paper, we will mainly focus on exploring the possi-     Data terms, that related information sources link to differ-
bility of exploiting semantics from authoritative knowledge        ent URIs. Some data may be in fact relevant for constituen-
bases to provide support for consuming Linked Data re-             cies, while others may use a different granularity (by county
sources. The service provided will allow users to retrieve         for example), and the URI of a county is obviously differ-
contained (and container) entity URIs from popular data            ent from the set of URIs of all its constituencies. Available
sets by exploiting a co-reference service. Moreover, a back-       knowledge bases about the geographical or administrative
linking service which we previously created in the EnAK-           subdivision of a territory can be exploited to cover such gap
Ting project5 , will allow us to retrieve the information re-      in data granularity.
sources that addressed such URIs. Far from trying to pro-
vide a general purpose reasoner for geographical entities, the                                                                                 The County of Hampshire
aim of the service described in the following sections is to
                                                                                                                                                os:7000000000017765
exploit the semantically rich knowledge base for UK geogra-
phy in order to ease users’ navigation through the published
PSI data sets. Similar capabilities were already provided by           http://mortality.psi.enakting.org
DBpedia Mobile [2], an application that retrieved DBpedia
entries mashed up on a map based on users’ geographical                                             scovo:dimension
                                                                          mortality:ds_1_299_1                           mortality:Hampshire
coordinates. The results provided by our service although                  mortality:ds_1_299_1
                                                                             mortality:ds_1_299_1
are based on a spatial subdivision of the territory, subdivi-
sion that is already used by public sector organizations to
classify their data (e.g. crime statistics are based on a police
based subdivision of the territory, while MPs activities are           http://crime.psi.enakting.org
                                                                                                    scovo:dimension
related to the constituency they were voted in).                            crime:ds1_37_1                                crime:Hampshire
                                                                              crime:ds1_37_1
                                                                                crime:ds1_37_1
                                                                                  crime:ds1_37_1
                                                                                   crime:ds1_37_1
3.     MOTIVATION
   The Linked Data principles [3] promote a Web of Data
whose architecture is inherently decentralised, relying on
                                                                       http://parliament.psi.enakting.org
data already published (when available) in order to give se-                                                                 Winchester
                                                                                                           dc:coverage
mantics and context to new data. The growth the WoD                       parliament:member/10395                        parliament:cons-426
has experienced over recent years relies on the simplicity                                                                    Eastleigh
                                                                                                           dc:coverage
of publishing and linking data. However, up to now a se-                   parliament:member/101                         parliament:cons-203
mantically coherent orchestration of data publishing is still
                                                                                                                              Fareham
a mirage. Nevertheless, relying purely on data linkage for                                                 dc:coverage

the discovery and browsing of linked data resources would                 parliament:member/11884                        parliament:cons-228

                                                                                      .
lead to a serious knot to untie in the near future. The use                           .
of ontologies and powerful ontology languages in publishing
Linked Data will be an effort that must be justified against a
scenario where such explicit semantics are rarely exploited.                         owl:sameAs                                                   resource accessible

   In publishing UK Public Sector Information (UK PSI),
                                                                                     contained_in                                                 resource inaccessible
we have identified an issue concerning data accessibility and
navigability that addresses in particular the missing exploita-
tion of semantics (in this case about qualitative spatial de-
scription of geographical entities). In this paper we present      Figure 2: Resource irretrievable via geographical
                                                                   gap
4
    http://backlinks.psi.enakting.org
5
    http://enakting.org                                              Taking as an example the PSI data sets published re-
cently6 , we adopted the Ordnance Survey administrative            ready partially aligned. The integration of different knowl-
ontology in order to provide context to our data items (i.e.       edge bases could lead to the possible exploitation of such
SCOVO items instances7 and local governmental data). The           alignments in order to bridge data sets and reuse the avail-
SCOVO ontology allows us to describe statistical data as a         able knowledge in more than one context.
collection of Items where each item describes a statistical
value (i.e. a single cell in a multidimensional table) along       4.    GEOGRAPHICAL SERVICE FOR UK
with all the dimensions that characterise it. In the case of
UK PSI statistics, many data sets collected were related to           To support the user’s experience in browsing and discov-
geographical regions (counties, districts, etc.)                   ery of new resources in the WoD, we have developed a ge-
   In this case, users who wished to discover useful informa-      ographical service for querying the UK territory structure.
tion about their own region (e.g. the County of Hampshire,         The decision to restrict the service to the UK territory is
top Figure 2) would start their searching activity by brows-       mainly due to the fact that the service is mainly used in
ing one of its available URIs. The OS URI for such geo-            order to support the discovery of UK PSI resources. Knowl-
graphical entity would be os:70000000000177658 , but any           edge about geographical containment is exploited here to
equivalent URI provided by a co-reference system will pro-         link information that is contextually related because of their
vide the same results as will be described in the following.       spatial dimension.
Using a backlinking service for resolving the entities link-          For this use case we have implemented a service for query-
ing to the given URI for Hampshire, we are able to retrieve        ing the topological structure of UK (from the broader entity
links to mortality statistics (mortality:ds1_299_[1...3]9 )        to the more particular and the other way around) that can
and crime statistics (crime:ds1_37_[1...11]10 ). In Figure         be easily integrated into a web of linked data. The service,
2 those URIs are contained in boxes labelled as “accessible”,      accessible at http://geoservice.psi.enakting.org is de-
meaning that those URIs are retrievable following back al-         signed in order to be easily integrated both into web appli-
ready existent arcs. Those SCOVO data sets’ items address          cations and in linked data resources and it follows few basic
in fact Hampshire county as one of their dimensions. What          principles:
is missing is the further data collected that reports valuable
                                                                   Lightweight Service : The service should be easy to use
information about regions contained in Hampshire. In par-
                                                                        and resolve a specific problem. A geographical ser-
ticular, within the EnAKTing project, we published linked
                                                                        vice is a component of the WoD that supports discov-
data about the singular constituencies too. In detail we pub-
                                                                        ery when geographical entities are involved, it is not a
lished, for each of constituency, an historical record of the
                                                                        general purpose reasoning engine.
MP in charge for that constituency, his/her voting records
and expenses. In Figure 2 those resources are contained in         Linked Data Compatible : The geographical service sho-
dotted boxes labelled as “inaccessible”, meaning that they             uld be used as a resolvable URI like any other resource,
cannot be retrieved with the existent knowledge.                       in order to be used in linked data content as a use-
   Example URIs for such inaccessible resources are11 :                ful provider of relevant URIs. Moreover the service
                                                                       should provide the results in a number of different for-
parliament:cons-637 rdfs:label "Winchester"                            mats that will be decided using content negotiation
parliament:cons-203 rdfs:label "Eastleigh"                             and HTTP 303 redirection.
parliament:cons-228 rdfs:label "Fareham"
                                                                   Co-reference Support : The service should exploit the
   The URIs for, respectively: Winchester, Eastleigh, and              already available knowledge about instance equivalence
Fareham, are therefore not retrieved by the resolution of              provided by co-reference systems12 in order to return
the Hampshire URI (obviously) or by the additional service             results useful in a number of different data sets.
provided from the backlinking service.
   Despite the fact that an entity is still semantically differ-   4.1   Data collection and normalisation
ent from the parts that compose it, the information relevant          OS provides an ontology13 and an RDF dump about spa-
for all its constituting parts can still be relevant for the en-   tial relations between UK regions. The triples from OS
tity as a whole. Without covering such geographical gap it         have been parsed and only the relation of physical contain-
is not possible to access all the relevant sets of information,    ments have been retained, normalised and completed with
provide them to the user or process them in some way in            the inverse relations in a separate knowledge base. The
order to summarise their content.                                  service presented here, for the sake of simplicity and effi-
   The aim of this research is to exploit authoritative knowl-     ciency14 , manages only the NTPP, and the relative inverse,
edge bases in order to cover such gaps, allowing therefore         the NTPPi relations. The knowledge extracted from the OS
citizens to retrieve information resources relevant to their       data set has been then normalised in terms of an internal
region of interest. Moreover, there are many data sources          ontology that represent qualitative spatial relations.
that describes geographical resources, and all of those are al-       The normalisation step has been introduced in order to
6                                                                  allow the service to integrate further geographical hierar-
   http://browser.psi.enakting.org
7                                                                  chies in the future (e.g. geonames provides containment of
   http://purl.org/NET/scovo
 8                                                        12
   PREFIX os:              Like http://sameas.org
 9
   PREFIX mortality:                                                      ontology/SpatialRelations/v0.2/SpatialRelations.
10
   PREFIX crime:        owl
11
   PREFIX parliament:                                                      in order to provide a very focused service.
                             http://dbpedia.org                                             http://crime.psi.enakting.org

                                               dbpedia:Hampshire                                                 crime:Hampshire




                             http://data.ordnancesurvey.co.uk
                                                                                           Hampshire county

                                                                             os:7000000000017765




                                                           Fareham                                             Winchester

                                                       os:7000000000025157           ...             os:7000000000025128




                             http://dbpedia.org                                             http://parliament.psi.enakting.org
                                         dbpedia:Fareham                                                                        parliament:cons-637
                                    (UK Parliament constituency)
                                                     ...                                                                ...

                                                       dbpedia:Winchester
                                                   (UK Parliament constituency)                           parliament:cons-228




                                             owl:sameAs                                       part                                                part_of




                       Figure 3: Coupling of co-reference and Ordnance Survey geographic ontology


geographical features). The future integration of qualita-                                   a target data set provided by the user, see bottom part of
tive spatial knowledge bases is devised in order to extend                                   Figure 3). The co-reference service used in this paper is
the service outside the borders of UK and for providing an                                   the http://sameas.org service from Glaser et al. [9]. The
assessment of co-references between geographical entities.                                   relevant bundles have been retrieved from the service and
   A simple example of how the normalised triples from OS                                    cached for performance. It is important to note that, in or-
ontology are used in coupling with a co-reference service for                                der to chose the wanted quality of service, one could opt for
bridging the navigational gap for different data sets is de-                                 using one co-reference service instead of another. The func-
picted in Figure 3; in the figure it is possible to see that a sin-                          tionality provided is transparent from the provenance of the
gle statement from OS describing the fact that the County                                    co-reference bundles.
of Hampshire contains Fareham and Winchester15 :                                                Exploiting co-reference services and OS ontology, it is
                                                                                             therefore possible to infer containment relation between re-
os:7...17765 os:contains os:7...25157.                                                       sources from different data sets. For example:
os:7...17765 os:contains os:7...25128.
                                                                                             dbpedia:Hampshire owl:sameAs os:7...17765
  has been translated into an internal representation con-                                   AND
taining both relations: part, and part of; like the following:                               os:7...17765 geoservice:part os:7...25128
                                                                                             AND
os:7...17765 geoservice:part    os:7...25157.                                                os:7...25128 owl:sameAs dbpedia:Winchester
os:7...25157 geoservice:part_of os:7...17765 .                                               =⇒
os:7...17765 geoservice:part    os:7...25128.                                                dbpedia:Hampshire geoservice:part dbpedia:Winchester
os:7...25128 geoservice:part_of os:7...17765 .
                                                                                             4.2        RESTful API
   The containment relations so normalised (see central part                                   The service is accessed via HTTP GET requests and pro-
of Figure 3) are then internally stored in the system and                                    vide two essential information: the list of entities contained
queried for serving users requests.                                                          the input URI, and the list of entities that contains the in-
   The normalised containment relations are integrated with                                  put URI. The interface is then accessible via the following
the information provided by the co-reference system that                                     URIs:
allows to bridge different data sources both in the input
phase (i.e. where the input URI must be translated in                                        http://geoservice.psi.enakting.org/{command}/
the OS equivalent, see top part of Figure 3) and the out-                                      {dictionary}/{format}/{URI}
put phase (i.e. when the results must be translated into
                                                                                               In the above API description, the parameters are enclosed
15
     OS URIs are shortened, the trail of ’0’ are replaced by ’. . . ’.                       in brackets and their meaning is the following:
                                                           http://geoservice.psi.enakting.org


                                                                                           2.
                                                      1.
                          dbpedia:Hampshire                                                             co-reference
                                                                       geoservice
                                                                                                     http://sameas.org
                                                                                           4.
                                                 5.

                                                                       3.      os:7000000000017765
                          dbpedia:Fareham_
                      (UK_Parliament_constituency)
                                                                     geoservice:KB
                                                                        (4store)




                     Figure 4: Overall architecture and interaction with co-reference system


command: can be either contains or container: in the                          use one of the data set of preference (e.g. DBpedia or
   first case it returns the URIs of the entities contained                   Geonames) and ask for contained, or container, enti-
   by the input URI; in the second case it returns the                        ties in one of the desired target data set (e.g. again
   URIs of the entities that contains the input URI.                          DBpedia, Geonames, or enAKTing published informa-
                                                                              tion).
dictionary: can be one of the followings (dbpedia, os,
     statistics, geonames, enakting, opencyc, open-                        The service returns a list of URIs if the content type is
     lylocal, or none) and instructs the service to use the             text or json. The RDF content, for both rdf and turtle,
     co-reference system in order to retrieve the equivalent            describes the containment relations between the input URI
     URIs in the respective data sets (i.e. DBpedia [1],                and the resulting resources. In both cases the returned URIs
     Ordnance Survey [10], UK National Statistics16 , Geon-             are translated into the desired address space.
     ames17 , PSI enAKTing18 , OpenCYC [12], Openly Lo-                    The procedure followed by the service, and an overall ar-
     cal project19 ). The value none is used for not applying           chitecture, is depicted in Figure 4, and can be describe as
     any filter. In this case the URIs returned will be the             follows:
     ones from the Ordnance Survey plus the ones returned
     from the co-reference service.                                         1. user generated request (HTTP GET request)
                                                                            2. normalisation of the input URI to OS
format: the format parameter is optional and can be one
    of the followings (rdf, text, ttl, or json). The value                  3. computation of the property closure (i.e. part or part-
    of the format parameter decide then the format of                          of ) over the normalised URI
    the returned content: RDF/XML for rdf ; list of URIs
    separated by new lines for text; RDF/Turtle for tur-                    4. optional phase of translation and filtering of the re-
    tle; and finally JSON20 for json. If the parameter                         sulting URIs to the target URI space
    is not given the right content is decided using the
                                                                            5. formatted content, as per user request, returned to the
    303 HTTP redirection. Even for the content requests
                                                                               user (HTTP Response)
    Accept:text/html done using the browser, the client
    is redirected to the HTML page of the service ini-                     As an example, consider the case of a software client
    tialised with the input URI.                                        who needs to know all the geographical entities contained
                                                                        in the Hampshire, the request can adopt as an input one
URI: is the URI of the input entity to query using the ser-             of many available URIs describing the Hampshire county,
    vice. The service uses a co-reference system in order to            a popular choice could be the DBpedia URI (i.e. URI =
    find the equivalent URI for the Ordnance Survey and                 http://dbpedia.org/resource/Hampshire). The agent can
    the Geonames data set. This means that the user can                 then explicit the desired target data set, for example the DB-
16
   http://statistics.data.gov.uk last accessed 10/02/10                 pedia data set itself (i.e. dictionary = dbpedia), and in-
17
   http://geonames.org last accessed 10/02/10                           struct the server to return the JSON format of the document
18
   http://browser.psi.enakting.org        last   accessed               (i.e. HTTP header contains Accept:application/json).
 10/02/10                                                               The HTTP request will be then the following:
19
   Community devoted to provide linked data access for lo-
 cal government data, see http://openlylocal.com last ac-                   GET /contains/dbpedia/http://dbpedia.org/
 cessed 10/02/10                                                                 resource/Hampshire
20
   http://json.org last accessed 10/02/10                                   Host: geoservice.psi.enakting.org
     Accept: application/json                                     http://dbpedia.org/resource/Southampton_Test_
                                                                      %28UK_Parliament_constituency%29
                                                                  http://dbpedia.org/resource/Southampton_Itchen_
And the service will return a response redirecting the client
                                                                      %28UK_Parliament_constituency%29
to the right URL:
     HTTP/1.1 302 Found                                              From those URIs we are then able to check then the iden-
     Location: http://geoservice.psi.enakting.org/                tities of the MPs in charge (in the Southampton page from
     contains/dbpedia/json/http://dbpedia.org/                    DBpedia their are mentioned both as leaders of the city
     resource/Hampshire                                           whereas an MP is actually in charge only to its constituency
                                                                  where s/he has been elected. Asking then for the URIs from
                                                                  the data sets provided by the EnAKTing project we would
That, once resolved, will finally return the desired content,     be able to retrieve the followings:
a JSON array of strings that represents the URI of the DB-
pedia resource describing entities contained in Hampshire:        http://parliament.psi.enakting.org/id/cons-536
     HTTP/1.1 200 OK                                              http://parliament.psi.enakting.org/id/cons-535
     Content-Type: application/json
                                                                     Following such links the user would be able then to re-
                                                                  trieve other information about the MPs from each constituen-
     ["http://dbpedia.org/resource/North_East
                                                                  cy (even retrieving an historical record of them) and further
     _Hampshire_%28UK_Parliament_constituency%29",
                                                                  information about their political activity.
      "http://dbpedia.org/resource/East_Hampshire
      _%28UK_Parliament_constituency%29", ...
                                                                  5.    EVALUATION
   The client agent can obviously immediately refer to the           We have evaluated our geographical service from two dif-
right URL and retrieve the content in the right format straight   ferent perspectives. The first one looks at the direct ben-
away. A useful way to exploit such service can be seen when       efit that our backlinking service for Public Sector Informa-
data sets other than OS one are queried. Not every data           tion22 would gain from expanding its navigability through
set in fact provides a clear semantic representation about        geographic containments (see Section 5.1). The second eval-
mereological relations. This is due to the fact that the focus    uation is more analytic and looks at the new knowledge
of many data set is to provide information about a particu-       generated as part of the translation process from an author-
lar region: encyclopaedic information from DBpedia, statis-       itative geographic closure to the covered vocabularies (see
tics information from the UK National Statistics, geograph-       Section 5.2) .
ical features from Geonames21 , conceptual description from
Open CYC, local government information from Openly Lo-            5.1    Backlinking Service Integration
cal, and UK PSI from EnAKTing.                                       This section studies the navigability improvement that
   Using the service presented in this paper is easy to ex-       our backlinking service for the PSI in the UK would ex-
ploit the OS administrative ontology in order to retrieve ge-     periment by plugging the containments from a wide range
ographically relevant information regardless from the start-      of vocabularies.The PSI Backlinking Service provides an ac-
ing data set. As an example, let us consider the case where       cess point to retrieve backlinks from Foreign URIs. Foreign
a user may want to retrieve information about local govern-       URIs make data discovery difficult because it is not possible
ment of its own city, for example about Southampton, UK.          to navigate the RDF documents of the WoD bidirectionally.
The easiest thing to do is to start from a recognizable URI       http://backlinks.psi.enakting.org provides an API to
such as the DBpedia ones:                                         retrieve collections of backlinks for a given URI. The study
                                                                  of the covered knowledge bases23 in the UK PSI Backlinking
http://dbpedia.org/resource/Southampton                           Service made explicit that one of the most highly connected
  From this URI the user can retrieve general information         data sets in the PSI WoD are the ones representing some
about the city, even the names about some of the city lead-       type of geographic information.
ers. No further information is available on the Southampton          In this evaluation we have used the Backlinking Service
DBpedia page about local government information. Asking           as a client of the Geographical Service in order to expand
the geographical service to return the contained entities from    the backlinks that we can get from geographic resources. We
the Openly Local site we can then retrieve more resources:        have kept the decentralization nature of the Backlinking and
                                                                  Geographical services and basically the Backlinking Service
http://openlylocal.com/id/wards/4925                              performs HTTP requests to get the geographic containments
http://openlylocal.com/id/wards/4929                              (see Figure 5). When the geography extension is enabled
...                                                               the backlinking service gets the list of contained entities for
http://openlylocal.com/id/wards/4938                              the input URI and returns the backlinks connected to any
  Those URIs are the ones published for each one of the           URI part of containments. The request to the Geography
wards present in the city of Southampton and provides not         Service is performed using “contains” as command JSON as
only the names of the local councillors but also some other       format and “none” as dictionary. The selected dictionary is
statistics about the ward (i.e. demographics and religious        “none” because the Backlinking Service doesn’t know before
statistics). Moreover, asking again the service for the DB-       22
                                                                     http://backlinks.psi.enakting.org   last   accessed
pedia URIs we are able to retrieve the followings:                 10/02/10
21                                                                23
 Geonames provides a containment relation that does not              http://backlinks.psi.enakting.org#KBs last accessed
however reflect any administrative subdivision                     10/02/10
                                          HTTP GET http://backlinks.psi.enakting.org/resource/URI?geo=enabled
                                                                                                                  linked to dbpedia:Hampshire or equivalent URIs but to ge-
                                                                                                                  ographic containments of it in at least one of the data sets
                                          http://backlinks.psi.enakting.org
                                                       (RESTFul API)                                              covered by the Geographical Service.
                                                                                                                     This scenario has shown one of possible scenarios where
                                                                                                                  the exploitation of explicit semantic can improved the ac-
  HTTP GET http://geoservice.psi.enakting.org/contains/none/json/URI                                              cessibility of the resources in the Web of Data. In esence
                                                                            for URI' in geoPartonomyBundle:
                                                                                BackLinks += GetBackLinks(URI')
                                                                                                                  the backlinking service is improving its graph connectivity
       http://geoservice.psi.enakting.org                                                                         by being aware of the new layer of Linked Data that the
                      (RESTFul API)
                                                                          Backlinks
                                                                       Knowledge Base
                                                                                                                  Geographical Service publishes via its RESTFul API. This
                                                                           (4store)
                                                                                                                  case study also shows how different Linked Data RESTFul
                       Co-reference                                                                               services (such as co-reference, backlinking and geographical
                    http://sameAs.org
                                                                                                                  services) can cooperate in a layer built on top of current
                                                                                                                  Web of Data to improve its navigability.
Figure 5: Interaction of the backlinking and geo-                                                                 5.2    Vocabulary Closure Coverage
graphical services
                                                                                                                     The geographical service can be seen as an extra layer of
                                                                                                                  linked data based on an initial geographic closure provided
                                                                                                                  by Ordnance Survey and its extensions to other data sets via
hand what type of URIs will be the source of backlinks for                                                        co-references. This extra layer of linked data is obviously an
a certain geographic region. So as to improve the coverage                                                        added value to the Web of Data. This section analyses the
we aim to get all the possible containments from all the                                                          interlinking improvement between the data sets by means of
dictionaries supported in the geographical service.                                                               number of triples produced by the Geographical Service.
   There is a natural outcome from this integration and it can                                                       Table 1 represents the amount of triples generated by our
be shown using how the systems works when asking for back-                                                        service in terms of number of triples that contain where the
links connected the URI dbpedia:Hampshire. Prior to the use                                                       predicate is geoservice:part or geoservice:part of. This
of the geographical extension a request to retrieve backlinks                                                     table shows the numbers of triples linking every pair of data
for dbpedia:Hampshire would just give back 14 URIs related                                                        set in the system. For instance our Geographical Service has
the UK region of dbpedia:Hampshire or any equivalent URI                                                          produced 30995 geographic containments between dbpedia
part of the same co-reference bundle in sameAs.org (see Fig-                                                      and mortality.psi.enakting.org.
ure 6). This same request when the geographical service is                                                           Of particular interest are the results from the geonames
integrated returns the following additional backlinks:                                                            data set. In fact, the number of containment relations within
                                                                                                                  such dataset is quite small compared to the number of con-
   • 6 010 resources that represent schools from http://                                                          tainment relations provided by geonames itself (a rough esti-
     education.data.gov.uk. These RDF documents rep-                                                              mate done by the authors counts about 9K relations). Such
     resents the totality of education entities in the region                                                     additional source of spatial knowledge open a scenario where
     of Hampshire.                                                                                                the two knowledge bases can be compared and integrated for
                                                                                                                  providing a better recall for the service. An important as-
   • 42 mortality statistical resources from http://mortali                                                       pect to take into account in such a scenario would be the
     ty.psi.enakting.org. This statistics are segmented                                                           quality of the results computed by the integration.
     by geography and gender.                                                                                        The data seed that triggered this new knowledge is the OS
                                                                                                                  to OS containments, 60M of statements. The total number
   • 981 CO2 emission measurements from http://co2emis                                                            of triples generated are 223M and these are partially inter-
     sion.psi.enakting.org. These resources represent                                                             linking every pair of data sets. Partially because the com-
     the CO2 emissions for the region of Hampshire be-                                                            pleteness of every pair of datasets’ closure relies on the accu-
     tween 2005 and 2007.                                                                                         racy of the co-reference bundles extracted from sameAs.org.
                                                                                                                  As the number of co-references from sameAs.org grows and
   • 300 resources with information of energy consumption                                                         improves its accuracy the Geographical Service will reflect
     from http://energy.psi.enakting.org. This data                                                               those changes automatically. This side effects is one of key
     sets publishes the energy consumption in the UK in                                                           aspects of the Web of Data and its decentralized nature.
     respect to fuel in the road network between 2005 and
     2007. These results represent all the RDF documents
     linked to geographical regions contained in Hampshire.
                                                                                                                  6.    CONCLUSIONS
                                                                                                                     We have presented in this paper a service that helps users
   • 4 788 population census information segmented by age                                                         in browsing geographical resources from different datasets
     and sex from http://population.psi.enakting.org.                                                             (dbpedia, geonames, data.gov.uk. psi.enakting.org, . . . ) by
                                                                                                                  exploiting an authoritative ontology for the UK territory
   • 224 parliamentary identities from http://parliament.                                                         (Ordnance Survey). One of the novel aspects of this research
     psi.enakting.org. These represent mandates for dif-                                                          is the use of a co-reference system (http://sameas.org) to
     ferent members of the UK Parliament and House of                                                             extend the containments from one geographic data set to
     Commons.                                                                                                     others where such containments are not so rich or com-
                                                                                                                  plete. Moreover, the added value of integrating such geo-
  Figure 6 shows the output of the backlinking service with                                                       graphical service with a backlinking service has been shown
and without geographical extensions in the Backlinking Ser-                                                       with respect to demonstrate a possible exploitation scenario
vice. All the resources enumerated above are not specifically                                                     on Public Sector Information. Due to the particular na-
                             Backlinks Geographical Service Integration Disabled
                    http://backlinks.psi.enakting.org/resource/doc/http://dbpedia.org/resource/Hampshire




                                             Backlinks Geographical Service Integration Enabled

                                             http://backlinks.psi.enakting.org/resource/doc/http://dbpedia.org/resource/Hampshire?geo=enabled




      Figure 6: Output comparison for dbpedia:Hampshire with and without geographical partonomies



                                              Table 1: Datasets linkage improvement statistics
                OS                  dbpedia             statistics             mortality              parliament              crime             geonames   openlylocal   opencyc
 OS             60469910            1757760             45354078               1035901                1338214                 235906            94072      18559453      1106900
 dbpedia        1757760             59640               1393322                30995                  46077                   9570              3035       540619        35250
 statistics     45354078            1393322             36179867               813660                 1056892                 206217            71232      14430773      819965
 mortality      1035901             30995               813660                 19109                  23929                   4607              1488       344436        17415
 parliament     1338214             46077               1056892                23929                  37631                   7654              2410       436883        28070
 crime          235906              9570                206217                 4607                   7654                    2249              334        82559         4160
 geonames       94072               3035                71232                  1488                   2410                    334               224        26427         2475
 openlylocal    18559453            540619              14430773               344436                 436883                  82559             26427      6498462       312120
 opencyc        1106900             35250               819965                 17415                  28070                   4160              2475       312120        27975




ture of the knowledge provided (i.e. closure of geographi-                                                 Hampshire (i.e. not contained any more), although being
cal containment properties), there is the possibility of over-                                             still part of it as a ceremonial county. Versioning of infor-
whelming the user with information when asking about top                                                   mation resources is an hot topic in Linked Data community
level features (e.g. England). In order to cope with this                                                  and it is even more important when publishing Public Sec-
eventuality, the service will be provided soon with the ca-                                                tor Information, whose content and validity must be put
pability to limit the results by depth. Therefore, when                                                    into context.
asked about all the entities contained in the top level fea-                                                  The research work reported here tackles an important
ture England at the first level of depth, the service will re-                                             aspect of Linked Data, the exploitation of explicit seman-
turn only: North East, North West, South East, Eastern,                                                    tic content for enhancing resource retrieval and browsabil-
South West, East Midlands, West Midlands, Yorkshire &                                                      ity. The choice to tackle geographical knowledge rather
the Humber, Scotland, Wales, London (different from the                                                    than some other data facet is mainly due to the analysis
City of London).                                                                                           of the available data sources, their structure and the avail-
   Another important aspect not tackled in this work, and                                                  able knowledge exploitable for a better integration of the
subject of future research, is the temporal extent of admin-                                               available information.
istrative divisions. The version of administrative geography                                                  The use of co-reference systems allowed us to exploit the
of UK will change shortly and has changed frequently during                                                knowledge created in one organization (Ordnance Survey in
the years (e.g. the number and borders of constituencies are                                               this case) in different, and potentially novel, data collec-
reviewed every 10 or 15 years). New entities can be defined,                                               tions, overlapping a qualitative spatial dimension that was
old ones can be abolished or change status. For example                                                    not present before. Such reuse of knowledge is potentially in-
Southampton, once part of Hampshire, became a Unitary                                                      novative but poses many questions about the management
Authority on the 1st of April 1997. Since then, Southamp-                                                  of the quality of the knowledge and the entity alignments
ton has been administratively detached from the county of                                                  used. The presence, integration, and comparison of different
geographical knowledge bases can be beneficial for the main-       [13] J. Nanard and M. Nanard. Using structured types to
tenance and discovery of entity alignments of good quality.             incorporate knowledge in hypertext. In HYPERTEXT
   Another interesting aspect related to the use of co-reference        ’91: Proceedings of the third annual ACM conference
services integrated with an additional knowledge source is              on Hypertext, pages 329–343, New York, NY, USA,
the ability to exploit the data semantics in order to change            1991. ACM.
the navigability of the datasets. Such change in the naviga-       [14] A. A. Randell, Z. Cui, and A. G. Cohn. A spatial logic
bility is clear when new arcs are provided within the same              based on regions and connection. In B. Nebel, W.
data set (e.g. between dbpedia resource where they were not             Swartout, and C. Rich, editors, Principles of
linked before) or between resources belonging to different              Knowledge Representation and Reasoning, 1992.
data sets (see Table 1 for a complete account of the data          [15] J. Rohlfs. A theory of interdependent demand for a
sets connected).                                                        communications service. The Bell Journal of
                                                                        Economics and Management Science, 5(1):16–37,
7.   ACKNOWLEDGEMENTS                                                   1974.
  This work was supported by the EnAKTing project funded           [16] J. Sowa and A. Borgida. Principles of semantic
by the Engineering and Physical Sciences Research Council               networks : explorations in the representation of
under contract EP/G008493/1.                                            knowledge. Morgan Kaufmann, 1991.
                                                                   [17] A. N. Whitehead. Process and Reality. The MacMillan
8.   REFERENCES                                                         Company, New York, NY, USA, 1929.
 [1] S. Auer, S. Auer, C. Bizer, G. Kobilarov, J. Lehmann,
     and Z. Ives. Dbpedia: A nucleus for a web of open
     data. in 6th International Semantic Web Conference,
     Busan, Korea, pages 11–15, 2007.
 [2] C. Becker and C. Bizer. DBpedia mobile: A
     location-enabled linked data browser. In 1st Workshop
     about Linked Data on the Web (LDOW2008), April
     2008.
 [3] T. Berners-Lee. Design issues: Linked data.
     http://www.w3.org/DesignIssues/LinkedData.html,
     2006.
 [4] L. Carr, W. Hall, S. Bechhofer, and C. Goble.
     Conceptual linking: ontology-based open hypermedia.
     In WWW ’01: Proceedings of the 10th international
     conference on World Wide Web, pages 334–342, New
     York, NY, USA, 2001. ACM.
 [5] B. L. Clarke. A calculus of individuals based on
     “connection”. Notre Dame J. Formal Logic,
     22(3):204–218, 1981.
 [6] A. G. Cohn, B. Bennett, J. Gooday, and N. M. Gotts.
     Qualitative spatial representation and reasoning with
     the region connection calculus. Geoinformatica,
     1(3):275–316, 1997.
 [7] M. J. Egenhofer. A formal definition of binary
     topological relationships. In 3rd International
     Conference, on Foundations of Data Organization and
     Algorithms (FODO), pages 457–472, New York, NY,
     USA, 1989. Springer-Verlag New York, Inc.
 [8] C. Freksa. Temporal reasoning based on
     semi-intervals. Artif. Intell., 54(1-2):199–227, 1992.
 [9] H. Glaser, A. Jaffri, and I. Millard. Managing
     co-reference on the semantic web. In WWW2009
     Workshop: Linked Data on the Web (LDOW2009),
     April 2009.
[10] J. Goodwin, C. Dolbear, and G. Hart. Geographical
     linked data: The administrative geography of great
     britain on the semantic web. Transaction in GIS,
     12(1):19–30, February 2009.
[11] Grzegorczyk. Undecidability of some topological
     theories. Fundamenta Mathematicae, 38:137–152,
     1951.
[12] D. B. Lenat. Cyc: a large-scale investment in
     knowledge infrastructure. Commun. ACM,
     38(11):33–38, 1995.