<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>OWL-based formalisation of geographic databases specifications</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Nathalie Abadie</string-name>
          <email>nathalie-f.abadie@ign.fr</email>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Ammar Mechouche</string-name>
          <email>ammar.mechouche@ign.fr</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Sébastien Mustière</string-name>
          <xref ref-type="aff" rid="aff2">2</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Institut Géographique National, Laboratoire COGIT</institution>
          ,
          <addr-line>73 Avenue de Paris, 94160 Saint-Mandé, France, +33 1 43 98 80 00 + 71 25</addr-line>
        </aff>
        <aff id="aff1">
          <label>1</label>
          <institution>Institut Géographique National, Laboratoire COGIT</institution>
          ,
          <addr-line>73 Avenue de Paris, 94160 Saint-Mandé, France, +33 1 43 98 80 03</addr-line>
        </aff>
        <aff id="aff2">
          <label>2</label>
          <institution>Institut Géographique National, Laboratoire COGIT</institution>
          ,
          <addr-line>73 Avenue de Paris, 94160 Saint-Mandé, France, +33 1 43 98 81 49</addr-line>
        </aff>
      </contrib-group>
      <abstract>
        <p>The ability to share and combine geographic data from different information sources in a consistent way is a key issue for enabling successful implementation of Spatial Data Infrastructures (SDIs). This can only be done through a deep understanding of databases structure and content. In this poster, we propose to do that through the elicitation and formalisation of geographic database specifications, relying on OWL ontologies, as recommended in the semantic Web community. We thus propose a general ontology for eliciting key concepts manipulated by data specifications, and rules to build local ontologies representing knowledge contained in specific data specifications.</p>
      </abstract>
      <kwd-group>
        <kwd>eol&gt;Geographic Database Semantics</kwd>
        <kwd>OWL</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>1. WHY FORMALISING</title>
    </sec>
    <sec id="sec-2">
      <title>SPECIFICATIONS?</title>
      <p>
        In the last decades, the increase of geographic data acquisition
campaigns has resulted in a huge amount of diverse,
heterogeneous and distributed geographic data sources. However,
even if these data represent the same geographic real world, there
is a great heterogeneity between them. Consequently, the ability to
share and combine geographic data from different sources in a
consistent way is a key issue for enabling their efficient usability.
Previous geo-data integration efforts mainly focused on syntactic
heterogeneities through the development of standards. Semantic
interoperability, which addresses more complex problems, is still
investigated. Actually, recent works mainly focused either on
geodata discovery and retrieval or on transformation of geo-data
schema. In the former case, most of the proposed approaches
[
        <xref ref-type="bibr" rid="ref1">1</xref>
        ][
        <xref ref-type="bibr" rid="ref2">2</xref>
        ][
        <xref ref-type="bibr" rid="ref3">3</xref>
        ] use a global domain ontology to specify the precise
meaning of geo-data, either by renaming feature classes with
ontology labels, or thanks to semantic annotations. They rather
aim at helping a user in retrieving geo-data that represent a
specific geographic concept, such as „buildings’, even if feature
class names of available datasets are totally different. In the latter
cases, recent approaches [
        <xref ref-type="bibr" rid="ref4">4</xref>
        ][
        <xref ref-type="bibr" rid="ref5">5</xref>
        ][
        <xref ref-type="bibr" rid="ref6">6</xref>
        ] provide geo-databases experts
with a graphical interface to help them in manually describing
their schemas and specifying mappings between source and target
schemas.
      </p>
      <p>
        However, each geo-data producer has its own rules for data
capture, and its own point of view about the geographic real world
[
        <xref ref-type="bibr" rid="ref7">7</xref>
        ]. As an example, if a feature class is named „Building‟, it may
actually designate only permanent buildings, or include precarious
buildings, such as cabins, or huts. Besides, a geographic database
is produced at a specific scale of analysis and geographic features
are then captured in the database consistently with this specific
level of detail. For example, only buildings of area greater than 50
m2 may be captured. Furthermore, the geometric representation of
a given geographic feature may vary: a building may be
represented by a polygon representing its perimeter or by a point
captured at its centre.
      </p>
      <p>All these selection and representation criteria are stored in specific
textual documents, used as guideline for data capture, namely the
database specifications. They are a very rich source of knowledge
about geo-data semantics and their use in a schema matching
process could help in identifying and solving complex
heterogeneities. Let us consider two different databases covering
the same geographical space. The first one has a feature class
named „Building‟ which represents only “buildings of area greater
than 20 m2”, while the second one has a feature class named
„Built-up area‟ which represents “buildings of area greater than 50
m2”. Comparing these feature classes‟ specifications enables to
find the following mapping rule: „Building‟ instances of area
greater than 50 m2 represent the same real world buildings as
„Built-up area‟ instances. Providing a schema matching
application with formal specifications would therefore enable to
automatically find such complex mapping rules between
heterogeneous geo-databases.</p>
    </sec>
    <sec id="sec-3">
      <title>2. THE SPECIFICATIONS ONTOLOGY</title>
      <p>
        Several formal models for geographic database specifications
have already been proposed [
        <xref ref-type="bibr" rid="ref8">8</xref>
        ][
        <xref ref-type="bibr" rid="ref9">9</xref>
        ]. As formalisation of data
specifications in SDIs is a kind of elicitation of data semantics in
a Web environment, we propose to rely on semantic Web
standards to do so: our approach is based on ontologies developed
with the Ontology Web Language (OWL 2 [
        <xref ref-type="bibr" rid="ref10">10</xref>
        ]).
      </p>
      <p>
        A first step to formalise specifications is to define unambiguously
key concepts commonly used in geo-database specifications. In
other words, we define a domain ontology, named “Specifications
Ontology” (SO, see Figure 1). This ontology SO only contains
concepts specific to geographic data specifications. It relies in
turn on more general ontologies, for example for defining basic
geometric types [
        <xref ref-type="bibr" rid="ref11">11</xref>
        ]. For example, this domain ontology SO
formalises the concepts of data source and centreline, which are
commonly used in many data specifications.
      </p>
    </sec>
    <sec id="sec-4">
      <title>3. HOW TO FORMALISE A GEO</title>
    </sec>
    <sec id="sec-5">
      <title>DATABASE SPECIFICATION?</title>
      <p>
        Besides we propose to formalise each database specification by
means of an application ontology, named “local specification
ontology” (LSO). This ontology imports SO and extends it to
describe real world geographic concepts and database classes.
In order to clearly separate database concepts from real world
concepts encountered in specifications, two main classes from SO
are used: GeographicEntity and Feature. On the one hand,
GeographicEntity‟s subclasses are concepts imported from a
domain ontology of topographic concepts. They may be created
from the specifications text thanks to natural language processing
tools [
        <xref ref-type="bibr" rid="ref12">12</xref>
        ]. On the other hand, Feature‟s subclasses are concepts
directly derived from the corresponding database schema. They
contains information such as selection criteria used to populate
the database, such as the fact that “only habitation buildings of
area greater than 20 m2 are represented in the feature class
„Building‟ of the database”. We thus require feature classes such
as „Building‟ to be modelled as “classes” in the OWL language,
and selection constraints to be modelled as “axioms” including
rules that restrict the possible interpretations for the defined term,
those axioms being defined by means of concepts and relations
defined in SO and LSO. Considering the example above, the
feature class „Building‟ of this geo-database will be defined in
LSO as follows:
      </p>
      <sec id="sec-5-1">
        <title>Class: lso:db_Building</title>
      </sec>
      <sec id="sec-5-2">
        <title>EquivalentTo: so:represents some (lso:Habitation and so:area some double[&gt;20.0])</title>
      </sec>
    </sec>
    <sec id="sec-6">
      <title>4. CONCLUSION</title>
      <p>In this poster we proposed an OWL 2 based model for geographic
database specification formalisation, which aims at eliciting
geographic databases semantics by describing the link between
data and what they represent. Key concepts used in data
specifications are specified in a specifications domain ontology
(SO), whereas knowledge contained in one given database
specification is described in a specification application ontology
(LSO) which uses SO‟s concepts. A tool enabling automatic
comparison of formal specifications is being implemented. It aims
at providing expressive schemas mappings between geographic
heterogeneous databases, for schema translation or schema
integration purposes.</p>
    </sec>
    <sec id="sec-7">
      <title>5. ACKNOWLEDGMENTS</title>
      <p>This work is partly funded by the French Research Agency
through the GeOnto project ANR-O7-MDCO-005 on “creation,
alignment, comparison and use of geographic ontologies”
(http://geonto.lri.fr/).</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <surname>Paul</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>and Ghosh S.K.</surname>
          </string-name>
          <year>2006</year>
          .
          <article-title>An Approach for Service Oriented Discovery and Retrieval of Spatial Data</article-title>
          ,
          <source>In Proceedings of International Workshop on Service Oriented Software Engineering</source>
          , Shangay, China,
          <fpage>84</fpage>
          -
          <lpage>94</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [2]
          <string-name>
            <surname>Nambiar</surname>
            ,
            <given-names>U.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Ludscher Ludäscher</surname>
            ,
            <given-names>B.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Lin</surname>
            <given-names>K.</given-names>
          </string-name>
          , and
          <string-name>
            <surname>Baru</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          <year>2006</year>
          .
          <article-title>The GEON portal: accelerating knowledge discovery in the geosciences</article-title>
          ,
          <source>Procedings In Proceedings of the 8th ACM International Workshop on Web Information and Data Management</source>
          , Aarlington, Virginia, USA,
          <fpage>83</fpage>
          -
          <lpage>90</lpage>
          , (
          <year>2006</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [3]
          <string-name>
            <surname>Klien</surname>
            <given-names>E.M.</given-names>
          </string-name>
          <year>2008</year>
          .
          <article-title>Semantic Annotation of Geographic Information</article-title>
          .
          <source>Phd thesis</source>
          , Institute for Geoinformatics, University of Muenster. Muenster, Germany.
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [4]
          <string-name>
            <surname>Balley</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          <year>2007</year>
          .
          <article-title>Aide à la restructuration de données géographiques sur le Web - Vers la diffusion à la carte d'information géographique</article-title>
          .
          <source>PhD Thesis</source>
          , University of Marne-La-Vallée, France.
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          [5]
          <string-name>
            <surname>Schade</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          <year>2009</year>
          .
          <article-title>Translation of Geospatial Data, Challenges, Solution and Vision</article-title>
          ,
          <source>In Proceedings of the 12th International Conference on Geographic Information Science (AGILE'09)</source>
          , Pre-Conference Workshop “Challenges in Spatial Data Harmonisation”, Hannover, Germany.
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          [6]
          <string-name>
            <surname>Reitz</surname>
            ,
            <given-names>T.</given-names>
          </string-name>
          , de Vries,
          <string-name>
            <given-names>M.</given-names>
            , and
            <surname>Fitzner</surname>
          </string-name>
          ,
          <string-name>
            <surname>D.</surname>
          </string-name>
          <year>2009</year>
          . A5.
          <fpage>2</fpage>
          -
          <issue>D3</issue>
          [
          <fpage>3</fpage>
          .3]
          <string-name>
            <given-names>Conceptual</given-names>
            <surname>Schema</surname>
          </string-name>
          Specification and Mapping,
          <source>HUMBOLDT Technical Report.</source>
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          [7]
          <string-name>
            <surname>Fonseca</surname>
            ,
            <given-names>F.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Clodoveu</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          , and
          <string-name>
            <surname>Camara</surname>
            ,
            <given-names>G.</given-names>
          </string-name>
          <year>2003</year>
          .
          <article-title>Bridging Ontologies and Conceptual Schemas in Geographic Information Integration</article-title>
          . In GeoInformatica,
          <volume>7</volume>
          (
          <issue>4</issue>
          ),
          <fpage>355</fpage>
          -
          <lpage>378</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          [8]
          <string-name>
            <surname>Gesbert</surname>
            <given-names>N.</given-names>
          </string-name>
          <year>2005</year>
          .
          <article-title>Etude de la formalisation des spécifications de bases de données géographiques en vue de leur intégration</article-title>
          .
          <source>Phd thesis</source>
          , University of Marne-la-Vallée.
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          [9]
          <string-name>
            <surname>Christensen</surname>
            ,
            <given-names>J. V.</given-names>
          </string-name>
          <year>2006</year>
          .
          <article-title>Formalizing Specifications for Geographic Information</article-title>
          .
          <source>In Proceedings of the 9th AGILE Conference on Geographic Information Science</source>
          . College of Geoinformatics, University of West Hungary,
          <fpage>186</fpage>
          -
          <lpage>194</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          [10]
          <fpage>W3C</fpage>
          .
          <year>2009</year>
          . OWL 2
          <string-name>
            <given-names>Web</given-names>
            <surname>Ontology</surname>
          </string-name>
          <string-name>
            <surname>Language</surname>
          </string-name>
          , Primer,
          <source>W3C Recommendation 27 October</source>
          <year>2009</year>
          . http://www.w3.org/TR/owl2-primer/
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          [11]
          <string-name>
            <surname>Lieberman</surname>
            <given-names>J.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Singh</surname>
            <given-names>R.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Goad</surname>
            <given-names>C.</given-names>
          </string-name>
          <year>2007</year>
          .
          <article-title>W3C geospatial ontologies</article-title>
          ,
          <source>W3C incubator group report</source>
          , 23 october
          <year>2007</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          [12]
          <string-name>
            <surname>Kamel</surname>
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Aussenac-Gilles</surname>
            <given-names>N.</given-names>
          </string-name>
          <year>2009</year>
          .
          <article-title>Ontology Learning by Analysing XML Document Structure and Content</article-title>
          ,
          <source>In Proceedings of Knowledge Engineering and Ontology Development</source>
          , Madère, Portugal.
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>