<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Integration of information from di erent ontologies</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Julia Dmitrieva</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Fons J. Verbeek</string-name>
          <email>fverbeek@liacs.nl</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Universiteit Leiden, Leiden Institute of Advanced Computer Science</institution>
          ,
          <addr-line>Niels Bohrweg 1, 2333 CA Leiden</addr-line>
          ,
          <country country="NL">The Netherlands</country>
        </aff>
      </contrib-group>
      <abstract>
        <p>In this work we describe our approach directed to the integration of information from di erent ontologies. We propose a method to extract and integrate modules from ontologies. These modules are generated as subgraphs from original ontologies on the basis of a term de ned by the user. The module extraction is related to the view traversal method [1]. The di erence is that we not only extract modules from OWL ontologies but use these modules for further integration. The small modules are integrated to one graph structure on basis of mappings. These mappings are based on the similarity between concepts in di erent ontologies. With these mappings we introduce merged nodes in the integrated graph which connect concepts from di erent ontologies. The integrated graph can be considered as a view or as a query answer and contains information collected from di erent ontologies. The method can also be useful to check the correctness of the mappings, because the user can see whether the merged concept is correctly placed in the hierarchy.</p>
      </abstract>
      <kwd-group>
        <kwd>information integration</kwd>
        <kwd>ontology integration</kwd>
        <kwd>ontology mapping</kwd>
        <kwd>module extraction</kwd>
        <kwd>ontology visualization</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>Introduction</title>
      <p>In order to nd similar concepts we use the string similarity. For each concept
we extract the following characteristics which are used for the calculation of the
similarity.</p>
      <p>Label The label of the concept. This is in most cases the name of the concept.
Description Description of the concept. This is an annotation property, where
a description of the meaning of the concept is given in a human language.</p>
      <p>At this moment, we make only use of English language in our prototype.
ID This is a concept identi er. In some ontologies this is a unique string
automatically generated during the serialization of the ontology. In other
ontologies the concept ID can be the same as the name of the concept. Because this
property depends on how an ontology is serialized, a little weight is assigned
to it during the calculation of the similarity.</p>
      <p>
        The comparison of concepts from di erent ontologies is done on the basis of
Levenstein distance algorithm [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ]. By this mapping the concepts from ontology O1
(seed ontology ) are mapped to the concepts from other ontologies O2; O3; : : : ; On.
3
      </p>
    </sec>
    <sec id="sec-2">
      <title>Modules extraction: Graph-based Approach</title>
      <p>We introduce a graph-based approach for module extraction. Although we are
aware that our approach can lead to information loss, it is easy to implement
and is su cient for testing our prototype.</p>
      <p>
        In our previous work [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ] we have explained how to represent an ontology as
a graph structure. In this work we elaborate further on this approach. The most
important di erence and di culty that we have to cope with during a module
extraction is that the interesting term is matched not by one but by multiple
concepts in the ontology. It means that we have to redesign our graph extraction
algorithm, and apply it for multiple nodes.
      </p>
      <p>
        The backbone of the graph that represents an ontology module is a spanning
tree. Nodes in the spanning tree represent classes in the ontology and edges
represent hierarchical relations as well as other kinds of relations between classes.
These relations are extracted with the help of the reasoner [
        <xref ref-type="bibr" rid="ref4">4</xref>
        ]. Inferring that two
classes C and E are related to each other via R requires extraction of properties
from axioms of the concept. Then the reasoner can be asked whether the concept
C u :9R:E is satis able, if not then it is necessary for the class C to have the
property R with the ller E.
      </p>
      <p>
        In order to extract a module we need to collect multiple graphs created on
the basis of concepts matched by the interesting term. Let S0 be a set of the
concepts which are matched during the term search procedure. This procedure
is implemented as the comparison of label, comment and id of each concept
from the ontology with the user query. We are constructing a graph G that
corresponds to the extracted module, where Go is the set of nodes, and Ge is
the set of edges. The algorithm for module extraction works as follows:
1. Initialize G0 S0
2. G0 G0 [ P0; where P0 is a set of parents of S0 .
3. G0 G0 [ S1; where S1 contains the siblings of the nodes in S0, if the parent
of ci 2 S0 has 2 or more children in S0.
4. G0 G0 [ P1; where P1 contains the parents of nodes in G0; if these parents
have 2 or more children in G0.
5. Ge [eij ; where eij is an edge between nodes ci 2 G0 and cj 2 G0 that
exists in the original ontology.
6. After this point we have the set of probably unconnected trees. The trees
are connected by the following algorithm:
- if the root r of tree ti is descendant of some node s1 of tree tj , then make
s parent of r. Introduce an edge from s to r.
1 With descendant here we mean that s and r are in the transitive closure of the
sublClassOf relation. Whether r is a subclass of s can be solved by Description
Logics[
        <xref ref-type="bibr" rid="ref5">5</xref>
        ] (DL) reasoner.
      </p>
      <p>- otherwise we have trees t1; t2; : : : ; tn which are not subtrees of each
others. For these trees we need to nd the least common subsumer2 a. This
node will be the parent of trees t1; t2; : : : ; tn, and need to be added to
the set G0. For each ti introduce an edge from node a to root of ti.
From this tree an ontology can be created, where the hierarchical relation A
is subclass of B is represented as subclass axiom A v B, and relations of the type
A Relation B are represented as A v 9Relation:B. The annotation properties,
e.g. label, comment, description, etc., can be copied from the original ontology.
4</p>
    </sec>
    <sec id="sec-3">
      <title>Integrated Graph</title>
      <p>We are interested in the integration of a set of di erent ontologies in one graph
structure. We begin with the term that was chosen by the user during the
ontology extraction procedure (root concept ). For each ontology we connect the
children of the most general Thing with this root concept. Then we traverse
each ontology and add the children recursively until we reach the leaves of the
hierarchy. During this walk, for each concept C1, we interrogate the mapping
le in order to check whether there is a mapping for this concept. If this is the
case then we merge the information from every concept in this mapping and
introduce a merged concept M . The concept M will represent C1 and all other
concepts which are mapped to M . When another ontology is traversed and a
concept C2 is found which is also mapped to M , we don't insert this concept
into the graph. Instead, we introduce a new link from the parent of C2 to M .</p>
      <p>
        The integrated graph structure can be visualized with our visualization
approach [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ] and further explored by the user. It can also be transformed to the
new ontology.
      </p>
    </sec>
    <sec id="sec-4">
      <title>Acknowledgements References</title>
      <p>This work has been partially supported by the BioRange program of the
Netherlands BioInformatics Centre (NBIC, BSIK grant).
2 Least common subsumer can be calculated by a DL reasoner.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          1.
          <string-name>
            <surname>Noy</surname>
            ,
            <given-names>N.F.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Musen</surname>
            ,
            <given-names>M.A.</given-names>
          </string-name>
          :
          <article-title>Specifying ontology views by traversal</article-title>
          .
          <source>LNCS</source>
          <volume>3298</volume>
          (
          <year>2004</year>
          )
          <volume>713</volume>
          {
          <fpage>725</fpage>
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          2.
          <string-name>
            <surname>Levenshtein</surname>
            ,
            <given-names>V.I.</given-names>
          </string-name>
          :
          <article-title>Binary codes capable of correcting deletions, insertions, and reversals</article-title>
          . (
          <year>1966</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          3.
          <string-name>
            <surname>Dmitrieva</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Verbeek</surname>
            ,
            <given-names>F.J.:</given-names>
          </string-name>
          <article-title>Multi-view ontology visualization</article-title>
          .
          <source>In: The 11th International Protege Conference, June 23-26</source>
          ,
          <fpage>2009</fpage>
          - Amsterdam
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          4. Pellet:
          <article-title>Owl 2 reasoner for java</article-title>
          . http://clarkparsia.com/pellet
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          5.
          <string-name>
            <surname>Baader</surname>
            ,
            <given-names>F.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Calvanese</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>McGuinness</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Nardi</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Patel-Schneider</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          :
          <article-title>The Description Logic Handbook</article-title>
          . Cambridge University Press, Cambridge (
          <year>2002</year>
          )
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>