<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Towards an Ontology-based Mediation Framework for Integrating Biological Data?</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Amine Kerzazi</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Ismael Navas-Delgado</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Jose F.Aldana-Montes</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>E.T.S. Ingenier a Informatica Universidad de Malaga, Campus de Teatinos</institution>
          ,
          <addr-line>29071 Malaga</addr-line>
        </aff>
      </contrib-group>
      <abstract>
        <p>In the context of Life Sciences, the frame of Systems Biology is emerging. It is supported by all high-throughput methods which generate large amounts of data that cannot be processed simply by the human mind. The integration of data from heterogeneous knowledge sources involves the consolidation of heterogeneous data geared at generating new knowledge that can not be obtained from single data sources. In this paper, we introduce the new improvements in the mediator components, their function and importance for biological data integration.</p>
      </abstract>
      <kwd-group>
        <kwd>Semantic Directory</kwd>
        <kwd>ontology</kwd>
        <kwd>Data Integration</kwd>
        <kwd>mapping</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>Introduction</title>
      <p>
        Integration of data from heterogeneous knowledge sources represents the
consolidation of heterogeneous data geared at generating new knowledge that can not
be obtained from single data sources. The eld of data integration in the
Semantic web has gained popularity in recent years; integrated access to multiple
distributed and autonomous data sources is a key challenge for many semantic web
applications. In this paper, we introduce an ontology-based mediator framework
(Khaos Ontology-based Mediator Framework http://khaos.uma.es/KOMF/ [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ])
which uses a generic infrastructure to register and manage ontologies, their
relationships and also information relating to the resources. KOMF has been
successfully instantiated in the context of Molecular Biology for integrating dispersed
data sources. The most important proposal to solve the data integration
problem is the wrapper/mediator architecture. In this architecture, a mediator (an
intermediate virtual database with a schema G according to a previous de nition
of the data integration system) is established between data sources (with a set
of schemas S) and applications.
      </p>
      <p>The study of data integration proposals has enabled the design of the novel
architecture proposed, KOMF. This architecture can be used to develop di
erent data integration systems, and can even be used to emulate existent systems.</p>
      <p>
        This architecture can be used for any kind of database system, including
centralized, distributed, or parallel systems. The main purpose of our architecture is to
provide a semantically uni ed interface for querying heterogeneous information
sources. The architecture is composed of four major kinds of elements (Figure
1):
{ Semantic directories[
        <xref ref-type="bibr" rid="ref2">2</xref>
        ] store and manage meta-data concerning a number of
domain ontologies, as well as the relationships among the ontologies and the
data source/databases.
{ Data services include the access to data sources/databases that could be
queried through the Web.
{ The mediator provides a way of using queries from the applications to
produce integrated results.
{ Applications should provide end-user-focused interfaces, so users do not need
to know that a mediation system is being used.
We have started working on a pilot system called the Amine System Project
(ASP, http://asp.uma.es/WebMediator ) for the integration of biological
information, related to Biochemistry, Molecular Biology and Physiopathology, of a
group of compounds known as biogenic amines. Two general objectives can be
distinguished in this project:
1. Development of new and more e cient tools for the integration of
information stored in databases, with the aim of detecting new emergent properties
of this system.
2. Generation of in silico predictive models at di erent levels of complexity. It
is being carried out by a multi-disciplinary group consisting of biochemists,
molecular biologists and computer scientists.
      </p>
      <p>The main activity carried out in this pilot has been the development of a
prototype for solving a speci c problem. The result is AMMO-Prot, the ASP
Model Finder. The problem to be solved by this tool is the following:</p>
      <p>A common and useful strategy to determine the 3D structure of a protein,
which cannot be obtained by crystallization, is to apply comparative modelling
techniques. They start working with the primary sequence of the target protein
to nally predict its 3D structure by comparing the target polypeptide to those of
solved homologous proteins.</p>
      <p>
        Another tool developed taking advantage of KOMF has been the System
Biology Metabolic Modelling Assistant (http://www.sbmm.uma.es )[
        <xref ref-type="bibr" rid="ref2">2</xref>
        ], which is a
tool developed to search, visualize, manipulate and annotate identity data and
assist in annotating the kinetic data. The inputs to search pathways are the
pathway's name or Kegg code, a set of enzymes or a correctly annotated model
(homemade or not). The application queries KOMF via conjunctive queries and
obtains a set of RDF instances. Results are rebuilt to a format that is
understandable for the application and the user can ask for information on enzymes,
compounds and reactions. Users can also edit the pathway freely or using an
assistant, adding well formed kinetic rules. At the end of the process the user
can export the pathway to SBML format, enriched automatically without any
previous con guration.
3
      </p>
    </sec>
    <sec id="sec-2">
      <title>Discussion and Conclusions</title>
      <p>This paper has introduced KOMF, an ontology-based mediation framework,
which uses ontologies as mediated schema. Since the mediated schema is an
ontology, queries are created over the ontology that constitutes the mediated
schema and results are ontology instances. The use of ontologies enables
reasoning to be included at di erent levels, making it possible to infer new knowledge.
This framework has been validated in molecular biology and systems biology. In
this context we have de ned a domain ontology and a set of data sources has been
registered and successfully integrated. KOMF it is used for integrating data from
di erent biological information related to Biochemistry, Molecular Biology and
Physiopathology of a group of compounds known as biogenic amines. It is also
used by The System Biology Metabolic Modeling Assistant to search, visualize,
manipulate and annotate identity data and assist in annotating the kinetic data.
As future work, our intention is to infer information that is not stored anywhere
(but is a logical consequence of the stored one) by using reasoning. Furthermore,
we are de ning how to integrate data transformation tools in order to enable the
transformation of integrated data for solving more complex tasks (like protein
structure prediction, protein alignment, etc.).</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          1.
          <string-name>
            <given-names>Othmane</given-names>
            <surname>Chniber; Amine Kerzazi; Ismael Navas-Delgado y Jose F. Aldana</surname>
          </string-name>
          <article-title>Montes. KOMF: the Khaos ontology-based mediation framework</article-title>
          .
          <source>En Luciano Milanesi and Paolo Romano (eds.)</source>
          .
          <source>Bioinformatics Methods for Biomedical Complex System Applications</source>
          .
          <fpage>19</fpage>
          -21 May
          <year>2008</year>
          ,
          <string-name>
            <given-names>Villa</given-names>
            <surname>Monastero</surname>
          </string-name>
          , Varenna, Italy. pags.
          <volume>57</volume>
          -
          <fpage>60</fpage>
          . NETTAB,
          <year>2008</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          2.
          <string-name>
            <surname>Navas-Delgado</surname>
            <given-names>I</given-names>
          </string-name>
          ;
          <string-name>
            <surname>Kerzazi</surname>
            <given-names>A</given-names>
          </string-name>
          ;
          <string-name>
            <surname>Chniber O y Aldana-Montes</surname>
            <given-names>J</given-names>
          </string-name>
          .
          <article-title>SD-CORE: a semantic middleware applied to molecular biology</article-title>
          .
          <source>In Proceedings of On the Move to meaningful Internet Systems: OTM Workshops; 9-14 November</source>
          <year>2008</year>
          ; Monterrey.
          <year>2008</year>
          :
          <fpage>976</fpage>
          -
          <lpage>985</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          3.
          <string-name>
            <surname>Reyes-Palomares</surname>
            <given-names>A</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Montanez</surname>
            <given-names>R</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Real-Chicharro</surname>
            <given-names>A</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Chniber</surname>
            <given-names>O</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Kerzazi</surname>
            <given-names>A</given-names>
          </string-name>
          ,
          <string-name>
            <surname>NavasDelgado</surname>
            <given-names>I</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Medina</surname>
            <given-names>MA</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Aldana-Montes</surname>
            <given-names>JF</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Sanchez-Jimenez</surname>
            <given-names>F</given-names>
          </string-name>
          :
          <article-title>Systems biology metabolic modeling assistant: an ontology-based tool for the integration of metabolic data in kinetic modeling</article-title>
          .
          <source>Bioinformatics</source>
          <year>2009</year>
          ,
          <volume>25</volume>
          :
          <fpage>834</fpage>
          -
          <lpage>835</lpage>
          .
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>