<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Applying Reasoning to Instance Transformation</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Adrian Mocan</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Mick Kerrigan</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Emilia Cimpian</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Semantic Technology Institute Innsbruck Leopold-Franzens UniversitaÄt Innsbruck</institution>
          ,
          <country country="AT">Austria</country>
        </aff>
      </contrib-group>
      <abstract>
        <p>Signi¯cant e®ort has been invested by the Semantic Web community in methodologies and tools for creating ontology mappings. Various techniques for mapping creation have been developed; however less interest has been shown in the actual usage of the created mappings and their application in concrete mediation scenarios. In this paper we show how mappings can be converted to logical rules and evaluated by a Datalog reasoner in order to produced the desired mediation result. The mediation scenario discussed in this work is instance transformation: data expressed as ontology instances is transformed from the terms of one ontology in to the terms of another ontology based on mappings created in a design-time phase. We analyze the required reasoning task and describe how the mappings created at design-time can be grounded at runtime. We also explore strategies and techniques that we employed in order to improve the e±ciency of the overall process.</p>
      </abstract>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>Introduction</title>
      <p>
        Data mediation in the context of ontologies involves the creation of mappings
between ontologies at the schema level. The set of mappings (or alignment ) is
created in advance between two or more ontologies so they can be used
automatically at run-time to solve various heterogeneity problems in a given mediation
scenario. While the process of creating ontology alignments has been well
explored in the last decade in the Semantic Web community [
        <xref ref-type="bibr" rid="ref11 ref14 ref4">4,14,11</xref>
        ], the usage
of the alignments has so far been considered an application-speci¯c detail.
      </p>
      <p>This paper describes the usage of ontology mappings in the instance
transformation scenario. It shows how a relatively simple mapping representation format
can be grounded automatically to complex logical rules. After the grounding is
applied, a general-purpose Datalog-based reasoner can be used to evaluate the
rules and to retrieve the mediated data by query. We argue that by separating
the mappings representation from their executable form o®ers advantages that
can be exploited in applying a set of optimizations on the overall instance
transformation process. We propose a grounding mechanism capable of transforming
mappings from an abstract form into executable mapping rules and go on and
provide an overview of several optimization techniques suited for this scenario.</p>
      <p>Section 2 given an overview of the main technologies used in our approach,
while Section 3 describes the grounding of abstract mappings to mapping rules
and the main reasoning task that is being performed. Section 4 brie°y introduced
several optimization types that can be applied in this context, and we close the
paper with a presentation of related work and some conclusions.
2</p>
    </sec>
    <sec id="sec-2">
      <title>Ontology Mappings</title>
      <p>
        The work presented in this paper is part of a broader context, namely the Web
Service Execution Environment (WSMX) [
        <xref ref-type="bibr" rid="ref10">10</xref>
        ], and it targets ontologies that
conform to the Web Service Modeling Ontology (WSMO) [
        <xref ref-type="bibr" rid="ref13">13</xref>
        ]. WSMO ontologies
are expressed using the Web Service Modeling Language (WSML) [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ], a language
which o®ers several language variants. For the approach used in this paper we
consider the Logic Programming branch of WSML, namely the WSML-Flight
and WSML-Rule [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ] variants. WSML-Flight is a powerful rule language based
on a logic programming variant of F-Logic [
        <xref ref-type="bibr" rid="ref8">8</xref>
        ] and is semantically equivalent to
Datalog with inequality and (locally) strati¯ed negation. WSML-Rule extends
WSML-Flight with further features from Logic Programming, namely the use of
function symbols, unsafe rules, and unstrati¯ed negation under the well-founded
semantics.
      </p>
      <p>
        The instance transformation scenario relies on a set of pre-existing mappings
(i.e. an alignment) between the source and the target ontologies. This paper does
not discuss the methods used in creating the alignments between the ontologies
- they could be either entirely manual or semi-automatic assisted by graphical
tool like the one we proposed in [
        <xref ref-type="bibr" rid="ref9">9</xref>
        ]. In this scenario we choose not to directly
represent the ontology mappings using a speci¯c WSML variants, but instead to
use an intermediary, abstract representation. The Abstract Mapping Language
(AML), a language proposed in [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ] (extended and elaborated in [
        <xref ref-type="bibr" rid="ref5">5</xref>
        ] as the
Alignment Format ) is used to express the mappings, because it does not commit to
any existing ontology representation language. The abstract mappings state that
a semantic relationship exists between the mapped entities, but the actual
semantics and interpretation is associated in a second step, called grounding (see
Section 3.1). There are a number of reasons behind this design decision:
Reusability: The same set of mappings can be used in various mediation
scenarios. A grounding mechanism can be applied at runtime and a formal semantics
can be associated with these mappings in order to suit the targeted scenario.
Manageability: As ontologies evolve, the set of mappings between them must
be updated. If mappings are represented as rules in a particular ontology
language the special features and peculiarities of this language are re°ected in the
rules as well.
      </p>
      <p>Although the AML has its own syntax, in this paper we will use a more
schematic representation of the mappings in examples, both for brevity and for
emphasizing the simplicity of this form of mapping representation. As such, for
every mapping we use the following form:
mapping( sourceEntity; [sourceEntityRestriction];</p>
      <p>targetEntity; [targetEntityRestriction]; typeOfMaping)</p>
      <p>The source and target entity restriction are optional conditions on the mapped
entities, and typeOf M apping 2 fC2C; C2A; A2C; A2Ag where C2C stands for a
concept to concept (or class to class) mapping, A2A stands for an attribute to
attribute mapping1, A2C stands for a attribute to concept mapping, and C2A
stands for a concept to attribute mapping. The AML can be also used to express
mappings between relations of arbitrary arity, but here due to space constraints
we restrict ourselves to mappings involving only concepts and attributes.
3</p>
    </sec>
    <sec id="sec-3">
      <title>Instance Transformation</title>
      <p>The instance transformation scenario can be summarized as follows: di®erent
business actors use ontologies to describe their internal business logic and their
data. Each of these actors use their own information system and they interact
with each other as part of arbitrary business processes. However as the ontologies
of each of the actors is likely to be di®erent there is a need for a specialized
service capable of transforming data expressed in terms of a given ontology
(the source ontology) into the terms of another ontology (the target ontology),
allowing the two actors to communicate e®ectively, without changing the way
they represent their data. As the instance transformation occurs at run-time it
has to be performed completely automatically using mappings that have already
been created at the schema level during a design-time phase.
3.1</p>
      <sec id="sec-3-1">
        <title>Grounding</title>
        <p>Each of the abstract mappings have to be grounded to rules expressed in WSML
in order for them to be used in a reasoner. WSML-Rule has been preferred
over WSML-Flight since it o®ers function symbols, which allows the building
and usage of constructed terms as new identi¯ers for the mediated instances
based on the identi¯ers of the original source instances. This identi¯ers of the
target instances are built using the function symbol mediated with two
parameters: ¯rst is the source instance out of which the target instance is
derived and second is the target concept the source instance is mediated to (e.g.
mediated(johnS; o2#Citizen)).</p>
        <p>Formula 1 shows the grounding of a concept to concept mapping (C2C) to
WSML, where ¡ is a grounding function that takes as parameter an abstract
mapping (or a fragment of a mapping, e.g. restrictions) and produces a WSML
rule2.
1 In WSML attributes are binary relations local to the concept they are de¯ned on
and they can have both a data-type or another concept as range.
2 In WSML variables are preceded by a "?", class memberships are denoted
by "memberOf ", conjunctions by "and" and inheritance relationships by
"subConceptOf ". Also, ®[¯ hasValue °] and ®[¯ ofType °] are atomic
formulas called molecules; both ® and ° identify instances (or values) and concepts (or
data-types) respectively, while ¯ identi¯es an attribute.</p>
        <p>Formula 2 describes the grounding of the A2A mappings when both the
source and the target attributes have as range a data-type value. In this case the
source value is simply copied to the target instance. Lines 3 and 4 assure that
this rule covers also the cases when the mapped source attribute is inherited
from an upper concept and the actual instance to be mediated is an instance of
a specialized concept (denoted by the ?subCS variable).</p>
        <p>¡ (mapping (CS:AS; ASR; CT :AT ; ATR; A2A)) 7!
mediated(?x; CT )[AT hasValue ?v] memberOf CT : ¡
?x[AS hasValue ?v] memberOf ?subCS and
?subCS subConceptOf CS and mapped(?subCS; CT ; ?x) and
¡ (ASR) and ¡ (ATR):
(2)</p>
        <p>In Formula 3 the mappings between attributes having as ranges other
concepts are addressed. As shown at line 2, the value of the target attribute is set
to another mediated instance, which is produced by one of the rules generated
by Formulas 2 to 5. At line 5, it is checked if this other mediated instance has
any attributes, in order to avoid the generation of meaningless instances due to
incomplete mappings.</p>
        <p>¡ (mapping (CS:AS; ASR; CT :AT ; ATR; A2A)) 7!
mediated(?x; CT )[AT hasValue mediated(?i; SubRT )] memberOf CT : ¡
?x[AS hasValue ?i] memberOf ?subCS and
?subCS subConceptOf CS and mapped(?subCS; CT ; ?x) and
mediated(?i; SubRT )[?anAttribute hasValue ?aV alue] and
¡ (ASR) and ¡ (ATR):
(3)</p>
        <p>The Formula 5 is the symmetric of Formula 4, grounding the C2A mappings.
¡ (mapping (CS; CSR; CT :AT ; ATR; C2A)) 7!
mediated(?x; CT )[AT hasValue mediated(?x; SubRT )] memberOf CT : ¡
?x memberOf CS and
mediated(?x; SubRT )[?anA hasValue ?aV alue] memberOf SubRT and
¡ (CSR) and ¡ (ATR): (5)</p>
        <p>
          The concept SubRT is extracted during the grounding process based on the
range of the mapped target attribute, i.e. a distinct rule is generated for the
attribute's range and for each of the range's subclasses (since every instance of
the attribute's range sub-concept is a valid ¯ller of that attribute). As described
in [
          <xref ref-type="bibr" rid="ref9">9</xref>
          ], no mappings between two attributes having as range a data-type and a
concept, are allowed (compensated by the usage of C2A and A2C mappings).
        </p>
        <p>The grounding of the A2C mappings , as seen in Formula 4, handles the
cases when di®erent levels of aggregation are used in the ontologies, e.g. the
attribute CS:A has the concept CRS as range, and CRS's instances also need to
be transformed into instances of the CT . This type of mapping can be applied
only when CRS is a concept, but not a data-type.</p>
        <p>¡ (mapping (CS:AS; ASR; CT ; CTR; A2C)) 7!
mediated(?x; CT )[?anA hasValue ?aV alue] memberOf CT : ¡
?x[AS hasValue ?i] memberOf CS and
mediated(?i; CT )[?anA hasValue ?aV alue] memberOf CT and
¡ (ASR) and ¡ (CTR):
(4)</p>
        <p>The restrictions can be seen as conditions that have to hold in order for the
mapping to apply. They are divided into three main classes: attribute occurrence
(aoc), attribute type (atc) and attribute value (avc) conditions. Due to space
constraints, this paper does not include a complete and detailed description of
the conditions grounding. However, the Formulas 6 to 9 shows the grounding of
the conditions when the "equals" operator is used. The grounding in Formulas 6
and 7 can be applied on any conditions, while, in order to avoid the creation of
unsafe rules, Formula 8 is used to ground the only the avc set on source entities,
while Formula 9 is used to ground only the avc set on the target entities.
¡ (atc(C:A; "equals"; R)) 7! C[A ofType R]
¡ (aoc(C:A)) 7! C[A ofType ?aRange]
¡ (avc(C:A; "equals"; V alue)) 7! C[A hasValue V alue]
¡ (avc(C:A; "equals"; V alue; tV ariable)) 7!</p>
        <p>assignement(Id; ?tV ariable) and C[A hasValue ?tV ariable]
assignement(Id; V alue):
(6)
(7)
(8)
(9)</p>
        <p>The Id uniquely identi¯es di®erent assignments, one distinct fact is generated
each time a value from an target avc has to be assigned to a target attribute.
3.2</p>
      </sec>
      <sec id="sec-3-2">
        <title>Reasoning Task</title>
        <p>After grounding, the alignment can be seen as an ontology that imports the
source and the target ontologies and the set of mapping rules.
Conceptually, it could be also seen as a "merged ontology" where the input
ontologies were independently put together (the separation is realized by
namespaces) and linked by rules. By adding the source data to it and posing queries
in terms of the target ontology, the appropriate mapping rules are triggered and
the mediated data produced as a result.</p>
        <p>Considering the two ontology fragments in Table 1, a set of mappings as the
one presented in Listing 1.1 can be created.</p>
        <sec id="sec-3-2-1">
          <title>Listing 1.1. Abstract mappings</title>
          <p>
            Listing 1.2 shows the mapping rules generated by grounding the abstract
mappings depicted in Listing 1.1, using the Formulas 1 to 9. These mapping
rules are expressed as WSML axioms and they have a precise semantics, as
de¯ned by WSML-Rule [
            <xref ref-type="bibr" rid="ref3">3</xref>
            ]. They are included in an ontology that imports the
source and the target ontologies. The source instances to be mediated are also
included in this ontology, which is then registered in the reasoner. Listing 1.3
depicts a set of source instances to be mediated by using the mappings shown
above.
          </p>
          <p>Listing 1.2. WSML mapping rules between the Belgian and Italian ontologies
¨
¥
wsmlVariant
namespace fo1
o2
" http : / /www. wsmo . org /wsml/wsml¡syntax /wsml¡r u l e "
" http : / /www. semantic¡gov . org / BelgianCitizenOntology#" ,
" http : / /www. semantic¡gov . org / I t a l i a n C i t i z e n O n t o l o g y#"g
ontology merged ontology
importsOntology f o1#BelgianCitizenOntology , o2#I t a l i a n C i t i z e n O n t o l o g y g
axiom o2#ccMappingRule18 definedBy
o2#mappedConcepts ( o1#Person , o2#Citizen , ? X17) and
o2#mediated (?X17 , o2#C i t i z e n ) memberOf o2#C i t i z e n :¡ ?X17 memberOf o1#Person .
axiom o2#caMappingRule72 definedBy
o2#mediated (?X69 , o2#C i t i z e n ) [ o2#hasName hasValue
o2#mediated1 (?X69 , o2#Name ) ] memberOf o2#C i t i z e n :¡ ?X69 memberOf o1#Person
and o2#mediated1 (?X69 , o2#Name ) [ ? A70 hasValue ?V71 ] memberOf o2#Name .
axiom o2#ccMappingRule12 definedBy
o2#mappedConcepts ( o1#Person , o2#Name, ? X11) and
o2#mediated (?X11 , o2#Name) memberOf o2#Name :¡ ?X11 memberOf o1#Person .
axiom o2#aaMappingRule48 definedBy
o2#mediated (?X45 , o2#Name ) [ o2#hasFirstName hasValue ?Y46 ] memberOf o2#Name :¡
?X45 [ o1#hasChristianName hasValue ?Y46 ] memberOf ?SC47 and
?SC47 subConceptOf o1#Person and o2#mappedConcepts (? SC47 , o2#Name, ? X45 ) .
. . .
axiom o2#aaMappingRule84 definedBy
o2#mediated (?X79 , o2#C i t i z e n ) [ o2#hasSex hasValue ?Y81 ] memberOf o2#C i t i z e n :¡
?X79 [ o1#hasGender hasValue ?Y80 ] memberOf ?SC83 and
?SC83 subConceptOf o1#Person and o2#mappedConcepts (? SC83 , o2#Citizen , ? X79)
and ?X79 [ o1#hasGender hasValue o1# 1 ] and o2#assignement 82 (? Y81 ) .
axiom o2#assignement 82 definedBy</p>
          <p>o2#assignement 82 ( o2#M) .
. . .
axiom o2#ccMappingRule14 definedBy
o2#mappedConcepts ( o1#Gender , o2#Sex , ? X13) and
o2#mediated (?X13 , o2#Sex ) memberOf o2#Sex :¡ ?X13 memberOf o1#Gender .
§</p>
          <p>Once this ontology is registered in the reasoner, queries can be asked to
retrieve the mediated data. The mediation take place within the reasoner when
the rules are evaluated and the target instances become implicit knowledge in
¦
the reasoning space. Normally, one or more instances from the source set are
considered to be root instances and used in determining a starting query point.
Otherwise, the procedure exempli¯ed below must be applied to all the instances
in the source set (although this could lead to redundant computations).
¥</p>
          <p>Listing 1.3. Source instances to be mediated
§ ¦
Assuming that johnS is the root instance in the set shown in Listing 1.3,
the ¯rst step is to identify which is the most relevant concept from the target
to be mediated to. johnS is an instance of concept Person and there are two
concepts in the target the Person is mapped to: Citizen and Name. Since the
concept Name can be reached from Citizen via the hasName attribute (paths
of any length can be explored in this way), the concept Citizen is considered as
being an ancestor of the concept Name and by this, more suited for the instance
johnS to be mapped to3.</p>
          <p>Once the concept Citizen is selected, queries can be posed to the reasoner.
The ¯rst query and the results obtained are shown in Listing 1.4.</p>
          <p>Listing 1.4. The target instance obtained by mediating the johnS instance
¨ ¥
?¡ ?x memberOf o2#C i t i z e n .</p>
          <p>Found &lt; 1 &gt; r e s u l t s to the query :
(1 ) ¡¡ ?x = mediated ( johnS , o2#C i t i z e n )
§ ¦</p>
          <p>The target instance mediated(johnS,o2#Citizen) has to be explored and its
attributes and their values retrieved. The next query and the obtained answers
are shown in Listing 1.5. For each of the attributes (?y ) this query retrieves both
the value (?z ) and the type of this value (?avC ).</p>
          <p>Listing 1.5. Finding the attributes and their values for a target instance
¨
?¡mediated ( johnS , o2#C i t i z e n ) [ ? y hasValue ? z ] memberOf o2#C i t i z e n and ? z memberOf ?avC .
Found &lt; 3 &gt; r e s u l t s to the query :
((12 )) ¡¡ ?? zz == om2e#dMia,te?dav(Cjoh=nSo2,#o2S#exNa,m?ye)=, o2?#avhCas=Sexo2#Name , ?y = o2#hasName
(3 ) ¡¡¡¡ ? z = mediated ( johnS birthDate , o2#Date ) , ?avC = o2#Date , ?y = o2#hasBirthday
?¡mediated ( johnS , o2#Name ) [ ? y hasValue ? z ] memberOf o2#Name and ? z memberOf ?avC .
Found &lt; 2 &gt; r e s u l t s?avtCo =the s tqruienrgy :, ?y = o2#hasFirstName
((12 )) ¡¡¡¡ ?? zz == JSomhinth, , ?avC = s t r i n g , ?y = o2#hasSurname
§
¥
¦
3 In the presence of an inheritance hierarchy, if for example there are multiple
mappings from the concept Person to several concepts in this hierarchy, the most speci¯c
concept would qualify.</p>
          <p>For each of the attribute values having as type a concept the query process
continues recursively. Listing 1.5 illustrates the queries for the hasName attribute
and mediated(johnS,o2#Name) value.</p>
          <p>By applying this querying mechanism all the target instances can be retrieved
and materialized. The obtained mediated instances are shown in Listing 1.6.
¨</p>
        </sec>
        <sec id="sec-3-2-2">
          <title>Listing 1.6. Mediated target instances</title>
          <p>iiinnnsssooooooottt2222222aaa#######nnnhdhhhhycccaaaaaeaseeeysasssNSFSBruahemmmiimrrxhraeeesntsaetdddahVhsNiiimdVhaaaaaaaesatttlamyVueeesldddVuheeahe(((alauh1jjsajloooVu1e1sahhhVe9sannVn8olaSSu02oSla#u2e,,lM#ubee"ooimr22Smet##"mhdeNJCdiDioataiimhhatatnte"iteezde")ed(,nmj(oo)jeho2mn#hmbSnDeeSm,arOtbboefe2i)rr#oOtmN2fh#aeDmNoma2aeb#mt)eeCerO,i tfoiz2o#e2#nDDataete)
o2#month hasValue 4
¥
§ ¦</p>
          <p>
            The logic program corresponding to the "merged ontology" is decidable even
if this ontology is expressed in WSML-Rule, which is generally undecidable [
            <xref ref-type="bibr" rid="ref3">3</xref>
            ].
The reason is that the function symbols are used in such a way that only one
level of constructed terms can be generated. That is, the mapping rules will never
generate terms like mediated(mediated(:::(X; C):::)). Formulas 1 to 5 builds the
constructed term mediated(?x; CT ) memberOf CT only if ?x memberOf CS
exists. Since CS and CT are concepts from the source and the target ontology,
respectively, they are separated by namespaces and, as a consequence, distinct.
4
          </p>
        </sec>
      </sec>
    </sec>
    <sec id="sec-4">
      <title>E®ective Data Mediation</title>
      <p>In this section we introduce the lessons learned from applying the instance
transformation scenario described in Section 3 to use cases in two EU funded projects,
namely SEEMP and SemanticGov4. SEEMP enables the exchange of data
between di®erent Employment Services in Europe that use their own data
structures and taxonomies. SemanticGov aims to build the infrastructure necessary
to enable the provision of Semantic Web Services for Public Administration.
4.1</p>
      <p>
        Optimizations for the Instance Transformation Scenario
In the context of the SEEMP project, mappings between the local ontology of
each employment service and the reference ontology of the marketplace were
made at design time using the mapping tools [
        <xref ref-type="bibr" rid="ref9">9</xref>
        ] in the Web Service Modeling
Toolkit (WSMT) [
        <xref ref-type="bibr" rid="ref7">7</xref>
        ], thus transforming an instance from one local employment
service to another was done by transforming the source instance into terms of
the reference ontology and then transforming this instances into terms of the
other local ontology. However having created the mappings it was obvious that
the performance of the reasoning when performing instance transformation was
an issue. With further investigation it became apparent that the number of
facts and rules registered in the reasoner was very high. Essentially for a given
instance transformation the size of the source and target ontologies was very large
4 For more details see http://www.seemp.org and http://www.semantic-gov.org.
and the mappings between these two large structures was causing the forward
chaining algorithms within the reasoner to require signi¯cant computation time.
Also the instance data to transform was very large and parts of this information
where untransformable due to a lack of mappings. In this section we explore the
optimizations performed in the SEEMP project to improve the performance of
the overall instance transformation process.
      </p>
      <p>Source Instance Filtering: In a given scenario a certain amount of the
input source instances can be transformed based on the coverage of the available
mappings. In scenarios where this coverage is low the reasoner contains lots of
instance data that will never be transformed to the target ontology. This
additional information is unnecessary for the instance transformation step, but
having it within the reasoner means that it gets used in the model
computations and thus its removal would improve the overall performance. To this end a
source instance pre-¯lter was added to the instance transformation process that
removes those parts of the source instances that cannot be transformed. This
¯lter can have quite dramatic e®ects on the quantity of data loaded into the
reasoner as the absence of one attribute to attribute mapping can remove entire
branches of source instances. Having implemented this ¯lter an improvement of
between 5% and 10% in most of our test cases was observed. The amount of
performance improvement that can be gained by applying this ¯lter is proportional
to the amount of extra source data that is untransformable. Thus in scenarios
where there is 100% coverage the ¯lter will not remove any content and the
performance will remain the same.</p>
      <p>Mapping Filtering: In other cases the source instance set can be very small
and the number of mappings very large. The extra mappings that are being
loaded into the reasoner will only partially ¯re without producing any meaningful
and valid result. Further exploration into the SEEMP project test cases showed
that this was not an extraordinary scenario but in fact the normal case. Thus
a mapping ¯lter was added to the instance transformation process removing
those mappings unrelated to the source instances. This ¯lter showed a huge
performance improvement of between 40% and 50% for the average test case in
the SEEMP project as the ontologies in the test cases contain a number of very
large taxonomies with many mappings between them, approximately 70% of the
mappings created at design time are ¯ltered away at runtime by the mapping
¯lter for these test cases.</p>
      <p>Source and Target Ontology Filtering: Having ¯ltered both the source
instances and the mappings the bulk of the data remaining in the reasoner is the
source and target ontologies. Dynamically ¯ltering ontologies prior to putting
them in a reasoner is a non trivial a®air; however initial research appears to show
that a naive ¯lter of the ontologies based on the source instances and the available
mappings can be used to remove those parts of the source and target ontologies
which are unnecessary for a given instance transformation process. Our ongoing
research seems to show that a 50% to 60% performance improvement on top of
the improvements already garnered by the source instance and mapping ¯lters
can be gained.</p>
      <p>The combination of these ¯lters can dramatically reduce the amount of data
that is loaded in the reasoner at runtime and therefore the performance of the
overall instance transformation process. Crucially these ¯lters are all dynamic
and applied at runtime thus they involve no human e®ort in order to con¯gure
them. In our current system each of these ¯lter is turned on and o® via a °ag,
thus they can be applied in those scenarios where they are desirable and turned
o® where they are not.
There are situations when the data-values embedded in the source instances need
to be transformed before they can be assigned to the attributes part of the target
instances. Such transformations could involve simple string concatenations or
more complex, dynamic transformations based on external factors, for example
currency conversions.
¨
§</p>
      <p>Listing 1.7. Example of a value transformation service usage
axiom o2#aaMappingRule definedBy
o2#mediated (?X45 , o2#C i t i z e n ) [ o2#hasName hasValue
htt??pXX:44/55/[[ ooex11##amhhaapsslSCeu.hrornriasgmt/ieacnohnNacasamVteSaleuhreavsi Vc?eYa(4lu?7 Ye]4m6?Y,em4?6bY]e4rm7O)ef]m?bmSeCerOm47fbeaorn1O#dfPoe2r#soCni t aiznedn :¡</p>
      <p>?SC47 subConceptOf o1#Person and o2#mappedConcepts (? SC47 , o2#Citizen , ? X45 ) .
insooot222a###nhhhcaaasessNSBaemimxretedhhidhaaasatyVesdVah(laujaloueshVenoaS2hl#u,tMetpo 2m:#/e/Cdeiixataitzmeedpn(le)jo.hmonregSm/bbceiorrOntcfhaDota2S#teeCrv,i ticoize2#e(n"DJaothen) " , "Smith" )
Normally, this kind of conversions requires either the use of specialized data
manipulation functions within the reasoning environment or the access to
external services that can apply the conversion after the reasoning occurred. Since the
¯rst option would signi¯cantly limit the type of the allowed transformations, we
rely on the second approach. The domain experts can specify during the creation
of mappings, an arbitrary identi¯er to denote the service they would like to use.
At run-time, the reasoner is used only to assign to the target recipient a string
encoding of the service and its parameter. After the mediated target instances
are retrieved from the reasoner, a post-processing phase identi¯es and evaluates
every service/parameter encoding.</p>
      <p>For example, assuming that the attribute hasN ame of the Citizen concept in
that target ontology would have as range a string, representing a concatenation
of the ¯rst name and the surname of a person, such a value transformation
service would be required. Listing 1.7 shows the corresponding WSML mapping
rule and the attribute value produced after reasoning.</p>
      <p>The services set that can be used in the mapping process should be
customizable at the mapping creation tool level; they could be added, removed or brie°y
described. It is important to note that at that level no implementation has to
be provided - the implementation has to be available and integrated with the
instance transformation component only at run-time.
¥
¦</p>
    </sec>
    <sec id="sec-5">
      <title>Related Work</title>
      <p>
        TSIMMIS [
        <xref ref-type="bibr" rid="ref6">6</xref>
        ] is a system for integrating information coming from heterogeneous
data sources. TSIMMIS includes a mediator-generator able to produce
mediator descriptions in the Mediator Speci¯cation Language (MSL) [
        <xref ref-type="bibr" rid="ref12">12</xref>
        ]. MSL and
the abstract mapping language's grounding (AMLg) proposed in this paper are
based on the same general principles: they both rely on datalog-like rules which
construct target data, based on the information from the sources. However MSL
relies on patterns to match and create data from the source and target data sets,
while AMLg acts on schema level elements. Additionally, using MSL, one needs
to construct paths from root objects down to the relevant information in the
model, while with AMLg separate rules are created. These rules are eventually
"composed" by a reasoner in order to produce the desired target data out of the
source instances. This strategy assures that when new schema elements need to
be included in the mappings set, no re-engineering of the existing mappings or
rules is necessary. Another relevant similarity aspect between TSIMMIS and the
approach in this paper, is the usage of Skolem functions [
        <xref ref-type="bibr" rid="ref8">8</xref>
        ] or function symbols.
Both approaches use them to encode information regarding how the target
object or instance has been derived from the source. While this information is used
by MSL only to build special target objects, it plays a crucial role for AMLg
allowing the combination of the results from individual rules into one complex
result instance.
      </p>
      <p>
        Abiteboul and colleagues [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ] propose a middleware data model, as basis for
the integration task, and declarative rules to specify the integration. The model
is a minimal one and the data structure consists of ordered label trees. The
authors also consider two types of rules: correspondence rules used to express
relationships between the nodes of their model (similar with our mappings in the
AML but expressed in Datalog) and translation rules, a decidable sub-case for
the actual data translation (resembling our mapping rules expressed in WSML).
6
      </p>
    </sec>
    <sec id="sec-6">
      <title>Conclusions</title>
      <p>This paper describes an approach that uses reasoning to perform data mediation,
based on pre-existing alignments between ontologies. The alignments consists of
mappings expressed in the Abstract Mapping Language, a form of representation
that does not commit to any ontology representation language or formalism. In
order to apply these mappings in a concrete mediation scenario they have to
be grounded to a concrete language and to have a formal semantics associated.
This paper provides a grounding of the AML to WSML-Rule, suited for the
instance transformation scenario. A detailed example of the reasoning task is
also provided together with a set of optimization that can be applied in order to
improve the performances of the instance transformation process. Additionally,
the paper describes a way of using value transformation services in conjunction
with reasoning without being restricted to the reasoner's set of built-ins.</p>
      <p>A full implementation of the instance transformation component is available
as part of WSMX, and it is available for download at http://sourceforge.
net/projects/wsmx/. As future work we plan to conduct a full evaluation of the
performance improvements discussed in this paper using the showcases developed
in EU funded projects where this prototype is being developed and used.
7</p>
    </sec>
    <sec id="sec-7">
      <title>Acknowledgements</title>
      <p>The work is funded by the European Commission under the projects
KnowledgeWeb, SEEMP, SemanticGov, SUPER and SHAPE; by the FFG (OÄ sterreichische
ForschungsFÄorderungsGesellschaft mbH) under the project SemBiz.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          1.
          <string-name>
            <given-names>S.</given-names>
            <surname>Abiteboul</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Cluet</surname>
          </string-name>
          , and
          <string-name>
            <given-names>T.</given-names>
            <surname>Milo</surname>
          </string-name>
          .
          <article-title>Correspondence and Translation for Heterogeneous Data</article-title>
          .
          <source>In Proc. of the 6th Intl Conf on Database Theory (ICDT-1997)</source>
          , pages
          <fpage>351</fpage>
          {
          <fpage>363</fpage>
          ,
          <string-name>
            <surname>Delphi</surname>
          </string-name>
          , Greece,
          <year>1997</year>
          . Springer-Verlag.
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          2.
          <string-name>
            <surname>J. de Bruijn</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          <string-name>
            <surname>Foxvog</surname>
            , and
            <given-names>K.</given-names>
          </string-name>
          <string-name>
            <surname>Zimmerman</surname>
          </string-name>
          .
          <article-title>Ontology Mediation Patterns Library</article-title>
          .
          <source>SEKT Project D4.3</source>
          .1, available at: http://www.sekt-project.com,
          <year>2004</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          3.
          <string-name>
            <surname>J. de Bruijn</surname>
            ,
            <given-names>H.</given-names>
          </string-name>
          <string-name>
            <surname>Lausen</surname>
            ,
            <given-names>R.</given-names>
          </string-name>
          <string-name>
            <surname>Krummenacher</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          <string-name>
            <surname>Polleres</surname>
            ,
            <given-names>L.</given-names>
          </string-name>
          <string-name>
            <surname>Predoiu</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          <string-name>
            <surname>Kifer</surname>
            , and
            <given-names>D.</given-names>
          </string-name>
          <string-name>
            <surname>Fensel</surname>
          </string-name>
          .
          <article-title>The Web Service Modeling Language WSML</article-title>
          .
          <source>WSML Working Draft D16</source>
          .
          <year>1v0</year>
          .21, available at: http://www.wsmo.org/TR/,
          <year>Oct 2005</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          4.
          <string-name>
            <given-names>M.</given-names>
            <surname>Ehrig</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Staab</surname>
          </string-name>
          , and
          <string-name>
            <given-names>Y.</given-names>
            <surname>Sure</surname>
          </string-name>
          .
          <article-title>Bootstrapping Ontology Alignment Methods with APFEL</article-title>
          .
          <source>Proc of the 4th Intl Semantic Web Conf (ISWC-2005)</source>
          ,
          <year>Nov 2005</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          5.
          <string-name>
            <given-names>J.</given-names>
            <surname>Euzenat</surname>
          </string-name>
          , F. Schar®e, and L.
          <string-name>
            <surname>Sera</surname>
          </string-name>
          <article-title>¯ni. Speci¯cation of the alignment format</article-title>
          .
          <source>Knowledge Web Deliverable D2.2.6</source>
          ,
          <year>2006</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          6.
          <string-name>
            <given-names>H.</given-names>
            <surname>Garcia-Molina</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Y.</given-names>
            <surname>Papakonstantinou</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Quass</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Rajaraman</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Y.</given-names>
            <surname>Sagiv</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J. D.</given-names>
            <surname>Ullman</surname>
          </string-name>
          ,
          <string-name>
            <given-names>V.</given-names>
            <surname>Vassalos</surname>
          </string-name>
          , and
          <string-name>
            <given-names>J.</given-names>
            <surname>Widom</surname>
          </string-name>
          .
          <article-title>The TSIMMIS approach to mediation: Data models and languages</article-title>
          .
          <source>Journal of Intelligent Information Systems</source>
          ,
          <volume>8</volume>
          (
          <issue>2</issue>
          ),
          <year>1997</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          7.
          <string-name>
            <given-names>M.</given-names>
            <surname>Kerrigan</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Mocan</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Tanler</surname>
          </string-name>
          , and
          <string-name>
            <given-names>D.</given-names>
            <surname>Fensel</surname>
          </string-name>
          .
          <article-title>The Web Service Modeling Toolkit - An Integrated Development Environment for SWS</article-title>
          .
          <source>In Proc of the 4th European Semantic Web Conf (ESWC-2007)</source>
          , Austria,
          <year>Jun 2007</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          8.
          <string-name>
            <given-names>M.</given-names>
            <surname>Kifer</surname>
          </string-name>
          , G. Lausen, and
          <string-name>
            <given-names>J.</given-names>
            <surname>Wu</surname>
          </string-name>
          .
          <article-title>Logical foundations of object-oriented and framebased languages</article-title>
          .
          <source>Journal of the ACM</source>
          , (
          <volume>42</volume>
          ):
          <volume>741</volume>
          {
          <fpage>843</fpage>
          ,
          <year>July 1995</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          9.
          <string-name>
            <given-names>A.</given-names>
            <surname>Mocan</surname>
          </string-name>
          and
          <string-name>
            <surname>E. Cimpian.</surname>
          </string-name>
          <article-title>An Ontology-Based Data Mediation Framework for Semantic Environments</article-title>
          .
          <source>Intl Journal on Semantic Web and Information Systems (IJSWIS)</source>
          ,
          <volume>3</volume>
          (
          <issue>2</issue>
          ), April - June
          <year>2007</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          10.
          <string-name>
            <given-names>A.</given-names>
            <surname>Mocan</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Moran</surname>
          </string-name>
          , E. Cimpian, and
          <string-name>
            <given-names>M.</given-names>
            <surname>Zaremba</surname>
          </string-name>
          .
          <article-title>Filling the Gap - Extending Service Oriented Architectures with Semantics</article-title>
          .
          <source>In Proc of the IEEE Int Conf on e-Business Engineering (ICEBE-2006)</source>
          , China, Oct
          <year>2006</year>
          . IEEE Computer Society.
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          11.
          <string-name>
            <given-names>N. F.</given-names>
            <surname>Noy</surname>
          </string-name>
          and
          <string-name>
            <given-names>M. A.</given-names>
            <surname>Munsen</surname>
          </string-name>
          .
          <article-title>The PROMPT Suite: Interactive Tools For Ontology Merging And Mapping. Intl Jrnl of Human-Computer Stud</article-title>
          .,
          <volume>6</volume>
          (
          <issue>59</issue>
          ):
          <volume>983</volume>
          {
          <fpage>1024</fpage>
          ,
          <year>2003</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          12.
          <string-name>
            <given-names>Y.</given-names>
            <surname>Papakonstantinou</surname>
          </string-name>
          ,
          <string-name>
            <given-names>H.</given-names>
            <surname>Garcia-Molina</surname>
          </string-name>
          ,
          <article-title>and</article-title>
          <string-name>
            <given-names>J. D.</given-names>
            <surname>Ullman</surname>
          </string-name>
          .
          <article-title>MedMaker: A Mediation System Based on Declarative Speci¯cations</article-title>
          .
          <source>In Proc of the 12th Intl Conf on Data Engineering</source>
          , USA,
          <year>1996</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          13.
          <string-name>
            <given-names>D.</given-names>
            <surname>Roman</surname>
          </string-name>
          , U. Keller, H. Lausen, J. de Bruijn,
          <string-name>
            <given-names>R.</given-names>
            <surname>Lara</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Stollberg</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Polleres</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Feier</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Bussler</surname>
          </string-name>
          , and
          <string-name>
            <given-names>D.</given-names>
            <surname>Fensel</surname>
          </string-name>
          .
          <source>Web Service Modeling Ontology. Applied Ontology</source>
          ,
          <volume>1</volume>
          (
          <issue>1</issue>
          ):
          <volume>77</volume>
          {
          <fpage>106</fpage>
          ,
          <year>2005</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref14">
        <mixed-citation>
          14.
          <string-name>
            <given-names>N.</given-names>
            <surname>Silva</surname>
          </string-name>
          and
          <string-name>
            <given-names>J.</given-names>
            <surname>Rocha</surname>
          </string-name>
          .
          <article-title>Semantic Web Complex Ontology Mapping</article-title>
          .
          <source>In Proc of the IEEE Web Intelligence (WI-2003)</source>
          , Canada, Oct
          <year>2003</year>
          .
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>