Introduction

Applying Reasoning to Instance Transformation

Adrian Mocan

Mick Kerrigan

Emilia Cimpian

0 0 Semantic Technology Institute Innsbruck Leopold-Franzens UniversitaÄt Innsbruck , Austria

Signi¯cant e®ort has been invested by the Semantic Web community in methodologies and tools for creating ontology mappings. Various techniques for mapping creation have been developed; however less interest has been shown in the actual usage of the created mappings and their application in concrete mediation scenarios. In this paper we show how mappings can be converted to logical rules and evaluated by a Datalog reasoner in order to produced the desired mediation result. The mediation scenario discussed in this work is instance transformation: data expressed as ontology instances is transformed from the terms of one ontology in to the terms of another ontology based on mappings created in a design-time phase. We analyze the required reasoning task and describe how the mappings created at design-time can be grounded at runtime. We also explore strategies and techniques that we employed in order to improve the e±ciency of the overall process.

Introduction

Data mediation in the context of ontologies involves the creation of mappings between ontologies at the schema level. The set of mappings (or alignment ) is created in advance between two or more ontologies so they can be used automatically at run-time to solve various heterogeneity problems in a given mediation scenario. While the process of creating ontology alignments has been well explored in the last decade in the Semantic Web community [ 4,14,11 ], the usage of the alignments has so far been considered an application-speci¯c detail.

This paper describes the usage of ontology mappings in the instance transformation scenario. It shows how a relatively simple mapping representation format can be grounded automatically to complex logical rules. After the grounding is applied, a general-purpose Datalog-based reasoner can be used to evaluate the rules and to retrieve the mediated data by query. We argue that by separating the mappings representation from their executable form o®ers advantages that can be exploited in applying a set of optimizations on the overall instance transformation process. We propose a grounding mechanism capable of transforming mappings from an abstract form into executable mapping rules and go on and provide an overview of several optimization techniques suited for this scenario.

Section 2 given an overview of the main technologies used in our approach, while Section 3 describes the grounding of abstract mappings to mapping rules and the main reasoning task that is being performed. Section 4 brie°y introduced several optimization types that can be applied in this context, and we close the paper with a presentation of related work and some conclusions. 2

Ontology Mappings

The work presented in this paper is part of a broader context, namely the Web Service Execution Environment (WSMX) [ 10 ], and it targets ontologies that conform to the Web Service Modeling Ontology (WSMO) [ 13 ]. WSMO ontologies are expressed using the Web Service Modeling Language (WSML) [ 3 ], a language which o®ers several language variants. For the approach used in this paper we consider the Logic Programming branch of WSML, namely the WSML-Flight and WSML-Rule [ 3 ] variants. WSML-Flight is a powerful rule language based on a logic programming variant of F-Logic [ 8 ] and is semantically equivalent to Datalog with inequality and (locally) strati¯ed negation. WSML-Rule extends WSML-Flight with further features from Logic Programming, namely the use of function symbols, unsafe rules, and unstrati¯ed negation under the well-founded semantics.

The instance transformation scenario relies on a set of pre-existing mappings (i.e. an alignment) between the source and the target ontologies. This paper does not discuss the methods used in creating the alignments between the ontologies - they could be either entirely manual or semi-automatic assisted by graphical tool like the one we proposed in [ 9 ]. In this scenario we choose not to directly represent the ontology mappings using a speci¯c WSML variants, but instead to use an intermediary, abstract representation. The Abstract Mapping Language (AML), a language proposed in [ 2 ] (extended and elaborated in [ 5 ] as the Alignment Format ) is used to express the mappings, because it does not commit to any existing ontology representation language. The abstract mappings state that a semantic relationship exists between the mapped entities, but the actual semantics and interpretation is associated in a second step, called grounding (see Section 3.1). There are a number of reasons behind this design decision: Reusability: The same set of mappings can be used in various mediation scenarios. A grounding mechanism can be applied at runtime and a formal semantics can be associated with these mappings in order to suit the targeted scenario. Manageability: As ontologies evolve, the set of mappings between them must be updated. If mappings are represented as rules in a particular ontology language the special features and peculiarities of this language are re°ected in the rules as well.

Although the AML has its own syntax, in this paper we will use a more schematic representation of the mappings in examples, both for brevity and for emphasizing the simplicity of this form of mapping representation. As such, for every mapping we use the following form: mapping( sourceEntity; [sourceEntityRestriction];

targetEntity; [targetEntityRestriction]; typeOfMaping)

The source and target entity restriction are optional conditions on the mapped entities, and typeOf M apping 2 fC2C; C2A; A2C; A2Ag where C2C stands for a concept to concept (or class to class) mapping, A2A stands for an attribute to attribute mapping1, A2C stands for a attribute to concept mapping, and C2A stands for a concept to attribute mapping. The AML can be also used to express mappings between relations of arbitrary arity, but here due to space constraints we restrict ourselves to mappings involving only concepts and attributes. 3

Instance Transformation

The instance transformation scenario can be summarized as follows: di®erent business actors use ontologies to describe their internal business logic and their data. Each of these actors use their own information system and they interact with each other as part of arbitrary business processes. However as the ontologies of each of the actors is likely to be di®erent there is a need for a specialized service capable of transforming data expressed in terms of a given ontology (the source ontology) into the terms of another ontology (the target ontology), allowing the two actors to communicate e®ectively, without changing the way they represent their data. As the instance transformation occurs at run-time it has to be performed completely automatically using mappings that have already been created at the schema level during a design-time phase. 3.1

Grounding

Each of the abstract mappings have to be grounded to rules expressed in WSML in order for them to be used in a reasoner. WSML-Rule has been preferred over WSML-Flight since it o®ers function symbols, which allows the building and usage of constructed terms as new identi¯ers for the mediated instances based on the identi¯ers of the original source instances. This identi¯ers of the target instances are built using the function symbol mediated with two parameters: ¯rst is the source instance out of which the target instance is derived and second is the target concept the source instance is mediated to (e.g. mediated(johnS; o2#Citizen)).

Formula 1 shows the grounding of a concept to concept mapping (C2C) to WSML, where ¡ is a grounding function that takes as parameter an abstract mapping (or a fragment of a mapping, e.g. restrictions) and produces a WSML rule2. 1 In WSML attributes are binary relations local to the concept they are de¯ned on and they can have both a data-type or another concept as range. 2 In WSML variables are preceded by a "?", class memberships are denoted by "memberOf ", conjunctions by "and" and inheritance relationships by "subConceptOf ". Also, ®[¯ hasValue °] and ®[¯ ofType °] are atomic formulas called molecules; both ® and ° identify instances (or values) and concepts (or data-types) respectively, while ¯ identi¯es an attribute.

Formula 2 describes the grounding of the A2A mappings when both the source and the target attributes have as range a data-type value. In this case the source value is simply copied to the target instance. Lines 3 and 4 assure that this rule covers also the cases when the mapped source attribute is inherited from an upper concept and the actual instance to be mediated is an instance of a specialized concept (denoted by the ?subCS variable).

¡ (mapping (CS:AS; ASR; CT :AT ; ATR; A2A)) 7! mediated(?x; CT )[AT hasValue ?v] memberOf CT : ¡ ?x[AS hasValue ?v] memberOf ?subCS and ?subCS subConceptOf CS and mapped(?subCS; CT ; ?x) and ¡ (ASR) and ¡ (ATR): (2)

In Formula 3 the mappings between attributes having as ranges other concepts are addressed. As shown at line 2, the value of the target attribute is set to another mediated instance, which is produced by one of the rules generated by Formulas 2 to 5. At line 5, it is checked if this other mediated instance has any attributes, in order to avoid the generation of meaningless instances due to incomplete mappings.

¡ (mapping (CS:AS; ASR; CT :AT ; ATR; A2A)) 7! mediated(?x; CT )[AT hasValue mediated(?i; SubRT )] memberOf CT : ¡ ?x[AS hasValue ?i] memberOf ?subCS and ?subCS subConceptOf CS and mapped(?subCS; CT ; ?x) and mediated(?i; SubRT )[?anAttribute hasValue ?aV alue] and ¡ (ASR) and ¡ (ATR): (3)

The Formula 5 is the symmetric of Formula 4, grounding the C2A mappings. ¡ (mapping (CS; CSR; CT :AT ; ATR; C2A)) 7! mediated(?x; CT )[AT hasValue mediated(?x; SubRT )] memberOf CT : ¡ ?x memberOf CS and mediated(?x; SubRT )[?anA hasValue ?aV alue] memberOf SubRT and ¡ (CSR) and ¡ (ATR): (5)

The concept SubRT is extracted during the grounding process based on the range of the mapped target attribute, i.e. a distinct rule is generated for the attribute's range and for each of the range's subclasses (since every instance of the attribute's range sub-concept is a valid ¯ller of that attribute). As described in [ 9 ], no mappings between two attributes having as range a data-type and a concept, are allowed (compensated by the usage of C2A and A2C mappings).

The grounding of the A2C mappings , as seen in Formula 4, handles the cases when di®erent levels of aggregation are used in the ontologies, e.g. the attribute CS:A has the concept CRS as range, and CRS's instances also need to be transformed into instances of the CT . This type of mapping can be applied only when CRS is a concept, but not a data-type.

¡ (mapping (CS:AS; ASR; CT ; CTR; A2C)) 7! mediated(?x; CT )[?anA hasValue ?aV alue] memberOf CT : ¡ ?x[AS hasValue ?i] memberOf CS and mediated(?i; CT )[?anA hasValue ?aV alue] memberOf CT and ¡ (ASR) and ¡ (CTR): (4)

The restrictions can be seen as conditions that have to hold in order for the mapping to apply. They are divided into three main classes: attribute occurrence (aoc), attribute type (atc) and attribute value (avc) conditions. Due to space constraints, this paper does not include a complete and detailed description of the conditions grounding. However, the Formulas 6 to 9 shows the grounding of the conditions when the "equals" operator is used. The grounding in Formulas 6 and 7 can be applied on any conditions, while, in order to avoid the creation of unsafe rules, Formula 8 is used to ground the only the avc set on source entities, while Formula 9 is used to ground only the avc set on the target entities. ¡ (atc(C:A; "equals"; R)) 7! C[A ofType R] ¡ (aoc(C:A)) 7! C[A ofType ?aRange] ¡ (avc(C:A; "equals"; V alue)) 7! C[A hasValue V alue] ¡ (avc(C:A; "equals"; V alue; tV ariable)) 7!

assignement(Id; ?tV ariable) and C[A hasValue ?tV ariable] assignement(Id; V alue): (6) (7) (8) (9)

The Id uniquely identi¯es di®erent assignments, one distinct fact is generated each time a value from an target avc has to be assigned to a target attribute. 3.2

Reasoning Task

After grounding, the alignment can be seen as an ontology that imports the source and the target ontologies and the set of mapping rules. Conceptually, it could be also seen as a "merged ontology" where the input ontologies were independently put together (the separation is realized by namespaces) and linked by rules. By adding the source data to it and posing queries in terms of the target ontology, the appropriate mapping rules are triggered and the mediated data produced as a result.

Considering the two ontology fragments in Table 1, a set of mappings as the one presented in Listing 1.1 can be created.

Listing 1.1. Abstract mappings

Listing 1.2 shows the mapping rules generated by grounding the abstract mappings depicted in Listing 1.1, using the Formulas 1 to 9. These mapping rules are expressed as WSML axioms and they have a precise semantics, as de¯ned by WSML-Rule [ 3 ]. They are included in an ontology that imports the source and the target ontologies. The source instances to be mediated are also included in this ontology, which is then registered in the reasoner. Listing 1.3 depicts a set of source instances to be mediated by using the mappings shown above.

Listing 1.2. WSML mapping rules between the Belgian and Italian ontologies ¨ ¥ wsmlVariant namespace fo1 o2 " http : / /www. wsmo . org /wsml/wsml¡syntax /wsml¡r u l e " " http : / /www. semantic¡gov . org / BelgianCitizenOntology#" , " http : / /www. semantic¡gov . org / I t a l i a n C i t i z e n O n t o l o g y#"g ontology merged ontology importsOntology f o1#BelgianCitizenOntology , o2#I t a l i a n C i t i z e n O n t o l o g y g axiom o2#ccMappingRule18 definedBy o2#mappedConcepts ( o1#Person , o2#Citizen , ? X17) and o2#mediated (?X17 , o2#C i t i z e n ) memberOf o2#C i t i z e n :¡ ?X17 memberOf o1#Person . axiom o2#caMappingRule72 definedBy o2#mediated (?X69 , o2#C i t i z e n ) [ o2#hasName hasValue o2#mediated1 (?X69 , o2#Name ) ] memberOf o2#C i t i z e n :¡ ?X69 memberOf o1#Person and o2#mediated1 (?X69 , o2#Name ) [ ? A70 hasValue ?V71 ] memberOf o2#Name . axiom o2#ccMappingRule12 definedBy o2#mappedConcepts ( o1#Person , o2#Name, ? X11) and o2#mediated (?X11 , o2#Name) memberOf o2#Name :¡ ?X11 memberOf o1#Person . axiom o2#aaMappingRule48 definedBy o2#mediated (?X45 , o2#Name ) [ o2#hasFirstName hasValue ?Y46 ] memberOf o2#Name :¡ ?X45 [ o1#hasChristianName hasValue ?Y46 ] memberOf ?SC47 and ?SC47 subConceptOf o1#Person and o2#mappedConcepts (? SC47 , o2#Name, ? X45 ) . . . . axiom o2#aaMappingRule84 definedBy o2#mediated (?X79 , o2#C i t i z e n ) [ o2#hasSex hasValue ?Y81 ] memberOf o2#C i t i z e n :¡ ?X79 [ o1#hasGender hasValue ?Y80 ] memberOf ?SC83 and ?SC83 subConceptOf o1#Person and o2#mappedConcepts (? SC83 , o2#Citizen , ? X79) and ?X79 [ o1#hasGender hasValue o1# 1 ] and o2#assignement 82 (? Y81 ) . axiom o2#assignement 82 definedBy

o2#assignement 82 ( o2#M) . . . . axiom o2#ccMappingRule14 definedBy o2#mappedConcepts ( o1#Gender , o2#Sex , ? X13) and o2#mediated (?X13 , o2#Sex ) memberOf o2#Sex :¡ ?X13 memberOf o1#Gender . §

Once this ontology is registered in the reasoner, queries can be asked to retrieve the mediated data. The mediation take place within the reasoner when the rules are evaluated and the target instances become implicit knowledge in ¦ the reasoning space. Normally, one or more instances from the source set are considered to be root instances and used in determining a starting query point. Otherwise, the procedure exempli¯ed below must be applied to all the instances in the source set (although this could lead to redundant computations). ¥

Listing 1.3. Source instances to be mediated § ¦ Assuming that johnS is the root instance in the set shown in Listing 1.3, the ¯rst step is to identify which is the most relevant concept from the target to be mediated to. johnS is an instance of concept Person and there are two concepts in the target the Person is mapped to: Citizen and Name. Since the concept Name can be reached from Citizen via the hasName attribute (paths of any length can be explored in this way), the concept Citizen is considered as being an ancestor of the concept Name and by this, more suited for the instance johnS to be mapped to3.

Once the concept Citizen is selected, queries can be posed to the reasoner. The ¯rst query and the results obtained are shown in Listing 1.4.

Listing 1.4. The target instance obtained by mediating the johnS instance ¨ ¥ ?¡ ?x memberOf o2#C i t i z e n .

Found < 1 > r e s u l t s to the query : (1 ) ¡¡ ?x = mediated ( johnS , o2#C i t i z e n ) § ¦

The target instance mediated(johnS,o2#Citizen) has to be explored and its attributes and their values retrieved. The next query and the obtained answers are shown in Listing 1.5. For each of the attributes (?y ) this query retrieves both the value (?z ) and the type of this value (?avC ).

Listing 1.5. Finding the attributes and their values for a target instance ¨ ?¡mediated ( johnS , o2#C i t i z e n ) [ ? y hasValue ? z ] memberOf o2#C i t i z e n and ? z memberOf ?avC . Found < 3 > r e s u l t s to the query : ((12 )) ¡¡ ?? zz == om2e#dMia,te?dav(Cjoh=nSo2,#o2S#exNa,m?ye)=, o2?#avhCas=Sexo2#Name , ?y = o2#hasName (3 ) ¡¡¡¡ ? z = mediated ( johnS birthDate , o2#Date ) , ?avC = o2#Date , ?y = o2#hasBirthday ?¡mediated ( johnS , o2#Name ) [ ? y hasValue ? z ] memberOf o2#Name and ? z memberOf ?avC . Found < 2 > r e s u l t s?avtCo =the s tqruienrgy :, ?y = o2#hasFirstName ((12 )) ¡¡¡¡ ?? zz == JSomhinth, , ?avC = s t r i n g , ?y = o2#hasSurname § ¥ ¦ 3 In the presence of an inheritance hierarchy, if for example there are multiple mappings from the concept Person to several concepts in this hierarchy, the most speci¯c concept would qualify.

For each of the attribute values having as type a concept the query process continues recursively. Listing 1.5 illustrates the queries for the hasName attribute and mediated(johnS,o2#Name) value.

By applying this querying mechanism all the target instances can be retrieved and materialized. The obtained mediated instances are shown in Listing 1.6. ¨

Listing 1.6. Mediated target instances

iiinnnsssooooooottt2222222aaa#######nnnhdhhhhycccaaaaaeaseeeysasssNSFSBruahemmmiimrrxhraeeesntsaetdddahVhsNiiimdVhaaaaaaaesatttlamyVueeesldddVuheeahe(((alauh1jjsajloooVu1e1sahhhVe9sannVn8olaSSu02oSla#u2e,,lM#ubee"ooimr22Smet##"mhdeNJCdiDioataiimhhatatnte"iteezde")ed(,nmj(oo)jeho2mn#hmbSnDeeSm,arOtbboefe2i)rr#oOtmN2fh#aeDmNoma2aeb#mt)eeCerO,i tfoiz2o#e2#nDDataete) o2#month hasValue 4 ¥ § ¦

The logic program corresponding to the "merged ontology" is decidable even if this ontology is expressed in WSML-Rule, which is generally undecidable [ 3 ]. The reason is that the function symbols are used in such a way that only one level of constructed terms can be generated. That is, the mapping rules will never generate terms like mediated(mediated(:::(X; C):::)). Formulas 1 to 5 builds the constructed term mediated(?x; CT ) memberOf CT only if ?x memberOf CS exists. Since CS and CT are concepts from the source and the target ontology, respectively, they are separated by namespaces and, as a consequence, distinct. 4

E®ective Data Mediation

In this section we introduce the lessons learned from applying the instance transformation scenario described in Section 3 to use cases in two EU funded projects, namely SEEMP and SemanticGov4. SEEMP enables the exchange of data between di®erent Employment Services in Europe that use their own data structures and taxonomies. SemanticGov aims to build the infrastructure necessary to enable the provision of Semantic Web Services for Public Administration. 4.1

Optimizations for the Instance Transformation Scenario In the context of the SEEMP project, mappings between the local ontology of each employment service and the reference ontology of the marketplace were made at design time using the mapping tools [ 9 ] in the Web Service Modeling Toolkit (WSMT) [ 7 ], thus transforming an instance from one local employment service to another was done by transforming the source instance into terms of the reference ontology and then transforming this instances into terms of the other local ontology. However having created the mappings it was obvious that the performance of the reasoning when performing instance transformation was an issue. With further investigation it became apparent that the number of facts and rules registered in the reasoner was very high. Essentially for a given instance transformation the size of the source and target ontologies was very large 4 For more details see http://www.seemp.org and http://www.semantic-gov.org. and the mappings between these two large structures was causing the forward chaining algorithms within the reasoner to require signi¯cant computation time. Also the instance data to transform was very large and parts of this information where untransformable due to a lack of mappings. In this section we explore the optimizations performed in the SEEMP project to improve the performance of the overall instance transformation process.

Source Instance Filtering: In a given scenario a certain amount of the input source instances can be transformed based on the coverage of the available mappings. In scenarios where this coverage is low the reasoner contains lots of instance data that will never be transformed to the target ontology. This additional information is unnecessary for the instance transformation step, but having it within the reasoner means that it gets used in the model computations and thus its removal would improve the overall performance. To this end a source instance pre-¯lter was added to the instance transformation process that removes those parts of the source instances that cannot be transformed. This ¯lter can have quite dramatic e®ects on the quantity of data loaded into the reasoner as the absence of one attribute to attribute mapping can remove entire branches of source instances. Having implemented this ¯lter an improvement of between 5% and 10% in most of our test cases was observed. The amount of performance improvement that can be gained by applying this ¯lter is proportional to the amount of extra source data that is untransformable. Thus in scenarios where there is 100% coverage the ¯lter will not remove any content and the performance will remain the same.

Mapping Filtering: In other cases the source instance set can be very small and the number of mappings very large. The extra mappings that are being loaded into the reasoner will only partially ¯re without producing any meaningful and valid result. Further exploration into the SEEMP project test cases showed that this was not an extraordinary scenario but in fact the normal case. Thus a mapping ¯lter was added to the instance transformation process removing those mappings unrelated to the source instances. This ¯lter showed a huge performance improvement of between 40% and 50% for the average test case in the SEEMP project as the ontologies in the test cases contain a number of very large taxonomies with many mappings between them, approximately 70% of the mappings created at design time are ¯ltered away at runtime by the mapping ¯lter for these test cases.

Source and Target Ontology Filtering: Having ¯ltered both the source instances and the mappings the bulk of the data remaining in the reasoner is the source and target ontologies. Dynamically ¯ltering ontologies prior to putting them in a reasoner is a non trivial a®air; however initial research appears to show that a naive ¯lter of the ontologies based on the source instances and the available mappings can be used to remove those parts of the source and target ontologies which are unnecessary for a given instance transformation process. Our ongoing research seems to show that a 50% to 60% performance improvement on top of the improvements already garnered by the source instance and mapping ¯lters can be gained.

The combination of these ¯lters can dramatically reduce the amount of data that is loaded in the reasoner at runtime and therefore the performance of the overall instance transformation process. Crucially these ¯lters are all dynamic and applied at runtime thus they involve no human e®ort in order to con¯gure them. In our current system each of these ¯lter is turned on and o® via a °ag, thus they can be applied in those scenarios where they are desirable and turned o® where they are not. There are situations when the data-values embedded in the source instances need to be transformed before they can be assigned to the attributes part of the target instances. Such transformations could involve simple string concatenations or more complex, dynamic transformations based on external factors, for example currency conversions. ¨ §

Listing 1.7. Example of a value transformation service usage axiom o2#aaMappingRule definedBy o2#mediated (?X45 , o2#C i t i z e n ) [ o2#hasName hasValue htt??pXX:44/55/[[ ooex11##amhhaapsslSCeu.hrornriasgmt/ieacnohnNacasamVteSaleuhreavsi Vc?eYa(4lu?7 Ye]4m6?Y,em4?6bY]e4rm7O)ef]m?bmSeCerOm47fbeaorn1O#dfPoe2r#soCni t aiznedn :¡

?SC47 subConceptOf o1#Person and o2#mappedConcepts (? SC47 , o2#Citizen , ? X45 ) . insooot222a###nhhhcaaasessNSBaemimxretedhhidhaaasatyVesdVah(laujaloueshVenoaS2hl#u,tMetpo 2m:#/e/Cdeiixataitzmeedpn(le)jo.hmonregSm/bbceiorrOntcfhaDota2S#teeCrv,i ticoize2#e(n"DJaothen) " , "Smith" ) Normally, this kind of conversions requires either the use of specialized data manipulation functions within the reasoning environment or the access to external services that can apply the conversion after the reasoning occurred. Since the ¯rst option would signi¯cantly limit the type of the allowed transformations, we rely on the second approach. The domain experts can specify during the creation of mappings, an arbitrary identi¯er to denote the service they would like to use. At run-time, the reasoner is used only to assign to the target recipient a string encoding of the service and its parameter. After the mediated target instances are retrieved from the reasoner, a post-processing phase identi¯es and evaluates every service/parameter encoding.

For example, assuming that the attribute hasN ame of the Citizen concept in that target ontology would have as range a string, representing a concatenation of the ¯rst name and the surname of a person, such a value transformation service would be required. Listing 1.7 shows the corresponding WSML mapping rule and the attribute value produced after reasoning.

The services set that can be used in the mapping process should be customizable at the mapping creation tool level; they could be added, removed or brie°y described. It is important to note that at that level no implementation has to be provided - the implementation has to be available and integrated with the instance transformation component only at run-time. ¥ ¦

Related Work

TSIMMIS [ 6 ] is a system for integrating information coming from heterogeneous data sources. TSIMMIS includes a mediator-generator able to produce mediator descriptions in the Mediator Speci¯cation Language (MSL) [ 12 ]. MSL and the abstract mapping language's grounding (AMLg) proposed in this paper are based on the same general principles: they both rely on datalog-like rules which construct target data, based on the information from the sources. However MSL relies on patterns to match and create data from the source and target data sets, while AMLg acts on schema level elements. Additionally, using MSL, one needs to construct paths from root objects down to the relevant information in the model, while with AMLg separate rules are created. These rules are eventually "composed" by a reasoner in order to produce the desired target data out of the source instances. This strategy assures that when new schema elements need to be included in the mappings set, no re-engineering of the existing mappings or rules is necessary. Another relevant similarity aspect between TSIMMIS and the approach in this paper, is the usage of Skolem functions [ 8 ] or function symbols. Both approaches use them to encode information regarding how the target object or instance has been derived from the source. While this information is used by MSL only to build special target objects, it plays a crucial role for AMLg allowing the combination of the results from individual rules into one complex result instance.

Abiteboul and colleagues [ 1 ] propose a middleware data model, as basis for the integration task, and declarative rules to specify the integration. The model is a minimal one and the data structure consists of ordered label trees. The authors also consider two types of rules: correspondence rules used to express relationships between the nodes of their model (similar with our mappings in the AML but expressed in Datalog) and translation rules, a decidable sub-case for the actual data translation (resembling our mapping rules expressed in WSML). 6

Conclusions

This paper describes an approach that uses reasoning to perform data mediation, based on pre-existing alignments between ontologies. The alignments consists of mappings expressed in the Abstract Mapping Language, a form of representation that does not commit to any ontology representation language or formalism. In order to apply these mappings in a concrete mediation scenario they have to be grounded to a concrete language and to have a formal semantics associated. This paper provides a grounding of the AML to WSML-Rule, suited for the instance transformation scenario. A detailed example of the reasoning task is also provided together with a set of optimization that can be applied in order to improve the performances of the instance transformation process. Additionally, the paper describes a way of using value transformation services in conjunction with reasoning without being restricted to the reasoner's set of built-ins.

A full implementation of the instance transformation component is available as part of WSMX, and it is available for download at http://sourceforge. net/projects/wsmx/. As future work we plan to conduct a full evaluation of the performance improvements discussed in this paper using the showcases developed in EU funded projects where this prototype is being developed and used. 7

Acknowledgements

The work is funded by the European Commission under the projects KnowledgeWeb, SEEMP, SemanticGov, SUPER and SHAPE; by the FFG (OÄ sterreichische ForschungsFÄorderungsGesellschaft mbH) under the project SemBiz.

Abiteboul ,

Cluet , and

Milo . Correspondence and Translation for Heterogeneous Data . In Proc. of the 6th Intl Conf on Database Theory (ICDT-1997) , pages 351 { 363 , Delphi , Greece, 1997 . Springer-Verlag.

2. J. de Bruijn , D.

Foxvog , and K.

Zimmerman . Ontology Mediation Patterns Library . SEKT Project D4.3 .1, available at: http://www.sekt-project.com, 2004 .

3. J. de Bruijn , H.

Lausen , R.

Krummenacher , A.

Polleres , L.

Predoiu , M.

Kifer , and D.

Fensel . The Web Service Modeling Language WSML . WSML Working Draft D16 . 1v0 .21, available at: http://www.wsmo.org/TR/, Oct 2005 .

Ehrig ,

Staab , and

Sure . Bootstrapping Ontology Alignment Methods with APFEL . Proc of the 4th Intl Semantic Web Conf (ISWC-2005) , Nov 2005 .

Euzenat , F. Schar®e, and L. Sera ¯ni. Speci¯cation of the alignment format . Knowledge Web Deliverable D2.2.6 , 2006 .

Garcia-Molina ,

Papakonstantinou ,

Quass ,

Rajaraman ,

Sagiv ,

J. D.

Ullman ,

Vassalos , and

Widom . The TSIMMIS approach to mediation: Data models and languages . Journal of Intelligent Information Systems , 8 ( 2 ), 1997 .

Kerrigan ,

Mocan ,

Tanler , and

Fensel . The Web Service Modeling Toolkit - An Integrated Development Environment for SWS . In Proc of the 4th European Semantic Web Conf (ESWC-2007) , Austria, Jun 2007 .

Kifer , G. Lausen, and

Wu . Logical foundations of object-oriented and framebased languages . Journal of the ACM , ( 42 ): 741 { 843 , July 1995 .

Mocan and E. Cimpian. An Ontology-Based Data Mediation Framework for Semantic Environments . Intl Journal on Semantic Web and Information Systems (IJSWIS) , 3 ( 2 ), April - June 2007 .

10.

Mocan ,

Moran , E. Cimpian, and

Zaremba . Filling the Gap - Extending Service Oriented Architectures with Semantics . In Proc of the IEEE Int Conf on e-Business Engineering (ICEBE-2006) , China, Oct 2006 . IEEE Computer Society.

11.

N. F.

Noy and

M. A.

Munsen . The PROMPT Suite: Interactive Tools For Ontology Merging And Mapping. Intl Jrnl of Human-Computer Stud ., 6 ( 59 ): 983 { 1024 , 2003 .

12.

Papakonstantinou ,

Garcia-Molina , and

J. D.

Ullman . MedMaker: A Mediation System Based on Declarative Speci¯cations . In Proc of the 12th Intl Conf on Data Engineering , USA, 1996 .

13.

Roman , U. Keller, H. Lausen, J. de Bruijn,

Lara ,

Stollberg ,

Polleres ,

Feier ,

Bussler , and

Fensel . Web Service Modeling Ontology. Applied Ontology , 1 ( 1 ): 77 { 106 , 2005 .

14.

Silva and

Rocha . Semantic Web Complex Ontology Mapping . In Proc of the IEEE Web Intelligence (WI-2003) , Canada, Oct 2003 .