<?xml version="1.0" encoding="UTF-8"?>
<TEI xml:space="preserve" xmlns="http://www.tei-c.org/ns/1.0" 
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" 
xsi:schemaLocation="http://www.tei-c.org/ns/1.0 https://raw.githubusercontent.com/kermitt2/grobid/master/grobid-home/schemas/xsd/Grobid.xsd"
 xmlns:xlink="http://www.w3.org/1999/xlink">
	<teiHeader xml:lang="en">
		<fileDesc>
			<titleStmt>
				<title level="a" type="main">A pattern-based ontology matching approach for detecting complex correspondences</title>
			</titleStmt>
			<publicationStmt>
				<publisher/>
				<availability status="unknown"><licence/></availability>
			</publicationStmt>
			<sourceDesc>
				<biblStruct>
					<analytic>
						<author>
							<persName><forename type="first">Dominique</forename><surname>Ritze</surname></persName>
							<email>dritze@mail.uni-mannheim.de</email>
							<affiliation key="aff0">
								<orgName type="institution">University of Mannheim</orgName>
							</affiliation>
						</author>
						<author>
							<persName><forename type="first">Christian</forename><surname>Meilicke</surname></persName>
							<affiliation key="aff0">
								<orgName type="institution">University of Mannheim</orgName>
							</affiliation>
						</author>
						<author>
							<persName><forename type="first">Ondřej</forename><surname>Šváb-Zamazal</surname></persName>
							<email>ondrej.zamazal@vse.cz</email>
							<affiliation key="aff1">
								<orgName type="institution">University of Economics</orgName>
								<address>
									<settlement>Prague</settlement>
								</address>
							</affiliation>
						</author>
						<author>
							<persName><forename type="first">Heiner</forename><surname>Stuckenschmidt</surname></persName>
							<affiliation key="aff0">
								<orgName type="institution">University of Mannheim</orgName>
							</affiliation>
						</author>
						<title level="a" type="main">A pattern-based ontology matching approach for detecting complex correspondences</title>
					</analytic>
					<monogr>
						<imprint>
							<date/>
						</imprint>
					</monogr>
					<idno type="MD5">45BF92CFE0B52BB2787B596A0BA5511C</idno>
				</biblStruct>
			</sourceDesc>
		</fileDesc>
		<encodingDesc>
			<appInfo>
				<application version="0.7.2" ident="GROBID" when="2023-03-24T12:31+0000">
					<desc>GROBID - A machine learning software for extracting information from scholarly documents</desc>
					<ref target="https://github.com/kermitt2/grobid"/>
				</application>
			</appInfo>
		</encodingDesc>
		<profileDesc>
			<abstract>
<div xmlns="http://www.tei-c.org/ns/1.0"><p>State of the art ontology matching techniques are limited to detect simple correspondences between atomic concepts and properties. Nevertheless, for many concepts and properties atomic counterparts will not exist, while it is possible to construct equivalent complex concept and property descriptions. We define a correspondence where at least one of the linked entities is non-atomic as complex correspondence. Further, we introduce several patterns describing complex correspondences. In particular, we focus on methods for automatically detecting complex correspondences. These methods are based on a combination of basic matching techniques. We conduct experiments with different datasets and discuss the results.</p></div>
			</abstract>
		</profileDesc>
	</teiHeader>
	<text xml:lang="en">
		<body>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="1">Introduction</head><p>Ontology matching is referred to as a means for resolving the problem of semantic heterogeneity <ref type="bibr" target="#b2">[3]</ref>. This problem is caused by the possibility to describe the same domain by the use of ontologies that differ to a large degree. Ontology engineers might, for example, chose different vocabularies to describe the same entities. There might also be ontologies where some parts are modeled in a fine grained way, while in other ontologies there are only shallow concept hierarchies in the relevant branches. These kinds of heterogeneities can be resolved by state of the art ontology matching systems, which might e.g. detect that hasAuthor and writtenBy are equivalent properties and only different vocabulary is used. Moreover a matching system might identify, that Author is more general as both concepts FirstAuthor and CoAuthor.</p><p>However, ontological heterogeneities are not restricted to these kind of problems: different modeling styles might require more than equivalence or subsumption correspondences between atomic concepts and properties. <ref type="foot" target="#foot_0">1</ref> Semantic relations between complex descriptions become necessary. This is illustrated by the following example: While in one ontology we have an atomic concept AcceptedPaper, in another ontology we have the general concept Paper and the boolean property accepted. An AcceptedPaper in the first ontology corresponds in the second ontology to a Paper that has been accepted. Such a correspondence, where at least one of the linked entities is a complex concept or property description, is referred to as complex correspondence in the following. As main contribution of this paper we suggest an automated pattern based approach to detect certain types of complex correspondences and study its performance by applying it on different datasets. Even though different researchers were concerned with similar topics (see <ref type="bibr" target="#b10">[11]</ref>), to our knowledge none of the resulting works was concerned with automated detection in an experimental setting. Exceptions can be found in the machine learning community (see <ref type="bibr">Section 2)</ref>.</p><p>We first discuss related work centered around the notion of a complex correspondence in Section 2. We then present four patterns of complex correspondences in Section 3. In Section 4 we suggest the algorithms we designed to detect occurrences of these patterns. Each of these algorithms is described as a conjunction of conditions, which are easy to check by basic matching techniques. In Section 5 we apply the algorithms on two datasets from the OAEI and show that the proposed techniques can be used to detect a significant amount of complex correspondences. We end with a conclusion in Section 6.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="2">Related Work</head><p>Complex matching is a well known topic in database schema matching. In <ref type="bibr" target="#b0">[1]</ref> the authors describe complex matches as matching corresponding attributes on which some operation was applied, e.g. a name is equivalent with concatenation of a first-name and a last-name. There are several systems dealing with this kind of database schema matching. On the other hand complex matching is relatively new in the ontology matching field. Most of the state of the art matchers just find (simple) correspondences between two atomic terms. However, pragmatic concerns call for complex matching. We also experienced this during discussions at the OM-2008. It turns out that simple correspondences are too limited to capture all meaningful relations between concepts and properties of two related ontologies. This is an important aspect with respect to application scenarios making use of alignments e.g. instance migration scenarios. There are three diverse aspects of complex correspondences: designing (defining), finding and representing them.</p><p>In <ref type="bibr" target="#b7">[8]</ref> complex correspondences are mainly considered from design and representation aspects. Complex correspondences are captured as correspondence patterns. They are solutions for recurring mismatches being raised during aligning two ontologies. These patterns are now being included within Ontology Design Patterns (ODP) <ref type="foot" target="#foot_1">2</ref> . This work considers complex matching as task that had to be conducted by a human user, which might e.g. be a domain expert. Experts can take advantage of diverse templates for capturing complex and correct matching. However, this collection of patterns can also be exploited by some automated matching approach, as suggested and shown in this paper.</p><p>In <ref type="bibr" target="#b10">[11]</ref> authors tried to find complex correspondences using pattern-based detection of different semantic structures in ontologies. The most refined pattern is concerned with 'N-ary' relation detection. After detecting an instance of the pattern (using query language and some string-based heuristics) additional conditions (mainly string-based comparisons) over related entities wrt. matching are checked. While there are some experiments with pattern detection in one ontology, experiments with matching tasks are missing.</p><p>Furthermore, in <ref type="bibr" target="#b11">[12]</ref> the authors consider an approach for pattern-based ontology transformation useful for diverse purposes. One particular use case is ontology matching where this method enables finding further originally missed correspondences. Ontologies are transformed according to transformation patterns and then any matcher can be applied. Authors hypothesize that matchers can work with some structures better than with others. This approach uses Expressive alignment language<ref type="foot" target="#foot_2">3</ref> based on <ref type="bibr" target="#b1">[2]</ref> which extends the original INRIA alignment format. This language enables to express complex structures on each side of an alignment (set operators, restriction for entities and relations). Furthermore it is possible to use variables and transformation functions for transforming attribute values. "Basically, complex correspondences are employed indirectly in the ontology matching process at a pre-processing step where ontology patterns are detected and transformed <ref type="bibr" target="#b12">[13]</ref>." Unlike, in this paper complex correspondences are detected directly taking advantage of information from not only two ontologies being aligned but also from a reference alignment composed of simple correspondences.</p><p>Regarding ontology matching, there are a few matchers trying to find complex correspondences based on machine learning approaches (see <ref type="bibr" target="#b8">[9]</ref> for a general description). A concrete matching system is presented in <ref type="bibr" target="#b5">[6]</ref>. These approaches take correspondences with more than two atomic terms into account, but require the ontologies to include matchable instances. However, ontologies often contain disjoint sets of instances, such that for each instance of one ontology there exists no counterpart in the other ontology and vice versa. The approach proposed in this paper does not require the existence of matchable instances at all.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="3">Complex Correspondence Patterns</head><p>In the following we propose four patterns for complex correspondences that, due to a preparatory study, we expect to occur frequently within ontology matching problems. We first report about our preparatory study, followed by a detailed presentation of each pattern. Each pattern is also explained by an example depicted in Figure <ref type="figure">1</ref>. Without explicitly mentioning it, we will refer to Figure <ref type="figure">1</ref>  First of all we had to collect different types of complex correspondences. We considered the examples found in <ref type="bibr" target="#b8">[9]</ref> and also profited from the discussion of the consensus track at OM 2008, which highlighted the need for complex correspondences. <ref type="foot" target="#foot_3">4</ref> After we had a few ideas, we started observing two sets of ontologies manually to detect concrete examples for complex correspondences. The specific ontologies which we examined are the SIGKDD, CMT, EKAW, IASTED, and CONFOF ontologies of the conference dataset Fig. <ref type="figure">1</ref>. Two example ontologies to explain the complex patterns and ontologies 101, 301, 302, 303, and 304 of the benchmark track. The first dataset describes the domain of conferences. This seems to be suitable <ref type="bibr" target="#b9">[10]</ref> because most persons dealing with ontologies are academics and know this topic already. Therefore it is easier to understand complex interdependencies in this domain instead compared to an unfamiliar domain like e.g. medical domains. The OAEI Benchmark ontologies attend the domain bibliography which is also well-known by academics. Another reason for choosing these ontologies are the existing and freely available reference alignments. For the conference dataset an alignment is available for every pair of two ontologies. Only for each combination with ontology 101 an alignment is available for the benchmark ontologies, resulting in four matching tasks. In Section 4 we will explain in how far and for which purpose a reference alignment, which consists of simple correspondences, is required.</p><p>The first three patterns are very similar, nevertheless, it will turn out that different algorithms are required to detect concrete complex correspondences. In accordance with <ref type="bibr" target="#b7">[8]</ref> we will refer to them as Class by Attribute Type pattern, Class by Inverse Attribute Type pattern, and Class by Attribute Value pattern. In the following we give a formal description as well as an example for each pattern.</p><p>Class by Attribute Type pattern (CAT) This pattern occurs very often when we have disjoint sibling concept. In such a situation the same pattern can be used to define each of the sibling concepts.</p><p>Formal Pattern:</p><formula xml:id="formula_0">1#A ≡ ∃2#R.2#B Example: 1#PositiveReviewedPaper ≡ ∃2#hasEvaluation.2#Positive</formula><p>With respect to the ontologies depicted in Figure <ref type="figure">1</ref> we can construct correspondences of this type for the concepts Positive-, Neutral-, and NegativeReviewedPaper.</p><p>Class by Inverse Attribute Type pattern (CAT −1 ) The following pattern requires to make use of the inverse 2#R −1 of property 2#R, since we want to define 1#A as subconcept of 2#R's range.</p><p>Formal Pattern:</p><formula xml:id="formula_1">1#A ≡ 2#B ∃2#R −1 . Example: 2#Researcher ≡ 1#Person ∃1#researchedBy −1 .</formula><p>Given an ontology which contains a property and its inverse property as named entities, it is possible to describe the same correspondences as Class by Attribute Type pattern and as Class by Inverse Attribute Type pattern. Nevertheless, an inverse property might often not be defined as atomic entity in the ontology or might be named in a way which makes a correct matching harder.</p><p>Class by Attribute Value pattern (CAV) While in the Class by Attribute Type pattern membership to a concept was a necessary condition, we now make use of nominals defined by concrete data values.</p><p>Formal Pattern:</p><formula xml:id="formula_2">1#A ≡ ∃2#R.{. . .} (where {. . .} is a set of concrete data values) Example: 1#submittedPaper ≡ ∃2#submission.{true}</formula><p>Another typical example is the distinction between LateRegisteredParticipant and Ear-lyRegisteredParticipant. In particular, the boolean variant of the pattern occurs to distinguish between complementary subclasses. However, in general there might be more than two relevant values. The following correspondence is a more complex example:</p><formula xml:id="formula_3">1#StudentPassedExam ≡ ∃2#hasExamScore.{A, B , C , D}.</formula><p>Property Chain pattern (PC) <ref type="foot" target="#foot_4">5</ref> In the following we assume that in O 1 property 1#author relates a paper to the name of its author, while in O 2 2#author relates a paper to its author and the datatype property 2#name relates a person to its name. Under these circumstances a chain of properties in O 2 is equivalent to an atomic property in O 1 .</p><p>Formal Pattern:</p><formula xml:id="formula_4">1#R ≡ 2#P • 2#Q Example: 1#author ≡ 2#hasAuthor • 2#name</formula><p>Conventional matching systems focus only on correspondences between atomic entities. Therefore, a matcher might detect a similarity between 1#R and 2#P and one between 1#R and 2#Q, but will finally decide to output the one with higher similarity. This observation already indicates that state of the art matching techniques can be exploited to generate complex correspondences. In particular, we will argue in the next section, that it is possible to detect complex correspondences by combining simple techniques in an intelligent way. 6   </p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="4">Algorithms</head><p>The techniques we are using for detecting complex correspondences are based on combinations of both linguistic and structural methods. In the following we shortly list and describe these approaches. The structural techniques require the existence of a reference alignment R that consists of simple equivalence correspondences between atomic concepts. In particular, it would also be possible to use a matcher generated (and partially incorrect) alignment, but in our first experiments we wanted to avoid any additional source of error.</p><p>Structural Criteria To decide whether two or more entities are related via complex correspondences, information about their position in the ontology hierarchy is required. Therefore, we have to check whether two concepts are in a subclass resp. superclass relation, or are even equivalent concepts. It might also be important to know if two concepts are non overlapping, disjoint concepts. Properties are connected to the concepts hierarchy via domain and range restrictions, which are thus also important context information. All of these notions are clearly defined within a single ontology, however, we extend these notions to a pair of aligned ontologies. 1#C is also referred to as a subconcept of 2#D if there exists a correspondence</p><formula xml:id="formula_5">1#C = 2#D ∈ R such that O 1 |= 1#C ⊆ 1#C and O 2 |= 2#D ⊆ 2#D.</formula></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>Syntactical Criteria</head><p>The most efficient methods used in ontology matching are based on string comparisons e.g. comparing concept id (the fragment of the concepts URI) resp. label to compute a similarity between ontological elements. We also make use of this basic method by computing a similarity measure between normalized strings based on the Levenshtein measure <ref type="bibr" target="#b3">[4]</ref>. For the sake of simplicity we refer to the maximum value obtained from id and label comparison as label similarity in the following. For some operations we need to determine the head noun of a given compound concept/property label. Thus, we can e.g. detect that Reviewer is the head noun of ExternalReviewer. Sometimes we are simply interested in the first part of a label, sometimes in the head noun and sometimes in the remaining parts. Data type Compatibility Two data types are compatible if one data type can be translated into the other and vice versa. This becomes relevant whenever datatype properties are involved. We determined compatibility in a wide sense. E.g. data type</p><p>String is compatible to every other data type while Date is not compatible to Boolean. 6 Even experts tend to avoid the introduction of complex correspondences. The property chain A more detailed description can be found in <ref type="bibr" target="#b6">[7]</ref>. Overall we emphasize that our methodology does not exceed basic functionalities which we normally would expect to be part of any state of the art matching system.</p><formula xml:id="formula_6">1#R ≡ 2#P • 2#Q, for</formula><p>Class by Attribute Type pattern A correspondence 1#A ≡ ∃2#R.2#B of the CAT type is generated by our algorithm, if all following conditions hold.</p><p>1. The string that results from removing the head noun from the label of 1#A is similar to the label of 2#B . 2. There exists a class 2#C that is a superclass of 2#B , range of 2#R and has also a label similar to 2#R. 3. The domain of 2#R is a superclass of 1#A due to R.</p><p>Notice that these conditions are a complete description of our approach for detecting the CAT pattern. The following example will clarify why such a straightforward approach works. </p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>Class by Inverse Attribute Type pattern</head><formula xml:id="formula_7">A correspondence 1#A ≡ 2#B ∃2#R −1 .</formula><p>of the CAT −1 type is generated if all following conditions hold.</p><p>1. The labels of 1#A and 2#R are similar. 2. There exists a concept 2#B which both is a proper subset of the range of 2#R 3. and which is, due to the R, a superclass of 1#A.</p><p>Notice that for the CAT pattern we did not demand similarity between 1#A and 2#R. This is related to the fact that the label of a property often describes some aspects of its range and not its domain (e.g. hasAuthor relates a paper to its author). Thus, the label of a property is relevant for the inverse pattern CAT −1 . The other two conditions are related to structural aspects and filter out candidates that are caused by accidental string similarities.</p><p>Class by Attribute Value pattern Although above we described the pattern CAV in general, our algorithm will only detect the boolean variant of this pattern. A correspondence 1#A ≡ ∃2#R.{true} is generated by our algorithm, if all following conditions hold.</p><p>1. The range of the datatype property 2#R is Boolean. Given a non-boolean datatype property range, more sophisticated techniques are required to decide which set of values is adequate for which concept. In our case this distinction is based on condition 2c. If the similarity value does not exceed a certain threshold, we generate 1#A ≡ ∃2#R.{false} instead of 1#A ≡ ∃2#R.{true}. An example detected in our experimental study is 1#Early Registered Participant ≡ ∃2#earlyRegistration.{true} exploiting 1#Participant ≡ 2#Participant in R.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>Property Chain pattern</head><formula xml:id="formula_8">A correspondence 1#R ≡ 2#P • 2#Q of type PC is gener- ated, if all following conditions hold.</formula><p>1. Due to R, the domain of 1#R is a subclass or superclass of the domain of 2#P . 2. The range of 2#P is a subclass or superclass of the domain of 2#Q. 3. Datatype properties 1#R and 2#Q have a compatible data range. 4. The labels of 1#R and 2#P are similar. 5. The label of 2#Q is name or is contained in the label of 1#R resp. vice versa.</p><p>Due to the condition that range of 2#P and domain of 2#Q are in a superclass relation, the successive application of the properties can be ensured. Often 1#R maps a class onto a name, therefore especially properties which are labeled with name are potential mapping candidates. An example for this pattern has already been given in the previous section. With respect to Figure <ref type="figure">1</ref> we have 1#R = 1#author , 2#P = 2#hasAuthor , 2#Q = 2#name. The property 1#author relates a paper to the name of its author, 2#hasAuthor relates a paper to its author and 2#name an author to its name. Thus, a chain of properties is required to express 1#author in the terminology defined by O 2 .</p><p>A second set of conditions aims to cover a different naming strategy. The first three conditions are the same as above, but the last ones have to be replaced as follows.</p><p>4. The labels of 1#R and 2#Q are similar. 5. The labels of 2#P and its range or the labels of the properties 2#P and 2#Q are similar.</p><p>An example, depicted in Figure <ref type="figure">4</ref>, of a property chain that fulfills these conditions:</p><p>1#hasYear = 2#date • 2#year where 2#date is an object property with 2#Date as abstract range. For all patterns of the class by and property chain family we additionally check for each candidate correspondence whether there exists a constituent that already occurs in the reference alignment. In this case we trust the simple correspondence in the reference alignment and do not generate the complex correspondence.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="5">Experiments</head><p>The algorithms described in the previous section have been implemented in a matching tool available at http://dominique-ritze.de/complex-mappings/. We applied our tool on three datasets referred to as CONFERENCE 1, CONFERENCE 2 and BENCHMARK. These datasets have been taken from corresponding tracks of the Ontology Alignment Evaluation Initiative (OAEI). As BENCHMARK we refer to the matching tasks #301 -#304 of the OAEI Benchmark track. We abstained from using the other test cases, because they are generated by systematic variations of the #101 ontology, which do not exceed a certain degree of structural difference. The CONFERENCE 1 dataset consists of all pairs of ontologies for which a reference alignment is available. Additionally, we used the reference alignment between concepts created for the experiments conducted in <ref type="bibr" target="#b4">[5]</ref> to extend our datasets. This dataset is referred to as CONFERENCE 2 and has not been regarded while looking for complex correspondences. Notice that all conditions in our algorithms express hard boolean constraints. The only exception is the threshold that determines whether two strings are similar. Therefore, we conducted our experiments with different thresholds from 0.6 to 0.9.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>Correct Correspondences (true positives)</head><p>Incorrect Correspondences (false positives)</p><formula xml:id="formula_9">Type CAT &amp; CAT −1 PC CAT &amp; CAT −1 PC</formula><p>Threshold 0.6 0.7 0.8 0.9 0.6 0.7 0.8 0.9 0.6 0.7 0.8 0.9 0.6 0.7 0.8 0.9 0.6 0.7 0.8 0.9 0.6 0.7 0.8 0.9 Table <ref type="table">1</ref>. Results with four different thresholds Table <ref type="table">1</ref> gives an overview on the results of our experiments. We carefully analyzed all generated correspondences and divided them in correct (true positives) and incorrect ones (false positives). One might first notice that we did not include a column for the CAV pattern. Unfortunately, only two correct and one incorrect correspondence of this type have been detected in the CONFERENCE 1 dataset. Remember that we only focused on boolean datatype properties. A more general strategy might result in higher recall. Nevertheless, to our knowledge all correspondences of the boolean CAV have been detected and even with low thresholds only one incorrect correspondence accrued.</p><p>Obviously there is a clear distinction between different datasets. While our matching system detected correct complex correspondences of class by types in the CON-FERENCE datasets, none have been detected in the BENCHMARK dataset. Nearly the same holds vice versa. This is based on the fact that the ontologies of the BENCHMARK dataset are dedicated to the very narrow domain of bibliography and do not strongly vary with respect to their concept hierarchy, while differences can be found with regard to the use of properties. The CONFERENCE ontologies on the other hand have very different conceptual hierarchies.</p><p>Correspondences of the pattern CAT and CAT −1 can be found in both CONFER-ENCE 1 &amp; 2 datasets. As expected we find the typical relation between precision and recall on the one hand and the chosen threshold on the other hand: low thresholds cause low precision of approx 30% and allow to detect a relatively high number of correct correspondences. A nearly balanced ratio between true and false positives is reached with a threshold of 0.8.</p><p>For the PC pattern a threshold of 0.6 results in 18 correct and 21 incorrect correspondences. Surprisingly, the number of correct correspondences does not decrease with increasing threshold, although the number of incorrect correspondences decreases significantly. This is based on the fact that the relevant entities occurring in the PC pattern are very often not only similar but identical after normalization (e.g. concept Date and property date). This observation indicates that there is still room for improvement by choosing different thresholds for different patterns.</p><p>Another surprising result is the high number of false property chains in the CON-FERENCE 1 and in particular in the CONFERENCE 2 dataset compared to the BENCH-MARK dataset. Due to the existence of a reference alignment with high coverage of properties for the BENCHMARK dataset many incorrect property chains have not been generated. Their constituents already occurred in simple correspondence of the reference alignment. The same does not hold for the CONFERENCE datasets. There are many properties that have no counterpart in one of the other ontologies.</p><p>Our experimental study points to the problem of evaluating the quality of a complex alignment. Due to the fact that complex correspondences are missing in the reference alignments, our results cannot be compared against a gold standard, resulting in missing recall values. Even though it might be possible to construct a complete reference alignment for a finite number of patterns, it will be extremely laborious to construct a complete reference alignment, which contains all non-trivial complex correspondences. Nevertheless, a comparison against the size of the simple reference alignments might deliver some useful insights. The number of property correspondences in the union of all BENCHMARK reference alignments is 139 (only 63 concept correspondences), while we could find 17 additional property chains with our approach. For the CONFERENCE datasets we counted 275 concept correspondences (only the CONFERENCE 1 dataset comprised additionally 12 property correspondences). Here we detected 12 complex correspondences of different class by types. These results indicate that the proposed complex ontology matching strategy increased recall by approx. 4% with respect to concept correspondences and by approx. 10% with repect to property correspondences.</p><p>Interpreting these results, we have to keep in mind that the generation of complex correspondences is much harder compared to the generation of simple correspondences. While a balanced rate of correct and incorrect correspondences will not be acceptable for simple matching tasks, a similar result is positive with respect to the complex matching task which we tackle with our approach.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="6">Conclusion</head><p>We proposed a pattern based approach to detect different types of complex correspondences. Our approach does not rely on machine learning techniques, which require the availability of instance correspondences. On the contrary, it is based on state of the art matching techniques and additionally exploits an input alignment which consists of simple correspondences. In an experimental study we have shown that our approach, which is simply based on checking conditions specific to a particular pattern, is sufficient to detect a significant amount of complex correspondences, while the number of false positives is relatively low, if considering that complex correspondences are quite hard to detect.</p><p>Although first results are promising, we know that the task of verifying the correctness of complex correspondences requires human interaction. A pattern based approach, as proposed in this paper, will in most cases fail to generate highly precise alignments. This is based on the fact that the generation of complex correspondences is significantly harder compared to the task of generating simple correspondences. Suppose, given concept AcceptedPaper of O 1 , a user is searching in O 2 for an equivalent concept. First of all, there are as much simple hypotheses available as there are atomic concepts in O 2 . The situation changes dramatically when there exists no atomic counterpart and a complex correspondence is required. The search space explodes and it becomes impossible for a human expert to evaluate each possible combination. We know that the proposed patterns covers only a small part of an infinite search space. Nevertheless, this small part might still be large enough to find a significant fraction of those correspondences that will not be detected at all without a supporting system.</p></div><figure xmlns="http://www.tei-c.org/ns/1.0" xml:id="fig_0"><head></head><label></label><figDesc>throughout this section. Further we use O 1 and O 2 to refer to two aligned ontologies, and we use prefix notation i#C to refer to an entity C from ontology O i .</figDesc></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" xml:id="fig_1"><head>Fig. 2 .</head><label>2</label><figDesc>Fig. 2. Conditions relevant for detecting CAT correspondence 1#Accepted Paper ≡ ∃2#hasDecision.2#Acceptance.</figDesc><graphic coords="7,189.84,350.12,235.68,138.33" type="bitmap" /></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" xml:id="fig_2"><head>2 .</head><label>2</label><figDesc>In the following the label of 1#A is split into its head noun hn(1#A) and the remaining part of the label ¬hn(1#A). Again, ¬hn(1#A) is split into a first part ¬hn 1 (1#A) and a remaining part ¬hn 2 (1#A). (a) hn(1#A) is similar to the label of 2#R's domain. (b) ¬hn(1#A) is similar to the label of 2#R. (c) ¬hn 1 (1#A) is similar to the label of 2#R. 3. The domain of 2#R is a superclass of 1#A due to R.</figDesc></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" xml:id="fig_3"><head>Fig. 3 .</head><label>3</label><figDesc>Fig. 3. Conditions relevant for detecting PC correspondence 1#hasYear ≡ 2#date • 2#year</figDesc><graphic coords="9,134.77,310.33,345.84,123.56" type="bitmap" /></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" xml:id="fig_4"><head></head><label></label><figDesc>18 18 18 28 26 25 18 24 14 11 2 21 16 14 8 45 30 25 10</figDesc></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0"><head></head><label></label><figDesc></figDesc><graphic coords="4,134.77,115.84,345.83,225.21" type="bitmap" /></figure>
			<note xmlns="http://www.tei-c.org/ns/1.0" place="foot" n="1" xml:id="foot_0">Atomic concepts/properties are sometimes also referred to as named concepts/properties resp. concept/property names.</note>
			<note xmlns="http://www.tei-c.org/ns/1.0" place="foot" n="2" xml:id="foot_1">In the taxonomy of patterns at the ODP portal (http://ontologydesignpatterns. org/wiki/OPTypes) category AlignmentODP corresponds best with the patterns in this paper, while category CorrespondeceODP is a more general category.</note>
			<note xmlns="http://www.tei-c.org/ns/1.0" place="foot" n="3" xml:id="foot_2">http://alignapi.gforge.inria.fr/language.html</note>
			<note xmlns="http://www.tei-c.org/ns/1.0" place="foot" n="4" xml:id="foot_3">http://nb.vse.cz/ ˜svabo/oaei2008/cbw08.pdf</note>
			<note xmlns="http://www.tei-c.org/ns/1.0" place="foot" n="5" xml:id="foot_4">Correspondence patterns library<ref type="bibr" target="#b7">[8]</ref> explicitly contains (CAT) and (CAV), other two patterns (PC) and (CAT −1 ) are not explicitly presented there.</note>
		</body>
		<back>

			<div type="acknowledgement">
<div xmlns="http://www.tei-c.org/ns/1.0"><p>Acknowledgment The work has been partially supported by the German Science Foundation (DFG) under contract STU 266/3-1 and STU 266/5-1 and by the IGA VSE grant no. 20/08 "Evaluation and matching ontologies via patterns".</p></div>
			</div>

			<div type="references">

				<listBibl>

<biblStruct xml:id="b0">
	<analytic>
		<title level="a" type="main">Semantic-integration research in the database community</title>
		<author>
			<persName><forename type="first">A</forename><surname>Doan</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><forename type="middle">Y</forename><surname>Halevy</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">AI Magazine</title>
		<imprint>
			<biblScope unit="page" from="83" to="94" />
			<date type="published" when="2005">2005</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b1">
	<analytic>
		<title level="a" type="main">Expressive alignment language and implementation</title>
		<author>
			<persName><forename type="first">J</forename><surname>Euzenat</surname></persName>
		</author>
		<author>
			<persName><forename type="first">F</forename><surname>Scharffe</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Zimmermann</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Knowledge web</title>
				<imprint>
			<date type="published" when="2007">2007</date>
		</imprint>
	</monogr>
	<note>deliverable 2.2.10</note>
</biblStruct>

<biblStruct xml:id="b2">
	<monogr>
		<title level="m" type="main">Ontology Matching</title>
		<author>
			<persName><forename type="first">J</forename><surname>Euzenat</surname></persName>
		</author>
		<author>
			<persName><forename type="first">P</forename><surname>Shvaiko</surname></persName>
		</author>
		<imprint>
			<date type="published" when="2007">2007</date>
			<publisher>Springer</publisher>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b3">
	<analytic>
		<title level="a" type="main">Binary codes capable of correcting deletions and insertions and reversals</title>
		<author>
			<persName><forename type="first">V</forename><forename type="middle">I</forename><surname>Levenshtein</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">Russian. English Translation in Soviet Physics Doklady</title>
		<imprint>
			<biblScope unit="volume">10</biblScope>
			<biblScope unit="issue">8</biblScope>
			<biblScope unit="page">707710</biblScope>
			<date type="published" when="1965">1965. 1966</date>
		</imprint>
	</monogr>
	<note>Doklady Akademii Nauk SSSR</note>
</biblStruct>

<biblStruct xml:id="b4">
	<analytic>
		<title level="a" type="main">Repairing Ontology Mappings</title>
		<author>
			<persName><forename type="first">C</forename><surname>Meilicke</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Tamilin</surname></persName>
		</author>
		<author>
			<persName><forename type="first">H</forename><surname>Stuckenschmidt</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Proceedings of the 22nd Conference on Artificial Intelligence</title>
				<meeting>the 22nd Conference on Artificial Intelligence<address><addrLine>Vancouver, Canada</addrLine></address></meeting>
		<imprint>
			<date type="published" when="2007">2007</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b5">
	<analytic>
		<title level="a" type="main">Discovering Executable Semantic Mappings Between Ontologies</title>
		<author>
			<persName><forename type="first">H</forename><surname>Qin</surname></persName>
		</author>
		<author>
			<persName><forename type="first">D</forename><surname>Dou</surname></persName>
		</author>
		<author>
			<persName><forename type="first">P</forename><surname>Lependu</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">On the Move to Meaningful Internet Systems 2007: CoopIS, DOA, ODBASE, GADA, and IS</title>
				<imprint>
			<date type="published" when="2007">2007</date>
			<biblScope unit="page" from="832" to="849" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b6">
	<monogr>
		<title level="m" type="main">Generating Complex Ontology Alignments</title>
		<author>
			<persName><forename type="first">D</forename><surname>Ritze</surname></persName>
		</author>
		<imprint>
			<date type="published" when="2009">2009</date>
		</imprint>
		<respStmt>
			<orgName>University Mannheim ; Bachelor thesis</orgName>
		</respStmt>
	</monogr>
</biblStruct>

<biblStruct xml:id="b7">
	<monogr>
		<title level="m" type="main">Correspondence Patterns Representation</title>
		<author>
			<persName><forename type="first">F</forename><surname>Scharffe</surname></persName>
		</author>
		<imprint>
			<date type="published" when="2009">2009</date>
		</imprint>
		<respStmt>
			<orgName>University of Innsbruck</orgName>
		</respStmt>
	</monogr>
	<note type="report_type">PhD thesis</note>
</biblStruct>

<biblStruct xml:id="b8">
	<analytic>
		<title level="a" type="main">Learning Complex Ontology Alignments A Challenge for ILP Research</title>
		<author>
			<persName><forename type="first">H</forename><surname>Stuckenschmidt</surname></persName>
		</author>
		<author>
			<persName><forename type="first">L</forename><surname>Predoiu</surname></persName>
		</author>
		<author>
			<persName><forename type="first">C</forename><surname>Meilicke</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Proceedings of the 18th International Conference on Inductive Logic Programming</title>
				<meeting>the 18th International Conference on Inductive Logic Programming</meeting>
		<imprint>
			<date type="published" when="2008">2008</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b9">
	<analytic>
		<title level="a" type="main">OntoFarm: Towards an Experimental Collection of Parallel Ontologies</title>
		<author>
			<persName><forename type="first">O</forename><surname>Šváb</surname></persName>
		</author>
		<author>
			<persName><forename type="first">V</forename><surname>Svátek</surname></persName>
		</author>
		<author>
			<persName><forename type="first">P</forename><surname>Berka</surname></persName>
		</author>
		<author>
			<persName><forename type="first">D</forename><surname>Rak</surname></persName>
		</author>
		<author>
			<persName><forename type="first">P</forename><surname>Tomášek</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Poster Proceedings of the International Semantic Web Conference</title>
				<imprint>
			<date type="published" when="2005">2005</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b10">
	<analytic>
		<title level="a" type="main">Towards Ontology Matching via Pattern-Based Detection of Semantic Structures in OWL Ontologies</title>
		<author>
			<persName><forename type="first">O</forename><surname>Šváb-Zamazal</surname></persName>
		</author>
		<author>
			<persName><forename type="first">V</forename><surname>Svátek</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Proceedings of the Znalosti Czecho-Slovak Knowledge Technology conference</title>
				<meeting>the Znalosti Czecho-Slovak Knowledge Technology conference</meeting>
		<imprint>
			<date type="published" when="2009">2009</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b11">
	<analytic>
		<title level="a" type="main">Towards Metamorphic Semantic Models</title>
		<author>
			<persName><forename type="first">O</forename><surname>Šváb-Zamazal</surname></persName>
		</author>
		<author>
			<persName><forename type="first">V</forename><surname>Svátek</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>David</surname></persName>
		</author>
		<author>
			<persName><forename type="first">F</forename><surname>Scharffe</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Poster session at European Semantic Web Conference</title>
				<imprint>
			<date type="published" when="2009">2009</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b12">
	<analytic>
		<title level="a" type="main">Pattern-based Ontology Transformation Service</title>
		<author>
			<persName><forename type="first">O</forename><surname>Šváb-Zamazal</surname></persName>
		</author>
		<author>
			<persName><forename type="first">V</forename><surname>Svátek</surname></persName>
		</author>
		<author>
			<persName><forename type="first">F</forename><surname>Scharffe</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Proceedings of the 1st International Conference on Knowledge Engineering and Ontology Development</title>
				<meeting>the 1st International Conference on Knowledge Engineering and Ontology Development</meeting>
		<imprint>
			<date type="published" when="2009">2009</date>
		</imprint>
	</monogr>
</biblStruct>

				</listBibl>
			</div>
		</back>
	</text>
</TEI>
