<?xml version="1.0" encoding="UTF-8"?>
<TEI xml:space="preserve" xmlns="http://www.tei-c.org/ns/1.0" 
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" 
xsi:schemaLocation="http://www.tei-c.org/ns/1.0 https://raw.githubusercontent.com/kermitt2/grobid/master/grobid-home/schemas/xsd/Grobid.xsd"
 xmlns:xlink="http://www.w3.org/1999/xlink">
	<teiHeader xml:lang="en">
		<fileDesc>
			<titleStmt>
				<title level="a" type="main">Completeness and Optimality in Ontology Alignment Debugging</title>
			</titleStmt>
			<publicationStmt>
				<publisher/>
				<availability status="unknown"><licence/></availability>
			</publicationStmt>
			<sourceDesc>
				<biblStruct>
					<analytic>
						<author>
							<persName><forename type="first">Jan</forename><surname>Noessner</surname></persName>
							<affiliation key="aff0">
								<orgName type="institution">University of Mannheim</orgName>
								<address>
									<postCode>68163</postCode>
									<settlement>Mannheim</settlement>
									<country key="DE">Germany</country>
								</address>
							</affiliation>
						</author>
						<author>
							<persName><forename type="first">Heiner</forename><surname>Stuckenschmidt</surname></persName>
							<affiliation key="aff0">
								<orgName type="institution">University of Mannheim</orgName>
								<address>
									<postCode>68163</postCode>
									<settlement>Mannheim</settlement>
									<country key="DE">Germany</country>
								</address>
							</affiliation>
						</author>
						<author>
							<persName><forename type="first">Christian</forename><surname>Meilicke</surname></persName>
							<affiliation key="aff0">
								<orgName type="institution">University of Mannheim</orgName>
								<address>
									<postCode>68163</postCode>
									<settlement>Mannheim</settlement>
									<country key="DE">Germany</country>
								</address>
							</affiliation>
						</author>
						<author role="corresp">
							<persName><forename type="first">Mathias</forename><surname>Niepert</surname></persName>
							<email>mniepert@cs.washington.edu</email>
							<affiliation key="aff1">
								<orgName type="institution">University of Washington</orgName>
								<address>
									<postCode>98195-2350</postCode>
									<settlement>Seattle</settlement>
									<region>WA</region>
									<country key="US">USA</country>
								</address>
							</affiliation>
						</author>
						<title level="a" type="main">Completeness and Optimality in Ontology Alignment Debugging</title>
					</analytic>
					<monogr>
						<imprint>
							<date/>
						</imprint>
					</monogr>
					<idno type="MD5">24C20B0B9819408626529A607D2FCAAF</idno>
				</biblStruct>
			</sourceDesc>
		</fileDesc>
		<encodingDesc>
			<appInfo>
				<application version="0.7.2" ident="GROBID" when="2023-03-24T03:14+0000">
					<desc>GROBID - A machine learning software for extracting information from scholarly documents</desc>
					<ref target="https://github.com/kermitt2/grobid"/>
				</application>
			</appInfo>
		</encodingDesc>
		<profileDesc>
			<textClass>
				<keywords>
					<term>ontology matching</term>
					<term>expressiveness</term>
					<term>alignment debugging</term>
				</keywords>
			</textClass>
			<abstract>
<div xmlns="http://www.tei-c.org/ns/1.0"><p>The benefit of light-weight reasoning in ontology matching has been recognized by a number of researchers resulting in alignment repair systems such as Alcomo and LogMap. While the general benefit of logical reasoning has been shown in principle, there is no systematic empirical evaluation analyzing (i) the impact of completeness of the reasoning methods and (ii) whether approximate or optimal solutions to the conflict resolution problem have to be preferred. Using standard benchmark data sets, we show that increasing the expressive power does improve the matching results and that optimal resolution methods slightly outperform approximate ones.</p></div>
			</abstract>
		</profileDesc>
	</teiHeader>
	<text xml:lang="en">
		<body>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="1">Introduction</head><p>Research in ontology matching has been strongly influenced by earlier results in schema matching <ref type="bibr" target="#b18">[19]</ref>. There are several approaches that aim at being universally applicable across ontologies and database schemas by relying on a representation of ontologies and schemas as directed graphs <ref type="bibr" target="#b0">[1]</ref>. While various studies have verified the benefit of explicit, logical schema semantics such as description logics and logical reasoning (e.g. <ref type="bibr" target="#b16">[17]</ref>), there is only a limited number of approaches that exploit schema semantics to improve matching results in a principled manner. Early approaches exploiting the logical structure of class descriptions were based on specialized similarity measures that take logical operators into account (e.g. <ref type="bibr" target="#b1">[2]</ref>). Additional methods avoid structural properties that mimic unwanted reasoning results <ref type="bibr" target="#b5">[6]</ref> or require user interaction <ref type="bibr" target="#b17">[18]</ref>. More recently, a number of approaches have been proposed that explicitly use ontological reasoning. Meilicke et al., for instance, compute and leverage logical inconsistencies to eliminate conflicts between alignment hypotheses <ref type="bibr" target="#b10">[11]</ref>. A related approach was proposed by Jiménez-Ruiz et al. <ref type="bibr" target="#b6">[7]</ref>. Additional debugging strategies remove incoherent alignments during a post-processing step <ref type="bibr" target="#b19">[20,</ref><ref type="bibr" target="#b12">13]</ref>. Giunchiglia and colleagues use reasoning over logic-based representations of class labels but solely focus on the problem of matching class hierarchies <ref type="bibr" target="#b3">[4]</ref>. Most of these approaches exploit restricted forms of reasoning so as to ensure the scalability to large models. While these approaches demonstrated the benefits of logical reasoning for matching expressive ontologies, there has not been a systematic investigation of the impact logical reasoning has on matching results. In particular, it is not obvious whether more expressive reasoning methods provide more benefits than less expressive ones. Furthermore, the impact of applying different strategies for resolving detected logical conflicts, has not been analyzed in details. Within this paper we report about experiments that shed light on both research questions. Another systematic evaluation is provided in <ref type="bibr" target="#b7">[8]</ref>, where the authors focus on the need of debugging and provide a comparison of two debugging systems, while we focus on completeness and optimality.</p><p>The paper is structured as follows. In Section 2 we explain alignment incoherence and introduce the notion of completeness and optimality with respect to alignment debugging. Moreover, we describe three existing debugging systems that we use in our experiments. We discuss the setting of our experiments in Section 3 with a focus on data sets and evaluation metrics. The results of these experiments are presented in Section 4. We close with a discussion in Section 5.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="2">Incoherence in Ontology Matching</head><p>Ontology Matching is the task of finding correspondences between entities of two ontologies O 1 and O 2 . According to <ref type="bibr" target="#b2">[3]</ref>, a correspondence between an entity e 1 defined in O 1 and an entity e 2 defined in O 2 is a 4-tuple e 1 , e 2 , r, n where r is a semantic relation (such as equivalence), and n is a real-valued confidence value. A set of correspondences is called an alignment. In line with most matching systems and benchmarks, we focus on equivalence correspondences, i.e., ( e 1 , e 2 , ≡, n ), where the matched entities are either both classes or properties. However, the overall approach can also be applied to any kind of axioms as long as these axioms are supported by the debugging system (e.g., all three systems used in our experiments support also subsumption axioms as correspondences).</p><p>An alignment A can be created by a human expert or by an automated matching system. In both cases, A might include erroneous correspondences. However, it is reasonable to assume that O 1 and O 2 do not contain erroneous axioms. For that reason, an alignment A can be interpreted as a set of uncertain, weighted equivalence axioms, while O 1 ∪ O 2 will comprise the certain axioms. Merging A, O 1 , and O 2 can then result into an incoherent ontology, i.e. some of the classes of O 1 or O 1 might be unsatisfiable due to the additional information encoded in A. The following example shows an incoherent alignment.</p><formula xml:id="formula_0">O 1 = {Jaguar 1 Cat 1 , Cat 1 Animal 1 }, O 2 = {Jaguar 2 Brand 2 , Animal 2 ¬Brand 2 } A = { Jaguar 1 ≡ Jaguar 2 , 0.9 , Animal 1 ≡ Animal 2 , 0.95 }</formula><p>In this example the classes Jaguar 1 and Jaguar 2 are unsatisfiable in the merged ontology. There are three possible ways to resolve this incoherence: (1) Dis-card both correspondences, (2) discard Jaguar 1 ≡ Jaguar 2 , 0.9 , or (3) discard Animal 1 ≡ Animal 2 , 0.95 . Obviously, we prefer (2) and (3) over <ref type="bibr" target="#b0">(1)</ref>. Moreover, it seems to make more sense to remove the correspondence that is less confident, i.e., the most reasonable decision is <ref type="bibr" target="#b1">(2)</ref> given no further information is available.</p><p>However, with larger matching problems a solution to the debugging problem becomes more complex for two reasons. First, not all conflicts (= subsets of correspondences resulting in incoherence) might be detected. This might be caused by using an incomplete reasoning technique, for example, because only a certain type of axioms are analyzed. Second, the detected conflicts might be overlapping and there are several ways to resolve the incoherence. In such a situation a solution should be preferred that removes as less confidence as possible. We call such a solution an optimal solution and define it as a subset ∆ ⊆ A such that A \ ∆ is coherent and there exist no other ∆ * such that A \ ∆ * is coherent and c∈∆ conf (c) &gt; c∈∆ * conf (c). This definition corresponds to the definition of a global optimal diagnosis given in <ref type="bibr" target="#b8">[9]</ref>.</p><p>Note that optimality and completeness are independent characteristics of a debugging system. It is possible to construct a debugging system that is complete in terms of reasoning but cannot guarantee the optimality of the solution, while it is also possible to construct a system that is incomplete and optimal, in the sense that the solution is optimal with respect to all detected conflicts, even though these conflicts are only a subset of all conflicts due to the incompleteness. Note also that the notion of optimality is a technical notion, i.e., an optimal solution might not always be the best solution in terms of precision and recall.</p><p>3 Experimental Set-Up</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="3.1">Datasets</head><p>The ontologies we use for the experiments are from the ontology alignment evaluation initiative (OAEI) <ref type="bibr" target="#b4">[5]</ref>. We selected the conference and the large Biomed ontologies because these benchmarks are not artificially created (unlike, for instance, the benchmarks dataset), are not focused on a narrow alignment problem (unlike, for instance, the multifarm dataset which is concerned with multilingual ontology matching), and provide coherent reference alignments. Moreover, the size of the large Biomed ontologies allows us to assess the scalability of the presented approach.</p><p>The conference dataset consists of 15 ontologies which model the domain of conference organization <ref type="bibr" target="#b20">[21]</ref>. The number of classes, properties, and axioms of a particular type of 7 ontologies are listed in Table <ref type="table" target="#tab_0">1</ref> ordered by increasing expressiveness. Every row in the table, with the exception of the last row, corresponds to one expressiveness level we used for the experiments (see Section 4.1). For the 7 listed ontologies, reference alignments were created for each possible pair, resulting in 21 ontology pairs with a reference alignment.</p><p>Since the ontologies in the Conference dataset are relatively small, we also performed experiments with the large BioMed ontologies. data set consists of the Foundational Model of Anatomy (fma) <ref type="foot" target="#foot_0">3</ref> , National Cancer Institute Thesaurus (nci) <ref type="foot" target="#foot_1">4</ref> , and SNOMED clinical terms<ref type="foot" target="#foot_2">5</ref> ontologies. Semantically rich and with thousands of classes, the problem of aligning these ontologies is one of the computationally most challenging in the OAEI campaign. For the 2013 OAEI campaign, only 12 out of 21 participating system configurations were able to compute results for the three combinations. We used the "small fragment" matching problems of the track. For more details on these data sets we refer the reader to the OAEI track website <ref type="foot" target="#foot_3">6</ref> . The properties of the ontologies are summarized in Table <ref type="table" target="#tab_1">2</ref>.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="3.2">Alignment Aggregation</head><p>For each of the matching tasks described above, there are several alignments available that have been generated by different matching system. We decided to aggregate these alignment for each matching task in a preprocessing step. Thus, we can work with large input alignments and can avoid an additional subsequent aggregation of the debugging results. We aggregated the results of all matchers participating in the 2013 OAEI campaign. For the conference benchmark, we included all matchers which performed better than the string equality baseline <ref type="bibr" target="#b4">[5]</ref>. For the large BioMed benchmark, we included the results of the 6 matchers which were able to compute a solution for every combination <ref type="bibr" target="#b4">[5]</ref>. The participants in the large BioMed track were allowed to submit results for different settings of their system. We always used the best results of each system in terms of f-measure. The method of alignment aggregation resembles the approach described in <ref type="bibr" target="#b8">[9]</ref>. For each pair of ontologies, we union the alignments A 1 , . . . , A i , . . . , A n of each matching system i to one alignment A. To that end, we first span the confidence values w of each correspondence w, a in alignment A i to the range of (0, 1]. This ensures that the confidence values of the individual matchers are scaled identically. We then compute the aggregated a-priori confidence values for a correspondence as the normalized sum of all a-priori confidences of that correspondence. The average size of one alignment for the conference benchmark is 42 ranging from at least 29 to at most 60 correspondences. For the large BioMed benchmark, we obtain 3396 correspondences for the ontology pair nci and fma; 10760 for the pair fma and snomed; and 18842 for snomed and nci.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="3.3">Debugging Systems</head><p>In our evaluation we present results for the debugging systems ELog, LogMap and Alcomo that we apply on the ontologies and alignments described so far.</p><p>-ELog <ref type="bibr" target="#b15">[16]</ref> is a reasoner for log-linear description logics, which offers complete reasoning capabilities for EL++. ELog can be used for debugging ontology alignments (details can be found in <ref type="bibr" target="#b14">[15]</ref>). Since ELog transforms the debugging problem to finding the MAP state of a Markov Logic Network, it guarantees the optimality of the solution, i.e., the MAP state corresponds to an optimal solution. However, ELog is not complete with respect to the full expressiveness of OWL DL. -LogMap [7] is a matching system including a component for alignment debugging. In our experiments we report only about applying this component and refer to it, for the sake of simplicity, as LogMap. This component translates the ontologies into a set of Horn clauses and applies the linear Dowling-Gallier algorithm for propositional Horn satisability multiple times for repairing. The algorithm is not optimal and to our knowledge also not complete against the OWL DL profile. LogMap is known to be the most efficient debugging tool currently available (see for example <ref type="bibr" target="#b7">[8]</ref>).</p><p>-Alcomo <ref type="bibr" target="#b8">[9]</ref> has specifically been developed for the purpose of debugging ontology alignments. Alcomo can be used in a setting that ensures the completeness (for OWL DL) and the optimality of the solution. The optimality of the solution is guaranteed by applying an exhaustive search algorithm to check potential solutions. However, this setting is applicable only to small matching problems. Using a lightweight setting, Alcomo can also be applied to larger matching problems loosing both the features of optimality and completeness.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="3.4">Metrics</head><p>F-Measure Precision and recall of an alignment A measure the correctness of A and the completeness of A, respectively. Both measures are defined with respect to a given reference alignment or gold standard G. The F-measure is the harmonic mean of precision an recall. Precision P , recall R, and F-measure F can be formally defined as</p><formula xml:id="formula_1">P = |A ∩ G| |A| , R = |A ∩ G| |G| ,<label>and</label></formula><formula xml:id="formula_2">F = 2 • P • R P + R .</formula></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>Number of Unsatisfiable Classes</head><p>The number of unsatisfiable classes is proposed as a quality measure for ontology matching in <ref type="bibr" target="#b9">[10]</ref>. It refers to the number of classes that are unsatisfiable in the merged ontology A ∪ O 1 ∪ O 2 where O 1 and O 2 are the matched ontologies and A is the alignment between them. The smaller the number of unsatisfiable classes the higher the quality of the alignment. We computed the number of unsatisfiable classes with the HermiT <ref type="bibr" target="#b11">[12]</ref> reasoner since it is known from previous work <ref type="bibr" target="#b7">[8]</ref> that HermiT outperforms other reasoners in the computation of unsatisfiable classes. Unfortunately, we were not able to compute the unsatisfiable classes for the nci and snomed pair under 5 hours and, thus, cannot provide the number of unsatisfiable classes for the large BioMed benchmark. The conference benchmark experiments were performed on a virtual machine with 8 GB RAM and 2 cores with 2,4 Ghz. The large BioMed experiments were executed on a virtual machine with 60 GB RAM and 2 cores.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="4">Experimental Results</head></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="4.1">Expressiveness</head><p>Within this section we report about experiments that include axiom types with increasing expressiveness. Within these experiments we use the ELog debugging system. For the lowest level of expressiveness, we only include subsumption axioms A B. For the second level, we add disjointness axioms A B ⊥. For the third level, we include domain and range restrictions. Finally, for the most expressive level, we include all axioms representable with the DL EL ++ . The size of the resulting ontologies is shown in Table <ref type="table" target="#tab_0">1</ref> and 2 presented in the previous section. The ELog debugging system is complete with respect to each of the resulting matching problems. However, with this approach we simulate different types of debugging systems that are restricted to exploit different levels of expressiveness. For example, on the second level we simulate a debugging Results for the conference benchmark. With increasing expressiveness, Fmeasure and runtime (in seconds) are increasing. Contrary, the incoherent classes de-This effects are stronger for lower thresholds since more conflicts occur. For thresholds lower than 0.12 the hermiT reasoner failed in computing the number of incoherent classes. In total, the conference benchmark contains 2.973 classes.</p><p>system that bases its reasoning techniques only on the inter-dependencies between subsumption and disjointness axioms. Note that we analyze results for the ontologies in their full expressiveness in the subsequent section.</p><p>Figure <ref type="figure">1</ref> and Figure <ref type="figure">2</ref> depict the results for the various levels of expressiveness and for different thresholds for the conference and large BioMed benchmarks, respectively. The x-axis shows the different thresholds that we applied prior to the debugging step. The results show that the differences between the various levels are less pronounced for lower thresholds. Hence, we put a special emphasis on the threshold areas below 0.2 (for the conference benchmark) and below 0.7 (for the large BioMed benchmark) since results for higher thresholds were nearly identical. Please note that in Figure <ref type="figure">1</ref> the stepsize in each chart changes at threshold 0.2 from 0.01 to 0.1 since, beyond that threshold, there are only very few logical conflicts.</p><p>We observe a positive correlation between increased expressiveness and Fmeasure scores. Considering only subsumption axioms results in lower scores compared to the setting with additional disjointness axioms. Even higher Fmeasures scores are achieved if domain and range restrictions are taken into account. The highest F-measure scores are obtained if we incorporate all EL ++ axioms. This holds also true for the choice of a well-suited threshold in the range Fig. <ref type="figure">2</ref>. Results for the large BioMed benchmark. With increasing expressiveness, Fmeasure scores and running time (in seconds) are increasing. These effects are stronger for lower thresholds since more conflicts occur. We do not provide the number of incoherent classes because the hermiT reasoner did not terminate within 5 hours.</p><p>of 0.15 to 0.2 in case of the conference benchmark, where we clearly observe the benefits of exploiting the full expressiveness of EL ++ . As expected, the number of unsatisfiable classes (center figure of Figure <ref type="figure">1</ref>) is higher for settings with decreased expressiveness. For the subsumption only configuration, we observe the highest number of unsatisfiable classes in the final alignment. On the other hand, there are only few unsatisfiable classes for the EL ++ setting. Aside from the F-measure results, this is another indication of an improved alignment quality. The reason why we obtain unsatisfiable classes at all for EL ++ expressiveness is that the expressiveness of our underlying ontologies is higher than EL ++ . In case of the large BioMed benchmark the Hermit reasoner was not able to determine the number of unsatisfiable classes within 5 hours. Thus, we do not provide a graphic for this benchmark.</p><p>Also as expected, we observe an increase in running time (right figures) when the number of resolved conflicts increases, since runtimes are higher for low thresholds. Furthermore, runtimes also increase with increasing expressiveness. This is in line with our expectation, because a higher level of expressiveness results also in the generation of a more complex optimization problem that needs to be solved when computing the most probable coherent ontology query.</p><p>In summary, the results show that the alignment quality increases with an increase in expressiveness. F-measure scores are higher and the number of unsatisfiable classes is lower if expressiveness increases. We can also conclude that Fig. <ref type="figure">3</ref>. Results for ELog compared with other approaches on the conference benchmark. For lower thresholds, optimal approaches achieve a higher F-measure than approximate approaches but require a longer runtime. The number of unsatisfiable classes is low (1.7% or lower) for all systems. For thresholds lower than 0.12 the hermiT reasoner failed in computing the number of incoherent classes. In total, the conference benchmark contains 2.973 classes. Runtimes are given in seconds.</p><p>a debugging system that is more complete in terms of the supported expressivity will generate better results compared to a less complete system. Runtimes, however, increase with higher expressiveness. This shows a trade-off between runtime and alignment quality depending on the choice of the supported expressiveness.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="4.2">Approximate vs. optimal solutions</head><p>In this section, we experimentally address the question if optimal algorithms lead to higher quality than approximate algorithms. To that end, we compare the log-linear description logic system ELog and the optimal algorithm of Alcomo against the approximate algorithms of LogMap and Alcomo. <ref type="foot" target="#foot_4">7</ref>The results for the conference and large BioMed benchmark are depicted in Figure <ref type="figure">3</ref> and Figure <ref type="figure">4</ref>, respectively. Again, we focus on the discussion Results for ELog compared with other approaches on the large BioMed benchmark. For lower thresholds, optimal approaches achieve a higher F-measure than approximate approaches but require a longer runtime. We do not provide the number of incoherent classes because the hermiT reasoner did not terminate within 5 hours. Runtimes are given in seconds.</p><p>of results for thresholds below 0.2 (for the conference benchmark) and 0.7 (for the large BioMed benchmark).</p><p>The system ELog and the optimal algorithm of Alcomo gains the highest F-measure scores (left figures). The approximate algorithms of Alcomo and LogMap reach lower F-measure scores. The difference in F-measure results between ELog and the optimal algorithm of Alcomo is due to the fact that the associated optimization problems often have more than one solution. Each of this optimal solution has the same objective, i.e. the confidence total of the resulting alignments is the same, but sometimes different F-measure scores. Thus, ELog might choose a different optimum than the optimal algorithm of Alcomo.</p><p>ELog has the highest number of unsatisfiable classes (center figure of Figure <ref type="figure">3</ref>) of all three algorithms. However, having 53 inconsistent classes is only 1.7% compared to the total sum of classes of 2,973. As explained above, ELog is complete only for EL ++ . Thus, all inconsistencies were caused from axioms which are out of the scope of EL ++ . The results indicate that the restricted expressivity seems to be less important than the optimality of the solution, since ELog generates at the same time results with the best F-measure.</p><p>The approximate algorithms of LogMap and Alcomo are more efficient, especially for lower thresholds. In case of the conference benchmark, ELog outperforms the approximate Alcomo algorithm for thresholds higher than 0.15. Except for the thresholds of 0.11 and 0.12, the exact Alcomo algorithm is slower than ELog and does not terminate within one hour for thresholds below 0.09. For the large BioMed benchmark, the approximate algorithms are faster. For thresholds below 0.7 the exact Alcomo algorithm does not terminate within one hour. LogMap achieves by far the best runtime results, which is also supported by the results reported in <ref type="bibr" target="#b7">[8]</ref>. This is (at least partially) caused by incomplete reasoning and non-optimal conflict resolution techniques.</p><p>The non-optimal variant of Alcomo and LogMap generate very similar alignments. This becomes obvious when comparing the F-measure scores presented in the left plots of Figure <ref type="figure">3 and 4</ref>. Obviously, the systems show a similar bevaviour and seem to apply a similar conflict resolution strategy. The same observation can be made for the optimal variant of Alcomo and ELog. Thus, the distinction between optimal and non-optimal algorithms becomes visible in the threshold/F-measure plots, which supports the importance of this distinction.</p><p>Overall, we can conclude that optimal systems achieve higher F-measure scores than the approximate algorithms. With respect to runtime, the approximate algorithms are faster than the optimal approaches. In particular LogMap outperforms all other systems. Furthermore, ELog has shorter runtimes than the optimal algorithm of Alcomo. This is remarkable since LogMap and Alcomo are specialized on ontology matching. They leverage the fact that weighted axioms can only occur between ontologies and that those axioms are either subsumption or equivalence axioms.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="5">Conclusions</head><p>Our experiments indicate that an increase in expressiveness leads to an increase in F-measure scores. Furthermore, the comparison of approximate and optimal ontology alignment repairing systems shows that optimal approaches achieve better F-measure scores. However, we observe a trade-off between F-measure and runtime. Runtimes are longer for higher expressiveness and optimal approaches have, on average, longer runtimes than approximate approaches. Thus, we advice users to employ optimal approaches for non-time critical data integration tasks. If real-time ontology alignment is required, we recommend the use of approximate approaches combined with reasoning techniques that might be incomplete.</p></div><figure xmlns="http://www.tei-c.org/ns/1.0" xml:id="fig_0"><head></head><label></label><figDesc>Fig.1. Results for the conference benchmark. With increasing expressiveness, Fmeasure and runtime (in seconds) are increasing. Contrary, the incoherent classes de-This effects are stronger for lower thresholds since more conflicts occur. For thresholds lower than 0.12 the hermiT reasoner failed in computing the number of incoherent classes. In total, the conference benchmark contains 2.973 classes.</figDesc></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" xml:id="fig_3"><head></head><label></label><figDesc>Fig.<ref type="bibr" target="#b3">4</ref>. Results for ELog compared with other approaches on the large BioMed benchmark. For lower thresholds, optimal approaches achieve a higher F-measure than approximate approaches but require a longer runtime. We do not provide the number of incoherent classes because the hermiT reasoner did not terminate within 5 hours. Runtimes are given in seconds.</figDesc></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" type="table" xml:id="tab_0"><head>Table 1 .</head><label>1</label><figDesc>The corresponding Number of classes, properties, and axioms of the conference ontologies.</figDesc><table><row><cell></cell><cell>cmt</cell><cell>conf.</cell><cell>confof</cell><cell>edas</cell><cell>ekaw</cell><cell>iasted</cell><cell>sigkdd</cell></row><row><cell>classes</cell><cell>36</cell><cell>60</cell><cell>38</cell><cell>104</cell><cell>77</cell><cell>140</cell><cell>49</cell></row><row><cell>properties</cell><cell>59</cell><cell>64</cell><cell>36</cell><cell>50</cell><cell>33</cell><cell>41</cell><cell>28</cell></row><row><cell>subsumption</cell><cell>25</cell><cell>49</cell><cell>33</cell><cell>84</cell><cell>71</cell><cell>132</cell><cell>41</cell></row><row><cell>+ disjointness</cell><cell>52</cell><cell>63</cell><cell>76</cell><cell>491</cell><cell>145</cell><cell>133</cell><cell>41</cell></row><row><cell>+ domain and range restrictions</cell><cell>149</cell><cell>149</cell><cell>100</cell><cell>543</cell><cell>184</cell><cell>193</cell><cell>73</cell></row><row><cell>+ all other EL ++ axioms</cell><cell>263</cell><cell>331</cell><cell>293</cell><cell>865</cell><cell>309</cell><cell>505</cell><cell>186</cell></row><row><cell>every axiom</cell><cell>318</cell><cell>408</cell><cell>335</cell><cell>903</cell><cell>341</cell><cell>539</cell><cell>193</cell></row></table></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" type="table" xml:id="tab_1"><head>Table 2 .</head><label>2</label><figDesc>Number of classes, properties, and deterministic axioms of the large BioMed ontologies. For each ontology there exists two fragments. In case of the fma ontology, for example, one fragment contains the axioms overlapping with the nci ontology and one fragment contains the axioms overlapping with the snomed ontology.</figDesc><table><row><cell></cell><cell>fmanci</cell><cell>fmasnomed</cell><cell>ncifma</cell><cell>ncisnomed</cell><cell>snomedfma</cell><cell>snomednci</cell></row><row><cell>classes</cell><cell cols="6">3696 10157 6488 23958 13412 51128</cell></row><row><cell>properties</cell><cell>24</cell><cell>24</cell><cell>63</cell><cell>82</cell><cell>18</cell><cell>51</cell></row><row><cell>subsumption</cell><cell cols="6">3693 10154 4917 18946 16287 31299</cell></row><row><cell>+ disjointness</cell><cell cols="6">3732 10196 5022 19099 16287 31299</cell></row><row><cell>+ domain and range restrictions</cell><cell cols="6">3732 10196 5130 19233 16287 31299</cell></row><row><cell>+ all other EL ++ axioms</cell><cell cols="6">7521 20449 14269 50218 33673 122221</cell></row><row><cell cols="7">every axiom (without annotations) 7548 20478 15634 54452 47104 122221</cell></row></table></figure>
			<note xmlns="http://www.tei-c.org/ns/1.0" place="foot" n="3" xml:id="foot_0">http://sig.biostr.washington.edu/projects/fm/</note>
			<note xmlns="http://www.tei-c.org/ns/1.0" place="foot" n="4" xml:id="foot_1">http://ncit.nci.nih.gov/</note>
			<note xmlns="http://www.tei-c.org/ns/1.0" place="foot" n="5" xml:id="foot_2">http://www.ihtsdo.org/index.php?id=545</note>
			<note xmlns="http://www.tei-c.org/ns/1.0" place="foot" n="6" xml:id="foot_3">http://www.cs.ox.ac.uk/isg/projects/SEALS/oaei/2013/</note>
			<note xmlns="http://www.tei-c.org/ns/1.0" place="foot" n="7" xml:id="foot_4">Alcomo can be executed in different settings. We refer to the setting using the parameters METHOD OPTIMAL/REASONING COMPLETE as optimal algorithm. We refer to the setting METHOD GREEDY/REASONING EFFICIENT as approximate algorithm. However, this settings is both incomplete and does not generate an optimal solution.</note>
		</body>
		<back>
			<div type="references">

				<listBibl>

<biblStruct xml:id="b0">
	<analytic>
		<title level="a" type="main">Schema and ontology matching with coma++</title>
		<author>
			<persName><forename type="first">D</forename><surname>Aumueller</surname></persName>
		</author>
		<author>
			<persName><forename type="first">H</forename><forename type="middle">H</forename><surname>Do</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><surname>Massmann</surname></persName>
		</author>
		<author>
			<persName><forename type="first">E</forename><surname>Rahm</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Proceedings of the 24th International Conference on Management of Data (SIGMOD)</title>
				<meeting>the 24th International Conference on Management of Data (SIGMOD)</meeting>
		<imprint>
			<date type="published" when="2005">2005</date>
			<biblScope unit="page" from="906" to="908" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b1">
	<analytic>
		<title level="a" type="main">Similarity-based ontology alignment in OWL-lite</title>
		<author>
			<persName><forename type="first">J</forename><surname>Euzenat</surname></persName>
		</author>
		<author>
			<persName><forename type="first">P</forename><surname>Valtchev</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Proceedings of the 16th European Conference on Artificial Intelligence (ECAI)</title>
				<meeting>the 16th European Conference on Artificial Intelligence (ECAI)</meeting>
		<imprint>
			<date type="published" when="2004">2004</date>
			<biblScope unit="page" from="333" to="337" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b2">
	<monogr>
		<title level="m" type="main">Ontology matching</title>
		<author>
			<persName><forename type="first">J</forename><surname>Euzenat</surname></persName>
		</author>
		<author>
			<persName><forename type="first">P</forename><surname>Shvaiko</surname></persName>
		</author>
		<imprint>
			<date type="published" when="2007">2007</date>
			<publisher>Springer</publisher>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b3">
	<analytic>
		<title level="a" type="main">Semantic matching</title>
		<author>
			<persName><forename type="first">F</forename><surname>Giunchiglia</surname></persName>
		</author>
		<author>
			<persName><forename type="first">P</forename><surname>Shvaiko</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">Knowledge Eng. Review</title>
		<imprint>
			<biblScope unit="volume">18</biblScope>
			<biblScope unit="issue">3</biblScope>
			<biblScope unit="page" from="265" to="280" />
			<date type="published" when="2003">2003</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b4">
	<analytic>
		<title level="a" type="main">Results of the ontology alignment evaluation initiative</title>
		<author>
			<persName><forename type="first">B</forename><forename type="middle">C</forename><surname>Grau</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Z</forename><surname>Dragisic</surname></persName>
		</author>
		<author>
			<persName><forename type="first">K</forename><surname>Eckert</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Euzenat</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Ferrara</surname></persName>
		</author>
		<author>
			<persName><forename type="first">R</forename><surname>Granada</surname></persName>
		</author>
		<author>
			<persName><forename type="first">V</forename><surname>Ivanova</surname></persName>
		</author>
		<author>
			<persName><forename type="first">E</forename><surname>Jiménez-Ruiz</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><forename type="middle">O</forename><surname>Kempf</surname></persName>
		</author>
		<author>
			<persName><forename type="first">P</forename><surname>Lambrix</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Nikolov</surname></persName>
		</author>
		<author>
			<persName><forename type="first">H</forename><surname>Paulheim</surname></persName>
		</author>
		<author>
			<persName><forename type="first">D</forename><surname>Ritze</surname></persName>
		</author>
		<author>
			<persName><forename type="first">F</forename><surname>Scharffe</surname></persName>
		</author>
		<author>
			<persName><forename type="first">P</forename><surname>Shvaiko</surname></persName>
		</author>
		<author>
			<persName><forename type="first">C</forename><forename type="middle">T</forename><surname>Dos Santos</surname></persName>
		</author>
		<author>
			<persName><forename type="first">O</forename><surname>Zamazal</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Proceedings of the 8th Ontology Matching Workshop</title>
				<meeting>the 8th Ontology Matching Workshop</meeting>
		<imprint>
			<date type="published" when="2013">2013. 2013</date>
			<biblScope unit="page" from="61" to="100" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b5">
	<analytic>
		<title level="a" type="main">Ontology matching with semantic verification</title>
		<author>
			<persName><forename type="first">Y</forename><forename type="middle">R</forename><surname>Jean-Mary</surname></persName>
		</author>
		<author>
			<persName><forename type="first">E</forename><forename type="middle">P</forename><surname>Shironoshita</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><forename type="middle">R</forename><surname>Kabuka</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">J. Web Sem</title>
		<imprint>
			<biblScope unit="volume">7</biblScope>
			<biblScope unit="issue">3</biblScope>
			<biblScope unit="page" from="235" to="251" />
			<date type="published" when="2009">2009</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b6">
	<analytic>
		<title level="a" type="main">Logmap: Logic-based and scalable ontology matching</title>
		<author>
			<persName><forename type="first">E</forename><surname>Jiménez-Ruiz</surname></persName>
		</author>
		<author>
			<persName><forename type="first">B</forename><forename type="middle">C</forename><surname>Grau</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Proceedings of the 10th International Semantic Web Conference (ISWC)</title>
				<meeting>the 10th International Semantic Web Conference (ISWC)</meeting>
		<imprint>
			<date type="published" when="2011">2011</date>
			<biblScope unit="page" from="273" to="288" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b7">
	<analytic>
		<title level="a" type="main">Evaluating mapping repair systems with large biomedical ontologies</title>
		<author>
			<persName><forename type="first">E</forename><surname>Jiménez-Ruiz</surname></persName>
		</author>
		<author>
			<persName><forename type="first">C</forename><surname>Meilicke</surname></persName>
		</author>
		<author>
			<persName><forename type="first">B</forename><forename type="middle">C</forename><surname>Grau</surname></persName>
		</author>
		<author>
			<persName><forename type="first">I</forename><surname>Horrocks</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Proceedings of the 26th International Workshop on Description Logics</title>
				<meeting>the 26th International Workshop on Description Logics</meeting>
		<imprint>
			<date type="published" when="2013">2013</date>
			<biblScope unit="page" from="246" to="257" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b8">
	<monogr>
		<title level="m" type="main">Alignment incoherence in ontology matching</title>
		<author>
			<persName><forename type="first">C</forename><surname>Meilicke</surname></persName>
		</author>
		<imprint>
			<date type="published" when="2011">2011</date>
		</imprint>
		<respStmt>
			<orgName>University Mannheim</orgName>
		</respStmt>
	</monogr>
	<note type="report_type">Ph.D. thesis</note>
</biblStruct>

<biblStruct xml:id="b9">
	<analytic>
		<title level="a" type="main">Incoherence as a basis for measuring the quality of ontology mappings</title>
		<author>
			<persName><forename type="first">C</forename><surname>Meilicke</surname></persName>
		</author>
		<author>
			<persName><forename type="first">H</forename><surname>Stuckenschmidt</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Proceedings of the 3rd Ontology Matching Workshop (OM)</title>
				<meeting>the 3rd Ontology Matching Workshop (OM)</meeting>
		<imprint>
			<date type="published" when="2008">2008</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b10">
	<analytic>
		<title level="a" type="main">Repairing ontology mappings</title>
		<author>
			<persName><forename type="first">C</forename><surname>Meilicke</surname></persName>
		</author>
		<author>
			<persName><forename type="first">H</forename><surname>Stuckenschmidt</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Tamilin</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Proceedings of the 22nd Conference on Artificial Intelligence (AAAI)</title>
				<meeting>the 22nd Conference on Artificial Intelligence (AAAI)</meeting>
		<imprint>
			<date type="published" when="2007">2007</date>
			<biblScope unit="page" from="1408" to="1413" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b11">
	<analytic>
		<title level="a" type="main">Hypertableau reasoning for description logics</title>
		<author>
			<persName><forename type="first">B</forename><surname>Motik</surname></persName>
		</author>
		<author>
			<persName><forename type="first">R</forename><surname>Shearer</surname></persName>
		</author>
		<author>
			<persName><forename type="first">I</forename><surname>Horrocks</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">Journal of Artificial Intelligence Research</title>
		<imprint>
			<biblScope unit="volume">36</biblScope>
			<biblScope unit="issue">1</biblScope>
			<biblScope unit="page" from="165" to="228" />
			<date type="published" when="2009">2009</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b12">
	<analytic>
		<title level="a" type="main">Yam++-a combination of graph matching and machine learning approach to ontology alignment task</title>
		<author>
			<persName><forename type="first">D</forename><surname>Ngo</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Z</forename><surname>Bellahsene</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">Journal of Web Semantics</title>
		<imprint>
			<biblScope unit="volume">16</biblScope>
			<date type="published" when="2012">2012</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b13">
	<analytic>
		<title level="a" type="main">Log-linear description logics</title>
		<author>
			<persName><forename type="first">M</forename><surname>Niepert</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Noessner</surname></persName>
		</author>
		<author>
			<persName><forename type="first">H</forename><surname>Stuckenschmidt</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Proceedings of the 22nd International Joint Conference on Artificial Intelligence (IJCAI)</title>
				<meeting>the 22nd International Joint Conference on Artificial Intelligence (IJCAI)</meeting>
		<imprint>
			<date type="published" when="2011">2011</date>
			<biblScope unit="page" from="2153" to="2158" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b14">
	<monogr>
		<title level="m" type="main">Efficient Maximum A-Posteriori Inference in Markov Logic and Application in Description Logics</title>
		<author>
			<persName><forename type="first">J</forename><surname>Noessner</surname></persName>
		</author>
		<imprint>
			<date type="published" when="2014">2014</date>
		</imprint>
		<respStmt>
			<orgName>University Mannheim</orgName>
		</respStmt>
	</monogr>
	<note type="report_type">Ph.D. thesis</note>
</biblStruct>

<biblStruct xml:id="b15">
	<analytic>
		<title level="a" type="main">Elog: A probabilistic reasoner for owl el</title>
		<author>
			<persName><forename type="first">J</forename><surname>Noessner</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Niepert</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Proceedings of the 5th Conference on Web Reasoning and Rule Systems (RR)</title>
				<meeting>the 5th Conference on Web Reasoning and Rule Systems (RR)</meeting>
		<imprint>
			<date type="published" when="2011">2011</date>
			<biblScope unit="page" from="281" to="286" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b16">
	<analytic>
		<title level="a" type="main">Ontology evolution: Not the same as schema evolution</title>
		<author>
			<persName><forename type="first">N</forename><surname>Noy</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Klein</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">Knowledge and Information System</title>
		<imprint>
			<biblScope unit="volume">6</biblScope>
			<biblScope unit="issue">4</biblScope>
			<biblScope unit="page">428440</biblScope>
			<date type="published" when="2004">2004</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b17">
	<analytic>
		<title level="a" type="main">Algorithm and tool for automated ontology merging and alignment</title>
		<author>
			<persName><forename type="first">N</forename><forename type="middle">F</forename><surname>Noy</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><forename type="middle">A</forename><surname>Musen</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Proceedings of the 17th National Conference on Artificial Intelligence</title>
				<meeting>the 17th National Conference on Artificial Intelligence</meeting>
		<imprint>
			<date type="published" when="2000">2000</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b18">
	<analytic>
		<title level="a" type="main">A survey of approaches to automatic schema matching</title>
		<author>
			<persName><forename type="first">E</forename><surname>Rahm</surname></persName>
		</author>
		<author>
			<persName><forename type="first">P</forename><forename type="middle">A</forename><surname>Bernstein</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">VLDB Journal</title>
		<imprint>
			<biblScope unit="volume">10</biblScope>
			<date type="published" when="2001">2001</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b19">
	<analytic>
		<title level="a" type="main">Kosimap: Use of description logic reasoning to align heterogeneous ontologies</title>
		<author>
			<persName><forename type="first">Q</forename><surname>Reul</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><forename type="middle">Z</forename><surname>Pan</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">23rd International Workshop on Description Logics DL2010</title>
				<imprint>
			<publisher>Citeseer</publisher>
			<date type="published" when="2010">2010</date>
			<biblScope unit="page">489</biblScope>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b20">
	<analytic>
		<title level="a" type="main">Ontofarm: Towards an experimental collection of parallel ontologies</title>
		<author>
			<persName><forename type="first">O</forename><surname>Šváb</surname></persName>
		</author>
		<author>
			<persName><forename type="first">V</forename><surname>Svátek</surname></persName>
		</author>
		<author>
			<persName><forename type="first">P</forename><surname>Berka</surname></persName>
		</author>
		<author>
			<persName><forename type="first">D</forename><surname>Rak</surname></persName>
		</author>
		<author>
			<persName><forename type="first">P</forename><surname>Tomášek</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Poster Track of ISWC</title>
				<imprint>
			<date type="published" when="2005">2005</date>
		</imprint>
	</monogr>
</biblStruct>

				</listBibl>
			</div>
		</back>
	</text>
</TEI>
