<?xml version="1.0" encoding="UTF-8"?>
<TEI xml:space="preserve" xmlns="http://www.tei-c.org/ns/1.0" 
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" 
xsi:schemaLocation="http://www.tei-c.org/ns/1.0 https://raw.githubusercontent.com/kermitt2/grobid/master/grobid-home/schemas/xsd/Grobid.xsd"
 xmlns:xlink="http://www.w3.org/1999/xlink">
	<teiHeader xml:lang="en">
		<fileDesc>
			<titleStmt>
				<title level="a" type="main">On direct debugging of aligned ontologies</title>
			</titleStmt>
			<publicationStmt>
				<publisher/>
				<availability status="unknown"><licence/></availability>
			</publicationStmt>
			<sourceDesc>
				<biblStruct>
					<analytic>
						<author>
							<persName><forename type="first">Kostyantyn</forename><surname>Shchekotykhin</surname></persName>
							<affiliation key="aff0">
								<orgName type="institution">Alpen-Adria Universität</orgName>
								<address>
									<postCode>9020</postCode>
									<settlement>Klagenfurt</settlement>
									<country key="AT">Austria</country>
								</address>
							</affiliation>
						</author>
						<author>
							<persName><forename type="first">Philipp</forename><surname>Fleiss</surname></persName>
							<affiliation key="aff0">
								<orgName type="institution">Alpen-Adria Universität</orgName>
								<address>
									<postCode>9020</postCode>
									<settlement>Klagenfurt</settlement>
									<country key="AT">Austria</country>
								</address>
							</affiliation>
						</author>
						<author>
							<persName><forename type="first">Patrick</forename><surname>Rodler</surname></persName>
							<affiliation key="aff0">
								<orgName type="institution">Alpen-Adria Universität</orgName>
								<address>
									<postCode>9020</postCode>
									<settlement>Klagenfurt</settlement>
									<country key="AT">Austria</country>
								</address>
							</affiliation>
						</author>
						<author>
							<persName><forename type="first">Gerhard</forename><surname>Friedrich</surname></persName>
							<affiliation key="aff0">
								<orgName type="institution">Alpen-Adria Universität</orgName>
								<address>
									<postCode>9020</postCode>
									<settlement>Klagenfurt</settlement>
									<country key="AT">Austria</country>
								</address>
							</affiliation>
						</author>
						<title level="a" type="main">On direct debugging of aligned ontologies</title>
					</analytic>
					<monogr>
						<imprint>
							<date/>
						</imprint>
					</monogr>
					<idno type="MD5">D9A9F667673862FB87FE16D1987281B7</idno>
				</biblStruct>
			</sourceDesc>
		</fileDesc>
		<encodingDesc>
			<appInfo>
				<application version="0.7.2" ident="GROBID" when="2023-03-24T11:45+0000">
					<desc>GROBID - A machine learning software for extracting information from scholarly documents</desc>
					<ref target="https://github.com/kermitt2/grobid"/>
				</application>
			</appInfo>
		</encodingDesc>
		<profileDesc>
			<abstract>
<div xmlns="http://www.tei-c.org/ns/1.0"><p>Modern ontology debugging methods allow efficient identification and localization of faulty axioms defined by a user while developing an ontology. However, in many use cases such as ontology alignment the ontologies might include many conflict sets, i.e. sets of axioms preserving the faults, thus making the ontology diagnosis infeasible. In this paper we present a debugging approach based on a direct computation of diagnoses that omits calculation of conflict sets. The proposed algorithm is able to identify diagnoses for an ontology which includes a large number of faults and for which application of standard diagnosis methods fails. The evaluation results show that the approach is practicable and is able to identify a fault in adequate time.</p></div>
			</abstract>
		</profileDesc>
	</teiHeader>
	<text xml:lang="en">
		<body>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="1">Motivation and algorithm details</head><p>Ontology development and maintenance relies on an ability of users to express their knowledge in form of logical axioms. However, the knowledge acquisition process might be problematic since a user can make a mistake in an axiom being modified or a correctly specified axiom can trigger a hidden bug in an ontology. These bugs might be of a different nature and result in violation of such requirements as consistency of an ontology or satisfiability of its classes. In such scenarios as ontology matching the complexity of faults might be very high because multiple disagreements between ontological definitions and/or modeling problems can be triggered by aliments at once.</p><p>Ontology debuggers simplify the development by allowing their users specification of requirements to the intended (target) ontology. In addition, a user can provide B set of background (correct) axioms and sets of positive P and negative N test cases. A positive test case is a set of axioms that must be entailed by the intended ontology, whereas a negative test case must not. If any of the requirements or test cases are broken, i.e. an ontology O is faulty, then the tuple O, B, P, N is a diagnosis problem instance. For a given problem instance a debugging tool computes a set of axioms D ⊆ O called diagnosis. An expert should remove or modify at least all axioms of a diagnosis in order to be able to formulate the target ontology O t .</p><p>Nevertheless, in real-world scenarios debugging tools can return a set of alternative diagnoses D. The reason is the practical impossibility for a user to specify such a set of requirements and test cases, prior to a debugging session, providing all information required for identification of the target diagnosis D t , i.e. the diagnosis which application allows formulation of the intended ontology. Diagnosis discrimination methods allow their users to reduce the number of diagnoses to be considered. An interactive algorithm suggested in <ref type="bibr" target="#b3">[4]</ref> identifies the target diagnosis by asking an oracle a sequence of questions: whether some axiom is entailed by the target ontology or not. A general interactive ontology diagnosis algorithm can be described as follows: (1) Generate a set of diagnoses D including at most n diagnoses. (2) Compute a set of queries and select the best one using a predefined measure. (3) Ask the oracle and, depending on the answer, add the query either to P or to N . (4) Update the set of diagnoses D and remove the ones that do not comply with the newly acquired test case. (5) Update the tree and repeat from Step 1 if the tree contains open nodes. (6) Return the set of diagnoses D. The resulting set D includes only diagnoses that are not differentiable in terms of their entailments, but have some syntactical differences. The preferred target diagnosis D t , in this case, should be selected by the user using some ontology editor such as Protégé.</p><p>Most of the debugging approaches, including <ref type="bibr" target="#b3">[4]</ref>, follow the standard model-based diagnosis approach <ref type="bibr" target="#b2">[3]</ref> and compute diagnoses using conflict sets CS, i.e. irreducible sets of axioms ax i ∈ O that preserve violation of at least one requirement. The computation of one conflict set can be done within a polynomial number of calls to the reasoner, e.g. by QUICKXPLAIN algorithm <ref type="bibr" target="#b1">[2]</ref>. To identify a diagnosis of cardinality |D| = m the hitting set algorithm suggested in <ref type="bibr" target="#b2">[3]</ref> requires computation of m conflict sets. In some practical scenarios, such ontology matching, the number of conflict sets m can be large, thus making the ontology debugging practically infeasible.</p><p>In this paper we present two algorithms INV-HS-TREE and INV-QUICKXPLAIN, which inverse the standard model-based approach to ontology debugging and compute diagnoses directly, rather than by means of conflict sets (see <ref type="bibr" target="#b4">[5]</ref> for a detailed description of the algorithms). The main function of the latter algorithm splits recursively the initial diagnosis problem instance into two sub-problems by partitioning the axioms of a given faulty ontology into two subsets. In many cases SPLIT simply partitions the set of axioms into two sets of equal cardinality. The algorithm continues to divide diagnosis problems until it identifies a set D such that O \ D fulfills all the requirements, but O \ D i , where D i are partitions of D , not. In further iterations the algorithm minimizes the D by splitting it into sub-problems of the form D = D ∪ O ∆ , where O ∆ contains only one axiom. In the case when D is a diagnosis and D is not, the algorithm decides that O ∆ is a subset of the sought diagnosis. Just as the original algorithm, INV-QUICKXPLAIN always terminates and returns either a diagnosis D or "no diagnosis".</p><p>In order to enumerate all possible diagnoses we modified the HS-TREE algorithm <ref type="bibr" target="#b2">[3]</ref> to accept diagnoses as node labels instead of conflict sets. The INV-HS-TREE algorithm constructs a directed tree from root to the leaves, where each node nd is labeled either with a diagnosis D(nd) or (closed) or × (pruned). The latter two labels indicate that the node cannot be extended. Each edge outgoing from the open node nd is labeled with an element s ∈ D(nd). HS(nd) is a set of edge labels on the path from the root to the node nd. Initially the algorithm creates an empty root node and adds it to the queue, thus, implementing a breadth-first search strategy. Until the queue is empty, the algorithm retrieves the first node nd from the queue and labels it with either:</p><p>1. (closed). The leaf nodes of a complete tree are either pruned (×) or closed ( ) nodes.</p><formula xml:id="formula_0">×</formula><p>In the diagnosis discrimination settings the ontology debugger acquires new knowledge that can invalidate some of the diagnoses labeling the tree nodes. During the tree update INV-HS-TREE searches for the nodes with invalid labels, removes its label and places it to the list of open nodes. Moreover, the algorithm removes all the nodes of a subtree originating from this node. After all nodes with invalid labels are cleanedup, the algorithm attempts to reconstruct the tree by reusing the remaining valid minimal diagnoses (rule 2, INV-HS-TREE). Such aggressive pruning of the tree is feasible since a) the tree never contains more than n nodes that were computed with INV-QUICKXPLAIN and b) computation of a possible modification to the minimal diagnosis, that can restore its validness, requires invocation of INV-QUICKXPLAIN and, therefore, as hard as computation of a new diagnosis. Note also, that in a common diagnosis discrimination setting n is often set to a small number, e.g. 10, in order to achieve good responsiveness of the system. In the direct approach this setting limits the number of calls to INV-QUICKXPLAIN to n and results in a small size of the search tree. The latter is another advantage of the direct method as it requires much less memory in comparison to a debugger based on original HS-TREE.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="2">Evaluation</head><p>We evaluated the direct ontology debugging technique using aligned ontologies generated in the framework of OAEI 2011 <ref type="bibr" target="#b0">[1]</ref>. These ontologies represent a real-world scenario in which a user generated ontology alignments by means of (semi-)automatic tools. All ontologies used in the experiments are available online, together with detailed evaluation results <ref type="foot" target="#foot_1">1</ref> .</p><p>In the first experiment we applied the debugging technique to the set of aligned ontologies resulting from "Conference" set of problems, which is characterized by lower precision and recall of the applied systems (the best F-measure 0.65) in comparison, for instance, to the "Anatomy" problem (average F-measure 0.8). The Conference test suite includes 286 ontology alignments generated by 14 ontology matching systems including: a) 140 ontologies are consistent and coherent; b) 122 ontologies are incoherent; c) 26 ontologies are inconsistent; and in 8 cases a reasoner was unable to classify in two hours <ref type="foot" target="#foot_2">2</ref> . Note that only two systems CODI and MaasMtch out of 14 were able to generate consistent and coherent alignments. This observation confirms the importance of high-performance ontology debugging methods.</p><p>The 146 ontologies of the cases b) and c) were analyzed with both HS-TREE and INV-HS-TREE. The results of the experiment show that for 133 ontologies out of 146 both approaches were able to compute the required amount of diagnoses. HT-TREE required 49, 61 and 80 sec. on average to find 1, 9 and 30 leading minimal diagnoses. The direct algorithm required less time and returned the requested number of diagnoses on average in 27, 52 and 72 sec. respectively. In the 13 cases HS-TREE was unable to find all requested diagnoses in each experiment. Within 2 hours the algorithm calculated only 1 diagnosis for csa-conference-ekaw and for ldoa-conference-confof it was able to find 1 and 9 diagnoses, whereas INV-HS-TREE required 9 sec. for 1, 40 sec. for 9 and 107 sec. for 30 diagnoses on average. Moreover, in the first experiment we evaluated the efficiency of the interactive direct debugging approach applied to the cases listed in Table <ref type="table" target="#tab_1">1</ref> alignments computed from the set of correct alignments M c , provided by the organizers of OAEI 2011, and the set M ij generated by a ontology matching system. In the experiment the used the Entropy scoring function <ref type="bibr" target="#b3">[4]</ref> with prior fault probabilities of axioms corresponding to aliments set to 1 − v, where v is the confidence value of the matcher for an axiom. All axioms of O i and O j were assumed to be correct and were assigned small probabilities. The results presented in Table <ref type="table" target="#tab_1">1</ref> were computed for the diagnosis problem instance M ij , O i ∪ O j , ∅, ∅ . The experiment shows that the system was able to identify the target diagnosis efficiently with small reaction times. The system's performance decreased only in the cases when a reasoner required much time to verify the consistency of an ontology.</p><p>In the second scenario we applied the direct method to unsatisfiable and classifiable within 2 hours ontologies, generated for the Anatomy problem. The source ontologies O 1 and O 2 include 11545 and 4838 axioms correspondingly, whereas the size of the alignments varies between 1147 and 1461 axioms. The target diagnosis selection process was performed in the same way as in the first experiment. The results of the experiment show that the target diagnosis can be computed within 40 second in an average case. Moreover, INV-HS-TREE slightly outperformed HS-TREE.</p></div><figure xmlns="http://www.tei-c.org/ns/1.0" type="table" xml:id="tab_0"><head></head><label></label><figDesc>if there is a node nd , labeled with either or ×, such that H(nd ) ⊆ H(nd) (pruning non-minimal paths), or 2. D(nd ) if a node nd exists such that its label D(nd ) ∩ H(nd) = ∅ (reuse), or 3. D if D is a diagnosis for the diagnosis problem instance O, B ∪ H(nd), P, N computed by INV-QUICKXPLAIN (compute), or 4.</figDesc><table /></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" type="table" xml:id="tab_1"><head>Table 1 .</head><label>1</label><figDesc>. We selected the target diagnosis randomly among all diagnoses of the following diagnosis problem instanceM f , O i ∪ O j ∪ M t , ∅, ∅, where M f and M t are the sets of false and true positive Diagnosis discrimination using direct ontology debugging and Entropy scoring function. React time required to compute 9 diagnoses and a query, #CC number of consistency checks, CC gives average time needed for one consistency check.</figDesc><table><row><cell cols="7">Matcher Ontology 1 Ontology 2 Time (sec) #Query React (sec) #CC CC (ms)</cell></row><row><cell>ldoa</cell><cell cols="2">conference confof</cell><cell>11.6</cell><cell>6</cell><cell>1.4 430</cell><cell>3</cell></row><row><cell>ldoa</cell><cell>cmt</cell><cell>ekaw</cell><cell>48.5</cell><cell>21</cell><cell>2.2 603</cell><cell>16</cell></row><row><cell cols="2">mappso confof</cell><cell>ekaw</cell><cell>9.9</cell><cell>5</cell><cell>1.8 341</cell><cell>7</cell></row><row><cell cols="3">optima conference ekaw</cell><cell>16.7</cell><cell>5</cell><cell>2.5 553</cell><cell>8</cell></row><row><cell cols="2">optima confof</cell><cell>ekaw</cell><cell>23.9</cell><cell>20</cell><cell>1.1 313</cell><cell>14</cell></row><row><cell>ldoa</cell><cell cols="2">conference ekaw</cell><cell>56.6</cell><cell>35</cell><cell>1.4 253</cell><cell>53</cell></row><row><cell>csa</cell><cell cols="2">conference ekaw</cell><cell>6.7</cell><cell>2</cell><cell>2.7 499</cell><cell>3</cell></row><row><cell cols="3">mappso conference ekaw</cell><cell>27.4</cell><cell>13</cell><cell>1.8 274</cell><cell>28</cell></row><row><cell>ldoa</cell><cell>cmt</cell><cell>edas</cell><cell>24.7</cell><cell>22</cell><cell>1.1 303</cell><cell>8</cell></row><row><cell>csa</cell><cell cols="2">conference edas</cell><cell>18.4</cell><cell>6</cell><cell>2.7 419</cell><cell>5</cell></row><row><cell>csa</cell><cell>edas</cell><cell>iasted</cell><cell>1744.6</cell><cell>3</cell><cell>349.2 1021</cell><cell>1333</cell></row><row><cell>ldoa</cell><cell>ekaw</cell><cell>iasted</cell><cell>23871.4</cell><cell>10</cell><cell cols="2">1885.9 287 72607</cell></row><row><cell cols="2">mappso edas</cell><cell>iasted</cell><cell>18400.2</cell><cell>5</cell><cell cols="2">2028.2 723 17844</cell></row></table></figure>
			<note xmlns="http://www.tei-c.org/ns/1.0" place="foot" xml:id="foot_0">This research is funded by Austrian Science Fund (Project V-Know, contract 19996).</note>
			<note xmlns="http://www.tei-c.org/ns/1.0" place="foot" n="1" xml:id="foot_1">http://code.google.com/p/rmbd/wiki/DirectDiagnosis</note>
			<note xmlns="http://www.tei-c.org/ns/1.0" place="foot" n="2" xml:id="foot_2"> </note>
			<note xmlns="http://www.tei-c.org/ns/1.0" place="foot" n="3" xml:id="foot_3">.2Ghz, 32GB RAM running Ubuntu 11.04, Java 6 and HetmiT 1.3.6.</note>
		</body>
		<back>
			<div type="references">

				<listBibl>

<biblStruct xml:id="b0">
	<analytic>
		<title level="a" type="main">Final results of the Ontology Alignment Evaluation Initiative</title>
		<author>
			<persName><forename type="first">J</forename><surname>Euzenat</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Ferrara</surname></persName>
		</author>
		<author>
			<persName><forename type="first">W</forename><forename type="middle">R</forename><surname>Van Hage</surname></persName>
		</author>
		<author>
			<persName><forename type="first">L</forename><surname>Hollink</surname></persName>
		</author>
		<author>
			<persName><forename type="first">C</forename><surname>Meilicke</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Nikolov</surname></persName>
		</author>
		<author>
			<persName><forename type="first">D</forename><surname>Ritze</surname></persName>
		</author>
		<author>
			<persName><forename type="first">F</forename><surname>Scharffe</surname></persName>
		</author>
		<author>
			<persName><forename type="first">P</forename><surname>Shvaiko</surname></persName>
		</author>
		<author>
			<persName><forename type="first">H</forename><surname>Stuckenschmidt</surname></persName>
		</author>
		<author>
			<persName><forename type="first">O</forename><surname>Sváb-Zamazal</surname></persName>
		</author>
		<author>
			<persName><forename type="first">C</forename><forename type="middle">T</forename><surname>Dos Santos</surname></persName>
		</author>
		<ptr target="CEUR-WS.org" />
	</analytic>
	<monogr>
		<title level="m">Proceedings of the 6th International Workshop on Ontology Matching</title>
				<meeting>the 6th International Workshop on Ontology Matching</meeting>
		<imprint>
			<date type="published" when="2011">2011. 2011</date>
			<biblScope unit="page" from="1" to="29" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b1">
	<analytic>
		<title level="a" type="main">QUICKXPLAIN: Preferred Explanations and Relaxations for Over-Constrained Problems</title>
		<author>
			<persName><forename type="first">U</forename><surname>Junker</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Proc. 19th National Conference on Artificial Intelligence</title>
				<meeting>19th National Conference on Artificial Intelligence</meeting>
		<imprint>
			<date type="published" when="2004">2004</date>
			<biblScope unit="page" from="167" to="172" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b2">
	<analytic>
		<title level="a" type="main">A Theory of Diagnosis from First Principles</title>
		<author>
			<persName><forename type="first">R</forename><surname>Reiter</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">Artificial Intelligence</title>
		<imprint>
			<biblScope unit="volume">32</biblScope>
			<biblScope unit="issue">1</biblScope>
			<biblScope unit="page" from="57" to="95" />
			<date type="published" when="1987">1987</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b3">
	<analytic>
		<title level="a" type="main">Interactive ontology debugging : two query strategies for efficient fault localization</title>
		<author>
			<persName><forename type="first">K</forename><surname>Shchekotykhin</surname></persName>
		</author>
		<author>
			<persName><forename type="first">G</forename><surname>Friedrich</surname></persName>
		</author>
		<author>
			<persName><forename type="first">P</forename><surname>Fleiss</surname></persName>
		</author>
		<author>
			<persName><forename type="first">P</forename><surname>Rodler</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">Web Semantics: Science, Services and Agents on the World Wide Web</title>
		<imprint>
			<biblScope unit="volume">12</biblScope>
			<biblScope unit="issue">13</biblScope>
			<biblScope unit="page" from="88" to="103" />
			<date type="published" when="2012">2012</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b4">
	<monogr>
		<title level="m" type="main">Direct computation of diagnoses for ontology debugging</title>
		<author>
			<persName><forename type="first">K</forename><surname>Shchekotykhin</surname></persName>
		</author>
		<author>
			<persName><forename type="first">G</forename><surname>Friedrich</surname></persName>
		</author>
		<author>
			<persName><forename type="first">P</forename><surname>Fleiss</surname></persName>
		</author>
		<author>
			<persName><forename type="first">P</forename><surname>Rodler</surname></persName>
		</author>
		<idno>arXiv 1-16</idno>
		<ptr target="http://arxiv.org/abs/1209.0997" />
		<imprint>
			<date type="published" when="2012">2012</date>
		</imprint>
	</monogr>
</biblStruct>

				</listBibl>
			</div>
		</back>
	</text>
</TEI>
