<?xml version="1.0" encoding="UTF-8"?>
<TEI xml:space="preserve" xmlns="http://www.tei-c.org/ns/1.0" 
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" 
xsi:schemaLocation="http://www.tei-c.org/ns/1.0 https://raw.githubusercontent.com/kermitt2/grobid/master/grobid-home/schemas/xsd/Grobid.xsd"
 xmlns:xlink="http://www.w3.org/1999/xlink">
	<teiHeader xml:lang="en">
		<fileDesc>
			<titleStmt>
				<title level="a" type="main">Distributed Resolution for ALC -First Results</title>
			</titleStmt>
			<publicationStmt>
				<publisher/>
				<availability status="unknown"><licence/></availability>
			</publicationStmt>
			<sourceDesc>
				<biblStruct>
					<analytic>
						<author>
							<persName><forename type="first">Anne</forename><surname>Schlicht</surname></persName>
							<affiliation key="aff0">
								<orgName type="institution">University of Mannheim</orgName>
								<address>
									<country key="DE">Germany</country>
								</address>
							</affiliation>
						</author>
						<author>
							<persName><forename type="first">Heiner</forename><surname>Stuckenschmidt</surname></persName>
							<affiliation key="aff0">
								<orgName type="institution">University of Mannheim</orgName>
								<address>
									<country key="DE">Germany</country>
								</address>
							</affiliation>
						</author>
						<title level="a" type="main">Distributed Resolution for ALC -First Results</title>
					</analytic>
					<monogr>
						<imprint>
							<date/>
						</imprint>
					</monogr>
					<idno type="MD5">210FDF4F443CC210A0886D3856025C4E</idno>
				</biblStruct>
			</sourceDesc>
		</fileDesc>
		<encodingDesc>
			<appInfo>
				<application version="0.7.2" ident="GROBID" when="2023-03-25T02:21+0000">
					<desc>GROBID - A machine learning software for extracting information from scholarly documents</desc>
					<ref target="https://github.com/kermitt2/grobid"/>
				</application>
			</appInfo>
		</encodingDesc>
		<profileDesc>
			<abstract>
<div xmlns="http://www.tei-c.org/ns/1.0"><p>The use of description logic as the basis for semantic web languages has led to new requirements with respect to scalable and non-standard reasoning. Description logic is a decidable fragment of FOL but still, the standard reasoning tasks are of exponential complexity, satisfiability and subsumption tests are often intractable on large ontologies. Existing large ontologies have a modular structure like networks of linked ontologies, caused by the development process. However, current reasoning approaches do scarcely take advantage of this structure. The available reasoners do not exploit parallel computation and scalability improvements enabled by distributed reasoning. In this paper, we lay the foundation for developing distributed reasoning methods by showing that the description logic fragment ALC can be distributed. We propose a distributed, complete and terminating algorithm that decides satisfiability of terminologies in ALC. The algorithm is based on recent results on applying resolution to description logics. We show that the resolution procedure proposed by Tammet can be distributed amongst multiple resolution solvers by assigning unique sets of literals to individual solvers. This provides the basis for a highly scalable reasoning infrastructure for description logics. P air Set {¬P air(x1), Set(x1)} P air ∀part.¬Set {¬P air(x2), ¬part(x2, x3), ¬Set(x3)} Set ∃part.Set {¬Set(x4), part(x4, f (x4))}, {¬Set(x4), Set(f (x4))} P air(a) {P air(a)}</p></div>
			</abstract>
		</profileDesc>
	</teiHeader>
	<text xml:lang="en">
		<body>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="1">Introduction</head><p>The use of description logics as one of the primary logical languages for knowledge representation on the Web has created new challenges with respect to reasoning in these logics. In order to be able to support the vision of a semantic web of interrelated ontologies, reasoning procedures have to be highly scalable and able to deal with physically distributed knowledge models. A natural way of addressing these problems is to rely on distributed inference procedures that can distribute the load between different solvers thus reducing potential bottlenecks both in terms of memory and computation time. Currently, almost all the work on description logic reasoning still assumes a centralized approach where the complete terminology has to be present on a single system and all inference steps are are carried out on this system. Exceptions to this rule are approaches like Distributed Description Logics (DDL) <ref type="bibr" target="#b1">[2]</ref> that support local reasoning at the price of sacrificing expressiveness in the links between local models and by dropping some formal properties on the level of the overall model. In DDL for example, certain types of inconsistencies are not propagated on the global level. The goal of our work is to support local reasoning in description logics without the need to reduce the expressiveness of links between local models. Based on results in resolution reasoning Fig. <ref type="figure" target="#fig_0">1</ref>. ALC example and equisatisfiable clauses. This example states that every pair is a set, every part of a pair is not a set and every set has a part that is a set. For testing satisfiability of the concept P air we additionally state that it has an instance a. The translation to FOL clauses requires replacing the existencial quantified variable y with a skolem function.</p><p>for description logic <ref type="bibr" target="#b7">[8]</ref>, we present an algorithm that decides satisfiability of a set of ALC ontologies. This algorithm allows us to provide distributed reasoning support on sets of terminologies that share non-logical symbols without merging the modules. A possible application is the provision of distributed reasoning support for (ALC equivalent parts of) OWL ontologies linked by import statements. Our method guarantees that the global semantics is preserved.</p><p>The contribution of this paper is threefold: <ref type="bibr" target="#b0">(1)</ref> We identify general requirements for a resolution procedure needed to guaranteeing soundness and completeness. <ref type="bibr" target="#b1">(2)</ref> we show that the resolution method proposed in <ref type="bibr" target="#b7">[8]</ref> satisfies these requirements. <ref type="bibr" target="#b2">(3)</ref> we present first results on the practical applicability of the algorithm. The paper is organized as follows: after a brief discussion of related work in the remainder of this section, we first explain the general idea of distributing standard resolution and discuss the problems that have to be solved for guaranteeing completeness. We then discuss ordered resolution as a adequate basis for distribution and its adaptation to ALC knowledge bases before turning to the distributed ordered resolution algorithm. Finally, we present the first results of this method and discuss possible optimizations.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="1.1">Related Work</head><p>Modular DL Reasoning Current approaches to distributed reasoning on description logic mostly rely on tableaux methods. Distribution by solving the two choices of nondeterminitic tableaux rules in parallel is difficult as it hampers the application of optimization and blocking strategies. Instead most distributed tableaux approaches try to identify all possible conflicts, i.e. all axioms that might follow from another module and would cause a contradiction and send these as queries to the other modules. So far, this is only done for links with rather restricted expressiveness between the modules. The most prominent actually distributed T-Box reasoning implementation for ontologies Distributed Description Logic (DDL) <ref type="bibr" target="#b1">[2]</ref> supports only a special type of links (called bridge rules) between ontologies. The local domains have to be disjoint, i.e. there is no real subsumption between elements of different modules. Like DDL, E-Connections <ref type="bibr" target="#b2">[3]</ref> treat local domains as disjoint and do not support subsumption relations between modules. Currently, a modular ontology framework based on Conservative Extensions <ref type="bibr" target="#b4">[5]</ref> is in development. This framework, however, imposes severe requirements to the self-containement of the modules. Reasoning with these modules is relatively easy but the computational difficulties are transfered to checking and maintaining the properties of the modules. We think that this approach overshoots the mark and poses too many restrictions on the way axioms might be distributed across different sites.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>Distributed Resolution Methods</head><p>Resolution is used by all successful FOL provers. Approaches to distributed first order reasoning are motivated by efficiency considerations, performance is improved by using multiple processors in parallel. Roo <ref type="bibr" target="#b3">[4]</ref>, for example, is a parallelization of the widely (re)used first order reasoner Otter. Parallelization differs from the distribution setting, while the former usually uses a shared memory, the latter constitutes a physical separation of modules and resulting higher communication costs. Partition-Based Reasoning <ref type="bibr" target="#b0">[1]</ref> is a distributed resolution strategy that requires local reasoning to be complete for consequence finding. The requirement inevitably causes derivation of much more clauses than necessary for refutation.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>Resolution for Description Logics</head><p>The main problem with applying FOL approaches for DL is that termination is not guaranteed. Although DL is a decidable fragment of FOL, causing a FOL reasoner to terminate on DL input requires considerable adaption of the algorithm. This paper is based on resolution methods introduced to description logics by <ref type="bibr" target="#b5">[6]</ref> that decide satisfiability. The algorithm we present can be seen as a distributed version of the algorithm for deciding ALC satisfiability proposed there.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="2">Distributed Standard Resolution</head><p>Analyzing the applicability of the Partition-Based Reasoning approach of <ref type="bibr" target="#b0">[1]</ref> to description logic revealed that the only way to avoid redundant derivations is to perform the same inferences as a common resolution reasoner instead of connecting a set of blackbox reasoners. Therefore, the basic idea proposed in this paper is to distribute common standard resolution by allocating each derivation step to a unique site in the distributed system. We first define distributed resolution before detailing this essential property of a distributed resolution calculus. A distributed calculus can be created from a given resolution calculus, based on an allocation relation that determines the modules.</p><p>Definition 2 (Allocation). An allocation function for C is a total function a : C → M that maps clauses to module identifiers.</p><p>The allocation could also be defined by an allocation relation, thus allowing for duplication of clauses to multiple modules. However, we will show that for ALC duplication is not necessary and we will define an appropriate allocation function. Before we turn to distributed resolution for ALC we give a general notation of distributed resolution.</p><p>Definition 3 (Distributed Resolution). A distributed resolution method for a set of clauses C is a tuple (R, a) of a resolution calculus and an allocation function such that for every inference with premises P i ∈ C, i ∈ I the allocation restriction ∃m ∈ M, ∀i ∈ I : a(P i ) = m is satisfied.</p><p>For rules with only one premise this restriction is trivially true. A trivial allocation is obtained by allocating all clauses to the same module (a(c) = m 1 ∀c ∈ C). Non-trivial allocations constitute a seperation into modules. The allocation restriction enables implementation of distributed resolution by connecting a set of resolution reasoners, each with a local set of clauses:</p><p>-Every module is saturated seperately.</p><p>-Newly derived clauses are propagated according to the allocation relation.</p><p>Considering one of these reasoners, the difference to performing centralized resolution is, that occasionally locally derived clauses are send to another reasoner and in return, clauses are received from other reasoners and have to be integrated in the local clause list. In addition, a global initialization and termination routine has to be added, that starts the separate solvers and stops if one of them finds a proof or all are saturated.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>Definition 4 (Local Resolution</head><p>). The local calculus for module m ∈ M in a distributed resolution method (R, a) is R with additional restriction a(P i ) = m for every premise P i to every rule r ∈ R.</p><p>In contrast to the centralized case, a module that is saturated locally may have to continue reasoning once a new clause is propagated from another module. The allocation a was defined to be functional for guarateeing that inferences are never duplicated. Most relevant are inferences with multiple premises because of the expensive search for corresponding sets of premises. The allocation restrictions ensure that every inference of the original calculus occurs in the distributed setting.</p><p>Corollary 1. If a calculus R guarantees completeness (termination respective) for input ⊆ C than every distributed reasoning method (R, a) for C with a finite set M of modules is complete (terminates).</p><p>Considering a derivation of the empty clause in centralized resolution that is not derived in the corresponding distributed resolution method leads to a contradiction: Assume the empty clause is not derived from an unsatisfiable KB by distributed ordered resolution and consider the first inference in the corresponding centralized reasoner that does not occur in the distributed version. By the allocation restriction, all premises are allocated to some module m the inference is carried out by local resolution in that module. Termination is obviously preserved, no inference is added by distribution.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="2.1">Problems Distributing Resolution</head><p>In general, clauses might have to be allocated to more than one module to make sure that the premises of every derivation step are allocated to the same module. In order to achieve an efficient method, we have to avoid duplication and allocate every clause to as few sites as possible. Even if we assume the case of only two modules where each may use terms defined by the other, distributing standard resolution might cause most of the axioms in inferences be copied to the other module. To make sure that every pair of clauses that can be resolved with each other meets in a module, we need to send all clauses containing a foreign term (i.e. a term defined in the other module) to the other module. Unfortunately the resolvent of clauses from different modules is very likely to contain terms from both modules that were only used locally before, thus it has to be copied to both modules. If we extend the approach to multiple modules we can expect the number of duplicates to further increase.</p><p>The partition based approach <ref type="bibr" target="#b0">[1]</ref> adresses this problem by limiting propagation to clauses that contain only shared symbols. It defines the shared language between modules as the clauses consisting of symbols shared by a pair of modules. The system can be depicted as a graph, with modules as nodes and edges for every shared language, labelled with the shared symbols. The graph is modified to form a tree by deleting edges and adding the corresponding symbols to other edge labels. Modules that share a symbol "s" remain connected, there is still some path in the tree connecting the modules with "s" contained in the label of every edge on the path. While the edges in the graph are undirected, edges in the resulting tree point towards the root. Whenever a local reasoner derives an axiom that is contained in a shared language (i.e. it contains only symbols from an outgoing edge), the axiom is propagated to the corresponding neighbor, towards the root. In difference to our approach clauses are not allocated to a specific module but may be passed on from one to the other until they reach the root. This approach faces some efficiency problems: First, as mentioned before, the system infers much more clauses than necessary for refutation, because the local reasoning has to be complete for consequence finding to guarantee completeness. If the modules are small (i.e. the knowledge base is distributed to many modules) this effect is negligible, but then the communication cost are very high. Second, the computation and communication is not distributed equably among the modules. The root is a distinguished module, communication load boosts towards it. Not only are the edges directed towards the root, also the number of shared language symbols is larger on edges close to the root. (Recall that all modules that share some symbol remain connected by a path labelled with this symbol.) Third, the partitioning depends on the query. If the query clauses are not contained in some local language, new links have to be introduced and repartitioning might be necessary.</p><p>Nevertheless, we will show that with a different approach it is possible to define a distributed resolution method for ALC that is sound and complete, and terminates. It does not require duplication of axioms, every clause is allocated to exactly one module. New axioms (e.g. queries) can be easily added to the system. The approach is fairly distributed, no module stands out from the others.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="3">Ordered Resolution</head><p>The problem with applying first order methods for DL is loosing decidability. An DLontology input to a complete first order reasoner will not terminate in general. <ref type="bibr" target="#b7">[8]</ref> showed that using ordered resolution, decidability of some description logics can be preserved by transforming the ontology into a special normal form. Before we give the definition of the method for ALC from <ref type="bibr" target="#b5">[6]</ref> we define the resolution rules it is based on. Ordered resolution depends on two parameters that can be modified to improve performance e.g. for restricted subsets of first order logic like description logic. First parameter is the order of literals (for defining the maximal literals) that can be defined on top of an order of predicate symbols, function symbols and constants. The second parameter is the selection function that maps every clause C to a subset S(C) of it's negative literals. Compared to unrestricted resolution, many derivations are skipped in ordered resolution. A literal A that is not selected in a clause C can only be resolved upon if nothing else is selected in the clause and Aσ is maximal. Intuitively, the effect of the selection function is a change in the order of literals in a clause, the selected literals are moved to the front. In addition, the combination of multiple inferences for skipping redundant intermediate results (i.e. hyperresolution) is controlled by selection.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>Definition 5 (Ordered Resolution).</head></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>Ordered resolution</head><formula xml:id="formula_0">C ∨ A D ∨ ¬B Cσ ∨</formula></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="3.1">Adaption to ALC</head><p>We assume a ALC ontology transformed into first order clauses (consider Figure <ref type="figure" target="#fig_0">1</ref> for an example). Modules are thus sets of clauses that can be derived from the given ontology. Different ontologies that share terms (e.g. one ontologie uses a concept defined in another ontology) can be considered as modules, too. The ontologies are translated into clauses seperately resulting in the same structure as an ontology that is first clausified and than partitioned into modules.</p><p>For preserving decidability, the ontology has to be transformed into definitorial form before clausification. Then, resolving the clauses with ordered resolution derives an empty clause iff the ontology is inconsistent and terminates for any ontology. Intuitively, the definitorial form can be seen as contrary of unfolding, it splits up nested axioms into simple ones by introducing new concepts. The requirement to the literal ordering is satisfied by every lexicographic path ordering based on a total precedence &gt; with f &gt; P &gt; ¬ for every function symbol f and predicate symbol P .</p><p>Definition 9 (Lexicographic Path Ordering). A lexicographic path ordering (LPO) is a term ordering induced by a well-founded strict precedence &gt; over function, predicate and logical symbols, defined by: s = f (s 1 , . . . , s m ) g(t 1 , . . . , t n ) = t if and only if at least one of the following holds (i) f &gt; g and s t i for all i with 1 ≤ i ≤ n (ii) f = g and for some j we have (s 1 , . . . , s j−1 ) = (t 1 , . . . , t j−! ), s j t j and s t k for all k with j &lt; k ≤ n (iii) s j t for some j with 1 ≤ j ≤ m Theorem 1 (R DL complexity). For an ALC knowledge base KB, clausifying its definitorial form and saturating the clauses by R ALC decides satisfiability of KB and runs in time exponential in KB.</p><p>This theorem is the reason for using ordered resolution in our approach, we thereby guarantee termination. Note that ordered resolution is not complete for consequence finding and hence, applying it for the partition based reasoning approach of <ref type="bibr" target="#b0">[1]</ref> results in an incomplete procedure.</p><p>Inuitively, the idea for distributing resolution is to allocate clauses based on their resolvable literals.</p><p>-Every module "hosts" a subset of the literals, i.e. it is responsible for all inferences resolving upon one of these literals. -Every (derived or stated) clause is moved to the modules that host a resolvable literal of the clause.</p><p>Essential for distributing resolution without duplicating clauses and derivations is to reduce the number of modules that may be responsible for the next derivation to a single module i.e. to define the resolution method in such a way that there is a unique resolvable literal for every clause. Note that literals can be allocated to modules based on their top symbol (i.e. the predicate P of the literal (¬)P (t) where t is a term) and a partitioning of symbols that allocates every symbol to exactly one module.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="4.1">Resolvable Literals of a Given Clause</head><p>For defining a resolution method that is complete for this communication strategy, we investigate the options for using a given clause in an inference. The first thing we discover from the definition of ordered resolution is that a clause can only be main premise and not side premise if some literal is selected by the selection function. By exploiting the other requirements we obtain the decision tree in Figure <ref type="figure" target="#fig_2">2</ref> (assuming the decision on the resolvable literal is not affected by unification). If nothing is selected, only a maximal literal can be resolved. If there are multiple maximal literals, then the clause cannot be a side premise. Problematic are multiple negative maximal literals because any of them can be the resolved literal in the next derivation. If all multiple maximal literals are positive the clause cannot be premise for ordered resolution because it can neither be a side premise nor a main premise. A given clause can never be side premise and main premise at same time (in different inferences). If the selection function is empty, the first branch on the right is pruned. If the order is total on all literals, the branch in the middle is pruned. By exploiting the effects of the selection and ordering on the clause tree, two problematic cases are identified:</p><p>Multiple selected literals: If literals from different moduls are selected, we cannot allocate the inference to one module. Multiple negative maximal literals, none selected: We have to send the clause to every module that is responsible for derivations resolving upon one of these literals.</p><p>Note that these restrictions apply to the unified clause: The definition of ordered resolution requires each resolved atom Aσ to be (strictly) maximal with respect to Dσ (Cσ) but this does in general not imply that A is (strictly) maximal w.r.t. the premise before substitution. However, the ordering defined for R ALC is admissible which implies invariance under substitutions and the selection function is independent of substitutions, too. For guaranteeing termination of R ALC , ordering and selection were defined in a way that restricts the clauses to five different types <ref type="bibr" target="#b5">[6]</ref>:</p><formula xml:id="formula_1">clause type type of resolvable literal 1 R(x, f (x)) ∨ P(x) R(x, f (x)) 2a P(x) (¬)P (x) 2b P1(f (x)) ∨ P2(x) (¬)P (f (x)) 3 ¬R(x, y) ∨ P1(x) ∨ P2(y) ¬R(x, y) 4 P(a) (¬)P (a) 5 (¬)R(a, b) (¬)R(a, b)</formula><p>Table <ref type="table">1</ref>. ALC clauses. For a term t, P(t) denotes a possibly empty disjunction of the form (¬)P1(t) . . . (¬)Pn(t). P(f (x)) denotes a possibly empty disjunction of the form P1(f1(x)) . . . Pm(fm(x)). Note that this definition allows each P i (fi(x)) to contain positive and negative literals. We choose a slightly different notation of the ALC clauses with the type 2 clauses subdivided because the resolvable literal is of type P (f (x)) if one exists and otherwise P (x).</p><p>Obviously, ALC clauses never contain more than one selected literal, thus, the first problem does not occur in R ALC . The second problem can be avoided, too, by either applying a total order on literals or selecting one of the multiple maximal literals. The effect is similar because considering the possible inferences, a selected literal is treated like a strictly maximal literal. We choose the former and extend the ordering defined for R ALC as follows.  <ref type="formula">2</ref>). The first literal is the resolvable literal of a clause (let Set &gt; P air). Arrows depict propagation of clauses, we ommit braces for purpose of readability. Note that in this example most clauses contain symbols hosted by the other module. In realistic examples this overlap is very small and we can expect many of the clauses to be kept in the module where they were infered.</p><p>contain the same number of function symbols and the top predicate of A precedes the top predicate of B.</p><p>The calculus R ALC with extended ordering D is denoted by R D ALC . The literals that remain incomparable wrt. D have the same top predicate and are therefore hosted by the same module.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>Corollary 2 (Distributed Ordered Resolution).</head><p>Distributed Ordered Resolution, i.e. the method (R D ALC , a) with allocation a defined by a(c) = part(topSymbol(resolvableLiteral(c))) based on a partitioning of predicates part that maps every predicate to a module identifier is a distributed resolution method.</p><p>Since the resolvable literal is unique, the allocation is a function. The allocation restrictions are satisfied because for any pair of clauses that are premises of an inference the resolvable literals have the same top symbol and are thus allocated to the same module. Hence, if R D ALC is complete and terminates, by Corollary 1 the same holds for (R D ALC , a). Theorem 2 (Completeness and Termination of Distributed Ordered Resolution). Distributed Ordered Resolution is a complete and terminating method for deciding ALC satisfiability.</p><p>Proof. The theorem follows from completeness and termination of R ALC (Theorem 1). We have to show that R D ALC is complete and terminates. The definition of ordered resolution allows for an arbitrary ordering of literals, hence R D ALC is complete. We only have to make sure that Theorem 1 is not affected by this modification and hence termination is preserved. In fact, the requirements R(x, f (x)) D ¬C(x) and D(f (x)) D ¬C(x) still hold for all function symbols f , and predicates R, C, and D.</p><p>Consider Figure <ref type="figure" target="#fig_2">2</ref> for an example refutation.</p><p>The algorithm presented above is not yet implemented in an actually distributed reasoner, but first experiments were conducted simulating the distributed setting. For investigating distributed ordered resolution we implemented the definitorial form normalization and modified the resolution reasoner SPASS to perform the R D ALC resolution. The crucial point for determining performance of distributed ordered resolution is the number of propagated clauses. If the modules happen to be completely independent, performance would be improved by a factor propotional to the number of modules. But, in realistic settings the modules will allways share some symbols and clauses will be exchanged between them. Propagation of a derived clause to another module is considerably more expensive then just adding it to the local clause list. Apart from the distance that has to be bridged over, a received clause requires modification of the local index structures that are used to manage the clause lists. Hence, the ratio of derived clauses that have to be propagated to another module is the dicisive criterion.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="5.1">Setting</head><p>For our experiments we used the FOODSWAP<ref type="foot" target="#foot_0">1</ref> ontology. First we transformed the ontology into its definitorial form. Then we used our partitioning tool PATO for assigning each term (i.e. concept or property name) a module number. PATO first builds a dependency graph from the ontology that is afterwards partitioned into parts that are stronger internally connected than extenally connected <ref type="foot" target="#foot_1">2</ref> . For minimizing the communication we configured Pato to create a link in the dependency graph for every pair of terms appearing in the same axiom. Finally we started the SPASS reasoner with appropriate configuration to check consistency of the ontology. For every ordered resolution derivation we recorded the module that performed the derivation (i.e. the module number of the resolved literal) and the module to which the derived clause is send (i.e. the module number of the resolvable literal of the derived clause).</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="5.2">Results</head><p>Analysing the list of source and destination of every derived clause, we found that out of the 229 derived clauses 110 were kept in the same module, the other 119 clauses were propagated to another module. The complete picture of the communication is depicted in Figure <ref type="figure" target="#fig_4">4</ref>. The communication load of 52% of the derived clauses may already be acceptable in many applications, but still further reduction can be achieved by merging modules as depicted in the subsequent graphs. In the three module setting only 28% of the clauses are propagated and 26% for the two modules while the sizes remain balanced. Positive tests like this consistency check require complete saturation whereas negative test are often answered faster by resolution reasoners. For our small test case the time required for reasoning is dominated by the time for loading the input, but the number of derived clauses is smaller for positive subsumption checks (i.e. derivation of the empty clause). E.g. testing the subsumption (meat f ood) by adding its converse (meat ∧ ¬f ood) to the clauses required only 45 derivations (with 47% propagation in the four module setting). In contrary, the negative subsumption test (eggs dairy) is similar to the positive consistency check (226 derivations, 115 propagations).</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="6">Optimizations</head><p>The first results of distributed ordered resolution are very encouraging. However, additional tests with different larger ontologies are required and further reduction of the propagation ratio is desirable.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="6.1">Partitioning</head><p>One of the next steps for improving performance will be to optimize the partitioning wrt. communication. In our experiment we started with a partitioning generated by PATO and than reduced communication by merging modules. In the same way we could start with all modules containing only a single term and find the optimal partitioning wrt. a given reasoning task. Finding the optimal partitioning is an apparently hard problem, but approximations will probably meet the needs.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="6.2">Selection Function</head><p>Another topic of investigation is the selection function. Up to now, it selects at most one literal. Thus, a derived clause is very likely to be moved to another module. It would be more efficient to select local literals first. Moreover, if we select multiple literals, we can combine multiple inferences to a hyperresolution inference and skip redundant intermediate results. By investigating the requirements necessary to make sure the set of ALC clauses is closed under R ALC , we can improve the selection function:</p><p>Definition 11 (Adapted Distributed Selection).</p><p>1. Select nothing from clauses of type 1 and 5. 2. From type 3 clauses select the negative binary literal. 3. From clauses of type 2b select the maximal number of negative literals of type P (f (x)) hosted by a single module. 4. From clauses of type 2a and 4 select the maximal number of negative literals hosted by a single module.</p><p>We may not select all local literals but selecting all literals hosted by one module has a similar effect. First the clause is moved to that module, but then all local literals are solved in the next inference.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="6.3">Reduction</head><p>For efficient resolution reasoning it is not sufficient to avoid unneccessary derivations.</p><p>Additionally, reduction techniques have to be applied that delete clauses if they are tautologies or subsumed by other clauses. Deleting newly derived clauses that are subsumed is forward subsumption, backward subsumption refers to the deletion of clauses that are subsumed by a newly derived clause. The naïve approach to distributed subsumption would be to propagate new clauses to all modules that host one of its literals for forward subsumption checks and send it to modules that are responsible for literals equal or smaller than the resolvable literal of the new clause for backward subsumtion.</p><p>Fortunately we can again take advantage of the clause typology. Of all different clause types only type 2a can be subsumed by a clause of different type, every other clause can only be subsumed by a clause of the same type. Thus, by indexing the clauses with their types we can skip a large part of the subsumption tests.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="7">Summary</head><p>We addressed the problem of improving the scalability of description logic reasoning by proposing a distributed resolution method for ALC terminologies. The algorithm extends ordered resolution with a method for assigning clauses to a unique location at which all possible resolution steps are executed by a local solver. We analyzed the properties of such a distributed method and identified necessary conditions for guaranteeing completeness and termination of the algorithm. Further, we have shown that the resolution method for ALC described by Motik satisfies these conditions and can therefore be distributed across different distributed reasoners. Our investigations lay the foundation for the implementation of a large scale reasoning infrastructure for ALC terminologies and can be seen as a first step towards supporting the vision of the semantic web as a distributed system of interlinked ontologies that can be reasoned upon. First experiments on the performance of the proposed algorithm are promising but more detailed investigation is necessary. In particular, we have to find a way to control the cost of communication between local reasoners. The major task in this context is the definition of a suitable distribution strategy based on the nature of the terminologies involved. Further, we have to investigate possible optimizations of the reasoning method some of which are already mentioned in Section 6. Another major issue is the development of distributed reasoning methods for more expressive languages -the final goal is to support reasoning in OWL which is known to be equivalent to SHOIN (D). A direct extension of the approach proposed here is not possible because dealing with number restrictions requires a different resolution approach, i.e. the use of paramodulation. In this context an assignment of clauses to a certain reasoner is not feasible, duplication not be completely avoided. An adaption of our approach with limited duplication to a calculus that supports equality is the focus of future work.</p></div><figure xmlns="http://www.tei-c.org/ns/1.0" xml:id="fig_0"><head>Definition 1 (</head><label>1</label><figDesc>Standard Resolution). For clauses C ∨ A and D ∨ ¬B with literals A and ¬B, standard resolution is defined by Resolution C ∨ A D ∨ ¬B Cσ ∨ Dσ where σ is the most general unifier of A and B.</figDesc></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" xml:id="fig_1"><head>Definition 7 (</head><label>7</label><figDesc>Definitorial Form). The definitorial form of a concept C in negation normal form isDef(C) = C if all subterms of C are literal concepts {¬Q C| p } ∪ Def(C[Q] p ) elsewhere C| p is a concept that occurs in C (subterm of C), but not a literal concept and all subterms of C| p are literal concepts. C[Q] p is the concept definition that results from replacing C| p with Q.Definition 8 (ALC Resolution). ALC resolution (R ALC ) is the calculus withrules ordered resolution and factoring, selection of exactly the negative binary literals, and literal ordering with R(x, f (x)) ¬C(x) and D(f (x)) ¬C(x), for all function symbols f , and predicates R, C, and D.</figDesc></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" xml:id="fig_2"><head>Fig. 2 .</head><label>2</label><figDesc>Fig. 2. Clause decision tree depicting the resolution options for a given clause.</figDesc><graphic coords="8,169.35,435.69,276.67,112.42" type="bitmap" /></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" xml:id="fig_3"><head>Definition 10 ( 1 )( 3 )Fig. 3 .</head><label>10133</label><figDesc>Fig. 3. Example of Distributed Ordered Resolution. The example from Figure 1 is distributed to two modules. Module A hosts literals with top predicate Set, B hosts literals with top predicate part or P air. Stated clauses are noted above the horizontal line, below (4 1+3 ) means clause (4) was derived from clauses (1) and (2). The first literal is the resolvable literal of a clause (let Set &gt; P air). Arrows depict propagation of clauses, we ommit braces for purpose of readability. Note that in this example most clauses contain symbols hosted by the other module. In realistic examples this overlap is very small and we can expect many of the clauses to be kept in the module where they were infered.</figDesc></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" xml:id="fig_4"><head>Fig. 4 .</head><label>4</label><figDesc>Fig. 4. Communication between modules for the satisfiability test. Modules are denoted by circles with the size (i.e. number of terms) of the module denoted inside. Edges are labelled with the number of clauses propagated between the corresponding modules, propagation direction is not distinguished for purpose of readability. The numbers below the graphs give the percentage of derived clauses that have to be propagated.</figDesc><graphic coords="12,178.86,115.84,94.51,85.04" type="bitmap" /></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" type="table" xml:id="tab_0"><head></head><label></label><figDesc>Aσ is strictly maximal with respect to Cσ 4. nothing is selected in Cσ ∨ Aσ Definition 6 (Positive Factoring).</figDesc><table><row><cell>Positive Factoring</cell><cell>C ∨ A ∨ B Cσ ∨ Aσ</cell></row></table><note>Dσ where 1. σ is the most general unifier of A and B 2. either B is selected in D ∨ ¬B or else nothing is selected in D ∨ ¬B and Bσ is maximal w.r.t. Dσ 3. where 1. σ is the most general unifier of A and B 2. Aσ is maximal with respect to Cσ ∨ Bσ 3. nothing is selected in Cσ ∨ Aσ ∨ Bσ</note></figure>
			<note xmlns="http://www.tei-c.org/ns/1.0" place="foot" n="1" xml:id="foot_0">http://www.mindswap.org/dav/ontologies/commonsense/food/foodswap.owl</note>
			<note xmlns="http://www.tei-c.org/ns/1.0" place="foot" n="2" xml:id="foot_1">A detailed description of PATO can be found in<ref type="bibr" target="#b6">[7]</ref> </note>
		</body>
		<back>

			<div type="acknowledgement">
<div xmlns="http://www.tei-c.org/ns/1.0"><head>Acknowledgement</head><p>This work was partially supported by the German Science Foundation in the Emmy-Noether Program under contract Stu 266/3-1.</p></div>
			</div>

			<div type="references">

				<listBibl>

<biblStruct xml:id="b0">
	<analytic>
		<title level="a" type="main">Partition-based logical reasoning for first-order and propositional theories</title>
		<author>
			<persName><forename type="first">E</forename><surname>Amir</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><surname>Mcilraith</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">Artificial Intelligence</title>
		<imprint>
			<biblScope unit="volume">162</biblScope>
			<biblScope unit="issue">1-2</biblScope>
			<biblScope unit="page" from="49" to="88" />
			<date type="published" when="2005">2005</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b1">
	<analytic>
		<title level="a" type="main">Distributed description logics: Assimilating information from peer sources</title>
		<author>
			<persName><forename type="first">A</forename><surname>Borgida</surname></persName>
		</author>
		<author>
			<persName><forename type="first">L</forename><surname>Serafini</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">Journal of Data Semantics</title>
		<imprint>
			<biblScope unit="volume">1</biblScope>
			<biblScope unit="page" from="153" to="184" />
			<date type="published" when="2003">2003</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b2">
	<analytic>
		<title level="a" type="main">Combining owl ontologies using econnections</title>
		<author>
			<persName><forename type="first">Bijan</forename><surname>Bernardo Cuenca Grau</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Evren</forename><surname>Parsia</surname></persName>
		</author>
		<author>
			<persName><surname>Sirin</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">Journal Of Web Semantics</title>
		<imprint>
			<biblScope unit="volume">4</biblScope>
			<biblScope unit="issue">1</biblScope>
			<date type="published" when="2005">2005</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b3">
	<analytic>
		<title level="a" type="main">Roo: a parallel theorem prover</title>
		<author>
			<persName><forename type="first">L</forename><surname>Ewing</surname></persName>
		</author>
		<author>
			<persName><forename type="first">William</forename><forename type="middle">W</forename><surname>Lusk</surname></persName>
		</author>
		<author>
			<persName><forename type="first">John</forename><surname>Mccune</surname></persName>
		</author>
		<author>
			<persName><surname>Slaney</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Proceedings of the 11th CADE</title>
				<meeting>the 11th CADE</meeting>
		<imprint>
			<publisher>Springer Verlag</publisher>
			<date type="published" when="1992">1992</date>
			<biblScope unit="volume">607</biblScope>
			<biblScope unit="page" from="731" to="734" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b4">
	<analytic>
		<title level="a" type="main">Conservative extensions in expressive description logics</title>
		<author>
			<persName><forename type="first">C</forename><surname>Lutz</surname></persName>
		</author>
		<author>
			<persName><forename type="first">D</forename><surname>Walther</surname></persName>
		</author>
		<author>
			<persName><forename type="first">F</forename><surname>Wolter</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Proceedings of the Twentieth International Joint Conference on Artificial Intelligence IJCAI-07</title>
				<meeting>the Twentieth International Joint Conference on Artificial Intelligence IJCAI-07</meeting>
		<imprint>
			<publisher>AAAI Press</publisher>
			<date type="published" when="2007">2007</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b5">
	<monogr>
		<title level="m" type="main">Reasoning in Description Logics using Resolution and Deductive Databases</title>
		<author>
			<persName><forename type="first">Boris</forename><surname>Motik</surname></persName>
		</author>
		<imprint>
			<date type="published" when="2006-01">January 2006</date>
			<pubPlace>Karlsruhe, Germany</pubPlace>
		</imprint>
		<respStmt>
			<orgName>Universität Karlsruhe (TH)</orgName>
		</respStmt>
	</monogr>
	<note type="report_type">PhD thesis</note>
</biblStruct>

<biblStruct xml:id="b6">
	<analytic>
		<title level="a" type="main">Structure-based partitioning of large concept hierarchies</title>
		<author>
			<persName><forename type="first">Heiner</forename><surname>Stuckenschmidt</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Michel</forename><surname>Klein</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Proceedings of the Third International Semantic Web Conference (ISWC 2004)</title>
				<editor>
			<persName><forename type="first">Sheila</forename><forename type="middle">A</forename><surname>Mcilraith</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">Dimitris</forename><surname>Plexousakis</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">Frank</forename><surname>Van Harmelen</surname></persName>
		</editor>
		<meeting>the Third International Semantic Web Conference (ISWC 2004)<address><addrLine>Hiroshima, Japan</addrLine></address></meeting>
		<imprint>
			<date type="published" when="2004-11">nov 2004</date>
			<biblScope unit="page" from="289" to="303" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b7">
	<monogr>
		<title level="m" type="main">Resolution methods for Decision Problems and Finite Model Building</title>
		<author>
			<persName><forename type="first">Tanel</forename><surname>Tammet</surname></persName>
		</author>
		<imprint>
			<date type="published" when="1992">1992</date>
		</imprint>
		<respStmt>
			<orgName>Chalmers University of Technology and University of Göteborg</orgName>
		</respStmt>
	</monogr>
	<note type="report_type">PhD thesis</note>
</biblStruct>

				</listBibl>
			</div>
		</back>
	</text>
</TEI>
