<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Distributed Resolution for ALC - First Results</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Anne Schlicht</string-name>
          <email>anne@informatik.uni-mannheim.de</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Heiner Stuckenschmidt</string-name>
          <email>heiner@informatik.uni-mannheim.de</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>University of Mannheim</institution>
          ,
          <country country="DE">Germany</country>
        </aff>
      </contrib-group>
      <abstract>
        <p>The use of description logic as the basis for semantic web languages has led to new requirements with respect to scalable and non-standard reasoning. Description logic is a decidable fragment of FOL but still, the standard reasoning tasks are of exponential complexity, satisfiability and subsumption tests are often intractable on large ontologies. Existing large ontologies have a modular structure like networks of linked ontologies, caused by the development process. However, current reasoning approaches do scarcely take advantage of this structure. The available reasoners do not exploit parallel computation and scalability improvements enabled by distributed reasoning. In this paper, we lay the foundation for developing distributed reasoning methods by showing that the description logic fragment ALC can be distributed. We propose a distributed, complete and terminating algorithm that decides satisfiability of terminologies in ALC. The algorithm is based on recent results on applying resolution to description logics. We show that the resolution procedure proposed by Tammet can be distributed amongst multiple resolution solvers by assigning unique sets of literals to individual solvers. This provides the basis for a highly scalable reasoning infrastructure for description logics.</p>
      </abstract>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>-</title>
      <p>
        The use of description logics as one of the primary logical languages for knowledge
representation on the Web has created new challenges with respect to reasoning in these
logics. In order to be able to support the vision of a semantic web of interrelated
ontologies, reasoning procedures have to be highly scalable and able to deal with physically
distributed knowledge models. A natural way of addressing these problems is to rely on
distributed inference procedures that can distribute the load between different solvers
thus reducing potential bottlenecks both in terms of memory and computation time.
Currently, almost all the work on description logic reasoning still assumes a
centralized approach where the complete terminology has to be present on a single system
and all inference steps are are carried out on this system. Exceptions to this rule are
approaches like Distributed Description Logics (DDL) [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ] that support local reasoning
at the price of sacrificing expressiveness in the links between local models and by
dropping some formal properties on the level of the overall model. In DDL for example,
certain types of inconsistencies are not propagated on the global level. The goal of our
work is to support local reasoning in description logics without the need to reduce the
expressiveness of links between local models. Based on results in resolution reasoning
P air v Set
P air v ∀part.¬Set
Set v ∃part.Set
P air(a)
{¬P air(x1), Set(x1)}
{¬P air(x2), ¬part(x2, x3), ¬Set(x3)}
{¬Set(x4), part(x4, f (x4))}, {¬Set(x4), Set(f (x4))}
{P air(a)}
for description logic [
        <xref ref-type="bibr" rid="ref8">8</xref>
        ], we present an algorithm that decides satisfiability of a set of
      </p>
    </sec>
    <sec id="sec-2">
      <title>ALC ontologies. This algorithm allows us to provide distributed reasoning support on</title>
      <p>sets of terminologies that share non-logical symbols without merging the modules. A
possible application is the provision of distributed reasoning support for (ALC
equivalent parts of) OWL ontologies linked by import statements. Our method guarantees that
the global semantics is preserved.</p>
      <p>
        The contribution of this paper is threefold: (1) We identify general requirements
for a resolution procedure needed to guaranteeing soundness and completeness. (2) we
show that the resolution method proposed in [
        <xref ref-type="bibr" rid="ref8">8</xref>
        ] satisfies these requirements. (3) we
present first results on the practical applicability of the algorithm. The paper is
organized as follows: after a brief discussion of related work in the remainder of this
section, we first explain the general idea of distributing standard resolution and discuss
the problems that have to be solved for guaranteeing completeness. We then discuss
ordered resolution as a adequate basis for distribution and its adaptation to ALC
knowledge bases before turning to the distributed ordered resolution algorithm. Finally, we
present the first results of this method and discuss possible optimizations.
1.1
      </p>
      <p>Related Work
Modular DL Reasoning Current approaches to distributed reasoning on description
logic mostly rely on tableaux methods. Distribution by solving the two choices of
nondeterminitic tableaux rules in parallel is difficult as it hampers the application of
optimization and blocking strategies. Instead most distributed tableaux approaches try to
identify all possible conflicts, i.e. all axioms that might follow from another module and
would cause a contradiction and send these as queries to the other modules. So far, this
is only done for links with rather restricted expressiveness between the modules.</p>
      <sec id="sec-2-1">
        <title>The most prominent actually distributed T-Box reasoning implementation for ontolo</title>
        <p>
          gies Distributed Description Logic (DDL) [
          <xref ref-type="bibr" rid="ref2">2</xref>
          ] supports only a special type of links
(called bridge rules) between ontologies. The local domains have to be disjoint, i.e.
there is no real subsumption between elements of different modules. Like DDL,
E
        </p>
      </sec>
      <sec id="sec-2-2">
        <title>Connections [3] treat local domains as disjoint and do not support subsumption relations between modules.</title>
      </sec>
      <sec id="sec-2-3">
        <title>Currently, a modular ontology framework based on Conservative Extensions [5] is</title>
        <p>in development. This framework, however, imposes severe requirements to the
selfcontainement of the modules. Reasoning with these modules is relatively easy but the
computational difficulties are transfered to checking and maintaining the properties of
the modules. We think that this approach overshoots the mark and poses too many
restrictions on the way axioms might be distributed across different sites.</p>
      </sec>
      <sec id="sec-2-4">
        <title>Distributed Resolution Methods Resolution is used by all successful FOL provers.</title>
      </sec>
      <sec id="sec-2-5">
        <title>Approaches to distributed first order reasoning are motivated by efficiency consider</title>
        <p>
          ations, performance is improved by using multiple processors in parallel. Roo[
          <xref ref-type="bibr" rid="ref4">4</xref>
          ], for
example, is a parallelization of the widely (re)used first order reasoner Otter.
        </p>
      </sec>
      <sec id="sec-2-6">
        <title>Parallelization differs from the distribution setting, while the former usually uses a</title>
        <p>
          shared memory, the latter constitutes a physical separation of modules and resulting
higher communication costs. Partition-Based Reasoning [
          <xref ref-type="bibr" rid="ref1">1</xref>
          ] is a distributed resolution
strategy that requires local reasoning to be complete for consequence finding. The
requirement inevitably causes derivation of much more clauses than necessary for
refutation.
        </p>
      </sec>
      <sec id="sec-2-7">
        <title>Resolution for Description Logics The main problem with applying FOL approaches</title>
        <p>for DL is that termination is not guaranteed. Although DL is a decidable fragment of</p>
      </sec>
      <sec id="sec-2-8">
        <title>FOL, causing a FOL reasoner to terminate on DL input requires considerable adaption</title>
        <p>
          of the algorithm. This paper is based on resolution methods introduced to description
logics by [
          <xref ref-type="bibr" rid="ref6">6</xref>
          ] that decide satisfiability. The algorithm we present can be seen as a
distributed version of the algorithm for deciding ALC satisfiability proposed there.
2
        </p>
        <sec id="sec-2-8-1">
          <title>Distributed Standard Resolution</title>
        </sec>
      </sec>
      <sec id="sec-2-9">
        <title>Analyzing the applicability of the Partition-Based Reasoning approach of [1] to descrip</title>
        <p>tion logic revealed that the only way to avoid redundant derivations is to perform the
same inferences as a common resolution reasoner instead of connecting a set of
blackbox reasoners. Therefore, the basic idea proposed in this paper is to distribute common
standard resolution by allocating each derivation step to a unique site in the distributed
system.</p>
      </sec>
      <sec id="sec-2-10">
        <title>We first define distributed resolution before detailing this essential property of a distributed resolution calculus.</title>
        <p>Definition 1 (Standard Resolution). For clauses C ∨ A and D ∨ ¬B with literals A
and ¬B, standard resolution is defined by</p>
        <p>Resolution C ∨ A D ∨ ¬B</p>
        <p>Cσ ∨ Dσ
where σ is the most general unifier of A and B.</p>
      </sec>
      <sec id="sec-2-11">
        <title>A distributed calculus can be created from a given resolution calculus, based on an allocation relation that determines the modules.</title>
        <p>Definition 2 (Allocation). An allocation function for C is a total function a : C → M
that maps clauses to module identifiers.</p>
      </sec>
      <sec id="sec-2-12">
        <title>The allocation could also be defined by an allocation relation, thus allowing for dupli</title>
        <p>cation of clauses to multiple modules. However, we will show that for ALC duplication
is not necessary and we will define an appropriate allocation function. Before we turn
to distributed resolution for ALC we give a general notation of distributed resolution.
Definition 3 (Distributed Resolution). A distributed resolution method for a set of
clauses C is a tuple (R, a) of a resolution calculus and an allocation function such that
for every inference with premises Pi ∈ C, i ∈ I the allocation restriction ∃m ∈ M, ∀i ∈
I : a(Pi) = m is satisfied.</p>
      </sec>
      <sec id="sec-2-13">
        <title>For rules with only one premise this restriction is trivially true. A trivial allocation is</title>
        <p>obtained by allocating all clauses to the same module (a(c) = m1 ∀c ∈ C). Non-trivial
allocations constitute a seperation into modules.</p>
      </sec>
      <sec id="sec-2-14">
        <title>The allocation restriction enables implementation of distributed resolution by connect</title>
        <p>ing a set of resolution reasoners, each with a local set of clauses:
– Every module is saturated seperately.</p>
        <p>– Newly derived clauses are propagated according to the allocation relation.</p>
      </sec>
      <sec id="sec-2-15">
        <title>Considering one of these reasoners, the difference to performing centralized resolution</title>
        <p>is, that occasionally locally derived clauses are send to another reasoner and in return,
clauses are received from other reasoners and have to be integrated in the local clause
list. In addition, a global initialization and termination routine has to be added, that
starts the separate solvers and stops if one of them finds a proof or all are saturated.
Definition 4 (Local Resolution). The local calculus for module m ∈ M in a
distributed resolution method (R, a) is R with additional restriction a(Pi) = m for every
premise Pi to every rule r ∈ R.</p>
      </sec>
      <sec id="sec-2-16">
        <title>In contrast to the centralized case, a module that is saturated locally may have to con</title>
        <p>tinue reasoning once a new clause is propagated from another module.</p>
      </sec>
      <sec id="sec-2-17">
        <title>The allocation a was defined to be functional for guarateeing that inferences are never</title>
        <p>duplicated. Most relevant are inferences with multiple premises because of the
expensive search for corresponding sets of premises. The allocation restrictions ensure that
every inference of the original calculus occurs in the distributed setting.
Corollary 1. If a calculus R guarantees completeness (termination respective) for
input ⊆ C than every distributed reasoning method (R, a) for C with a finite set M of
modules is complete (terminates).</p>
      </sec>
      <sec id="sec-2-18">
        <title>Considering a derivation of the empty clause in centralized resolution that is not derived</title>
        <p>in the corresponding distributed resolution method leads to a contradiction: Assume the
empty clause is not derived from an unsatisfiable KB by distributed ordered resolution
and consider the first inference in the corresponding centralized reasoner that does not
occur in the distributed version. By the allocation restriction, all premises are allocated
to some module m the inference is carried out by local resolution in that module.</p>
      </sec>
      <sec id="sec-2-19">
        <title>Termination is obviously preserved, no inference is added by distribution.</title>
        <p>2.1</p>
        <p>Problems Distributing Resolution
In general, clauses might have to be allocated to more than one module to make sure
that the premises of every derivation step are allocated to the same module. In order to
achieve an efficient method, we have to avoid duplication and allocate every clause to
as few sites as possible. Even if we assume the case of only two modules where each
may use terms defined by the other, distributing standard resolution might cause most
of the axioms in inferences be copied to the other module. To make sure that every pair
of clauses that can be resolved with each other meets in a module, we need to send all
clauses containing a foreign term (i.e. a term defined in the other module) to the other
module. Unfortunately the resolvent of clauses from different modules is very likely to
contain terms from both modules that were only used locally before, thus it has to be
copied to both modules. If we extend the approach to multiple modules we can expect
the number of duplicates to further increase.</p>
        <p>
          The partition based approach [
          <xref ref-type="bibr" rid="ref1">1</xref>
          ] adresses this problem by limiting propagation to
clauses that contain only shared symbols. It defines the shared language between
modules as the clauses consisting of symbols shared by a pair of modules. The system can
be depicted as a graph, with modules as nodes and edges for every shared language,
labelled with the shared symbols. The graph is modified to form a tree by deleting edges
and adding the corresponding symbols to other edge labels. Modules that share a
symbol ”s” remain connected, there is still some path in the tree connecting the modules
with ”s” contained in the label of every edge on the path. While the edges in the graph
are undirected, edges in the resulting tree point towards the root.
        </p>
        <p>Whenever a local reasoner derives an axiom that is contained in a shared language (i.e.
it contains only symbols from an outgoing edge), the axiom is propagated to the
corresponding neighbor, towards the root. In difference to our approach clauses are not
allocated to a specific module but may be passed on from one to the other until they
reach the root. This approach faces some efficiency problems: First, as mentioned
before, the system infers much more clauses than necessary for refutation, because the
local reasoning has to be complete for consequence finding to guarantee completeness.
If the modules are small (i.e. the knowledge base is distributed to many modules) this
effect is negligible, but then the communication cost are very high. Second, the
computation and communication is not distributed equably among the modules. The root is a
distinguished module, communication load boosts towards it. Not only are the edges
directed towards the root, also the number of shared language symbols is larger on edges
close to the root. (Recall that all modules that share some symbol remain connected by
a path labelled with this symbol.) Third, the partitioning depends on the query. If the
query clauses are not contained in some local language, new links have to be introduced
and repartitioning might be necessary.</p>
      </sec>
      <sec id="sec-2-20">
        <title>Nevertheless, we will show that with a different approach it is possible to define a</title>
        <p>distributed resolution method for ALC that is sound and complete, and terminates. It
does not require duplication of axioms, every clause is allocated to exactly one module.</p>
      </sec>
      <sec id="sec-2-21">
        <title>New axioms (e.g. queries) can be easily added to the system. The approach is fairly distributed, no module stands out from the others.</title>
        <sec id="sec-2-21-1">
          <title>Ordered Resolution</title>
        </sec>
      </sec>
      <sec id="sec-2-22">
        <title>The problem with applying first order methods for DL is loosing decidability. An DL</title>
        <p>
          ontology input to a complete first order reasoner will not terminate in general. [
          <xref ref-type="bibr" rid="ref8">8</xref>
          ]
showed that using ordered resolution, decidability of some description logics can be
preserved by transforming the ontology into a special normal form. Before we give the
definition of the method for ALC from [
          <xref ref-type="bibr" rid="ref6">6</xref>
          ] we define the resolution rules it is based on.
        </p>
      </sec>
      <sec id="sec-2-23">
        <title>Ordered resolution depends on two parameters that can be modified to improve perfor</title>
        <p>mance e.g. for restricted subsets of first order logic like description logic. First
parameter is the order of literals (for defining the maximal literals) that can be defined on top of
an order of predicate symbols, function symbols and constants. The second parameter is
the selection function that maps every clause C to a subset S(C) of it’s negative literals.
Definition 5 (Ordered Resolution).
where</p>
        <p>Ordered resolution</p>
        <p>C ∨ A D ∨ ¬B</p>
        <p>Cσ ∨ Dσ
1. σ is the most general unifier of A and B
2. either B is selected in D ∨ ¬B or else nothing is selected in D ∨ ¬B and Bσ is
maximal w.r.t. Dσ
3. Aσ is strictly maximal with respect to Cσ
4. nothing is selected in Cσ ∨ Aσ
Definition 6 (Positive Factoring).</p>
        <p>Positive Factoring C ∨ A ∨ B</p>
        <p>Cσ ∨ Aσ
where
1. σ is the most general unifier of A and B
2. Aσ is maximal with respect to Cσ ∨ Bσ
3. nothing is selected in Cσ ∨ Aσ ∨ Bσ
Compared to unrestricted resolution, many derivations are skipped in ordered
resolution. A literal A that is not selected in a clause C can only be resolved upon if nothing
else is selected in the clause and Aσ is maximal. Intuitively, the effect of the selection
function is a change in the order of literals in a clause, the selected literals are moved
to the front. In addition, the combination of multiple inferences for skipping redundant
intermediate results (i.e. hyperresolution) is controlled by selection.
3.1</p>
        <sec id="sec-2-23-1">
          <title>Adaption to ALC</title>
        </sec>
      </sec>
    </sec>
    <sec id="sec-3">
      <title>We assume a ALC ontology transformed into first order clauses (consider Figure 1 for</title>
      <p>an example). Modules are thus sets of clauses that can be derived from the given
ontology. Different ontologies that share terms (e.g. one ontologie uses a concept defined in
another ontology) can be considered as modules, too. The ontologies are translated into
clauses seperately resulting in the same structure as an ontology that is first clausified
and than partitioned into modules.</p>
      <sec id="sec-3-1">
        <title>For preserving decidability, the ontology has to be transformed into definitorial form</title>
        <p>before clausification. Then, resolving the clauses with ordered resolution derives an
empty clause iff the ontology is inconsistent and terminates for any ontology. Intuitively,
the definitorial form can be seen as contrary of unfolding, it splits up nested axioms into
simple ones by introducing new concepts.</p>
        <p>Definition 7 (Definitorial Form). The definitorial form of a concept C in negation
normal form is</p>
        <p>Def(C) =</p>
        <p>C
{¬Q u C|p} ∪ Def(C[Q]p)
if all subterms of C are literal concepts
else
where C|p is a concept that occurs in C (subterm of C), but not a literal concept and all
subterms of C|p are literal concepts. C[Q]p is the concept definition that results from
replacing C|p with Q.</p>
        <p>Definition 8 (ALC Resolution). ALC resolution (RALC) is the calculus with
– rules ordered resolution and factoring,
– selection of exactly the negative binary literals, and
– literal ordering with R(x, f (x)) ¬C(x) and D(f (x))
function symbols f , and predicates R, C, and D.
¬C(x), for all</p>
      </sec>
      <sec id="sec-3-2">
        <title>The requirement to the literal ordering is satisfied by every lexicographic path ordering</title>
        <p>based on a total precedence &gt; with f &gt; P &gt; ¬ for every function symbol f and
predicate symbol P .</p>
        <p>Definition 9 (Lexicographic Path Ordering). A lexicographic path ordering (LPO) is
a term ordering induced by a well-founded strict precedence &gt; over function, predicate
and logical symbols, defined by:
s = f (s1, . . . , sm) g(t1, . . . , tn) = t if and only if
at least one of the following holds
(i) f &gt; g and s ti for all i with 1 ≤ i ≤ n
(ii) f = g and for some j we have (s1, . . . , sj−1) = (t1, . . . , tj−!), sj
for all k with j &lt; k ≤ n
(iii) sj t for some j with 1 ≤ j ≤ m
tj and s
tk</p>
        <sec id="sec-3-2-1">
          <title>Theorem 1 (RDL complexity).</title>
          <p>For an ALC knowledge base KB, clausifying its definitorial form and saturating the
clauses by RALC decides satisfiability of KB and runs in time exponential in KB.</p>
        </sec>
      </sec>
      <sec id="sec-3-3">
        <title>This theorem is the reason for using ordered resolution in our approach, we thereby guarantee termination. Note that ordered resolution is not complete for consequence finding and hence, applying it for the partition based reasoning approach of [1] results in an incomplete procedure.</title>
        <sec id="sec-3-3-1">
          <title>Distributed Ordered Resolution</title>
        </sec>
      </sec>
      <sec id="sec-3-4">
        <title>Inuitively, the idea for distributing resolution is to allocate clauses based on their resolvable literals.</title>
        <p>– Every module ”hosts” a subset of the literals, i.e. it is responsible for all inferences
resolving upon one of these literals.
– Every (derived or stated) clause is moved to the modules that host a resolvable
literal of the clause.</p>
        <p>Essential for distributing resolution without duplicating clauses and derivations is to
reduce the number of modules that may be responsible for the next derivation to a
single module i.e. to define the resolution method in such a way that there is a unique
resolvable literal for every clause. Note that literals can be allocated to modules based
on their top symbol (i.e. the predicate P of the literal (¬)P (t) where t is a term) and a
partitioning of symbols that allocates every symbol to exactly one module.
4.1</p>
        <p>Resolvable Literals of a Given Clause</p>
      </sec>
      <sec id="sec-3-5">
        <title>For defining a resolution method that is complete for this communication strategy, we</title>
        <p>investigate the options for using a given clause in an inference. The first thing we
discover from the definition of ordered resolution is that a clause can only be main premise
and not side premise if some literal is selected by the selection function. By exploiting
the other requirements we obtain the decision tree in Figure 2 (assuming the decision
on the resolvable literal is not affected by unification).</p>
        <p>
          If nothing is selected, only a maximal literal can be resolved. If there are multiple
maximal literals, then the clause cannot be a side premise. Problematic are multiple
negative maximal literals because any of them can be the resolved literal in the next
derivation. If all multiple maximal literals are positive the clause cannot be premise
for ordered resolution because it can neither be a side premise nor a main premise. A
given clause can never be side premise and main premise at same time (in different
inferences). If the selection function is empty, the first branch on the right is pruned.
If the order is total on all literals, the branch in the middle is pruned. By exploiting
the effects of the selection and ordering on the clause tree, two problematic cases are
identified:
? Multiple selected literals: If literals from different moduls are selected, we cannot
allocate the inference to one module.
?? Multiple negative maximal literals, none selected: We have to send the clause to
every module that is responsible for derivations resolving upon one of these literals.
Note that these restrictions apply to the unified clause: The definition of ordered
resolution requires each resolved atom Aσ to be (strictly) maximal with respect to Dσ (Cσ)
but this does in general not imply that A is (strictly) maximal w.r.t. the premise before
substitution. However, the ordering defined for RALC is admissible which implies
invariance under substitutions and the selection function is independent of substitutions,
too. For guaranteeing termination of RALC , ordering and selection were defined in a
way that restricts the clauses to five different types [
          <xref ref-type="bibr" rid="ref6">6</xref>
          ]:
1
2a
2b
3
4
5
        </p>
        <p>clause type
R(x, f (x)) ∨ P(x)</p>
        <p>P(x)</p>
        <p>P1(f (x)) ∨ P2(x)
¬R(x, y) ∨ P1(x) ∨ P2(y)</p>
        <p>P(a)
(¬)R(a, b)
type of resolvable literal</p>
        <p>R(x, f (x))
(¬)P (x)
(¬)P (f (x))
¬R(x, y)
(¬)P (a)
(¬)R(a, b)</p>
      </sec>
    </sec>
    <sec id="sec-4">
      <title>Obviously, ALC clauses never contain more than one selected literal, thus, the first</title>
      <p>problem does not occur in RALC . The second problem can be avoided, too, by either
applying a total order on literals or selecting one of the multiple maximal literals. The
effect is similar because considering the possible inferences, a selected literal is treated
like a strictly maximal literal. We choose the former and extend the ordering defined
for RALC as follows.</p>
      <p>Definition 10 (Ordering for Distributed Resolution).</p>
      <p>For two literals of ALC clauses A and B, A D B iff A B or A and B are
incomparable wrt. and i) A contains a function symbol but not B or ii) A and B</p>
      <p>Module A
(1) Set(f (x4)), ¬Set(x4)
(2) Set(x1), ¬P air(x1)</p>
      <p>Module B
¬part(x2, x3), ¬Set(x3), ¬P air(x2) (1)
part(x4, f (x4)), ¬Set(x4) (2)</p>
      <p>P air(a) (3)
(3) ¬Set(f (x4)), ¬Set(x4), ¬P air(x4)←¬Set(f (x4)), ¬Set(x4), ¬P air(x4) (4 1+2)
(1+3 4) ¬Set(x4), ¬Set(x4), ¬P air(x4)
(2+4 5) ¬P air(x4)
→
¬P air(x4) (5)
(6 3+5)
contain the same number of function symbols and the top predicate of A precedes the
top predicate of B.</p>
      <p>The calculus RALC with extended ordering D is denoted by RADLC .</p>
      <sec id="sec-4-1">
        <title>The literals that remain incomparable wrt.</title>
        <p>therefore hosted by the same module.</p>
      </sec>
      <sec id="sec-4-2">
        <title>D have the same top predicate and are</title>
        <p>Corollary 2 (Distributed Ordered Resolution).</p>
        <p>Distributed Ordered Resolution, i.e. the method (RADLC , a) with allocation a defined by
a(c) = part(topSymbol(resolvableLiteral(c))) based on a partitioning of predicates
part that maps every predicate to a module identifier is a distributed resolution method.</p>
      </sec>
      <sec id="sec-4-3">
        <title>Since the resolvable literal is unique, the allocation is a function. The allocation restric</title>
        <p>tions are satisfied because for any pair of clauses that are premises of an inference the
resolvable literals have the same top symbol and are thus allocated to the same module.
Hence, if RADLC is complete and terminates, by Corollary 1 the same holds for (RADLC , a).
Theorem 2 (Completeness and Termination of Distributed Ordered Resolution).
Distributed Ordered Resolution is a complete and terminating method for deciding
ALC satisfiability.</p>
      </sec>
    </sec>
    <sec id="sec-5">
      <title>Proof. The theorem follows from completeness and termination of RALC (Theorem 1).</title>
      <p>We have to show that RADLC is complete and terminates. The definition of ordered
resolution allows for an arbitrary ordering of literals, hence RADLC is complete. We only have
to make sure that Theorem 1 is not affected by this modification and hence termination
is preserved. In fact, the requirements R(x, f (x)) D ¬C(x) and D(f (x)) D ¬C(x)
still hold for all function symbols f , and predicates R, C, and D.</p>
      <sec id="sec-5-1">
        <title>Consider Figure 2 for an example refutation.</title>
        <sec id="sec-5-1-1">
          <title>Experiments</title>
          <p>The algorithm presented above is not yet implemented in an actually distributed
reasoner, but first experiments were conducted simulating the distributed setting. For
investigating distributed ordered resolution we implemented the definitorial form
normalization and modified the resolution reasoner SPASS to perform the RADLC
resolution. The crucial point for determining performance of distributed ordered resolution is
the number of propagated clauses. If the modules happen to be completely independent,
performance would be improved by a factor propotional to the number of modules. But,
in realistic settings the modules will allways share some symbols and clauses will be
exchanged between them. Propagation of a derived clause to another module is
considerably more expensive then just adding it to the local clause list. Apart from the distance
that has to be bridged over, a received clause requires modification of the local index
structures that are used to manage the clause lists. Hence, the ratio of derived clauses
that have to be propagated to another module is the dicisive criterion.
5.1</p>
          <p>Setting
For our experiments we used the FOODSWAP1 ontology. First we transformed the
ontology into its definitorial form. Then we used our partitioning tool PATO for assigning
each term (i.e. concept or property name) a module number. PATO first builds a
dependency graph from the ontology that is afterwards partitioned into parts that are stronger
internally connected than extenally connected2. For minimizing the communication we
configured Pato to create a link in the dependency graph for every pair of terms
appearing in the same axiom. Finally we started the SPASS reasoner with appropriate
configuration to check consistency of the ontology. For every ordered resolution
derivation we recorded the module that performed the derivation (i.e. the module number of
the resolved literal) and the module to which the derived clause is send (i.e. the module
number of the resolvable literal of the derived clause).
5.2</p>
          <p>Results
Analysing the list of source and destination of every derived clause, we found that out of
the 229 derived clauses 110 were kept in the same module, the other 119 clauses were
propagated to another module. The complete picture of the communication is depicted
in Figure 4. The communication load of 52% of the derived clauses may already be
acceptable in many applications, but still further reduction can be achieved by merging
modules as depicted in the subsequent graphs. In the three module setting only 28% of
the clauses are propagated and 26% for the two modules while the sizes remain
balanced.</p>
        </sec>
      </sec>
      <sec id="sec-5-2">
        <title>1 http://www.mindswap.org/dav/ontologies/commonsense/food/foodswap.owl</title>
      </sec>
      <sec id="sec-5-3">
        <title>2 A detailed description of PATO can be found in [7]</title>
        <p>52%
28%</p>
        <p>Positive tests like this consistency check require complete saturation whereas
negative test are often answered faster by resolution reasoners. For our small test case the
time required for reasoning is dominated by the time for loading the input, but the
number of derived clauses is smaller for positive subsumption checks (i.e. derivation of the
empty clause). E.g. testing the subsumption (meat v f ood) by adding its converse
(meat ∧ ¬f ood) to the clauses required only 45 derivations (with 47% propagation in
the four module setting). In contrary, the negative subsumption test (eggs v dairy) is
similar to the positive consistency check (226 derivations, 115 propagations).
6</p>
        <sec id="sec-5-3-1">
          <title>Optimizations</title>
        </sec>
      </sec>
      <sec id="sec-5-4">
        <title>The first results of distributed ordered resolution are very encouraging. However, additional tests with different larger ontologies are required and further reduction of the propagation ratio is desirable.</title>
        <p>6.1</p>
        <p>Partitioning
One of the next steps for improving performance will be to optimize the partitioning wrt.
communication. In our experiment we started with a partitioning generated by PATO
and than reduced communication by merging modules. In the same way we could start
with all modules containing only a single term and find the optimal partitioning wrt. a
given reasoning task. Finding the optimal partitioning is an apparently hard problem,
but approximations will probably meet the needs.
6.2</p>
        <p>Selection Function</p>
      </sec>
      <sec id="sec-5-5">
        <title>Another topic of investigation is the selection function. Up to now, it selects at most one</title>
        <p>literal. Thus, a derived clause is very likely to be moved to another module. It would
be more efficient to select local literals first. Moreover, if we select multiple literals,
we can combine multiple inferences to a hyperresolution inference and skip redundant
intermediate results. By investigating the requirements necessary to make sure the set
of ALC clauses is closed under RALC , we can improve the selection function:
Definition 11 (Adapted Distributed Selection).
1. Select nothing from clauses of type 1 and 5.
2. From type 3 clauses select the negative binary literal.
3. From clauses of type 2b select the maximal number of negative literals of type</p>
        <p>P (f (x)) hosted by a single module.
4. From clauses of type 2a and 4 select the maximal number of negative literals hosted
by a single module.</p>
      </sec>
      <sec id="sec-5-6">
        <title>We may not select all local literals but selecting all literals hosted by one module has a similar effect. First the clause is moved to that module, but then all local literals are solved in the next inference.</title>
        <p>6.3
Reduction</p>
      </sec>
      <sec id="sec-5-7">
        <title>For efficient resolution reasoning it is not sufficient to avoid unneccessary derivations.</title>
        <p>Additionally, reduction techniques have to be applied that delete clauses if they are
tautologies or subsumed by other clauses. Deleting newly derived clauses that are
subsumed is forward subsumption, backward subsumption refers to the deletion of clauses
that are subsumed by a newly derived clause. The na¨ıve approach to distributed
subsumption would be to propagate new clauses to all modules that host one of its literals
for forward subsumption checks and send it to modules that are responsible for literals
equal or smaller than the resolvable literal of the new clause for backward subsumtion.</p>
      </sec>
      <sec id="sec-5-8">
        <title>Fortunately we can again take advantage of the clause typology. Of all different clause types only type 2a can be subsumed by a clause of different type, every other clause can only be subsumed by a clause of the same type. Thus, by indexing the clauses with their types we can skip a large part of the subsumption tests.</title>
        <p>7</p>
        <sec id="sec-5-8-1">
          <title>Summary</title>
          <p>We addressed the problem of improving the scalability of description logic reasoning
by proposing a distributed resolution method for ALC terminologies. The algorithm
extends ordered resolution with a method for assigning clauses to a unique location at
which all possible resolution steps are executed by a local solver. We analyzed the
properties of such a distributed method and identified necessary conditions for guaranteeing
completeness and termination of the algorithm. Further, we have shown that the
resolution method for ALC described by Motik satisfies these conditions and can therefore be
distributed across different distributed reasoners. Our investigations lay the foundation
for the implementation of a large scale reasoning infrastructure for ALC terminologies
and can be seen as a first step towards supporting the vision of the semantic web as
a distributed system of interlinked ontologies that can be reasoned upon. First
experiments on the performance of the proposed algorithm are promising but more detailed
investigation is necessary. In particular, we have to find a way to control the cost of
communication between local reasoners. The major task in this context is the definition
of a suitable distribution strategy based on the nature of the terminologies involved.
Further, we have to investigate possible optimizations of the reasoning method some
of which are already mentioned in Section 6. Another major issue is the development
of distributed reasoning methods for more expressive languages – the final goal is to
support reasoning in OWL which is known to be equivalent to SHOIN (D). A direct
extension of the approach proposed here is not possible because dealing with number
restrictions requires a different resolution approach, i.e. the use of paramodulation. In
this context an assignment of clauses to a certain reasoner is not feasible, duplication
not be completely avoided. An adaption of our approach with limited duplication to a
calculus that supports equality is the focus of future work.</p>
          <p>Acknowledgement</p>
        </sec>
      </sec>
      <sec id="sec-5-9">
        <title>This work was partially supported by the German Science Foundation in the Emmy</title>
      </sec>
      <sec id="sec-5-10">
        <title>Noether Program under contract Stu 266/3-1.</title>
      </sec>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          1.
          <string-name>
            <given-names>E.</given-names>
            <surname>Amir</surname>
          </string-name>
          and
          <string-name>
            <given-names>S.</given-names>
            <surname>McIlraith</surname>
          </string-name>
          .
          <article-title>Partition-based logical reasoning for first-order and propositional theories</article-title>
          .
          <source>Artificial Intelligence</source>
          ,
          <volume>162</volume>
          (
          <issue>1-2</issue>
          ):
          <fpage>49</fpage>
          -
          <lpage>88</lpage>
          ,
          <year>2005</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          2.
          <string-name>
            <given-names>A.</given-names>
            <surname>Borgida</surname>
          </string-name>
          and
          <string-name>
            <given-names>L.</given-names>
            <surname>Serafini</surname>
          </string-name>
          .
          <article-title>Distributed description logics: Assimilating information from peer sources</article-title>
          .
          <source>Journal of Data Semantics</source>
          ,
          <volume>1</volume>
          :
          <fpage>153</fpage>
          -
          <lpage>184</lpage>
          ,
          <year>2003</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          3.
          <string-name>
            <given-names>Bernardo</given-names>
            <surname>Cuenca</surname>
          </string-name>
          <string-name>
            <surname>Grau</surname>
          </string-name>
          , Bijan Parsia, and
          <string-name>
            <given-names>Evren</given-names>
            <surname>Sirin</surname>
          </string-name>
          .
          <article-title>Combining owl ontologies using econnections</article-title>
          .
          <source>Journal Of Web Semantics</source>
          ,
          <volume>4</volume>
          (
          <issue>1</issue>
          ),
          <year>2005</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          4.
          <string-name>
            <surname>Ewing L. Lusk</surname>
          </string-name>
          , William W.
          <string-name>
            <surname>McCune</surname>
            ,
            <given-names>and John</given-names>
          </string-name>
          <string-name>
            <surname>Slaney</surname>
          </string-name>
          .
          <article-title>Roo: a parallel theorem prover</article-title>
          .
          <source>In In Proceedings of the 11th CADE</source>
          , volume
          <volume>607</volume>
          <source>of LNAI</source>
          , pages
          <fpage>731</fpage>
          -
          <lpage>734</lpage>
          . Springer Verlag,
          <year>1992</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          5.
          <string-name>
            <given-names>C.</given-names>
            <surname>Lutz</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Walther</surname>
          </string-name>
          , and
          <string-name>
            <given-names>F.</given-names>
            <surname>Wolter</surname>
          </string-name>
          .
          <article-title>Conservative extensions in expressive description logics</article-title>
          .
          <source>In Proceedings of the Twentieth International Joint Conference on Artificial Intelligence IJCAI07</source>
          . AAAI Press,
          <year>2007</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          6.
          <string-name>
            <given-names>Boris</given-names>
            <surname>Motik</surname>
          </string-name>
          .
          <article-title>Reasoning in Description Logics using Resolution and Deductive Databases</article-title>
          .
          <source>PhD thesis</source>
          , Universita¨t
          <source>Karlsruhe (TH)</source>
          , Karlsruhe, Germany,
          <year>January 2006</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          7.
          <string-name>
            <given-names>Heiner</given-names>
            <surname>Stuckenschmidt</surname>
          </string-name>
          and
          <string-name>
            <given-names>Michel</given-names>
            <surname>Klein</surname>
          </string-name>
          .
          <article-title>Structure-based partitioning of large concept hierarchies</article-title>
          .
          <source>In Sheila A. McIlraith</source>
          ,
          <string-name>
            <surname>Dimitris Plexousakis</surname>
          </string-name>
          , and Frank van Harmelen, editors,
          <source>Proceedings of the Third International Semantic Web Conference (ISWC</source>
          <year>2004</year>
          ), pages
          <fpage>289</fpage>
          -
          <lpage>303</lpage>
          , Hiroshima, Japan, nov
          <year>2004</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          8.
          <string-name>
            <given-names>Tanel</given-names>
            <surname>Tammet</surname>
          </string-name>
          .
          <article-title>Resolution methods for Decision Problems and Finite Model Building</article-title>
          .
          <source>PhD thesis</source>
          , Chalmers University of Technology and University of Go¨teborg,
          <year>1992</year>
          .
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>