<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Optimised Absorption for Expressive Description Logics</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Andreas Steigmiller</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Birte Glimm</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Thorsten Liebig</string-name>
          <email>liebig@derivo.de</email>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>University of Ulm</institution>
          ,
          <addr-line>Ulm, Germany, &lt;first name&gt;.&lt;last</addr-line>
        </aff>
        <aff id="aff1">
          <label>1</label>
          <institution>derivo GmbH</institution>
          ,
          <addr-line>Ulm</addr-line>
          ,
          <country country="DE">Germany</country>
        </aff>
      </contrib-group>
      <abstract>
        <p>The Web Ontology Language (OWL 2) [19] is based on the very expressive Description Logic (DL) SROIQ [3] for which sophisticated algorithms are required to handle (standard) reasoning tasks. In practice, variants of tableau algorithms are often used since they are easily extensible and adaptable. Since such algorithms have a very high worst-case complexity, developing optimisations to nevertheless allow for highly e cient implementations is a long-standing research area in DLs (see, e.g., [4,18]). A very e ective and widely implemented optimisation is absorption, which is a preprocessing step that aims at rewriting general concept inclusion (GCI) axioms such that non-determinism in the tableau algorithm is avoided as much as possible. In this paper, we present an improved variant of a recursive binary absorption algorithm that generalises the well-known techniques of binary absorption [7] and role absorption [17]. The algorithm also allows for absorbing parts of concepts and, as a result, more expressive concept constructors can be handled and non-determinism can often be delayed further. The algorithm has already been introduced in our previous work [12] as an essential pre-requisite for handling nominal schemas [8]. Here we simplify its presentation, prove the correctness, present several extensions for the algorithm, and provide a comparison with other absorption techniques. The algorithm is implemented in the tableau-based reasoner Konclude [15] and is essential for Konclude's e ciency. In fact, many other optimisation techniques in Konclude are based on the presented absorption and they significantly benefit from the reduced or eliminated non-determinism [13,14]. We next introduce the basics of tableau algorithms and absorption. In Section 3, we introduce our recursive absorption algorithm for which we present several extension in Section 4. We discuss related work in Section 5 before we conclude in Section 6.</p>
      </abstract>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>Introduction</title>
      <p>
        Model construction calculi, such as tableau [
        <xref ref-type="bibr" rid="ref3 ref6">3,6</xref>
        ], decide the consistency of a
knowledge base K by trying to construct an abstraction of a model for K , a so-called
completion graph. A completion graph G is a tuple (V; E; L; ,˙ ), where each node x 2 V (edge
hx; yi 2 E) represents one or more (pairs of) individuals. Each node x (edge hx; yi) is
labelled with a set of concepts (roles), L(x) (L(hx; yi)), which the (pairs of)
individuals represented by x (hx; yi) are instances of. The relation ,˙ records inequalities, which
must hold between nodes, e.g., due to at-least cardinality restrictions. The algorithm
works by decomposing concepts in the completion graph with a set of expansion rules.
For example, if C1 t C2 2 L(v) for some node v, but neither C1 2 L(v) nor C2 2 L(v),
then the rule for handling disjunctions non-deterministically adds one of the disjuncts
to L(v). Similarly, if 9r:C 2 L(v), but v does not have an r-successor with C in its
label, then the algorithm expands the completion graph by adding the required successor
node. Unrestricted application of this rule or the rule that handles at-least cardinality
restrictions can lead to the introduction of infinitely many new tableau nodes. To
nevertheless guarantee termination, a cycle detection technique called (pairwise) blocking
[
        <xref ref-type="bibr" rid="ref5">5</xref>
        ] restricts the application of these rules. The rules are repeatedly applied until either
the completion graph is fully expanded (no more rules are applicable), in which case
the graph can be used to construct a model that witnesses the consistency of K , or an
obvious contradiction (called a clash) is discovered (e.g., both C and :C in a node
label), proving that the completion graph does not correspond to a model. The input
knowledge base K is consistent if the rules can be applied such that they build a fully
expanded and clash-free completion graph.
      </p>
      <p>In order to guarantee that each node of the completion graph indeed satisfies all
axioms of the TBox, one can “internalise” the TBox into a concept that is added to each
node label. For example, if the TBox contains the axioms</p>
      <p>A v 9r:(B1 u B2)
(1)</p>
      <p>A u B1 u 9r:(B1 t B2) v 9s:A
(2)
the internalisation contains one conjunct for each axiom that is a disjunction with the
negated left-hand side of the axiom and the right-hand side: nnf( (:A t 9r:(B1 u B2)) u
(:(A u B1 u 9r:(B1 t B2)) t 9s:A) ). Given that &gt; is satisfied at each node, a tableau
rule can be specified that adds such an internalised concept, say CI , with an auxiliary
axiom of the form &gt; v CI to each node label. Clearly, internalisation introduces a
large number of disjunctions in each node label, which possibly require several
nondeterministic choices and backtracking if the choice resulted in a clash.
2.1</p>
      <sec id="sec-1-1">
        <title>Absorption</title>
        <p>Instead of internalising the TBox axioms, it is more e cient to integrate a rule into the
tableau calculus that checks, for each GCI C v D, whether C is satisfied for a node and
if this is the case, then D is also added to the node label. For Axiom (1), for example, it
is easy to check whether the atomic concept A is satisfied at a node. This lazy unfolding
for atomic concepts can be realised by additionally using Rv1 of Table 1 in the tableau
algorithm. With this rule, we do not have to internalise axioms of the form A v C.</p>
        <p>Checking whether a complex left-hand side of an axiom is satisfied can, however,
be non-trivial. For example, it can often not be verified syntactically, whether a node
satisfies 9r:(B1 t B2), which would be required for Axiom (2). For example, with
Axioms (1) and (2), any instance of A has an r-successor that satisfies B1 u B2 and,
therefore, this A instance (semantically) also satisfies 9r:(B1 t B2) (but possibly not B1). In
order to nevertheless avoid the internalisation of such GCIs into axioms of the form
&gt; v nnf(:C t D) (and the resulting processing of disjunctions), practical reasoning
systems use elaborate transformations in a preprocessing step called absorption.</p>
        <p>An absorption algorithm rewrites axioms in such a way that internalisation and the
processing of disjunctions in the tableau algorithm are avoided as much as possible.
To achieve this, the absorption algorithm extracts conditions for an axiom that can
easily be checked, and only if the conditions hold for a node in a completion graph,
then the remaining (complex) part of the axiom has to be added to the node’s label,
while otherwise the axiom is trivially satisfied. For example, consider the
internalisation &gt; v :A t :B1 t 8r:(:B1 u :B2) t 9s:A of Axiom (2). We can observe that any
node in a tableau that does not have an r-neighbour trivially satisfies 8r:(:B1u:B2) and,
hence, the overall disjunction. Furthermore, even if there is an r-successor, this is only
problematic, if the successor satisfies B1 or B2. To check this, we can introduce axioms
that make sure that a fresh marker concept T1 is added to each node label that contains
B1 or B2, which then triggers the propagation of another marker T2 to the r-predecessor
to indicate that one of the other disjuncts :A, :B1, or 9s:A has to be satisfied. Since it
can also easily be checked whether A or B1 is satisfied, we can use the axioms:</p>
        <p>B1 v T1 B2 v T1 T1 v 8r :T2 A u B1 u T2 v 9s:A
to express Axiom (2) without any disjunction. We say that :A, :B1, and 8r:(:B1 u:B2)
are completely absorbable since they can be moved out of the disjunction, whereas 9s:A
is non-absorbable, i.e., it cannot easily be checked when the concept is satisfied and,
hence, the concept has to remain on the right-hand side of the axiom.</p>
        <p>
          Note that the last axiom cannot be handled by the lazy unfolding rule Rv1 since its
left-hand side consists of a conjunction of atomic concepts. Instead of adding a rule that
processes axioms of the form A1 u: : :uAn v C with Ai atomic concepts, an e cient way
of handling such axioms is binary absorption [
          <xref ref-type="bibr" rid="ref7">7</xref>
          ]. Binary absorption splits such axioms
such that A1 and A2 imply a fresh concept T1 (A1 u A2 v T1), then T1 is combined with
the next condition A3 and so on, until Tn 2 u An v Tn 1. The concept Tn 1 then implies
the concept C. Hence, rule Rv2 in Table 1 is su cient for handling such binary axioms.
3
        </p>
      </sec>
    </sec>
    <sec id="sec-2">
      <title>A Recursive Algorithm for Partial Absorption</title>
      <p>
        We next present an improved variant of a recursive binary absorption algorithm. In
comparison to other absorption techniques, the presented approach is often able to
further delay non-determinism. The recursion also facilitates further optimisations such as
backward chaining for the handling of nominal schemas [
        <xref ref-type="bibr" rid="ref12">12</xref>
        ].
      </p>
      <sec id="sec-2-1">
        <title>Algorithm 1 absorbTBox</title>
        <p>Input: T : a TBox, i.e., a set of GCIs C v D
Output: T 0: a new TBox with absorbed axioms of T
1: T 0 ;
2: for all C v D 2 T do
3: S fA j A = absorb(C0; T 0); C0 2 PA(nnf(:C t D))g
4: A join(S ; T 0)
5: fD1; : : : ; Dmg notCA(nnf(:C t D))
6: T 0 T 0 [ fA v D1 t : : : t Dmg
7: end for</p>
        <p>
          Due to the recursion, the presented algorithm works quite di erently compared to
the original binary absorption [
          <xref ref-type="bibr" rid="ref7">7</xref>
          ]. We exploit this for the integration of further
improvements. The presented algorithm completely handles an axiom C v D in one step, i.e.,
the axiom is first internalised into &gt; v :C tD and then the absorbable (sub-)concepts of
:C t D are recursively handled. Each recursion step creates (simple) inclusion axioms
for the handled (sub-)concepts such that corresponding marker concepts are implied,
and the final marker concept (Tn 1) is used to imply the non-absorbable part of the
axiom. In contrast, the original binary absorption algorithm introduces new GCIs during
the absorption of an axiom, which are then (possibly) further absorbed in separate steps.
        </p>
        <p>This complete handling of axioms allows for partially absorbing parts of axioms
without creating additional disjunctions. Roughly speaking, this partial absorption also
uses parts of non-absorbable concepts as conditions to delay the processing of the
nonabsorbable part of the axiom. For example, the axiom &gt; 5 r:A v D is, without
absorption, handled as &gt; v 6 4 r:A t D. None of the disjuncts can be absorbed completely,
but it is nevertheless possible to delay the processing of the disjunction until there is an
r-neighbour with the concept A in its label. Partial absorption is able to extract these
conditions, and rewrites the axiom such that the disjunction is propagated from a node
with A in its label to all r -neighbours (if there are any), which results in A v 8r :T1
and T1 v 6 4 r:A t D. Hence, partial absorption can significantly improve the original
binary absorption for more expressive Description Logics.</p>
        <p>We now make important terms for our partial absorption algorithm clearer.
Definition 1. A concept C is completely absorbable if (i) C = :fag for a a nominal, (ii)
C = :A for A an atomic concept, (iii) C = C1 t : : : tCn or C = C1 u : : : uCn and, for 1
i n, Ci is completely absorbable, or (iv) C = 8r:D and D is completely absorbable.
Otherwise C is not completely absorbable. A concept C is partially absorbable if C is
completely absorbable or (i) C = 8r:D, (ii) C = 6 n r:D, (iii) C = C1 t : : : t Cn and,
for some i, 1 i n, Ci is partially absorbable, or (iv) C = C1 u : : : u Cn and, for each
i, 1 i n, Ci is partially absorbable. For a disjunction C = C1 t : : : t Cn, we set
notCA(C) :=fCi j 1 i n; Ci is not completely absorbableg</p>
        <p>PA(C) :=fCi j 1
i</p>
        <p>n; Ci is partially absorbableg:
Note that if a concept C is not partially absorbable, then it is also not completely
absorbable. In the following, we describe the absorption algorithm.</p>
        <p>Algorithm 1 (absorbTBox) takes as input a TBox T and produces a TBox T 0 such
that each axiom in T 0 has the form A v C, A1 u A2 v C, &gt; v C, or fag v C, i.e., T 0
can be e ciently handled by the rules of Table 1 or as concept assertion. The method
absorbTBox considers each axiom C v D 2 T as disjunction in negation normal
form. Each partially absorbable disjunct from PA(nnf(:C t D)) is passed to the function
absorb (Algorithm 2), which (recursively) performs the (binary) absorption and returns
a single atomic concept that represents the disjunct. The concepts are then joined and
used to imply the non-absorbable part of the axiom in Line 6. Please note that if all
disjuncts of the axiom can be completely absorbed, then an empty disjunction is created
in Line 6, which we consider as ?.</p>
        <p>The method absorb is recursive with the base case of negated concept names (C =
:A) or nominals (C = :fag). For the former, the algorithm directly returns the atomic
concept A. For the latter, the algorithm returns the fresh concept T , which is axiomatised
by fag v T in the TBox constructed during the absorption. For the recursion, partially
absorbable subconcepts of C are first absorbed, which results in an atomic concept for
each such subconcept. The resulting atomic concepts are then joined by Algorithm 3
(join), which implements the binary absorption introduced in Section 3 and which
returns just the final concept (Tn 1).</p>
        <p>Partial absorption also significantly delays non-determinism for axioms that use
more expressive concept constructors. Consider, for example, a TBox T with
9r:(A u 8r:B1) v B2 (3)
as the one and only axiom of the form C v D. Without absorption, it would be
necessary to internalise this axiom into the form &gt; v nnf(:C t D), which would result
in &gt; v 8r:(:A t 9r::B1) t B2. Obviously, this would produce a significant amount
of non-determinism. Clearly, none of the parts of (3) can be absorbed completely, i.e.,
notCA(nnf(:C t D)) = f8r:(:At9r::B1); B2g. The concept 8r:(:At9r::B1), however,
is partially absorbable, i.e., it is in PA(nnf(:C t D)). This allows for delaying the
processing of the disjunctions until there is an r-neighbour with the concept A in its label.
In order to capture this, the presented absorption algorithm rewrites the axiom such that
the disjunction is propagated from a node with A in its label to all r -neighbours as
follows: In Line 3 of absorbTBox, we call absorb(8r:(:At9r::B1); T 0) with T 0 = ;. The
algorithm proceeds with Lines 7–10 and recursively calls itself for :A and T 0, which
returns A, while 9r::B1 is not absorbable. The following call of join(fAg; T 0) directly
returns A (no binarisation is needed). Finally, in Line 10, T 0 is extended with A v 8r :T
for the fresh concept T and T is returned by the algorithm. Back in Algorithm 1, the call
of join(fT g; T 0) again directly returns T . After extending T 0 in Line 6 of Algorithm 1,
we have T 0 = fA v 8r :T; T v 8r:(:A t 9r::B1) t B2g.</p>
        <p>It is clear that an existential restriction of the form 9r:C cannot be absorbed directly
since it does not provide conditions that could be used to trigger other concepts. In
contrast, the absorption can be extended to several concept constructors of more expressive
DLs (see also Section 4). For example, the :9r:Self concept of SROIQ can be partially
absorbed with &gt; v 8r :A and A v :9r:Self.</p>
        <p>Note that, in principle, it is not necessary to always create new axioms with fresh
atomic concepts for the absorption of identical concepts. In practice, the binary
absorption axioms as well as the axioms for absorbing specific concepts can be reused.
Lemma 1. Let T be a TBox and T 0 the TBox produced by Algorithm 1 when given T
as input, then T 0 is a model-conservative extension of T , i.e., (a) every model of T
can be extended to a model of T 0 by adding suitable interpretations for fresh symbols
in T 0 and (b) every model of T 0 is a model of T (T 0 j= T ).</p>
        <p>
          A complete proof is available in an online technical report [
          <xref ref-type="bibr" rid="ref16">16</xref>
          ].
        </p>
        <p>Proof (sketch). Note that by showing T 0 j= T , we do not need to require T T 0 as
usual for (model) conservative extensions.</p>
        <p>Let I be some model of T , C v D 2 T , E a completely (partially) absorbable
disjunction that is an absorbed subconcept of nnf(:C t D), and TE the concept returned
by absorb(E; T 0). For (a), we construct an interpretation J for T 0 that coincides with
I apart from the interpretations of concepts that are fresh in T 0. To show J j= T 0,
we establish (1) (:TE)J = EJ ((:TE)J EJ ). For showing (b), we first show (2)
T 0 j= :TE v E.</p>
        <p>The base case, where each disjunct of a subconcept E = E1 t: : :t En of nnf(:C t D)
is either non-absorbable or of the form :A or :fag, is straightforward. For the induction,
E might also contain other absorbable disjuncts. We just sketch the case of Ei = 6
n r:E0, 1 i n, where absorb(Ei; T 0) returns TEi and T:E0 v 8inv(r):TEi 2 T 0, which
is equivalent to 9r:T:E0 v TEi . We set TEJi = (9r:T:E0 )J , so J j= T:E0 v 8inv(r):TEi .
To show (1): we have (:TEi )J = (8r:(:T:E0 ))J , and, by induction, (8r:(:T:E0 ))J
(6 n r:E0)J = EiJ . For (2): since T 0 j= 9r:T:E0 v TEi , T 0 j= :TEi v 8r:(:T:E0 ). By
induction, T j 0 = :TEi v 6 n r:E0 = Ei as required.</p>
        <p>0 = :T:E0 v E0, hence, T j</p>
        <p>In absorbTBox, we eventually have a concept TE for the disjuncts E1; : : : ; En 2
PA(nnf(:C t D)) and TE v D1 t : : : t Dm 2 T 0. By induction, T 0 j= TE1 u : : : u TEn v TE
and T 0 j= :TEi v Ei for 1 i n. Hence, T 0 j= &gt; v E1 t : : : t En t D1 t : : : t Dm
and, hence, T 0 j= &gt; v nnf(:C t D). To show that J j= TE v D1 t : : : t Dm we
show J j= &gt; v :TE1 t : : : t :TEn t D1 t : : : t Dm since J j= TE1 u : : : u TEn v
TE. W.l.o.g., let E1; : : : ; E` be the completely and E`+1; : : : ; En the partially, but not
completely absorbable disjuncts of E. By induction, (:TEi )J = EiJ ((:TEi )J EiJ ),
for each 1 i ` (` &lt; i n). Since I j= T , I j= &gt; v E1 t : : : t E` t D1 t : : : t Dm
and, hence, J j= &gt; v :TE1 t : : : t :TE` t D1 t : : : t Dm. Since, for ` &lt; i n, Ei = D j
for some 1 j m, J j= &gt; v :TE1 t : : : t :TEn t D1 t : : : t Dm as required.
4</p>
      </sec>
    </sec>
    <sec id="sec-3">
      <title>Extensions to the Absorption Algorithm</title>
      <p>In this section, we sketch several extensions of the above presented absorption
algorithm, which improve the handling of completely defined concepts, optimise the
processing of disjunctions, and allow for absorbing axioms that contain data ranges.
4.1</p>
      <sec id="sec-3-1">
        <title>Handling of Completely Defined Concepts</title>
        <p>In the previous sections, we considered a TBox to be a set of GCIs of the form C v D.
This is w.l.o.g. since C D can be rewritten to C v D and D v C. For definitions of
the form A C with A an atomic concept (also called a completely defined concept),
it is often ine cient to decompose the axiom A C into A v C and C v A, because
nnf(:C) might not be completely absorbable and then the disjunction :C t A has to be
processed for all nodes in the completion graph. In order to determine the satisfiability
of a concept, it is for many nodes not relevant whether A or :A is in their label as long
as it can be ensured that one of both alternatives is not causing a clash. Obviously, if
the atomic concept A is only defined once in the knowledge base and A is acyclic,3
then C or :C and, therefore, also A or :A must be satisfiable and, only in this case,
3 Acyclicity is defined as follows: A1 directly uses A2 w.r.t. a TBox T if A1 C 2 T or
A1 v C 2 T and A2 occurs in C; uses is the transitive closure of “directly uses”. Then, a
concept D is acyclic w.r.t. a TBox T if it contains no concept A that uses itself.
1: for all A C 2 T do
2: TA absorb(PA(nnf(:C)); T 0)
3: if nnf(:C) is completely absorbable then
4: T 0 T 0 [ fA v C; TA v A; TA v A+g
5: else if jfC0 j A v C0 2 T or A C0 2 T gj &gt; 1 then
6: T 0 T 0 [ fA v C; TA v nnf(:C t A); TA v A+g
7: else
8: T 0 T 0 [ fA C; TA v A+g
9: end if
10: end for
the axiom A C can directly be handled by an additional rule, which unfolds A to
C and :A to :C. To capture this, we extend the algorithm absorbTBox (cf. Figure 1)
such that axioms of the from A C are only eliminated if there are several definitions
of A. In such cases, A v C and C v A must be explicitly represented in T 0 such
that possible interactions between these several definitions are handled. In addition, a
concept of the form :A is now only completely (partially) absorbable if A is acyclic
and, for each A C 2 T , nnf(:C) is completely (partially) absorbable. Note that the
definition of completely and partially absorbable concepts now refers to the TBox. To
correctly handle the completely defined concepts in the recursive absorption algorithm,
absorb also has to be adapted for concepts of the form :A. The algorithm still returns A
if there is no axiom of the form A C 2 T . Otherwise, for each A C, all absorbable
disjuncts of nnf(:C) are recursively absorbed, the resulting atomic concepts are joined
into a fresh concept TC, and TC v A+ is added to T 0 for a so-called candidate concept
A+ for A. After processing all definitions for A, the candidate concept A+ is returned.
Note that if nnf(:C) is not absorbable, then the absorption returns &gt; and &gt; v A+ is
added to T 0. An occurrence of A+ in the label of a node signalises that the node might
be an instance of the concept A, i.e., if A+ is not in the node label of a clash-free and fully
expanded completion graph, then this node is also not an instance of A. In particular, if
A+ is not in the label of a node, then we know that :A can be safely added since one
disjunct of nnf(:C) is trivially satisfied. Hence, the generated candidate concept A+ can
be used as condition for further absorption if :A occurs and, therefore, it can be used
to further delay branching.</p>
        <p>
          Such candidate concepts are also very useful for identifying completely defined
concepts as possible subsumers for classification (cf. [
          <xref ref-type="bibr" rid="ref2">2</xref>
          ]) without forcing the decision
between A and :A for the nodes in the completion graph. As a result, these candidate
concepts can significantly improve the classification performance, especially in
combination with model merging techniques [
          <xref ref-type="bibr" rid="ref13 ref14">13,14</xref>
          ], and, therefore, it is almost always useful
to generate these candidate concepts even if they are not required for further absorption.
The presented absorption algorithm can also be used to optimise the handling of
ordinary disjunctions. For example, the axiom A v 9r:(:B t A) cannot be absorbed and
contains the disjunction :B t A. Obviously, we can replace the disjunction with a fresh
marker concept T by adding the additional axiom T v :B t A to the TBox, which can
be absorbed to T u B v A. Consequently, we have the axioms A v 9r:T and T u B v A,
which can be processed by the tableau algorithm without causing any non-determinism.
Another advantage of the presented absorption algorithm is the straightforward
extensibility to handle datatypes such as strings, integers, or decimals, which represent sets
of concrete data values. In OWL, one can also use datatype restrictions to build custom
datatypes, e.g., the datatype restriction real[int ^ &gt;5] restricts OWL’s datatype owl:real,
which we abbreviate to real, with the facets int and &gt;5 to real numbers that are integers
and greater than 5. We call int ^ &gt;5 a facet expression. One can further use Boolean
constructors to build data ranges, e.g., string _ real[int ^ &gt;5] is a data range that contains
all data values that are strings or reals as described above. Concepts refer to data ranges
via concrete roles, e.g., the formula 9q:real[int ^ &gt;5], for q a concrete role, describes
individuals that have some value of the data range real[int ^ &gt;5] as q-successor.
        </p>
        <p>
          Elements that represent data values are represented in the tableau algorithm by
concrete nodes. For example, for a node v with 9q:real[int ^ &gt;5] 2 L(v), the
tableau algorithm adds a new concrete node z and an edge hv; zi with q 2 L(hv; zi) and
real[int ^ &gt;5] 2 L(z). Typically, the tableau algorithm is then combined with a datatype
checker that checks the consistency of datatype constraints [
          <xref ref-type="bibr" rid="ref10">10</xref>
          ]. Such constraints are
represented as a set of assertions of the form d(z), :d(z), and z1 0 z2, where z, z1, and
z2 are concrete nodes and d is an explicit enumeration of data values fc1; : : : ; cng or an
expression of the form dt['] for dt a datatype and ' a facet expression. Given a set of
such assertions, the datatype checker determines whether each concrete node can be
assigned a data value in a way that satisfies all of the given assertions.
        </p>
        <p>
          It is well-known that facets can be used to encode disjunctions [
          <xref ref-type="bibr" rid="ref9">9</xref>
          ]. For example,
A v 8r:B1 t B2 can be expressed by the axioms
        </p>
        <p>
          A v 9q:real[int ^ &gt;5] (4) 9q:real[
          <xref ref-type="bibr" rid="ref10">10</xref>
          ] v 8r:B1 (5) 9q:real[&gt;10] v B2; (6)
where q is a concrete role. For the data range real[int ^ &gt;5], it cannot directly be
determined whether real[
          <xref ref-type="bibr" rid="ref10">10</xref>
          ] or its complement is satisfied. This has to be clarified in
a case-by-case analysis, where once the integer interval int(5; 10] and once the
interval int(10; 1] is considered. Hence, the datatype checker is forced to make
nondeterministic decisions. This also means that it is usually not possible to (completely)
absorb data ranges since it is not easily possible to determine if a certain data range
is satisfied or not. Of course, it is possible to absorb other parts of the axioms, e.g.,
Axiom (5) can be absorbed to
        </p>
        <p>
          &gt; v 8r :T T v 8q:real[
          <xref ref-type="bibr" rid="ref10">10</xref>
          ] t 8r:B1;
where real[
          <xref ref-type="bibr" rid="ref10">10</xref>
          ] expresses the negation of real[
          <xref ref-type="bibr" rid="ref10">10</xref>
          ]. However, without any further
optimisations, Axiom (6) introduces non-determinism for every node as it is handled as
&gt; v 8q:real[&gt;10] t B2:
        </p>
        <p>
          For partial absorption it is only required that the tableau algorithm or the datatype
checker can identify those concepts and data ranges that might be satisfied. This is
obviously possible for many data ranges and, therefore, they can be absorbed partially.
For this, the datatype checker has to be extended such that it identifies data range
candidates, i.e., data ranges that are possibly satisfied. Let us assume that real[
          <xref ref-type="bibr" rid="ref10">10</xref>
          ]+ is the
data range candidate for real[
          <xref ref-type="bibr" rid="ref10">10</xref>
          ], i.e., if a concrete node in the completion graph
possibly represents numerical data values that are less or equal to 10, then real[
          <xref ref-type="bibr" rid="ref10">10</xref>
          ]+ is
satisfied and this can be identified e ciently by the datatype checker. Consequently, we
can use such data range candidates in lazy unfolding rules, e.g., real[
          <xref ref-type="bibr" rid="ref10">10</xref>
          ]+ v T is used
to trigger the marker concept T if real[
          <xref ref-type="bibr" rid="ref10">10</xref>
          ] is possibly satisfied. Note that this adds
ordinary concepts to the labels of concrete nodes, which is, however, not problematic
since these concepts are only used internally as auxiliary constructs. With the data range
candidates, we can improve the absorption of Axiom (5) to
        </p>
        <p>
          real[
          <xref ref-type="bibr" rid="ref10">10</xref>
          ]+ v T1 T1 v 8q :T2 &gt; v 8r :T3 T2 u T3 v 8q:real[
          <xref ref-type="bibr" rid="ref10">10</xref>
          ] t 8r:B1
and we can absorb Axiom (6) to
        </p>
        <p>real[&gt;10]+ v T4 T4 v 8q :T5 T5 v 8q:real[&gt;10] t B2:
Now, the processing of the disjunctions is delayed until there are concrete nodes that
possibly represent the absorbed data ranges.</p>
        <p>
          The required extensions for the absorption algorithm and the datatype checker are
straightforward. For the absorption, the definition of partial absorbability has to be
extended to the supported data ranges and the concepts related to datatypes. Moreover,
the absorb function has to be adjusted such that the corresponding axioms are created.
The datatype checker has to be extended to identify those data range candidates that
might be satisfied. Usually, the datatype checker manages the data intervals that are still
possible for a concrete node in a sorted array [
          <xref ref-type="bibr" rid="ref10">10</xref>
          ]. By counting and iterating through
the possible values, it is easily possible to identify those data range candidates that are
satisfied. For example, the possible values of a concrete node z are restricted to the
integer interval int(5; 1] if the data range real[int ^ &gt;5] is asserted to z. If z is not distinct
to other concrete nodes (i.e., there is no assertion z 0 z0), then the datatype checker can
pick the first possible data value of 6 to satisfy the constraints for z. Hence, the datatype
checker has to trigger only those data range candidates that are satisfied for this first
data value (i.e., real[
          <xref ref-type="bibr" rid="ref10">10</xref>
          ]+). If z is distinct to 5 other concrete nodes, then real[&gt;10]+
is also triggered (although it is not necessarily the case that indeed 6 data values are
required to assign all (distinct) concrete nodes a data value).
5
        </p>
      </sec>
    </sec>
    <sec id="sec-4">
      <title>Comparison with Related Absorption Techniques</title>
      <p>In comparison to other absorption techniques, partial absorption often delays
nondeterminism further. Consider again Axiom (3) from Section 3, which is rewritten by
the original binary absorption algorithm into the following axioms:4</p>
      <p>A v 9r::B1 t 8r :T</p>
      <p>T v B2:
In contrast, partial absorption creates the following axioms (as shown in Section 3):
A v 8r :T</p>
      <p>
        T v 8r:(:A t 9r::B1) t B2:
4 The binary absorption algorithm presented in [
        <xref ref-type="bibr" rid="ref7">7</xref>
        ] has a looping bug for concepts of the form
8r:C, which can, however, easily be repaired. We simply ignored the repeated absorption of
concepts that are introduced by the algorithm.
      </p>
      <p>
        Hence, with the original binary absorption technique, the disjunction has already to be
processed when A is added to the label of a node, whereas partial absorption delays
the processing until there is also an r-neighbour. It is also well-known that
clausification for the hypertableau algorithm reduces many cases of non-determinism [
        <xref ref-type="bibr" rid="ref11">11</xref>
        ]. For
Axiom (3), however, clausification creates the following DL-clauses:
      </p>
      <p>r(x; y) ! B2(x) _ T (y) T (x) ^ A(x) ! 9r::B1(x)
Hence, the hypertableau algorithm has to choose between B2 and T for every node with
an r-neighbour, whereas the partial absorption technique allows for further delaying the
processing of disjunctions until the node also has the concept A in its label. Besides
that, the presented partial absorption technique shares other interesting features with
clausification of the hypertableau algorithm. In particular, both use conditions of more
expressive concept constructors to trigger the processing of disjunctions. For instance,
the axiom &gt; 2 r:A v B is automatically satisfied if there is no r-successor with the
concept A in its label. This is utilised by partial absorption, which generates</p>
      <p>A v 8r :T T v 6 1 r:A t B
and also by clausification of the hypertableau calculus, which generates:
r(x; y1) ^ A(y1) ^ r(x; y2) ^ A(y2) ! B(x) _ y1 y2
Note that clausification eliminates at-most cardinality restrictions, whereas partial
absorption has to use the concept 6 1 r:A since &gt; 2 r:A is not completely absorbable.
Consequently, a choose-rule is not required in the hypertableau algorithm, which possibly
further reduces non-determinism in comparison to a standard tableau algorithm that has
to choose between :A and A for all r-neighbours of nodes with T in their label.</p>
      <p>By using a recursive algorithm that absorbs GCIs completely in one step, we gain
a further advantage in comparison to traditional absorption algorithms. Traditional
algorithms absorb parts of axioms in separate steps, e.g., by introducing new GCIs that
are handled separately, which often creates additional disjunctions, especially if the
axioms have more complex (sub-)concepts. For instance, by simplifying the axiom
&gt; 5 r:(&gt; 3 s:A) v D into &gt; 5 r:T v D and T &gt; 3 s:A, for T a fresh atomic concept
that is used to split the original axiom, the GCI T &gt; 3 s:A cannot further be absorbed
without creating additional disjunctions. In contrast, our recursive absorption algorithm
creates the axioms A v 8s :T1, T1 v 8r :T2, and T2 v 6 4 r:(&gt; 3 s:A) t D, which
delays the processing of the disjunction significantly (or avoids it completely if there is no
node in the completion graph for which T2 is added to its label).
6</p>
    </sec>
    <sec id="sec-5">
      <title>Conclusions</title>
      <p>
        In this paper, we have presented the partial absorption algorithm used in the
reasoning system Konclude together with several extensions. The candidate concepts can, for
example, be combined with model merging techniques to reduce Konclude’s
classification time for several thousand ontologies by 60 % on average [
        <xref ref-type="bibr" rid="ref13 ref14">13,14</xref>
        ], which allows
Konclude to often outperform other state-of-the-art reasoners. The presented absorption
techniques are also essential for further optimisations such as an e cient handling of
nominal schemas [
        <xref ref-type="bibr" rid="ref12">12</xref>
        ] or the combination of tableau with saturation-based procedures
[
        <xref ref-type="bibr" rid="ref13 ref14">13,14</xref>
        ], where the reduced non-determinism is crucial.
      </p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          1.
          <string-name>
            <surname>Baader</surname>
            ,
            <given-names>F.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Calvanese</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>McGuinness</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Nardi</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Patel-Schneider</surname>
            ,
            <given-names>P</given-names>
          </string-name>
          . (eds.):
          <article-title>The Description Logic Handbook: Theory, Implementation, and Applications</article-title>
          . Cambridge University Press, second edn. (
          <year>2007</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          2.
          <string-name>
            <surname>Glimm</surname>
            ,
            <given-names>B.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Horrocks</surname>
            ,
            <given-names>I.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Motik</surname>
            ,
            <given-names>B.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Shearer</surname>
            ,
            <given-names>R.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Stoilos</surname>
          </string-name>
          , G.:
          <article-title>A novel approach to ontology classification</article-title>
          .
          <source>J. of Web Semantics: Science, Services and Agents on the World Wide Web</source>
          <volume>14</volume>
          ,
          <fpage>84</fpage>
          -
          <lpage>101</lpage>
          (
          <year>July 2012</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          3.
          <string-name>
            <surname>Horrocks</surname>
            ,
            <given-names>I.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Kutz</surname>
            ,
            <given-names>O.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Sattler</surname>
            ,
            <given-names>U.</given-names>
          </string-name>
          :
          <article-title>The even more irresistible SROIQ</article-title>
          .
          <source>In: Proc. 10th Int. Conf. on Principles of Knowledge Representation and Reasoning (KR'06)</source>
          . pp.
          <fpage>57</fpage>
          -
          <lpage>67</lpage>
          . AAAI Press (
          <year>2006</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          4.
          <string-name>
            <surname>Horrocks</surname>
            ,
            <given-names>I.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Patel-Schneider</surname>
            ,
            <given-names>P.F.</given-names>
          </string-name>
          :
          <article-title>Optimizing description logic subsumption</article-title>
          .
          <source>J. of Logic and Computation</source>
          <volume>9</volume>
          (
          <issue>3</issue>
          ),
          <fpage>267</fpage>
          -
          <lpage>293</lpage>
          (
          <year>1999</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          5.
          <string-name>
            <surname>Horrocks</surname>
            ,
            <given-names>I.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Sattler</surname>
            ,
            <given-names>U.</given-names>
          </string-name>
          :
          <article-title>A description logic with transitive and inverse roles and role hierarchies</article-title>
          .
          <source>J. of of Logic and Computation</source>
          <volume>9</volume>
          (
          <issue>3</issue>
          ),
          <fpage>385</fpage>
          -
          <lpage>410</lpage>
          (
          <year>1999</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          6.
          <string-name>
            <surname>Horrocks</surname>
            ,
            <given-names>I.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Sattler</surname>
            ,
            <given-names>U.</given-names>
          </string-name>
          :
          <article-title>A tableau decision procedure for SH OIQ</article-title>
          .
          <source>J. of Automated Resoning</source>
          <volume>39</volume>
          (
          <issue>3</issue>
          ),
          <fpage>249</fpage>
          -
          <lpage>276</lpage>
          (
          <year>2007</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          7.
          <string-name>
            <surname>Hudek</surname>
            ,
            <given-names>A.K.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Weddell</surname>
            ,
            <given-names>G.E.</given-names>
          </string-name>
          :
          <article-title>Binary absorption in tableaux-based reasoning for description logics</article-title>
          .
          <source>In: Proc. 19th Int. Workshop on Description Logics (DL'06)</source>
          . vol.
          <volume>189</volume>
          .
          <string-name>
            <surname>CEUR</surname>
          </string-name>
          (
          <year>2006</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          8.
          <string-name>
            <surname>Krötzsch</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Maier</surname>
            ,
            <given-names>F.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Krisnadhi</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Hitzler</surname>
            ,
            <given-names>P.:</given-names>
          </string-name>
          <article-title>A better uncle for OWL: nominal schemas for integrating rules and ontologies</article-title>
          .
          <source>In: Proc. 20th Int. Conf. on World Wide Web (WWW'11)</source>
          . pp.
          <fpage>645</fpage>
          -
          <lpage>654</lpage>
          . ACM (
          <year>2011</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          9.
          <string-name>
            <surname>Magka</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Kazakov</surname>
            ,
            <given-names>Y.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Horrocks</surname>
            ,
            <given-names>I.</given-names>
          </string-name>
          :
          <article-title>Tractable extensions of the description logic EL with numerical datatypes</article-title>
          .
          <source>J. of Automated Reasoning</source>
          <volume>47</volume>
          (
          <issue>4</issue>
          ),
          <fpage>427</fpage>
          -
          <lpage>450</lpage>
          (
          <year>2011</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          10.
          <string-name>
            <surname>Motik</surname>
            ,
            <given-names>B.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Horrocks</surname>
            ,
            <given-names>I.</given-names>
          </string-name>
          :
          <article-title>OWL datatypes: Design and implementation</article-title>
          .
          <source>In: Proc. 7th Int. Semantic Web Conf. (ISWC'08)</source>
          . LNCS, vol.
          <volume>5318</volume>
          , pp.
          <fpage>307</fpage>
          -
          <lpage>322</lpage>
          . Springer (
          <year>2008</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          11.
          <string-name>
            <surname>Motik</surname>
            ,
            <given-names>B.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Shearer</surname>
            ,
            <given-names>R.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Horrocks</surname>
            ,
            <given-names>I.</given-names>
          </string-name>
          :
          <article-title>Hypertableau reasoning for description logics</article-title>
          .
          <source>J. of Artificial Intelligence Research</source>
          <volume>36</volume>
          ,
          <fpage>165</fpage>
          -
          <lpage>228</lpage>
          (
          <year>2009</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          12.
          <string-name>
            <surname>Steigmiller</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Glimm</surname>
            ,
            <given-names>B.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Liebig</surname>
            ,
            <given-names>T.</given-names>
          </string-name>
          :
          <article-title>Nominal schema absorption</article-title>
          .
          <source>In: Proc. 23rd Int. Joint Conf. on Artificial Intelligence (IJCAI'13)</source>
          . pp.
          <fpage>1104</fpage>
          -
          <lpage>1110</lpage>
          . AAAI Press/The MIT Press (
          <year>2013</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          13.
          <string-name>
            <surname>Steigmiller</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Glimm</surname>
            ,
            <given-names>B.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Liebig</surname>
            ,
            <given-names>T.</given-names>
          </string-name>
          :
          <article-title>Coupling tableau algorithms for expressive description logics with completion-based saturation procedures</article-title>
          .
          <source>In: Proc. 7th Int. Joint Conf. on Automated Reasoning (IJCAR'14)</source>
          . LNCS, Springer (
          <year>2014</year>
          ), accepted
        </mixed-citation>
      </ref>
      <ref id="ref14">
        <mixed-citation>
          14.
          <string-name>
            <surname>Steigmiller</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Glimm</surname>
            ,
            <given-names>B.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Liebig</surname>
            ,
            <given-names>T.</given-names>
          </string-name>
          :
          <article-title>Coupling tableau algorithms for the DL SROIQ with completion-based saturation procedures</article-title>
          .
          <source>Tech. Rep. UIB-2014-02</source>
          , University of Ulm, Ulm, Germany (
          <year>2014</year>
          ), available online at http://www.uni-ulm.de/fileadmin/website_ uni_ulm/iui/Ulmer_Informatik_Berichte/2014/UIB-2014-02.pdf
        </mixed-citation>
      </ref>
      <ref id="ref15">
        <mixed-citation>
          15.
          <string-name>
            <surname>Steigmiller</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Liebig</surname>
            ,
            <given-names>T.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Glimm</surname>
            ,
            <given-names>B.</given-names>
          </string-name>
          :
          <article-title>Konclude: system description</article-title>
          .
          <source>J. of Web Semantics</source>
          (
          <year>2014</year>
          ), accepted
        </mixed-citation>
      </ref>
      <ref id="ref16">
        <mixed-citation>
          16.
          <string-name>
            <surname>Steigmiller</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Liebig</surname>
            ,
            <given-names>T.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Glimm</surname>
            ,
            <given-names>B.</given-names>
          </string-name>
          :
          <article-title>Optimised absorption for expressive description logics</article-title>
          .
          <source>Tech. rep.</source>
          , Ulm University, Ulm, Germany (
          <year>2014</year>
          ), available online at https://www.uni-ulm.de/fileadmin/website_uni_ulm/iui.inst.090/ Publikationen/2014/StGL14c.pdf
        </mixed-citation>
      </ref>
      <ref id="ref17">
        <mixed-citation>
          17.
          <string-name>
            <surname>Tsarkov</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Horrocks</surname>
          </string-name>
          , I.:
          <article-title>E cient reasoning with range and domain constraints</article-title>
          .
          <source>In: Proc. 17th Int. Workshop on Description Logics (DL'04)</source>
          . vol.
          <volume>104</volume>
          .
          <string-name>
            <surname>CEUR</surname>
          </string-name>
          (
          <year>2004</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref18">
        <mixed-citation>
          18.
          <string-name>
            <surname>Tsarkov</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Horrocks</surname>
            ,
            <given-names>I.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Patel-Schneider</surname>
            ,
            <given-names>P.F.</given-names>
          </string-name>
          :
          <article-title>Optimizing terminological reasoning for expressive description logics</article-title>
          .
          <source>J. of Automated Reasoning</source>
          <volume>39</volume>
          ,
          <fpage>277</fpage>
          -
          <lpage>316</lpage>
          (
          <year>2007</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref19">
        <mixed-citation>
          19. W3C OWL Working Group:
          <article-title>OWL 2 Web Ontology Language: Document Overview</article-title>
          . W3C
          <source>Recommendation (27 October</source>
          <year>2009</year>
          ), available at http://www.w3.org/TR/ owl2-overview/
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>