Optimizing Reasoning with Qualified Number Restrictions in SHQ

                           N. Farsiniamarj and V. Haarslev
                        Concordia University, Montreal, Canada

1 Introduction
Using SHQ one can express number restrictions on role fillers of individuals. A typi-
cal question for a DL reasoner would be whether the concept ∀hasCredit.(Science t
Engineering t Business) u (≥ 120 hasCredit.(Science t Engineering)) u (≥ 140
hasCredit)u(≤ 32 hasCredit.(Science tBusiness))u(≤ 91 hasCredit.Engineering)
is satisfiable. Most DL tableau algorithms [9, 1, 12] test the satisfiability of such a con-
cept by first satisfying all at-least restrictions, e.g., by creating 260 hasCredit-fillers,
of which 120 are instances of (Science t Engineering). Eventually, a nondetermin-
istic choose-rule assigns to each of these 260 individuals (Science t Business) or
¬(Science t Business), and Engineering or ¬Engineering. In case an at-most re-
striction is violated, e.g., a student has more than 91 hasCredit-fillers of Engineering,
a nondeterministic merge-rule tries to reduce the number of these individuals by merg-
ing a pair of individuals until the upper bound specified in this at-most restriction is
satisfied. Searching for a model in such an arithmetically uninformed or blind way is
usually very inefficient.
     Our hybrid calculus is based on a standard tableau for SH [1] modified and ex-
tended to deal with qualified number restrictions and works with an inequation solver
based on integer linear programming. The tableau rules encode number restrictions into
a set of inequations using the so-called atomic decomposition technique [16]. The set
of inequations is processed by the inequation solver which finds, if possible, a minimal
non-negative integer solution (distribution of role fillers constrained by number restric-
tions) satisfying the inequations. The tableau rules ensure that such a distribution of role
fillers also satisfies the logical restrictions.
     Since this hybrid algorithm collects all the information about arithmetic expressions
before creating any role filler, it will not satisfy any at-least restriction by violating an
at-most restriction and there is no need for a mechanism of merging role fillers and
its performance is not affected by the values of numbers occurring in number restric-
tions. Since the solution from the inequation solver satisfies all numerical restrictions
imposed by at-least and at-most restrictions, our calculus needs to create only one so-
called proxy individual (inspired by [6]) representing a set of role fillers. Considering
all these features the proposed hybrid algorithm is well suited to improve average case
performance. Furthermore, in [2, Chapter 6] it has been shown that a tableau procedure
extended by global caching and an algebraic method similar to the one presented in this
paper and in [16, 8] is worst-case optimal for SHIQ. Although the calculus presented
in our paper is different from the one in [2] we conjecture that the result presented in
[2] can be transferred to our work because the use of integer linear programming was
essential to proving worst-case optimality.
     This calculus extends our work presented in [3] by (i) covering Abox consistency
for SHQ with general Tboxes, and (ii) introducing the use of proxy individuals rep-
resenting sets of role fillers. It complements the work in [4] (which extends the work
                           Table 1. Syntax and semantics of SHQ

 Syntax Semantics                                                Syntax Semantics
   A    A I ⊆ ∆ I , A ∈ NC                                      R ∈ NR R I ⊆ ∆ I × ∆ I
  ¬C    ∆I \C I                                                 R ∈ NRT RI = (RI )+
 C u D C I ∩ DI                                                  R v S RI ⊆ S I
          I     I
 C tD C ˘ ∪D                                                     C v D C I ⊆ DI
 ∀R.C ˘x | ∀y : (x, y) ∈ RI¯⇒ y ∈ C I
                                      ¯
                                                                  a : C aI ∈ C I
≥ n R.C ˘x | #RI (x, C) ≥ n ¯                                   (a, b) : R (aI , bI ) ∈ RI
≤ m R.C x | #RI (x, C) ≤ m                                           .
                                                                  a=6 b aI 6= bI

in [3] to ALCOQ) by additionally dealing with role hierarchies and transitive roles.
This paper also discusses practical aspects of using the hybrid method and presents
evaluation results based on an implemented prototype. It also significantly differs from
the work presented in [15, 16] where no formal calculus was proposed and the covered
logic did not support Abox consistency for SHQ with general Tboxes. Our early work
presented in [8] did not deal with Aboxes and was based on a recursive calculus without
using proxy individuals and did not provide any proof for soundness, completeness, or
termination.
2 Preliminaries
2.1 Description Logic SHQ
In the following, three disjoint sets are defined; NC is the set of concepts names;
NR = NRT ∪ NRS is the set of all role names which consists of transitive (NRT )
and non-transitive (NRS ) roles; I is the set of all individuals, while IA ⊆ I is the set
of individuals occurring in an Abox A. A role is called simple if it is neither transitive
nor has any transitive sub-role. In order to remain decidable qualified number restric-
tions are only allowed for simple roles (except for concepts of the form ≥ 1 R.C). In
Table 1, we use a standard Tarski-style semantics I and assume that C, D are arbitrary
concept descriptions, R, S ∈ NR , a, b ∈ IA , and #RI (x, C) denotes the cardinality of
{y | (x, y) ∈ RI , y ∈ C I }. Let v∗ be the transitive, reflexive closure of v over NR . A
concept description C is said to be satisfiable by an interpretation I iff C I 6= ∅. A role
hierarchy R is a set of assertions of the form R v S where R, S ∈ NR .
    A SHQ Tbox T is a finite set of axioms of the form C v D or C ≡ D where
C, D are concept expressions and C ≡ D is the placeholder for {C v D, D v C}.
A Tbox T is satisfiable by an interpretation I iff I satisfies all axioms in T and R. A
SHQ Abox A w.r.t. a Tbox T is a finite set of assertions of the form a : C, (a, b) : R,
         .
and a = 6 b. An Abox A is satisfiable by an interpretation I iff I satisfies T and R and
all assertions in A. Let a, b ∈ I be (possibly unnamed) individuals, given an assertion
(a, b) : R, a is called a predecessor of b and b a successor or role filler of a. The R-fillers
of a are defined as Fil (a, R) = {b | (aI , bI ) ∈ RI }.
2.2 Reducing SHQ to SHN \
Inspired by [16] we transform given qualified number restrictions to unqualified number
restrictions and add a new role-set difference operator ∀(R\R0 ).C with (∀(R\R0 ).C)I =
{x | ∀y : (x, y) ∈ (RI \ R0I ) ⇒ y ∈ C I }. The resulting language is called SHN \ .
     ˙ denote the standard negation normal form (NNF) of ¬C. We define a recursive
Let ¬C
function unQ which rewrites a SHQ concept description into SHN \ . It is important to
note that this rewriting process always introduces for each transformed qualified num-
ber restriction a unique new role.
    The function unQ transforms the input description into its NNF and replaces qual-
ified number restrictions. In the following each R0 is a new role in NR with R :=
R ∪ {R0 v R}: unQ(C) := C if C ∈ NC ; unQ(¬C) := ¬C if C ∈ NC , unQ(¬C)             ˙
otherwise; unQ(∀R.C) := ∀R.unQ(C); unQ(C u D) := unQ(C) u unQ(D);
unQ(C t D) := unQ(C) t unQ(D); unQ(≥ nR.C) := (≥ nR0 ) u ∀R0 .unQ(C);
unQ(≤ nR.C) := (≤ nR0 ) u ∀(R\R0 ).unQ(¬C);     ˙    unQ(a : C) := a : unQ(C).
    It is easy to show that the concept (≤ n R.C) is equisatisfiable with the expression
(∀(R\R0 ).¬C u ≤ n R0 ) with R0 fresh in R and R0 v R (see [5] for the proof). Since
unQ introduces a negation itself, the negated description needs to be converted to NNF
before further applying unQ. Our language is not closed under negation w.r.t. the con-
cept descriptions created by unQ. However, our calculus ensures that these concept
descriptions will never be negated.
3 A Hybrid Abox Tableau Calculus for SHQ
3.1 Preprocessing
As a preprocessing step, the input Abox and Tbox are transformed into SHN \ as
defined in Section 2.2. Similar to [13], indorder to propagate Tbox axioms to all in-
dividuals we define the concept CT := CvD∈T unQ(¬C               ˙ t D) and U as a new
transitive role in the role hierarchy R extended by {R v U | R ∈ NR }. In order to ex-
amine the consistency of A, the algorithm first extends A by {x0 : (CT u ∀U.CT )} ∪
{(x0 , x) : U | x occurs in A}, where x0 is new in A. By this means, we impose the ax-
ioms in the Tbox on all individuals in A. Furthermore, we rewrite the role assertions in
A as follows.
    If x, y ∈ IA then there exists a unique role name Rxy ∈ NR only used to represent
that y is an R-filler of x with Rxy v R ∈ R. In other words whenever (x, z) : Rxy we
have y I = z I . We replace role assertions of the form (b, c) : R by b : (≥ 1 Rbc u ≤
1 Rbc ) because they impose a numerical restriction on individuals. For example, an
assertion (b, c) : R implies there exists exactly one Rbc -filler for b which is c.
3.2 Completion Forest
The closure clos(E) of a concept expression E is the smallest set of concepts such that:
E ∈ clos(E), (¬D) ∈ clos(E) ⇒ D ∈ clos(E), (C t D) ∈ clos(E) or (C u D) ∈
clos(E) ⇒ C ∈ clos(E) and D ∈ clos(E), (∀R.C) ∈ clos(E) ⇒ C ∈ clos(E),
(≥ n R.C) ∈ clos(E) or (≤ m R.C) ∈ clos(E) ⇒ C ∈ clos(E). For a Tbox T we
define clos(T ) such that if (C v D) ∈ T or (C ≡ D) ∈ T then clos(C) ⊆ clos(T )
and clos(D) ⊆ T . Similarly for an Abox A we define clos(A) such that if (a : C) ∈ A
then clos(C) ⊆ clos(A).
    A completion forest F = (V, E, L, LE ) for a SHQ Abox A is composed of a
set of arbitrarily connected nodes as the roots of completion trees (if one ignores the
connections between root nodes). Every node     Px ∈ V is labeled by L(x) ⊆ clos(A) and
LE (x) as a set of inequations of the form ( i∈1..k vi ) ./ n with ./ ∈ {≤, ≥}, n ∈ N,
and vi ∈ V where V is a set of variables (see also below); each edge hx, yi ∈ E is
labeled by the set L(hx, yi) ⊆ NR , and x (y) is called a predecessor (successor or
role filler) of y (x), and the transitively closed set of successors (predecessors) is called
                                           p1 = {R}, p2 = {T }, p4 = {S},
        p1     p5    p4
                                           p3 = {R, T }, p5 = {R, S}, p6 = {S, T },
              RS                           p7 = {R, S, T }
    R                          S
               p7
                                           α(v001 ) = p1 , α(v010 ) = p2 , α(v100 ) = p4 ,
        p3                p6               α(v011 ) = p3 , α(v101 ) = p5 , α(v110 ) = p6 ,
              RST                          α(v111 ) = p7
        RT            ST
                                                     8                                     9
                                                     >
                                                     >  v001 + v011 + v101 + v111 ≥ 3, >   >
                                                        v010 + v011 + v110 + v111 ≤ 2,
                                                     <                                     =
                          p2               LE (x) =
               T                                     >
                                                     >  v100 + v101 + v110 + v111 ≤ 1, >   >
                                                        v100 + v101 + v110 + v111 ≥ 1
                                                     :                                     ;

                               Fig. 1. Atomic Decomposition Example

descendants (ancestors) respectively. We maintain the distinction between nodes of a
                         .
forest by the relation =.
                       6
    The set Rx of related roles for a node x is defined as Rx = {S | {≤ n S, ≥ m S} ∩
L(x) 6= ∅}. We define a partitioning RS x = ( RS ⊆Rx {RS }) \ ∅ and for a partition
                                                 S

RS x ∈ RS x we define RS Ix = ( S∈RS x Fil I (x, S)) \ ( S∈Rx \RS x Fil I (x, S)) such
                                  T                        S

that Fil I (x, S) = {y I | y ∈ Fil (x, S)}. RS Ix represents the interpretation of fillers of
x that are fillers for the roles in RS but not fillers for the roles in Rx \ RS . Therefore,
by definition the fillers of x associated with the partitions in RS x are mutually disjoint
w.r.t. the interpretation I.
     We assume a set V of variables and by using a mapping α : V ↔ RS x we associate
a unique variable v ∈ V with each partition RS x in RS x such that α(v) = RS x . Let
V R be defined as the set of all variables related to a role R such that V R = {v ∈
V | R ∈ α(v)}. A function      ξ is used to map number restrictions to inequations of the
form ξ(R, ./, n) := ( vi ∈V R vi ) ./ n where ./ ∈ {≤, ≥} and n ∈ N.
                        P

     Since all concept restrictions for node successors that are not root nodes are prop-
agated through universal restrictions for roles and a new role is created for each role
filler occurring in a role assertion, we can conclude that all nodes in a certain node par-
tition share the same restrictions and can be dealt with as a unit. We call this unit proxy
node which is a representative of possibly more than one node. We define the mapping
card : V → N to indicate the cardinality associated with proxy nodes.
     A node x in a forest F is blocked by a node y iff x, y are not root nodes and y is
an ancestor of x such that L(x) ⊆ L(y). A node x contains a clash iff there exists a
concept name A ∈ NC such that {A, ¬A} ⊆ L(x) (logical clash) or LE (x) does not
have a non-negative integer solution (arithmetic clash).
     An arithmetic solution is represented using the function σ : V → N which assigns a
non-negative integer value to each variable in V. Let Vx be the set of variables assigned
to an individual x, Vx = {v ∈ V | v occurs in LE (x)}. We define a set of solutions
Ω for x as Ω(x) := {σ(v) = n | n ∈ N, v ∈ Vx }. Notice that the goal function of
the inequation-solver is to minimize the sum of the variables occurring in the input
inequations.
     The algorithm starts with the forest FA with the individuals mentioned in A as root
nodes. For each a ∈ IA a root node xa will be created with L(xa ) = {C | (a : C) ∈ A}.
Additionally, for every root node x we set card(x) = 1.
    We illustrate these definitions with a simple example shown in Figure 1. Assume
for a node x that we have L(x) = {≥ 3 R, ≤ 2 T , ≥ 1 S, ≤ 1 S}. Therefore, we have
Rx = {R, T, S} and RS x consists of 7 different partitions named p1 , . . . , p7 (see
Figure 1). Assuming a binary coding of the indices of variables, where the first digit
from the right represents R, the second T , and the last S, we define the variables such
that each vi is associated with its corresponding partition (see α(v) in Figure 1). Hence,
the number restrictions in L(x) can be translated to the set of inequations in LE (x) as
shown in the bottom-right part of Figure 1.
3.3 Completion Rules
The completion rules are shown in Figure 2. The completion rules are always ap-
plied with the following priorities (given in decreasing order). A rule of lower priority
can never be applied if another one with a higher priority is applicable. 1. reset-Rule
and merge-Rule. 2. u-Rule, t-Rule, ∀-Rule, ∀\ -Rule, ∀+ -Rule, ch-Rule, disjoint-Rule,
equal-Rule, and hierarchy-Rule 3. ≤-Rule and ≥-Rule. 4. fil -Rule. There are two limi-
tations on the application of the rules: (i) priority of the rules, (ii) rules are only appli-
cable to nodes that are not blocked.
    A completion forest F is called complete if no completion rule is applicable to any
node of F. A completion forest F is called clash-free if none of the clash triggers is
applicable to any node of F.
    The u-Rule, t-Rule, ∀-Rule, and the ∀+ -Rule are similar to the ones in standard
tableau algorithms. The ∀\ -Rule implements the universal restriction based on role set
difference. ≤-Rule, ≥-Rule: The ≤-Rule and the ≥-Rule translate the number restric-
tions, based on the atomic decomposition technique, into inequations and add them to
LE (x).
    ch-Rule: The intuition behind the ch-Rule is due to the partitioning consequences.
When we partition all successors for a node, we actually consider all possible cases for
the role successors. If a partition px for a node x is logically unsatisfiable, the corre-
sponding variable v with α(v) = px should be zero. But if it is logically satisfiable,
nothing but the current set of inequations can restrict the number of nodes being fillers
for the roles in the partition px .
                                                                                           .
    The disjoint-Rule preserves the semantics of Abox assertions of the form a =          6 b.
The equal-Rule ensures the equivalence of the Rxy - and Sxy -fillers. The hierarchy-
Rule enforces the semantics of role hierarchies on partition variables. The merge-Rule
merges two successors (and root nodes) za , zb of a root node zx into one node by re-
placing every occurrence of za in the completion graph by zb . The labels L and LE of
zb are unified with the corresponding labels of za . All incoming (outgoing) edges of za
become now incoming (outgoing) edges of zb .
    fil-Rule: The fil-Rule, which has the lowest priority, is the only generating rule.
It creates one successor (proxy node) for a node zx that is not blocked and sets the
cardinality of the proxy node to σ(v). If zx is a root node and has an already existing
successor xb , which is a also a root node, and the role set for hzx , zb i is empty, i.e., it
has not yet been initialized or has been reset by the reset-rule, the role set is set to α(v).
    Due to possible cycles between root nodes, one has to consider incremental parti-
tioning for root nodes because a previously unknown number restriction could be prop-
agated to a root node. Whenever (≤ n R) or (≥ m R) is added to L(x), the following
reset-Rule if x is a root node, {(≤ nR), (≥ nR)} ∩ L(x) 6= ∅, and ∀v ∈ Vx : R ∈       / α(v)
           then set LE (x) := ∅ and for every successor y of x set L(hx, yi) := ∅
merge-Rule if there exists root nodes zx , za , zb for x, a, b ∈ IA such that Rxa ∈ L(hzx , zb i)
           then merge the nodes za , zb and their labels and replace every occurrence of za in
           the completion graph by zb
u-Rule     if (C1 u C2 ) ∈ L(x) and {C1 , C2 } 6⊆ L(x)
           then set L(x) = L(x) ∪ {C1 , C2 }
t-Rule     if (C1 t C2 ) ∈ L(x) and {C1 , C2 } ∩ L(x) = ∅
           then set L(x) = L(x) ∪ {X} with X ∈ {C1 , C2 }
∀-Rule     if ∀R.C ∈ L(x) and there exists a y and R0 with R0 ∈ L(hx, yi), C ∈            / L(y),
           R 0 v∗ R
           then set L(y) = L(y) ∪ {C}
∀\ -Rule   if ∀R\S.C ∈ L(x) and there exists a y and R0 with R0 ∈ L(hx, yi), R0 v∗ R,
           C∈  / L(y), and there exists no S 0 such that S 0 v∗ S and S 0 ∈ L(hx, yi)
           then set L(y) = L(y) ∪ {C}
∀+ -Rule   if ∀R.C ∈ L(x) and there exists a y and R0 , S with R0 ∈ L(hx, yi), R0 v∗ S,
           S v∗ R, S ∈ NRT , and ∀S.C ∈       / L(y)
           then set L(y) = L(y) ∪ {∀S.C}
ch-Rule    If there occurs v in LE (x) with {v ≥ 1, v ≤ 0} ∩ LE (x) = ∅
           then set LE (x) = LE (x) ∪ {X} with X ∈ {v ≥ 1, v ≤ 0}
disjoint-  if there occurs v in LE (zx ) with {R0 , S 0 } ⊆ α(v), R0 v∗ Rxa , S 0 v∗ Rxb ,
Rule       x, a, b ∈ IA , v ≤ 0 ∈ / LE (zx )
           then set LE (zx ) := LE (zx ) ∪ {v ≤ 0}
equal-Rule if there occurs v in LE (zx ) with R0 ∈ α(v), S 0 ∈    / α(v), R0 v∗ Rxa , S 0 v∗ Sxa ,
           x, a, b ∈ IA , v ≤ 0 ∈ / LE (zx )
           then set LE (zx ) := LE (zx ) ∪ {v ≤ 0}
hierarchy- if there occurs v in LE (x) with R ∈ α(v), S ∈      / α(v), R v∗ S, v ≤ 0 ∈/ LE (x)
Rule       then set LE (x) := LE (x) ∪ {v ≤ 0}
≥-Rule     If (≥ nR) ∈ L(x) and ξ(R, ≥, n) ∈       / LE (x)
           then set LE (x) = LE (x) ∪ {ξ(R, ≥, n)}
≤-Rule     If (≤ nR) ∈ L(x) and ξ(R, ≤, n) ∈       / LE (x)
           then set LE (x) = LE (x) ∪ {ξ(R, ≤, n)}
fil-Rule   If there exists v occurring in LE (zx ) such that σ(v) = n with n > 0:
                 if n = 1, zx , zb root nodes, and Rxb ∈ α(v) with x, b ∈ IA
                 then if L(hzx , zb i) = ∅ then set L(hzx , zb i) := α(v) end if
                 else if zx is not blocked and ¬∃y : L(hzx , yi) = α(v)
                 then create a new node y and set L(hzx , yi) := α(v) and card(y) = n

      Fig. 2. Completion rules for SHQ Abox consistency (listed in decreasing priority)

takes place: (i) The reset-Rule becomes applicable for x, which sets LE (x) := ∅ and
removes all outgoing edges of x. (ii) Now that LE (x) is empty, the ≤-Rule and ≥-
Rule will be invoked again to recompute the partitions and variables. Afterwards, the
set of inequations based on the new partitioning is added. (iii) If (σ(vi ) = n) ∈ Ω(x),
 P vi ∈ Vx corresponds
where                         to the previous partitioning, then set LE (x) := LE (x) ∪
{ v0 ∈Vx,i vj0 ≥ n, v0 ∈Vx,i vj0 ≤ n} where Vx,i := {vj0 ∈ Vx0 | α(vi ) ⊆ α(vj0 )} and
                    P
     j                  j
vj0 ∈ Vx0 are based on the new partitioning. The third task in fact maintains the previous
solution in x and records it by means of inequalities in LE (x). Therefore, the solution
based on the new partitioning will be recreated by the arithmetic reasoner.
    An interpretation I satisfies a SHQ Abox A iff the hybrid tableau calculus can
derive a complete and clash-free completion forest for A. Due to lack of space we
refer the reader to [5] for the proofs on termination, soundness and completeness of the
presented calculus.
4 Practical reasoning
4.1 Complexity Analysis
To analyze the complexity of the concept satisfiability test with respect to qualified
number restrictions, we count the number of branches that the algorithm creates in the
search space. In the following we assume a node x ∈ V in the completion forest that
contains p at-least restrictions and q at-most restrictions in its label (Ri , Rj0 ∈ NR and
Ci , Cj0 ∈ clos(T )): {≥ n1 R1 .C1 , . . . , ≥ np Rp .Cp } ⊆ L(x), {≤ m1 R10 .C10 , . . . , ≤
mq Rq0 .Cq0 } ⊆ L(x)
     Standard tableau: The complexity of a standard tableau (e.g., see [9]) can be char-
acterized by (2q )N × f (N, m), where N depends on the biggest number occurring in
at-least restrictions and f (N, m) describes the number of possible ways to merge N in-
dividuals into m individuals with m the smallest non-zero number occurring in at-most
restrictions (see [5] for more details).
     Hybrid tableau: During the preprocessing step, the hybrid algorithm converts all
qualified number restrictions into unqualified ones which introduces p + q new role
names. According to the atomic decomposition presented in Section 3.2, the hybrid
algorithm defines 2p+q − 1 partitions and consequently variables for x; i.e. |Vx | =
2p+q −1. The ch-Rule opens two branches for each variable in Vx . Therefore, there will
be totally 2|Vx | cases to be examined by the arithmetic reasoner. The ch-Rule will be
                                                  p+q
invoked |Vx | = 2p+q − 1 times and creates 22         branches in the search space. Hence,
the complexity of the algorithm is characterized by a double-exponential function of
p+q. In [14] a polynomial-time algorithm for integer programming with a fixed number
of variables is given. However, our prototype implementation employed the Simplex
method which is known to be NP in the worst case but usually behaves very well on
average.
     The complexity of the standard algorithm is a function of N and therefore the num-
bers occurring in the at-most restrictions can affect the standard algorithm exponen-
tially. Whereas in the hybrid algorithm, the complexity is independent from N due to
its arithmetic approach to the problem.
4.2 Partition Optimization Techniques
Although it seems that the hybrid algorithm is double-exponential in the worst case and
the large number of variables seems to be hopelessly inefficient, there are some effective
heuristics and optimization techniques which make it feasible to use.
    Default Value for Variables: In the semantic branching based on the concept choose-
Rule (see [9]), in one branch we have C and in the other branch we have ¬C in the
label of the nodes. However, due to the ch-Rule (for variables) in one branch we have
v ≥ 1 whereas in the other branch v ≤ 0. In contrast to concept branching based on
the choose-Rule, in variable branching we can ignore the existence of variables that are
less or equal to zero. In other words, the arithmetic reasoner only considers variables
that are greater or equal to one.
     Therefore, by setting the default value to v ≤ 0 for every variable, the algorithm
does not need to invoke the ch-Rule |Vx | times before starting to find a solution for
the inequations. More precisely, the algorithm starts with the default value of v ≤ 0
for all variables in |Vx |. Obviously, the solution for this set of inequations, which is
∀ vi ∈ Vx : σ(vi ) = 0, cannot satisfy any at-least restriction. Therefore, the algorithm
must choose some variables in Vx to make them greater or equal to one. Although in
the worst case the algorithm still needs to try 2|Vx | cases, by setting this default value it
does not need to invoke the ch-Rule when it is not necessary.
     We define don’t care variables as the set of variables that appear in an at-least re-
striction but in no at-most restriction. These variables have only logical restrictions
which later on will be processed by the algorithm. Therefore, any non-negative integer
value satisfying all known arithmetic restrictions can be assigned to these variables and
we can leave them unchanged in all inequations unless a logical clash is triggered.
     We define satisfying variables as the set of variables which occur in an at-least re-
striction and are not don’t care variables. Since these are the variables that occur in
an at-least restriction, by assigning them to be greater or equal to one, the algorithm
can lead the arithmetic reasoner to a solution. Whenever a node that is created based
on v causes a clash, by means of dependency-directed backtracking we will set v ≤ 0
and therefore remove v from the satisfying variables set. When the satisfying variables
set becomes empty the algorithm can conclude that the set of qualified number restric-
tions in L(x) is unsatisfiable. Notice that the number of variables that can be decided
to be greater than zero in an inequation is bounded by the number occurring in their
corresponding number restriction.
     Variable Encoding: One can encode indices of variables in binary format to easily
retrieve the role names related to them. We do not need to assign any memory space for
them unless they have a value greater than zero based on an arithmetic solution.
4.3 Dependency-Directed Backtracking
Dependency-directed backtracking or backjumping is a backtracking method which de-
tects the sources of a clash and tries to bypass branching points that are not related to
the sources of the clash [12, 11]. In the hybrid algorithm, whenever we encounter a log-
ical clash for a successor y of x, we can conclude that the corresponding variable vy for
the partition in which y resides must be zero. Therefore, we can prune all branches for
which vy ≥ 1 ∈ LE (x). This method which is called simple backtracking can decrease
the size of the search space by pruning half of the branches each time the algorithm
detects a clash. For example, for an arbitrary L(x), by pruning all the branches where
                                                p+q
vy ≥ 1, we will in fact prune 2|Vx |−1 = 22 −1 branches w.r.t. the ch-Rule, which
amounts to half of the branches.
    We can improve this by a better version in which we prune all branches that have the
same reason for the clash caused by vy . This methods is called complex backtracking.
For instance, assume the node y that is created based on σ(vy ) = k where k ≥ 1
ends up with a clash. Since we have only one type of clash other than the arithmetic
clash, assume the clash is because of {A, ¬A} ⊆ L(y) for some A ∈ NC . Moreover,
assume we know that A is caused by a ∀Ri .A restriction in its predecessor x and ¬A
by ∀S\Tj .(¬A) ∈ L(x). It is possible to conclude that all the variables v for which
Ri ∈ α(v) ∧ Tj ∈   / α(v) will end up with the same clash.
    Considering the binary coding for the indices of the variables in which the ith digit
represents Ri and the jth digit represents Tj , all the variables, where the ith digit is
equal to 1 and the jth digit equal to zero, must be zero. Since the binary coding of the
variable indices requires a total of p + q digits, the number of variables that must be
zero will be 2p+q−2 . The 2p+q − 2p+q−2 other variables are free to be constrained and
will open two branches in the search space. Therefore, the number of branches will be
reduced from 2|Vx | to 23/4|Vx | which is a significant improvement.
5 Evaluation
Our prototype reasoner can process ALCHQ concept expressions (see also [5]). The
system is implemented in Java using the OWL-API 2.1.1 [10]. Qualified number restric-
tions are expressive constructs added to the forthcoming OWL 2 [17]. Therefore, current
OWL benchmarks do not contain qualified number restrictions. In fact, to the best of our
knowledge there is no real-world SHQ benchmark available which contains non-trivial
qualified number restrictions. Furthermore, our prototype reasoner does not implement
any optimization techniques except the ones introduced above, which target only rea-
soning with the Q-component of SHQ. Accordingly, we built a set of synthetic test
cases to empirically evaluate the hybrid reasoner. We focus our evaluation on concept
expressions only containing qualified number restrictions. We identified the following
parameters that may affect the complexity of reasoning with number restrictions: (i) The
size of numbers occurring in number restrictions. (ii) The number of qualified number
restrictions. (iii) The ratio of the number of at-least to the number of at-most restric-
tions. (iv) Satisfiability versus unsatisfiability of the input concept expression. Due to
lack of space we focus in this paper only on the cases (i) and (iv).
    Well-known reasoners that support qualified number restrictions such as Racer [7],
FaCT++ [19] or Pellet [18] implement numerous optimization techniques. However,
to the best of our knowledge, Pellet as well as FaCT++ have no specific optimization
techniques for dealing with qualified number restrictions. We decided to select Pellet
(version 2.0 RC7 from June 11, 2009) as the representative for a state-of-the-art DL rea-
soner. We based our evaluations only partially on a comparison with one of the existing
reasoners and focused on the study of the behavior of the hybrid reasoner. The experi-
ments were performed under Windows 32 on a standard PC with a dual-core (2.1 GHz)
processor and 3 GB of physical memory. Every test was executed in five independent
runs (with a timeout of 1000 seconds) and the average of these runs is presented.
    In order to assess the scalability of the hybrid algorithm with respect to the size of
the numbers, we compared its performance with Pellet. We used the following concept
expressions (assuming a role hierarchy {R v T, S v T, RS v R, RS v S}; i is a
number incremented for each benchmark):
(≥ 2i RS.(A t B)) u (≤ i S.A) u (≤ i R.B) u (≤ (i−1) T.(¬A)) t (≤ i T.(¬B))                (CSAT )
(≥ 2i RS.(AtB))u(≤ i S.A)u(≤ i R.B) u(≤ (i−1) T.(¬A))t(≤ (i−1) T.(¬B))(CUNSAT )
    CSAT is a satisfiable concept while CUNSAT is unsatisfiable. In fact, CSAT is not
trivially satisfiable. In the hybrid algorithm, the set of inequations is only satisfied in
the case where all variables are zero except v, v 0 ≥ 1 and α(v) = {RS 0 , S 0 , T 0 } and
α(v 0 ) = {RS 0 , R0 }.1 The standard algorithm in order to examine the satisfiability of
 1
     Assume R0 , S 0 , RS 0 , and T 0 are new sub-roles of R, S, RS, and T respectively.
               10
                        6
                                                         ....
                                Effect of the Size of Numbers: Hybrid vs. Pellet                                                            100
                                                                                                                                                                       Effect of Exponential Growth of i


                                                                                                                                                                                                       .
                                                       . .
                        5
                                                                                                                                            90
                                                                                                                                                                                                               Hybrid-Sat
                    2
                                                                                                                                                                                                               Hybrid-Unsat
                                                                                     Hybrid-Sat                                             80


                                                     .
                        5
               10
                        5
                                                                                     Hybrid-Unsat
                                                                                                                                            70
                                                                                     Pellet-Sat


                                                                                                                             Runtime (ms)
                    2                                                                Pellet-Unsat
Runtime (ms)


                                                                                                                                            60
                        4


                                                                                                                                                . . . . . .
               10
                        5                                                                                                                   50


                    ...
                    2                                                                                                                       40
                        3
               10
                        5                                                                                                                   30

                    2                                                                                                                       20
                        2
               10
                        5
                                                                                                                                            10

                                                                                                                                                0
                            2       3       4          5        6        7        8           9         10                                          1              2               3               4               5                  6
                                                i in concept expression C                                                                                                 log10 i in concept expression C

                    Fig. 3. Increasing the value of numbers                                                                                                  Fig. 4. Exponential growth of i


                                                                                          .
                                                Effect of Backtracking
               10
                5
                    6
                                                                                                                            16000
                                                                                                                                                             . . . . .....
                                                                                                                                                    Satisfiable vs. Unsatisfiable: Standard and Hybrid Algorithms

                                                                                                                                                                                                                   .         Hybrid


                                                                                        .
                2                                                                                                           14000                                                                                            Pellet


                             .
                    5
               10
                5                                                                                                           12000


                           .
                2

                                                                                                             Runtime (ms)
                                                                                                                            10000
Runtime (ms)


                    4


                         .
               10
                5                                                                                                            8000


                       .
                2


                     .
                    3                                                                                                        6000
               10


                   .
                5


                           .
                                                                                                                             4000


                ..
                2


                                                                                                                                            ...                                        .............
                    2
               10                                                                No-Backtracking                             2000
                5                                                                Simple
                                                                                 Complex                                              0
                2                                                                                                                           0           20    40   60      80    100   120   140       160   180       200   220   240
                        2       3       4       5          6        7        8   9       10        11                                                                     i in the concept expression Ei
                                                    Number of Restrictions


               Fig. 5. Different backtracking strategies                                                                                                 Fig. 6. (Un)Satisfiability of Ei

CSAT creates (2 × i) RS-successors for x in (A t B) and according to the three at-
most restrictions it opens 8 new branches for each successor. However, since (2 × i)
is much larger than i or i − 1, one must start merging the extra successors when an
at-most restriction is violated which happens in all branches in this test case. As shown
in Figure 3 (logarithmic scale), the growth of i from 2 to 10 has almost no effect on the
hybrid reasoner while it dramatically increases the runtime of the standard algorithm
for numbers as small as 6 and 7. In contrast to the standard reasoner, the performance
of the hybrid reasoner is unaffected by the value of the numbers. Furthermore, to assure
that this independence will be preserved also with respect to an exponential growth of
i, in Figure 4 we present the performance of the hybrid reasoner for i = 10n , n ∈ 1..6.
     In the next experiment we observed the effect of backtracking on the performance
of the hybrid reasoner. In three settings, we first turn off backtracking, secondly enable
simple backtracking, and finally utilize complex backtracking (see Section 4.3). We
tested an unsatisfiable concept DUNSAT which follows the pattern where Dj u Dk = ⊥
for 1 ≤ j, k ≤ i, j 6= k with respect to a role hierarchy R v T : (≥ 3 R.D1 ) u . . . u
(≥ 3 R.Di ) u (≤ (3i−1) T ). As shown in Figure 5 (logarithmic scale), the high non-
determinism of the ch-Rule makes it inevitable to utilize backtracking. Moreover, we
can conclude that by using a more comprehensive and informed method of backtracking
we can improve the performance significantly.
     In the last experiment the test cases are concepts containing four qualified at-least
restrictions and one unqualified at-most restriction according to the following pattern.
We abbreviate these concepts with Ei where R v T for i = 1, 20, 40, . . . , 220, 240:
≥ 30 R.(B u C)u≥ 30 R.(B u ¬C) u≥ 30 R.(¬B u C)u≥ 30 R.(¬B u ¬C)u≤ i T .
The concept Ei is satisfiable for i ≥ 120 and unsatisfiable for i < 120.
     As illustrated in Figure 6, the standard algorithm can easily infer for i ∈ 1..29 that
Ei is unsatisfiable since x has at least 30 distinguished successors. However, from E30
to E120 it becomes very time-consuming for the standard algorithm to merge all 120
successors into i individuals. Moreover, Figure 6 shows that no matter which value i
takes between 30 and 119, the standard algorithm performs similarly. In other words,
we can conclude that it tries the same number of possible ways of merging which is
all the possibilities to merge 4 sets of mutually distinguished individuals. As soon as i
becomes greater or equal 120, since the at-most restriction is not violated, the standard
algorithm simply ignores it and reasoning becomes trivial for the standard algorithm.
     Furthermore, we can conclude from Figure 6 that for the hybrid algorithm i = 1
is a trivial case since not more than one variable can have the type of v ≥ 1 which is
the case that easily leads to unsatisfiability for E1 . However, it becomes more difficult
as i grows and reaches its maximum for i = 30 and starts to decrease gradually until
i = 70 and remains unchanged until i = 120. In fact, this unexpected behavior did not
correspond to the formal analysis of the hybrid algorithm. Therefore, we measured
the time spent on arithmetic reasoning and logical reasoning as well as the number of
different clashes. The reason that i = 30 is a breaking point is due to the fact that for
i < 30 no arithmetic solution exists for the set of inequations. It seems that the increase
in runtime is due to this fact and a straight-forward and unoptimized implementation
of the arithmetic reasoner. When i grows from 30 to 120, the arithmetic reasoner finds
more solutions for the set of inequations which will fail due to logical clashes.
6 Discussion and Future Work
Based on the previous sections we identify the following advantages of the hybrid al-
gorithm: (i) Insensitivity to the value of numbers in number restrictions. (ii) Since all
number restrictions are collected before expanding the completion graph, its solution is
more probable to survive. (iii) Due to the atomic decomposition the search for a model
is semantically structured. When encountering a clash, one can efficiently backtrack to
the source of the clash and optimally prune the branches which would lead to the same
clash (see Section 4.3).
     The following disadvantages of the hybrid algorithm can be observed: (i) According
to the nature of the atomic decomposition, the hybrid algorithm introduces an exponen-
tial number of variables. (ii) The implemented prototype still requires a preprocessing
phase affected by the number of partitions. This initialization time could be saved for
trivially satisfiable or unsatisfiable concepts.
     We plan to address the following topics in our future work. (i) Since the minimal
number of successor property is unnecessary in many cases, one could reduce the arith-
metic reasoning time by modifying the goal function and allowing a solution with more
successors. (ii) Optimizing the arithmetic reasoner by implementing techniques such as
incremental arithmetic reasoning, caching, and heuristics for controlling the order of
applying the ch-rule.
Acknowledgements
We express our gratitude to our colleagues Jocelyne Faddoul and Ralf Möller for their
valuable advice during this investigation.
References
 1. F. Baader and U. Sattler. An overview of tableau algorithms for description logics. Studia
    Logica, 69:5–40, 2001.
 2. Y. Ding. Tableau-based Reasoning for Description Logics with Inverse Roles and Number
    Restrictions. PhD thesis, Department of Computer Science and Software Engineering, Con-
    cordia University, April 2008. Available at http://users.encs.concordia.ca/
    ~haarslev/students/Yu_Ding.pdf.
 3. J. Faddoul, N. Farsinia, V. Haarslev, and R. Möller. A hybrid tableau algorithm for ALCQ. In
    Proceedings of the 18th European Conference on Artificial Intelligence (ECAI 2008), Patras,
    Greece, July 21-25, pages 725–726, 2008. An extended version appeared in Proceedings of
    the 2008 International Workshop on Description Logics (DL-2008), Dresden, Germany, May
    13-16, 2008.
 4. J. Faddoul, V. Haarslev, and R. Möller. Hybrid reasoning for description logics with nom-
    inals and qualified number restrictions. Technical report, Institute for Software Systems
    (STS), Hamburg University of Technology, 29 pages, 2008. See also http://www.sts.tu-
    harburg.de/tech-reports/papers.html.
 5. N. Farsiniamarj. Combining integer programming and tableau-based reasoning: A hybrid
    calculus for the description logic SHQ. Master’s thesis, Concordia University, Depart-
    ment of Computer Science and Software Engineering, 2008. Available at http://users.
    encs.concordia.ca/~haarslev/students/Nasim_Farsinia.pdf.
 6. V. Haarslev and R. Möller. Optimizing reasoning in description logics with qualified number
    restriction. In Proceedings of the International Workshop on Description Logics (DL’2001),
    Aug. 1-3, 2001, Stanford, CA, USA, pages 142–151, Aug. 2001.
 7. V. Haarslev and R. Möller. RACER system description. In R. Goré, A. Leitsch, and T. Nip-
    kow, editors, Proceedings of the International Joint Conference on Automated Reasoning,
    IJCAR’2001, June 18-23, 2001, Siena, Italy, Lecture Notes in Computer Science, pages 701–
    705. Springer-Verlag, June 2001.
 8. V. Haarslev, M. Timmann, and R. Möller. Combining tableaux and algebraic methods for
    reasoning with qualified number restrictions. In Proceedings of the International Workshop
    on Description Logics (DL’2001), Aug. 1-3, Stanford, USA, pages 152–161, 2001.
 9. B. Hollunder and F. Baader. Qualifying number restrictions in concept languages. In
    J. Allen, R. Fikes, and E. Sandewall, editors, Second International Conference on Princi-
    ples of Knowledge Representation, Cambridge, Mass., April 22-25, 1991, pages 335–346,
    Apr. 1991.
10. M. Horridge, S. Bechhofer, and O. Noppens. Igniting the OWL 1.1 touch paper: The OWL
    API. In Proceedings of the OWLED 2007 Workshop on OWL: Experiences and Directions,
    volume 258, Innsbruck, Austria, June 2007.
11. I. Horrocks. Backtracking and qualified number restrictions: Some preliminary results. In
    Proc. of the 2002 Description Logic Workshop (DL 2002), pages 99–106, 2002.
12. I. Horrocks, U. Sattler, and S. Tobies. Practical reasoning for very expressive description
    logics. Logic Journal of the IGPL, 8(3):239–264, 2000.
13. I. Horrocks, U. Sattler, and S. Tobies. Reasoning with individuals for the description logic
    SHIQ. In D. MacAllester, editor, Proceedings of the 17th International Conference on Au-
    tomated Deduction (CADE-17), Lecture Notes in Computer Science, pages 482–496, Ger-
    many, 2000. Springer Verlag.
14. H. Lenstra. Integer programming with a fixed number of variables. Mathematics of Opera-
    tions Research, 8(4):538–548, November 1983.
15. H. Ohlbach and J. Koehler. Role hierarchies and number restrictions. In Proceedings of the
    International Workshop on Description Logics (DL-97), Sept. 27 - 29, Gif sur Yvette (Paris),
    France, 1997.
16. H. Ohlbach and J. Köhler. Modal logics, description logics and arithmetic reasoning. Artifi-
    cial Intelligence, 109(1-2):1–31, 1999.
17. OWL 2 Web Ontology Language: Structural Specification and Functional-Style Syntax,
    W3C Working Draft, 02 December 2008, October 2008. http://www.w3.org/TR/
    owl2-syntax/ (last visited June 2009).
18. E. Sirin, B. Parsia, B. C. Grau, A. Kalyanpur, and Y. Katz. Pellet: a practical OWL-DL
    reasoner. Journal of Web Semantics, 5(2):51–53, 2005.
19. D. Tsarkov and I. Horrocks. FaCT++ description logic reasoner: System description. In
    Proc. of the Int. Joint Conf. on Automated Reasoning (IJCAR 2006), volume 4130 of Lecture
    Notes in Artificial Intelligence, pages 292–297. Springer, 2006.