Extending Consequence-Based Reasoning to SHIQ

                    Andrew Bate, Boris Motik, Bernardo Cuenca Grau,
                         František Simančı́k, and Ian Horrocks

                                    University of Oxford
                                   Oxford, United Kingdom
                                first.last@cs.ox.ac.uk


1      Introduction
Description logics (DLs) [3] are a family of knowledge representation formalisms with
numerous practical applications. SHIQ is a particularly important DL as it provides
the formal underpinning for the Web Ontology Language (OWL). DLs model a domain
of interest using concepts (i.e., unary predicate symbols) and roles (i.e., binary pred-
icate symbols). DL applications often rely on subsumption—the problem of checking
logical entailment between concepts—and so the development of practical subsumption
procedures for DLs such as SHIQ has received a lot of attention.
     Most DLs are fragments of the guarded fragment [1] of first-order logic; however,
SHIQ provides a restricted form of counting that does not fall within the guarded frag-
ment. Moreover, most DLs, including SHIQ, can be captured using the two-variable
fragment of first-order logic with counting (C 2 ) [11], but this provides us with neither
a practical nor a worst-case optimal reasoning procedure (C 2 and SHIQ are NE XP -
T IME- and E XP T IME-complete, respectively). Algorithms for more general logics thus
do not satisfy the requirements of DL applications, and so numerous alternatives spe-
cific to DLs have been explored. Many DLs can be decided in the framework of res-
olution [18, 13], including SHIQ [14]. These procedures are usually worst-case op-
timal and can be practical, but, as we discuss in Section 3, in even very simple cases
they can draw unnecessary inferences. Practically successful SHIQ reasoners, such as
FaCT++ [26], HermiT [9], Pellet [25], and Racer [12], use variants of highly-optimised
(hyper)tableau algorithms [6]—model-building algorithms that ensure termination by
a variant of blocking [7]. Although worst-case optimal tableau algorithms are known
[10], practical implementations are typically not worst-case optimal. While generally
very effective, tableau algorithms still cannot process certain ontologies; for example,
the GALEN ontology1 has proved particularly challenging, mainly because tableau cal-
culi tend to construct very large models.
     A breakthrough in practical ontology reasoning came in the form of consequence-
based calculi. Although not originally presented in the consequence-based framework,
the algorithm for the DL EL [2] can be seen as the first such calculus. This algorithm
was later reformulated and extended to Horn-SHIQ [15] and Horn-SROIQ [19]—
DLs that support functional roles, but not disjunctive reasoning. Recently, consequence-
based calculi were also developed for the DLs ALCH [24] and ALCI [23], which sup-
port disjunctive reasoning, but not counting. Consequence-based calculi can be seen as
 1
     http://www.opengalen.org
a combination of resolution and hypertableau (see Section 3 for details). As in resolu-
tion, they describe ontology models by systematically deriving certain ontology conse-
quences; and as in hypertableau, the ontology axioms can be used to guide the derivation
process, and to avoid drawing unnecessary inferences. Moreover, consequence-based
calculi are not just refutationally complete, but can classify an ontology in a single
pass, which greatly reduces the overall work. These advantages allowed the CB system
to be the first to classify all of GALEN [15].
     Existing consequence-based algorithms can handle either disjunctions or counting,
but not both. As we discuss in detail in Section 4, it is challenging to extend these algo-
rithms to DLs such as SHIQ that combine both kinds of construct: counting quantifiers
require equality reasoning, which together with disjunctions can impose complex con-
straints on ontology models. Unlike in existing consequence-based calculi, these con-
straints cannot be captured using DLs themselves; instead, a more expressive first-order
fragment is needed, which makes the reasoning process much more involved.
     In Section 5 we present a consequence-based calculus for SHIQ. Borrowing ideas
from resolution theorem proving, we encode the required consequences using a special
kind of first-order clause; and to handle equality effectively, we base our calculus on
ordered paramodulation [17]—a state of the art calculus for equational theorem proving
used in modern systems such as E [22] and Vampire [20]. To make the calculus efficient
on EL, we have carefully constrained the rules so that, on EL ontologies, it mimics
existing EL calculi. Thus, although a practical evaluation of our calculus is still pending,
we believe that it is likely to perform well in practice on ‘mostly-EL’ ontologies due to
is close relationship with existing and well-proven calculi.


2   Preliminaries

First-Order Logic. To simplify matters technically, it is common practice in equational
theorem proving to encode atoms as terms. An atomic formula P (~s) can be encoded as
P (~s) ≈ t, where t is a new special constant, and P is considered as a function symbol
rather than as a predicate symbol. Note however that, in order to avoid meaningless
expressions in which predicate symbols occur at proper subterms, a multi-sorted type
discipline on terms in the encoding is adopted. Thus, the set of symbols in the signature
is partitioned into a set P of predicate symbols (which includes t), and a set F of
function symbols.
     A term is constructed as usual using variables and the signature symbols. Terms
containing predicate symbols as their outermost symbol are called P-terms, while all
other terms are F-terms. For example, for P a predicate and f a function symbol, both
f (P (x)) and P (P (x)) are malformed; P (f (x)) is a well-formed P-term; and f (x)
is a well-formed F-term. An (in)equality is an expression of the form s ≈ t (s 6≈ t)
where s and t are both either F- or P-terms. We assume that ≈ and 6≈ are implicitly
symmetric—that is, s ./ t and t ./ s are one and the same expression, for ./ ∈ {≈, 6≈ }.
A literal is an equality or an inequality. An atom is an equality of the form P (~s) ≈ t,
and we write it simply as P (~s) whenever it is clear from the context whether P (~s) is
intended to be a P-term or an atom. A clause is an expression of the form Γ → ∆ where
Γ is a conjunction of atoms called the body, and ∆ is a disjunction of literals called the
             Table 1. Translating Normalised SHIQ Ontologies into DL-Clauses

                                                   B1 (x) → S(x, fi (x)) for 1 ≤ i ≤ n
        B1 v > n S.B2                              B1 (x) → B2 (fi (x))      for 1 ≤ i ≤ n
                                                   B1 (x) → fi (x) 6≈ fj (x) for 1 ≤ i < j ≤ n

        B1 v 6 n S.B2                   V x) ∧ B2 (x) → SB2 W
                                        S(y,                (y, x)     for fresh SB2
                           B1 (x) ∧          SB2 (x, zi ) →        zi ≈ zj
                                      1≤i≤n+1                 1≤i<j≤n+1

    B1 v ∀S.B2                         B1 (x) ∧ S(x, z1 ) → B2 (z1 )
  d        F                                  V              W
    Bi v       Bj                                 Bi (x) →          Bj (x)
1≤i≤n        1≤j≤m                         1≤i≤n              1≤i≤m

        S1 v S2                                 S1 (x, z1 ) → S2 (x, z1 )
        S1 v S2−                                S1 (x, z1 ) → S2 (z1 , x)


head. We often treat conjunctions and disjunctions as sets; and we write the empty
conjunction (disjunction) as > (⊥). We use the standard notion of subterm positions;
then, s|p is the subterm of s at position p; moreover, s[t]p is the term obtained from s
by replacing the subterm at position p with t; finally, position p is proper in t if t|p 6= t.
    Orders. A term order  is a strict order on the set of all terms. The multiset exten-
sion mul of  compares multisets M and N on a universe U such that M mul N
if and only if M 6= N and, for each n ∈ N \ M , some m ∈ M \ N exists such that
m  n, where \ is the multiset difference. We extend  to literals by identifying each
s 6≈ t with the multiset {s, s, t, t} and each s ≈ t with the multiset {s, t}, and by com-
paring the result using mul . Given an order , element b ∈ U , and subset S ⊂ U , the
notation S  b abbreviates ∃a ∈ S : a  b.
    Description Logic SHIQ. In this paper, a SHIQ ontology is represented as a
set of DL-clauses, which we define next. Let P1 and P2 be countable sets of unary and
binary predicate symbols, and let F be a countable set of unary function symbols. DL-
clauses are constructed using the central variable x and variables zi . A DL-F-term has
the form x, zi , or f (x) with f ∈ F; a DL-P-term has the form B(zi ), B(x), B(f (x)),
S(x, zi ), S(zi , x), S(x, f (x)), S(f (x), x) with B ∈ P1 and S ∈ P2 ; and a DL-term is
a DL-F- or a DL-P-term. A DL-literal has the form A ≈ t with A a DL-P-term (called
a DL-atom), or f (x) ./ g(x), f (x) ./ zi , or zi ./ zj with ./ ∈ {≈, 6≈ }. A DL-clause
contains only DL-atoms of the form B(x), S(x, zi ), and S(zi , x) in the body and only
DL-literals in the head, and where each variable zi occurring in the head also occurs in
the body. An ontology O is a finite set of DL-clauses. A query clause is a DL-clause in
which all atoms are of the form B(x). Given an ontology O and a query clause Γ → ∆,
the query clause entailment problem is to decide whether O |= ∀x.(Γ → ∆) holds; we
often leave out ∀x and write the latter as O |= Γ → ∆.
    SHIQ ontologies are commonly written using a DL-style syntax, but we can al-
ways transform such ontologies into DL-clauses without affecting the entailment of
query clauses. Transitivity is encoded away as described in [21, 8], and the resulting
axioms are normalised to the forms shown on the left-hand side of Table 1 as described
in [15, 23]. The normalised axioms are translated to DL-clauses as shown in Table 1.
                   O1 = {      Bi v ∃Sj .Bi+1 for 0 ≤ i < n and 1 ≤ j ≤ 2 (1)
                              Bn v Cn                                     (2)
                             Ci+1 v ∀Sj− .Ci for 0 ≤ i < n and 1 ≤ j ≤ 2 (3) }

               B0 (x)     Succ[5]:   ∃S1 .B1   (6)     B1 (x)                                 Bn (x)
                vB0       Succ[5]:   ∃S2 .B1   (7)      vB1                 ···                vBn

    Initialisation: > → B0   (4)           Succ[5]:      > → B1       (8)         Succ[. . . ]: > → Bn       (10)
    Hyper[1+4]: > → ∃Sj .B1 (5)            Hyper[1+8]: > → ∃Sj .B2 (9)            Hyper[2+10]: > → Cn        (11)
                                                                                                      −
    Pred[14]:       > → C0  (15)           Pred[. . . ]: > → C1      (13)         Hyper[3+11]: > → ∀Sj .Cn−1 (12)
                                           Hyper[3+13]: > → ∀Sj− .C0 (14)


                        Fig. 1. Example Motivating Consequence-Based Calculi


3      Why Consequence-Based Calculi?

Consider the ontology O1 (written using DL notation) shown in Figure 1. Axiom (3)
can be reformulated as ∃Sj .Ci+1 v Ci , and so O1 is in EL. One can readily verify that
O |= Bi v Ci holds for each 1 ≤ i ≤ n.
    To prove, say, O |= B0 v C0 , a (hyper)tableau calculus constructs in a forward-
chaining manner a tree-shaped model of depth n and of fanout two, where nodes at
depth i are labelled by Bi and Ci . Forward chaining ensures that reasoning is goal-
oriented and thus amenable to practical implementation. However, all nodes labelled
with Bi are of the same type and behave in the same way, which reveals a weakness of
(hyper)tableau calculi: the constructed model can be large (exponential in our example)
and highly redundant. Techniques such as caching [10] or anywhere blocking [16] can
be used to constrain model construction, but their effectiveness often depends on the
order of rule applications. Thus, model size has proved to be a key limiting factor for
(hyper)tableau-based reasoners in practice [16].
    In contrast, resolution describes models using universally quantified clauses that
‘summarise’ the model. This prevents redundancy and ensures worst-case optimality of
many resolution decision procedures. Nevertheless, resolution can still derive unneces-
sary clauses. In our example, axioms (1) and (3) are translated into clauses (16) and
(17), respectively, which can be used to derive all 2n2 clauses of the form (18).

                          Bi (x) → Sj (x, fi,j (x))             for i ∈ {1, . . . , n} and j ∈ {1, 2}         (16)
       Sj (z1 , x) ∧ Ck+1 (x) → Ck (z1 )                        for k ∈ {1, . . . , n} and j ∈ {1, 2} (17)
    Bi (x) ∧ Ck+1 (fi,j (x)) → Ck (x)                           for i, k ∈ {1, . . . , n} and j ∈ {1, 2} (18)

Of these 2n2 clauses, only those where i = k are relevant to proving O |= B0 v C0 .
Moreover, if we extend O with additional clauses that contain Bi and Ci , each of the
2n2 clauses from (18) can participate in further inferences, which can give rise to many
more irrelevant conclusions. This problem is exacerbated in satisfiable cases since all
resolution consequences must then be computed in full.
    Consequence-based calculi combine the goal-directed reasoning of (hyper)tableau
calculi with the ‘summarisation’ of resolution. In [23], we presented a very general
framework for ALCI ontologies that captures the key elements of consequence-based
calculi such as [2, 15, 19, 24]. We use this framework as basis for our extension to
SHIQ so, before presenting our extension, we explain the main concepts on O1 . Due
to space restrictions we cannot reproduce in full the inference rules from [23]; however,
these are similar in spirit to our inference rules for SHIQ presented in Table 2.
    Our calculus constructs a context structure D = hV, E, S, core, i—a graph whose
vertices V are called contexts and whose directed edges are labelled with concepts of
the form ∃S.B. Let I be a model of O. Instead of representing each element of I
individually as in (hyper)tableau calculi, we ‘summarise’ all elements of a certain kind
using a single context v. Each context v ∈ V is associated with a (possibly empty)
set corev of core concepts that hold in all domain elements that v represents; thus,
corev determines the kind of context v. We use a set Sv of clauses to capture additional
constraints that the elements represented by vdmust satisfy;
                                                          F       inFALCI, we F  can do so
using clauses over DL concepts of the form Bi v Bj t ∃Sk .Bk t ∀S` .B` .
Thus, unlike in resolution where all consequences belong to a single set, we assign a
consequence a particular set in order to reduce the number of inferences. Clauses in Sv
are ‘relative’ to corev : for each Γ v ∆ ∈ Sv , we have O |= corev u Γ v ∆—that is,
we choose not to include corev in clause bodies since corev always holds. Finally, 
provides each context v ∈ V with a concept order v that restricts resolution inferences
in the presence of disjunctions.
    Consequence-based calculi are not just refutation-complete: they actually derive
the required consequences. Figure 1 shows how this is achieved for O1 |= B0 v C0 ;
the core and the clauses are shown, respectively, above and below a context. To prove
B0 v C0 , we introduce context vB0 with corevB0 = {B0 } and clause (4) stating that
B0 holds in this context. Next, using the Hyper rule, we derive (5) from (1) and (4); this
rule performs hyperresolution, but restricted to one context at a time.
    Next, the Succ rule satisfies the existential quantifiers in (5). To this end, the rule
uses a parameter called an expansion strategy. A strategy is given two sets of constraints
that a successor of vB0 must satisfy due to universal restrictions: K1 contains constraints
that must hold, and K2 contains constraints that might hold. Given such K1 and K2 ,
the strategy then decides whether to reuse an existing context or create a fresh one,
and in the latter case it also determines how to initialise the new context’s core. In our
example, there are no universal restrictions and all information in vB0 is deterministic,
so K1 = K2 = {B1 }. For EL, a reasonable strategy is to associate with each concept
Bi a context vBi with corevBi = {Bi }, and to always to satisfy existential quantifiers
of the form ∃S.Bi using vBi ; thus, in our example we introduce vB1 and initialise it
with (8). Note that (5) represents two existential quantifiers, both of which we satisfy
(in separate applications of the Succ rule) using vB1 . Different strategies may be used
with more expressive DLs; please refer to [23, Section 3.4] for an in-depth discussion.
    We construct contexts vB2 , . . . , vBn in a similar way, finally deriving (11) by hy-
perresolving (2) and (10), and then (12) by hyperresolving (3) and (11). Clause (12)
imposes a constraint on the predecessor context, which we propagate backwards using
the Pred rule, obtaining (13) and (15). Since, however, clauses in SvB0 are ‘relative’ to
corevB0 , clause (15) actually represents our query clause B0 v C0 .
    Thus, like resolution, consequence-based calculi ‘summarise’ models to prevent re-
dundant computation, and, like (hyper)tableau calculi, they differentiate elements in a
model of O to prevent the derivation of consequences such as (18).
O2 = {             B0 (x) → S(f1 (x), x) (19)   B1 (x) → S(x, fi (x)) for 2 ≤ i ≤ 3 (22)
                   B0 (x) → B1 (f1 (x)) (20)    B1 (x) → Bi (fi (x)) for 2 ≤ i ≤ 3 (23)
          B2 (x) ∧ B3 (x) → ⊥ V          (21)       W → B4 (x)
                                                Bi (x)                for 2 ≤ i ≤ 3 (24)
                        B1 (x) ∧ 1≤j≤3 S(x, zi ) → 1≤j<k≤3 zj ≈ zk                  (25) }

                  B0 (x)                                                                      S(y, x), B2 (x)
                            Succ[27+28]:    f1   (29)      Succ[32+33+37]:   f2    (38)
                   v1                                                                               v3

 Initialisation: > → B0 (x)          (26)                                         Succ[32+33+37]:      > → S(y, x) (39)
 Hyper[19+26]: > → S(f1 (x), x)      (27)                                         Succ[32+33+37]:      > → B2 (x) (40)
 Hyper[20+26]: > → B1 (f1 (x))       (28)                                         Succ[32+32+37]: B3 (x) → B3 (x) (41)
 Pred[48]:       > → B2 (x) ∨ B3 (x) (49)                                         Hyper[21+40+41]: B3 (x) → ⊥      (42)
 Hyper[24+49]: > → B4 (x) ∨ B2 (x) (50)
 Hyper[24+50]: > → B4 (x)            (51)             S(x, y), B1 (x)    Succ[32+33+37]:     f2   (38)
                                                 v2
                                                                                                     S(y, x), B3 (x)
            Succ[27+28]:           > → S(x, y)                                   (30)
            Succ[27+28]:           > → B1 (x)                                    (31)                       v4
            Hyper[22+31]:          > → S(x, f2 (x))                              (32)
                                                                                            Succ[34+35]: > → S(y, x) (44)
            Hyper[23+31]:          > → B2 (f2 (x))                               (33)
                                                                                            Succ[34+35]: > → B3 (x) (45)
            Hyper[22+31]:          > → S(x, f3 (x))                              (34)
            Hyper[23+31]:          > → B3 (f3 (x))                               (35)
            Hyper[25+30+31+32+34]: > → f2 (x) ≈ y ∨ f3 (x) ≈ y ∨ f3 (x) ≈ f2 (x) (36)
            Eq[35+36]:             > → f2 (x) ≈ y ∨ f3 (x) ≈ y ∨ B3 (f2 (x))     (37)
            Pred[37+42]:           > → f2 (x) ≈ y ∨ f3 (x) ≈ y                   (46)
            Eq[35+46]:             > → B3 (y) ∨ f2 (x) ≈ y                       (47)
            Eq[33+47]:             > → B2 (y) ∨ B3 (y)                           (48)


          Fig. 2. Challenges in Extending the Consequence-Based Framework to SHIQ

4    Extending the Framework to SHIQ
We now present an example before formalising the calculus. Due to an interaction
between counting quantifiers and inverse roles, a SHIQ ontology can impose more
complex constraints on model elements than ALCI. Let O2 be the SHIQ ontology
shown in Figure 2; we argue that O2 |= B0 (x) → B4 (x) holds. To see why, consider
an equality Herbrand interpretation I constructed from B0 (a). Then, (19) and (20) de-
rive S(f1 (a), a) and B1 (f1 (a)); moreover, (22) and (23) derive S(f1 (a), f2 (f1 (a)))
and B2 (f2 (f1 (a))), and S(f1 (a), f3 (f1 (a))) and B3 (f3 (f1 (a))). Due to (24) we de-
rive B4 (f2 (f1 (a))) and B4 (f3 (f1 (a))). Finally, from (25) we derive (52).
                    f2 (f1 (a)) ≈ a ∨ f3 (f1 (a)) ≈ a ∨ f3 (f1 (a)) ≈ f2 (f1 (a))                                      (52)
We must satisfy at least one disjunct in (52). Disjunct f3 (f1 (a)) ≈ f2 (f1 (a)) cannot
be satisfied due to (21); but then, regardless of whether we satisfy f3 (f1 (a)) ≈ a or
f2 (f1 (a)) ≈ a, we derive B4 (a); hence, the inference holds.
    To prove this in our consequence-based framework, we must capture constraint (52)
and its consequences. However, this cannot be done using standard description logic
notation because DL concepts cannot identify specific successors and predecessors of
f1 (a)—that is, they cannot say ‘either the first or the second successor is equal to the
predecessor’. Thus, our main challenges are to devise a method for representing all the
relevant constraints that can be induced by SHIQ ontologies, and to ensure that such
constraints are correctly propagated between adjacent contexts.
    To address these challenges, we Skolemise existential quantifiers and transform ax-
ioms into DL-clauses. Skolemisation introduces function symbols that act as names for
successors. Our clauses thus contain terms of the form x, fi (x), and y which have a
special meaning in our setting: variable x represents the elements that a context stands
for; fi (x) represents a successor of x; and y represents the predecessor of x. This allows
us to represent constraint (52) as
                        f2 (x) ≈ y ∨ f3 (x) ≈ y ∨ f3 (x) ≈ f2 (x).                        (53)
     Table 2 shows the inference rules of our calculus that are applicable to such a clausal
representation. In each clause, literals are ordered from the smallest to the largest, and
so the maximal literal is always on the right; moreover, clause numbers correspond
to the order of clause derivation. In the rest of this section, we discuss the rules on
our running example and show how they verify O2 |= B0 (x) → B4 (x); for brevity, we
present only the inferences needed to produce the desired conclusion.
     We first create context v1 and initialise it with (26); this ensures that each interpreta-
tion represented by the context structure contains an element for which B0 holds. Next,
we derive (27) and (28) using hyperresolution. At this point, we could hyperresolve
(22) and (28) to obtain > → S(f1 (x), f2 (f1 (x))); however, such inferences could eas-
ily lead to nontermination of the calculus due to increased term nesting. Therefore, we
require hyperresolution to map variable x in the DL-clauses to variable x in the context
clauses; thus, in each context, hyperresolution derives only consequences about x.
     Function symbol f1 in clauses (27) and (28) is akin to an existential quantifier;
consequently, the Succ rule introduces a fresh context v2 . Due to Skolemisation, edges
in our context structure are labelled with function symbols, rather than concepts of
the form ∃S.B as in [23]. The rule uses an expansion strategy analogous to the EL
strategy from Section 3. To determine which information to propagate to a successor,
Definition 2 given below introduces a set Su(O) of successor triggers. In our example,
DL-clause (25) contains atoms B1 (x) and S(x, zi ) in its body, where zi can be mapped
to a predecessor or a successor of x, so a context in which hyperresolution is applied to
(25) will be interested in information about its predecessors; we reflect this by adding
B1 (x) and S(x, y) to Su(O). Thus, the Succ rule introduces context v2 , sets its core to
B1 (x) and S(x, y), and initialises the context with (30) and (31).
     We next introduce (32)–(35) using hyperresolution, at which point we have suffi-
cient information to apply hyperresolution to (25) to derive (36). Please note how the
presence of (30) is crucial for this inference. We use paramodulation to deal with equal-
ity in clause (36). As is common in resolution-based theorem proving, we order the
literals in a clause and apply inferences only to maximal literals; thus, we derive (37).
     Clauses (32), (33), and (37) contain function symbol f2 , so we again apply the Succ
rule and introduce context v3 . Due to clause (33), we know that B2 (x) must always
hold in v2 , so we add B2 (x) to corev2 . However, B3 (f2 (x)) occurs in clause (37) in a
disjunction, so it holds only conditionally in v2 ; we reflect this by including B3 (x) in
the body of clause (41). This allows us derive (42) using hyperresolution.
     Clause (42) essentially says ‘B3 (f2 (x)) should not hold in the predecessor’, so
the Pred rule propagates this information to v2 . This produces clause (46); one can
intuitively understand this inference as hyperresolution of (37) and (42), where we take
into account that term f2 (x) in context v2 is represented as variable x in context v3 .
    After two paramodulation steps, we derive clause (48), which essentially says ‘the
predecessor must satisfy B2 (x) or B3 (x)’. The set Pr(O) of predecessor triggers from
Definition 2 identifies this information as relevant to v1 : the DL-clauses in (24) contain
B2 (x) and B3 (x) in their bodies, which are represented in v2 as B2 (y) and B3 (y).
Hence Pr(O) contains B2 (y) and B3 (y), which allows the Pred rule to derive (49).
    After two more hyperresolution steps, we finally derive our target clause (51). Please
note, however, that we cannot derive this if B4 (x) were maximal in (50); thus, for
completeness we require all atoms in the head of a query clause to be smallest. A similar
observation applies to Pr(O): if B3 (y) were maximal in (47), we would not derive (48)
and propagate it to v1 ; thus, we require all atoms in Pr(O) to be smallest too.

5   Formalising the Consequence-Based Algorithm for SHIQ
Our calculus manipulates context clauses, which are constructed from context terms and
context literals as described in Definition 1. Unlike in general resolution, we restrict
context clauses to contain only variables x and y, which have a special meaning in our
setting: variable x represents a point (i.e., a term) in the model, and y represents the
predecessor of x; this naming convention is important for the rules of our calculus. This
is in contrast to the DL-clauses of an ontology: these can contain variables x and zi ,
and the latter can refer to either the predecessor or a successor of x.
Definition 1. A context F-term is a term of the form x, y, or f (x) for f ∈ F; a context
P-term is a term of the form B(y), B(x), B(f (x)), S(x, y), S(y, x), S(x, f (x)), or
S(f (x), x) for B, R ∈ P and f ∈ F; and a context term is an F-term or a P-term. A
context literal is a literal of the form A ≈ t (called a context atom), f (x) ./ g(x), or
f (x) ./ y, for A a context P-term and ./ ∈ {≈, 6≈ }. A context clause is a clause with
only function-free context atoms in the body, and only context literals in the head.
    Definition 2 introduces sets Su(O) and Pr(O), that identify the information that
must be exchanged between adjacent contexts. Intuitively, Su(O) contains atoms that
are of interest to a context’s successor, and it guides the Succ rule whereas Pr(O) con-
tains atoms that are of interest to a context’s predecessor and it guides the Pred rule.
Definition 2. Let O be an ontology. The set Su(O) of successor triggers of O is the
smallest set of atoms such that, for each clause Γ → ∆ ∈ O, we have (i) B(x) ∈ Γ im-
plies B(x) ∈ Su(O), (ii) S(x, zi ) ∈ Γ implies S(x, y) ∈ Su(O), and (iii) S(zi , x) ∈ Γ
implies S(y, x) ∈ Su(O). The set Pr(O) of predecessor triggers is defined as
      Pr(O) = { A{x 7→ y, y 7→ x} | A ∈ Su(O) } ∪ { B(y) | B occurs in O }.
    Similarly to resolution, our calculus uses a term order  to restrict its inferences.
Definition 3 specifies the conditions that the order must satisfy. Conditions 1 and 2
ensure that F-terms are compared uniformly across contexts; however, P-terms can
be compared in different ways in different contexts. Conditions 1 through 4 ensure that,
when grounded with x and y mapping to a term its predecessor, respectively, the order is
a simplification order [4]—a kind of term order commonly used in equational theorem
proving. Finally, condition 5 ensures that atoms that might be propagated to a context’s
predecessor via the Pred rule are smallest, which is important for completeness.
Definition 3. Let m be a total, well-founded order on function symbols. A context term
order  is an order on context terms satisfying the following conditions:
 1. for each f ∈ F, we have f (x)  x  y;
 2. for all f, g ∈ F with f m g, we have f (x)  g(x);
 3. for all terms s1 , s2 , and t and each position p in t, if s1  s2 , then t[s1 ]p  t[s2 ]p ;
 4. for each term s and each proper position p in s, we have s  s|p ; and
 5. for each atom A ≈ t ∈ Pr(O) and each context term s 6∈ {x, y}, we have A 6 s.
Order  is extended to literals, also written , as described in Section 2.
    A lexicographic path order (LPO) [4] over context F-terms and context P-terms, in
which x and y are treated as constants such that x  y, satisfies conditions 1 through 4.
Furthermore, Pr(O) contains only atoms of the form B(y), S(x, y), and S(y, x), which
can always be smallest in the order; thus, condition 5 does not contradict the other con-
ditions. Hence, an LPO that is relaxed for condition 5 satisfies Definition 3. In addition
to orders, redundancy elimination techniques have proven crucial to the practical effec-
tiveness of resolution calculi. We now define a criterion compatible with our setting.
Definition 4. A set of clauses U contains a clause Γ → ∆ up to redundancy, written
Γ →∆∈    ˆ U , if
 1. terms s and s0 exist such that s ≈ s ∈ ∆ or {s ≈ s0 , s 6≈ s0 } ⊆ ∆, or
 2. a clause Γ 0 → ∆0 ∈ U exist such that Γ 0 ⊆ Γ and ∆0 ⊆ ∆.
                                                                     ˆ U \ {C }, for
Proposition 1. For U a set of clauses and C ∈ U a clause such that C ∈
each clause C 0 with C 0 ∈
                         ˆ U , we have C 0 ∈
                                           ˆ U \ {C }.
    Proposition 1 shows that our calculus is compatible with backward subsumption
(which is captured in the Elim rule). Note that tautologies of the form A → A are not
necessarily redundant since they can be used to initialise contexts. However, if our cal-
culus were to derive both A → A and A → A ∨ A0 then the latter is always redundant.
    We now formalise the notion of a context structure, and define soundness for a
context structure. The latter captures the fact that corev is not contained in the body of
any context clause in Sv .
Definition 5. A context structure for an ontology O is a tuple D = hV, E, S, core, i,
where V is a finite set of contexts, E ⊆ V × V × F is a finite set of edges labelled with
function symbols, function core assigns to each context v a conjunction corev of atoms
over the P-terms from Su(O), function S assigns to each context v a finite set Sv of
context clauses, and function  assigns to each context v a context term order v . A
context structure D is sound for O if the following conditions both hold:
 S1. O |= corev ∧ Γ → ∆ for each context v ∈ V and each clause Γ → ∆ ∈ Sv , and
 S2. O |= coreu → corev {x 7→ f (x), y 7→ x} for each edge hu, v, f i ∈ E.
    Definition 6 introduces an expansion strategy—a parameter of our calculus that
determines when and how to reuse contexts in order to satisfy existential restrictions.
We have discussed the roles of expansion strategies in Section 3; moreover, in [23] we
presented several expansion strategies for the DLs contained in ALCI, and adapting
these to SHIQ is straightforward.
                   Table 2. The rules of the consequence-based calculus for SHIQ

         If    A ∈ corev ,
Core


               and > → A ∈      / Sv ,
         then add > → A to Sv .
               Vn
         If       i=1 Ai → ∆ ∈ O,
               σ is a substitution such that σ(x) = x,
Hyper


               Γi → V∆i ∨ Ai σ ∈ Sv with   Wn ∆i 6v Ai σ for i ∈ {1, . . . , n},
               and Vn i=1 Γi → ∆σ ∨ Wi=1 ∆i ∈         6 ˆ Sv ,
         then add n   i=1 iΓ    →   ∆σ   ∨   n
                                             i=1 ∆  i to  Sv .
         If    Γ1 → ∆1 ∨ s1 ≈ t1 ∈ Sv with s1 v t1 and ∆1 6v s1 ≈ t1 ,
               Γ2 → ∆2 ∨ s2 ./ t2 ∈ Sv with ./ ∈ {≈, 6≈ } and s2 v t2 and ∆2 6v s2 ./ t2 ,
Eq


               s2 |p = s1 ,
               and Γ1 ∧ Γ2 → ∆1 ∨ ∆2 ∨ s2 [t1 ]p ./ t2 ∈          6 ˆ Sv ,
         then add Γ1 ∧ Γ2 → ∆1 ∨ ∆2 ∨ s2 [t1 ]p ./ t2 to Sv .
         If    Γ → ∆ ∨ t 6≈ t ∈ Sv with ∆ 6v t 6≈ t,
Ineq


               and Γ → ∆ ∈    6 ˆ Sv ,
         then add Γ → ∆ to Sv .
         If    Γ → ∆ ∨ s ≈ t ∨ s ≈ t0 ∈ Sv with ∆ ∪ {s ≈ t} 6v s ≈ t0 and s v t0 ,
Factor


               and Γ → ∆ ∨ t 6≈ t0 ∨ s ≈ t0 ∈     6 ˆ Sv ,
         then add Γ → ∆ ∨ t 6≈ t0 ∨ s ≈ t0 to Sv .
         If    Γ → ∆ ∈ Sv and
Elim


               Γ →∆∈      ˆ Sv \ {Γ → ∆}
         then remove Γ → ∆ from Sv .
         If    hu, v, f i ∈ E,
               Vm             Wm+n
                  i=1 Ai →        i=m+1 Ai ∈ Sv ,
               Γi → ∆i ∨ Ai σ ∈ Su with ∆i 6u Ai σ for 1 ≤ i ≤ m,
Pred


               Ai ∈ Pr(O) for each m + 1 ≤ i ≤ m + n,
               and m       Γi → m          ∆i ∨ m+n                 6 ˆ Su ,
                    V               W           W
                                                            Aσ∈
                    Vi=1            Wi=1        Wi=m+1 i
         then add i=1 Γi → i=1 ∆i ∨ m+n
                      m                m
                                                  i=m+1     A  i σ  to  Su ;
         where σ = {x 7→ f (x), y 7→ x}.
         If    Γ → ∆ ∨ A ∈ Su where ∆ 6u A and A contains f (x), and
               no edge hu, v, f i ∈ E exists such that A → A ∈           ˆ Sv for each A ∈ K2 \ corev ,
         then let hv, core0 , 0 i := strategy(K1 , D);
               if v ∈ V, then let v := v ∩ 0 , and
Succ


               otherwise let V := V ∪ {v}, corev := core0 , v := 0 , and Sv := ∅;
               add the edge hu, v, f i to E; and
               add A → A to Sv for each A ∈ K2 \ corev ;
         where σ = {x 7→ f (x), y 7→ x},
               K1 = { A ∈ Su(O) | > → Aσ ∈ Su }, and
               K2 = { A ∈ Su(O) | Γ 0 → ∆0 ∨ Aσ ∈ Su and ∆0 6u Aσ }.
Definition 6. An expansion strategy is a polynomial-time computable function strategy
that takes a set of atoms K and a context structure D = hV, E, S, core, i. The result
of strategy(K, D) is a triple hv, core0 , 0 i where core0 is a subset of K; either v ∈
                                                                                      / V is
a fresh context, or v ∈ V is an existing context in D such that corev = core0 ; and 0 is
a context term order.
    We now present the main theorems. Full proofs of all technical results can be found
in [5]. Theorem 1 proves that all clauses derived by our calculus are indeed conclusions
of the input ontology, and Theorem 2 is our statement of completeness.
Theorem 1. For any strategy, applying a rule from Table 2 to an ontology O and a
context structure D that is sound for O produces a context structure that is sound for O.

Theorem 2. Let O be an ontology, and let D = hV, E, S, core, i be a context struc-
ture such that no inference rule from Table 2 is applicable to O and D. Then, for each
query clause ΓQ → ∆Q and each context q ∈ V that satisfy all of the following condi-
tions, we have ΓQ → ∆Q ∈   ˆ Sq .
 C1. O |= ΓQ → ∆Q .
 C2. For each atom A ≈ t ∈ ∆Q and each context term s 6∈ {x, y} such that A q s,
     we have s ≈ t ∈ ∆Q ∪ Pr(O).
                                      ˆ Sq .
 C3. For each A ∈ ΓQ , we have ΓQ → A ∈
    Theorems 1 and 2 result in the following algorithm for deciding O |= ΓQ → ∆Q .
 1. Create an empty context structure D, introduce a context q, and, for each A ∈ ΓQ ,
    add ΓQ → A to Sq in order to satisfy condition C3.
 2. Initialise q in a way that satisfies condition C2, and select an expansion strategy.
 3. Saturate D and O using the inference rules from Table 2.
 4. ΓQ → ∆Q holds if and only if ΓQ → ∆Q ∈        ˆ Sv .
Proposition 2. For each expansion strategy that introduces at most exponentially many
contexts, the consequence-based calculus for SHIQ is worst-case optimal.
Proof. The number ℘ of different context clauses that can be generated using the sym-
bols in O is clearly at most exponential in the size of O, and the number m of clauses
participating in each inference is linear in the size of O. Hence, with k contexts, the
number of inferences is bounded by (k · ℘)m ; if k is at most exponential in the size of
O, the number of inferences is exponential as well. Thus, if at most exponentially many
contexts are introduced, our algorithm runs in exponential time, which is worst-case
optimal for SHIQ [3].                                                                 t
                                                                                      u

6   Conclusion
We have presented the first consequence based calculus for SHIQ, a DL that includes
both disjunction and counting quantifiers. Our calculus combines ideas from state of the
art resolution and (hyper)tableau calculi, including the use of ordered paramodulation
for efficient equality reasoning. In spite of its increased complexity, the calculus mimics
existing and well proven EL calculi on EL ontologies. Thus, although implementation
and evaluation remain as future work, we believe that the calculus is likely to work well
on ‘mostly-EL’ ontologies—a type of ontology that occurs commonly in practice.
References
 1. H. Andréka, J. van Benthem, and I. Németi. Modal Languages and Bounded Fragments of
    Predicate Logic. Journal of Philosophical Logic, 27(3):217–274, 1998.
 2. F. Baader, S. Brandt, and C. Lutz. Pushing the EL Envelope. In L. Pack Kaelbling and
    A. Saffiotti, editors, Proc. of the 19th Int. Joint Conference on Artificial Intelligence (IJCAI
    2005), pages 364–369, Edinburgh, UK, July 30–August 5 2005. Morgan Kaufmann Publish-
    ers.
 3. F. Baader, D. Calvanese, D. McGuinness, D. Nardi, and P. F. Patel-Schneider, editors. The
    Description Logic Handbook: Theory, Implementation and Applications. Cambridge Uni-
    versity Press, January 2003.
 4. F. Baader and T. Nipkow. Term Rewriting and All That. Cambridge University Press, 1998.
 5. A. Bate, B. Motik, B. Cuenca Grau, F. Simančı́k, and I. Horrocks. Extending Consequence-
    Based Reasoning to SHIQ. Technical report, Department of Computer Science, University
    of Oxford, May 2015.
 6. P. Baumgartner, U. Furbach, and I. Niemelä. Hyper Tableaux. In Proc. of the European
    Workshop on Logics in Artificial Intelligence (JELIA ’96), number 1126 in LNAI, pages
    1–17, Évora, Portugal, September 30–October 3 1996. Springer.
 7. P. Baumgartner and R. A. Schmidt. Blocking and Other Enhancements for Bottom-Up Model
    Generation Methods. In U. Furbach and N. Shankar, editors, Proc. of the 3rd Int. Joint Conf.
    on Automated Reasoning (IJCAR 2006), volume 4130 of LNCS, pages 125–139, Seattle, WA,
    USA, August 17–20 2006. Springer.
 8. S. Demri and H. de Nivelle. Deciding Regular Grammar Logics with Converse Through
    First-Order Logic. Journal of Logic, Language and Information, 14(3):289–329, 2005.
 9. Birte Glimm, Ian Horrocks, Boris Motik, Giorgos Stoilos, and Zhe Wang. HermiT: An OWL
    2 Reasoner. Journal of Automated Reasoning, 53(3):245–269, 2014.
10. Rajeev Goré and Linh Anh Nguyen. EXPTIME Tableaux with Global Caching for Descrip-
    tion Logics with Transitive Roles, Inverse Roles and Role Hierarchies. In Nicola Olivetti,
    editor, Proc. of the 16th Int. Conf. on Automated Reasoning with Tableaux and Related Meth-
    ods (TABLEAUX 2007), volume 4548 of LNCS, pages 133–148, Aix en Provence, France,
    July 3–6 2007. Springer.
11. E. Grädel, M. Otto, and E. Rosen. Two-Variable Logic with Counting is Decidable. In Proc.
    of the 12th IEEE Symposium on Logic in Computer Science (LICS ’97), pages 306–317,
    Warsaw, Poland, June 29–July 2 1997. IEEE Computer Society.
12. V. Haarslev and R. Möller. RACER System Description. In R. Goré, A. Leitsch, and T. Nip-
    kow, editors, Proc. of the 1st Int. Joint Conf. on Automated Reasoning (IJCAR 2001), volume
    2083 of LNAI, pages 701–706, Siena, Italy, June 18–23 2001. Springer.
13. U. Hustadt and R.A. Schmidt. On the Relation of Resolution and Tableaux Proof Systems for
    Description Logics. In Thomas Dean, editor, Proc. of the 16th Int. Joint Conf. on Artificial
    Intelligence (IJCAI 1999), pages 110–117, Stockholm, Sweden, July 31 – August 6 1999.
    Morgan Kaufmann.
14. Ullrich Hustadt, Boris Motik, and Ulrike Sattler. Deciding Expressive Description Logics in
    the Framework of Resolution. Information & Computation, 206(5):579–601, 2008.
15. Y. Kazakov. Consequence-Driven Reasoning for Horn SHIQ Ontologies. In Craig Boutilier,
    editor, Proc. of the 21st Int. Joint Conf. on Artificial Intelligence (IJCAI 2009), pages 2040–
    2045, Pasadena, CA, USA, July 11–17 2009.
16. B. Motik, R. Shearer, and I. Horrocks. Hypertableau Reasoning for Description Logics.
    Journal of Artificial Intelligence Research, 36:165–228, 2009.
17. R. Nieuwenhuis and A. Rubio. Theorem Proving with Ordering and Equality Constrained
    Clauses. Journal of Symbolic Computation, 19(4):312–351, 1995.
18. H. De Nivelle, R. A. Schmidt, and U. Hustadt. Resolution-Based Methods for Modal Logics.
    Logic Journal of the IGPL, 8(3):265–292, 2000.
19. M. Ortiz, S. Rudolph, and M. Simkus. Worst-Case Optimal Reasoning for the Horn-DL
    Fragments of OWL 1 and 2. In F. Lin, U. Sattler, and M. Truszczynski, editors, Proc. of
    the 12th Int. Conf. on Knowledge Representation and Reasoning (KR 2010), pages 269–279,
    Toronto, ON, Canada, May 9–13 2010. AAAI Press.
20. A. Riazanov and A. Voronkov. The design and implementation of VAMPIRE. AI Communi-
    cations, 15(2–3):91–110, 2002.
21. R. A. Schmidt and U. Hustadt. A Principle for Incorporating Axioms into the First-Order
    Translation of Modal Formulae. In F. Baader, editor, Proc. of the 19th Int. Conf. on Au-
    tomated Deduction (CADE-19), volume 2741 of LNAI, pages 412–426, Miami Beach, FL,
    USA, July 28–August 2 2003. Springer.
22. S. Schulz. E—A Brainiac Theorem Prover. AI Communications, 15(2–3):111–126, 2002.
23. František Simančı́k, Boris Motik, and Ian Horrocks. Consequence-Based and Fixed-
    Parameter Tractable Reasoning in Description Logics. Artificial Intelligence, 209:29–77,
    2014.
24. F. Simančı́k, Y. Kazakov, and I. Horrocks. Consequence-Based Reasoning beyond Horn
    Ontologies. In Toby Walsh, editor, Proc. of the 22nd Int. Joint Conf. on Artificial Intelligence
    (IJCAI 2011), pages 1093–1098, Barcelona, Spain, July 16–22 2011.
25. E. Sirin, B. Parsia, B. Cuenca Grau, A. Kalyanpur, and Y. Katz. Pellet: A practical OWL-DL
    reasoner. Journal of Web Semantics, 5(2):51–53, 2007.
26. D. Tsarkov and I. Horrocks. FaCT++ Description Logic Reasoner: System Description. In
    Proc. of the 3rd Int. Joint Conf. on Automated Reasoning (IJCAR 2006), volume 4130 of
    LNAI, pages 292–297, Seattle, WA, USA, August 17–20 2006. Springer.