Satisfiability Checking and Conjunctive Query
Answering in Description Logics with Global and
          Local Cardinality Constraints

        Franz Baader1 , Bartosz Bednarczyk12 , and Sebastian Rudolph1
               1
                 Faculty of Computer Science, TU Dresden, Germany
                        firstname.lastname@tu-dresden.de
          2
            Institute of Computer Science, University of Wrocław, Poland
                       bartosz.bednarczyk@cs.uni.wroc.pl


      Abstract. We consider an expressive description logic (DL) in which
      the global and local cardinality constraints introduced in previous papers
      can be mixed. On the one hand, we show that this does not increase
      the complexity of satisfiability checking and other standard inference
      problems. On the other hand, conjunctive query entailment in this DL
      turns out to be undecidable. We prove that decidability of querying can
      be regained if global and local constraints are not mixed and the global
      constraints are appropriately restricted.


1   Introduction

DLs that can express both local cardinality constraints (i.e., constraints concern-
ing the role successors of specific individuals) and global cardinality constraints
(i.e., constraints on the overall cardinality of concepts) can, for instance, be used
to check the correctness of statistical statements. For example, if a German car
company claims that they have produced more than N cars in a certain year,
and P % of the tires used for their cars were produced by Betteryear, this may
be contradictory to a statement of Betteryear that they have sold less than M
tires in Germany. Such statistical information may, of course, also influence the
answers to queries. If we know that the car company VMW uses only tires from
Betteryear or Badmonth, but the statistical information allows us to conclude
that another car company has actually bought all the tires sold by Betteryear,
then we know that the cars sold by VMW all have tires produced by Badmonth.
This motivates investigating DLs with expressive cardinality constraints, and
to consider not just standard inferences such as satisfiability checking for these
DLs, but also query answering.
    In two previous publications we have, on the one hand, extended the DL
ALCQ by more expressive number restrictions using cardinality and set con-
straints expressed in the quantifier-free fragment of Boolean Algebra with Pres-
burger Arithmetic (QFBAPA) [10]. In the resulting DL ALCSCC, which was
introduced and investigated in [1], cardinality and set constraints are applied
locally, i.e., they refer to the role successors of an individual under consid-
eration. It was shown in [1] that pure concept satisfiability in ALCSCC is a
PSpace-complete problem, and concept satisfiability w.r.t. a general TBox is
ExpTime-complete. This shows that the more expressive number restrictions
do not increase the complexity of reasoning since reasoning in ALCQ has the
same complexity [17,19].
    On the other hand, we have extended the terminological formalism of the
well-known description logic ALC from general TBoxes to more general cardinal-
ity constraints expressed in QFBAPA [4], which we called extended cardinality
constraints (ECBoxes). These constraints are global since they refer to all indi-
viduals in the interpretation domain. It was shown in [4] that reasoning w.r.t.
ECBoxes is in NExpTime even if the numbers occurring in the constraints are
encoded in binary. A NExpTime lower bound follows from a result of Tobies [18]
for a restricted form of cardinality constraints, where the cardinality of a concept
can only be compared with a fixed number. This complexity can be lowered to
ExpTime if a restricted form of cardinality constraints (RCBoxes) is used. Such
RCBoxes are still powerful enough to express statistical knowledge bases [13].
    An obvious way to generalize these two approaches is to combine the two ex-
tensions, i.e., to consider extended cardinality constraints, but now on ALCSCC
concepts rather than just ALC concepts. This combination was investigated
in [2], where a NExpTime upper bound was established for reasoning in ALCSCC
w.r.t. ECBoxes. It is also shown in [2] that reasoning w.r.t. RCBoxes stays in
ExpTime also for ALCSCC.
    Here we go one step further by allowing for a tighter integration of global
and local constraints. The resulting logic, which we call ALCSCC ++ , allows, for
example, to relate the number of role successors of a given individual with the
overall number of elements of a certain concept. We show that, from a worst-case
complexity point of view, this extended expressivity comes for free if we consider
classical reasoning problems. Concept satisfiability in ALCSCC ++ has the same
complexity as in ALC and ALCSCC with global cardinality constraints: it is
NExpTime-complete. Yet, for effective conjunctive query answering this logic
turns out to be too expressive. We show that conjunctive query entailment w.r.t.
ALCSCC ++ knowledge bases is, in fact, undecidable. In contrast, we can show
that conjunctive query entailment w.r.t. ALCSCC RCBoxes is decidable.
    We assume the reader to be sufficiently familiar with all the standard notions
of description logics [3,5,15].


2   The logic ALCSCC ++

As in [1,4], we use the quantifier-free fragment of Boolean Algebra with Pres-
burger Arithmetic (QFBAPA) to express our constraints. In this logic, one can
build set terms by applying Boolean operations (intersection ∩, union ∪, and
complement ·c ) to set variables as well as the constants ∅ and U. Set terms s, t
can then be used to state set constraints, which are equality and inclusion con-
straints of the form s = t, s ⊆ t, where s, t are set terms. Presburger Arithmetic
(PA) expressions are built from integer constants and set cardinalities |s| using
addition as well as multiplication with an integer constant. They can be used
to form cardinality constraints of the form k = `, k < `, N dvd `, where k, ` are
PA expressions, N is an integer constant, and dvd stands for divisibility. A QF-
BAPA formula is a Boolean combination of set and cardinality constraints. A
solution σ of a QFBAPA formula φ assigns a finite set σ(U) to U and subsets
of σ(U) to set variables such that φ is satisfied by this assignment (see [1] for
more details). A QFBAPA formula φ is satisfiable if it has a solution. In [10] it
is shown that the satisfiability problem for QFBAPA formulae is NP-complete.
    We are now ready to define our new logic, which we call ALCSCC ++ to indi-
cate that it is an extension of the logic ALCSCC introduced in [1]. When defining
the semantics of ALCSCC ++ , we restrict the attention to finite interpretations
to ensure that cardinalities of concept descriptions are always well-defined non-
negative integers.

Definition 1. Given disjoint finite sets NC and NR of concept names and
role names, respectively, ALCSCC ++ concept descriptions (short: concepts) are
Boolean combinations of concept names and constraint expressions, where a con-
straint expression is of the form sat(c) for a set constraint or a cardinality con-
straint c that uses role names and ALCSCC ++ concept descriptions in place of
set variables. As usual, we use > ( top) and ⊥ ( bottom) as abbreviations for
A t ¬A and A u ¬A, respectively.
    A finite interpretation of NC and NR consists of a finite, non-empty set ∆I
and a mapping ·I that maps every concept name A ∈ NC to a subset AI of ∆I
and every role name r ∈ NR to a binary relation rI over ∆I . For a given element
d ∈ ∆I we define rI (d) := {e ∈ ∆I | (d, e) ∈ rI }. The interpretation function ·I
is inductively extended to ALCSCC ++ concept descriptions by interpreting the
Boolean operators as usual, and the constraint expressions as follows:
sat(c)I := {d ∈ ∆I | the mapping ·Id satisfies c}, where ·Id maps
– ∅ to ∅Id := ∅ and U to U Id := ∆I ,
– the ALCSCC ++ concept descriptions C occurring in c to C Id := C I ,
– and the role names r occurring in c to rId := rI (d).
The ALCSCC ++ concept description C is satisfiable if there is a finite interpre-
tation I such that C I 6= ∅.

    Note that the interpretation of concepts as set variables in ALCSCC ++ is
global in the sense that it does not depend on d, i.e., C Id = C Ie for all d, e ∈ ∆I .
In contrast, the interpretation of role names r as set variables is local since only
the r-successors of d are considered by ·Id . In ALCSCC, also the interpretation of
concepts as set variables is local since in the semantics of ALCSCC the mapping
·Id considers only the elements of C I that are role successors of d for some      role
name in NR . This can clearly be simulated in ALCSCC ++ by using C ∩( r∈NR r)
                                                                             S
instead of C when formulating the constraint. Thus, ALCSCC concepts can be
expressed by ALCSCC ++ concepts. In addition, extended cardinality constraints
(ECBoxes), as introduced in [4], are expressible within ALCSCC ++ concept de-
scriptions, as are nominals, the universal role, and role negation.
Proposition 1. ALCSCC ++ concepts can polynomially express nominals, role
conjunctions, and ALCSCC ECBoxes, and thus also ABoxes, ALC ECBoxes
and ALCSCC TBoxes. In addition, they have the same expressivity as concepts
of ALCSCC extended with the universal role or with role negation, whereas both
of these features are not expressible in plain ALCSCC.

Proof. ECBoxes correspond to Boolean combinations of concept descriptions of
the form sat(c) where c contains only concept descriptions as set variables. Since
such concept descriptions are satisfied either by no element of ∆I or by all of
them, their effect is to enforce the constraint on the whole interpretation domain
if they are conjoined to a concept description.
    Regarding role negation, for given role names r, r, the constraint sat(> ⊆
sat(r∩r ⊆ ∅)) enforces that, for every individual, the sets of its r and r successors
are disjoint. In addition, the constraint sat(> ⊆ sat(|r| + |r| = |U|)) says that
elements of the domain that are not r successors of a given individual must
be r successors. Thus, we can express that the role r is interpreteted as the
complement of r, i.e. rI = ∆I × ∆I \ rI for every finite interpretation I.
    The other statements in the proposition are also easy to see.                  t
                                                                                   u


3   Satisfiability of ALCSCC ++ concept descriptions
In the following we consider an ALCSCC ++ concept description E and show
how to test E for satisfiability by reducing this problem to the problem of test-
ing satisfiability of QFBAPA formulae. Since the reduction is exponential and
satisfiability in QFBAPA is in NP, this yields a NExpTime upper bound for
satisfiability of ALCSCC ++ concept descriptions. This bound is optimal since
consistency of extended cardinality constraints in ALC, as introduced in [4], is
already NExpTime hard, and can be expressed as an ALCSCC ++ satisfiability
problem by Proposition 1.
    Our NExpTime algorithm combines ideas from the satisfiability algorithm
for ALCSCC concept descriptions [1] and the consistency procedure for ALC
ECBoxes [4]. In particular, we use the notion of a type, as introduced in [4].
This notion is also similar to the Venn regions employed in [1]. Given a set of
concept descriptions M, the type of an individual in an interpretation consists
of the elements of M to which the individual belongs. Such a type t can also
be seen as a concept description Ct , which is the conjunction of all the elements
of t. We assume in the following that M consists of all subdescriptions of the
concept description E as well as the negations of these subdescriptions.

Definition 2. A subset t of M is a type for E if it satisfies the following three
properties: (1) for every concept description ¬C ∈ M, either C or ¬C belongs
to t; (2) for every concept description C u D ∈ M, we have that C u D ∈ t iff
C ∈ t and D ∈ t; (3) for every concept description C t D ∈ M, we have that
C t D ∈ t iff C ∈ t or D ∈ t.
    We denote the set of all types for E with types(E). Given an interpretation
I and an individual d ∈ ∆I , the type of d is the set tI (d) := {C ∈ M | d ∈ C I }.
    Due to Condition (1) in the definition of types, concept descriptions Ct , Ct0
induced by different types t 6= t0 are disjoint, and all concept descriptions in
M can be obtained as the disjoint union of the concept      descriptions induced
by the types containing them, i.e., we have C I = t type with C∈t CtI for all C ∈
                                                   S
M and finite interpretations I. In particular, the following holds for all finite
interpretations I:
                          X                               \
               |C I | =           |CtI | and |CtI | = |      C I |,
                       t type with C∈t                       C∈t

where the latter identity is an immediate consequence of the definition of Ct as
the conjunction of all the elements of t.
    Given a type t, the constraints occurring in the top-level Boolean structure
of t induce a QFBAPA formula ψt , in which the concepts C and roles r occurring
in these constraints are replaced by set variables XC and Xrt , respectively. Note
that set variables corresponding to concepts are independent of the type t, i.e.,
they are shared by all types, whereas the set variables corresponding to roles are
different for different types. This corresponds to the fact that roles are evaluated
locally, but concepts are evaluated globally in the semantics of ALCSCC ++ . In
order to ensure that the Boolean structure of concepts is respected by the set
variables, we introduce the formula β :=
   ^                               ^                            ^                  c
         XCuD = XC ∩ XD ∧               XCtD = XC ∪ XD ∧             X¬C = (XC ) .
CuD∈M                           CtD∈M                          ¬C∈M

Overall, we translate the ALCSCC ++ concept E into the QFBAPA formula
                                      ^       \
              δE := (|XE | ≥ 1) ∧ β ∧      (|    XC | = 0) ∨ ψt .
                                         t∈types(E)   C∈t

Intuitively, to satisfy E, we need to have at least one element in it, which explains
the first conjunct. The third conjunct together with β ensures that, for any type
that is realized (i.e., has elements), the constraints of this type are satisfied. The
next lemma states that there is a 1–1-relationship between solvability of δE and
satisfiability of E.
Lemma 1. The QFBAPA formula δE is of size at most exponential in the size
of E, and it is satisfiable iff the ALCSCC ++ concept description E is satisfiable.
   Since satisfiability of QFBAPA formulae can be decided within NP even for
binary coding of numbers [10], this lemma shows that satisfiability of ALCSCC ++
concept descriptions can be decided within NExpTime. Together with the known
NExpTime lower bound for consistency of ALC ECBoxes in [4], this yields:
Theorem 1. Satisfiability of ALCSCC ++ concept descriptions with numbers en-
coded in binary is NExpTime-complete.
   Thanks to Proposition 1, this carries over to satisfiability of ALCSCC ++
knowledge bases which may feature an ABox, a TBox and an ECBox.
4   Query entailment in ALCSCC ++

The final result of this section is the undecidability of conjunctive query entail-
ment for ALCSCC ++ . To this end, we first briefly recap the notion of (Boolean)
conjunctive queries and define query entailment.
    In queries, we use variables from a countably infinite set V . A Boolean con-
junctive query (CQ) q is a finite set of atoms of the form r(x, y) or C(z), where
r is a role, C is concept, and x, y, z ∈ V . A CQ q is satisfied by I (written:
I |= q) if there is a variable assignment π : V → ∆I (called match) such that
(π(x), π(y)) ∈ rI for every r(x, y) ∈ q and π(z) ∈ C I for every C(z) ∈ q. A CQ
q is (finitely) entailed from a knowledge base K (written: K |= q) if every (finite)
model I of K satisfies q.
    We actually show undecidability of CQ entailment for a much weaker logic,
thereby providing a very restricted fragment of constant-free and equality-free
two-variable first-order logic for which finite CQ entailment is already unde-
cidable, significantly strengthening and solidifying earlier results along those
lines [14]. Our proof makes use of deterministic Turing machines (DTMs). For
our purposes, it is sufficient to consider only computations starting with an
empty tape. For space reasons, we assume the reader to be familiar with stan-
dard notions and constructions concerning DTMs. We call a DTM looping if its
run starting contains repeating configurations,i.e., there are two different (and
hence – due to determinism – infinitely many) points in time, where the ma-
chine’s tape content, head position, and state are the same. It is easy to see that
the problem of determining if a given TM is looping is undecidable.
    We show our undecidability result for the DL ALC cov , a slight extension
of ALC by role cover axioms of the form cov(r, s) for role names r and s. An
interpretation I satisfies cov(r, s) if rI ∪ sI = ∆I × ∆I . Role
                                                                cover axioms can
be expressed in ALCSCC ++ via sat > ⊆ sat(|r ∪ s| = |U|) , hence ALC cov is
subsumed by ALCSCC ++ .
    In what follows, assume that a DTM M is given. We now describe an ALC cov
TBox T and conjunctive query q such that T |= q exactly if M is not looping. We
provide q and T together with the underlying intuitions of our construction. The
goal of our construction is that a countermodel (i.e., an interpretation satisfying
T but not q) corresponds to a looping configuration sequence of M. Thereby,
the domain elements represent tape cells at certain computation steps of M.
The role h connects consecutive tape cells of the same configuration, whereas
the role v connects a configuration’s tape cell with the same tape cell of the
successor configuration.
    We start by providing the query. Intuitively, the query is meant to catch the
unwanted situation that two corresponding tape cells of consecutive configura-
tions are v-connected, but the cells to their right aren’t.

              q = ∃x, y, x0 , y 0 .v(x, y) ∧ h(x, x0 ) ∧ h(y, y 0 ) ∧ v(x0 , y 0 )   (1)
                 Table 1. TBox axioms for DTM implementation

                    > v ∃aux.(TapeStart u InitConf u State qini )               (3)
                                 > v ∃h.> u ∃v.>                                (4)
                             TapeStart v ∀v.TapeStart                           (5)
                 InitConf v ∀h.InitConf       InitConf v Symbol                (6)

   State q v ∀h.NoHeadR      NoHeadR v ∀h.NoHeadR          NoHeadR v NoHead     (7)
∃h.Stateq v NoHeadL       ∃h.NoHeadL v NoHeadL             NoHeadL v NoHead     (8)
                                   State q u NoHead v ⊥                         (9)

                Symbolσ u Symbol σ0 v ⊥           Stateq u State q0 v ⊥        (10)
                         N oHead u Symbol σ v ∀v.Symbolσ                       (11)

                        Stateq u Symbol σ v ∀v.(Symbolσ0 u ∀h.Stateq0 )        (12)
                   ∃h.(Stateq u Symbol σ ) v ∀v.(Stateq0 u ∀h.Symbolσ0 )       (13)
            TapeStart u State q u Symbol σ v ∀v.(Stateq0 u Symbol σ0 )         (14)


We proceed by giving the axioms of T . The following covering axiom ensures
that, whenever two elements are not v-connected, they must be v-connected.
This is needed to enable the above query to catch the described problem.
                                      cov(v, v)                                (2)
The remaining TBox axioms can be found in Table 1. Axiom 3 ensures (by
means of an auxiliary role aux which serves no further purpose) that there is
a first tape cell of the first (initial) configuration where the head of the TM
is positioned in the initial state. Axiom 4 enforces that for every cell of every
configuration there is both a tape cell to its right and a corresponding tape
cell in the successor configuration. Axiom 5 makes sure that, for every cell that
is the first on its tape, the corresponding successor configuration’s tape cell is
also the first. Axioms 6 propagates the information that a cell belongs to the
initial configuration along the tape, and fills the tape with blanks. Axioms 7–9
(instantiated for every state q) make sure that in every configuration there can
only be one cell where the head is positioned. Every cell can only carry one
symbol and the head can be in only one state, as ensured by Axioms 10 (for
distinct symbols σ, σ 0 and distinct states q, q0 ). Thanks to Axiom 11, symbols on
head-free cells carry over to the next configuration. As specified by the DTM’s
transition function, the head reads a symbol σ, writes a symbol σ 0 , changes its
state from q to q0 and moves right (Axiom 12) or left (Axiom 13) or stays in its
place whenever it is supposed to move left but is already at the leftmost tape cell
(Axiom 14). This finishes the description of the TBox T , allowing us to establish
the claimed property and consequenty the undecidability result.
Proposition 2. M is looping iff there is a finite model I of T with I 6|= Q.
Theorem 2. Finite CQ entailment over ALC cov TBoxes is undecidable.

Proof. According to Proposition 2, the TM looping problem can be reduced to
the problem if for a given ALC cov TBox T and conjunctive query q, there is a
finite interpretation I with I |= T with I 6|= q. Note that the latter is the case
exactly if T does not finitely entail q.

   Finally, taking into account that ALCSCC ++ subsumes ALC cov and only al-
lows for finite models, we obtain the wanted result.

Corollary 1. Conjunctive query entailment for ALCSCC ++ is undecidable.


5   Query entailment from ALCSCC RCBoxes

As the last result of the paper, we show that decidability of CQ entailment can
be regained by moving to a less expressive logic, namely ALCSCC RCBoxes, i.e.,
finite sets of restricted cardinality constraints of the form

            N1 |C1 | + . . . + Nk |Ck | ≤ Nk+1 |Ck+1 | + . . . + Nk+` |Ck+` |,

where Ci are ALCSCC concept descriptions and Ni are integer constants for
1 ≤ i ≤ k + `, with the obvious semantics. RCBoxes can express general concept
inclusions (GCIs) C v D via |C u ¬D| ≤ |⊥|, so we use GCIs when appropriate.
    We first sketch our approach. Let R be an input RCBox and let q be an
input CQ. Assume that q is not entailed by R. Hence there is a counter-model I,
satisfying R but not q. Our goal is to develop an algorithm to produce another
RCBox R0 consisting of R and some additional knowledge, in a way that R0 is
satisfiable if and only if there is a counter-model of R for q. This can be done
by a careful analysis of query matches. Note that if a query q is entailed by R,
every model I falls into one of the following two categories: (i) either there is an
acyclic (also called tree-shaped ) query match or (ii) every query match contains
some cyclic substructure of the model.
    To deal with these two cases we proceed as follows. For the first case, in
order to rule out such models having tree-shaped query matches, we enrich our
RCBox R with additional knowledge forbidding these matches. This can be done
by using the so-called rolling-up technique of transforming query matches into
concepts, as, e.g., in [7,11,8]. For the second case, we show that it actually cannot
happen: we argue that if there is a model I with only cyclic query matches, one
can employ an appropriate model transformation, called pumping, to turn I into
a proper counter-model. This technique is sometimes called big-cycle method [12]
or Rosati covers and was successfully used in the context of finite query entail-
ment, e.g, in [6,14,9]. Concluding, it suffices to enrich an ALCSCC RCBox with
additional statements ruling out all models with acyclic query matches, to obtain
an RCBox whose satisfiability coincides with the existence of a counter-model.
Checking (un)satisfiability of ALCSCC RCBoxes is decidable [2], so we can con-
clude following theorem:
Theorem 3. CQ entailment from ALCSCC RCBoxes is decidable.

    The rest of this section is devoted to a sketch of the proof of Theorem 3 as
outlined above. In Section 5.1 we show how to forbid tree-shaped query matches,
in Section 5.2 we handle the case of cyclic matches and construct counter-models.


5.1   Forbidding acyclic query matches

In this section, we provide a method for forbidding tree-shaped query matches.
We strongly rely on previous results on query entailment for ALCHQ knowledge
bases [11]. For the sake of simplicity we assume that all queries under consider-
ation are connected (in the graph-theoretic sense).
    We first recall some standard definitions. Let q be a conjunctive query and
let Vq be the set of variables appearing in q. We can see a query q as a directed
graph Gq = (Vq , Eq ), where the nodes are variables from q and any two nodes x, y
are connected if some atom r(x, y) appears in q. A query is tree-shaped, if the
underlying graph does not contain any (undirected) cycles.
    We introduce the notion of treeification, which for a given query q essentially
describes the set of all its ways to match in a tree-shaped way.
Definition 3. Let q be a CQ. We say that a tree-shaped query q 0 is a treeification
of q, if q 0 can be obtained from q by (possibly multiple times) selecting some atoms
r(x, z) and s(y, z) and replacing all variables x in the query by y. By trees(q) we
denote the set of all treeifications of q.
Note that trees(q) is finite.
    Next we describe the so-called rolling-up technique, which transforms a tree-
shaped query match into a single concept [16,11]. Let us fix a conjunctive query q
and some treeification q 0 of it. For each variable x ∈ Vq we construct a con-
cept Cq0 ,x , with the supposed meaning that d ∈ CqI0 ,x if variable x from q 0 can
be mapped to d in a query match represented by q 0 . We define these concepts
as follows: Picking one arbitrary variable xrq0 ∈ Vq0 , we let (Vq0 , ≺) be the tree
obtained from Gq0 by orienting all edges away from xrq0 . Now, wed define Cq ,x for
                                                                              0

every x ∈ Vq in a bottom-up manner as follows: Cq ,x equals C(x)∈q0 C if x is
               0                                       0

a leaf (i.e. ≺-minimal), otherwise:
               l         l  \                   l  \                   
                  C u         ∃     s.Cq0 ,y u           ∃     s− .Cq0 ,y
           C(x)∈q 0   (x,y)∈E 0
                             q    s(x,y)∈q 0    (y,x)∈E 0
                                                       q    s(y,x)∈q 0
                         y≺x                       y≺x


Concepts of the form ∃ s(y,x)∈q0 s− .Cq0 ,y in the above definition are clearly not
                            T

in ALC (due to the presence of inverse roles), but this can be remedied as fol-
lows: We replace any ∃r− .Cq0 ,y with a (conjunction of) inverted role name(s) r−
by a newly introduced concept C∃r− .Cq0 ,y for which we also specify Cq0 ,y v
∀r.Cr− .Cq0 ,y . Like this, any of the concepts Cq0 ,y is free of inverses.
   For a given CQ q, we enumerate all of its treeifications and roll them up into
a concept. Let RMatch q      be an RCBox defining that there exists a tree-shaped
query match in a model:
                                   G
                                               Cq0 ,xrq0 v Matchq                        (15)
                               q 0 ∈trees(q)

      The two coming lemmas give the precise meaning to the defined concepts.
Lemma 2. Let R be an ALCSCC RCBox and let q be a conjunctive query.
Let RMatch
      q     be as defined above. Assume that R ∪ RMatch
                                                   q     has a model I such
          I
that Matchq is empty. Then I does not have any tree-shaped query matches.
Lemma 3. Let R be an ALCSCC RCBox and let q be a conjunctive query
and RMatch
      q    as defined above. If there is a model I of R without any tree-shaped
query matches, then R∗ = R ∪ RMatch
                                 q      ∪ {> v ¬Matchq } is satisfiable.
   Satisfiability checking of R∗ can be performed with an algorithm from [2].
Together with Lemma 2 and Lemma 3, we establish:
Theorem 4. It is decidable whether for a given CQ q and a given ALCSCC
RCBox R, there is a model of R without any tree-shaped query matches of q.

5.2     Pumping models to enforce high girth
Now we take a closer look at the previously announced pumping method for
eliminating non-tree-shaped query matches.
Definition 4. Let I be an interpretation, ∆I = {d1 , d2 , . . . , dn } be the set of
domain elements and E the set of “edges”, i.e., e = (d, d0 ) occur in E if there
is a role r, s.t. (d, d0 ) ∈ rI . Additionally, let F be the set of all functions f of
type f : E → {0, 1}. We define a pumping of I, denoted with pump(I) = I 0 ,
                                          0
in the following way: (i) we set ∆I = ∆I × F , (ii) for a concept name A, an
                                                                0
element d ∈ ∆ and any function f we set (d, f ) ∈ AI iff d ∈ AI (iii) for any
                 I
                                                                            0
role name r and a pair ((d, f ), (d , f )) we set ((d, f ), (d , f 0 )) ∈ rI iff (d, d0 ) ∈ rI
                                    0   0                     0

and functions f, f are equal on all arguments except for the argument e = (d, d0 )
                    0

for which f 0 (e) = 1 − f (e).
    The girth of I is the length of a shortest (undirected) proper cycle contained
in ∆I (proper means that no element from E is used more than once). It is not
difficult to see that the girth of pump(I) is at least twice the girth of I: due to
the fact that we need to flip the e value in f every time we cross an edge e.
Lemma 4. Let I be an interpretation with girth k. Then the girth of pump(I)
is at least 2k.
    Correctness of the pumping method is guaranteed by the following lemma.
Its proof relies on two observations (i) degree of each node and “types” of its
successors are preserved during pumping, and (ii) all global constraints from the
RCBox are still satisfied since all cardinalities from inequalities were multiplied
by the same number (i.e. |F |). Formally, the proof goes via an induction over
the depth of ALCSCC concepts.
Lemma 5. Let R be an RCBox with a model I. Then pump(I) is a model of R.
   Now we explain how to use the pumping method from the Definition 4 to
erase non-tree query matches and obtain a counter-model.
Lemma 6. Let q be a CQ and let R be an ALCSCC RCBox. Assume R has a
model I that does not have any acyclic query matches of q. Then there exists a
counter-model I 0 for q and R.
Proof. We define I 0 as the |q|-fold application of pump to I. Since each it-
eration of pumping doubles the girth of the input structure, the girth of I 0 is
strictly greater than |q|. Moreover, during the pumping process, we haven’t intro-
duced any new query matches. Hence, since any cyclic match of q would require
girth ≤ |q|, the query q cannot match into I 0 anymore and I 0 is a counter-model.
     By combining (i) a method of pumping a structure from this Section, (ii)
a method of enriching a knowledge-base with a knowledge to forbid all acyclic
query matches from the previous section and (iii) an algorithm for testing (un)sat-
isfiability for ALCSCC concepts w.r.t RCBoxes, we conclude Theorem 3.
     A rough complexity analysis give us a 2ExpTime upper bound. We believe
that this bound can be improved to ExpTime by adapting techniques from [11].

6   Conclusion
In our work, we have significantly pushed the boundaries of quantitative exten-
sions to description logics. We showed that ALC can not only be extended by
both global (extension-based) and local (neighbourhood-based) arithmetic con-
straints but even by hybrid constraints mixing the two, yielding a significant
increase in expressivity without negative impact on the complexity of satisfia-
bility testing. On the downside, we had to find out that this extension leads to
undecidability of conjunctive query answering. However, we were able to regain
decidability and even a favourable complexity under appropriate restrictions.
    Without any doubt, the ability to deal with factual data, i.e. ABoxes, is
of utmost importance in the context of statistical considerations and query-
ing. Therefore, our next step will be to extend our investigations to full-fledged
ALCSCC knowledge bases including ABoxes. In fact, we have already established
that ABoxes can be added to ALCSCC RCBoxes without impacting the Exp-
Time complexity of satisfiability checking and are confident that standard query
partitioning and a slight modification of model pumping will allow us to show
ExpTime completeness of CQ answering.

Acknowledgements
Bartosz Bednarczyk was supported by the Polish Ministry of Science and Higher
Education program “Diamentowy Grant” no. DI2017 006447. Sebastian Rudolph
was supported by the European Research Council (ERC) through the Consol-
idator Grant 771779 (DeciGUT). Franz Baader was partially supported by the
German Research Foundation (DFG) within the Research Unit 1513 Hybris.
References
 1. Franz Baader. A new description logic with set constraints and cardinality con-
    straints on role successors. In Clare Dixon and Marcelo Finger, editors, Pro-
    ceedings of the 11th International Symposium on Frontiers of Combining Systems
    (FroCoS’17), volume 10483 of Lecture Notes in Computer Science, pages 43–59,
    Brasília, Brazil, 2017. Springer-Verlag.
 2. Franz Baader. Expressive cardinality constraints on ALCSCC concepts. In Pro-
    ceedings of the 34th ACM/SIGAPP Symposium On Applied Computing (SAC’19).
    ACM, 2019.
 3. Franz Baader, Diego Calvanese, Deborah McGuinness, Daniele Nardi, and Peter
    Patel-Schneider, editors. The Description Logic Handbook: Theory, Implementa-
    tion, and Applications. Cambridge University Press, second edition, 2007.
 4. Franz Baader and Andreas Ecke. Extending the description logic alc with more
    expressive cardinality constraints on concepts. In GCAI 2017. 3rd Global Con-
    ference on Artificial Intelligence, volume 50 of EPiC Series in Computing, pages
    6–19. EasyChair, 2017.
 5. Franz Baader, Ian Horrocks, Carsten Lutz, and Uli Sattler. An Introduction to
    Description Logic. Cambridge University Press, 2017.
 6. Vince Bárány, Georg Gottlob, and Martin Otto. Querying the guarded fragment.
    Logical Methods in Computer Science, 10(2), 2014.
 7. Diego Calvanese, Giuseppe De Giacomo, and Maurizio Lenzerini. On the decid-
    ability of query containment under constraints. In Proc. of the 17th ACM SIGACT
    SIGMOD SIGART Symp. on Principles of Database Systems (PODS’98), pages
    149–158, 1998.
 8. Birte Glimm, Carsten Lutz, Ian Horrocks, and Ulrike Sattler. Conjunctive query
    answering for the description logic SHIQ. J. of Artificial Intelligence Research,
    31:157–204, 2008.
 9. Yazmin Angélica Ibáñez-García, Carsten Lutz, and Thomas Schneider. Finite
    model reasoning in horn description logics. In Proc. of the 14th Int. Conf. on
    Principles of Knowledge Representation and Reasoning (KR 2014), pages 288–297,
    2014.
10. Viktor Kuncak and Martin C. Rinard. Towards efficient satisfiability checking for
    Boolean algebra with Presburger arithmetic. In Frank Pfenning, editor, Proc. of
    the 21st Int. Conf. on Automated Deduction (CADE-07), volume 4603 of Lecture
    Notes in Computer Science, pages 215–230. Springer, 2007.
11. Carsten Lutz. The complexity of conjunctive query answering in expressive de-
    scription logics. In Alessandro Armando, Peter Baumgartner, and Gilles Dowek,
    editors, Proc. of the Int. Joint Conf. on Automated Reasoning (IJCAR 2008), Lec-
    ture Notes in Artificial Intelligence, pages 179–193. Springer-Verlag, 2008.
12. Martin Otto. Highly acyclic groups, hypergraph covers, and the guarded fragment.
    J. ACM, 59(1):5:1–5:40, 2012.
13. Rafael Peñaloza and Nico Potyka. Towards statistical reasoning in description
    logics over finite domains. In Serafin Moral and Olivier Pivert, editors, Proc. of
    the 11th Int. Conf. on Scalable Uncertainty Management (SUM 2017), volume
    10564 of Lecture Notes in Computer Science. Springer-Verlag, 2017.
14. Ian Pratt-Hartmann. Data-complexity of the two-variable fragment with counting
    quantifiers. Inf. Comput., 207(8):867–888, 2009.
15. Sebastian Rudolph. Foundations of description logics. In Axel Polleres, Claudia
    d’Amato, Marcelo Arenas, Siegfried Handschuh, Paula Kroner, Sascha Ossowski,
    and Peter F. Patel-Schneider, editors, Reasoning Web. Semantic Technologies for
    the Web of Data – 7th International Summer School 2011, volume 6848 of LNCS,
    pages 76–136. Springer, 2011.
16. Sergio Tessaris. Querying expressive dls. In Proc. of the 2001 Description Logic
    Workshop (DL 2001), volume 49 of CEUR Workshop Proceedings. CEUR-WS.org,
    2001.
17. Stephan Tobies. A PSPACE algorithm for graded modal logic. In Har-
    ald Ganzinger, editor, Proc. of the 16th Int. Conf. on Automated Deduction
    (CADE’99), volume 1632 of Lecture Notes in Artificial Intelligence, pages 52–66.
    Springer-Verlag, 1999.
18. Stephan Tobies. The complexity of reasoning with cardinality restrictions and
    nominals in expressive description logics. J. of Artificial Intelligence Research,
    12:199–217, 2000.
19. Stephan Tobies. Complexity Results and Practical Algorithms for Logics in Knowl-
    edge Representation. PhD thesis, LuFG Theoretical Computer Science, RWTH-
    Aachen, Germany, 2001.