<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Satisfiability Checking and Conjunctive Query Answering in Description Logics with Global and Local Cardinality Constraints</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Franz Baader</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Bartosz Bednarczyk</string-name>
          <email>bartosz.bednarczyk@cs.uni.wroc.pl</email>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Sebastian Rudolph</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Faculty of Computer Science</institution>
          ,
          <addr-line>TU Dresden</addr-line>
          ,
          <country country="DE">Germany</country>
        </aff>
        <aff id="aff1">
          <label>1</label>
          <institution>Institute of Computer Science, University of Wrocław</institution>
          ,
          <country country="PL">Poland</country>
        </aff>
      </contrib-group>
      <abstract>
        <p>We consider an expressive description logic (DL) in which the global and local cardinality constraints introduced in previous papers can be mixed. On the one hand, we show that this does not increase the complexity of satisfiability checking and other standard inference problems. On the other hand, conjunctive query entailment in this DL turns out to be undecidable. We prove that decidability of querying can be regained if global and local constraints are not mixed and the global constraints are appropriately restricted.</p>
      </abstract>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>-</title>
      <p>Introduction
DLs that can express both local cardinality constraints (i.e., constraints
concerning the role successors of specific individuals) and global cardinality constraints
(i.e., constraints on the overall cardinality of concepts) can, for instance, be used
to check the correctness of statistical statements. For example, if a German car
company claims that they have produced more than N cars in a certain year,
and P % of the tires used for their cars were produced by Betteryear, this may
be contradictory to a statement of Betteryear that they have sold less than M
tires in Germany. Such statistical information may, of course, also influence the
answers to queries. If we know that the car company VMW uses only tires from
Betteryear or Badmonth, but the statistical information allows us to conclude
that another car company has actually bought all the tires sold by Betteryear,
then we know that the cars sold by VMW all have tires produced by Badmonth.
This motivates investigating DLs with expressive cardinality constraints, and
to consider not just standard inferences such as satisfiability checking for these
DLs, but also query answering.</p>
      <p>
        In two previous publications we have, on the one hand, extended the DL
ALCQ by more expressive number restrictions using cardinality and set
constraints expressed in the quantifier-free fragment of Boolean Algebra with
Presburger Arithmetic (QFBAPA) [
        <xref ref-type="bibr" rid="ref10">10</xref>
        ]. In the resulting DL ALCSCC, which was
introduced and investigated in [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ], cardinality and set constraints are applied
locally, i.e., they refer to the role successors of an individual under
consideration. It was shown in [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ] that pure concept satisfiability in ALCSCC is a
PSpace-complete problem, and concept satisfiability w.r.t. a general TBox is
ExpTime-complete. This shows that the more expressive number restrictions
do not increase the complexity of reasoning since reasoning in ALCQ has the
same complexity [
        <xref ref-type="bibr" rid="ref17 ref19">17,19</xref>
        ].
      </p>
      <p>
        On the other hand, we have extended the terminological formalism of the
well-known description logic ALC from general TBoxes to more general
cardinality constraints expressed in QFBAPA [
        <xref ref-type="bibr" rid="ref4">4</xref>
        ], which we called extended cardinality
constraints (ECBoxes). These constraints are global since they refer to all
individuals in the interpretation domain. It was shown in [
        <xref ref-type="bibr" rid="ref4">4</xref>
        ] that reasoning w.r.t.
ECBoxes is in NExpTime even if the numbers occurring in the constraints are
encoded in binary. A NExpTime lower bound follows from a result of Tobies [
        <xref ref-type="bibr" rid="ref18">18</xref>
        ]
for a restricted form of cardinality constraints, where the cardinality of a concept
can only be compared with a fixed number. This complexity can be lowered to
ExpTime if a restricted form of cardinality constraints (RCBoxes) is used. Such
RCBoxes are still powerful enough to express statistical knowledge bases [
        <xref ref-type="bibr" rid="ref13">13</xref>
        ].
      </p>
      <p>
        An obvious way to generalize these two approaches is to combine the two
extensions, i.e., to consider extended cardinality constraints, but now on ALCSCC
concepts rather than just ALC concepts. This combination was investigated
in [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ], where a NExpTime upper bound was established for reasoning in ALCSCC
w.r.t. ECBoxes. It is also shown in [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ] that reasoning w.r.t. RCBoxes stays in
ExpTime also for ALCSCC.
      </p>
      <p>Here we go one step further by allowing for a tighter integration of global
and local constraints. The resulting logic, which we call ALCSCC++, allows, for
example, to relate the number of role successors of a given individual with the
overall number of elements of a certain concept. We show that, from a worst-case
complexity point of view, this extended expressivity comes for free if we consider
classical reasoning problems. Concept satisfiability in ALCSCC++ has the same
complexity as in ALC and ALCSCC with global cardinality constraints: it is
NExpTime-complete. Yet, for effective conjunctive query answering this logic
turns out to be too expressive. We show that conjunctive query entailment w.r.t.
ALCSCC++ knowledge bases is, in fact, undecidable. In contrast, we can show
that conjunctive query entailment w.r.t. ALCSCC RCBoxes is decidable.</p>
      <p>
        We assume the reader to be sufficiently familiar with all the standard notions
of description logics [
        <xref ref-type="bibr" rid="ref15 ref3 ref5">3,5,15</xref>
        ].
2
      </p>
    </sec>
    <sec id="sec-2">
      <title>The logic ALCS CC</title>
      <p>
        ++
As in [
        <xref ref-type="bibr" rid="ref1 ref4">1,4</xref>
        ], we use the quantifier-free fragment of Boolean Algebra with
Presburger Arithmetic (QFBAPA) to express our constraints. In this logic, one can
build set terms by applying Boolean operations (intersection \, union [, and
complement c) to set variables as well as the constants ; and U . Set terms s; t
can then be used to state set constraints, which are equality and inclusion
constraints of the form s = t; s t, where s; t are set terms. Presburger Arithmetic
(PA) expressions are built from integer constants and set cardinalities jsj using
addition as well as multiplication with an integer constant. They can be used
to form cardinality constraints of the form k = `; k &lt; `; N dvd `, where k; ` are
PA expressions, N is an integer constant, and dvd stands for divisibility. A
QFBAPA formula is a Boolean combination of set and cardinality constraints. A
solution of a QFBAPA formula assigns a finite set (U ) to U and subsets
of (U ) to set variables such that is satisfied by this assignment (see [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ] for
more details). A QFBAPA formula is satisfiable if it has a solution. In [
        <xref ref-type="bibr" rid="ref10">10</xref>
        ] it
is shown that the satisfiability problem for QFBAPA formulae is NP-complete.
      </p>
      <p>
        We are now ready to define our new logic, which we call ALCSCC++ to
indicate that it is an extension of the logic ALCSCC introduced in [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ]. When defining
the semantics of ALCSCC++, we restrict the attention to finite interpretations
to ensure that cardinalities of concept descriptions are always well-defined
nonnegative integers.
      </p>
      <p>Definition 1. Given disjoint finite sets NC and NR of concept names and
role names, respectively, ALCSCC++ concept descriptions (short: concepts) are
Boolean combinations of concept names and constraint expressions, where a
constraint expression is of the form sat (c) for a set constraint or a cardinality
constraint c that uses role names and ALCSCC++ concept descriptions in place of
set variables. As usual, we use &gt; ( top) and ? ( bottom) as abbreviations for
A t :A and A u :A, respectively.</p>
      <p>A finite interpretation of NC and NR consists of a finite, non-empty set I
and a mapping I that maps every concept name A 2 NC to a subset AI of I
and every role name r 2 NR to a binary relation rI over I . For a given element
d 2 I we define rI (d) := fe 2 I j (d; e) 2 rI g: The interpretation function I
is inductively extended to ALCSCC++ concept descriptions by interpreting the
Boolean operators as usual, and the constraint expressions as follows:
sat (c)I := fd 2</p>
      <sec id="sec-2-1">
        <title>I j the mapping Id satisfies cg, where Id maps</title>
        <p>– ; to ;Id := ; and U to U Id := I ,
– the ALCSCC++ concept descriptions C occurring in c to CId := CI ,
– and the role names r occurring in c to rId := rI (d).</p>
        <p>The ALCSCC++ concept description C is satisfiable if there is a finite
interpretation I such that CI 6= ;.</p>
        <p>
          Note that the interpretation of concepts as set variables in ALCSCC++ is
global in the sense that it does not depend on d, i.e., CId = CIe for all d; e 2 I .
In contrast, the interpretation of role names r as set variables is local since only
the r-successors of d are considered by Id . In ALCSCC, also the interpretation of
concepts as set variables is local since in the semantics of ALCSCC the mapping
Id considers only the elements of CI that are role successors of d for some role
name in NR. This can clearly be simulated in ALCSCC++ by using C \(Sr2NR r)
instead of C when formulating the constraint. Thus, ALCSCC concepts can be
expressed by ALCSCC++ concepts. In addition, extended cardinality constraints
(ECBoxes), as introduced in [
          <xref ref-type="bibr" rid="ref4">4</xref>
          ], are expressible within ALCSCC++ concept
descriptions, as are nominals, the universal role, and role negation.
Proposition 1. ALCSCC++ concepts can polynomially express nominals, role
conjunctions, and ALCSCC ECBoxes, and thus also ABoxes, ALC ECBoxes
and ALCSCC TBoxes. In addition, they have the same expressivity as concepts
of ALCSCC extended with the universal role or with role negation, whereas both
of these features are not expressible in plain ALCSCC.
        </p>
        <p>Proof. ECBoxes correspond to Boolean combinations of concept descriptions of
the form sat (c) where c contains only concept descriptions as set variables. Since
such concept descriptions are satisfied either by no element of I or by all of
them, their effect is to enforce the constraint on the whole interpretation domain
if they are conjoined to a concept description.</p>
        <p>Regarding role negation, for given role names r; r, the constraint sat (&gt;
sat (r\r ;)) enforces that, for every individual, the sets of its r and r successors
are disjoint. In addition, the constraint sat (&gt; sat (jrj + jrj = jU j)) says that
elements of the domain that are not r successors of a given individual must
be r successors. Thus, we can express that the role r is interpreteted as the
complement of r, i.e. rI = I I n rI for every finite interpretation I.</p>
        <p>The other statements in the proposition are also easy to see.
tu
3</p>
        <p>
          Satisfiability of ALCSCC++ concept descriptions
In the following we consider an ALCSCC++ concept description E and show
how to test E for satisfiability by reducing this problem to the problem of
testing satisfiability of QFBAPA formulae. Since the reduction is exponential and
satisfiability in QFBAPA is in NP, this yields a NExpTime upper bound for
satisfiability of ALCSCC++ concept descriptions. This bound is optimal since
consistency of extended cardinality constraints in ALC, as introduced in [
          <xref ref-type="bibr" rid="ref4">4</xref>
          ], is
already NExpTime hard, and can be expressed as an ALCSCC++ satisfiability
problem by Proposition 1.
        </p>
        <p>
          Our NExpTime algorithm combines ideas from the satisfiability algorithm
for ALCSCC concept descriptions [
          <xref ref-type="bibr" rid="ref1">1</xref>
          ] and the consistency procedure for ALC
ECBoxes [
          <xref ref-type="bibr" rid="ref4">4</xref>
          ]. In particular, we use the notion of a type, as introduced in [
          <xref ref-type="bibr" rid="ref4">4</xref>
          ].
This notion is also similar to the Venn regions employed in [
          <xref ref-type="bibr" rid="ref1">1</xref>
          ]. Given a set of
concept descriptions M, the type of an individual in an interpretation consists
of the elements of M to which the individual belongs. Such a type t can also
be seen as a concept description Ct, which is the conjunction of all the elements
of t. We assume in the following that M consists of all subdescriptions of the
concept description E as well as the negations of these subdescriptions.
Definition 2. A subset t of M is a type for E if it satisfies the following three
properties: (1) for every concept description :C 2 M, either C or :C belongs
to t; (2) for every concept description C u D 2 M, we have that C u D 2 t iff
C 2 t and D 2 t; (3) for every concept description C t D 2 M, we have that
C t D 2 t iff C 2 t or D 2 t.
        </p>
        <p>We denote the set of all types for E with types(E). Given an interpretation
I and an individual d 2 I , the type of d is the set tI (d) := fC 2 M j d 2 CI g:</p>
        <p>Due to Condition (1) in the definition of types, concept descriptions Ct; Ct0
induced by different types t 6= t0 are disjoint, and all concept descriptions in
M can be obtained as the disjoint union of the concept descriptions induced
by the types containing them, i.e., we have CI = St type with C2t CtI for all C 2
M and finite interpretations I. In particular, the following holds for all finite
interpretations I:
jCI j =</p>
        <p>X
t type with C2t
jCtI j
and
jCtI j = j \ CI j;</p>
        <p>C2t
where the latter identity is an immediate consequence of the definition of Ct as
the conjunction of all the elements of t.</p>
        <p>Given a type t, the constraints occurring in the top-level Boolean structure
of t induce a QFBAPA formula t, in which the concepts C and roles r occurring
in these constraints are replaced by set variables XC and Xrt, respectively. Note
that set variables corresponding to concepts are independent of the type t, i.e.,
they are shared by all types, whereas the set variables corresponding to roles are
different for different types. This corresponds to the fact that roles are evaluated
locally, but concepts are evaluated globally in the semantics of ALCSCC++. In
order to ensure that the Boolean structure of concepts is respected by the set
variables, we introduce the formula :=</p>
        <p>^
CuD2M</p>
        <p>XCuD = XC \ XD ^</p>
        <p>XCtD = XC [ XD ^
^
CtD2M</p>
        <p>^
:C2M</p>
        <p>X:C = (XC )c:
Overall, we translate the ALCSCC++ concept E into the QFBAPA formula
E := (jXE j
1) ^
^</p>
        <p>^ (j \ XC j = 0) _ t:
t2types(E) C2t
Intuitively, to satisfy E, we need to have at least one element in it, which explains
the first conjunct. The third conjunct together with ensures that, for any type
that is realized (i.e., has elements), the constraints of this type are satisfied. The
next lemma states that there is a 1–1-relationship between solvability of E and
satisfiability of E.</p>
        <p>Lemma 1. The QFBAPA formula E is of size at most exponential in the size
of E, and it is satisfiable iff the ALCSCC++ concept description E is satisfiable.</p>
        <p>
          Since satisfiability of QFBAPA formulae can be decided within NP even for
binary coding of numbers [
          <xref ref-type="bibr" rid="ref10">10</xref>
          ], this lemma shows that satisfiability of ALCSCC++
concept descriptions can be decided within NExpTime. Together with the known
NExpTime lower bound for consistency of ALC ECBoxes in [
          <xref ref-type="bibr" rid="ref4">4</xref>
          ], this yields:
Theorem 1. Satisfiability of ALCSCC++ concept descriptions with numbers
encoded in binary is NExpTime-complete.
        </p>
        <p>Thanks to Proposition 1, this carries over to satisfiability of ALCSCC++
knowledge bases which may feature an ABox, a TBox and an ECBox.</p>
      </sec>
    </sec>
    <sec id="sec-3">
      <title>Query entailment in ALCSCC</title>
      <p>++
The final result of this section is the undecidability of conjunctive query
entailment for ALCSCC++. To this end, we first briefly recap the notion of (Boolean)
conjunctive queries and define query entailment.</p>
      <p>In queries, we use variables from a countably infinite set V . A Boolean
conjunctive query (CQ) q is a finite set of atoms of the form r(x; y) or C(z), where
r is a role, C is concept, and x; y; z 2 V . A CQ q is satisfied by I (written:
I j= q) if there is a variable assignment : V ! I (called match) such that
( (x); (y)) 2 rI for every r(x; y) 2 q and (z) 2 CI for every C(z) 2 q. A CQ
q is (finitely) entailed from a knowledge base K (written: K j= q) if every (finite)
model I of K satisfies q.</p>
      <p>
        We actually show undecidability of CQ entailment for a much weaker logic,
thereby providing a very restricted fragment of constant-free and equality-free
two-variable first-order logic for which finite CQ entailment is already
undecidable, significantly strengthening and solidifying earlier results along those
lines [
        <xref ref-type="bibr" rid="ref14">14</xref>
        ]. Our proof makes use of deterministic Turing machines (DTMs). For
our purposes, it is sufficient to consider only computations starting with an
empty tape. For space reasons, we assume the reader to be familiar with
standard notions and constructions concerning DTMs. We call a DTM looping if its
run starting contains repeating configurations,i.e., there are two different (and
hence – due to determinism – infinitely many) points in time, where the
machine’s tape content, head position, and state are the same. It is easy to see that
the problem of determining if a given TM is looping is undecidable.
      </p>
      <p>We show our undecidability result for the DL ALCcov, a slight extension
of ALC by role cover axioms of the form cov(r; s) for role names r and s. An
interpretation I satisfies cov(r; s) if rI [ sI = I I . Role cover axioms can
be expressed in ALCSCC++ via sat &gt; sat(jr [ sj = jU j) , hence ALCcov is
subsumed by ALCSCC++.</p>
      <p>In what follows, assume that a DTM M is given. We now describe an ALCcov
TBox T and conjunctive query q such that T j= q exactly if M is not looping. We
provide q and T together with the underlying intuitions of our construction. The
goal of our construction is that a countermodel (i.e., an interpretation satisfying
T but not q) corresponds to a looping configuration sequence of M. Thereby,
the domain elements represent tape cells at certain computation steps of M.
The role h connects consecutive tape cells of the same configuration, whereas
the role v connects a configuration’s tape cell with the same tape cell of the
successor configuration.</p>
      <p>We start by providing the query. Intuitively, the query is meant to catch the
unwanted situation that two corresponding tape cells of consecutive
configurations are v-connected, but the cells to their right aren’t.</p>
      <p>q = 9x; y; x0; y0:v(x; y) ^ h(x; x0) ^ h(y; y0) ^ v(x0; y0)
(1)
We proceed by giving the axioms of T . The following covering axiom ensures
that, whenever two elements are not v-connected, they must be v-connected.
This is needed to enable the above query to catch the described problem.
The remaining TBox axioms can be found in Table 1. Axiom 3 ensures (by
means of an auxiliary role aux which serves no further purpose) that there is
a first tape cell of the first (initial) configuration where the head of the TM
is positioned in the initial state. Axiom 4 enforces that for every cell of every
configuration there is both a tape cell to its right and a corresponding tape
cell in the successor configuration. Axiom 5 makes sure that, for every cell that
is the first on its tape, the corresponding successor configuration’s tape cell is
also the first. Axioms 6 propagates the information that a cell belongs to the
initial configuration along the tape, and fills the tape with blanks. Axioms 7–9
(instantiated for every state q) make sure that in every configuration there can
only be one cell where the head is positioned. Every cell can only carry one
symbol and the head can be in only one state, as ensured by Axioms 10 (for
distinct symbols ; 0 and distinct states q; q0). Thanks to Axiom 11, symbols on
head-free cells carry over to the next configuration. As specified by the DTM’s
transition function, the head reads a symbol , writes a symbol 0, changes its
state from q to q0 and moves right (Axiom 12) or left (Axiom 13) or stays in its
place whenever it is supposed to move left but is already at the leftmost tape cell
(Axiom 14). This finishes the description of the TBox T , allowing us to establish
the claimed property and consequenty the undecidability result.
Proposition 2. M is looping iff there is a finite model I of T with I 6j= Q.
(3)
(4)
(5)
(6)
(7)
(8)
(9)
(10)
(11)
(12)
(13)
(14)
(2)
Theorem 2. Finite CQ entailment over ALCcov TBoxes is undecidable.
Proof. According to Proposition 2, the TM looping problem can be reduced to
the problem if for a given ALCcov TBox T and conjunctive query q, there is a
finite interpretation I with I j= T with I 6j= q. Note that the latter is the case
exactly if T does not finitely entail q.</p>
      <p>Finally, taking into account that ALCSCC++subsumes ALCcov and only
allows for finite models, we obtain the wanted result.</p>
      <p>Corollary 1. Conjunctive query entailment for ALCSCC++is undecidable.
5</p>
      <p>Query entailment from</p>
    </sec>
    <sec id="sec-4">
      <title>ALCSCC RCBoxes</title>
      <p>As the last result of the paper, we show that decidability of CQ entailment can
be regained by moving to a less expressive logic, namely ALCSCC RCBoxes, i.e.,
finite sets of restricted cardinality constraints of the form</p>
      <p>N1jC1j + : : : + NkjCkj</p>
      <p>Nk+1jCk+1j + : : : + Nk+`jCk+`j;
where Ci are ALCSCC concept descriptions and Ni are integer constants for
1 i k + `, with the obvious semantics. RCBoxes can express general concept
inclusions (GCIs) C v D via jC u :Dj j?j, so we use GCIs when appropriate.</p>
      <p>We first sketch our approach. Let R be an input RCBox and let q be an
input CQ. Assume that q is not entailed by R. Hence there is a counter-model I,
satisfying R but not q. Our goal is to develop an algorithm to produce another
RCBox R0 consisting of R and some additional knowledge, in a way that R0 is
satisfiable if and only if there is a counter-model of R for q. This can be done
by a careful analysis of query matches. Note that if a query q is entailed by R,
every model I falls into one of the following two categories: (i) either there is an
acyclic (also called tree-shaped ) query match or (ii) every query match contains
some cyclic substructure of the model.</p>
      <p>
        To deal with these two cases we proceed as follows. For the first case, in
order to rule out such models having tree-shaped query matches, we enrich our
RCBox R with additional knowledge forbidding these matches. This can be done
by using the so-called rolling-up technique of transforming query matches into
concepts, as, e.g., in [
        <xref ref-type="bibr" rid="ref11 ref7 ref8">7,11,8</xref>
        ]. For the second case, we show that it actually cannot
happen: we argue that if there is a model I with only cyclic query matches, one
can employ an appropriate model transformation, called pumping, to turn I into
a proper counter-model. This technique is sometimes called big-cycle method [
        <xref ref-type="bibr" rid="ref12">12</xref>
        ]
or Rosati covers and was successfully used in the context of finite query
entailment, e.g, in [
        <xref ref-type="bibr" rid="ref14 ref6 ref9">6,14,9</xref>
        ]. Concluding, it suffices to enrich an ALCSCC RCBox with
additional statements ruling out all models with acyclic query matches, to obtain
an RCBox whose satisfiability coincides with the existence of a counter-model.
Checking (un)satisfiability of ALCSCC RCBoxes is decidable [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ], so we can
conclude following theorem:
Theorem 3. CQ entailment from ALCSCC RCBoxes is decidable.
      </p>
      <p>
        The rest of this section is devoted to a sketch of the proof of Theorem 3 as
outlined above. In Section 5.1 we show how to forbid tree-shaped query matches,
in Section 5.2 we handle the case of cyclic matches and construct counter-models.
In this section, we provide a method for forbidding tree-shaped query matches.
We strongly rely on previous results on query entailment for ALCHQ knowledge
bases [
        <xref ref-type="bibr" rid="ref11">11</xref>
        ]. For the sake of simplicity we assume that all queries under
consideration are connected (in the graph-theoretic sense).
      </p>
      <p>We first recall some standard definitions. Let q be a conjunctive query and
let Vq be the set of variables appearing in q. We can see a query q as a directed
graph Gq = (Vq; Eq), where the nodes are variables from q and any two nodes x; y
are connected if some atom r(x; y) appears in q. A query is tree-shaped, if the
underlying graph does not contain any (undirected) cycles.</p>
      <p>We introduce the notion of treeification, which for a given query q essentially
describes the set of all its ways to match in a tree-shaped way.</p>
      <p>Definition 3. Let q be a CQ. We say that a tree-shaped query q0 is a treeification
of q, if q0 can be obtained from q by (possibly multiple times) selecting some atoms
r(x; z) and s(y; z) and replacing all variables x in the query by y. By trees(q) we
denote the set of all treeifications of q.</p>
      <p>Note that trees(q) is finite.</p>
      <p>
        Next we describe the so-called rolling-up technique, which transforms a
treeshaped query match into a single concept [
        <xref ref-type="bibr" rid="ref11 ref16">16,11</xref>
        ]. Let us fix a conjunctive query q
and some treeification q0 of it. For each variable x 2 Vq we construct a
concept Cq0;x, with the supposed meaning that d 2 CqI0;x if variable x from q0 can
be mapped to d in a query match represented by q0. We define these concepts
as follows: Picking one arbitrary variable xqr0 2 Vq0 , we let (Vq0 ; ) be the tree
obtained from Gq0 by orienting all edges away from xqr0 . Now, we define Cq0;x for
every x 2 Vq0 in a bottom-up manner as follows: Cq0;x equals dC(x)2q0 C if x is
a leaf (i.e. -minimal), otherwise:
      </p>
      <p>l C u
C(x)2q0</p>
      <p>l
(x;yy)2xEq0
9</p>
      <p>\ s:Cq0;y
s(x;y)2q0
u l 9 \ s :Cq0;y
(y;xy)2xEq0 s(y;x)2q0
Concepts of the form 9Ts(y;x)2q0 s :Cq0;y in the above definition are clearly not
in ALC (due to the presence of inverse roles), but this can be remedied as
follows: We replace any 9r :Cq0;y with a (conjunction of) inverted role name(s) r
by a newly introduced concept C9r :Cq0;y for which we also specify Cq0;y v
8r:Cr :Cq0;y . Like this, any of the concepts Cq0;y is free of inverses.</p>
      <p>For a given CQ q, we enumerate all of its treeifications and roll them up into
a concept. Let RqMatch be an RCBox defining that there exists a tree-shaped
query match in a model:</p>
      <p>G
q02trees(q)</p>
      <sec id="sec-4-1">
        <title>Cq0;xqr0 v Matchq</title>
        <p>(15)</p>
        <p>The two coming lemmas give the precise meaning to the defined concepts.
Lemma 2. Let R be an ALCSCC RCBox and let q be a conjunctive query.
Let RqMatch be as defined above. Assume that R [ RqMatch has a model I such
that MatchqI is empty. Then I does not have any tree-shaped query matches.
Lemma 3. Let R be an ALCSCC RCBox and let q be a conjunctive query
and RqMatch as defined above. If there is a model I of R without any tree-shaped
query matches, then R = R [ RqMatch [ f&gt; v :Matchqg is satisfiable.</p>
        <p>
          Satisfiability checking of R can be performed with an algorithm from [
          <xref ref-type="bibr" rid="ref2">2</xref>
          ].
Together with Lemma 2 and Lemma 3, we establish:
Theorem 4. It is decidable whether for a given CQ q and a given ALCSCC
RCBox R, there is a model of R without any tree-shaped query matches of q.
5.2
        </p>
        <p>Pumping models to enforce high girth
Now we take a closer look at the previously announced pumping method for
eliminating non-tree-shaped query matches.</p>
        <p>Definition 4. Let I be an interpretation, I = fd1; d2; : : : ; dng be the set of
domain elements and E the set of “edges”, i.e., e = (d; d0) occur in E if there
is a role r, s.t. (d; d0) 2 rI . Additionally, let F be the set of all functions f of
type f : E ! f0; 1g. We define a pumping of I, denoted with pump(I) = I0,
in the following way: (i) we set I0 = I F , (ii) for a concept name A, an
role name r andI aanpdairan((ydf;ufn)c;t(ido0n;ff0))wweesesett((d(;df;)f 2); (AdI0;0fi0ff))d22rIA0 Iiff(i(idi); df0o)r2anrIy
element d 2
and functions f; f 0 are equal on all arguments except for the argument e = (d; d0)
for which f 0(e) = 1 f (e).</p>
        <p>The girth of I is the length of a shortest (undirected) proper cycle contained
in I (proper means that no element from E is used more than once). It is not
difficult to see that the girth of pump(I) is at least twice the girth of I: due to
the fact that we need to flip the e value in f every time we cross an edge e.
Lemma 4. Let I be an interpretation with girth k. Then the girth of pump(I)
is at least 2k.</p>
        <p>Correctness of the pumping method is guaranteed by the following lemma.
Its proof relies on two observations (i) degree of each node and “types” of its
successors are preserved during pumping, and (ii) all global constraints from the
RCBox are still satisfied since all cardinalities from inequalities were multiplied
by the same number (i.e. jF j). Formally, the proof goes via an induction over
the depth of ALCSCC concepts.</p>
        <p>Lemma 5. Let R be an RCBox with a model I. Then pump(I) is a model of R.</p>
        <p>Now we explain how to use the pumping method from the Definition 4 to
erase non-tree query matches and obtain a counter-model.</p>
        <p>Lemma 6. Let q be a CQ and let R be an ALCSCC RCBox. Assume R has a
model I that does not have any acyclic query matches of q. Then there exists a
counter-model I0 for q and R.</p>
        <p>Proof. We define I0 as the jqj-fold application of pump to I. Since each
iteration of pumping doubles the girth of the input structure, the girth of I0 is
strictly greater than jqj. Moreover, during the pumping process, we haven’t
introduced any new query matches. Hence, since any cyclic match of q would require
girth jqj, the query q cannot match into I0 anymore and I0 is a counter-model.</p>
        <p>By combining (i) a method of pumping a structure from this Section, (ii)
a method of enriching a knowledge-base with a knowledge to forbid all acyclic
query matches from the previous section and (iii) an algorithm for testing
(un)satisfiability for ALCSCC concepts w.r.t RCBoxes, we conclude Theorem 3.</p>
        <p>
          A rough complexity analysis give us a 2ExpTime upper bound. We believe
that this bound can be improved to ExpTime by adapting techniques from [
          <xref ref-type="bibr" rid="ref11">11</xref>
          ].
6
        </p>
        <p>Conclusion
In our work, we have significantly pushed the boundaries of quantitative
extensions to description logics. We showed that ALC can not only be extended by
both global (extension-based) and local (neighbourhood-based) arithmetic
constraints but even by hybrid constraints mixing the two, yielding a significant
increase in expressivity without negative impact on the complexity of
satisfiability testing. On the downside, we had to find out that this extension leads to
undecidability of conjunctive query answering. However, we were able to regain
decidability and even a favourable complexity under appropriate restrictions.</p>
        <p>Without any doubt, the ability to deal with factual data, i.e. ABoxes, is
of utmost importance in the context of statistical considerations and
querying. Therefore, our next step will be to extend our investigations to full-fledged
ALCSCC knowledge bases including ABoxes. In fact, we have already established
that ABoxes can be added to ALCSCC RCBoxes without impacting the
ExpTime complexity of satisfiability checking and are confident that standard query
partitioning and a slight modification of model pumping will allow us to show
ExpTime completeness of CQ answering.</p>
        <p>Acknowledgements
Bartosz Bednarczyk was supported by the Polish Ministry of Science and Higher
Education program “Diamentowy Grant” no. DI2017 006447. Sebastian Rudolph
was supported by the European Research Council (ERC) through the
Consolidator Grant 771779 (DeciGUT). Franz Baader was partially supported by the
German Research Foundation (DFG) within the Research Unit 1513 Hybris.</p>
      </sec>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          1.
          <string-name>
            <given-names>Franz</given-names>
            <surname>Baader</surname>
          </string-name>
          .
          <article-title>A new description logic with set constraints and cardinality constraints on role successors</article-title>
          .
          <source>In Clare Dixon and Marcelo Finger</source>
          , editors,
          <source>Proceedings of the 11th International Symposium on Frontiers of Combining Systems (FroCoS'17)</source>
          , volume
          <volume>10483</volume>
          of Lecture Notes in Computer Science, pages
          <fpage>43</fpage>
          -
          <lpage>59</lpage>
          , Brasília, Brazil,
          <year>2017</year>
          . Springer-Verlag.
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          2.
          <string-name>
            <given-names>Franz</given-names>
            <surname>Baader</surname>
          </string-name>
          .
          <article-title>Expressive cardinality constraints on ALCSCC concepts</article-title>
          .
          <source>In Proceedings of the 34th ACM/SIGAPP Symposium On Applied Computing (SAC'19)</source>
          . ACM,
          <year>2019</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          3.
          <string-name>
            <given-names>Franz</given-names>
            <surname>Baader</surname>
          </string-name>
          , Diego Calvanese,
          <string-name>
            <surname>Deborah</surname>
            <given-names>McGuinness</given-names>
          </string-name>
          ,
          <string-name>
            <given-names>Daniele</given-names>
            <surname>Nardi</surname>
          </string-name>
          , and
          <string-name>
            <surname>Peter</surname>
          </string-name>
          Patel-Schneider, editors.
          <source>The Description Logic Handbook: Theory</source>
          , Implementation, and
          <string-name>
            <surname>Applications</surname>
          </string-name>
          . Cambridge University Press, second edition,
          <year>2007</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          4.
          <string-name>
            <given-names>Franz</given-names>
            <surname>Baader</surname>
          </string-name>
          and
          <string-name>
            <given-names>Andreas</given-names>
            <surname>Ecke</surname>
          </string-name>
          .
          <article-title>Extending the description logic alc with more expressive cardinality constraints on concepts</article-title>
          .
          <source>In GCAI 2017. 3rd Global Conference on Artificial Intelligence</source>
          , volume
          <volume>50</volume>
          of EPiC Series in Computing, pages
          <fpage>6</fpage>
          -
          <lpage>19</lpage>
          . EasyChair,
          <year>2017</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          5.
          <string-name>
            <given-names>Franz</given-names>
            <surname>Baader</surname>
          </string-name>
          , Ian Horrocks, Carsten Lutz, and
          <string-name>
            <given-names>Uli</given-names>
            <surname>Sattler</surname>
          </string-name>
          .
          <article-title>An Introduction to Description Logic</article-title>
          . Cambridge University Press,
          <year>2017</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          6.
          <string-name>
            <given-names>Vince</given-names>
            <surname>Bárány</surname>
          </string-name>
          , Georg Gottlob, and
          <string-name>
            <given-names>Martin</given-names>
            <surname>Otto</surname>
          </string-name>
          .
          <article-title>Querying the guarded fragment</article-title>
          .
          <source>Logical Methods in Computer Science</source>
          ,
          <volume>10</volume>
          (
          <issue>2</issue>
          ),
          <year>2014</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          7. Diego Calvanese, Giuseppe De Giacomo, and
          <string-name>
            <given-names>Maurizio</given-names>
            <surname>Lenzerini</surname>
          </string-name>
          .
          <article-title>On the decidability of query containment under constraints</article-title>
          .
          <source>In Proc. of the 17th ACM SIGACT SIGMOD SIGART Symp. on Principles of Database Systems (PODS'98)</source>
          , pages
          <fpage>149</fpage>
          -
          <lpage>158</lpage>
          ,
          <year>1998</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          8.
          <string-name>
            <given-names>Birte</given-names>
            <surname>Glimm</surname>
          </string-name>
          , Carsten Lutz, Ian Horrocks, and
          <string-name>
            <given-names>Ulrike</given-names>
            <surname>Sattler</surname>
          </string-name>
          .
          <article-title>Conjunctive query answering for the description logic SHIQ</article-title>
          .
          <source>J. of Artificial Intelligence Research</source>
          ,
          <volume>31</volume>
          :
          <fpage>157</fpage>
          -
          <lpage>204</lpage>
          ,
          <year>2008</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          9.
          <string-name>
            <given-names>Yazmin</given-names>
            <surname>Angélica</surname>
          </string-name>
          Ibáñez-García,
          <string-name>
            <given-names>Carsten</given-names>
            <surname>Lutz</surname>
          </string-name>
          , and
          <string-name>
            <given-names>Thomas</given-names>
            <surname>Schneider</surname>
          </string-name>
          .
          <article-title>Finite model reasoning in horn description logics</article-title>
          .
          <source>In Proc. of the 14th Int. Conf. on Principles of Knowledge Representation and Reasoning (KR</source>
          <year>2014</year>
          ), pages
          <fpage>288</fpage>
          -
          <lpage>297</lpage>
          ,
          <year>2014</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          10.
          <string-name>
            <given-names>Viktor</given-names>
            <surname>Kuncak</surname>
          </string-name>
          and
          <string-name>
            <given-names>Martin C.</given-names>
            <surname>Rinard</surname>
          </string-name>
          .
          <article-title>Towards efficient satisfiability checking for Boolean algebra with Presburger arithmetic</article-title>
          . In Frank Pfenning, editor,
          <source>Proc. of the 21st Int. Conf. on Automated Deduction (CADE-07)</source>
          , volume
          <volume>4603</volume>
          of Lecture Notes in Computer Science, pages
          <fpage>215</fpage>
          -
          <lpage>230</lpage>
          . Springer,
          <year>2007</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          11.
          <string-name>
            <given-names>Carsten</given-names>
            <surname>Lutz</surname>
          </string-name>
          .
          <article-title>The complexity of conjunctive query answering in expressive description logics</article-title>
          . In Alessandro Armando,
          <string-name>
            <given-names>Peter</given-names>
            <surname>Baumgartner</surname>
          </string-name>
          , and Gilles Dowek, editors,
          <source>Proc. of the Int. Joint Conf. on Automated Reasoning (IJCAR 2008), Lecture Notes in Artificial Intelligence</source>
          , pages
          <fpage>179</fpage>
          -
          <lpage>193</lpage>
          . Springer-Verlag,
          <year>2008</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          12.
          <string-name>
            <given-names>Martin</given-names>
            <surname>Otto</surname>
          </string-name>
          .
          <article-title>Highly acyclic groups, hypergraph covers, and the guarded fragment</article-title>
          .
          <source>J. ACM</source>
          ,
          <volume>59</volume>
          (
          <issue>1</issue>
          ):5:
          <fpage>1</fpage>
          -
          <lpage>5</lpage>
          :
          <fpage>40</fpage>
          ,
          <year>2012</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          13.
          <string-name>
            <given-names>Rafael</given-names>
            <surname>Peñaloza</surname>
          </string-name>
          and
          <string-name>
            <given-names>Nico</given-names>
            <surname>Potyka</surname>
          </string-name>
          .
          <article-title>Towards statistical reasoning in description logics over finite domains</article-title>
          .
          <source>In Serafin Moral and Olivier Pivert</source>
          , editors,
          <source>Proc. of the 11th Int. Conf. on Scalable Uncertainty Management (SUM</source>
          <year>2017</year>
          ), volume
          <volume>10564</volume>
          of Lecture Notes in Computer Science. Springer-Verlag,
          <year>2017</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref14">
        <mixed-citation>
          14.
          <string-name>
            <surname>Ian</surname>
          </string-name>
          Pratt-Hartmann.
          <article-title>Data-complexity of the two-variable fragment with counting quantifiers</article-title>
          .
          <source>Inf. Comput.</source>
          ,
          <volume>207</volume>
          (
          <issue>8</issue>
          ):
          <fpage>867</fpage>
          -
          <lpage>888</lpage>
          ,
          <year>2009</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref15">
        <mixed-citation>
          15.
          <string-name>
            <given-names>Sebastian</given-names>
            <surname>Rudolph</surname>
          </string-name>
          .
          <article-title>Foundations of description logics</article-title>
          .
          <source>In Axel Polleres, Claudia d'Amato</source>
          , Marcelo Arenas, Siegfried Handschuh, Paula Kroner, Sascha Ossowski, and
          <string-name>
            <surname>Peter F.</surname>
          </string-name>
          Patel-Schneider, editors,
          <source>Reasoning Web. Semantic Technologies for the Web of Data - 7th International Summer School</source>
          <year>2011</year>
          , volume
          <volume>6848</volume>
          <source>of LNCS</source>
          , pages
          <fpage>76</fpage>
          -
          <lpage>136</lpage>
          . Springer,
          <year>2011</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref16">
        <mixed-citation>
          16.
          <string-name>
            <given-names>Sergio</given-names>
            <surname>Tessaris</surname>
          </string-name>
          .
          <article-title>Querying expressive dls</article-title>
          .
          <source>In Proc. of the 2001 Description Logic Workshop (DL</source>
          <year>2001</year>
          ), volume
          <volume>49</volume>
          <source>of CEUR Workshop Proceedings. CEUR-WS.org</source>
          ,
          <year>2001</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref17">
        <mixed-citation>
          17.
          <string-name>
            <given-names>Stephan</given-names>
            <surname>Tobies</surname>
          </string-name>
          .
          <article-title>A PSPACE algorithm for graded modal logic</article-title>
          . In Harald Ganzinger, editor,
          <source>Proc. of the 16th Int. Conf. on Automated Deduction (CADE'99)</source>
          , volume
          <volume>1632</volume>
          <source>of Lecture Notes in Artificial Intelligence</source>
          , pages
          <fpage>52</fpage>
          -
          <lpage>66</lpage>
          . Springer-Verlag,
          <year>1999</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref18">
        <mixed-citation>
          18.
          <string-name>
            <given-names>Stephan</given-names>
            <surname>Tobies</surname>
          </string-name>
          .
          <article-title>The complexity of reasoning with cardinality restrictions and nominals in expressive description logics</article-title>
          .
          <source>J. of Artificial Intelligence Research</source>
          ,
          <volume>12</volume>
          :
          <fpage>199</fpage>
          -
          <lpage>217</lpage>
          ,
          <year>2000</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref19">
        <mixed-citation>
          19.
          <string-name>
            <given-names>Stephan</given-names>
            <surname>Tobies</surname>
          </string-name>
          .
          <article-title>Complexity Results and Practical Algorithms for Logics in Knowledge Representation</article-title>
          .
          <source>PhD thesis</source>
          , LuFG Theoretical Computer Science, RWTHAachen, Germany,
          <year>2001</year>
          .
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>