Satisfiability Checking and Conjunctive Query Answering in Description Logics with Global and Local Cardinality Constraints Franz Baader1 , Bartosz Bednarczyk12 , and Sebastian Rudolph1 1 Faculty of Computer Science, TU Dresden, Germany firstname.lastname@tu-dresden.de 2 Institute of Computer Science, University of Wrocław, Poland bartosz.bednarczyk@cs.uni.wroc.pl Abstract. We consider an expressive description logic (DL) in which the global and local cardinality constraints introduced in previous papers can be mixed. On the one hand, we show that this does not increase the complexity of satisfiability checking and other standard inference problems. On the other hand, conjunctive query entailment in this DL turns out to be undecidable. We prove that decidability of querying can be regained if global and local constraints are not mixed and the global constraints are appropriately restricted. 1 Introduction DLs that can express both local cardinality constraints (i.e., constraints concern- ing the role successors of specific individuals) and global cardinality constraints (i.e., constraints on the overall cardinality of concepts) can, for instance, be used to check the correctness of statistical statements. For example, if a German car company claims that they have produced more than N cars in a certain year, and P % of the tires used for their cars were produced by Betteryear, this may be contradictory to a statement of Betteryear that they have sold less than M tires in Germany. Such statistical information may, of course, also influence the answers to queries. If we know that the car company VMW uses only tires from Betteryear or Badmonth, but the statistical information allows us to conclude that another car company has actually bought all the tires sold by Betteryear, then we know that the cars sold by VMW all have tires produced by Badmonth. This motivates investigating DLs with expressive cardinality constraints, and to consider not just standard inferences such as satisfiability checking for these DLs, but also query answering. In two previous publications we have, on the one hand, extended the DL ALCQ by more expressive number restrictions using cardinality and set con- straints expressed in the quantifier-free fragment of Boolean Algebra with Pres- burger Arithmetic (QFBAPA) [10]. In the resulting DL ALCSCC, which was introduced and investigated in [1], cardinality and set constraints are applied locally, i.e., they refer to the role successors of an individual under consid- eration. It was shown in [1] that pure concept satisfiability in ALCSCC is a PSpace-complete problem, and concept satisfiability w.r.t. a general TBox is ExpTime-complete. This shows that the more expressive number restrictions do not increase the complexity of reasoning since reasoning in ALCQ has the same complexity [17,19]. On the other hand, we have extended the terminological formalism of the well-known description logic ALC from general TBoxes to more general cardinal- ity constraints expressed in QFBAPA [4], which we called extended cardinality constraints (ECBoxes). These constraints are global since they refer to all indi- viduals in the interpretation domain. It was shown in [4] that reasoning w.r.t. ECBoxes is in NExpTime even if the numbers occurring in the constraints are encoded in binary. A NExpTime lower bound follows from a result of Tobies [18] for a restricted form of cardinality constraints, where the cardinality of a concept can only be compared with a fixed number. This complexity can be lowered to ExpTime if a restricted form of cardinality constraints (RCBoxes) is used. Such RCBoxes are still powerful enough to express statistical knowledge bases [13]. An obvious way to generalize these two approaches is to combine the two ex- tensions, i.e., to consider extended cardinality constraints, but now on ALCSCC concepts rather than just ALC concepts. This combination was investigated in [2], where a NExpTime upper bound was established for reasoning in ALCSCC w.r.t. ECBoxes. It is also shown in [2] that reasoning w.r.t. RCBoxes stays in ExpTime also for ALCSCC. Here we go one step further by allowing for a tighter integration of global and local constraints. The resulting logic, which we call ALCSCC ++ , allows, for example, to relate the number of role successors of a given individual with the overall number of elements of a certain concept. We show that, from a worst-case complexity point of view, this extended expressivity comes for free if we consider classical reasoning problems. Concept satisfiability in ALCSCC ++ has the same complexity as in ALC and ALCSCC with global cardinality constraints: it is NExpTime-complete. Yet, for effective conjunctive query answering this logic turns out to be too expressive. We show that conjunctive query entailment w.r.t. ALCSCC ++ knowledge bases is, in fact, undecidable. In contrast, we can show that conjunctive query entailment w.r.t. ALCSCC RCBoxes is decidable. We assume the reader to be sufficiently familiar with all the standard notions of description logics [3,5,15]. 2 The logic ALCSCC ++ As in [1,4], we use the quantifier-free fragment of Boolean Algebra with Pres- burger Arithmetic (QFBAPA) to express our constraints. In this logic, one can build set terms by applying Boolean operations (intersection ∩, union ∪, and complement ·c ) to set variables as well as the constants ∅ and U. Set terms s, t can then be used to state set constraints, which are equality and inclusion con- straints of the form s = t, s ⊆ t, where s, t are set terms. Presburger Arithmetic (PA) expressions are built from integer constants and set cardinalities |s| using addition as well as multiplication with an integer constant. They can be used to form cardinality constraints of the form k = `, k < `, N dvd `, where k, ` are PA expressions, N is an integer constant, and dvd stands for divisibility. A QF- BAPA formula is a Boolean combination of set and cardinality constraints. A solution σ of a QFBAPA formula φ assigns a finite set σ(U) to U and subsets of σ(U) to set variables such that φ is satisfied by this assignment (see [1] for more details). A QFBAPA formula φ is satisfiable if it has a solution. In [10] it is shown that the satisfiability problem for QFBAPA formulae is NP-complete. We are now ready to define our new logic, which we call ALCSCC ++ to indi- cate that it is an extension of the logic ALCSCC introduced in [1]. When defining the semantics of ALCSCC ++ , we restrict the attention to finite interpretations to ensure that cardinalities of concept descriptions are always well-defined non- negative integers. Definition 1. Given disjoint finite sets NC and NR of concept names and role names, respectively, ALCSCC ++ concept descriptions (short: concepts) are Boolean combinations of concept names and constraint expressions, where a con- straint expression is of the form sat(c) for a set constraint or a cardinality con- straint c that uses role names and ALCSCC ++ concept descriptions in place of set variables. As usual, we use > ( top) and ⊥ ( bottom) as abbreviations for A t ¬A and A u ¬A, respectively. A finite interpretation of NC and NR consists of a finite, non-empty set ∆I and a mapping ·I that maps every concept name A ∈ NC to a subset AI of ∆I and every role name r ∈ NR to a binary relation rI over ∆I . For a given element d ∈ ∆I we define rI (d) := {e ∈ ∆I | (d, e) ∈ rI }. The interpretation function ·I is inductively extended to ALCSCC ++ concept descriptions by interpreting the Boolean operators as usual, and the constraint expressions as follows: sat(c)I := {d ∈ ∆I | the mapping ·Id satisfies c}, where ·Id maps – ∅ to ∅Id := ∅ and U to U Id := ∆I , – the ALCSCC ++ concept descriptions C occurring in c to C Id := C I , – and the role names r occurring in c to rId := rI (d). The ALCSCC ++ concept description C is satisfiable if there is a finite interpre- tation I such that C I 6= ∅. Note that the interpretation of concepts as set variables in ALCSCC ++ is global in the sense that it does not depend on d, i.e., C Id = C Ie for all d, e ∈ ∆I . In contrast, the interpretation of role names r as set variables is local since only the r-successors of d are considered by ·Id . In ALCSCC, also the interpretation of concepts as set variables is local since in the semantics of ALCSCC the mapping ·Id considers only the elements of C I that are role successors of d for some role name in NR . This can clearly be simulated in ALCSCC ++ by using C ∩( r∈NR r) S instead of C when formulating the constraint. Thus, ALCSCC concepts can be expressed by ALCSCC ++ concepts. In addition, extended cardinality constraints (ECBoxes), as introduced in [4], are expressible within ALCSCC ++ concept de- scriptions, as are nominals, the universal role, and role negation. Proposition 1. ALCSCC ++ concepts can polynomially express nominals, role conjunctions, and ALCSCC ECBoxes, and thus also ABoxes, ALC ECBoxes and ALCSCC TBoxes. In addition, they have the same expressivity as concepts of ALCSCC extended with the universal role or with role negation, whereas both of these features are not expressible in plain ALCSCC. Proof. ECBoxes correspond to Boolean combinations of concept descriptions of the form sat(c) where c contains only concept descriptions as set variables. Since such concept descriptions are satisfied either by no element of ∆I or by all of them, their effect is to enforce the constraint on the whole interpretation domain if they are conjoined to a concept description. Regarding role negation, for given role names r, r, the constraint sat(> ⊆ sat(r∩r ⊆ ∅)) enforces that, for every individual, the sets of its r and r successors are disjoint. In addition, the constraint sat(> ⊆ sat(|r| + |r| = |U|)) says that elements of the domain that are not r successors of a given individual must be r successors. Thus, we can express that the role r is interpreteted as the complement of r, i.e. rI = ∆I × ∆I \ rI for every finite interpretation I. The other statements in the proposition are also easy to see. t u 3 Satisfiability of ALCSCC ++ concept descriptions In the following we consider an ALCSCC ++ concept description E and show how to test E for satisfiability by reducing this problem to the problem of test- ing satisfiability of QFBAPA formulae. Since the reduction is exponential and satisfiability in QFBAPA is in NP, this yields a NExpTime upper bound for satisfiability of ALCSCC ++ concept descriptions. This bound is optimal since consistency of extended cardinality constraints in ALC, as introduced in [4], is already NExpTime hard, and can be expressed as an ALCSCC ++ satisfiability problem by Proposition 1. Our NExpTime algorithm combines ideas from the satisfiability algorithm for ALCSCC concept descriptions [1] and the consistency procedure for ALC ECBoxes [4]. In particular, we use the notion of a type, as introduced in [4]. This notion is also similar to the Venn regions employed in [1]. Given a set of concept descriptions M, the type of an individual in an interpretation consists of the elements of M to which the individual belongs. Such a type t can also be seen as a concept description Ct , which is the conjunction of all the elements of t. We assume in the following that M consists of all subdescriptions of the concept description E as well as the negations of these subdescriptions. Definition 2. A subset t of M is a type for E if it satisfies the following three properties: (1) for every concept description ¬C ∈ M, either C or ¬C belongs to t; (2) for every concept description C u D ∈ M, we have that C u D ∈ t iff C ∈ t and D ∈ t; (3) for every concept description C t D ∈ M, we have that C t D ∈ t iff C ∈ t or D ∈ t. We denote the set of all types for E with types(E). Given an interpretation I and an individual d ∈ ∆I , the type of d is the set tI (d) := {C ∈ M | d ∈ C I }. Due to Condition (1) in the definition of types, concept descriptions Ct , Ct0 induced by different types t 6= t0 are disjoint, and all concept descriptions in M can be obtained as the disjoint union of the concept descriptions induced by the types containing them, i.e., we have C I = t type with C∈t CtI for all C ∈ S M and finite interpretations I. In particular, the following holds for all finite interpretations I: X \ |C I | = |CtI | and |CtI | = | C I |, t type with C∈t C∈t where the latter identity is an immediate consequence of the definition of Ct as the conjunction of all the elements of t. Given a type t, the constraints occurring in the top-level Boolean structure of t induce a QFBAPA formula ψt , in which the concepts C and roles r occurring in these constraints are replaced by set variables XC and Xrt , respectively. Note that set variables corresponding to concepts are independent of the type t, i.e., they are shared by all types, whereas the set variables corresponding to roles are different for different types. This corresponds to the fact that roles are evaluated locally, but concepts are evaluated globally in the semantics of ALCSCC ++ . In order to ensure that the Boolean structure of concepts is respected by the set variables, we introduce the formula β := ^ ^ ^ c XCuD = XC ∩ XD ∧ XCtD = XC ∪ XD ∧ X¬C = (XC ) . CuD∈M CtD∈M ¬C∈M Overall, we translate the ALCSCC ++ concept E into the QFBAPA formula ^ \ δE := (|XE | ≥ 1) ∧ β ∧ (| XC | = 0) ∨ ψt . t∈types(E) C∈t Intuitively, to satisfy E, we need to have at least one element in it, which explains the first conjunct. The third conjunct together with β ensures that, for any type that is realized (i.e., has elements), the constraints of this type are satisfied. The next lemma states that there is a 1–1-relationship between solvability of δE and satisfiability of E. Lemma 1. The QFBAPA formula δE is of size at most exponential in the size of E, and it is satisfiable iff the ALCSCC ++ concept description E is satisfiable. Since satisfiability of QFBAPA formulae can be decided within NP even for binary coding of numbers [10], this lemma shows that satisfiability of ALCSCC ++ concept descriptions can be decided within NExpTime. Together with the known NExpTime lower bound for consistency of ALC ECBoxes in [4], this yields: Theorem 1. Satisfiability of ALCSCC ++ concept descriptions with numbers en- coded in binary is NExpTime-complete. Thanks to Proposition 1, this carries over to satisfiability of ALCSCC ++ knowledge bases which may feature an ABox, a TBox and an ECBox. 4 Query entailment in ALCSCC ++ The final result of this section is the undecidability of conjunctive query entail- ment for ALCSCC ++ . To this end, we first briefly recap the notion of (Boolean) conjunctive queries and define query entailment. In queries, we use variables from a countably infinite set V . A Boolean con- junctive query (CQ) q is a finite set of atoms of the form r(x, y) or C(z), where r is a role, C is concept, and x, y, z ∈ V . A CQ q is satisfied by I (written: I |= q) if there is a variable assignment π : V → ∆I (called match) such that (π(x), π(y)) ∈ rI for every r(x, y) ∈ q and π(z) ∈ C I for every C(z) ∈ q. A CQ q is (finitely) entailed from a knowledge base K (written: K |= q) if every (finite) model I of K satisfies q. We actually show undecidability of CQ entailment for a much weaker logic, thereby providing a very restricted fragment of constant-free and equality-free two-variable first-order logic for which finite CQ entailment is already unde- cidable, significantly strengthening and solidifying earlier results along those lines [14]. Our proof makes use of deterministic Turing machines (DTMs). For our purposes, it is sufficient to consider only computations starting with an empty tape. For space reasons, we assume the reader to be familiar with stan- dard notions and constructions concerning DTMs. We call a DTM looping if its run starting contains repeating configurations,i.e., there are two different (and hence – due to determinism – infinitely many) points in time, where the ma- chine’s tape content, head position, and state are the same. It is easy to see that the problem of determining if a given TM is looping is undecidable. We show our undecidability result for the DL ALC cov , a slight extension of ALC by role cover axioms of the form cov(r, s) for role names r and s. An interpretation I satisfies cov(r, s) if rI ∪ sI = ∆I × ∆I . Role  cover axioms can be expressed in ALCSCC ++ via sat > ⊆ sat(|r ∪ s| = |U|) , hence ALC cov is subsumed by ALCSCC ++ . In what follows, assume that a DTM M is given. We now describe an ALC cov TBox T and conjunctive query q such that T |= q exactly if M is not looping. We provide q and T together with the underlying intuitions of our construction. The goal of our construction is that a countermodel (i.e., an interpretation satisfying T but not q) corresponds to a looping configuration sequence of M. Thereby, the domain elements represent tape cells at certain computation steps of M. The role h connects consecutive tape cells of the same configuration, whereas the role v connects a configuration’s tape cell with the same tape cell of the successor configuration. We start by providing the query. Intuitively, the query is meant to catch the unwanted situation that two corresponding tape cells of consecutive configura- tions are v-connected, but the cells to their right aren’t. q = ∃x, y, x0 , y 0 .v(x, y) ∧ h(x, x0 ) ∧ h(y, y 0 ) ∧ v(x0 , y 0 ) (1) Table 1. TBox axioms for DTM implementation > v ∃aux.(TapeStart u InitConf u State qini ) (3) > v ∃h.> u ∃v.> (4) TapeStart v ∀v.TapeStart (5) InitConf v ∀h.InitConf InitConf v Symbol  (6) State q v ∀h.NoHeadR NoHeadR v ∀h.NoHeadR NoHeadR v NoHead (7) ∃h.Stateq v NoHeadL ∃h.NoHeadL v NoHeadL NoHeadL v NoHead (8) State q u NoHead v ⊥ (9) Symbolσ u Symbol σ0 v ⊥ Stateq u State q0 v ⊥ (10) N oHead u Symbol σ v ∀v.Symbolσ (11) Stateq u Symbol σ v ∀v.(Symbolσ0 u ∀h.Stateq0 ) (12) ∃h.(Stateq u Symbol σ ) v ∀v.(Stateq0 u ∀h.Symbolσ0 ) (13) TapeStart u State q u Symbol σ v ∀v.(Stateq0 u Symbol σ0 ) (14) We proceed by giving the axioms of T . The following covering axiom ensures that, whenever two elements are not v-connected, they must be v-connected. This is needed to enable the above query to catch the described problem. cov(v, v) (2) The remaining TBox axioms can be found in Table 1. Axiom 3 ensures (by means of an auxiliary role aux which serves no further purpose) that there is a first tape cell of the first (initial) configuration where the head of the TM is positioned in the initial state. Axiom 4 enforces that for every cell of every configuration there is both a tape cell to its right and a corresponding tape cell in the successor configuration. Axiom 5 makes sure that, for every cell that is the first on its tape, the corresponding successor configuration’s tape cell is also the first. Axioms 6 propagates the information that a cell belongs to the initial configuration along the tape, and fills the tape with blanks. Axioms 7–9 (instantiated for every state q) make sure that in every configuration there can only be one cell where the head is positioned. Every cell can only carry one symbol and the head can be in only one state, as ensured by Axioms 10 (for distinct symbols σ, σ 0 and distinct states q, q0 ). Thanks to Axiom 11, symbols on head-free cells carry over to the next configuration. As specified by the DTM’s transition function, the head reads a symbol σ, writes a symbol σ 0 , changes its state from q to q0 and moves right (Axiom 12) or left (Axiom 13) or stays in its place whenever it is supposed to move left but is already at the leftmost tape cell (Axiom 14). This finishes the description of the TBox T , allowing us to establish the claimed property and consequenty the undecidability result. Proposition 2. M is looping iff there is a finite model I of T with I 6|= Q. Theorem 2. Finite CQ entailment over ALC cov TBoxes is undecidable. Proof. According to Proposition 2, the TM looping problem can be reduced to the problem if for a given ALC cov TBox T and conjunctive query q, there is a finite interpretation I with I |= T with I 6|= q. Note that the latter is the case exactly if T does not finitely entail q. Finally, taking into account that ALCSCC ++ subsumes ALC cov and only al- lows for finite models, we obtain the wanted result. Corollary 1. Conjunctive query entailment for ALCSCC ++ is undecidable. 5 Query entailment from ALCSCC RCBoxes As the last result of the paper, we show that decidability of CQ entailment can be regained by moving to a less expressive logic, namely ALCSCC RCBoxes, i.e., finite sets of restricted cardinality constraints of the form N1 |C1 | + . . . + Nk |Ck | ≤ Nk+1 |Ck+1 | + . . . + Nk+` |Ck+` |, where Ci are ALCSCC concept descriptions and Ni are integer constants for 1 ≤ i ≤ k + `, with the obvious semantics. RCBoxes can express general concept inclusions (GCIs) C v D via |C u ¬D| ≤ |⊥|, so we use GCIs when appropriate. We first sketch our approach. Let R be an input RCBox and let q be an input CQ. Assume that q is not entailed by R. Hence there is a counter-model I, satisfying R but not q. Our goal is to develop an algorithm to produce another RCBox R0 consisting of R and some additional knowledge, in a way that R0 is satisfiable if and only if there is a counter-model of R for q. This can be done by a careful analysis of query matches. Note that if a query q is entailed by R, every model I falls into one of the following two categories: (i) either there is an acyclic (also called tree-shaped ) query match or (ii) every query match contains some cyclic substructure of the model. To deal with these two cases we proceed as follows. For the first case, in order to rule out such models having tree-shaped query matches, we enrich our RCBox R with additional knowledge forbidding these matches. This can be done by using the so-called rolling-up technique of transforming query matches into concepts, as, e.g., in [7,11,8]. For the second case, we show that it actually cannot happen: we argue that if there is a model I with only cyclic query matches, one can employ an appropriate model transformation, called pumping, to turn I into a proper counter-model. This technique is sometimes called big-cycle method [12] or Rosati covers and was successfully used in the context of finite query entail- ment, e.g, in [6,14,9]. Concluding, it suffices to enrich an ALCSCC RCBox with additional statements ruling out all models with acyclic query matches, to obtain an RCBox whose satisfiability coincides with the existence of a counter-model. Checking (un)satisfiability of ALCSCC RCBoxes is decidable [2], so we can con- clude following theorem: Theorem 3. CQ entailment from ALCSCC RCBoxes is decidable. The rest of this section is devoted to a sketch of the proof of Theorem 3 as outlined above. In Section 5.1 we show how to forbid tree-shaped query matches, in Section 5.2 we handle the case of cyclic matches and construct counter-models. 5.1 Forbidding acyclic query matches In this section, we provide a method for forbidding tree-shaped query matches. We strongly rely on previous results on query entailment for ALCHQ knowledge bases [11]. For the sake of simplicity we assume that all queries under consider- ation are connected (in the graph-theoretic sense). We first recall some standard definitions. Let q be a conjunctive query and let Vq be the set of variables appearing in q. We can see a query q as a directed graph Gq = (Vq , Eq ), where the nodes are variables from q and any two nodes x, y are connected if some atom r(x, y) appears in q. A query is tree-shaped, if the underlying graph does not contain any (undirected) cycles. We introduce the notion of treeification, which for a given query q essentially describes the set of all its ways to match in a tree-shaped way. Definition 3. Let q be a CQ. We say that a tree-shaped query q 0 is a treeification of q, if q 0 can be obtained from q by (possibly multiple times) selecting some atoms r(x, z) and s(y, z) and replacing all variables x in the query by y. By trees(q) we denote the set of all treeifications of q. Note that trees(q) is finite. Next we describe the so-called rolling-up technique, which transforms a tree- shaped query match into a single concept [16,11]. Let us fix a conjunctive query q and some treeification q 0 of it. For each variable x ∈ Vq we construct a con- cept Cq0 ,x , with the supposed meaning that d ∈ CqI0 ,x if variable x from q 0 can be mapped to d in a query match represented by q 0 . We define these concepts as follows: Picking one arbitrary variable xrq0 ∈ Vq0 , we let (Vq0 , ≺) be the tree obtained from Gq0 by orienting all edges away from xrq0 . Now, wed define Cq ,x for 0 every x ∈ Vq in a bottom-up manner as follows: Cq ,x equals C(x)∈q0 C if x is 0 0 a leaf (i.e. ≺-minimal), otherwise: l l  \  l  \  C u ∃ s.Cq0 ,y u ∃ s− .Cq0 ,y C(x)∈q 0 (x,y)∈E 0 q s(x,y)∈q 0 (y,x)∈E 0 q s(y,x)∈q 0 y≺x y≺x Concepts of the form ∃ s(y,x)∈q0 s− .Cq0 ,y in the above definition are clearly not T in ALC (due to the presence of inverse roles), but this can be remedied as fol- lows: We replace any ∃r− .Cq0 ,y with a (conjunction of) inverted role name(s) r− by a newly introduced concept C∃r− .Cq0 ,y for which we also specify Cq0 ,y v ∀r.Cr− .Cq0 ,y . Like this, any of the concepts Cq0 ,y is free of inverses. For a given CQ q, we enumerate all of its treeifications and roll them up into a concept. Let RMatch q be an RCBox defining that there exists a tree-shaped query match in a model: G Cq0 ,xrq0 v Matchq (15) q 0 ∈trees(q) The two coming lemmas give the precise meaning to the defined concepts. Lemma 2. Let R be an ALCSCC RCBox and let q be a conjunctive query. Let RMatch q be as defined above. Assume that R ∪ RMatch q has a model I such I that Matchq is empty. Then I does not have any tree-shaped query matches. Lemma 3. Let R be an ALCSCC RCBox and let q be a conjunctive query and RMatch q as defined above. If there is a model I of R without any tree-shaped query matches, then R∗ = R ∪ RMatch q ∪ {> v ¬Matchq } is satisfiable. Satisfiability checking of R∗ can be performed with an algorithm from [2]. Together with Lemma 2 and Lemma 3, we establish: Theorem 4. It is decidable whether for a given CQ q and a given ALCSCC RCBox R, there is a model of R without any tree-shaped query matches of q. 5.2 Pumping models to enforce high girth Now we take a closer look at the previously announced pumping method for eliminating non-tree-shaped query matches. Definition 4. Let I be an interpretation, ∆I = {d1 , d2 , . . . , dn } be the set of domain elements and E the set of “edges”, i.e., e = (d, d0 ) occur in E if there is a role r, s.t. (d, d0 ) ∈ rI . Additionally, let F be the set of all functions f of type f : E → {0, 1}. We define a pumping of I, denoted with pump(I) = I 0 , 0 in the following way: (i) we set ∆I = ∆I × F , (ii) for a concept name A, an 0 element d ∈ ∆ and any function f we set (d, f ) ∈ AI iff d ∈ AI (iii) for any I 0 role name r and a pair ((d, f ), (d , f )) we set ((d, f ), (d , f 0 )) ∈ rI iff (d, d0 ) ∈ rI 0 0 0 and functions f, f are equal on all arguments except for the argument e = (d, d0 ) 0 for which f 0 (e) = 1 − f (e). The girth of I is the length of a shortest (undirected) proper cycle contained in ∆I (proper means that no element from E is used more than once). It is not difficult to see that the girth of pump(I) is at least twice the girth of I: due to the fact that we need to flip the e value in f every time we cross an edge e. Lemma 4. Let I be an interpretation with girth k. Then the girth of pump(I) is at least 2k. Correctness of the pumping method is guaranteed by the following lemma. Its proof relies on two observations (i) degree of each node and “types” of its successors are preserved during pumping, and (ii) all global constraints from the RCBox are still satisfied since all cardinalities from inequalities were multiplied by the same number (i.e. |F |). Formally, the proof goes via an induction over the depth of ALCSCC concepts. Lemma 5. Let R be an RCBox with a model I. Then pump(I) is a model of R. Now we explain how to use the pumping method from the Definition 4 to erase non-tree query matches and obtain a counter-model. Lemma 6. Let q be a CQ and let R be an ALCSCC RCBox. Assume R has a model I that does not have any acyclic query matches of q. Then there exists a counter-model I 0 for q and R. Proof. We define I 0 as the |q|-fold application of pump to I. Since each it- eration of pumping doubles the girth of the input structure, the girth of I 0 is strictly greater than |q|. Moreover, during the pumping process, we haven’t intro- duced any new query matches. Hence, since any cyclic match of q would require girth ≤ |q|, the query q cannot match into I 0 anymore and I 0 is a counter-model. By combining (i) a method of pumping a structure from this Section, (ii) a method of enriching a knowledge-base with a knowledge to forbid all acyclic query matches from the previous section and (iii) an algorithm for testing (un)sat- isfiability for ALCSCC concepts w.r.t RCBoxes, we conclude Theorem 3. A rough complexity analysis give us a 2ExpTime upper bound. We believe that this bound can be improved to ExpTime by adapting techniques from [11]. 6 Conclusion In our work, we have significantly pushed the boundaries of quantitative exten- sions to description logics. We showed that ALC can not only be extended by both global (extension-based) and local (neighbourhood-based) arithmetic con- straints but even by hybrid constraints mixing the two, yielding a significant increase in expressivity without negative impact on the complexity of satisfia- bility testing. On the downside, we had to find out that this extension leads to undecidability of conjunctive query answering. However, we were able to regain decidability and even a favourable complexity under appropriate restrictions. Without any doubt, the ability to deal with factual data, i.e. ABoxes, is of utmost importance in the context of statistical considerations and query- ing. Therefore, our next step will be to extend our investigations to full-fledged ALCSCC knowledge bases including ABoxes. In fact, we have already established that ABoxes can be added to ALCSCC RCBoxes without impacting the Exp- Time complexity of satisfiability checking and are confident that standard query partitioning and a slight modification of model pumping will allow us to show ExpTime completeness of CQ answering. Acknowledgements Bartosz Bednarczyk was supported by the Polish Ministry of Science and Higher Education program “Diamentowy Grant” no. DI2017 006447. Sebastian Rudolph was supported by the European Research Council (ERC) through the Consol- idator Grant 771779 (DeciGUT). Franz Baader was partially supported by the German Research Foundation (DFG) within the Research Unit 1513 Hybris. References 1. Franz Baader. A new description logic with set constraints and cardinality con- straints on role successors. In Clare Dixon and Marcelo Finger, editors, Pro- ceedings of the 11th International Symposium on Frontiers of Combining Systems (FroCoS’17), volume 10483 of Lecture Notes in Computer Science, pages 43–59, Brasília, Brazil, 2017. Springer-Verlag. 2. Franz Baader. Expressive cardinality constraints on ALCSCC concepts. In Pro- ceedings of the 34th ACM/SIGAPP Symposium On Applied Computing (SAC’19). ACM, 2019. 3. Franz Baader, Diego Calvanese, Deborah McGuinness, Daniele Nardi, and Peter Patel-Schneider, editors. The Description Logic Handbook: Theory, Implementa- tion, and Applications. Cambridge University Press, second edition, 2007. 4. Franz Baader and Andreas Ecke. Extending the description logic alc with more expressive cardinality constraints on concepts. In GCAI 2017. 3rd Global Con- ference on Artificial Intelligence, volume 50 of EPiC Series in Computing, pages 6–19. EasyChair, 2017. 5. Franz Baader, Ian Horrocks, Carsten Lutz, and Uli Sattler. An Introduction to Description Logic. Cambridge University Press, 2017. 6. Vince Bárány, Georg Gottlob, and Martin Otto. Querying the guarded fragment. Logical Methods in Computer Science, 10(2), 2014. 7. Diego Calvanese, Giuseppe De Giacomo, and Maurizio Lenzerini. On the decid- ability of query containment under constraints. In Proc. of the 17th ACM SIGACT SIGMOD SIGART Symp. on Principles of Database Systems (PODS’98), pages 149–158, 1998. 8. Birte Glimm, Carsten Lutz, Ian Horrocks, and Ulrike Sattler. Conjunctive query answering for the description logic SHIQ. J. of Artificial Intelligence Research, 31:157–204, 2008. 9. Yazmin Angélica Ibáñez-García, Carsten Lutz, and Thomas Schneider. Finite model reasoning in horn description logics. In Proc. of the 14th Int. Conf. on Principles of Knowledge Representation and Reasoning (KR 2014), pages 288–297, 2014. 10. Viktor Kuncak and Martin C. Rinard. Towards efficient satisfiability checking for Boolean algebra with Presburger arithmetic. In Frank Pfenning, editor, Proc. of the 21st Int. Conf. on Automated Deduction (CADE-07), volume 4603 of Lecture Notes in Computer Science, pages 215–230. Springer, 2007. 11. Carsten Lutz. The complexity of conjunctive query answering in expressive de- scription logics. In Alessandro Armando, Peter Baumgartner, and Gilles Dowek, editors, Proc. of the Int. Joint Conf. on Automated Reasoning (IJCAR 2008), Lec- ture Notes in Artificial Intelligence, pages 179–193. Springer-Verlag, 2008. 12. Martin Otto. Highly acyclic groups, hypergraph covers, and the guarded fragment. J. ACM, 59(1):5:1–5:40, 2012. 13. Rafael Peñaloza and Nico Potyka. Towards statistical reasoning in description logics over finite domains. In Serafin Moral and Olivier Pivert, editors, Proc. of the 11th Int. Conf. on Scalable Uncertainty Management (SUM 2017), volume 10564 of Lecture Notes in Computer Science. Springer-Verlag, 2017. 14. Ian Pratt-Hartmann. Data-complexity of the two-variable fragment with counting quantifiers. Inf. Comput., 207(8):867–888, 2009. 15. Sebastian Rudolph. Foundations of description logics. In Axel Polleres, Claudia d’Amato, Marcelo Arenas, Siegfried Handschuh, Paula Kroner, Sascha Ossowski, and Peter F. Patel-Schneider, editors, Reasoning Web. Semantic Technologies for the Web of Data – 7th International Summer School 2011, volume 6848 of LNCS, pages 76–136. Springer, 2011. 16. Sergio Tessaris. Querying expressive dls. In Proc. of the 2001 Description Logic Workshop (DL 2001), volume 49 of CEUR Workshop Proceedings. CEUR-WS.org, 2001. 17. Stephan Tobies. A PSPACE algorithm for graded modal logic. In Har- ald Ganzinger, editor, Proc. of the 16th Int. Conf. on Automated Deduction (CADE’99), volume 1632 of Lecture Notes in Artificial Intelligence, pages 52–66. Springer-Verlag, 1999. 18. Stephan Tobies. The complexity of reasoning with cardinality restrictions and nominals in expressive description logics. J. of Artificial Intelligence Research, 12:199–217, 2000. 19. Stephan Tobies. Complexity Results and Practical Algorithms for Logics in Knowl- edge Representation. PhD thesis, LuFG Theoretical Computer Science, RWTH- Aachen, Germany, 2001.