-

Counting Queries over E LHI ? Ontologies ? ??

University of Bordeaux

Bordeaux INP

LaBRI

Talence

France

Inria

DI ENS

University PSL

Paris

France

While ontology-mediated query answering most often adopts (unions of) conjunctive queries as the query language, some recent works have explored the use of counting queries coupled with DL-Lite ontologies. In the present paper, we initiate the study of counting queries for Horn description logics outside the DL-Lite family. Via a non-trivial adaptation of existing techniques, we devise a decision procedure for answering counting conjunctive queries over ELHI? ontologies, which yields the same upper bounds as are known for DL-LiteR, namely, coNP in data complexity and coN2EXP w.r.t. combined complexity. We further show that the best known lower bounds for DL-LiteR (coNP-hard for data, coNEXP-hard for combined) hold also for EL.

Ontology-mediated query answering Counting queries

Introduction In the context of ontology-mediated query answering, the most commonly considered queries are conjunctive queries (CQs), but several works have explored ways of equipping CQs with some form of counting [ 8,10,9 ]. A recent approach, proposed in [ 3 ] as a generalization of [ 10 ], considers counting conjunctive queries (CCQs) that are syntactically de ned like standard CQs except that some variables may be designated as counting variables. In each model of the knowledge base, we can count the number of possible assignments to the counting variables that make the query hold. Certain answers are then de ned as intervals that provide upper and lower bounds on the count values across all models.

The problem of answering CCQs over DL-LiteR ontologies is intractable in general [ 10 ], and recent works have shown that intractability arises even in quite restricted settings [ 6,4 ]. However, some interesting tractable cases have also been identi ed, notably, rooted CCQs [ 3,6,11 ] and cardinality queries (= Boolean atomic CCQs) [ 4 ] coupled with DL-Litecore ontologies; query rewriting techniques have also begun to be explored [ 7 ]. To the best of our knowledge, however, CCQ answering has not yet been studied for ontologies outside the DL-Lite family. ? Partially supported by ANR project CQFD (ANR-18-CE23-0003) ?? © 2021 for this paper by its authors. Use permitted under Creative Commons

License Attribution 4.0 International (CC BY 4.0).

This motivates us to extend the study of CCQ answering to other well-known Horn description logics, such as E L and the more expressive E LHI?. While E L enjoys good computational properties for CQ answering [ 1 ], the techniques developed for CCQs in the DL-Lite context do not easily transfer. Indeed, a key property of DL-Lite is that merging elements in a model preserves modelhood so long as disjointness axioms are not violated. This property does not hold in E L, due to conjunctions of concepts on the left-hand side (LHS) of inclusions.

Through a non-trivial adaptation of the DL-Lite approach, we obtain similar upper bounds for CCQ answering over E LHI? ontologies as were known for DL-LiteR, namely, coNP membership w.r.t. data complexity and coN2EXP membership w.r.t. combined complexity. We further prove that even if we restrict to E L, we have the same lower bounds as in DL-LiteR: coNP-hardness w.r.t. data complexity and coNEXP-hardness w.r.t. combined complexity. 2

Preliminaries Knowledge Bases. We assume mutually disjoint sets NC, NR, and NI of concept, role, and individual names. A knowledge base (KB) K = (T ; A) consists of an ABox A and a TBox T . An ABox is a nite set of concept assertions A(b) (with A 2 NC, b 2 NI) and role assertions P(a; b) (with P 2 NR, a; b 2 NI). Let Ind(A) denote the set of individuals occurring in an ABox A.

A TBox is a nite set of axioms. In our considered DL E LHI?, TBoxes consist of concept inclusions B1 v B2, positive role inclusions R1 v R2, and negative role inclusions3 R1 u R2 v ?, where the Ri are roles drawn from NR = fP; P j P 2 NRg and the Bi are (complex) concepts constructed as follows: B := ? j > j A j B1 u B2 j 9R:B where A 2 NC; R 2 NR Let sig(T ) denote the set of concept and role names appearing in TBox T . Semantics of KBs. An interpretation takes the form I = ( I ; I ), where I is a non-empty set (called the domain) and I is the interpretation function that maps each A 2 NC to AI I , each P 2 NR to PI I I , and each a 2 NI to aI . In this paper, we will make the Standard Names Assumption by setting aI = a. Note however that our results only rely upon the weaker Unique Names Assumption (UNA), which stipulates that aI 6= bI whenever a 6= b.

The function I naturally extends to roles and complex concepts: (P )I = f(y; x) j (x; y) 2 PI g, ?I = ;, >I = I , (B1 u B2)I = B1I \ B2I and (9P:B)I = fd j (d; e) 2 RI ; e 2 BI g. An inclusion G v H is satis ed in I if GI HI ; an assertion A(b) (resp. P(a; b)) is satis ed in I if b 2 AI (resp. (a; b) 2 PI ). An interpretation is a model of a TBox T (resp. KB K) if it satis es all axioms in T (resp. axioms and assertions in K). A KB is satis able if it has at least one model. An inclusion (resp. assertion) is entailed from T (resp. K), written T j= (resp. K j= ), if is satis ed in every model of T (resp. K). 3 We follow e.g. [ 2 ] by considering a version of ELHI? with negative role inclusions. Example 1. Consider the ABox Ae = fA(a); B(b)g and the E LHI? TBox Te: Counting Queries. We consider counting queries as de ned in [ 3 ] (which generalizes the queries considered in [ 10,6 ]). A counting conjunctive query (CCQ) takes the form q(x) = 9y9z (x; y; z), where x; y; z are tuples of answer, existential, and counting variables, respectively, and is a conjunction of concept and role atoms with terms from NI [ x [ y [ z. A CCQ q is Boolean if x = ;.

A match for a CCQ q in an interpretation I is a homomorphism from q into I. If a match maps x to a, then the restriction of to z is called a counting match (c-match) of q(a) in I. The set of answers to q in I, denoted qI , contains all pairs (a; [m; M ]), with m; M 2 N [ f+1g, such that the number of distinct c-matches of q(a) in I belongs to the interval [m; M ]. A certain answer to q w.r.t. K is an answer in every model of K, that is a pair from TIj=K qI .

As usual, it is su cient to consider the Boolean case: (a; [m; M ]) is a certain answer to a CCQ q(x) i (;; [m; M ]) is a certain answer to the Boolean CCQ q(a) obtained by replacing x with a. Thus, from now on, we focus on Boolean CCQs, and work with candidate answers [m; M ] in place of (;; [m; M ]).

We further observe that since E LHI? cannot restrict the size of models, the value M in a certain answer [m; M ] is: 0 if the underlying CQ is unsatis able w.r.t. T , any number greater than 1 if q has a match in every model but z = ;; and +1 otherwise. As the rst two cases can be readily handled using existing techniques, we focus on identifying certain answers of the form [m; +1]. Example 2. Let qe := 9z D(z) be a Boolean CCQ. Intervals [0; +1] and [1; +1] are certain answers to qe over Ke. Interval [3; +1] is not a certain answer as the models depicted on Figures 1a, 1c, 1d and 1e contain only 2 matches for qe. Complexity. Given a E LHI? knowledge base K = (T ; A), a Boolean CCQ q, and an integer m 0 (in binary), we are interested in the complexity of deciding whether [m; +1] is a certain answer to q w.r.t. K. We will consider the two usual complexity measures: combined complexity which is in terms of the size of the whole input, and data complexity which is only in terms of the size of A and m (T and q are treated as xed). If O is a TBox, ABox, KB, or CCQ, then the size of O, denoted jOj, is the number of occurrences of concept and role names in O. Normal form. As is standard (see e.g. [ 2 ]), we work with E LHI? TBoxes in a convenient normal form, where every concept inclusion has one of the following restricted shapes:

A v ? > v A

A1 u A2 v A

A1 v 9R:A2 9R:A1 v A2 with A; A1; A2 2 NC; R 2 NR . Through the introduction of fresh concept names, we can transform in polynomial time any TBox T into a normal-form TBox T 0 that is a model-conservative extension of T (hence, indistinguishable from T from the point of view of queries). We therefore assume w.l.o.g. that all considered TBoxes are in normal form.

Canonical model. It is well known that every satis able E LHI? KB admits a canonical (or universal) model that embeds homomorphically into each of its models. We recall how such a model CK can be constructed (see e.g. [ 5 ]). The domain CK consists of all sequences aR1:M1 : : : Rn:Mn (n 0) such that a 2 Ind(A), each Ri belongs to NR , each Mi is a conjunction of concepts from NC [ f>g (treated as a set when convenient), and the following conditions hold: { If n 1, then T j= M0 v 9R1:M1 where M0 = fA 2 NC [ f>g j K j= A(a)g and M1 is maximal, as a set of concept names, for this property. { For every 1 i < n, T j= Mi v 9Ri+1:Mi+1 and Mi+1 is maximal, as a set of concept names, for this property.

Individual names are interpreted as themselves (aCK = a), and concept and role names are interpreted as follows:

ACK = fa j K j= A(a)g [ fe R:M j A 2 Mg PCK = f(a; b) j K j= P(a; b)g

[ f(e; e P0:M) j T j= P0 v Pg [ f(e P0:M; e) j T j= P0 v P g 3

Upper Bounds via Countermodels This section presents our main contribution: a decision procedure (and associated complexity upper bounds) for deciding whether a candidate interval [m; +1] is a certain answer to a Boolean CCQ q and E LHI? KB K = (T ; A) in normal form. It is more convenient in fact to focus on the complementary problem of checking whether [m; +1] is not a certain answer, as the latter holds i there exists a countermodel, i.e., a model of K with fewer than m c-matches.

We start by observing that if m is large enough, a countermodel always exists.

Lemma 1. There exists a countermodel for all m

(jInd(A)j + 3 jT j 2jT j)jqj.

Proof (sketch). We exhibit a model of size at most jInd(A)j + 3 jT j 2jT j. We may thus assume that the input m is such that m (jInd(A)j + 3 jT j 2jT j)jqj.

The main ingredient underlying our decision procedure and upper bounds is the following result, which restricts the size of countermodels we need to consider.

Theorem 1. If there exists a countermodel for input [m; +1], CCQ q and KB K, then there exists a countermodel with a polynomial-size (resp. double

exponential size) domain w.r.t. data complexity (resp. combined complexity).

Theorem 1 generalizes analogous results for DL-Lite KBs in [ 10,3 ] and gives rise to a decision procedure that guesses an interpretation of bounded size and checks whether it is a countermodel, yielding the following upper bounds:

Theorem 2. Deciding if input [m; +1] is a certain answer for q over K is in

coNP (resp. in coN2EXP) w.r.t. data complexity (resp. combined complexity).

The remainder of the section is devoted to proving Theorem 1. The highlevel idea is to start from an arbitrary countermodel I and merge its elements so as to reduce its size, while at the same time not introducing any new query matches. But how can we decide which elements of I can be safely merged? Looking to existing DL-Lite approaches [ 10,3 ] for inspiration, we observe that they proceed in two steps. First, they de ne an intermediate model I0 (called interleaving ) that, informally, retains the useful parts of I (i.e., those involved in query matches or needed to satisfy the ABox) and replaces the rest with treeshaped structures taken from the corresponding parts of the canonical model. With this more structured countermodel I0, it is easier to identify, via a wellchosen equivalence relation, the elements that behave similarly and thus can be safely merged. In a second step, elements of I0 from the same equivalence class are merged to obtain the desired bounded-size countermodel.

A naive adaptation of the DL-Lite approach to E LHI? (or even E L) fails already at the rst step. Indeed, as the next example illustrates, due to conjunction in the LHS of concept inclusions, the interleaving need not be a model.

Example 3. A countermodel Ie of Ke for qe and candidate integer 3 is depicted

in Figure 1c. Indeed Ie yields only 2 possible values for z: and .

The naive adaptation of the DL-Lite approach builds the interpretation de

picted in Figure 1b and induced from CKe . Observe that violates the axiom

A0 u B0 v A0. Generally speaking, the issue is that the canonical model may not

contain elements witnessing conjunctions of concepts that occur in the initial countermodel, so it is not enough to copy over parts of the canonical model.

We now show how to revamp the DL-Lite approach to make it work for E LHI?. To aid comprehension, we give a graphical overview in Figure 2. 3.1

Interlacing

We propose a new intermediate countermodel, called interlacing, which retains the desirable features of the interleaving but avoids the issues highlighted in Example 3. Essentially, the idea is to replace the canonical model by an alternative tree-shaped domain (called existential extraction) that is built from the countermodel I by keeping track of the RHS existential concepts satis ed in I.

The de nition of existential extraction uses the alphabet consisting of all R:A such that 9R:A is the RHS of an axiom in T . Furthermore, it assumes that, for every R:A 2 , we have chosen a function succIR:A that maps every element e 2 (9R:A)I to an element e0 2 I such that (e; e0) 2 RI and e0 2 AI . B00 B00

V V V B0

B0; D a P:A0 u D

b Q:B0 u D P

A a B b (a) Canonical model of Ke.

A0; B0; D

A a B b (b) Interleaving of Ie: not a model! B00 B00 B00 . . .

V V V B0 1 S 2 V 1

V U;V

V3 C2 C1 C0

C3; D V3 B00

V B0 S A3 S

RR23 AAA123 RR23 AAA123 RR23 AA12 (Ad) a0-PRin1terlaAcRi0n;Q1BBg0;oDf; AIb0e. (e) ReduAced a 0P-inteR1rlAa0c;QiBBn0g;Do;fAbI0e.

Fig. 1: Interpretations used along our examples. Existential extraction f

, inductively build the following mapping: f : Ind(A)

! a 7! a

I [ f"g w R:A 7! " if f (w) = " or f (w) 2= (9R:A)I succIR:A(f (w)) otherwise The existential extraction4 of I is := fw j w 2 Ind(A) ; f (w) 6= "g.

Remark 1. can be seen as the domain of a form of unravelling of I starting

from Ind(A), in which we only follow the selected successors for the RHS existential concepts. The selective nature of the unravelling is visible in Figure 1d which is based on the existential extraction of Ie: observe there is no edge that corresponds to following the red edge from Ie (Figure 1c) since V:A2 2= .

We proceed to de ne -interlacings, parametrized by a set of interest with Ind(A) . Intuitively, we preserve the portion of I corresponding to f ( ) and complete it with (locally) tree-shaped structures issuing from . De nition 2. The

-interlacing mapping of I is: The -interlacing I0 of I is the interpretation given by I0 := f 0( ) and: AI0 = ff 0(u) j u 2

; f (u) 2 AI g (= f 0(f 1(AI ))) PI0 = f(a; b) j a; b 2 Ind(A) ^ K j= P(a; b)g [ f(f 0(u); f 0(u P0:B)) j u; u P0:B 2 [ f(f 0(v P0:B); f 0(v)) j v; v P0:B 2 ^ T j= P0 v Pg ^ T j= P0 v Pg (Role0A) (Role0+) (Role0 ) Like the interleaving, the

-interlacing is a model which embeds in I.

Lemma 2. I0 is a model, and it embeds homomorphically in I via the mapping:

: I0 !

I w 7! w if w 2 f ( ) f (w) otherwise, that is w 2 n 4 While the de nitions of f , , and later constructions depend on the choice of successor functions, all choices lead to the desired result.

Proof (sketch). Both statements rely upon the facts that (i) if f 0(u) 2 AI0 , then f (u) 2 AI , and (ii) if (f 0(u); f 0(v)) 2 RI0 , then (f (u); f (v)) 2 RI .

Remark 2. It is easily veri ed that

f 0 = f .

To obtain a countermodel, we identify the crucial elements of I, namely, := Ind(A) [ fe 2 I j 9 2 m(q; I); 9z 2 z; (z) = eg, where m(q; I) is the set of matches of q in I. We then take 0 := f 1( ) as our set of interest.

Lemma 3. If I is a countermodel, then so is its 0-interlacing I0. Proof (sketch). We show that

injectively maps matches in I0 to matches in I.

Example 4. Figure 1d depicts the 0-interlacing of Ie. Like the initial model Ie, it is a countermodel for qe and integer 3. Notice Ie0 has an in nite domain.

3.2

Reduced interlacing

It remains to merge elements of the 0-interlacing I0 to obtain a countermodel of the required size. To identify similar elements, we consider their neighbourhoods.

De nition 3. Consider an interpretation M and an element c 2 M. Its n

neighbourhood NnM;D(c) w.r.t. a subdomain D M is de ned inductively as: NN0nMM+;;1DD((cc)) ::== fNcngM;D(c) [ e

9d 2 NnM;D(c) n D; 9R 2 NR ; (d; e) 2 RM

Observe that we stop adding successors when we reach an element from D. In particular, for c 2 D, we have NnM;D(c) = fcg for every value of n. It follows that the statement `c1 2 NnM;D(c2) i c2 2 NnM;D(c1)' does not hold in general. Example 5. In Figure 1d, neighbourhoods N2Ie0; ( ) and N2Ie0; ( ) are depicted (in green, resp. blue). In particular, notice a 2= N2Ie0; ( ) since 2 .

Recall that the de nition of I0 ensures that any c 2 I0 n is actually an element of and therefore we have c = aw for some individual name a and word w 2 . The tree-shaped structure of ensures that for all n, there exists a unique pre x rn;c of aw such that (i) f 0(rn;c) 2 NnI0; (c) and (ii) for any d 2 NnI0; (c), there exists a unique word wnd;c such that d = f 0(rn;c wnd;c).

This leads us to characterize the n-neighbourhood of an element c via the following function n;c, whose domain n is the set of words over with length 2n. Notice that, departing from [ 10 ], we keep track of sets of satis ed concepts, in order to handle conjunction of concepts on the LHS of axioms. n;c : n ! w 7!

[ 2sig(T ) [ f;g 8< f 0(rn;cw) if rn;cw 2

fA j A 2 sig(T ); f 0(rn;cw) 2 AI g if rn;cw 2 : ; otherwise CK ^ f 0(rn;cw) 2 CK ^ f 0(rn;cw) 2= We can now introduce the equivalence relation we use to merge elements:

De nition 4. The equivalence relation n on I0 is de ned as follows: an

element from is n-equivalent only to itself; two elements c1; c2 from I0 n are n-equivalent i wnc1;c1 = wnc2;c2 , n;c1 = n;c2 , and jc1j = jc2j mod 2jqj + 3. Remark 3. The set of concepts from sig(T ) satis ed by c 2 I0 is exactly n;c(wnc;c). Therefore, if c n c0, then c and c0 satisfy the same concept names.

Remark 4. If c

n c0, then c m c0 for any m

We obtain a smaller countermodel for our CCQ q by merging elements with respect to jqj+1. We will use e for the equivalence class of e w.r.t. jqj+1 and let p : e 7! e denote the canonical projection from I0 to J . To improve the readability of later material, we introduce notation for the set f j 2 g.

De nition 5. The reduced interlacing J is the interpretation with domain

I0 = jqj+1 and interpretation function J := p I0 .

Example 6. The reduced interlacing Je, together with two 2-neighbourhoods

N2Je; ( ) and N2Je; ( ), are displayed in Figure 1e. Notice Je remains a countermodel for qe and candidate integer 3.

It remains to show that the reduced interlacing J is a countermodel. As elements were merged using a local criteria, it is not possible in general to exhibit a homomorphism from the whole J to I0, injecting matches as in Lemma 3. However, local solutions are possible: a match of q in J maps each connected component C of q into a jqj-neighbourhood NjJqj; (c). By exhibiting a homomorphism c : NjJqj; (c) ! NjIqj0; (c), we can nd a match of C in I0. Such matches for q's connected components together form a match of the full q in I0.

We de ne such homomorphisms c inductively on NkJ ; (c) with k increasing from 0 to jqj. Starting from the element c 2 N0J ; (c), we can naturally carry it back as c(c) = c 2 N0I0; (c). Assume now that we have de ned c(d) for some d 2 NnJ ; (c) and that we are moving further to an element e 2 NnJ+;1 (c) along an edge (d; e) in J . In the case of e 2= , the following lemma produces a candidate c(e), namely e0, which is to c(d), namely d0, what e is to d.

Lemma 4. Given two elements d; e 2 J n , if there exists a role P from NR

such that (d; e) 2 PJ , then there exists a unique element R:B 2 such that one of the two following conditions is satis ed: edge+. jej = jdj + 1 mod 2jqj + 3, wjeqj+1;e = wjdqj+1 1;d

R:B and T j= R v P.

k 1 e. edge . jdj = jej + 1 mod 2jqj + 3, wjdqj+1;d = wjeqj+1 1;e R:B and T j= R v P.

Furthermore, for all d0 k d, we have e0 such that d0 = e0 R:B and the pre x e0 satis es e0 k 1 e.

Furthermore, for all d0

satis es e0 k d, the element e0 := d0 R:B belongs to I0 and

Notice the \strength" of the equivalence relation k between e and c(e) decreases as we move further in the neighbourhood of c. For example in Figure 1e: building ( 2) from ( ) := we obtain 1, with 1 1 2 but 1 2 2. However, since we start from c(c) := c jqj+1 c and explore a jqj-neighbourhood, the index remains at least 1. This is essential as 1 encodes relations to elements of as the next lemma shows. It allows in particular to treat the case of e 2 .

Lemma 5. If (d; e) 2 RJ for some e 2

, and if d0 1 d, then (d0; e) 2 RI0 .

It remains to free ourselves from the particular choice of d, which is likely not to be the only element of NnJ ; (c) connected to e. This cannot be observed on our running example as jqej = 1, but the possibility of cycles in the reduced interlacing should still be clear from Figure 1e. Taking a closer look at Lemma 4, we observe that c(e), that is e0, is obtained either by adding a letter to c(d), that is d0, or by removing the last letter of c(d), and that these letters coincide with those in the su xes of elements d and e. Therefore, when moving from c to e and ignoring self-cancelling steps, each added letter must appear in the su x of e and, similarly, each removed letter must appear in the su x of c. The challenge is therefore to quantify the number of additions and removals to build c(e) directly from c and e. The next de nition captures the relative di erence of letters between c and e, encoded in jcj and jej mod 2jqj + 3.

De nition 6. Let c 2 J and n jqj. The relative depth of e 2 NnJ ; from c is the integer c(e) 2 [ n; n] such that jej = jcj + c(e) mod 2jqj + 3. (c) jqj such that e 2 NnJ ;

Remark 5. By induction on n jqj, it is straightforward to see that c(e) is

well de ned. Unicity is ensured by c(e) n jqj. A consequence of Lemma 4 is that for the smallest n (c) we have c(e) = n mod 2.

We can now identify how many additions and removals cancelled each other. Indeed, if it takes n steps to reach e from c, with relative di erence of := c(e), then n j j is the length of the self-cancelling path, hence: n 2j j cancelled additions and n 2j j cancelled removals. Therefore, the actual amount of additions is nn 2j j + if

0, or n 2j j if 0, that is in both cases n+2 . Similarly we obtain 2 for the actual amount of removals. The next theorem formalizes all these intuitions: n;c(e) (in non-trivial cases) is obtained by removing the n2 last letters of c and keeping the n+ last letters from the su x of e. For example on 2 Figures 1d, element ( 2) = 1 is obtained by removing 1 (2 1) = 1 letter from and keeping 1+(2 1) = 0 letter from the su x of 2. It is then a technicality to verify these syntactical operations on words make sense in the domain of I0. Theorem 3. For all c 2

I0 and all n jqj, the following mapping: n;c(e) : NnJ ; (c) ! NnI0; (c) e 7! 8 > n 1;c(e) < e >: r n 2c(e) ;c

J ; if e 2 Nn 1 (c) if e 2 wen+ c(e) ;e otherwise 2 is a homomorphism satisfying n;c(e) jqj+1 n e and n;1c( )

Let us clarify how Theorem 3 concludes our proof with the following lemma.

Lemma 6. If I is a countermodel, then so is its reduced interlacing J .

Proof (sketch). It is mostly routine work of de nition chaining to show that J is a model, except for negative role inclusions, where Theorem 3 is needed to move violations of R1 u R2 v ? in J back into I0. As sketched earlier, we can employ the local homomorphisms between neighbourhoods to transform a match of q in J into a match in I0. Matches in I0 being contained in , our original match in J must be contained in . Thus, the mapping of matches in J to matches in I0 is essentially the identity (if we con ate and ), hence injective.

Finally, we obtain Theorem 1 by analyzing the size of J (i.e. counting the equivalence classes in J ), keeping in mind that due to Lemma 1 and our assumption on m, we have j j jInd(A)j + jqj (jInd(A)j + 3 jT j 2jT j)jqj. 4

Lower bounds for E L We now consider the simpler setting of E L, incomparable with DL-LiteR, and show that the same lower bounds hold, both in data and combined complexity.

Theorem 4. CCQ answering in E L is coNP-complete w.r.t. data complexity.

Proof (sketch). We reduce the complement of the graph 3-colorability problem to answering the E L OMQ (q; T ), with q = 9z B(z) and T containing A v 9R:B and 9R:Ck u 9E:(9R:Ck) v B for k 2 f1; 2; 3g.

Theorem 5. CCQ answering in E L is coNEXP-hard w.r.t. combined complexity.

Proof (sketch). The proof adapts a reduction from the exponential grid tiling problem (Lemma 18 from [ 10 ]), the key di erence being the use of existential restriction to replace role inclusions. 5

Outlook We have initiated the study of CCQ answering for Horn DLs outside the DL-Lite family, establishing the same complexity bounds for E L(HI?) as were known for DL-LiteR. There remain many questions to explore. Let us mention two potential research directions towards obtaining more practical algorithms. First, we can look for additional restrictions on the query or the ontology that ensure polynomial data complexity, as has been considered for DL-Lite [ 6,3,4 ]. Unfortunately, it appears that restrictions that work for DL-Lite are not su cient to obtain tractability in E L, so novel restrictions need to be identi ed. Second, it would be desirable, for E L but also for DL-LiteR, to develop a more re ned coNP procedure that is amenable to implementation, e.g. using SAT solvers.

1. Baader , F. , Brandt , S. , Lutz , C. : Pushing the EL envelope . In: Proceedings of the 19th international joint conference on Arti cial intelligence , IJCAI . pp. 364 { 369 ( 2005 )

2. Bienvenu , M. , Calvanese , D. , Ortiz , M. , Simkus , M. : Nested regular path queries in description logics . In: Proc. of the 14th Int. Conf. on Principles of Knowledge Representation and Reasoning (KR) . pp. 218 { 227 ( 2014 )

3. Bienvenu , M. , Maniere , Q. , Thomazo , M. : Answering counting queries over DLLite ontologies . In: Proc. of the 29th International Joint Conference on Arti cial Intelligence (IJCAI) . pp. 1608 { 1614 ( 2020 )

4. Bienvenu , M. , Maniere , Q. , Thomazo , M. : Cardinality queries over DL-Lite ontologies . In: Proc. of the 30th International Joint Conference on Arti cial Intelligence (IJCAI) . pp. 1801 { 1807 ( 2021 )

5. Bienvenu , M. , Ortiz , M. : Ontology-mediated query answering with data-tractable description logics . In: Tutorial Lectures of the 11th Reasoning Web International Summer School . pp. 218 { 307 ( 2015 )

6. Calvanese , D. , Corman , J. , Lanti , D. , Razniewski , S. : Counting query answers over a DL-Lite knowledge base . In: Proc. of the 29th International Joint Conference on Arti cial Intelligence (IJCAI) . pp. 1658 { 1666 ( 2020 )

7. Calvanese , D. , Corman , J. , Lanti , D. , Razniewski , S. : Rewriting count queries over DL-Lite TBoxes with number restrictions . In: Proc. of the 33rd International Workshop on Description Logics (DL) ( 2020 )

8. Calvanese , D. , Kharlamov , E. , Nutt , W. , Thorne , C. : Aggregate queries over ontologies . In: Proc. of the 2nd International Workshop on Ontologies and Information Systems for the Semantic Web (ONISW) . pp. 97 { 104 ( 2008 )

9. Feier , C. , Lutz , C. , Przybylko , M. : Answer counting under guarded TGDs . In: Proc. of the 24th International Conference on Database Theory (ICDT) ( 2021 )

10. Kostylev , E.V. , Reutter , J.L. : Complexity of answering counting aggregate queries over DL-Lite . Journal of Web Semantics (JWS) pp. 94 { 111 ( 2015 )

11. Nikolaou , C. , Kostylev , E.V. , Konstantinidis , G. , Kaminski , M. , Grau , B.C. , Horrocks , I. : Foundations of ontology-based data access under bag semantics . Arti cial Intelligence (AIJ ) pp. 91 { 132 ( 2019 )