<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Counting Queries over E LHI ? Ontologies ? ??</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>University of Bordeaux</string-name>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Bordeaux INP</string-name>
        </contrib>
        <contrib contrib-type="author">
          <string-name>LaBRI</string-name>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Talence</string-name>
        </contrib>
        <contrib contrib-type="author">
          <string-name>France</string-name>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Inria</string-name>
        </contrib>
        <contrib contrib-type="author">
          <string-name>DI ENS</string-name>
        </contrib>
        <contrib contrib-type="author">
          <string-name>University PSL</string-name>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Paris</string-name>
        </contrib>
        <contrib contrib-type="author">
          <string-name>France</string-name>
        </contrib>
      </contrib-group>
      <abstract>
        <p>While ontology-mediated query answering most often adopts (unions of) conjunctive queries as the query language, some recent works have explored the use of counting queries coupled with DL-Lite ontologies. In the present paper, we initiate the study of counting queries for Horn description logics outside the DL-Lite family. Via a non-trivial adaptation of existing techniques, we devise a decision procedure for answering counting conjunctive queries over ELHI? ontologies, which yields the same upper bounds as are known for DL-LiteR, namely, coNP in data complexity and coN2EXP w.r.t. combined complexity. We further show that the best known lower bounds for DL-LiteR (coNP-hard for data, coNEXP-hard for combined) hold also for EL.</p>
      </abstract>
      <kwd-group>
        <kwd>Ontology-mediated query answering</kwd>
        <kwd>Counting queries</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>-</title>
      <p>
        Introduction
In the context of ontology-mediated query answering, the most commonly
considered queries are conjunctive queries (CQs), but several works have explored
ways of equipping CQs with some form of counting [
        <xref ref-type="bibr" rid="ref10 ref8 ref9">8,10,9</xref>
        ]. A recent approach,
proposed in [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ] as a generalization of [
        <xref ref-type="bibr" rid="ref10">10</xref>
        ], considers counting conjunctive queries
(CCQs) that are syntactically de ned like standard CQs except that some
variables may be designated as counting variables. In each model of the knowledge
base, we can count the number of possible assignments to the counting variables
that make the query hold. Certain answers are then de ned as intervals that
provide upper and lower bounds on the count values across all models.
      </p>
      <p>
        The problem of answering CCQs over DL-LiteR ontologies is intractable in
general [
        <xref ref-type="bibr" rid="ref10">10</xref>
        ], and recent works have shown that intractability arises even in quite
restricted settings [
        <xref ref-type="bibr" rid="ref4 ref6">6,4</xref>
        ]. However, some interesting tractable cases have also been
identi ed, notably, rooted CCQs [
        <xref ref-type="bibr" rid="ref11 ref3 ref6">3,6,11</xref>
        ] and cardinality queries (= Boolean
atomic CCQs) [
        <xref ref-type="bibr" rid="ref4">4</xref>
        ] coupled with DL-Litecore ontologies; query rewriting techniques
have also begun to be explored [
        <xref ref-type="bibr" rid="ref7">7</xref>
        ]. To the best of our knowledge, however, CCQ
answering has not yet been studied for ontologies outside the DL-Lite family.
? Partially supported by ANR project CQFD (ANR-18-CE23-0003)
?? © 2021 for this paper by its authors. Use permitted under Creative Commons
      </p>
      <p>License Attribution 4.0 International (CC BY 4.0).</p>
      <p>
        This motivates us to extend the study of CCQ answering to other well-known
Horn description logics, such as E L and the more expressive E LHI?. While
E L enjoys good computational properties for CQ answering [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ], the techniques
developed for CCQs in the DL-Lite context do not easily transfer. Indeed, a key
property of DL-Lite is that merging elements in a model preserves modelhood
so long as disjointness axioms are not violated. This property does not hold in
E L, due to conjunctions of concepts on the left-hand side (LHS) of inclusions.
      </p>
      <p>Through a non-trivial adaptation of the DL-Lite approach, we obtain
similar upper bounds for CCQ answering over E LHI? ontologies as were known
for DL-LiteR, namely, coNP membership w.r.t. data complexity and coN2EXP
membership w.r.t. combined complexity. We further prove that even if we
restrict to E L, we have the same lower bounds as in DL-LiteR: coNP-hardness
w.r.t. data complexity and coNEXP-hardness w.r.t. combined complexity.
2</p>
      <p>Preliminaries
Knowledge Bases. We assume mutually disjoint sets NC, NR, and NI of concept,
role, and individual names. A knowledge base (KB) K = (T ; A) consists of an
ABox A and a TBox T . An ABox is a nite set of concept assertions A(b) (with
A 2 NC, b 2 NI) and role assertions P(a; b) (with P 2 NR, a; b 2 NI). Let Ind(A)
denote the set of individuals occurring in an ABox A.</p>
      <p>A TBox is a nite set of axioms. In our considered DL E LHI?, TBoxes
consist of concept inclusions B1 v B2, positive role inclusions R1 v R2, and
negative role inclusions3 R1 u R2 v ?, where the Ri are roles drawn from NR =
fP; P j P 2 NRg and the Bi are (complex) concepts constructed as follows:
B := ? j &gt; j A j B1 u B2 j 9R:B
where A 2 NC; R 2 NR
Let sig(T ) denote the set of concept and role names appearing in TBox T .
Semantics of KBs. An interpretation takes the form I = ( I ; I ), where I is
a non-empty set (called the domain) and I is the interpretation function that
maps each A 2 NC to AI I , each P 2 NR to PI I I , and each a 2 NI
to aI . In this paper, we will make the Standard Names Assumption by setting
aI = a. Note however that our results only rely upon the weaker Unique Names
Assumption (UNA), which stipulates that aI 6= bI whenever a 6= b.</p>
      <p>
        The function I naturally extends to roles and complex concepts: (P )I =
f(y; x) j (x; y) 2 PI g, ?I = ;, &gt;I = I , (B1 u B2)I = B1I \ B2I and (9P:B)I =
fd j (d; e) 2 RI ; e 2 BI g. An inclusion G v H is satis ed in I if GI HI ; an
assertion A(b) (resp. P(a; b)) is satis ed in I if b 2 AI (resp. (a; b) 2 PI ). An
interpretation is a model of a TBox T (resp. KB K) if it satis es all axioms in
T (resp. axioms and assertions in K). A KB is satis able if it has at least one
model. An inclusion (resp. assertion) is entailed from T (resp. K), written
T j= (resp. K j= ), if is satis ed in every model of T (resp. K).
3 We follow e.g. [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ] by considering a version of ELHI? with negative role inclusions.
Example 1. Consider the ABox Ae = fA(a); B(b)g and the E LHI? TBox Te:
Counting Queries. We consider counting queries as de ned in [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ] (which
generalizes the queries considered in [
        <xref ref-type="bibr" rid="ref10 ref6">10,6</xref>
        ]). A counting conjunctive query (CCQ) takes
the form q(x) = 9y9z (x; y; z), where x; y; z are tuples of answer, existential,
and counting variables, respectively, and is a conjunction of concept and role
atoms with terms from NI [ x [ y [ z. A CCQ q is Boolean if x = ;.
      </p>
      <p>A match for a CCQ q in an interpretation I is a homomorphism from q
into I. If a match maps x to a, then the restriction of to z is called a
counting match (c-match) of q(a) in I. The set of answers to q in I, denoted qI ,
contains all pairs (a; [m; M ]), with m; M 2 N [ f+1g, such that the number of
distinct c-matches of q(a) in I belongs to the interval [m; M ]. A certain answer
to q w.r.t. K is an answer in every model of K, that is a pair from TIj=K qI .</p>
      <p>As usual, it is su cient to consider the Boolean case: (a; [m; M ]) is a certain
answer to a CCQ q(x) i (;; [m; M ]) is a certain answer to the Boolean CCQ
q(a) obtained by replacing x with a. Thus, from now on, we focus on Boolean
CCQs, and work with candidate answers [m; M ] in place of (;; [m; M ]).</p>
      <p>
        We further observe that since E LHI? cannot restrict the size of models, the
value M in a certain answer [m; M ] is: 0 if the underlying CQ is unsatis able
w.r.t. T , any number greater than 1 if q has a match in every model but z = ;;
and +1 otherwise. As the rst two cases can be readily handled using existing
techniques, we focus on identifying certain answers of the form [m; +1].
Example 2. Let qe := 9z D(z) be a Boolean CCQ. Intervals [0; +1] and [1; +1]
are certain answers to qe over Ke. Interval [3; +1] is not a certain answer as
the models depicted on Figures 1a, 1c, 1d and 1e contain only 2 matches for qe.
Complexity. Given a E LHI? knowledge base K = (T ; A), a Boolean CCQ q,
and an integer m 0 (in binary), we are interested in the complexity of deciding
whether [m; +1] is a certain answer to q w.r.t. K. We will consider the two usual
complexity measures: combined complexity which is in terms of the size of the
whole input, and data complexity which is only in terms of the size of A and m
(T and q are treated as xed). If O is a TBox, ABox, KB, or CCQ, then the size
of O, denoted jOj, is the number of occurrences of concept and role names in O.
Normal form. As is standard (see e.g. [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ]), we work with E LHI? TBoxes in a
convenient normal form, where every concept inclusion has one of the following
restricted shapes:
      </p>
      <p>A v ?
&gt; v A</p>
      <p>A1 u A2 v A</p>
      <p>A1 v 9R:A2
9R:A1 v A2
with A; A1; A2 2 NC; R 2 NR . Through the introduction of fresh concept names,
we can transform in polynomial time any TBox T into a normal-form TBox T 0
that is a model-conservative extension of T (hence, indistinguishable from T
from the point of view of queries). We therefore assume w.l.o.g. that all
considered TBoxes are in normal form.</p>
      <p>
        Canonical model. It is well known that every satis able E LHI? KB admits
a canonical (or universal) model that embeds homomorphically into each of
its models. We recall how such a model CK can be constructed (see e.g. [
        <xref ref-type="bibr" rid="ref5">5</xref>
        ]).
The domain CK consists of all sequences aR1:M1 : : : Rn:Mn (n 0) such that
a 2 Ind(A), each Ri belongs to NR , each Mi is a conjunction of concepts from
NC [ f&gt;g (treated as a set when convenient), and the following conditions hold:
{ If n 1, then T j= M0 v 9R1:M1 where M0 = fA 2 NC [ f&gt;g j K j= A(a)g
and M1 is maximal, as a set of concept names, for this property.
{ For every 1 i &lt; n, T j= Mi v 9Ri+1:Mi+1 and Mi+1 is maximal, as a set
of concept names, for this property.
      </p>
      <p>Individual names are interpreted as themselves (aCK = a), and concept and role
names are interpreted as follows:</p>
      <p>ACK = fa j K j= A(a)g [ fe R:M j A 2 Mg
PCK = f(a; b) j K j= P(a; b)g</p>
      <p>[ f(e; e P0:M) j T j= P0 v Pg [ f(e P0:M; e) j T j= P0 v P g
3</p>
      <p>Upper Bounds via Countermodels
This section presents our main contribution: a decision procedure (and associated
complexity upper bounds) for deciding whether a candidate interval [m; +1] is
a certain answer to a Boolean CCQ q and E LHI? KB K = (T ; A) in normal
form. It is more convenient in fact to focus on the complementary problem of
checking whether [m; +1] is not a certain answer, as the latter holds i there
exists a countermodel, i.e., a model of K with fewer than m c-matches.</p>
      <p>We start by observing that if m is large enough, a countermodel always exists.</p>
    </sec>
    <sec id="sec-2">
      <title>Lemma 1. There exists a countermodel for all m</title>
      <p>(jInd(A)j + 3 jT j 2jT j)jqj.</p>
      <p>Proof (sketch). We exhibit a model of size at most jInd(A)j + 3 jT j 2jT j.
We may thus assume that the input m is such that m (jInd(A)j + 3 jT j 2jT j)jqj.</p>
      <p>The main ingredient underlying our decision procedure and upper bounds is
the following result, which restricts the size of countermodels we need to consider.</p>
      <sec id="sec-2-1">
        <title>Theorem 1. If there exists a countermodel for input [m; +1], CCQ q and</title>
      </sec>
      <sec id="sec-2-2">
        <title>KB K, then there exists a countermodel with a polynomial-size (resp. double</title>
        <p>exponential size) domain w.r.t. data complexity (resp. combined complexity).</p>
        <p>
          Theorem 1 generalizes analogous results for DL-Lite KBs in [
          <xref ref-type="bibr" rid="ref10 ref3">10,3</xref>
          ] and gives
rise to a decision procedure that guesses an interpretation of bounded size and
checks whether it is a countermodel, yielding the following upper bounds:
        </p>
      </sec>
      <sec id="sec-2-3">
        <title>Theorem 2. Deciding if input [m; +1] is a certain answer for q over K is in</title>
        <p>coNP (resp. in coN2EXP) w.r.t. data complexity (resp. combined complexity).</p>
        <p>
          The remainder of the section is devoted to proving Theorem 1. The
highlevel idea is to start from an arbitrary countermodel I and merge its elements
so as to reduce its size, while at the same time not introducing any new query
matches. But how can we decide which elements of I can be safely merged?
Looking to existing DL-Lite approaches [
          <xref ref-type="bibr" rid="ref10 ref3">10,3</xref>
          ] for inspiration, we observe that
they proceed in two steps. First, they de ne an intermediate model I0 (called
interleaving ) that, informally, retains the useful parts of I (i.e., those involved
in query matches or needed to satisfy the ABox) and replaces the rest with
treeshaped structures taken from the corresponding parts of the canonical model.
With this more structured countermodel I0, it is easier to identify, via a
wellchosen equivalence relation, the elements that behave similarly and thus can be
safely merged. In a second step, elements of I0 from the same equivalence class
are merged to obtain the desired bounded-size countermodel.
        </p>
        <p>A naive adaptation of the DL-Lite approach to E LHI? (or even E L) fails
already at the rst step. Indeed, as the next example illustrates, due to conjunction
in the LHS of concept inclusions, the interleaving need not be a model.</p>
      </sec>
      <sec id="sec-2-4">
        <title>Example 3. A countermodel Ie of Ke for qe and candidate integer 3 is depicted</title>
        <p>in Figure 1c. Indeed Ie yields only 2 possible values for z: and .</p>
      </sec>
    </sec>
    <sec id="sec-3">
      <title>The naive adaptation of the DL-Lite approach builds the interpretation de</title>
      <p>picted in Figure 1b and induced from CKe . Observe that violates the axiom</p>
      <sec id="sec-3-1">
        <title>A0 u B0 v A0. Generally speaking, the issue is that the canonical model may not</title>
        <p>contain elements witnessing conjunctions of concepts that occur in the initial
countermodel, so it is not enough to copy over parts of the canonical model.</p>
        <p>We now show how to revamp the DL-Lite approach to make it work for
E LHI?. To aid comprehension, we give a graphical overview in Figure 2.
3.1</p>
        <sec id="sec-3-1-1">
          <title>Interlacing</title>
          <p>We propose a new intermediate countermodel, called interlacing, which retains
the desirable features of the interleaving but avoids the issues highlighted in
Example 3. Essentially, the idea is to replace the canonical model by an
alternative tree-shaped domain (called existential extraction) that is built from the
countermodel I by keeping track of the RHS existential concepts satis ed in I.</p>
          <p>The de nition of existential extraction uses the alphabet consisting of all
R:A such that 9R:A is the RHS of an axiom in T . Furthermore, it assumes that,
for every R:A 2 , we have chosen a function succIR:A that maps every element
e 2 (9R:A)I to an element e0 2 I such that (e; e0) 2 RI and e0 2 AI .
B00
B00</p>
          <p>V
V
V B0</p>
          <p>B0; D
a P:A0 u D</p>
          <p>b Q:B0 u D
P</p>
          <p>Q</p>
          <p>A a B b
(a) Canonical model of Ke.</p>
          <p>A0; B0; D</p>
          <p>Q</p>
          <p>P</p>
          <p>A a B b
(b) Interleaving of Ie: not a model!
B00
B00
B00
.
.
.</p>
          <p>V
V
V B0
1 S
2
V
1</p>
          <p>V
U;V</p>
          <p>V3
C2
C1
C0</p>
          <p>C3; D V3
B00</p>
          <p>V B0
S A3 S</p>
          <p>RR23 AAA123 RR23 AAA123 RR23 AA12
(Ad) a0-PRin1terlaAcRi0n;Q1BBg0;oDf; AIb0e. (e) ReduAced a 0P-inteR1rlAa0c;QiBBn0g;Do;fAbI0e.</p>
          <p>Fig. 1: Interpretations used along our examples.
Existential extraction
f</p>
          <p>, inductively build the following mapping:
f : Ind(A)</p>
          <p>!
a 7! a</p>
          <p>I [ f"g
w R:A 7!
" if f (w) = " or f (w) 2= (9R:A)I
succIR:A(f (w)) otherwise
The existential extraction4 of I is
:= fw j w 2 Ind(A)
; f (w) 6= "g.</p>
        </sec>
      </sec>
      <sec id="sec-3-2">
        <title>Remark 1. can be seen as the domain of a form of unravelling of I starting</title>
        <p>from Ind(A), in which we only follow the selected successors for the RHS
existential concepts. The selective nature of the unravelling is visible in Figure 1d
which is based on the existential extraction of Ie: observe there is no edge that
corresponds to following the red edge from Ie (Figure 1c) since V:A2 2= .</p>
        <p>We proceed to de ne -interlacings, parametrized by a set of interest
with Ind(A) . Intuitively, we preserve the portion of I corresponding
to f ( ) and complete it with (locally) tree-shaped structures issuing from .
De nition 2. The</p>
        <p>-interlacing mapping of I is:
The
-interlacing I0 of I is the interpretation given by
I0 := f 0(
) and:
AI0 = ff 0(u) j u 2</p>
        <p>; f (u) 2 AI g (= f 0(f 1(AI )))
PI0 = f(a; b) j a; b 2 Ind(A) ^ K j= P(a; b)g
[ f(f 0(u); f 0(u P0:B)) j u; u P0:B 2
[ f(f 0(v P0:B); f 0(v)) j v; v P0:B 2
^ T j= P0 v Pg
^ T j= P0 v Pg
(Role0A)
(Role0+)
(Role0 )
Like the interleaving, the</p>
        <p>-interlacing is a model which embeds in I.</p>
      </sec>
      <sec id="sec-3-3">
        <title>Lemma 2. I0 is a model, and it embeds homomorphically in I via the mapping:</title>
        <p>: I0 !</p>
        <p>I
w 7!
w if w 2 f ( )
f (w) otherwise, that is w 2
n
4 While the de nitions of f , , and later constructions depend on the choice of
successor functions, all choices lead to the desired result.</p>
        <p>Proof (sketch). Both statements rely upon the facts that (i) if f 0(u) 2 AI0 , then
f (u) 2 AI , and (ii) if (f 0(u); f 0(v)) 2 RI0 , then (f (u); f (v)) 2 RI .</p>
      </sec>
    </sec>
    <sec id="sec-4">
      <title>Remark 2. It is easily veri ed that</title>
      <p>f 0 = f .</p>
      <p>To obtain a countermodel, we identify the crucial elements of I, namely,
:= Ind(A) [ fe 2 I j 9 2 m(q; I); 9z 2 z; (z) = eg, where m(q; I) is
the set of matches of q in I. We then take 0 := f 1( ) as our set of interest.</p>
      <sec id="sec-4-1">
        <title>Lemma 3. If I is a countermodel, then so is its 0-interlacing I0.</title>
      </sec>
    </sec>
    <sec id="sec-5">
      <title>Proof (sketch). We show that</title>
      <p>injectively maps matches in I0 to matches in I.</p>
      <sec id="sec-5-1">
        <title>Example 4. Figure 1d depicts the 0-interlacing of Ie. Like the initial model</title>
      </sec>
      <sec id="sec-5-2">
        <title>Ie, it is a countermodel for qe and integer 3. Notice Ie0 has an in nite domain.</title>
        <p>3.2</p>
        <sec id="sec-5-2-1">
          <title>Reduced interlacing</title>
          <p>It remains to merge elements of the 0-interlacing I0 to obtain a countermodel of
the required size. To identify similar elements, we consider their neighbourhoods.</p>
        </sec>
      </sec>
      <sec id="sec-5-3">
        <title>De nition 3. Consider an interpretation M and an element c 2 M. Its n</title>
        <p>neighbourhood NnM;D(c) w.r.t. a subdomain D M is de ned inductively as:
NN0nMM+;;1DD((cc)) ::== fNcngM;D(c) [ e</p>
        <p>9d 2 NnM;D(c) n D; 9R 2 NR ; (d; e) 2 RM</p>
        <p>Observe that we stop adding successors when we reach an element from D.
In particular, for c 2 D, we have NnM;D(c) = fcg for every value of n. It follows
that the statement `c1 2 NnM;D(c2) i c2 2 NnM;D(c1)' does not hold in general.
Example 5. In Figure 1d, neighbourhoods N2Ie0; ( ) and N2Ie0; ( ) are
depicted (in green, resp. blue). In particular, notice a 2= N2Ie0; ( ) since 2 .</p>
        <p>Recall that the de nition of I0 ensures that any c 2 I0 n is actually
an element of and therefore we have c = aw for some individual name a
and word w 2 . The tree-shaped structure of ensures that for all n, there
exists a unique pre x rn;c of aw such that (i) f 0(rn;c) 2 NnI0; (c) and (ii) for
any d 2 NnI0; (c), there exists a unique word wnd;c such that d = f 0(rn;c wnd;c).</p>
        <p>
          This leads us to characterize the n-neighbourhood of an element c via the
following function n;c, whose domain n is the set of words over with length
2n. Notice that, departing from [
          <xref ref-type="bibr" rid="ref10">10</xref>
          ], we keep track of sets of satis ed concepts,
in order to handle conjunction of concepts on the LHS of axioms.
n;c : n !
w 7!
        </p>
        <p>[ 2sig(T ) [ f;g
8&lt; f 0(rn;cw) if rn;cw 2</p>
        <p>fA j A 2 sig(T ); f 0(rn;cw) 2 AI g if rn;cw 2
: ; otherwise
CK ^ f 0(rn;cw) 2
CK ^ f 0(rn;cw) 2=
We can now introduce the equivalence relation we use to merge elements:</p>
      </sec>
      <sec id="sec-5-4">
        <title>De nition 4. The equivalence relation n on I0 is de ned as follows: an</title>
        <p>element from is n-equivalent only to itself; two elements c1; c2 from I0 n
are n-equivalent i wnc1;c1 = wnc2;c2 , n;c1 = n;c2 , and jc1j = jc2j mod 2jqj + 3.
Remark 3. The set of concepts from sig(T ) satis ed by c 2 I0 is exactly
n;c(wnc;c). Therefore, if c n c0, then c and c0 satisfy the same concept names.</p>
        <sec id="sec-5-4-1">
          <title>Remark 4. If c</title>
          <p>n c0, then c
m c0 for any m</p>
          <p>We obtain a smaller countermodel for our CCQ q by merging elements with
respect to jqj+1. We will use e for the equivalence class of e w.r.t. jqj+1 and
let p : e 7! e denote the canonical projection from I0 to J . To improve the
readability of later material, we introduce notation for the set f j 2 g.</p>
        </sec>
      </sec>
      <sec id="sec-5-5">
        <title>De nition 5. The reduced interlacing J is the interpretation with domain</title>
        <p>I0 = jqj+1 and interpretation function J := p I0 .</p>
      </sec>
      <sec id="sec-5-6">
        <title>Example 6. The reduced interlacing Je, together with two 2-neighbourhoods</title>
        <p>N2Je; ( ) and N2Je; ( ), are displayed in Figure 1e. Notice Je remains a
countermodel for qe and candidate integer 3.</p>
        <p>It remains to show that the reduced interlacing J is a countermodel. As
elements were merged using a local criteria, it is not possible in general to exhibit
a homomorphism from the whole J to I0, injecting matches as in Lemma 3.
However, local solutions are possible: a match of q in J maps each connected
component C of q into a jqj-neighbourhood NjJqj; (c). By exhibiting a
homomorphism c : NjJqj; (c) ! NjIqj0; (c), we can nd a match of C in I0. Such
matches for q's connected components together form a match of the full q in I0.</p>
        <p>We de ne such homomorphisms c inductively on NkJ ; (c) with k increasing
from 0 to jqj. Starting from the element c 2 N0J ; (c), we can naturally carry
it back as c(c) = c 2 N0I0; (c). Assume now that we have de ned c(d) for
some d 2 NnJ ; (c) and that we are moving further to an element e 2 NnJ+;1 (c)
along an edge (d; e) in J . In the case of e 2= , the following lemma produces
a candidate c(e), namely e0, which is to c(d), namely d0, what e is to d.</p>
      </sec>
      <sec id="sec-5-7">
        <title>Lemma 4. Given two elements d; e 2 J n , if there exists a role P from NR</title>
        <p>such that (d; e) 2 PJ , then there exists a unique element R:B 2 such that one
of the two following conditions is satis ed:
edge+. jej = jdj + 1 mod 2jqj + 3, wjeqj+1;e = wjdqj+1 1;d</p>
        <p>R:B and T j= R v P.</p>
        <p>k 1 e.
edge . jdj = jej + 1 mod 2jqj + 3, wjdqj+1;d = wjeqj+1 1;e R:B and T j= R v P.</p>
        <p>Furthermore, for all d0 k d, we have e0 such that d0 = e0 R:B and the pre x
e0 satis es e0 k 1 e.</p>
      </sec>
    </sec>
    <sec id="sec-6">
      <title>Furthermore, for all d0</title>
      <p>satis es e0
k d, the element e0 := d0 R:B belongs to
I0 and</p>
      <p>Notice the \strength" of the equivalence relation k between e and c(e)
decreases as we move further in the neighbourhood of c. For example in Figure 1e:
building ( 2) from ( ) := we obtain 1, with 1 1 2 but 1 2 2.
However, since we start from c(c) := c jqj+1 c and explore a jqj-neighbourhood,
the index remains at least 1. This is essential as 1 encodes relations to elements
of as the next lemma shows. It allows in particular to treat the case of e 2 .</p>
      <sec id="sec-6-1">
        <title>Lemma 5. If (d; e) 2 RJ for some e 2</title>
        <p>, and if d0
1 d, then (d0; e) 2 RI0 .</p>
        <p>It remains to free ourselves from the particular choice of d, which is likely
not to be the only element of NnJ ; (c) connected to e. This cannot be observed
on our running example as jqej = 1, but the possibility of cycles in the reduced
interlacing should still be clear from Figure 1e. Taking a closer look at Lemma 4,
we observe that c(e), that is e0, is obtained either by adding a letter to c(d),
that is d0, or by removing the last letter of c(d), and that these letters coincide
with those in the su xes of elements d and e. Therefore, when moving from
c to e and ignoring self-cancelling steps, each added letter must appear in the
su x of e and, similarly, each removed letter must appear in the su x of c. The
challenge is therefore to quantify the number of additions and removals to build
c(e) directly from c and e. The next de nition captures the relative di erence
of letters between c and e, encoded in jcj and jej mod 2jqj + 3.</p>
        <p>De nition 6. Let c 2 J and n jqj. The relative depth of e 2 NnJ ;
from c is the integer c(e) 2 [ n; n] such that jej = jcj + c(e) mod 2jqj + 3.
(c)
jqj such that e 2 NnJ ;</p>
      </sec>
      <sec id="sec-6-2">
        <title>Remark 5. By induction on n jqj, it is straightforward to see that c(e) is</title>
        <p>well de ned. Unicity is ensured by c(e) n jqj. A consequence of Lemma 4 is
that for the smallest n (c) we have c(e) = n mod 2.</p>
        <p>We can now identify how many additions and removals cancelled each other.
Indeed, if it takes n steps to reach e from c, with relative di erence of := c(e),
then n j j is the length of the self-cancelling path, hence: n 2j j cancelled
additions and n 2j j cancelled removals. Therefore, the actual amount of additions is
nn 2j j + if</p>
        <p>0, or n 2j j if 0, that is in both cases n+2 . Similarly we obtain
2 for the actual amount of removals. The next theorem formalizes all these
intuitions: n;c(e) (in non-trivial cases) is obtained by removing the n2 last
letters of c and keeping the n+ last letters from the su x of e. For example on
2
Figures 1d, element ( 2) = 1 is obtained by removing 1 (2 1) = 1 letter from
and keeping 1+(2 1) = 0 letter from the su x of 2. It is then a technicality to
verify these syntactical operations on words make sense in the domain of I0.
Theorem 3. For all c 2</p>
        <p>I0 and all n
jqj, the following mapping:
n;c(e) : NnJ ;
(c) ! NnI0;
(c)
e 7!
8
&gt; n 1;c(e)
&lt; e
&gt;: r n 2c(e) ;c</p>
        <p>J ;
if e 2 Nn 1 (c)
if e 2
wen+ c(e) ;e otherwise
2
is a homomorphism satisfying n;c(e)
jqj+1 n e and
n;1c(
)</p>
        <p>Let us clarify how Theorem 3 concludes our proof with the following lemma.</p>
      </sec>
      <sec id="sec-6-3">
        <title>Lemma 6. If I is a countermodel, then so is its reduced interlacing J .</title>
        <p>Proof (sketch). It is mostly routine work of de nition chaining to show that J is
a model, except for negative role inclusions, where Theorem 3 is needed to move
violations of R1 u R2 v ? in J back into I0. As sketched earlier, we can employ
the local homomorphisms between neighbourhoods to transform a match of q in
J into a match in I0. Matches in I0 being contained in , our original match
in J must be contained in . Thus, the mapping of matches in J to matches
in I0 is essentially the identity (if we con ate and ), hence injective.</p>
        <p>Finally, we obtain Theorem 1 by analyzing the size of J (i.e. counting the
equivalence classes in J ), keeping in mind that due to Lemma 1 and our
assumption on m, we have j j jInd(A)j + jqj (jInd(A)j + 3 jT j 2jT j)jqj.
4</p>
        <p>Lower bounds for E L
We now consider the simpler setting of E L, incomparable with DL-LiteR, and
show that the same lower bounds hold, both in data and combined complexity.</p>
      </sec>
      <sec id="sec-6-4">
        <title>Theorem 4. CCQ answering in E L is coNP-complete w.r.t. data complexity.</title>
        <p>Proof (sketch). We reduce the complement of the graph 3-colorability problem
to answering the E L OMQ (q; T ), with q = 9z B(z) and T containing A v 9R:B
and 9R:Ck u 9E:(9R:Ck) v B for k 2 f1; 2; 3g.</p>
      </sec>
      <sec id="sec-6-5">
        <title>Theorem 5. CCQ answering in E L is coNEXP-hard w.r.t. combined complexity.</title>
        <p>
          Proof (sketch). The proof adapts a reduction from the exponential grid tiling
problem (Lemma 18 from [
          <xref ref-type="bibr" rid="ref10">10</xref>
          ]), the key di erence being the use of existential
restriction to replace role inclusions.
5
        </p>
        <p>
          Outlook
We have initiated the study of CCQ answering for Horn DLs outside the DL-Lite
family, establishing the same complexity bounds for E L(HI?) as were known
for DL-LiteR. There remain many questions to explore. Let us mention two
potential research directions towards obtaining more practical algorithms. First,
we can look for additional restrictions on the query or the ontology that ensure
polynomial data complexity, as has been considered for DL-Lite [
          <xref ref-type="bibr" rid="ref3 ref4 ref6">6,3,4</xref>
          ].
Unfortunately, it appears that restrictions that work for DL-Lite are not su cient
to obtain tractability in E L, so novel restrictions need to be identi ed. Second,
it would be desirable, for E L but also for DL-LiteR, to develop a more re ned
coNP procedure that is amenable to implementation, e.g. using SAT solvers.
        </p>
      </sec>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          1.
          <string-name>
            <surname>Baader</surname>
            ,
            <given-names>F.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Brandt</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Lutz</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          :
          <article-title>Pushing the EL envelope</article-title>
          .
          <source>In: Proceedings of the 19th international joint conference on Arti cial intelligence</source>
          ,
          <source>IJCAI</source>
          . pp.
          <volume>364</volume>
          {
          <issue>369</issue>
          (
          <year>2005</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          2.
          <string-name>
            <surname>Bienvenu</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Calvanese</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Ortiz</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Simkus</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          :
          <article-title>Nested regular path queries in description logics</article-title>
          .
          <source>In: Proc. of the 14th Int. Conf. on Principles of Knowledge Representation and Reasoning (KR)</source>
          . pp.
          <volume>218</volume>
          {
          <issue>227</issue>
          (
          <year>2014</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          3.
          <string-name>
            <surname>Bienvenu</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Maniere</surname>
            ,
            <given-names>Q.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Thomazo</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          :
          <article-title>Answering counting queries over DLLite ontologies</article-title>
          .
          <source>In: Proc. of the 29th International Joint Conference on Arti cial Intelligence (IJCAI)</source>
          . pp.
          <volume>1608</volume>
          {
          <issue>1614</issue>
          (
          <year>2020</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          4.
          <string-name>
            <surname>Bienvenu</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Maniere</surname>
            ,
            <given-names>Q.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Thomazo</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          :
          <article-title>Cardinality queries over DL-Lite ontologies</article-title>
          .
          <source>In: Proc. of the 30th International Joint Conference on Arti cial Intelligence (IJCAI)</source>
          . pp.
          <year>1801</year>
          {
          <year>1807</year>
          (
          <year>2021</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          5.
          <string-name>
            <surname>Bienvenu</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Ortiz</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          :
          <article-title>Ontology-mediated query answering with data-tractable description logics</article-title>
          .
          <source>In: Tutorial Lectures of the 11th Reasoning Web International Summer School</source>
          . pp.
          <volume>218</volume>
          {
          <issue>307</issue>
          (
          <year>2015</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          6.
          <string-name>
            <surname>Calvanese</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Corman</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Lanti</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Razniewski</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          :
          <article-title>Counting query answers over a DL-Lite knowledge base</article-title>
          .
          <source>In: Proc. of the 29th International Joint Conference on Arti cial Intelligence (IJCAI)</source>
          . pp.
          <volume>1658</volume>
          {
          <issue>1666</issue>
          (
          <year>2020</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          7.
          <string-name>
            <surname>Calvanese</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Corman</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Lanti</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Razniewski</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          :
          <article-title>Rewriting count queries over DL-Lite TBoxes with number restrictions</article-title>
          .
          <source>In: Proc. of the 33rd International Workshop on Description Logics (DL)</source>
          (
          <year>2020</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          8.
          <string-name>
            <surname>Calvanese</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Kharlamov</surname>
            ,
            <given-names>E.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Nutt</surname>
            ,
            <given-names>W.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Thorne</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          :
          <article-title>Aggregate queries over ontologies</article-title>
          .
          <source>In: Proc. of the 2nd International Workshop on Ontologies and Information Systems for the Semantic Web (ONISW)</source>
          . pp.
          <volume>97</volume>
          {
          <issue>104</issue>
          (
          <year>2008</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          9.
          <string-name>
            <surname>Feier</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Lutz</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Przybylko</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          :
          <article-title>Answer counting under guarded TGDs</article-title>
          .
          <source>In: Proc. of the 24th International Conference on Database Theory (ICDT)</source>
          (
          <year>2021</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          10.
          <string-name>
            <surname>Kostylev</surname>
            ,
            <given-names>E.V.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Reutter</surname>
            ,
            <given-names>J.L.</given-names>
          </string-name>
          :
          <article-title>Complexity of answering counting aggregate queries over DL-Lite</article-title>
          .
          <source>Journal of Web Semantics (JWS)</source>
          pp.
          <volume>94</volume>
          {
          <issue>111</issue>
          (
          <year>2015</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          11.
          <string-name>
            <surname>Nikolaou</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Kostylev</surname>
            ,
            <given-names>E.V.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Konstantinidis</surname>
            ,
            <given-names>G.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Kaminski</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Grau</surname>
            ,
            <given-names>B.C.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Horrocks</surname>
            ,
            <given-names>I.</given-names>
          </string-name>
          :
          <article-title>Foundations of ontology-based data access under bag semantics</article-title>
          .
          <source>Arti cial Intelligence (AIJ</source>
          ) pp.
          <volume>91</volume>
          {
          <issue>132</issue>
          (
          <year>2019</year>
          )
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>