<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Worst-Case Optimal Querying of Very Expressive Description Logics with Path Expressions and Succinct Counting</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Bartosz Bednarczyk</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Sebastian Rudolph</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Computational Logic Group, TU Dresden</institution>
          ,
          <country country="DE">Germany</country>
        </aff>
        <aff id="aff1">
          <label>1</label>
          <institution>Institute of Computer Science, University of Wrocław</institution>
          ,
          <country country="PL">Poland</country>
        </aff>
      </contrib-group>
      <abstract>
        <p>Among the most expressive knowledge representation formalisms are the description logics of the Z family. For well-behaved fragments of ZOIQ, entailment of positive two-way regular path queries is well known to be 2EXPTIMEcomplete under the proviso of unary encoding of numbers in cardinality constraints. We show that this assumption can be dropped without an increase in complexity and EXPTIME-completeness can be achieved when bounding the number of query atoms, using a novel reduction from query entailment to knowledge base satisfiability. These findings allow to strengthen other results regarding query entailment and query containment problems in very expressive description logics. Our results also carry over to GC2, the two-variable guarded fragment of firstorder logic with counting quantifiers, for which hitherto only conjunctive query entailment has been investigated.</p>
      </abstract>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>-</title>
      <p>
        1 Introduction
Recent years have seen a convergence of the fields of logic-based knowledge
representation (KR) and databases, with querying over knowledge bases (KBs, aka ontologies)
as the archetypical inferencing task considered. Thereby, the query languages of interest
have evolved from plain conjunctive queries to more expressive formalisms with
pathnavigational components such as the positive two-way regular path queries (P2RPQs)
considered here. One popular and practically used way of specifying the to-be-queried
knowledge bases is via description logics (DLs), whose most expressive exemplars
feature advanced constructors for roles and path expressions. Certainly, the DLs from the
Z family [
        <xref ref-type="bibr" rid="ref6">6</xref>
        ] are among the most powerful KR formalisms on the verge of decidability.
For its most expressive proponent, ZOIQ, featuring nominals (O), role inverses (I),
and number restrictions (Q), querying is undecidable [
        <xref ref-type="bibr" rid="ref13 ref8">8,13</xref>
        ] and even decidability of KB
satisfiability is open, owing to the intricate interplay of the three mentioned features, but
restricting the interaction of O, I, and Q (or excluding one of the features altogether)
leads to beneficial model-theoretic properties, which give rise to upper bounds of
EXPTIME for KB satisfiability and 2EXPTIME for querying, established using elaborate
automata techniques [
        <xref ref-type="bibr" rid="ref6">6</xref>
        ]. While the first bound holds under the assumption of binary
? Supported by “Diamentowy Grant” no. DI2017 006447
?? Supported by ERC Consolidator Grant 771779 (DeciGUT)
encoding of number restrictions, the proof for the second one requires unary encoding.
While unary encoding of numbers is often silently accepted in the DL community, it is
considered “cheating” in others.
      </p>
      <p>
        In our paper, we overcome this restriction. Leveraging said model-theoretic property
of the considered DLs, we provide a novel reduction from the query entailment problem
K j= q to the problem of checking unsatisfiability of some other knowledge base K:q.
Thereby, the size of the latter is exponential in the total size of q and K (even when
assuming binary encoding), leading to a 2EXPTIME-complete complexity. Our technique
also yields the new insight that the complexity drops to EXPTIME, when the number of
atoms in q is bounded by a constant. The obtained results allow to correspondingly
improve a number of previous results on query containment and can be transferred to DLs
from the SR family. Reaching out to the community researching decidable fragments
of first-order logic, we show that our results also extend to entailment of P2RPQs by
GC2 databases, i.e. sets of unary and binary ground facts in the presence of a sentence
of the guarded two-variable fragment with counting as defined by Pratt-Hartmann [
        <xref ref-type="bibr" rid="ref10">10</xref>
        ].
2
      </p>
      <p>
        Preliminaries
Description Logics. We will very briefly recap the very expressive DL ZOIQ and its
relevant sublogics following Calvanese et al. [
        <xref ref-type="bibr" rid="ref6">6</xref>
        ]. We assume a fixed signature
consisting of sets NI of individual names, NC of concept names, and NR of role names. We
let NC contain the special concept names &gt; and ?, and NR the special role names &gt;
and ?. The following EBNF grammar defines atomic concepts B, concepts C, atomic
roles R, simple roles S and roles T , with o; a; b 2 NI, A 2 NC, P 2 NR n f&gt;g:
B ::= A j fog
C ::= B j :C j C u C j C t C j 8T:C j 9T:C j &gt;nS:C j 6nS:C j 9S:Self
R ::= P j P
S ::= R j S \ S j S [ S j S n S
      </p>
      <p>T ::= &gt; j S j T [ T j T T j T j id(C)
An assertion is of the form C(a), R(a; b), a b, or a 6 b, a general concept inclusion
(GCI) has the form C1 v C2. We use C1 C2 as abbreviation for C1 v C2, C2 v C1.
A ZOIQ knowledge base (short: KB) K = (A; T ) consists of a finite set A (called
ABox) of assertions and a finite set T (called TBox) of GCIs.</p>
      <p>The semantics of ZOIQ is defined via interpretations I = ( I ; I ) composed
of a non-empty set I called the domain of I and a function I mapping individual
names to elements of I , concept names to subsets of I , and role names to subsets
of I I . This mapping is extended to concepts, simple roles, and roles (cf. Table 1).
Then, a GCI C1 v C2 is satisfied by I if C1I C2I , satisfaction of ABox assertions is
defined in the obvious way. We say that an interpretation I satisfies a KB K = (A; T )
(or I is a model of K, written: I j= K) if it satisfies all axioms of A and T . K is
called satisfiable if it has a model and unsatisfiable otherwise. From ZOIQ, we obtain
ZIQ by disallowing nominal concepts fog, ZOQ, by disallowing role inverses (:)
and ZOI by disallowing number restrictions.
Queries. In queries, we use variables from a countably infinite set V. A Boolean
positive two-way regular path query (P2RPQ) is an expression q = 9x:', where ' is a
positive formula (i.e., one using ^ and _) over expressions of the form T (s; t) or C(t),
where T is a role, C is concept, and s and t are elements of x [ NI. A P2RPQ is a
conjunctive two-way regular path query (C2RPQ) if it does not use disjunction. It is a
union of conjunctive two-way regular path queries (UC2RPQ) if it is a disjunction of
C2RPQs.3 A variable assignment Z for I is a mapping V ! I . For x 2 V, we set
xI;Z := Z(x); for c 2 NI, we set cI;Z := cI . T (s; t) evaluates to true under Z and
I if (sI;Z ; tI;Z ) 2 T I . C(t) evaluates to true under Z and I if tI;Z 2 CI . A P2RPQ
q = 9x:' is satisfied by I (written: I j= q) if there is a variable assignment Z (called
match) such that ' evaluates to true under I and Z. A P2RPQ q is entailed by a KB K
(written: K j= q) if every model of K satisfies q. If q uses only signature elements of K,
we call it a query over K.</p>
      <p>Simplifications. A P2RPQ is simplified if all its atoms are of the form T (x; y) where
x and y are variables and T is built from role names using [, , and . For any KB K
and P2RPQ q, we can construct in polynomial time a KB K0 and simplified P2RPQ q0
such that K j= q iff K0 j= q0. Hence from here on, we will focus on simplified queries.
Automata. For each atom T (x; y) of a simplified P2RPQ, T is a regular expression
over NR. Every such expression can be represented by a nondeterministic finite
automaton A = ( ; Q; I; F; T) with an alphabet fP; P j P 2 NRg of possibly
inverted role names, a finite set Q of states, initial states I Q, final states F Q,
and transitions T Q Q.4 The size of A is polynomially bounded by T . In
the following, we assume queries in automaton form, where the atoms T (x; y) have
been replaced by the corresponding A(x; y). Given A = ( ; Q; I; F; T), let A =
( ; Q; F; I; T ) with = fP j P 2 g be the corresponding reverse
automaton5 with initial and final states swapped and the state transitions flipped: (q0; R ; q) 2
3 As 9x: Wi 'i Wi 9x:'i, the order of disjunction and existential quantification is irrelevant.
4 The correspondence is such that ( ; 0) 2 T I iff there is a role path P!1 P!k 0 in I
such that A accepts P1 : : : Pk.
5 For convenience, we consider (P ) identical to P .</p>
      <p>T for every (q; R; q0) 2 T. Moreover, we obtain the “state-parameterized” automaton
Aq;q0 = ( ; Q; fqg; fq0g; T) from A by setting the initial state to q and the final to q0.
3</p>
      <p>A Little Bit of Model Theory
The “well-behavedness” of certain sublogics of ZOIQ can be conveniently
characterised in model-theoretic terms.</p>
      <p>
        Definition 1 (quasi-forest model, [
        <xref ref-type="bibr" rid="ref5">5</xref>
        ]). Let K be a KB. A model I of K is a quasi-forest
model if:
– the domain I of I is a forest, i.e., a prefix-closed subset of Roots N for some
finite set Roots,
– Roots = foI j o is an individual name in Kg, and
– for every ; 0 2 I with ( ; 0) 2 P I for some role name P 2 NR n f&gt;g, either (i)
f ; 0g \ Roots 6= ;, or (ii) = 0; or (iii) is a child of 0, or (iv) 0 is a child of .
A KB K has the quasi-forest model property (short: qfmp) if K is either unsatisfiable or
it has a quasi-forest model. A DL L has the qfmp if every L KB K has the qfmp.
For such well-behaved KBs, tight and quite decent complexity bounds for satisfiability
checking have been established.
      </p>
      <p>
        Theorem 1 ([
        <xref ref-type="bibr" rid="ref6">6</xref>
        ]). Satisfiability checking of ZOIQ KBs having the qfmp is
EXPTIMEcomplete even with binary encoding of number restrictions.
      </p>
      <p>When it comes to query answering, a useful notion for relating interpretations to each
other is via homomorphisms. For two interpretations I = ( I ; I ) and J = ( J ; J ),
a function h : I ! J is a homomorphism from I to J if for all ; 0 2 I holds
2 AI ) h( ) 2 AJ for all concept names A, and ( ; 0) 2 P I ) (h( ); h( 0)) 2
P J for all role names P . If a homomorphism from I to J exists, we write I B J . We
note that I j= q and I B J imply J j= q for every simplified P2RPQ q.</p>
      <p>
        Definition 2 (qfhcp, [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ]). A KB K has the quasi-forest homomorphism-cover property
(short: qfhcp) if for every model I of K, there exists a quasi-forest model J of K with
J B I. A description logic L has the qfhcp if every L KB K has the qfhcp. We let tame
ZOIQ denote the set of all ZOIQ KBs having the qfhcp.
      </p>
      <p>
        Obviously, the qfhcp implies the qfmp. Note that unrestricted ZOIQ does not even
have the qfmp. The most expressive syntactically defined sublogics of tame ZOIQ
known today are ZIQ, ZOQ, and ZOI [
        <xref ref-type="bibr" rid="ref2 ref6">6,2</xref>
        ]. From the fact that the set of models for
any P2RPQ is closed under homomorphisms now follows that for any tame ZOIQ KB
with K 6j= q, there must exist a quasi-forest model I, satisfying I j= K and I 6j= q.
4
      </p>
      <p>Annotating (Partial) Query Matches
In the following, we consider a tame ZOIQ KB K and a simplified P2RPQ q over
K in automaton form. We note that q can be rewritten into an equivalent disjunction
W1 i n qi of n (also simplified) C2RPQs, where n may be exponential in the size of q,
but every single qi’s size is polynomial. Let [q] denote fq1; : : : qng. Then, testing K 6j= q
amounts to determining if K has a model I which is a simultaneous countermodel (i.e.,
I 6j= qi) for all qi 2 [q].</p>
      <p>We now give an intuitive description of our technique to detect (or refute) matches
of q. Given some model I of K, we iteratively, deterministically annotate all its domain
elements with fresh concept names QM that indicate which “parts” M of some query
qi match into I and how participates in these partial matches. To this end, we employ
descriptions M of query parts which contain information about (i) the query variables
matched (ii) (the existence of) paths realising certain state transitions in the query’s
automata and (iii) optionally, the “position” of the current in the query match by use
of a marker acting like an additional, distinguished query variable. In the annotation
process, the QM s will be assigned to domain elements based on their local
environment and known annotations QM 0 for “smaller” partial queries to the same element
or its direct (role) neighborhood. As an exception, non-localised query matches M
not containing will be “broadcast”, i.e., uniformly attached to all domain elements.
This way, in the annotation process, QM s for larger and larger partial queries are
assigned, until finally (after reaching the unique fixpoint) also the full matches for any qi
are recognised by annotations Qqi . This process will accurately detect partial and full
query matches, if I is a quasi-forest model (which is sufficient for our purposes as K
has the qfhcp). The annotation process is realised by virtue of a KB Kq , the size of
which is exponential in q, but only polynomial in K. In the following, we will stepwise
introduce the KB, interleaved with some necessary definitions.</p>
      <p>Definition 3 (partial query). Consider a C2RPQ qi = fA1(x1; y1);:::; Am(xm; ym)g.6
A partial query for qi and a set X fx1; y1; : : : ; xm; ymg of variables from qi is a set
M consisting of
– all atoms A(x; y) from qi for which fx; yg X ,
– for every A(x; y) from qi with x 2 X but y 62 X , one of the atoms Aq;q0 (x; ) or</p>
      <p>Aq;q0 (x; o) where q is initial in A,
– for every A(x; y) from qi with x 62 X but y 2 X , one of the atoms Aq;q0 (y; ) or</p>
      <p>Aq;q0 (y; o) where q is final in A,
– in case X = ;, exactly one atom of the form Aq;q0 ( ; o) or Aq;q0 ( ; o) (called
nominal-anchored path) or Aq;q0 ( ; ) or Aq;q0 ( ; ) (called round trip) for some
A from qi.</p>
      <p>A partial query is called monodic, if jX j = 1. A partial query is called local if it
contains and global otherwise.</p>
      <p>Nominal-Anchored Paths. First, we want to detect role paths that start in the
to-beannotated individual, end in named individual oI , and realise state transitions in one of
the query’s automata. Let o be any individual name in K, let A be either A or A for
any automaton A occurring in q with states q; q0; q00. Then let Kq contain the axiom
QfAq;q( ;o)g(o)
(1)
6 In cases, we may equate a C2RPQ qi with the set of its atoms.
and whenever A
has a transition (q; R; q0):</p>
      <p>9R:QfAq0;q00 ( ;o)g v QfAq;q00 ( ;o)g
Round Trips. Next, we are concerned with paths which start and end in the
to-beannotated individual. Assuming an automaton A with states q; q0; q00; q000 and transition
set T as well as an individual name o, we add the axioms:
(2)
(3)
(4)
(5)
(6)
(7)
(8)
(9)
Initialising monodic partial queries. Next we determine domain elements to which
separate query variables could possibly be mapped in a match.</p>
      <p>Definition 4. We call a monodic partial query M original, whenever q = q0 for
every Aq;q0 (x; ) 2 M and every Aq;q0 (x; ) 2 M . For M an original monodic partial
query for q and fxg, the set PrecM of precondition concepts consists of:
– QfAq;q0 ( ;o)g for each atom Aq;q0 (x; o) in M ,
– QfAq;q0 ( ;o)g for each atom Aq;q0 (x; o) in M ,
– F q initial QfAq;q0 ( ; )g for each atom A(x; x) in M .</p>
      <p>q0 final
Now we can initialise all original monodic partial queries M by adding the following
axioms to Kq:</p>
      <p>l PrecM v QM
Updating partial queries. Partial queries can be updated by roundtrips: Let M be a
partial query containing some Aq;q0 (x; ). Then we realise the update by adding the
following axioms to Kq:</p>
      <p>QM u QfAq0;q00 ( ; )g v QMnfAq;q0 (x; )g[fAq;q00 (x; )g</p>
      <p>&gt; v QfAq;q( ; )g
QfAq;q0 ( ; )g u QfAq0;q00 ( ; )g v QfAq;q00 ( ; )g</p>
      <p>QfAq;q0 ( ;o)g u QfAq00;q0 ( ;o)g v QfAq;q00 ( ; )g
for (q; R; q0) 2 T:</p>
      <p>9R:Self v QfAq;q0 ( ; )g
for f(q; R; q0); (q00; R0; q000)g</p>
      <p>T:
9(R \ R0 ):QfAq0;q00 ( ; )g v QfAq;q000 ( ; )g
Furthermore, a partial query can “progress” when making a “step” in the model, moving
from the considered to some role neighbour. In such a step, all the “unready” paths
(corresponding to atoms with ) must be updated in a synchronous manner. Given a
partial query M for qi and nonempty X as well as a set fR1; : : : ; Rkg of (possibly
inverted) role names, assume M 0 can be obtained from M by replacing each atom of
the form Aq;q0 (x; ) by an atom Aq;q00 (x; ) such that A has a transition (q0; R; q00)
for for any R 2 fR1; : : : ; Rkg. Then add the following axiom to the KB Kq:
(10)
Joining partial queries. When two partial queries corresponding to the same query
qi “meet” in a domain element , they can – under certain circumstances – be “glued
together” into a “bigger” partial query of qi.</p>
      <p>Definition 5 (joinable, join). Let M1 be a partial query for qi and X1 and let M2 be
a partial query for qi and X2. We call M1 and M2 joinable if
– X1 6= ;, X2 6= ;, X1 \ X2 = ;, and
– for each A(x; y) 2 qi n (M1[M2) with fx; yg X1[X2, there are states q; q0; q00 of
A with q initial and q00 final, such that either fAq;q0 (x; ); Aq00;q0 (y; )g 2 M1[M2
or fAq;q0 (x; o); Aq00;q0 (y; o)g 2 M1 [ M2 for some o 2 NI.</p>
      <p>For joinable M1 and M2, their join, denoted M1 ./ M2, is the partial query obtained
from M1 [ M2 by replacing any pair of atoms Aq;q0 (x; ) and Aq00;q0 (y; ) by A(x; y),
where is either an individual name o or .</p>
      <p>We implement the join operation for every pair M1, M2 of joinable partial queries for
a qi 2 [q] by extending Kq with:</p>
      <p>QM1 u QM2 v QM1./M2
Broadcasting of global partial queries. Whenever a partial query M does not have any
occurrences of , which would tie it to a specific element, it will be “broadcast” to all
domain elements:
9&gt;:QM v QM
(11)
(12)
This concludes the definition of Kq. Revisiting our initial intuition of Kq’s purpose of
deterministically annotating a model I of K, the following lemma formalises this by
singling out one unique annotated model I for every I.</p>
      <p>Lemma 1. Let K be a ZOIQ KB and q a simplified P2RPQ. For every model I of K
there exists a unique model I of K [ Kq which extends I and satisfies QIM QJM for
every QM in Kq and for every model J of K [ Kq that extends I.</p>
      <p>We will call models I from Lemma 1 Q-minimal. Note that for every model I of K,
any concept membership 2 QIM holding in the corresponding Q-minimal model I
can be derived from concept and role memberships in I through a finite sequence of
“applications” of axioms from Kq.7
5</p>
      <p>Reduction to Satisfiability
Now we establish technical results relating the presence of QM -annotations in models
and the actual semantic satisfaction of the corresponding partial query M . Note that any
partial query M can be seen as a (set representation of a) C2RPQ in automaton form,
assuming is just an ordinary variable.
7 To see this, it may help to realise that all axioms of Kq can be easily expressed in monadic</p>
      <p>Datalog.</p>
      <p>Definition 6 ((tight) satisfaction of a partial query). Given an interpretation I, a
domain element 2 I , and a partial query M , we say M is satisfied or holds in
, written I; j= M , if there is a match Z for M in I with Z( ) = . M is satisfied
or holds tightly in , written I; j=j= M , if Z is such that every Aq1;q2 (x; ) 2 M is
realized by a path Z(x) ; not containing oI for any individual name o.
Note that j= and j=j= coincide whenever M is global or does not contain variables. Note
also that, as a consequence of this definition, if M is global, then QM holds (tightly)
everywhere or nowhere throughout the domain. With this notion in place, the next two
lemmas can be seen as soundness and completeness results regarding the deduction
calculus for (partial) query matches embodied by Kq.</p>
      <p>Lemma 2 (soundness). Let I be a Q-minimal model of K [ Kq and let 2 I . Let
M be any partial query for any C2RPQ qi 2 [q]. Then 2 QIM implies I; j= M .
Lemma 3 (completeness). Let I be a quasi-forest model of K [ Kq and let
Let M be any partial query for any C2RPQ qi 2 [q]. Then I; j=j= M implies
2
2 QIM .</p>
      <p>I .</p>
      <p>Soundness is proven by induction over the length of the derivation for 2 QIM .
Completeness is shown by induction over the spread of the match for M , i.e., the number of
variables (except ) plus the sum of the lengths of all paths realising the query atoms.</p>
      <p>Thanks to these correspondences, we can essentially rule out models with matches
of any q1; : : : ; qn by forcing the extensions of Qq1 ; : : : ; Qqn to be empty. Therefore, let
K:q = K [ Kq [ fQqi v ? j qi 2 [q]g:
(13)
We establish some syntactic and semantic properties of K:q: Some easy calculations
yield size bounds for K:q, tameness is not affected by the reduction, and unsatisfiability
of K:q indeed coincides with query entailment.</p>
      <p>Fact 4 (size of K:q) The size of K:q is single exponential in the total size of K and q.
It is polynomial, if the number of atoms in q is bounded by a constant.</p>
      <p>Lemma 5. If K is in tame ZOIQ then so is K:q.</p>
      <p>Theorem 2. Let K be a tame ZOIQ KB and let q be a simplified P2RPQ over K. Then
K j= q if and only if K:q is unsatisfiable.</p>
      <p>Proof. We show the equivalent statement: K 6j= q if and only if K:q satisfiable.
“only if”: Assume K 6j= q, that is there exists an I with I j= K and I 6j= q. Consider
I , the Q-minimal model of K [ Kq extending I. If some QqIi were nonempty, then
there would be a query match of q into I by Lemma 2, contradicting our assumption.
Therefore, I j= Qqi v ? for all qi 2 [q] and hence, I j= K:q.
“if”: Assume K:q is satisfiable and hence has a model I. As K:q has the qfmp thanks
to Lemma 5, we can assume I to be a quasi-forest model. I j= Qqi v ? implies
emptiness of QqIi for every qi 2 [q]. If I j= q held true, witnessed by I j= qi for some
qi 2 [q], Lemma 3 would require 2 QqIi for all 2 I , leading to a contradiction.
Hence I 6j= q. On the other hand, I j= K:q implies I j= K, therefore K 6j= q.
Based on this theorem, we can now prove our central result:
Theorem 3. Checking entailment of simplified P2RPQs from tame ZOIQ KBs with
binary number encoding is 2EXPTIME-complete. It is EXPTIME-complete if the
number of atoms in the query is bounded.</p>
      <p>
        Proof. By Theorem 2, entailment of a simplified C2RPQ q from a tame ZOIQ KB K
can be reduced to checking unsatisfiability of K:q, which can be computed in
outputpolynomial time and the size of which is exponential (polynomial, if number of atoms in
q is bounded). By Theorem 1, (un)satisfiability of K:q can be checked in EXPTIME in
the size of K:q, hence we obtain the desired 2EXPTIME and EXPTIME upper bounds.
The matching lower bounds are well-known even for much weaker logics [
        <xref ref-type="bibr" rid="ref9">9</xref>
        ].
6
      </p>
      <p>
        Querying GC2 Databases
We now consider the problem of P2RPQ entailment from GC2 databases, i.e., sets of
unary and binary ground facts in the presence of a sentence in the guarded two-variable
fragment with counting quantifiers as defined by Pratt-Hartmann ([
        <xref ref-type="bibr" rid="ref10">10</xref>
        ]). We will show
that there is a polytime query-entailment-preserving transformation from GC2 databases
to ZIQ KBs. This allows us to transfer the results from Theorem 3. Matching lower
bounds follow from known results for much weaker logics and query formalisms (such
as CQs over ALCI [
        <xref ref-type="bibr" rid="ref7">7</xref>
        ]).
      </p>
      <p>
        Definition 7 (GC2, [
        <xref ref-type="bibr" rid="ref10">10</xref>
        ]). For a signature S of nullary (N0), unary (NC), and binary
(NR [ f g) predicates,8 let GC2 be the smallest set of formulae that contains all atoms
over S using only variables from fx; yg, that is closed under Boolean combinations,
and that contains
– all formulae 9u:' and 8u:' with u 2 fx; yg whenever ' is a GC2 formula with at
most one free variable, and
– all formulae 8u:( ! ') as well as uQ:( ^ ') and uQ:( ) where is a binary
atom (called “guard”) containing both x and y, ' is a GC2 formula, and Qis any of
9, 96n, 9=n, or 9&gt;n.
      </p>
      <p>A GC2 database DB is a theory f g [ A consisting of a GC2 sentence and a finite set
2
A of unary and binary ground facts over constants from NI.9 We denote by GCqef the
set of GC2 formulae which are quantifier-free and do not use .</p>
      <p>A P2RPQ over a GC2 database is a P2RPQ where for every atom of the form C(t) holds
C 2 NC and for every atom of the form T (s; t), every subexpression of the form id(C)
satisfies C 2 NC. P2RPQ entailment from GC2 databases is defined in the same way
as for DLs.</p>
      <p>For convenience, we make use of a special normal form for GC2 introduced in the
following.
8 As unary and binary predicates correspond to concept and role names, respectively, we use the
same set symbols to denote them.
9 Note that ABoxes, as defined before, are syntactically and semantically identical to finite sets
of unary and binary ground facts, so we consider the notions the same and use the same symbol
A. Note also that by definition, does not contain constants.</p>
      <p>
        Lemma 6 (normal form, [
        <xref ref-type="bibr" rid="ref11">11</xref>
        ]). For any GC2 sentence , one can compute in
polynomial time a GC2 sentence NF( ) of the form
8x
^ V1 h ` 8x8y (Eh(x; y) !
h _ x
      </p>
      <p>y))
^ V1 i m 8x9=ni y (Fi(x; y) ^ x 6 y) ;
(14)
with 2 GC2qef not using y, fE1; ::: ; E`; F1; ::: ; Fmg NR, f 1; ::: ; `g GC2qef ,
and fn1; ::: ; nmg N, such that (i) NF( ) j= and (ii) every model of with at least
maxh nh elements can be extended to a model of NF( ).</p>
      <p>Given a GC2 database DB = f g [ A, we let NF(DB) = fNF( )g [ A. Thanks to the
previous lemma and the known property that models of GC2 sentences are closed under
disjoint self-union, the normal form can be shown to be query-entailment-preserving.
Lemma 7. For every P2RPQ q over a GC2 database DB holds DB j= q if and only if
NF(DB) j= q.</p>
      <p>We next provide a query-entailment-preserving polynomial translation of GC2 databases
into ZIQ KBs.</p>
      <p>Definition 8. Let be a GC2 sentence in normal form as in Lemma 6. We define the
ZIQ TBox T as follows:</p>
      <p>[
1 h l</p>
      <p>[
1 i m
T
= T
– T = f&gt; v 61R :&gt;; &gt; v 9R :Selfg
– Tps = fP ropP 9&gt;:P ropP j P 2 N0g
– T = f&gt; v tr( )g, where</p>
      <p>tr(P ) = P ropP
tr(P (x)) = P
tr(P (x; x)) = 9P:Self</p>
      <p>tr(:') = :tr(')
tr(' ^ ) = tr(') u tr( )
tr(' _ ) = tr(') t tr( )
[ f&gt; v 8(Ehn(R
in h to sets of GCIs as follows:
– TEh; h = Sat2atoms( h) (axEh (at) [ f&gt; v 8(REh;atnEh):?g) where axEh maps atoms
[ trEh ( h))):?g
P 7! f
P (x) 7! f
P (x; x) 7! f
P (y) 7! f
P (y; y) 7! f</p>
      <p>9REh;P (y;y):&gt; v 9P:Self;
P (x; y) 7! f 9(REh;P (x;y)nP ):&gt; v ?;
P (y; x) 7! f 9(REh;P (y;x)nP ):&gt; v ?;</p>
      <p>2
and trEh maps GCqef formulae to simple roles:</p>
      <p>9REh;P :&gt; v P ropP ;
9REh;P (x):&gt; v P;
9REh;P (x;x):&gt; v 9P:Self;
9REh;P (y):&gt; v P;</p>
      <p>9(EhnREh;P ):&gt; v :P ropP g
9(EhnREh;P (x)):&gt; v :P g
9(EhnREh;P (x;x)):&gt; v :9P:Selfg
9(Eh nREh;P (y)):&gt; v :P g
9(Eh nREh;P (y;y)):&gt; v :9P:Selfg
9(Eh\P nREh;P (x;y)):&gt; v ?
9(Eh \P nREh;P (y;x)):&gt; v ?
g
g
trEh (at ) = REh;at
trEh (: ) = Ehntr( )
trEh ( ^ 0) = trEh ( ) \ trEh ( 0)
trEh ( _ 0) = trEh ( ) [ trEh ( 0)
– TFi;ni = f &gt; v 6ni(Fi nR ):&gt; u &gt;ni(Fi nR ):&gt; g
Given a GC2 database DB = f g [ A, we let KDB denote the ZIQ KB (TNF( ); A).
Intuitively, T axiomatises equality, Tps introduces synchronised concept names P ropP
for every propositional symbol P from DB, since ZIQ does not allow for nullary
predicates. T , TEh; h , and TFi;ni implement the conjuncts from NF( ).</p>
      <p>We can now observe that KDB can be computed in polytime wrt. the size of DB and
that it has the qfhcp (since it is a ZIQ KB). Moreover we can show that any model of
KDB can be extended to one of NF(DB) and, conversely, any model of NF(DB) can be
extended to one of KDB. With the help of these correspondences and Lemma 7, we can
then argue that for any P2RPQ q over DB holds DB j= q iff KDB j= q. These findings
allow us to apply Theorem 3 to establish the following result.</p>
      <p>Theorem 4. Checking P2RPQ entailment from GC2 databases is 2EXPTIME-complete.
It is EXPTIME-complete if the number of query atoms is bounded by a constant.
7</p>
      <p>
        Conclusion
We have established tight complexity bounds for expressive querying in very expressive
DLs under the assumption of succinct (i.e. binary) encoding of number restrictions.
Arguing along the lines of Calvanese et al. [
        <xref ref-type="bibr" rid="ref6">6</xref>
        ], we can leverage our findings to strengthen
their results on query containment as well as the SR family as follows:10
Theorem 5 (query containment). Testing query entailment K j= q0 q is in
2EXPTIME with respect to the total size of q0, q, and K (with binary encoding of number
restrictions) if (i) K is in ZOQ or ZOI and q0 and q are P2RPQs over K, or (ii) K is
in ZIQ, q0 is a conjunctive query and q is a P2RPQ over K. The complexity drops to
EXPTIME if the number of atoms occurring in q is bounded by a constant.
Theorem 6 (SR family). Deciding K j= q and K j= q0 q is in 3EXPTIME in the
total size of q0, q, K (with binary encoding of numbers) – and in 2EXPTIME if K’s RBox
is given by defining regular expressions for the non-simple roles – if (i) K is in SROQ
or SROI and q0, q are P2RPQs over K, or (ii) K is in SRIQ, q0 is a conjunctive
query and q is a P2RPQ over K. The complexities are in 2EXPTIME and in EXPTIME,
respectively, if the number of atoms occurring in q is bounded by a constant.
There are plenty of open questions left for future work. First, the results established
here hold for arbitrary models, however, so far, very little is known about finite (model)
query answering in DLs from the Z family and the methods applied here do not seem
to easily carry over to the finite model setting. Second, while the extension of our query
formalism by nestings in the sense of Bienvenu et al. [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ] seems straightforward without
impacting complexities, our technique seems not readily extendable to capture more
elaborate query languages [
        <xref ref-type="bibr" rid="ref12 ref14 ref4">14,4,12</xref>
        ].
10 For space reasons, we assume the reader to be familiar with the notions used in these theorems.
      </p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          1.
          <string-name>
            <given-names>Meghyn</given-names>
            <surname>Bienvenu</surname>
          </string-name>
          , Diego Calvanese, Magdalena Ortiz, and
          <string-name>
            <given-names>Mantas</given-names>
            <surname>Simkus</surname>
          </string-name>
          .
          <article-title>Nested regular path queries in description logics</article-title>
          . In Chitta Baral, Giuseppe De Giacomo, and Thomas Eiter, editors,
          <source>Proc. 14th Int. Conf. on Principles of Knowledge Representation and Reasoning (KR'14)</source>
          , pages
          <fpage>218</fpage>
          -
          <lpage>227</lpage>
          . AAAI Press,
          <year>2014</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          2.
          <string-name>
            <surname>Piero</surname>
            <given-names>A.</given-names>
          </string-name>
          <string-name>
            <surname>Bonatti</surname>
          </string-name>
          , Carsten Lutz, Aniello Murano, and
          <string-name>
            <surname>Moshe</surname>
            <given-names>Y.</given-names>
          </string-name>
          <string-name>
            <surname>Vardi</surname>
          </string-name>
          .
          <article-title>The complexity of enriched mu-calculi</article-title>
          .
          <source>Logical Methods in Computer Science</source>
          ,
          <volume>4</volume>
          (
          <issue>3</issue>
          :11):
          <fpage>1</fpage>
          -
          <lpage>27</lpage>
          ,
          <year>2008</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          3.
          <string-name>
            <given-names>Pierre</given-names>
            <surname>Bourhis</surname>
          </string-name>
          , Markus Krötzsch, and
          <string-name>
            <given-names>Sebastian</given-names>
            <surname>Rudolph</surname>
          </string-name>
          .
          <article-title>How to best nest regular path queries</article-title>
          .
          <source>In Proc. 27th Int. Workshop on Description Logics (DL'14)</source>
          , volume
          <volume>1193</volume>
          <source>of CEUR Workshop Proceedings</source>
          , pages
          <fpage>404</fpage>
          -
          <lpage>415</lpage>
          . CEUR-WS.org,
          <year>2014</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          4.
          <string-name>
            <given-names>Pierre</given-names>
            <surname>Bourhis</surname>
          </string-name>
          , Markus Krötzsch, and
          <string-name>
            <given-names>Sebastian</given-names>
            <surname>Rudolph</surname>
          </string-name>
          .
          <article-title>Reasonable highly expressive query languages</article-title>
          .
          <source>In Qiang Yang and Michael Wooldridge</source>
          , editors,
          <source>Proc. 24th Int. Joint Conf. on Artificial Intelligence (IJCAI'15)</source>
          , pages
          <fpage>2826</fpage>
          -
          <lpage>2832</lpage>
          . AAAI Press,
          <year>2015</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          5. Diego Calvanese, Thomas Eiter, and
          <string-name>
            <given-names>Magdalena</given-names>
            <surname>Ortiz</surname>
          </string-name>
          .
          <article-title>Answering regular path queries in expressive description logics: An automata-theoretic approach</article-title>
          .
          <source>In Proc. 22nd AAAI Conf. on Artificial Intelligence (AAAI'07)</source>
          , pages
          <fpage>391</fpage>
          -
          <lpage>396</lpage>
          . AAAI Press,
          <year>2007</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          6. Diego Calvanese, Thomas Eiter, and
          <string-name>
            <given-names>Magdalena</given-names>
            <surname>Ortiz</surname>
          </string-name>
          .
          <article-title>Regular path queries in expressive description logics with nominals</article-title>
          . In Craig Boutilier, editor,
          <source>Proc. 21st Int. Joint Conf. on Artificial Intelligence (IJCAI'09)</source>
          , pages
          <fpage>714</fpage>
          -
          <lpage>720</lpage>
          . IJCAI,
          <year>2009</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          7.
          <string-name>
            <given-names>Carsten</given-names>
            <surname>Lutz</surname>
          </string-name>
          .
          <article-title>The complexity of conjunctive query answering in expressive description logics</article-title>
          . In Alessandro Armando,
          <string-name>
            <given-names>Peter</given-names>
            <surname>Baumgartner</surname>
          </string-name>
          , and Gilles Dowek, editors,
          <source>Proc. 4th Int. Joint Conf. on Automated Reasoning (IJCAR'08)</source>
          , number 5195 in LNAI, pages
          <fpage>179</fpage>
          -
          <lpage>193</lpage>
          . Springer,
          <year>2008</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          8.
          <string-name>
            <given-names>Magdalena</given-names>
            <surname>Ortiz</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Sebastian</given-names>
            <surname>Rudolph</surname>
          </string-name>
          , and
          <string-name>
            <given-names>Mantas</given-names>
            <surname>Simkus</surname>
          </string-name>
          .
          <article-title>Query answering is undecidable in DLs with regular expressions, inverses, nominals, and counting</article-title>
          .
          <source>INFSYS Research Report 1843-10-03</source>
          , TU Vienna,
          <year>2010</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          9.
          <string-name>
            <given-names>Magdalena</given-names>
            <surname>Ortiz</surname>
          </string-name>
          and
          <string-name>
            <given-names>Mantas</given-names>
            <surname>Simkus</surname>
          </string-name>
          .
          <article-title>Reasoning and query answering in description logics</article-title>
          . In Thomas Eiter and Thomas Krennwallner, editors,
          <source>Proc. 8th Int. Reasoning Web Summer School (RW'12)</source>
          , volume
          <volume>7487</volume>
          of Lecture Notes in Computer Science, pages
          <fpage>1</fpage>
          -
          <lpage>53</lpage>
          . Springer,
          <year>2012</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          10.
          <string-name>
            <surname>Ian</surname>
          </string-name>
          Pratt-Hartmann.
          <article-title>Complexity of the guarded two-variable fragment with counting quantifiers</article-title>
          .
          <source>J. Log. Comput.</source>
          ,
          <volume>17</volume>
          (
          <issue>1</issue>
          ):
          <fpage>133</fpage>
          -
          <lpage>155</lpage>
          ,
          <year>2007</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          11.
          <string-name>
            <surname>Ian</surname>
          </string-name>
          Pratt-Hartmann.
          <article-title>Data-complexity of the two-variable fragment with counting quantifiers</article-title>
          .
          <source>Inf. Comput.</source>
          ,
          <volume>207</volume>
          (
          <issue>8</issue>
          ):
          <fpage>867</fpage>
          -
          <lpage>888</lpage>
          ,
          <year>2009</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          12.
          <string-name>
            <surname>Juan L. Reutter</surname>
          </string-name>
          , Miguel Romero, and
          <string-name>
            <surname>Moshe</surname>
            <given-names>Y.</given-names>
          </string-name>
          <string-name>
            <surname>Vardi</surname>
          </string-name>
          .
          <article-title>Regular queries on graph databases</article-title>
          .
          <source>Theor. Comp. Sys.</source>
          ,
          <volume>61</volume>
          (
          <issue>1</issue>
          ):
          <fpage>31</fpage>
          -
          <lpage>83</lpage>
          ,
          <year>July 2017</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          13.
          <string-name>
            <given-names>Sebastian</given-names>
            <surname>Rudolph</surname>
          </string-name>
          .
          <article-title>Undecidability Results for Database-Inspired Reasoning Problems in Very Expressive Description Logics</article-title>
          . In Chitta Baral,
          <string-name>
            <given-names>James P.</given-names>
            <surname>Delgrande</surname>
          </string-name>
          , and Frank Wolter, editors,
          <source>Proc. 15th Int. Conf. on Knowledge Representation and Reasoning (KR'16)</source>
          , pages
          <fpage>247</fpage>
          -
          <lpage>257</lpage>
          . AAAI Press,
          <year>2016</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref14">
        <mixed-citation>
          14.
          <string-name>
            <given-names>Sebastian</given-names>
            <surname>Rudolph</surname>
          </string-name>
          and
          <string-name>
            <given-names>Markus</given-names>
            <surname>Krötzsch</surname>
          </string-name>
          .
          <article-title>Flag &amp; check: Data access with monadically defined queries</article-title>
          . In Richard Hull and Wenfei Fan, editors,
          <source>Proc. 32nd Symposium on Principles of Database Systems (PODS'13)</source>
          , pages
          <fpage>151</fpage>
          -
          <lpage>162</lpage>
          . ACM,
          <year>2013</year>
          .
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>