Acyclic Query Answering Under Guarded Disjunctive
         Existential Rules and Consequences to DLs

                  Pierre Bourhis1 , Michael Morak2 , and Andreas Pieris2
                      1
                       CNRS LIFL Université Lille1/INRIA Lille, France.
                          pierre.bourhis@univ-lille1.fr
          2
            University of Oxford, Department of Computer Science, United Kingdom.
                {michael.morak,andreas.pieris}@cs.ox.ac.uk


        Abstract. The complete picture of the complexity of conjunctive query answer-
        ing under guarded disjunctive existential rules has been recently settled. However,
        in the case of (unions of) acyclic conjunctive queries ((U)ACQs) there are some
        fundamental questions which are still open. It is the precise aim of the present
        paper to close those questions, and to understand whether the acyclicity of the
        query has a positive impact on the complexity of query answering. Our main re-
        sult states that acyclic conjunctive query answering under a fixed set of guarded
        disjunctive existential rules is E XP T IME-hard. This result together with an E XP -
        T IME upper bound obtained by exploiting classical results on guarded first-order
        logic, gives us a complete picture of the complexity of our problem. We also
        show that our results can be used as a generic tool for establishing results on
        (U)ACQ answering under several central DLs. In fact, restricting the query lan-
        guage to UACQs improves the complexity to E XP T IME-complete for any DL
        between DL-Litebool and ALCHI; this holds even for fixed TBoxes.


1     Introduction
Rule-based languages have a prominent presence in the areas of artificial intelligence
and databases. A noticeable formalism, originally intended for expressing complex
queries over relational databases, is Datalog, i.e., function-free first-order Horn logic.
As Datalog itself is not able to infer the existence of new objects that do not occur in
the extensional database, strong interest in enhancing Datalog with existential quantifi-
cation in rule-heads emerged in recent years; see, e.g., [5, 6, 12, 23, 26]. The obtained
rules are known under a variety of names such as existential rules, tuple-generating
dependencies (TGDs), and Datalog± rules. However, these languages are still not ex-
pressive enough for non-deterministic reasoning, e.g., to express simple statements like
“a grandchild is a male or a female”, which can be logically expressed using the rules
    ∀X∀Y (parentOf (X, Y ) ∧ isfather (Y ) → ∃Z(grparentOf (X, Z) ∧ human(Z)))
                         ∀X(human(X) → male(X) ∨ female(X)).
Extending TGDs to allow such a disjunction in the head yields the formalism of disjunc-
tive TGDs (DTGDs) [17]. Such an extension of plain Datalog (a.k.a. full TGDs), called
disjunctive Datalog, has been studied in [18]. However, the addition of existential quan-
tifiers easily leads to undecidability of the main reasoning tasks, including conjunctive
query (CQ) answering [8].
     Several concrete languages which ensure decidability have thus been proposed; see,
e.g., [5, 10, 12, 13, 19, 23, 24]. In this paper, we will focus on a concept called “guard-
edness” [10], which requires that all universally quantified variables in a rule appear
together in an atom in the left-hand side of the implication. Guardedness is a well-
accepted paradigm, giving rise to robust languages that capture important description
logics such as DL-Lite [14], EL [4] and even ALCI. Recently, the complexity picture
for query answering under guarded DTGDs has been investigated and completed for
(unions of) arbitrary CQs [9]. Also for simple atomic queries the complexity has been
investigated (see, e.g., [1, 21]). As acyclic CQs (ACQs) in the literature usually lead to a
reduction in the complexity of the query answering problem, the objective of the present
paper is to investigate whether this positive impact can also be observed in our setting.
To this end, we aim to pinpoint the exact complexity of answering (unions of) acyclic
CQs ((U)ACQs), in order to compare it to existing results for answering (unions of)
arbitrary CQs. More precisely, we concentrate on the following fundamental question:
What is the exact complexity of answering (U)ACQs under guarded DTGDs, and how
is it affected if we consider a signature of bounded arity, or a fixed set of dependencies?
     After establishing the complexity of (U)ACQ answering, we then investigate the
consequences to DLs. We show that our results can be used as a generic tool for es-
tablishing results on (U)ACQ answering under several central DLs that can be found
in the literature such as the key DL ALCHI, that is, the DL ALC extended with role
hierarchies and inverse roles. Our contributions can be summarized as follows:

 1. We show that UACQ answering under guarded DTGDs with predicates of bounded
    arity is in E XP T IME. This upper bound is established by exploiting results from
    guarded first-order logic [22];
 2. We show that ACQ answering under fixed sets of guarded DTGDs is E XP T IME-
    hard. This is a strong and general lower bound obtained by simulating the behavior
    of an alternating linear space Turing machine.
 3. Finally, we show that our results for fixed arity and fixed sets extend to a number
    of expressive DLs, namely ALCHI, ELUHI, DL-LiteH            bool and also DL-Litebool .
    For each of these DLs, and any others between them, answering UACQs, even over
    fixed TBoxes, is E XP T IME-complete. The upper bounds follow from the fact that
    guarded DTGDs are powerful enough to capture each of these DLs, and they thus
    inherit the positive effects of restricting the query language to acyclic queries. The
    lower bounds follow from the construction of our proof for guarded DTGDs.


2   Preliminaries

General. Let C, N and V be pairwise disjoint infinite countable sets of constants, (la-
beled) nulls and variables, respectively. We denote by X sequences (or sets) of variables
X1 , . . . , Xk . Let [n] = {1, . . . , n}, for n > 1. A term is a constant, null or variable.
An atom has the form p(t1 , . . . , tn ), where p is an n-ary predicate, and t1 , . . . , tn are
terms. For an atom a, dom(a) and var (a) are the set of its terms and the set of its vari-
ables, respectively; those notations extend to sets of atoms. Usually conjunctions and
disjunctions of atoms are treated as sets of atoms. An instance I is a (possibly infinite)
set of atoms of the form p(t), where t is a tuple of constants and nulls. A database D
is a finite instance with only∧constants. Whenever an instance I is treated as a logical
formula, is the formula ∃X ( a∈I I), where X contains a variable for each null in I.
    Conjunctive Queries. A conjunctive query (CQ) q is a sentence ∃Xφ(X), where
φ is a conjunction of atoms. If q does not have free variables, then it is called Boolean.
For brevity, we consider only Boolean CQs; however, all the results of the paper can
be easily extended to non-Boolean CQs. A union of conjunctive queries (UCQ) is a
disjunction of a finite number of CQs. By abuse of notation, sometimes we consider
a UCQ as set of CQs. A CQ q = ∃Xφ(X) has a positive answer over an instance I,
written I |= q, if there exists a homomorphism h such that h(φ(X)) ⊆ I. The answer
to a UCQ Q over I is positive, written I |= Q, if there exists q ∈ Q such that I |= q.
    Several subclasses of conjunctive queries have been considered in the literature,
with the aim of reducing the complexity of the CQ evaluation problem. This paper
focuses on the class of acyclic CQs (ACQs); see, e.g., [16]. The acyclicity of a CQ
is defined via the acyclicity of its hypergraph. Informally, a hypergraph is acyclic if it
can be reduced to the empty hypergraph by iteratively eliminating some non-maximal
hyperedge, or some vertex contained in at most one hyperedge; this procedure is known
as the GYO algorithm. The formal definition is as follows. Consider a hypergraph H.
The GYO-reduct of H, denoted GYO(H), is obtained by applying exhaustively the
following two steps: (1) eliminate vertices that are contained in at most one hyperedge;
and (2) eliminate hyperedges that are empty or contained in other hyperedges. We say
that H is acyclic if GYO(H) = ⟨∅, ∅⟩. The hypergraph of a CQ q, denote H(q), is
a hypergraph ⟨V, H⟩, where V = dom(q), and, for each atom a in q, there exists a
hyperedge h ∈ H such that h = dom(a). We say that q is acyclic iff H(q) is acyclic.
    Disjunctive Tuple-Generating Dependencies. A disjunctive      ∨ntuple-generating de-
pendency (DTGD) σ is a first-order formula ∀X (φ(X) → i=1 ∃Yi ψi (X, Yi )),
where n > 1, X ∪ Y1 ∪ . . . ∪ Yn ⊂ V, and φ, ψ1 , . . . , ψn are∨conjunctions     of atoms.
                                                                  n
The formula φ is called the body of σ, denoted body(σ), while i=1 ψi is the head of σ,
denoted head (σ). If n = 1, then σ is called tuple-generating dependency (TGD). The
schema of a set Σ of DTGDs, denoted sch(Σ), is the set of all predicates occurring in
Σ. For brevity, we will omit the universal quantifiers in front of DTGDs, and use the
comma (instead of ∧) for conjoining atoms. An instance I satisfies σ, written I |= σ, if
the following holds: Whenever there exists a homomorphism h such that h(φ(X)) ⊆ I,
then there exists i ∈ [n] and h′ ⊇ h such that h′ (ψi (X, Yi )) ⊆ I; I satisfies a set Σ of
DTGDs, denoted I |= Σ, if I |= σ, for each σ ∈ ∧      Σ. Whenever a set Σ of DTGDs is
treated as a logical formula, in fact is the formula ( σ∈Σ σ).
    As said, (U)CQ answering under DTGDs (which is defined below) is undecidable;
in fact, is undecidable already for TGDs [8], even when the set of TGDs is fixed and the
query is acyclic [11], or even when the set of TGDs is singleton [5]. A well-known prop-
erty which guarantees decidability of query answering is guardedness [11]. A DTGD
σ is guarded if there exists an atom a ∈ body(σ), called guard, which contains all the
variables occurring in body(σ), i.e., var (a) = var (body(σ)).
    Query Answering. The models of a database D and a set Σ of DTGDs, denoted
mods(D, Σ), is the set of instances {I | I ⊇ D and I |= Σ}. The answer to a CQ q
w.r.t. D and Σ is positive, denoted D ∪ Σ |= q, if I |= q, for each I ∈ mods(D, Σ).
The answer to a UCQ w.r.t. D and Σ is defined analogously. The problem tackled in
this work is defined as follows: Given a CQ q, a database D, and a set Σ of DTGDs,
decide whether D ∪ Σ |= q. If the input query is an ACQ, then the above problem
is called ACQ answering. Analogously, the problem UACQ answering for unions of
ACQs is defined. The data complexity of the above problems is calculated taking only
the database as input. For the combined complexity, the query and set of DTGDs count
as part of the input as well.
    Disjunctive Chase. We employ    ∨n the disjunctive chase [17]. Consider an instance I,
and a DTGD σ : φ(X) → i=1 ∃Y ψi (X, Y). We say that σ is applicable to I if
there exists a homomorphism h such that h(φ(X)) ⊆ I, and the result of applying σ
to I with h is the set {I1 , . . . , In }, where Ii = I ∪ h′ (ψi (X, Y)), for each i ∈ [n],
and h′ ⊇ h is such that h′ (Y ) is a “fresh” null not occurring in I, for each Y ∈ Y.
For such an application of a DTGD, which defines a single DTGD chase step, we write
I⟨σ, h⟩{I1 , . . . , In }. A disjunctive chase tree of a database D and a set Σ of DTGDs
is a (possibly infinite) tree such that the root is D, and for every node I, assuming that
{I1 , . . . , In } are the children of I, there exists σ ∈ Σ and a homomorphism h such that
I⟨σ, h⟩{I1 , . . . , In }. The disjunctive chase algorithm for D and Σ consists of an ex-
haustive application of DTGD chase steps in a fair fashion, which leads to a disjunctive
chase tree T of D and Σ; we denote by chase(D, Σ) the set {I | I is a leaf of T }. It is
well-known that, given a UCQ Q, D ∪ Σ |= Q iff I |= Q, for each I ∈ chase(D, Σ).
    The Guarded Fragment of First-Order Logic. The guarded fragment (GFO) has
been introduced in [2]. The set of GFO formulas over a schema R is the smallest set (1)
containing all atomic R-formulas and equalities; (2) closed under the logical connec-
tives ¬, ∧, ∨, →; and (3) if a is an R-atom containing all the variables of X ∪ Y, and
φ is a GFO formula with free variables contained in (X ∪ Y), then ∀X(a → φ) and
∃X(a∧φ) are GFO formulas. It is known that the problem of deciding whether an GFO
sentence is satisfiable is 2E XP T IME-complete, and E XP T IME-complete for predicates
of bounded arity [22].


3   Answering (Unions of) Acyclic Queries

The problem of answering (unions of) arbitrary conjunctive queries has recently been
investigated in [9]. It turns out that even for fixed sets of very simple forms of DTGDs,
the problem is already 2E XP T IME-hard. Restricting the query language to (unions of)
acyclic queries has often proven to be a way to reduce the complexity. Our goal in this
section is thus to investigate if this also holds true for guarded DTGDs, and to study
how exactly the complexity of (U)CQ answering under our guarded DTGDs is affected
if we focus on acyclic queries. We will begin this section by summarizing our results,
and then proceed with our new complexity bounds.
    Table 1 summarizes the complexity results for answering (U)ACQs under guarded
DTGDs. Compared with the results for (U)CQs, as presented in [9], two observa-
tions can be made: Firstly, the combined and the data complexity do not change if
we restrict ourselves to acyclic queries. Secondly, however, the complexity decreases
from 2E XP T IME-completeness to E XP T IME-completeness, in the case of predicates of
bounded arity, and also if we consider a fixed set of DTGDs. The impact of restricting
            Combined Complexity Bounded Arity Fixed Rules Data Complexity
               2E XP T IME       E XP T IME   E XP T IME       coNP
                UB: [9, Thm. 1]      UB: Thm. 1    UB: Thm. 1    UB: [7, Thm. 19]
               LB: [11, Thm. 6.1]    LB: Thm. 2    LB: Thm. 2   LB: [15, Thm. 4.5]

         Table 1: Complexity of answering (U)ACQs under guarded DTGDs.

the query language to acyclic queries is this only noticeable in these two cases. The
novel results of this section follow:

 1. UACQ answering under guarded DTGDs is in E XP T IME if we focus on predicates
    of bounded arity (Theorem 1) – this is shown by a reduction to satisfiability of
    a slight extension of GFO, and then exploit the fact that this problem is in E XP -
    T IME in case of predicates of bounded arity [22]; and
 2. ACQ answering under fixed sets of guarded DTGDs is E XP T IME-hard (Theorem 2)
    – by simulating the behavior of an alternating linear space Turing machine.

    Existing results from the literature, provide us with the missing optimal upper and
lower bounds, which complete the entire picture of the complexity of (U)ACQ answer-
ing. Firstly, the missing upper bounds are immediately inherited from results presented
in [9], in particular the combined and data complexity of guarded DTGDs, which are in
2E XP T IME and coNP, respectively. Secondly, the coNP-hardness of ACQ answering
under guarded DTGDs follows from [15, Theorem 4.5] – ∧      the CQ employed in the con-
struction of the proof of this result is ∃X∃Y1 ∃Y2 ∃Z1 ∃Z2 ( i∈{1,2} (pi (X, Yi ) ∧ s(Yi ) ∧
ri (X, Zi ) ∧ t(Zi ))), which is clearly acyclic.
    From the results of this section, we observe that the acyclicity of the query does not
make our problem easier w.r.t. the combined and data complexity. However, in the cases
of bounded arity and fixed sets of DTGDs, the complexity decreases from 2E XP T IME to
E XP T IME. Let us now proceed with the formal proofs of our results.


3.1 Complexity Results

We start this section by showing that UACQ answering under guarded DTGDs is in E X -
P T IME in case of predicates of bounded arity. We can assume that each guarded DTGD
                                     ∨
σ is of the form α(X) ∧ ϕ(X) → 16i6m ∃Yψi (X, Y), where ϕ and ψi are conjunc-
tions of atoms, and α is the atom that forms the guard of σ. Each
                                                              ∨ such formula can now
in fact be rewritten as the formula ∀X(α(X) → (ϕ(X) → 16i6m ∃Yψi (X, Y))). It
can be verified that, if each ψi is a single atom, then the above formula falls into GFO.
However, if ψi is a conjunction of atoms, then the formula is “almost” guarded since
the existentially quantified variables are not necessarily guarded. Interestingly, the sat-
isfiability algorithm proposed in [22] is able to treat formulas as the one above, even if
the existentially quantified variables are not guarded, without increasing the complex-
ity – this is explicitly stated in a remark on page 8 of [22]. By exploiting the above
observation, it is now easy to reduce our problem to the satisfiability problem of sen-
tences which are “almost” guarded, which is E XP T IME-complete in case of predicates
of bounded arity [22]. Consider a database D, a set Σ of guarded DTGDs, and a UACQ
Q. The database D itself contains only ground atoms and is thus trivially guarded. The
set Σ can be rewritten as a sentence ΦΣ which is “almost” guarded, as explained above.
Finally, the query Q, although is not guarded, it can be rewritten as a GFO sentence ΦQ
due to acyclicity [20]. Thus, D∪Σ |= Q iff the “almost” GFO sentence (D∧ΦΣ ∧¬ΦQ )
is not satisfiable, and we obtain the following:
Theorem 1. UACQ answering under guarded DTGDs is in E XP T IME in case of pred-
icates of bounded arity.
    We continue this section by showing that ACQ answering under fixed sets of guarded
DTGDs is E XP T IME-hard. The construction simulates an alternating L IN S PACE Turing
machine. Using a fixed set of rules, chains of non-deterministic length can be generated,
where the length represents a number. By linking these chains appropriately, configu-
rations of the Turing machine can be guessed. An acyclic query is then used to check
consistency w.r.t. the configuration and the transition function. The desired result fol-
lows since alternating L IN S PACE already coincides with E XP T IME.
Theorem 2. ACQ answering under fixed sets of guarded DTGDs is E XP T IME-hard.
Proof. The proof is by a reduction from the non-acceptance of an alternating linear
space Turing machine M on input I. Let M = (S, Λ, δ, s0 ), where S = S∀ ⊎ S∃ ⊎
{sa } ⊎ {sr } is a finite set of states partitioned into universal states, existential states, an
accepting state and a rejecting state, Λ = {0, 1, ⊔} is the tape alphabet with ⊔ be the
blank symbol, δ : S×Λ → (S×Λ×{−1, 0, +1})2 is the transition function, and s0 ∈ S
is the initial state. We assume that M is well-behaved and never tries to read beyond
its tape boundaries, always halts, and uses exactly n tape cells, where n = O(|I|). Fur-
thermore, we assume that a rejecting configuration does not have a subsequent config-
uration, while an accepting configuration has only itself as a subsequent configuration.
Finally, we assume that s0 ∈ S∃ , and also that every universal configuration is followed
by two existential configurations and vice versa. The above assumptions can be made,
without sacrificing the generality of our proof, since the non-acceptance problem of M
remains E XP T IME-hard. We are going to construct a database D, a set Σ of DTGDs,
and a UACQ Q such that D ∪ Σ |= Q iff M rejects I, and Σ is a fixed set of guarded
DTGDs. By [9, Lemma 4], the obtained lower bound holds even if we focus on ACQs.
The idea of the proof is to construct trees, which encode possible computation trees of
M on the input string I, by chasing D with Σ, and then exploit Q to check their consis-
tency. On each configuration node v, which represents the configuration Cv of M , we
attach a chain of length n, which mimics the tape in Cv , and a chain of length at most
|S|, which encodes the state of Cv . The formal definition of D, Σ and Q follows.
     The Database D. Let D = {conf ∃ (c)}, where c ∈ C is a special constant which
represents the initial configuration (which is, by assumption, existential).
     The Set Σ. The predicates that we are going to use are self-explanatory, and thus
we proceed with the construction of Σ without describing the predicates of sch(Σ).
 – Each existential (universal) configuration has one (two) successor configuration(s)
   which is (are) universal (existential):
          conf ∃ (X) → ∃Y succ(X, Y ), conf ∀ (Y ),
          conf ∀ (X) → ∃Y ∃Z succ 1 (X, Y ), conf ∃ (Y ), succ 2 (X, Z), conf ∃ (Z).
 – Each configuration has a cell- and state-chain which simulates its tape and encodes
   its state, respectively:

                    conf x (X) → ∃Y cell (X, Y ), for each x ∈ {∃, ∀},
                    cell (X, Y ) → ∃Z cell (Y, Z) ∨ end (Y ),
                     conf x (X) → ∃Y state(X, Y ), for each x ∈ {∃, ∀},
                  state(X, Y ) → ∃Z state(Y, Z) ∨ end (Y ).

 – Finally, we guess the content of each cell and the cursor position:

                         cell (X, Y ) → 0(Y ) ∨ 1(Y ) ∨ ⊔(Y ),
                         cell (X, Y ) → cursor (Y ) ∨ notCursor (Y ).

The construction of Σ is now completed. It is easy to see that Σ does not depend on
M , and also that Σ is a set of guarded DTGDs.
    The UACQ Q. We now proceed with the construction of Q which ensures that each
model of D ∪ Σ encodes a valid computation tree. Roughly, Q consists of the following
disjuncts:
 1. Qinitial for checking that in the initial configuration the tape contains the input
    string I = a1 . . . ak , the state is s0 , and the cursor is at the first cell;
 2. Qlength for checking that cell-chains are of length n, and state-chains are of length
    at most |S|;
 3. Qcursor for checking that exactly one cell is pointed by the cursor;
 4. Qtrans for checking the consistency w.r.t. the transition function of M ; and
 5. Qinertia for checking that the tape cells not under the cursor keep their old value
    during a transition.
We assume a fixed order on S. Given a state s ∈ S, let #s be its position in this order.
The length of the state-chain which encodes s is #s . For a predicate p ∈ {cell , state},
pi (X, Y ) is used as a shorthand for a chain of length i along predicate p, namely
∃Z1 . . . ∃Zi−1 (p(X, Z1 ) ∧ . . . ∧ p(Zi−1 , Y )). A useful subquery that we will use is
                                               (                    )
                      length[p]k (X) ≡ ∃Y pk (X, Y ) ∧ end (Y )

stating that there exists a p-chain of length k. Having all the necessary ingredients in
place, we are now ready to formalize the components of Q.
 1. Qinitial is defined as (recall that c ∈ C represents the initial configuration)
       ∨         ∨         (                                 )
                       ∃X conf ∃ (c) ∧ cell i (c, X) ∧ a(X) ∨
       i∈[n] a∈Λ\{ai }
                              ∨
                                       (conf ∃ (c) ∧ length[state]#s (c)) ∨
                           s∈S\{s0 }
                                          ∨       (                                     )
                                                ∃X conf ∃ (c) ∧ cell i (c, X) ∧ head (X) .
                                        1<i6n
 2. Qlength is defined as
        ∨
             ∃X length[cell ]i (X) ∨ ∃X∃Y cell n+1 (X, Y )
         16i<n

                                                              ∨ ∃X length[state]|S|+1 (X).
 3. Qcursor is defined as
        ∨                 (                                                     )
               ∃X∃Y ∃Z cell i (X, Y ) ∧ cell j (X, Z) ∧ cursor (Y ) ∧ cursor (Z) .
        16i<j6n

 4. In the definition of Qtrans , we use the function f : S → {∃, ∀}, where f (s) = ∃
    if s ∈ S∃ ; otherwise, f (s) = ∀. Moreover, we use the binary operator ⊙s , where
    s ∈ S, which is defined as ∧, if s ∈ S∃ , and as ∨, if s ∈ S∀ . Qtrans is defined as
               ∨                      ⊙
                                      s        ∨
                                                       ∃X∃Y ∃Z∃V ∃W
(s,a)→((s1 ,a1 ,d1 ),(s2 ,a2 ,d2 ))̸∈δ i∈{1,2} j∈[n]
  (
      conf f (s) (X) ∧ length[state]#s (X) ∧ cell j (X, Y ) ∧ cursor (Y ) ∧ a(Y )∧
                         (succ(X, Z) ∨ succ 1 (X, Z) ∨ succ 2 (X, Z))∧
                                                                                         )
        length[state]#si (Z) ∧ cell j (Z, V ) ∧ ai (V ) ∧ cell j+di (Z, W ) ∧ cursor (W ) .

 5. Finally, Qinertia is defined as
        ∨             ∨          ∨
                                           ∃X∃Y ∃Z1 ∃Z2 (conf x (X)∧
        x∈{∃,∀} (a,a′ )∈Λ×Λ,a̸=a′ i∈[n]

             (succ(X, Y ) ∨ succ 1 (X, Y ) ∨ succ 2 (X, Y ))∧
                                                                                           )
                  cell i (X, Z1 ) ∧ cell i (Y, Z2 ) ∧ notCursor (Z1 ) ∧ a1 (Z1 ) ∧ a2 (Z2 ) .
This completes the construction of Q. It is easy to verify that each disjunct of Q is an
acyclic query.

     By construction, and the assumption that a rejecting configuration does not have a
successor configuration, while an accepting configuration has only itself as a successor,
it is not difficult to see that D ∪ Σ |= Q iff M rejects I.

    By Theorems 1 and 2, as well as inherited results described earlier, we can now state
our main result that closes the complexity picture for guarded DTGDs w.r.t. answering
(unions of) acyclic conjunctive queries.
Corollary 1. (U)ACQ answering under guarded DTGDs is 2E XP T IME-complete in the
combined complexity, E XP T IME-complete in case of fixed arity and fixed sets of DT-
GDs, and coNP-complete in the data complexity.
    It can be seen that, in comparison to arbitrary (unions of) conjunctive queries, the
complexity in the general case remains the same. However, there is a notable complexity
decrease from 2E XP T IME to E XP T IME in case where at least the arity is fixed. However,
the data complexity remains coNP-complete.
4     Consequences to DLs
In this section, we will investigate the relationship between our previously discussed
formalism of guarded DTGDs with DLs. Our aim is to show that our techniques and
results can be used as a generic tool for establishing results on UACQ answering under
several central DLs that can be found in the literature. To this end, we focus on the
key DL ALCHI, that is, the well known DL ALC extended with role hierarchies and
inverse roles. The set of ALCHI concepts is the smallest set such that (1) any atomic
concept, as well as ⊥ and ⊤ are concepts; (2) for any role R, if C and D are concepts,
then so are C ⊓ D, ¬C, ∀R.C, ∀R− .C, ∃R.C and ∃R− .C. A TBox consists of a finite
set of concept inclusions of the form C ⊑ D, where C and D are concepts, as well as
a finite set of role inclusions R ⊑ S, where R and S are (possibly inverse) roles. An
ABox consists of a finite set of assertions of the form C(a) or R(a, b), where a and
b are individuals and C and R is a concept and role, respectively. A TBox T together
with an ABox A consists a knowledge base (KB) K = (T , A). Its semantics are defined
as usual via first-order interpretations.

4.1   A Generic Complexity Result
By exploiting our techniques and results, we can establish the following result about
UACQ answering; L1 ⊆ L2 means that L1 is a subformalism of L2 :

Theorem 3. Consider a DL L such that L ⊆ ALCHI and L is powerful enough for
expressing the following inclusion assertions:

                   A1 ⊑ A2 ⊔ A3         A ⊑ ∃R.⊤         ∃R− .⊤ ⊑ A,

where A, A1 , A2 , A3 are concept names. Then, UACQ answering under L is E XP T IME-
complete in combined complexity, even when the TBox is fixed.

    As we shall see, the above generic result closes some interesting open problems in
a uniform way. The rest of this section is devoted to establish Theorem 3.
    Upper Bound. The upper bound is obtained by reducing UACQ answering under
ALCHI to UACQ answering under guarded DTGDs. First, we discuss how complex
concepts can be eliminated from the given ABox by pushing them in the TBox. Such
a reformulation of the given KB will allow us to consider the ABox as a relational
database. Consider an ALCHI KB K = (T , A). We construct an ABox A′ from A
by replacing each assertion C(a), where C is a complex concept, by C ⋆ (a), where C ⋆
is an auxiliary concept name not occurring in A. Then, the TBox T ′ is constructed by
adding to T an assertion of the form C ⋆ ⊑ C, for each complex concept C occurring in
A. It is easy to see that (T , A) and (T ′ , A′ ) are equivalent w.r.t. UACQ answering. Let
us now explain how an ALCHI TBox can be normalized in such a way that can be seen
as a set of guarded DTGDs. In fact, we exploit the normalization procedure proposed
in [25] for ALCH, i.e., ALCHI without inverse roles. An ALCHI TBox is in normal
form if it only contains axioms of the form given on the left column of Table 2. It is
implicit in [25] that every ALCH TBox T can be transformed in polynomial time into
an ALCH TBox in normal form which is equivalent to T w.r.t. UACQ answering. This
          ALCHI Axiom                           First-Order Representation
       A1 ⊓ . . . ⊓ An ⊑ ⊥                  ∀X(A1 (X) ∧ . . . ∧ An (X) → ⊥)
 A1 ⊓ . . . ⊓ An ⊑ B1 ⊔ . . . ⊔ Bm ∀X(A1 (X) ∧ . . . ∧ An (X) → B1 (X) ∨ . . . ∨ Bm (X))
             A ⊑ ∃R.B                      ∀X(A(X) → ∃Y R(X, Y ) ∧ B(Y ))
             ∃R.A ⊑ B                       ∀X(R(X, Y ) ∧ A(Y ) → B(X))
             A ⊑ ∀R.B                       ∀X(A(X) ∧ R(X, Y ) → B(Y ))
               R⊑S                           ∀X∀Y (R(X, Y ) → S(X, Y ))
              R ⊑ S−                         ∀X∀Y (R(X, Y ) → S(Y, X))
                 Table 2: Normal form and first-order representation.

result can be straightforwardly extended to ALCHI. From the above discussion, we
immediately get the following auxiliary result:

Lemma 1. Consider an ALCHI KB K = (T , A). We can construct in polynomial
time an ALCHI KB K′ = (T ′ , A′ ), where T ′ is in normal form and A′ contains only
concept and role names, such that K |= Q iff K′ |= Q, for every UACQ Q.

    As shown in Table 2, given an ALCHI TBox T in normal form, every axiom of T
can be equivalently rewritten as a first-order sentence which is either a guarded DTGD
or a constraint of the form ∀X(A1 (X) ∧ . . . ∧ An (X) → ⊥), where ⊥ denotes the
truth constant false, which may lead to an inconsistency. Such inconsistencies can be
encoded in the query by considering the left-hand side of the constraints as disjuncts
of the given UACQ. Moreover, observe that those disjuncts will be ACQs, and thus
the obtained query remains acyclic. Now, an ABox which contains only concept and
role names (and not complex concepts) can be naturally seen as a relational database.
Therefore, from Lemma 1 and the above discussion, we conclude the following: given
an ALCHI KB K = (T , A) and a UACQ Q, we can construct in polynomial time
a database DA , a set ΣT of guarded DTGDs, and a UACQ QT such that K |= Q iff
DA ∪ ΣT |= QT , and the next result follows:

Proposition 1. UACQ answering under ALCHI KBs can be polynomially reduced to
UACQ answering under guarded DTGDs.

    Clearly, the above proposition, combined with our result on UACQ answering under
guarded DTGDs, implies the upper bound stated in Theorem 3.
    Lower Bound. The desired lower bound is inherited from Theorem 2. More pre-
cisely, the guarded DTGDs employed in the proof of Theorem 2 have one of the fol-
lowing forms which corresponds to DL axioms (possibly more than one):

                 A(X) → ∃Y R(X, Y ) ≡ A ⊑ ∃R.⊤
                    R(X, Y ) → A(Y ) ≡ ∃R− .⊤ ⊑ A
    R(X, Y ) → ∃Z S(Y, Z) ∨ A(Y ) ≡ ∃R− .⊤ ⊑ B1 B1 ⊑ B2 ⊔ A                  B2 ⊑ ∃S.⊤
R(X, Y ) → A1 (Y ) ∨ A2 (Y ) ∨ A3 (Y ) ≡ ∃R− .⊤ ⊑ B B ⊑ A1 ⊔ A23
                                         A23 ⊑ A2 ⊔ A3 ,

and the next result follows.
Proposition 2. Consider a DL L which is powerful enough for expressing the following
inclusion assertions: A1 ⊑ A2 ⊔A3 , A ⊑ ∃R.⊤ and ∃R− .⊤ ⊑ A, where A, A1 , A2 , A3
are concept names. Then, UACQ answering under L is E XP T IME-hard in combined
complexity, even when the TBox is fixed.

4.2   Other Description Logics.
In [3] the data complexity of answering arbitrary CQs over DL-LiteH bool and DL-Litebool
was investigated and shown to be coNP-complete; more recently, in [9], the combined
complexity of this question was investigated and the problem shown to be 2E XP T IME-
hard for the former DL. However, to the best of our knowledge, there do not exist
any results on answering UACQs. Since DL-Litebool is already equipped with limited
existential quantification, role inverse and union, Theorem 3 implies the following:
Corollary 2. UACQ answering under DL-Litebool is E XP T IME-complete in combined
complexity, even when the TBox is fixed.
    By extension, for every DL that is a subset of ALCHI and that possesses the fea-
tures listed in Theorem 3, answering unions of acyclic queries is E XP T IME-complete,
even if the TBox is fixed. In addition to the above corollary, this applies to, e.g., DL-
LiteH
    bool , ELU , i.e., the extension of the well-known DL EL with union of concepts, and
ELUHI, that is, the extension of ELU with role hierarchies and inverse roles.


5     Conclusion
In this paper, we studied the acyclic query answering problem under guarded disjunc-
tive TGDs. While in the combined complexity this task remains 2E XP T IME-complete,
in case of fixed arity and fixed sets of rules, we observe the classical positive effect
of restricting the query language to ACQs, as the complexity drops to E XP T IME. We
also investigated the impact of our results on acyclic query answering under DL-based
formalisms; in particular, we showed that for any DL at most as expressive as ALCHI,
the complexity of query answering is in E XP T IME if the query language is restricted to
UACQs. We also showed that this problem for DLs equipped with limited existential
quantification, role inverse and union is E XP T IME-hard, even when the TBox is fixed.

Acknowledgements. Pierre thankfully acknowledges projet équipe associée Inria North-
European Labs 2013-2016, Michael his DOC Fellowship of the Austrian Academy
of Sciences, and Andreas the EPSRC grant EP/J008346/1 (PrOQAW). We thank the
anonymous referees for many helpful comments.


References
 1. Alviano, M., Faber, W., Leone, N., Manna, M.: Disjunctive Datalog with existential quanti-
    fiers: Semantics, decidability, and complexity issues. TPLP 12(4-5), 701–718 (2012)
 2. Andréka, H., van Benthem, J., Németi, I.: Modal languages and bounded fragments of pred-
    icate logic. J. Philosophical Logic 27, 217–274 (1998)
 3. Artale, A., Calvanese, D., Kontchakov, R., Zakharyaschev, M.: The DL-Lite family and re-
    lations. J. Artif. Intell. Res. 36, 1–69 (2009)
 4. Baader, F.: Least common subsumers and most specific concepts in a description logic with
    existential restrictions and terminological cycles. In: Proc. of IJCAI. pp. 319–324 (2003)
 5. Baget, J.F., Leclère, M., Mugnier, M.L., Salvat, E.: On rules with existential variables: Walk-
    ing the decidability line. Artif. Intell. 175(9-10), 1620–1654 (2011)
 6. Baget, J.F., Mugnier, M.L., Rudolph, S., Thomazo, M.: Walking the complexity lines for
    generalized guarded existential rules. In: Proc. of IJCAI. pp. 712–717 (2011)
 7. Bárány, V., Gottlob, G., Otto, M.: Querying the guarded fragment. In: Proc. of LICS. pp.
    1–10 (2010)
 8. Beeri, C., Vardi, M.Y.: The implication problem for data dependencies. In: Proc. of ICALP.
    pp. 73–85 (1981)
 9. Bourhis, P., Morak, M., Pieris, A.: The impact of disjunction on query answering under
    guarded-based existential rules. In: Proc. IJCAI (2013)
10. Calı̀, A., Gottlob, G., Kifer, M.: Taming the infinite chase: Query answering under expressive
    relational constraints. In: Proc. of KR. pp. 70–80 (2008)
11. Calı̀, A., Gottlob, G., Kifer, M.: Taming the infinite chase: Query answering under expressive
    relational constraints. J. Artif. Intell. Res. (JAIR) 48, 115–174 (2013)
12. Calı̀, A., Gottlob, G., Lukasiewicz, T.: A general Datalog-based framework for tractable
    query answering over ontologies. J. Web. Sem. 14, 57–83 (2012)
13. Calı̀, A., Gottlob, G., Pieris, A.: Towards more expressive ontology languages: The query
    answering problem. Artif. Intell. 193, 87–128 (2012)
14. Calvanese, D., De Giacomo, G., Lembo, D., Lenzerini, M., Rosati, R.: Tractable reasoning
    and efficient query answering in description logics: The DL-Lite family. J. Autom. Reasoning
    39(3), 385–429 (2007)
15. Calvanese, D., Giacomo, G.D., Lembo, D., Lenzerini, M., Rosati, R.: Data complexity of
    query answering in description logics. Artif. Intell. 195, 335–360 (2013)
16. Chekuri, C., Rajaraman, A.: Conjunctive query containment revisited. Theor. Comput. Sci.
    239(2), 211–229 (2000)
17. Deutsch, A., Tannen, V.: Reformulation of XML queries and constraints. In: Proc. of ICDT.
    pp. 225–241 (2003)
18. Eiter, T., Gottlob, G., Mannila, H.: Disjunctive Datalog. ACM Trans. Database Syst. 22(3),
    364–418 (1997)
19. Fagin, R., Kolaitis, P.G., Miller, R.J., Popa, L.: Data exchange: Semantics and query answer-
    ing. Theor. Comput. Sci. 336(1), 89–124 (2005)
20. Gottlob, G., Leone, N., Scarcello, F.: Robbers, marshals, and guards: game theoretic and
    logical characterizations of hypertree width. J. Comput. Syst. Sci. 66(4), 775–808 (2003)
21. Gottlob, G., Manna, M., Morak, M., Pieris, A.: On the complexity of ontological reasoning
    under disjunctive existential rules. In: Proc. of MFCS. pp. 1–18 (2012)
22. Grädel, E.: On the restraining power of guards. J. Symb. Log. 64(4), 1719–1742 (1999)
23. Krötzsch, M., Rudolph, S.: Extending decidable existential rules by joining acyclicity and
    guardedness. In: Proc. of IJCAI. pp. 963–968 (2011)
24. Leone, N., Manna, M., Terracina, G., Veltri, P.: Efficiently computable Datalog∃ programs.
    In: Proc. of KR (2012)
25. Simancik, F., Kazakov, Y., Horrocks, I.: Consequence-based reasoning beyond horn ontolo-
    gies. In: Proc. of IJCAI. pp. 1093–1098 (2011)
26. Thomazo, M., Baget, J.F., Mugnier, M.L., Rudolph, S.: A generic querying algorithm for
    greedy sets of existential rules. In: Proc. of KR (2012)