<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Non-Uniform Data Complexity of Query Answering in Description Logics</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Carsten Lutz</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Frank Wolter</string-name>
          <email>Wolter@liverpool.ac.uk</email>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Department of Computer Science, University of Bremen</institution>
          ,
          <country country="DE">Germany</country>
        </aff>
        <aff id="aff1">
          <label>1</label>
          <institution>Department of Computer Science, University of Liverpool</institution>
          ,
          <country country="UK">UK</country>
        </aff>
      </contrib-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>Introduction</title>
      <p>In recent years, the use of ontologies to access instance data has become increasingly
popular. The general idea is that an ontology provides a vocabulary or conceptual model
for the application domain, which can then be used as an interface for querying instance
data and to derive additional facts. In this emerging area, called ontology-based data
access (OBDA), it is a central research goal to identify ontology languages for which
query answering scales to large amounts of instance data. Since the size of the data is
typically very large compared to the size of the ontology and the size of the query, the
central measure for such scalability is provided by data complexity—the complexity of
query answering where only the data is considered to be an input, but both the query
and the ontology are fixed.</p>
      <p>
        In description logic (DL), ontologies take the form of a TBox, instance data is stored
in an ABox, and the most important class of queries are conjunctive queries (CQs).
A fundamental observation regarding this setup is that, for expressive DLs such as
ALC and SHIQ, the complexity of query answering is coNP-complete [
        <xref ref-type="bibr" rid="ref12">12</xref>
        ] and thus
intractable (when speaking of complexity, we always mean data complexity). The most
popular strategy to avoid this problem is to replace ALC and SHIQ with less
expressive DLs that are Horn in the sense that they can be embedded into the Horn fragment
of first-order (FO) logic and have minimal models that can be exploited for PTIME
query answering. Horn DLs in this sense include, for example, logics from the E L and
DL-Lite families as well as Horn-SHIQ, a large fragment of SHIQ for which
CQanswering is still in PTIME [
        <xref ref-type="bibr" rid="ref12">12</xref>
        ]. While CQ-answering in Horn-SHIQ and the E L
family of DLs is also hard for PTIME, the problem has even lower complexity in
DLLite. In fact, the design goal of DL-Lite was to achieve FO-rewritability, i.e., that any
CQ q and TBox T can be rewritten into an FO query q0 such that the answers to q
w.r.t. T coincide with the answers that a standard database system produces for q0 [
        <xref ref-type="bibr" rid="ref6">6</xref>
        ].
Achieving this goal requires CQ-answering to be in AC0.
      </p>
      <p>It thus seems that the data complexity of query answering in a DL context is
wellunderstood. However, all results discussed above are on the level of logics, i.e., each
result concerns a class of TBoxes that is defined syntactically through expressibility in a
certain logic, but no attempt is made to identify more structure inside these classes. The
aim of this paper is to advocate a fresh look on the subject, by taking a novel approach.
Specifically, we advocate a non-uniform study of the complexity of query answering
by considering data complexity on the level of individual TBoxes. For a TBox T , we
say that CQ-answering w.r.t. T is in PTIME if for every CQ q, there is a PTIME
algorithm that, given an ABox A, computes the answers to q in A w.r.t. T . In a similar way,
we can define coNP-hardness and FO-rewritability on the TBox level. The non-uniform
perspective allows us to investigate more fine-grained questions regarding the data
complexity of query answering such as: given an expressive DL L such as ALC or SHIQ,
how can one characterize those L-TBoxes T for which CQ-answering is in PTIME?
How can we do the same for FO-rewritability? Is there a dichotomy for the complexity
of query answering w.r.t. TBoxes formulated in L, such as: for any L-TBox T ,
CQanswering w.r.t. T is either in PTIME or coNP-hard?</p>
      <p>In this paper, we consider TBoxes formulated in the expressive DL ALCF I, answer
some of the above questions, and take some steps towards others. Our main results are:
1. there is a dichotomy between PTIME and coNP-complete for CQ-answering w.r.t.</p>
      <p>
        ALC-TBoxes if, and only if, Feder and Vardi’s dichotomy conjecture that
“constraint satisfaction problems (CSPs) with finite template are in PTIME or
NPcomplete” [
        <xref ref-type="bibr" rid="ref10">10</xref>
        ] is true; the same holds for ALCI-TBoxes;
2. there is no dichotomy between PTIME and coNP-complete for CQ-answering w.r.t.
      </p>
      <p>ALCF -TBoxes, unless PTIME = NP; moreover, PTIME-complexity of CQ
answering and many related problems are undecidable for ALCF .
3. there is a dichotomy between PTIME and coNP-complete for CQ-answering w.r.t.</p>
      <p>ALCF I-TBoxes of depth one, i.e., TBoxes where concepts have role depth 1;
4. FO-rewritability is decidable for Horn-ALCF I-TBoxes of depth two and all
HornALCF -TBoxes;</p>
      <p>
        It should be noted that there has been steady progress regarding the dichotomy
conjecture of Feder and Vardi over the last fifteen years and though the problem is still
open, a solution does not seem completely out of reach [
        <xref ref-type="bibr" rid="ref4 ref5">4, 5</xref>
        ]. Our proof of Point 1 is
based on a novel connection between CSPs and query answering w.r.t. ALCI-TBoxes
that can be exploited to transfer numerous results from the CSP world to query
answering w.r.t. ALCI-TBoxes and related problems. For example, together with [
        <xref ref-type="bibr" rid="ref16 ref5">16, 5</xref>
        ] we
obtain the following results on ‘FO-rewritability of ABox consistency’:
5. Given an ALCI-TBox T , it can be decided in NEXPTIME whether there is an
FOsentence 'T such that for all ABoxes A, A is consistent w.r.t. T iff A viewed as an
FO-structure satisfies 'T . Moreover, such a sentence 'T exists iff ABox
consistency w.r.t. T can be decided in non-uniform AC0. Finally, if no such sentence 'T
exists, then ABox consistency w.r.t. T is LOGSPACE-hard (under FO-reductions).
To prove our results, we introduce some new notions that are relevant for studying
the questions raised and prove some additional results of general interest. A central
such notion is materializability of a TBox T , which formalizes the existence of
minimal models as known from Horn-DLs. We show that, in the case of TBoxes of depth
one, materializability characterizes PTIME CQ-answering, which allows us to establish
Point 2 above. For TBoxes of unrestricted depth, non-materializability still provides a
sufficient condition for coNP-hardness of CQ-answering. We also develop the notion
of unraveling tolerance of a TBox T , which provides a sufficient condition for query
answering to be in PTIME. The resulting upper bound strictly generalizes the known
result that CQ-answering in Horn-ALCF I is in PTIME. Our framework also allows
to formally establish some common intuitions and beliefs held in the context of
CQanswering in description logics. For example, we show that for any ALCF I -TBox T ,
CQ-answering is in PTIME iff answering positive existential queries is in PTIME iff
answering E LI -instance queries is in PTIME and likewise for FO-rewritability.
Another observation in this spirit is that an ALCF I -TBox is materializable (has minimal
models) iff it is convex (a notion related to the entailment of disjunctions).
      </p>
      <p>Most proofs in this paper are deferred to the (appendix of the) long version, which
is available at http://www.csc.liv.ac.uk/ frank/publ/publ.html.
2</p>
    </sec>
    <sec id="sec-2">
      <title>Preliminaries</title>
      <p>We use standard notation for the syntax and semantics of ALCF I and other
wellknown DLs. Our TBoxes are finite sets of concept inclusions C v D, where C and D
are potentially compound concepts, and functionality assertions func(r), where r is a
potentially inverse role. ABoxes are finite sets of assertions A(a) and r(a; b) with A a
concept name and r a role name. We use Ind(A) to denote the set of individual names
used in the ABox A and sometimes write r (a; b) 2 A instead of r(b; a) 2 A. For the
interpretation of individual names, we make the unique name assumption.</p>
      <p>A first-order query (FOQ) q(x) is a first-order formula with free variables x
constructed from atoms A(t), r(t; t0), and t = t0 (where t; t0 range over individual names
and variables) using negation, conjunction, disjunction, and existential quantification.
The variables in x are the answer variables of q. A FOQ without answer variables is</p>
      <sec id="sec-2-1">
        <title>Boolean. We say that a tuple a Ind(A) is an answer to q(x) in an interpretation I if</title>
        <p>I j= q[a], where q[a] results from replacing the answer variables x in q(x) with a. A
tuple a Ind(A) is a certain answer to q(x) in A given T , in symbols T ; A j= q(a),
if I j= q[a] for all models I of A and T . Set certT (q; A) = fa j T ; A j= q(a)g.
A positive existential query (PEQ) q(x) is a FOQ without negation and equality and
a conjunctive query (CQ) is a positive existential query without disjunction. If C is
an E LI -concept and a 2 NI, then C(a) is an E LI -query (ELIQ). E L-queries (ELQs)
are defined analogously. Note that E LI -queries and E L-queries are always Boolean. In
what follows, we sometimes slightly abuse notation and use FOQ to denote the set of
all first-order queries, and likewise for CQ, PEQ, ELIQ, and ELQ.</p>
      </sec>
      <sec id="sec-2-2">
        <title>Definition 1. Let T be an ALCF I -TBox. Let Q 2 fCQ; PEQ; ELIQ; ELQg. Then</title>
        <p>– Q-answering w.r.t. T is in PTIME if for every q(x) 2 Q, there is a polytime
algorithm that computes, given an ABox A, the answer certT (q; A);
– Q-answering w.r.t. T is coNP-hard if there is a Boolean q 2 Q such that, given an</p>
        <p>ABox A, it is coNP-hard to decide whether T ; A j= q;
– T is FO-rewritable for Q iff for every q(x) 2 Q one can effectively construct an
FO-formula q0(x) such that for every ABox A, certT (q; A) = fa j IA j= q0(a)g,
where IA denotes A viewed as an interpretation.</p>
        <p>The above notions of complexity are rather robust under changing the query language:
as we show next, neither the PTIME bounds nor FO-rewritability depend on whether
we consider PEQs, CQs, or ELIQs.</p>
      </sec>
      <sec id="sec-2-3">
        <title>Theorem 1. For all ALCF I-TBoxes T , the following equivalences hold:</title>
      </sec>
      <sec id="sec-2-4">
        <title>1. CQ-answering w.r.t. T is in PTIME iff PEQ-answering w.r.t. T is in PTIME iff</title>
      </sec>
      <sec id="sec-2-5">
        <title>ELIQ-answering w.r.t. T is in PTIME;</title>
      </sec>
      <sec id="sec-2-6">
        <title>2. T is FO-rewritable for CQ iff it is FO-rewritable for PEQ iff it is FO-rewritable</title>
        <p>for ELIQ.</p>
      </sec>
      <sec id="sec-2-7">
        <title>If T is an ALCF -TBox, then we can replace ELIQ in Points 1 and 2 with ELQ.</title>
        <p>The proof is based on Theorems 2 and 3 below. Theorem 1 allows us to (sometimes)
speak of the ‘complexity of query answering’ without reference to a query language.
3</p>
      </sec>
    </sec>
    <sec id="sec-3">
      <title>Materializability</title>
      <p>An important tool we use for analyzing the complexity of query answering is the notion
of materializability of a TBox T , which means that computing the certain answers to
any query q and ABox A w.r.t. T reduces to evaluating q in a single model of A and T .</p>
      <sec id="sec-3-1">
        <title>Definition 2. Let T be an ALCF I-TBox and Q 2 fCQ; PEQ; ELIQ; ELQg. T is Qmaterializable if for every ABox A that is consistent w.r.t. T , there exists a model I of T and A such that I j= q[a] iff T ; A j= q(a) for all q(x) 2 Q and a Ind(A).</title>
        <p>We show that PEQ, CQ, and ELIQ-materializability coincide (and for ALC-TBoxes, all
these also coincide with ELQ-materializability). Materializability is also equivalent to
the following disjunction property (sometimes also called convexity): a TBox T has the
ABox disjunction property if for all ABoxes A and ELIQs C1(a1); : : : ; Cn(an), from
T ; A j= C1(a1) _ : : : _ Cn(an) it follows that T ; A j= Ci(ai), for some i n.</p>
      </sec>
      <sec id="sec-3-2">
        <title>Theorem 2. Let T be an ALCF I-TBox. The following equivalences hold: T is PEQ</title>
        <p>materializable iff T is CQ-materializable iff T is ELIQ-materializable iff T has the
ABox disjunction property.</p>
      </sec>
      <sec id="sec-3-3">
        <title>If T is an ALC-TBox, the above are equivalent to ELQ-materializability.</title>
        <p>
          Because of Theorem 2, we sometimes use the term materializability without reference
to a query language. We call an interpretation I that satisfies the condition formulated
in Definition 2 for PEQs a minimal model of T and A. Note that in many cases, only an
infinite minimal models exists. For example, for T = fA v 9r:Ag and A = fA(a)g
every minimal model I of T and A comprises an infinite r-chain starting at aI . Every
TBox that is equivalent to an FO Horn sentence (in the general sense of [
          <xref ref-type="bibr" rid="ref7">7</xref>
          ]) is
materializable: to construct a minimal model for such a TBox T and some ABox A, one
can take the direct product of all at most countable models of T and A (for additional
information on direct products in DLs, see [
          <xref ref-type="bibr" rid="ref17">17</xref>
          ]). Conversely, however, there are simple
materializable TBoxes that are not equivalent to FO Horn sentences.
        </p>
        <p>Example 1. Let T = f9r:(A u :B u :E) v 9r:(:A u :B u :E)g. One can easily
show that T is not preserved under direct products; thus, it is not equivalent to a Horn
sentence. However, one can construct a minimal model I for T and any ABox A by
taking the interpretation IA obtained by viewing A as an interpretation and then adding,
for any a 2 Ind(A) with a 2 (9r:(A u :B u :E))IA , a fresh da such that (a; da) 2 rI
and da is not in the extension of any concept name. PEQ-answering w.r.t. T is
FOrewritable since for any PEQ q, certT (q; A) consists of precisely the answers to q in IA
(i.e., no query rewriting is necessary). Thus, PEQ-answering w.r.t. T is also in PTIME.
We show that materializability is a necessary condition for query answering being in
PTIME.</p>
      </sec>
      <sec id="sec-3-4">
        <title>Theorem 3. If an ALCF I-TBox T (ALCF -TBox T ) is not materializable, then ELIQanswering (ELQ-answering) is coNP-hard w.r.t. T .</title>
        <p>
          The proof uses the violation of the ABox disjunction property stated in Theorem 2 and
generalizes the reduction of 2+2-SAT used in [
          <xref ref-type="bibr" rid="ref19">19</xref>
          ] to prove that instance checking in a
variant of E L is coNP-hard.
        </p>
        <p>
          Materializability is not a sufficient condition for query answering to be in PTIME. In
fact, we show that for any non-uniform constraint satisfaction problem, there is a
materializable ALC-TBox for which Boolean CQ-answering has the same complexity, up to
complementation of the complexity class. For two finite relational FO-structures R and
R0 over relation symbols , we write Hom(R0; R) if there is a homomorphism from
R0 to R. The non-uniform constraint satisfaction problem for R, denoted by CSP(R),
is the problem to decide, for every finite R0 over , whether Hom(R0; R).
Numerous algorithmic problems, among them many NP-complete ones such as k-SAT and
k-colourability of graphs, can be given in the form CSP(R). It is known that every
problem of the form CSP(R) is polynomially equivalent to some CSP(R0) with R0 a
digraph [
          <xref ref-type="bibr" rid="ref10">10</xref>
          ]. Thus, in what follows we can restrict ourselves to considering CSPs of
the form CSP(I), where I is a DL interpretation. A signature is a set of concept and
role names. The signature sig(T ) of a TBox T is the set of concept and role names that
occur in T . A -TBox is a TBox that uses symbols from only. Similar notation is
used for ABoxes, concepts, and interpretations. For an ABox A, we denote by A the
subset of A containing symbols from only. We will often not distinguish between
ABoxes and finite interpretations.
        </p>
      </sec>
      <sec id="sec-3-5">
        <title>Theorem 4. For every non-uniform constraint satisfaction problem CSP(I), one can</title>
        <p>compute in polytime a materializable ALC-TBox T such that for all ABoxes A,
1. Hom(A ; I), with = sig(I), iff A is consistent w.r.t. T ;</p>
      </sec>
      <sec id="sec-3-6">
        <title>2. for any Boolean CQ q, answering q w.r.t. T is polynomially reducible to the complement of CSP(I).</title>
        <p>The proof Theorem 4 relies on the existence of ALC-concepts H whose value HI in
interpretations I cannot be detected directly using CQs, but which can be used in a
TBox to influence the values AI of concept names A and, therefore, have an indirect
effect on the answers to CQs. From the viewpoint of CQ query answering, they thus
behave similarly to second-order variables. More precisely, let, for a finite set V of
indices, Zv; rv; sv be concept and role names, respectively. Let</p>
        <p>TV = f&gt; v 9rv:&gt;; &gt; v 9sv:Zv j v 2 V g;
Hv = 8rv:9sv::Zv:</p>
      </sec>
      <sec id="sec-3-7">
        <title>Lemma 1. For any ABox A and sets Iv Ind(A), v 2 V , one can construct a minimal model I of (TV ; A) such that HvI = Iv for all v 2 V . TV is FO-rewritable for PEQ.</title>
        <p>To prove Theorem 4, one extends the TBox TV . Assume CSP(I) is given. Let V =
and assume, for simplicity, that sig(I) = frg. Define
T = TV [ fHv u 9r:Hw v ? j v; w 2 V; (v; w) 62 rI g [
fHv u Hw v ? j v; w 2 V; v 6= wg [ f l
:Hv v ?g
Based on Lemma 1, it is possible to verify Points 1 and 2 of Theorem 4. For Point 2, it
can be seen that for all Boolean CQs q and ABoxes A, (T ; A) j= q iff (TV ; A) j= q or
not Hom(A ; I); since TV is FO-rewritable, the former can be checked in PTIME.
v2V</p>
      </sec>
    </sec>
    <sec id="sec-4">
      <title>4 (Towards) Dichotomies</title>
      <p>We start with a reduction of Boolean CQ-answering w.r.t. ALCI-TBoxes to CSPs that
yields, together with Theorem 4, a proof of Point 1 in the introduction: the dichotomy
problem for CSPs is equivalent to the dichotomy problem for CQ answering w.r.t.
ALC(and ALCI-) TBoxes.</p>
      <p>Theorem 5. Let T be an ALCI-TBox and C(a) an ELIQ. Then one can construct, in
time exponential in jT j + jCj,
1. a -interpretation I, = (sig(T ) [ sig(C)) ] fP g, with P a concept name, such
that for all ABoxes A,
(a) there is a polynomial reduction of answering C(a) w.r.t. T to the complement
of CSP(I);
(b) there is a polynomial reduction from the complement of CSP(I) to Boolean</p>
      <sec id="sec-4-1">
        <title>CQ-answering w.r.t. T ; 2. a -interpretation I, = sig(T ), such that for every ABox A, A is consistent w.r.t. T iff Hom(A ; I).</title>
        <p>For Point 1, I is in fact the interpretation that is obtained by the standard type
elimination procedure for ALCI-TBoxes T and concepts C. More specifically, let S be the
closure under single negation of all subconcepts of T and C. A type t is a maximal
subset of S that is satisfiable w.r.t. T . Then I is the set of all types, t 2 AI iff A 2 t,
and (t; t0) 2 rI iff 8r:D 2 t implies D 2 t0 and 8r :D 2 t0 implies D 2 t. For the
special concept name P , set P I = ft j C 2= tg. With the type elimination algorithm, I
can be constructed in exponential time. The mentioned reductions are then as follows:
(a) (T ; A) j= C(a) iff not Hom(AP (a); I), where AP (a) results from A by adding</p>
        <p>P (a) to A and removing all other assertions using P from A;
(b) not Hom(A ; I) iff (T ; A) j= 9v:(P (v) ^ C(v)).</p>
        <p>Result 1 from the introduction can be derived as follows. Let CSP(I) be an
NP-intermediate CSP, i.e., a CSP that is neither in PTIME nor NP-hard. Take the TBox T
from Theorem 4. By Point 1 of that theorem and since consistency of ABoxes w.r.t. T
can trivially be reduced to the complement of answering Boolean CQs w.r.t. T ,
CQanswering w.r.t. T is not in PTIME. By Point 2, CQ-answering w.r.t. T is not
coNPhard either. Conversely, let T be a TBox for which CQ-answering w.r.t. T is neither in
PTIME nor coNP-hard. Then by Theorem 1 and since every ELIQ is a CQ, the same
holds for ELIQ-answering w.r.t. T . Thus, there is a concrete ELIQ C(a) such that
answering C(a) w.r.t. T is coNP-intermediate. Let I be the interpretation constructed
in Point 1 of Theorem 5 for T and C(a). By Point 1a, CSP(I) is not in PTIME; by
Point 1b, it is not NP-hard either.</p>
        <p>
          Result 5 from the introduction can be derived as follows. It is proved in [
          <xref ref-type="bibr" rid="ref16 ref5">16, 5</xref>
          ] that
the problem to decide whether the class of structures fI0 j Hom(I0; I)g is FO-definable
is NP-complete. We obtain a NEXPTIME upper bound since the template I associated
with T can be constructed in exponential time. The claims for AC0 and LOGSPACE
follow in the same way from other results in [
          <xref ref-type="bibr" rid="ref16 ref5">16, 5</xref>
          ].
        </p>
        <p>We now develop a condition on TBoxes, called unraveling tolerance, that is
sufficient for PTIME CQ-answering and strictly generalizes Horn-ALCF I, the ALCF
Ifragment of Horn-SHIQ. For the case of TBoxes of depth one, we obtain a PTIME/coNP
dichotomy result. The notion of unraveling tolerance is based on an unraveling
operation on ABoxes, in the same spirit as the well-known unraveling of an interpretation
into a tree interpretation. This is inspired by (i) the observation that, in the proof of
Theorem 3, the non-tree-shape of ABoxes is essential; and (ii) by Theorem 5 together
with the known fact the non-uniform CSPs are tractable when restricted to tree-shaped
input structures. The unraveling Au of an ABox A is the following ABox:
– the individual names Ind(Au) of Au are sequences b0r0b1 rn 1bn, b0; : : : ; bn 2
Ind(A) and r0; : : : ; rn 1 (possibly inverse) roles such that for all i &lt; n, we have
ri(bi; bi+1) 2 A and bi+1 6= bi 1 (whenever i &gt; 0);
– for each C(b) 2 A and = b0r0b1 rn 1bn 2 Ind(Au) with bn = b, we have</p>
        <p>C( ) 2 Au;
– for each b0r0b1</p>
        <p>rn 1bn 2 Ind(Au), we have rn 1(bn 1; bn) 2 Au.</p>
        <p>For all = b0r0 rn 1bn 2 Ind(Au), we write tail( ) to denote bn. Note that the
condition bi+1 6= bi 1 is needed to ensure that functional roles can still be interpreted
in a functional way after unraveling, despite the UNA.</p>
      </sec>
      <sec id="sec-4-2">
        <title>Definition 3. A TBox T is unraveling tolerant if for all ABoxes A and ELIQs q, we have that T ; A j= q implies T ; Au j= q.</title>
        <p>
          It is not hard to prove that the converse direction ‘T ; Au j= q implies T ; A j= q’
is true for all ALCF I-TBoxes. We now show that the class of unraveling tolerant
ALCF I-TBoxes generalizes Horn-ALCF I. This is based on the original and most
general definition of Horn-SHIQ in [
          <xref ref-type="bibr" rid="ref12">12</xref>
          ] and thus also captures weaker variants as
used e.g. in [
          <xref ref-type="bibr" rid="ref13 ref9">13, 9</xref>
          ]. The TBox in Example 1, which is unraveling tolerant but not a
Horn-ALCF I-TBox, demonstrates that the generalization is strict.
        </p>
      </sec>
      <sec id="sec-4-3">
        <title>Lemma 2. Every Horn-ALCF I-TBox is unraveling tolerant.</title>
        <p>It is interesting to note that unraveling tolerance implies materializability. We shall see
that the converse is, in general, not true.</p>
      </sec>
      <sec id="sec-4-4">
        <title>Lemma 3. Every unraveling-tolerant ALCF I-TBox is materializable.</title>
        <p>
          We now show that unraveling tolerance yields a class of ALCF I-TBoxes for which
query answering is in PTIME. By Lemma 2 and since we actually exhibit a uniform
algorithm for query answering w.r.t. unraveling tolerant TBoxes, this also reproves the
known PTIME upper bound for CQ-answering in Horn-ALCF I [
          <xref ref-type="bibr" rid="ref9">9</xref>
          ]. This result is not
a consquence of Theorem 4 and known results for CSPs since we capture full ALCF I.
        </p>
      </sec>
      <sec id="sec-4-5">
        <title>Theorem 6. If an ALCF I-TBox T is unraveling tolerant, then PEQ-answering w.r.t.</title>
        <p>T is in PTIME.</p>
        <p>To see that unraveling tolerance does not capture all ALCF I-TBoxes for which query
answering is in PTIME, we can invoke Theorem 4. For example, taking a CSP for
2-colorability, we obtain a TBox T for which CQ-answering is in PTIME and such
that an ABox A with sig(A) = frg is consistent w.r.t. T iff A is 2-colorable. Thus,
A; T j= X(a), X a fresh concept name, iff A is not 2-colorable. It follows that T is not
unraveling tolerant. We conjecture that it is possible to generalize Theorem 6 to larger
classes of TBoxes by relaxing the operation of ABox unraveling such that it yields
ABoxes of bounded treewidth instead of tree-shaped ABoxes. Such a generalization
would still not capture 2-colorability.</p>
        <p>We now turn to TBoxes of depth one. The central observation is that for this special
case, we can prove a converse of Lemma 3.</p>
      </sec>
      <sec id="sec-4-6">
        <title>Lemma 4. Every materializable ALCF I-TBox of depth one is unraveling tolerant.</title>
        <p>This brings us into the position where we can establish the announced dichotomy result
for ALCF I-TBoxes of depth one. If such a TBox T is materializable, then Lemma 4
and Theorem 6 yield that PEQ-answering w.r.t. T is in PTIME. Otherwise,
ELIQanswering w.r.t. T is coNP-complete by Theorem 3. We thus obtain the following.</p>
      </sec>
      <sec id="sec-4-7">
        <title>Theorem 7 (Dichotomy). For every ALCF I-TBox T of depth one, one of the follow</title>
        <p>ing is true:
– Q-answering w.r.t. T is in PTIME for any Q 2 fPEQ;CQ;ELIQg;
– Q-answering w.r.t. T is coNP-complete for any Q 2 fPEQ;CQ;ELIQg.
5</p>
      </sec>
    </sec>
    <sec id="sec-5">
      <title>Deciding FO-Rewritability</title>
      <p>The results of this section are based on the observation that for materializable TBoxes of
depth one, FO-rewritability for CQ follows from FO-rewritability for atomic concepts,
i.e., concept names and ?. We say that an atomic concept A is FO-rewritable w.r.t. a
TBox T and a signature if there exists an FO-formula 'A such that for all -ABoxes
A and a 2 Ind(A): T ; A j= A(a) iff IA j= 'A[a]. Clearly, if T is FO-rewritable
for CQ, then every atomic concept is FO-rewritable w.r.t. T and any signature. For
materializable TBoxes of depth one, the converse is also true.</p>
      <sec id="sec-5-1">
        <title>Lemma 5. A materializable ALCF I-TBox of depth one is FO-rewritable for CQs iff all atomic concepts are FO-rewritable w.r.t. T and sig(T ).</title>
        <p>
          Based on Lemma 5, we can use Theorem 5 and results from [
          <xref ref-type="bibr" rid="ref16">16</xref>
          ] to obtain the following
result, in a similar (but slightly more involved) way as in the proof of Result 5 from the
introduction.
        </p>
        <p>Theorem 8. FO-rewritability for CQs is decidable in NEXPTIME, for any of the
following classes of TBoxes: materializable ALCI-TBoxes of depth one,
Horn-ALC</p>
      </sec>
      <sec id="sec-5-2">
        <title>TBoxes, and Horn-ALCI-TBoxes of depth two.</title>
        <p>
          Theorem 5 does not apply to DLs with functional roles. To analyze FO-rewritability
in the presence of functional roles, we associate with every materializable TBox T of
depth one a monadic datalog program T such that T and T give the same answers
to queries A(a), A atomic. We then show that T is FO-rewritable if, and only if, T is
equivalent to a non-recursive datalog program. The latter property is known as
boundedness of a datalog program and has been studied extensively for fixpoint logics [
          <xref ref-type="bibr" rid="ref18 ref3">3, 18</xref>
          ]
and datalog programs [
          <xref ref-type="bibr" rid="ref8">8</xref>
          ]. Using existing decidability results for boundedness, we can
then establish a counterpart of Theorem 8 for the case of ALCF I.
        </p>
        <p>For our purposes, a monadic datalog program consists of rules A(x) X,
where A is a concept name and X is a finite set consisting of assertions of the form
B(x), r(x1; x2), and inequalities x1 6= x2, where B is a concept name, r a role, and
x; x1; x2 range over variables. Inequalities are required to model functional roles. We
also use a special unary predicate ? and rules ?(x) X stating that X is inconsistent.
For an ABox A, we denote by i(A) the set of all assertions A(a) that can be derived
using i applications of rules from to A. We set 1(A) = Si 0 i(A).
Definition 4 (Boundedness). Let be a datalog program and a signature. An
atomic concept A is bounded in for -ABoxes if there exists a k &gt; 0 such that
for all -ABoxes A and all a 2 sig(A): A(a) 2 1(A) iff A(a) 2 k(A).
Let T be a materializable TBox of depth one. A -neighbourhood ABox ( -NH)
consists of a -ABox A with a distinguished individual name f such that A consists of
assertions of the form r(f; a) with r a role and a 6= f and A(b) such that
– for each b 6= f with b 2 Ind(A) there is exactly one r such that r(f; b) 2 A;
– if r(f; b1) and r(f; b2) 2 A and b1 6= b2, then there exists A(b1) 2 A with A(b2) 62</p>
        <p>A or vice versa.</p>
        <p>The ABox A in which each individual b is replaced by a variable xb is denoted by Ax.
Now define a monadic datalog program associated with T , where = sig(T ):
T = fA(xa)
f?(x)
f?(x)
fA(x)</p>
        <p>Ax j A is a -NH, a 2 Ind(A), A 2 , (T ; A) j= A(a)g [
Ax j A is a -NH that is not consistent w.r.t. T g [
r(y; y1); r(y; y2); y1 6= y2 j func(r) 2 T g [
?(x) j A 2</p>
        <p>g:
The following lemma states that</p>
        <p>T behaves as intended.</p>
      </sec>
      <sec id="sec-5-3">
        <title>Lemma 6. For every materializable ALCF I-TBox T of depth one, every A 2 sig(T ),</title>
        <p>every ABox A, and every a 2 Ind(A), (T ; A) j= A(a) iff A(a) 2 T1(A). Moreover,
?(a) 2 T1(A) iff A is not consistent w.r.t. T .</p>
        <p>Using unfolding tolerance of materializable TBoxes of depth one, one can show the
following equivalence for FO-rewritability and boundedness.</p>
      </sec>
      <sec id="sec-5-4">
        <title>Lemma 7. For every materializable ALCF I-TBox T of depth one and signature</title>
        <p>
          an atomic concept A is bounded in T for -ABoxes iff A is FO-rewritable w.r.t. T
and .
:
Unfortunately, decidability results for boundedness of monadic datalog programs are
not directly applicable to T since they assume programs without inequalities [
          <xref ref-type="bibr" rid="ref11 ref8">8, 11</xref>
          ].
However, using unfolding tolerance, one can employ instead recent decidability results
on boundedness of least fixed points over trees [
          <xref ref-type="bibr" rid="ref18">18</xref>
          ] to obtain the following theorem.
Theorem 9. FO-rewritability for CQs is decidable, for any of the following classes
of TBoxes: materializable ALCF I-TBoxes of depth one, Horn-ALCF -TBoxes, and
        </p>
      </sec>
      <sec id="sec-5-5">
        <title>Horn-ALCF I-TBoxes of depth two.</title>
        <p>6</p>
        <p>Non-Dichotomy and Undecidability in ALCF
The aim of this section is to show that the addition of functional roles significantly
complicates the problems studied in the previous sections. More precisely, we show that
(i) for CQ-answering w.r.t. ALCF -TBoxes, there is no dichotomy between PTIME and
coNP unless PTIME = NP; and (ii) CQ-answering in PTIME is undecidable for ALCF
TBoxes, and likewise for coNP-hardness, materializability and FO-rewritability. Point (i)
is a consequence of the following result.</p>
      </sec>
      <sec id="sec-5-6">
        <title>Theorem 10. For every language L in coNP, there is an ALCF -TBox T and query</title>
        <p>rej(a), rej a concept name, such that the following holds:</p>
      </sec>
      <sec id="sec-5-7">
        <title>1. there exists a polynomial reduction of deciding v 2 L to answering rej(a) w.r.t. T ;</title>
      </sec>
      <sec id="sec-5-8">
        <title>2. for every ELIQ q, answering q w.r.t. T is polynomially reducible to deciding v 2 L.</title>
        <p>
          Ladners theorem [
          <xref ref-type="bibr" rid="ref15">15</xref>
          ] states that unless PTIME = NP, coNP intermediate problems
exist. Suppose to the contrary of Point (i) that for every ALCF -TBox T , CQ answering
w.r.t. T is in PTIME or coNP-hard. Take a coNP-intermediate language L and let T
be the TBox from Theorem 10. By Point 1 of the theorem, CQ-answering w.r.t. T is
not in PTIME. Thus it must be coNP-hard. By Theorem 1 and since a dichotomy for
CQ-answering w.r.t. T also implies a dichotomy for ELIQ-answering w.r.t. T ,
ELIQanswering w.r.t. T is also coNP-hard. By Point 2 of Theorem 10, this is impossible.
        </p>
        <p>
          The proof of Theorem 10 combines the ‘hidden’ concepts Hv from the proof of
Theorem 4 with ideas from a proof in [
          <xref ref-type="bibr" rid="ref1">1</xref>
          ] which establishes undecidability of a certain
query emptiness problem in ALCF . Using a similar strategy, we establish the
undecidability results announced as Point (ii) above, summarized by the following theorem.
        </p>
      </sec>
      <sec id="sec-5-9">
        <title>Theorem 11. For ALCF -TBoxes T , the following problems are undecidable (Points 1 and 2 are subject to the side condition that PTIME 6= NP):</title>
      </sec>
      <sec id="sec-5-10">
        <title>1. CQ-answering w.r.t. T is in PTIME;</title>
      </sec>
      <sec id="sec-5-11">
        <title>2. CQ answering w.r.t. T is coNP-hard;</title>
      </sec>
      <sec id="sec-5-12">
        <title>3. T is materializable.</title>
        <p>In the appendix, we also prove that FO-rewritability for CQ is undecidable in ALCF ,
for a slightly modified definition of FO-rewritability that only considers consistent
ABoxes.</p>
      </sec>
    </sec>
    <sec id="sec-6">
      <title>Conclusions</title>
      <p>We have have introduced non-uniform data complexity of query answering w.r.t.
description logic TBoxes and proved that it enables a more fine-grained analysis than the
standard approach. Many questions remain. In particular, the newly established
CSPconnection should be exploited further. We believe that the techniques introduced in
this paper can be extended to richer DLs such as SHIQ.</p>
      <p>Acknowledgments. C. Lutz was supported by the DFG SFB/TR 8 “Spatial Cognition”.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          1.
          <string-name>
            <given-names>F.</given-names>
            <surname>Baader</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Bienvenu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Lutz</surname>
          </string-name>
          , and
          <string-name>
            <given-names>F.</given-names>
            <surname>Wolter</surname>
          </string-name>
          .
          <article-title>Query and predicate emptiness in description logics</article-title>
          .
          <source>In Proc. of KR2010</source>
          . AAAI Press,
          <year>2010</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          2.
          <string-name>
            <given-names>F.</given-names>
            <surname>Baader</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Brandt</surname>
          </string-name>
          , and
          <string-name>
            <given-names>C.</given-names>
            <surname>Lutz</surname>
          </string-name>
          .
          <article-title>Pushing the E L envelope</article-title>
          .
          <source>In Proc. of IJCAI05</source>
          , pages
          <fpage>364</fpage>
          -
          <lpage>369</lpage>
          . Professional Book Center,
          <year>2005</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          3.
          <string-name>
            <given-names>J.</given-names>
            <surname>Barwise</surname>
          </string-name>
          and
          <string-name>
            <given-names>Y. N.</given-names>
            <surname>Moschovakis</surname>
          </string-name>
          .
          <article-title>Global inductive definability</article-title>
          .
          <source>J. Symb. Log.</source>
          ,
          <volume>43</volume>
          (
          <issue>3</issue>
          ):
          <fpage>521</fpage>
          -
          <lpage>534</lpage>
          ,
          <year>1978</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          4.
          <string-name>
            <given-names>A. A.</given-names>
            <surname>Bulatov</surname>
          </string-name>
          .
          <article-title>A dichotomy theorem for constraints on a three-element set</article-title>
          .
          <source>In Proc. of FOCS02</source>
          , pages
          <fpage>649</fpage>
          -
          <lpage>658</lpage>
          ,
          <year>2002</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          5.
          <string-name>
            <given-names>A. A.</given-names>
            <surname>Bulatov</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A. A.</given-names>
            <surname>Krokhin</surname>
          </string-name>
          , and
          <string-name>
            <given-names>B.</given-names>
            <surname>Larose</surname>
          </string-name>
          .
          <article-title>Dualities for constraint satisfaction problems</article-title>
          .
          <source>In Complexity of Constraints</source>
          , pages
          <fpage>93</fpage>
          -
          <lpage>124</lpage>
          ,
          <year>2008</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          6.
          <string-name>
            <given-names>D.</given-names>
            <surname>Calvanese</surname>
          </string-name>
          , G. De Giacomo,
          <string-name>
            <given-names>D.</given-names>
            <surname>Lembo</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Lenzerini</surname>
          </string-name>
          , and
          <string-name>
            <given-names>R.</given-names>
            <surname>Rosati</surname>
          </string-name>
          .
          <article-title>Tractable reasoning and efficient query answering in description logics: The DL-Lite family</article-title>
          .
          <source>J. of Autom. Reasoning</source>
          ,
          <volume>39</volume>
          (
          <issue>3</issue>
          ):
          <fpage>385</fpage>
          -
          <lpage>429</lpage>
          ,
          <year>2007</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          7.
          <string-name>
            <given-names>C. C.</given-names>
            <surname>Chang</surname>
          </string-name>
          and
          <string-name>
            <given-names>H. J.</given-names>
            <surname>Keisler</surname>
          </string-name>
          .
          <source>Model Theory</source>
          , volume
          <volume>73</volume>
          <source>of Studies in Logic and the Foundations of Mathematics. Elsevier</source>
          ,
          <year>1990</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          8.
          <string-name>
            <given-names>S. S.</given-names>
            <surname>Cosmadakis</surname>
          </string-name>
          ,
          <string-name>
            <given-names>H.</given-names>
            <surname>Gaifman</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P. C.</given-names>
            <surname>Kanellakis</surname>
          </string-name>
          , and
          <string-name>
            <given-names>M. Y.</given-names>
            <surname>Vardi</surname>
          </string-name>
          .
          <article-title>Decidable optimization problems for database logic programs (preliminary report)</article-title>
          .
          <source>In Proc. of STOC88</source>
          , pages
          <fpage>477</fpage>
          -
          <lpage>490</lpage>
          ,
          <year>1988</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          9.
          <string-name>
            <given-names>T.</given-names>
            <surname>Eiter</surname>
          </string-name>
          , G. Gottlob,
          <string-name>
            <given-names>M.</given-names>
            <surname>Ortiz</surname>
          </string-name>
          , and
          <string-name>
            <given-names>M.</given-names>
            <surname>Simkus</surname>
          </string-name>
          .
          <article-title>Query answering in the description logic Horn-SHIQ</article-title>
          .
          <source>In Proc. of JELIA08</source>
          , volume
          <volume>5293</volume>
          <source>of LNCS</source>
          , pages
          <fpage>166</fpage>
          -
          <lpage>179</lpage>
          . Springer,
          <year>2008</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          10.
          <string-name>
            <given-names>T.</given-names>
            <surname>Feder</surname>
          </string-name>
          and
          <string-name>
            <given-names>M. Y.</given-names>
            <surname>Vardi</surname>
          </string-name>
          .
          <article-title>Monotone monadic snp and constraint satisfaction</article-title>
          .
          <source>In Proc. of STOC93</source>
          , pages
          <fpage>612</fpage>
          -
          <lpage>622</lpage>
          ,
          <year>1993</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          11. H.
          <string-name>
            <surname>Gaifman</surname>
            ,
            <given-names>H. G.</given-names>
          </string-name>
          <string-name>
            <surname>Mairson</surname>
            ,
            <given-names>Y.</given-names>
          </string-name>
          <string-name>
            <surname>Sagiv</surname>
            , and
            <given-names>M. Y.</given-names>
          </string-name>
          <string-name>
            <surname>Vardi</surname>
          </string-name>
          .
          <article-title>Undecidable optimization problems for database logic programs</article-title>
          .
          <source>In Proc. of LICS87</source>
          , pages
          <fpage>106</fpage>
          -
          <lpage>115</lpage>
          ,
          <year>1987</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          12. U. Hustadt,
          <string-name>
            <given-names>B.</given-names>
            <surname>Motik</surname>
          </string-name>
          , and
          <string-name>
            <given-names>U.</given-names>
            <surname>Sattler</surname>
          </string-name>
          .
          <article-title>Reasoning in description logics by a reduction to disjunctive datalog</article-title>
          .
          <source>J. Autom. Reasoning</source>
          ,
          <volume>39</volume>
          (
          <issue>3</issue>
          ):
          <fpage>351</fpage>
          -
          <lpage>384</lpage>
          ,
          <year>2007</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          13.
          <string-name>
            <given-names>Y.</given-names>
            <surname>Kazakov</surname>
          </string-name>
          .
          <article-title>Consequence-driven reasoning for horn SHIQ ontologies</article-title>
          .
          <source>In Proc. of IJCAI09</source>
          , pages
          <fpage>2040</fpage>
          -
          <lpage>2045</lpage>
          ,
          <year>2009</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref14">
        <mixed-citation>
          14. M.
          <article-title>Kro¨tzsch. Efficient inferencing for OWL EL</article-title>
          .
          <source>In Proc. of JELIA10</source>
          , volume
          <volume>6341</volume>
          <source>of LNCS</source>
          , pages
          <fpage>234</fpage>
          -
          <lpage>246</lpage>
          . Springer,
          <year>2010</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref15">
        <mixed-citation>
          15.
          <string-name>
            <given-names>R. E.</given-names>
            <surname>Ladner</surname>
          </string-name>
          .
          <article-title>On the structure of polynomial time reducibility</article-title>
          .
          <source>JACM</source>
          ,
          <volume>22</volume>
          (
          <issue>1</issue>
          ):
          <fpage>155171</fpage>
          ,
          <year>1975</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref16">
        <mixed-citation>
          16.
          <string-name>
            <given-names>B.</given-names>
            <surname>Larose</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Loten</surname>
          </string-name>
          , and
          <string-name>
            <given-names>C.</given-names>
            <surname>Tardif</surname>
          </string-name>
          .
          <article-title>A characterisation of first-order constraint satisfaction problems</article-title>
          .
          <source>Logical Methods in Computer Science</source>
          ,
          <volume>3</volume>
          (
          <issue>4</issue>
          ),
          <year>2007</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref17">
        <mixed-citation>
          17.
          <string-name>
            <surname>C. Lutz</surname>
            ,
            <given-names>R.</given-names>
          </string-name>
          <string-name>
            <surname>Piro</surname>
            , and
            <given-names>F.</given-names>
          </string-name>
          <string-name>
            <surname>Wolter</surname>
          </string-name>
          .
          <article-title>Description logic tboxes: Model-theoretic characterizations and rewritability</article-title>
          .
          <source>In Proc. of IJCAI11</source>
          ,
          <year>2011</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref18">
        <mixed-citation>
          18.
          <string-name>
            <surname>M. Otto</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          <string-name>
            <surname>Blumensath</surname>
            , and
            <given-names>M.</given-names>
          </string-name>
          <string-name>
            <surname>Weyer</surname>
          </string-name>
          .
          <article-title>Decidability results for the boundedness problem</article-title>
          .
          <source>Technical report, TU Darmstadt</source>
          ,
          <year>2010</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref19">
        <mixed-citation>
          19.
          <string-name>
            <given-names>A.</given-names>
            <surname>Schaerf</surname>
          </string-name>
          .
          <article-title>On the complexity of the instance checking problem in concept languages with existential quantification</article-title>
          .
          <source>J. of Intell. Inf. Sys.</source>
          ,
          <volume>2</volume>
          :
          <fpage>265</fpage>
          -
          <lpage>278</lpage>
          ,
          <year>1993</year>
          .
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>