<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>On the Data Complexity of Ontology-Mediated Queries with a Covering Axiom</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>O. Gerasimova</string-name>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>S. Kikot</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>V. Podolskii</string-name>
          <xref ref-type="aff" rid="aff1">1</xref>
          <xref ref-type="aff" rid="aff2">2</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>M. Zakharyaschev</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Birkbeck, University of London</institution>
          ,
          <country country="UK">U.K</country>
        </aff>
        <aff id="aff1">
          <label>1</label>
          <institution>National Research University Higher School of Economics</institution>
          ,
          <addr-line>Moscow</addr-line>
          ,
          <country country="RU">Russia</country>
        </aff>
        <aff id="aff2">
          <label>2</label>
          <institution>Steklov Mathematical Institute</institution>
          ,
          <addr-line>Moscow</addr-line>
          ,
          <country country="RU">Russia</country>
        </aff>
      </contrib-group>
      <abstract>
        <p>This paper reports on our ongoing work that aims at a classification of conjunctive queries q according to the data complexity of answering ontologymediated queries (fA v T t F g; q). We give examples of queries from the complexity classes C 2 fAC0; L; NL; P; CONPg, and obtain a few syntactical conditions for C-membership and C-hardness.</p>
      </abstract>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>Introduction</title>
      <p>
        The OWL 2 QL profile of OWL 2 —as well as the underlining description logics from
the DL-Lite family [
        <xref ref-type="bibr" rid="ref2 ref4">4, 2</xref>
        ]—were designed to ensure that every ontology-mediated query
(OMQ, for short) (T ; q) with an OWL 2 QL ontology T and a conjunctive query (CQ)
q is first-order (FO) rewritable. However, when developing ontologies for
ontologybased data access (OBDA) [
        <xref ref-type="bibr" rid="ref10">10</xref>
        ] applications, domain experts are often tempted to use
axioms with constructs that are not available in OWL 2 QL. For example, the NPD
FactPages ontology,4 which was created to facilitate querying the datasets of the
Norwegian Petroleum Directorate,5 contains cardinality restrictions and covering axioms of
the form A v B1 t t Bn. Typical answers to the question whether such axioms
could have a negative impact on OMQ rewriting are as follows: (i) the data satisfies
the axioms anyway (because of the database schema), (ii) our ‘real-world queries’ are
never affected by them, and (iii) OBDA systems such as Ontop drop everything outside
OWL 2 QL. Ideally, of course, we would rather want our system to detect automatically
whether the given OMQ is FO-rewritable and alert the user if this is not so. Furthermore,
in case of non-FO-rewritability, we might want the system to check whether a datalog
rewriting is possible, and so on. From the complexity-theoretic point of view, we are
thus interested in the data complexity of answering a given OMQ with an expressive
ontology.
      </p>
      <p>
        A systematic investigation of this problem was started in [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ], which showed among
other results that answering OMQs of the form (Disn; u), where u is a union of CQs
(UCQ) and Disn = fA v B1 t t Bng, is polynomially equivalent to the constraint
satisfaction problems CSP(A). In particular, a P/CONP dichotomy for such OMQs
would give a dichotomy for CSPs, thereby confirming the Feder-Vardi conjecture. As
4 http://sws.ifi.uio.no/project/npd-v2/
5 http://factpages.npd.no/factpages/
shown in [
        <xref ref-type="bibr" rid="ref7">7</xref>
        ], answering CQs with basic schema.org ontologies (in particular, Disn)
and CQs of qvar-size 2 is in P for combined complexity, where q is of qvar-size n if
the restriction of q to its quantified variables is a disjoint union of CQs with at most n
variables each. Moreover, FO- and datalog-rewritability of OMQs of the form (T ; u),
where T is a schema.org ontology and u is a UCQ, are decidable in NEXPTIME. It has
also been recently established in [
        <xref ref-type="bibr" rid="ref5">5</xref>
        ] that checking FO-rewritability of OMQs with
ontologies formulated in any description logic between ALCI and SHI is
2NEXPTIMEcomplete. Datalog rewritability of OMQs with ontologies given in disjunctive datalog
has been investigated in [
        <xref ref-type="bibr" rid="ref8">8</xref>
        ].
      </p>
      <p>In this paper, we consider one fixed non-Horn ontology Dis = fA v T t F g.
Ultimately aiming at a complete classification of CQs q according to the data complexity
of answering OMQs Q = (Dis; q), here we present our initial observations about this
problem. Ideally, we would like to obtain transparent necessary and sufficient conditions
relating the structure of q—say, the way how T and F occur in it—with the complexity
of answering Q. For example, one such condition guaranteeing datalog rewritability,
and so tractability of answering Q follows from [8, Theorem 27]: it suffices that q
contains at most one occurrence of F or at most one occurrence of T . We obtain a few
conditions in the same spirit for the complexity classes AC0, L, NL and P. We also give
quite a few simple and instructive CQs distinguishing between NL and P, and develop
techniques for establishing P and CONP lower bounds.
2</p>
    </sec>
    <sec id="sec-2">
      <title>Preliminaries</title>
      <p>In our context, a conjunctive query (CQ) is a first-order (FO) formula of the form
q(x) = 9y '(x; y), where ' is a conjunction of unary or binary atoms P (z) with
z x [ y.Unions of conjunctive queries (UCQ) is a disjunction of conjunctive queries.
Given an ABox (or data instance) A, we denote by ind(A) the set of individual names
that occur in A. A tuple a ind(A) is a certain answer to the OMQ Q = (Dis; q(x))
over A if M j= q(a), for every model M of Dis [ A; in this case we write Dis; A j=
q(a). If the set x of answer variables is empty, a certain answer to Q over D is ‘yes’ if
M j= q, for every model M of Dis [ A, and ‘no’ otherwise. OMQs and CQs without
answer variables x are called Boolean. We often regard CQs as sets of their atoms. In
this paper, we assume that the all CQs are connected.</p>
      <p>Let Q = (Dis; q(x)) be a fixed OMQ. By answering Q, we understand the problem
of checking, given an ABox A and a tuple a ind(A), whether Dis; A j= q(a). It is
readily seen that this problem is always in CONP. It is in the complexity class AC0 if
there is an FO-formula q0(x), called an FO-rewriting of Q, such that Dis; A j= q(a)
iff q0(a) holds in the model given by A, for any ABox A and any tuple a ind(A).</p>
      <p>A datalog program, , is a finite set of rules of the form 8z ( 0 1 ^ ^ m),
where each i is an atom Q(y) with y z or an equality (z = z0) with z; z0 2 z. (As
usual, we omit 8z.) The atom 0 is the head of the rule, and 1; : : : ; m its body. All
variables in the head must occur in the body, and = can only occur in the body. The
predicates in the heads of rules in are IDB predicates, the rest (including =) EDB
predicates. A program is called linear if the body of every rule in contains at most
one IDB predicate.</p>
      <p>
        A datalog query is a pair ( ; G(x)), where is a datalog program and G(x) an
atom. A tuple a ind(A) is an answer to ( ; G(x)) over an ABox A if G(a) holds
in the first-order structure with domain ind(A) obtained by closing A under the rules
in ; in this case we write ; A j= G(a). A datalog query ( ; G(x)) is a datalog
rewriting of an OMQ Q = (Dis; q(x)) in case Dis; A j= q(a) iff ; A j= G(a),
for any ABox A and any a ind(A). The evaluation problem for ( ; G(x))—that
is, checking, given an ABox A and a tuple a ind(A), whether ; A j= G(a)—is
known to be in P; for linear , this problem is in NL; see [
        <xref ref-type="bibr" rid="ref6">6</xref>
        ] and references therein.
3
      </p>
      <p>AC0
By a solitary occurrence of F in a CQ q we mean any F (x) 2 q such that T (x) 2= q;
likewise, a solitary occurrence of T in q is any T (x) 2 q such that F (x) 2= q.
Theorem 1. For any CQ q without solitary occurrences of F (or T ), answering the
OMQ Q = (Dis; q) is in AC0.</p>
      <p>Proof. We show that Dis; A j= q(a) iff A j= q(a). Suppose that A 6j= q(a) and
F (x) 2 q ) T (x) 2 q. Take A0 = A [ fF (a) j a 2 ind(A) ^ T (a) 2= Ag. Clearly,
A0 j= Dis and A0 6j= q(a). The converse direction is trivial.</p>
      <p>In particular, answering any OMQ Q = (Dis; q), where q does not contain one of
F or T , is in AC0. This observation can be easily generalised to OMQs with ontologies
Disn = fA v B1 t t Bng, for n 2:
Theorem 2. Suppose q is any CQ that does not contain an occurrence of Bi, for some
i (1 i n). Then answering the OMQ Q = (Disn; q) is in AC0.</p>
      <p>Thus, only those CQs can ‘feel’ Disn as far as FO-rewritability is concerned that
contain all the Bn (which makes them quite complex in practice). Theorem 1 also shows
that Q = (Dis; q) satisfying the respective condition has a trivial FO-rewriting, viz. q
itself. This is not accidental as shown by the following observation:
Proposition 1. If Q = (Dis; q) is in AC0, then q is a rewriting of Q.
Proof. By [3, Proposition 5.9], if Q is FO-rewritable, it has a UCQ rewriting. Then
there is a homomorphism from q to any CQ q0 in this rewriting.</p>
      <p>We do not know yet whether the sufficient condition for FO-rewritability given
by Theorem 1 is also a necessary one for minimal CQs q (that are not equivalent to
any of their proper subqueries). For non-minimal CQs, this is not the case as shown
by F</p>
      <p>R</p>
      <p>R F T</p>
      <p>R</p>
      <p>R T which is in AC0 because it is equivalent to the CQ</p>
      <p>R F T R . Below we obtain some partial results showing how a single F -atom
and a single T -atom in q can cause L- and NL-hardness.</p>
      <p>
        L and NL
We say that a Boolean CQ q is an F -T -CQ if it has exactly one atom of the form F (x),
exactly one atom of the form T (y), and the variables x and y are distinct.
Theorem 3. Answering any OMQ Q = (Dis; q) with an F -T -CQ q is L-hard.
Proof. The proof is by reduction to the reachability problem for undirected graphs,
which is known to be L-complete; see, e.g., [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ]. Let q0 be the CQ obtained from q
by removing the atoms F (x) and T (y). Suppose we are given an undirected graph
G = (V; E) and two vertices s; t 2 V . It will be convenient to regard G as a directed
graph such that (u; v) 2 E iff (v; u) 2 E, for any u; v 2 V . We encode G by means
of an ABox AG that is obtained from G as follows. For every edge e = (u; v) 2 E,
let q0e be the set of atoms in q0 with x renamed to u, y to v and all other variables z
to ze. Then AG comprises all such qe, for e 2 E, as well as F (s), T (t) and A(v), for
v 2 V n fs; tg. Our aim is to show that s !G t iff Dis; AG j= q.
      </p>
      <p>Suppose s !G t, that is, there exists a path s = v0; v1; : : : ; vn = t in G with
ei = (vi; vi+1) 2 E, for i &lt; n. Consider an arbitrary model I of Dis and AG. Since
I j= Dis and F (s), T (t), A(vi), for 1 i &lt; n, are all in AG, we can find some i &lt; n
such that I j= F (vi) and I j= T (vi+1). As q0ei is an isomorphic copy of q0, we obtain
I j= q. Conversely, suppose s 6!G t. Define an interpretation I by extending the ABox
AG with F I = fv 2 V j s !G vg and T I = fv 2 V j s 6!G vg. Clearly, I is a model
of Dis. By the construction, the elements of the connected component of I containing
s cannot be instances of T , while the remaining elements of I cannot be instances of
F . Since q is connected, it follows that I 6j= q.</p>
      <p>We call a Boolean CQ q linear-directed if all of its variables can be arranged in a
sequence v0; : : : ; vm such that all binary predicates in q are of the form R(vi; vi+1), for
some i, 0 i &lt; m.</p>
      <p>Theorem 4. Answering any OMQ Q = (Dis; q) with a linear-directed CQ q
containing both a solitary F and a solitary T is NL-hard.</p>
      <p>
        Proof. Suppose F (vk) 2 q, T (vk) 2= q and F (vl) 2= q, T (vl) 2 q, for some k; l with
0 k &lt; l m. We rename the sequence vk; : : : ; vl to x0; : : : ; xn. The proof proceeds
by reduction to the reachability problem in directed graphs, which is known to be
NLcomplete; see, e.g., [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ]. Given a directed graph G = (V; E) and vertices s; t 2 V , we
construct the ABox AG in the same way as in the proof of Theorem 3 treating x0 as x
and xn as y. Again, we show that s !G t iff Dis; AG j= q. The implication ()) is
established exactly as above.
      </p>
      <p>To prove ((), we assume that s 6!G t and consider the same model I as defined
in the proof of Theorem 3. Taking account of linear-directedness of q, we immediately
conclude that there is no homomorphism h : q ! I with h(x0) 2 V . It remains to
show that there is no homomorphism h : q ! I with h(x0) 2= V either. Suppose to
the contrary that such a homomorphism exists. Then there exist B 2 fF; T g and a
homomorphism f : q ! (AG2 [ fB(r)g), where G2 = (fs; r; tg; f(s; r); (r; t)g). We
denote the points of AG2 between s and r by x0; x1; : : : ; xn and those between r and
t by xn; x01; : : : ; x0n. By comparing the lengths of appropriate segments of q, we obtain
f (x0) = xi, for some i (0 &lt; i &lt; n). As F (x0) 2 q, we must have F (xi) 2 q; see
the picture below. As f (xi) = x2i if 2i n, and f (xi) = x02i mod n otherwise, we
also have F (x2i mod n) 2 q; more generally, F (xki mod n) 2 q for all natural k. Now,
since the equation of the form ‘iX = n mod n’ always has a solution, F (xn) 2 q,
which is impossible if B = T . If B = F , we use a similar argument starting from
T (xi) 2 q and show that T (xn) 2 q, which is again a contradiction.</p>
      <p>F
x0
: : :
f (x0)
F
xi
x0
: : :
: : :
f (xi)
xi
: : :
: : :</p>
      <p>B
xn
xj
: : :
: : :
x0</p>
      <p>i
xn</p>
      <p>T
: : :
x0
n
T</p>
      <p>Theorems 1 and 4 give the following dichotomy for OMQs Q = (Dis; q) with
linear-directed CQs q:
– either q does not contain a solitary F or a solitary T , and answering Q is in AC0,
– or q contains both solitary F and T , and answering Q is NL-hard.</p>
      <p>We now complement the sufficient conditions of L- and NL-hardness obtained above
with sufficient conditions of OMQ answering in L- and NL.</p>
      <p>A CQ q0(x; y) is symmetric if the CQs q0(x; y) and q0(y; x) are equivalent in the
sense that q0(a; b) holds in A iff q0(b; a) holds in A, for any ABox A and a; b 2 ind(A).
Theorem 5. Let Q = (Dis; q) be any OMQ such that</p>
      <p>q = 9x; y (F (x) ^ q01(x) ^ q0(x; y) ^ q02(y) ^ T (y));
for some connected CQs q0(x; y); q01(x) and q02(y) that do not contain solitary T and
F , and q0(x; y) is symmetric. Then answering Q can be done in L.</p>
      <p>Proof. It is not hard to show that, for any ABox A, we have Dis; A j= q iff there exist
v0; v1; : : : ; vn 2 ind(A), for some n 1, such that the following conditions hold:
– F (v0); A(v1); : : : ; A(vn 1); T (vn) 2 A;
– A j= q0(vi; vi+1) for 0 i &lt; n;
– A j= q01(vi) for 0 i &lt; n;
– A j= q02(vi) for 1 i n.</p>
      <p>It remains to observe that checking these conditions reduces to checking VT –VF
reachability in the undirected graph GA = (VA; EA) defined below. The vertices in GA
comprise the set VA = VT [ VA [ VF , where
– VT = fv 2 ind(A) j A j= T (v) ^ q02(v)g;
– VA = fv 2 ind(A) j A j= A(v) ^ q01(v) ^ q02(v)g;
– VF = fv 2 ind(A) j A j= F (v) ^ q01(v)g.</p>
      <p>The edges in GA comprise the set EA = ET A [ EAA [ EF A, where
– Eall = f(x; y) j A j= q0(x; y)g;
– ET A = f(x; y) 2 Eall j (x 2 VT ^ y 2 VA) _ (y 2 VT ^ x 2 VA)g;
– EAA = f(x; y) 2 Eall j x 2 VA ^ y 2 VAg;
– EF A = f(x; y) 2 Eall j (x 2 VF ^ y 2 VA) _ (y 2 VF ^ x 2 VA)g.
It is readily seen that GA = (VA; EA) is undirected in the sense that, for all of its
vertices u and v, (u; v) 2 EA iff (u; v) 2 EA.</p>
      <p>If we do not require q0(x; y) to be symmetric, the complexity upper bound increases
to NL:
Theorem 6. Let Q = (Dis; q) be any OMQ such that</p>
      <p>q = 9x; y (F (x) ^ T (y) ^ q0(x; y));
for some connected CQ q0(x; y) without solitary occurrences of F and T . Then
answering Q can be done in NL.</p>
      <p>Proof. We claim that the datalog query ( ; G) with the following linear datalog
program , where q~0 is the result of omitting all the 9 from q0:
is a datalog rewriting of Q. Indeed, if ; A j= G then there are v0; v1; : : : ; vn 2 ind(A)
such that F (v0); A(v1); : : : ; A(vn 1); T (vn) 2 A and q0(vi; vi+1) holds in A, for
0 i &lt; n. Clearly, in any model I of Dis and A there is i with I j= F (vi) ^ T (vi+1).
It follows that Dis; A j= q.</p>
      <p>Conversely, suppose ; A 6j= G. Let VP = fv 2 ind(A) j ; A j= P (v)g. Define
a model I of Dis with domain ind(A) by setting
T I = fv j T (v) 2 Ag [ fv 2 VP j A(v) 2 Ag;
F I = F A [ fv 2= VP j A(v) 2 Ag:
We claim that I 6j= q. Indeed, otherwise there is a homomorphism h : q ! I. As
h(y) 2 T I , we have ; A j= P (h(y)). As h(x) 2 F I , we have either F (h(x)) 2 A or
A(h(x)) 2 A, contrary to ; A 6j= G.</p>
      <p>The sufficient conditions of Theorems 5 and 6 only apply to CQs with exactly one
solitary occurrence of F and exactly one solitary occurrence of T . What happens if we
allow more than one solitary occurrences of F or T ?
The following result is a consequence of [8, Theorem 27]:
Theorem 7. Let Q = (Dis; q) be any OMQ such that
q = 9x; y1; : : : ; yn (F (x) ^ T (y1) ^
^ T (yn) ^ q0(x; y1; : : : ; yn));
for some connected CQ q0(x; y1; : : : ; yn) without solitary occurrences of T and F .
Then answering Q can be done in P.</p>
      <p>
        Indeed, for any ABox A, we have Dis; A j= q iff ; A j= G, where
following datalog program and q~0 is the result of omitting all the 9 from q0:
is the
The proof is by reduction of the alternating monotone circuit evaluation problem, which
is known to be P-complete [
        <xref ref-type="bibr" rid="ref9">9</xref>
        ]. An example of an alternating monotone circuit is shown
in the picture below. Given such a circuit C and an input , we define an ABox AC as
the set of the following atoms:
– R(g; h), if a gate g is an input of a gate h;
– S(g; h), if g and h are distinct inputs of some AND-gate;
– S(g; g), if g is an input gate or a non-output AND-gate;
– T (g), if g is an input gate with 1 under ;
– F (g), for the only output gate g;
– A(g), for those g that are neither inputs nor the output.
      </p>
      <p>To illustrate, the picture below shows an alternating monotone circuit C, an input
it, and the ABox AC , where the solid arrows represent R and the dashed ones S:
F
for</p>
      <p>OR
1
0
AND</p>
      <p>AND</p>
      <p>AND
OR
OR
1</p>
      <p>AND</p>
      <p>OR
OR
0</p>
      <p>OR
0
0</p>
      <p>T</p>
      <p>A</p>
      <p>A
A</p>
      <p>A
A
T</p>
      <p>A
A</p>
      <p>A</p>
      <p>A
One can show that C( ) = 1 iff Dis; AC j= q.</p>
      <p>Curiously, by changing S to R in the CQ from Example 1, we obtain an OMQ that
is NL-complete as follows from Theorem 8 below.
6</p>
      <p>NL vs. P
Theorem 8. Answering any OMQ Q = (Dis; qn) with</p>
      <p>n 1
qn = 9x1; : : : ; xn; y ^</p>
      <p>T (xi) ^ R(xi; xi+1) ^ T (xn) ^ R(xn; y) ^ F (y);
for n</p>
      <p>Proof. The lower bound follows from Theorem 4. The proof of the upper one is by
reduction to directed reachability. We split qn into two CQs:
One can show that, for any ABox A, we have Dis; A j= qn iff there exist a
homomorphism f : q0n ! A and a directed R-path f (xn); v0; v1; : : : ; vm 2 ind(A) such that
A(vi) 2 A, for i = 1; : : : ; m 1, and F (vm) 2 A. Clearly, this criterion reduces to
directed reachability.</p>
      <p>To further illustrate how minor modifications to the structure of CQs can send them
to different complexity classes, we collect in Table 1 a number of CQs in the scope of
Theorem 7, some of which turn out to be NL-complete, while others are P-complete.
(All the omitted labels on the arrows in Table 1 are assumed to be R, =A means either
blank or A, and F T =A means either F T or A).</p>
      <p>Here, we only sketch the proof of P-hardness for the OMQ (Dis; q), where q is
T F T</p>
      <p>R R
The proof is by reduction of the monotone circuit evaluation problem. Given a
monotone circuit C and an input , we define an ABox AC as the following labelled directed
graph, all of whose edges are labelled with R. For each gate g of C except the inputs
and output, the graph contains two vertices g and g0 labelled with A; the output gate
g gives rise to only one vertex g labelled with F , while each input gate g to only one
vertex g0 labelled according to . For an OR-gate g = h1 _ h2, we have the directed
edges (h01; g); (h02; g); (g; rg), where rg is a new vertex labelled with T . For an
ANDgate g = h1 ^ h2, we have the edges (h01; g); (g; h02). Also, for each gate g, we have the
edges (g; g0); (g0; tg), where tg is a new vertex labelled with T . An example illustrating
the construction is given below. One can show that C( ) = 1 iff Dis; AC j= q.
g3
AND</p>
      <p>1
g1 AND
1</p>
      <p>OR g2
0</p>
      <p>T
T
tg1</p>
      <p>The membership in NL for the CQs in the left column of Table 1 can be shown
by constructing appropriate linear datalog programs. For example, answering the OMQ
T
T
T
T
T
T
T
T
T
T
T</p>
      <p>T
T
T
T
T
=A
T
T
F
T
T</p>
      <p>F
T
FT
FT
T
=A
=A
T</p>
      <p>F
F
FT
FT
FT
=A
T</p>
      <p>F
FT
F
FT</p>
      <p>F
=A</p>
      <p>FT
FT
FT</p>
      <p>T
with the last CQ of the left column can be done by the following linear program:
P (x)
P (x)
P (x)</p>
      <p>G</p>
      <p>R(x; y); T (y); R(x; z); R(z; v); T (v)
R(x; y); T (y); R(x; z); R(z; v); P (v); A(v)
R(x; y); P (y); A(y)</p>
      <p>P (x); F (x)
Note that the classification problem we deal with in this section can be regarded as an
instance of a more general problem of classifying datalog programs in terms of their
data complexity, in particular, finding an NL/P dichotomy.
7</p>
    </sec>
    <sec id="sec-3">
      <title>CONP</title>
      <p>On the other hand, a minor extension of the CQ from Example 1 can lead to
CONPcompleteness. First we show that answering the OMQ Q = (Dis; q) with the Boolean
CQ q given in the picture below is CONP-complete.</p>
      <p>R
R
Q</p>
      <p>S
b2 F
c1 A
S</p>
      <p>S
a3
T
T
a1
F
b3
Q
A
c0</p>
      <p>S
A c3
F
b0</p>
      <p>S
a2 T</p>
      <p>T a0</p>
      <p>Q
Q
R
Q
Consider the ABoxes AN constructed according to the pattern shown below for N = 3:</p>
      <p>Q R</p>
      <p>R
Let V = fa0; : : : ; aN g. It is not hard to see that (i) for any interpretation I based
on AN , if I 6j= q then either V T I or V F I ; (ii) the interpretations I and I0
obtained by extending AN with T I = T A [ V and F I0 = F A [ V , respectively, are
both models of Dis that do not satisfy q.</p>
      <p>Given a 2+2-CNF with clauses D1; : : : ; DN and variables p1; : : : ; pM , we take
M disjoint copies of AN , distinguishing between them by the superscripts 1; : : : ; M .
For example, a23 is the a3-point of the second copy of AN and V 2 = fa02; : : : ; a2N g. For
each Dn of the form :pi _ :pj _ pk _ pl, we add to those copies the atoms R(ain; ajn),
S(ajn; akn) and Q(akn; aln), and denote the resulting ABox by A .</p>
      <p>A ain 1</p>
      <p>A ajn 1</p>
      <p>k
A an 1</p>
      <p>A aln 1
A ain
pi</p>
      <p>A ajn
pj</p>
      <p>S</p>
      <p>A akn
pk</p>
      <p>A aln
pl
We show that is satisfiable iff Dis; A 6j= q. Let q0 = R(x; y) ^ S(y; z) ^ Q(z; w).
Observe that any possible match of q0 in A falls into one of the two groups:
(A) (ain; bin; cin; ain+1), for 0 n N , 1 i M and addition modulo N + 1;
(B) (ain; aj ; an; aln), for some clause Dn = (:pi _ :pj _ pk _ pl) in .</p>
      <p>k
n</p>
      <p>Suppose is satisfiable under an assignment a. We define a model I of Dis by
extending A with T I = T A [ SfV i j a(pi) = 1g, F I = F A [ SfV i j a(pi) = 0g.
We claim that I 6j= q. Indeed, the tuples in (A) cannot yield a match by (ii) above,
while the tuples in (B) do not give a match since a(Dn) = 1, for all n N . To see this,
suppose a tuple (ain; ajn; akn; aln) from (B) is a match for q in I. Then fain; ajng T I
and fakn; alng F I , from which a(pi) = 1, a(pj ) = 1, a(pk) = 0 and a(pl) = 0, and
so the clause Dn = :pi _ :pj _ pk _ pl is false under a.</p>
      <p>Conversely, suppose Dis; A 6j= q. Then there is a model I of Dis based on A
such that I 6j= q. By (i) above applied to the copies of AN , for every i M , we have
either Vi T I or Vi F I . In the former case, we set a(pi) = 1; in the latter one, we
set a(pi) = 0. We claim that is satisfiable under a. Indeed, if Dn = :pi _:pj _pk _pl
is false under a, then a(pi) = 1, a(pj ) = 1, a(pk) = 0 and a(pl) = 0, and so the tuple
(ain; ajn; akn; al ) would be a match for q in I.</p>
      <p>n</p>
      <p>The proposed method is generic in the sense that we can try to apply it to any
‘sufficiently asymmetric’ CQ q with two T -atoms and two F -atoms: we use a T -F
fragment of q for copying the values of the Boolean variables, and the whole q for
encoding the clauses of a 2 + 2-CNF. However, this method does not work for the CQ</p>
      <p>T T F F
which requires a somewhat different technique. We show CONP-hardness of (Dis; q0)
by reduction of 3SAT. Given a 3CNF , we define an ABox A as follows. First, for
every variable p in , we construct a ‘gadget’ shown in the picture below, where the
number of A-nodes above each of the circles matches the number of clauses in ; we
refer to these nodes as p-nodes and, respectively, :p-nodes (below the circles, there are
2 p- and 2 :p-nodes):
q0</p>
      <p>R</p>
      <p>R</p>
      <p>R
F</p>
      <p>F
T</p>
      <p>F
T</p>
      <p>F
T
.</p>
      <p>A
.
.</p>
      <p>A
A
p
A
A
A
A
A
A
:p</p>
      <p>T
F</p>
      <p>F
T
:p
.</p>
      <p>A
.
.</p>
      <p>A
A
A
A
A</p>
      <p>T
F</p>
      <p>T</p>
      <p>T
T
F</p>
      <p>F
T</p>
      <p>A
A
q
A</p>
      <p>T
F</p>
      <p>F
T</p>
      <p>T</p>
      <p>F
A
A
r
A
Observe that, for any model I of Dis and the constructed gadget for p, if I 6j= q then
either (i) the p-nodes are all in F I and the :p-nodes are all in T I , or (ii) the p-nodes
are all in T I and the :p-nodes are all in F I .</p>
      <p>Now, for every clause c = (l1 _ l2 _ l3) in , we add to the constructed gadgets the
atoms T (c), R(c; ac
fresh :l1-node, alc2 :al1fr)e,sRh(la2-c:nlo1;dae,lc2a)n, dRa(lca3lc2a;farelc3s)h, lw3-hneoredec. Fisoar enxeawmipnlde,ivfiodrutahle, aclc:alu1sea
c = (p _ q _ r), we obtain the fragment below. The resulting ABox is denoted by A .
One can show that is satisfiable iff Dis; A 6j= q0.</p>
      <p>Acknowledgements. The work of O. Gerasimova and M. Zakharyaschev was carried
out at the National Research University Higher School of Economics and supported
by the Russian Science Foundation under grant 17-11-01294; the work of V.
Podolskii was supported by the Russian Academic Excellence Project ‘5-100’ and by grant
MK-7312.2016.1. Thanks are due to Frank Wolter and Carsten Lutz for comments,
suggestions and discussions.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          1.
          <string-name>
            <surname>Arora</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Barak</surname>
            ,
            <given-names>B.</given-names>
          </string-name>
          :
          <article-title>Computational Complexity: A Modern Approach</article-title>
          . Cambridge University Press, New York, NY, USA, 1st edn. (
          <year>2009</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          2.
          <string-name>
            <surname>Artale</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Calvanese</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Kontchakov</surname>
            ,
            <given-names>R.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Zakharyaschev</surname>
            ,
            <given-names>M.:</given-names>
          </string-name>
          <article-title>The DL-Lite family and relations</article-title>
          .
          <source>Journal of Artificial Intelligence Research (JAIR) 36</source>
          ,
          <fpage>1</fpage>
          -
          <lpage>69</lpage>
          (
          <year>2009</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          3.
          <string-name>
            <surname>Bienvenu</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>ten Cate</surname>
            ,
            <given-names>B.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Lutz</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Wolter</surname>
            ,
            <given-names>F.</given-names>
          </string-name>
          :
          <article-title>Ontology-based data access: A study through disjunctive datalog, csp, and MMSNP</article-title>
          .
          <source>ACM Transactions on Database Systems</source>
          <volume>39</volume>
          (
          <issue>4</issue>
          ),
          <volume>33</volume>
          :
          <fpage>1</fpage>
          -
          <lpage>44</lpage>
          (
          <year>2014</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          4.
          <string-name>
            <surname>Calvanese</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>De Giacomo</surname>
            ,
            <given-names>G.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Lembo</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Lenzerini</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Rosati</surname>
            ,
            <given-names>R.</given-names>
          </string-name>
          :
          <article-title>Tractable reasoning and efficient query answering in description logics: the DL-Lite family</article-title>
          .
          <source>Journal of Automated Reasoning</source>
          <volume>39</volume>
          (
          <issue>3</issue>
          ),
          <fpage>385</fpage>
          -
          <lpage>429</lpage>
          (
          <year>2007</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          5.
          <string-name>
            <surname>Feier</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Kuusisto</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Lutz</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          :
          <article-title>Rewritability in monadic disjunctive datalog, MMSNP, and expressive description logics</article-title>
          .
          <source>CoRR abs/1701</source>
          .02231 (
          <year>2017</year>
          ), http://arxiv.org/ abs/1701.02231
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          6.
          <string-name>
            <surname>Gottlob</surname>
            ,
            <given-names>G.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Papadimitriou</surname>
            ,
            <given-names>C.H.</given-names>
          </string-name>
          :
          <article-title>On the complexity of single-rule datalog queries</article-title>
          .
          <source>Inf. Comput</source>
          .
          <volume>183</volume>
          (
          <issue>1</issue>
          ),
          <fpage>104</fpage>
          -
          <lpage>122</lpage>
          (
          <year>2003</year>
          ), http://dx.doi.org/10.1016/S0890-
          <volume>5401</volume>
          (
          <issue>03</issue>
          )
          <fpage>00012</fpage>
          -
          <lpage>9</lpage>
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          7.
          <string-name>
            <surname>Hernich</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Lutz</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Ozaki</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Wolter</surname>
            ,
            <given-names>F.</given-names>
          </string-name>
          :
          <article-title>Schema.org as a description logic</article-title>
          . In: Calvanese,
          <string-name>
            <given-names>D.</given-names>
            ,
            <surname>Konev</surname>
          </string-name>
          ,
          <string-name>
            <surname>B</surname>
          </string-name>
          . (eds.)
          <source>Proceedings of the 28th International Workshop on Description Logics</source>
          , Athens,Greece, June 7-10,
          <year>2015</year>
          .
          <source>CEUR Workshop Proceedings</source>
          , vol.
          <volume>1350</volume>
          .
          <string-name>
            <surname>CEUR-WS.org</surname>
          </string-name>
          (
          <year>2015</year>
          ), http://ceur-ws.
          <source>org/</source>
          Vol-
          <volume>1350</volume>
          /paper-24.pdf
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          8.
          <string-name>
            <surname>Kaminski</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Nenov</surname>
            ,
            <given-names>Y.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Grau</surname>
            ,
            <given-names>B.C.</given-names>
          </string-name>
          :
          <article-title>Datalog rewritability of disjunctive datalog programs and non-Horn ontologies</article-title>
          .
          <source>Artif. Intell</source>
          .
          <volume>236</volume>
          ,
          <fpage>90</fpage>
          -
          <lpage>118</lpage>
          (
          <year>2016</year>
          ), http://dx.doi.org/10. 1016/j.artint.
          <year>2016</year>
          .
          <volume>03</volume>
          .006
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          9.
          <string-name>
            <surname>Papadimitriou</surname>
            ,
            <given-names>C.: Computational</given-names>
          </string-name>
          <string-name>
            <surname>Complexity. Addison-Wesley</surname>
          </string-name>
          (
          <year>1994</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          10.
          <string-name>
            <surname>Poggi</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Lembo</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Calvanese</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>De Giacomo</surname>
            ,
            <given-names>G.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Lenzerini</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Rosati</surname>
          </string-name>
          , R.:
          <article-title>Linking data to ontologies</article-title>
          .
          <source>Journal on Data Semantics X</source>
          ,
          <fpage>133</fpage>
          -
          <lpage>173</lpage>
          (
          <year>2008</year>
          )
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>