=Paper= {{Paper |id=None |storemode=property |title=The Complexity of Conjunctive Query Abduction in DL-Lite |pdfUrl=https://ceur-ws.org/Vol-745/paper_49.pdf |volume=Vol-745 |dblpUrl=https://dblp.org/rec/conf/dlog/CalvaneseOSS11 }} ==The Complexity of Conjunctive Query Abduction in DL-Lite== https://ceur-ws.org/Vol-745/paper_49.pdf
    The Complexity of Conjunctive Query Abduction in
                        DL-Lite?

    Diego Calvanese1 , Magdalena Ortiz2 , Mantas Šimkus2 , and Giorgio Stefanoni12
                      1
                     KRDB Research Centre for Knowledge and Data
                       Free University of Bozen-Bolzano, Italy
           calvanese@inf.unibz.it, giorgio.stefanoni@gmail.com
                         2
                            Institute of Information Systems
                      Vienna University of Technology, Austria
             ortiz@kr.tuwien.ac.at, simkus@dbai.tuwien.ac.at



        Abstract. In order to meet usability requirements, most logic-based applications
        provide explanation facilities for reasoning services. This holds also for DLs,
        where research focused on the explanation of both TBox reasoning and, more
        recently, query answering. Besides explaining the presence of a tuple in a query
        answer, it is important to explain also why a given tuple is missing. We address
        this latter problem for (conjunctive) query answering over DL-Lite ontologies,
        by adopting abductive reasoning, that is, we look for additions to the ABox that
        force a given tuple to be in the result. As reasoning tasks, we consider existence
        and recognition of an explanation, and relevance and necessity of a certain asser-
        tion for an explanation. We characterize the computational complexity of these
        problems for subset minimal and cardinality minimal solutions.


1     Introduction
Query answering over ontologies formulated in Description Logics (DLs) has received
considerable attention both in research and industry. Given an ontology, users typically
pose queries over the conceptual schema and get answers that take into account the con-
straints specified at the conceptual level. Many efforts have concentrated on lightweight
description logics. For instance, DL-LiteA has been tailored for query answering over
large data sets [7]. For this reason, expressive power is traded in favor of a better com-
putational behavior in terms of data-complexity. In fact, conjunctive query answering in
DL-LiteA enjoys FOL-rewritability, i.e., it can be reduced to the problem of evaluating
a suitably constructed FOL query over a database instance.
    In order to meet usability requirements set by domain users, most logic-based ap-
plications provide explanation algorithms for reasoning services. This holds also for
DLs, where research focused on the explanation of both TBox reasoning [12,6,14,3]
and, more recently, query answering [5]. In addition, the latter paper advocates the
importance of tackling the problem of explaining the absence of a tuple in the an-
swers to a query over an ontology. This problem stems from the database community,
where it has been solved in the context of databases extended with provenance infor-
mation [9]. We address this problem by considering explanations for the absence of a
?
    This work was partially supported by the Austrian Science Fund (FWF) grant P20840.
2          Diego Calvanese, Magdalena Ortiz, Mantas Šimkus, and Giorgio Stefanoni

tuple in the context of query answering over DL-LiteA ontologies. We adopt abductive
reasoning [10,11], that is, we consider which additions need to be made to the ABox to
force the given tuple to be in the result. More precisely, given a TBox T , an ABox A,
and a query q, an explanation for a given tuple t is a new ABox U such that the answer
to q over hT , A ∪ U i contains t. An important aspect in explanations is to provide the
user with solutions that are simple to understand and free of redundancy, hence as small
as possible. To address this requirement, we study various restrictions on solutions, in
particular, we focus on subset minimal and cardinality minimal ones. We consider stan-
dard decision problems associated to logic-based abduction: (i) existence of an expla-
nation, (ii) recognition of a given ABox as being an explanation, and (iii) relevance and
(iv) necessity of an ABox assertion, i.e., whether it occurs in some or all explanations.
After motivating such problems and formalizing them, we provide algorithms to solve
them and a precise characterization of their computational complexity for DL-LiteA .
The complexity results for the various reasoning tasks are summarized in Table 1.


2      Preliminaries

DL-LiteA . DL-LiteA is a member of the DL-Lite family of DLs [7], which have been
designed for dealing efficiently with large amounts of extensional information. In DL-
LiteA , concept expressions C, denoting sets of objects, and role expressions R, denoting
binary relations between objects, are formed as follows:

                        C −→ A | ∃R,                    R −→ P | P − .

where A denotes an atomic concept and P an atomic role3 . In a DL-LiteA ontology
O = hT , Ai, the TBox T consists of axioms of the form

                   C1 v C2 ,                R1 v R2 ,
                                                                     (funct R),
                   C1 v ¬C2 ,               R1 v ¬R2 ,

and the ABox A consists of assertions of the form A(c) and R(c, c0 ), where c, c0 are
constants (or, individuals) from a countably infinite set C. An interpretation is a pair
I = h∆I , ·I i, where ∆I is a non-empty domain, and the interpretation function ·I is
defined as usual. We adopt here the unique name assumption (UNA), i.e., cI1 6= cI2 for
all c1 , c2 ∈ C with c1 6= c2 . We refer to [7] for more details.

Conjunctive Queries. Let V be a countably infinite set of variables. Expressions A(t)
and P (t, t0 ) are called atoms, where t, t0 ∈ V ∪ C. A conjunctive query (CQ) q is
an expression q(x1 , . . . , xn ) ← a1 , . . . , am , where each ai , 1 ≤ i ≤ m, is an atom.
Let V(q) denoteSthe set of variables occurring in q, C(q) the set of constants in q,
and let at(q) = 1≤i≤m {ai }. A match for q in an interpretation I is a mapping π :
V(q) ∪ C(q) → ∆I such that π is the identity on constants, π(t) ∈ AI for each
 3
     We ignore here the distinction between data values and objects, since it is immaterial for our
     results. As a consequence, we do not consider value domains and attributes, which are present
     in DL-LiteA .
                            The Complexity of Conjunctive Query Abduction in DL-Lite                   3

A(t) ∈ at(q), and hπ(t), π(t0 )i ∈ P I for each P (t, t0 ) ∈ at(q). The tuple hx1 , . . . , xn i
is the tuple of answer variables of q. The answer to q over I, denoted ans(q, I), is the
set of all n-tuples hd1 , . . . , dn i ∈ Cn such that hdI1 , . . . , dIn i = hπ(x1 ), . . . , π(xn )i for
some match π for q in I. A union of conjunctive queries (UCQ)Sis a set of CQs with
the same answer variable tuple. For a UCQ q, we let ans(q, I) = q0 ∈q ans(q 0 , I). The
certain answer to a CQ or a UCQ q over O is defined as cert(q, O) = {c ∈ Cn | c ∈
ans(q, I) for each model I of O}.


3    Explaining Negative Query Answers

We now define the problem considered in this paper:

Definition 1. Let O = hT , Ai be an ontology, q a UCQ, and c a tuple of constants. We
call P = hO, q, ci a query abduction problem (QAP). A solution to P (or an explana-
tion for P) is any ABox U such that the ontology O0 = hT , A ∪ U i is consistent and
c ∈ cert(q, O0 ). The set of all explanations for P is denoted expl(P).

    If c ∈
         / cert(q, O), then we call c a negative answer to q over O. Note that a query
over the ontology can have a negative answer only if the ontology is satisfiable. Dif-
ferently, if the ontology is unsatisfiable then the QAP P does not have any solution.
In the following, we will examine various restrictions to expl(P) to reduce redundancy
in explanations. This is achieved by the introduction of a preference relation among
explanations. This relation is reflexive and transitive, i.e., we have a pre-order among
solutions.

Definition 2. Assume a QAP P. Let  denote a pre-order on the set expl(P) of solu-
tions. We write U ≺ U 0 if U  U 0 and U 0  U. The preferred explanations expl (P) of
a QAP P under the pre-order  are defined as follows: expl (P) = { U ∈ expl(P) |
 there is no U 0 ∈ expl(P) s.t. U 0 ≺ U }, i.e., expl (P) contains all the -explanations
that are minimal under .

    Two preference orders are considered here: the subset-minimality order, denoted by
⊆, and the minimum explanation size order, denoted by ≤. The latter order is defined
by U ≤ U 0 iff |U| ≤ |U 0 |. Observe that expl≤ (P) ⊆ expl⊆ (P).
    We define now four decision problems related to (minimal) explanations, which are
parametric w.r.t. the chosen preference order . Given a QAP P:
  – -EXISTENCE: Does there exist a -explanation for P?
  – -RECOGNITION: Is a set U of ABox assertions a -explanation for P?
  – -RELEVANCE: Does an assertion α occur in some -explanation for P?
  – -NECESSITY: Does an assertion α occur in all -explanations for P?
In the following, whenever no preference is applied (i.e., when  is the identity) we
omit to write  in front of the problem’s names. We provide an example, in which we
highlight the consequences of choosing among the various orderings.
4       Diego Calvanese, Magdalena Ortiz, Mantas Šimkus, and Giorgio Stefanoni


               Table 1: Summary of main complexity results (completeness)
               -EXISTENCE     -RECOGNITION    -RELEVANCE      -NECESSITY
        none     PT IME (4.1)        NP            PT IME (4.3)    PT IME (4.2)
         ≤          PT IME           DP                PNP
                                                        k            PNP
                                                                      k (4.2)
         ⊆          PT IME           DP              Σ2P (4.3)     PT IME (4.2)



Example 1. Let O = hT , Ai be an ontology describing a university domain, where T
is

 PostGrad v Student,         Tutor v Professor , Advanced v Course,
UnderGrad v Student,     ∃hasTutor v PartTime,    ∃teaches v Professor ,
UnderGrad v ¬PostGrad , ∃hasTutor − v Tutor ,    ∃teaches − v Course,
 PartTime v UnderGrad .

    That is, there are two different kinds of students, PostGrad and UnderGrad . More-
over, PartTime students are tutored by Tutor s, who are particular professors. Addi-
tionally, the university offers some Advanced courses. Let the ABox A consist of the
assertions teaches(rob, SWT ), hasTutor (peter , rob). Now, assume that a user is in-
terested in finding all those who both teach an advanced course and tutor a student.
Then, she would write the query

                  q(x) ← teaches(x, y), Advanced (y), hasTutor (z, x).

Moreover, she may expect rob to be part of the result, i.e., rob ∈ cert(q, O),
but this is not the case. Intuitively, rob satisfies all the constraints imposed by the
query, except that the SWT course is not known to be Advanced . One can easily
see that {teaches(rob, TOC ), Advanced (TOC ), hasTutor (john, rob)} is an expla-
nation, {teaches(rob, ALG), Advanced (ALG)} is a ⊆-minimal explanation, while
{Advanced (SWT )} is a ≤-minimal explanation.

    In the next section, the complexity of the four main problems is studied in the light
of the different preference relations.


4   Complexity of Explanations
Table 1 provides an overview of our complexity results. Recall that the class Σ2P is
a member of the Polynomial Hierarchy [13]; it is the class of all decision problems
solvable in non-deterministic polynomial time using an NP oracle. Moreover, the class
PNP
  k contains all the decision problems that can be solved in polynomial time with an
NP oracle, where all oracle calls must be first prepared and then issued in parallel. The
class DP contains all problems that, considered as languages, can be characterized as
the intersection of a language in NP and a language in CO NP [13]. Note that: PT IME ⊆
NP ⊆ DP ⊆ PNP            P
                k ⊆ Σ2 is believed to be a strict hierarchy of inclusions and here we
make such an assumption.
                        The Complexity of Conjunctive Query Abduction in DL-Lite         5

   Our results can be explained as follows. We show in the next section that EX -
ISTENCE can be reduced to the PT IME -complete satisfiability problem for DL-LiteA
without the UNA [1], which justifies our PT IME upper bound. This result can then be
used to characterize the complexity of RELEVANCE, NECESSITY, and ⊆-NECESSITY.
≤-RELEVANCE and ≤-NECESSITY are harder. The reason being that in order to solve
these problems one has to compute first the minimal size of a solution and, then, inspect
all the solutions of that size. Additionally, there is another increase in complexity when
dealing with ⊆-RELEVANCE. The intuition is that there is an exponential number of
candidate solutions to examine and for each of them one has to check that none of its
subsets is itself a solution, which requires a CO NP computation. Due to space limita-
tions, the results on -RECOGNITION are not detailed in this paper (see [8] for more
details). The intuition for the NP bound for RECOGNITION is that one needs simply to
check consistency and perform query evaluation to solve the problem. In case a prefer-
ence order is in place, one has to check minimality as well, which is a CO NP check for
⊆- and ≤-minimal explanations that leads to completeness for DP.


4.1   Complexity of -EXISTENCE

For -EXISTENCE note first that the existence of an explanation for P implies the
existence of an explanation under the ⊆ and ≤ orderings. Thus, we only consider EX -
ISTENCE . Our first task is to show that we can restrict ourselves to explanations built
from the original signature of the input QAP plus a small number of fresh constants.

Proposition 1. If P = hO, q, ci has a solution, then P has a solution U 0 with con-
cepts, roles, and attributes only from O and at most m = maxqi ∈q |at(qi )| fresh ABox
individuals.

Proof. Assume an arbitrary solution U to P. Given the consistency of O0 = hT , A ∪ U i,
it follows that there exists a model I of O0 under the UNA. W.l.o.g. we assume that
∆I = C with cI = c for each c ∈ C. Additionally, the interpretation I admits a match
π for some qi (x) ∈ q, such that π(x) = c. Let U 0 = {A(o) | A(t) ∈ at(qi ) and π(t) =
o} ∪ {R(o, o0 ) | R(t, t0 ) ∈ at(qi ) and π(t) = o and π(t0 ) = o0 }. Observe that U 0
has no more individuals than qi has variables. It remains to see that U 0 is a solution.
Clearly the original match π witnesses also c ∈ ans(qi , A ∪ U 0 ). It remains to see that
O00 = hT , A ∪ U 0 i is consistent. But this follows from the fact that I is a model of O0
and that the atoms in U 0 hold in I                                                      t
                                                                                         u

    The above restriction allows us to consider canonical explanations, i.e., explana-
tions resulting from suitable instantiations of the bodies of CQs qi ∈ q. Keeping in
mind that CQs, seen as FOL formulae, are always satisfiable, an explanation does not
exist only if the structure of the query is not compliant with the constraints expressed in
the ontology. That is, for all the interpretations J of q with ans(q, J ) 6= ∅, there is no
model I of O, such that I ∪ J |= O. To check whether a UCQ is compliant with the
ontological constraints, a naïve method is to iteratively go through all the CQs in q and
instantiate them in the ABox. If for none of the CQs we obtain a consistent ontology,
then the query violates some of the constraints imposed at the conceptual level.
6       Diego Calvanese, Magdalena Ortiz, Mantas Šimkus, and Giorgio Stefanoni

Proposition 2. For DL-LiteA ontologies, EXISTENCE is PT IME-complete.
Proof. (MEMBERSHIP) Note that P = hO, q, ci, with q a UCQ, has a solution iff Pq0 =
hO, q 0 , ci has a solution for some q 0 ∈ q. Hence, it suffices to show the upper bound
for CQs. To this end, we provide a logspace reduction from EXISTENCE to consistency
in DL-LiteA without UNA, which in turn is PT IME-complete [1]. Assume a QAP P =
hO, q, ci, where q is a CQ and O = hT , Ai. We argue that P has a solution iff O0 =
hT ∪ T 0 , A ∪ Uq ∪ A0 i is consistent, where O0 is an ontology obtained from O and
q(c) as follows. The ABox Uq is obtained from at(q(c)) by replacing each variable
x with a fresh individual name ax . The ABox A0 consists of assertions Ao (o) for all
constants o occurring in T , A and Uq , where each Ao is a fresh concept name. The
TBox T 0 consists of axioms Ao v ¬Ao0 for all pairs o 6= o0 of constants occurring in A
and q(c). We now show that P has a solution iff O0 is consistent.
      Assume that P has a solution U. Then, due to consistency of O00 = (T , A ∪ U),
there is a model I of O00 under the UNA. Additionally, I admits a match π for q(c). Let
                                                                                     0
I 0 be the extension of I that additionally interprets (i) constants in Uq as aIx = π(x)
                                      0
for all variables x in q, and (ii) AIo = {oI } for all freshly introduced Ao . It remains
                 0                 0
to show that I is a model of O . Observe that since I is under the UNA, we have that
    0       0
AIo ∩ AIo0 = ∅ for all constant pairs o 6= o0 . Thus I 0 satisfies all the disjointness axioms
Ao v ¬Ao0 in T 0 . The assertions in A0 are satisfied due to (ii), while the assertions in
Uq due to (i) above.
      The other direction of the proof is obvious and we omit it here.
      (HARDNESS) Let us reduce consistency in DL-LiteA without UNA to EXISTENCE.
Given O = hT , Ai, we create QAP P = hO0 , q(), hii simply by encoding the ABox A
in the CQ q by replacing each constant a ∈ A by a distinct variable name xa in q, while
the ontology O0 consists only of the TBox T .                                                t
                                                                                             u

4.2   Complexity of (-)NECESSITY
The existence of an explanation is most of the times not very informative to the user.
In fact, given a negative answer to a query, it is important to delineate the fundamental
reasons leading to the absence of the expected tuple. That is, users would like to know
which assertions occur in all the solutions to a QAP P.
Proposition 3. For DL-LiteA ontologies, the NECESSITY and ⊆-NECESSITY problems
are PT IME-complete.
Proof. (MEMBERSHIP) We assume a QAP P = hO, q, ci with O = hT , Ai, an asser-
tion ϕ(t) and consider NECESSITY first. This problem can be reduced to non-EXISTENCE,
which was shown to be in PT IME in the previous section. We build O0 = hT 0 , A0 i such
that ϕ(t) occurs in all explanations for P iff P 0 = hO0 , q, ci has no explanation. We
define O0 by setting A0 = A ∪ {ϕ̄(t)} and T 0 = T ∪ {ϕ̄ v ¬ϕ}, where ϕ̄ is a fresh
predicate name. It is easy to see that if P 0 has no explanation, then either P has no ex-
planation as well, or, all the explanations for P must contain ϕ(t). For ⊆-NECESSITY,
observe that ϕ(t) occurs in all explanation for P iff ϕ(t) occurs in all ⊆-minimal ex-
planations for P. Thus ⊆-NECESSITY can be decided in polynomial time using our
algorithm for NECESSITY.
                         The Complexity of Conjunctive Query Abduction in DL-Lite          7

   (HARDNESS) The lower-bound can be proved through a logspace reduction from
EXISTENCE to non- NECESSITY , that is, deciding whether there exists a solution to a
QAP that does not contain the given assertion. Let P = hO, q, ci be a QAP with q
a CQ, we build hP, αi such that P has a solution iff hP, αi is a negative instance to
NECESSITY . Let α = A(o), for some fresh concept A and constant o not occurring
in P. Now, assume P has a solution U. By Proposition 1, we know that there exists a
solution U 0 to P containing only concept and role names from O. Hence, A(o) 6∈ U 0 ,
since concept name A is not in the ontology. Therefore, one can conclude that A(o) is
not a necessary assertion to P.
    The other direction is straightforward.                                        t
                                                                                   u

      Let us now show the complexity of necessity under the ≤ preference order.
Theorem 1. For DL-LiteA ontologies, ≤-NECESSITY is PNP
                                                    k -complete.

Proof. (MEMBERSHIP only, HARDNESS in [8]) Let’s assume a QAP P = hO, q, ci and
an assertion α. By the use of canonical explanations, we know that the size m of the
largest solution to P corresponds to the size of the largest CQ in q. Observe that (P, α)
is a negative instance of ≤-NECESSITY iff there is an 0 ≤ i ≤ m such that (a) P has
an explanation U with |U| = i and α 6∈ U, and (b) U is ≤-minimal. Thus, we use an
auxiliary problem SIZE - OUT, which is to decide given a tuple hP 0 , α0 , n0 i, where P 0 is
a QAP, α0 is an assertion, and n0 is an integer, whether there exists an explanation U 0
for P 0 such that |U 0 | = n0 and α0 6∈ U 0 . Furthermore, the problem NO - SMALLER is to
decide given a tuple hP 0 , n0 i of a QAP and an integer whether there is no explanation
U 0 for P 0 such that |U 0 | < n0 . Observe that SIZE - OUT is in NP, while NO - SMALLER
is in CO NP. Take the tuple S = hA0 , B0 , . . . , Am , Bm i, where (a) Ai = (P, α, i),
for all 0 ≤ i ≤ m, and (b) Bi = (P, i), for all 0 ≤ i ≤ m. Due to the above
observation, α occurs in all ≤-minimal explanations U for P iff for all 0 ≤ i ≤ m, one
of the following holds: (i) Ai is a negative instance of SIZE - OUT, or (ii) Bi is a negative
instance of NO - SMALLER. Note that S can be built in polynomial time in the size of the
input, while the positivity of the instances in S can be decided by making 2m parallel
calls to an NP oracle. Thus we obtain membership in PNP    k .                              t
                                                                                            u


4.3    Complexity of -RELEVANCE

A domain user faced with a negative answer to a query may ask herself whether, the
absence of a certain ABox assertion α in the ontology is related with the lack of the tuple
in the results. That is, she would like to know whether α occurs in some explanation to
QAP P.

Proposition 4. For DL-LiteA ontologies, RELEVANCE is PT IME-complete.

Proof. (MEMBERSHIP) We assume a QAP P = hO, q, ci with O = hT , Ai and an
assertion φ(t). We now provide a reduction from RELEVANCE to EXISTENCE. We con-
struct O0 = hT , A0 i, where A0 = A∪φ(t). Then, P has an explanation U with φ(t) ∈ U
iff there exists an explanation to P 0 = (O0 , q, c). This is because, any explanation to
P 0 can be extended by adding φ(t). It is simple to see that any such explanation is an
8        Diego Calvanese, Magdalena Ortiz, Mantas Šimkus, and Giorgio Stefanoni

explanation to P, as well. Finally, if P 0 does not admit explanations, then φ(t) is a
source of inconsistency in P.
   (HARDNESS) Hardness can again be proved with a reduction from EXISTENCE in a
similar way as done in Section 4.2.                                                 t
                                                                                    u

   Let us now tackle the problem of ⊆-RELEVANCE, for which we recall the FOL-
rewritability of query answering in DL-LiteA . Given an ABox A, let DB(A) be the fol-
lowing interpretation: (i) ∆DB(A) is the set of constants occurring in A, (ii) ADB(A) =
{o ∈ ∆DB(A) | A(o) ∈ A}, for each atomic concept A, and (iii) P DB(A) = {ho, o0 i ∈
∆DB(A) | P (o, o0 ) ∈ A}, for each atomic role P .

Proposition 5 ([7]). Let PerfectRef(q, T ) be the perfect reformulation of q w.r.t. T ,
which
S     is a UCQ obtained by applying the rewrite rules given in [7]. Then, cert(q, O) =
  r∈PerfectRef (q,T ) ans(r, DB(A)).

In other words, the certain answers to a UCQ can be computed by rewriting the query
into a FOL query to be evaluated over the ABox.

Theorem 2. For DL-LiteA ontologies, ⊆-RELEVANCE is Σ2P -complete.

Proof. (MEMBERSHIP) The membership in Σ2P is clear from Algorithm 1, which works
as follows. An explanation U containing φ(t) is non-deterministically computed by
guessing an instantiation of a subquery in PerfectRef(q(c), T ), where Anon is a set of
fresh ABox individuals (see Proposition 1). Let HAS - SUBEXPL solve the problem of
deciding whether a solution U has a subset which is itself an explanation. The prob-
lem can be easily proved to be in NP. Then, the algorithm checks the complement
of HAS - SUBEXPL in order to assure that none of the subsets of U is itself an expla-
nation, from which it follows that φ(t) is ⊆-relevant. Checking the complement of
HAS - SUBEXPL requires the power of a CO NP machine. For this reason, the algorithm
is solvable in non-deterministic polynomial time by a TM with an NP oracle.
    (HARDNESS) We prove it by a reduction from the Σ2P -complete problem co-CERT 3 COL
[15] (see also [4]). An instance of co-CERT 3 COL is given by a graph G = (V, E) with
vertices V = {0, . . . , n − 1} such that every edge is labeled with a disjunction of
two literals over the Boolean variables {p(i,j) | i, j < n}. G is a positive instance if
there is a truth value assignment t to the Boolean variables such that the graph t(G)


Algorithm 1
INPUT: QAP P = hq, O, ci and ABox assertion φ(t)
OUTPUT: yes iff φ(t) is relevant to P
 1: Guess qi ∈ {q1 , . . . , qn } = q
 2: Guess the derivation of one rewriting r(c) in PerfectRef(qi (c), T )
 3: Guess a set of atoms U ⊆ at(r)
 4: Guess a mapping π from V(q) to constants in DB(A) and Anon
 5: Check that (T , A ∪ U) 6|= ⊥, where U is the instantiation of U through π.
 6: Check that φ(t) ∈ U and π is a match for r(c) over DB(A ∪ U)
 7: Check that HAS - SUBEXPL (P, U) = no.
                            The Complexity of Conjunctive Query Abduction in DL-Lite                  9

obtained from G by including only those edges whose label evaluates to true under t
is not 3-colorable. Assume an instance G of co-CERT 3 COL. We show how to build in
polynomial time a QAP PG = h(TG , AG ), qG , cG i and an ABox assertion αG such
that: G is a positive instance of co-CERT 3 COL iff αG is ⊆-relevant for PG . We use
an empty TBox and a Boolean query, thus TG = ∅ and cG = hi. The query qG is
a UCQ qG = qe1 ∪ · · · ∪ qek ∪ q 0 , where {e1 , . . . , ek } = E, each qei is an atomic
query qei () ← Wei (x, y), and q 0 is defined as follows. Assume B = {t, f } to be the
set of truth values. The query q 0 has the following atoms for each edge e = (i, j) in E:
(a) B(xi ), Re (xi , ye ), Re (ye , xj ), B(xj ), and (b) P (ye , zpi ), Api (zpi ), Wpi (zpi , zp0 i ),
where pi ∈ {p1 , p2 } and p1 , p2 are the first and the second proposition in the labeling
of e, respectively. The query q 0 simply incorporates G together with the disjunctions on
the edges. Observe that if two edges have the same proposition in their label, then this
will be reflected in q 0 by some shared variables zpi .
    To build AG we use individuals cp and c¬p for the truth value of proposition p.
Intuitively, the truth value of p will be determined by either Ap (cp ) or Ap (c¬p ) being
in the update. Assume a tuple t = he, v1 , v2 , a, bi, where e ∈ E, {v1 , v2 } ⊆ B, and a, b
are individuals. Let p1 , p2 be the first and the second propositions of e. For i ∈ {1, 2}
and vi = t , let li = pi if pi is positive and li = ¬pi otherwise. Similarly, for i ∈ {1, 2}
and vi = f , let li = ¬pi if pi is positive and li = pi otherwise. Then, the ABox A(t)
consists of the assertions Re (a, dT ), Re (dT , b), P (dT , cl1 ) and P (dT , cl2 ) depending
on the boolean values in input.
    The ABox AG is the union of the following ABoxes:
(A1) A(he, v, v 0 , ai , aj i) for all e ∈ E, v, v 0 ∈ B, 0 ≤ i, j ≤ 2, and i 6= j;
(A2) A(he, f, f, ai , ai i) for all e ∈ E, and 0 ≤ i ≤ 2;
(A3) A(he, v, v 0 , b, bi) for all e ∈ E, v, v 0 ∈ B;
(A4) The ABox {B(a0 ), B(a1 ), B(a2 )};
(A5) The assertions Wp (cp , c¬p ) and Wp (c¬p , cp ) for all propositions.
     Let αG = B(b). It is not too difficult to see that G is a positive instance of co-
CERT 3 COL iff there exists an ⊆-explanation U to P such that αG       ∈ U. Basically,
definitions (A1)-(A3) encode a triangular structure T in which edges in G that evaluate
to false according to a given truth assignment can be mapped on any edge of T , reflexive
edges included. If an edge of G evaluates to true, then it must be mapped to one of the
non-reflexive edges. This ensures that if G can be mapped to T under truth assignment
t, then t(G) is 3-colorable. Instead, definitions (A4)-(A5) define a cyclic structure C
into which any graph G can be embedded. It has to be noted that the node b is not
asserted to be a member of B, hence qG cannot be mapped there directly with any truth
assignment. We see this more formally next:
    “⇒” Suppose there is a truth assignment t such that t(G) is not 3-colorable. Let
U = {B(b)}∪U1 , where U1 = {Ap (cp ) | t(p) = t}∪{Ap (c¬p ) | t(p) = f }. It remains
to argue that U is a ⊆-explanation to P. It is not hard to see that U is an explanation.
Indeed qG matches already in the ABox obtained by point (A3) (hint: since B(b) ∈ U,
we match qG by mapping all variables of qG to (interpretation of) b). Suppose there is
a smaller update U 0 ⊂ U. Observe that U1 ⊆ U 0 . This is because for all propositions p,
the symbol Ap does not occur in AG but does occur in qG . Then, U \ {B(b)} must be an
10      Diego Calvanese, Magdalena Ortiz, Mantas Šimkus, and Giorgio Stefanoni

update. If this is the case, then qG can be matched in AG without the ABox from (A3),
i.e. in the triangle part. Then t(G) is 3-colorable, which contradicts the assumption.
     “⇐” Let U be a ⊆-minimal explanation containing B(b). Due to the presence of
qe1 ∪ . . . ∪ qek in qG and the assumption, the role Wp cannot occur in U for any
proposition p. Since U is an explanation, by the definition of q 0 and (A5) we have
that Ap (cp ) ∈ U or Ap (c¬p ) ∈ U for all propositions p. Since for any proposition p
we have that Ap occurs in qG with one and only variable zp , we know that exactly one
of Ap (cp ) ∈ U and Ap (c¬p ) ∈ U holds. Due to the atoms Wp (zp , zp0 ) in qG , we also
have that individuals of the form cp and c¬p are the only ones that can get an Ap label.
Consider the assignment t defined as follows: t(e) = t if Ap (cp ) ∈ U, and t(e) = f
if Ap (c¬p ) ∈ U. It is not difficult to argue that t(G) is not 3-colorable and thus G is a
positive instance of co-CERT 3 COL. Indeed, if t(G) was 3-colorable, Q should be map-
pable into the triangle part obtained in (A1)-(A3). Then U \ {B(b)} would be a smaller
update, which would mean a contradiction.                                                t
                                                                                         u

Note that the above lower bound holds already for empty TBoxes.

Proposition 6. For DL-LiteA ontologies, ≤-RELEVANCE is PNP
                                                        k -complete.

Proof. (MEMBERSHIP only, HARDNESS in [8]) ≤-RELEVANCE can be tackled in a way
similar to ≤-NECESSITY. In fact, the algorithm described in Theorem 1 can be modi-
fied in order to solve this problem. Let SIZE - IN solve the following problem: given a
tuple hP, α, ni, where P is a QAP, α an assertion, and n an integer, decide whether
there exists an explanation U, with |U| = n and α ∈ U. Then, we change the positivity
condition of the ≤-NECESSITY algorithm as follows: α occurs in some ≤-minimal ex-
planations U for P iff for some i, 0 ≤ i ≤ m, it holds that: (i) Ai is a positive instance
of SIZE - IN, and (ii) Bi is a positive instance of NO - SMALLER. It is easy to see that
                                                                   NP
SIZE - IN is solvable in NP, hence the whole problem is again in Pk .                   t
                                                                                        u


5    Conclusions

In this paper we have provided the formalization of a new problem, namely the explana-
tion of negative answers to user queries over ontologies. A tuple is said to be a negative
answer, if the user expects it to be part of cert(q, O) but the tuple is actually not. In
our framework, an explanation consists of an ABox that when added to the ontology
leads the negative answer to be returned in the results of the query. We define various
problems that help us in characterizing the complexity of finding explanations, such
as the existence of explanations and relevance/necessity of assertions. We further con-
sider a minimality criterion to be applied over explanations, such as subset-minimal and
minimum-size preference orders. Within this framework, we provide a characterization
of the computational complexity of the various problems for the DL DL-LiteA .
    Future work includes studying the application of this framework to other lightweight
description logics, starting with the EL-family. We would also like to investigate the
problem in the case where the ontology signature and the explanation signature may be
different. That is, the signature over which explanations can be constructed is restricted
only to a subset of the ontology signature [2].
                           The Complexity of Conjunctive Query Abduction in DL-Lite             11

References
 1. A. Artale, D. Calvanese, R. Kontchakov, and M. Zakharyaschev. The DL-Lite family and
    relations. J. of Artificial Intelligence Research, 36:1–69, 2009.
 2. F. Baader, M. Bienvenu, C. Lutz, and F. Wolter. Query and predicate emptiness in description
    logics. In Proc. of the 12th Int. Conf. on the Principles of Knowledge Representation and
    Reasoning (KR 2010), 2010.
 3. F. Baader and R. Peñaloza. Axiom pinpointing in general tableaux. J. of Logic and Compu-
    tation, 20(1):5–34, 2010.
 4. P. A. Bonatti, C. Lutz, and F. Wolter. The complexity of circumscription in description logics.
    J. of Artificial Intelligence Research, 35:717–773, 2009.
 5. A. Borgida, D. Calvanese, and M. Rodríguez-Muro. Explanation in the DL-Lite family of
    description logics. In Proc. of the 7th Int. Conf. on Ontologies, DataBases, and Applications
    of Semantics (ODBASE 2008), volume 5332 of Lecture Notes in Computer Science, pages
    1440–1457. Springer, 2008.
 6. A. Borgida, E. Franconi, and I. Horrocks. Explaining ALC subsumption. In Proc. of the
    14th Eur. Conf. on Artificial Intelligence (ECAI 2000), 2000.
 7. D. Calvanese, G. De Giacomo, D. Lembo, M. Lenzerini, A. Poggi, M. Rodríguez-Muro, and
    R. Rosati. Ontologies and databases: The DL-Lite approach. In S. Tessaris and E. Franconi,
    editors, Semantic Technologies for Informations Systems – 5th Int. Reasoning Web Summer
    School (RW 2009), volume 5689 of Lecture Notes in Computer Science, pages 255–356.
    Springer, 2009.
 8. D. Calvanese, M. Ortiz, M. Šimkus, and G. Stefanoni. The complexity of conjunctive query
    abduction in DL-Lite. Technical Report INFSYS RR 1843-11-01, Institute of Informa-
    tion Systems, Vienna University of Technology, 2011. Available at: http://www.kr.
    tuwien.ac.at/research/reports/.
 9. A. Chapman and H. V. Jagadish. Why not? In Proc. of the ACM SIGMOD Int. Conf. on
    Management of Data, pages 523–534, 2009.
10. T. Eiter and G. Gottlob. The complexity of logic-based abduction. J. of the ACM, 42(1):3–42,
    1995.
11. S. Klarman, U. Endriss, and S. Schlobach. ABox abduction in the description logic ALC. J.
    of Automated Reasoning, 46(1):43–80, 2011.
12. D. L. McGuinness and A. Borgida. Explaining subsumption in description logics. In Proc.
    of the 14th Int. Joint Conf. on Artificial Intelligence (IJCAI’95), pages 816–821, 1995.
13. C. H. Papadimitriou. Computational Complexity. Addison Wesley Publ. Co., 1994.
14. R. Penaloza and B. Sertkaya. Complexity of axiom pinpointing in the DL-Lite family. In
    Proc. of the 23rd Int. Workshop on Description Logic (DL 2010), volume 573 of CEUR
    Electronic Workshop Proceedings, http://ceur-ws.org/, 2010.
15. I. A. Stewart. Complete problems involving boolean labelled structures and projection trans-
    actions. J. of Logic and Computation, 1(6):861–882, 1991.