=Paper= {{Paper |id=Vol-1350/paper-47 |storemode=property |title=Answering EL Queries in the Presence of Preferences |pdfUrl=https://ceur-ws.org/Vol-1350/paper-47.pdf |volume=Vol-1350 |dblpUrl=https://dblp.org/rec/conf/dlog/CeylanLP15 }} ==Answering EL Queries in the Presence of Preferences== https://ceur-ws.org/Vol-1350/paper-47.pdf
         Answering EL Queries in the Presence of
                      Preferences

       İsmail İlkan Ceylan1? , Thomas Lukasiewicz2 , and Rafael Peñaloza3??
                1
                 Theoretical Computer Science, TU Dresden, Germany
                          ceylan@tcs.inf.tu-dresden.de
             2
               Department of Computer Science, University of Oxford, UK
                         Thomas.Lukasiewicz@cs.ox.ac.uk
           3
             KRDB Research Centre, Free University of Bozen-Bolzano, Italy
                             rafael.penaloza@unibz.it

    Conjunctive query (CQ) answering is an important reasoning task in descrip-
tion logics (DLs). Its goal is to retrieve the tuples of individuals that satisfy a
conjunctive query; i.e., a finite set of atomic queries. These tuples are called an-
swers. Clearly, a given CQ may have a considerable number of answers, specially
if the set of individual names appearing in the ABox is large, as is the case for
many existing DL ontologies. In order to manage all these answers in a structural
manner, one can try to extend query answering with preference criteria, in such
a way that the most preferred answers are returned first.
    Possibilistic networks (PNs) have arisen as a way of representing conditional
preferences over a finite set of events in a compact way [1]. The general idea is
to provide a possibility degree to each conditional event which is proportional to
the preference given to that event. We apply this idea to model the preferences
of query answers indirectly, by modeling the preferences over the contexts that
entail them. In a nutshell, we divide an EL knowledge base (KB) into contexts,
and use a possibilistic network to describe the joint possibility distribution over
these contexts. Our formalism is based on ideas previously presented for rea-
soning under probabilistic uncertainty described by a Bayesian network [3]. The
preference of an answer to the query is the possibility degree of the best con-
text that entails this answer. Dually, we also compute, given a query, the most
preferred source; that is, the context with the highest degree that entails this
query.
    Similar to Bayesian networks [4], PNs are graphical models providing a com-
pact representation of a discrete possibility distribution, through some inde-
pendence assumptions [2]. A possibility distribution over a set Ω is a func-
tion Pos : Ω → [0, 1] that intuitively provides a degree of how possible is an
event ω ∈ Ω to happen. This function is extended to sets Γ ⊆ Ω by defining
Pos(Γ ) = supω∈Γ Pos(ω). The product conditional distribution which is defined
by the equation Pos(Γ ∩ Θ) = Pos(Γ | Θ) · Pos(Θ).
    Possibilistic networks decompose a possibility distribution into a product of
conditional probability distributions that depend on the structure of a graph. A
?
     Supported by DFG within the Research Training Group “RoSI” (GRK 1907).
??
     The work was developed while the author was still affiliated with TU Dresden and
     the author has been partially supported by the Cluster of Excellence “cfAED”.
                                   x ¬x
                                  0.4 0.7         x

                                                                    z ¬z
                       y ¬y
                                                           x y     0.4 0.9
               x      0.5 1        y                  z    x ¬y    0.1 0.1
               ¬x     0.3 0.1                             ¬x y     0.6 1
                                                          ¬x ¬y    0.9 0


                    Fig. 1. A possibilistic network over V0 = {x, y, z}


possibilistic network (PN) is a pair P = (G, Φ), where G = (V, E) is a DAG,
and Φ contains a conditional possibility distribution PosP (x | pa(x)) of every
variable x ∈ V given its parents pa(x) (see Figure 1). This PN defines the joint
possibility distribution over the valuations of the variables in V
                                      Y
                         PosP (V ) =      PosP (x | pa(x)).
                                            x∈V


     Let V be a fixed but arbitrary finite set of propositional variables. A V -context
is a propositional formula over V . A V -GCI is of the form hC v D : ϕi with C, D
concepts and ϕ a V -context. A V -TBox is a finite set of V -GCIs. V -assertions
are of the form hC(a) : ϕi or hr(a, b) : ϕi where r ∈ NC , a, b ∈ NI , C is a concept
and ϕ is a V -context. A V -ABox is a finite set of V -assertions. A PEL KB is a
tuple K = (P, T , A) where P is a PN over V , T is a V -TBox and A is a V -ABox.
     The semantics of this logic is defined using multiple worlds. A contextual in-
terpretation is a pair (I, W) where I is an EL interpretation and W is a valuation
of the variables in V . (I, W) satisfies the axiom hλ : ϕi ((I, W) |= hλ : ϕi), iff
either (i) W 6|= ϕ, or (ii) I |= λ. It is a model of the PEL TBox T (resp. ABox A)
iff it satisfies all the axioms in T (resp. A). A possibilistic interpretation is a pair
P = (J, Pos), where J is a finite set of contextual interpretations and Pos is a
possibility distribution over I. P is a model of the PEL TBox T (resp. ABox A)
if every (I, W) ∈ J is a model of T (resp. A). P is a model of the PN P if for
every valuation W,
                              max Pos(I, W) = PosP (W).
                            (I,W)∈J

P is a model of the PEL KB K = (P, T , A) iff it is a model of T , A, and P.
    Each possibilistic interpretation P = (I, Pos) defines a possibility distribu-
tion PosP over all CQs given by PosP (q) := max(I,W)∈I, I|=q {Pos(I, W)}. The
entailment degree of q w.r.t. the PEL KB K is

                                PosK (q) := inf {PosP (q)}.
                                              P|=K

These possibility distributions are extended to contexts in the obvious way, by
setting PosP (ϕ) := PosP (ϕ) = maxW|=ϕ PosP (W). We can then define the con-
               Table 1. PEL reasoning problems and their complexity

                   Problem            data         KB      network     combined
                 p-entailment           P           P       NP-c          NP-c
                 top-k answer           P           P       ∆p2 -c        ∆p2 -c
           conditional top-k answer     P           P       ∆p2 -c        ∆p2 -c
           k most preferred worlds      P           P      coNP-c        coNP-c



ditional possibilities of a query given a context, and of a context given a query,
using the standard product rule. Formally,

                         PosK (q ∧ ϕ) = PosK (q | ϕ)PosK (ϕ),
                         PosK (q ∧ ϕ) = PosK (ϕ | q)PosK (q),

where                                                                    
                 PosK (q ∧ ϕ) =       inf            max        Pos(I, W)} .
                                  (I,Pos)|=K       I|=q, W|=φ

     We consider three main reasoning problems in this setting; namely, decid-
ing p-entailment, retrieving the top-k answers to a query, and the k most pre-
ferred worlds entailing a given query. We formally define these problems next.
The problem of p-entailment refers to deciding whether PosK (q) ≥ p for some
given p ∈ (0, 1]. The top-k answer problem consists in deciding whether a tuple
(a1 , . . . , ak ) of different answers to q w.r.t. K is such that (i) for all i, 1 ≤ i < k,
PosK (ai ) ≥ PosK (ai+1 ), and (ii) for every other answer a, PosK (ak ) ≥ PosK (a).
This problem can be generalized to consider additional contextual evidence;
that is, verify whether (a1 , . . . , ak ) are the top-k answers to q given the con-
text ϕ. Finally, the k most preferred worlds problem is the problem of deciding
whether a tuple of k valuations of the variables V (W1 , . . . , Wk ) is such that
PosK (Wi | q) ≥ PosK (Wi+1 | q) holds for all i, 1 ≤ i < k, and there exists no
other valuation W such that PosK (W | q) > PosK (Wk | q).
     The complexity of all these problems is summarized in Table 1, where net-
work complexity refers to the complexity considering only the size of the PN as
input, KB complexity considers the size of the ABox and TBox, while combined
complexity considers the whole KB together with the PN and the query as the
size of the input. As it can be seen, all the problems remain tractable w.r.t. data
and KB complexity, but the complexity increases as soon as the PN or the query
is considered part of the input. This corresponds to the behaviour exhibited by
query answering in the classical EL [5]. The full details of these results can be
found in the appendix.
     Although all the complexity bounds are tight, they are all based on perform-
ing black-box query entailment tests on EL KBs. As future work we plan to
adapt specific query answering techniques to produce effective algorithms that
can be used in practice. We will also extend our framework to other kinds of
standard and non-stardard reasoning tasks.
References
1. BenAmor, N., Dubois, D., Gouider, H., Prade, H.: Possibilistic Networks : A New
   Setting for Modeling Preferences. In: Proc. of SUM’14. LNCS, vol. 8720. Springer
   Verlag (2014)
2. Benferhat, S., Dubois, D., Garcia, L., Prade, H.: Possibilistic logic bases and possi-
   bilistic graphs. In: Proc. of UAI’99. Morgan-Kaufmann Publishers (1999)
3. Ceylan, İ.İ., Peñaloza, R.: The Bayesian Description Logic BEL. In: Proc. of IJ-
   CAR’14. LNCS, vol. 8562. Springer Verlag (2014)
4. Darwiche, A.: Modeling and Reasoning with Bayesian Networks. Cambridge Uni-
   versity Press (2009)
5. Rosati, R.: On conjunctive query answering in EL. In: Proc. of DL’07. CEUR Work-
   shop Proceedings, vol. 250. CEUR-WS (2007)