Bayesian Description Logics

                    İsmail İlkan Ceylan1? and Rafael Peñaloza1,2??
                1
                    Theoretical Computer Science, TU Dresden, Germany
                       2
                         Center for Advancing Electronics Dresden
                      {ceylan,penaloza}@tcs.inf.tu-dresden.de


         Abstract. We present Bayesian Description Logics (BDLs): an exten-
         sion of Description Logics (DLs) with contextual probabilities encoded
         in a Bayesian network (BN). Classical DL reasoning tasks are extended
         to consider also the contextual and probabilistic information in BDLs.
         A complexity analysis of these problems shows that, for propositionally
         closed DLs, this extension comes without cost, while for tractable DLs
         the complexity is affected by the cost of reasoning in the BN.


1      Introduction

Description logic (DL) ontologies are usually composed of axioms that restrict
the class of possible interpretations. As these are hard restrictions, DL ontologies
can only encode absolute, immutable knowledge. For some application domains,
however, knowledge depends on the situation (or context) in which it is con-
sidered. For example, the notion of a luxury hotel in a small rural center will
be different from the one in a large cosmopolis. When building an ontology for
hotels, it makes sense to contextualize the axioms according to location, and
possibly other factors like season, type of weather, etc. Since these contexts refer
to notions that are external to the domain of interest, it is not always desirable,
or even possible, to encode them directly into the classical DL axioms.
    To handle contextual knowledge, we label every axiom with the context
in which it holds. For example, hLuxuryHotel v ∃hasFeature.MeetingRoom : cityi
states that in the context of a city, every luxury hotel has a meeting room. This
axiom imposes no restriction in case the context is not a city: it might still hold,
or not, depending on other factors.
    Labeling the axioms in an ontology allows us to give a more detailed descrip-
tion of the knowledge domain. Reasoning in these cases can be used to infer
knowledge that is guaranteed to hold in any given context. While the knowledge
within this context is precise, there might be a level of uncertainty regarding
the current context. To model this uncertainty, we attach a probability to each
of the possible contexts. Since we cannot assume that the contexts are (proba-
bilistically) independent, we need to describe the joint probability distribution
?
     Supported by DFG within the Research Training Group “RoSI” (GRK 1907).
??
     Partially supported by DFG within the Cluster of Excellence ‘cfAED’.
                                                          x
                                                    x
                                                         0.7
                      y                                                      z

                x     1       y                                       x y    0.3
                ¬x   0.5                                              x ¬y   0.1
                                                                     ¬x y     0
                                                               z     ¬x ¬y   0.9

                           Fig. 1: The BN B0 over V0 = {x, y, z}


over the space of all contexts. Thus, we consider knowledge bases that are com-
posed of an ontology labeled with contextual information, together with a joint
probability distribution over the space of contexts.
    To represent the probabilistic component of the knowledge base, we use
Bayesian networks (BNs) [12], a well-known probabilistic graphical model that
allows for a compact representation of the probability distribution, with the help
of conditional independence assumptions. For the logical component, we focus
on classical DLs where the consequence relation is monotonic.
    In previous work we have studied the Bayesian extension of the DL EL [5–7].
Here, we extend those ideas to arbitrary DLs and (monotonic) reasoning tasks.


2     Bayesian Description Logics

We now introduce Bayesian DLs. These logics extend classical DLs to handle
uncertainty, expressed through a Bayesian network [12]. Formally, a Bayesian
network (BN) is a pair B = (G, Φ), where G = (V, E) is a finite directed acyclic
graph (DAG) whose nodes represent Boolean random variables,3 and Φ contains,
for every x ∈ V , a conditional probability distribution PB (x | π(x)) of x given
its parents π(x). If V is the set of nodes in G, we say that B is a BN over V .
    Intuitively, G = (V, E) encodes a series of conditional independence assump-
tions between the random variables. More precisely, every variable x ∈ V is
conditionally independent of its non-descendants given its parents. Thus, every
BN B defines a unique joint probability distribution (JPD) over V given by
                                              Y
                                  PB (V ) =         PB (x | π(x)).
                                              x∈V

A very simple BN is shown in Figure 1. Every node of the graph has an associ-
ated conditional probability distribution table. For the node x, the probability
distribution is not conditional, since x has no parents in B0 , and for z the table
describes the conditional probability of z given its two parents x and y. From
this network we can derive e.g. P (x, ¬y, z) = P (z | x, ¬y) · P (¬y | x) · P (x) = 0.
3
    In their general form, BNs allow for arbitrary discrete random variables. We restrict
    w.l.o.g. to Boolean variables for ease of presentation.
    We abstract from the specific DL used, and consider an arbitrary ontology
language L that defines possibly infinite classes (i) A of well-formed axioms;
(ii) C of consequences; and (iii) O of ontologies such that every element of O is
a finite subset of A, and if O ∈ O, then every subset of O is also an ontology in
O. The semantics of this language is given by a class I of interpretations and an
entailment relation |= ⊆ I × (A ∪ C) that expresses which interpretations satisfy
which axioms and consequences. We say that an interpretation I is a model of
the ontology O, denoted by I |= O if I |= α for all α ∈ O.4 Finally, we say that
O entails the consequence c, denoted by O |= c if every model of O satisfies c.
Notice that the entailment relation is monotonic; i.e., if O |= c and O ⊆ O0 ∈ O,
then O0 |= c. Any DL with acyclic TBoxes, or with general ontologies is an
ontology language of this kind; consequences in these languages are e.g. concept
unsatisfiability, concept subsumption, or ontology inconsistency.
    For the rest of this paper, we consider L to be an arbitrary but fixed ontology
language, with axioms A, ontologies O, consequences C, and interpretations I.
As an example scenario we use subsumption in EL with general TBoxes. In the
Bayesian ontology language BL, ontologies are generalized by annotating every
axiom with a context, which is defined as a set of literals belonging to a BN.

Definition 1 (KB). Let V be a finite set of Boolean variables. A V -literal is
either x or ¬x, where x ∈ V ; a V -context is a consistent set of V -literals.
    A V -axiom is of the form hα : κi where α ∈ A and κ is a V -context. A
V -ontology is a finite set O of V -axioms, such that {α | hα : κi ∈ O} ∈ O. A
V -consequence is a pair hc : κi, where c ∈ C and κ is a V -context.
    A BL knowledge base (KB) over V is a pair K = (B, O) where B is a BN
over V and O is a V -ontology.5

Intuitively, a V -axiom is only guaranteed to hold when the context κ is satisfied.
The semantics of this logic is defined by extending interpretations to evaluate
the random variables from the BN.

Definition 2 (interpretation). Given a finite set of Boolean variables V , a
V -interpretation is a pair I = (I, V I ) where I ∈ I and V I : V → {0, 1} is a
valuation of the variables in V .

If there is no danger of ambiguity we will usually omit the prefix V . The valuation
V I is extended to contexts by defining V I (¬x) = 1 − V I (x) for all x ∈ V , and
                                  V I (κ) = min V I (`),
                                            `∈κ

where V I (∅) := 1 for every context κ. The V -interpretation I = (I, V I ) is a model
of the V -axiom hα : κi or the V -consequence hc : κi, denoted as I |= hα : κi
(respectively I |= hc : κi), iff V I (κ) = 0 or I |= α (resp., I |= c). It is a model
of the V -ontology O iff it is a model of all the V -axioms in O. The ontology
O entails hc : κi if every model of O is a model of hc : κi. The idea is that the
4
    We use the infix notation for the relation |=.
5
    Unless stated otherwise we assume that K is over V in the rest of the paper.
restriction α is only required to hold whenever the context κ is satisfied. Thus,
any interpretation that violates the context trivially satisfies the whole V -axiom.
Example 3. Let V0 = {x, y, z}, and consider the BEL V0 -ontology
         O0 := { hA v C : {x, y}i , hA v B : {¬x}i , hB v C : {¬x}i }.

The interpretation I0 = (I0 , V0 ) where V0 ({x, ¬y, z}) = 1, and I0 = ({d}, ·I0 )
with AI0 = {d}, and B I0 = C I0 = ∅ is a model of O0 , but not of hA v B : {x}i.
BL generalizes L: an ontology from L is a special case of a BL V -ontology, where
all V -axioms are associated with the empty context; i.e., are of the form hα : ∅i.
Since every valuation satisfies the empty context ∅, a V -interpretation I = (I, V I )
satisfies the V -axiom hα : ∅i iff I |= α, and analogously for consequences hc : κi.
For brevity, we use O |= c to denote that O entails hc : ∅i. For a valuation W of
the variables in V , we define the V -ontology containing the V -axioms that must
be satisfied in any V -interpretation I = (I, V I ) with V I = W.
Definition 4 (restriction). Let K = (B, O) be a KB. The restriction of O to
a valuation W of the variables in V is the V -ontology
                     OW := {hα : ∅i | hα : κi ∈ O, W(κ) = 1}.
So far, our semantics have focused on the evaluation of the Boolean variables and
the interpretation of concepts, ignoring the probabilistic information provided
by the BN. To handle these probabilities, we introduce multiple-world semantics
next. Intuitively, a V -interpretation describes a possible world; by assigning a
probabilistic distribution over these interpretations, we describe the required
probabilities, which should be consistent with the BN.
Definition 5 (probabilistic model). A probabilistic interpretation is a pair
P = (I, PI ), where I is a set of V -interpretations and PI is a probability distribu-
tion over I such that PI (I) > 0 only for finitely many interpretations I ∈ I. It is
a model of the V -ontology O (resp., of the V -consequence hc : κi) if every I ∈ I
is a model of O (resp., of hc : κi). P is consistent with the BN B if for every
valuation W of the variables in V it holds that
                                 X
                                      PI (I) = PB (W).
                             I∈I, V I =W

The probabilistic interpretation P is a model of the KB (B, O) iff it is a (prob-
abilistic) model of O and consistent with B.
As consequence of this semantics, probabilistic models preserve the probability
distribution of B for subsets of literals; i.e., contexts. The proof follows from
the fact that a context describes a partial valuation. Hence, the probability of a
context κ is the sum of the probabilities of all valuations that extend κ. Formally,
let K = (B, O) be a KB, and κ a context. For every model P of K it holds that
                               X
                                   PI (I) = PB (V I ).
                            I∈I, V I (κ)=1
It is sometimes useful to consider a special kind of probabilistic interpretations,
which we call pithy. These interpretations contain at most one V -interpretation
for each valuation of the variables in V . Each of these V -interpretations provides
the essential information associated to the corresponding valuation.

Definition 6 (pithy). The probabilistic interpretation P = (I, PI ) is called
pithy if for every valuation W of the variables in V there exists at most one
V -interpretation I = (I, V I ) ∈ I such that V I = W.

We now study the main reasoning problems in BL, and their complexity.


3     Reasoning in BL

We have shown how to represent probabilistic knowledge using a BL KB. We
now focus on reasoning with this knowledge. Recall that in the classical case an
ontology O entails a consequence c iff every model of O is also a model of c. We
generalize this notion to consider contexts and the probabilities from the BN.

Definition 7 (entailment). Let hc : κi be a V -consequence and K a BL KB.
K entails hc : κi, denoted as K |= hc : κi, if every probabilistic model of K is a
model of hc : κi. For a probabilistic
                          P            interpretation P = (I, PI ), the probability of
hc : κi is PP (hc : κi) := I∈I,I|=hc:κi PI (I). The probability of hc : κi w.r.t. K is

                          PK (hc : κi) := inf PP (hc : κi).
                                          P|=K


We say that c is a positive entailment in κ if PK (hc : κi) > 0, and a p-entailment
in κ, p ∈ (0, 1] if PK (hc : κi) ≥ p. 1-entailments are also called almost-sure.

Clearly, if K |= hc : κi, then PK (hc : κi) = 1. The converse may not hold since the
entailment relation might be violated in V -interpretations of probability zero.

Example 8. Consider the KB K0 = (B0 , O0 ), where B0 is the BN depicted in
Figure 1 and O0 the ontology from Example 3. Then, PK0 (hA v C : ∅i) = 1
and PK0 (hC v B : {x, y}i) = 0. Moreover, for any two concepts E, F , it holds
that PK0 (hE v F : {x, ¬y}i) = 1 since hE v F : {x, ¬y}i can only be violated in
V -interpretations with probability 0. But, in general, K0 6|= hE v F : {x, ¬y}i.


3.1   Probabilistic Entailments

We consider first the problem of computing the probability of an entailment,
or deciding the properties of this probability. As an intermediate step, we show
that it is possible w.l.o.g. to restrict reasoning to pithy models.

Lemma 9. Let K be a KB. If P is a probabilistic model of K, then there ex-
ists a pithy model Q of K such that for every consequence hc : κi it holds that
PQ (hc : κi) ≤ PP (hc : κi).
Proof (Sketch). Let W be a valuation and I, I0 ∈ I two V -interpretations such
              0
that V I = V I = W. Construct a new interpretation J as the disjoint union of I
and I0 . The probabilistic interpretation (H, PH ) with H = (I ∪ {J}) \ {I, I0 } and
                                   (
                                     PI (H)            H 6= J
                         PH (H) :=
                                     PI (I) + PI (I0 ) H = J

is a model of K. Moreover, J |= hα : κi iff both I |= hα : κi and I0 |= hα : κi.    t
                                                                                    u

As we show next, the probability of a consequence can be computed by reasoning
over the restrictions OW of the V -ontolgy O.

Theorem 10. Let K = (B, O) be a KB and hc : κi a consequence.
                                           X
               PK (hc : κi) = 1 − PB (κ) +      PB (W).
                                                  OW |=c
                                                  W(κ)=1

Proof. For every valuation W construct the V -interpretation IW as follows. If
OW |= c, then IW is any model (I, W) of OW ; otherwise, IW is any model
(I, W) of OW that does not satisfy hc : κi, which must exist by definition. The
probabilistic interpretation PK = (I, PI ) with I = {IW | W a valuation of V }
and PI (IW ) = PB (W) for all W is a model of K and
                          X                X                X
       PPK (hc : κi) =         PI (IW ) =        PI (IW ) +      PI (IW )
                       IW |=hc:κi          W(κ)=0               W(κ)=1,
                                                                IW |=hc:κi
                                       X
                     = 1 − PB (κ) +            PB (W).
                                      OW |=c
                                      W(κ)=1
                                     P
Thus, PK (hc : κi) ≤ 1 − PB (κ) + OW |=c,W(κ)=1 PB (W). Suppose now that the
inequality is strict, then there exists a probabilistic model P = (J, PJ ) of K such
that PP (hc : κi) < PPK (hα : κi). By Lemma 9, we can assume w.l.o.g. that P is
pithy, and hence for every valuation W with PB (W) > 0 there exists exactly one
JW ∈ J with V JW = W. We thus have
                        X                          X
                                  PJ (JW ) <                PI (IW ).
                JW |=hc:κi,W(κ)=1           IW |=hc:κi,W(κ)=1

Since PI (IW ) = PJ (JW ) for all W, there exists a valuation W 0 with IW 0 |= hc : κi
but JW 0 6|= hc : κi. Since JW 0 is a model of OW 0 it follows that OW 0 6|= c. By
construction, then we have that IW 0 6|= hc : κi, which is a contradiction.          t
                                                                                     u

Based on this theorem, we can compute the probability of an entailment as
described in Algorithm 1. The algorithm simply verifies for all possible valuations
W, whether OW entails c. Clearly, the for loop is executed 2|V | times; once for
each valuation of the variables in V . Each of these executions needs to compute
Algorithm 1 Probability of an entailment
Input: KB K = (B, O), consequence hc : κi
Output: PK (hc : κi)
 1: P ← 0, Q ← 0
 2: for all valuations W do
 3:    if W(κ) = 0 then
 4:        Q ← Q + PB (W)
 5:    else if OW |= c then
 6:        P ← P + PB (W)
 7: return 1 − Q + P


the probability PB (W) and, possibly, decide whether OW |= c. The former can be
done in polynomial time on the size of B, using the standard chain rule [12], while
the complexity of the latter depends on the complexity of deciding entailments
in language L, which we denote with k. Notice that the different valuations can
be enumerated using only polynomial space, and the algorithm handles them
independently. This yields the following result.

Theorem 11. Deciding p-entailment in BL is in PSpacek .

As a lower bound, unsurprisingly, p-entailment is at least as hard as deciding
entailment in L, and deciding probabilities from the BN. Since this latter problem
is hard for the class PP [19], we get the following result.

Theorem 12. Deciding p-entailment is both PP-hard and k-hard.

From these two theorems it follows that if the complexity of deciding entailments
in L is at least PSpace, then p-entailment in BL is in the same complexity class.
This means that any DL at least as expressive as ALC (without a TBox), can be
extended with our probabilistic semantics, without affecting the complexity of
reasoning. For tractable DLs like the EL and DL-Lite families, the complexity in-
creases to PP. If we restrict only in deciding positive or almost-sure entailments,
then these bounds can be lowered.

Theorem 13. The complexity of deciding positive entailment is between NP
and NPk ; deciding almost-sure entailments is between coNP and coNPk .

Proof. To decide positive entailment, we can simply guess a valuation W and
check in polynomial time with a k-oracle that (i) PB (W) > 0 and (ii) either
W(κ) = 0 or OW |= c. The correctness of this algorithm is given by Theorem 10.
Thus the problem is in NPk . To show hardness, we recall that deciding, given
a BN B and a variable x ∈ V , whether PB (x) > 0 is NP-hard [9]. Consider
the KB K = (B, ∅) and c ∈ C. It follows from Theorem 10 that PB (x) > 0
iff PK (hc : {¬x}i) > 0. Thus positive consequence is NP-hard. The bounds for
almost-sure entailments can be shown analogously.                           t
                                                                            u
This theorem implies that if k contains ∆P  2 , then positive and almost-sure rea-
soning in BL is in the same complexity class as reasoning in L. Moreover, if k is
contained in PTime, then these problems are NP-complete and coNP-complete,
respectively. It can also be easily seen from Algorithm 1 that if k is in PTime,
then all these probabilistic reasoning problems are fixed-parameter tractable
where |V | is the parameter.6
   Algorithm 1 shows that the probabilistic and the logical components of the
KB can be decoupled while reasoning. This is an encouraging result as it means
that one can apply the optimized methods developed for BN inference and for
DL reasoning directly in BL without mayor modifications.


3.2    Contextual Entailments

We now turn our attention to deciding whether a consequence follows from
all models of the KB in a classical sense; that is, whether K |= hc : κi holds.
This problem is intractable already for the sublogic of BEL that allows only the
conjunction constructor, even if restricted to the empty context.

Theorem 14. Let K be a BEL KB and C, D two concepts. Deciding whether
K |= hC v D : ∅i is coNP-hard.

Proof. We present a reduction from validity of DNF formulas, which is known
to be coNP-hard [8]. Let φ = σ1 ∨ . . . ∨ σn be a DNF formula where each σi
is a conjunctive clause and let V be the set of all variables appearing in φ. For
each variable x ∈ V , we introduce the concept names Bx and B¬x and define the
TBox Ox := {hA v Bx : {x}i , hA v B¬x : {¬x}i}. For every conjunctive clause
σ = `1 ∧ . . . ∧ `m define the TBox Oσ := {hB`1 u . . . u B
                                                          S`m v C : ∅i}.
                                                                     S Let now
K = (B, O) where B is an arbitrary BN over V and O = x∈V Ox ∪ 1≤i≤n Oσi .
It is easy to see that φ is valid iff K |= hA v C : ∅i.                        t
                                                                               u

The main reason for this hardness is that the interaction of contexts might
produce consequences that are not obvious at first sight. For instance, a con-
sequence might follow in context κ not because the axioms from Oκ entail the
consequence, but rather because any valuation satisfying κ will yield it. That is
the main idea in the proof of Theorem 14; the axioms that follow directly from
the empty context never entail the axiom A v C, but if φ is valid, then this
follows from all valuations. We obtain the following result.

Lemma 15. Let K = (B, O) be a KB. Then K |= hc : κi iff for every valuation
W with W(κ) = 1, it holds that OW |= c.

It thus suffices to identify all valuations that define V -ontologies entailing the
consequence. To do this, we will take advantage of techniques developed in the
area of axiom-pinpointing [4], access control [1], and context-based reasoning [2].
6
    A problem is fixed-parameter tractable if it can be solved in polynomial time, as-
    suming that the parameter is fixed [13].
Notice that a consequence depends only on the V -ontology and not on the BN.
For that reason, for the rest of this section we focus only on this part
                                                                     V of the KB.
    We can see every context κ as the conjunctive clause χκ := `∈κ `. In this
view, the V -ontology O is a set of labeled axioms over the distributive lattice B
of all Boolean formulas over the variables V modulo equivalence. Each formula
φ in this lattice defines a sub-ontology Oφ containing all axioms hα : κi ∈ O with
χκ |= φ.
    Using the terminology from [2], we are interested in finding a boundary for a
consequence. Given a V -ontology O labeled over the lattice B and a consequence
c, a boundary for c w.r.t. O is an element φ ∈ B such that for every join-prime
element ψ ∈ B it holds that ψ |= φ iff Oψ |= c (see [2] for further details). Notice
that the join-prime elements of B are exactly the valuations of variables in V .
Using Lemma 15 we obtain the following result.
Theorem 16. Let φ be a boundary for c w.r.t. O in B. Then, for any context κ
we have that K |= hc : κi iff χκ |= φ.
Several methods have been developed for computing boundaries, either through
a black-box approach that makes several calls to an external reasoner [1, 2, 15],
or through a glass-box approach that modifies a pre-existing reasoner [3, 4, 14].
In general, it is easy to see that a boundary can be computed by performing
exponentially many entailment tests. Once a boundary φ for c w.r.t. O has been
computed, contextual entailment is decided by verifying whether χκ |= φ. This
decision is in NP on |V |. Thus, overall we have the following.
Corollary 17. Contextual entailment in BL can be decided in ExpTimek .
Clearly, a boundary for c provides more information than necessary for deciding
whether the axiom holds in a given context κ. It encodes all contexts that entail
the desired axiom. We can use this knowledge to deduce the most likely context.

3.3   Most Likely Context
The problem of finding the most likely context for a consequence can be seen
as the dual of computing the probability of this consequence. Intuitively, we are
interested in finding the most likely explanation for an event; assuming that a
consequence holds, we are interested in finding an explanation for it, in the form
of a context, that has the maximal probability of occurring.
Definition 18 (most likely context). Given a KB K = (B, O) and c ∈ C, the
context κ is called a most likely context for c w.r.t. K if (i) K |= hc : κi, and
(ii) for every context κ0 , if K |= hc : κ0 i, then PB (κ0 ) ≤ PB (κ).
Notice that we are not interested in maximizing PK (hc : κi) but rather PB (κ).
Indeed, these two problems can be seen as dual, since PK (hc : κi) depends in-
versely, but not exclusively, on PB (κ) (see Theorem 10).
   Algorithm 2 computes the set of all most likely contexts for c w.r.t. K, to-
gether with their probability. It maintains a value p of the highest known prob-
Algorithm 2 Compute all most likely contexts
Input: KB K = (B, O), consequence c
Output: The set Λ of most likely contexts for c w.r.t. K and probability p ∈ [0, 1]
 1: Λ ← ∅, p ← 0
 2: φ ← boundary(c, O)                          . compute a boundary for c w.r.t. O
 3: for all contexts κ do
 4:    if χκ |= φ then
 5:        if PB (κ) > p then
 6:            Λ ← {κ}
 7:            p ← PB (κ)
 8:        else if PB (κ) = p then
 9:            Λ ← Λ ∪ {κ}
10: return Λ, p


ability for a context, and a set Λ with all the contexts that have probability p.
The algorithm first computes a boundary for the consequence, which is used to
test, for every context κ whether K |= hc : κi. In that case, it compares PB (κ)
with p. If the former is larger, then the highest probability is updated to this
value, and the set Λ is restarted to contain only κ. If they are the same, then κ is
added to the set of most likely contexts. The number of contexts is exponential
on B, and for each of them we have to test propositional entailment, which is
also exponential on B. The most expensive step is to compute the boundary,
whose complexity depends on the complexity of reasoning.
Theorem 19. Algorithm 2 computes all most likely contexts for c w.r.t. K in
exponential time, with a k oracle.
In general, it is not possible to lower this exponential upper bound, since a
simple consequence may have exponentially many most likely contexts. However,
Algorithm 2 can be adapted to compute one most likely context in a more
efficient way. The main idea is to order the calls in the for loop by decreasing
probability. Once a context κ with χκ |= φ has been found, it is guaranteed to
be a most likely context and the algorithm may stop. This approach would still
require exponential time in the worst case, even with a k oracle, since it requires
the computation of the boundary.


4   Related Work
The amount of work on handling uncertain knowledge with description logics
is too vast to cover in detail here. Many probabilistic description logics have
been defined, which differ not only in their syntax but also in their use of the
probabilities and their application. These logics were recently surveyed in [17].
We discuss here only those logics most closely related to ours.
    One of the first attempts for combining BNs and DLs was P-Classic [16],
which extended Classic through probability distributions over the interpreta-
tion domain. The more recent PR-OWL [10] uses multi-entity BNs to describe
the probability distributions of some domain elements. In both cases, the prob-
abilistic component is interpreted providing individuals with a probability dis-
tribution; this differs greatly from our multiple-world semantics, in which we
consider a probability distribution over a set of classical DL interpretations.
    Perhaps the closest to our approach are the Bayesian extension of DL-Lite [11]
and DISPONTE [18]. The latter allows for so-called epistemic probabilities that
express the uncertainty associated to a given axiom. Their semantics is based,
as ours, on a probabilistic distribution over a set of interpretations. The main
difference with our approach is that in [18], the authors assume that all probabil-
ities are independent, while we allow a joint probability distribution through the
BN. Another minor difference is that in DISPONTE it is impossible to deduce
classical consequences, as we do. The logic described in [11] looks very simi-
lar to our framework. There is, however, a subtle but important difference. In
our approach, an interpretation I satisfies an axiom hα : κi if V I (κ) = 1 implies
I |= α. In [11], the authors employ a closed-world assumption over the contexts,
replacing this implication by an equivalence; i.e., V I (κ) = 0 also implies I 6|= α.

5    Conclusions
We have introduced probabilistic extensions of DLs. Our basic assumption is
that we have certain knowledge, which depends on an uncertain situation, or
context. In practical terms, this means that every axiom is associated to a context
with the intended meaning that, if the context holds, then the axiom must
be true. Uncertainty is represented through a BN that encodes the probability
distribution of the contexts. The advantage of using Bayesian networks relies in
their capacity of describing conditional independence assumptions compactly.
    We have studied the complexity of reasoning in this probabilistic logic. We
have shown that for expressive DLs, the complexity of reasoning is not affected
by the introduction of the contextual and probabilistic components. For tractable
DLs the complexity of reasoning increases, which was expected since handling
BNs is intractable in general. Moreover, we have shown that reasoning can be
decoupled between the probabilistic and the logical components.
    There are several directions for future work. One of them is to tighten the
complexity bounds for inexpressive DLs, in particular for the DL-Lite family, as
was done for EL [6]. We also plan to develop more goal-directed algorithm for
reasoning in these logics. We will also study the possibility of using graphical
models for other kinds of reasoning, like preferences, and considering different
restrictions on the underlying graph. Finally, we will consider problems that
tighten the relationship between the probabilistic and the logical components.
One of such problems would be to update the BN according to evidence attached
to the ontology.

References
 1. Baader, F., Knechtel, M., Peñaloza, R.: A generic approach for large-scale onto-
    logical reasoning in the presence of access restrictions to the ontology’s axioms. In:
    Proc. 8th International Semantic Web Conference (ISWC 2009). LNCS, vol. 5823,
    pp. 49–64. Springer (2009)
 2. Baader, F., Knechtel, M., Peñaloza, R.: Context-dependent views to axioms and
    consequences of semantic web ontologies. J. of Web Semantics 12–13, 22–40 (2012)
 3. Baader, F., Peñaloza, R.: Automata-based axiom pinpointing. J. of Automated
    Reasoning 45(2), 91–129 (August 2010)
 4. Baader, F., Peñaloza, R.: Axiom pinpointing in general tableaux. J. of Logic and
    Computation 20(1), 5–34 (2010)
 5. Ceylan, İ.İ.: Context-Sensitive Bayesian Description Logics. Master’s thesis, Dres-
    den University of Technology, Germany (2013)
 6. Ceylan, İ.İ., Peñaloza, R.: The Bayesian Description Logic BEL. In: Proc. 7th In-
    ternational Conference on Automated Reasoning (IJCAR 2014) (2014), to appear.
 7. Ceylan, İ.İ., Peñaloza, R.: Reasoning in the Description Logic BEL using Bayesian
    Networks. In: Proc. 4th International Workshop on Statistical Relational AI
    (StarAI 2014) (2014), to appear.
 8. Cook, S.A.: The complexity of theorem-proving procedures. In: Proc. Third Annual
    ACM Symposium on Theory of Computing. pp. 151–158. STOC ’71, ACM, New
    York, NY, USA (1971), http://doi.acm.org/10.1145/800157.805047
 9. Cooper, G.F.: The computational complexity of probabilistic inference using
    bayesian belief networks (research note). Artif. Intel. 42(2-3), 393–405 (Mar 1990)
10. da Costa, P.C.G., Laskey, K.B., Laskey, K.J.: Pr-owl: A bayesian ontology language
    for the semantic web. In: Uncertainty Reasoning for the Semantic Web I, URSW
    2005-2007. LNCS, vol. 5327, pp. 88–107. Springer (2008)
11. d’Amato, C., Fanizzi, N., Lukasiewicz, T.: Tractable reasoning with bayesian de-
    scription logics. In: Proc. Second International Conference on Scalable Uncertainty
    Management (SUM 2008). LNCS, vol. 5291, pp. 146–159. Springer (2008)
12. Darwiche, A.: Modeling and Reasoning with Bayesian Networks. Cambridge Uni-
    versity Press (2009)
13. Downey, R.G., Fellows, M.: Parameterized Complexity. Monographs in Computer
    Science, Springer (1999)
14. Fu, W., Peñaloza, R.: Adding context to tableaux for DLs. In: Kazakov, Y., Lembo,
    D., Wolter, F. (eds.) Proceedings of the 2012 International Workshop on Descrip-
    tion Logics (DL’12). CEUR-WS, vol. 846. Rome, Italy (2012)
15. Kalyanpur, A., Parsia, B., Horridge, M., Sirin, E.: Finding all justifications of owl
    dl entailments. In: Proc. 6th International Semantic Web Conference (ISWC 2007).
    Lecture Notes in Computer Science, vol. 4825, pp. 267–280. Springer (2007)
16. Koller, D., Levy, A.Y., Pfeffer, A.: P-classic: A tractable probablistic description
    logic. In: Proc. 14th National Conference on Artificial Intelligence (AAAI-97). pp.
    390–397. AAAI Press (1997)
17. Lukasiewicz, T., Straccia, U.: Managing uncertainty and vagueness in description
    logics for the semantic web. J. of Web Semantics 6(4), 291–308 (2008)
18. Riguzzi, F., Bellodi, E., Lamma, E., Zese, R.: Epistemic and statistical probabilistic
    ontologies. In: Proc. 8th Int. Workshop on Uncertainty Reasoning for the Semantic
    Web (URSW-12). vol. 900, pp. 3–14. CEUR-WS (2012)
19. Roth, D.: On the hardness of approximate reasoning. Artif. Intel. 82(1-2), 273–302
    (1996)