=Paper= {{Paper |id=Vol-477/paper-27 |storemode=property |title=Reasoning About Typicality in ALC and EL |pdfUrl=https://ceur-ws.org/Vol-477/paper_15.pdf |volume=Vol-477 |dblpUrl=https://dblp.org/rec/conf/dlog/GiordanoGOP09 }} ==Reasoning About Typicality in ALC and EL== https://ceur-ws.org/Vol-477/paper_15.pdf
     Reasoning About Typicality in ALC and EL

Laura Giordano1, Valentina Gliozzi2 , Nicola Olivetti3 , and Gian Luca Pozzato2
1
    Dipartimento di Informatica - Università del Piemonte Orientale “A. Avogadro” -
                                laura@mfn.unipmn.it
           2
             Dipartimento di Informatica - Università edgli Studi di Torino -
                          {gliozzi,pozzato}@di.unito.it
       3
         LSIS-UMR CNRS 6168 Universitè “Paul Cézanne” - Aix-Marseille 3 -
                         nicola.olivetti@univ-cezanne.fr

        Abstract. In this work we summarize our recent results on extending
        Description Logics for reasoning about prototypical properties and inher-
        itance with exceptions. First, we focus our attention on the logic ALC.
        We present a nonmonotonic logic ALC + Tmin , which is built upon a
        monotonic logic ALC + T obtained by adding a typicality operator T to
        ALC. The operator T is intended to select the “most normal” or “most
        typical” instances of a concept, so that knowledge bases may contain
        subsumption relations of the form“T(C) is subsumed by P ”, express-
        ing that typical C-members have the property P . In order to perform
        nonmonotonic inferences, we define a “minimal model” semantics: the
        intuition is that preferred, or minimal models are those that maximise
        typical instances of concepts. By means of ALC +Tmin we are able to in-
        fer defeasible properties of (explicit or implicit) individuals. We also show
        that the satisfiability of an ALC + T-knowledge base is in EXPTIME,
        whereas deciding query entailment in ALC + Tmin is in co-NExpNP .
        We apply our approach based on the operator T also to the low com-
                                         ⊥                                   ⊥
        plexity Description Logic EL+ . We propose an extension EL+ T and
                                                             ⊥
        we show that the problem of entailment in EL+ T is in co-NP.


1     Introduction
Description logics (DLs) represent one of the most important formalisms of
knowledge representation. A DL knowledge base (KB) comprises a TBox, con-
taining the definition of concepts (and possibly roles), and a specification of
inclusions relations among them, and an ABox containing instances of concepts
and roles. Since the very objective of the TBox is to build a taxonomy of con-
cepts, the need of representing prototypical properties and of reasoning about
defeasible inheritance of such properties naturally arises.
    Several approaches to handle prototypical reasoning and inheritance with
exceptions in DL have been proposed in the literature, all of them are based
on the integration of DLs with some nonmonotonic reasoning mechanism: either
default logic (see [4, 19, 5]), or autoepistemic logic (see [9, 15] for some recents
developmentes) and circumscription (see [7] and [6]). In particular, [6] analyzes
the complexity of reasoning with circumscribed low complexity description log-
ics, such as DL-lite and the EL family. While reasoning with circumscribed ALC
knowledge bases is NExpNP -hard [7], in circumscribed DL-liteR complexity
drops to the second level of the polynomial hierarchy. In [6] is also shown that
in EL reasoning over circumscribed knowledge bases remain ExpTime-hard in
general. However, by limiting occurrences of existential restrictions, complexity
drops to the second level of the polynomial hierarchy.
    In this paper, we present an overview of our approach to reasoning about
typicality in DLs, which is based on the idea of introducing in the language a
typicality operator T. First, we present the monotonic logic ALC + T, obtained
by adding the operator T to ALC. Then we introduce a minimal model seman-
tics for it, which allows typical instances of concepts to be maximized. Finally,
                           ⊥                                        ⊥
we present the logic EL+ T, obtained by extending the logic EL+ with T. We
analyze and compare the complexity of these logics.
    The intended meaning of the operator T, for any concept C, is that T(C)
singles out the instances of C that are considered as “typical” or “normal”.
Thus assertions as “normally students do not pay taxes” are represented by
T(Student) ⊑ ¬TaxPayer . The operator T is characterised by a set of pos-
tulates that are essentially a reformulation of KLM [16] axioms of preferential
logic P, namely the assertion T(C) ⊑ P is equivalent to the conditional asser-
tion C |∼ P of P. It turns out that the semantics of the typicality operator can
be equivalently specified by a suitable modal logic.
    The idea underlying the modal interpretation is that there is a global prefer-
ence relation (a strict partial order) < on individuals, so that typical instances
of a concept C can be defined as the instances of C that are minimal with
respect to <. In this modal logic, < works as an accessibility relation R with
R(x, y) ≡ y < x, so that we can define T(C) as C ⊓¬C. The preference relation
< does not have infinite descending chains as we adopt the so-called Smooth-
ness condition or Limit Assumption of conditional logics. As a consequence, the
corresponding modal operator  has the same properties as in Gödel-Löb modal
logic G of arithmetic provability.
    In this setting, we assume that a KB comprises, in addition to the standard
TBox and ABox, a set of assertions of the type T(C) ⊑ D where D is a concept
not mentioning T. For instance, let the KB contain:

    T(Student) ⊑ ¬TaxPayer
    T(Student ⊓ Worker ) ⊑ TaxPayer
    T(Student ⊓ W orker ⊓ ∃HasChild .⊤) ⊑ ¬TaxPayer

corresponding to the assertions: normally a student does not pay taxes, nor-
mally a working student pays taxes, but normally a working student having
children does not pay taxes. Suppose further that the ABox contains alter-
natively the following facts about john: 1. Student(john); 2. Student(john),
Worker (john); 3. Student(john), Worker (john), ∃HasChild .⊤(john). We would
like to infer the expected (defeasible) conclusion about john in each case: 1.
¬TaxPayer (john), 2. TaxPayer (john), 3. ¬TaxPayer (john). Furthermore, we
would like to infer (defeasible) properties also of individuals implicitly intro-
duced by existential restrictions, for instance, if the ABox further contains
∃HasChild .(Student ⊓ Worker )(jack ) it should derive (defeasibly) the “right”
conclusion ∃HasChild .TaxPayer (jack ) in the latter. Finally, adding irrelevant
information should not affect the conclusions. Given the KB as above, one
should be able to infer as well T(Student ⊓ SportLover ) ⊑ ¬TaxPayer and
T(Student ⊓ Worker ⊓ SportLover ) ⊑ TaxPayer , as SportLover is irrelevant
with respect to being a TaxPayer or not. For the same reason, the conclu-
sion about john being a TaxPayer or not should not be influenced by adding
SportLover (john) to the ABox.
    The monotonic logic ALC + T allows some weak forms of inference through
cautious monotonicity. For instance, if typical students are young, from the KB
above we can derive that typical young students do not pay taxes. Inference in
ALC + T is in ExpTime. However, ALC + T is not sufficient to perform the
kind of defeasible reasoning illustrated above. Concerning the example, we get
for instance that: KB ∪ {Student(john), Worker (john)} 6|= TaxPayer (john);
KB 6|= T(Student ⊓ SportLover ) ⊑ ¬TaxPayer . In order to derive the conclusion
about john we should know (or assume) that john is a typical working student,
but we do not dispose of this information. Similarly, in order to derive that also
a typical student who loves sport does not have to pay taxes, we must be able
to infer or assume that a “typical student loving sport” is also a “typical stu-
dent”, since there is no reason why it should not be the case; this cannot be
derived by the logic itself given the nonmonotonic nature of T. The basic mono-
tonic logic ALC + T is then too weak to enforce these extra assumptions. In
order to perform defeasible inferences, we strenghten the semantics of ALC + T
by proposing a minimal model semantics. Intuitively, the idea is to restrict our
consideration to models that maximise typical instances of a concept. In order
to define the preference relation on models we take advantage of the modal se-
mantics of ALC + T: the preference relation on models (with the same domain)
is defined by comparing, for each individual, the set of modal (or more pre-
cisely -ed) concepts containing the individual in the two models. Similarly to
circumscription, where we must specify a set of minimised predicates, here we
must specify a set of concepts LT of which we want to maximise the set of typical
instances (it may just be the set of all concepts occurring in the knowledge base).
We call the new logic ALC + Tmin and we denote by |=L     min semantic entailment
                                                             T


determined by minimal models. Taking the KB of the examples above we obtain,
for instance, KB ∪ {Student(john), Worker (john)} |=L    min TaxPayer (john); KB
                                                           T

                                               LT
∪ {∃HasChild .(Student ⊓ Worker )(jack)} |=min ∃HasChild .TaxPayer (jack ) and
KB |=L min T(Student ⊓SportLover ) ⊑ ¬TaxPayer . As the second example shows,
        T


we are able to infer the intended conclusion also for the implicit individuals. In
[11] we have provided a decision procedure for checking satisfiability and validity
in ALC + Tmin . Our procedure, not presented here, can be used to determine
constructively an upper bound of the complexity of ALC + Tmin . Namely we
obtain that checking query entailment for ALC + Tmin is in co-NExpNP .
                                                       ⊥
    Finally, we introduce an extension of the logic EL+ with typicality. The log-
ics of the EL family allow for conjunction (⊓) and existential restriction (∃R.C).
Despite their relatively low expressivity, a renewed interest has recently emerged
for these logics. Indeed, it has been shown that EL has better algorithmic prop-
erties than F L0 , which allows for conjunction and value restriction (∀R.C). [1, 8]
show that reasoning in EL and several of its extensions remains tractable (i.e.,
polynomial-time decidable) in the presence of the TBox, and even of general
concept inclusions (GCIs). Furthermore, it has turned out that the logics of the
EL family are relevant for several applications, in particular in the bio-medical
domain; for instance, medical terminologies, such as the Galen Medical Knowl-
edge Base (GALEN, [17]), the Systemized Nomenclature of Medicine (SNOMED
[18]), and the Gene Ontology [20] used in bioinformatics, can be formalized in
small extensions of EL.                                                    ⊥
    We present some preliminary results about the complexity of EL+ T. In
                                            ⊥
particular, we show that a consistent EL+ T knowledge base has a small model
whose size is polynomial in the size of the knowledge base. In the paper, we
sketch the construction of the model. We show that, as a consequence of this
                                                    ⊥
result, the problem of deciding entailment in EL+ T is in co-NP.


2   The logic ALC + T
We consider an alphabet of concept names C, of role names R, and of individuals
O. The language L of the logic ALC + T is defined by distinguishing concepts
and extended concepts as follows: (Concepts) A ∈ C, ⊤ and ⊥ are concepts of
L; if C, D ∈ L and r ∈ R, then C ⊓ D, C ⊔ D, ¬C, ∀r.C, ∃r.C are concepts of L.
(Extended concepts) if C is a concept, then C and T(C) are extended concepts,
and all the Boolean combinations of extended concepts are extended concepts
of L. A knowledge base is a pair (TBox,ABox). TBox is a finite set of GCIs
C ⊑ D, where C ∈ L is an extended concept (either C ′ or T(C ′ )), and D ∈ L is
a concept. ABox contains expressions of the form C(a) and r(a, b) where C ∈ L
is an extended concept, r ∈ R, and a, b ∈ O.
    In order to provide a semantics to the operator T, we extend the definition
of a model used in “standard” terminological logic ALC:

Definition 1 (Semantics of T with selection function). A model is any
structure h∆, I, fT i, where: ∆ is the domain; I is the extension function that
maps each extended concept C to C I ⊆ ∆, and each role r to a rI ⊆ ∆I × ∆I .
I is defined in the usual way (as for ALC) and, in addition, (T(C))I = fT (C I ).
fT : P ow(∆) → P ow(∆) is a function satisfying the following properties:
(fT − 1) fT (S) ⊆ S                              (fT − 2) if S "= ∅, then also fT (S) "= ∅
                                                                       !        !
(fT − 3) if fT (S) ⊆ R, then fT (S) = fT (S ∩ R)        (fT − 4) fT ( Si ) ⊆       fT (Si )
          !                 "
(fT − 5)     fT (Si ) ⊆ fT ( Si )

Intuitively, given the extension of some concept C, fT selects the typical instances
of C. (fT − 1) requests that typical elements of S belong to S. (fT − 2) requests
that if there are elements in S, then there are also typical such elements. The next
properties constraint the behavior of fT wrt ∩ and ∪ in such a way that they do
not entail monotonicity. According to (fT − 3), if the typical elements of S are in
R, then they coincide with the typical elements of S ∩ R, thus expressing a weak
form of monotonicity (namely cautious    S monotonicity).
                                                  S        (fT − 4) corresponds to
one direction of the equivalence fT ( Si ) = fT (Si ), the one that does      T not
entail
T       monotonicity.   Similar  considerations
                                        T       apply toT the equation   f T (  Si ) =
   fT (Si ), of which only the inclusion fT (Si ) ⊆ fT ( Si ) is derivable. (fT −5) is
a further constraint on the behavior of fT wrt arbitrary unions and intersections;
it would be derivable if fT were monotonic.
    We can give an alternative semantics for T based on a preference relation.
The idea is that there is a global preference relation among individuals and that
the typical members of a concept C, i.e. selected by fT (C I ), are the minimal
elements of C wrt this preference relation. Observe that this notion is global,
that is to say, it does not compare individuals wrt a specific concept (something
like y is more typical than x wrt concept C). In this framework, an object x ∈ ∆
is a typical instance of some concept C, if x ∈ C I and there is no C-element in
∆ more typical than x. The typicality preference relation is partial since it is not
always possible to establish which object is more typical than which other. The
following definition is needed:
Definition 2. Given a relation <, which is a strict partial order (i.e. an ir-
reflexive and transitive relation) over a domain ∆, for all S ⊆ ∆, we define
M in< (S) = {x : x ∈ S and ∄y ∈ S s.t. y < x}. We say that < satisfies the
Smoothness Condition iff for all S ⊆ ∆, for all x ∈ S, either x ∈ M in< (S) or
∃y ∈ M in< (S) such that y < x.
In [10] it is shown that given a model with a selection function, it is possible
to define on the same domain a preference relation < such that, for all S ⊆ ∆,
fT (S) = M in< (S). Formally, given any model h∆, I, fT i, fT satisfies postulates
(fT − 1) to (fT − 5) above if and only if it is possible to define on ∆ a strict
partial order <, satisfying the Smoothness Condition, such that for all S ⊆ ∆,
fT (S) = M in< (S). By this result, we can refer to the following semantics for
ALC + T:
Definition 3 (Semantics of ALC +T). A model M is any structure h∆, <, Ii,
where ∆ and I are defined as in Definition 1, and < is a strict partial order over
∆ satisfying the Smoothness Condition (see Definition 2 above). As a difference
wrt Definition 1, the semantics of the T operator is: (T(C))I = M in< (C I ). For
concepts (built from operators of ALC), C I is defined in the usual way.
We introduce the following definition:
Definition 4 (Model satisfying a Knowledge Base). Consider a model M,
as defined in Definition 3. We extend I so that it assigns to each individual a of
O an element aI of the domain ∆. Given a KB (TBox,ABox), we say that:
 – M satisfies TBox if for all inclusions C ⊑ D in TBox, and all elements
    x ∈ ∆, if x ∈ C I then x ∈ DI .
 – M satisfies ABox if: (i) for all C(a) in ABox, we have that aI ∈ C I , (ii)
    for all r(a, b) in ABox, we have that (aI , bI ) ∈ rI .
M satisfies a knowledge base if it satisfies both its TBox and its ABox.

Notice that the meaning of T can be split into two parts: for any a of the domain
∆, a ∈ (T(C))I just in case (i) a ∈ C I , and (ii) there is no b ∈ C I such that b < a.
In order to isolate the second part of the meaning of T (for the purpose of the
calculus that we will present later), we introduce a new modality . The basic
idea is simply to interpret the preference relation < as an accessibility relation.
By the Smoothness Condition, it turns out that  has the properties as in Gödel-
Löb modal logic of provability G. The Smoothness Condition ensures that typical
elements of C I exist whenever C I 6= ∅, by preventing infinitely descending chains
of elements. This condition therefore corresponds to the finite-chain condition
on the accessibility relation (as in G). The interpretation of  in M is as follows:
(C)I = {a ∈ ∆ | for every b ∈ ∆, if b < a then b ∈ C I }. Therefore, we have
that a is a typical instance of C (a ∈ (T(C))I ) iff a ∈ (C ⊓ ¬C)I . Since we
only use  to capture the meaning of T, in the following we will always use the
modality  followed by a negated concept, as in ¬C.
    It is possible to prove that the satisfiability of an ALC + T-knowledge base
is in EXPTIME. We omit the proof that can be found in section 3.1 of [13].

Theorem 1 (Complexity of ALC + T). Given an ALC + T-knowledge base
(TBox,ABox), the problem of checking whether it is satisfiable can be solved in
exponential time.


3    The logic ALC + Tmin
The logic ALC + T allows one to reason about typicality. As a difference with
respect to standard ALC, in ALC + T we can consistently express, for instance,
the fact that three different concepts, as student, working student and work-
ing student with children, have a different status as taxpayers. As we have seen
in the introduction, this can be consistently expressed by including in a knowl-
edge base the three formulas: T(Student) ⊑ ¬TaxPayer ; T(Student ⊓Worker ) ⊑
TaxPayer ; T(Student ⊓Worker ⊓∃HasChild .⊤) ⊑ ¬TaxPayer . Assume that john
is an instance of the concept Student ⊓Worker ⊓∃HasChild .⊤. What can we con-
clude about john? If the ABox contains the assertion (∗) T(Student ⊓ Worker ⊓
∃HasChild .⊤)(john), then, in ALC +T, we can conclude that ¬TaxPayer (john).
However, in the absence of (*), we cannot derive ¬TaxPayer (john).
    We would like to infer that individuals are typical instances of the concepts
they belong to, if consistent with the KB. In order to maximize the typicality of
instances, we define a preference relation on models, and we introduce a semantic
entailment determined by minimal models. Informally, we prefer a model M to
a model N if M contains more typical instances of concepts than N .
    Given a KB, we consider a finite set LT of concepts occurring in the KB, the
typicality of whose instances we want to maximize. The maximization of the set
of typical instances will apply to individuals explicitly occurring in the ABox as
well as to implicit individuals. We assume that the set LT contains at least all
concepts C such that T(C) occurs in the KB.
    We have seen that a is a typical instance of a concept C (a ∈ (T(C))I )
when it is an instance of C and there is not another instance of C preferred to
a, i.e. a ∈ (C ⊓ ¬C)I . In the following, in order to maximize the typicality
of the instances of C, we minimize the instances of ¬¬C. Notice that this is
different from maximising the instances of T(C). We have adopted this solution
since it allows to maximise the set of typical instances of C without affecting
the extension of C (whereas maximising the extension of T(C) would imply
maximising also the extension of C).
                         −
    We define the set MLT of negated boxed formulas holding in a model, relative
                                                              −
to the concepts in LT . Given a model M = h∆, <, Ii, let M  LT = {(a, ¬¬C) |
a ∈ (¬¬C)I , with a ∈ ∆, C ∈ LT }.

Definition 5 (Preferred and minimal models). Given a model M = h∆M ,