=Paper= {{Paper |id=Vol-434/paper-3 |storemode=property |title=Combining Logic Programming with Description Logics and Machine Learning for the Semantic Web |pdfUrl=https://ceur-ws.org/Vol-434/paper3.pdf |volume=Vol-434 }} ==Combining Logic Programming with Description Logics and Machine Learning for the Semantic Web== https://ceur-ws.org/Vol-434/paper3.pdf
        Combining Logic Programming with
      Description Logics and Machine Learning
                for the Semantic Web

                    Francesca A. Lisi and Floriana Esposito

            Dipartimento di Informatica, Università degli Studi di Bari
                      Via E. Orabona 4, 70125 Bari, Italy
                        {lisi, esposito}@di.uniba.it



      Abstract. In this paper we consider an extension of Logic Programming
      that tackles the Semantic Web challenge of acquiring rules combined with
      ontologies. To face this bottleneck problem we propose a framework that
      resorts to the expressive and deductive power of DL+log and adopts the
      methodological apparatus of Inductive Logic Programming.


1   Introduction

Combining rules and ontologies is a hot topic in the (Semantic) Web area as
testified by the intense activity and the standardization efforts of the Rule Inter-
change Format working group at W3C. Yet the debate around a unified language
for (Semantic) Web rules is still open. Indeed, combining rules and ontologies
raises several issues in Knowledge Representation (KR) due to the many differ-
ences between the underlying logics, Clausal Logics (CLs) [17] and Description
Logics (DLs) [1] respectively. Among the many recent KR proposals, DL+log
[23] is a very powerful framework that allows for the tight integration of DLs
and disjunctive Datalog with negation (Datalog¬∨ ) [7]. A point in favour
of DL+log is its decidability for many DLs, notably for SHIQ [12]. Since the
design of OWL has been based on the SH family of very expressive DLs [11],
SHIQ+log is a good candidate for investigation in the Semantic Web context.
    The upcoming standard rule language for the Semantic Web, if well-founded
from the KR viewpoint, will be equipped with reasoning algorithms. In KR
tradition deductive reasoning is the most widely studied. Yet, other forms of
reasoning will become necessary. E.g., acquiring and maintaining Semantic Web
rules is very demanding and can be automated though partially by applying Ma-
chine Learning algorithms. In this paper, we consider a decidable instantiation
of DL+log obtained by choosing SHIQ for the DL part and Datalog¬ for the
CL part, and face the problem of defining inductive reasoning mechanisms on it.
To solve the problem, we propose to resort to the methodological apparatus of
that form of Machine Learning known under the name of Inductive Logic Pro-
gramming (ILP) [19]. We extend some known ILP techniques to SHIQ + log¬
and illustrate them with examples relevant to the Semantic Web context.
   The paper is organized as follows. Section 2 briefly introduces hybrid DL-CL
formalisms and ILP. Section 3 introduces the KR framework of DL+log. Section
4 defines the ILP framework for inducing SHIQ+log¬ rules. Section 5 provides
a comparative analysis of our proposal with related work. Section 6 concludes
the paper with final remarks.


2     Background
2.1   Logic Programming and Description Logics
Description Logics (DLs) are a family of KR formalims that allow for the spec-
ification of knowledge in terms of classes (concepts), binary relations between
classes (roles), and instances (individuals) [1]. Complex concepts can be defined
from atomic concepts and roles by means of constructors (see Table 1). E.g.,
concept descriptions in the basic DL AL are formed according to only the con-
structors of atomic negation, concept conjunction, value restriction, and limited
existential restriction. The DLs ALC and ALN are members of the AL fam-
ily. The former extends AL with (arbitrary) concept negation (or complement),
whereas the latter with number restriction. The DL ALCN R adds to the con-
structors inherited from ALC and ALN a further one: role intersection (see
Table 1). Conversely, in the DL SHIQ [12] it is allowed to invert roles and to
express qualified number restrictions of the form ≥ nS.C and ≤ nS.C where S
is a simple role (see Table 1).
     A DL knowledge base (KB) can state both is-a relations between concepts
(axioms) and instance-of relations between individuals (resp. couples of individu-
als) and concepts (resp. roles) (assertions). Concepts and axioms form the TBox
whereas individuals and assertions form the ABox. A SHIQ KB encompasses
also a RBox which consists of axioms concerning abstract roles. The semantics
of DLs is usually defined through a mapping to First Order Logic (FOL) [2]. An
interpretation I = (∆I , ·I ) for a DL KB consists of a non-empty domain ∆I
and a mapping function ·I . In particular, individuals are mapped to elements of
∆I such that aI 6= bI if a 6= b (Unique Names Assumption (UNA) [21]). Yet in
SHIQ UNA does not hold by default [10]. Thus individual equality (inequality)
assertions may appear in a SHIQ KB (see Table 1). Also the KB represents
many different interpretations, i.e. all its models. This is coherent with the Open
World Assumption (OWA) that holds in FOL semantics. The main reasoning
task for a DL KB is the consistency check that is performed by applying decision
procedures based on tableau calculus. Decidability of reasoning is crucial in DLs.
     The integration of DLs and Logic Programming follows the tradition of KR
research on hybrid systems, i.e. those systems which are constituted by two or
more subsystems dealing with distinct portions of a single KB by performing
specific reasoning procedures [8], and gives raise to KR systems that will be
referred to as DL-CL hybrid systems in the rest of the paper. The motivation
for investigating and developing such systems is to improve on representational
adequacy and deductive power by preserving decidability. In particular, combin-
ing DLs with CLs can easily yield to undecidability if the interface between
                        Table 1. Syntax and semantics of DLs.



       bottom (resp. top) concept ⊥ (resp. >) ∅ (resp. ∆I )
                   atomic concept      A      AI ⊆ ∆I
                    (abstract) role    R      RI ⊆ ∆I × ∆I
            (abstract) inverse role   R−      (RI )−
             (abstract) individual     a      aI ∈ ∆I
                   concept negation       ¬C         ∆I \ C I
               concept intersection     C1 u C2      C1I ∩ C2I
                      concept union     C1 t C2      C1I ∪ C2I
                    value restriction    ∀R.C        {x ∈ ∆I | ∀y (x, y) ∈ RI → y ∈ C I }
              existential restriction    ∃R.C        {x ∈ ∆I | ∃y (x, y) ∈ RI ∧ y ∈ C I }
        at least number restriction      ≥ nR        {x ∈ ∆I | |{y|(x, y) ∈ RI }| ≥ n}
       at most number restriction        ≤ nR        {x ∈ ∆I | |{y|(x, y) ∈ RI }| ≤ n}
at least qualif. number restriction     ≥ nS.C       {x ∈ ∆I | |{y ∈ C I |(x, y) ∈ S I }| ≥ n}
at most qualif. number restriction      ≤ nS.C       {x ∈ ∆I | |{y ∈ C I |(x, y) ∈ S I }| ≤ n}
                    role intersection   R1 u R2      R1I ∩ R2I
       concept equivalence axiom        C1 ≡ C2      C1I = C2I
      concept subsumption axiom         C1 v C2      C1I ⊆ C2I
            role equivalence axiom       R≡S         RI = S I
               role inclusion axiom      RvS         RI ⊆ S I
                 concept assertion        a:C        aI ∈ C I
                     role assertion     ha, bi : R   (aI , bI ) ∈ RI
      individual equality assertion       a≈b        aI = bI
    individual inequality assertion       a 6≈ b     aI 6= bI




them is not reduced. In [14] the family Carin of languages combining any DL
and HCL is presented. Among the many important results of this study, it is
proved that query answering in a logic obtained by extending ALCN R with non-
recursive Datalog rules, where both concepts and roles can occur in rule bodies,
is decidable. Query answering is decided using constrained SLD-resolution, i.e.
an extension of SLD-resolution with a modified version of tableau calculus. An-
other DL-CL hybrid system is AL-log [6] that integrates ALC [26] and Datalog
[4] by constraining the variables occurring in the body of rules with ALC con-
cept assertions. Constrained SLD-resolution for AL-log is decidable and offers a
complete and sound method for answering ground queries by refutation. Besides
decidability, another relevant issue is DL-safeness of hybrid DL-CL systems [22].
A safe interaction between the DL and the CL part of an hybrid KB allows to
solve the semantic mismatch between DLs and CLs due to the different inferences
that can be made under OWA and CWA respectively. In this respect, AL-log is
DL-safe whereas Carin is not.
2.2   Logic Programming and Machine Learning

The research area born at the intersection of Logic Programming and Machine
Learning, more precisely Concept Learning [18], is known under the name of
Inductive Logic Programming (ILP) [19]. From Logic Programming ILP has
borrowed the KR framework, i.e. Horn Clausal Logic (HCL). From Concept
Learning it has inherited the inferential mechanisms for induction, the most
prominent of which is generalization characterized as search through a partially
ordered space of hypotheses. According to this vision, in ILP a hypothesis is
a clausal theory (i.e., a set of rules) and the induction of a single clause (rule)
requires (i) structuring, (ii) searching and (iii) bounding the space of hypotheses.
First we focus on (i) by clarifying the notion of ordering for clauses. An ordering
allows for determining which one, between two clauses, is more general than
the other. Actually quasi-orders are considered, therefore uncomparable pairs
of clauses are admitted. One such ordering is θ-subsumption [20]: Given two
clauses C and D, we say that C θ-subsumes D if there exists a substitution
θ, such that Cθ ⊆ D. Given the usefulness of Background Knowledge (BK) in
ILP, orders have been proposed that reckon with it, e.g. Buntine’s generalized
subsumption [3]. Generalized subsumption only applies to definite clauses and
the BK should be a definite program. Once structured, the space of hypotheses
can be searched (ii) by means of refinement operators. A refinement operator is
a function which computes a set of specializations or generalizations of a clause
according to whether a top-down or a bottom-up search is performed. The two
kinds of refinement operator have been therefore called downward and upward,
respectively. The definition of refinement operators presupposes the investigation
of the properties of the various quasi-orders and is usually coupled with the
specification of a declarative bias for bounding the space of clauses (iii). Bias
concerns anything which constrains the search for theories, e.g. a language bias
specifies syntactic constraints on the clauses in the search space.
    Induction with ILP generalizes from individual instances/observations in the
presence of BK, finding valid hypotheses. Validity depends on the underlying set-
ting. At present, there exist several formalizations of induction in clausal logic
that can be classified according to the following two orthogonal dimensions: the
scope of induction (discrimination vs characterization) and the representation of
observations (ground definite clauses vs ground unit clauses) [5]. Discriminant
induction aims at inducing hypotheses with discriminant power as required in
tasks such as classification. In classification, observations encompass both posi-
tive and negative examples. Characteristic induction is more suitable for finding
regularities in a data set. This corresponds to learning from positive examples
only. The second dimension affects the notion of coverage, i.e. the condition un-
der which a hypothesis explains an observation. In learning from entailment (or
from implications), hypotheses are clausal theories, observations are ground def-
inite clauses, and a hypothesis covers an observation if the hypothesis logically
entails the observation. In learning from interpretations, hypotheses are clausal
theories, observations are Herbrand interpretations (ground unit clauses) and a
hypothesis covers an observation if the observation is a model for the hypothesis.
3     Combining LP and DLs with DL+log
The KR framework of DL+log [23] allows for the tight integration of DLs [1]
and Datalog¬∨ [7]. More precisely, it allows a DL KB to be extended with
weakly-safe Datalog¬∨ rules. The condition of weak safeness allows to overcome
the main representational limits of the approaches based on the DL-safeness
condition, e.g. the possibility of expressing conjunctive queries (CQ) and unions
of conjunctive queries (UCQ)1 , by keeping the integration scheme still decidable.
To a certain extent, DL+log is between AL-log [6] and Carin [14].

3.1     Syntax
Formulas in DL+log are built upon three mutually disjoint predicate alphabets:
an alphabet of concept names PC , an alphabet of role names PR , and an alphabet
of Datalog predicates PD . We call a predicate p a DL-predicate if either p ∈ PC
or p ∈ PR . Then, we denote by C a countably infinite alphabet of constant names.
An atom is an expression of the form p(X), where p is a predicate of arity n
and X is a n-tuple of variables and constants. If no variable symbol occurs in X,
then p(X) is called a ground atom (or fact). If p ∈ PC ∪ PR , the atom is called
a DL-atom, while if p ∈ PD , it is called a Datalog atom.
    Given a description logic DL, a DL+log KB B is a pair (Σ, Π), where Σ is
a DL KB and Π is a set of Datalog¬∨ rules, where each rule R has the form
                                   p1 (X1 ) ∨ . . . ∨ pn (Xn ) ←
         r1 (Y1 ), . . . , rm (Ym ), s1 (Z1 ), . . . , sk (Zk ), ¬u1 (W1 ), . . . , ¬uh (Wh )
with n, m, k, h ≥ 0, each pi (Xi ), rj (Yj ), sl (Zl ), uk (Wk ) is an atom and:
 – each pi is either a DL-predicate or a Datalog predicate;
 – each rj , uk is a Datalog predicate;
 – each sl is a DL-predicate;
 – (Datalog safeness) every variable occurring in R must appear in at least
   one of the atoms r1 (Y1 ), . . . , rm (Ym ), s1 (Z1 ), . . . , sk (Zk );
 – (weak safeness) every head variable of R must appear in at least one of the
   atoms r1 (Y1 ), . . . , rm (Ym ).
    We remark that the above notion of weak safeness allows for the presence
of variables that only occur in DL-atoms in the body of R. On the other hand,
the notion of DL-safeness can be expressed as follows: every variable of R must
appear in at least one of the atoms r1 (Y1 ), . . . , rm (Ym ). Therefore, DL-safeness
forces every variable of R to occur also in the Datalog atoms in the body of
R, while weak safeness allows for the presence of variables that only occur in
DL-atoms in the body of R. Without loss of generality, we can assume that in a
DL+log KB (Σ, Π) all constants occurring in Σ also occur in Π.
1
    A Boolean UCQ over a predicate alphabet P is a first-order sentence of the form
    ∃X.conj1 (X) ∨ . . . ∨ conjn (X), where X is a tuple of variable symbols and each
    conji (X) is a set of atoms whose predicates are in P and whose arguments are either
    constants or variables from X. A Boolean CQ corresponds to a Boolean UCQ in the
    case when n = 1.
Example 1. Let us consider a DL+log KB B (adapted from [23]) integrating the
following DL-KB Σ (ontology about persons)

[A1] PERSON v ∃ FATHER− .MALE
[A2] MALE v PERSON
[A3] FEMALE v PERSON
[A4] FEMALE v ¬MALE
     MALE(Bob)
     PERSON(Mary)
     PERSON(Paul)
     FATHER(John,Paul)
and the following Datalog¬∨ program Π (rules about students):

[R1] boy(X) ← enrolled(X,c1,bsc), PERSON(X), ¬girl(X)
[R2] girl(X) ← enrolled(X,c2,msc), PERSON(X)
[R3] boy(X)∨ girl(X) ← enrolled(X,c3,phd), PERSON(X)
[R4] FEMALE(X) ← girl(X)
[R5] MALE(X) ← boy(X)
[R6] man(X) ← enrolled(X,c3,phd), FATHER(X,Y)
     enrolled(Paul,c1,bsc)
     enrolled(Mary,c1,bsc)
     enrolled(Mary,c2,msc)
     enrolled(Bob,c3,phd)
     enrolled(John,c3,phd)
Note that the rules mix DL-literals and Datalog-literals. Notice that the vari-
able Y in rule R6 is weakly-safe but not DL-safe, since Y does not occur in any
Datalog predicate in R6.


3.2   Semantics

For DL+log two semantics have been defined: a first-order logic (FOL) semantics
and a nonmonotonic (NM) semantics. In particular, the latter extends the stable
model semantics of Datalog¬∨ [9]. According to it, DL-predicates are still
interpreted under OWA, while Datalog predicates are interpreted under CWA.
Notice that, under both semantics, entailment can be reduced to satisfiability. In
a similar way, it can be seen that CQ answering can be reduced to satisfiability
in DL+log. Consequently, Rosati [23] concentrates on the satisfiability problem
in DL+log KBs. It has been shown that, when the rules are positive disjunctive,
the above two semantics are equivalent with respect to the satisfiability problem.
In particular, FOL-satisfiability can always be reduced (in linear time) to NM-
satisfiability. Hence, the satisfiability problem under the NM semantics is in the
focus of interest.

Example 2. With reference to Example 1, it can be easily verified that all NM-
models for B satisfy the following ground atoms:
 – boy(Paul) (since rule R1 is always applicable for {X/Paul} and R1 acts like
   a default rule, which can be read as follows: if X is a person enrolled in course
   c1, then X is a boy, unless we know for sure that X is a girl);
 – girl(Mary) (since rule R2 is always applicable for {X/Mary});
 – boy(Bob) (since rule R3 is always applicable for {X/Bob}, and, by rule R4,
   the conclusion girl(Bob) is inconsistent with Σ);
 – MALE(Paul) (due to rule R5);
 – FEMALE(Mary) (due to rule R4).

Notice that B |=N M FEMALE(Mary), while Σ 6|=F OL FEMALE(Mary). In other
words, adding rules has indeed an effect on the conclusions one can draw about
DL-predicates. Moreover, such an effect also holds under the FOL semantics of
DL+log-KBs, since it can be verified that B |=F OL FEMALE(Mary) in this case.


3.3     Reasoning

The problem statement of satisfiability for finite DL+log KBs relies on the fol-
lowing problem known as the Boolean CQ/UCQ containment problem 2 in DLs:
Given a DL-TBox T , a Boolean CQ Q1 and a Boolean UCQ Q2 over the alpha-
bet PC ∪PR , Q1 is contained in Q2 with respect to T , denoted by T |= Q1 ⊆ Q2 ,
iff, for every model I of T , if Q1 is satisfied in I then Q2 is satisfied in I. The
algorithm NMSAT-DL+log for deciding NM-satisfiability of DL+log KBs looks
for a guess (GP , GN ) of the Boolean CQs in the DL-grounding of Π, denoted
as grp (Π), that is consistent with the DL-KB Σ (Boolean CQ/UCQ contain-
ment problem) and such that the Datalog¬∨ program Π(GP , GN ) has a stable
model. Details of how obtaining grp (Π) and Π(GP , GN ) can be found in [23].
     The decidability of reasoning in DL+log, thus of ground query answering, de-
pends on the decidability of the Boolean CQ/UCQ containment problem in DL.
Consequently, ground queries can be answered by applying NMSAT-DL+log.

Theorem 1 [23] For every description logic DL, satisfiability of DL+log-KBs
(both under FOL semantics and under NM semantics) is decidable iff Boolean
CQ/UCQ containment is decidable in DL.

Corollary 1. Given a DL+log KB (Σ, Π) and a ground atom α, (Σ, Π) |= α
iff (Σ, Π ∪ {← α}) is unsatisfiable.

From Theorem 1 and from previous results on query answering and query con-
tainment in DLs, it follows the decidability of reasoning in several instantia-
tions of DL+log. Since SHIQ is the most expressive DL for which the Boolean
CQ/UCQ containment is decidable [10], we consider SHIQ+log¬ (i.e. SHIQ
extended with weakly-safe Datalog¬ rules) as the KR framework in our study
of ILP for the Semantic Web.
2
    This problem was called existential entailment in [14].
4     Inducing SHIQ+log¬ Rules with ILP
We consider the task of inducing new SHIQ+log¬ rules from an already existing
SHIQ+log¬ KB. At this stage of work the scope of induction does not matter.
Therefore the term ’observation’ is to be preferred to the term ’example’. We
choose to work within the setting of learning from interpretations which requires
an observation to be represented as a set of ground unit clauses.
    We assume that the data are represented as a SHIQ+log¬ KB B where the
intensional part K (i.e., the TBox T plus the set ΠR of rules) plays the role of
background knowledge and the extensional part (i.e., the ABox A plus the set
ΠF of facts) contributes to the definition of observations. Therefore ontologies
may appear as input to the learning problem of interest.
Example 3. Suppose we have a SHIQ+log¬ KB (adapted from [23]) consisting
of the following intensional knowledge K:
[A1] RICHuUNMARRIED v ∃ WANTS-TO-MARRY− .>
[R1] RICH(X) ← famous(X), ¬ scientist(X)
and the following extensional knowledge F:
      UNMARRIED(Mary)
      UNMARRIED(Joe)
      famous(Mary)
      famous(Paul)
      famous(Joe)
      scientist(Joe)
that can be split into FJoe = {UNMARRIED(Joe), famous(Joe), scientist(Joe)},
FMary = {UNMARRIED(Mary), famous(Mary)}, and FPaul = {famous(Paul)}.
    The language L of hypotheses must allow for the generation of SHIQ+log¬
rules starting from three disjoint alphabets PC (L) ⊆ PC (B), PR (L) ⊆ PR (B),
and PD (L) ⊆ PD (B). More precisely, we consider linked3 and range-restricted4
weakly-safe Datalog¬ clauses of the form
    p(X) ← r1 (Y1 ), . . . , rm (Ym ), s1 (Z1 ), . . . , sk (Zk ), ¬u1 (W1 ), . . . , ¬uh (Wh )
where the unique literal p(X) in the head represents the target predicate, de-
noted as c if p is a Datalog-predicate and as C if p is a SHIQ-predicate. In
the following we provide examples for these two cases of rule learning, one aimed
at inducing c(X) ← rules and the other C(X) ← rules. The former kind of rule
will enrich the Datalog part of the KB, whereas the latter will extend the DL
part (i.e., the input ontology).
3
  A clause H is linked if each literal li ∈ H is linked. A literal li ∈ H is linked if at least
  one of its terms is linked. A term t in some literal li ∈ H is linked with linking-chain
  of length 0, if t occurs in head(H), and with linking-chain of length d + 1, if some
  other term in li is linked with linking-chain of length d. The link-depth of a term t
  in li is the length of the shortest linking-chain of t.
4
  A clause H is range-restricted if each variable occurring in head(H) also occur in
  body(H).
Example 4. Suppose that the Datalog-predicate happy is the target predicate
and the set PD (Lhappy ) ∪ PC (Lhappy ) ∪ PR (Lhappy ) = {famous/1} ∪ {RICH/1} ∪
{WANTS-TO-MARRY/2} provides the building blocks for the language Lhappy . The
following SHIQ+log¬ rules

H1happy      happy(X) ← RICH(X)
  happy
H2           happy(X) ← famous(X)
H3happy      happy(X) ← famous(X), WANTS-TO-MARRY(Y,X)

belonging to Lhappy can be considered hypotheses for the target predicate happy.
Note that H3happy is weakly-safe.

Example 5. Suppose now that the target predicate is the DL-predicate LONER.
If LLONER is defined over PD (LLONER ) ∪ PC (LLONER ) = {famous/1, scientist/1} ∪
{UNMARRIED/1}, then the following SHIQ+log¬ rules

H1LONER      LONER(X) ← scientist(X)
H2LONER      LONER(X) ← scientist(X), UNMARRIED(X)
H3LONER      LONER(X) ← ¬famous(X)

belong to LLONER and represent hypotheses for the target predicate LONER.

   In order to support with ILP techniques the induction of SHIQ+log¬ rules,
the language L of hypotheses needs to be equipped with a generality order ,
and a coverage relation covers so that (L, ) is a search space and covers defines
the mappings from (L, ) to the set O of observations. The next subsections are
devoted to these issues.


4.1   The hypothesis ordering

The definition of a generality order for hypotheses in L can disregard neither the
peculiarities of SHIQ+log¬ nor the methodological apparatus of ILP. One issue
arises from the presence of NAF literals (i.e., negated Datalog literals) both in
the background knowledge and in the language of hypotheses. As pointed out in
[25], rules in normal logic programs are syntactically regarded as Horn clauses
by viewing the NAF-literal ¬p(X) as an atom not p(X) with the new predicate
not p. Then any result obtained on Horn logic programs is directly carried over
to normal logic programs. Assuming one such treatment of NAF literals, we
propose to adapt generalized subsumption [3] to the case of SHIQ+log¬ rules.
The resulting generality relation will be called K-subsumption, briefly K , from
now on. We provide a characterization of K that relies on the reasoning tasks
known for DL+log and from which a test procedure can be derived.

Definition 1. Let H1 , H2 ∈ L be two hypotheses standardized apart, K a back-
ground knowledge, and σ a Skolem substitution for H2 with respect to {H1 } ∪ K.
We say that H1 K H2 iff there exists a ground substitution θ for H1 such that
(i) head(H1 )θ = head(H2 )σ and (ii) K ∪ body(H2 )σ |= body(H1 )θ.
   Note that condition (ii) is a variant of the Boolean CQ/UCQ containment
problem because body(H2 )σ and body(H1 )θ are both Boolean CQs. The differ-
ence between (ii) and the original formulation of the problem is that K encom-
passes not only a TBox but also a set of rules. Nonetheless this variant can
be reduced to the satisfiability problem for finite SHIQ+log¬ KBs. Indeed the
skolemization of body(H2 ) allows to reduce the Boolean CQ/UCQ containment
problem to a CQ answering problem5 . Due to the aforementioned link between
CQ answering and satisfiability, checking (ii) can be reformulated as proving
that the KB (T , ΠR ∪ body(H2 )σ ∪ {← body(H1 )θ}) is unsatisfiable. Once refor-
mulated this way, (ii) can be solved by applying the algorithm NMSAT-DL+log.
Example 6. Let us consider the hypotheses
H1happy        happy(A) ← RICH(A)
  happy
H2             happy(X) ← famous(X)
reported in Example 4 up to variable renaming. We want to check whether
H1happy K H2happy holds. Let σ = {X/a} a Skolem substitution for H2happy with
respect to K ∪ H1happy and θ = {A/a} a ground substitution for H1happy . The
condition (i) is immediately verified. The condition (ii) K ∪ {famous(a)} |=
RICH(a) is nothing else that a ground query answering problem in SHIQ+log.
It can be proved that the query RICH(a) can not be satisfied because the rule
R1 is not applicable for a. Thus, H1happy 6K H2happy . Since H2happy 6K H1happy , the
two hypotheses are incomparable under K-subsumption. Conversely, it can be
proved that H2happy K H3happy but not viceversa.

Example 7. Let us consider the hypotheses
H1LONER        LONER(A) ← scientist(A)
H2LONER        LONER(X) ← scientist(X),UNMARRIED(X)
reported in Example 5 up to variable renaming. We want to check whether
H1LONER K H2LONER holds. Let σ = {X/a} a Skolem substitution for H2LONER with
respect to K ∪ H1LONER and θ = {A/a} a ground substitution for H1LONER . The
condition (i) is immediately verified. The condition
           (ii) K ∪ {scientist(a), UNMARRIED(a)} |= {scientist(a)}
is a ground query answering problem in SHIQ+log. It can be easily proved that
all NM-models for K ∪ {scientist(a), UNMARRIED(a)} satisfy scientist(a).
Thus, H1LONER K H2LONER . The viceversa does not hold. Also it can be proved that
H3LONER is incomparable with both H1LONER and H2LONER under K-subsumption.

It is straightforward to see that the decidability of K-subsumption follows from
the decidability of SHIQ+log¬ . It can be proved that K is a quasi-order (i.e. it
is a reflexive and transitive relation) for SHIQ+log¬ rules, therefore the space
of hypotheses can be searched by refinement operators.
5
    Since UNA does not necessarily hold in SHIQ, the (Boolean) CQ/UCQ containment
    problem for SHIQ boils down to the (Boolean) CQ/UCQ answering problem.
4.2   The hypothesis coverage of observations
The definition of a coverage relation depends on the representation choice for
observations. An observation oi ∈ O is represented as a couple (p(ai ), Fi ) where
Fi is a set containing ground facts concerning the tuple of individuals ai . We
assume K ∩ O = ∅.

Definition 2. Let H ∈ L be a hypothesis, K a background knowledge and oi ∈
O an observation. We say that H covers oi under interpretations w.r.t. K iff
K ∪ Fi ∪ H |= p(ai ).

Note that the coverage test can be reduced to query answering in SHIQ+log¬
KBs which in its turn can be reformulated as a satisfiability problem of the KB.
                                happy
Example 8. The hypothesis H3       mentioned in Example 4 covers the observa-
                                                           happy
tion oMary = (happy(Mary), FMary ) because K ∪ FMary ∪ H3        |= happy(Mary).
                                             happy
Indeed, all NM-models for B = K ∪ FMary ∪ H3       satisfy:
 – famous(Mary) (trivial!);
 – ∃ WANTS-TO-MARRY− .>(Mary), due to the axiom A1 and to the fact that
   both RICH(Mary) and UNMARRIED(Mary) hold in every model of B;
 – happy(Mary), due to the above conclusions and to the rule R1. Indeed, since
   ∃WANTS-TO-MARRY− .>(Mary) holds in every model of B, it follows that in
   every model there exists a constant x such that WANTS-TO-MARRY(x,Mary)
   holds in the model, consequently from rule R1 it follows that happy(Mary)
   also holds in the model.
             happy
Note that H3      does not cover the observations oJoe = (happy(Joe), FJoe ) and
oPaul = (happy(Paul), FPaul ). More precisely, K ∪ FJoe ∪ H3happy 6|= happy(Joe)
because scientist(Joe) holds in every model of B = K ∪ FJoe ∪ H3happy , thus
making the rule R1 not applicable for {X/Joe}, therefore RICH(Joe) not deriv-
able. Finally, K ∪ FPaul ∪ H3happy 6|= happy(Paul) because UNMARRIED(Paul)
is not forced to hold in every model of B = K ∪ FPaul ∪ H3happy , therefore
∃WANTS-TO-MARRY− .>(Paul) is not forced by A1 to hold in every such model.
    It can be proved that H1happy covers oMary and oPaul , while H2happy all the three
observations.

Example 9. With reference to Example 5, the hypothesis H3LONER does not cover
the observation oMary = (LONER(Mary), FMary ) because all NM-models for B =
K ∪ FMary ∪ H3LONER do satisfy famous(Mary). Note that it does not cover the
observations oPaul = (LONER(Paul), FPaul ) and oJoe = (LONER(Joe), FJoe ) for
analogous reasons. It can be proved that H2LONER covers oMary and oJoe while
H1LONER all three observations.


5     Related Work
Two ILP frameworks have been proposed so far that adopt a hybrid DL-CL
representation for both hypotheses and background knowledge. The framework
proposed in [24] focuses on discriminant induction and adopts the ILP setting
of learning from interpretations. Hypotheses are represented as Carin-ALN
non-recursive rules with a Horn literal in the head that plays the role of tar-
get concept. The coverage relation of hypotheses against examples adapts the
usual one in learning from interpretations to the case of hybrid Carin-ALN
BK. The generality relation between two hypotheses is defined as an extension
of generalized subsumption. Procedures for testing both the coverage relation
and the generality relation are based on the existential entailment algorithm
of Carin. Following [24], Kietz studies the learnability of Carin-ALN , thus
providing a pre-processing method which enables ILP systems to learn Carin-
ALN rules [13]. In [15], the representation and reasoning means come from
AL-log. Hypotheses are represented as constrained Datalog clauses. Note that
this framework is general, meaning that it is valid whatever the scope of induc-
tion is. The generality relation for one such hypothesis language is an adapta-
tion of generalized subsumption to the AL-log KR framework. It gives raise to a
quasi-order and can be checked with a decidable procedure based on constrained
SLD-resolution. Coverage relations for both ILP settings of learning from inter-
pretations and learning from entailment have been defined on the basis of query
answering in AL-log. As opposite to [24], the framework has been partially im-
plemented in an ILP system [16] that supports a variant of frequent pattern
discovery where rich prior conceptual knowledge is taken into account in order
to find patterns at multiple levels of description granularity.


            Table 2. Comparison between ILP frameworks for DL-CL systems.


                    Learning in Carin-ALN [24]      Learning in AL-log [15]        Learning in SHIQ+log¬
prior knowledge     Carin-ALN KB                    AL-log KB                      SHIQ+log¬ KB
  ontology lang.    ALN                             ALC                            SHIQ
       rule lang.   Horn clauses                    Datalog clauses                Datalog¬ clauses
hypothesis lang.    Carin-ALN non-recursive rules   constrained Datalog clauses    SHIQ+log¬ non-recursive rules
target predicate    Horn literal                    Datalog literal                SHIQ/Datalog literal
    observations    interpretations                 interpretations/implications   interpretations
       induction    predictive                      predictive/descriptive         predictive/descriptive
generality order    extension of [3] to Carin-ALN   extension of [3] to AL-log     extension of [3] to SHIQ+log¬
   coverage test    Carin-ALN query answering       AL-log query answering         SHIQ+log¬ query answering
  ref. operators    no                              downward                       no
implementation      no                              partially                      no
    application     no                              yes                            no




    The ILP framework presented in this paper differs from [24] and [15] in several
respects as summarized in Table 2, notably the following ones. First, it relies on a
more expressive DL, i.e. SHIQ. Second, it allows for inducing definitions for new
DL concepts, i.e. rules with a SHIQ literal in the head. Third, it relies on a more
expressive yet decidable CL, i.e. Datalog¬ . Forth, it adopts a tighter form of
integration between the DL and the CL part, i.e. the weakly-safe one. Similarities
also emerge from Table 2 such as the use of a semantic ordering for hypotheses
in order to accommodate ontologies in ILP. Note that generalized subsumption
is chosen for adaptation in all three ILP frameworks because definite clauses,
though enriched with DL and NAF literals, are still used.


6   Final Remarks
In this paper, we have proposed an ILP framework built upon SHIQ+log¬ . In-
deed, well-known ILP techniques for induction have been reformulated in terms
of the deductive reasoning mechanims of DL+log. Notably, we have defined a de-
cidable generality ordering, K-subsumption, for SHIQ+log¬ rules on the basis
of the decidable algorithm NMSAT-SHIQ+log. We would like to point out that
the ILP framework proposed is suitable for inductive reasoning in the context of
the Semantic Web for two main reasons. First, it adopts the DL which was the
starting point for the design of the Web ontology language OWL. Second, it can
deal with incomplete knowledge, thus coping with a more plausible scenario of
the Web. Though the work presented in this paper can be considered as a fea-
sibility study, it provides the principles for inductive reasoning in SHIQ+log¬ .
We would like to emphasize that they will be still valid for any other upcoming
decidable instantiation of DL+log, provided that Datalog¬ is still considered
for the CL part.
    The Semantic Web offers several use cases for rules among which we can
choose in order to see our ILP framework at work. As next step towards any
practice, we plan to define ILP algorithms starting from the ingredients identified
in this paper. Tractable cases, e.g. the instantiation of DL+log with DL-Lite
(subset of SHIQ), will be of major interest. Also we would like to investigate
the impact of having Datalog¬∨ both in the language of hypotheses and in the
language for the background theory. The inclusion of the nonmonotonic features
of SHIQ+log full will strengthen the ability of our ILP framework to deal
with incomplete knowledge by performing an inductive form of commonsense
reasoning. One such ability can turn out to be useful in the Semantic Web, and
complementary to reasoning with uncertainty and under inconsistency. Finally,
we would like to study the complexity of K-subsumption.


References
 1. F. Baader, D. Calvanese, D. McGuinness, D. Nardi, and P.F. Patel-Schneider, edi-
    tors. The Description Logic Handbook: Theory, Implementation and Applications.
    Cambridge University Press, 2003.
 2. A. Borgida. On the relative expressiveness of description logics and predicate
    logics. Artificial Intelligence, 82(1–2):353–367, 1996.
 3. W. Buntine. Generalized subsumption and its application to induction and redun-
    dancy. Artificial Intelligence, 36(2):149–176, 1988.
 4. S. Ceri, G. Gottlob, and L. Tanca. What you always wanted to know about datalog
    (and never dared to ask). IEEE Transactions on Knowledge and Data Engineering,
    1(1):146–166, 1989.
 5. L. De Raedt and L. Dehaspe. Clausal Discovery. Machine Learning, 26(2–3):99–
    146, 1997.
 6. F.M. Donini, M. Lenzerini, D. Nardi, and A. Schaerf. AL-log: Integrating Datalog
    and Description Logics. Journal of Intelligent Information Systems, 10(3):227–252,
    1998.
 7. T. Eiter, G. Gottlob, and H. Mannila. Disjunctive Datalog. ACM Transactions
    on Database Systems, 22(3):364–418, 1997.
 8. A.M. Frisch and A.G. Cohn. Thoughts and afterthoughts on the 1988 workshop
    on principles of hybrid reasoning. AI Magazine, 11(5):84–87, 1991.
 9. M. Gelfond and V. Lifschitz. Classical negation in logic programs and disjunctive
    databases. New Generation Computing, 9(3/4):365–386, 1991.
10. B. Glimm, I. Horrocks, C. Lutz, and U. Sattler. Conjunctive query answering for
    the description logic SHIQ. Journal of Artificial Intelligence Research, 31:151–
    198, 2008.
11. I. Horrocks, P.F. Patel-Schneider, and F. van Harmelen. From SHIQ and RDF
    to OWL: The making of a web ontology language. Journal of Web Semantics,
    1(1):7–26, 2003.
12. I. Horrocks, U. Sattler, and S. Tobies. Practical reasoning for very expressive
    description logics. Logic Journal of the IGPL, 8(3):239–263, 2000.
13. J.-U. Kietz. Learnability of description logic programs. In S. Matwin and C. Sam-
    mut, editors, Inductive Logic Programming, volume 2583 of Lecture Notes in Arti-
    ficial Intelligence, pages 117–132. Springer, 2003.
14. A.Y. Levy and M.-C. Rousset. Combining Horn rules and description logics in
    CARIN. Artificial Intelligence, 104:165–209, 1998.
15. F.A. Lisi. Building Rules on Top of Ontologies for the Semantic Web with Inductive
    Logic Programming. Theory and Practice of Logic Programming, 8(03):271–300,
    2008.
16. F.A. Lisi and D. Malerba. Inducing Multi-Level Association Rules from Multiple
    Relations. Machine Learning, 55:175–210, 2004.
17. J.W. Lloyd. Foundations of Logic Programming. Springer, 2nd edition, 1987.
18. T.M. Mitchell. Generalization as search. Artificial Intelligence, 18:203–226, 1982.
19. S.-H. Nienhuys-Cheng and R. de Wolf. Foundations of Inductive Logic Program-
    ming, volume 1228 of Lecture Notes in Artificial Intelligence. Springer, 1997.
20. G.D. Plotkin. A note on inductive generalization. Machine Intelligence, 5:153–163,
    1970.
21. R. Reiter. Equality and domain closure in first order databases. Journal of ACM,
    27:235–249, 1980.
22. R. Rosati. Semantic and computational advantages of the safe integration of on-
    tologies and rules. In F. Fages and S. Soliman, editors, Principles and Practice
    of Semantic Web Reasoning, volume 3703 of Lecture Notes in Computer Science,
    pages 50–64. Springer, 2005.
23. R. Rosati. DL+log: Tight integration of description logics and disjunctive datalog.
    In P. Doherty, J. Mylopoulos, and C.A. Welty, editors, Proc. of Tenth International
    Conference on Principles of Knowledge Representation and Reasoning, pages 68–
    78. AAAI Press, 2006.
24. C. Rouveirol and V. Ventos. Towards Learning in CARIN-ALN . In J. Cussens
    and A. Frisch, editors, Inductive Logic Programming, volume 1866 of Lecture Notes
    in Artificial Intelligence, pages 191–208. Springer, 2000.
25. C. Sakama. Nonmonotonic inductive logic programming. In T. Eiter, W. Faber,
    and M. Truszczynski, editors, Logic Programming and Nonmonotonic Reasoning,
    volume 2173 of Lecture Notes in Computer Science, pages 62–80. Springer, 2001.
26. M. Schmidt-Schauss and G. Smolka. Attributive concept descriptions with com-
    plements. Artificial Intelligence, 48(1):1–26, 1991.