=Paper= {{Paper |id=Vol-1433/tc_89 |storemode=property |title=An Abductive Framework for Datalog± Ontologies |pdfUrl=https://ceur-ws.org/Vol-1433/tc_89.pdf |volume=Vol-1433 |dblpUrl=https://dblp.org/rec/conf/iclp/GavanelliLRBZC15 }} ==An Abductive Framework for Datalog± Ontologies == https://ceur-ws.org/Vol-1433/tc_89.pdf
Technical Communications of ICLP 2015. Copyright with the Authors.                      1




An abductive Framework for Datalog± Ontologies
       MARCO GAVANELLI, EVELINA LAMMA, FABRIZIO RIGUZZI,


         ELENA BELLODI, RICCARDO ZESE and GIUSEPPE COTA
                                University of Ferrara, Italy

                       submitted 29 May 2015; accepted 14 July 2015



                                       Abstract
Ontologies are a fundamental component of the Semantic Web since they provide a formal
and machine manipulable model of a domain. Description Logics (DLs) are often the
languages of choice for modeling ontologies. Great effort has been spent in identifying
decidable or even tractable fragments of DLs. Conversely, for knowledge representation
and reasoning, integration with rules and rule-based reasoning is crucial in the so-called
Semantic Web stack vision. Datalog± is an extension of Datalog which can be used for
representing lightweight ontologies, and is able to express the DL-Lite family of ontology
languages, with tractable query answering under certain language restrictions.
   In this work, we show that Abductive Logic Programming (ALP) is also a suitable
framework for representing Datalog± ontologies, supporting query answering through an
abductive proof procedure, and smoothly achieving the integration of ontologies and rule-
based reasoning. In particular, we consider an Abductive Logic Programming framework
named SCIFF and derived from the IFF abductive framework, able to deal with exis-
tentially (and universally) quantified variables in rule heads, and Constraint Logic Pro-
gramming constraints. Forward and backward reasoning is naturally supported in the
ALP framework. We show that the SCIFF language smoothly supports the integration
of rules, expressed in a Logic Programming language, with Datalog± ontologies, mapped
into SCIFF (forward) integrity constraints.

KEYWORDS: Abductive Logic Programming, Datalog± , Description Logics, Semantic
Web.




                                   1 Introduction
The main idea of the Semantic Web is making information available in a form that
is understandable and automatically manageable by machines (Hitzler et al. 2009).
Ontologies are engineering artefacts consisting of a vocabulary describing some
domain, and an explicit specification of the intended meaning of the vocabulary (i.e.,
how concepts should be classified), possibly together with constraints capturing
additional knowledge about the domain. Ontologies provide a formal and machine
manipulable model of a domain, and this justifies their use in the Semantic Web.
   In order to realize this vision, the W3C has supported the development of a
family of knowledge representation formalisms of increasing complexity for defining
2                                M. Gavanelli et al.

ontologies, called Web Ontology Language (OWL). In particular, OWL 1 defines
the sublanguages OWL-Lite, OWL-DL (based on Description Logics) and OWL-
Full. Therefore, ontologies are a fundamental component of the Semantic Web and
Description Logics (DLs) are often the languages of choice for modeling them.
   Extensive work has focused on developing tractable DLs, identifying the DL-Lite
family (Calvanese et al. 2007), for which answering conjunctive queries is in AC0
in data complexity.
   In a related research direction, Calı̀ et al. (2009b) proposed Datalog± , an exten-
sion of Datalog with existential rules for defining ontologies. Datalog± can be used
for representing lightweight ontologies, and encompasses the DL-Lite family (Calı̀
et al. 2009a). By suitably restricting the language syntax and adopting appropriate
syntactic conditions, also Datalog± achieves tractability (Calı̀ et al. 2008).
   In this work, we consider the Datalog± language and show how ontologies ex-
pressed in this language can be also modeled in an Abductive Logic Programming
(ALP, for short) framework (Kakas et al. 1993), where query answering is sup-
ported by the underlying ALP proof procedure. ALP has been proved a powerful
tool for knowledge representation and reasoning, taking advantage from ALP opera-
tional support as a (static or dynamic) verification tool. ALP languages are usually
equipped with a declarative (model-theoretic) semantics, and an operational se-
mantics given in terms of a proof-procedure. Several abductive proof procedures
have been defined (both backward, forward, and a mix of the two such) (Kakas and
Mancarella 1990; Bry 1990; Kakas 2000; Denecker and Schreye 1998; Alferes et al.
1999; Abdennadher and Christiansen 2000; Christiansen and Dahl 2005; Endriss
et al. 2004), with many different applications (diagnosis, monitoring, verification,
etc.). Among them, a notable one is the IFF abductive proof-procedure (Fung and
Kowalski 1997) which was proposed to deal with forward rules, and with non-ground
abducibles. This proof procedure has been later extended in (Alberti et al. 2008),
and the resulting proof procedure, named SCIFF, can deal with both existentially
and universally quantified variables in rule heads, and Constraint Logic Program-
ming (CLP) constraints (Jaffar and Maher 1994). The resulting system has been
used for modeling and implementing several knowledge representation frameworks,
such as deontic logic (Alberti et al. 2006), normative systems (Alberti et al. 2012),
interaction protocols for multi-agent systems (Alberti et al. 2004), Web services
choreographies (Alberti et al. 2006), etc. also providing an effective reasoning sys-
tem.
   Here we concentrate on Datalog± ontologies, and show how an ALP language
enriched with quantified variables (existential to our purposes) can be a useful
knowledge representation and reasoning framework for them. We do not focus here
on complexity results of the overall system, which is, however, not tractable.
   Forward and backward reasoning is naturally supported by the ALP proof pro-
cedure, and the considered SCIFF language smoothly supports the integration of
rules, expressed in a Logic Programming language, with ontologies expressed in
Datalog± . In fact, In fact, SCIFF allows us to map Datalog± ontologies into the
forward integrity constraints on which it is based. Other ALP languages could be
used (Kakas et al. 2001; Mancarella et al. 2009); we chose SCIFF because it is
                  An abductive Framework for Datalog± Ontologies                         3

freely available on the web and it is supported on the last versions of commercial
and open source Prolog systems (SICStus (Carlsson and Mildner 2012) and SWI
(Wielemaker et al. 2012)).
   In the following, Section 2 introduces Datalog± . Section 3 introduces Abductive
Logic Programming, and the SCIFF language. Section 4 shows how the considered
Datalog± language can be mapped into SCIFF. Section 5 concludes the paper, and
outlines future work.


                                      2 Datalog±
Datalog± extends Datalog by allowing existential quantifiers, the equality predicate
and the truth constant false in rule heads. Datalog± can be used for representing
lightweight ontologies and is able to express the DL-Lite family of ontology lan-
guages (Calı̀ et al. 2009a). By restricting the language syntax, Datalog± achieves
decidability (Calı̀ et al. 2008).
   In order to describe Datalog± , let us assume (i) an infinite set of data constants
∆, (ii) an infinite set of labeled nulls ∆N (used as “fresh” Skolem terms), and (iii)
an infinite set of variables ∆V . Different constants represent different values (unique
name assumption), while different nulls may represent the same value. We assume
a lexicographic order on ∆ ∪ ∆N , with every symbol in ∆N following all symbols in
∆. We denote by X vectors of variables X1 , . . . , Xk with k ≥ 0. A relational schema
R is a finite set of relation names (or predicates). A term t is a constant, null or
variable. An atomic formula (or atom) has the form p(t1 , . . . , tn ), where p is an n-
ary predicate, and t1 , . . . , tn are terms. A database D for R is a possibly infinite set
of atoms with predicates from R and arguments from ∆ ∪ ∆N . A conjunctive query
(CQ) over R has the form q(X) = ∃YΦ(X, Y), where Φ(X, Y) is a conjunction
of atoms having as arguments variables X and Y and constants (but no nulls). A
Boolean CQ (BCQ) over R is a CQ having head predicate q of arity 0 (i.e., no
variables in X).
   We often write a BCQ as the set of all its atoms, having constants and variables
as arguments, and omitting the quantifiers. Answers to CQs and BCQs are defined
via homomorphisms, which are mappings µ : ∆ ∪ ∆N ∪ ∆V → ∆ ∪ ∆N ∪ ∆V such
that (i) c ∈ ∆ implies µ(c) = c, (ii) c ∈ ∆N implies µ(c) ∈ ∆ ∪ ∆N , and (iii)
µ is naturally extended to term vectors, atoms, sets of atoms, and conjunctions
of atoms. The set of all answers to a CQ q(X) = ∃YΦ(X, Y) over a database D,
denoted q(D), is the set of all tuples t over ∆ for which there exists a homomorphism
µ : X ∪ Y → ∆ ∪ ∆N such that µ(Φ(X, Y)) ⊆ D and µ(X) = t. The answer to
a BCQ q = ∃YΦ(Y) over a database D, denoted q(D), is Yes, denoted D |= q,
iff there exists a homomorphism µ : Y → ∆ ∪ ∆N such that µ(Φ(Y)) ⊆ D, i.e., if
q(D) 6= ∅.
   Datalog± syntax includes three types of implication rules. Given a relational
schema R, a tuple-generating dependency (or TGD) F is a first-order formula of the
form ∀X∀YΦ(X, Y) → ∃ZΨ(X, Z), where Φ(X, Y) and Ψ(X, Z) are conjunctions
of atoms over R, called the body and the head of F , respectively. Such F is satisfied
in a database D for R iff, whenever there exists a homomorphism h such that
4                                 M. Gavanelli et al.

h(Φ(X, Y)) ⊆ D, there exists an extension h 0 of h such that h 0 (Ψ(X, Z)) ⊆ D. We
usually omit the universal quantifiers in TGDs.
    Query answering under TGDs is defined as follows. For a set of TGDs T on R,
and a database D for R, the set of models of D given T , denoted mods(D, T ), is
the set of all (possibly infinite) databases B such that D ⊆ B and every F ∈ T is
satisfied in B . The set of answers to a CQ q on D given T , denoted ans(q, D, T ), is
the set of all tuples t such that t ∈ q(B ) for all B ∈ mods(D, T ). The answer to a
BCQ q over D given T is Yes, denoted D ∪T |= q, iff B |= q for all B ∈ mods(D, T ).
    The second component of a Datalog± theory is represented by negative con-
straints (NC): first-order formulas of the form ∀XΦ(X) → false, where Φ(X) is a
conjunction of atoms. The universal quantifiers are usually left implicit.
    Equality-generating dependencies (EGDs) are the third component of a Datalog±
theory. An EGD F is a first-order formula of the form ∀XΦ(X) → Xi = Xj , where
Φ(X), called the body of F , is a conjunction of atoms, and Xi and Xj are variables
from X. We call Xi = Xj the head of F . Such F is satisfied in a database D for R
iff, whenever there exists a homomorphism h such that h(Φ(X)) ⊆ D, it holds that
h(Xi ) = h(Xj ). We usually omit the universal quantifiers in EGDs.


                                   2.1 The Chase
The chase is a bottom-up procedure for deriving atoms entailed by a database and
a Datalog± theory. The chase works on a database through the so-called TGD and
EGD chase rules.
   The TGD chase rule is defined as follows. Given a relational database D for a
schema R, and a TGD F on R of the form ∀X∀YΦ(X, Y) → ∃ZΨ(X, Z), F is
applicable to D if there is a homomorphism h that maps the atoms of Φ(X, Y)
to atoms of D. Let F be applicable and h1 be a homomorphism that extends h as
follows: for each Xi ∈ X, h1 (Xi ) = h(Xi ); for each Zj ∈ Z, h1 (Zj ) = zj , where zj
is a “fresh” null, i.e., zj ∈ ∆N , zj 6∈ D, and zj lexicographically follows all other
labeled nulls already introduced. The result of the application of the TGD chase
rule for F is the addition to D of all the atomic formulas in h1 (Ψ(X, Z)) that are
not already in D.
   The EGD chase rule is defined as follows. An EGD F on R of the form Φ(X) →
Xi = Xj is applicable to a database D for R iff there exists a homomorphism
h : Φ(X) → D such that h(Xi ) and h(Xj ) are different and not both constants. If
h(Xi ) and h(Xj ) are different constants in ∆, then there is a hard violation of F .
Otherwise, the result of the application of F to D is the database h(D) obtained
from D by replacing every occurrence of a non-constant element e ∈ {h(Xi ), h(Xj )}
in D by the other element e 0 (if e and e 0 are both nulls, then e precedes e 0 in the
lexicographic order).
   The chase algorithm consists of an exhaustive application of the TGD and EGD
chase rules that may lead to an infinite result. The chase rules are applied iteratively:
in each iteration (1) a single TGD is applied once and then (2) the EGDs are applied
until a fix point is reached. EGDs are assumed to be separable (Calı̀ et al. 2010).
Intuitively, separability holds whenever: (i) if there is a hard violation of an EGD
                An abductive Framework for Datalog± Ontologies                          5

in the chase, then there is also one on the database w.r.t. the set of EGDs alone
(i.e., without considering the TGDs); and (ii) if there is no hard violation, then
the answers to a BCQ w.r.t. the entire set of dependencies equals those w.r.t. the
TGDs alone (i.e., without the EGDs).
   The two problems of CQ and BCQ evaluation under TGDs and EGDs are
logspace-equivalent (Calı̀ et al. 2009b). Moreover, query answering under TGDs
is equivalent to query answering under TGDs with only single atoms in their
heads (Calı̀ et al. 2008). Henceforth, we focus only on the BCQ evaluation prob-
lem and we assume that every TGD has a single atom in its head (without loss
of generality since by introducing new predicate symbols, we can always transform
TGD-rules with multiple atoms in the head in sets of rules with only one). A BCQ
q on a database D, a set TT of TGDs and a set TE of EGDs can be answered by
performing the chase and checking whether the query is entailed by the extended
database that is obtained. In this case we write D ∪ TT ∪ TE |= q.
Example 1 (Adapted from (Gottlob et al. 2011))
Consider the following ontology for a real estate information extraction system:
   F1 = ann(X , label ), ann(X , price), visible(X ) → priceElem(X )
If X is annotated as a label, as a price and is visible, then it is a price element.
   F2 = ann(X , label ), ann(X , priceRange), visible(X ) → priceElem(X )
If X is annotated as a label, as a price range, and is visible, then it is a price
element.
   F3 = priceElem(E ), group(E , X ) → forSale(X )
If E is a price element and is grouped with X , then X is for sale.
   F4 = forSale(X ) → ∃P price(X , P )
If X is for sale, then there exists a price for X .
   F5 = hasCode(X , C ), codeLoc(C , L) → loc(X , L)
If X has postal code C , and C ’s location is L, then X ’s location is L.
   F6 = hasCode(X , C ) → ∃L codeLoc(C , L), loc(X , L)
If X has postal code C , then there exists L s.t. C has location L and so does X .
   F7 = loc(X , L1), loc(X , L2) → L1 = L2
If X has the locations L1 and L2, then L1 and L2 are the same.
   F8 = loc(X , L) → advertised (X )
If X has a location L then X is advertised.
   Suppose we are given the database
 codeLoc(ox 1, central ), codeLoc(ox 1, south), codeLoc(ox 2, summertown)
 hasCode(prop1, ox 2), ann(e1, price), ann(e1, label ), visible(e1), group(e1, prop1)
The atomic BCQs priceElem(e1), forSale(prop1) and advertised (prop1) evaluate to
true, while the CQ loc(prop1, L) has answers q(L) = {summertown}. In fact, even
if loc(prop1, z1 ) with z1 ∈ ∆N is entailed by formula F6 , formula F7 imposes that
summertown = z1 . If F7 were absent then q(L) = {summertown, z1 }.
Answering BCQs q over databases and ontologies containing NCs can be performed
by first checking whether the BCQ Φ(X) evaluates to false for each NC of the form
∀XΦ(X) → false. If one of these checks fails, then the answer to the original BCQ q
6                                M. Gavanelli et al.

is positive, otherwise the negative constraints can be simply ignored when answering
the original BCQ q.


                       3 ALP and the SCIFF language
Abductive Logic Programming (ALP, for short) is a family of programming lan-
guages that integrate abductive reasoning into logic programming. An ALP pro-
gram consists of a set of clauses, that can contain in the body some distinguished
predicates, belonging to a set A and called abducibles. The aim is finding a set of
abducibles EXP, built from symbols in A that, together with the knowledge base,
is an explanation for a given known effect (called goal G) and satisfies a set of logic
formulae, called Integrity Constraints (IC ):
                                  KB ∪ EXP |= G
                                  KB ∪ EXP |= IC
   SCIFF (Alberti et al. 2008) is a language in the ALP class, originally designed
to model and verify interactions in open societies of agents and it is an extension
of the IFF proof-procedure (Fung and Kowalski 1997). As the IFF, it relies on the
three-valued completion semantics (Kunen 1987), it considers integrity constraints
of the form body → head where the body is a conjunction of literals and the head
is a disjunction of conjunctions of literals. While in the IFF the literals can be
built only on defined or abducible predicates, in SCIFF they can also be CLP
constraints (Jaffar and Maher 1994), occurring events (only in the body), or positive
and negative expectations, as will be explained soon.
Definition 1
A SCIFF Program is a pair hKB , ICi where KB is a set of clauses (an extended
logic program) and IC is a set of forward rules called Integrity Constraints (ICs,
for short in the following).
   SCIFF considers a (possibly dynamically growing) set of facts (called history)
HAP, that contains ground atoms H(Event[, Time]). This set can grow dynami-
cally, during the computation, thus implementing a dynamic acquisition of events.
Some distinguished abducibles are called expectations. A positive expectation, writ-
ten E(Event[, Time]) means that a corresponding event H(Event[, Time]) is ex-
pected to happen, while EN(Event[, Time]) is a negative expectation, and requires
events H(Event[, Time]) not to happen. To simplify the notation, we will omit the
Time argument from events and expectations.
   Variables occurring only in positive expectations are existentially quantified (ex-
pressing the idea that a single event is enough to support them), while those in
negative expectations are universally quantified, so that any event matching with a
negative expectation leads to inconsistency with the current hypothesis. CLP (Jaf-
far and Maher 1994) constraints can be imposed on variables. The computed answer
includes in general three elements: a substitution for the variables in the goal (as
usual in Prolog), the constraint store (as in CLP), and the set EXP of abduced
literals.
                 An abductive Framework for Datalog± Ontologies                      7

  The declarative semantics of SCIFF includes the classic conditions of ALP:

      KB ∪ HAP ∪ EXP         |= G                                                  (1)
      KB ∪ HAP ∪ EXP         |= IC                                                 (2)

plus specific conditions to support the confirmation of expectations.
  Positive/negative expectations are confirmed (not violated) if

      KB ∪ HAP ∪ EXP         |= E(X ) → H(X )                                      (3)
      KB ∪ HAP ∪ EXP         |= EN(X ) ∧ H(X ) → false                             (4)

  The declarative semantics of SCIFF also requires that the same event cannot be
expected both to happen and not to happen

      KB ∪ HAP ∪ EXP |= E(X ) ∧ EN(X ) → false                                     (5)


Definition 2 (SCIFF answer )
Given a SCIFF program hKB , ICi and a history HAP, a goal G is a SCIFF answer
if there is a set EXP such that equations (1)-(5) are satisfied. In this case, we write

                                 hKB , ICi |=HAP G

  The SCIFF proof-procedure is a rewriting system that defines a proof tree, whose
nodes represent states of the computation. A set of transitions rewrite a node into
one or more children nodes.
  The main transitions, inherited from the IFF are:

Unfolding replaces a (non abducible) atom with its definitions;
Propagation if an abduced atom a(X ) occurs in the condition of an IC (e.g.,
  a(Y ) → p), the atom is removed from the condition (generating X = Y → p);
Case Analysis given an implication containing an equality in the condition (e.g.,
  X = Y → p), generates two children in logical or (in the example, either X = Y
  and p, or X 6= Y );
Equality rewriting rewrites equalities as in the Clark’s equality theory;
Logical simplifications other simplifications like (true → A) ⇔ A, etc.

SCIFF includes also the transitions of CLP (Jaffar and Maher 1994) for constraint
solving.
  A complete description is in (Alberti et al. 2008), with proofs of soundness,
completeness, and termination. SCIFF was implemented in CHR (Frühwirth 1998),
an efficient implementation is described in (Alberti et al. 2013).
  In this paper we consider the generative version of SCIFF, called g-SCIFF (Al-
berti et al. 2006), in which also the H events are considered as abducibles, and can
be assumed like the other abducible predicates, beside being provided as input in
the history HAP; they are then collected in a set HAP0 ⊇ HAP.
8                                    M. Gavanelli et al.

Definition 3 (g-SCIFF answer )
Given a SCIFF program hKB , ICi and a history HAP, we say that a goal G is
a g-SCIFF answer if there exist a set EXP and a set HAP0 ⊇ HAP such that
equations (1)-(5) are satisfied1 . In this case, we write

                                     hKB , ICi |=gHAP G


                    4 Mapping Datalog± into ALP programs
In this section, we show that a Datalog± program can be represented as a set of
SCIFF integrity constraints and a history. SCIFF abductive declarative semantics
provides the model-theoretic counterpart to Datalog± semantics. Operationally,
query answering is achieved bottom-up via the chase in Datalog± , while in the
ALP framework it is supported by the SCIFF proof procedure. SCIFF is able to
integrate a knowledge base KB , expressed in terms of Logic Programming clauses,
possibly with abducibles in their body, and to deal with integrity constraints.
   To our purposes, we consider only SCIFF programs with an empty KB , IC s with
only conjunctions of positive expectations and CLP constraints in their heads. We
show that this subset of the language suffices to represent Datalog± ontologies2 .
   We map the finite set of relation names of a Datalog± relational schema R into
the set of predicates of the corresponding SCIFF program.

Definition 4
The τ mapping is recursively defined as follows, where A is an atom, M can be
either H or E, and F1 , F2 , . . . are Datalog± formulae:
                     τ (Body → Head )      =    τH (Body) → τE (Head )
                                τH (A)     =    H(A)
                                 τE (A)    =    E(A)
                          τM (F1 ∧ F2 )    =    τM (F1 ) ∧ τM (F2 )
                             τM (false)    =    false
                         τM (Yi = Yj )     =    Yi = Yj
                            τE (∃X A)      =    E(A)

  A Datalog± database D for R corresponds to the (possibly infinite) SCIFF his-
tory HAP, since there is a one-to-one correspondence between each tuple in D and
each (ground) fact in HAP. This mapping is denoted as HAP = τH (D).
  A Datalog TGD F of the kind body → head is mapped into the SCIFF integrity
constraint IC = τ (F ), where the body is mapped into conjunctions of SCIFF atoms,
and head into conjunctions of SCIFF abducible atoms. Existential quantifications
of variables occurring in the head of the TGD are maintained in the head of the
SCIFF IC , but they are left implicit in the SCIFF syntax, while the rest of the
variables are universally quantified with scope the entire IC .


1 In the equations (1)-(5) the set HAP should be substituted with HAP0 .
2 As should be clear soon, the only CLP constraint used in the mapping is the equality constraint
                 An abductive Framework for Datalog± Ontologies                     9

   Given a set of TGDs T , let us denote the mapping of T into the corresponding
set IC of SCIFF integrity constraints, as IC = τ (T ).
   Recall that for a set of TGDs T on R, and a database D for R, the set of models
of D given T , denoted mods(D, T ), is the set of all (possibly infinite) databases B
such that D ⊆ B and every F ∈ T is satisfied in B . For any such database B , we can
prove that there exists an abductive explanation EXP = τE (B ), HAP0 = τH (B )
such that:
                                HAP0 ∪ EXP |= IC
where HAP0 ⊇ HAP = τH (D), and IC = τ (T ).
   Finally, Datalog± negative constraints NC are mapped into SCIFF ICs with head
false, and equality-generating dependencies EGDs into SCIFF ICs, each one with
an equality CLP constraint in its head.
   Therefore, informally speaking, the set of models of D given T , mods(D, T ),
corresponds to the set of all the abductive explanations EXP satisfying the set of
SCIFF integrity constraints IC = τ (T ).
   A Datalog± CQ q(X) = ∃YΦ(X, Y) over R is mapped into a SCIFF goal G =
τE (Φ(X, Y)), where τE (Φ(X, Y)) is a conjunction of SCIFF atoms. Notice that in
the SCIFF framework we have therefore a goal with existential variables only, and
among them, we are interested in computed answer substitutions for the original
(tuple of) variables X (and therefore Y variables can be made anonymous).
   A Datalog± BCQ q = Φ(Y) is mapped similarly: G = τE (Φ(Y)).
   Recall that in Datalog± the set of answers to a CQ q on D given T , denoted
ans(q, D, T ), is the set of all tuples t such that t ∈ q(B ) for all B ∈ mods(D, T ).
With abuse of notation, we will write q(t) to mean answer t for q on D given T .
   We can hence state the following theorems for (model-theoretic) completeness of
query answering.
Theorem 1 (Completeness of query answering)
For each answer q(t) of a CQ q(X) = ∃YΦ(X, Y) on D given T , in the corre-
sponding SCIFF program h∅, A, τ (T )i there exists an answer substitution θ and an
abductive explanation EXP ∪ HAP0 for goal G = τE (Φ(X, )) such that:
                                 h∅, ICi |=gHAP Gθ
where HAP = τH (D), IC = τ (T ), and Gθ = τE (Φ(t, )).

Corollary 1 (Completeness of boolean query answering)
If the answer to a BCQ q = ∃YΦ(Y) over D given T is Yes, denoted D ∪ T |= q,
then in the corresponding SCIFF program there exists an abductive explanation
EXP ∪ HAP0 such that:
                                 h∅, ICi |=gHAP Gθ
where HAP = τH (D), IC = τ (T ), and G = τE (Φ( )).
  The SCIFF proof procedure was proved sound and complete w.r.t. SCIFF declar-
ative semantics in (Alberti et al. 2008), thus for each abductive explanation EXP
for a given goal G in a SCIFF program, there exists a SCIFF-based computation
10                                  M. Gavanelli et al.

producing a set of abducibles (positive expectations to our purposes) δ ⊆ EXP,
and a computed answer substitution for goal G possibly more general than θ.

Example 2 (Real estate information extraction system in ALP )
The TGDs F1 -F8 from the Datalog± ontology of Example 1 are one-to-one mapped
into the following SCIFF ICs:3
IC1 : H(ann(X , label )), H(ann(X , price)), H(visible(X )) → E(priceElem(X ))
IC2 : H(ann(X , label )), H(ann(X , priceRange)), H(visible(X )) → E(priceElem(X ))
IC3 : H(priceElem(E )), H(group(E , X )) → E(forSale(X ))
IC4 : H(forSale(X )) → (∃P ) E(price(X , P ))
IC5 : H(hasCode(X , C )), H(codeLoc(C , L)) → E(loc(X , L))
IC6 : H(hasCode(X , C )) → (∃L) E(codeLoc(C , L)), E(loc(X , L))
IC7 : H(loc(X , L1)), H(loc(X , L2)) → L1 = L2
IC8 : H(loc(X , L)) → E(advertised (X ))
   The database is then simply mapped into the following history HAP:
     {H(codeLoc(ox 1, central )), H(codeLoc(ox 1, south)),
      H(codeLoc(ox 2, summertown)), H(hasCode(prop1, ox 2)), H(ann(e1, price)),
      H(ann(e1, label )), H(visible(e1)), H(group(e1, prop1))}
   The SCIFF proof procedure applies ICs in a forward manner, and it infers the
following set of abducibles from the program above:

       EXP =       {E(priceElem(e1)), E(forSale(prop1)), ∃P E(price(prop1, P )),
                     E(loc(prop1, summertown)), E(advertised (prop1))}

plus the corresponding H atoms, that are not reported for the sake of brevity.
  Each of the (ground) atomic queries of Example 1 is entailed in the SCIFF
program above, since there exist sets EXP and HAP0 such that:

     HAP0 ∪ EXP |= E(priceElem(e1)), E(forSale(prop1)), E(advertised (prop1))

The query ∃L E(loc(prop1, L)) is entailed as well (with unification L = summertown)
since:


                    HAP0 ∪ EXP |= E(loc(prop1, summertown))
   Also in this case, if we remove IC7 we obtain the previous answer, and a further
one, similar to the one obtained by Datalog± , where the set HAP0 contains two
loc events:
                 H(loc(prop1, summertown)),          ∃L H(loc(prop1, L))
and the answer includes a further CLP constraint L 6= summertown (being, instead,
L and summertown unified in the previous answer).


3 We show for clarity the quantification of existentially quantified variables, although in the
 SCIFF syntax the quantification is implicit.
                 An abductive Framework for Datalog± Ontologies                     11

                       5 Conclusions and Future Work
In this paper, we addressed representation and reasoning for Datalog± ontologies in
an Abductive Logic Programming framework, with existential (and universal) vari-
ables, and Constraint Logic Programming constraints in rule heads. The underlying
proof procedure, named SCIFF and inspired by the IFF proof procedure, was im-
plemented in Constraint Handling Rules (Frühwirth 1998). The SCIFF system has
already been used for modeling and implementing several knowledge representation
frameworks, also providing an effective reasoning system.
   Here we have shown how the SCIFF language can be a useful knowledge repre-
sentation and reasoning framework for Datalog± ontologies. In fact, the underlying
abductive proof procedure can be directly exploited as an ontological reasoner for
query answering and consistency check, also supporting inline incrementality of the
extensional part of the knowledge base (namely, the ABox), represented in SCIFF
as a (possibly incremental) set of events. To the best of our knowledge, this is the
first application of ALP to model and reason upon ontologies.
   Many issues have not been addressed in this paper, and they will be subject of
future work. First of all, we have not focused here on complexity results. Future work
will be devoted to identify syntactic conditions guaranteeing tractable ontologies in
SCIFF, in the style of what has been done for Datalog± .
   A second issue for future work concerns experimentation and comparison with
other approaches, even not Logic Programming based, on real-size ontologies.
   Finally, SCIFF language is richer than the subset here used to represent Datalog±
ontologies. In fact, the SCIFF integrity constraints can have the disjunction in the
head, and negative expectations in rule heads, with universally quantified variables
too, which basically represent the fact that something ought not to happen, and
the proof procedure can identify violations to them. Moreover, the SCIFF language
smoothly supports the integration of rules, expressed in a Logic Programming lan-
guage, with ontologies expressed in Datalog± , since a Logic Programming program
can be added to the set of ICs, giving the opportunity to consider deductive rules
besides the forward ICs themselves. SCIFF also allows for CLP constraints be-
side the equality one, which can be used also in the ICs as well. Finally, this rich
language could be used to add further expressivity to query languages.


                                    References
Abdennadher, S. and Christiansen, H. 2000. An experimental CLP platform for
 integrity constraints and abduction. In FQAS, Flexible Query Answering Systems,
 H. Larsen, J. Kacprzyk, S. Zadrozny, T. Andreasen, and H. Christiansen, Eds. LNCS.
 Springer-Verlag, Warsaw, Poland, 141–152.
Alberti, M., Chesani, F., Gavanelli, M., Lamma, E., Mello, P., and Montali, M.
 2006. An abductive framework for a-priori verification of web services. In Proceed-
 ings of the Eighth Symposium on Principles and Practice of Declarative Programming,
 M. Maher, Ed. ACM Press, New York, USA, 39–50.
Alberti, M., Chesani, F., Gavanelli, M., Lamma, E., Mello, P., and Torroni, P.
 2006. Security protocols verification in abductive logic programming: a case study. In
12                                  M. Gavanelli et al.

  ESAW 2005 Post-proceedings, O. Dikenelli, M.-P. Gleizes, and A. Ricci, Eds. Number
  3963 in LNAI. Springer-Verlag, Kusadasi, Aydin, Turkey, 106–124.
Alberti, M., Chesani, F., Gavanelli, M., Lamma, E., Mello, P., and Torroni, P.
 2008. Verifiable agent interaction in abductive logic programming: the SCIFF frame-
 work. ACM Transactions on Computational Logic 9, 4, 29:1–29:43.
Alberti, M., Gavanelli, M., and Lamma, E. 2012. Deon + : Abduction and constraints
 for normative reasoning. In Logic Programs, Norms and Action - Essays in Honor of
 Marek J. Sergot on the Occasion of His 60th Birthday, A. Artikis, R. Craven, N. K.
 Cicekli, B. Sadighi, and K. Stathis, Eds. Lecture Notes in Computer Science, vol. 7360.
 Springer, 308–328.
Alberti, M., Gavanelli, M., and Lamma, E. 2013. The CHR-based implementation of
 the SCIFF abductive system. Fundamenta Informaticae 124, 4, 365–381.
Alberti, M., Gavanelli, M., Lamma, E., Mello, P., Sartor, G., and Torroni,
 P. 2006. Mapping deontic operators to abductive expectations. Computational and
 Mathematical Organization Theory 12, 2–3 (Oct.), 205 – 225.
Alberti, M., Gavanelli, M., Lamma, E., Mello, P., and Torroni, P. 2004. Specifi-
 cation and verification of agent interactions using social integrity constraints. Electronic
 Notes in Theoretical Computer Science 85, 2 (Apr.), 94–116.
Alferes, J. J., Pereira, L. M., and Swift, T. 1999. Well-founded abduction via tabled
 dual programs. In Logic Programming: The 1999 International Conference, Las Cruces,
 New Mexico, USA, D. D. Schreye, Ed. MIT Press, Cambridge, MA, 426–440.
Bry, F. 1990. Intensional updates: Abduction via deduction. In Logic Programming,
 Proceedings of the Seventh International Conference, Jerusalem, Israel, D. Warren and
 P. Szeredi, Eds. MIT Press, Cambridge, MA, 561–578.
Calı̀, A., Gottlob, G., and Kifer, M. 2008. Taming the infinite chase: Query answering
 under expressive relational constraints. In International Conference on Principles of
 Knowledge Representation and Reasoning. AAAI Press, 70–80.
Calı̀, A., Gottlob, G., and Lukasiewicz, T. 2009a. A general datalog-based framework
 for tractable query answering over ontologies. In Symposium on Principles of Database
 Systems. ACM, 77–86.
Calı̀, A., Gottlob, G., and Lukasiewicz, T. 2009b. Tractable query answering over
 ontologies with Datalog± . In International Workshop on Description Logics. CEUR
 Workshop Proceedings, vol. 477. CEUR-WS.org.
Calı̀, A., Gottlob, G., Lukasiewicz, T., Marnette, B., and Pieris, A. 2010.
 Datalog± : A family of logical knowledge representation and query languages for new
 applications. In IEEE Symposium on Logic in Computer Science. 228–242.
Calvanese, D., Giacomo, G. D., Lembo, D., Lenzerini, M., and Rosati, R. 2007.
 Tractable reasoning and efficient query answering in description logics: The dl-lite family.
 J. Autom. Reasoning 39, 3, 385–429.
Carlsson, M. and Mildner, P. 2012. SICStus Prolog – the first 25 years. Theory and
 Practice of Logic Programming 12, 35–66.
Christiansen, H. and Dahl, V. 2005. HYPROLOG: A new logic programming language
 with assumptions and abduction. In Proc. ICLP 2005, M. Gabbrielli and G. Gupta,
 Eds. LNCS, vol. 3668. Springer, 159–173.
Denecker, M. and Schreye, D. D. 1998. SLDNFA: an abductive procedure for abduc-
 tive logic programs. Journal of Logic Programming 34, 2, 111–167.
Endriss, U., Mancarella, P., Sadri, F., Terreni, G., and Toni, F. 2004. The CIFF
 proof procedure for abductive logic programming with constraints. In Proc. JELIA
 2004, J. J. Alferes and J. A. Leite, Eds. LNAI, vol. 3229. Springer-Verlag, 31–43.
                 An abductive Framework for Datalog± Ontologies                     13

Frühwirth, T. 1998. Theory and practice of constraint handling rules. Journal of Logic
  Programming 37, 1-3 (Oct.), 95–138.
Fung, T. H. and Kowalski, R. A. 1997. The IFF proof procedure for abductive logic
  programming. Journal of Logic Programming 33, 2 (Nov.), 151–165.
Gottlob, G., Lukasiewicz, T., and Simari, G. I. 2011. Conjunctive query answering
  in probabilistic Datalog+/- ontologies. In International Conference on Web Reasoning
  and Rule Systems. LNCS, vol. 6902. Springer, 77–92.
Hitzler, P., Krötzsch, M., and Rudolph, S. 2009. Foundations of Semantic Web
  Technologies. CRCPress.
Jaffar, J. and Maher, M. 1994. Constraint logic programming: a survey. Journal of
  Logic Programming 19-20, 503–582.
Kakas, A. C. 2000. ACLP: integrating abduction and constraint solving. In Proceedings
  of the 8th International Workshop on Non-Monotonic Reasoning, NMR’00, Brecken-
  ridge, CO.
Kakas, A. C., Kowalski, R. A., and Toni, F. 1993. Abductive Logic Programming.
  Journal of Logic and Computation 2, 6, 719–770.
Kakas, A. C. and Mancarella, P. 1990. On the relation between Truth Maintenance
  and Abduction. In Proceedings of the 1st Pacific Rim International Conference on
  Artificial Intelligence, PRICAI-90, Nagoya, Japan, T. Fukumura, Ed. Ohmsha Ltd.
Kakas, A. C., van Nuffelen, B., and Denecker, M. 2001. A-System: Problem solving
  through abduction. In Proceedings of the Seventeenth International Joint Conference
  on Artificial Intelligence, Seattle, Washington, USA (IJCAI-01), B. Nebel, Ed. Morgan
  Kaufmann Publishers, Seattle, Washington, USA, 591–596.
Kunen, K. 1987. Negation in logic programming. In Journal of Logic Programming.
  Vol. 4. 289–308.
Mancarella, P., Terreni, G., Sadri, F., Toni, F., and Endriss, U. 2009. The CIFF
  proof procedure for abductive logic programming with constraints: Theory, implemen-
  tation and experiments. TPLP 9, 6, 691–750.
Wielemaker, J., Schrijvers, T., Triska, M., and Lager, T. 2012. SWI-Prolog.
  Theory and Practice of Logic Programming 12.