=Paper=
{{Paper
|id=Vol-2211/paper-31
|storemode=property
|title=ABox Abduction for Description Logics: The Case of Multiple Observations
|pdfUrl=https://ceur-ws.org/Vol-2211/paper-31.pdf
|volume=Vol-2211
|authors=Júlia Pukancová,Martin Homola
|dblpUrl=https://dblp.org/rec/conf/dlog/PukancovaH18
}}
==ABox Abduction for Description Logics: The Case of Multiple Observations==
<pdf width="1500px">https://ceur-ws.org/Vol-2211/paper-31.pdf</pdf>
<pre>
    ABox Abduction for Description Logics: The Case of
                 Multiple Observations

                           Júlia Pukancová and Martin Homola

                              Comenius University in Bratislava
                      Faculty of Mathematics, Physics, and Informatics,
                              Mlynská dolina, 84248 Bratislava
                       {pukancova,homola}@fmph.uniba.sk


       Abstract. We develop an ABox abduction algorithm for description logics based
       on Reiter’s minimal hitting set algorithm. It handles abduction problems with mul-
       tiple observations and it supports the class of explanations allowing atomic and
       negated atomic concept and role assertions. As shorter explanations are preferred,
       the algorithm computes shorter explanations first and allows to limit their length.
       The algorithm is sound and complete for this class of explanations and for any
       given maximal length of explanations. In this paper we specifically focus on com-
       puting explanations for abduction problems with multiple observations, for which
       we explore two distinct approaches. The DL expressivity is limited only by the
       DL reasoner that our algorithm calls as a black box. We provide an implementa-
       tion on top of Pellet, which is a full OWL 2 reasoner, so the expressivity is up to
       . We evaluate the implementation on three different ontologies.

       Keywords: description logics ⋅ abduction ⋅ implementation ⋅ evaluation


1   Introduction

The goal of abduction [20,5] is to explain why a set of axioms  (called observation)
does not follow from a knowledge base : an explanation for  is another set of axioms
 s.t.  follows from ∪. ABox abduction, i.e. the case when both  and  are limited
to ABox axioms, has found applications in areas such as diagnostic reasoning [16,21,5]
or multimedia interpretation [6,2].
    On the other hand, general-purpose ABox abduction solvers, especially for more ex-
pressive DLs are still underdeveloped. Some approaches are based on translation, e.g.,
Klarman et al. [18] and Du et al. [4] rely on a translation to first-order and modal logic,
respectively to logic programs. The former work is purely theoretical, it is sound and
complete, but limited to . The latter has some interesting computational results
but it is only complete w.r.t. a specific Horn fragment of . Other works, e.g., of
Halland and Britz [12,11] and Ma et al. [19] directly exploit DL tableau reasoning, but
their expressivity is still limited to  and . The former work is a theoretical
proposal, it is sound, but not complete. The latter was implemented, but soundness or
completeness is not shown. An ABox abduction service based on backward chaining
of DL-safe rules [6,2] is part of RacerPro [10]. To our best knowledge, soundness and
completeness results are not known. An interesting proposal based on forgetting was re-
cently developed by Del-Pinto and Schmidt [3]. Their work includes an implementation,
it is sound and complete for , and it handles semantic, not just syntactic minimality.
     Building on ideas of Halland and Britz [11,12], we have developed an ABox ab-
duction solver AAA [22] based on Reiter’s [23] Minimal Hitting Set (MHS) algorithm.
AAA supports atomic and negated atomic concept and role assertions in explanations. It
uses a DL reasoner as a black box, hence it handles any DL expressivity supported by the
reasoner – Pellet in case of our implementation. It also exploits optimization techniques
suggested by Reiter such as model reuse and pruning.
     In this paper, we extend our previous results as follows: (a) we develop two distinct
approaches for computing explanations for abduction problems with multiple explana-
tions; (b) we incorporate a limitation on maximum length of explanations and improve
effective exploration of the search space, starting from more desired (i.e., shorter) ex-
planations; (c) we conduct preliminary empirical evaluation, specifically focusing on
the comparison of the two multiple-observation approaches and on evaluation of the
implemented optimization techniques.


2   Preliminaries

For brevity we will only introduce the  DL [1] which contains all features
essential to our approach – especially due to constructions involved in handling multiple
observations and role explanations. The lowest expressivity that the DL reasoner used
in our abduction algorithm should support is . Role hierarchies are not strictly
needed, but without them the number of explanations involving roles is limited.
    A DL vocabulary consists of three countable mutually disjoint sets: set of indi-
viduals 𝑁I = {𝑎, 𝑏, …}, set of atomic concepts 𝑁C = {𝐴, 𝐵, …}, and set of roles
𝑁R = {𝑅, 𝑆, …}. Complex  concepts are recursively constructed as stated in
Table 1, where 𝐶, 𝐷 are  concepts, 𝑅, 𝑆 are roles, and 𝑎, 𝑏 are individuals. A
TBox  is a finite set of GCIs and RIAs, and an ABox  is a finite set of concept, role
or negated role assertions, as given in Table 1. A knowledge base is a pair  = ( , ).
By the negation of any assertion 𝜑, i.e. ¬𝜑, we mean ¬𝐶(𝑎) for 𝜑 = 𝐶(𝑎), ¬𝑅(𝑎, 𝑏) for
𝜑 = 𝑅(𝑎, 𝑏), and 𝑅(𝑎, 𝑏) for 𝜑 = ¬𝑅(𝑎, 𝑏). Let ¬ = {¬𝜑 ∣ 𝜑 ∈ } for any ABox .
    An interpretation of a knowledge base  is a pair  = (Δ , ⋅ ), where Δ ≠ {},
and ⋅ is an interpretation function s.t. 𝑎 ∈ Δ for 𝑎 ∈ 𝑁I , 𝐴 ⊆ Δ for 𝐴 ∈ 𝑁C , and
𝑅 ⊆ Δ × Δ for 𝑅 ∈ 𝑁R . Interpretation of complex concepts is inductively defined
in Table 1.  satisfies an axiom 𝜑 ( ⊧ 𝜑) as given in Table 1.  is a model of  if it
satisfies all axioms included in ;  is consistent if there is a model  of . A concept
𝐶 is satisfiable w.r.t.  if there is a model  of  s.t. 𝐶  ≠ {}. An axiom 𝜑 is entailed
by  ( ⊧ 𝜑) if for every model  of  it holds that  ⊧ 𝜑. A set of axioms  is entailed
by  ( ⊧ ) if  ⊧ 𝜑 for all 𝜑 ∈ .
    Entailment and consistency checking are inter-reducible: e.g., for an ABox assertion
𝜑,  ⊧ 𝜑 if and only if  ∪ {¬𝜑} is inconsistent. There is a number of DL reasoners
[14,15,25,24,7,26] for consistency checking, using the tableau algorithm (TA) [1].
    In DL we distinguish between TBox and ABox abduction [5], the latter is of our
interest in this paper. An ABox abduction problem is a pair  = (, ), where  is
                         Table 1. Syntax and Semantics of 

Constructor           Syntax    Semantics
complement              ¬𝐶      Δ ⧵ 𝐶 
intersection            𝐶 ⊓𝐷    𝐶  ∩ 𝐷
union                   𝐶 ⊔𝐷    𝐶  ∪ 𝐷
existential restriction ∃𝑅.𝐶    {𝑥 ∣ ∃𝑦 (𝑥, 𝑦) ∈ 𝑅 ∧ 𝑦 ∈ 𝐶  }
value restriction       ∀𝑅.𝐶    {𝑥 ∣ ∀𝑦 (𝑥, 𝑦) ∈ 𝑅 → 𝑦 ∈ 𝐶  }
nominal                 {𝑎}     {𝑎 }
Axiom                 Syntax    Semantics
concept incl. (GCI)   𝐶⊑𝐷       𝐶  ⊆ 𝐷
role incl. (RIA)      𝑅⊑𝑆       𝑅 ⊆ 𝑆 
concept assertion     𝐶(𝑎)     𝑎 ∈ 𝐶 
role assertion        𝑅(𝑎, 𝑏) (𝑎 , 𝑏 ) ∈ 𝑅
neg. role assertion   ¬𝑅(𝑎, 𝑏) (𝑎 , 𝑏 ) ∉ 𝑅


a DL knowledge base and  is a set of ABox assertions. A set of ABox assertions
 is an explanation of  if  ∪  ⊧ . If  = {𝑂} we call  = (, ) a single-
observation abduction problem, and we also denote it by  = (, 𝑂). Otherwise  is
called a multiple-observation abduction problem.
     Not all explanations are acceptable [5]. We will require an explanation  of  to
be explanatory ( ̸⊧ ), consistent ( ∪  is consistent), and relevant (for each 𝑂𝑖 ∈
:  ̸⊧ 𝑂𝑖 ). Even after ruling out such undesired explanations, there can still be too
many of them. Therefore some notion of minimality is often used. We will use syntactic
minimality: an explanation  of  is minimal if there is no other explanation  ′ of 
s.t.  ′ ⊊ .
     In this work we are interested in explanations in form of atomic and negated atomic
ABox assertions. More specifically, we require  ⊆ {𝐴(𝑎), ¬𝐴(𝑎), 𝑅(𝑎, 𝑏), ¬𝑅(𝑎, 𝑏) ∣
𝐴 ∈ 𝑁C , 𝑅 ∈ 𝑁R , 𝑎, 𝑏 ∈ 𝑁I }. This class of explanations will be denoted An Rn . The
subset of this class, which only contains explanatory, consistent, relevant, and minimal
explanations will be denoted An RCER,sub
                                    n       .
     Note that for any abduction problem  = (, ) both  and  are finite and the
class of explanations is defined w.r.t. their finite combined signature, hence there is only
finitely many explanations in either of these classes for any abduction problem .


3    ABox Abduction with Single Observation
To compute explanations for single-observation abduction problems we build on our
previous work [22] which in turn extends the works of Halland and Britz [12,11] and
the seminal work of Reiter [23].
     In order to find an explanation for an ABox abduction problem  = (, 𝑂) we need
to find a set of ABox assertions  such that  ∪  ⊧ 𝑂, i.e., such that  ∪  ∪ {¬𝑂}
is inconsistent. This can be done by finding a hitting set [17] of the collection  of all
models of  ∪ {¬𝑂} and then negating each assertion in the hitting set.
     To stay within the class An Rn we will rely on ABox encoding of models. ABox en-
coding of  is 𝑀 = {𝐶(𝑎) ∣  ⊧ 𝐶(𝑎), 𝐶∈ {𝐴, ¬𝐴}, 𝐴 ∈ 𝑁C , 𝑎 ∈ 𝑁I } ∪ {𝑅(𝑎, 𝑏) ∣  ⊧
𝑅(𝑎, 𝑏), 𝑅 ∈ 𝑁R , 𝑎, 𝑏 ∈ 𝑁I } ∪ {¬𝑅(𝑎, 𝑏) ∣  ⊧ ¬𝑅(𝑎, 𝑏), 𝑅 ∈ 𝑁R , 𝑎, 𝑏 ∈ 𝑁I }. Similarly
as above, each  and 𝑂 have a finite combined signature, hence each ABox encoding
is finite. What is more, 𝑀 is not necessary homomorphic with the original model ;
it ignores the anonymous part of the model (on purpose). Hereafter we automatically
assume the ABox encoding whenever we talk about models.
     According to Reiter [23], hitting sets are found by constructing a HS-tree 𝑇 =
(𝑉 , 𝐸, 𝐿), a labelled tree which contains negated models of  ∪ {¬𝑂} as node-labels
and whose edges are labelled by ABox assertions contained in their parent node. HS-
tree has the property that the node label 𝐿(𝑛) and the union of the edge-labels 𝐻(𝑛) on
the path from root 𝑟 to each node 𝑛 is disjoint. If 𝑛 cannot be extended by any successor
satisfying this property then 𝐻(𝑛) corresponds to a hitting set.
     Additional pruning conditions apply on the HS-tree [23,22]. The most obvious case
is when a node 𝑛 already corresponds to a hitting set 𝐻(𝑛) and there is another node 𝑛′
with 𝐻(𝑛) ⊆ 𝐻(𝑛′ ). Any such 𝑛′ can be pruned. For further conditions refer to literature
[23,22]. A pruned HS-tree (i.e., one on which all pruning conditions were applied), once
completed, contains all minimal hitting sets [23].
Theorem 1. Given an ABox abduction problem  = (, 𝑂). Let  be a set of all
models of  ∪ {¬𝑂} and let 𝑇 = (𝑉 , 𝐸, 𝐿) be a pruned HS-tree for a collection of sets
{¬𝑀 ∣ 𝑀 ∈ }. Then {𝐻(𝑛) ∣ 𝑛 ∈ 𝑉 is a leaf in 𝑇 that is not pruned} is the collection
of all minimal explanations of .
     The Single Observation Abduction algorithm (SOA) is given in Algorithm 1. This
is a simplified and shortened pseudocode with some details omitted. For a more detailed
version please refer to our previous report [22].
     The algorithm takes a knowledge base  and an observation 𝑂 as first two inputs.
One improvement w.r.t. the previous work [22] is that the search can be limited only
to explanations up to certain length, this is done by the third parameter 𝑙. The current
version also implements model reuse [23,22] and it has a few additional improvements,
notably it adds additional pruning, but the details are beyond the scope of this paper.
     The last two parameters are auxiliary and they are only relevant when the algorithm is
called as a subroutine to find explanations for multiple-observation abduction problems.
In the single-observation case they will be set to  = {𝑂} and some new individual 𝑠0
(w.r.t. both  and 𝑂).
     The algorithm first tries to obtain a model of  ∪ {¬𝑂} to use ¬𝑀 as a label for the
root node 𝑟. If there is none (i.e.,  ⊧ 𝑂) no (explanatory) explanations exist. Otherwise,
successors of 𝑟 are initialized and the tree is traversed breadth-first down to the depth 𝑙.
For each node, we first evaluate 𝐻(𝑛). If it is inconsistent, irrelevant, or pruning applies,
the node is pruned.
     Next, if  ∪ {¬𝑂} ∪ 𝐻(𝑛) has a model, it is used to label 𝑛, otherwise we have found
an explanation which is stored. Finally, after the desired depth 𝑙 is fully traversed, all
explanations are returned.
     The algorithm calls the external solver (TA) on lines 1, 10, 11 and 15 though some
TA calls can be saved thanks to model reuse. Also, while pruning ensures minimality,
we have also filtered out all undesired explanations, hence the following result.
Algorithm 1 SOA(,𝑂,𝑙,,𝑠0 )
Require: knowledge base , observation 𝑂, max explanation length 𝑙, set of observations ,
    individual 𝑠0
Ensure: set  of all explanations of  = (, 𝑂) of the class An RCER,sub
                                                                   n
                                                                          up to the length 𝑙
 1: 𝑀 ← a model 𝑀 of  ∪ {¬𝑂} obtained from TA
 2: if 𝑀 = 𝚗𝚞𝚕𝚕 then
 3:     return "nothing to explain"
 4: end if
 5: create new HS-tree 𝑇 = (𝑉 , 𝐸, 𝐿) with root 𝑟
 6: label 𝑟 by 𝐿(𝑟) ← ¬𝑀 ⧵ {𝜑 ∈ 𝑀 ∣ 𝑠0 occurs in 𝜑}
 7: for each 𝜎 ∈ ¬𝑀 create a successor 𝑛𝜎 of 𝑟 and label the resp. edge by 𝜎
 8:  ← {}
 9: while there is next node 𝑛 in 𝑇 and |𝐻(𝑛)| ≤ 𝑙 do
10:     if  ∪ 𝐻(𝑛) is inconsistent
11:         or 𝐻(𝑛) ∪ {¬𝑂𝑖 } is inconsistent for some 𝑂𝑖 ∈ 
12:         or 𝑛 can be pruned
13:     then
14:         prune 𝑛
15:     else if there is a model 𝑀 of  ∪ {¬𝑂} ∪ 𝐻(𝑛), reused or obtained from TA then
16:         label 𝑛 by 𝐿(𝑛) ← ¬𝑀 ⧵ {𝜑 ∈ 𝑀 ∣ 𝑠0 occurs in 𝜑}
17:     else
18:          ←  ∪ {𝐻(𝑛)}
19:     end if
20:     for each 𝜎 ∈ ¬𝑀 create a successor 𝑛𝜎 of 𝑛 and label the resp. edge by 𝜎
21: end while
22: return 


Theorem 2. Let  = (, 𝑂) be a single-observation ABox abduction problem, 𝑙 ≥ 1,
and 𝑠0 a new individual w.r.t.  and 𝑂. SOA(, 𝑂, 𝑙, {𝑂}, 𝑠0 ) terminates, and it is sound
and complete w.r.t. all explanations of  of the class An RCER,sub
                                                           n       up to the length 𝑙.
     In addition, if we remove the length limitation (i.e., we set 𝑙 to ∞), the algorithm
still terminates as the depth of the HS-tree is also bound by the number of possible
ABox assertions. In such a case the algorithm finds all explanations of the input ABox
abduction problem.

4     Handling Multiple Observations
We now extend SOA to handle multiple observations. This algorithm is called ABox
Abduction Algorithm (AAA). In fact, we explore two versions, one based on reducing
the set of observations to a single observation (AAAR ), the other based on splitting the
multiple-observation problem into separate single-observation subproblems (AAAS ).

4.1   Reduction
An observation consisting of multiple concept assertions involving the same individual,
say  = {𝐶1 (𝑎), … , 𝐶𝑛 (𝑎)}, can be easily reduced to an equivalent single observation
𝑂′ = 𝐶1 ⊓ ⋯ ⊓ 𝐶𝑛 (𝑎). Cases involving multiple individuals or even role assertions are
less straightforward, but in DLs featuring nominals they may be encoded as follows.
Lemma 1. Let  = (, ) be a multiple-observation ABox abduction problem with
 = {𝐶1 (𝑎1 ), … , 𝐶𝑛 (𝑎𝑛 ), 𝑅1 (𝑏1 , 𝑐1 ), … , 𝑅𝑚 (𝑏𝑚 , 𝑐𝑚 ), ¬𝑄1 (𝑑1 , 𝑒1 ), … , ¬𝑄𝑙 (𝑑𝑙 , 𝑒𝑙 )},
where 𝐶𝑖 are concepts, 𝑅𝑖 , 𝑄𝑗 ∈ 𝑁R and 𝑎𝑖 , 𝑏𝑗 , 𝑐𝑗 , 𝑑𝑘 , 𝑒𝑘 ∈ 𝑁I for 𝑖 ∈ [1, 𝑛], 𝑗 ∈ [1, 𝑚],
𝑘 ∈ [1, 𝑙]. Let  ⊆ {𝐴(𝑎), ¬𝐴(𝑎), 𝑅(𝑎, 𝑏), ¬𝑅(𝑎, 𝑏) ∣ 𝐴 ∈ 𝑁C , 𝑅 ∈ 𝑁R , 𝑎, 𝑏 ∈ 𝑁I }. Let
 ′ = (, 𝑂′ ) be a single-observation ABox abduction problem with 𝑂′ = 𝑋(𝑠0 ), s.t. 𝑠0
is new w.r.t. , , and , and

                𝑋 =(¬{𝑎1 } ⊔ 𝐶1 ) ⊓ ⋯ ⊓ (¬{𝑎𝑛 } ⊔ 𝐶𝑛 )
                      ⊓ (¬{𝑏1 } ⊔ ∃𝑅1 .{𝑐1 }) ⊓ ⋯ ⊓ (¬{𝑏𝑚 } ⊔ ∃𝑅𝑚 .{𝑐𝑚 })
                      ⊓ (¬{𝑑1 } ⊔ ∀𝑄1 .¬{𝑒1 }) ⊓ ⋯ ⊓ (¬{𝑑𝑙 } ⊔ ∀𝑄𝑙 .¬{𝑒𝑙 }) .

Then  is an explanation of  if and only if it is an explanation of  ′ .
    Notice that the lemma rules out explanations involving the individual 𝑠0 introduced
during the reduction. If this is not the case  ′ may indeed have more explanations than
. These unwanted explanations need to be filtered out.
Example 1. Let  = {𝐴 ⊑ 𝐵, 𝐶 ⊑ 𝐷}, and  = {𝐵(𝑎), 𝐷(𝑏)}. The only explanation
of  = (, ) is 1 = {𝐴(𝑎), 𝐶(𝑏)}. Using the reduction we obtain  ′ = (, 𝑂′ ) with
𝑂′ = (¬{𝑎} ⊔ 𝐵) ⊓ (¬{𝑏} ⊔ 𝐷)(𝑠0 ). However, besides for 1 which is an explanation
of  ′ courtesy of Lemma 1, in addition 2 = {𝐴(𝑠0 ), 𝐶(𝑠0 )}, 3 = {𝐴(𝑎), 𝐶(𝑠0 )}, and
4 = {𝐴(𝑠0 ), 𝐶(𝑏)} are explanations of  ′ .
    The AAAR algorithm is listed in Algorithm 2. It takes a knowledge base , a set
of observations , and a length upper bound 𝑙 ≥ 1 as inputs. It reduces the set  to a
single observation 𝑂′ according to Lemma 1 and passes the reduced single-observation
abduction problem to SOA.


Algorithm 2 AAAR (,,𝑙): AAA based on Reduction
Require: knowledge base , set of observations  = {𝐶1 (𝑎1 ), … , 𝐶𝑛 (𝑎𝑛 ), 𝑅1 (𝑏1 , 𝑐1 ), … ,
    𝑅𝑚 (𝑏𝑚 , 𝑐𝑚 ),¬𝑄1 (𝑑1 , 𝑒1 ), … ,¬𝑄𝑙 (𝑑𝑙 , 𝑒𝑙 )}, max length of an explanation 𝑙
Ensure: set  of all explanations of  = (, 𝑂) of the class An RCER,sub  n
                                                                                  up to the length 𝑙
 1: 𝑠0 ← new individual w.r.t.  and 
 2: 𝑋 ← (¬{𝑎1 } ⊔ 𝐶1 ) ⊓ ⋯ ⊓ (¬{𝑎𝑛 } ⊔ 𝐶𝑛 )
         ⊓ (¬{𝑏1 } ⊔ ∃𝑅1 .{𝑐1 }) ⊓ ⋯ ⊓ (¬{𝑏𝑚 } ⊔ ∃𝑅𝑚 .{𝑐𝑚 })
         ⊓ (¬{𝑑1 } ⊔ ∀𝑄1 .¬{𝑒1 }) ⊓ ⋯ ⊓ (¬{𝑑𝑙 } ⊔ ∀𝑄𝑙 .¬{𝑒𝑙 })
 3: 𝑂′ ← 𝑋(𝑠0 )
 4:  ← SOA(, 𝑂′ , 𝑙, , 𝑠0 )
 5: return 


    Instead of filtering the unwanted explanation involving the auxiliary individual 𝑠0 ex
post, AAAR passes 𝑠0 to SOA as the fifth parameter. SOA then excludes the assertions
involving 𝑠0 already from the models returned by TA, reducing the HS-tree.
    Since for multiple-observation abduction the relevance needs to be checked w.r.t.
all observations AAAR also passes the original set of observations  to SOA as the
fourth parameter. Observing that SOA handles the auxiliary parameters correctly, the
correctness of AAAR is then a consequence of the correctness of SOA.

Theorem 3. Let  = (, ) be a multiple-observation ABox abduction problem, and
let 𝑙 ≥ 1. Then AAAR (,,𝑙) terminates, and it is sound and complete w.r.t. all expla-
nations of  of the class An RnCER,sub up to the length 𝑙.

    As for SOA, if we remove the depth limitation, the algorithm finds all explanations
of the input ABox abduction problem.


4.2   Splitting into Subproblems

A multiple-observation abduction problem may also be solved by splitting  into 𝑛 sub-
problems 𝑖 = (𝑖 , 𝑂𝑖 ) and answering each 𝑖 separately using SOA. If there is no
bound on length, this is quite easy: we simply combine the results in terms of union
with some additional filtering [22]. But the partial explanations may overlap or even
repeat. If a length bound 𝑙 is given, we need to run SOA up to 𝑙 for each subproblem,
and only then combine the results. Only this assures all explanations up to the length 𝑙
for  (plus possibly some which are longer). This may seem as unnecessary overhead
compared to AAAR but as we show below, sometimes it may be useful.
    AAAS algorithm is listed in Algorithm 3. It receives a knowledge base , observa-
tions  = {𝑂1 , … , 𝑂𝑛 }, and a length upper bound 𝑙 ≥ 1 as inputs.


Algorithm 3 AAAS (,,𝑙): AAA with Splitting
Require: knowledge base , set of observations , max explanation length 𝑙
Ensure: set  of all explanations of  = (, 𝑂) of the class An RCER,sub
                                                                  n
                                                                          up to the length 𝑙
 1: 𝑠0 ← new individual w.r.t.  and                                                 ⊳ auxiliary
 2: Σ ← {}                                                        ⊳ collection of partial results
 3: ′ ←  ∪ {⊤(𝑎) ∣ 𝑎 occurs in 𝑂𝑖 , 𝑂𝑖 ∈ }
 4: for all 𝑂𝑖 ∈  do
 5:     𝑖 ← SOA(′ , 𝑂𝑖 , 𝑙, , 𝑠0 )                                ⊳ get partial result for 𝑂𝑖
 6:     if 𝑖 = {} then
 7:         return {}                                                   ⊳  has no explanation
 8:     else if 𝑖 ≠ "nothing to explain" then
 9:         Σ ← Σ ∪ {𝑖 }                                                 ⊳ store partial result
10:     end if
11: end for
12: if Σ = {} then
13:     return "nothing to explain"
14: else
15:      ← {1 ∪ ⋯ ∪ 𝑚 ∣ 𝑖 ∈ 𝑖 , 𝑖 ∈ Σ, 𝑚 = |Σ|}
16:      ← { ∈  ∣  is minimal, consistent, and relevant}
17: end if
18: return 
    The algorithm initializes an empty collection Σ which will be used to accumulate
the partial results returned by SOA and a dummy new individual 𝑠0 (lines 1–2).
    We cannot just directly use  in each subproblem 𝑖 , as we may miss explanations
involving individuals from other observations. Hence ′ is used, obtained by adding
⊤(𝑎) into  for all such individuals 𝑎 (line 3).
    The algorithm then loops through 𝑂𝑖 ∈  (lines 4–11) and calls SOA for ′ , 𝑂𝑖
and 𝑙. We also pass the set of observations  (due to relevance checks) and a dummy
new individual 𝑠0 to SOA as auxiliary parameters. If one of the observations cannot be
explained (SOA returned {}) then neither the whole set  can be explained (line 7).
    Observe that  and  ⧵ {𝑂𝑖 ∈  ∣  ⊧ 𝑂𝑖 } have the same set of explanations.
Therefore if SOA returned "nothing to explain" for some 𝑂𝑖 the result is sim-
ply excluded from Σ. If this happens for all 𝑂𝑖 ∈ , the overall result is "nothing
to explain".
    If Σ is non-empty the explanations of  are computed as unions 1 ∪ ⋯ ∪ 𝑚 , com-
bining all possible 𝑖 from each 𝑖 ∈ Σ with the others (line 15). While SOA already
did some filtering of the partial results, supersets, irrelevance, and inconsistence may
have been introduced by unifying them, hence they are filtered out (line 16).
    Finally, also AAAS is correct up to any length 𝑙. And if the length limitation is
removed, the algorithm finds all explanations of the input ABox abduction problem.

Theorem 4. Let  = (, ) be a multiple-observation ABox abduction problem, and
let 𝑙 ≥ 1. Then AAAS (,,𝑙) terminates, and it is sound and complete w.r.t. all expla-
nations of  of the class An RnCER,sub up to the length 𝑙.


5     Evaluation
We have implemented the ABox abduction algorithm into the AAA solver. For im-
plementation details, see our previous report [22]. The implementation is available at:
http://dai.fmph.uniba.sk/~pukancova/aaa/ . We present evaluation re-
sults with the aim to compare the two versions of the multiple-observation algorithm.

5.1    Dataset and Methodology
We have chosen three ontologies for the evaluation: Family ontology (our own small
ontology of family relations)1 , Coffee ontology by Carlos Mendes2 , and LUBM (Lehigh
University Benchmark [9]). The parameters of the ontologies are stated in Table 2.
    On each input we ran AAA iteratively, rising maximal explanation length (i.e. the
maximal depth of the HS-tree). The following properties were recorded from each run:
time of execution, number of explanations, number of the nodes in the HS-tree, number
of TA calls, number of reused models, number of pruned nodes.
    Apart from exceptional cases all experiments were repeated 10 times on the same
input and the results were averaged. All experiments were executed both allowing and
disallowing loops (i.e., reflexive role assertions) in explanations. For the lack of space
 1
     http://dai.fmph.uniba.sk/~pukancova/aaa/ont/
 2
     https://gist.githubcom/cmendesce/56e1e16aee5a556a186f512eda8dabf3
                              Table 2. Parameters of the ontologies

                   Ontology            Concepts Roles Individuals Axioms
                   Family ontology        8      1         2        24
                   Coffee ontology       41      6         2       291
                   LUBM                  43      25        1        46


we report only the former in this paper. On average, the experiments with loops took
243.24 % more time and found 26.99 % more explanations.
    All experiments were done on a 6-core 3.2 GHz AMD Phenom™ II X6 1090T Pro-
cessor, 8 GB RAM, running Ubuntu 17.10, Linux 4.13.0, while the maximum Java heap
size was set to 4GB. We have used the GNU time utility to measure the CPU time con-
sumed by AAA while running in user mode, summed over all threads.


5.2   Experiment


The algorithm was executed once on each ontology with a specifically chosen observa-
tion set, each consisting of three assertions: {𝖥𝖺𝗍𝗁𝖾𝗋(𝗃𝖺𝖼𝗄), 𝖬𝗈𝗍𝗁𝖾𝗋(𝖾𝗏𝖺), 𝖯𝖾𝗋𝗌𝗈𝗇(𝖿 𝗋𝖾𝖽)}
for Family ontology, {𝖬𝗂𝗅𝗄(𝖺), 𝖢𝗈𝖿 𝖿 𝖾𝖾(𝖻), 𝖯𝗎𝗋𝖾(𝖼)} for Coffee ontology, {𝖯𝖾𝗋𝗌𝗈𝗇(𝗃𝖺𝖼𝗄),
𝖤𝗆𝗉𝗅𝗈𝗒𝖾𝖾(𝗃𝖺𝖼𝗄), 𝖯𝗎𝖻𝗅𝗂𝖼𝖺𝗍𝗂𝗈𝗇(𝖺)} for LUBM. The aim was to compare the reduction (R)
and the splitting (S) approach therefore all experiments were run with either option.
    The experiment was processed up to the depth 3, as even in this depth the algorithm
ran out of memory in half of the cases. The reason for this is that the observations contain
multiple individuals which increases the search space. Most runs were repeated 10 times,
with four exceptions: LUBM (R and S), depth 3 and Coffee (R), depth 3 as these runs
ran out of memory; and Coffee (S) as the execution time was too high.
    In Figure 1 the execution times for each ontology and both approaches are plotted.
These times are also stated in Table 3 with the numbers of the computed explanations
in each experiment. The out-of-memory cases show the time when the memory was ex-
ceeded. The average deviation in time between runs was 2.66 % for the splitting approach
and 1.6 % for the reduction approach.


                                Table 3. Parameters of HS-trees

                  Family                         Coffee                    LUBM
App. D      Nodes Expl.         Time       Nodes Expl.         Time   Nodes Expl.    Time
       1    93.0        7    4.54     480.0           12     23.46   411.0     320 32.08
  S    2 1545.0       144   21.78   41303.0           12    826.74 39756.3     320 421.21
       3 17072.6      813 1169.02 2052996.0           12 332177.08       –       –      –
       1    31.0        0    2.78     160.0            0     11.85   137.0       0   9.11
  R    2   931.0        0   13.63   25441.0            0    503.72 18633.0     320 229.17
       3 13951.0        9   79.68         –            –         –       –       –      –
                             106       LUBM (S)
                                       LUBM (R)
                             105       Coffee (S)
                                       Coffee (R)
                             104       Family (S)


                  time [s]
                                       Family (R)
                             103

                             102

                             101

                             100
                                   1                2           3

                                       Fig. 1. Depth vs. time


     For each run of the algorithm, a number of pruned nodes, reused models and TA
calls was recorded. In Figure 2 a proportion of these numbers for each experiment is
plotted (for depths for which memory was not exceeded).
     The execution time is always higher for the splitting approach than for the reduction
approach (ignoring the out-of-memory cases). Indeed this is because in the splitting
approach the length limit is applied to each subproblem separately. Thus, also some ex-
planations longer than the limit are computed (as unions of the separate results obtained
for each subproblem).
     From one point of view, the reduction approach is more efficient, as for a length limit
𝑙 it always assures all the explanations up to 𝑙, and in our experiments it always reached
lower time than the splitting approach for the same limit 𝑙.
     On the other hand, we observed that the splitting approach may often find higher
numbers of explanations much more quickly, e.g. in Table 3 (Family ontology). AAAR
found all 9 explanations up to length 3 in 79.68 seconds. AAAS found 7 of these in
4.54 seconds (after depth 1) and it already found 144 explanations in 21.78 seconds
(after depth 2). In fact, it took slightly longer to run it to depth 3 (19.48 minutes) but
during this time it found the 9 explanations up to the length 3 together with additional
804 longer explanations. Though we cannot characterize this additional explanations in
any way (apart from being sound), this approach may be suitable for some applications,
where completeness is not a high priority, and the main goal is to compute as much
explanations as possible.


6   Conclusions

We have described a sound and complete ABox abduction algorithm that handles mul-
tiple observations, and supports the class of explanations including atomic and negated
atomic concept and role assertions. Unlike the previous works which can only support
(or are complete up to) limited DL expressivity [18,12,11,19], our algorithm plugs in
a DL reasoner as a black box, hence the DL expressivity is only limited by the used
               Family ontology (S)             Coffee ontology (S)                     LUBM ontology (S)
        100%                          100%                                      100%
                                                           11.1
         90%          26.6
                                       90%     18.1                     22.7     90%     16.8

         80%   41.9
                              36.4     80%                 15.8                  80%
                                                                          9.4                           50.2
         70%                           70%                                       70%
         60%                           60%                                       60%
                      35.3
         50%                  30.2     50%                                       50%
         40%                           40%     81.9
                                                           73.1
                                                                                 40%     83.2           16.2
                                                                        67.9
         30%   58.1                    30%                                       30%
         20%          38.1    33.5
                                       20%                                       20%                    33.6
         10%                           10%                                       10%
          0%                            0%                                        0%
                1      2       3                1           2             3               1              2
                      depth                                depth                                depth
               Family ontology (R)             Coffee ontology (R)                     LUBM ontology (R)
        100%                          100%                                      100%
         90%                           90%          19.4                         90%     19.0
                                                                   35.6
         80%   45.2
                      38.8    42.3     80%                                       80%
                                                                                                        48.4
         70%                           70%                                       70%
         60%                           60%                         13.6          60%
         50%                           50%                                       50%
                      36.0                                                                              16.9
         40%                  31.4     40%          80.6                         40%     81.0

         30%   54.8                    30%                                       30%
                                                                   50.8
         20%                           20%                                       20%                    34.7
                      25.2    26.3
         10%                           10%                                       10%
          0%                            0%                                        0%
                1      2       3                      1             2                     1              2
                      depth                                depth                                depth

                           TA calls          reused models                      pruned nodes

               Fig. 2. Proportion of pruned nodes, reused models and TA calls


reasoner. We have provided an implementation based on Pellet [25], thus the expressiv-
ity ranges up to . MHS is NP-complete [23,17] hence the overall combination
inherits the higher complexity of the used DL starting at ExpTime for  [13].
     In this paper we have predominantly focused on extending our algorithm and imple-
mentation to handle multiple-observation abduction problems. We have developed and
evaluated two distinct approaches. Our evaluation shows applicability of the approach,
particularly when looking for explanations of lower lengths (i.e. those most preferred)
and with a smaller number of involved individuals. The evaluation also shows how the
implemented optimization techniques are essential in reducing the search space. Note
that this is only our first attempt at an implementation, in the future we would like to
plug in and compare different reasoners, but also to explore possible improvements in
the MHS algorithm [8,27] to further boost the performance.

Acknowledgments. This work was supported by the Slovak national project VEGA
1/0778/18. Júlia Pukancová is also supported by an extraordinary scholarship awarded
by Faculty of Mathematics, Physics, and Informatics, Comenius University in Bratislava,
and by the Comenius University grant no. UK/266/2018. We would like to thank to
anonymous reviewers from previous DL workshops for valuable suggestions.
References

 1. Baader, F., Calvanese, D., McGuinness, D.L., Nardi, D., Patel-Schneider, P.F. (eds.): The
    Description Logic Handbook: Theory, Implementation, and Applications. Cambridge Uni-
    versity Press (2003)
 2. Castano, S., Espinosa Peraldí, I.S., Ferrara, A., Karkaletsis, V., Kaya, A., Möller, R., Mon-
    tanelli, S., Petasis, G., Wessel, M.: Multimedia interpretation for dynamic ontology evolution.
    Journal of Logic and Computation 19(5), 859–897 (2009)
 3. Del-Pinto, W., Schmidt, R.A.: Forgetting-based abduction in . In: Proceedings of the
    Workshop on Second-Order Quantifier Elimination and Related Topics (SOQE 2017), Dres-
    den, Germany. CEUR-WS, vol. 2013, pp. 27–35 (2017)
 4. Du, J., Qi, G., Shen, Y., Pan, J.Z.: Towards practical ABox abduction in large description
    logic ontologies. Int. J. Semantic Web Inf. Syst. 8(2), 1–33 (2012)
 5. Elsenbroich, C., Kutz, O., Sattler, U.: A case for abductive reasoning over ontologies. In:
    Proceedings of the OWLED*06 Workshop on OWL: Experiences and Directions, Athens,
    GA, USA. CEUR-WS, vol. 2016 (2006)
 6. Espinosa Peraldí, I.S., Kaya, A., Möller, R.: Formalizing multimedia interpretation based on
    abduction over description logic aboxes. In: Proceedings of the 22nd International Workshop
    on Description Logics (DL 2009), Oxford, UK. CEUR-WS, vol. 477 (2009)
 7. Glimm, B., Horrocks, I., Motik, B., Stoilos, G., Wang, Z.: Hermit: An OWL 2 reasoner.
    Journal of Automated Reasoning 53(3), 245–269 (2014)
 8. Greiner, R., Smith, B.A., Wilkerson, R.W.: A correction to the algorithm in reiter’s theory of
    diagnosis. Artificial Intelligence 41(1), 79–88 (1989)
 9. Guo, Y., Pan, Z., Heflin, J.: LUBM: A benchmark for OWL knowledge base systems. Journal
    of Web Semantics 3(2-3), 158–182 (2005)
10. Haarslev, V., Hidde, K., Möller, R., Wessel, M.: The RacerPro knowledge representation and
    reasoning system. Semantic Web 3(3), 267–277 (2012)
11. Halland, K., Britz, K.: Abox abduction in  using a DL tableau. In: 2012 South African
    Institute of Computer Scientists and Information Technologists Conference, SAICSIT ’12,
    Pretoria, South Africa. pp. 51–58 (2012)
12. Halland, K., Britz, K.: Naïve ABox abduction in  using a DL tableau. In: Proceedings of
    the 2012 International Workshop on Description Logics, DL 2012, Rome, Italy. CEUR-WS,
    vol. 846 (2012)
13. Hladik, J., Model, J.: Tableau systems for  and . In: Haarslev, V., Möller, R.
    (eds.) Proc. of the 17th Int. Workshop on Description Logics (DL 2004). CEUR-WS, vol.
    104, pp. 168–177 (2004)
14. Horrocks, I.: The fact system. In: Automated Reasoning with Analytic Tableaux and Related
    Methods, International Conference, TABLEAUX ’98, Oisterwijk, The Netherlands, Proceed-
    ings. LNCS, vol. 1397, pp. 307–312. Springer (1998)
15. Horrocks, I.: Using an expressive description logic: FaCT or fiction? In: Proceedings of the
    Sixth International Conference on Principles of Knowledge Representation and Reasoning
    (KR’98), Trento, Italy. pp. 636–649. AAAI (1998)
16. Hubauer, T., Legat, C., Seitz, C.: Empowering adaptive manufacturing with interactive di-
    agnostics: A multi-agent approach. In: Advances on Practical Applications of Agents and
    Multiagent Systems – 9th International Conference on Practical Applications of Agents and
    Multiagent Systems, PAAMS 2011, Salamanca, Spain. pp. 47–56 (2011)
17. Karp, R.M.: Reducibility among combinatorial problems. In: Proceedings of a symposium
    on the Complexity of Computer Computations, IBM Thomas J. Watson Research Center,
    Yorktown Heights, New York. pp. 85–103 (1972)
18. Klarman, S., Endriss, U., Schlobach, S.: ABox abduction in the description logic . Jour-
    nal of Automated Reasoning 46(1), 43–80 (2011)
19. Ma, Y., Gu, T., Xu, B., Chang, L.: An ABox abduction algorithm for the description logic
    . In: Intelligent Information Processing VI – 7th IFIP TC 12 International Conference,
    IIP 2012, Guilin, China. Proceedings. IFIP AICT, vol. 385, pp. 125–130. Springer (2012)
20. Peirce, C.S.: Deduction, induction, and hypothesis. Popular science monthly 13, 470–482
    (1878)
21. Pukancová, J., Homola, M.: Abductive reasoning with description logics: Use case in medical
    diagnosis. In: Proceedings of the 28th International Workshop on Description Logics (DL
    2015), Athens, Greece. CEUR-WS, vol. 1350 (2015)
22. Pukancová, J., Homola, M.: Tableau-based ABox abduction for the  description
    logic. In: Proceedings of the 30th International Workshop on Description Logics, Montpellier,
    France. CEUR-WS, vol. 1879 (2017)
23. Reiter, R.: A theory of diagnosis from first principles. Artificial intelligence 32(1), 57–95
    (1987)
24. Shearer, R., Motik, B., Horrocks, I.: Hermit: A highly-efficient OWL reasoner. In: Proceed-
    ings of the Fifth OWLED Workshop on OWL: Experiences and Directions, Karlsruhe, Ger-
    many. CEUR-WS, vol. 432 (2008)
25. Sirin, E., Parsia, B., Cuenca Grau, B., Kalyanpur, A., Katz, Y.: Pellet: A practical OWL-DL
    reasoner. Journal of Web Semantics 5(2), 51–53 (2007)
26. Steigmiller, A., Liebig, T., Glimm, B.: Konclude: system description. Journal of Web Seman-
    tics 27, 78–85 (2014)
27. Wotawa, F.: A variant of reiter’s hitting-set algorithm. Information Processing Letters 79(1),
    45–51 (2001)

</pre>