=Paper=
{{Paper
|id=Vol-2373/paper-15
|storemode=property
|title=Comparing ABox Abduction Based on Minimal Hitting Set and MergeXplain
|pdfUrl=https://ceur-ws.org/Vol-2373/paper-15.pdf
|volume=Vol-2373
|authors=Katarína Fabianová,Júlia Pukancová,Martin Homola
|dblpUrl=https://dblp.org/rec/conf/dlog/FabianovaPH19
}}
==Comparing ABox Abduction Based on Minimal Hitting Set and MergeXplain==
<pdf width="1500px">https://ceur-ws.org/Vol-2373/paper-15.pdf</pdf>
<pre>
 Comparing ABox Abduction Based on Minimal Hitting
              Set and MergeXplain

               Katarína Fabianová, Júlia Pukancová and Martin Homola

                         Comenius University in Bratislava
                          Mlynská dolina, 84248 Bratislava
           kfabianova11@gmail.com,{pukancova,homola}@fmph.uniba.sk


       Abstract. We investigate an application of the MergeXplain algorithm on ABox
       abduction. This approach was recently proposed to address computational limi-
       tations of more traditional approaches, such as Reiter’s Minimal Hitting Set al-
       gorithm. MergeXplain uses divide and conquer to search through the space of
       possible explanations more quickly, however this comes at a cost, as it is not com-
       plete. In this paper, we report on our preliminary experimental evaluation of both
       approaches in the context of finding explanations of ABox abduction problems in
       description logics. Finally, we also investigate how to combine both approaches
       in order to leverage on the improved effectivity of MergeXplain and still retain
       completeness.

       Keywords: description logics ⋅ abduction ⋅ MergeXplain ⋅ minimal hitting set


1   Introduction
Abduction explains why some observation does not follow from a knowledge base (KB).
This problem was originally introduced by Peirce [16], and it was studied also in the con-
text of description logics (DL) [5,12,10,9,4,13,6,2,17,18,14,3]. We focus on ABox ab-
duction [5] wherein both observations and explanations are extensional (i.e., they come
in the form of ABox assertions).
    Multiple existing works adopted Reiter’s Minimal Hitting Set algorithm [19] exploit-
ing a DL reasoner for satisfiability checking. This was proposed by Halland and Britz
[10,9] and later extended by Pukancová and Homola [17,18] and Mrózek et al. [14] who
have developed a black box approach and its proof-of-concept implementation.
    The main advantage of Reiter’s algorithm is its completeness. Also, thanks to prop-
erties of breath-first search it finds smaller explanations first [18], but to search through
the complete space of explanations may simply take too long.
    However, completeness is not necessarily required. In some applications it may be
sufficient to find just one explanation. For such scenarios Junker [11] proposed Quick-
Xplain (QXP), which uses divide and conquer to find one explanation more effectively.
This method was further extended by Shchekotykhin et al. [20] into MergeXplain (MXP),
capable of finding multiple explanations in one run.
    In this work we report on our experiments with MXP-based abduction. We have
implemented both methods (MHS and MXP) into an ABox abduction solver. Our im-
plementation calls DL reasoners as a black box using OWL API similarly to Mrózek et
                         Table 1. Syntax and Semantics of 

Concept               Syntax    Semantics
Atomic concept          𝐴       𝐴
complement              ¬𝐶      Δ ⧵ 𝐶 
intersection            𝐶 ⊓𝐷    𝐶  ∩ 𝐷
existential restriction ∃𝑅.𝐶    {𝑥 ∣ ∃𝑦 (𝑥, 𝑦) ∈ 𝑅 ∧ 𝑦 ∈ 𝐶  }
nominal                 {𝑎}     {𝑎 }
Axiom                 Syntax    Semantics
concept incl. (GCI)   𝐶⊑𝐷       𝐶  ⊆ 𝐷
role incl. (RIA)      𝑅⊑𝑆       𝑅 ⊆ 𝑆 
concept assertion     𝐶(𝑎)     𝑎 ∈ 𝐶 
role assertion        𝑅(𝑎, 𝑏) (𝑎 , 𝑏 ) ∈ 𝑅
neg. role assertion   ¬𝑅(𝑎, 𝑏) (𝑎 , 𝑏 ) ∉ 𝑅


al. [14]. This allowed us to repeat all our experiments with Pellet [22], HermiT [21], and
JFact [15].
     We have conducted a preliminary experimental evaluation with two use cases. In the
first, simpler use case we focused on explanations of size one. While MXP is incomplete
in general, it is guaranteed to find all explanations of size one in a single run, hence such a
comparison is interesting. In the second use case we did not limit the size of explanations.
We simply set a timeout (of 12 hours) and compared the number of explanations found
by each approach.
     Our evaluation shows that in cases when one is only interested in explanations of
size one MHS is more effective. In cases when larger explanations cannot be ruled out
of consideration MXP was able to find over 80 % of explanations found by MHS before
the timeout expired.
     Finally, we have also investigated a combined approach, in which MXP is run re-
peatedly and MHS is used to steer this process in order to achieve completeness. We
present the resulting algorithm. Empirical evaluation of this approach is subject to our
ongoing work.


2    ABox Abduction in DL
For simplicity we will introduce  [1]. However any other DL may be used
due to our black box approach. A vocabulary consists of countably infinite mutually
disjoint sets of individuals 𝑁I = {𝑎, 𝑏, …}, roles 𝑁R = {𝑃 , 𝑅, …}, and atomic concepts
𝑁C = {𝐴, 𝐵, …}. Concepts are recursively built using constructors ¬, ⊓, ∃, as shown
in Table 1. A KB  =  ∪  consists of a a finite set of GCI and RIA axioms (in TBox
 ), and a finite set of assertions (in ABox ) as given in Table 1.
     An interpretation is a pair  = (Δ , ⋅ ), where Δ ≠ ∅ is a domain, and the inter-
pretation function ⋅ maps each individual 𝑎 ∈ 𝑁I to 𝑎 ∈ Δ , each atomic concept
𝐴 ∈ 𝑁C to 𝐴 ⊆ Δ , each role 𝑅 ∈ 𝑁R to 𝑅 ⊆ Δ × Δ in such a way that the
interpretation of any complex concept is as given on the right-hand side of Table 1.
    An interpretation  satisfies an axiom 𝜑 (denoted  ⊧ 𝜑) if the respective constraint
in Table 1 is satisfied. It is a model of a KB  (denoted  ⊧ ) if  ⊧ 𝜑 for all 𝜑 ∈ .
A KB  is consistent, if there is at least one interpretation  such that  ⊧ .  entails
an axiom 𝜑 (denoted  ⊧ 𝜑) if  ⊧ 𝜑 for each  ⊧ .
    Finally, we define ¬𝜑 ∶= ¬𝐶(𝑎) if 𝜑 = 𝐶(𝑎); ¬𝜑 ∶= 𝐶(𝑎) if 𝜑 = ¬𝐶(𝑎); ¬𝜑 ∶=
¬𝑅(𝑎, 𝑏) if 𝜑 = 𝑅(𝑎, 𝑏); ¬𝜑 ∶= 𝑅(𝑎, 𝑏) if 𝜑 = ¬𝑅(𝑎, 𝑏).
    In ABox abduction [5], we are given a KB  and an observation 𝑂 consisting of an
ABox assertion. The task is to find an explanation , again, consisting of ABox asser-
tions, such that  ∪  ⊧ 𝑂. However, the set of all ABox expressions may be too broad
to draw the explanations from (after all, it is infinite), hence we typically consider some
set of abducibles Abd. In this work we will limit the explanations to atomic and negated
atomic concept assertions; so we set Abd ∶= {𝐴(𝑎), ¬𝐴(𝑎) ∣ 𝐴 ∈ 𝑁C , 𝑎 ∈ 𝑁I }. Note
that we do not limit the observations in any way, apart from allowing only one (possibly
complex) ABox assertion.
Definition 1 (ABox Abduction Problem). Let Abd be a set of ABox assertions. An
ABox abduction problem is a pair  = (, 𝑂) such that  is a knowledge base in DL
and 𝑂 is an ABox assertion. A solution of  (also called explanation) is any finite set
 ⊆ Abd of ABox assertions such that  ∪  ⊧ 𝑂.
    While Definition 1 establishes the basic reasoning mechanism of abduction, some of
the explanations it permits may be unintuitive. According to Elsenbroich et al. [5] given
 = (, 𝑂) and its solution :
 1.  is consistent if  ∪  is a consistent KB,
 2.  is relevant if  ̸⊧ 𝑂,
 3.  is explanatory if  ̸⊧ 𝑂.
     Indeed an explanation should be consistent, as anything follows from inconsistency;
and so, an explanation that makes  inconsistent does not really explain the observation.
It should be relevant – it should not imply the observation directly without requiring the
KB  at all. And it should be explanatory, that is, we should not be able to explain the
observation without it.
     Only explanations which satisfy all three of the above conditions will be called de-
sired. In addition, in order to avoid excess hypothesizing, minimality is required.
Definition 2 (Minimality). Assume an ABox abduction problem  = (, 𝑂). Given
two solutions  and  ′ of , we say that  is (syntactically) smaller than  ′ if  ⊆  ′ .1
We further say that a solution  of  is syntactically minimal if there is no other solution
 ′ of  that is smaller than .

3      Our Approach
We first describe the MHS algorithm and then the MergeXplain algorithm. Both were
implemented and evaluated as reported in Section 4. Finally we also describe how both
algorithms may be combined to obtain a more effective yet still complete approach.
 1
     Note that before we compare two solutions  and  ′ of  syntactically, we typically normalize
     the assertions w.r.t. (outermost) concept conjunction: as 𝐶 ⊓ 𝐷(𝑎) is equivalent to the pair of
     assertions 𝐶(𝑎) and 𝐷(𝑎), we replace the former form by the latter while possible.
3.1   Minimal Hitting Set Algorithm

This approach is based on the fact that  is an explanation of  = (, 𝑂) if and only
if  ∪  ⊧ 𝑂, i.e., if and only if  ∪  ∪ {¬𝑂} is inconsistent. Assume  ∪ {¬𝑂} is
consistent and the set of all its models is . To find some explanation, we may simply
collect in  assertions that invalidate each model 𝑀 ∈ . However we want to draw
these assertions from the set of abducibles Abd. Hence it suffices to find a hitting set
(MHS) of the set {Abd(𝑀) ∣ 𝑀 ∈ } where Abd(𝑀) = {𝜙 ∈ Abd ∣ 𝑀 ̸⊧ 𝜙} is a set
of expressions that invalidate 𝑀 and are part of Abd.
    In fact, as proved by Reiter [19], the set of all minimal explanations corresponds to
the set of all minimal hitting sets of {Abd(𝑀) ∣ 𝑀 ∈ }, modulo negation.
    The algorithm works by constructing a HS-tree 𝑇 = (𝑉 , 𝐸, 𝐿), a labelled tree in
which each node is labelled by Abd(𝑀) for some model 𝑀 of  ∪ {¬𝑂} and whose
edges are labelled by elements of the parent node’s label. The HS-tree has the property
that the node label 𝐿(𝑛) and the union of the edge-labels 𝐻(𝑛) on the path from root
𝑟 to each node 𝑛 are disjoint. If 𝑛 cannot be extended by any successor satisfying this
property then 𝐻(𝑛) corresponds to a hitting set.
    In addition, pruning is applied during the HS-tree construction in order to eliminate
non-minimal hitting sets. The most obvious case is when a node 𝑛 already corresponds
to a hitting set 𝐻(𝑛) and there is another node 𝑛′ with 𝐻(𝑛) ⊆ 𝐻(𝑛′ ). Any such 𝑛′ can
be pruned. A pruned HS-tree (i.e., one on which all pruning conditions were applied),
once completed, contains all minimal hitting sets [19]. In addition we also prune nodes
which correspond to undesired explanations [17].
    The resulting algorithm is given in Algorithm 1. This algorithm is sound and com-
plete [19,17,18].

Theorem 1. The MHS algorithm is sound and complete (i.e., it returns the set  of all
minimal desired explanations of  and 𝑂.)

     The fact that MHS explores the search space using breadth-first search (BFS) allows
to limit the search for explanations by maximum size. The algorithm is still complete
w.r.t. any given target size [18].


3.2   MergeXplain Algorithm

MXP is listed in Algorithm 2. Both QXP and MXP were originally designed to find
minimal inconsistent subsets (dubbed conflicts) of an overconstrained knowledge base
 =  ∪ , where  is the consistent background theory and  is the “suspicious”
part from which the conflicts are drawn. This can be immediately adopted for ABox
abduction: in order to find explanations for an abduction problem  = (, 𝑂) one needs
to call MXP( ∪ {¬𝑂}, Abd). This observation allows us to adopt the following result
from Shchekotykhin et al. [20]:

Theorem 2. Assume an ABox abduction problem  = (, 𝑂) and a set of abducibles
Abd. If there is at least one explanation 𝛾 ⊆ Abd of  then calling MXP(∪{¬𝑂}, Abd)
returns a nonempty set Γ of explanations of .
Algorithm 1 MHS(,𝑂)
Require: knowledge base , observation 𝑂
Ensure: set  of all explanations of  = (, 𝑂) of the class Abd
 1: 𝑀 ← a model 𝑀 of  ∪ {¬𝑂}
 2: if 𝑀 = 𝚗𝚞𝚕𝚕 then
 3:     return "nothing to explain"
 4: end if
 5: create new HS-tree 𝑇 = (𝑉 , 𝐸, 𝐿) with root 𝑟
 6: label 𝑟 by 𝐿(𝑟) ← Abd(𝑀)
 7: for each 𝜎 ∈ 𝐿(𝑟) create a successor 𝑛𝜎 of 𝑟 and label the resp. edge by 𝜎
 8:  ← {}
 9: while there is next node 𝑛 in 𝑇 w.r.t. BFS do
10:     if 𝑛 can be pruned then
11:         prune 𝑛
12:     else if there is a model 𝑀 of  ∪ {¬𝑂} ∪ 𝐻(𝑛) then
13:         label 𝑛 by 𝐿(𝑛) ← Abd(𝑀)
14:     else
15:          ←  ∪ {𝐻(𝑛)}
16:     end if
17:     for each 𝜎 ∈ 𝐿(𝑛) create a successor 𝑛𝜎 of 𝑛 and label the resp. edge by 𝜎
18: end while
19: return 


     Thus MXP is sound, and it always finds at least one explanation if at least one ex-
ists. However it is not complete, especially in cases of abduction problems which have
multiple partially overlapping explanations.
Example 1. Let  = {𝐴 ⊓ 𝐵 ⊑ 𝐷, 𝐴 ⊓ 𝐶 ⊑ 𝐷} and let 𝑂 = 𝐷(𝑎). Let us ignore
negated ABox expressions and start with Abd = {𝐴(𝑎), 𝐵(𝑎), 𝐶(𝑎)}. There are two
minimal explanations of  = (, 𝑂): {𝐴(𝑎), 𝐵(𝑎)}, and {𝐴(𝑎), 𝐶(𝑎)}. Calling MXP(∪
{¬𝑂}, Abd), it passes the initial tests and calls FINDCONFLICTS( ∪ {¬𝑂}, Abd).
    FINDCONFLICTS needs to decide how to split  = Abd into 1 and 2 . Let us assume
the split was 1 = {𝐴(𝑎)} and 2 = {𝐵(𝑎), 𝐶(𝑎)}. Since both 1 and 2 are now conflict-
free w.r.t. ∪{¬𝑂}, the two consecutive recursive calls return ⟨1′ , ∅⟩ and ⟨2′ , ∅⟩ where
1′ = {𝐴(𝑎)} and 2′ = {𝐵(𝑎), 𝐶(𝑎)}.
    In the while loop, G ETCONFLICT( ∪ {¬𝑂} ∪ {𝐵(𝑎), 𝐶(𝑎)}, {𝐵(𝑎), 𝐶(𝑎)}, {𝐴(𝑎)})
returns 𝑋 = {𝐴(𝑎)} while G ETCONFLICT( ∪ {¬𝑂} ∪ {𝐴(𝑎)}, {𝐴(𝑎)}, {𝐵(𝑎), 𝐶(𝑎)})
returns 𝐵(𝑎), and hence the first confclit 𝛾 = {𝐴(𝑏), 𝐵(𝑎)} is found and added into Γ.
    However, consecutively 𝐴(𝑎) is removed from 𝐶1′ leaving it empty, and thus the other
conflict is not found and Γ = {{𝐴(𝑏), 𝐵(𝑎)}} is returned.

3.3   Combined Approach
As showed by the example, MergeXplain is not always complete. However, both ap-
proaches can be combined in order to regain completeness. The idea is to call MXP
repeatedly and use the construction of the HS-tree to steer this process and to guarantee
completeness.
Algorithm 2 MXP(,)
Input: : Background theory, : the set of possibly faulty constraints
Output: Γ, a set of minimal conflicts
 1: if ¬𝑖𝑠𝐶𝑜𝑛𝑠𝑖𝑠𝑡𝑒𝑛𝑡() then
 2:     return "no solution"
 3: else if 𝑖𝑠𝐶𝑜𝑛𝑠𝑖𝑠𝑡𝑒𝑛𝑡( ∪ ) then
 4:     return ∅
 5: end if
 6: ⟨ , Γ⟩ ← FINDCONFLICTS(, )
 7: return Γ
 8: function FINDCONFLICTS(, ) returns a tuple ⟨ ′ , Γ⟩
 9:    if 𝑖𝑠𝐶𝑜𝑛𝑠𝑖𝑠𝑡𝑒𝑛𝑡( ∪ ) then
10:          return ⟨, ∅⟩
11:    else if || = 1 then
12:          return ⟨∅, {}⟩
13:    end if
14:    Split  into disjoint, non-empty sets 1 and 2
15:    ⟨1′ , Γ1 ⟩ ← FINDCONFLICTS(, 1 )
16:    ⟨2′ , Γ2 ⟩ ← FINDCONFLICTS(, 2 )
17:    Γ ← Γ1 ∪ Γ2
18:    while ¬𝑖𝑠𝐶𝑜𝑛𝑠𝑖𝑠𝑡𝑒𝑛𝑡(1′ ∪ 2′ ∪ ) do
19:          𝑋 ← G ETCONFLICT( ∪ 2′ , 2′ , 1′ )
20:          𝛾 ← 𝑋 ∪ G ETCONFLICT( ∪ 𝑋, 𝑋, 2′ )
21:          1′ ← 1′ ∖{𝑎} where 𝑎 ∈ 𝑋
22:          Γ ← Γ ∪ {𝛾}
23:    end while
24:    return ⟨1′ ∪ 2′ , Γ⟩
25: end function
26: function G ETCONFLICT(, 𝐷, )
27:    if 𝐷 ≠ ∅ ∧ ¬𝑖𝑠𝐶𝑜𝑛𝑠𝑖𝑠𝑡𝑒𝑛𝑡() then
28:        return ∅
29:    else if || = 1 then
30:        return 
31:    end if
32:    Split  into disjoint, non-empty sets 1 and 2
33:    𝐷2 ← G ETCONFLICT( ∪ 1 , 1 , 2 )
34:    𝐷1 ← G ETCONFLICT( ∪ 𝐷2 , 𝐷2 , 1 )
35:    return 𝐷1 ∪ 𝐷2
36: end function


    The combined algorithm MHS-MXP, presented in Algorithm 3, amends the function
FINDCONFLICTS with additional third parameter 𝛿 ⊆  that guarantees that if 𝛿 is a
conflict for  (i.e., if  ∪ 𝛿 is inconsistent) then 𝛿 ∈ Γ, where Γ is the set of conflicts
returned by FINDCONFLICTS(, , 𝛿). This is assured by the modifications in lines 27
and 30. These changes are satisfactory to guarantee that if 𝛿 is a conflict for , it is not
lost from the output. The G ETCONFLICT function did not require any modifications.
Algorithm 3 MHS-MXP(,𝑂)
Require: knowledge base , observation 𝑂
Ensure: set  of all explanations of  = (, 𝑂) of the class Abd
 1: if ¬𝑖𝑠𝐶𝑜𝑛𝑠𝑖𝑠𝑡𝑒𝑛𝑡( ∪ {¬𝑂}) then
 2:     return "nothing to explain"
 3: end if
 4: create new HS-tree 𝑇 = (𝑉 , 𝐸, 𝐿) with root 𝑟
 5:  ← {}
 6: while there is next node 𝑛 in 𝑇 w.r.t. BFS s.t. 𝐿(𝑛) ⊆ Abd, 𝐿(𝑛) ≠ ∅ do
 7:     if 𝛾 ∈  s.t. 𝛾 ⊆ 𝐻(𝑛) then
 8:          𝑛←∅                                                                         ⊳ 𝑛 is pruned
 9:     else
10:           ⟨ , Γ⟩ ← FINDCONFLICTS( ∪ {¬𝑂}, Abd, 𝐻(𝑛))
11:           if Γ = ∅ go to 23 end if
12:           for all 𝛾 ∈ Γ do
13:                for all ordered sequences ⟨𝑎1 , … , 𝑎𝑛 ⟩ s.t. {𝑎1 , … , 𝑎𝑛 } = 𝛾 do
14:                    A DDPATH(𝑟,⟨𝑎1 , … , 𝑎𝑛 ⟩)
15:                end for
16:                for all 𝑎 ∈ 𝐿(𝑛) s.t. there is no 𝑛′ with 𝐿((𝑛, 𝑛′ )) = 𝑎 do
17:                    add node 𝑛′ , 𝐿((𝑛, 𝑛′ )) ← 𝑎
18:                end for
19:                 ←∪Γ
20:           end for
21:     end if
22: end while
23: return  ← {𝛾 ∈  ∣ 𝛾 is desired}
24: function FINDCONFLICTS(, , 𝛿) returns a tuple ⟨ ′ , Γ⟩
25:     if 𝑖𝑠𝐶𝑜𝑛𝑠𝑖𝑠𝑡𝑒𝑛𝑡( ∪ ) then
26:           return ⟨, ∅⟩
27:     else if || = 1 or  = 𝛿 then
28:           return ⟨∅, {}⟩
29:     end if
30:     Split  into disjoint, non-empty sets 1 and 2 s.t. if 𝛿 ⊆  then 𝛿 ⊆ 2
31:     ⟨1′ , Γ1 ⟩ ← FINDCONFLICTS(, 1 , 𝛿)
32:     ⟨2′ , Γ2 ⟩ ← FINDCONFLICTS(, 2 , 𝛿)
33:     Γ ← Γ1 ∪ Γ2
34:     while ¬𝑖𝑠𝐶𝑜𝑛𝑠𝑖𝑠𝑡𝑒𝑛𝑡(1′ ∪ 2′ ∪ ) do
35:           𝑋 ← G ETCONFLICT( ∪ 2′ , 2′ , 1′ )
36:           𝛾 ← 𝑋 ∪ G ETCONFLICT( ∪ 𝑋, 𝑋, 2′ )
37:           1′ ← 𝐶1′ ∖{𝑎} where 𝑎 ∈ 𝑋
38:           Γ ← Γ ∪ {𝛾}
39:     end while
40:     return ⟨1′ ∪ 2′ , Γ⟩
41: end function
42: function A DDPATH(𝑛, ⟨𝑎1 , … , 𝑎𝑛 ⟩)
43:     if there is no 𝑛′ in 𝑇 s.t. 𝐿(𝑛, 𝑛′ ) = 𝑎1 then
44:           add node 𝑛′ , 𝐿((𝑛, 𝑛′ )) ← 𝑎, 𝐿(𝑛′ ) ← 𝐿(𝑛)
45:     end if
46:     if 𝑛 > 1 then
47:           A DDPATH(𝑛′ ,⟨𝑎2 , … , 𝑎𝑛 ⟩)
48:     else
49:           𝐿(𝑛′ ) ← ∅
50:     end if
51: end function
    MHS-MXP starts by checking if  ⊧ 𝑂 (i.e., if  ∪ {¬𝑂} is inconsistent) in which
case there is nothing to explain. It then constructs the HS-tree with the following mod-
ifications: (a) Nodes are not labelled by subsets of Abd, instead the successor edges
are drawn from the whole set Abd. Only nodes corresponding to found conflicts and to
pruned nodes are labelled by ∅ to cut further search. (b) Nodes are explored by BFS,
for each node 𝑛 we call FINDCONFLICTS( ∪ {¬𝑂}, Abd, 𝐻(𝑛)) where by passing 𝐻(𝑛)
as parameter we assure that, if it is a conflict, it is not omitted from the resulting set of
conflicts Γ. (c) If FINDCONFLICTS did not find any conflicts (i.e., if Γ = ∅) we termi-
nate the search. We can do this because MXP always return some conflict, if one exists
[20]. Hence we can be sure that the search is over. (d) The HS-tree is enriched by all
paths found in Γ which are now verified minimal conflicts and hence (the corresponding
leaf-nodes) are omitted from future exploration. This is assured by labelling the ends of
these paths by ∅. (e) From now on all paths corresponding to their supersets will imme-
diately be pruned when encountered by the BFS search – they correspond to nonminimal
conflicts. (f) Finally we filter out conflicts corresponding to undesired explanations.

Theorem 3. The MHS-MXP algorithm is sound and complete (i.e., it returns the set 
of all minimal desired explanations of  and 𝑂.)

    The soundness follows from the soundness of MXP. The completeness follows from
the observation that for every 𝛾 ⊆ Abd that is an explanation of  = (, 𝑂), the HS-tree
contains a path respective to some leaf node 𝑛 s.t. 𝐻(𝑛) = 𝛾 as 𝛾 was only subtracted
from the node labels once at least one such node was added to the HS-tree.
    Note that one additional optimization is possible: after FINDCONFLICTS is called
for the first time and returns Γ, it is safe to reduce Abd by removing all 𝛾 ∈ Γ such
that |𝛾| = 1. This is because all minimal conflicts involving these abducibles were now
found. This reduces the search space, and in the special case when all explanations are
of size one the algorithm will terminate after two calls of FINDCONFLICTS.


4     Evaluation

A preliminary experimental evaluation was conducted with implementations of MHS
and MXP, both paired with three DL reasoners – Pellet, HermiT, and JFact. Both al-
gorithms are implemented in Java and communicate with the reasoners through OWL
API. The source code of both implementations is available online.2
    The evaluation is split into two experiments. Experiment 1 is focused on computing
explanations of size one. In this case MHS can be made more effective by bounding the
HS-tree depth, and on the other hand MXP is complete in this case. Experiment 2 was
conducted without any constraints on the size of explanations, but a timeout needed to
be set. Both experiments were focused on comparing execution times between the two
approaches and the three reasoners. Each time was computed as an average value from
ten runs with ten different observations.
 2
     https://github.com/katuskaa/MasterThesis
4.1    Dataset and Methodology
Three ontologies were chosen. The Family ontology3 , is our own ontology of family
relations. It is smaller, but it is particularly useful in this use case as it generates a number
of explanations of size higher than one. LUBM ontology [8], is a standard benchmark.
Beer ontology4 was chosen. Both LUBM and Beer were chosen because of their larger
size compared to the Family ontology, but on the other hand as in the case of many real
world ontologies their axiomatic structure is less complex which implies that most if not
all explanations are of size one.


                             Table 2. Parameters of the ontologies

                          Ontology        Concepts Roles Individuals Axioms
                          Family ontology   10       1        0        28
                          Beer ontology     58       9        0       165
                          LUBM              43      25        0       243


    All experiments were done on a virtual machine with 8 cores (16 threads) of the pro-
cessor Intel® Xeon® CPU E5-2695 v4, 32 GB RAM, 2.10GHz, running Ubuntu 18.04.1,
while the maximum Java heap size was set to 4GB. Execution times were measured in
Java using ThreadMXBean from the java.lang.management package. We used user
time, that is the actual time without system overhead.

4.2    Experiment 1
The first experiment is focused on comparing both methods (MHS and MXP) on a sim-
plified task of finding all explanations of size one. We focus on this class of explanations
for two reasons. Firstly, many real-world ontologies feature only weak axiomatization,
if any, implying that most explanations, if not all, will be of size one.
     Secondly, both MHS and MXP can be used to find all explanations of size one quite
effectively. MHS – which is complete but normally quite ineffective – can be depth-
bound and stopped after it explored the search space up to size one, which takes consid-
erably less time than to explore the rest of the search space. MXP, in turn, is incomplete
in general, but it does not lose any explanations of size one.
     For each ontology Figure 1 plots the average time of execution over ten observations
for each of the three reasoners and for both methods. In case of the smallest Family
ontology Figure 1 (a) shows that MHS is always quicker than MXP regardless of the
DL reasoner used. Among the reasoners Pellet is the quickest when paired with MHS,
however JFact is the quickest when paired with MXP. In all three cases MXP has found
one additional explanation (of size 2) than MHS for one of the observations.
     The results for medium-sized Beer ontology are plotted in Figure 1 (b). We observe
that MHS is quicker than MXP with Pellet and HermiT, but in case of JFact MXP is
quicker. Again, Pellet is the quickest when paired with MHS, and JFact is the quick-
est when paired with MXP, however the performance of HermiT is improved for this
 3
     http://dai.fmph.uniba.sk/~pukancova/aaa/ont/family2.owl
 4
     https://www.cs.umd.edu/projects/plus/SHOE/onts/beer1.0.html
0.7                                                                   80
                                   17.5                               70
0.6
                                   15.0                               60
0.5                                12.5                               50
0.4                                10.0                               40
0.3                                 7.5                               30
0.2                                 5.0                               20
0.1                                 2.5                               10
0.0                                 0.0                                0
       HermiT    JFact    Pellet          HermiT   JFact    Pellet         HermiT   JFact   Pellet
            (a) Family                         (b) Beer                        (c) LUBM

                   Fig. 1. Result times in seconds for Exp. 1:       MHS      MXP


larger ontology. Both approaches have found the same amount of explanations for each
observation.
    The LUBM ontology is the largest one in our use case. The plot in Figure 1 (c) shows
that the results are essentially similar to the case of the Beer ontology. The only two cases
which took significantly greater time were those of MHS/JFact and MXP/Pellet. We can
also observe that the comparative performance of HermiT is further improved. Again,
both approaches have found the same amount of explanations for each observation.

4.3     Experiment 2
The goal of this experiment was to compute as much explanations as possible using
MHS and MXP. We compare execution times and the number of explanations found.
As runtimes of MHS (especially for larger ontologies) may be too high, a timeout of 12
hours (43200 seconds) was set. MXP terminated way faste in all runs, however, as we
know, it does not explore the whole search space. The results are plotted in Figure 2.
    The Family ontology (a) is much smaller compared to the other two, hence all runs
finished before the timeout. Its richer axiomatization generates a number of explanations
of size higher than one. Hence while MXP took shorter time than MHS it found a smaller
number of explanations. Still, on average it found 82.35 % of the explanations found by
MHS. On this smaller ontology once again Pellet was the quickest when paired with
MHS while JFact was the quickest when paired with MXP.
    The results for the Beer and LUBM ontologies (Figures 2, b–c) are similar. They
show that even in case of medium size ontologies it simply is not feasible for MHS to
explore the whole search space as it either reached the timeout (the cases of HermiT and
JFact) or it exceeded the memory (the case of Pellet).5 Note that it may be feasible to
use MHS in cases when the set of abducibles is further reduced, i.e., if the user knows
or is able to guess beforehand in which part of the search space to look for possible
explanations. Such cases are out of the scope of this paper.
 5
     Note that if the time out was reached we recorded 12 hours, but if the memory was exceeded
     we recorded the time at the end of the last completed depth of the HS-tree – hence the results
     for Pellet in Figure 2 (b–c) are only seemingly good.
1.4 100.0%                                  12h 100.0%        100.0%
                                                                                         12h 100.0%         100.0%


1.2               100.0%                    10h                                          10h
1.0                                          8h                                           8h
0.8      82.35%                100.0%        6h                                           6h
0.6                                82.35%
                                             4h                                           4h
0.4                   82.35%                                                                                             100.0%

0.2                                          2h                                           2h
                                                                           100.0%
0.0                                          0h      100.0%       100.0%        100.0%    0h       100.0%       100.0%       100.0%
    HermiT         JFact        Pellet            HermiT       JFact        Pellet              HermiT       JFact        Pellet
            (a) Family                                   (b) Beer                                     (c) LUBM

Fig. 2. Results in seconds (a) and hours (b–c) for Exp. 2:                                     MHS (depth 1–6)              MXP


    In contrast, MXP only took a fraction of this time. (Note that while exact times of
MXP runs are not readable from Figure 2 (b–c) due to scale, they are the same as in
Figure 1 (b–c)). Both approaches found exactly all explanations of size one in each run
(and none other). Since both ontologies feature only weak axiomatization it is unlikely
that there are any larger explanations.
    Looking at the MHS results of the three reasoners we observe that HermiT was the
most successful in terms of being able to explore the largest part of the search space
before the timeout. Interestingly, this is in contrast with the results on the Family ontol-
ogy which may indicate that it likely features some initial overhead which is outweighed
when the reasoning task grows larger.

5     Conclusions
In this work we have focused on comparison of MHS and MXP on the task of ABox
abduction. We have implemented both approaches and conducted a preliminary evalu-
ation. Our evaluation shows that in cases when there are only explanations of size one,
or one is only interested in this size of explanations MHS is more effective. In cases
when larger explanations cannot be ruled out MXP can be used for fast but incomplete
query; in our tests we were able to obtain over 80 % of explanations in this way. MHS
is complete but it is much slower. Even on medium size ontologies such as Beer and
LUBM MHS did not terminate before the 12 hour timeout.
    As we paired our solvers with three different DL reasoners in all experiments, we
were able to compare the performance of these reasoners as well. Out of them HermiT
was the most effective as it was able to search through the largest part of the search space
before the 12 hour timeout expired.
    We have also described a combined approach which leverages on the effectivity of
MXP and uses MHS to steer the search and retain completeness. Empirical evaluation
of this approach is an ongoing effort.

Acknowledgements. This work presents the results of the Master’s thesis by Katarína
Fabianová [7]. It was supported by the Slovak national project VEGA 1/0778/18.
References
 1. Baader, F., Calvanese, D., McGuinness, D.L., Nardi, D., Patel-Schneider, P.F. (eds.): The
    Description Logic Handbook: Theory, Implementation, and Applications. Cambridge Uni-
    versity Press (2003)
 2. Castano, S., Espinosa Peraldí, I.S., Ferrara, A., Karkaletsis, V., Kaya, A., Möller, R., Mon-
    tanelli, S., Petasis, G., Wessel, M.: Multimedia interpretation for dynamic ontology evolution.
    J. Log. Comput. 19(5), 859–897 (2009)
 3. Del-Pinto, W., Schmidt, R.A.: Forgetting-based abduction in . In: Proceedings of the
    Workshop on Second-Order Quantifier Elimination and Related Topics (SOQE 2017), Dres-
    den, Germany. CEUR-WS, vol. 2013, pp. 27–35 (2017)
 4. Du, J., Qi, G., Shen, Y., Pan, J.Z.: Towards practical ABox abduction in large description
    logic ontologies. Int. J. Semantic Web Inf. Syst. 8(2), 1–33 (2012)
 5. Elsenbroich, C., Kutz, O., Sattler, U.: A case for abductive reasoning over ontologies. In:
    Proceedings of the OWLED*06 Workshop on OWL: Experiences and Directions, Athens,
    GA, US. CEUR-WS, vol. 216 (2006)
 6. Espinosa Peraldí, I.S., Kaya, A., Möller, R.: Formalizing multimedia interpretation based on
    abduction over description logic ABoxes. In: Proceedings of the 22nd International Workshop
    on Description Logics (DL 2009), Oxford, UK. CEUR-WS, vol. 477 (2009)
 7. Fabianová, K.: Optimization of an abductive reasoner for description logics. Master’s thesis,
    Comenius University in Bratislava (2019), to appear
 8. Guo, Y., Pan, Z., Heflin, J.: LUBM: A benchmark for OWL knowledge base systems. Journal
    of Web Semantics 3(2-3), 158–182 (2005)
 9. Halland, K., Britz, K.: Abox abduction in  using a DL tableau. In: 2012 South African
    Institute of Computer Scientists and Information Technologists Conference, SAICSIT ’12,
    Pretoria, South Africa. pp. 51–58 (2012)
10. Halland, K., Britz, K.: Naïve ABox abduction in  using a DL tableau. In: Proceedings of
    the 2012 International Workshop on Description Logics, DL 2012, Rome, Italy. CEUR-WS,
    vol. 846 (2012)
11. Junker, U.: QUICKXPLAIN: Preferred explanations and relaxations for over-constrained
    problems. In: Proceedings of the Nineteenth National Conference on Artificial Intelligence,
    Sixteenth Conference on Innovative Applications of Artificial Intelligence, San Jose, Califor-
    nia, US. pp. 167–172. AAAI Press (2004)
12. Klarman, S., Endriss, U., Schlobach, S.: ABox abduction in the description logic . Jour-
    nal of Automated Reasoning 46(1), 43–80 (2011)
13. Ma, Y., Gu, T., Xu, B., Chang, L.: An ABox abduction algorithm for the description logic
    . In: Intelligent Information Processing VI – 7th IFIP TC 12 International Conference,
    IIP 2012, Guilin, China. Proceedings. IFIP AICT, vol. 385, pp. 125–130. Springer (2012)
14. Mrózek, D., Pukancová, J., Homola, M.: ABox abduction solver exploiting multiple DL rea-
    soners. In: Proceedings of the 31st International Workshop on Description Logics, Tempe,
    Arizona, US. CEUR-WS, vol. 2211 (2018)
15. Palmisano, I.: JFact DL reasoner. http://jfact.sourceforge.net/
16. Peirce, C.S.: Deduction, induction, and hypothesis. Popular science monthly 13, 470–482
    (1878)
17. Pukancová, J., Homola, M.: Tableau-based ABox abduction for the  description
    logic. In: Proceedings of the 30th International Workshop on Description Logics, Montpellier,
    France. CEUR-WS, vol. 1879 (2017)
18. Pukancová, J., Homola, M.: ABox abduction for description logics: The case of multiple ob-
    servations. In: Proceedings of the 31st International Workshop on Description Logics, Tempe,
    Arizona, US. CEUR-WS, vol. 2211 (2018)
19. Reiter, R.: A theory of diagnosis from first principles. Artificial intelligence 32(1), 57–95
    (1987)
20. Shchekotykhin, K.M., Jannach, D., Schmitz, T.: MergeXplain: Fast computation of multiple
    conflicts for diagnosis. In: Proceedings of the Twenty-Fourth International Joint Conference
    on Artificial Intelligence, IJCAI 2015, Buenos Aires, Argentina. AAAI Press (2015)
21. Shearer, R., Motik, B., Horrocks, I.: Hermit: A highly-efficient OWL reasoner. In: Proceed-
    ings of the Fifth OWLED Workshop on OWL: Experiences and Directions, Karlsruhe, Ger-
    many. CEUR-WS, vol. 432 (2008)
22. Sirin, E., Parsia, B., Cuenca Grau, B., Kalyanpur, A., Katz, Y.: Pellet: A practical OWL-DL
    reasoner. Journal of Web Semantics 5(2), 51–53 (2007)

</pre>