=Paper=
{{Paper
|id=Vol-3263/paper-13
|storemode=property
|title=Hybrid MHS-MXP ABox Abduction Solver: First Empirical Results
|pdfUrl=https://ceur-ws.org/Vol-3263/paper-13.pdf
|volume=Vol-3263
|authors=Martin Homola,Júlia Pukancová,Iveta Balintová,Janka Boborová
|dblpUrl=https://dblp.org/rec/conf/dlog/HomolaPBB22
}}
==Hybrid MHS-MXP ABox Abduction Solver: First Empirical Results==
<pdf width="1500px">https://ceur-ws.org/Vol-3263/paper-13.pdf</pdf>
<pre>
Hybrid MHS-MXP ABox Abduction Solver: First
Empirical Results
Martin Homola, Júlia Pukancová, Iveta Balintová and Janka Boborová
Comenius University in Bratislava, Mlynská dolina, 842 41 Bratislava, Slovakia


                                      Abstract
                                      MHS-MXP is a hybrid algorithm which adopts the divide-and-conquer strategy of fast but incomplete
                                      MergeXplain (MXP) and combines it with Minimal Hitting Set (MHS) to regain completeness. We describe
                                      a class of inputs on which MHS-MXP has an advantage. We provide an experimental implementation
                                      which enables us to perform first preliminary empirical evaluation on this class of inputs.

                                      Keywords
                                      ABox abduction, description logics, ontologies


1. Introduction
In the context of DL, ABox abduction [1] assumes a DL KB 𝒦 and an extensional observation 𝑂
(in form of an ABox assertion). Explanations (also extensional) are sets of ABox assertions ℰ
such that 𝒦 together with ℰ entails 𝑂. The MHS algorithm [2] is the classic method to find all
explanations of an ABox abduction problem. MHS is mostly applied relying on an external DL
reasoner for consistency checking as a black box.
   MHS is complete. It searches through the space of all possible explanations, from the smallest
(in terms of cardinality) towards the largest. This approach is useful, particularly in cases
where there are smaller explanations. However in cases where there is (even a small number
of) explanations of larger size this search strategy may be inefficient. Notably, the worst-case
complexity of the underlying problems is also hard. The MHS problem is NP-complete [3] and
consistency checking of DL KBs repeatedly called by MHS depends on the particular DL, but
for many DLs it may be exponential or worse.
   Alternative strategies were explored. QuickXplain (QXP) [4], and more recently its extension
MergeXplain [5] employ the divide and conquer strategy which allows to find one (QXP) or
even multiple explanations (MXP) very efficiently. On the other hand, these methods are not
complete, i.e., there is no warranty that all explanations will be found.
   However, when MXP is run repeatedly, with slightly modified inputs, it divides the search
space differently and it may return a different set of explanations than in the previous runs.
Based on this key observation, we have proposed a combined algorithm, dubbed MHS-MXP [6],

              DL 2022: 35th International Workshop on Description Logics, August 7–10, 2022, Haifa, Israel
Envelope-Open homola@fmph.uniba.sk (M. Homola); pukancova@fmph.uniba.sk (J. Pukancová); balintova37@uniba.sk
(I. Balintová); boborova3@uniba.sk (J. Boborová)
GLOBE https://dai.fmph.uniba.sk/~homola/ (M. Homola); https://dai.fmph.uniba.sk/~pukancova/ (J. Pukancová)
Orcid 0000-0001-6384-9771 (M. Homola)
                                    © 2022 Copyright for this paper by its authors. Use permitted under Creative Commons License Attribution 4.0 International (CC BY 4.0).
 CEUR
 Workshop
 Proceedings
               http://ceur-ws.org
               ISSN 1613-0073
                                    CEUR Workshop Proceedings (CEUR-WS.org)
that iterates runs of MXP and it uses MHS to steer the search space exploration in such a way
that completeness is retained.
   In the current report we study relevant properties of MHS-MXP that help not only to establish
its correctness, but also hint the class of inputs on which it may have an advantage over MHS.
We conjecture, that one such class of favourable inputs are those with a smaller number of
shorter explanations. We provide a preliminary implementation and show first empirical results
that supports the above conjecture.
   The main advantage of our work compared to some other promising approaches in DL
abduction [7, 8] lies in being a truly black box approach and thus it may be immediately paired
with any DL for which a suitable reasoner is available (provided that a model can be extracted
from it, see below). In fact, the approach is not limited to DLs and it can be applied in any case
in which MHS is applicable. Different approaches to DL abduction, on the other hand, have
other advantages, e.g. they are able to provide for richer abducibles than our approach.


2. Preliminaries
We assume familiarity with the basics of DL [9], including vocabulary consisting of individuals
𝑁I = {𝑎, 𝑏, …}, roles 𝑁R = {𝑃, 𝑄, …}, and atomic concepts 𝑁C = {𝐴, 𝐵, …}; complex concepts
𝐶, 𝐷, … built by constructors (e.g. ¬, ⊓, ⊔, ∃, ∀, in case of 𝒜 ℒ 𝒞 [10]); a KB 𝒦 = 𝒯 ∪ 𝒜
composed of a TBox 𝒯 (with subsumption axioms of the form 𝐶 ⊑ 𝐷) and an ABox 𝒜 (with
concept assertions of the form 𝐶(𝑎) and (possibly negated [11]) role assertions of the form 𝑅(𝑎)
and ¬𝑅(𝑎)). We also remind about the semantics that relies on models 𝑀 of a KB 𝒦, that satisfy
all axioms or assertions 𝜙 in 𝒦 (𝑀 ⊧ 𝜙); and the reasoning tasks of checking the consistency of
𝒦 (if it has a model) and entailment (𝒦 ⊧ 𝜙 if 𝑀 ⊧ 𝜙 for all its models 𝑀).
   In ABox abduction [1], we are given a KB 𝒦 and an observation 𝑂 consisting of an ABox
assertion. The task is to find an explanation ℰ, again, consisting of ABox assertions, such that
𝒦 ∪ ℰ ⊧ 𝑂. Explanations are drawn from some set of abducibles Abd.

Definition 1 (ABox Abduction Problem). Let Abd be a finite set of ABox assertions. An ABox
abduction problem is a pair 𝒫 = (𝒦 , 𝑂) such that 𝒦 is a knowledge base in DL and 𝑂 is an ABox
assertion. An explanation of 𝒫 (on Abd) is any finite set of ABox assertions ℰ ⊆ Abd such that
𝒦 ∪ ℰ ⊧ 𝑂.

  We limit the explanations to atomic and negated atomic concept and role assertions; hence
Abd ⊆ {𝐴(𝑎), ¬𝐴(𝑎) ∣ 𝐴 ∈ 𝑁C , 𝑎 ∈ 𝑁I } ∪ {𝑅(𝑎, 𝑏), ¬𝑅(𝑎, 𝑏) ∣ 𝑅 ∈ 𝑁R , 𝑎, 𝑏 ∈ 𝑁I }. Note that we do
not limit the observations, apart from allowing only one (possibly complex) ABox assertion.
  According to Elsenbroich et al. [1] it is reasonable to require from each explanation ℰ of
𝒫 = (𝒦 , 𝑂) to be: (a) consistent (𝒦 ∪ ℰ is consistent); (b) relevant (ℰ ⊧ ̸ 𝑂); and (c) explanatory
(𝒦 ⊧ ̸ 𝑂). Explanations that satisfy these three conditions will be called desired. In addition, in
order to avoid excess hypothesizing, minimality is required.

Definition 2 (Minimality). Assume an ABox abduction problem 𝒫 = (𝒦 , 𝑂). Given explana-
tions ℰ and ℰ ′ of 𝒫, ℰ is (syntactically) smaller than ℰ ′ if ℰ ⊆ ℰ ′ .An explanation ℰ of 𝒫 is
(syntactically) minimal if there is no other explanation ℰ ′ of 𝒫 that is smaller than ℰ.
Algorithm 1 MHS(𝒦,𝑂,Abd)
Input: Knowledge base 𝒦, observation 𝑂, abducibles Abd
Output: 𝒮ℰ all explanations of 𝒫 = (𝒦 , 𝑂) w.r.t. Abd

 1: 𝑀 ← a model 𝑀 of 𝒦 ∪ {¬𝑂}                            10:        prune 𝑛
 2: if 𝑀 = n u l l then                                  11:    else if exists model 𝑀 of 𝒦 ∪ {¬𝑂} ∪ 𝐻 (𝑛) then
 3:     return ” n o t h i n g t o e x p l a i n ”       12:        label 𝑛 by 𝐿(𝑛) ← Abd(𝑀)
 4: end if                                               13:    else if 𝐻 (𝑛) is desired then
 5: 𝑇 ← (𝑉 = {𝑟}, 𝐸 = ∅, 𝐿 = {𝑟 ↦ Abd(𝑀)})               14:        𝒮ℰ ← 𝒮ℰ ∪ {𝐻 (𝑛)}
 6: for each 𝜎 ∈ 𝐿(𝑟) create new 𝜎-successor 𝑛𝜎 of 𝑟     15:    end if
 7: 𝒮ℰ ← {}                                              16:    for each 𝜎 ∈ 𝐿(𝑛) create new 𝜎-successor 𝑛𝜎 of 𝑛
 8: while exists next node 𝑛 in 𝑇 w.r.t. BFS do          17: end while
 9:     if 𝑛 can be pruned then                          18: return 𝒮ℰ


3. Computing Explanations
We now review first the complete MHS algorithm and then the faster but approximative MXP
algorithm. The hybrid approach that tries to combine “the best of both worlds” is then introduced
in Section 4.

3.1. Minimal Hitting Set
Adopting the well-known result of Reiter [2], computing all minimal explanations of (𝒦 , 𝑂)
reduces to finding all minimal hitting sets of the set of models of 𝐾 ∪ {¬𝑂} in the following
sense. Also, if some of the models contain no abducibles then there are no explanations.

Observation 1. The minimal explanations of (𝒦 , 𝑂) on Abd directly corresponds to the minimal
hitting sets of {Abd(𝑀) ∣ 𝑀 ⊧ 𝒦 ∪ {¬𝑂}} where Abd(𝑀) = {𝜙 ∈ Abd ∣ 𝑀 ⊧ ̸ 𝜙}.

Observation 2. If Abd(𝑀) = ∅ for some 𝑀 ⊧ 𝒦 ∪ {¬𝑂}, then (𝒦 , 𝑂) has no explanations on
Abd.

   The MHS algorithm (Algorithm 1) works by constructing an HS-tree 𝑇 = (𝑉 , 𝐸, 𝐿) in which
each node is labelled by Abd(𝑀) for some model 𝑀 of 𝒦 ∪ {¬𝑂} and whose edges are labelled by
elements of the parent’s label. If a node 𝑛1 ∈ 𝑉 has a successor 𝑛2 ∈ 𝑉 such that 𝐿(⟨𝑛1 , 𝑛2 ⟩) = 𝜎
then 𝑛2 is a 𝜎-successor of 𝑛1 .
   The HS-tree has the property that the node-label 𝐿(𝑛) and the union 𝐻 (𝑛) of the edge-labels
on the path from the root 𝑟 to each node 𝑛 are disjoint. Each label 𝐿(𝑛) can be found as Abd(𝑀)
of any model of 𝒦 ∪ {¬𝑂} ∪ 𝐻 (𝑛), by one call to an external DL reasoner. If no such model 𝑀
exists then 𝐻 (𝑛) corresponds to a hitting set. Note that if 𝑀 exists but Abd(𝑀) = ∅, then in
accord with Observation 2 𝐻 (𝑛) cannot be extended to a hitting set.
   Subset pruning eliminates non-minimal hitting sets: given a hitting set 𝐻 (𝑛), nodes 𝑛′ with
𝐻 (𝑛) ⊆ 𝐻 (𝑛′ ) are pruned. Equal-paths pruning prunes also nodes 𝑛′ with 𝐻 (𝑛) = 𝐻 (𝑛′ ), even
if 𝐻 (𝑛) is not a hitting set. Once completed, a pruned HS-tree contains all minimal hitting sets
[2]. MHS is sound and complete [2, 12].
Algorithm 2 MXP(ℬ,𝒞)
Input: background theory ℬ, set of possibly faulty constraints 𝒞
Output: a set of minimal conflicts Γ

 1: if ¬isConsistent(ℬ) then                             19:       𝑋 ← GetConflict(ℬ ∪ 𝒞2′ , 𝒞2′ , 𝒞1′ )
 2:     return ” n o e x p l a n a t i o n ”             20:       𝛾 ← 𝑋 ∪ GetConflict(ℬ ∪ 𝑋 , 𝑋 , 𝒞2′ )
 3: else if isConsistent(ℬ ∪ 𝒞 ) then                    21:       𝒞1′ ← 𝒞1′ \{𝜎} where 𝜎 ∈ 𝑋
 4:     return ∅                                         22:       Γ ← Γ ∪ {𝛾 }
 5: end if                                               23:    end while
 6: ⟨ , Γ⟩ ← FindConflicts(ℬ, 𝒞)                         24:    return ⟨𝒞1′ ∪ 𝒞2′ , Γ⟩
 7: return Γ                                             25: end function
 8: function FindConflicts(ℬ, 𝒞)                         26: function GetConflict(ℬ, 𝐷, 𝒞)
 9:    if isConsistent(ℬ ∪ 𝒞 ) then                      27:    if 𝐷 ≠ ∅ ∧ ¬isConsistent(ℬ) then
10:         return ⟨𝒞 , ∅⟩                               28:        return ∅
11:    else if |𝒞 | = 1 then                             29:    else if |𝒞 | = 1 then
12:         return ⟨∅, {𝒞 }⟩                             30:        return 𝒞
13:    end if                                            31:    end if
14:    Split 𝒞 into disjoint, non-empty sets 𝒞1 and 𝒞2   32:    Split 𝒞 into disjoint, non-empty sets 𝒞1 and 𝒞2
15:    ⟨𝒞1′ , Γ1 ⟩ ← FindConflicts(ℬ, 𝒞1 )               33:    𝐷2 ← GetConflict(ℬ ∪ 𝒞1 , 𝒞1 , 𝒞2 )
16:    ⟨𝒞2′ , Γ2 ⟩ ← FindConflicts(ℬ, 𝒞2 )               34:    𝐷1 ← GetConflict(ℬ ∪ 𝐷2 , 𝐷2 , 𝒞1 )
17:    Γ ← Γ 1 ∪ Γ2                                      35:    return 𝐷1 ∪ 𝐷2
18:    while ¬isConsistent(𝒞1′ ∪ 𝒞2′ ∪ ℬ) do             36: end function


Theorem 1. The MHS algorithm is sound and complete (i.e., it returns the set 𝒮ℰ of all minimal
desired explanations of 𝒦 and 𝑂 on Abd).

   The fact that MHS explores the search space using breadth-first search (BFS) allows to limit
the search for explanations by maximum size. The algorithm is still complete w.r.t. any given
target size [12].

3.2. MergeXplain
Both QXP [4] and MXP [5] were originally designed to find minimal inconsistent subsets
(dubbed conflicts) of an over-constrained knowledge base 𝒦 = ℬ ∪ 𝒞, where ℬ is the consistent
background theory and 𝒞 is the “suspicious” part from which the conflicts are drawn. The
algorithm is listed in Algorithm 2.
   The essence of QXP is captured in the GetConflict function, which cleverly decomposes 𝒞
by splitting it into smaller and smaller subsets such that it is always able to reconstruct one
minimal conflict, if it only exists. The auxiliary function isConsistent(𝒦) encapsulates calls to
an external reasoner; it returns true if 𝒦 is consistent and false otherwise.
   Thus, if we just wanted to find one minimal explanation of an ABox abduction problem,
adopting a result of Junker [4] we could use GetConflict in the following way.

Theorem 2. Assume an ABox abduction problem 𝒫 = (𝒦 , 𝑂) and a set of abducibles Abd. If
there is at least one explanation 𝛾 ⊆ Abd of 𝒫 then calling GetConflict(𝒦 ∪{¬𝑂}, 𝒦 ∪{¬𝑂}, Abd)
returns some minimal explanation 𝛿 ⊆ Abd of 𝒫.
   MXP, captured in the FindConflicts function, finds as many conflicts as it is possible to
reconstruct from one way in which 𝒞 can be split. MXP relies on GetConflict to recover some
of the conflicts that would be lost due to splitting. This ensures that it keeps the important
property of QXP that at least one minimal is found in each run, if it exists.
   This approach can be immediately adopted for ABox abduction: in order to find explanations
for an abduction problem 𝒫 = (𝒦 , 𝑂) on Abd one needs to call MXP(𝒦 ∪ {¬𝑂}, Abd). This
observation allows us to adopt the following result from Shchekotykhin et al. [5]:

Theorem 3. Assume an ABox abduction problem 𝒫 = (𝒦 , 𝑂) and a set of abducibles Abd.
If there is at least one explanation 𝛾 ⊆ Abd of 𝒫 then calling MXP(𝒦 ∪ {¬𝑂}, Abd) returns a
nonempty set Γ of minimal explanations of 𝒫.

  In fact, MXP is thorough in its decomposition of 𝒞, which is broken to smaller and smaller
subsets until they are consistent with ℬ or until only sets of size 1 remain. This directly implies
that all conflicts of size 1 will always be found and returned by a single run of MXP. This
observation will prove to be useful for our hybrid algorithm.

Observation 3. Given an ABox abduction problem 𝒫 = (𝒦 , 𝑂), a set of abducibles Abd, and
any 𝛾 ⊆ Abd s.t. |𝛾 | = 1, if 𝒦 ∪ 𝛾 ⊧ 𝑂 then 𝛾 ∈ MXP(𝒦 ∪ {¬𝑂}, Abd).

   Thus MXP is sound and it always finds at least one minimal explanation (Theorem 3), and it
finds all explanations of size one (Observation 3). Still, MXP is not complete. Some explanations
may be lost, especially in cases with multiple partially overlapping explanations.

Example 1. Let 𝒦 = {𝐴 ⊓ 𝐵 ⊑ 𝐷, 𝐴 ⊓ 𝐶 ⊑ 𝐷} and let 𝑂 = 𝐷(𝑎). Let us ignore negated ABox
expressions and start with Abd = {𝐴(𝑎), 𝐵(𝑎), 𝐶(𝑎)}. There are two minimal explanations of
𝒫 = (𝒦 , 𝑂): {𝐴(𝑎), 𝐵(𝑎)}, and {𝐴(𝑎), 𝐶(𝑎)}. Calling MXP(𝒦 ∪ {¬𝑂}, Abd), it passes the initial
tests and calls FindConflicts(𝒦 ∪ {¬𝑂}, Abd).
   FindConflicts needs to decide how to split 𝒞 = Abd into 𝒞1 and 𝒞2 . Let us assume the
split was 𝒞1 = {𝐴(𝑎)} and 𝒞2 = {𝐵(𝑎), 𝐶(𝑎)}. Since both 𝒞1 and 𝒞2 are now conflict-free w.r.t.
𝒦 ∪ {¬𝑂}, the two consecutive recursive calls return ⟨𝒞1′ , ∅⟩ and ⟨𝒞2′ , ∅⟩ where 𝒞1′ = {𝐴(𝑎)} and
𝒞2′ = {𝐵(𝑎), 𝐶(𝑎)}.
   In the while loop, GetConflict(𝒦 ∪ {¬𝑂} ∪ {𝐵(𝑎), 𝐶(𝑎)}, {𝐵(𝑎), 𝐶(𝑎)}, {𝐴(𝑎)}) returns 𝑋 = {𝐴(𝑎)}
while GetConflict(𝒦 ∪{¬𝑂}∪{𝐴(𝑎)}, {𝐴(𝑎)}, {𝐵(𝑎), 𝐶(𝑎)}) returns 𝐵(𝑎), and hence the first conflict
𝛾 = {𝐴(𝑎), 𝐵(𝑎)} is found and added into Γ.
   However, consecutively 𝐴(𝑎) is removed from 𝒞1′ leaving it empty, and thus the other conflict is
not found and Γ = {{𝐴(𝑎), 𝐵(𝑎)}} is returned.

  Finally, not only is MXP able to find all explanations of size 1, but it also has the property
that if no longer explanations are returned in a given run then in fact this is because there are
none. So in such a case we are sure that we have found all explanations whatsoever in a single
run and we do not have to search any further.

Theorem 4. Given an ABox abduction problem 𝒫 = (𝒦 , 𝑂), a set of abducibles Abd, let Γ =
MXP(𝒦 ∪ {¬𝑂}, Abd). If there is no 𝛾 ∈ Γ s.t. |𝛾 | > 1, then for all minimal 𝛿 ⊆ Abd s.t. 𝒦 ∪ 𝛿 ⊧ 𝑂
we have that 𝛿 ∈ Γ.
Algorithm 3 MHS-MXP(𝒦,𝑂,Abd)
Input: knowledge base 𝒦, observation 𝑂, set of abducibles Abd
Output: set 𝒮ℰ of all explanations of 𝒫 = (𝒦 , 𝑂) of the class Abd
 1: Con ← {}                                                                         ▷ Set of conflicts
 2: Mod ← {}                                                                  ▷ Set of cached models
 3: if ¬isConsistent(𝒦 ∪ {¬𝑂}) then
 4:     return ” n o t h i n g t o e x p l a i n ”
 5: else if Abd(𝑀) = ∅ where Mod = {𝑀} then
 6:     return 𝒮ℰ = ∅
 7: end if
 8: 𝑇 ← (𝑉 = {𝑟}, 𝐸 = ∅, 𝐿 = ∅)                                                       ▷ Init. HS-Tree
 9: while there is next node 𝑛 in 𝑇 w.r.t. BFS do
10:     if 𝛾 ⊈ 𝐻 (𝑛) for all 𝛾 ∈ Con then                                    ▷ Otherwise 𝑛 is pruned
11:         ⟨ , Γ⟩ ← FindConflicts(𝒦 ∪ {¬𝑂} ∪ 𝐻 (𝑛), Abd ⧵ 𝐻 (𝑛))
12:         Con ← Con ∪ {𝐻 (𝑛) ∪ 𝛾 ∣ 𝛾 ∈ Γ}
13:         if Γ ≠ ∅ and ∃𝛾 ∈ Γ ∶ |𝛾 | > 1 then                         ▷ HS-tree is extended under 𝑛
14:             𝐿(𝑛) ← Abd(𝑀) ⧵ 𝐻 (𝑛) for some 𝑀 ∈ Mod s.t. 𝑀 ⊧ 𝐻 (𝑛)
15:             for each 𝜎 ∈ 𝐿(𝑛) create new 𝜎-successor 𝑛𝜎 of 𝑛
16:         end if
17:     end if
18: end while
19: return 𝒮ℰ ← {𝛾 ∈ Con ∣ 𝛾 is desired}
20: function isConsistent(𝒦)
21:    if there is 𝑀 ⊧ 𝒦 then
22:        Mod ← Mod ∪ {𝑀}
23:        return true
24:    else
25:        return false
26:    end if
27: end function


4. Combined MHS-MXP Algorithm
The idea to use MXP to find all explanations is based on the observation that running it multiple
times in a row may result in a consecutive extension of the overall set of conflicts found so far.
A naïve, and possibly to a large extent successful idea, would be to randomize the set splits
MXP does in each recursive call. We would likely find different conflicts each time, however it
would not be clear when to stop.
   We will instead explore a hybrid approach, and we will show that by modifying MXP’s inputs
in its consecutive iterations, the search space exploration can be guided by the construction of
an HS-tree from the obtained outputs, and thus completeness will be achieved.
   The combined MHS-MXP algorithm, listed as Algorithm 3, therefore constructs the HS-tree
𝑇 as usual, but in each node 𝑛, instead of simply retrieving one model of 𝒦 ∪ {¬𝑂} ∪ 𝐻 (𝑛), it
launches MXP by calling FindConflicts.
   It starts by checking the consistency of 𝒦 ∪ {¬𝑂}. We use a modified isConsistent function
which stores all models in the model cache Mod. The stored models are later used to construct
the HS-tree and label its nodes. For this reason we also assume that FindConflicts uses this
modified isConsistent function.
   Then the main loop is initiated. For the root node 𝑟, FindConflicts is simply called passing
𝒦 ∪ {¬𝑂} as the background theory and Abd as the set of conflicts (as 𝐻 (𝑛) = ∅ at this point).
The obtained conflicts Γ are stored in Con. We then verify if all conflicts were already found
or if the search needs to go on (line 13). From Theorem 3 we know that if no conflicts were
returned in Γ, it means there are no conflicts whatsoever. Also from Observation 3 we know
that all conflicts of size 1 are always found and returned in Γ. Finally, by Theorem 4 we have
that if any larger conflicts remain, at least one is also present in Γ. Hence, if there is no 𝛾 ∈ Γ
with |𝛾 | > 1 there are no other explanations to be found and the search can be terminated.
   If however at least one such 𝛾 was returned in Γ then the HS-tree is extended under 𝑟 using
the model 𝑀 that was previously found and stored in Mod.
   When consecutively any other node 𝑛 ≠ 𝑟 is visited by the main loop, we first check if it can
be pruned: it is pruned if 𝐻 (𝑛) contains any of the conflicts already stored in Con. If not, we
now want to use MXP with the goal to explore as much as possible of that part of the space of
explanations that extends 𝐻 (𝑛). Therefore we call FindConflicts passing 𝒦 ∪ {¬𝑂} ∪ 𝐻 (𝑛) as
the background theory and Abd ⧵ 𝐻 (𝑛) as the set of conflicts.
   If we are lucky, we might cut off this branch completely in line 13, that is, if no extension of
𝐻 (𝑛) of size greater than 1 is found (by Theorem 4). Otherwise we extend the HS-tree below 𝑛.
   To be able to do that, we need a model of 𝒦 ∪ {¬𝑂} ∪ 𝐻 (𝑛). However, we do not need to run
another consistency check here, as by design of our algorithm at this point such a model is
already cached in Mod.

Observation 4. For each node 𝑛 of the HS-tree visited by the main loop of MHS-MXP(𝒦 , 𝑂, Abd)
either 𝐻 (𝑛) ∈ Con or 𝒦 ∪ {¬𝑂} ∪ 𝐻 (𝑛) is consistent and at least for one 𝑀 ∈ Mod, 𝑀 ⊧ 𝒦 ∪ {¬𝑂} ∪
𝐻 (𝑛).

   For the root 𝑟 this trivially holds as it was only visited (as the first node in the main loop) if
the consistency check in line 3 was positive, i.e. a suitable model was found and cached.
   For any other node 𝑛, this holds due to FindConflicts was previously called in the parent node
 ′
𝑛 of 𝑛, and from Observation 3 we know that during that call all possible inconsistent extensions
of 𝐻 (𝑛′ ) of size 1 were added to Con. So, if 𝑛 was not pruned in line 10, 𝐻 (𝑛) = 𝐻 (𝑛′ ) ∪ {𝜎 } must
be consistent with 𝒦 ∪ {¬𝑂}. Moreover, since FindConflicts did not prove {𝜎 } being a conflict
for 𝒦 ∪ {¬𝑂} ∪ 𝐻 (𝑛′ ), at some point it must have checked the consistency of 𝒦 ∪ {¬𝑂} ∪ 𝐻 (𝑛′ )
together with {𝜎 } or some superset thereof with a positive result, and at that point the respective
model was added to Mod.
   Finally, by the time a complete HS-tree is constructed, all explanations are accumulated in
Con. However, due to calls to FindConflicts where (nonempty) 𝐻 (𝑛) was passed together with
𝒦 as the consistent background theory, some of these conflicts in Con may be non-minimal and
they have to be filtered out. At this point we also filter out any other undesired explanations.
Then the remaining minimal and desired explanations are returned as 𝒮ℰ .

Theorem 5. The MHS-MXP algorithm is sound and complete (i.e., it returns the set 𝒮ℰ of all
minimal desired explanations of 𝒦 and 𝑂 on Abd).
5. Advantages and Limitations
First of all, it is immediately apparent from the properties of MXP alone, that our approach
absolutely crushes MHS in cases when all explanations are of size one. By Observation 3 and
Theorem 4 the search may immediately stop after one call to MXP in the root node of the
HS-tree. Without this “look ahead” capability provided to the hybrid algorithm by MXP, pure
MHS has no way of knowing it could stop and has to generate the HS-tree completely.
   Let us now consider some cases when bigger explanations come into play.
Example 2. Let 𝒦 = {𝐴 ⊓ 𝐵 ⊑ 𝐹 , 𝐶 ⊓ 𝐷(𝑎), 𝐸(𝑏)}, let 𝑂 = 𝐹 (𝑎) and let Abd =
{𝐴(𝑎), 𝐵(𝑎), 𝐶(𝑎), 𝐷(𝑎)}. There is exactly one explanation of (𝒦 , 𝑂), ℰ1 = {𝐴(𝑎), 𝐵(𝑎)}.
   If we run MHS-MXP, it first checks 𝒦 ∪ {¬𝐹 (𝑎)} for consistency and it obtains a model 𝑀 thereof,
say one with Abd(𝑀) = {𝐴(𝑎), 𝐶(𝑎)}.
   The call to FindConflicts in the root does not allow to terminate the search, since ℰ1 was
returned and |ℰ1 | > 1. Therefore 𝑛1 and 𝑛2 are added to the HS-tree with 𝐻 (𝑛1 ) = {𝐴(𝑎)} and
𝐻 (𝑛2 ) = {𝐶(𝑎)}.
   When FindConflicts is called in 𝑛1 , it returns one conflict {𝐵(𝑎)} which together with 𝐻 (𝑛1 )
makes up for the explanation ℰ1 . What is more, this branch is consecutively cut off, as no greater
conflicts were found. Notably, further exploration of branches extending 𝐻 (𝑛1 ) with 𝐶(𝑎) and 𝐷(𝑎)
is avoided (in comparison with MHS).
   Then FindConflicts is called in 𝑛2 returning one conflict {𝐴(𝑎), 𝐵(𝑎)}, corresponding to the
non-minimal explanation {𝐶(𝑎), 𝐴(𝑎), 𝐵(𝑎)}. However, since there was a conflict extending 𝐻 (𝑛1 )
by a size greater than one, we may not terminate yet and must explore this branch in the HS-tree
further, until only extensions of size one are returned by MXP in each path.
  The case presented in Example 2 and similar cases with a small overall number of explanations
can still be handled rather efficiently, compared to MHS. It can be observed that a significant
part of the search space is cut off. However consider the following modification of the inputs.
Example 3. Given the 𝒦 and 𝑂 as in Example 2, let Abd = {𝐴(𝑎), 𝐵(𝑎), 𝐶(𝑎), 𝐷(𝑎), 𝐸(𝑎), ¬𝐸(𝑎)}.
The abduction problem (𝒦 , 𝑂) now has two explanations ℰ1 and ℰ2 = {𝐸(𝑎), ¬𝐸(𝑎)}, where the
second one is undesired (inconsistent). When FindConflicts is called in the root node 𝑟 it returns
{{𝐴(𝑎), 𝐵(𝑎)}, {𝐸(𝑎), ¬𝐸(𝑎)}}. W.l.o.g. we may assume that the same model 𝑀 was used to label 𝑟
and that 𝑀 ⊧ ̸ 𝐸(𝑎). This time Abd(𝑀) = {𝐴(𝑎), 𝐶(𝑎), 𝐸(𝑎)} and in addition to 𝑛1 and 𝑛2 as above
also 𝑛3 is generated with 𝐻 (𝑛3 ) = {𝐸(𝑎)}.
    Now the search cannot be immediately cut off after MXP is called in any of the three nodes 𝑛1 ,
𝑛2 , or 𝑛3 . E.g., in 𝑛1 FindConflicts returns {{𝐵(𝑎)}, {𝐸(𝑎), ¬𝐸(𝑎)}}. Only branches where all but
one element from each explanation is already present can be cut off safely.
   The exercise we make in Example 3 shows that the larger the overall amount of explanations
(including also various undesired explanations as ℰ2 ) and also the greater their size, the less
advantage MHS-MXP is likely to retain. It should also be noted that while adding complementary
assertions (inducing inconsistent explanations) to abducibles does not make a difference for
MHS, it does make a difference (towards the worse) for MHS-MXP. This is also true of assertions
that are mutually inconsistent due to the background ontology (and therefore inducing another
type of undesired explanations, called irrelevant).
   The problem of finding all explanations of a given abduction problem is hard. MHS itself is
NP-complete [3], and it still has to call DL reasoning (likely exponential or worse) in every node.
While MHS-MXP provides an advantage on certain inputs we have no reason to suppose it is
substantially better in the worst case. We were however interested to explore the improvement
in the advantageous cases empirically on which we report in the following part of this paper.


6. Implementation and Evaluation
An implementation1 of MHS-MXP was developed in Java using Maven for dependency manage-
ment. Compared to the previous version of the implementation [6], the current version features:
pure MHS mode, observations in form of a set of complex concept assertions, and equal-paths
pruning (in addition to subset pruning).
   An external DL reasoner is called in each node to query about the model. The model data
is extracted through OWL API [13], using the g e t T y p e s method. This approach is preliminary
and it has known issues2 , we are currently working on an alternate approach (cf. Conclusions).
   Using the implementation, we ran3 tests in order to understand if MHS-MXP may have a
certain advantage at least on cases of inputs that we identified as favourable. We have used the
LUBM ontology [14]. It does not contain disjunction, so we are able to test on it even given the
limited model extraction method that we currently use.
   LUBM does not feature atomic concepts that would allow explanations of size greater than 1.
Hence in each input we have set the observation to 𝐷(𝑎) where 𝑎 is an individual and 𝐷 is a
new auxiliary concept name. Then we have randomly generated a set of LUBM concept names
𝐴1 , … , 𝐴𝑛 , and added an equivalence 𝐷 ≡ 𝐴1 ⊓ ⋯ ⊓ 𝐴𝑛 into the input ontology. If 𝐴1 , … , 𝐴𝑛 are
independent concepts with no subconcepts there is exactly one explanation {𝐴1 (𝑎), … , 𝐴𝑛 (𝑎)}.
If 𝐴1 , … , 𝐴𝑛 are not independent the explanation will be shorter. If some of 𝐴1 , … , 𝐴𝑛 have
(independent) subconcepts there will be multiple partially overlapping explanations. We targeted
explanations of size up to 5, hence we generated inputs for 𝑛 ∈ [1..5] – 50 inputs altogether. We
have aggregated the inputs into five groups S1–S5 based on the size of the largest explanation.
The inputs were generated randomly (however LUBM concepts with subconcepts were drawn
twice as often to ensure higher number of explanations). The number of generated samples
was consecutively reduced in order to obtain balanced groups S1–S5, each with 10 inputs.
   To verify the second part of our conjecture, i.e. that MHS-MXP may perform better on inputs
with smaller number of explanations, we also aggregated the same 50 inputs differently, into
1
  The implementation is available at https://github.com/IvetBx/MHS-MXP-algorithm.
2
  We would need to extract all atomic concept assertions that are true in a model (i.e. in a complete completion tree
  constructed during a consistency check). OWL API is not well equipped for this task. The g e t T y p e s method instead
  returns all concept assertions that are entailed for a given individual. In cases e.g. when the ontology contain
  disjunction and 𝒦 ⊧ 𝐴 ⊔ 𝐵(𝑎) this would yield incorrect results. However for simple ontologies without disjunction
  there is always one model in which exactly those atomic concept assertions are true for a given individual which are
  also entailed (which is the model that we will extract). Therefore we may employ this preliminary model extraction
  method in our limited evaluation as described in the following.
3
  The experiments were executed on a virtual machine with 8 cores (16 threads) of the processor Intel Xeon CPU
  E5-2695 v4, 32GB RAM, 2.10GHz, running Ubuntu 20.04 and Oracle Java SE Runtime Environment v1.8.0_201. To
  measure the execution times in Java, ThreadMXBean from the package java.lang.management was used. The user
  time measured the actual time without system overhead. The maximum Java heap size to 4GB.
groups C1–C5 accordingly. Basic characteristics of groups S1–S5 and C1–C5 are given in Table 1.

Table 1
Statistics for input groups: #: number of inputs; Cm , Ca , CM : min, average, and max count of explanations;
Sm , Sa , SM : min, average, and max size of the largest explanation
                        Set # Cm      Ca   CM    Set #    Cm    Ca   CM Sm    Sa SM
                        S1   10 2       8 21     C1 9    2 5.8 10         1 1.11   2
                        S2   10 9 70.5 160       C2 11 16    52 100       1    2   3
                        S3   10 48 213.4 480     C3 17 112 260.3 480      2 3.39   4
                        S4   10 168 418.8 840    C4 4 504 640.5 840       4 4.25   5
                        S5   10 504 2628 6720    C5 9 1176 2864 6720      5    5   5


   Our implementation supports atomic and negated atomic concept assertions as abducibles,
where the latter may be suppressed by a switch. We have used this to obtain an unfavourable case
for MHS-MXP (both atomic and negated atomic concepts allowed) and a favourable case (negated
atomic concepts suppressed). However, all generated inputs only have explanations involving
atomic concept assertions, hence each input has exactly the same number of explanations in
either case (favourable and unfavourable) – only the search space in the unfavourable case (and
the inherent difficulty for MHS-MXP to handle it) is larger.
   Testing the inputs on the unfavourable case did not show significant difference between MHS
and MHS-MXP. For the lack of space we omit details. The results on the favourable case are
plotted in Figure 1. MHS runs are plotted on the left, and MHS-MXP runs are plotted on the
right; results from groups S1–S5 are at the top, while groups C1–C5 are at the bottom.
   To observe the efficiency of search space exploration we measured for each level of the
HS-tree (𝑥-axis) the time by which it is fully completed (𝑦-axis). Group averages are plotted.
Recall that when MHS completes level 𝑛 it has already found all explanations of size up to 𝑛. In
contrast MHS-MXP, after completing level 𝑛, has already found all explanations of size up to
𝑛+1 (cf. Observation 3).
   MHS runs show similar times for S1–S5 up to level 2. S4–S5 reached time out in level 3,
S2–S3 in level 4. Few S1 runs managed to finish levels 4 and 5. (Note that the results for levels 4
and 5 are “polluted” by the fact that only a small number of runs which finished before the time
out are averaged here. However, levels 0–3 show the general trend.) In contrast, MHS-MXP
was able to finish all runs within the same time out, even if sometimes required a few more
levels than the maximal size of explanation for the given input. We can see a clear correlation
between the maximal length of explanations and the required running time, in this respect each
consecutive group between S1 and S5 proved to be harder.
   The results on groups C1–C5 shows very similar trends, proving that MHS-MXP has greater
advantage on inputs with smaller number of explanations.


7. Discussion
We have reported on an ABox abduction algorithm based on hybrid combination of MHS and
MXP. The aim of this approach is to obtain a complete algorithm, that is more efficient than
                        MHS                                       MHS-MXP

     104                                        104                                            all
                                                                                               S5
                                                                                               S4
     103                                        103                                            S3
                                                                                               S2
                                                                                               S1
     102                                        102


     101                                        101


     100                                        100

           0   1   2    3    4    5    6    7         0   1   2    3   4    5    6   7
                        MHS                                       MHS-MXP

     104                                        104                                            all
                                                                                               C5
                                                                                               C4
     103                                        103                                            C3
                                                                                               C2
                                                                                               C1
     102                                        102

                                                101
     101

                                                100
     100
           0   1   2   3    4    5    6    7          0   1   2    3   4    5   6    7

Figure 1: Average time (in the seconds) for computing individual levels from the certain group of inputs


MHS, at least on a class of favourable inputs that we have described. We have implemented the
algorithm and our first preliminary evaluation on the LUBM ontology confirms this conjecture.
   Both MHS and MHS-MXP rely on an external DL reasoner (as a black box) to compute models
of the KB. However DL reasoners do not return models as a standard reasoning service. They do
not even necessarily compute them, however, tableau-based reasoners compute a finite represen-
tation of a model which contains sufficient information for our approach. We currently employ
a simplified workaround for model extraction which does not work in general. It allowed us to
make experiments on LUBM, however we want to extend these to other (real world) ontologies.
We are currently looking into the possibility to use OWLKnowledgeExplorerReasoner extension
in OWL API for true model extension. The advantage of using an API would be that different
(tableau-based) DL reasoners could then be plugged in. Another option would be to use a tight
source-code integration of a DL reasoner such as in the case the MHS solver AAA [15] which
tightly integrates Pellet [16].
Acknowledgments
We would like to express our thanks to anonymous reviewers for their valuable feedback on
this and also on the previous version of this report. This work was supported by the Slovak
Research and Development Agency under the Contract no. APVV-19-0220 (ORBIS) and by the
EU H2020 programme under Contract no. 952215 (TAILOR). Martin Homola is also supported
by projects VEGA 1/0621/22 and APVV-20-0353.


References
 [1] C. Elsenbroich, O. Kutz, U. Sattler, A case for abductive reasoning over ontologies, in:
     Proceedings of the OWLED*06 Workshop on OWL: Experiences and Directions, Athens,
     GA, US, volume 216 of CEUR-WS, 2006.
 [2] R. Reiter, A theory of diagnosis from first principles, Artificial intelligence 32 (1987) 57–95.
 [3] R. M. Karp, Reducibility among combinatorial problems, in: Proceedings of a symposium
     on the Complexity of Computer Computations, March 20–22, 1972, at the IBM Thomas J.
     Watson Research Center, Yorktown Heights, New York., 1972, pp. 85–103.
 [4] U. Junker, QuickXplain: Preferred explanations and relaxations for over-constrained
     problems, in: Proceedings of the Nineteenth National Conference on Artificial Intelli-
     gence, Sixteenth Conference on Innovative Applications of Artificial Intelligence, San Jose,
     California, US, AAAI Press, 2004, pp. 167–172.
 [5] K. M. Shchekotykhin, D. Jannach, T. Schmitz, MergeXplain: Fast computation of mul-
     tiple conflicts for diagnosis, in: Proceedings of the Twenty-Fourth International Joint
     Conference on Artificial Intelligence, IJCAI 2015, Buenos Aires, Argentina, AAAI Press,
     2015.
 [6] M. Homola, J. Pukancová, J. Gablíková, K. Fabianová, Merge, explain, iterate, in: S. Borg-
     wardt, T. Meyer (Eds.), Proceedings of the 33rd International Workshop on Description
     Logics (DL 2020) co-located with the 17th International Conference on Principles of Knowl-
     edge Representation and Reasoning (KR 2020), Online Event [Rhodes, Greece], September
     12th to 14th, 2020, volume 2663 of CEUR Workshop Proceedings, CEUR-WS.org, 2020.
 [7] W. Del-Pinto, R. A. Schmidt, Abox abduction via forgetting in ALC, in: The Thirty-Third
     AAAI Conference on Artificial Intelligence, AAAI 2019, Honolulu, Hawaii, USA, AAAI
     Press, 2019, pp. 2768–2775.
 [8] P. Koopmann, W. Del-Pinto, S. Tourret, R. A. Schmidt, Signature-based abduction for
     expressive description logics, in: Proceedings of the 17th International Conference on
     Principles of Knowledge Representation and Reasoning, KR 2020, Rhodes, Greece, 2020,
     pp. 592–602.
 [9] F. Baader, D. Calvanese, D. L. McGuinness, D. Nardi, P. F. Patel-Schneider (Eds.), The
     Description Logic Handbook: Theory, Implementation, and Applications, Cambridge
     University Press, 2003.
[10] M. Schmidt-Schauß, G. Smolka, Attributive concept descriptions with complements,
     Artificial intelligence 48 (1991) 1–26.
[11] I. Horrocks, O. Kutz, U. Sattler, The even more irresistible 𝒮 ℛ𝒪ℐ 𝒬, in: Proceedings,
     Tenth International Conference on Principles of Knowledge Representation and Reasoning,
     Lake District of the United Kingdom, AAAI, 2006, pp. 57–67.
[12] J. Pukancová, M. Homola, ABox abduction for description logics: The case of multiple
     observations, in: Proceedings of the 31st International Workshop on Description Logics,
     Tempe, Arizona, US, volume 2211 of CEUR-WS, 2018.
[13] M. Horridge, S. Bechhofer, The OWL API: A java API for OWL ontologies, Semantic Web
     2 (2011) 11–21.
[14] Y. Guo, Z. Pan, J. Heflin, LUBM: A benchmark for OWL knowledge base systems, Journal
     of Web Semantics 3 (2005) 158–182.
[15] J. Pukancová, M. Homola, The AAA abox abduction solver, Künstliche Intell. 34 (2020)
     517–522.
[16] E. Sirin, B. Parsia, B. Cuenca Grau, A. Kalyanpur, Y. Katz, Pellet: A practical OWL-DL
     reasoner, Journal of Web Semantics 5 (2007) 51–53.

</pre>