Abduction in EL via Translation to FOL ? Fajar Haifani1 , Patrick Koopmann2 , and Sophie Tourret1,3 1 Max Planck Institute for Informatics, Saarland Informatics Campus, Saarbrücken Germany ({fhaifani,stourret}@mpi-inf.mpg.de) 2 Institute for Theoretical Computer Science, Technische Universität Dresden, Germany (patrick.koopmann@tu-dresden.de) 3 Université de Lorraine, CNRS, Inria, LORIA, Nancy, France Abstract. We present a technique for performing TBox abduction in the description logic EL. The input problem is converted into first-order formulas on which a prime implicate generation technique is applied, then EL hypotheses are reconstructed by combining the generated positive and negative implicates. Keywords: Abduction · Description Logic EL· Prime Implicates · First- order Logic. 1 Introduction Description logic (DL) ontologies are used to formalize terminological knowledge in diverse areas such as medicine, biology, or the semantic web. Common rea- soning tasks for ontologies are subsumption checking, which is to decide whether one description of a concept generalizes another, and classification, which is to compute the entire subsumption hierarchy of an ontology, that is, the set of all subsumption relationships that hold between atomic concepts in the ontology. Both problems can nowadays be computed rather efficiently using highly opti- mized description logic reasoners [24]. Modern ontologies can become very large and complex: for instance, the medical ontology SNOMED CT4 contains over 350,000 statements, while the Gene Ontology GO defines over 50,000 concepts. But even for less complex ontologies, it is easy to introduce bugs, and possi- ble inferences may not always be transparent to ontology users and engineers. Correspondingly, services for explaining those inferences can be very helpful. Typical techniques for explaining why subsumption can be inferred include jus- tifications [4, 16, 27] and proofs [1, 2], both of which have been included in the popular ontology development tool Protégé [17, 18]. In order to explain why a subsumption relationship cannot be inferred, one can use abduction [21]. The general setup in abduction is that we have an ? Funded by the DFG grant 389792660 as part of TRR 248 – CPEC, see https://perspicuous-computing.science Copyright © 2021 for this paper by its authors. Use permitted under Creative Com- mons License Attribution 4.0 International (CC BY 4.0). 4 https://www.snomed.org Abduction in EL via Translation to FOL 47 input consisting of a knowledge base K and some logical statement α—the observation—that is not logically entailed by K. The aim is then to compute a hypothesis, a set of axioms whose addition to K would result in the entail- ment α. This way, the hypothesis not only provides for a possible explanation, it also offers a fix to repair the entailment in case it is supposed to hold. One way to avoid trivial hypotheses (such as the axiom α itself) is to restrict solutions syntactically, for instance by specifying a signature of predicates that can be used in the solution [20, 22]. In the context of DLs, depending on the shape of the observation to be explained, abduction can be classified into different categories: – In concept abduction observation and hypothesis are DL concepts. That means we are looking for a concept that is subsumed by a user-given concept (with respect to an ontology) [5]. – In ABox abduction, observations and hypotheses have the form of ABoxes, that is, they describe facts, which is the more typical scenario for diagno- sis [6–9, 11, 15, 19, 20, 25, 26]. – In TBox abduction, observation and hypothesis consist of TBox axioms, which can be used to find fixes for missing subsumption relationships [10,28]. – Finally, in knowledge base abduction, the most general form, both hypotheses and observations can be arbitrarily composed of TBox and ABox axioms [13, 22]. We are interested in explaining missing subsumption relationships, which is what is done by TBox abduction, in which background knowledge, observation, and hypothesis all consist of TBox axioms, that is, DL axioms expressing subsump- tion relationships between concepts. Optionally, a set of abducible concepts can be specified as part of the input, of which every hypothesis needs to be composed. We work in the lightweight DL EL, used in many large scale ontologies [3]. TBox abduction has been considered before, albeit in different settings. In [10], instead of a set of abducibles, a set of justification patterns is given, which is a set of syntactical patterns that hypotheses have to fit in. In contrast, [28] generalises both this work and the one in [10], in that it uses an arbitary oracle function to decide whether a solution is admissible or not (which may use abducibles, justification patterns, or something else). The work in [28] also considers EL, and abduction under various minimality notions such as subset minimality and size minimality. It is there shown that deciding the existence of hypotheses is tractable, while for certain minimality criteria, it can become coNP-complete. In addition to the theoretical results, the authors of [28] present practical algorithms, and an evaluation of an implementation. Different to our approach, they use an external DL reasoner to decide entailment relationships. In contrast, we present an approach that directly exploits first-order reasoning, and thus has the potential to be generalisable to more expressive DLs. The TBox abduction technique we present proceeds in three steps. The first step is the translation of the abduction problem into a first-order formula Φ. Then, we compute the prime implicates of Φ, that is, a minimal set of logical consequences of Φ that subsume all other consequences of Φ. In the final step, we 48 Fajar Haifani, Patrick Koopmann and Sophie Tourret C1 v C2 PI g+ Σ (Φ) T translation Φ PI generation recombination S PI g− Σ (Φ) Σ Fig. 1. EL abduction using prime implicate generation in FOL. construct TBoxes that are hypotheses of the original abduction problem, from the prime implicates of Φ. Figure 1 illustrates this process with some more details. The translation to first-order relies on the assumption that the TBox T is normalized and this results in a set of definite Horn clauses Φ. Prime implicate (PI) generation in first-order logic can be done with an off-the-shelf tool [12, 23] or, in our case, by slightly altering a resolution-based theorem prover, such as included in the SPASS theorem prover [29]. Since the input is Horn, PI g+ Σ (Φ), the set of ground positive prime implicates of Φ, contains only unit clauses. The recombination step looks at the negative ground prime implicates in PI g− Σ (Φ) one after the other and attempts to match each literal in one such clause with clauses in PI g+ Σ (Φ). If this succeeds, the result is a solution to the original problem. We denote S the set of all solutions obtained in this way. In this paper, we describe in details the translation and recombination steps and prove that the S obtained in that way contains only solutions to the original problem. 2 Preliminaries We begin with an introduction of the relevant notions, first in EL, then in first- order logic. 2.1 The Description Logic EL Let NC and NR be pair-wise disjoint, countably infinite sets of respectively atomic concepts and roles. Generally, we use letters A, B, E, F ,... for atomic concepts, and r for roles, possibly annotated, letters C, D, possibly annotated, denote EL concepts, built according to the syntax rule C ::= > | A | C u C | ∃r.C . d We use {C1 , . . . , Cm } as abbreviation for C1 u . . . u Cm , and identify the empty conjunction (when m = 0) with >. An EL TBox T is a finite set of concept inclusions (CIs) of the form C v D. We use C ≡ D as abbreviation for {C v D, D v C}. The semantics of EL is defined as usual using DL interpretations, which are tuples I = (∆I , ·I ) made of a domain ∆I and an interpretation function Abduction in EL via Translation to FOL 49 ·I that maps atomic concepts A ∈ NC to sets AI ⊆ ∆I and roles r ∈ NR to relations rI ⊆ ∆I × ∆I . The interpretation function ·I is extended to complex concepts as follows: >I = ∆I (C u D)I = C I ∩ DI (∃r.C)I = {d ∈ ∆I | ∃(d, e) ∈ rI s.t. e ∈ C I } We say that I satisfies a CI C v D, in symbols I |= C v D, if and only if C I ⊆ DI , and if I satisfies all axioms in a TBox T , we write I |= T and call I a model of T . If a CI C v D is satisfied in every model of T , we write T |= C v D and say that C v D is entailed by T . In this case, we say that C is subsumed by D and call C a subsumee of D and D a subsumer of C. Note that, if we identify NC with the set of unary predicates and NR with the set of binary predicates, I is a classical first-order structure for those predicates. We can thus define satisfaction of first-order formulas in DL interpretations as usual, and also relate first-order formulas with CIs and TBoxes via the usual entailment relation. 2.2 Abduction in EL The abduction problem we are concerned with in this paper is the following. Definition 1. Given a TBox T , a set of atomic concepts Σ ⊆ NC and a concept inclusion C1 v C2 , called the observation, where C1 and C2 are atomic concepts and T 6|= C1 v C2 , the corresponding TBox abduction problem, denoted by the tuple hT , Σ, C1 v C2 i, is to find a TBox H ⊆ {Ai1 u · · · u Ain v Bi1 u · · · u Bim | {Ai1 , . . . , Ain , Bi1 , . . . , Bim } ⊆ Σ} where m > 0, n ≥ 0 and such that T ∪ H |= C1 v C2 and T ∩ H = ∅. Such a solution H to the abduction problem is called a hypothesis. As an example, suppose we have a TBox T = {C1 v ∃r.A, ∃r.B v C2 }. For the abduction problem hT , {A, B}, C1 v C2 i, the set H = {A v B} is a solution since T ] H |= C1 v C2 . Note that since EL TBoxes are always consistent, the consistency condition usually required on T ∪ H is not needed in our setting. 2.3 Prime Implicate Generation in First Order Logic We consider first-order logic with unary and binary predicates, which we identify with the corresponding elements in NC and NR . Let V be a set of variables, and ΣFOL be a signature such that ΣFOL = NC ∪ NR , where NC and NR are the sets of EL atomic concepts and roles, used as unary and binary predicate symbols respectively. Furthermore, let F be a set of function symbols, which includes a set C of constants as nullary function symbols. A term t, possibly annotated, is either a variable x ∈ V, a constant c ∈ C, or of the form f (t1 , . . . , tm ), where f is an m-ary function symbol and t1 , . . ., tm are terms. Atoms are either of 50 Fajar Haifani, Patrick Koopmann and Sophie Tourret the form A(t1 ) or r(t1 , t2 ), where t1 and t2 are terms. Literals are either atoms κ (positive literals) or negated atoms ¬κ (negative literals). Clauses, denoted ϕ with possible annotations, are disjunctions of literals implicitly universally quantified. Formulas, denoted Φ, are described using universal and existential quantification, conjunction, disjunction, and negation. For convenience, by abuse of notation, we may sometimes identify sets of formulas or clauses with the conjunction over those. Skolemization removes existentially quantified variables from formulas in the usual way: for each existentially quantified variable, it adds a fresh function with variable arguments taken from the universally quantified variables outside the scope of the variable under consideration (if no such variables exist, we use a nullary function, ie. a constant). For example, ∀x.¬A(x) ∨ ∃y.r(x, y) is Skolemized into ∀x.¬A(x) ∨ r(x, f (x)). Here, we assume that the formula is in negation normal form, that is, every negation symbol occurs only in front of an atom. Furthermore, we assume that such Skolem functions are taken from a set NS not in the signature. In our context, we only require a single Skolem constant, namely sk0 , since Skolemized atoms, literals, and clauses range over Skolem terms either built over a given variable or sk0 instead of ranging over variables. We let Tx (NS ) be the set of Skolem terms built on x, where x is either sk0 or a variable in V. Since we only consider formulas over unary and binary predicates, they can be satisfied by DL interpretations as defined above, understood naturally as first-order structures. We call interpretations I with ∆I = Tsk0 (NS ) Herbrand interpretations, which for convenience, by abuse of notation, we may treat as sets of ground atoms s.t. A(t) ∈ I if and only if t ∈ AI and r(t, t0 ) ∈ I if and only if (t, t0 ) ∈ rI . Definition 2 (Prime Implicate). Given a set of clauses Φ, a clause ϕ is an implicate of Φ if Φ |= ϕ. Additionally, ϕ is a prime implicate of Φ if for any other implicate ϕ0 of Φ s.t. ϕ0 |= ϕ, it also holds that ϕ |= ϕ0 . Given a Skolemized Φ, the set PI g+ Σ (Φ) is the set of all positive ground prime implicates of Φ and the set PI g− Σ (Φ) is the set of all negative ground prime implicates where all of the predicate symbols belong to Σ. Example 3. Given Φ as follows: Φ = {A(sk0 ), ¬B(sk0 ), ¬A(x) ∨ r(x, sk(x)), ¬A(x) ∨ E(sk(x)), ¬G(x) ∨ ¬r(x, y) ∨ ¬E(y) ∨ B(sk0 )}, where sk0 is a constant and sk is a function symbol, the ground prime implicates of Φ are PI g+ g− Σ (Φ) = {A(sk0 ), E(sk(sk0 ))} and PI Σ (Φ) = {¬B(sk0 ), ¬G(sk0 ) ∨ ¬E(sk(sk0 ))}. They are implicates because all of them are consequences of Φ (they are entailed by Φ). Moreover, these implicates are also prime implicates because it is not possible to obtain an implicate of Φ by removing a literal from one of them. In some cases, the set of prime implicates can even be infinite. Abduction in EL via Translation to FOL 51 Example 4. Consider Φ0 , a slight variation on Φ from the previous example, where E(sk(x)) is replaced by A(sk(x)), so that: Φ0 = {A(sk0 ), ¬B(sk0 ), ¬A(x) ∨ r(x, sk(x)), ¬A(x) ∨ A(sk(x)), ¬G(x) ∨ ¬r(x, y) ∨ ¬E(y) ∨ B(sk0 )}, Whereas the negative ground prime implicates are unchanged (PI g− 0 Σ (Φ ) = g− PI Σ (Φ)), there are now infinitely many positive ground prime implicates, be- cause PI g+ 0 Σ (Φ ) contains A(sk0 ), A(sk(sk0 )), A(sk(sk(sk0 ))) . . . 3 From EL to FOL We assume the EL TBox in the input to be in normal form as defined in [3], which means that every CI is of the form A v B, A1 u A2 v B, ∃r.A v B or A v ∃r.B, where r ∈ NR and A, A1 , A2 , B ∈ NC ∪ {>}. Every EL TBox can be transformed in polynomial time into this normal form through the introduction of fresh concept names. This transformation preserves all entailments in the signature of the original ontology, even after the addition of CIs (assuming the same CIs are added also to the input ontology, and they don’t use any of the introduced concept names). Note that this transformation extends the signature, and thus potentially the solution space of the abduction problem: to avoid this, we may simply intersect the set of abducibles Σ with the signature of the original input ontology. At the level of TBoxes, we use the standard, semantics-preserving translation from EL to FOL [3]. Definition 5 (TBox translation). The function π translates concepts into first-order formulas: π(>, x) = true π(A, x) = A(x) π(∃r.C, x) = ∃y.(r(x, y) ∧ π(C, y)) π(C u D, x) = π(C, x) ∧ π(D, x) We lift π to concept inclusions with: π(C v D) = ∀x.¬π(C, x)∨π(D, x). Finally, we lift π to TBoxes with: π(T ) = {π(C v D) | C v D ∈ T }. 52 Fajar Haifani, Patrick Koopmann and Sophie Tourret If T is in normal form, π(T ) contains only axioms of the following shapes: shapes from CIs with > : ∀x.true ∀x.B(x) ∀x.¬A(x) ∨ ∃y.r(x, y), ∀x.¬(∃y.r(x, y)) ∨ B(x) shapes from CIs without > : ∀x.¬A(x) ∨ B(x), ∀x.¬A1 (x) ∨ ¬A2 (x) ∨ B(x), ∀x.¬(∃y.r(x, y) ∧ A(y)) ∨ B(x), ∀x.¬A(x) ∨ ∃y.(r(x, y) ∧ B(y)). In practice, since the symbol > will be preprocessed, only the four last forms are relevant in our work. The function π is semantics-preserving in the following sense: Theorem 6 (First-Order Semantics Preservation [3]). For any EL TBox T and possibly complex concepts C and D, T |= C v D if and only if π(T ) |= π(C v D). The translation of a TBox abduction problem, built on top of the standard TBox translation, is less straightforward. The translation Π(T , C1 v C2 ) of hT , Σ, C1 v C2 i to first-order logic is the Skolemization of π(T ] T 0 ) ∧ ¬π(C1 v C20 ) where sk0 is used as the unique fresh Skolem constant such that the Skolem- ization of ¬π(C1 v C20 ) results in {C1 (sk0 ), ¬C20 (sk0 )}, and where several other things happen that are explained in the following paragraphs. Before Skolemization, T is first preprocessed by replacing every occurrence of the concept > in T with that of the fresh atomic concept A> and we add ∃r.A> v A> and B v A> to T for every role r and atomic concept B. This simulates for A> the implicit property that C v > holds for any C no matter what the TBox is. In particular, this ensures that whenever there is a positive prime implicate B(t) or r(t, t1 ), A> (t) also becomes a prime implicate. Finally, we define T 0 by renaming all atomic concepts A from T using fresh symbols A0 . Notice that this is also done for C2 , renamed to C20 in Π(T , C1 v C2 ). Thanks to this encoding trick, the inferences can be traced back from C1 (sk0 ) or from ¬C20 (sk0 ) and do not interfere with each other, making it easier to relate ground prime implicates and concept inclusions and capturing interesting solutions that would otherwise not be found. We now look at a complete encoding example in details. Example 7. Suppose given T over the signature where NC = {A, B, C1 , C2 , E, F, G, H} and NR = {r1 , r2 } as follows. T = {C1 v ∃r1 .A, C1 v B, C1 v ∃r2 .G, ∃r2 .E v B, ∃r1 .F v H, B u H v C2 } Abduction in EL via Translation to FOL 53 Consider the abduction problem hT , NC , C1 v C2 i. Preprocessing the > symbol turns T into: T = {C1 v ∃r1 .A, C1 v B, C1 v ∃r2 .G, ∃r2 .E v B, ∃r1 .F v H, B u H v C2 } ]{∃r1 .A> v A> , ∃r2 .A> v A> } ]{J v A> | J ∈ {A, B, C1 , C2 , E, F, G, H}} and the renaming of atomic concepts produces: T 0 = {C10 v ∃r1 .A0 , C10 v B 0 , C10 v ∃r2 .G0 , ∃r2 .E 0 v B 0 , ∃r1 .F 0 v H 0 , B 0 u H 0 v C20 } ]{∃r1 .A0> v A0> , ∃r2 .A0> v A0> } ]{J v A0> | J ∈ {A0 , B 0 , C10 , C20 , E 0 , F 0 , G0 , H 0 }} The translation of T results in ΦT , the Skolemization of π(T ): ΦT = {¬C1 (x) ∨ r1 (x, sk1 (x)), ¬C1 (x) ∨ A(sk1 (x)), ¬C1 (x) ∨ B(x), ¬C1 (x) ∨ r2 (x, sk2 (x)), ¬C1 (x) ∨ G(sk2 (x)), ¬r2 (x, y) ∨ ¬E(y) ∨ B(x), ¬r1 (x, y) ∨ ¬F (y) ∨ H(x), ¬B(x) ∨ ¬H(x) ∨ C2 (x), ]{¬r1 (x, y) ∨ A> (y) ∨ A> (x), ¬r2 (x, y) ∨ A> (y) ∨ A> (x)} ]{¬J(x) ∨ A> (x) | J ∈ {A, B, C1 , C2 , E, F, G, H}} ΦT 0 , the Skolemization of T 0 , is similar except that the atomic concepts are renamed. Finally, Π(T , C1 v C2 ) = ΦT ∪ ΦT 0 ∪ {C1 (sk0 ), ¬C20 (sk0 )}. Let us now look in more detail at the reason for replacing > with A> . First, remember that > is not considered as a part of NC and represents an empty conjunction. Since it is translated to true in first-order logic, it never appears in a translated formula, as can be seen in the text following Definition 5. However, our technique is only looking for the ground prime implicates. There are many other non-ground prime implicates that are generated and that we only use in the hope of finally grounding them. It would not be practical to have to sort out the non-ground PIs due to the > symbol from the rest of them. Consider the following example. Example 8. Given Σ = {A, E, F, G} and the TBox T = {> v ∃r1 .F, C1 v ∃r2 .A, A v ∃r3 .G, ∃r2 .E v J1 , ∃r1 .J1 v J0 , ∃r2 .> v J3 , J0 u J3 v C2 } Some of the ground PIs of Π(T , C1 v C2 ) are as follows: 54 Fajar Haifani, Patrick Koopmann and Sophie Tourret – A(sk2 (sk0 )), F (sk1 (sk2 (sk0 ))), G(sk3 (sk2 (sk0 ))) are positive PIs, – ¬E 0 (sk1 (sk2 (sk0 ))) ∨ ¬A0> (sk3 (sk2 (sk0 ))) is a negative PI, where the Skolem functions sk1 , sk2 , and sk3 , are from the Skolemization of r1 , r2 , and r3 respectively in the first three axioms. Let us take a close look at the reason why F (sk1 (sk2 (sk0 ))) is an implicate. We observe that A(sk2 (sk0 )) is obtained as a consequence of the presence of C1 v ∃r2 .A in T . Thanks to the addition of A v A> to T , A> (sk2 (sk0 )) is also an implicate. Finally due to the presence of the renaming of > v ∃r1 .F in T , namely A> v ∃r1 .F , the clause F (sk1 (sk2 (sk0 ))) is also an implicate. With a direct translation of >, we would only be able to derive F (sk1 (x)) and there would be no way to obtain a ground implicate from this clause. Similarly, it is only thanks to the presence of ∃r3 .A> v A> in T that the prime implicate ¬E 0 (sk1 (sk2 (sk0 ))) ∨ ¬A0> (sk3 (sk2 (sk0 ))) can be deduced. 4 Building EL hypotheses from prime implicates We use the ground prime implicates of Φ = Π(T , C1 v C2 ) to produce solutions to the abduction problem hT , Σ, C1 v C2 i in EL. To do this, we identify sub- sumers of C1 and subsumees of C2 that can be generated from the ground prime implicates in Φ, and that can be matched with each other based on the Skolem terms they contain. Definition 9 (Solution reconstruction). Let hT , Σ, C1 v C2 i be an abduc- tion problem and let Φ = Π(T , C1 v C2 ), and assume there are prime implicates {A1 (t1 ), . . . , An (tn )} ⊆ PI g+ 0 0 0 0 g− Σ (Φ) and ¬B1 (t1 ) ∨ . . . ∨ ¬Bm (tm ) ∈ PI Σ (Φ) such that {ti }i∈[1,n] = {t0i }i∈[1,m] . We can then obtain a hypothesis H for the abduction problem H = {l1 (t) v l2 (t) | t ∈ L and T 6|= l1 (t) v l2 (t)}, such that – L = {ti }di∈[1,n] = {t0i }i∈[1,m] , – l1 (t) = d{A | A(t) ∈ {A1 (t1 ), . . . , An (tn )}}, and – l2 (t) = {B | B 0 (t) ∈ {B10 (t01 ), . . . , Bm 0 (t0m )}}. We denote by S(Φ, Σ) the set containing all such H. Example 10. From Example 7, T = { C1 v ∃r1 .A, C1 v B, C1 v ∃r2 .F, B u ∃r1 .E v C2 , ∃r1 .A> v A> , ∃r2 .A> v A> } ]{J v A> | J ∈ {C1 , C2 , A, B, E, F, G}}. Consider the prime implicates from Π(T , C1 v C2 ) Abduction in EL via Translation to FOL 55 – {B(sk0 ), A(sk1 (sk0 ))} ⊆ PI g+ Σ (Φ), and – ¬B (sk0 ) ∨ C (sk1 (sk0 )) ∈ PI g− 0 0 Σ (Φ). From the Skolem term sk0 and sk1 (sk0 ), we have l1 (sk0 ) v l2 (sk0 ) = B v B and l1 (sk1 (sk0 )) v l2 (sk1 (sk0 )) = A v C respectively. However, B v B is a tautology. Consequently, H = {A v C} is one possible hypothesis. Theorem 11. Given a TBox abduction problem hT , Σ, C1 v C2 i, and denot- ing Φ the set Π(T , C1 v C2 ), the set S(Φ, Σ) contains only hypotheses for the abduction problem. Proof. Let ΦT denote the Skolemization of π(T ). If ¬B10 (t01 ) ∨ . . . ∨ ¬Bm 0 (t0m ) ∈ g− PI Σ (Φ), then ΦT ∪ {C1 (sk0 ), ¬C2 (sk0 )} |= ¬B1 (t1 ) ∨ . . . ∨ ¬Bm (tm ) because using, e.g., resolution, any inference to derive the primed disjunction can be imitated without primes. Moreover, A(t) ∈ PI g+ g+ Σ (Φ) implies A(t) ∈ PI Σ (ΦT ∪ {C1 (sk0 )}) since only non-primed clauses are relevant to entail positive literals. Thus ΦT ∪ {C1 (sk0 )} |= A1 (t1 ) ∧ . . . ∧ An (tn ). By combining the two results for compatible positive and negative clauses, we obtain ΦT ∪ {C1 (sk0 ), ¬C2 (sk0 )} |= π(l1 (t), t) ∧ ¬π(l2 (t), t) for all t ∈ L where L, l1 (t) and l2 (t) are as defined in Definition 9. In turn, this implies ΦT ∪ {C1 (sk0 ), ¬C2 (sk0 )} |= ∃x.(π(l1 (t), x) ∧ ¬π(l2 (t), x)) for all t ∈ L. Thus ΦT ∪ {C1 (sk0 ), ¬C2 (sk0 )} ∪ {∀x.(¬π(l1 (t), x) ∨ π(l2 (t), x)) for all t ∈ L}. Hence, for any hypothesis H ∈ S(Φ, Σ), we know that ΦT ∪ {C1 (sk0 ), ¬C2 (sk0 )} ∪ π(H) is unsatisfiable. Consequently, because Skolemization is satisfiability-preserving, π(T ) ∪ ¬π(C1 v C2 ) ∪ π(H) is also unsatisfiable, hence π(T ) ∪ π(H) |= π(C1 v C2 ). By Theorem 6, this in turn means that T ∪ H |= C1 v C2 . t u 5 Conclusion We have introduced a technique to compute hypotheses to solve TBox abduction problems in EL. This technique turns the input TBox into a first-order formula, of which the ground prime implicates can be used to build hypotheses answering the problem. We have not shown the prime implicate computation step itself for two rea- sons. First, existing tools such as SOLAR [23] and GPiD [12] can compute prime implicates automatically in first-order logic. Second, there is an issue with the termination of such tools for problems with cyclic dependencies in the CIs, e.g., such that T |= A v ∃r.A, because in that case, the set of prime implicates may be unbounded. However, there exists only a finite number of hypotheses to be found and we conjecture that prime implicates with polynomial nesting depth of Skolem terms are sufficient for finding all relevant hypotheses. We plan to verify this conjecture and implement a prime implicate generation tool on top of SPASS [29] as the next step of this work. An orthogonal concern that we are currently addressing is to relate the ob- tained hypotheses with connection-minimality [14], a notion we recently intro- duced to characterize solutions to the abduction problem in EL that we find 56 Fajar Haifani, Patrick Koopmann and Sophie Tourret particularly interesting, because they connect the subsumers of C1 to the sub- sumees of C2 in a less contrived way than other solutions. We believe that the present technique is sound and complete with respect to finding connection- minimal hypotheses, but that remains to be formally proved. Finally, as our approach works by generating ground prime implicates start- ing from a set of ground literals, it should also be possible to use it for a mix of ABox and TBox abduction, where our background knowledge consists of a DL knowledge base containing both a TBox and ground facts, the observation is another set of ground facts, and we are looking for hypotheses in the form of a TBox. This would have applications for TBox learning from data. References 1. Alrabbaa, C., Baader, F., Borgwardt, S., Koopmann, P., Kovtunova, A.: Finding small proofs for description logic entailments: Theory and practice. In: Albert, E., Kovacs, L. (eds.) LPAR-23: 23rd International Conference on Logic for Program- ming, Artificial Intelligence and Reasoning. EPiC Series in Computing, vol. 73, pp. 32–67. EasyChair (2020). https://doi.org/10.29007/nhpp 2. Alrabbaa, C., Baader, F., Borgwardt, S., Koopmann, P., Kovtunova, A.: Find- ing good proofs for description logic entailments using recursive quality measures. In: Platzer, A., Sutcliffe, G. (eds.) Automated Deduction - CADE 28 - 28th In- ternational Conference on Automated Deduction, Virtual Event, July 12-15, 2021, Proceedings. Lecture Notes in Computer Science, vol. 12699, pp. 291–308. Springer (2021). https://doi.org/10.1007/978-3-030-79876-5 17 3. Baader, F., Horrocks, I., Lutz, C., Sattler, U.: An Introduction to Description Logic. Cambridge University Press (2017). https://doi.org/10.1017/9781139025355 4. Baader, F., Suntisrivaraporn, B.: Debugging SNOMED CT using axiom pinpoint- ing in the description logic EL+ . In: Proc. of the 3rd Conference on Knowledge Rep- resentation in Medicine (KR-MED’08): Representing and Sharing Knowledge Us- ing SNOMED. CEUR-WS, vol. 410 (2008), http://ceur-ws.org/Vol-410/Paper01. pdf 5. Bienvenu, M.: Complexity of abduction in the EL family of lightweight description logics. In: Proceedings of KR 2008. pp. 220–230. AAAI Press (2008), http://www. aaai.org/Library/KR/2008/kr08-022.php 6. Calvanese, D., Ortiz, M., Simkus, M., Stefanoni, G.: Reasoning about explanations for negative query answers in DL-Lite. J. Artif. Intell. Res. 48, 635–669 (2013). https://doi.org/10.1613/jair.3870 7. Ceylan, İ.İ., Lukasiewicz, T., Malizia, E., Molinaro, C., Vaicenavicius, A.: Explana- tions for negative query answers under existential rules. In: Calvanese, D., Erdem, E., Thielscher, M. (eds.) Proceedings of KR 2020. pp. 223–232. AAAI Press (2020). https://doi.org/10.24963/kr.2020/23 8. Del-Pinto, W., Schmidt, R.A.: ABox abduction via forgetting in ALC. In: The Thirty-Third AAAI Conference on Artificial Intelligence, AAAI 2019. pp. 2768– 2775. AAAI Press (2019). https://doi.org/10.1609/aaai.v33i01.33012768 9. Du, J., Qi, G., Shen, Y., Pan, J.Z.: Towards practical ABox abduction in large description logic ontologies. Int. J. Semantic Web Inf. Syst. 8(2), 1–33 (2012). https://doi.org/10.4018/jswis.2012040101 Abduction in EL via Translation to FOL 57 10. Du, J., Wan, H., Ma, H.: Practical TBox abduction based on justification pat- terns. In: Proceedings of the Thirty-First AAAI Conference on Artificial In- telligence. pp. 1100–1106 (2017), http://aaai.org/ocs/index.php/AAAI/AAAI17/ paper/view/14402 11. Du, J., Wang, K., Shen, Y.: A tractable approach to ABox abduction over descrip- tion logic ontologies. In: Brodley, C.E., Stone, P. (eds.) Proceedings of the Twenty- Eighth AAAI Conference on Artificial Intelligence. pp. 1034–1040. AAAI Press (2014), http://www.aaai.org/ocs/index.php/AAAI/AAAI14/paper/view/8191 12. Echenim, M., Peltier, N., Sellami, Y.: A generic framework for implicate gener- ation modulo theories. In: Galmiche, D., Schulz, S., Sebastiani, R. (eds.) Au- tomated Reasoning - 9th International Joint Conference, IJCAR 2018, Held as Part of the Federated Logic Conference, FloC 2018, Oxford, UK, July 14-17, 2018, Proceedings. Lecture Notes in Computer Science, vol. 10900, pp. 279–294. Springer (2018). https://doi.org/10.1007/978-3-319-94205-6 19, https://doi.org/ 10.1007/978-3-319-94205-6 19 13. Elsenbroich, C., Kutz, O., Sattler, U.: A case for abductive reasoning over on- tologies. In: Proceedings of the OWLED’06 Workshop on OWL: Experiences and Directions (2006), http://ceur-ws.org/Vol-216/submission 25.pdf 14. Haifani, F., Koopmann, P., Tourret, S.: Introducing connection-minimal abduction for EL ontologies. XLoKR workshop (2021) 15. Halland, K., Britz, K.: ABox abduction in ALC using a DL tableau. In: 2012 South African Institute of Computer Scientists and Information Technologists Confer- ence, SAICSIT ’12. pp. 51–58 (2012). https://doi.org/10.1145/2389836.2389843 16. Horridge, M.: Justification Based Explanation in Ontologies. Ph.D. thesis, Uni- versity of Manchester, UK (2011), https://www.research.manchester.ac.uk/portal/ files/54511395/FULL TEXT.PDF 17. Horridge, M., Parsia, B., Sattler, U.: Explanation of OWL entailments in protege 4. In: Bizer, C., Joshi, A. (eds.) Proceedings of the Poster and Demonstration Session at the 7th International Semantic Web Conference (ISWC2008), Karlsruhe, Ger- many, October 28, 2008. CEUR Workshop Proceedings, vol. 401. CEUR-WS.org (2008), http://ceur-ws.org/Vol-401/iswc2008pd submission 47.pdf 18. Kazakov, Y., Klinov, P., Stupnikov, A.: Towards reusable explanation services in protege. In: Artale, A., Glimm, B., Kontchakov, R. (eds.) Proceedings of the 30th International Workshop on Description Logics, Montpellier, France, July 18- 21, 2017. CEUR Workshop Proceedings, vol. 1879. CEUR-WS.org (2017), http: //ceur-ws.org/Vol-1879/paper31.pdf 19. Klarman, S., Endriss, U., Schlobach, S.: ABox abduction in the descrip- tion logic ALC. Journal of Automated Reasoning 46(1), 43–80 (2011). https://doi.org/10.1007/s10817-010-9168-z 20. Koopmann, P.: Signature-based abduction with fresh individuals and complex concepts for description logics. In: Zhou, Z. (ed.) Proceedings of the Thirti- eth International Joint Conference on Artificial Intelligence, IJCAI 2021, Virtual Event / Montreal, Canada, 19-27 August 2021. pp. 1929–1935. ijcai.org (2021). https://doi.org/10.24963/ijcai.2021/266 21. Koopmann, P.: Two ways of explaining negative entailments in description logics using abduction. In: Explainable Logic-Based Knowledge Representation (XLoKR 2021) (2021) 22. Koopmann, P., Del-Pinto, W., Tourret, S., Schmidt, R.A.: Signature-based abduc- tion for expressive description logics. In: Calvanese, D., Erdem, E., Thielscher, M. 58 Fajar Haifani, Patrick Koopmann and Sophie Tourret (eds.) Proceedings of the 17th International Conference on Principles of Knowl- edge Representation and Reasoning, KR 2020. pp. 592–602. AAAI Press (2020). https://doi.org/10.24963/kr.2020/59 23. Nabeshima, H., Iwanuma, K., Inoue, K., Ray, O.: SOLAR: an automated de- duction system for consequence finding. AI Commun. 23(2-3), 183–203 (2010). https://doi.org/10.3233/AIC-2010-0465, https://doi.org/10.3233/AIC-2010-0465 24. Parsia, B., Matentzoglu, N., Gonçalves, R.S., Glimm, B., Steigmiller, A.: The OWL reasoner evaluation (ORE) 2015 competition report. J. Autom. Reason. 59(4), 455–482 (2017). https://doi.org/10.1007/s10817-017-9406-8 25. Pukancová, J., Homola, M.: Tableau-based ABox abduction for the ALCHO de- scription logic. In: Proceedings of the 30th International Workshop on Description Logics (2017), http://ceur-ws.org/Vol-1879/paper11.pdf 26. Pukancová, J., Homola, M.: The AAA ABox abduction solver. Künstliche Intell. 34(4), 517–522 (2020). https://doi.org/10.1007/s13218-020-00685-4 27. Schlobach, S., Cornet, R.: Non-standard reasoning services for the debugging of description logic terminologies. In: Gottlob, G., Walsh, T. (eds.) Proc. of the 18th Int. Joint Conf. on Artificial Intelligence (IJCAI 2003). pp. 355–362. Morgan Kauf- mann, Acapulco, Mexico (2003), http://ijcai.org/Proceedings/03/Papers/053.pdf 28. Wei-Kleiner, F., Dragisic, Z., Lambrix, P.: Abduction framework for repairing in- complete EL ontologies: Complexity results and algorithms. In: Proceedings of the Twenty-Eighth AAAI Conference on Artificial Intelligence. pp. 1120–1127. AAAI Press (2014), http://www.aaai.org/ocs/index.php/AAAI/AAAI14/paper/ view/8239 29. Weidenbach, C., Schmidt, R.A., Hillenbrand, T., Rusev, R., Topic, D.: Sys- tem description: Spass Version 3.0. In: Pfenning, F. (ed.) Automated Deduc- tion - CADE-21, 21st International Conference on Automated Deduction, Bre- men, Germany, July 17-20, 2007, Proceedings. Lecture Notes in Computer Science, vol. 4603, pp. 514–520. Springer (2007). https://doi.org/10.1007/978-3-540-73595- 3 38, https://doi.org/10.1007/978-3-540-73595-3 38