Signature-Based ABox Abduction in ALC is Hard? Patrick Koopmann Institute for Theoretical Computer Science, Technische Universität Dresden, Germany Abstract. In ABox abduction, we are given a knowledge base together with an ABox—the observation—that is not logically entailed by the knowledge base, and are looking for another ABox—the hypothesis—that, when added to the knowledge base, would make the observation entailed. This has applications in explaining negative entailments and missing query answers, as well as in diag- nosis. To get useful hypotheses, in signature-based abduction, we additionally provide a signature of abducible names, and require the hypothesis to use only names from that signature. In the variant we are considering, the hypothesis may otherwise use fresh individual names, as well as complex concepts constructed in arbitrary way using the names in the signature. It was recently shown that this variant of abduction is in N2E XP T IME NP , and that hypotheses may require con- cepts that are of triple exponential size. We complement those results by showing a matching N2E XP T IME NP lower bound, and show that in the worst case, hy- potheses may also use a double exponential number of fresh individual names. 1 Introduction Since inferences performed by description logic reasoners can be complex, and real on- tologies often contain 10,000s of axioms, explanations of description logic reasoning has since longer been in the focus of research [26]. In particular for explaining posi- tive entailments, that is, entailments that hold for a given knowledge base, there is a plethora of research, mostly focused on using justifications [33,4,30,17], but recently also on using proofs [1,2]. Explaining negative entailments, that is, entailments that do not hold for a given knowledge base, has been less in the focus of attention. Here, com- mon approaches are showing the user a counter example in form of a description logic interpretation [5,3], or using abduction [29,23]. In abduction, one is given a knowledge base and an observation, a formula that is not logically entailed by the knowledge base, and is interested in finding a missing piece in the knowledge base that would make the observation logically entailed, called a hypothesis for the observation. Abduction can be used in different ways to explain negative entailments to ontology users [23,7,8] and to repair missing entailments [34]. Another application of abduction is diagnosis: here, the observation describes symptoms, for instance of a medical condition or a faulty ma- chine [27,8], and the hypothesis is supposed to give a possible explanation for how the symptoms came into place. ? This work was supported by the DFG in grant 389792660 as part of TRR 248. Copyright © 2021 for this paper by its authors. Use permitted under Creative Commons Li- cense Attribution 4.0 International (CC BY 4.0). 62 Patrick Koopmann To illustrate, we reuse a simplified example from [21] from the geology domain. The observation is that the ground under a street has become unstable: {Unstable(street)}, and the background knowledge involves the following axioms: 1. EvaporiteFormation u ∃borders.(WaterWay u ¬∃lining.WaterProof) v ∃affectedBy.Dissolution 2. EvaporiteFormation u ∃affectedBy.Dissolution v ∀above.Unstable 3. (Waterway t Street) u EvaporiteFormation v ⊥ 4. Waterway(canal) 5. Street(street), These state: 1. an evaporite formation bordering to a waterway without waterproof lin- ing is affected by dissolution; 2. if an evaporite formation is affected by dissolution, then everything above it is unstable; 3. waterways and streets are different from evapor- ite formations; 4. canal is a waterway and 5. street is a street. A meaningful hypothesis could then be the following, which uses e as a fresh individual referring to a possible unknown formation under the street, as well as the complex concept ∀lining.⊥: H = { EvaporiteFormation(e), above(e, street), borders(e, canal), ∀lining.⊥(canal) } Depending on whether observation and hypothesis consist of concepts, facts, termi- nological axioms or knowledge bases, one distinguishes between concept abduction [6], ABox abduction [9,8,32,31,7,12,10,16,19], TBox abduction [11,34,15,14], and knowl- edge base abduction [24,13]. In this paper, we focus on ABox abduction, where, as in the above example, both the observation and the hypothesis consist of a set of facts about some set of individuals, a variant especially suited for diagnosis and for generat- ing counter examples as in [23]. To avoid trivial answers (such as the observation itself), usually, the abduction prob- lem is specified with additional constraints on the hypothesis. Those may be purely of logical nature [13], or based on syntactic properties. For instance, in [11], the user specifies syntactical patterns of axioms that the hypothesis must conform to. Another approach is to fix a finite set of abducible axioms—either directly as part of the in- put [31,8], or indirectly by posing strict conditions on the shape of axioms, such as being flat ABox assertions using only names and individuals from the input [32,10]. However, such a restriction might not be feasible in applications where we do not know the exact shape of the axioms we are looking for, for instance because they involve un- known individuals or complex concepts as in the example above. Another approach is to instead specify a set of abducible names or predicates, from which hypotheses can then be constructed in arbitrary fashion using the constructors of the respective descrip- tion logic. In the context of diagnosis, this set would contain names that are connected to causes of symptoms rather than the symptoms themselves, specific terms rather than generic categories, and possibly describing aspects that can be verified to check the hy- pothesis. In the example above, examples for names that would not be included would be Unstable (because it is about the observation, not about the cause), Waterway and Street (because we know all waterways and streets in the area), affectedBy and Dis- solution (because they are too unspecific, and we are not looking for the processes, but the objects that caused the street to become unstable.) Signature-Based ABox Abduction in ALC is Hard 63 This form of abduction, which we call signature-based abduction, has been theo- retically investigated for DL-Lite in [7], and for more expressive DLs in [21]. There is also a practical implementation for signature-based ABox abduction [9] and one for signature-based KB abduction [24]. These two practical implementation reduce abduc- tion to uniform interpolation via (extensions of) the tool L ETHE [20], which can lead to triple exponentially large solutions [25]. As shown in [21], this is indeed in theory un- avoidable. However, some crucial questions were left open in [21]: for the correspond- ing decision problem whether there exists a hypothesis for a given abduction problem, using ALC as description logic and allowing for both fresh individuals and complex concepts in the hypothesis, only an upper bound of N2E XP T IME NP is provided, while the lower bound is left open. Furthermore, for the number of fresh individuals one may need to introduce in the worst case, also only an upper bound was developed. In this paper, we show that those bounds are indeed tight, by proving that the general version of the signature-based ABox abduction problem for ALC is N2E XP T IME NP -complete, and that hypotheses for those problems may require a number of fresh individuals that is double exponential in the size of the abduction problem. 2 Preliminaries We recall the description logic ALC and define the signature-based ABox abduction problem formally. Let NC , NR and NI be pair-wise disjoint, countably infinite sets of respectively con- cept, role and individual names. A signature is a finite set Σ ⊆ NC ∪ NR . Concepts are then built according to the following syntax rule, where A ∈ NC and r ∈ NR : C := > | ⊥ | A | ¬C | C t C | C u C | ∃r.C | ∀r.C A TBox is a finite set of general concept inclusion axioms (GCIs) of the form C v D and concept equivalence axioms of the form C ≡ D, where C and D are concepts. An ABox is a finite set of concept assertions of the form C(a) and role assertions of the form r(a, b), where C is a concept, r ∈ NR and a, b ∈ NI . A knowledge base (KB) K is a union of a TBox and an ABox, and thus generalises both notions. The contents of a KB are called axioms. For a concept/axiom/KB E, we denote its signature, that is, the concept and role names that occur in E, by sig(E). The semantics of KBs is defined in terms of interpretations. An interpretation I is a tuple h∆I , ·I i of a countable but possibly infinite domain ∆I and an interpretation function ·I that assigns to each individual name a ∈ NI a domain element aI ∈ ∆I , to each concept name A ∈ NC a set AI ⊆ ∆I , and to each role r ∈ NR a relation rI ⊆ ∆I × ∆I . The interpretation function is extended as follows to concepts: >I = ∆ I ⊥=∅ (¬C)I = ∆I \ C I (C u D)I = C I ∩ DI (C t D)I = C I ∪ DI (∃r.C)I = {d ∈ ∆I | there is hd, ei ∈ rI s.t. e ∈ C I } (∀r.C)I = {d ∈ ∆I | for every hd, ei ∈ rI we have e ∈ C I }. 64 Patrick Koopmann Such an interpretation I satisfies an axiom α, in symbols I |= α if α = C v D and C I ⊆ DI , α = C ≡ D and C I = DI , α = C(a) and aI ∈ C I , or α = r(a, b) and haI , bI i ∈ rI . If I satisfies all axioms in a KB K, we write I |= K and call I a model of K. If a KB K has no model, we say it is inconsistent and write K |= ⊥. If an axiom α is satisfied in every model of a KB K, we write K |= α and say α is entailed by K. We are now ready to define the signature-based ABox abduction problem for ALC. Definition 1. A signature based ABox abduction problem is given by a tuple A = hK, Φ, Σi of a KB K called the background knowledge, an ABox Φ called the observa- tion, and a signature Σ of abducible names, for which we want to determine whether there exists a hypothesis H, which is an ABox that satisfies the following conditions: A1 K ∪ H 6|= ⊥, A2 K ∪ H |= Φ, and A3 sig(H) ⊆ Σ. Intuitively, the hypothesis should not contradict what we know (Condition A1), it should be sufficient to explain the observation (Condition A2), and it should only be con- structed using the abducible names (Condition A3). Apart from that, we put no further restriction on H: it may use individual names that occur neither in K nor in Φ (fresh in- dividuals), and it may contain assertions C(a) where C is a complex concept, arbitrarily composed using the constructors of ALC and the names in Σ. Regarding the corresponding decision problem, the following was shown in [21]: Theorem 1. Existence of hypotheses for signature-based ABox abduction problems is in N2E XP T IME NP . However, a corresponding lower bound was not yet provided, which is the contribution of this paper. In addition, [21] provided the following bounds on the size of hypotheses. Theorem 2. There exists a family of signature-based ABox abduction problems for which every hypothesis is of size triple exponential in the size of the problem. Further- more, for every signature-based ABox abduction problem, if there exists a hypothesis, then there exist one that is of size at most triple exponential in the size of the problem. The family of abduction problems used for the lower bound is such that only a single assertion is needed for the hypothesis. Therefore, an open problem remained how many fresh individuals may be required in a hypothesis. Inspection of the proof used for the upper bound of Theorem 2 gives at least an upper bound for this. Theorem 3. For every signature-based ALC-ABox abduction problem, if there exists a hypothesis, then there exist one that uses a number of fresh individual names that is at most double exponential in the size of the abduction problem. Proof. This is a consequence of the following lemmas from [21]: 1. every hypothesis can be converted into a so-called hypothesis abstraction (Lemma 1), 2. every hypothesis abstraction can be restricted to refer to at most double exponentially many individual names (Lemma 2), and 3. those bounds are preserved when translating the hypothesis abstraction back into a hypothesis (Lemma 3). t u We will also show that this bound is tight: in our reduction, we create a class of signature-based abduction problems whose hypotheses always need a double exponen- tial number of fresh individuals. Signature-Based ABox Abduction in ALC is Hard 65 3 Overview of the Reduction To prove hardness for N2E XP T IME NP , we provide a reduction from the word problem for non-deterministic, double exponential Turing machines with NP oracle. Specifically, given such a Turing machine M and a word w, we will construct an abduction problem AM,w in polynomial time s.t. M accepts w iff there exists a hypothesis for AM,w . The individual names in this hypothesis will be organised into a N × N × N -cube, n where N = {0, . . . , 2n · 22 }, and n is polynomial in the size of w. That is, for every hx, y, zi ∈ N × N × N , there will be a corresponding individual ax,y,z . The individuals on one side of the cube, that this, those of the form ax,y,0 , will encode an accepting configuration history for the Turing machine. The remaining individuals supply the “guessing space” for the oracle via the models of the hypothesis. Specifically, for a fixed y, the individuals ax,y,z will be such that they cannot possibly encode a successful computation history for the oracle in any model of the hypothesis and the background knowledge. Our abduction problem is composed of different components, that we define as separate abduction problems: 1. we reuse a construction from [21] to obtain three double exponential counters; 2. those are used to create the different individuals for all coordinates in N × N × N ; 3. we then enforce that every hypothesis forms a cube in which every coordinate is represented by exactly one individual; 4. finally, we make sure that hypothesis corresponds to a computation history of the Turing machine as described above. 4 The Double Exponential Counters We encode numbers with 2n bits using chains of 2n individuals, where each individual is an instance of the concept name Bit if it represents a bit with value 1, and otherwise an instance of ¬Bit. An abduction problem that exactly provides this is constructed in the proof for Theorem 5 in [21,22] to show the triple exponential lower bound on the size of hypothesis with complex concepts. This abduction problem is of the form Ac = hKc , Goal(a), Σc i, where {r, Bit, B} ⊆ Σc . Kc is such that, for any individual name b, the only way to entail Goal(b) using only names from Σc is by enforcing that every path of r-successors starting from b encodes a 2n -bit counter as required, starting n from a value of 22 − 1 and counting down to 0, ending at an individual satisfying B. (The construction also increments this counter along another role s, which is however not relevant for our purposes here.) This also means that no such path can contain any n cycles or shortcuts before reaching the instance of B after 2n · 22 r-successors. We will need three such double exponential counters, for which purpose we rename all names in Ac to obtain the three signature-disjoint variants Ax = hKx , Goalx (a), Σx i, Ay = hKy , Goaly (a), Σy i and Az = hKz , Goalz (a), Σz i. Specifically, these use now roles rx , ry and rz to connect the individuals in the counting sequence, the concept names Bitx , Bity and Bitz for the respective bit values, and Bx , By and Bz to mark the end points of the counters. All counters count backwards: that is, they start at the maximal value, and then step-wise decrease along the role-successors. 66 Patrick Koopmann 5 Creating the Coordinates To generate all coordinates in our cube, we now need to connect those abduction prob- lems appropriately. Specifically, we need the counter for a coordinate d ∈ {x, y, z} to start not only at the individual name a (the root of the cube), but also at any other individual name along the corresponding side of the cube. For d ∈ {x, y, z}, we use a concept name Maxd to mark individuals that are at the beginning of the sequence for the d-counter (specifically: at the last bit of the maximal counter value). This is established using the following axioms: l Maxd (a) Maxd v ∀re .Maxd u ∀rd .¬Maxd e∈{x,y,z}\d l ¬Maxd v ∀re .¬Maxd e∈{x,y,z} Intuitively, the counter for the coordinate d should be initialised at all points where Maxd is satisfied. However, we cannot initialise the counters directly, but need to use the respective abduction problem. Here, we use that Goald (b) is only entailed for individual names b from which every path along rd corresponds to a double exponential counter for d. To make sure all counters are initialised at the right positions, we use another concept name AllStarted that requires that all reachable individuals that satisfy Maxd also satisfy Goald . Recall that every counter for d reaches its end at a domain element satisfying B. Bx t By t Bz v AllStarted l l ∃rd .(AllStarted u (¬Maxe t Goale )) v AllStarted d∈{x,y,z} e∈{x,y,z} The concept name Coords now requires all counters to have started: AllStarted u Goalx u Goaly u Goalz v Coords. Set Kcoord to be the union of Kx , Ky , Kz and these new axioms. The abduction problem Acoord = hKcoord , Coords(a), Σx ∪ Σy ∪ Σz i now produces the required counters in its hypotheses, reachable by the corresponding paths of rx , ry and rz -successors. 6 Enforcing the Grid Shape So far, our abduction problem would still allow for a solution with just a single indi- vidual, which would produce the counters using a complex concept. This changes with the next extension, which makes sure that hypotheses need to use a number of fresh individuals that is double exponential in the size of the input, and that those individuals are organised into a three-dimensional grid via the three roles rx , ry and rz . Rather than Signature-Based ABox Abduction in ALC is Hard 67 enforcing this grid shape directly, we define a concept to detect whether a hypothesis is not of the desired grid shape, and then require for the observation to be entailed that the root individual is not an instance of this concept. For every two d, e ∈ {x, y, z} st. d 6= e, we add the following axiom NonSquare v ∃rd .∃re .D u ∃re .∃rd .¬D, NonSquare is only satisfiable by domain elements for which there is a path along rd -re that ends in a different domain element than a path along re -rd . As in an interpretation, there may be more rd and re successors than specified in the ABox, we use the follow- ing axioms to make sure NonSquare cannot be satisfied by an individual in an ABox in which the next two rd and re -successors form a square. ∃rd .D ≡ ∀rd .D ∃re .D ≡ ∀re .D ∃rd .∃re .D ≡ ∀rd .∃re .D ∃re .∃rd .D ≡ ∀re .∃rd .D The only way to entail ¬NonSquare(b) for an individual b, and without using D or NonSquare, is now by providing an rd -re -path and an re -rd -path that both end on the same individual, and thus by creating one cell of a grid. Next, we define the con- cept NonGrid to detect appearances of NonSquare anywhere on a path between the root element and the end of our paths marked by the concept names Bx , By and Bz . Specifically, for all d ∈ {x, y, z}, we add: NonGrid v ¬(Bx t By t Bz ) u (NonSquare t ∃rd .NonGrid) ∃rd .¬NonGrid v ∀rd .¬NonGrid The first axiom ensures that, if a domain element is in NonGrid, then there must be a path to some element in NonSquare that does not go through Bx t By t Bz . Together with the first axiom, the second axiom makes sure that, once one path fails to satisfy NonGrid at some point, all predecessors will fail to do so as well, unless NonSquare is already satisfied. Consequently, to satisfy ¬NonGrid, we have to ensure that on all paths, ¬NonSquare remains satisfied before the paths reach Bx t By t Bz . The only way to do so is by organising the role successors into a grid as required. The final axiom to be added is the following: Coords v NonGrid t Grid Every instance of Coords is either in NonGrid or Grid. To entail Grid(a), we need to entail Coords(a) and ¬NonGrid(a), which is only possible by having all coordinates organised into a cube. Denote the resulting KB with all axioms mentioned until now by Kgrid . Our abduc- tion problem for this section is defined as Agrid v= {Kgrid , Grid(a), Σx ∪Σy ∪Σz }. As now every hypothesis needs an individual for every coordinate in the cube, we obtain the following theorem. Theorem 4. There exists an unbounded family of ALC ABox abduction problems such that every hypothesis uses a number of fresh individuals that is at least double expo- nential in the size of the abduction problem. 68 Patrick Koopmann 7 Encoding the Turing machine It remains to encode the Turing machine to finish the reduction. Let Mo = hQo , Σo , Γo , ∆o , qo0 , b, Fo i be a non-deterministic Turing machine with states Qo , input alphabet Σo , tape alpha- bet Γo , transition relation ∆o ⊆ (Qo × Γo ) × (Qo × Γo × {−1, +1}), initial state qo0 , blank symbol b and accepting states Fo . We assume that there exists a polynomial p such that for input words w, Mo stops after at most p(|w|) steps. Furthermore, let M = hQ, Σ, Γ, δ, q0 , q? , b, F, Mo i be a non-deterministic Turing machine that uses Mo as oracle, where Q is the set of states, Σ the input alphabet, Γ the tape alphabet, ∆ ⊆ (Q × Γ × Γo ) → (Q × Γ × Γo × {−1, +1}) the transition relation that now operates on two tapes (the work tape and the oracle tape), q0 the starting state, q? the oracle query state, q− the oracle answer state, and F the set of accepting states. The definition of runs and accepting runs is as usual, however, for convenience, we define it with two differences: 1) our tape head is always on the same position for the work tape and the oracle tape, and 2) when in the oracle state, the Turing machine M goes into state q− if the oracle rejects the current content of the oracle tape, and otherwise M rejects the input (instead of going into a special state reserved for positive query answers). That 1) is without loss of generality can be seen using a standard trick for reducing multi tape machines to single tape machines: specifically, to simulate a different tape head for the oracle tape, we introduce a “marked” tape symbol γT for every tape symbol γ in the original TM, and use those marked symbols to mark the virtual position of the head on the oracle tape, and similarly for the work tape. In order to move the two tapes in different positions, we just move back and forth between the cells on the tapes that are marked in this way. This is possible with only quadratic time overhead. For Difference 2), we use the following argument: assume we have a Turing ma- chine that, based on the result of the oracle, would go either into a state q+ (if the oracle accepts) or into a state q− (if the oracle rejects). We can simulate this behaviour with our type of Turing machine by first guessing the result of the oracle. If we guess for a positive answer, we verify it without the oracle (since the Turing machine is non- deterministic), if we guess for a negative answer, we use the oracle to verify this. We assume that there is a polynomial q s.t. for words w on the input, M stops steps, that is, it accepts a language in N2E XP T IME NP . Fix one q(|w|) after at most 22 such input word w. We choose our n used in the previous sections s.t. n ≥ p(q(|w|)), this way ensuring that n bounds the length of accepting runs, as well as the maximal length of the tape content, both for M and the oracle machine Mo . Our reduction will now proceed in such a way that for the resulting abduction problem, every hypothesis will encode a successful run on the side of the cube for which the z-coordinate is 0: specifically, we will represent the content of the two tapes of M along the x-axis, and succeeding configurations along the y-axis. Signature-Based ABox Abduction in ALC is Hard 69 For convenience, we assume Q, Γ , Qo and Γo to be pairwise disjoint, and Q ∪ Γ ∪ Qo ∪ Γo ⊆ NC , so that we can identify states and tape symbols directly with concept names. We however need to make sure that every domain element can represent at most one state and tape symbol at a time: q u q0 v ⊥ for every q, q 0 ∈ Q, q 6= q 0 (1) 0 γuγ v⊥ for every γ, γ 0 ∈ Γ, γ 6= γ 0 (2) γo u γo0 v ⊥ for every γo , γo0 ∈ Γo , γo 6= γo0 (3) The input word w = w0 . . . wm is encoded as follows: w0 (a), rx (a, a1 ), w1 (a1 ), rx (a1 , a2 ), . . . , wm (am ), rx (am , am+1 ), b(am+1 ) (4) To make sure the blank symbol is propagated along the available space on the tape, we use the following TBox axiom. Recall that Maxy and Maxz are satisfied on all individuals where the y and z-counters are initialised, and Bx at the end of the counting sequence along rx . b u Maxy u Maxz u ¬Bx v ∀rx .b (5) We represent the current state on every node in the grid, and mark the position of the tape head with a special concept name Head: Head(a) (6) Maxy u Maxz v q0 (7) This completes the encoding of the initial configuration. We further make sure that also on the other configurations, every individual along the x-axis gets assigned the same state q ∈ Q: q ≡ ∀rx .q (8) and that at most one individual satisfies Head. For the latter, we use the concept LeftOfHead to mark individuals left of the head. q ∈ Q: q ≡ ∀rx .q (9) (Maxx u ¬Head) v LeftOfHead (10) LeftOfHead v ∀rx (Head t LeaftOfHead) (11) Head v ¬LeftOfHead u ∀rx .(¬Head u ¬LeftOfHead) (12) ¬LeftOfHead u ¬Head v ∀rx .(¬Head u ¬LeftOfHead) (13) The transitions that do not use the oracle are encoded using the following TBox axioms. For every q ∈ Q \ (F ∪ {q? }), γ ∈ Γ , γo ∈ Γo : ∃rx .(q u Head u γ u γo ) (14) G 0 0 0 v ∀ry .(Head u q u ∀rx .(γ u γo )) (15) hhq,γ,γo i,hq 0 ,γ 0 ,γo0 ,−1ii∈∆ G t 0 0 0 ∀ry .∀rx .(γ 0 u γo0 u q 0 u ∀rx .Head) (16) hhq,γ,γo i,hq ,γ ,γo ,+1ii∈∆ γ u ¬Head v ∀ry .γ (17) γo u ¬Head v ∀ry .γo (18) 70 Patrick Koopmann The following axioms now propagate the acceptance back to the root, so that we can require it for the observation to be entailed. Here, we use a fresh concept name Accept. G q v Accept (19) q∈F ∃ry .Accept v Accept (20) ∃rx .Accept v Accept (21) Our signature ΣT,w for the final abduction problem is defined as ΣT,w = Σx ∪ Σy ∪ Σz ∪ Q ∪ Γ ∪ Γo ∪ Head (22) It remains to encode the oracle. While we want the run of M to be represented in the hypothesis, we cannot do the same for the oracle, as we need to quantify over all possible runs (and ensure that none of them is accepting). We thus use a fresh concept name γo∗ for every γo ∈ Γo , since Γo is already used in ΣT,w . The following axioms, for every γo ∈ Γo , copy the information from the oracle tape to the input tape of the oracle machine, when the oracle is queried: q? u γo v γo∗ . Similarly, we use a concept name Tape∗ outside of the signature ΣT,w for the tape positions of the oracle machine. The following axiom ensures that the head of the oracle tape is always at the left-most position at the beginning: Maxx u Maxz v Tape∗ Except for axioms 14–16, the rest is encoded using the same axioms as in 7 – 18, however with the role ry replaced by rz , Tape replaced by Tape∗ , the tape alphabet now using the names γo∗ instead of γo , and of course all states and transitions now referring to the Turing machine Mo . To detect whether the oracle is rejecting, instead of the axioms 14–16, we represent the transitions as follows for every q ∈ Qo \ Fo and γ ∈ Γo . G ∃rx .(q u Head∗ u γ ∗ ) v ∀rz .(Head∗ u q 0 u ∀rx .(γ 0 )∗ ) hhq,γi,hq 0 ,γ 0 ,−1ii∈∆o G t 0 0 ∀rz .∀rx .((γ 0 )∗ u q 0 u ∀rx .Head∗ ) hhq,γi,hq ,γ ,+1ii∈∆ t Reject ∃x.Reject v Reject ∃z.Reject v Reject Those state that, if every computation path in Mo ends in a dead end that is not an accepting state, then in every model, Reject will be satisfied by the domain elements that represent the beginning of the computation. Finally, we connect both components, the encoding of the Turing machine M and the encoding of the Turing machine Mo , by introducing a concept name OracleChecked that is satisfied if on every configuration between the accepting configuration and the Signature-Based ABox Abduction in ALC is Hard 71 initial configuration, either there is no oracle call (¬q? ), or there is an oracle call (q? ), the oracle rejects, and the next state is q− . G q v OracleChecked q∈F ¬q? u ∃ry .OracleChecked v OracleChecked q? u Reject u ∃ry .(OracleChecked u q− ) v OracleChecked To complete the construction, we add the following GCI: Accept u OracleChecked u Grid v Observation Denote the knowledge base consisting of all axioms shown so far by KT,w . The final abduction problem is now the tuple AT,w = hKT,w , Observation(a), ΣT,w i. Lemma 1. There is a hypothesis for AT,w iff T accepts w. Together with the upper bound from [21], we thus obtain the following theorem. Theorem 5. Signature-based ABox abduction in ALC is N2E XP T IME NP -complete. 8 Conclusion One conclusion one could draw from the high complexity is that it may make sense to look into more restricted variants of ABox abduction, such as flat ABox abduction, where complex concepts are not allowed in the hypothesis: here hypotheses can get at most exponentially large, and the decision problem (for ALC) is only CO NE XP T IME- complete [21]. On the other hand, especially in the context of description logics, it is often observed that high theoretical complexity results do not mean that there cannot be implementations that work well in practice: for instance, reasoning in SROIQ, the DL underlying the web ontology language standard OWL, is N2E XP T IME-complete [18], while modern optimised DL reasoners process even large OWL ontologies in prac- tice [28]. An example more related to signature-based abduction is uniform interpola- tion, which for ALC may also lead to triple exponentially large ontologies [25], while systems like L ETHE [20] and FAME [35] can often successfully compute uniform inter- polants of realistic ontologies in practice. The current prototypes for signature-based abduction for ALC [9,24] performed reasonably well in an evaluation on real ontologies, however, they do not introduce fresh individual names. Instead, the more general approach in [24] uses additional DL constructors like inverse roles and nominals to express connections between named individuals. Arguably, the results of abduction would become more user friendly if they would use fresh individual names instead. An approach could be to first compute hypotheses with those systems, and afterwards simplify concepts by introduction of fresh individuals. The results in this paper indicate however that this could turn out challenging in practice, as the number of those individuals may become large in theory. 72 Patrick Koopmann References 1. Alrabbaa, C., Baader, F., Borgwardt, S., Koopmann, P., Kovtunova, A.: Finding small proofs for description logic entailments: Theory and practice. In: Albert, E., Kovacs, L. (eds.) LPAR-23: 23rd International Conference on Logic for Programming, Artificial In- telligence and Reasoning. EPiC Series in Computing, vol. 73, pp. 32–67. EasyChair (2020). https://doi.org/10.29007/nhpp 2. Alrabbaa, C., Baader, F., Borgwardt, S., Koopmann, P., Kovtunova, A.: Finding good proofs for description logic entailments using recursive quality measures. In: Platzer, A., Sut- cliffe, G. (eds.) Proceedings of the 28th International Conference on Automated Deduction (CADE’21). LNCS (2021), to appear. 3. Alrabbaa, C., Hieke, W., Turhan, A.Y.: Counter model transformation for explaining non- subsumption in EL. In: Beierle, C., Ragni, M., Stolzenburg, F., Thimm, M. (eds.) Proceed- ings of the 7th Workshop on Formal and Cognitive Reasoning (FCR-2021) co-located with the 44th German Conference on Artificial Intelligence (KI-2021). CEUR Workshop Proceed- ings, vol. 2961, pp. 9–22. CEUR-WS.org (2021) 4. Baader, F., Peñaloza, R., Suntisrivaraporn, B.: Pinpointing in the description logic EL+ . In: KI 2007: Advances in Artificial Intelligence, 30th Annual German Conference on AI, KI 2007, Osnabrück, Germany, September 10-13, 2007, Proceedings. pp. 52–67 (2007) 5. Bauer, J., Sattler, U., Parsia, B.: Explaining by example: Model exploration for ontology comprehension. In: Grau, B.C., Horrocks, I., Motik, B., Sattler, U. (eds.) Proceedings of the 22nd International Workshop on Description Logics (DL 2009), Oxford, UK, July 27-30, 2009. CEUR Workshop Proceedings, vol. 477. CEUR-WS.org (2009), http://ceur-ws.org/ Vol-477/paper 37.pdf 6. Bienvenu, M.: Complexity of abduction in the EL family of lightweight description logics. In: Proceedings of KR 2008. pp. 220–230. AAAI Press (2008), http://www.aaai.org/Library/ KR/2008/kr08-022.php 7. Calvanese, D., Ortiz, M., Simkus, M., Stefanoni, G.: Reasoning about explanations for neg- ative query answers in DL-Lite. J. Artif. Intell. Res. 48, 635–669 (2013). https://doi.org/10. 1613/jair.3870 8. Ceylan, İ.İ., Lukasiewicz, T., Malizia, E., Molinaro, C., Vaicenavicius, A.: Explanations for negative query answers under existential rules. In: Calvanese, D., Erdem, E., Thielscher, M. (eds.) Proceedings of KR 2020. pp. 223–232. AAAI Press (2020). https://doi.org/10.24963/ kr.2020/23 9. Del-Pinto, W., Schmidt, R.A.: ABox abduction via forgetting in ALC. In: The Thirty- Third AAAI Conference on Artificial Intelligence, AAAI 2019. pp. 2768–2775. AAAI Press (2019). https://doi.org/10.1609/aaai.v33i01.33012768 10. Du, J., Qi, G., Shen, Y., Pan, J.Z.: Towards practical ABox abduction in large description logic ontologies. Int. J. Semantic Web Inf. Syst. 8(2), 1–33 (2012). https://doi.org/10.4018/ jswis.2012040101 11. Du, J., Wan, H., Ma, H.: Practical TBox abduction based on justification patterns. In: Pro- ceedings of the Thirty-First AAAI Conference on Artificial Intelligence. pp. 1100–1106 (2017), http://aaai.org/ocs/index.php/AAAI/AAAI17/paper/view/14402 12. Du, J., Wang, K., Shen, Y.: A tractable approach to ABox abduction over description logic ontologies. In: Brodley, C.E., Stone, P. (eds.) Proceedings of the Twenty-Eighth AAAI Con- ference on Artificial Intelligence. pp. 1034–1040. AAAI Press (2014), http://www.aaai.org/ ocs/index.php/AAAI/AAAI14/paper/view/8191 13. Elsenbroich, C., Kutz, O., Sattler, U.: A case for abductive reasoning over ontologies. In: Proceedings of the OWLED’06 Workshop on OWL: Experiences and Directions (2006), http://ceur-ws.org/Vol-216/submission 25.pdf Signature-Based ABox Abduction in ALC is Hard 73 14. Haifani, F., Koopmann, P., Tourret, S.: Abduction in EL via translation to FOL. In: Schmidt, R.A., Wernhard, C., Zhao, Y. (eds.) SOQE 2021: The 2nd Workshop on Second-Order Quan- tifier Elimination and Related Topics. CEUR Workshop Proceedings, CEUR-WS.org (2021) 15. Haifani, F., Koopmann, P., Tourret, S.: Introducing connection minimal abduction for EL ontologies. In: Explainable Logic-Based Knowledge Representation (XLoKR 2021) (2021) 16. Halland, K., Britz, K.: ABox abduction in ALC using a DL tableau. In: 2012 South African Institute of Computer Scientists and Information Technologists Conference, SAICSIT ’12. pp. 51–58 (2012). https://doi.org/10.1145/2389836.2389843 17. Horridge, M.: Justification Based Explanation in Ontologies. Ph.D. thesis, University of Manchester, UK (2011), https://www.research.manchester.ac.uk/portal/files/54511395/ FULL TEXT.PDF 18. Kazakov, Y.: RIQ and SROIQ are harder than SHOIQ. In: Brewka, G., Lang, J. (eds.) Princi- ples of Knowledge Representation and Reasoning: Proceedings of the Eleventh International Conference, KR 2008, Sydney, Australia, September 16-19, 2008. pp. 274–284. AAAI Press (2008), http://www.aaai.org/Library/KR/2008/kr08-027.php 19. Klarman, S., Endriss, U., Schlobach, S.: ABox abduction in the description logic ALC. Journal of Automated Reasoning 46(1), 43–80 (2011). https://doi.org/10.1007/ s10817-010-9168-z 20. Koopmann, P.: LETHE: forgetting and uniform interpolation for expressive description log- ics. Künstliche Intell. 34(3), 381–387 (2020). https://doi.org/10.1007/s13218-020-00655-w 21. Koopmann, P.: Signature-based abduction with fresh individuals and complex concepts for description logics. In: Zhou, Z. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI 2021. pp. 1929–1935. ijcai.org (2021), https: //doi.org/10.24963/ijcai.2021/266 22. Koopmann, P.: Signature-based abduction with fresh individuals and complex concepts for description logics (extended version). CoRR abs/2105.00274 (2021), https://arxiv.org/abs/ 2105.00274 23. Koopmann, P.: Two ways of explaining negative entailments in description logics using ab- duction. In: Explainable Logic-Based Knowledge Representation (XLoKR 2021) (2021) 24. Koopmann, P., Del-Pinto, W., Tourret, S., Schmidt, R.A.: Signature-based abduction for ex- pressive description logics. In: Calvanese, D., Erdem, E., Thielscher, M. (eds.) Proceedings of the 17th International Conference on Principles of Knowledge Representation and Rea- soning, KR 2020. pp. 592–602. AAAI Press (2020). https://doi.org/10.24963/kr.2020/59 25. Lutz, C., Wolter, F.: Foundations for uniform interpolation and forgetting in expressive de- scription logics. In: Proceedings of IJCAI 2011. pp. 989–995. IJCAI/AAAI (2011). https: //doi.org/10.5591/978-1-57735-516-8/IJCAI11-170 26. McGuinness, D.L.: Explaining Reasoning in Description Logics. Ph.D. thesis, Rutgers Uni- versity, NJ, USA (1996). https://doi.org/10.7282/t3-q0c6-5305 27. Obeid, M., Obeid, Z., Moubaiddin, A., Obeid, N.: Using description logic and Abox abduc- tion to capture medical diagnosis. In: Wotawa, F., Friedrich, G., Pill, I., Koitz-Hristov, R., Ali, M. (eds.) Advances and Trends in Artificial Intelligence. From Theory to Practice. pp. 376–388. Springer International Publishing, Cham (2019) 28. Parsia, B., Matentzoglu, N., Gonçalves, R.S., Glimm, B., Steigmiller, A.: The OWL reasoner evaluation (ORE) 2015 competition report. J. Autom. Reason. 59(4), 455–482 (2017). https: //doi.org/10.1007/s10817-017-9406-8 29. Peirce, C.S.: Deduction, induction, and hypothesis. Popular science monthly 13, 470–482 (1878) 30. Peñaloza, R.: Axiom-Pinpointing in Description Logics and Beyond. Ph.D. thesis, Technis- che Universität Dresden, Germany (2009) 74 Patrick Koopmann 31. Pukancová, J., Homola, M.: Tableau-based ABox abduction for the ALCHO description logic. In: Proceedings of the 30th International Workshop on Description Logics (2017), http://ceur-ws.org/Vol-1879/paper11.pdf 32. Pukancová, J., Homola, M.: The AAA ABox abduction solver. Künstliche Intell. 34(4), 517– 522 (2020). https://doi.org/10.1007/s13218-020-00685-4 33. Schlobach, S., Cornet, R.: Non-standard reasoning services for the debugging of description logic terminologies. In: Gottlob, G., Walsh, T. (eds.) Proc. of the 18th Int. Joint Conf. on Artificial Intelligence (IJCAI 2003). pp. 355–362. Morgan Kaufmann, Acapulco, Mexico (2003), http://ijcai.org/Proceedings/03/Papers/053.pdf 34. Wei-Kleiner, F., Dragisic, Z., Lambrix, P.: Abduction framework for repairing incomplete EL ontologies: Complexity results and algorithms. In: Proceedings of the Twenty-Eighth AAAI Conference on Artificial Intelligence. pp. 1120–1127. AAAI Press (2014), http://www.aaai. org/ocs/index.php/AAAI/AAAI14/paper/view/8239 35. Zhao, Y., Schmidt, R.A.: FAME(Q): an automated tool for forgetting in description logics with qualified number restrictions. In: Fontaine, P. (ed.) Automated Deduction - CADE 27 - 27th International Conference on Automated Deduction, Natal, Brazil, August 27-30, 2019, Proceedings. Lecture Notes in Computer Science, vol. 11716, pp. 568–579. Springer (2019). https://doi.org/10.1007/978-3-030-29436-6 34