-

Abductive Reasoning with Description Logics: Use Case in Medical Diagnosis

0 Comenius University in Bratislava , Mlynská dolina, 84248 Bratislava , Slovakia

Ontologies have been increasingly used as a core representation formalism in medical information systems. Diagnosis is one of the highly relevant reasoning problems in this domain. In recent years this problem has captured attention also in the description logics community and various proposals on formalising abductive reasoning problems and their computational support appeared. In this paper, we focus on a practical diagnostic problem from a medical domain the diagnosis of diabetes mellitus - and we try to formalize it in DL in such a way that the expected diagnoses are abductively derived. Our aim in this work is to analyze abductive reasoning in DL from a practical perspective, considering more complex cases than trivial examples typically considered by the theory- or algorithm-centered literature, and to evaluate the expressivity as well as the particular formulation of the abductive reasoning problem needed to capture medical diagnosis.

Diagnosis abduction use case

Abduction, originally introduced by Peirce [15], is a form of backward reasoning, typically with a diagnostic rationale. We have a knowledge base K that is supposed to model some problem, and we have an observation O which is supposed to follow in situations captured by K , but we are not able to explain O deductively, i.e., K 6j= O. In abductive reasoning we ask the question – why is it that O does not follow from K , and we look for a hypothesis (or, explanation) H such that, if added to K , then O will follow from the resulting knowledge base. Moreover, we most typically look for explanations consisting of extensional rather than intensional knowledge, i.e., some set of ground facts that will, together with K , explain O.

Abduction only recently captured the researchers’ interest also in the area of ontologies and DL [7], where it also has some interesting applications, including possible explanations of incomplete modelling or incomplete matching [4], monitoring malfunctions in complex systems [11], and multimedia interpretation [16], among others.

The problem of diagnosis, often shown as a classic example of abductive reasoning [7,9], is highly relevant in the medical domain. Not only for primary diagnosis of a certain disease (as the model example in this use case), but also in emerging applications such as telemedical monitoring systems and ambient assisted living, where the patient’s condition is continually monitored and diagnosed for anomalies, therapy adherence, etc. Most of these applications nowadays heavily rely on ontologies (i.e., DL-based knowledge bases) that have been increasingly used as core representation formalism for clinical knowledge. While abduction over DL has been studied especially from the theoretical and from the algorithmic perspective, we are not aware of any case studies focusing on the practical aspects of modelling problems for abductive reasoning.

In this paper, we focus on a practical diagnostic problem from a medical domain: the diagnosis of diabetes mellitus. Based on information from clinical guidelines and other relevant sources (e.g., [1,2]) we formalize it in DL in such a way that the expected diagnoses are abductively derived. While we simplify the problem for reasons of conciseness, we do abstract a number of distinct, less or more problematic cases that need to be addressed, including: (a) dealing with the hierarchy of symptoms and possible diagnoses, (b) di erential and elimination diagnosis, (c) associated conditions with similar symptoms, (d) distinguishing and reporting complications, and some more.

In the analysis that follows, we evaluate the modelled examples from the perspective of which particular variant of abduction is being addressed, what DL expressivity is needed, and we highlight the most important modelling issues that we run into.

In the end, we learned the following lessons: medical diagnosis especially requires ABox abduction, as hypothesizing the intensional knowledge in this domain is typically not desired – that is the area of domain experts. We were mostly able to model our simplified examples with the rather less expressive DL ALC for which abductive reasoning is available [9,12,13]. Though, examples requiring more complex constructs can also be found. Finally, modelling diagnostic knowledge bases with DL is di erent from modelling typical ontologies. In order to get the desired explanations the statements often need to be formulated more strongly, so that the desired observations follow. Also, to compare the generated hypotheses, at least part of the knowledge is used also deductively. Combining abductive and deductive reasoning within one knowledge base poses some di culties, even on simplistic examples. 2

Abductive Reasoning with DL

The basic abductive framework for DL was introduced by Elsenbroich et al. [7] who proposed formulations for a number of distinct abductive problems. The main three types are summarized below. We will assume that the reader is already familiar with DL, the split of the knowledge base into the TBox (intensional knowledge) and the ABox (extensional knowledge), basic syntax, and semantics [3].

Definition 1 (Abduction problems in DL [7]). An abduction problem is a pair P = (K ; O) such that K is a knowledge base in DL, and O a TBox or ABox assertion. A solution of P is any finite set H of TBox and ABox assertions such that K [ H is consistent and K [ H j= O. In addition, P is called – TBox abduction problem: if H is a set of TBox assertions and O is a TBox assertion. – ABox abduction problem: if H is a set of ABox assertions and O is an ABox assertion. – Knowledge base abduction problem: the general problem, i.e., if there are no restrictions on H and O.

The definition gives a generic framework for abduction, but it is not very useful without further constraining the possible hypotheses. The number of possible explanations is very high, even infinite, therefore we need to be able to compare them and select the most preferred ones. The commonly used restrictions include [7]: Definition 2. Given an abduction problem P = (K ; O) and hypotheses H; H0, we say that: 1. H is consistent if H [ K 6j= ?, i.e. H is consistent w.r.t. K 2. H is relevant if H 6j= O, i.e. H does not entail O 3. H is explanatory if K 6j= O, i.e. K does not entail O 4. H is stronger than H0 (H K H0) if K [ H entails H0 (and vice-versa H is weaker than H0 if K [ H0 entails H) 5. minimal if for every H0, H0 K H, i.e., H is weaker than any other H0

In general, H is a preferred solution if it is consistent, relevant, explanatory and there is no strictly weaker solution H0 (i.e., such that H K H0 and H0 K H). If there is single such solution, it is called the most preferred. Such hypotheses are (semantically) minimal. Minimality is important, because if there is a (strictly) weaker hypothesis than H it means that H hypothesizes too much. As abduction amounts to guessing, in a sense we do not want to guess more than necessary in order to derive the observation.

Consistency is required, because from inconsistency (K [ H j= ?) one is able to derive everything. Such hypotheses would explain every observation and so it is not meaningful to find solutions not consistent with the knowledge base K . The knowledge base K represents the background theory from which we are interested to derive the hypotheses. Therefore we should not be able to explain the observation without it. Such hypotheses (i.e., when H j= O) are therefore irrelevant. Similarly, an abduction problem only needs explaining the observation does not already follow from K .

The diagnostic problems in the medical domain, which is our interest in this paper, will most often call for ABox abduction. This is because our aim is not to enrich the knowledge base with new axioms; these are typically su ciently described by domain experts. Therefore our purpose is to build a knowledge base formalizing the available expert’s knowledge (TBox) and use it to find explanations in form of facts (ABox assertions) to any observation, which is typically also a fact (ABox assertion).

A number of researchers addressed the computational solution for abduction problems. We will focus on those who addressed ABox abduction. Klarman et al. [12] proposed an algorithm for ABox abduction on top of ALC based on resolution. They show that it is sound and complete. Halland and Britz [9] developed, also for ALC, a method based on the DL tableau algorithm. Ma et al. [13] also rely on th DL tableau algorithm, but extend the approach towards ALCI. Completeness was not shown by Ma et al., while Halland and Britz explicitly note that their approach is incomplete.

Du et al. [5] solve abduction reasoning in DL via a reduction to logic programming, where abduction has been extensively studied [6]. In logic programming, abductive explanations are not arbitrary, but are typically drawn from a set of distinctive literals called abducibles. This is because the user is often able to charaterzie in which part of the knowledge the hypothesis is expected, and thus to reduce the search space. To transfer this notion to the area of DL, Du et al. introduced a new variant of the ABox abduction problem, which we will call simple, in the form of P(K ; A; O). Besides for the knowledge base K and the observation O (set of concept assertions or atomic roles assertions), it adds the abducibles A – a set of atomic concepts and atomic roles. An abductive solution H for P = (K ; A; O) is a minimal set of ABox axioms composed of individuals of K and concepts or roles of A, such that K [ H j= O. The solution should be relevant, and consistent. The simple ABox abduction problem may be solved by reduction to logic programming, and consecutive evalaution by a readymade reasoning engine for logic programming. 3

Diabetes Mellitus Use Case

of the assumed simplification

Diabetes mellitus (DM) is a group of metabolic diseases characterized by hyperglycemia resulting from defects in insulin secretion, insulin action, or both. The chronic hyperglycemia of diabetes is associated with long-term damage, dysfunction, and failure of various organs, especially the eyes, kidneys, nerves, heart, and blood vessels [2]. We chose DM for our use case, as its diagnosis is a complex problem, with need to distinguish between particular subtypes and associated conditions, identification of possible complications, etc.

Our aim is to conceptualize a KB in DL that can be used for diagnosis of DM relying on abductive reasoning. As the problem is rather complex, we will concentrate on selected specific subproblems, and we will also abstract from some details which can be implemented analogously. 3.1

Hierarchy of Symptoms

Typical symptoms of diabetes mellitus include: frequent urination, excessive thirst, hyperglycemia, blurred vision, anorexia, weight loss, fatigue, and weakness. There are some more, but they can be added into the formalization analogously. In the following, diabetes mellitus is represented by the concept symbol DM, and the symptoms by S1; : : : ; S8, respectively. In abductive reasoning, we observe some set of symptoms and try to hypothesize the most relevant diagnosis. Therefore we record the relation between the diagnosis and its symptoms by the axiom:

9hasDiag:DM v 9hasSymp:(S1 u

While literally (more precisely: deductively) this axiom means, that whoever has DM must simultaneously manifest all eight symptoms S1; : : : S8, which is not always true; it allows to generate relevant abductive hypotheses: if the patient p is observed to have any subset of the symptoms S1; : : : ; S8 (e.g., we have an observation O1 = p : (9hasSymp:S2) u (9hasSymp:S8)) then H1 = fp : 9hasDiag:DMg is among the generated diagnoses. Note that this kind of modelling is simplified, as it ignores uncertainties, often present in medical knowledge. In this paper we take this simplification as we want to explore the possibilities of abduction with regular DL. Introducing uncertainties is left for future work.

There are, possibly, some relations between symptoms, like one symptom may be a more specific or a synonymous name for another, for instance, polydipsia (S9) is a synonym for excessive thirst. This is modelled by adding:

S9 ( 2 ) Now, whenever we observe S9 instead of S2 among some other symptoms of DM we derive equal hypotheses (e.g., if O2 = fp : (9hasSymp:S9) u (9hasSymp:S7)g, H1 is still abductively derived).

A more complex relationship among symptoms may occur in cases when some symptoms are conditions which may themselves be manifested by some other symptoms. For instance anorexia (S5) is a condition associated with weight loss, fatigue, and weakness (S6; : : : ; S8). This may be modelled in two ways, adding either ( 3–4 ) or ( 5 ):

S5 v 9hasSymp:(S6 u

hasSymp hasSymp v hasSymp ( 4 ) S5 v S6 u u S8 This then allows us to keep the axioms that relate diagnoses to symptoms more concise, e.g., we may now replace ( 1 ) by: Potential diagnoses of our interest are listed in Table 1. These concepts can be readily taken from a number of medical ontologies. We chose to root them in SNOMED CT, where they are present as disorders. As in many cases in the medical domain, the main diagnosis (DM), has some subtypes which we want to distinguish. There are some alternate diagnoses (e.g., diabetes insipidus) which need to be rooted out or confirmed during the diagnostic process. And there are some related diagnoses (e.g., obesity, ketoacidosis) which may take part in the diagnosis of DM as relevant symptoms. The diagnoses thus form a hierarchy.

We will narrow down our focus on the diagnoses listed in Table 1. The respective part of the hierarchy is formalized using the following DL axioms:

DM1 t DM2 t GD v DM

LADA v DM1

The hierarchy of diagnoses has significant influence on the generated hypotheses. For instance, considering the axioms ( 5–6 ) formalizing the symptoms of DM, together ( 5 ) ( 6 ) ( 7 ) ( 8 ) with ( 7–8 ), if some of the symptoms of DM are observed (e.g., O1 or O2), the abductive reasoner will generate a number of diagnoses: H1 = fp : 9hasDiag:DMg as before, but in addition also H2 = fp : 9hasDiag:DM1g, H3 = fp : 9hasDiag:DM2g, H4 = fp : (9hasDiag:DM1) u (9hasDiag:DM2)g, and similar hypotheses for all (asserted or derived) subconcepts of DM. This is because they now all allow to derive the given observations. On the other hand, if we compare these hypotheses semantically, we see that Hi K H1 for i > 1 in the hypotheses above, as K [ Hi j= H1, and never vice versa. Hence only H1 will be preferred. A typical task in medical diagnostics is to distinguish one diagnosis from another, often similar one. This is called di erential diagnosis. The two diagnoses may be distinguished by considering some symptoms relevant to one of them, but not the other.

We will consider DM1 and DM2; as di erentiating between them is a relevant medical problem. DM1 and DM2 have some symptoms in common, but they also have some di erent symptoms. In this case, the common symptoms are those of DM. When the patient has some of these symptoms we say that she has DM – this case is covered by axioms ( 5–6 ).

Specific symptoms of DM1 include: belly pain, vomiting, fruity breath odor, drowsiness, and coma (S10; : : : ; S14). This is formalized as follows:

9hasDiag:DM1 v 9hasSymp:(S10 u : : : u S14) Analogously, the symptoms of DM2 include: skin problems, slow healing, tingling, numbness, and high BMI. We will name these as S15; : : : ; S19, which gives us the axiom: 9hasDiag:DM2 v 9hasSymp:(S15 u : : : u S19) Together the three diagnoses are now covered with ( 5–7 ) and ( 9–10 ). Let us consider some observations and the respective hypotheses: ( 9 ) ( 10 ) – If we observe some set of symptoms that are common to both DM1 and DM2, e.g., having again the observation O1 = p : (9hasSymp:S2) u (9hasSymp:S8), then the most preferred diagnosis will be H1 = fp : 9hasDiag:DMg. However, also H2 = fp : 9hasDiag:DM1g, H3 = fp : 9hasDiag:DM2g, H4 = fp : (9hasDiag:DM1) u (9hasDiag:DM2)g, will be valid abductive explanations (among others), but as we already discussed in Sect. 3.2 they are all stronger than H1 and hence H1 will be the most preferred. – If we observe at least one symptom specific to DM1, e.g., having the observation O3 = p : (9hasSymp:S2) u (9hasSymp:S10), this is no longer abductively explained by H1, nor H3. The most preferred hypothesis will be H2. H4 is an abductive explanation as well, but it is stronger than H2, and hence H2 is preferred. – The case when we observe some specific symptoms of DM2 (but none of DM1) is exactly analogous. – Finally, if we observe specific symptoms of both DM1 and DM2, e.g., we have O4 = p : (9hasSymp:S10) u (9hasSymp:S15), the most preferred hypothesis that abductively explains this observation is H4. This is because in case of H1–H3 there is always some symptom which is not explained. There are other explanations (e.g., H5 = fp : 9hasDiag:(DM1 u DM2)g), but they are all stronger than H4. 3.4

Case of Secondary Explanation of the Observation

During di erential diagnosis it is also important to recognize cases when the given set of observed symptoms may have other possible explanations than the disease in question. For instance, one of the major symptoms of DM (hyperglycemia named as S3) may be caused as a side e ect of some medication.

As the axiom ( 6 ) includes S3 as one of the DM symptoms, we are already able to answer on the observation of having S3. Another possible explanation is taking the medications (name them M1). So we add a new axiom:

9hasMedication:M1 v 9hasSymp:S3

( 11 ) As this is a very simple conceptualization, there is only one observation by which this axiom play role. This observation is O5 = p : 9hasSymp:S3. As well we have exactly two hypotheses H6 = fp : 9hasMedication:M1g and H7 = fp : 9hasDiag:DMg. Neither H6 nor H7 is stronger then the other one so both are preferred.

This result means, that we cannot be sure, which explanation is correct and we have to continue in diagnosting. We are satisfied with this answer because also in medical domain information about hyperglycemia presence is insu cient condition to decide between taking medications and having diagnosis DM. 3.5

Associated Conditions

In the medical domain, relations between diagnoses may come into play. Some associated conditions may have similar symptoms, or subset of symptoms as other diagnoses. Thus, in fact, in Axiom ( 9 ) the symptoms S10; : : : ; S14 are symptoms of an associated diagnosis called ketoacidosis. Similarly the symptom S19 is a symptom of obesity, an associated diagnosis of DM2.

To take this into the account, we may add to ( 9 ) a new axiom, Axiom ( 12 ), and analogously to ( 10 ) a new axiom, Axiom ( 13 ), as follows: 9hasDiag:KA v 9hasSymp:(S10 u : : : u S14)

9hasDiag:Ob v 9hasSymp:S19

( 12 ) When we take the observations O1 and O3 and try to explain them using axioms ( 5–7 ) and ( 9–13 ), we see some changes: – The case of observation O1 = p : (9hasSymp:S2) u (9hasSymp:S8) is not a ected by the newly added axioms, as the symptom S2 is not explained by any of them.

So, the most preferred hypothesis is again H1. – If we observe at least one symptom specific to ketoacidosis together with some symptoms of DM (which are all symptoms of DM1 as we already know), e.g., having the observation O3 = p : (9hasSymp:S2) u (9hasSymp:S10), then besides for H2 = fp : 9hasDiag:DM1g we have to consider also hypothesis H6 = fp : (9hasDiag:DM)u(9hasDiag:KA)g: they both explain O3 and both are preferred to any other but they are mutually incomparable. But H6 is certainly unexpected as so far we modelled the axioms in such a way that abductively the symptoms DM and ketoacidosis explain the diagnosis of DM1. To solve this we have to add yet another axiom:

9hasDiag:DM u 9hasDiag:KA v 9hasDiag:DM1

( 14 ) The axiom enables to compare the two hypotheses also deductively, i.e., any patient with both diagnoses DM and ketoacidosis is inferred to have also DM1. Hence H6 is now stronger than H2, and so H2 is the single most preferred hypothesis. – The case when we observe some specific symptoms of obesity together with symptoms of DM is analogous. We get the expected hypothesis H3 = fp : 9hasDiag: DM2g, but to suppress H7 = fp : (9hasDiag:DM) u (9hasDiag:Ob)g as less preferred, we need to add: 9hasDiag:DM u 9hasDiag:Ob v 9hasDiag:DM2 ( 15 ) – If we observe only the symptoms shared by obesity and DM2, in our simplified example, only O5 = p : (9hasSymp:S18), two preferred and incomparable hypotheses are H8 = fp : (9hasDiag:Ob)g and H3 = fp : (9hasDiag:DM2)g. In this case, this is in accord with the medical knowledge: this symptom can either be caused by one condition or by the other. 3.6

Case of Complications

However, in certain circumstances, one may wish to model the associated diagnoses di erently, to capture a closer relation between them. This is, for instance, the case of DM1 and ketoacidosis. When we consider O3 = p : (9hasSymp:S2)u(9hasSymp:S10) the modelling from the previous section gives the single most preferred diagnosis H2 = fp : 9hasDiag:DM1g. While ketoacidosis is also indicated by symptom S10, the hypothesis H9 = fp : 9hasDiag:KAg is not an abductive explanation as it does not explain S10.

This solution does not correctly capture the importance of ketoacidosis presence in diabetic patients. Ketoacidosis does not merely share symptoms with DM1, but it is an acute complication thereof which rarely occurs in non-diabetic individuals.

Therefore, we would intuitively want the explanation H10 = fp : 9hasDiag:DM1) u (9hasDiag:KA)g in this case. In fact, H10 also explains O3 in this case but is is stronger than H2 and hence also less preferred.

To achieve this, we have to remodel the knowledge, so that, given some symptoms that are shared by both DM1 and ketoacidosis, only the conjunction of these diagnoses explains them: we exchange ( 9 ) together with ( 12 ) with a new axiom: (9hasDiag:DM1) u (9hasDiag:KA) v 9hasSymptom:(S10 u : : : u S14) ( 16 ) This however still does not force H10 as single most preferred, as due to the presence ( 7 ) and ( 14 ) we now have H10 K H6 and H6 K H10, hence both H10 and H6 are equally preferred. However, recall that we have asserted ( 14 ) only to support a slightly di erent relation between DM1 and ketoacedosis in the previous section, so when we drop this axiom then the single, most preferred diagnosis remains to be H10.

This last move reflects the fact that in order to model di erent relationships between preferred explanations we need to manipulate also the deductive part of the KB (used during deductive comparisons of hypotheses) and, what is more, these manipulation is not just additive, it is selective, which may cause problems in more complex cases with higher number of variously interrelated diagnoses. 3.7

Case of Further Examination Needed

Consider again the case of one hypothesis being a more specific case of another. Typically, we have two diagnoses which have some common symptoms, while the more specific one likely has some additional symptoms (like with DM1 and DM, or DM2 and DM above). If we observe only some of the common symptoms, our previous modelling derives the less specific hypothesis. The more specific hypothesis also explains the observations, but there is no additional evidence for it, so we prefer the less specific one.

Most of the time this is expected, but not all the time. It may be necessary to differentiate between such hypotheses by all means, and so, if perhaps some evidence is lacking it should be obtained by additional examination if possible, by a laboratory test for instance. Hence the outcome from the diagnosis procedure should not be just the less specific hypothesis, but instead it should indicate also that additional tests are needed.

From a medical perspective this might be illustrated on the following problem: some patients who fit a certain profile (they are older, and not obese) and typically show symptoms common to DM1, may in fact have a specific type of DM1 called LADA (cf. Table 1 and axiom ( 8 )). To di erentiate between these two diagnoses, medical practitioners are advised to test antibodies (e.g., GADA) in a blood sample.

As this case leads to some complex modelling, we will explain it on simplified examples. Assume that LADA is a specific form of DM1, and that DM1 has some symptom (S01) and LADA has an additional specific one (S02). That is, we start from: ( 17 ) ( 18 ) (19)

Now we end up exactly in the situation described above. If only DM1 symptoms are observed (i.e., in our simplified case O6 = p : 9hasSymp:S01) then the preferred hypothesis is H2 = fp : 9hasDiag:DM1g. In order to resolve this we may try to use (20) instead of ( 18 ):

(9hasDiag:DM1) u (9needS:LT) v 9hasSymp:S01

We now get the expected most preferred hypothesis H11 = fp : (9hasDiag:DM1) u (9needS:LT)g for O6 and so much for O7 = p : 9hasSymp:S02 we will get H12 = fp : 9hasDiag:LADAg as most preferred , however for O8 = p : (9hasSymp:S01) u (9hasSymp:S02) we now get H12 = fp : (9hasDiag:LADA) u (9needS:LT)g as most preferred, as the need of the lab test is now necessary to explain S0 . This is unintuitive 1 as indeed once we observe S02 we know that the patient has LADA, no more tests are needed. It is easy to verify that it does not help to alter (19) to (21):

9hasDiag:LADA v 9hasSymp:(S 10 u S 20)

(21) This is due to ( 17 ), (20), and (21) now for O6 give both H11 and H12 as preferred, and they are incomparable. But clearly H12 is exactly the wrong hypothesis here. We are getting into some vicious circle.

The only solution we were able to come up with is to start treating the symptoms as completely specified. That is, either S01 or :S01 is always part of the observation, and same for S02 or other symptoms possibly involved in this part of the derivation. Using ( 17 ), (22), and (23), we always get the expected results: (9hasDiag:DM1) u (9needS:LT) v 9hasSymp:(S01 u :S02)

9hasDiag:LADA v 9hasSymp:(S01 u S02) u 9hasSymp:(:S01 u S02)

(22) (23) However, we now have to ask queries di erently. For O06 = p : 9hasSymp:(S01 u :S02) the most preferred hypothesis is H11, and for both for O07 = p : 9hasSymp:(:S1 u S02) and for O8 (no need to change here) we will get H12.

So, we were able to get the expected results, but for a considerable price. Treating symptoms as completely specified is not in line with the usual intuitions behind abduction, where it is normal to assume that the observations are incomplete, and we are tasked to give the most appropriate explanation for any such given observation. In addition, it leads to a considerable blow up in the axioms, where all possible combinations of positive and negative symptoms need to be enumerated. 4

Discussion

All formalizations in our use case are instances of the ABox abduction problem as defined in Definition 1, based on Elsenbroich et al. [7]. The explanations that we seek are typically of a specific form of a single assertion: p : (9R1:C1) u u (9Rn:Cn) involving single patient p, where R1, . . . , Rn most typically will come from some a priori known set of roles which are relevant based on the domain knowledge. A similar assumption may often hold also for the concepts C1, . . . , Cn (e.g., most often these will be atomic concepts for various diagnoses and conditions relevant to the patients state).

While exceptions to these constraints may certainly be found (e.g., a second person involved from which the patient contracted the disease), in many cases the form of expected explanations will be reducible into atomic diagnoses by adding “interface” axioms of the form:

Diag1 (9R1:C1) u u (9Rn:Cn) (24) (25) The hypothesis of the complex form (24) now reduces into the atomic form p : Diag1 which makes it possible to postulate the problem as simple ABox abduction. This is an important observation, as Du et al. [5] showed that the simple abduction problem can be e ectively solved even for DLs up to SH IQ.

Most of the examples we have shown rely in a fairly basic DL ALC, abduction support for which is known [12,13,5,9]. We used complex role inclusions to capture dependencies between symptoms, but we also showed a simpler modelling which does not require it. Di erent complex DL constructs that might possibly be needed in more realistic situations certainly include number restrictions, and also restriction over concrete domains (known, e.g., in OWL [14]), that would enable to support some more elaborate statements about symptoms (the patient has at least some number of symptoms, or has a numeric value of some symptom from a specific range). Extensions of abductive reasoning for more expressive DLs that include such constructs are therefore desirable.

From a modelling perspective, an interesting lesson learned from the use case is that modelling a knowledge base to be used in a classical, deductive way, and modelling it to support abductive reasoning pose di erent, and sometimes conflicting requirements. As we noted in Sect. 3.1, it is often the case that certain explanation is expected to be valid for a number of symptoms, including any subsets thereof. Hence the typical approach that we relied upon is to formulate the axioms in a stronger fashion – the explanation implies all the symptoms, hence any subset follows. This kind of modelling is also demonstrated in the literature [7,5].

Firstly, we note that this requires the knowledge to be used in abductive application to be modelled di erently than it is typically usual for ontologies, which normally are subject to deductive reasoning. Secondly, as we have repeatedly observed in this paper, even the process of abduction has an important subtask when hypotheses are compared and strictly deductive reasoning is used for this. This results in complex inter-relations between the “abductive” and the “deductive” part of the knowledge base and may lead to rather complicated modelling. A possible way how to deal with this is to treat the abductive and the deductive knowledge separately [16], or to used a more refined formulation of the abduction problem [10]. We plan to try this in the future.

Finally, we note that abduction has also a number of advantages, e.g., when compared to deductive diagnostic reasoning. In number of works [8,17,18] relying upon the latter, where deductive rules of the form S 1; ; S n ! D are used, where S i are symptoms and D is the diagnosis, the authors point out the problem occurring in case of two diagnoses D1, D2 with the former having a subset of symptoms of the latter. Using deductive inference, and observing all symptoms of D2, both D1 and D2 are derived. This is counterintuitive as specific symptoms of D2 were observed which are not symptoms of D1. This problem has to be addressed by some suitable workarounds. In comparison, as we have demonstrated in the use case, abductive reasoning naturally eliminates the hypotheses which do not explain all of the observed symptoms.

Acknowledgments. This work was supported by VEGA project no. 1/1333/12. Júlia Pukancová is also supported by grant GUK no. UK/426/2015.

1. American Association of Clinical Endocrinologists: Diabetes care plan guidelines . Endocrine Practice 17 ( 2011 )

2. American diabetes association: Diagnosis and classification of diabetes mellitus . Diabetes Care 31 ( 2008 )

3. Baader , F. , Calvanese , D. , McGuinness , D.L. , Nardi , D. , Patel-Schneider , P.F . (eds.): The Description Logic Handbook: Theory, Implementation, and Applications . Cambridge University Press ( 2003 )

4. Colucci , S. , Noia , T.D., Sciascio , E.D. , Donini , F.M. , Mongiello , M. : Concept abduction and contraction in description logics . In: Proceedings of the 2003 International Workshop on Description Logics (DL2003) , Rome, Italy September 5-7 , 2003 ( 2003 )

5. Du , J. , Qi , G. , Shen , Y. , Pan , J.Z. : Towards practical ABox abduction in large description logic ontologies . Int. J. Semantic Web Inf. Syst . 8 ( 2 ), 1 - 33 ( 2012 )

6. Eiter , T. , Gottlob , G. , Leone , N.: Abduction from logic programs: Semantics and complexity . Theor. Comput. Sci . 189 ( 1-2 ), 129 - 177 ( 1997 )

7. Elsenbroich , C. , Kutz , O. , Sattler , U. : A case for abductive reasoning over ontologies . In: Proceedings of the OWLED*06 Workshop on OWL: Experiences and Directions , Athens, Georgia, USA, November 10 - 11 , 2006 ( 2006 )

8. García-Crespo , Á., Rodríguez , A. , Mencke , M. , Gómez-Berbís , J.M. , Colomo-Palacios , R. : Oddin: Ontology-driven di erential diagnosis based on logical inference and probabilistic refinements . Expert Systems with Applications 37 ( 3 ), 2621 - 2628 ( 2010 )

9. Halland , K. , Britz , K. : Naïve ABox abduction in ALC using a DL tableau . In: Proceedings of the 2012 International Workshop on Description Logics, DL-2012 , Rome, Italy, June 7-10, 2012 . Sun SITE Central Europe (CEUR) ( 2012 )

10. Hubauer , T. , Lamparter , S. , Pirker , M. : Relaxed abduction: Robust information interpretation for incomplete models . In: Proceedings of the 24th International Workshop on Description Logics (DL 2011 ), Barcelona, Spain, July 13-16 , 2011 ( 2011 )

11. Hubauer , T. , Legat , C. , Seitz , C. : Empowering adaptive manufacturing with interactive diagnostics: A multi-agent approach . In: Advances on Practical Applications of Agents and Multiagent Systems - 9th International Conference on Practical Applications of Agents and Multiagent Systems, PAAMS 2011 , Salamanca, Spain, 6 - 8 April 2011 . pp. 47 - 56 ( 2011 )

12. Klarman , S. , Endriss , U. , Schlobach , S.: ABox abduction in the description logic ALC . Journal of Automated Reasoning 46 ( 1 ), 43 - 80 ( 2011 )

13. Ma , Y. , Gu , T. , Xu , B. , Chang , L. : An ABox abduction algorithm for the description logic ALCI . In: Intelligent Information Processing VI - 7th IFIP TC 12 International Conference . pp. 125 - 130 ( 2012 )

14. OWL Working Group (ed.) : OWL 2 Web Ontology Language Document Overview . Recommendation, W3C ( 27 October 2009 )

15. Peirce , C.S. : Deduction, induction, and hypothesis . Popular science monthly 13 , 470 - 482 ( 1878 )

16. Petasis , G. , Möller , R. , Karkaletsis , V.: BOEMIE: Reasoning-based information extraction . In: Proceedings of the 1st Workshop on Natural Language Processing and Automated Reasoning co-located with 12th International Conference on Logic Programming and Nonmonotonic Reasoning (LPNMR 2013 ),

Corunna , Spain, September 15th , 2013 . pp. 60 - 75 ( 2013 )

17. Rodríguez , A. , Labra , J. , Alor-Hernandez , G. , Gómez , J.M. , Posada-Gomez , R. : ADONIS: Automated diagnosis system based on sound and precise logical descriptions . In: Proc. of 22nd IEEE International Symposium ( 2009 )

18. Rodríguez-González , A. , Labra-Gayo , J.E. , Colomo-Palacios , R. , Mayer , M.A. , GómezBerbís , J.M. , García-Crespo , A. : SeDeLo: Using semantics and description logics to support aided clinical diagnosis . Journal of Medical Systems 36 ( 4 ), 2471 - 2481 ( 2012 )