Representation of Parsimonious Covering Theory in OWL-DL Cory Henson, Krishnaprasad Thirunarayan, Amit Sheth, Pascal Hitzler Ohio Center of Excellence in Knowledge-enabled Computing (Kno.e.sis) Wright State University, Dayton, Ohio, USA {cory, tkprasad, amit, pascal}@knoesis.org Abstract. The Web Ontology Language has not been designed for representing abductive inference, which is often required for applications such as medical disease diagnosis. As a consequence, existing OWL ontologies have limited ability to encode knowledge for such applications. In the last 150 years, many logic frameworks for the representation of abductive inference have been developed. Among these frameworks, Parsimonious Covering Theory (PCT) has achieved wide recognition. PCT is a formal model of diagnostic reasoning in which knowledge is represented as a network of causal associations, and whose goal is to account for observed symptoms with plausible explanatory hypotheses. In this paper, we argue that OWL does provide some of the expressivity required to approximate diagnostic reasoning, and outline a suitable encoding of PCT in OWL-DL. Keywords: OWL, Parsimonious Covering Theory, Abductive Reasoning 1 Introduction Abduction is often described as inference to the best explanation. Given some background knowledge and a set of observations, an abductive reasoner will compute a set of best explanations. In general, abduction is formalized as Σ ⋀ Δ ⊧ Γ where background knowledge Σ and observations Γ are given, and explanations Δ (also called abducibles) are to be computed (⊧ refers to the first-order logic consequence relation). One highly popular abductive reasoning framework is the Parsimonious Covering Theory (PCT) [1]. PCT has predominantly been used in the domain of medical disease diagnosis. Reasoning in PCT is executed by algorithms that support a hypothesize-and-test inference process, and is driven by background knowledge modeled as a bipartite graph causally linking disorders to manifestations (see Figure 1). The basic premise of PCT is that diagnostic reasoning can be divided into two parts: coverage and parsimony. The coverage criterion describes how to generate a set of explanations such that each given observation is caused by a disorder in the explanation (an observation is a manifestation that has been observed). In complicated domains, such as medical disease diagnosis, the number and size of explanations may grow to be large. In order to reduce to a more reasonable size, the parsimony criterion describes how to choose which explanations are best. Many different parsimony criteria have been advanced, including minimum cardinality criterion, subset minimality (irredundancy) criterion, etc. [1]. The single disorder assumption is a simple, yet effective, parsimony criterion that has also proved popular in the past; it states that explanations may contain only a single disorder. While OWL [2] may not have been designed for representing abductive inference, the integration with abductive reasoning has been explored [3]; however previous approaches have required modification of OWL syntax and/or an OWL inference engine [4]. In this work we will Figure 1. Background knowledge modeled demonstrate that OWL does provide some of the as a bipartite graph with an example set of expressivity needed to approximate diagnostic observations and valid explanations. reasoning – without extension of its syntax or semantics – by outlining a suitable encoding of PCT in OWL-DL. We caution the reader, however, that the OWL representation discussed does not explicitly implement PCT, but only approximates PCT, since OWL inference does not support a hypothesize-and-test inference process. 2 Representation of PCT in OWL The task of representing PCT in OWL involves encoding the background knowledge Σ and the set of observations Γ in an OWL ontology such that the execution of OWL reasoning results in explanations Δ that satisfy both the coverage and parsimony criteria. An explanation is a cover if, for each observation, there is a causal relation from a disorder contained in the explanation to the observation. An explanation is parsimonious, or best, if it contains only a single disorder. Thus, an explanation is a parsimonious cover if the disorders in an explanation cause all the given observations. To better clarify our task, we first describe the process of abduction where background knowledge Σ = ⟨D,M,C⟩ and observations Γ are given, and explanations Δ are to be inferred. More specifically, an abduction problem P (in PCT) is a 4-tuple ⟨D,M,C,Γ⟩ where D is a finite set of disorders; M is a finite set of manifestations; C : D ⟶ P(M) is the causation function; and Γ ⊆ M is the set of observations. (P(S) represents the powerset of S, and C maps a disorder to the corresponding set of manifestations it causes). For any disorder d ∈ D and manifestation m ∈ M, effects(d) = C(d) and causes(m) = {d | m ∈ C(d)}. effects(D) = ∈ . The set DI ⊆ D is said to be a cover of MJ ⊆ M if MJ ⊆ effects(DI). A set Δ ⊆ D is said to be an explanation of Γ for a problem P = ⟨D,M,C,Γ⟩ if and only if Δ covers Γ and Δ satisfies a given "miminality" (parsimony) criterion. A cover DI of MJ is said to be minimal if its cardinality is smallest among all covers of M J. A cover DI of MJ is said to be irredundant if none of its proper subsets is also a cover of M J [5]. The next step is to translate the above representation into OWL. The translation – o(P), with PCT problem P – is summarized in Table 1. The translation of Σ into OWL is straightforward. To translate the set of disorders D, create a class Disorder, and for all d ∈ D create an individual of type Disorder by asserting Disorder(d) (1). To translate the set of manifestations M, create a class Manifestation, and for all m ∈ M create an individual of type Manifestation Table 1. PCT-to-OWL by asserting Manifestation(m) (2). Finally, PCT OWL-DL to translate the set of causal relation 1 D for all d ∈ D assert Disorder(d) instances C, create an object property 2 M for all m ∈ M assert Manifestation(m) causes, and for all disorders in the domain 3 C for all m ∈ C(d) assert causes(d,m) of C and for each m ∈ C(d), create a 4 Γ Explanation ≡ ∃causes.{m1} ⊓…⊓ causes fact by asserting causes(d,m) (3). ∃causes.{mn}, where mi ∈ Γ The translation of the set of 5 Δ for each Δ = {d}, Explanation(d) holds observations Γ into OWL is not as straightforward. To translate Γ, first select an observation m1 ∈ Γ and create an existentially quantified property restriction for the causes relation, ∃causes.{m1}. For each additional observation mi ∈ Γ, i=2,...,n, create an additional existentially quantified property restriction for the causes relation and conjoin it to the previous restriction, ∃causes.{m1} ⊓ … ⊓ ∃causes.{mn}. Finally, create a class Explanation and define it to be equivalent to the conjunction of restrictions, Explanation ≡ ∃causes.{m1} ⊓…⊓ ∃causes.{mn} (4). To generate explanations Δ, execute a query for all individuals of type Explanation by posing the query Explanation(?x). Explanation(d) is a result of this query if {d} is a parsimonious cover (5). Note that the resulting knowledge base lies in the tractable OWL 2 EL profile of OWL 2 [2]. Theorem. Given a PCT problem P = ⟨D,M,C,Γ⟩ and its translation o(P) into OWL, Δ = {d} is a PCT explanation if and only if Explanation(d) is deduced by an OWL-DL reasoner; i.e., iff o(P) ⊧ Explanation(d). Proof: (⟹) If {d} is a parsimonious cover of Γ = {m1,…,mn} then, by definition, Γ ⊆ C(d). By construction of causes in o(P), d : ∃causes.{m1} ⊓…⊓ ∃causes.{mn}. Hence, by definition of Explanation, o(P) ⊧ Explanation(d) holds. (⟸) To justify our claim that this OWL representation approximates PCT, we verify that all query results satisfy both the coverage and the parsimony criteria. To satisfy the coverage criterion, each binding of ?x, for the query Explanation(?x), must be a disorder that causes all the observations in Γ. This follows from the definition of Explanation ≡ ∃causes.{m1} ⊓…⊓ ∃causes.{mn}. That is, Explanation(d) implies causes(d,m1) ⊓…⊓ causes(d,mn). To satisfy the parsimony criterion, each binding of ?x must be a single disorder. This follows since each disorder that binds to ?x is a single individual. This completes the proof. If we want to generalize the definition of Explanation to allow for covers with multiple disorders, then the parsimony criterion cannot easily be expressed in OWL, since it would require minimization of the extension of a predicate. Simulation by using multiple queries may be an option, by incrementally generating cover candidates and checking whether each constitutes an explanation. This seems hardly efficient, though, and also unsatisfactory because the parsimony criterion itself is not modeled.1 Previously, we have written a meta-interpreter for more general cases [8]. 1 In a circumscriptive version of OWL [6, 7] it could easily be modeled. 3 Example Consider the following example taken from WebMD [9] which relates the disorders, flu and cold, to their manifestations. Figure 2 provides a graphical representation of the background knowledge and Table 2 shows the PCT to OWL translations for the background knowledge (1-3), a set of given observations {sneezing, sore-throat, mild- cough} (4), and the explanation query results composed of the disorder cold (5). For brevity, we show only a few translations, but the remainder should be apparent. Table 2. Example translation from PCT into OWL PCT OWL 1 D = {flu, cold} Disorder(flu) Disorder(cold) 2 M = {fever, headache, Manifestation(fever) …} Manifestation(headache) … 3 C = {C(flu)={fever, causes(flu, fever) headache} …} causes(flu, headache) … 4 Γ = {sneezing, sore- Explanation ≡ throat, mild-cough} ∃causes.{sneezing}⊓ ∃causes.{sore-throat}⊓ ∃causes.{mild-cough} Figure 2. Background knowledge 5 Δ = {cold} Explanation(cold) causally relating flu and cold to their manifestations (WebMD [7]). 4 Conclusion We have shown an interesting use of OWL for diagnostic reasoning. Specifically, we have shown that a restricted form of explanation, according to PCT, can be obtained through a suitable encoding of PCT in OWL. Based on the popularity of PCT, in combination with the single disorder assumption, it is apparent that many applications can benefit from a representation of PCT in OWL by taking advantage of the machinery of OWL inference and by enabling the easy reuse of data on the Web. Acknowledgements. This research was supported in part by The Dayton Area Graduate Studies Institute (DAGSI), AFRL/DAGSI Research Topic SN08-8: "Architectures for Secure Semantic Sensor Networks for Multi-Layered Sensing." Pascal Hitzler acknowledges support by the National Science Foundation under award 1017225 "III: Small: TROn – Tractable Reasoning with Ontologies." References 1. Reggia, J.A., Peng, Y.: Modeling Diagnostic Reasoning: A Summary of Parsimonious Covering Theory. Computer Methods and Programs Biomedicine, 25:125-34 (1987). 2. Hitzler, P., Krötzsch, M., Parsia, B., Patel-Schneider P.F., Rudolph S., OWL 2 Web Ontology Language: Primer. W3C Recommendation, 27 October 2009 (2009). 3. Elsenbroich, C., Kutz, O., Sattler, U.: A case for abductive reasoning over ontologies. In: Cuenca Grau, B., Hitzler, P., Shankey, C., Wallace, E. (Eds).: Proceedings of the OWLED06 Workshop on OWL: Experiences and Directions, Athens, Georgia, USA, November 10-11, 2006. CEUR Workshop Proceedings 216 (2006). 4. Peraldi, S.E., Kaya, A., Möller, R.: Formalizing multimedia interpretation based on abduction over description logic aboxes. In: Cuenca Grau, B., Horrocks, I., Motik, B., Sattler, U. (Eds).: Proceedings of the 22nd International Workshop on Description Logics (DL 2009), Oxford, UK, July 27-30, 2009. CEUR Workshop Proceedings 477 (2009). 5. Thirunarayan, K., Dasigi, V.: On the Relationship between Abductive Reasoning and Boolean Minimization. In: Proceedings of AAAI 91 Workshop: Towards Domain Independent Strategies for Abduction, pp. 84-88, (1991). 6. Grimm, S., Hitzler, P.: A Preferential Tableaux Calculus for Circumscriptive ALCO. In: Polleres, A., Swift, T. (Eds.), Web Reasoning and Rule Systems, Third International Conference, RR 2009, Chantilly, VA, USA, October 2009, Proceedings. Lecture Notes in Computer Science Vol. 5837, Springer, pp. 40-54, (2009). 7. Bonatti, P., Lutz, C., Wolter, F.: The Complexity of Circumscription in DLs. J. Artif. Intell. Res. 35: 717-773 (2009) 8. Thirunarayan, K., Henson, C., Sheth, A.: Situation Awareness via Abductive Reasoning for Semantic Sensor Data: A Preliminary Report. In: Proceedings of the 2009 International Symposium on Collaborative Technologies and Systems, IEEE, pp. 111-118, (2009). 9. WebMD.: Is it a Cold or the Flu? http://www.webmd.com/cold-and-flu/is-it-a-cold-or-flu (retrieved April 20, 2011).