=Paper= {{Paper |id=Vol-432/paper-20 |storemode=property |title=Modeling ontologies using OWL, Description Graphs, and Rules |pdfUrl=https://ceur-ws.org/Vol-432/owled2008eu_submission_13.pdf |volume=Vol-432 |dblpUrl=https://dblp.org/rec/conf/owled/MotikGHS08 }} ==Modeling ontologies using OWL, Description Graphs, and Rules== https://ceur-ws.org/Vol-432/owled2008eu_submission_13.pdf
    Modeling Ontologies Using OWL, Description
                Graphs, and Rules

    Boris Motik1 , Bernardo Cuenca Grau1 , Ian Horrocks1, and Ulrike Sattler2
                             1
                              University of Oxford, UK
                         2
                             University of Manchester, UK



1     Introduction

Ontologies often describe structured objects, which consist of many parts inter-
connected in complex ways. Such objects abound in molecular biology and the
clinical sciences. Clinical ontologies such as GALEN, the Foundational Model of
Anatomy (FMA), and the National Cancer Institute (NCI) Thesaurus describe
numerous structured objects. For example, FMA models the human hand as
consisting of the fingers, the palm, various bones, blood vessels, and so on, all
of which are highly interconnected.
    Modeling structured objects poses numerous problems to the OWL family
of languages. The design of OWL DL and OWL 2 has been driven by the desire
to provide practically useful knowledge modeling primitives while ensuring de-
cidability of reasoning. The latter goal has been achieved by ensuring that the
selected primitives have a variant of the tree-model property [1]: each satisfiable
OWL knowledge base has a model whose elements are connected in a tree-like
manner. Thus, OWL ontologies describing (usually non-tree-like) structured ob-
jects typically have models that do not reflect the actual structure of the modeled
objects. This technical problem has severe consequences for OWL users [2]. In
search of the “correct” way of describing structured objects, modelers often cre-
ate overly complex ontologies; however, since the required expressive power is
actually missing, these ontologies do not entail the consequences that would fol-
low if the objects were described accurately. Furthermore, the complexity of the
ontologies can cause significant performance problems during reasoning.
    To address this lack of expressivity, we propose to extend OWL with de-
scription graphs, which can be understood as schema-level descriptions of struc-
tured objects. Furthermore, to allow the representation of conditional statements
about structured objects, we also extend OWL with first-order rules [3]. For
example, we can represent the structure of the hand and its parts using descrip-
tion graphs, and we can represent statements such as “if a bone in the hand is
fractured, then the hand is fractured as well” using rules. Finally, we can use
standard OWL-style modeling to represent nonstructural aspects of the domain,
such as “a medical doctor is a person with an MD degree.”
    We thus obtain a powerful knowledge representation formalism that addresses
the expressivity limitations of OWL, but that is, unfortunately, undecidable. It
is widely recognized that reasoning algorithms are more likely to be effective in
practice if the underlying logics are decidable. Therefore, we have analyzed the
main causes for undecidability and have investigated restrictions under which
the formalism becomes decidable.
    In particular, we have observed that structured objects can often be described
by a possibly large, yet bounded number of parts. For example, a human body
consists of a certain number of organs, all of which can be decomposed into
smaller parts; further decomposition, however, will eventually reach the parts
that the modeler cannot or does not want to describe. For example, FMA de-
scribes the skeleton of the hand, but it does not describe the structure of the
distal phalanges of the fingers. The number of parts needed to describe the hand
is thus determined by the granularity of the hierarchical decomposition of the
hand. This decomposition naturally defines an acyclic hierarchy of description
graphs. For example, the fingers will be described by description graphs that
are subordinate to that of the hand; furthermore, the description graph for the
hand is not naturally subordinate to the description graphs for the fingers. We
use this observation to define a particular acyclicity restriction on description
graphs. Roughly speaking, it allows an instance of a description graph up the
hierarchy to imply existence of an instance of a description graph lower in the
hierarchy, but not vice versa. Provided that the OWL TBox is empty, acyclicity
bounds the number of parts that one needs to reason with, which can be used
to obtain a decision procedure for the basic reasoning problems.
    If the OWL TBox is not empty, the acyclicity condition alone does not ensure
decidability due to an interaction between OWL axioms, graphs, and rules [4].
To obtain decidability, we limit their interaction by imposing an additional role
separation condition. In particular, we separate the roles that can be used in
OWL axioms from the roles that can be used in the rules; furthermore, depending
on the expressivity of the used fragment of OWL, we may additionally require
that the OWL axioms do not refer to the roles used in the description graphs.
    This paper summarizes the results published in several recent papers [5, 2].
For the sake of brevity, we omit the proofs and certain technical details, which
can be found in [5, 2]. Furthermore, we assume the reader to be familiar with
OWL and the basics of description logics (DLs).


2   Problems with Modeling Complex Structures

To understand the limitations of modeling structured objects in DLs (and hence
in OWL), we consider the problem of modeling the skeleton of the human hand
(see Figure 1a). The carpal bones form the base of the hand. The central part
contains the metacarpal bones, one leading to each finger. The fingers consist of
phalanges: the proximal phalanges are connected to the metacarpal bones, and
all fingers apart from the thumb contain a middle phalanx between the proximal
and the distal phalanx. This structure can be conceptualized as in Figures 1b–1e.
     Figures 1b–1e could be represented in DLs using an ABox A. ABox asser-
tions, however, represent concrete data; thus, A would represent the structure of
one particular hand. In this paper, we are concerned with modeling structured
objects at the schema level —that is, we want to describe the general structure
of all hands. and instantiate such a description many times. For example, if we
say that each patient has a hand, then, for each concrete patient, we should
instantiate a different hand, each of the structure shown in Figures 1b–1e. This
cannot be achieved using ABox assertions.
    We can give a logical, schema-level interpretation to Figures 1b–1e by treat-
ing vertices as concepts and arrows as participation constraints specifying their
relationships. For example, Hand and Index finger are concepts and the arrow
between them says that the index is a part of the hand.3 Participation constraints
are represented in ontologies using DL axioms such as (1)–(5).
    Let K be a DL knowledge base containing the following axioms, in which, for
the sake of brevity, we omit the suffix of index finger .
(1)                             Index finger ⊑ ∃part .Distal phalanx
(2)                            Index finger ⊑ ∃part .Middle phalanx
(3)                Distal phalanx ⊑ ∃attached to.Middle phalanx
(4)              Middle phalanx ⊑ ∃attached to.Proximal phalanx
(5)                        Proximal phalanx ⊑ ∃part − .Index finger
(6)                                             Sym(attached to)
    Let I be an interpretation corresponding to Figure 1e in the obvious way.
Clearly, I satisfies K, which justifies the formalization of Figure 1e using K.
Unfortunately, the ontology K is underconstrained: some models of K do not
correspond to the actual structure of the index finger from Figure 1e. Axioms
(2) and (4) imply the existence of two middle phalanges of the index finger,
but K does not state that these two middle phalanges must be the same object.
Thus, the interpretation I ′ corresponding to Figure 2 is also a model of K.
    This discrepancy prevents us from drawing conclusions that rely on the non-
tree connections in the structure; for example, if the index finger has a broken
distal phalanx, then we should conclude that the phalanx adjacent to the middle
phalanx is broken (since this is the same broken phalanx). Furthermore, it can
also cause problems with the performance of reasoning. For example, we might
use axioms (2)–(6) to describe the relationships between the index finger, its
proximal phalanx and its middle phalanx.
    The axioms in K do not state that the index finger in (5) is a part of the
“initial” index finger, so the interpretation I ′ contains an infinite tree of index
fingers. In fact, this model is “canonical” in the sense that it reflects the least
amount of information derivable from the axioms. In order to disprove an entail-
ment, a DL reasoner will try to construct such a “canonical” model. In practice,
these models can be highly repetitive and much larger than the intended ones,
which, according to our experience, is the main reason why DL reasoners cannot
process ontologies such as FMA and certain versions of GALEN.
    These problems could be addressed by ensuring that all models of K resemble
as much as possible the intended conceptualization shown in Figures 1b–1e. DL
3
    The role attached to is symmetric, so we do not orient the edges labeled with it.
(a) Anatomy of the Hand                       (b) Hand (Ghand )




(c) Finger (Gfinger )   (d) Thumb (Gthumb )         (e) Index (Gindex finger )

            Fig. 1: The Anatomy of the Hand and its Conceptual Models


languages, however, exhibit (a variant of) the tree model property [1]: whenever
a DL knowledge base K has a model, it has a model of a certain tree shape.
This is a consequence of the form of axioms allowed in OWL, and is generally
considered desirable because it ensures decidability of reasoning. At the same
time, however, it also means that we must leave the confines of DLs and OWL
if we want to faithfully represent structured objects.


3   The Formalism
We now present our formalism. We first introduce description graphs.
Definition 1 (Description Graph). An ℓ-ary description graph is a directed
labeled graph G = (V, E, λ, M ) with V = {1, . . . , ℓ} a set of vertices, E ⊆ V × V
a set of edges, and λ a labeling function that assigns a set of atomic concepts
or the negation of atomic concepts λhii to each vertex i ∈ V and a set of atomic
roles λhi, ji ⊆ NR to each edge hi, ji ∈ E. Finally, M ⊆ NC is a set of main
concepts for G. For A an atomic concept, VA is the set of vertices that contain
A in their label; that is, VA = {k ∈ V | A ∈ λhki}.
   Thus, description graphs are labeled graphs where the nodes are labeled with
concepts and the edges with roles. The main concepts indicate the objects whose
         Middle phalanx           Proximal phalanx     Index finger

                             attached to       part−
               part



Index finger


               part
                             attached to     attached to          part−

          Distal phalanx           Middle phalanx Proximal phalanx        Index finger


                           Fig. 2: Unintended Infinite Model I ′


structure is defined by the graphs. For example, the main concepts for the graph
in Figure 1b (framed with rounded rectangles) are Hand and Palm, meaning
that this graph defines the structure of the hand and the palm. Intuitively, an
instance of a main concept implies the existence of a graph instance.
Definition 2 (Rules). Let NI and NV be disjoint sets of individuals and vari-
ables. An atom is of the form C(s), R(s, t), or s ≈ t, for s, t ∈ NI ∪ NV , C a
concept, and R a role. A rule is an expression of the form

(7)                         U1 ∧ . . . ∧ Um → V1 ∨ . . . ∨ Vn

where Ui and Vj are atoms, m ≥ 0, and n ≥ 0. W.l.o.g we assume that the
body never contains ≈. The conjunction U1 ∧ . . . ∧ Um is called the body, and
the disjunction V1 ∨ . . . ∨ Vn is called the head. Variables x and y are directly
connected in a rule r if they both occur in a body atom of r, and connected is
the transitive closure of directly connected. A rule r is connected if each pair of
variables x and y occurring in r is connected in r.
    A graph rule is a rule of the form (7) where all concepts and roles in atoms
are atomic, and that can also contain graph atoms of the form G(t1 , . . . , tk ), for
G an ℓ-ary description graph and ti ∈ NI ∪ NV .
   Next, we introduce graph specializations, which allow us to represent objects
at different levels of abstraction. For example, we would like to describe the
abstract structure common to all fingers as shown in Figure 1c; then, we should
be able to specialize this structure for the index finger and introduce the middle
phalanx, as in Figure 1e. The graph specialization Gfinger ⊳ Gthumb states that
the graph for the thumb specializes the graph for a finger.
Definition 3 (Graph Specialization). A graph specialization is an axiom of
the form G1 ⊳ G2 , where G1 = (V1 , E1 , λ1 , M1 ) and G2 = (V2 , E2 , λ2 , M2 ) are
description graphs with V1 ⊆ V2 .
   Next, we introduce axioms that allow us to properly connect graph instances.
For example, Ghand contains the vertices 3 and 4 for the thumb and its proximal
phalanx, which correspond to the vertices 1 and 3 of Gthumb . We can specify this
correspondence using a graph alignment of the form Ghand [3, 4] ↔ Gthumb [1, 3].
Intuitively, this ensures that it is not possible for Ghand and Gthumb to share the
thumb without sharing the proximal phalanx as well.
Definition 4 (Graph Alignment). A graph alignment is an expression of the
form G1 [v1 , . . . , vn ] ↔ G2 [w1 , . . . wn ], where G1 and G2 are description graphs
with sets of vertices V1 and V2 , respectively, vi ∈ V1 and wi ∈ V2 for 1 ≤ i ≤ n.
   Finally, we define GBoxes and graph-extended KBs.
Definition 5 (Formalism). A graph box (GBox) is a tuple G = (GG , GS , GA )
where GG , GS , and GA are finite sets of description graphs, graph specializations
over GG , and graph alignments over GG . ABoxes are extended to allow for graph
assertions of the form G(a1 , . . . , aℓ ) for G an ℓ-ary graph. A graph-extended
knowledge base is a 4-tuple K = (T , P, G, A) where T is a TBox, P is a program
with a finite number of connected rules, G is a GBox, and A is an ABox.
   Next, we define the semantics of the formalism.
Definition 6 (Semantics). An interpretation I = (△I , ·I ) is defined as usual,
and it interprets each ℓ-ary description graph G as an ℓ-ary relation over △I ; that
is, GI ⊆ (△I )ℓ . A graph assertion is satisfied in I, written I |= G(a1 , . . . , aℓ ), iff
haI1 , . . . , aIℓ i ∈ GI . Satisfaction of a description graph, graph specialization, and
graph alignment in I is defined in Table 1, and satisfaction of T , P, and A in I
is defined as usual. A knowledge base K = (T , P, G, A) is satisfied in I, written
I |= K, if all its components are satisfied in I.
    Thus, each ℓ-ary graph G is interpreted as an ℓ-ary relation GI in which
each tuple corresponds to an instance of G in the interpretation. The key and
disjointness properties in Table 1 ensure that no two distinct instances of G can
share a vertex; for example, no two distinct instances of Ghand can share the
vertex for the thumb. These properties ensure that no model I contains infinite
“chains” of instances of a description graph, which reflects the intuition that
each description graph instance represents a bounded and isolated part of the
domain. The start property in Table 1 ensures that each instance of a main
concept A of G occurs in an instance of G. For example, since Hand is a main
concept for Ghand , each instance of Hand must occur as vertex 1 in an instance
of Ghand .
    Graph specializations are interpreted as inclusions over the graph relations;
for example, Gfinger ⊳ Gindex finger means that each instance of an index finger
is also an instance of a finger. The two graphs share all the vertices of the more
general graph, and the more specific graph can introduce additional vertices.
Finally, graph alignments state that, whenever two graphs share some vertex
from the specified list, then they share all other vertices from the list as well.
For example, the alignment Ghand [3, 4] ↔ Gthumb [1, 3] states that, if instances
of Ghand and Gthumb share vertices 3 and 1, respectively, then they must also
share vertices 4 and 3, respectively.
                Table 1: Satisfaction of GBox Elements in an Interpretation

I |= G for G = (V, E, λ, M ) iff
Key property:
∀x1 , . . . , xℓ , y1 , . . . , yℓ ∈ △I :             W                         V
  hx1 , . . . , xℓ i ∈ GI ∧ hy1 , . . . , yℓ i ∈ GI ∧   x i = yi →                    x j = yj
                                                        1≤i≤ℓ                 1≤j≤ℓ
Disjointness property:
                                                                                                  V
∀x1 , . . . , xℓ , y1 , . . . , yℓ ∈ △I : hx1 , . . . , xℓ i ∈ GI ∧ hy1 , . . . , yℓ i ∈ GI →             xi 6= yj
                                                                                                1≤i