Proceedings of the 6th Workshop on Formal and Cognitive Reasoning


 Towards Model Transformation in Description
Logics – Investigating the Case of Transductions

                       Willi Hieke and Anni-Yasmin Turhan ?

                           TU Dresden, Dresden, Germany
                      Institute of Theoretical Computer Science
                 {willi.hieke, anniyasmin-turhan}@tu-dresden.de


        Abstract. Models for Description Logic (DL) ontologies can be used
        for explanation purposes, but the models computed by standard DL rea-
        soner systems are not necessarily suitable for this task. In this paper we
        investigate the general task of transforming an arbitrary model into a
        target model that admits the desired property. To this end, we introduce
        a general framework for model transformation that abstracts from the
        DL in use, the property of the target model and the transformation for-
        malism. We consider instantiations of the framework for the DLs ALC
        and EL and use transductions as transformation formalism.

        Keywords: Description logics · Model transformation · Transduction.


1     Introduction

Description Logics (DLs) are a family of knowledge representation languages that
are widely used in ontology-based applications. Most DLs are fragments of first-
order logic (FOL) that have decidable reasoning problems. Ontologies usually
capture definitions for terminologies from an application domain by concept
axioms. These axioms can refer to other concepts or roles, which are unary and
binary predicates, respectively.
    DL ontologies can easily grow very large and complex and thus, a conse-
quence computed by a DL reasoner is not necessarily obvious for a knowledge
engineer and automated ontology services to explain the results of reasoning are
needed. Such a service is the computation of justifications and was intensively
investigated in recent years, see [10] for a recent overview. Using justifications
and proofs for explanation identifies parts of an ontology that is “responsible”
for a consequence, whereas the approach for explanation presented in this paper
uses models. For instance, in the life-cycle of an ontology, the developer team
might change, so to acquire the meaning of concepts and to grasp the overall
structure of an ontology might not be an easy task for a new knowledge engineer.
What could be helpful is a tool for the computation of typical instances that are
easy to comprehend, but still carry the information from the ontology.
?
    This work has been supported by DFG in the Research Training Group QuantLA
    (GRK 1763).


    Copyright c 2020 for this paper by its authors. Use permitted under
    Creative Commons License Attribution 4.0 International (CC BY 4.0).

                                           69
Towards Model Transformation in Description Logics – Investigating the Case
                           of Transductions


    For a range of DLs there are highly-optimized reasoner systems available
such as FaCT++ [14], HermiT [8], or ELK [9]. DL reasoners decide satisfiability
of an ontology (or a concept) by computing a model of it. However, using such
a model for explanation purposes may not be appropriate, since it is simply the
first model found according to the reasoner’s optimization strategies and may
be artificial and counter-intuitive. It is a natural idea to transform a model such
that it would be better suited for explanation. Explorations of this idea in the
context of DLs are scarce. In the work of Bauer et al. [3], where the internals
of the FaCT++ reasoner [14] have been modified to guide the search of a model
s.t. a model (more) suitable for explanation is obtained. As changing the inter-
nals of a highly-optimized reasoner can easily corrupt the implementation, the
approach where reasoner generated models are transformed is clearly preferable.
In [1] the authors adopt this approach and suggest visualizations of concepts by
(classes of) models, where information “not relevant to” the user’s view is fil-
tered out. The general task of model transformation has been investigated in the
more general setting of FOL formulae. For instance, computing minimal models
has been investigated in [4]. However, as for FOL reasoning is not decidable and
transformations for these models need to cater for n-ary relations, these meth-
ods need not be suitable for the DL case. Similarly, there is a lot of work on
model transformation for propositional formulae, but these lack the relational
structures that DL models have and therefore these methods are not applicable.
    As classical DLs use only unary predicates and binary relations, their models
are node and edge labeled graphs and can be easily be depicted. For a graphical
representation of a concept instance, it could be advantageous, if the model was
for instance small, or tree-like, or planar, and so on. We consider these properties
here—not so much to argue for their cognitive adequateness for explanation, but
rather to illustrate by their use that our approach can handle properties that
can reasonably be required for these purposes. We address the task of model
transformation by introducing a framework, in which the intended property to
be satisfied by the model is kept variable. This allows for different properties de-
pending on the application. We introduce a general framework for transforming
arbitrary models of ontologies. This framework is subsequently used to formalize
reasoning problems and to investigate a prominent one in depth in this paper.
    As the formalisms for transformations, we use transductions (as defined in
[7]) in this paper. Transductions specify mappings on relational structures using
logical formulae with free variables. Transductions are a promising tool for trans-
formations as they admit the expressivity of monadic second order logic in their
formulae. However, in principle our framework admits the use of other graph
transformation formalisms, as for instance Graph Rewriting Systems (GRS) [11].
    As a concrete instantiation of our framework, we investigate the case of trans-
forming arbitrary models of ontologies written in the DL called ALC into a
tree(-like) models by transductions. We construct a family of transductions over
interpretations and show that these transformations are model preserving, i.e. in-
put and output model satisfy the same sentences in the DL ALC. We also briefly
consider a lightweight DL that is of limited expressiveness, but has good com-


                                        70
Towards Model Transformation in Description Logics – Investigating the Case
                           of Transductions


putational properties and has the canonical model property (sometimes called
universal model). The canonical model would allow for imposing assumptions
on the model that is to be transformed.
    This paper is structured as follows: in Section 2, we define the basic notions
of DL and transductions. We define the model transformation framework and
dedicated reasoning problems for it in Section 3 and consider the case of trans-
ductions to tree-like models in Section 4 as an instantiation of our framework
for ALC knowledge bases. We also discuss aspects of model transformation for
EL. Last, we draw conclusions and lay out future work in Section 5.


2     Preliminaries

As this paper mainly combines notions from Description Logics and graph trans-
ductions, we briefly introduce the main notions from these fields. For detailed
introductions, see [2] or [7], respectively.


2.1   Description Logics

The main building blocks for DL knowledge bases are concepts, which describe
unary predicates from the modeled domain. We use sets for concept names NC
and for role names NR . Let A ∈ NC and r ∈ NR , then complex ALC concepts can
be built according to the following grammar:

                        C := A | > | ¬C | C u C | ∃r.C .

Commonly used abbreviations are: C1 t C2 := ¬(¬C1 u ¬C2 ), ⊥ := ¬>, and
∀r.C := ¬(∃r.¬C). The semantics is defined as in first-order logic by means of
interpretations, which are a tuple I = (∆I , ·I ), consisting of the domain ∆I
and a function ·I that maps elements from NC to subsets of the domain and
elements from NR to subsets over ∆I × ∆I . The mapping ·I is extended to
(complex) ALC concept descriptions as follows: >I = ∆I , ¬C I = ∆I \ C I ,
(C1 u C2 )I = C1I u C2I and (∃r.C)I = {d ∈ ∆I | ∃e.(d, e) ∈ rI and e ∈ C I }.
The fragment of ALC that only uses the top-concept >, conjunction (u), and
existential restrictions (∃r.C) as concept constructors is called EL. Note that EL
cannot express inconsistencies.
    Terminological knowledge about the application domain can be modeled by
using (complex) concept descriptions. Let C and D be concepts, a general concept
inclusion (GCI) is a statement of the form: C v D. A TBox T is a finite
set of GCIs. An interpretation I satisfies a GCI C v D, if C I ⊆ DI . An
interpretation I is a model of a TBox (I |= T ) if and only if I satisfies C v D
for all C v D ∈ T . A knowledge base K is a pair of a TBox T and an ABox
A. The latter can express knowledge about objects by the use of constants. We
concentrate on terminological knowledge and do not introduce ABoxes formally.
We assume A to be empty in K. We denote by Σ(I) the signature of I, i.e.,
the set of concept and role names occurring in I; and for concepts C and GCIs,


                                       71
Towards Model Transformation in Description Logics – Investigating the Case
                           of Transductions


respectively. We write ΣA := Σ ∩NC and Σr := Σ ∩NR and we assume signatures
to be finite.
    Typical reasoning problems for DLs are to decide satisfiability and subsump-
tion. Satisfiability checks for the existence of a model for a given concept or TBox
and subsumption checks for super-concept relationships between two given con-
cepts w.r.t. a TBox and is formally defined as: D subsumes C w.r.t. TBox T
(written C vT D) iff in all models I of T holds: C I ⊆ DI .
    DLs differ in their expressivity and this often shows in the complexity for
reasoning. For ALC, testing subsumption of concepts w.r.t. a non-empty TBox
is ExpTime-complete [12] and w.r.t. for an empty one it is PSpace-complete
[13]. In EL deciding subsumption can already be done in polynomial time [6].
The expressivity of a DL can be captured by invariance results. Let L be a
logic and ./ be a relation between two pointed interpretations, which are an
interpretation together with a dedicated element form its domain. An invariance
result for L then says, that for all pointed interpretations (I, d) and (J , e) with
(I, d) ./ (J , e) and an L-formula φ: I |= φ(d) holds iff J |= φ(e) holds. In case
of ALC such an invariance result holds for bisimulations [5], while for EL it holds
for homomorphisms.

2.2   Transductions
We now define transductions in regard to DL interpretations. Observe that such
an interpretation is effectively a directed graph with labeled vertices (indicating
concept membership) and labeled edges (indicating membership to a role). In-
tuitively in a transduction, vertices and edges from one interpretation are being
copied into a new interpretation if they satisfy a condition given by a logical
formula. Depending on the fulfillment of the conditions, the new copy can be
modified. If the conditions are formulated in monadic second-order logic (MSO)
or FOL, we call the transduction a MSO- or FOL-transduction, respectively.
    Since typical DL interpretations have at most binary predicates, we restrict
the standard definition of transductions of relational structures from [7] to the
case of at most binary relations. We denote the set of L-formulae (e.g. L ∈
{MSO, FOL}) over Σ, with free variables in X, by L(Σ, X). The set of all Σ-
interpretations is denoted by I(Σ).
    Let Σ and Σ 0 for the remainder of this paper be binary signatures. A trans-
duction τ of type from Σ to Σ 0 is a binary relation on Σ-interpretations or
formally, τ ⊆ I(Σ) × I(Σ 0 ). We call the input of a transduction source inter-
pretation and its output target interpretation. The logical formulae that capture
the transformation conditions and thereby induce a particular transduction are
comprised in a tuple called definition scheme.
Definition 1 (Definition scheme). Let Par be a finite set of variables called
parameters. Let AuxVar be a finite set of variables called auxiliary variables s.t.
Par ∩ AuxVar = ∅, let x, y ∈ AuxVar, and L be a logic. A definition scheme of
type from Σ to Σ 0 with set of parameters Par is a tuple of formulae of the form
                D = (χ, (δi )i∈[k] , (θω )ω∈(ΣA0 ×[k]) , (ηω )ω∈(Σr0 ×[k]2 ) )


                                            72
Towards Model Transformation in Description Logics – Investigating the Case
                           of Transductions


for some k ∈ N+ (we call positive integers N+ index set and abbreviate {1, . . . , k}
with [k]), where
  – χ ∈ L(Σ, Par) is the precondition;
  – δi ∈ L(Σ, Par ∪ {x}) for each i ∈ [k] are domain formulae;
  – θω ∈ L(Σ, Par ∪ {x}) for each ω ∈ ΣA0 × [k], and are concept formulae; and
  – ηω ∈ L(Σ, Par ∪ {x, y}) for each ω ∈ Σr0 × [k]2 are the role formulae.
A definition scheme D is an L-definition scheme, if all its formulae are written
in some logic L. If Par = ∅ for D, then D is called parameterless.
Please note that a definition scheme is also called a graph transducer in [15].
Intuitively, any input interpretation has to satisfy the precondition χ first, before
the domain formulae δi and the relation formulae θω and ηω are being evaluated
on the input interpretation. The domain formulae then select the elements from
the input interpretation that satisfy δi . By using not only one but many domain
formulae, elements can be copied more than once into target interpretation.
Here, k is the upper bound on the number of copies of each domain element from
the source interpretation. Consequently, the domain of the target interpretation
consists of at most k distinct copies of the input domain. The relation formulae
θω and ηω then state the conditions under which a concept name or a role name
is added to the copied elements.
    Next, we define how a target interpretation is induced by a definition scheme
from a source interpretation I. If applied to sets of interpretations and parameter
assignments, this method then gives rise to a transduction.
Definition 2 (Transduction induced by a definition scheme). Let I =
(∆I , ·I ) be a Σ-interpretation, λPar : Par → ∆I , be a parameter assignment, and

                 D = (χ, (δi )i∈[k] , (θω )ω∈(ΣA ×[k]) , (ηω )ω∈(Σr ×[k]2 ) )

a definition scheme. Let λ be the assignment that extends λPar s.t. λ(x) = a and
(I, λPar ) |= δi (a) stand for (I, λ) |= δi (and extending it accordingly for the θi
and ηi ). Then I 0 is the Σ 0 -interpretation defined by D from (I, λPar ) iff:
  – (I, λPar ) |= χ,
         0
  – ∆I := {(a, i) ∈ ∆I × [k] | (I, λPar ) |= δi (a)},
        0                0
  – AI := {(a, i) ∈ ∆I | (I, λPar ) |= θω (a)} for all A ∈ ΣA (I), and
      0                                  0      0
  – rI := {((a1 , i1 ), (a2 , i2 )) ∈ (∆I × ∆I ) | (I, λPar ) |= ηω (a1 , a2 )} for all
    r ∈ Σr (I).
As the interpretation I 0 is uniquely determined by D, I, and λ iff (I, λ) |= χ,
we obtain the transduction τD induced by D and use the function τD (I, λ) = I 0
directly.
Note that transductions, as in [7], are defined over finite structures and we
restrict ourselves to finite interpretations accordingly. This does not pose a re-
striction for our goal to transform models that are computed by a DL reasoner
as these are finite.


                                             73
Towards Model Transformation in Description Logics – Investigating the Case
                           of Transductions


3    A General Framework for Model Transformation
To be able to investigate model transformations in a wider setting in a structured
and well-defined way, we define a general framework that abstracts from the
concrete property to be achieved for the target interpretation, the knowledge
base, and even the logics employed. The parameters of the framework are, the
knowledge base K under consideration together with an interpretation I over
the same signature. We write LK for the logic in which the knowledge base is
formulated. The desired property is a sentence ρ of some logic Lρ . We sometimes
write ρ-model for a model that satisfies property ρ. Another parameter is the
actual transformation mapping. As we concentrate on transductions, we use τ
to transform arbitrary source models of K to desired ρ-models.
Definition 3 (Model transformation framework). Let LK be a description
logic, K an LK -knowledge base, I an interpretation over the signature Σ(K), ρ
a sentence formulated in some logic Lρ and τ : I(Σ(K)) → I(Σ(K)) a trans-
formation mapping over interpretations of the binary signature Σ(K). We call
S = (I, K) the source pair and T = (ρ, τ ) the transformation pair.
We use transductions to describe model transformations and restrict ourselves
to empty ABoxes and finite interpretations. In principle, the framework would
also allow for other kinds of transformations or even structures of higher arity.
The main reasoning tasks using the framework is to decide if a transformation
pair applied to a source pair is successful.
Definition 4 (Successfulness problems). Let S = (I, K) be a source pair
and T = (ρ, τ ) be a transformation pair. Let LK be a DL. A pair (S, T ), is called
successful, denoted by successful(S, T ), iff I |= K implies τ (I) |= K ∪ {ρ}.
  – The successfulness problem for S and T is to decide for a given (S, T ) with
    S = (I, K) and T = (ρ, τ ), whether (S, T ) is successful.
  – The successfulness problem for a DL and a transformation pair is to decide
    for a given DL LK and T = (ρ, τ ), whether for every S = (I, K), where K
    is expressed in LK , the pair (S, T ) is successful.
Please note that, in case I 6|= K, the success of a pair (S, T ) is trivially given.
Furthermore, K and ρ can be inconsistent, meaning that there is simply no model
for K ∪ ρ. Thus the successfulness property does not hold trivially in every case.
    In principle the framework can easily give rise to other reasoning problems.
An interesting question that generalizes the successfulness problem is whether
for a given DL LK and a given property ρ, a LK -KB has a ρ-model.
Definition 5 (ρ-model existence problems). Given a DL LK and a property
ρ formulated in a logic Lρ .
  – The ρ-model existence problem for K decides whether there exists a ρ-model
    for a given KB K formulated in LK .
  – The ρ-model existence problem for LK decides whether there exists a ρ-model
    for every consistent KB K formulated in LK .


                                        74
Towards Model Transformation in Description Logics – Investigating the Case
                           of Transductions


To answer these question may necessitate to compare the expressiveness of LK
and the logic Lρ that ρ is formulated in. To decide the ρ-model existence problem
for a given KB K where Lρ is not more expressive than LK , the condition to
fulfill the property ρ can, in principle be answered by employing the DL reasoner
for LK . For some cases, the ρ-model existence problem for a DL LK is already
answered. For instance, in case of the existence of tree-shaped models, these do
always exist for ALC concepts defined w.r.t. a TBox, because ALC has the tree
model property, i.e. any ALC concept satisfiable w.r.t. a TBox also has a model
that is tree-shaped.
    We focus in this paper on the successfulness problem for a DL and a trans-
formation pair, i.e. we want to obtain general results of the form: given a trans-
formation pair T , for all source pairs S of a given DL LK , successful(S, T ) holds.
This is the case if τ is a model preserving relation w.r.t. a logic LK . For instance,
by the virtue of invariance results it holds that, if source and target interpre-
tation of τ are proven to be globally (ALC)-bisimilar, every (ALC) knowledge
base satisfied by the source interpretation is also satisfied by the target one.
    Ultimately, it would be desirable to not only decide successfulness, but to
compute the actual ρ-model for a source pair. Given a particular ρ, it is not
obvious how to obtain a suitable definition scheme. For instance, if ρ requires
the model to have no isolated vertices, the scheme could either connect or delete
those vertices from the input model. So it is not trivial to derive the formulae for
the definition scheme merely from the property ρ for the target model. We are
interested in the investigation of the relation of ρ to the formulae of a definition
scheme for future work.


4    A Transduction for Model Unraveling
We introduce a family of transductions (τ`-Tree )`∈N as transformation formalism
for source pairs S = (I, K), where K is an ALC knowledge base (with an empty
ABox). The goal is to achieve a tree(-like) model which is defined as follows.
Definition 6 (Property `-tree-like). A pointed interpretation (I, d) satisfies
the property `-tree-like iff it is one connected component and a partitioning of
the edges into 3 sets s.t.
  – d has no predecessor w.r.t. the first partition of the edges and for every
    element that is reachable from d by a non-empty path of length < ` by edges
    of that partition there is only one predecessor,
  – each element e that is reachable from d by a path of length ` is a node in a
    directed graph Ge , whose edges belong exclusively to the second partition and
    each directed graph Ge has at most one node reachable from d by a path of
    length `
  – edges in the third partition exclusively have their first component in the di-
    rected graph Ge and their second component in the path from d to e.
Intuitively, we want to unravel arbitrary models for ALC KBs into a tree of (user
definable) depth `. The DL ALC admits the tree model property as well as the


                                         75
Towards Model Transformation in Description Logics – Investigating the Case
                           of Transductions


finite model property, but need not have finite tree models. Hence, some leaves
of the tree might need to re-use elements as successors, since the transformation
must be model preserving. The idea is to introduce back-loops to the branches of
the tree to the most recent suitable ancestor. The resulting model is `-tree-like.
As an intuitive example consider the TBox {A v ∃r.A} and the interpretation
I = ({a}, {(a, a) ∈ r}). Our transformation would create an ` long chain of
copies of a connected by an r-edge and needs to loop back to some element.
    The idea for the unraveling transduction is the following. The “root” element
d is copied into the domain of the target interpretation I 0 and is the root of the
tree-like interpretation. As in the standard tree-unraveling defined for interpre-
tations (see [2]), each element on a path up to length ` in I from the given
unraveling element d, is copied into the domain of the target interpretation I 0
as often as it is occurring on such a path. These nodes are the “inner nodes”
of the tree-like structure. Elements that occur in I at a greater distance to d
than `, are copied only once into the target domain. These are located at the
“end” of the tree-structure and are called (complex) leaves—despite the fact
that they can have an arbitrary relational neighborhood. There are three kinds
of edges: those edges called branches that connect the nodes being the root or
inner nodes in the tree-structure, those edges called thickets that belong to the
arbitrary structure within a complex leaf and those edges that point from a node
within a complex leaf back to the tree-structure called back-loops. If an element
of a complex leaf has an r-successor in I, that has already a copy on this branch,
the complex leaf is connected to, the transduction introduces a back-loop back
into this copy. More precisely, the back-loop is to the youngest ancestor copy on
the branch. This design choice creates short back-loops. The underlying idea of
the transduction is to identify each domain element by the shortest path from
the unraveling node over ∆I .
    We use the following auxiliary predicate to express reachability (confer [7])
defined by the MSO formula:
                                               _                         
       reach(x,y):=∀X x∈X ∧ ∀u,v:u∈X ∧(             r(u,v))⇒v∈X ⇒y∈X            (1)
                                                   r∈Σr

in the definition schema D`-Tree inducing the unraveling transduction.

Definition 7 (Unraveling transduction). Let I be an interpretation, d ∈
∆I called the unraveling node, ` ∈ N. Also, let the set of path words of up to
                                                    i
length ` in I be α(I, `) := {w = x1 · · · xi ∈ (∆I ) | 0 ≤ i ≤ `} and w = x1 · · · xn
and w0 = y1 · · · ym be two words. The definition scheme (with Par = {y} and
AuxVar = {u, v}) is
                                                                                         
  D`-Tree = χ, (δw )w∈α(I,`) , (θC,w )C∈ΣA ,w∈α(I,`) , (ηr,(w,w0 ) )r∈Σr ,(w,w0 )∈α(I,`)2 ,

where the components are defined as follows:

  – Precondition:                   χ(y) := ∀x : x 6= y → reach(x, y)                  (2)


                                           76
Towards Model Transformation in Description Logics – Investigating the Case
                           of Transductions


  – Domain formulae are defined by means of the following formulae:
                root
               δw    (y, u) := u = y                                                         (3)
                                 |w|                        
                                 ^     _
               inner
              δw     (y, u) :=                  r(xi , xi+1 ) ∧ (x1 = y) ∧ (x|w|+1 = u)      (4)
                                 i=1     r∈Σr
                                                                        |w|
                                                                       ^               
                leaf
               δw    (y, u) := δw0 (y, x|w| ) ∧ reach(x|w| , u) ∧             u 6= xi        (5)
                                                                        i=1
                             
                             
                              δ root (y, u)        , if |w| = 0
                              w
                 δw (y, u) := δw inner                                                       (6)
                                       (y, u)       , if |w| ≤ ` − 1
                             
                             
                              leaf                                           0
                               δw (y, u)            , if |w| = ` & w = w · x`

    The variable xi used in the formulae of D`-Tree needs to be mapped by λ to
    the according xi ∈ w.1
  – Concept name formulae:                   θA,w (u) := A(u)              (7)

  – Role name formulae: ηr,(w,w0 ) are defined by means of the following formulae:
             thicket
            ηr,(w,w  0 ) (u, v) := r(u, v)                                                   (8)
             branch                           0
           ηr,(w,w   0 ) (u, v) := r(u, v) ∧ w = w · u                                       (9)
                                             ^ m          ^n           
            back-loop
           ηr,(w,w  0 ) (u, v) := r(u, v) ∧       xi = yi ∧       xi 6= v                   (10)
                                              i=1           i=m+1
                               
                               
                               η thicket (u, v)         if |w| = |w0 | = `
                                r,(w,w0 )
                                  branch
                  ηr,(w,w0 ) := ηr,(w,w  0 ) (u, v)      if |w0 | = |w| + 1 & |w0 | < `     (11)
                               
                               
                               η back-loop (u, v)            0
                                                         if |w | < |w| & |w| = `
                                       r,(w,w0 )

The definition schema D`-Tree together with the assignment λPar : y 7→ d, then de-
fines the transduction relation τ` . Other (undefined) cases are set to the Boolean
constant ⊥.
Note that the index set is not the natural numbers, but the set of words over
∆I up to length `. This is w.l.o.g. as there always is an order on that index set
isomorphic to the natural numbers, e.g. the lexicographic order. According to
Definition 2, every transduction induced by a definition scheme can only increase
the domain size linearly. Using α(I, `) as index set, however, the increase of the
                                                            P`
domain size is limited to |∆I | · |α(I, `)|, and |α(I, `)| = k=1 |∆I |k , which is
not linear but still polynomial.
 1
     This is an element in ∆I by definition, i.e. the transduction uses more parameters
     than just the unraveling node d. For eased readability we use xi for both the variable
     in the formulae and the domain elements of I since we identify them with λ.


                                                   77
Towards Model Transformation in Description Logics – Investigating the Case
                           of Transductions


                                                        a     Caterpillar


                                                   m
                                                for
                                             lt-

                                           ing
                                          du


                                                                ea
                                                                       has-offspring


                                         pr


                                                                   t
                               s-a


                                                                   s
                                      ffs
                             ha

                                   s-o
                                     ha
                                                       eats
                 Butterfly     b                                            c   Plant


                             Fig. 1. Model Iex of TBox Tex .


   We before we investigate formal properties of the unravelling transduction,
we illustrate the unraveling of a model of TK by an example in the realm of biol-
ogy. Consider the following TBox that speaks about caterpillars, butterflies and
plants. It states that all butterflies eat plants and have caterpillars as offspring.
Also, every caterpillar eats plants and has a butterfly as an adult form. Lastly,
every plant has a plant as offspring.

        Tex := { Butterfly v (∃has-offspring.Caterpillar) u (∃eats.Plant),
                 Caterpillar v (∃has-adult-form.Butterfly) u (∃eats.Plant),
                 Plant v ∃ has-offspring.Plant }

Figure 1 depicts a model Iex for the TBox Tex using only three domain elements.
Concept names are written next to the nodes and the labeled arrows depict roles.
Such a succinct model could be the outcome of the reasoning process of reasoners
that aim at keeping the domain of the model small by eagerly reusing domain
elements and blocking the introduction of new elements possible—a strategy
known as anywhere blocking that is implemented by the HermiT reasoner [8].
    Model Iex , however, has some shortcomings when it comes to explaining
the concepts under consideration. For instance, the offspring of a butterfly is
certainly not the caterpillar that this butterfly developed from. Also, a regular
plant not its own offspring. A model unraveling might help to resolve some of
these artifacts due to reasoner optimization.
    Figure 2 shows the 2-unraveling of IK . The parts of the unraveled model
in solid lines highlight the tree of depth 2 and the dotted lines and are being
added for the transformation to be model preserving. In the part of the model
drawn in solid lines, we now have that the offspring of a butterfly is no longer
the caterpillar it emerged from and also, plants have other (freshly introduced)
plants as offspring, too. For explanation purposes, users might only be interested
in the non-dotted part and an implementation could simply omit the parts drawn
in dotted line if the user permits.
    We show now that the unraveling transduction is successful for all ALC KBs.
To this end we employ bisimulations between source and target interpretations.
Lemma 1. The relation ' := {(x, (x0 , w)) | (x0 , w) ∈ ∆τ`-Tree (I) & x = x0 }) is a
bisimulation between I and τ`-Tree (I).


                                                       78
Towards Model Transformation in Description Logics – Investigating the Case
                           of Transductions


                                                      Caterpillar
                                                                               aε
                                                                        rm
                                                                 t   -fo                ea
                                                               ul                          t   s
                                                          ad
                                                        s-
                           Butterfly
                           rm                         ha                                                Plant
                                                ba                                                 ca
                         fo
                         t-


                                                                                                            ha
                      ul


                                          g                                  has-offspring                     s            has-offspring
                                     rin
                   ad


                                                                ea                                              -o
                                   sp                                                                             ffs
                   s-


                                                                     ts
                 ha


                                off                                                                                     pr
                            s -                                                                                            in
                                                                                                                                g
   Caterpillar           ha

                   aab                                                         cab   Plant                                          cac   Plant
                                              has-offspring
                                  ea
                                     ts


                                               caba        Plant


            Fig. 2. The 2-unraveling of model IK into a 2-tree-like structure.


                                                                                                        0
Proof. We denote τ`-Tree (I) by I 0 . Since x0 in (x0 , w) ∈ ∆I is an element of ∆I ,
we can state x = x0 as condition. Assume I |= χ, so that τ`-Tree (I) is defined.
We need to show three conditions for this claim.
                                                              0
     First, x ' (x0 , w) implies x ∈ AI ⇔ (x0 , w) ∈ AI for all x ∈ ∆I , (x0 , w) ∈
  I0
∆ and A ∈ ΣA ; this condition follows directly from concept name formula (7).
     Second, x ' (x0 , w) and (x, y) ∈ rI implies the existence of (y 0 , w0 ) such that
                           0                                                                 0
((x , w), (y 0 , w0 )) ∈ rI and y ' (y 0 , w) for all x, y ∈ ∆I , (x0 , w), (y 0 , w0 ) ∈ ∆I
    0
                                           I                0
and all r ∈ Σr . Assume (x, y) ∈ r and x ' (x , w). Since I satisfies the
precondition (2), there is a path in I from the unraveling node d to x and hence
to y. Now, domain formulae (3) and (4) create a fresh copy for each element
ending in a d-path in I up to length `. If the path from d to x is of length at
most ` in I, we have to distinguish two cases, because an r-successor could have
been copied before in this path. If the r-successor has been copied before, domain
formula (4) is true for y and is hence copied into the domain of I 0 , i.e. there is
                     0
an (y, w) ∈ ∆I . If the r-successor y of x in I has not been copied yet (because
the d-path is longer than `), domain formula (5) is true for y and hence y is
copied into the domain of I 0 as well. To see that the role relationships are set
correctly, we consider two cases. The considered path from d to y over x in I
is of length up to ` or it is not. Role formula (9) connects two nodes (x, w) and
(y, w0 ) with an redge if r(x, y) holds in I and w = w · x. In the latter case, we
again distinguish two cases. If the r-successor y of x in I is already reachable
over a prefix of path of (x, w), then (10) adds the redge to an earlier copy of y
in I 0 . If there is no copy of a prefix of w, relation formula (8) connects (x0 , w0 )
and (y 0 , w0 ) whenever x is connected to y in I.


                                                                          79
Towards Model Transformation in Description Logics – Investigating the Case
                           of Transductions


                                                     0
     Third, x ' (x, w) and ((x0 , w), (y 0 , w0 )) ∈ rI implies the existence of y ∈ ∆I
                                                                               0
such that: (x, y) ∈ rI and y ' (y 0 , w0 ) for all x ∈ ∆I , x0 , y 0 ∈ ∆I and r ∈
Σr . Similarly to the previous argument, assume we are given (x ' (x0 , w) and
                0
(x0 , y 0 ) ∈ rI . Then, there is a path in I 0 from d over x to y, since y 0 is copied if
and only if there is such a path according to domain formulae (4) and (5). And
here as well as in the previous condition, an r-edge between (x, w) and (y, w) is
added if and only if there is an r-edge between x and y in I.

Lemma 2. Let the set C := {[(x, w)]≡ } of equivalence classes be defined by the
equivalence relation (x, w) ≡ (x0 , w0 ) iff x = x0 over ∆τ`-Tree (I) . Then C is a
partitioning of the domain of τ`-Tree (I).

Corollary 1. Let I be an interpretation and τ`-Tree the unraveling transduction
for some ` ∈ N. Then, if τ`-Tree (I) is defined, I and τ`-Tree (I) are globally bisim-
ilar.
Proof. Recall the relation ' from Lemma 1. We show that for each e ∈ ∆I , there
is an element e0 ∈ ∆τ`-Tree (I) , such that (I, e) ' (τ`-Tree (I), e0 ), and vice versa.
To achieve this, we use the partitioning C from Lemma 2. For each element in
the domain of I exists exactly one corresponding element in C. Following the
argument in Lemma 1 and since τ`-Tree (I) is defined, for each reachable element
from the unraveling node, there is at least one copy. Hence, the relation ' is a
bijection between I and I 0 and hence, we have that C is a partitioning. We also
have that for each representative (e, w) of [(e, w)]≡ , it is the case that e ' (e, w)
as defined and hence we have that ' is global. So, we have that for each elements
of the domains of I and I 0 , there is at least one bisimilar element in the other
one respectively.

Theorem 1. For all source pairs S = (I, T ), where T is an ALC-TBox and I
an interpretation holds: if τ`-Tree (I) is defined, then successful(S, T ).
Proof. Direct consequence of Corollary 1 and Definition 6.

   Note that the computational complexity of applying τ`-Tree to an input inter-
pretation is the sum of evaluating the definition scheme over this interpretation.
The amount of formulae to evaluate for τ`-Tree is polynomial in size of |∆I | as
mentioned before. To compute reachability (here expressed by the predicate
reach(x, y)), well-known PTime time algorithms exist.


Considerations on model transformation for EL. Since the DL EL does
not use negation, every EL-KB has a model. A GCI formulated in EL is satisfied
in one interpretation I is also satisfied in another interpretation I’ iff there is
a homomorphism h : I 7→ I 0 . This seems clear from the fact that homomor-
phisms are relation preserving mappings. Hence, if we instantiate our framework
such that LK = EL, we need to show that τ is a homomorphism in order to
guarantee successfulness for each EL TBox. As global bisimulation implies the
existence of a homomorphism, the τ`-Tree transduction preserves models of EL


                                           80
Towards Model Transformation in Description Logics – Investigating the Case
                           of Transductions


knowledge bases. An important difference to ALC is that EL has the canonical
model property, which means that there always exists a model that can homo-
morphically be embedded into any other model of the same EL-knowledge base.
Canonical models are the basis for reasoning in EL and they are computed by
all dedicated EL reasoners such as ELK [9]. Thus an interesting variant of our
framework with LK = EL is that by the EL-KB K also the source model I is de-
termined and a transduction can make much stronger assumptions on its input.
Whether this gives better computational properties for some properties ρ needs
to be investigated.


5    Conclusions and Future Work
We have defined and discussed a general framework for model transformation in
DL—motivated by the task to generate models of DL TBoxes that are suitable
for explanation. Based on this framework we have formally defined a collection
of reasoning problems. By means of results from model theory one can address
some of these reasoning problems. We have defined and investigated an `-tree
unraveling of (finite) interpretations as a concrete instantiation of the framework
and have shown that the corresponding transduction is model preserving with
respect to any TBox formulated in the logic ALC.
    In this paper we have mainly discussed the decision problems related to the
existence of ρ-models. For the application of generating models that facilitate
explanations, the corresponding computation problems are clearly of bigger in-
terest as they pave the way to automatic support. However, it is not obvious how
to use a property ρ to derive a suitable definition scheme for that property. In
practice it would probably suffice to offer a fixed catalog of properties to a user
in order to display a model with one or even several of the properties from the
catalog. Having a catalog of properties for which there are definition schemes and
results like Theorem 1 (with respect to a given logic), the transformations could
be implemented in a tool, in which the user specifies the source and transforma-
tions pairs. Populating such a catalog with suitable model properties should be
done in regard of results from the fields of cognitive sciences or visualization.
    There are many extensions of this initial work to be studied in future work.
An obvious extension of the work presented here, is to use ABoxes as well when
computing or transforming models. As ABoxes add named individuals to models,
transformations have to obey the unique name assumption, which says that
differently named objects have to be mapped to different domain elements in
interpretations. This assumption must not be violated by transformations.
    The framework is not tied to transductions and as an alternative transforma-
tion formalism, graph rewriting systems are a interesting option. A comparison
of graph rewriting systems and graph transductions in regard of treating models
derived from DL knowledge bases is future work, as well.


                                       81
Towards Model Transformation in Description Logics – Investigating the Case
                           of Transductions


Acknowledgments. We would like to thank Heiko Vogler and the anonymous
reviewers for their detailed and helpful remarks.


References
 1. do Amaral, F.N., Martins, C.B.: Visualization of description logic models. In: Pro-
    ceedings of the 21st International Workshop on Description Logics (DL 2008).
    CEUR Workshop Proceedings, vol. 353 (2008)
 2. Baader, F., Horrocks, I., Lutz, C., Sattler, U.: An Introduction to Description
    Logic. Cambridge University Press (2017)
 3. Bauer, J., Sattler, U., Parsia, B.: Explaining by example: Model exploration for
    ontology comprehension. In: Proceedings of the 22nd Int. Workshop on Description
    Logics (DL 2009). CEUR Workshop Proceedings, vol. 477 (2009)
 4. Baumgartner, P., Fuchs, A., De Nivelle, H., Tinelli, C.: Computing finite models
    by reduction to function-free clause logic. J. of Applied Logic 7(1), 58–74 (2009)
 5. van Benthem, J.: Modal Correspondence Theory. Ph.D. thesis, Universiteit van
    Amsterdam (1976)
 6. Brandt, S.: Polynomial time reasoning in a description logic with existential re-
    strictions, GCI axioms, and—what else? In: Proceedings of the 16th European
    Conference on Artificial Intelligence (ECAI-2004). pp. 298–302. IOS Press (2004)
 7. Courcelle, B., Engelfriet, J.: Graph Structure and Monadic Second-Order Logic: A
    Language-Theoretic Approach. Cambridge University Press (2012)
 8. Glimm, B., Horrocks, I., Motik, B., Stoilos, G., Wang, Z.: HermiT: an OWL 2
    reasoner. Journal of Automated Reasoning 53(3), 245–269 (2014)
 9. Kazakov, Y., Krötzsch, M., Simančı́k, F.: The incredible ELK. Journal of Auto-
    mated Reasoning 53(1), 1–61 (2014)
10. Peñaloza, R.: Explaining axiom pinpointing. In: Description Logic, Theory Com-
    bination, and All That - Essays Dedicated to Franz Baader on the Occasion of His
    60th Birthday. LNCS, vol. 11560, pp. 475–496. Springer (2019)
11. Rozenberg, G.: Handbook of graph grammars and computing by graph transfor-
    mation, vol. 1. World scientific (1997)
12. Schild, K.: A correspondence theory for terminological logics: Preliminary report.
    In: Mylopoulos, J., Reiter, R. (eds.) Proc. IJCAI’91. pp. 466–471 (1991)
13. Tobies, S.: Complexity Results and Practical Algorithms for Logics in Knowledge
    Representation. Doctoral thesis, Rheinisch-Westfälische Technische Hochschule
    Aachen, Aachen, Germany (2001)
14. Tsarkov, D., Horrocks, I.: FaCT++ description logic reasoner: System description.
    Journal of Automated reasoning pp. 292–297 (2006)
15. Vogler, H., Engelfriet, J.: A Büchi-Elgot-Trakhtenbrot theorem for automata with
    MSO graph storage. Discrete Mathematics & Theoretical Computer Science 22
    (2020)


                                         82