=Paper= {{Paper |id=None |storemode=property |title=Concept Definability and Interpolation in Enriched Models of EL-TBoxes |pdfUrl=https://ceur-ws.org/Vol-1014/paper_66.pdf |volume=Vol-1014 |dblpUrl=https://dblp.org/rec/conf/dlog/PonomaryovV13 }} ==Concept Definability and Interpolation in Enriched Models of EL-TBoxes== https://ceur-ws.org/Vol-1014/paper_66.pdf
        Concept definability and interpolation in
           enriched models of EL–TBoxes?

                       Denis Ponomaryov1 and Dmitry Vlasov2
           1
               Institute of Artificial Intelligence, University of Ulm, Germany;
                   Institute of Informatics Systems, Novosibirsk, Russia
                      2
                        Institute of Mathematics, Novosibirsk, Russia
                          ponom@iis.nsk.su, vlasov@academ.org

        Abstract. It is known that in Description Logics explicit concept defin-
        ability is directly related to concept interpolation. The problem to decide
        whether a concept is definable under a TBox wrt a signature usually
        reduces to entailment in the underlying logic. If an explicit definition ex-
        ists, then it can be found as a concept interpolant for a concept inclusion
        entailed by an appropriately chosen TBox. In fact, it can be extracted
        from a corresponding proof of the concept inclusion. We describe a graph
        structure called enriched model that represents proofs in normalized EL-
        TBoxes and show that, built once for a normalization of a given TBox
        T , it can be used for deciding the existence or direct computation of ex-
        plicit definitions of concepts under arbitrary subsets of axioms of T and
        wrt different subsignatures of T . Solving this computational problem has
        applications in collaborative ontology engineering and is an important
        part of the recently proposed algorithms for ontology decomposition.


1     Introduction and Motivation
    In the paper, we consider the Description Logic EL with the standard set
of constructors for concepts, i.e. the conjunction, existential restriction, and the
distinguished concept symbol >. Consider a scenario in which several experts
are working on the same ontology T and each expert has a vocabulary (a set
of concept and role names) she is using for defining complex concepts in the
ontology. Vocabularies need not be distinct between experts. If at some point an
expert encounters an axiom in the ontology and a concept which is built using
foreign vocabularies, she wants to figure out whether it still can be defined in her
own vocabulary. Or at least, whether the encountered concept can be partially
reformulated to use less symbols from foreign vocabularies (and more symbols
from her own one). If the answer is “yes”, then the expert has an option to see
the actual reformulation of the concept and can replace it with the obtained
reformulation (if she is allowed to do so under a collaborative ontology change
policy). To illustrate this idea, consider the following simple example. Consider
the TBox T = {ϕ, A v B, D v A}, where ϕ = A u B v D. Suppose an
expert is interested in a reformulation of the concept A u B. Independently of
what the expert’s vocabulary is, this concept can be reformulated as A, since
T |= A u B ≡ A, and it can be replaced with A giving a TBox equivalent to T .
Now assume that the expert’s vocabulary is {D}. We have T |= AuB ≡ D, but if
?
    The full paper is available at http://persons.iis.nsk.su/en/person/ponom/papers
we replace the concept AuB with D, we obtain the TBox which is not equivalent
to the initial one. In other words, even though the required reformulation of the
concept A u B exists, it can not be used instead of A u B without further
modifications of the TBox (e.g. adding the inclusion A v D as an additional
axiom to T ). The reason is that T \{ϕ} 6|= AuB ≡ D, i.e. the axiom ϕ is essential
for proving this equivalence. Thus, the given example demonstrates the need to
check for definability of concepts under certain subsets of a TBox. Now consider
the TBox S = {ϕ, A v D, ∃r.D v ∃r.A}, where ϕ = ∃r.A v B, and assume
that the expert’s vocabulary is {D}. We have S \ {ϕ} |= ∃r.A ≡ ∃r.D, i.e. the
concept ∃r.A can be partially reformulated as ∃r.D. Note that the subsignature
which can not be changed under any reformulation wrt the expert’s vocabulary
(here, the single symbol r) may not be known in advance and has to be guessed.
It is observed in the results of [6] however, that the mentioned problem of partial
concept reformulation is not harder than entailment in the underlying logic.
     Given a TBox T , a concept C is said to be explicitly definable under T wrt
a signature ∆ if T |= C ≡ E, where E is a concept over ∆. The problem to
decide whether C is equivalent in T to a concept containing less symbols from
sig (C) \ ∆ (thus, possibly having more ∆-symbols than C does) is called the
problem of partial concept reformulation (under T wrt ∆). Explicit definability
was studied in Description Logics in the context of query rewriting (Sect. 3 in
[8]), acyclic representation of TBoxes [2] and is intimately related to modularity
properties of ontologies via the notion of uniform interpolation. It is known
that the Description Logic EL enjoys the concept interpolation property in the
following sense (cf. Definition 3.1 in [3]): if the signatures of some TBoxes T1 , T2
intersect by a set ∆, as well as signatures of some concepts C1 , C2 , then T1 ∪T2 |=
C1 v C2 yields the existence of a concept D in the signature ∆ such that
T1 ∪ T2 |= {C1 v D, D v C2 } holds. It follows from this property that a concept
C is explicitly definable under a TBox T wrt a subsignature ∆ ⊆ sig (T ) iff
T ∪ T ∗ |= C ∗ v C, where sig (T ) ∩ sig (T ∗ ) = ∆, sig (C ∗ ) ∩ sig (C) ⊆ ∆, and
T ∗ , C ∗ are “copies” obtained from T and C, respectively, by replacing all non-
∆-symbols simultaneously with “fresh” ones, not present in sig (C) ∪ sig (T ).
Thus, deciding concept definability is polynomially reduced to entailment in
EL, although the explicit definition itself can be of exponential length in the
size of T and C (cf. Example 28 in [6]). Note that the above construction of
copies for T and C assumes that the signature ∆ is known and it allows to
check whether C is equivalent to a concept having symbols only from ∆ (not
necessarily all of them). In fact, the same idea can be used to check whether
C can be partially reformulated wrt ∆. For instance, the concept ∃r.A from
the TBox S above allows for the partial reformulation wrt ∆ = {D} and the
entailment (T \ {ϕ}) ∪ (T ∗ \ {ϕ∗ }) |= ∃r.A∗ v ∃r.A certifies this, where ϕ∗ =
∃r.A∗ v B and T ∗ and ∃r.A∗ are copies constructed as described above, with
sig (T )∩sig (T ∗ ) = ∆∪{r}. Observe that we had to extend the initial signature
∆ with the symbol r. In fact, if we extend an initial signature ∆ with the set
sig (C)\{x} for a symbol x ∈ sig (C)\∆, then the entailment as above holds iff
x can be removed in some partial reformulation of the concept C wrt ∆. Thus,
in order to check for partial reformulations of concepts under a given TBox T
wrt different signatures ∆ ⊆ sig (T ), one needs to have an efficient procedure
for checking the entailment
                    (T ∪ f ∗ (T )) \ S |= f ∗ (C) v C       (∗)
for subsets S ⊆ T ∪ f (T ), different concepts C, and different functions f ∗
                         ∗

renaming signature symbols into fresh ones injectively and being the identity
function on ∆. If the entailment holds, then the corresponding explicit definition
of the concept C can be extracted from the proof of the inclusion f ∗ (C) v C from
the TBox above. We will consider the case of this entailment problem, where C
is a concept from the LHS or RHS of an axiom of T , although the constructions
proposed for solving the problem can be easily extended to the case of arbitrary
concepts. Since the concept C, the set S, and function f ∗ are the changing
parameters in the problem (∗), but the TBox T and the initial signature ∆ are
fixed, it makes sense to use a precomputed extended representation of T to be
able to solve numerous instances of this problem efficiently.
    The aim of the paper is to present the idea that the standard (canonical)
model of an EL-TBox with the closure operators S and R (in the sense of [1])
can be extended to become a structure (called enriched model ) which, built once
for a TBox T , can be used as a precomputed template for solving problem (∗)
with the changing input parameters (C, S, and f ∗ ). An enriched model is an
extended representation of a TBox used in the implementation of the decomposi-
tion algorithm for EL–ontologies [6] which relies on solving numerous instances of
the partial concept reformulation problem. Our presentation is divided in three
parts. In Section 3, we introduce the notion of enriched model and show that
it can be used for checking the entailment of concept inclusions from different
subsets of a TBox. In Section 4, we describe the process of computing explicit
definitions of concepts in enriched models, which is conceptually similar to the
idea of computing concept interpolants from tableaux (Sect. 4 in [8] and Sect.
3 in [3]). These results constitute the solution for entailment problem (∗) wrt
the changing parameters which is described in Section 5. We note that the main
idea behind the enriched models is the observation applicable to any system with
non-trivial search problems: do a preprocessing of information in order to handle
numerous standard calls to it faster.

2   Basic Notations
    Let Nc and Nr stand, as usual, for the countably infinite sets of concept and
role names in the alphabet of the language EL. Denote N>   c = Nc ∪ {>} and let ./
be the symbol for either v or ≡. A formula is an expression of the form C1 ./ C2 ,
where C1 , C2 are concepts, and a TBox is a finite set of formulas which does not
contain C1 ≡ C2 and C1 v C2 simultaneously for some concepts C1 , C2 . If ∆ is
a signature, then T erm(∆) will stand for the set of all concepts in the signature
∆ and if ∆ ⊆ Nc , then T aut(∆) will denote the set of tautologies of the form
> v >, A v >, A v A, for each A ∈ ∆. Let T be a TBox. The notations
sub (T ), Nc T , and Nr T will stand for the sets of all concepts, concept names,
and role names respectively, occurring in axioms of T , while N>  c (T ) will be the
shorthand for Nc T ∪ {>}. The notation sig (T ) will mean the signature of T
and will be slightly abused to denote the signature of a formula or a concept.
If C ∈ sub (T ) is a concept and A ∈ Nc , then T (C/A) will stand for the TBox
obtained from T by substituting each occurrence of C in the LHS or RHS of
axioms of T with the concept name A. We call a Tbox T 0 conservative extension
of T if sig (T ) ⊆ sig (T 0 ) and for any formula ϕ, with sig (ϕ) ⊆ sig (T ), we
have T |= ϕ iff T 0 |= ϕ.

3   Proofs in Enriched Models of EL-TBoxes
    Informally, an enriched model is a graph structure for representing proofs
in a deductive system for normal TBoxes. It is built iteratively in accordance
with the inference rules and normal axioms obtained from an input TBox by
normalization. In addition, an enriched model stores a map from normal to the
original axioms to allow for checking the entailment of concept inclusions from
different subsets of the input TBox. To illustrate this idea briefly, let us consider
an example of eliminating a redundant axiom from a TBox.
Example 1 Consider the TBox S = {ϕ, A v B, A v D}, where ϕ = A u B v
D and A, B, D ∈ Nc . Note that either the first or the last axiom can be eliminated
from S giving an equivalent TBox. Let T = {ϕ0 , A v B, A v D, AuB v X, X v
A, X v B} be the conservative extension of S, where ϕ0 = X v D and X is
the “pseudonym” for the concept A u B. Note that S \ {ϕ} |= ϕ is equivalent
to T \ {ϕ0 } |= ϕ0 . Consider the following graph, where every vertex has one
or two labels depicted by the containment relation ∈ f (.) and an arrow marked
                                                                                ρ
with δ or ρ. In the figure, if a vertex is marked with labels f (X) and 7→ Y ,
then X, Y ∈ Nc T and T |= Y v X. The vertices marked with δ correspond
to axioms of T witnessing these entailments. Essentially, the graph represents
possible proofs of primitive concept inclusions inside T , note the two disconnected
components of the graph depicting independent proofs:




         Figure 1. Example of enriched model: proofs as paths in the graph
The right-most vertex of the second component corresponds to the entailed con-
cept inclusion ϕ0 = X v D. Moreover, there is a subgraph which witnesses this
entailment and does not contain vertices labelled by ϕ0 . This corresponds to the
fact that ϕ0 is entailed by T \{ϕ0 }and thus, ϕ is redundant in S.
    The graph considered in the illustration generally corresponds to what we
call further enriched model of a TBox. The rest of the section is devoted to a
formal justification of the idea of checking for entailment of concept inclusions
from subsets of a TBox by means of such structures.
Definition 1 (Primitivization). A formula is called primitive if it is of the
form A v B for A, B ∈ N>    c . A TBox Tp is called primitivization of a Tbox T
if every axiom ϕ of T is equivalent in Tp to a primitive axiom ϕp ∈ Tp , i.e. if
(Tp \ {ϕp }) ∪ {ϕ} |= ϕp and Tp |= ϕ.
    An extended primitivization of a TBox T is a pair hTp , τ 0 i, where Tp is a
primitivization of T and τ 0 : Tp → T ∪ {null} is a map, which are both obtained
after the application of the following rewriting rule to each axiom of T :
  • if C1 ./ C2 ∈ T then T := T (Ci /Ai )∪{Ai ≡ Ci }, for each Ci 6∈ N> c , i = 1, 2,
    where Ai is a fresh concept name,
  • τ 0 (Ai ≡ Ci ) = null and if ϕp is the image of C1 ./ C2 under the applied
    substitutions, then τ 0 (ϕp ) = C1 ./ C2 .
Definition 2 (Normalization). A formula is normal if it has one of the fol-
lowing forms: A v ∃r.B, ∃r.B v A, A1 u ... u An v B, for n > 1, with all
A, B, Ai , i = 1, .., n, being concept names. ATBox is normal if each of its axiom
is. A TBox T n is called normalization of a TBox T if it is a conservative ex-
tension of T and each axiom of T n is normal. Let Tep = hTp , τ 0 i be an extended
primitivization of a TBox T . An extended normalization of Tep is a pair hT n , τ i,
where T n is a normalization of Tp and τ : T n → T is a map such that:
  • if A v B ∈ Tp , A, B ∈ N> c , then A v B ∈ T
                                                    nm
                                                        and τ (A v B) = τ 0 (A v B);
  • if A ≡ B ∈ Tp , A, B ∈ N> c , then  {A   v B,  B  v A} ∈ T nm and τ (A v B) =
     τ (B v A) = τ 0 (A ≡ B);
  • τ (ϕ) = null for the rest of the axioms ϕ ∈ T nm .
It follows from the standard decision procedure for EL (see e.g. [1]) that the
following set of inference rules (under the notion of inference defined below) is
complete wrt the semantic entailment of primitive formulas A v B from normal
TBoxes (in the rules, A, B, A1 , ..., An , n > 2, stand for symbols from N>  c and C
is either a symbol from N> c or an expression ∃r.X, with X ∈ Nc ):
                                                                   >

                                                          AvB
        (Ax)             (AxTop)                 (Trans)          BvC∈T
              AvA                   Av>                   AvC
                        A v A1 , ..., A v An
              (AndR)                         A1 u ... u An v B ∈ T
                             AvB
                           A v ∃r.A1 , A1 v A2
                    (Ex)                       ∃r.A2 v B ∈ T
                                 AvB
           Figure 2. A deductive system for entailment of normal formulas
If T is a normal TBox and ϕ is a normal formula, then an inference (a proof )
of ϕ from T is a tree,where ϕ is the root and every node is either a tautology
A v A, A v >, for A ∈ N>  c , or a formula obtained from the child nodes by one
of the rules T rans, AndR, or Ex with the formulas in the side conditions being
axioms of T . If additionally, these formulas are from a subset T \ S, for some
S ⊆ T , then we say that the inference omits S.
    We now introduce the notion of enriched model of a normal TBox. The defi-
nition consists of two parts: the description of an enriched model as an algebraic
structure and the description of the process of its construction.
    Let T be a normal TBox. An enriched model of T is a structure M∗ =
((Γ + , Γ − , E), δ, ρ, f ), where (Γ + , Γ − , E) is a directed bipartite graph with the
set E of edges and the set Γ + ∪ Γ − of vertices augmented with maps:
  – δ : Γ − → T ∪ T aut(Nc T ),
  – ρ : Γ + → N>                 >           >
                    c (T ) ∪ (Nc (T ) × Nc (T )),
           >             T     Γ+
  – f : Nc (T ) ∪ Nr → 2
The process of construction of an enriched model for a normal TBox is given in
the Appendix.
Definition 3 (Submodel of an enriched model). Let M∗ = (Γ = (Γ + , Γ − ,
E), δ, ρ, f ) be an enriched model. The structure M∗s = (Γs = (Γs+ , Γs− , Es ), δs , ρs ,
fs ) is called a submodel of M∗ if Γs is a subgraph of Γ , each δs , ρs , fs is the
restriction of the corresponding map from M∗ onto the vertices of Γs and the
following holds:
  – for every v − ∈ Γs− , all vertices v + ∈ Γ + with (v + , v − ) ∈ E are in Γs+ and
     connected with v − in Γs ;
  – for every v + ∈ Γs+ , there exists v − ∈ Γs− such that (v − , v + ) ∈ Es ;
     if such v − is unique, then the submodel is called deterministic.
In general, the bipartite graph (Γ + , Γ − , E) may not have a tree-shape due to
cycles caused e.g. by equalities between concept names (recall Example 1). How-
ever, there always exist submodels which are trees and these submodels are in
one-to-one correspondence with proofs in the deductive system of Figure 2.
Theorem 1 (Entailment from subset of TBox in enriched model). Let T
be a TBox and T ep = hTp , τ 0 i be an extended primitivization of T . Let hT norm , τ i
be an extended normalization of T ep and M∗ be an enriched model of T norm .
Let C1 , C2 be concepts appearing on the LHS or RHS of an axiom of T and for
i = 1, 2, let Ai ∈ sig (Tp ) be a concept name which is either Ci (if Ci ∈ Nc ),
or such that Ai ∈ sig (Tp ) \ sig (T ) and Ai ≡ Ci ∈ Tp . Then, for a subset
S ⊆ T of axioms, it holds T \ S |= C1 v C2 iff there is a deterministic submodel
Ms = (Γs = (Γs+ , Γs− , Es ), δs , ρs , fs ) of M∗ and a vertex v + ∈ Γs+ such that:
  – v + ∈ fs (A2 ) and ρs (v + ) = A1 ,
  – for any non-leaf vertex v − ∈ Γs− , τ (δs (v − )) 6∈ S.
Therefore, in order to check the entailment T \ S |= C1 v C2 , it suffices to
find first a plus-vertex v + witnessing the entailment T norm |= A1 v A2 in the
enriched model M∗ (the vertex must satisfy the conditions v + ∈ fs (A2 ) and
ρs (v + ) = A1 ) and if it exists, find an inference of A1 v A2 , which does not refer
via the map τ ◦δ to formulas of S, as a deterministic submodel of M∗ containing
v + . The procedure of finding this submodel reduces to a traversal of the graph
(Γ + , Γ − , E) of M∗ starting from the vertex v + . The procedure is conceptually
similar to the one presented in the context of computing concept interpolants in
Section 4 and is therefore skipped.

4    Computing Explicit Definitions with Enriched Models
      Let T1 be a TBox, C1 be a concept from the LHS or RHS of an axiom of T ,
S1 ⊆ T1 be a subset of axioms, and ∆ be a signature. Suppose we want to check
whether C1 is ∆–definable under T1 \ S1 . Let T2 = f ∗ (T1 ) and S2 = f ∗ (S1 ) ⊆ T2
be the corresponding “copy”-TBoxes (with sig (T1 )∩sig (T2 ) = ∆) and let C2 =
f ∗ (C1 ) be the corresponding “copy”-concept obtained by a renaming function
f ∗ . We may assume that f ∗ (T1 \S1 ) = T2 \S2 and thus, by concept interpolation,
definability is equivalent to the condition (T1 \ S1 ) ∪ (T2 \ S2 ) |= C1 v C2 . The
observations from Section 3 imply that it suffices to consider the case when the
TBox T1 is normal and Ci is a concept name from sig (Ti ) \ ∆, for i = 1, 2. Then
(T1 \ S1 ) ∪ (T2 \ S2 ) |= C1 v C2 holds iff there is a submodel Ms in an enriched
model M∗ = ((Γ + , Γ − , E), δ, ρ, f ) of T1 ∪ T2 and a vertex v + ∈ Γ + certifying
this entailment. Let us show how a concept interpolant for the inclusion C1 v C2
(i.e. an explicit ∆-definition of C1 under T \ S) can be computed while searching
for this submodel in M∗ . The submodel Ms must be in direct correspondence
with a proof of C1 v C2 from (T1 \S1 )∪(T2 \S2 ). We have sig (T1 )∩sig (T2 ) = ∆
and Ci ∈ sig (Ti ) \ ∆, for i = 1, 2, so it suffices to understand how a concept
interpolant between C1 and C2 can be extracted from a proof of C1 v C2
in this situation, which is the purpose of Theorem 3 given in the Appendix.
The property of concept interpolation shown in the theorem is well-known for
EL, but the proof is given to justify soundness of the procedure for interpolant
computation described at the end of this section. Let us consider an example
illustrating the main idea.
Example 2 Consider the TBox S = {A u B v C, A ≡ D}, with A, B, C, D ∈
Nc , the corresponding primitivization Tp = {X v C, A ≡ D, X ≡ A u B}, and
a normalization T of Tp . Denote ∆ = {D, B} and let T ∗ be the “copy” of T
under the injective renaming of all non-∆-symbols into fresh ones such that X ∗
and C ∗ are the “copies” of X and C in T ∗ , respectively. Then the entailment
S \ {A u B v C} |= A u B ≡ D u B is equivalent to (T \ {X v C}) ∪ (T ∗ \ {X ∗ v
C ∗ }) |= X ∗ v X and there is a vertex and a subgraph in the enriched model of
T ∪ T ∗ certifying this entailment (note the right-most vertex):




    Figure 3. Constructing concept interpolant from a subgraph in enriched model
Consider the vertex labelled with f (A) and X ∗ in the graph, denote it as v. It
corresponds to the entailed inclusion X ∗ v A and is special, because it contains
non-∆-symbols from T ∗ and T on the LHS and RHS, respectively, while all
the descending vertices contain non-∆-symbols only from T ∗ . By the concept
interpolation property of EL, there must be an interpolant ∆-concept for the
inclusion X ∗ v A (for the vertex v). In fact, this concept can be obtained from the
LHS of the axiom labeling the child vertex of v, i.e. D is the required interpolant
for v. To obtain the interpolant for the right-most vertex (corresponding to the
inclusion X ∗ v X of interest) it suffices to take the conjunction of interpolants
for the child vertices preceding the one labeled with A u B v X. We have B ∈ ∆
and the concept D u B is the required interpolant.
    The example shows that there are special intermediate derived concept in-
clusions in the proof for which an interpolant can be easily computed. It is either
the LHS of the side condition of the rule under which the concept inclusion is
obtained, or the concept from the RHS of one of the premises of the rule. In par-
ticular, such special concept inclusions are the very first formulas in branches of
the proof which are derived using axioms of both, a TBox T and its “copy”. For
the rest of the derived concept inclusions, the corresponding interpolants can
be obtained by combining interpolants for preceding concept inclusions under
conjunction and existential restriction. These are the main ideas presented in
the proof of Theorem 3 and the procedure of interpolant computation.
    Essentially, the proof demonstrates that in case an irregular formula ϕ is de-
rived from T1 ∪T2 by an inference rule R, then either there are only regular formu-
las in the premise of R and the interpolant for ϕ can be computed immediately,
or there are some irregular formulas in the premise for which the correspond-
ing interpolants must be computed first in order to obtain an interpolant for
ϕ. Practically, this means a recursive procedure of computing interpolants while
traversing the proof of formula ϕ (i.e. a submodel starting with the plus-vertex
corresponding to ϕ). Every proof starts from axioms which are regular formulas,
so the first irregular formula in a branch of the proof is obtained after application
of some rule R with regular premises. As noted in the proof of Theorem 3 (and
Example 2), an interpolant in this case can be computed immediately, since it
is either the concept from the LHS of the formula in the side condition of R, or
the concept ∃r.A1 from the RHS of the premise, if R = Ex. Thus, Theorem 3
(together with Theorem 1) evidences soundness of the recursive procedure given
below, while termination follows from the finiteness of the graph in an enriched
model and the usage of flags for visited vertices. The procedure is defined in
words instead of pseudo-code for ease of exposition.
Procedure for computing an explicit definition as concept interpolant
Input:
  – an enriched model M∗ = (Γ = (Γ + , Γ − , E), δ, ρ, f ) of the union of normal
     TBoxes T1 ∪ T2 , with sig (T1 ) ∩ sig (T2 ) = ∆, for a signature ∆;
  – a concept inclusion A1 v A2 , with Ai ∈ (sig (Ti ) ∩ Nc ) \ ∆, for i = 1, 2.
Output: Empty iff T1 ∪ T2 6|= A1 v A2 ; otherwise, a concept interpolant D such
that sig (D) ⊆ ∆ and T1 ∪ T2 |= {A1 v D, D v A2 }.
The procedure operates on the graph Γ from M∗ and incorporates a subroo-
tine of graph traversal.
     1. If there is no vertex v + ∈ Γ + such that v + ∈ f (A2 ) and ρ(v + ) = A1 ,
then the procedure terminates with the empty output. Otherwise, initiate graph
traversal starting from the vertex v + as follows.
     2. Initialize two maps vis : Γ + ∪Γ − → {0, 1} and int : Γ + ∪Γ − → T erm(∆)∪
{null} for all vertices v of Γ as: vis(v) = 0, int(v) = null. The map vis will be
used to mark visited vertices and int will mark vertices with their corresponding
interpolants. The value null will mean that there is no interpolant and vis(v) = 0
will imply int(v) = null. The task is compute the value of int(v + ), which will
not be equal to null, however it can be the case for other vertices in Γ (which
are related to proofs inside TBoxes T1 , T2 , but not from their union, i.e. to the
proofs consisting of regular formulas).
     3. The recursive subrootine of graph traversal and interpolant computation
is as follows. Let v be the currently visited vertex from Γ .
     3.1. Let v ∈ Γ + . Then we visit an arbitrary vertex w such that (w, v) ∈ E,
i.e. a child minus-vertex:
  • If vis(w) = 1, we set int(v) = int(w);
  • If vis(w) = 0, then we initiate recursively graph traversal starting from the
     vertex w. When the traversal is complete, we set int(v) = int(w).
     After these steps, we set vis(v) = 1, i.e. visiting the vertex v is finished and
we exit the recursion to one level up.
    3.2. Let v ∈ Γ − . Then we immediately set vis(v) = 1. Let δ(v) = F1 v F2 .
If for some i = 1, 2, we have sig (Fi ) ⊆ ∆ , then we set int(v) = Fi and exit the
recursion to one level up.
    Otherwise, initiate graph traversals recursively starting from every irregular
(plus-vertex) w, with (w, v) ∈ E. If there are no such vertices, then exit the
recursion to one level up. Otherwise, when the traversal is complete for all such
vertices w, we have int(w) 6= null and proceed as follows.
    3.3. Consider different cases for the formula δ(v).
  • If δ(v) = A v B, or δ(v) = A v ∃r.B, for some concept names A, B, then
     there is a single child plus-vertex w for v and we set int(v) = int(w).
  • If δ(v) = A1 u...uAn v B, n > 1, then let w1 , .., wn be the child plus-vertices
     for v, with wi ∈ f (Ai ) and A = ρ(wi ), for i = 1, ..n and a concept name A.
     Take the maximal subset I ⊆ {1, .., n} such that for every i ∈ I, the vertex
     wi is irregular and set int(v) = i∈I int(wi ) u j ∈{1,..,n}\I Aj .
                                      F               F
    • If δ(v) = ∃r.A2 v B, then there is a child plus-vertex w such that w ∈ f (r)
      and ρ(v) = hA, A1 i for some A and A1 . If w is irregular, then set int(v) =
      int(w). Otherwise there is an irregular child vertex w, with w ∈ f (A2 ),
      ρ(w) = A1 , and we set int(v) = ∃r.int(w).
  This finishes visiting the vertex v ∈ Γ − .
The definition of the procedure is complete.
     Let v be the minus-vertex from point 3.3 of the procedure and w be one
of its child plus-vertices with an interpolant int(w) = D for some concept D.
From the consideration of the formula δ(v), one can see that either the concept
int(v) is the same as int(w), or can be of length increased by a constant (when
δ(v) = ∃r.A2 v B), or in the worst case, int(v) can be n-times longer than int(w)
(if δ(v) = A1 u...uAn v B, n > 1, and each corresponding child plus-vertex wi is
irregular). This means that the size of the interpolant for the inclusion A1 v A2
is at worst O(nm ), where m is the number of axioms in the normal TBox T1 ∪ T2
and n is the maximal number of conjuncts in an axiom of T1 ∪ T2 . Note that due
to arbitrary selection of the minus-vertex in point 3.1, the procedure computes
an arbitrary chosen interpolant out of existing ones for the concept inclusion
A1 v A2 . By Theorem 1, it should be clear that point 3.1 can be easily modified
to decide concept definability wrt a subset S of the TBox: it suffices to consider
only those child minus-vertices which are labeled with formulas corresponding
to axioms of S.

5      Partial Concept Reformulation with Enriched Models
    The problem of partial reformulation of concepts under a TBox T wrt a sig-
nature ∆ considered in the introduction implies several instances of the concept
definability problem for different extensions of ∆. At worst, the number of tests
for definability to be performed is equal to the number of signature symbols in
a concept C of interest. Moreover, in the decomposition algorithm for EL [6],
each non-primitive concept being the LHS of an axiom of T has to be tested for
partial reformulation. Each instance of the definability problem wrt an extension
∆0 ⊇ ∆ requires checking entailment from the union of TBoxes (T \S)∪(T 0 \S 0 ),
where T 0 is a “copy” of T , with all non-∆0 -symbols injectively renamed into
fresh symbols, and S, S 0 are subsets of axioms of T and T 0 , respectively. Since
the concept C and the sets (∆0 \ ∆), S, S 0 are the changing parameters in the
problem of partial reformulation, but the TBox T and the initial signature ∆
are fixed, it makes sense to use a prebuilt extended representation of T to be
able to solve numerous instances of the concept definability problem efficiently.
In the following we argue that an enriched model precomputed for a given TBox
T and a signature ∆ can be used as a template for solving different instances of
the concept definability problem wrt the changing parameters.
    We assume w.l.o.g. that the input TBox T is in the enriched normal form.
Given T and ∆, first an enriched model M for T is built and then its “copy”
M∗ is generated wrt an injective renaming of all non-∆-symbols from T . The
pair M, M∗ is then kept as a precomputed template. As soon as an extension ∆0
of the signature ∆ is given, the graphs of these two models are expanded (with
the iterative process of model construction described in the Appendix) wrt a set
of formulas generated from T and the set ∆0 \ ∆. The idea of constructing this
set of formulas is described in Example 3 and Theorem 2.
    In the example below, we assume that we are given TBoxes T and T ∗ sharing
a signature ∆ (initially taken to be empty for simplicity) and then ∆ is expanded
as the result of substituting (or “gluing”) some symbols of T ∗ with that of T .
Example 3 Consider normal TBoxes T = {A v ∃r1 .A1 } and T ∗ = {∃s1 .B1 v
B}, we have ∆ = sig (T ) ∩ sig (T ∗ ) = ∅. Let T 0 = {∃r1 .A1 v B} be the
TBox obtained from T ∗ by substituting B1 with A1 and s1 with r1 ; we have
∆0 = sig (T ) ∩ sig (T 0 ) = {A1 , r1 } ⊇ ∆. Clearly, T ∩ T 0 |= A v B.
   Now consider the TBox Q = T ∪T ∗ ∪{A1 ≡ B1 }∪T ∗ (s/r), where T ∗ (s/r) =
{∃r1 .B1 v B}, i.e. this TBox consists of the axiom of T ∗ with the role s1
substituted with r1 . Then Q |= A v B and in fact, it will be shown in Theorem
2 that T ∪ T 0 and the theory Q constructed for the union T ∪ T ∗ in the way
above entail the same normal formulas in signature sig (T ∪ T ∗ ). Therefore,
an enriched model for checking entailment from the union of theories T ∪ T 0
(sharing the signature ∆0 ⊇ ∆) can be obtained by expanding prebuilt models for
T and T ∗ wrt the additional formulas {A1 ≡ B1 } ∪ T ∗ (s/r).
    We note that a much more general model-theoretic result implying Theorem
2 can be proved, but here we restrict ourselves only to the case important in the
context of this paper.
Theorem 2 (Formulas imitate substitutions of signature symbols). Let
T1 and T2 be normal TBoxes, ∆ = sig (T1 ) ∩ sig (T2 ), and let σ1 = {Ai }i∈I ∪
{rj }j∈J ⊆ sig (T1 ) \ ∆, σ2 = {Bi }i∈I ∪ {sj }j∈J ⊆ sig (T2 ) \ ∆ be signatures,
where all Ai , Bi ∈ Nc and rj , sj ∈ Nr . Let T20 be the result of substituting every
Bi with Ai and every sj with rj in the axioms of T2 .
    Consider the TBox Q = T1 ∪ T2 ∪ {Ai ≡ Bi }i∈I ∪ T2 (s̄/r̄), where T2 (s̄/r̄)
is the set of formulas of the form ∃rj .X v Y , or X v ∃rj .Y , j ∈ J such that
∃sj .X v Y , or X v ∃sj .Y , respectively, is an axiom of T2 . Then for any normal
formula ϕ in signature sig (T1 ∪ T20 ) we have T1 ∪ T20 |= ϕ iff Q |= ϕ.
   It should be clear that precomputing the models M, M∗ does make sense,
when there is a large number of queries for partial concept reformulation to
be executed and answering all these queries would potentially involve all the
axioms of the TBox. In paticular, this is the case for the decomposition algorithm
from [6]. On the other hand, an expansion of M, M∗ wrt the auxiliary axioms
from Theorem 2 may be computed only to the extent needed for checking the
entailment C 0 v C for a concept C of interest (and its copy C 0 after a renaming
wrt ∆0 ⊇ ∆).

6   Discussion
    In the paper, we did not consider questions of efficient implementation of
enriched models and the algorithm of their construction. The iterative procedure
based on rules from Section 3 (including the initialization step) is mentioned to
simplify the exposition and is by no means proposed to be used in applications;
this topic is left out of the scope of this paper to separate the main concepts from
their implementation details. We note that the technique of enriched models can
be easily adopted for solving the problem of concept reformulation by using the
inference systems from the consequence-driven procedures for EL (see e.g. [4]
or [5]). Moreover, we argue that the basic ideas described in this paper can be
implemented in the context of any reasoning system for EL which allows for
proof tracing and incremental reasoning (see a short overview in Section 2 of
[5]). The first feature is necessary for constructing concept interpolants from
proofs, while the second one allows for using a prebuilt model of a TBox and
computing its local changes after the TBox is updated. For assume we are given
a TBox T , a signature ∆ ⊆ T , and we would like to be able to solve different
instances of the partial concept reformulation problem under T wrt ∆. Let T ∗
be a copy of T , with all non-∆-symbols injectively renamed, and let M be a
prebuilt model for T ∪ T ∗ . Let A∗ and S ∗ be the corresponding copies of A
and S in T ∗ . Now assume a signature ∆0 ⊇ ∆ and a set S ⊆ T are given; take
here ∆0 \ ∆ = {A}, A ∈ Nc , for simplicity. Then building a model for the TBox
(T \ S) ∪ (T 0 \ S 0 ) (where T 0 and S 0 are copies of T and S under an injective
renaming of all non-∆0 -symbols) is equivalent to incrementally updating M after
removing the axioms of S ∪ S ∗ from T ∪ T ∗ and adding the axiom A ≡ A∗ .
    Since an enriched model of a TBox contains proofs of primitive concept in-
clusions, it can be directly used for computing concept interpolants and hence,
explicit definitions of concepts. For every proof of a concept inclusion C ∗ v C
from T ∪ T ∗ (where C ∗ and T ∗ are appropriate copies of C and T ), there is
a corresponding explicit definition of C. It follows from the example on the
non-succinctness of uniform interpolants from Section 7 of [7] that, given a EL-
TBox T and a primitive concept inclusion ϕ, there can be exponentially many
proofs of ϕ from T and this holds already for the case, when sig (T ) ⊆ Nc .
Consider the signature ∆ = {Bij | Bij ∈ Nc , 1 6 i, j 6 n} and the TBox
T = {A v Bij | 1 6 i, j 6 n} ∪ {Bij v Bi | 1 6 i, j 6 n} ∪ {B1 u ... u Bn v A}.
Then the concept A has the longest explicit ∆-definition under T which is
F 16i,j6n Bij , as well as exponentially many shortest ∆-definitions. The pro-
cedure of compuiting concept interpolants given in Section 4 implies traversing
only a single proof out of the available ones and the choice of the proof is not
regulated by any strategies. We believe that the development of strategies of
concept reformulation deserves attention not only in the Description Logic EL,
but also in more expressive languages.
Acknowledgments. The first author was supported by the German Research Founda-
tion within the Transregional Collaborative Research Centre SFB/TRR 62 “Companion-
Technology for Cognitive Technical Systems”, the Russian Academy of Sciences (Grant
No. 15/10), and the Siberian Division of the Russian Academy of Sciences (Integration
Project No. 3). The second author was supported by the Russian Foundation for Basic
Research (Grant No. 12-01-00460-a).


References
1. Baader F., Brand S., and Lutz C.: Pushing the EL envelope. In Proc. Nineteenth
   International Joint Conference on Artificial Intelligence (IJCAI 2005).
2. ten Cate B., Conradie W., Marx M., Venema Y.: Definitorially complete Descrip-
   tion Logics. In Proc. Tenth International Conference on Principles of Knowledge
   Representation and Reasoning (KR 2006).
3. ten Cate B., Franconi E., Seylan I.: Beth definability in expressive Description Log-
   ics. In Proc. Twenty-second International Joint Conference on Artificial Intelligence
   (IJCAI 2011).
4. Kazakov Y., Krötzsch M., and Simancik F. Concurrent classification of EL ontolo-
   gies. In: Proc. Tenth International Semantic Web Conference (ISWC 2011).
5. Kazakov Y. and Klinov P. Incremental Reasoning in OWL EL without Book-
   keeping. Technical report, University of Ulm, May 2013. Available from:
   http://code.google.com/p/elk-reasoner/
6. Konev B., Lutz C., Ponomaryov D., Wolter F.: Decomposing Description Logic on-
   tologies. In: Proc. Twelfth International Conference on the Principles of Knowledge
   Representation and Reasoning (KR 2010).
7. Konev B., Walther D., and Wolter F. The logical difference problem for description
   logic terminologies. In: Proc. Fourth International Joint Conference on Automated
   Reasoning (IJCAR 2008).
8. Seylan I., Franconi E., de Bruijn J.: Effective query rewriting with ontologies over
   Dboxes. In: Proc. Twenty-first International Joint Conference on Artificial Intelli-
   gence (IJCAI 2009).
Appendix
1. Definition of enriched model
The definition consists of two parts: the description of an enriched model as an
algebraic structure and the description of the process of its construction. For
convenience, we repeat here the algebraic part of the definition.
    Let T be a normal TBox. An enriched model of T is a structure M∗ =
((Γ + , Γ − , E), δ, ρ, f ), where (Γ + , Γ − , E) is a directed bipartite graph with the
set E of edges and the set Γ + ∪ Γ − of vertices augmented with maps:

 – δ : Γ − → T ∪ T aut(Nc T ),
 – ρ : Γ + → N>               >       >
                 c (T ) ∪ (Nc (T ) × Nc (T )),
                     T       +
 – f : N>c (T ) ∪ Nr → 2
                           Γ


    such that the union of sets in the image of f is exactly Γ + and for all distinct
                    T
x, y ∈ N>
        c (T ) ∪ Nr   it holds that f (x) ∩ f (y) = ∅.
    By using the last restriction, we will sometimes abuse the notation and write
f −1 (v) to denote the preimage of the set V ⊆ Γ + with v ∈ V .
    Informally, the plus-vertices of the graph (i.e. the elements of Γ + ) will cor-
respond to concept and role names and will be mapped by ρ to concept names
and pairs of concept names, respectively. The minus-vertices will correspond to
axioms of T and tautologies. An enriched model M∗ = ((Γ + , Γ − , E), δ, ρ, f ) of
a normal TBox T is built iteratively in accordance with a set of graph expansion
rules which correspond to the inference rules from Figure 2. For every A ∈ Nc T ,
the graph is first initialized with connected pairs of vertices hw1 , y1 i and hw2 , y2 i
with w1 , w2 ∈ Γ − and y1 , y2 ∈ Γ + , such that y1 ∈ f (A), ρ(y1 ) = A, δ(w1 ) is a
tautology of the form A v A and y2 ∈ f (>), ρ(y2 ) = A, and δ(w2 ) is A v >.
Besides, a pair hw, yi is initialized, with y ∈ f (>), ρ(y) = >, and δ(w) = > v >.
This is the initialization step for the bipartite graph and the mappings δ, ρ, f .
To formulate the iterative construction process of M∗ , we present it as a set
of graph expansion rules, which describe how the bipartite graph should be
expanded (and the mappings updated) based on the previously built vertices
and an axiom from T encountered at an iteration step. Each rule has the form
(m > 2):

                           •(U1 , V1 ), ..., •(Um−1 , Vm−1 )
                                                             ϕ
                                      •(Um , Vm )
                                           T
where for i = 1, .., m, Ui ∈ N>c (T ) ∪ Nr   and Vi is either a symbol from N>
                                                                             c (T )
                             >
or a pair of symbols from Nc (T ), and ϕ is an axiom of T .
The rule is read as follows:
 – if there is an axiom ϕ in T
 – and for each (Ui , Vi ), i = 1, .., m − 1, there is a vertex xi ∈ Γ + such that
   xi ∈ f (Vi ) and ρ(xi ) = Ui
 – if there does not exist a vertex y ∈ Γ + such that y ∈ f (Vm ) and ρ(y) = Um ,
   then initialize such vertex and
 – if there does not already exist a vertex w ∈ Γ − , with δ(w) = ϕ, such that
   hxi , wi ∈ E for all i = 1, .., m − 1 and hw, yi ∈ E, then initialize such vertex.
The list of the graph expansion rules is presented in the figure below (n > 2 in
the rule AndR• ). Each rule is augmented with an illustration of the constructed
part of the graph with the above mentioned vertices xi , w, and y, i = 1, .., m − 1.


                       •(A,B)
           (T rans•1 ) •(A,C) B v C




                     •(A,B)
       (T rans•2 ) •(r,hA,Ci) B v ∃r.C




            •(A,A1 ),...,•(A,An )
(AndR• )         •(A,B)           A1 u ... u An v B




           •(r,hA,A1 i), •(A1 ,A2 )
  (Ex• )          •(A,B)            ∃r.A2 v B




    Figure 3. Iterative construction of an enriched model: graph expansion rules

The iterative procedure of model construction finishes when no rule results in
an expansion of the graph, i.e. when no vertex or edge can be added.
Definition 4 (Enriched model of a normal TBox). Let T be a normal
TBox. The structure M∗ = ((Γ + , Γ − , E), δ, ρ, f ) built for T as above is called
enriched model of T .


2. Proofs for Section 3
Let us formulate three auxiliary lemmas needed for the proof of Theorem 1. Note
that in general, the bipartite graph (Γ + , Γ − , E) may not have a tree-shape due
to cycles caused e.g. by equalities between concept names (recall Example 1).
However, there always exist submodels which are trees and these submodels are
in one-to-one correspondence with proofs in the deductive system of Figure 2.

Lemma 1 (Deterministic tree submodels correspond to proofs) Let
M∗ = (Γ = (Γ + , Γ − , E), δ, ρ, f ) be an enriched model of a TBox T . For any ver-
tex v + ∈ Γ + , there exists a deterministic submodel M∗s = (Γs = (Γs+ , Γs− , Es ), δs ,
ρs , fs ), where Γs is a tree with the root v + and each leaf is a vertex v − ∈ Γ −
with δ(v − ) ∈ T aut(Nc T ). Let S = T \ {δ(v − ) | v − ∈ Γs− }.
     The submodel M∗s corresponds to an inference from T omitting S of the
concept inclusion ρ(v + ) v f −1 (v + ) (in case ρ(v + ) ∈ Nc ) and of the inclusion
A v ∃f −1 (v + ).B (in case ρ(v + ) = hA, Bi).

Proof. The claim is proved by induction on the number of iteration steps in the
construction process of M∗ for T which are needed to initialize the vertex v + .
The base case is trivial by the initialization step for M∗ : in this case the required
tree is just a two-element chain corresponding to an inference of a tautology from
T aut(Nc T ). This inference trivially omits S. The induction step is simple as well,
since at each iteration step of the graph expansion, by the induction hypothesis,
for every plus-vertex in the premise of an expansion rule, there is a corresponding
deterministic submodel, the part of the graph to be constructed has the form
of a tree (Figure 3) and the intermediate vertex v − to be constructed satisfies
δ(v − ) ∈ T . If the expansion rule applied is T rans•i for some i = 1, 2, then the
corresponding inference of the concept inclusion is by the rule T rans with the
axiom in the side condition equal to δ(v − ); the rest of the rules in Figure 3 are
in direct correspondence to the rules from Figure 2. 
   Let us call proof submodel a submodel with the properties from Lemma 1 for
some plus-vertex v + . As expected, the converse of the lemma holds as well.

Lemma 2 (Proofs correspond to deterministic tree submodels) Let T
be a normal TBox, M∗ be an enriched model of T , and let ϕ be a formula of
either of the forms A v B, A v ∃r.B, for A, B ∈ Nc . If there is an inference of
ϕ from T which omits a subset S ⊆ T , then there is a proof submodel of M∗
with the root vertex v + corresponding to ϕ and each non-leaf minus-vertex v −
satisfying δ(v − ) ∈ (T \ S).

    Thus, in the notations of the lemma, there is an inference of ϕ from T iff there
is a plus-vertex v + ∈ Γ + in the enriched model of T such that v + ∈ f (B) and
ρ(v + ) = A, or v + ∈ f (r) and ρ(v + ) = hA, Bi, depending on the form of ϕ. By
the construction of the enriched model, the vertex v + in the graph is unique for
ϕ, so the number of plus-vertices in the model is at worst O(n2 ), where n in the
size of the input TBox T . Every path of length 3 in the graph starting and ending
in a plus-vertex has a minus-vertex in the middle (the graph is bipartite). Every
minus-vertex is labeled with an axiom of T and, by the construction, the number
of minus-vertices between two plus-vertices does not exceed the number of axioms
in T (there can not exist two such minus-vertices v1 , v2 with δ(v1 ) = δ(v2 )).
Thus, the size of the enriched model is polynomial in the size of T . Note that
besides the information about all the entailed primitive concept inclusions, the
enriched model contains information about all their possible inferences from
axioms of T and tautologies T aut(Nc T ), recall also the graphical representation
of independent inferences from Example 1.

Lemma 3 (Checking for entailment with a normal TBox) Let T be a
TBox, S ⊆ T be a subset of axioms, and ϕ ∈ T be an axiom. Let T ep =
hTp , τ 0 i be an extended primitivization of T and T en = hT norm , τ i be an extended
normalization of T ep . Denote S norm = {ψ ∈ T norm | τ (ψ) ∈ S} and ϕnorm =
{ψ ∈ T norm | τ (ψ) = ϕ}. Then we have T \ S |= ϕ iff T norm \ S norm |= ϕnorm .

Proof. For brevity we consider only the special case, when S = {ϕ} and ϕ is of
the form C1 ≡ C2 for some concepts C1 , C2 6∈ N>    c . It follows that the set ϕ
                                                                                  norm

is not a singleton. Other cases, when S is an arbitrary subset of axioms, or ϕ is
of the form C1 v C2 , or Ci ∈ N>   c , for some i = 1, 2, are considered analogously.
Let ϕp = A1 ≡ A2 be the primitive axiom of Tp such that τ 0 (ϕp ) = ϕ and Ai ≡
Ci ∈ Tp for i = 1, 2. Then, by the definition of extended normalization, ϕnorm =
{A1 v A2 , A2 v A1 } (with each of the concept inclusions mappedSby τ to ϕ).
Consider the pair h(T \ {ϕ})p , τ̃ )i, where (T \ {ϕ})p = Tp \ ({ϕp } ∪ i=1,2 {Ai ≡
Ci | Ci is not the LHS or RHS of an axiom of T \ {ϕ}}) and τ̃ is the restriction
of the map τ 0 onto the set Tp \ {ϕp }. Clearly, h(T \ {ϕ})p , τ̃ )i is an enriched
primitivization of T \ {ϕ} and hence, its conservative extension. Moreover, any
model of (T \ {ϕ})p expands to a model of Tp \ {ϕp } and thus, Tp \ {ϕp } is a
conservative extension of (T \ {ϕ})p . Now consider the set T norm \ ϕnorm . It is
easy to see that this set is a normalization of Tp \ {ϕp } and thus, its conservative
extension.
    (⇒) : If T \ {ϕ} |= ϕ, then (T \ {ϕ})p |= C1 ≡ C2 and Tp \ {ϕp } |= {C1 ≡
C2 , A1 ≡ C2 , A2 ≡ C2 }. But then T norm \ ϕnorm entails this set of formulas too
and hence, T norm \ ϕnorm |= ϕnorm .
    (⇐) : Assume that ϕnorm is entailed by T norm \ϕnorm . Then it is entailed by
Tp \ {ϕp } and, since Tp \ {ϕp } |= {A1 ≡ C2 , A2 ≡ C2 }, we have Tp \ {ϕp } |= ϕ. As
Tp \ {ϕp } is a conservative extension of T \ {ϕ}, we conclude that T \ {ϕ} |= ϕ.


Theorem 1 (Entailment from subset of TBox in enriched model). Let T
be a TBox and T ep = hTp , τ 0 i be an extended primitivization of T . Let hT norm , τ i
be an extended normalization of T ep and M∗ be an enriched model of T norm .
Let C1 , C2 be concepts appearing on the LHS or RHS of an axiom of T and for
i = 1, 2, let Ai ∈ sig (Tp ) be a concept name which is either Ci (if Ci ∈ Nc ),
or such that Ai ∈ sig (Tp ) \ sig (T ) and Ai ≡ Ci ∈ Tp . Then, for a subset
S ⊆ T of axioms, it holds T \ S |= C1 v C2 iff there is a proof submodel
Ms = (Γs = (Γs+ , Γs− , Es ), δs , ρs , fs ) of M∗ and a vertex v + ∈ Γs+ such that:
 – v + ∈ fs (A2 ) and ρs (v + ) = A1 ,
 – for any non-leaf vertex v − ∈ Γs− , τ (δs (v − )) 6∈ S.
Proof. Let us denote ϕ = C1 v C2 . If ϕ ∈ T then, by the definitions of
extended primitivization and normalization, A1 v A2 ∈ T norm , τ (A1 v A2 ) =
C1 v C2 and thus, by Lemma 3, the condition T \ S |= C1 v C2 is equivalent to
T norm \ S norm |= A1 v A2 , where S norm = {ψ ∈ T norm | τ (ψ) ∈ S}. If ϕ 6∈ T
                                                                               0
then consider the TBox Q = T ∪ {ϕ}. Clearly, Qep = hTp ∪ {A1 v A2 }, τQ          i is
                                            norm
an extended primitivization of Q and hT          ∪ {A1 v A2 }, τQ i is an extended
                                                     0                       0
normalization of Qep , where τQ (A1 v A2 ) = τQ        (A1 v A2 ) = ϕ and τQ    , τQ
are equal to τ and τ 0 respectively, on the rest of their domains T norm and Tp .
Observe that T \ S |= ϕ is the same as Q \ (S ∪ {ϕ}) |= ϕ, which by Lemma
3 and the above noted is equivalent to (T norm ∪ {A1 v A2 }) \ P |= A1 v A2 ,
where P = {ψ ∈ T norm ∪ {A1 v A2 } | τ (ψ) ∈ S ∪ {ϕ}}. We have A1 v A2 ∈ P,
so the last entailment is the same as T norm \ S norm |= A1 v A2 . Therefore, in
each of the two considered cases, the condition T \ S |= C1 v C2 is equivalent
to T norm \ S norm |= A1 v A2 .
    (⇒) : If we have the entailment above, then there is an inference of A1 v A2
from T norm which omits S norm . Then, by Lemma 2, there are the required
proof submodel Ms and the vertex v + ∈ Γs+ such that for all non-leaf vertices
v − ∈ Γ − we have δs (v − ) 6∈ S norm and hence, τ (δs (v − )) 6∈ S.
    (⇐) : Suppose there is a proof submodel Ms of M∗ satisfying the conditions
of the theorem. Then, by the definition of S norm and by Lemma 1, there is an
inference of A1 v A2 from T norm which omits S norm \ T aut(Nc T ), but then
T norm \ (S norm \ T aut(Nc T )) |= A1 v A2 and (T norm \ S norm ) ∪ (T norm ∩
T aut(Nc T )) |= A1 v A2 , hence T norm \S norm |= A1 v A2 and we have (T norm ∪
{A1 v A2 }) \ P |= A1 v A2 . By Lemma 3, this entailment is equivalent to
Q \ (S ∪ {C1 v C2 }) |= C1 v C2 which is the same as T \ S |= C1 v C2 . 


3. Proofs for Section 4
    Example 2 shows that there are special intermediate derived concept inclu-
sions in the proof for which an interpolant can be easily computed. It is either
the LHS of the side condition of the rule under which the concept inclusion is
obtained, or the concept from the RHS of one of the premises of the rule. In par-
ticular, such special concept inclusions are the very first formulas in branches of
the proof which are derived using axioms of both, a TBox T and its “copy”. For
the rest of the derived concept inclusions, the corresponding interpolants can be
obtained by combining interpolants for preceding concept inclusions under con-
junction and existential restriction. These are the main ideas presented in the
proof of Theorem 3 and the procedure of interpolant computation from Section
4. Motivated by the observation on these “special” concept inclusions in proofs,
we first introduce the following auxiliary notion.

Definition 5 (Irregular formula, irregular vertex). Let T1 and T2 be nor-
mal TBoxes, with sig (T1 ) ∩ sig (T2 ) = ∆ for a signature ∆. A formula ϕ is
called regular (wrt T1 , T2 ) if sig (ϕ) ⊆ sig (Ti ) for some i = 1, 2. Otherwise, it
is called irregular. Let ((Γ + , Γ − , E), δ, ρ, f ) be an enriched model of T1 ∪ T2 .
A vertex v + ∈ Γ + is called irregular if
 – either ρ(v) = A, v ∈ f (B), for concept names A, B, and A v B is an
   irregular formula,
 – or ρ(v) = hA, Bi, v ∈ f (r), for concept names A, B, and A v ∃r.B is an
   irregular formula.
    The property of concept interpolation shown in the following theorem is well-
known for EL, but the proof is given to justify soundness of the procedure for
interpolant computation described in Section 4.

Theorem 3 (Structural proof of concept interpolation). Let T1 and T2
be normal TBoxes, with sig (T1 ) ∩ sig (T2 ) = ∆ for a signature ∆, and let
A v F be a normal, but irregular formula such that A ∈ sig (T1 ). If there is an
inference of A v F from T1 ∪ T2 , then there is a concept interpolant D such that
T1 ∪ T2 |= {A v D, D v F } and sig (D) ⊆ ∆.

Proof. We apply induction on the length of the inference of A v F from T1 ∪ T2 .
Observe that by the form of the rules in Figure 1, we must have F ∈ sub (T1 ∪
T2 ) and thus, sig (F ) ⊆ sig (T2 ), since A v F is irregular and A ∈ sig (T1 ).
Induction step (the base case is trivial): let R be the corresponding derivation
rule. Note that the formula in the side condition of R is regular and assume that
all the formulas in the premise are regular too. If R is T rans, then we conclude
that B is the required interpolant; if R is AndR, then A1 u ... u An is; if R is Ex,
then either {A, r, A1 , A2 } ⊆ sig (T1 ) and thus, ∃r.A2 is the required interpolant,
or {A1 , r, A2 , B} ⊆ sig (T2 ), and hence, ∃r.A1 is.
    Now assume there is an irregular formula in the premise of R. By the in-
duction hypothesis, there is a corresponding interpolant for this formula. If R
is T rans, then it is the required interpolant for A v F . If R is AndR, then
consider the maximal subset I ⊆ {1, .., n} such that for every i ∈ I the for-
mula A v Ai is irregular and Di is the corresponding interpolant. Since A v F
is irregular, we conclude that Aj ∈ ∆ for all j ∈ {1, .., n} \ I and therefore,
F i∈I Di u  F   j ∈{1,..,n}\I Aj is the required interpolant. Finally, if R is Ex then, in
case A v ∃r.A1 is irregular, its interpolant is the required interpolant for A v F .
Now assume that A v ∃r.A1 is a regular formula and A1 v A2 is irregular with
an interpolant D. Then r ∈ ∆ and ∃r.D is the required interpolant for A v F .



4. Proof for Section 5

Theorem 2 (Formulas imitate substitutions of signature symbols) Let
T1 and T2 be normal TBoxes, ∆ = sig (T1 ) ∩ sig (T2 ), and let σ1 = {Ai }i∈I ∪
{rj }j∈J ⊆ sig (T1 ) \ ∆, σ2 = {Bi }i∈I ∪ {sj }j∈J ⊆ sig (T2 ) \ ∆ be signatures,
where all Ai , Bi ∈ Nc and rj , sj ∈ Nr . Let T20 be the result of substituting every
Bi with Ai and every sj with rj in the axioms of T2 .
    Consider the TBox Q = T1 ∪ T2 ∪ {Ai ≡ Bi }i∈I ∪ T2 (s̄/r̄), where T2 (s̄/r̄)
is the set of formulas of the form ∃rj .X v Y , or X v ∃rj .Y , j ∈ J such that
∃sj .X v Y , or X v ∃sj .Y , respectively, is an axiom of T2 . Then for any normal
formula ϕ in signature sig (T1 ∪ T20 ) we have T1 ∪ T20 |= ϕ iff Q |= ϕ.
Proof. (⇒): We show that Q is a conservative extension of T1 ∪ T20 . By the
definition of Q it suffices to verify that Q |= T20 \ T2 . Let ψ ∈ T20 \ T2 , this is a
formula obtained from an axiom ξ ∈ T2 by substituting Bi with Ai and sj with
rj for i ∈ I, j ∈ J. Let {Ak , . . . , Al } = sig (ψ) ∩ σ1 ∩ Nc (a possibly empty set).
Consider several possible cases for the signature of ψ. If sig (ψ) ∩ {rj }j∈J = ∅,
then ξ ∪ {Ai ≡ Bi }i∈{k,...,l} |= ψ, so we have Q |= ψ, because ξ ∈ T2 ⊆ Q.
If there is a role rj ∈ sig (ψ) for some j ∈ J, then there must be a formula
γ ∈ T2 (s̄/r̄) such that either γ |= ψ (in case sig (ψ)∩σ1 ∩Nc = ∅), or {γ}∪{Ai ≡
Bi }i∈{k,...,l} |= ψ. In each of the cases we have Q |= ψ.
    (⇐): Let Q0 be the TBox obtained from T1 ∪T2 ∪T2 (s̄/r̄) by substituting every
occurrence of symbol Bi with Ai , for i ∈ I. We have Q |= ϕ iff Q0 |= ϕ, since
Q |= Q0 , every model of Q0 expands to a model of Q, and sig (ϕ) ⊆ sig (Q0 ).
Observe that any formula ∃rj .X v Y , or X v ∃rj .Y from a “copy” of T2 (s̄/r̄)
obtained after this substitution, is a formula of T20 . Therefore, T1 ∪ T20 ⊆ Q0 and
the set difference D = Q0 \(T1 ∪T20 ) consists of formulas of the form ∃sj .X v Y , or
X v ∃sj .Y , with X, Y ∈ sig (T20 ), j ∈ J, such that there exists a corresponding
axiom ∃rj .X v Y , or X v ∃rj .Y in T20 . We now prove by induction that for an
inference of a (normal) formula ϕ from Q0 , with sig (ϕ) ⊆ sig (T1 ∪ T20 ), there
is a corresponding inference of ϕ from T1 ∪ T20 . Induction step (the base case
is trivial): if ϕ is obtained by AndR or T rans, then by the form of formulas
in D, the axiom in the side condition can not belong to D and thus there is
nothing to prove. If the rule is Ex with an axiom ∃sj .Y 0 v Z ∈ D in the side
condition, then there are formulas of the form X v ∃sj .Y ∈ D and Y v Y 0 in
the premise. The formula Y v Y 0 is normal and Y, Y 0 ∈ sig (T20 ), thus by the
induction hypothesis, there exists an inference of Y v Y 0 from T1 ∪ T20 . Since
there are corresponding formulas ∃rj .Y 0 v Z and X v ∃rj .Y in T20 , we conclude
that there is an inference of ϕ from T1 ∪ T20 in which ϕ is obtained by Ex from
formulas X v ∃rj .Y , Y v Y 0 and X v ∃rj .Y is obtained by T rans from the
tautology X v X.