=Paper=
{{Paper
|id=None
|storemode=property
|title=Concept Definability and Interpolation in Enriched Models of EL-TBoxes
|pdfUrl=https://ceur-ws.org/Vol-1014/paper_66.pdf
|volume=Vol-1014
|dblpUrl=https://dblp.org/rec/conf/dlog/PonomaryovV13
}}
==Concept Definability and Interpolation in Enriched Models of EL-TBoxes==
Concept definability and interpolation in enriched models of EL–TBoxes? Denis Ponomaryov1 and Dmitry Vlasov2 1 Institute of Artificial Intelligence, University of Ulm, Germany; Institute of Informatics Systems, Novosibirsk, Russia 2 Institute of Mathematics, Novosibirsk, Russia ponom@iis.nsk.su, vlasov@academ.org Abstract. It is known that in Description Logics explicit concept defin- ability is directly related to concept interpolation. The problem to decide whether a concept is definable under a TBox wrt a signature usually reduces to entailment in the underlying logic. If an explicit definition ex- ists, then it can be found as a concept interpolant for a concept inclusion entailed by an appropriately chosen TBox. In fact, it can be extracted from a corresponding proof of the concept inclusion. We describe a graph structure called enriched model that represents proofs in normalized EL- TBoxes and show that, built once for a normalization of a given TBox T , it can be used for deciding the existence or direct computation of ex- plicit definitions of concepts under arbitrary subsets of axioms of T and wrt different subsignatures of T . Solving this computational problem has applications in collaborative ontology engineering and is an important part of the recently proposed algorithms for ontology decomposition. 1 Introduction and Motivation In the paper, we consider the Description Logic EL with the standard set of constructors for concepts, i.e. the conjunction, existential restriction, and the distinguished concept symbol >. Consider a scenario in which several experts are working on the same ontology T and each expert has a vocabulary (a set of concept and role names) she is using for defining complex concepts in the ontology. Vocabularies need not be distinct between experts. If at some point an expert encounters an axiom in the ontology and a concept which is built using foreign vocabularies, she wants to figure out whether it still can be defined in her own vocabulary. Or at least, whether the encountered concept can be partially reformulated to use less symbols from foreign vocabularies (and more symbols from her own one). If the answer is “yes”, then the expert has an option to see the actual reformulation of the concept and can replace it with the obtained reformulation (if she is allowed to do so under a collaborative ontology change policy). To illustrate this idea, consider the following simple example. Consider the TBox T = {ϕ, A v B, D v A}, where ϕ = A u B v D. Suppose an expert is interested in a reformulation of the concept A u B. Independently of what the expert’s vocabulary is, this concept can be reformulated as A, since T |= A u B ≡ A, and it can be replaced with A giving a TBox equivalent to T . Now assume that the expert’s vocabulary is {D}. We have T |= AuB ≡ D, but if ? The full paper is available at http://persons.iis.nsk.su/en/person/ponom/papers we replace the concept AuB with D, we obtain the TBox which is not equivalent to the initial one. In other words, even though the required reformulation of the concept A u B exists, it can not be used instead of A u B without further modifications of the TBox (e.g. adding the inclusion A v D as an additional axiom to T ). The reason is that T \{ϕ} 6|= AuB ≡ D, i.e. the axiom ϕ is essential for proving this equivalence. Thus, the given example demonstrates the need to check for definability of concepts under certain subsets of a TBox. Now consider the TBox S = {ϕ, A v D, ∃r.D v ∃r.A}, where ϕ = ∃r.A v B, and assume that the expert’s vocabulary is {D}. We have S \ {ϕ} |= ∃r.A ≡ ∃r.D, i.e. the concept ∃r.A can be partially reformulated as ∃r.D. Note that the subsignature which can not be changed under any reformulation wrt the expert’s vocabulary (here, the single symbol r) may not be known in advance and has to be guessed. It is observed in the results of [6] however, that the mentioned problem of partial concept reformulation is not harder than entailment in the underlying logic. Given a TBox T , a concept C is said to be explicitly definable under T wrt a signature ∆ if T |= C ≡ E, where E is a concept over ∆. The problem to decide whether C is equivalent in T to a concept containing less symbols from sig (C) \ ∆ (thus, possibly having more ∆-symbols than C does) is called the problem of partial concept reformulation (under T wrt ∆). Explicit definability was studied in Description Logics in the context of query rewriting (Sect. 3 in [8]), acyclic representation of TBoxes [2] and is intimately related to modularity properties of ontologies via the notion of uniform interpolation. It is known that the Description Logic EL enjoys the concept interpolation property in the following sense (cf. Definition 3.1 in [3]): if the signatures of some TBoxes T1 , T2 intersect by a set ∆, as well as signatures of some concepts C1 , C2 , then T1 ∪T2 |= C1 v C2 yields the existence of a concept D in the signature ∆ such that T1 ∪ T2 |= {C1 v D, D v C2 } holds. It follows from this property that a concept C is explicitly definable under a TBox T wrt a subsignature ∆ ⊆ sig (T ) iff T ∪ T ∗ |= C ∗ v C, where sig (T ) ∩ sig (T ∗ ) = ∆, sig (C ∗ ) ∩ sig (C) ⊆ ∆, and T ∗ , C ∗ are “copies” obtained from T and C, respectively, by replacing all non- ∆-symbols simultaneously with “fresh” ones, not present in sig (C) ∪ sig (T ). Thus, deciding concept definability is polynomially reduced to entailment in EL, although the explicit definition itself can be of exponential length in the size of T and C (cf. Example 28 in [6]). Note that the above construction of copies for T and C assumes that the signature ∆ is known and it allows to check whether C is equivalent to a concept having symbols only from ∆ (not necessarily all of them). In fact, the same idea can be used to check whether C can be partially reformulated wrt ∆. For instance, the concept ∃r.A from the TBox S above allows for the partial reformulation wrt ∆ = {D} and the entailment (T \ {ϕ}) ∪ (T ∗ \ {ϕ∗ }) |= ∃r.A∗ v ∃r.A certifies this, where ϕ∗ = ∃r.A∗ v B and T ∗ and ∃r.A∗ are copies constructed as described above, with sig (T )∩sig (T ∗ ) = ∆∪{r}. Observe that we had to extend the initial signature ∆ with the symbol r. In fact, if we extend an initial signature ∆ with the set sig (C)\{x} for a symbol x ∈ sig (C)\∆, then the entailment as above holds iff x can be removed in some partial reformulation of the concept C wrt ∆. Thus, in order to check for partial reformulations of concepts under a given TBox T wrt different signatures ∆ ⊆ sig (T ), one needs to have an efficient procedure for checking the entailment (T ∪ f ∗ (T )) \ S |= f ∗ (C) v C (∗) for subsets S ⊆ T ∪ f (T ), different concepts C, and different functions f ∗ ∗ renaming signature symbols into fresh ones injectively and being the identity function on ∆. If the entailment holds, then the corresponding explicit definition of the concept C can be extracted from the proof of the inclusion f ∗ (C) v C from the TBox above. We will consider the case of this entailment problem, where C is a concept from the LHS or RHS of an axiom of T , although the constructions proposed for solving the problem can be easily extended to the case of arbitrary concepts. Since the concept C, the set S, and function f ∗ are the changing parameters in the problem (∗), but the TBox T and the initial signature ∆ are fixed, it makes sense to use a precomputed extended representation of T to be able to solve numerous instances of this problem efficiently. The aim of the paper is to present the idea that the standard (canonical) model of an EL-TBox with the closure operators S and R (in the sense of [1]) can be extended to become a structure (called enriched model ) which, built once for a TBox T , can be used as a precomputed template for solving problem (∗) with the changing input parameters (C, S, and f ∗ ). An enriched model is an extended representation of a TBox used in the implementation of the decomposi- tion algorithm for EL–ontologies [6] which relies on solving numerous instances of the partial concept reformulation problem. Our presentation is divided in three parts. In Section 3, we introduce the notion of enriched model and show that it can be used for checking the entailment of concept inclusions from different subsets of a TBox. In Section 4, we describe the process of computing explicit definitions of concepts in enriched models, which is conceptually similar to the idea of computing concept interpolants from tableaux (Sect. 4 in [8] and Sect. 3 in [3]). These results constitute the solution for entailment problem (∗) wrt the changing parameters which is described in Section 5. We note that the main idea behind the enriched models is the observation applicable to any system with non-trivial search problems: do a preprocessing of information in order to handle numerous standard calls to it faster. 2 Basic Notations Let Nc and Nr stand, as usual, for the countably infinite sets of concept and role names in the alphabet of the language EL. Denote N> c = Nc ∪ {>} and let ./ be the symbol for either v or ≡. A formula is an expression of the form C1 ./ C2 , where C1 , C2 are concepts, and a TBox is a finite set of formulas which does not contain C1 ≡ C2 and C1 v C2 simultaneously for some concepts C1 , C2 . If ∆ is a signature, then T erm(∆) will stand for the set of all concepts in the signature ∆ and if ∆ ⊆ Nc , then T aut(∆) will denote the set of tautologies of the form > v >, A v >, A v A, for each A ∈ ∆. Let T be a TBox. The notations sub (T ), Nc T , and Nr T will stand for the sets of all concepts, concept names, and role names respectively, occurring in axioms of T , while N> c (T ) will be the shorthand for Nc T ∪ {>}. The notation sig (T ) will mean the signature of T and will be slightly abused to denote the signature of a formula or a concept. If C ∈ sub (T ) is a concept and A ∈ Nc , then T (C/A) will stand for the TBox obtained from T by substituting each occurrence of C in the LHS or RHS of axioms of T with the concept name A. We call a Tbox T 0 conservative extension of T if sig (T ) ⊆ sig (T 0 ) and for any formula ϕ, with sig (ϕ) ⊆ sig (T ), we have T |= ϕ iff T 0 |= ϕ. 3 Proofs in Enriched Models of EL-TBoxes Informally, an enriched model is a graph structure for representing proofs in a deductive system for normal TBoxes. It is built iteratively in accordance with the inference rules and normal axioms obtained from an input TBox by normalization. In addition, an enriched model stores a map from normal to the original axioms to allow for checking the entailment of concept inclusions from different subsets of the input TBox. To illustrate this idea briefly, let us consider an example of eliminating a redundant axiom from a TBox. Example 1 Consider the TBox S = {ϕ, A v B, A v D}, where ϕ = A u B v D and A, B, D ∈ Nc . Note that either the first or the last axiom can be eliminated from S giving an equivalent TBox. Let T = {ϕ0 , A v B, A v D, AuB v X, X v A, X v B} be the conservative extension of S, where ϕ0 = X v D and X is the “pseudonym” for the concept A u B. Note that S \ {ϕ} |= ϕ is equivalent to T \ {ϕ0 } |= ϕ0 . Consider the following graph, where every vertex has one or two labels depicted by the containment relation ∈ f (.) and an arrow marked ρ with δ or ρ. In the figure, if a vertex is marked with labels f (X) and 7→ Y , then X, Y ∈ Nc T and T |= Y v X. The vertices marked with δ correspond to axioms of T witnessing these entailments. Essentially, the graph represents possible proofs of primitive concept inclusions inside T , note the two disconnected components of the graph depicting independent proofs: Figure 1. Example of enriched model: proofs as paths in the graph The right-most vertex of the second component corresponds to the entailed con- cept inclusion ϕ0 = X v D. Moreover, there is a subgraph which witnesses this entailment and does not contain vertices labelled by ϕ0 . This corresponds to the fact that ϕ0 is entailed by T \{ϕ0 }and thus, ϕ is redundant in S. The graph considered in the illustration generally corresponds to what we call further enriched model of a TBox. The rest of the section is devoted to a formal justification of the idea of checking for entailment of concept inclusions from subsets of a TBox by means of such structures. Definition 1 (Primitivization). A formula is called primitive if it is of the form A v B for A, B ∈ N> c . A TBox Tp is called primitivization of a Tbox T if every axiom ϕ of T is equivalent in Tp to a primitive axiom ϕp ∈ Tp , i.e. if (Tp \ {ϕp }) ∪ {ϕ} |= ϕp and Tp |= ϕ. An extended primitivization of a TBox T is a pair hTp , τ 0 i, where Tp is a primitivization of T and τ 0 : Tp → T ∪ {null} is a map, which are both obtained after the application of the following rewriting rule to each axiom of T : • if C1 ./ C2 ∈ T then T := T (Ci /Ai )∪{Ai ≡ Ci }, for each Ci 6∈ N> c , i = 1, 2, where Ai is a fresh concept name, • τ 0 (Ai ≡ Ci ) = null and if ϕp is the image of C1 ./ C2 under the applied substitutions, then τ 0 (ϕp ) = C1 ./ C2 . Definition 2 (Normalization). A formula is normal if it has one of the fol- lowing forms: A v ∃r.B, ∃r.B v A, A1 u ... u An v B, for n > 1, with all A, B, Ai , i = 1, .., n, being concept names. ATBox is normal if each of its axiom is. A TBox T n is called normalization of a TBox T if it is a conservative ex- tension of T and each axiom of T n is normal. Let Tep = hTp , τ 0 i be an extended primitivization of a TBox T . An extended normalization of Tep is a pair hT n , τ i, where T n is a normalization of Tp and τ : T n → T is a map such that: • if A v B ∈ Tp , A, B ∈ N> c , then A v B ∈ T nm and τ (A v B) = τ 0 (A v B); • if A ≡ B ∈ Tp , A, B ∈ N> c , then {A v B, B v A} ∈ T nm and τ (A v B) = τ (B v A) = τ 0 (A ≡ B); • τ (ϕ) = null for the rest of the axioms ϕ ∈ T nm . It follows from the standard decision procedure for EL (see e.g. [1]) that the following set of inference rules (under the notion of inference defined below) is complete wrt the semantic entailment of primitive formulas A v B from normal TBoxes (in the rules, A, B, A1 , ..., An , n > 2, stand for symbols from N> c and C is either a symbol from N> c or an expression ∃r.X, with X ∈ Nc ): > AvB (Ax) (AxTop) (Trans) BvC∈T AvA Av> AvC A v A1 , ..., A v An (AndR) A1 u ... u An v B ∈ T AvB A v ∃r.A1 , A1 v A2 (Ex) ∃r.A2 v B ∈ T AvB Figure 2. A deductive system for entailment of normal formulas If T is a normal TBox and ϕ is a normal formula, then an inference (a proof ) of ϕ from T is a tree,where ϕ is the root and every node is either a tautology A v A, A v >, for A ∈ N> c , or a formula obtained from the child nodes by one of the rules T rans, AndR, or Ex with the formulas in the side conditions being axioms of T . If additionally, these formulas are from a subset T \ S, for some S ⊆ T , then we say that the inference omits S. We now introduce the notion of enriched model of a normal TBox. The defi- nition consists of two parts: the description of an enriched model as an algebraic structure and the description of the process of its construction. Let T be a normal TBox. An enriched model of T is a structure M∗ = ((Γ + , Γ − , E), δ, ρ, f ), where (Γ + , Γ − , E) is a directed bipartite graph with the set E of edges and the set Γ + ∪ Γ − of vertices augmented with maps: – δ : Γ − → T ∪ T aut(Nc T ), – ρ : Γ + → N> > > c (T ) ∪ (Nc (T ) × Nc (T )), > T Γ+ – f : Nc (T ) ∪ Nr → 2 The process of construction of an enriched model for a normal TBox is given in the Appendix. Definition 3 (Submodel of an enriched model). Let M∗ = (Γ = (Γ + , Γ − , E), δ, ρ, f ) be an enriched model. The structure M∗s = (Γs = (Γs+ , Γs− , Es ), δs , ρs , fs ) is called a submodel of M∗ if Γs is a subgraph of Γ , each δs , ρs , fs is the restriction of the corresponding map from M∗ onto the vertices of Γs and the following holds: – for every v − ∈ Γs− , all vertices v + ∈ Γ + with (v + , v − ) ∈ E are in Γs+ and connected with v − in Γs ; – for every v + ∈ Γs+ , there exists v − ∈ Γs− such that (v − , v + ) ∈ Es ; if such v − is unique, then the submodel is called deterministic. In general, the bipartite graph (Γ + , Γ − , E) may not have a tree-shape due to cycles caused e.g. by equalities between concept names (recall Example 1). How- ever, there always exist submodels which are trees and these submodels are in one-to-one correspondence with proofs in the deductive system of Figure 2. Theorem 1 (Entailment from subset of TBox in enriched model). Let T be a TBox and T ep = hTp , τ 0 i be an extended primitivization of T . Let hT norm , τ i be an extended normalization of T ep and M∗ be an enriched model of T norm . Let C1 , C2 be concepts appearing on the LHS or RHS of an axiom of T and for i = 1, 2, let Ai ∈ sig (Tp ) be a concept name which is either Ci (if Ci ∈ Nc ), or such that Ai ∈ sig (Tp ) \ sig (T ) and Ai ≡ Ci ∈ Tp . Then, for a subset S ⊆ T of axioms, it holds T \ S |= C1 v C2 iff there is a deterministic submodel Ms = (Γs = (Γs+ , Γs− , Es ), δs , ρs , fs ) of M∗ and a vertex v + ∈ Γs+ such that: – v + ∈ fs (A2 ) and ρs (v + ) = A1 , – for any non-leaf vertex v − ∈ Γs− , τ (δs (v − )) 6∈ S. Therefore, in order to check the entailment T \ S |= C1 v C2 , it suffices to find first a plus-vertex v + witnessing the entailment T norm |= A1 v A2 in the enriched model M∗ (the vertex must satisfy the conditions v + ∈ fs (A2 ) and ρs (v + ) = A1 ) and if it exists, find an inference of A1 v A2 , which does not refer via the map τ ◦δ to formulas of S, as a deterministic submodel of M∗ containing v + . The procedure of finding this submodel reduces to a traversal of the graph (Γ + , Γ − , E) of M∗ starting from the vertex v + . The procedure is conceptually similar to the one presented in the context of computing concept interpolants in Section 4 and is therefore skipped. 4 Computing Explicit Definitions with Enriched Models Let T1 be a TBox, C1 be a concept from the LHS or RHS of an axiom of T , S1 ⊆ T1 be a subset of axioms, and ∆ be a signature. Suppose we want to check whether C1 is ∆–definable under T1 \ S1 . Let T2 = f ∗ (T1 ) and S2 = f ∗ (S1 ) ⊆ T2 be the corresponding “copy”-TBoxes (with sig (T1 )∩sig (T2 ) = ∆) and let C2 = f ∗ (C1 ) be the corresponding “copy”-concept obtained by a renaming function f ∗ . We may assume that f ∗ (T1 \S1 ) = T2 \S2 and thus, by concept interpolation, definability is equivalent to the condition (T1 \ S1 ) ∪ (T2 \ S2 ) |= C1 v C2 . The observations from Section 3 imply that it suffices to consider the case when the TBox T1 is normal and Ci is a concept name from sig (Ti ) \ ∆, for i = 1, 2. Then (T1 \ S1 ) ∪ (T2 \ S2 ) |= C1 v C2 holds iff there is a submodel Ms in an enriched model M∗ = ((Γ + , Γ − , E), δ, ρ, f ) of T1 ∪ T2 and a vertex v + ∈ Γ + certifying this entailment. Let us show how a concept interpolant for the inclusion C1 v C2 (i.e. an explicit ∆-definition of C1 under T \ S) can be computed while searching for this submodel in M∗ . The submodel Ms must be in direct correspondence with a proof of C1 v C2 from (T1 \S1 )∪(T2 \S2 ). We have sig (T1 )∩sig (T2 ) = ∆ and Ci ∈ sig (Ti ) \ ∆, for i = 1, 2, so it suffices to understand how a concept interpolant between C1 and C2 can be extracted from a proof of C1 v C2 in this situation, which is the purpose of Theorem 3 given in the Appendix. The property of concept interpolation shown in the theorem is well-known for EL, but the proof is given to justify soundness of the procedure for interpolant computation described at the end of this section. Let us consider an example illustrating the main idea. Example 2 Consider the TBox S = {A u B v C, A ≡ D}, with A, B, C, D ∈ Nc , the corresponding primitivization Tp = {X v C, A ≡ D, X ≡ A u B}, and a normalization T of Tp . Denote ∆ = {D, B} and let T ∗ be the “copy” of T under the injective renaming of all non-∆-symbols into fresh ones such that X ∗ and C ∗ are the “copies” of X and C in T ∗ , respectively. Then the entailment S \ {A u B v C} |= A u B ≡ D u B is equivalent to (T \ {X v C}) ∪ (T ∗ \ {X ∗ v C ∗ }) |= X ∗ v X and there is a vertex and a subgraph in the enriched model of T ∪ T ∗ certifying this entailment (note the right-most vertex): Figure 3. Constructing concept interpolant from a subgraph in enriched model Consider the vertex labelled with f (A) and X ∗ in the graph, denote it as v. It corresponds to the entailed inclusion X ∗ v A and is special, because it contains non-∆-symbols from T ∗ and T on the LHS and RHS, respectively, while all the descending vertices contain non-∆-symbols only from T ∗ . By the concept interpolation property of EL, there must be an interpolant ∆-concept for the inclusion X ∗ v A (for the vertex v). In fact, this concept can be obtained from the LHS of the axiom labeling the child vertex of v, i.e. D is the required interpolant for v. To obtain the interpolant for the right-most vertex (corresponding to the inclusion X ∗ v X of interest) it suffices to take the conjunction of interpolants for the child vertices preceding the one labeled with A u B v X. We have B ∈ ∆ and the concept D u B is the required interpolant. The example shows that there are special intermediate derived concept in- clusions in the proof for which an interpolant can be easily computed. It is either the LHS of the side condition of the rule under which the concept inclusion is obtained, or the concept from the RHS of one of the premises of the rule. In par- ticular, such special concept inclusions are the very first formulas in branches of the proof which are derived using axioms of both, a TBox T and its “copy”. For the rest of the derived concept inclusions, the corresponding interpolants can be obtained by combining interpolants for preceding concept inclusions under conjunction and existential restriction. These are the main ideas presented in the proof of Theorem 3 and the procedure of interpolant computation. Essentially, the proof demonstrates that in case an irregular formula ϕ is de- rived from T1 ∪T2 by an inference rule R, then either there are only regular formu- las in the premise of R and the interpolant for ϕ can be computed immediately, or there are some irregular formulas in the premise for which the correspond- ing interpolants must be computed first in order to obtain an interpolant for ϕ. Practically, this means a recursive procedure of computing interpolants while traversing the proof of formula ϕ (i.e. a submodel starting with the plus-vertex corresponding to ϕ). Every proof starts from axioms which are regular formulas, so the first irregular formula in a branch of the proof is obtained after application of some rule R with regular premises. As noted in the proof of Theorem 3 (and Example 2), an interpolant in this case can be computed immediately, since it is either the concept from the LHS of the formula in the side condition of R, or the concept ∃r.A1 from the RHS of the premise, if R = Ex. Thus, Theorem 3 (together with Theorem 1) evidences soundness of the recursive procedure given below, while termination follows from the finiteness of the graph in an enriched model and the usage of flags for visited vertices. The procedure is defined in words instead of pseudo-code for ease of exposition. Procedure for computing an explicit definition as concept interpolant Input: – an enriched model M∗ = (Γ = (Γ + , Γ − , E), δ, ρ, f ) of the union of normal TBoxes T1 ∪ T2 , with sig (T1 ) ∩ sig (T2 ) = ∆, for a signature ∆; – a concept inclusion A1 v A2 , with Ai ∈ (sig (Ti ) ∩ Nc ) \ ∆, for i = 1, 2. Output: Empty iff T1 ∪ T2 6|= A1 v A2 ; otherwise, a concept interpolant D such that sig (D) ⊆ ∆ and T1 ∪ T2 |= {A1 v D, D v A2 }. The procedure operates on the graph Γ from M∗ and incorporates a subroo- tine of graph traversal. 1. If there is no vertex v + ∈ Γ + such that v + ∈ f (A2 ) and ρ(v + ) = A1 , then the procedure terminates with the empty output. Otherwise, initiate graph traversal starting from the vertex v + as follows. 2. Initialize two maps vis : Γ + ∪Γ − → {0, 1} and int : Γ + ∪Γ − → T erm(∆)∪ {null} for all vertices v of Γ as: vis(v) = 0, int(v) = null. The map vis will be used to mark visited vertices and int will mark vertices with their corresponding interpolants. The value null will mean that there is no interpolant and vis(v) = 0 will imply int(v) = null. The task is compute the value of int(v + ), which will not be equal to null, however it can be the case for other vertices in Γ (which are related to proofs inside TBoxes T1 , T2 , but not from their union, i.e. to the proofs consisting of regular formulas). 3. The recursive subrootine of graph traversal and interpolant computation is as follows. Let v be the currently visited vertex from Γ . 3.1. Let v ∈ Γ + . Then we visit an arbitrary vertex w such that (w, v) ∈ E, i.e. a child minus-vertex: • If vis(w) = 1, we set int(v) = int(w); • If vis(w) = 0, then we initiate recursively graph traversal starting from the vertex w. When the traversal is complete, we set int(v) = int(w). After these steps, we set vis(v) = 1, i.e. visiting the vertex v is finished and we exit the recursion to one level up. 3.2. Let v ∈ Γ − . Then we immediately set vis(v) = 1. Let δ(v) = F1 v F2 . If for some i = 1, 2, we have sig (Fi ) ⊆ ∆ , then we set int(v) = Fi and exit the recursion to one level up. Otherwise, initiate graph traversals recursively starting from every irregular (plus-vertex) w, with (w, v) ∈ E. If there are no such vertices, then exit the recursion to one level up. Otherwise, when the traversal is complete for all such vertices w, we have int(w) 6= null and proceed as follows. 3.3. Consider different cases for the formula δ(v). • If δ(v) = A v B, or δ(v) = A v ∃r.B, for some concept names A, B, then there is a single child plus-vertex w for v and we set int(v) = int(w). • If δ(v) = A1 u...uAn v B, n > 1, then let w1 , .., wn be the child plus-vertices for v, with wi ∈ f (Ai ) and A = ρ(wi ), for i = 1, ..n and a concept name A. Take the maximal subset I ⊆ {1, .., n} such that for every i ∈ I, the vertex wi is irregular and set int(v) = i∈I int(wi ) u j ∈{1,..,n}\I Aj . F F • If δ(v) = ∃r.A2 v B, then there is a child plus-vertex w such that w ∈ f (r) and ρ(v) = hA, A1 i for some A and A1 . If w is irregular, then set int(v) = int(w). Otherwise there is an irregular child vertex w, with w ∈ f (A2 ), ρ(w) = A1 , and we set int(v) = ∃r.int(w). This finishes visiting the vertex v ∈ Γ − . The definition of the procedure is complete. Let v be the minus-vertex from point 3.3 of the procedure and w be one of its child plus-vertices with an interpolant int(w) = D for some concept D. From the consideration of the formula δ(v), one can see that either the concept int(v) is the same as int(w), or can be of length increased by a constant (when δ(v) = ∃r.A2 v B), or in the worst case, int(v) can be n-times longer than int(w) (if δ(v) = A1 u...uAn v B, n > 1, and each corresponding child plus-vertex wi is irregular). This means that the size of the interpolant for the inclusion A1 v A2 is at worst O(nm ), where m is the number of axioms in the normal TBox T1 ∪ T2 and n is the maximal number of conjuncts in an axiom of T1 ∪ T2 . Note that due to arbitrary selection of the minus-vertex in point 3.1, the procedure computes an arbitrary chosen interpolant out of existing ones for the concept inclusion A1 v A2 . By Theorem 1, it should be clear that point 3.1 can be easily modified to decide concept definability wrt a subset S of the TBox: it suffices to consider only those child minus-vertices which are labeled with formulas corresponding to axioms of S. 5 Partial Concept Reformulation with Enriched Models The problem of partial reformulation of concepts under a TBox T wrt a sig- nature ∆ considered in the introduction implies several instances of the concept definability problem for different extensions of ∆. At worst, the number of tests for definability to be performed is equal to the number of signature symbols in a concept C of interest. Moreover, in the decomposition algorithm for EL [6], each non-primitive concept being the LHS of an axiom of T has to be tested for partial reformulation. Each instance of the definability problem wrt an extension ∆0 ⊇ ∆ requires checking entailment from the union of TBoxes (T \S)∪(T 0 \S 0 ), where T 0 is a “copy” of T , with all non-∆0 -symbols injectively renamed into fresh symbols, and S, S 0 are subsets of axioms of T and T 0 , respectively. Since the concept C and the sets (∆0 \ ∆), S, S 0 are the changing parameters in the problem of partial reformulation, but the TBox T and the initial signature ∆ are fixed, it makes sense to use a prebuilt extended representation of T to be able to solve numerous instances of the concept definability problem efficiently. In the following we argue that an enriched model precomputed for a given TBox T and a signature ∆ can be used as a template for solving different instances of the concept definability problem wrt the changing parameters. We assume w.l.o.g. that the input TBox T is in the enriched normal form. Given T and ∆, first an enriched model M for T is built and then its “copy” M∗ is generated wrt an injective renaming of all non-∆-symbols from T . The pair M, M∗ is then kept as a precomputed template. As soon as an extension ∆0 of the signature ∆ is given, the graphs of these two models are expanded (with the iterative process of model construction described in the Appendix) wrt a set of formulas generated from T and the set ∆0 \ ∆. The idea of constructing this set of formulas is described in Example 3 and Theorem 2. In the example below, we assume that we are given TBoxes T and T ∗ sharing a signature ∆ (initially taken to be empty for simplicity) and then ∆ is expanded as the result of substituting (or “gluing”) some symbols of T ∗ with that of T . Example 3 Consider normal TBoxes T = {A v ∃r1 .A1 } and T ∗ = {∃s1 .B1 v B}, we have ∆ = sig (T ) ∩ sig (T ∗ ) = ∅. Let T 0 = {∃r1 .A1 v B} be the TBox obtained from T ∗ by substituting B1 with A1 and s1 with r1 ; we have ∆0 = sig (T ) ∩ sig (T 0 ) = {A1 , r1 } ⊇ ∆. Clearly, T ∩ T 0 |= A v B. Now consider the TBox Q = T ∪T ∗ ∪{A1 ≡ B1 }∪T ∗ (s/r), where T ∗ (s/r) = {∃r1 .B1 v B}, i.e. this TBox consists of the axiom of T ∗ with the role s1 substituted with r1 . Then Q |= A v B and in fact, it will be shown in Theorem 2 that T ∪ T 0 and the theory Q constructed for the union T ∪ T ∗ in the way above entail the same normal formulas in signature sig (T ∪ T ∗ ). Therefore, an enriched model for checking entailment from the union of theories T ∪ T 0 (sharing the signature ∆0 ⊇ ∆) can be obtained by expanding prebuilt models for T and T ∗ wrt the additional formulas {A1 ≡ B1 } ∪ T ∗ (s/r). We note that a much more general model-theoretic result implying Theorem 2 can be proved, but here we restrict ourselves only to the case important in the context of this paper. Theorem 2 (Formulas imitate substitutions of signature symbols). Let T1 and T2 be normal TBoxes, ∆ = sig (T1 ) ∩ sig (T2 ), and let σ1 = {Ai }i∈I ∪ {rj }j∈J ⊆ sig (T1 ) \ ∆, σ2 = {Bi }i∈I ∪ {sj }j∈J ⊆ sig (T2 ) \ ∆ be signatures, where all Ai , Bi ∈ Nc and rj , sj ∈ Nr . Let T20 be the result of substituting every Bi with Ai and every sj with rj in the axioms of T2 . Consider the TBox Q = T1 ∪ T2 ∪ {Ai ≡ Bi }i∈I ∪ T2 (s̄/r̄), where T2 (s̄/r̄) is the set of formulas of the form ∃rj .X v Y , or X v ∃rj .Y , j ∈ J such that ∃sj .X v Y , or X v ∃sj .Y , respectively, is an axiom of T2 . Then for any normal formula ϕ in signature sig (T1 ∪ T20 ) we have T1 ∪ T20 |= ϕ iff Q |= ϕ. It should be clear that precomputing the models M, M∗ does make sense, when there is a large number of queries for partial concept reformulation to be executed and answering all these queries would potentially involve all the axioms of the TBox. In paticular, this is the case for the decomposition algorithm from [6]. On the other hand, an expansion of M, M∗ wrt the auxiliary axioms from Theorem 2 may be computed only to the extent needed for checking the entailment C 0 v C for a concept C of interest (and its copy C 0 after a renaming wrt ∆0 ⊇ ∆). 6 Discussion In the paper, we did not consider questions of efficient implementation of enriched models and the algorithm of their construction. The iterative procedure based on rules from Section 3 (including the initialization step) is mentioned to simplify the exposition and is by no means proposed to be used in applications; this topic is left out of the scope of this paper to separate the main concepts from their implementation details. We note that the technique of enriched models can be easily adopted for solving the problem of concept reformulation by using the inference systems from the consequence-driven procedures for EL (see e.g. [4] or [5]). Moreover, we argue that the basic ideas described in this paper can be implemented in the context of any reasoning system for EL which allows for proof tracing and incremental reasoning (see a short overview in Section 2 of [5]). The first feature is necessary for constructing concept interpolants from proofs, while the second one allows for using a prebuilt model of a TBox and computing its local changes after the TBox is updated. For assume we are given a TBox T , a signature ∆ ⊆ T , and we would like to be able to solve different instances of the partial concept reformulation problem under T wrt ∆. Let T ∗ be a copy of T , with all non-∆-symbols injectively renamed, and let M be a prebuilt model for T ∪ T ∗ . Let A∗ and S ∗ be the corresponding copies of A and S in T ∗ . Now assume a signature ∆0 ⊇ ∆ and a set S ⊆ T are given; take here ∆0 \ ∆ = {A}, A ∈ Nc , for simplicity. Then building a model for the TBox (T \ S) ∪ (T 0 \ S 0 ) (where T 0 and S 0 are copies of T and S under an injective renaming of all non-∆0 -symbols) is equivalent to incrementally updating M after removing the axioms of S ∪ S ∗ from T ∪ T ∗ and adding the axiom A ≡ A∗ . Since an enriched model of a TBox contains proofs of primitive concept in- clusions, it can be directly used for computing concept interpolants and hence, explicit definitions of concepts. For every proof of a concept inclusion C ∗ v C from T ∪ T ∗ (where C ∗ and T ∗ are appropriate copies of C and T ), there is a corresponding explicit definition of C. It follows from the example on the non-succinctness of uniform interpolants from Section 7 of [7] that, given a EL- TBox T and a primitive concept inclusion ϕ, there can be exponentially many proofs of ϕ from T and this holds already for the case, when sig (T ) ⊆ Nc . Consider the signature ∆ = {Bij | Bij ∈ Nc , 1 6 i, j 6 n} and the TBox T = {A v Bij | 1 6 i, j 6 n} ∪ {Bij v Bi | 1 6 i, j 6 n} ∪ {B1 u ... u Bn v A}. Then the concept A has the longest explicit ∆-definition under T which is F 16i,j6n Bij , as well as exponentially many shortest ∆-definitions. The pro- cedure of compuiting concept interpolants given in Section 4 implies traversing only a single proof out of the available ones and the choice of the proof is not regulated by any strategies. We believe that the development of strategies of concept reformulation deserves attention not only in the Description Logic EL, but also in more expressive languages. Acknowledgments. The first author was supported by the German Research Founda- tion within the Transregional Collaborative Research Centre SFB/TRR 62 “Companion- Technology for Cognitive Technical Systems”, the Russian Academy of Sciences (Grant No. 15/10), and the Siberian Division of the Russian Academy of Sciences (Integration Project No. 3). The second author was supported by the Russian Foundation for Basic Research (Grant No. 12-01-00460-a). References 1. Baader F., Brand S., and Lutz C.: Pushing the EL envelope. In Proc. Nineteenth International Joint Conference on Artificial Intelligence (IJCAI 2005). 2. ten Cate B., Conradie W., Marx M., Venema Y.: Definitorially complete Descrip- tion Logics. In Proc. Tenth International Conference on Principles of Knowledge Representation and Reasoning (KR 2006). 3. ten Cate B., Franconi E., Seylan I.: Beth definability in expressive Description Log- ics. In Proc. Twenty-second International Joint Conference on Artificial Intelligence (IJCAI 2011). 4. Kazakov Y., Krötzsch M., and Simancik F. Concurrent classification of EL ontolo- gies. In: Proc. Tenth International Semantic Web Conference (ISWC 2011). 5. Kazakov Y. and Klinov P. Incremental Reasoning in OWL EL without Book- keeping. Technical report, University of Ulm, May 2013. Available from: http://code.google.com/p/elk-reasoner/ 6. Konev B., Lutz C., Ponomaryov D., Wolter F.: Decomposing Description Logic on- tologies. In: Proc. Twelfth International Conference on the Principles of Knowledge Representation and Reasoning (KR 2010). 7. Konev B., Walther D., and Wolter F. The logical difference problem for description logic terminologies. In: Proc. Fourth International Joint Conference on Automated Reasoning (IJCAR 2008). 8. Seylan I., Franconi E., de Bruijn J.: Effective query rewriting with ontologies over Dboxes. In: Proc. Twenty-first International Joint Conference on Artificial Intelli- gence (IJCAI 2009). Appendix 1. Definition of enriched model The definition consists of two parts: the description of an enriched model as an algebraic structure and the description of the process of its construction. For convenience, we repeat here the algebraic part of the definition. Let T be a normal TBox. An enriched model of T is a structure M∗ = ((Γ + , Γ − , E), δ, ρ, f ), where (Γ + , Γ − , E) is a directed bipartite graph with the set E of edges and the set Γ + ∪ Γ − of vertices augmented with maps: – δ : Γ − → T ∪ T aut(Nc T ), – ρ : Γ + → N> > > c (T ) ∪ (Nc (T ) × Nc (T )), T + – f : N>c (T ) ∪ Nr → 2 Γ such that the union of sets in the image of f is exactly Γ + and for all distinct T x, y ∈ N> c (T ) ∪ Nr it holds that f (x) ∩ f (y) = ∅. By using the last restriction, we will sometimes abuse the notation and write f −1 (v) to denote the preimage of the set V ⊆ Γ + with v ∈ V . Informally, the plus-vertices of the graph (i.e. the elements of Γ + ) will cor- respond to concept and role names and will be mapped by ρ to concept names and pairs of concept names, respectively. The minus-vertices will correspond to axioms of T and tautologies. An enriched model M∗ = ((Γ + , Γ − , E), δ, ρ, f ) of a normal TBox T is built iteratively in accordance with a set of graph expansion rules which correspond to the inference rules from Figure 2. For every A ∈ Nc T , the graph is first initialized with connected pairs of vertices hw1 , y1 i and hw2 , y2 i with w1 , w2 ∈ Γ − and y1 , y2 ∈ Γ + , such that y1 ∈ f (A), ρ(y1 ) = A, δ(w1 ) is a tautology of the form A v A and y2 ∈ f (>), ρ(y2 ) = A, and δ(w2 ) is A v >. Besides, a pair hw, yi is initialized, with y ∈ f (>), ρ(y) = >, and δ(w) = > v >. This is the initialization step for the bipartite graph and the mappings δ, ρ, f . To formulate the iterative construction process of M∗ , we present it as a set of graph expansion rules, which describe how the bipartite graph should be expanded (and the mappings updated) based on the previously built vertices and an axiom from T encountered at an iteration step. Each rule has the form (m > 2): •(U1 , V1 ), ..., •(Um−1 , Vm−1 ) ϕ •(Um , Vm ) T where for i = 1, .., m, Ui ∈ N>c (T ) ∪ Nr and Vi is either a symbol from N> c (T ) > or a pair of symbols from Nc (T ), and ϕ is an axiom of T . The rule is read as follows: – if there is an axiom ϕ in T – and for each (Ui , Vi ), i = 1, .., m − 1, there is a vertex xi ∈ Γ + such that xi ∈ f (Vi ) and ρ(xi ) = Ui – if there does not exist a vertex y ∈ Γ + such that y ∈ f (Vm ) and ρ(y) = Um , then initialize such vertex and – if there does not already exist a vertex w ∈ Γ − , with δ(w) = ϕ, such that hxi , wi ∈ E for all i = 1, .., m − 1 and hw, yi ∈ E, then initialize such vertex. The list of the graph expansion rules is presented in the figure below (n > 2 in the rule AndR• ). Each rule is augmented with an illustration of the constructed part of the graph with the above mentioned vertices xi , w, and y, i = 1, .., m − 1. •(A,B) (T rans•1 ) •(A,C) B v C •(A,B) (T rans•2 ) •(r,hA,Ci) B v ∃r.C •(A,A1 ),...,•(A,An ) (AndR• ) •(A,B) A1 u ... u An v B •(r,hA,A1 i), •(A1 ,A2 ) (Ex• ) •(A,B) ∃r.A2 v B Figure 3. Iterative construction of an enriched model: graph expansion rules The iterative procedure of model construction finishes when no rule results in an expansion of the graph, i.e. when no vertex or edge can be added. Definition 4 (Enriched model of a normal TBox). Let T be a normal TBox. The structure M∗ = ((Γ + , Γ − , E), δ, ρ, f ) built for T as above is called enriched model of T . 2. Proofs for Section 3 Let us formulate three auxiliary lemmas needed for the proof of Theorem 1. Note that in general, the bipartite graph (Γ + , Γ − , E) may not have a tree-shape due to cycles caused e.g. by equalities between concept names (recall Example 1). However, there always exist submodels which are trees and these submodels are in one-to-one correspondence with proofs in the deductive system of Figure 2. Lemma 1 (Deterministic tree submodels correspond to proofs) Let M∗ = (Γ = (Γ + , Γ − , E), δ, ρ, f ) be an enriched model of a TBox T . For any ver- tex v + ∈ Γ + , there exists a deterministic submodel M∗s = (Γs = (Γs+ , Γs− , Es ), δs , ρs , fs ), where Γs is a tree with the root v + and each leaf is a vertex v − ∈ Γ − with δ(v − ) ∈ T aut(Nc T ). Let S = T \ {δ(v − ) | v − ∈ Γs− }. The submodel M∗s corresponds to an inference from T omitting S of the concept inclusion ρ(v + ) v f −1 (v + ) (in case ρ(v + ) ∈ Nc ) and of the inclusion A v ∃f −1 (v + ).B (in case ρ(v + ) = hA, Bi). Proof. The claim is proved by induction on the number of iteration steps in the construction process of M∗ for T which are needed to initialize the vertex v + . The base case is trivial by the initialization step for M∗ : in this case the required tree is just a two-element chain corresponding to an inference of a tautology from T aut(Nc T ). This inference trivially omits S. The induction step is simple as well, since at each iteration step of the graph expansion, by the induction hypothesis, for every plus-vertex in the premise of an expansion rule, there is a corresponding deterministic submodel, the part of the graph to be constructed has the form of a tree (Figure 3) and the intermediate vertex v − to be constructed satisfies δ(v − ) ∈ T . If the expansion rule applied is T rans•i for some i = 1, 2, then the corresponding inference of the concept inclusion is by the rule T rans with the axiom in the side condition equal to δ(v − ); the rest of the rules in Figure 3 are in direct correspondence to the rules from Figure 2. Let us call proof submodel a submodel with the properties from Lemma 1 for some plus-vertex v + . As expected, the converse of the lemma holds as well. Lemma 2 (Proofs correspond to deterministic tree submodels) Let T be a normal TBox, M∗ be an enriched model of T , and let ϕ be a formula of either of the forms A v B, A v ∃r.B, for A, B ∈ Nc . If there is an inference of ϕ from T which omits a subset S ⊆ T , then there is a proof submodel of M∗ with the root vertex v + corresponding to ϕ and each non-leaf minus-vertex v − satisfying δ(v − ) ∈ (T \ S). Thus, in the notations of the lemma, there is an inference of ϕ from T iff there is a plus-vertex v + ∈ Γ + in the enriched model of T such that v + ∈ f (B) and ρ(v + ) = A, or v + ∈ f (r) and ρ(v + ) = hA, Bi, depending on the form of ϕ. By the construction of the enriched model, the vertex v + in the graph is unique for ϕ, so the number of plus-vertices in the model is at worst O(n2 ), where n in the size of the input TBox T . Every path of length 3 in the graph starting and ending in a plus-vertex has a minus-vertex in the middle (the graph is bipartite). Every minus-vertex is labeled with an axiom of T and, by the construction, the number of minus-vertices between two plus-vertices does not exceed the number of axioms in T (there can not exist two such minus-vertices v1 , v2 with δ(v1 ) = δ(v2 )). Thus, the size of the enriched model is polynomial in the size of T . Note that besides the information about all the entailed primitive concept inclusions, the enriched model contains information about all their possible inferences from axioms of T and tautologies T aut(Nc T ), recall also the graphical representation of independent inferences from Example 1. Lemma 3 (Checking for entailment with a normal TBox) Let T be a TBox, S ⊆ T be a subset of axioms, and ϕ ∈ T be an axiom. Let T ep = hTp , τ 0 i be an extended primitivization of T and T en = hT norm , τ i be an extended normalization of T ep . Denote S norm = {ψ ∈ T norm | τ (ψ) ∈ S} and ϕnorm = {ψ ∈ T norm | τ (ψ) = ϕ}. Then we have T \ S |= ϕ iff T norm \ S norm |= ϕnorm . Proof. For brevity we consider only the special case, when S = {ϕ} and ϕ is of the form C1 ≡ C2 for some concepts C1 , C2 6∈ N> c . It follows that the set ϕ norm is not a singleton. Other cases, when S is an arbitrary subset of axioms, or ϕ is of the form C1 v C2 , or Ci ∈ N> c , for some i = 1, 2, are considered analogously. Let ϕp = A1 ≡ A2 be the primitive axiom of Tp such that τ 0 (ϕp ) = ϕ and Ai ≡ Ci ∈ Tp for i = 1, 2. Then, by the definition of extended normalization, ϕnorm = {A1 v A2 , A2 v A1 } (with each of the concept inclusions mappedSby τ to ϕ). Consider the pair h(T \ {ϕ})p , τ̃ )i, where (T \ {ϕ})p = Tp \ ({ϕp } ∪ i=1,2 {Ai ≡ Ci | Ci is not the LHS or RHS of an axiom of T \ {ϕ}}) and τ̃ is the restriction of the map τ 0 onto the set Tp \ {ϕp }. Clearly, h(T \ {ϕ})p , τ̃ )i is an enriched primitivization of T \ {ϕ} and hence, its conservative extension. Moreover, any model of (T \ {ϕ})p expands to a model of Tp \ {ϕp } and thus, Tp \ {ϕp } is a conservative extension of (T \ {ϕ})p . Now consider the set T norm \ ϕnorm . It is easy to see that this set is a normalization of Tp \ {ϕp } and thus, its conservative extension. (⇒) : If T \ {ϕ} |= ϕ, then (T \ {ϕ})p |= C1 ≡ C2 and Tp \ {ϕp } |= {C1 ≡ C2 , A1 ≡ C2 , A2 ≡ C2 }. But then T norm \ ϕnorm entails this set of formulas too and hence, T norm \ ϕnorm |= ϕnorm . (⇐) : Assume that ϕnorm is entailed by T norm \ϕnorm . Then it is entailed by Tp \ {ϕp } and, since Tp \ {ϕp } |= {A1 ≡ C2 , A2 ≡ C2 }, we have Tp \ {ϕp } |= ϕ. As Tp \ {ϕp } is a conservative extension of T \ {ϕ}, we conclude that T \ {ϕ} |= ϕ. Theorem 1 (Entailment from subset of TBox in enriched model). Let T be a TBox and T ep = hTp , τ 0 i be an extended primitivization of T . Let hT norm , τ i be an extended normalization of T ep and M∗ be an enriched model of T norm . Let C1 , C2 be concepts appearing on the LHS or RHS of an axiom of T and for i = 1, 2, let Ai ∈ sig (Tp ) be a concept name which is either Ci (if Ci ∈ Nc ), or such that Ai ∈ sig (Tp ) \ sig (T ) and Ai ≡ Ci ∈ Tp . Then, for a subset S ⊆ T of axioms, it holds T \ S |= C1 v C2 iff there is a proof submodel Ms = (Γs = (Γs+ , Γs− , Es ), δs , ρs , fs ) of M∗ and a vertex v + ∈ Γs+ such that: – v + ∈ fs (A2 ) and ρs (v + ) = A1 , – for any non-leaf vertex v − ∈ Γs− , τ (δs (v − )) 6∈ S. Proof. Let us denote ϕ = C1 v C2 . If ϕ ∈ T then, by the definitions of extended primitivization and normalization, A1 v A2 ∈ T norm , τ (A1 v A2 ) = C1 v C2 and thus, by Lemma 3, the condition T \ S |= C1 v C2 is equivalent to T norm \ S norm |= A1 v A2 , where S norm = {ψ ∈ T norm | τ (ψ) ∈ S}. If ϕ 6∈ T 0 then consider the TBox Q = T ∪ {ϕ}. Clearly, Qep = hTp ∪ {A1 v A2 }, τQ i is norm an extended primitivization of Q and hT ∪ {A1 v A2 }, τQ i is an extended 0 0 normalization of Qep , where τQ (A1 v A2 ) = τQ (A1 v A2 ) = ϕ and τQ , τQ are equal to τ and τ 0 respectively, on the rest of their domains T norm and Tp . Observe that T \ S |= ϕ is the same as Q \ (S ∪ {ϕ}) |= ϕ, which by Lemma 3 and the above noted is equivalent to (T norm ∪ {A1 v A2 }) \ P |= A1 v A2 , where P = {ψ ∈ T norm ∪ {A1 v A2 } | τ (ψ) ∈ S ∪ {ϕ}}. We have A1 v A2 ∈ P, so the last entailment is the same as T norm \ S norm |= A1 v A2 . Therefore, in each of the two considered cases, the condition T \ S |= C1 v C2 is equivalent to T norm \ S norm |= A1 v A2 . (⇒) : If we have the entailment above, then there is an inference of A1 v A2 from T norm which omits S norm . Then, by Lemma 2, there are the required proof submodel Ms and the vertex v + ∈ Γs+ such that for all non-leaf vertices v − ∈ Γ − we have δs (v − ) 6∈ S norm and hence, τ (δs (v − )) 6∈ S. (⇐) : Suppose there is a proof submodel Ms of M∗ satisfying the conditions of the theorem. Then, by the definition of S norm and by Lemma 1, there is an inference of A1 v A2 from T norm which omits S norm \ T aut(Nc T ), but then T norm \ (S norm \ T aut(Nc T )) |= A1 v A2 and (T norm \ S norm ) ∪ (T norm ∩ T aut(Nc T )) |= A1 v A2 , hence T norm \S norm |= A1 v A2 and we have (T norm ∪ {A1 v A2 }) \ P |= A1 v A2 . By Lemma 3, this entailment is equivalent to Q \ (S ∪ {C1 v C2 }) |= C1 v C2 which is the same as T \ S |= C1 v C2 . 3. Proofs for Section 4 Example 2 shows that there are special intermediate derived concept inclu- sions in the proof for which an interpolant can be easily computed. It is either the LHS of the side condition of the rule under which the concept inclusion is obtained, or the concept from the RHS of one of the premises of the rule. In par- ticular, such special concept inclusions are the very first formulas in branches of the proof which are derived using axioms of both, a TBox T and its “copy”. For the rest of the derived concept inclusions, the corresponding interpolants can be obtained by combining interpolants for preceding concept inclusions under con- junction and existential restriction. These are the main ideas presented in the proof of Theorem 3 and the procedure of interpolant computation from Section 4. Motivated by the observation on these “special” concept inclusions in proofs, we first introduce the following auxiliary notion. Definition 5 (Irregular formula, irregular vertex). Let T1 and T2 be nor- mal TBoxes, with sig (T1 ) ∩ sig (T2 ) = ∆ for a signature ∆. A formula ϕ is called regular (wrt T1 , T2 ) if sig (ϕ) ⊆ sig (Ti ) for some i = 1, 2. Otherwise, it is called irregular. Let ((Γ + , Γ − , E), δ, ρ, f ) be an enriched model of T1 ∪ T2 . A vertex v + ∈ Γ + is called irregular if – either ρ(v) = A, v ∈ f (B), for concept names A, B, and A v B is an irregular formula, – or ρ(v) = hA, Bi, v ∈ f (r), for concept names A, B, and A v ∃r.B is an irregular formula. The property of concept interpolation shown in the following theorem is well- known for EL, but the proof is given to justify soundness of the procedure for interpolant computation described in Section 4. Theorem 3 (Structural proof of concept interpolation). Let T1 and T2 be normal TBoxes, with sig (T1 ) ∩ sig (T2 ) = ∆ for a signature ∆, and let A v F be a normal, but irregular formula such that A ∈ sig (T1 ). If there is an inference of A v F from T1 ∪ T2 , then there is a concept interpolant D such that T1 ∪ T2 |= {A v D, D v F } and sig (D) ⊆ ∆. Proof. We apply induction on the length of the inference of A v F from T1 ∪ T2 . Observe that by the form of the rules in Figure 1, we must have F ∈ sub (T1 ∪ T2 ) and thus, sig (F ) ⊆ sig (T2 ), since A v F is irregular and A ∈ sig (T1 ). Induction step (the base case is trivial): let R be the corresponding derivation rule. Note that the formula in the side condition of R is regular and assume that all the formulas in the premise are regular too. If R is T rans, then we conclude that B is the required interpolant; if R is AndR, then A1 u ... u An is; if R is Ex, then either {A, r, A1 , A2 } ⊆ sig (T1 ) and thus, ∃r.A2 is the required interpolant, or {A1 , r, A2 , B} ⊆ sig (T2 ), and hence, ∃r.A1 is. Now assume there is an irregular formula in the premise of R. By the in- duction hypothesis, there is a corresponding interpolant for this formula. If R is T rans, then it is the required interpolant for A v F . If R is AndR, then consider the maximal subset I ⊆ {1, .., n} such that for every i ∈ I the for- mula A v Ai is irregular and Di is the corresponding interpolant. Since A v F is irregular, we conclude that Aj ∈ ∆ for all j ∈ {1, .., n} \ I and therefore, F i∈I Di u F j ∈{1,..,n}\I Aj is the required interpolant. Finally, if R is Ex then, in case A v ∃r.A1 is irregular, its interpolant is the required interpolant for A v F . Now assume that A v ∃r.A1 is a regular formula and A1 v A2 is irregular with an interpolant D. Then r ∈ ∆ and ∃r.D is the required interpolant for A v F . 4. Proof for Section 5 Theorem 2 (Formulas imitate substitutions of signature symbols) Let T1 and T2 be normal TBoxes, ∆ = sig (T1 ) ∩ sig (T2 ), and let σ1 = {Ai }i∈I ∪ {rj }j∈J ⊆ sig (T1 ) \ ∆, σ2 = {Bi }i∈I ∪ {sj }j∈J ⊆ sig (T2 ) \ ∆ be signatures, where all Ai , Bi ∈ Nc and rj , sj ∈ Nr . Let T20 be the result of substituting every Bi with Ai and every sj with rj in the axioms of T2 . Consider the TBox Q = T1 ∪ T2 ∪ {Ai ≡ Bi }i∈I ∪ T2 (s̄/r̄), where T2 (s̄/r̄) is the set of formulas of the form ∃rj .X v Y , or X v ∃rj .Y , j ∈ J such that ∃sj .X v Y , or X v ∃sj .Y , respectively, is an axiom of T2 . Then for any normal formula ϕ in signature sig (T1 ∪ T20 ) we have T1 ∪ T20 |= ϕ iff Q |= ϕ. Proof. (⇒): We show that Q is a conservative extension of T1 ∪ T20 . By the definition of Q it suffices to verify that Q |= T20 \ T2 . Let ψ ∈ T20 \ T2 , this is a formula obtained from an axiom ξ ∈ T2 by substituting Bi with Ai and sj with rj for i ∈ I, j ∈ J. Let {Ak , . . . , Al } = sig (ψ) ∩ σ1 ∩ Nc (a possibly empty set). Consider several possible cases for the signature of ψ. If sig (ψ) ∩ {rj }j∈J = ∅, then ξ ∪ {Ai ≡ Bi }i∈{k,...,l} |= ψ, so we have Q |= ψ, because ξ ∈ T2 ⊆ Q. If there is a role rj ∈ sig (ψ) for some j ∈ J, then there must be a formula γ ∈ T2 (s̄/r̄) such that either γ |= ψ (in case sig (ψ)∩σ1 ∩Nc = ∅), or {γ}∪{Ai ≡ Bi }i∈{k,...,l} |= ψ. In each of the cases we have Q |= ψ. (⇐): Let Q0 be the TBox obtained from T1 ∪T2 ∪T2 (s̄/r̄) by substituting every occurrence of symbol Bi with Ai , for i ∈ I. We have Q |= ϕ iff Q0 |= ϕ, since Q |= Q0 , every model of Q0 expands to a model of Q, and sig (ϕ) ⊆ sig (Q0 ). Observe that any formula ∃rj .X v Y , or X v ∃rj .Y from a “copy” of T2 (s̄/r̄) obtained after this substitution, is a formula of T20 . Therefore, T1 ∪ T20 ⊆ Q0 and the set difference D = Q0 \(T1 ∪T20 ) consists of formulas of the form ∃sj .X v Y , or X v ∃sj .Y , with X, Y ∈ sig (T20 ), j ∈ J, such that there exists a corresponding axiom ∃rj .X v Y , or X v ∃rj .Y in T20 . We now prove by induction that for an inference of a (normal) formula ϕ from Q0 , with sig (ϕ) ⊆ sig (T1 ∪ T20 ), there is a corresponding inference of ϕ from T1 ∪ T20 . Induction step (the base case is trivial): if ϕ is obtained by AndR or T rans, then by the form of formulas in D, the axiom in the side condition can not belong to D and thus there is nothing to prove. If the rule is Ex with an axiom ∃sj .Y 0 v Z ∈ D in the side condition, then there are formulas of the form X v ∃sj .Y ∈ D and Y v Y 0 in the premise. The formula Y v Y 0 is normal and Y, Y 0 ∈ sig (T20 ), thus by the induction hypothesis, there exists an inference of Y v Y 0 from T1 ∪ T20 . Since there are corresponding formulas ∃rj .Y 0 v Z and X v ∃rj .Y in T20 , we conclude that there is an inference of ϕ from T1 ∪ T20 in which ϕ is obtained by Ex from formulas X v ∃rj .Y , Y v Y 0 and X v ∃rj .Y is obtained by T rans from the tautology X v X.