Computing Minimal Equivalent Acyclic EL Ontologies Patrick Koopmann and Nadeschda Nikitina Department of Computer Science, University of Oxford, UK Abstract. Computing equivalent EL-ontologies of minimal size is use- ful for various reasoning tasks such as uniform interpolation, ontology learning, rewriting ontologies into simpler DLs, abduction and knowl- edge revision. The corresponding tool support for minimising concepts and ontologies can also provide great help to ontology developers and end-users. We present a method for computing equivalent acyclic EL- ontologies of minimal size, where size is measured by number of occur- rences of predicates. We applied our minimisation method to both known and generated ontologies with promising results. 1 Introduction Logical languages allow to represent the same facts in different equivalent ways, whose complexity can vary significantly. Description logics are used to model ontologies describing 100,000s of terms. Understanding ontologies of this size is further hindered by unnecessary verbosity. For example, the following are two simplified versions of axioms found in the Galen ontology: TrueCavity ≡ BodyCavity u ∃isDefinedBy.∃hasTopology.∃hasState.trulyHollow (1) TruelyHollowBodyStructure ≡ ∃hasTopology.∃hasState.trulyHollow (2) Axiom 1 seems unnecessarily verbose, especially since Galen already contains a definition for the concept TruelyHollowBodyStructure. Using this concept, we can reformulate Axiom 1, obtaining a much more concise and accessible definition for TrueCavity, while preserving all logical consequences of the ontology: TrueCavity ≡ BodyCavity u ∃isDefinedBy.TruelyHollowBodyStructure (3) Arguably, Axiom 3 is more accessible than Axiom 1, as the user has to parse less information when reading it. Furthermore, Axiom 3 makes more use of the structure provided by the ontology, and contains less redundant information. In this paper, we investigate the task of computing equivalent ontologies of minimal size, where size is defined as the number of occurrences of concept and role symbols in the ontology, or as the sum of its axiom sizes. The correspond- ing decision problem is NP-complete. Automatic simplification of ontologies can benefit a range of areas, of which we give some examples. Optimise existing ontologies. Simplifying large ontologies by hand can be cumbersome, as the de- veloper needs full knowledge of all concepts available in the ontology and has to 2 consider numerous alternatives. A tool that automatically detects and removes redundancy and simplifies concept expressions could provide great help to on- tology developers and end-users. (Semi-)Automated Ontology Generation. There is a variety of techniques to generate ontologies automatically or semi- automatically from different data sources such as tables or text documents [27, 5]. Ontologies generated this way usually require additional manual work, and the quality of the presentation will be worse than that of hand-crafted ontologies. Reducing redundancy and verbosity automatically has the potential to signif- icantly improve the process of generating ontologies this way. Non-Standard Reasoning Services. There is an increasing number of non-standard reasoning services that generate sets of axioms. Examples include methods for uniform in- terpolation [16, 21, 18], ontology learning [13, 17], rewriting ontologies into sim- pler DLs [3, 19], abduction [6, 12] and knowledge revision [7, 24]. Usually, the corresponding tools use heuristics for removing redundancies. In contrast, using our method as a post-processing task in these procedures results in an optimal representation of the results. While there exist solutions for certain sub-problems such as removing redun- dant axioms [8] or minimising EL concepts [2], the only existing algorithm for computing ontologies of minimal size only supports EL-ontologies that do not use conjunctions [22]. We present a method that applies to a more general class of EL-ontologies, based on a semantic acyclity condition defined in Section 4. If the ontology is acyclic, minimised ontologies can be computed in an incremen- tal manner. For this, the ontology is divided into partitions that are minimised one after another. The minimisation of partitions involves the computation of minimal equivalent EL-concepts—a problem which on its own is already NP- complete, and for which we provide a practical method based on regular tree grammars for generating subsumers introduced in [21]. An implementation of our method is available online.1 Full proofs of lemmas and theorems are pro- vided in the long version of the paper [14]. 2 Preliminaries 2.1 The Description Logic EL We recall the description logic EL used in this paper [1]. Let Nc and Nr be count- ably infinite and mutually disjoint sets of concept symbols and role symbols. EL concepts C are defined as C ::= > | A | ∃r.C | C u C, where A ∈ Nc and r ∈ Nr . For a set C = {C1d , . . . , Cn }, we may abbreviate the corresponding conjunction C1 u . . . u Cn with C. If a concept is of the form A, A ∈ Nc , it is atomic. Otherwise it is complex. An EL ontology or TBox consists of concept inclusion axioms C v D and equivalence axioms C1 ≡ · · · ≡ Cn used 1 http://users.ox.ac.uk/~coml0607/el_minimiser/ 3 as a shorthand for Ci v Cj , i, j ≤ n. Equivalence axioms like this correspond to axioms of type OWLEquivalentClasses defined in the standard ontology lan- guage OWL [10]. For a set C = {C1 , . . . , Cn }, we may abbreviate the axiom C1 ≡ . . . ≡ Cn with ≡(C). We denote by sub(C) and sub(T ) the concepts occurring syntactically in C and T . The semantics used is standard (see for example [1]). In particular, for an ontology T and an axiom α, we use the notation T |= α to express that α is entailed by T , that is α is true in every model of T , and T1 ≡ T2 to express that T1 and T2 are equivalent, that is they have the same models. 2.2 Regular Tree Grammars A regular tree grammar is a tuple G = hns , N , F, Ri, composed of a start sym- bol ns , a set N of non-terminal symbols such that ns ∈ N , a ranked alphabet F of terminal symbols such that N ∩ F = ∅, and a set R of derivation rules of the form n →R β, where n ∈ N and β is a term over N ∪ F. A context C[X] is a term in which one subterm is replaced by a variable X. Given a regular tree grammar G = hns , N , F, Ri, the derivation relation →G is a relation on terms over N ∪ F such that t1 →G t2 iff there is a derivation rule n →R β ∈ R and a context C[X] such that t1 = C[X 7→ n] and t2 = C[X 7→ β]. We denote by →+ G the transitive closure of →G . The language generated by G, denoted by L(G), is the set of ground terms t over F such that ns →+ G t. 3 Minimising EL Concepts Before we study the problem of minimising EL-ontologies, we focus on min- imising EL-concepts, which plays a central role in our approach. We define the ∫ -size of an EL concept inductively by ∫ (>) = ∫ (A) = 1 for all A ∈ Nc , ∫ (∃r.C) = ∫ (C) + 1 and ∫ (C1 u C2 ) = ∫ (C1 ) + ∫ (C2 ). Informally, the size of a concept corresponds to the number occurrences of concept and role symbols, as well as of the > concept. The notion of minimal concepts is captured in the following definition: Definition 1. Given an EL ontology T and an EL concept C, C is minimal in T if there is no concept C 0 with T |= C ≡ C 0 and ∫ (C 0 ) < ∫ (C). For an ontology T and a concept C, we denote by minc (C, T ) the set of minimal concepts C 0 with T |= C ≡ C 0 . For an ontology T and two EL concepts C, Cc , we denote by minc (C, T , Cc ) the set of conditioned minimal concepts C 0 for which T |= Cc u C ≡ Cc u C 0 and for which there is no concept C 00 with ∫ (C 00 ) < ∫ (C 0 ) and T |= Cc u C ≡ Cc u C 00 . Intuitively, minimising an ontology involves minimising the concepts occurring in it. Whereas minc (C, T ) contains all minimal concepts equivalent to C, the set minc (C, T , Cc ) of conditioned minimal concepts contains concepts that are minimal in conjunction with a fixed concept Cc . Conditioned minimal concepts can be used to determine minimal concept inclusion axioms. Take as example 4 the axiom ∃r.A v B u ∃r.>. The concept B u ∃r.> is minimal with respect to the empty ontology. However, since ∃r.A v ∃r.> already follows from the empty ontology, the axiom ∃r.A v B is equivalent and uses only B on the right hand side. By fixing ∃r.A, as a conjunct, we obtain the desired concept for the axiom, since |= ∃r.A u (∃r.> u B) ≡ ∃r.A u B, and minc (B u ∃r.>, ∅, ∃r.A) = {B}. The decision problem corresponding to minimising EL-concepts—is there a concept C2 for a given T , C1 and k such that T |= C1 ≡ C2 and ∫ (C1 ) ≤ k— is NP-complete [2]. Since equivalence between general EL ontologies can be decided in polynomial time, from this follows also the NP-completeness of the corresponding decision problems for minimising EL concepts with arbitrary EL ontologies, as well as for minimising cyclic or acyclic EL ontologies. The upper- bound follows since we can non-deterministically guess a solution of size k and test for equivalence in polynomial time (see also [22]). Our method for computing minimal or conditioned minimal equivalent EL concepts makes use of regular tree grammars. Despite the exponential worst-case complexity of this approach, practicality can be achieved by using best-first search and further optimisations, which is briefly discussed in the evaluation section. We define F EL as the symbols constituting the logic EL, that is, F EL = Nc ∪ Nr ∪ {u, ∃}. For a given ontology T and concept Cs , we define the set N T ,Cs = {nC | C ∈ sub(T ) ∪ sub(Cs ) ∪ {>}} which contains a non-terminal symbol nC for every concept C occurring in T or Cs . Given an ontology T and a concept Cs , the grammar Gv (T , Cs ) is then given by hnCs , N T ,Cs , F EL , Ri, where R contains the following derivation rules: (R1): nC →R C for all nC ∈ N T ,Cs (R2): nC →R ∃r.nD for all nC , nD ∈ N T ,Cs such that C = ∃r.D (R3): nC →R nC1 u . . . u nCn for all nC ∈ N T ,Cs and sets {nC1 , . . . , nCn } ⊆ N T ,Cs such that T |= C v Ci , 1 ≤ i ≤ n. In practice, instances of rule (R3) can be determined by flattening and classi- fying the ontology using any standard description logic reasoner. The generation of the other rules is trivial. We extend the notion of ∫ -size to terms generated by Gv (T , Cs ) by setting ∫ (nC ) = 1 for all nC ∈ N T ,Cs . Furthermore, we de- note by Con(t, G) the result of saturating the term t with derivation rules of type (R1), that is, by replacing every non-terminal with the corresponding con- cept. Note that there is exactly one rule of type (R1) for every non-terminal, so that Con(t, G) is always uniquely defined. Example 1. Consider the following ontology T1 : B u ∃r.∃s.A1 v ∃r.∃t.A2 ∃t.A2 v A3 A3 v ∃s.A1 The set N T1 contains non-terminals for the concepts B u ∃r.∃s.A1 , B, ∃r.∃s.A1 , ∃s.A1 , A1 , ∃r.∃t.A2 , ∃t.A2 , A2 , A3 and >. We show an example derivation in the grammar Gv (T1 , B u ∃r.∃s.A1 ). (R3) (R1) nBu∃r.∃s.A1 −−−→G nB u n∃r.∃t.A2 −−−→G B u n∃r.∃t.A2 (R2) (R3) (R1) −−−→G B u ∃r.n∃t.A2 −−−→G B u ∃r.nA3 −−−→G B u ∃r.A3 5 The applied rules of type (R3) are due to the entailments T |= B u ∃r.∃s.A1 v B, T |= B u ∃r.∃s.A1 v ∃r.∃t.A2 and T |= ∃t.A2 v A3 . One can show that T |= Bu∃r.∃s.A1 ≡ Bu∃r.A3 . In fact, one cannot derive a smaller equivalent concept. It is shown in [21] that for every pair of EL concepts C, D such that no concept occurs twice in a conjunction in D, T |= C v D iff D is generated by Gv (T , C). Therefore, Gv (T , C) generates all subsumers of C that are candidates for mini- mal equivalent concepts of C in T . In order to compute an element of minc (C, T ) or minc (C, T , Cc ), we search the space of concepts generated by Gv (T , C). In order to limit the search space we make use of the following properties, which can be shown by inspection of the derivation rules. Lemma 1. Let T be an acyclic EL ontology, C an EL concept and G = Gv (C, T ). Further, let t1 , t2 be terms over N T ∪ F T . – If t1 →G t2 , then T |= Con(t1 ) v Con(t2 ). (v-monotonicity) – If t1 →G t2 , then ∫ (t1 ) ≤ ∫ (t2 ). (∫ -monotonicity) Due to the ∫ -monotonicity, we do not have to follow derivations from con- cepts C 0 for which ∫ (C 0 ) > ∫ (C m ), where C m is the currently known smallest equivalent concept of C. Note that this way, we have to check at most exponen- tially many derivations. Due to the v-monotonicity, we do not have to follow derivations of concepts C 0 such that T 6|= C 0 v C, where C is the concept to minimise. 4 Minimising Acyclic Ontologies We describe our method for minimising acyclic EL ontologies. The notion of ∫ - size is extended as follows to EL axioms and TBoxes: ∫ (C v D) = ∫ (C) + ∫ (D), ∫ (C1 ≡ . . . ≡ Cn ) = Σ1≤i≤n Ci , ∫ (T ) = Σα∈T ∫ (α). An EL ontology T m is minimal if there is no ontology T such that T ≡ T m and ∫ (T ) < ∫ (T m ). We focus on computing minimal equivalent ontologies for a class of EL on- tologies that is characterized by the following definition. Definition 2. Let T be an EL ontology. T is acyclic iff there are no EL concepts C, D such that C ∈ sub(D) and T |= C v ∃r.D. Note that since the definition of acyclicity is defined purely semantically, acyclicity is robust under logical equivalence: given two equivalent ontologies T1 and T2 , T1 is acyclic if and only if T2 is acyclic. Acyclic ontologies have the following property, which facilitates the minimi- sation of ontologies compared to cyclic ontologies. Lemma 2. Let T m be an ∫ -minimal acyclic ontology. Further, let α ∈ T m be of the form C1 v C2 or C1 ≡ . . . ≡ Cn . Then, every equivalent EL ontology T con- tains an axiom β of the form C10 v C20 or C10 ≡ . . . ≡ Cm 0 such that T |= C1 ≡ C10 . Due to this lemma, it is sufficient to consider axioms for the minimised on- tology whose left-hand side is equivalent to the left-hand side of axioms in the input ontology. This allows to partition the axioms in the ontology based on their left-hand side concepts, and minimise the partitions one after the other. 6 4.1 Structuring the input ontology To formalise this idea, we first group equivalent concepts in the ontology. For a concept C, we denote by [C]T the equivalence class of C, the concepts that syntactically occur in T and are equivalent to C: [C]T = {C 0 | C ∈ sub(T ), T |= C ≡ C 0 }. For each equivalence class [C]T , we define the before-mentioned par- titions as T[C] = {α | α = C1 v C2 or α = C1 ≡ . . . ≡ Cn ,, C1 ∈ [C]T }. m By Lemma 2, each non-empty partition T[C] in a minimal ontology T m has a corresponding non-empty partition T[C] in any equivalent ontology T . Note that in Example 1, if we replace the concept B u ∃r.∃s.A1 with its minimal equivalent B u ∃r.A3 , we obtain an ontology that is not equivalent. In order to minimise a partition, we can only take into account entailments from axioms outside of that partition. For example, all elements in [C]T are equivalent to the same concept C m minimal with respect to T , but if we replace equivalence axioms in T[C] by the tautological axiom C m ≡ C m , we do not obtain an equivalent ontology. To determine which axioms of the ontology have to be considered when minimising a partition T[C] , we structure the ontology based on an implicability relation ;T between axioms and equivalence classes. Intuitively, if we have α ;T β, for some α ∈ T[C] , α affects the meaning of β, and we should not take β into account when minimising T[C] . We define the relation ;T formally. For two EL axioms α, β, α ;T β iff there is a TBox T 0 with T |= T 0 , such that α ∈ T 0 , T 0 |= β and T 0 \ {α} 6|= β. Example 2. Take the ontology T1 used in the last example, and the two axioms α = B u∃r.A1 v ∃r.∃t.A2 and β = B u∃r.∃s.A1 ≡ B u∃r.A3 . We observed in the example that T1 |= β. Set T 0 = T1 . We have T1 |= T 0 , T 0 |= β and T 0 \ {α} 6|= β. Therefore, α ;T1 β. In the same way, we can establish ∃t.A2 v A3 ;T1 β and A3 v ∃s.A1 ;T1 β. In acyclic EL ontologies, the only cycles in the implicability relation are between axioms of the same partition. Lemma 3. Let T be an ∃-acyclic EL ontology and let α, β be entailments of T with α = Cα v Dα and β = Cβ v Dβ . Further, let β ;T α and α ;T β. Then, T |= Cα ≡ Cβ . We extend ;T to equivalence classes. [C]T ;T [D]T iff there are axioms αC = C1 v C2 and αD = D1 v D2 , T |= C1 ≡ C, D1 ≡ D, such that αC ;T αD . As a consequence of Lemma 3, the only cycles in the implicabil- ity relation between S equivalence classes are due to reflexivity of the relation. in The ontology T[C] = {T[D] | [D]T ;T [C]T , [D]T 6= [C]T } contains all axioms in T that may have an impact on the minimised version of T[C] . Lemma 4. Let T be an EL ontology and C, D ∈ sub(T ) be two EL concepts such that [C]T ;T [D]T . Then, one of the following is true. 1. T |= D v C. 2. T |= D ≡ ∃r.C. 7 Algorithm 1: Algorithm for computing minimal equivalent ontologies. Data: T : ∃-acyclic EL ontology Result: T m : ∫ -minimal equivalent ontology m 1 T := ∅; 2 Ptodo := {[C]T | C v D ∈ T }; 3 while Ptodo 6= ∅ do 4 Choose [C]T ∈ Ptodo with [D]T 6∈ Ptodo for all [D]T ;T [C]T ,[D]T 6= [C]T ; 5 T m := T m ∪ minimise(T[C] , T m ); 6 Ptodo := Ptodo \ {[C]}; 7 return T m ; 3. There is a concept C2 such that [C2 ]T 6∈ {[C]T , [D]T }, [C]T ;T [C2 ]T and [C2 ]T ;T [D]T . Lemma 4 allows to compute a super-relation of the implicability relation that is sufficient for our purposes. Condition 1 can be checked by flattening and classifying the ontology, Condition 2 is syntactical, and Condition 3 corresponds to the transitive closure. Example 3. There are three non-empty partitions in T1 , corresponding to the equivalence classes [B u ∃r.∃s.A1 ]T1 , [∃t.A2 ]T1 and [A3 ]T1 . Based on the obser- vations in Example 2, we have [∃t.A1 ]T1 ;T1 [B u ∃r.∃s.A1 ]T1 and [A3 ]T1 ;T1 [B u ∃r.∃s.A1 ]T1 . Using Lemma 4 and T1 |= ∃t.A1 v A3 , we can further es- tablish [A3 ]T1 ;T1 [∃t.A1 ]T1 . The ontology T1[Bu∃r.∃s.A in 1] contains the union of T1[∃t.A1 ] and T1[A3 ] , that is, the last two axioms of the ontology. Therefore, in order to minimise partition T1[Bu∃r.A3 ] , we are only allowed to take into account entailments from the last two axioms, which means the equivalence T1 |= B u ∃r.∃s.A1 ≡ B u ∃r.A3 , which depends on all axioms, cannot be used. Theorem 1. Let T 1 and T 2 be two acyclic EL ontologies s.t. T 1 ≡ T 2 , and C be any EL concept. Then, the following statements are true: 1. (T 1 )in 2 in [C] ≡ (T )[C] 2. (T 1 )in 1 1 in 2 [C] ∪ T[C] ≡ (T )[C] ∪ T[C] Note that Theorem 1 also holds for any minimal equivalent ontology T 2 . We can therefore construct a minimal equivalent ontology by first computing a min- imal ontology equivalent to (T 1 )in [C] , and then a minimal extension equivalent to (T 1 )in [C] ∪T 1 [C] . By starting with the concepts C for which (T 1 )in [C] is empty, we can compute minimal equivalent ontologies in an incremental way. In each step, we extend the current minimal ontology with the next partition, until all partitions are processed. An overview of the corresponding top-level procedure is shown in Algorithm 1. The algorithm makes use of a procedure minimise(T[C] , T m ) that minimises partitions T[C] against the already constructed ontology T m . This is described in the next subsection. 8 4.2 Computing Minimal Partitions The concepts occurring in a minimised partition can be computed solely based on the concepts that occur in the original partition T[C] , making use of logical in relations that follow from T[C] . Note that [C]T contains all concepts that occur in on the left-hand side of an axiom in T[C] . We further define the set S(C)T = {Cs | C1 v Cs ∈ T[C] or C1 ≡ . . . ≡ Cs ≡ . . . ≡ Cn ∈ T[C] }, which contains the corresponding concepts on the right-hand sides. [C]T and S(C)T encode all information din T[C] : we obtain an equivalent ontology if we replace T[C] by {≡([C]T ), C v S(C)T )}. In the remainder of the section, we specify how to remove all redundancy from these sets, and how to determine the shape of the minimised partition. We first specify a minimal subset of the concepts in [C]T whose equivalence has to be expressed in the partition of any equivalent ontology. Definition 3. A set C of concepts is equivalence-reduced against T if there are no distinct concepts C1 , C2 ∈ C such that T |= C1 ≡ C2 . A set [C]≡ T is a minimal set of required equivalent concepts in T[C] if [C]T ≡ in is a maximal, equivalence-reduced subset of [C]T against T[C] such that there are no concepts C1 ∈ [C]≡ in in T , C2 ∈ [C]T with T[C] 6|= C1 ≡ C2 and T[C] |= C1 v C2 . First, we only have to consider sets of concepts that are equivalence-reduced in in against T[C] , since all remaining equivalences already follow from T[C] . Second, we exclude all concepts from [C] whose equivalence can be expressed by a single concept inclusion axiom. Example 4. Let T2 extend T1 with the following axioms: B2 u B v ∃r.∃s.A1 B3 ≡ B4 u A3 ≡ B4 u ∃t.A2 ≡ B4 u ∃s.A1 We have S(B2 u B)T2 = {∃r.∃s.A1 } and S(B3 )T2 = [B3 ]T2 = {B3 , B4 u A3 , B4 u ∃t.A2 , B4 u ∃s.A1 }. We determine a minimal set of required equivalent concepts in T2[B3 ] . We have [B u∃r.∃s.A1 ] ;T [B2 uB] and [∃t.A2 ] ;T [B3 ], and therefore in T2[B 3] = T1 ∪{B2 v ∃r.∃s.A1 }. The minimal set of required equivalent concepts in T2[B3 ] is [B3 ]≡ T2 = {B3 , B4 u∃s.A1 }. B4 uA3 and B4 u∃t.A2 are not included in this in in set, because T2[B 2] |= B4 uA3 v B4 u∃s.A1 and T2[B 2] |= B4 u∃t.A2 v B4 u∃s.A1 . In order to determine a minimised partition for T[C] , we have to distinguish cases based on S(C)T and any minimal set of required equivalent concepts in T[C] . We first give the definition of minimised partitions, and then explain it in detail. Definition 4. Let T be an acyclic EL ontology and C an EL concept. Then, m T[C] = {αC } is a minimised partition for C in T iff αC is as follows, where [C]≡ d= {C1 , . . . , Cn } is a set of minimally required concepts of C in T and Cs = S(C)T : 1. If |[C]≡ | ≤ 1 and T[C] in |= C v Cs : 9 – αC = ∅ 2. If |[C]≡ | ≤ 1 and T[C] in 6|= C v Cs : – α = C v Cs , where C m ∈ minc (C, T[C] m m in ) and Csm ∈ minc (Cs , T[C] in , C m ). 3. If |[C]≡ | > 1 and T[C] in ∪ {C1 ≡ . . . ≡ Cn } |= C1 v Cs : m – αC = C1 ≡ . . . ≡ Cnm , where Cim ∈ minc (Ci , T[C] in ) ≡ in 4. If |[C] | > 1 and T[C] ∪ {C1 ≡ . . . ≡ Cn } 6|= C1 v Cs : – αC = C1m ≡ . . . ≡ Cnm ≡ Csm , where Cim ∈ minc (Ci , T[C] in ) and Csm ∈ in minc (C1 u Cs , T[C] ∪ {C1 ≡ . . . ≡ Cn }). The minimised partition only contains an equivalence axiom if there is more than one required equivalent concept (Condition 3 and 4). Otherwise, whether we need a concept inclusion axiom depends on whether all conceptd inclusions already in follow from T[C] or not (Condition 1 and 2). Note that Cs = S(C)T contains all concept inclusion information for C. A minimal concept inclusion axiom is determined as discussed in Section 3. Assume we have more than one required in equivalent concept and C v Cs does not follow solely from T[C] and the required equivalences (Condition 4). In this case, we might just add a concept inclusion axiom as for Condition 2. However, this way we might miss the minimal solution. Observe that the axioms C1 ≡ C2 , C1 v Cs are equivalent to C1 ≡ C2 ≡ C1 uCs . C1 u Cs has the same size as C1 v Cs , but it can be equivalent to a concept that is smaller to any concept inclusion axiom for C. For simplicity, we therefore always encode concept inclusions into the equivalence axiom if Condition 4 of the definition is fulfilled. Theorem 2. Let T be an acyclic EL ontology, T m a minimal equivalent on- m tology, C an EL concept and T[C] be a minimised partition for C in T . De- m2 note by T the result of replacing T[C] in T m by T[C] m . Then, T ≡ T m2 and ∫ (T m ) = ∫ (T m2 ). The result of the method minimise(T[C] , T m ) used in Algorithm 1 is calcu- lated by checking the cases in Definition 4. Together with Theorem 2, we can establish the correctness of our method. Theorem 3. For any acyclic EL ontology T , Algorithm 1 terminates and re- turns a minimal ontology T m such that T ≡ T m . Example 5. We continue on the running example. As it turns out, (T2 )in [B2 uB] = T1 is already minimal. To minimise T[B2 uB] , we note that Case 2 applies, since [B2 u B]T contains only one element. Based on the minimisation result in Ex- ample 1, we obtain the minimised partition {B2 u B v ∃r.A3 }. For [B3 ]T , the minimal set of required equivalent concepts is {B3 , B4 u ∃s.A1 }. Case 4 applies, which means we have to encode remaining concept inclusions from S(B3 )T2 into the equivalence axiom. The resulting minimised partition is {B3 ≡ B4 u ∃s.A1 ≡ B4 u ∃t.A2 }. As a result, we obtain the following minimal ontology: B u ∃r.∃s.A1 v ∃r.∃t.A2 ∃t.A2 v A3 A3 v ∃s.A1 B2 u B v ∃r.A3 B3 ≡ B4 u ∃s.A1 ≡ B4 u ∃t.A2 10 5 Evaluation We implemented the method in Java, using the OWL API [9]. We used the latest version of ELK [11] for reasoning, since it supports incremental reasoning, a feature required for a fast retrieval of subsumption relations for the incrementally in built minimised ontologies T[C] . ELK was further used to verify the equivalence of the minimised ontologies. The implementation is available online. The computation of minimal equivalent concepts was the most expensive part of the minimisation. Note that for each concept, there are exponentially many rules of type (R3) in the subsumer grammar, which makes an exhaustive search impossible. We therefore used several optimisations. (1) We determine the order in which rules of type (R3) are tried using a best-first strategy, where we evaluate rules based on (a) the conjunction length, (b) the size of the concepts corresponding to the non-terminals in the conjunction, and (c) the size of con- cepts subsuming these non-terminals. This way, we could reduce the number of entailment tests to 1 or 2 in most cases.Without a strategy like this, we were not able to compute minimal equivalent concepts in almost every case. (2) For large conjunctions, we tested whether certain conjuncts have to be included in every equivalent concept. For small concepts, this test is more expensive than just using a best-first search as described above, but it enabled us to minimise large conjunctions, which is why we only used this optimisation for large concepts. We evaluated our method on ontologies from the NCBO BioPortal reposi- tory [23]. From this repository, we selected all ontologies that (1) could be parsed by the OWL API, (2) contained at least 75% EL axioms, as defined in the pre- liminaries of this paper, (3) contained at least one existential role restriction and one conjunction in the EL axioms. The resulting set contained 55 ontolo- gies. We further included the versions of Galen [25] and NCI [26] from the Tones repository [20] and SNOMED [4]. To get an idea on how our method performs on generated ontologies, we generated 360 uniform interpolants of Galen with a signature size of 50 using the tool Lethe [15]. The syntactical structure of these interpolants was completely determined by the tool. For more information on uniform interpolation, we refer to [16]. Table 5 shows the sizes, the percentage of EL-axioms, and the percentage of equivalence classes in the acyclic part of the input ontologies. The percentage of the size reduced by our method is shown in the column labelled MSize1. To com- pare against simple syntactic transformations, we modified the input ontologies by exhaustively applying the transformation C1 v C2 , C1 v C3 ⇒ C1 v C2 uC3 . The difference in size against these ontologies is shown in the column labelled MSize2. We were especially interested in how the amount of complex concepts changed. We therefore computed the sum of the sizes of complex concepts, as well as the sum of the sizes of existential role restrictions in each ontology. The reductions with respect to these measures are shown in the columns respectively labelled CSize and ∃Size. The running times per ontology are shown in the last column. We see a significant reduction for all measures, especially for complex concepts and existential restrictions, whose accumulative size was reduced by respectively 14.17% and 25.82% on average in the BioPortal repository. 11 Ontologies Size EL acyclic MSize1 MSize2 CSize ∃Size Duration BioPortal 58,943.8 92.1% 78.5% 12.96% 7.64% 14.17% 25.82% 443 s. Interpolants 2,192.4 79.8% 98.5% 31.50% 25.37% 48.78% 48.62% 1.3 s. Galen 13,625 95.72% 90.53% 19.05% 13.41% 21.11% 31.15% 9.4 s. Gene Ext. 162,950 100.0% 100.0% 31.65% 16.81% 19.68% 50.71% 483 s. NCI 267,916 94.7% 100.0% 13.45% 3.36% 7.24% 8.71% 1,432 s. SNOMED 444,473 100.0% 73.7% 23.11% 20.59% 26.04% 27.51% 3,954 s. Table 1. Evaluation results. 6 Related Work The problem of minimising EL concepts was first studied in [2], for the special case of acyclic terminologies. The presented method is based on unfolding, which can easy be implemented for acyclic terminologies. Due to the exponential search space of this method, they also provide a greedy version of the algorithm that has polynomial worst-case complexity, but does not guarantee optimal results. The first technique for simplifying EL ontologies was presented in [22]. Whereas our method explores the full space of equivalent concepts for minimisation, and determines the exact shape of minimised axioms, this method minimises axioms by replacing subconcepts based on known equivalences. Redundant axioms are removed in a last step. While this method only runs in polynomial time, it only guarantees optimality if the ontology does not use conjunctions. Comparing the results of our evaluations, we see that our method provides a significant improvement: for example, using their method, the size of NCI and Galen are respectively reduced by 6% and 9%, whereas our method provides a reduction by respectively 13% and 19%. 7 Summary We presented a method for minimising acyclic EL ontologies, which might pro- vide great help in improving existing ontologies as well as for tools that generate ontology content. Key to our method is to structure the axioms in the ontology into partitions that can be minimised one after another following a implicability relation. Minimal axioms for each partition are computed by analysing inclusion relations between concepts that syntactically occur in the ontology, and mak- ing use of a method for minimising EL concepts with respect to an ontology. EL concepts are minimised using a technique based on regular tree grammars. An evaluation on realistic and generated ontologies showed that our method re- duced the overall size as well as the complexity of ontologies significantly. An open question is how to deal with cyclic ontologies. Whereas our approach could be used to minimise existing partitions in cyclic ontologies, the main challenge for cyclic ontologies is that we have to determine partitions not present in the original ontology. Apart from cyclic EL ontologies, we are currently investigating methods for minimising concepts in more expressive description logics. 12 References 1. Baader, F., Calvanese, D., McGuinness, D., Nardi, D., Patel-Schneider, P.: The De- scription Logic Handbook: Theory, Implementation and Applications. Cambridge University Press (2003) 2. Baader, F., Küsters, R., Molitor, R.: Rewriting concepts using terminologies. In: Proceedings of KR 2000. pp. 297–308 (2000) 3. Carral, D., Feier, C., Grau, B.C., Hitzler, P., Horrocks, I.: EL-ifying ontologies. In: Proceedings of IJCAR 2014 (2014) 4. Cornet, R., de Keizer, N.: Forty years of SNOMED: a literature review. BMC Med. Inf. & Decision Making 8 (2008) 5. Ding, Y., Foo, S.: Ontology research and development. part 1—a review of ontology generation. Journal of information science 28(2), 123–136 (2002) 6. Du, J., Wang, K., Shen, Y.: Towards tractable and practical ABox abduction over inconsistent description logic ontologies. In: Proceedings of AAAI 2015 (2015) 7. Grau, B.C., Kharlamov, E., Zheleznyakov, D.: How to contract ontologies. In: Proceedings of OWLED 2012 (2012) 8. Grimm, S., Wissmann, J.: Elimination of redundancy in ontologies. In: Proceedings of ESWC 2011. pp. 260–274 (2011) 9. Horridge, M., Bechhofer, S.: The OWL API: A Java API for OWL ontologies. IOS Press (2011) 10. Horrocks, I., Patel-Schneider, P., van Harmelen, F.: From SHIQ and RDF to OWL: The making of a web ontology language. Web Semantics: Science, Services and Agents on the World Wide Web 1(1), pp.7–26 (2003) 11. Kazakov, Y., Klinov, P.: Incremental reasoning in OWL EL without bookkeeping. In: Proceedings of ISWC 2013, pp. 232–247. Springer (2013) 12. Klarman, S., Endriss, U., Schlobach, S.: ABox abduction in the description logic ALC. Journal of Automated Reasoning 46(1), 43–80 (2011) 13. Konev, B., Lutz, C., Ozaki, A., Wolter, F.: Exact learning of lightweight description logic ontologies. In: Proceedings of KR 2013. AAAI Press (2013) 14. Koopmann, P., Nikitina, N.: Computing minimal equivalence acyclic EL ontologies—extended version. Tech. rep., University of Oxford (2016), http: //users.ox.ac.uk/~coml0607/DL2016.pdf 15. Koopmann, P., Schmidt, R.A.: Lethe: Saturation-based reasoning for non- standard reasoning tasks. In: Proceedings of ORE 2015. pp. 23–30. CEUR-WS.org (2015) 16. Koopmann, P., Schmidt, R.A.: Uniform interpolation and forgetting for ALC on- tologies with ABoxes. In: Proceedings of AAAI 2015 (2015) 17. Lehmann, J., Hitzler, P.: Concept learning in description logics using refinement operators. Machine Learning 78(1-2), 203–250 (2010) 18. Ludwig, M., Konev, B.: Practical uniform interpolation and forgetting for ALC TBoxes with applications to logical difference. In: Proceedings of KR 2014. AAAI Press (2014) 19. Lutz, C., Piro, R., Wolter, F.: Description logic TBoxes: Model-theoretic charac- terizations and rewritability. In: Proceedings of IJCAI 2011. pp. 983–988 (2011) 20. Matentzoglu, N., Bail, S., Parsia, B.: A corpus of OWL DL ontologies. In: Pro- ceedings of DL 2013. vol. 1014, pp. 829–841 (2013) 21. Nikitina, N., Rudolph, S.: (Non-)succinctness of uniform interpolants of general terminologies in the description logic EL. Artificial Intelligence 215(0), 120–140 (2014) 13 22. Nikitina, N., Schewe, S.: Simplifying description logic ontologies. In: Proceedings of ISWC 2013. LNCS, vol. 8219, pp. 411–426. Springer (2013) 23. Noy, N.F., Shah, N.H., Whetzel, P.L., Dai, B., Dorf, M., Griffith, N., Jonquet, C., Rubin, D.L., Storey, M.A., Chute, C.G., Musen, M.A.: BioPortal: ontologies and integrated data resources at the click of a mouse. Nucleic Acids Research (2009) 24. Qi, G., Liu, W., Bell, D.A.: Knowledge base revision in description logics. In: Logics in Artificial Intelligence, pp. 386–398. Springer (2006) 25. Rector, A., Gangemi, A., Galeazzi, E., Glowinski, A., Rossi-Mori, A.: The GALEN CORE model schemata for anatomy: Towards a re-usable application-independent model of medical concepts. In: Proceedings of MIE 2094. pp. 229–233 (1994) 26. Sioutos, N., de Coronado, S., Haber, M.W., Hartel, F.W., Shaiu, W.L., Wright, L.W.: NCI Thesaurus: A semantic model integrating cancer-related clinical and molecular information. Journal of Biomedical Informatics 40(1), 30–43 (2007) 27. Wächter, T., Schroeder, M.: Semi-automated ontology generation within OBO- Edit. Bioinformatics 26(12), 88–96 (2010)