Computing Minimal Equivalent Acyclic
                 EL Ontologies

                 Patrick Koopmann and Nadeschda Nikitina

            Department of Computer Science, University of Oxford, UK


      Abstract. Computing equivalent EL-ontologies of minimal size is use-
      ful for various reasoning tasks such as uniform interpolation, ontology
      learning, rewriting ontologies into simpler DLs, abduction and knowl-
      edge revision. The corresponding tool support for minimising concepts
      and ontologies can also provide great help to ontology developers and
      end-users. We present a method for computing equivalent acyclic EL-
      ontologies of minimal size, where size is measured by number of occur-
      rences of predicates. We applied our minimisation method to both known
      and generated ontologies with promising results.


1   Introduction
Logical languages allow to represent the same facts in different equivalent ways,
whose complexity can vary significantly. Description logics are used to model
ontologies describing 100,000s of terms. Understanding ontologies of this size is
further hindered by unnecessary verbosity. For example, the following are two
simplified versions of axioms found in the Galen ontology:
      TrueCavity ≡ BodyCavity
                u ∃isDefinedBy.∃hasTopology.∃hasState.trulyHollow               (1)
      TruelyHollowBodyStructure ≡ ∃hasTopology.∃hasState.trulyHollow            (2)

 Axiom 1 seems unnecessarily verbose, especially since Galen already contains a
definition for the concept TruelyHollowBodyStructure. Using this concept, we
can reformulate Axiom 1, obtaining a much more concise and accessible definition
for TrueCavity, while preserving all logical consequences of the ontology:
      TrueCavity ≡ BodyCavity u ∃isDefinedBy.TruelyHollowBodyStructure          (3)

 Arguably, Axiom 3 is more accessible than Axiom 1, as the user has to parse
less information when reading it. Furthermore, Axiom 3 makes more use of the
structure provided by the ontology, and contains less redundant information.
    In this paper, we investigate the task of computing equivalent ontologies of
minimal size, where size is defined as the number of occurrences of concept and
role symbols in the ontology, or as the sum of its axiom sizes. The correspond-
ing decision problem is NP-complete. Automatic simplification of ontologies can
benefit a range of areas, of which we give some examples. Optimise existing
ontologies. Simplifying large ontologies by hand can be cumbersome, as the de-
veloper needs full knowledge of all concepts available in the ontology and has to
2

consider numerous alternatives. A tool that automatically detects and removes
redundancy and simplifies concept expressions could provide great help to on-
tology developers and end-users. (Semi-)Automated Ontology Generation.
There is a variety of techniques to generate ontologies automatically or semi-
automatically from different data sources such as tables or text documents [27,
5]. Ontologies generated this way usually require additional manual work, and
the quality of the presentation will be worse than that of hand-crafted ontologies.
Reducing redundancy and verbosity automatically has the potential to signif-
icantly improve the process of generating ontologies this way. Non-Standard
Reasoning Services. There is an increasing number of non-standard reasoning
services that generate sets of axioms. Examples include methods for uniform in-
terpolation [16, 21, 18], ontology learning [13, 17], rewriting ontologies into sim-
pler DLs [3, 19], abduction [6, 12] and knowledge revision [7, 24]. Usually, the
corresponding tools use heuristics for removing redundancies. In contrast, using
our method as a post-processing task in these procedures results in an optimal
representation of the results.
    While there exist solutions for certain sub-problems such as removing redun-
dant axioms [8] or minimising EL concepts [2], the only existing algorithm for
computing ontologies of minimal size only supports EL-ontologies that do not
use conjunctions [22]. We present a method that applies to a more general class
of EL-ontologies, based on a semantic acyclity condition defined in Section 4. If
the ontology is acyclic, minimised ontologies can be computed in an incremen-
tal manner. For this, the ontology is divided into partitions that are minimised
one after another. The minimisation of partitions involves the computation of
minimal equivalent EL-concepts—a problem which on its own is already NP-
complete, and for which we provide a practical method based on regular tree
grammars for generating subsumers introduced in [21]. An implementation of
our method is available online.1 Full proofs of lemmas and theorems are pro-
vided in the long version of the paper [14].


2     Preliminaries

2.1    The Description Logic EL

We recall the description logic EL used in this paper [1]. Let Nc and Nr be count-
ably infinite and mutually disjoint sets of concept symbols and role symbols. EL
concepts C are defined as

                            C ::= > | A | ∃r.C | C u C,

where A ∈ Nc and r ∈ Nr . For a set C = {C1d    , . . . , Cn }, we may abbreviate the
corresponding conjunction C1 u . . . u Cn with C. If a concept is of the form A,
A ∈ Nc , it is atomic. Otherwise it is complex. An EL ontology or TBox consists
of concept inclusion axioms C v D and equivalence axioms C1 ≡ · · · ≡ Cn used
1
    http://users.ox.ac.uk/~coml0607/el_minimiser/
                                                                                  3

as a shorthand for Ci v Cj , i, j ≤ n. Equivalence axioms like this correspond
to axioms of type OWLEquivalentClasses defined in the standard ontology lan-
guage OWL [10]. For a set C = {C1 , . . . , Cn }, we may abbreviate the axiom
C1 ≡ . . . ≡ Cn with ≡(C).
    We denote by sub(C) and sub(T ) the concepts occurring syntactically in C
and T . The semantics used is standard (see for example [1]). In particular, for
an ontology T and an axiom α, we use the notation T |= α to express that α
is entailed by T , that is α is true in every model of T , and T1 ≡ T2 to express
that T1 and T2 are equivalent, that is they have the same models.

2.2   Regular Tree Grammars
A regular tree grammar is a tuple G = hns , N , F, Ri, composed of a start sym-
bol ns , a set N of non-terminal symbols such that ns ∈ N , a ranked alphabet F
of terminal symbols such that N ∩ F = ∅, and a set R of derivation rules of
the form n →R β, where n ∈ N and β is a term over N ∪ F. A context C[X] is
a term in which one subterm is replaced by a variable X. Given a regular tree
grammar G = hns , N , F, Ri, the derivation relation →G is a relation on terms
over N ∪ F such that t1 →G t2 iff there is a derivation rule n →R β ∈ R and a
context C[X] such that t1 = C[X 7→ n] and t2 = C[X 7→ β]. We denote by →+     G
the transitive closure of →G . The language generated by G, denoted by L(G), is
the set of ground terms t over F such that ns →+ G t.


3     Minimising EL Concepts
Before we study the problem of minimising EL-ontologies, we focus on min-
imising EL-concepts, which plays a central role in our approach. We define
the ∫ -size of an EL concept inductively by ∫ (>) = ∫ (A) = 1 for all A ∈ Nc ,
∫ (∃r.C) = ∫ (C) + 1 and ∫ (C1 u C2 ) = ∫ (C1 ) + ∫ (C2 ). Informally, the size of a
concept corresponds to the number occurrences of concept and role symbols, as
well as of the > concept.
    The notion of minimal concepts is captured in the following definition:
Definition 1. Given an EL ontology T and an EL concept C, C is minimal in T
if there is no concept C 0 with T |= C ≡ C 0 and ∫ (C 0 ) < ∫ (C). For an ontology T
and a concept C, we denote by minc (C, T ) the set of minimal concepts C 0 with
T |= C ≡ C 0 . For an ontology T and two EL concepts C, Cc , we denote by
minc (C, T , Cc ) the set of conditioned minimal concepts C 0 for which T |= Cc u
C ≡ Cc u C 0 and for which there is no concept C 00 with ∫ (C 00 ) < ∫ (C 0 ) and
T |= Cc u C ≡ Cc u C 00 .
Intuitively, minimising an ontology involves minimising the concepts occurring
in it. Whereas minc (C, T ) contains all minimal concepts equivalent to C, the
set minc (C, T , Cc ) of conditioned minimal concepts contains concepts that are
minimal in conjunction with a fixed concept Cc . Conditioned minimal concepts
can be used to determine minimal concept inclusion axioms. Take as example
4

the axiom ∃r.A v B u ∃r.>. The concept B u ∃r.> is minimal with respect to
the empty ontology. However, since ∃r.A v ∃r.> already follows from the empty
ontology, the axiom ∃r.A v B is equivalent and uses only B on the right hand
side. By fixing ∃r.A, as a conjunct, we obtain the desired concept for the axiom,
since |= ∃r.A u (∃r.> u B) ≡ ∃r.A u B, and minc (B u ∃r.>, ∅, ∃r.A) = {B}.
    The decision problem corresponding to minimising EL-concepts—is there a
concept C2 for a given T , C1 and k such that T |= C1 ≡ C2 and ∫ (C1 ) ≤ k—
is NP-complete [2]. Since equivalence between general EL ontologies can be
decided in polynomial time, from this follows also the NP-completeness of the
corresponding decision problems for minimising EL concepts with arbitrary EL
ontologies, as well as for minimising cyclic or acyclic EL ontologies. The upper-
bound follows since we can non-deterministically guess a solution of size k and
test for equivalence in polynomial time (see also [22]). Our method for computing
minimal or conditioned minimal equivalent EL concepts makes use of regular
tree grammars. Despite the exponential worst-case complexity of this approach,
practicality can be achieved by using best-first search and further optimisations,
which is briefly discussed in the evaluation section.
    We define F EL as the symbols constituting the logic EL, that is, F EL =
Nc ∪ Nr ∪ {u, ∃}. For a given ontology T and concept Cs , we define the set
N T ,Cs = {nC | C ∈ sub(T ) ∪ sub(Cs ) ∪ {>}} which contains a non-terminal
symbol nC for every concept C occurring in T or Cs . Given an ontology T and
a concept Cs , the grammar Gv (T , Cs ) is then given by hnCs , N T ,Cs , F EL , Ri,
where R contains the following derivation rules:
(R1): nC →R C for all nC ∈ N T ,Cs
(R2): nC →R ∃r.nD for all nC , nD ∈ N T ,Cs such that C = ∃r.D
(R3): nC →R nC1 u . . . u nCn for all nC ∈ N T ,Cs and sets {nC1 , . . . , nCn } ⊆
   N T ,Cs such that T |= C v Ci , 1 ≤ i ≤ n.
    In practice, instances of rule (R3) can be determined by flattening and classi-
fying the ontology using any standard description logic reasoner. The generation
of the other rules is trivial. We extend the notion of ∫ -size to terms generated
by Gv (T , Cs ) by setting ∫ (nC ) = 1 for all nC ∈ N T ,Cs . Furthermore, we de-
note by Con(t, G) the result of saturating the term t with derivation rules of
type (R1), that is, by replacing every non-terminal with the corresponding con-
cept. Note that there is exactly one rule of type (R1) for every non-terminal, so
that Con(t, G) is always uniquely defined.
Example 1. Consider the following ontology T1 :
            B u ∃r.∃s.A1 v ∃r.∃t.A2       ∃t.A2 v A3       A3 v ∃s.A1
The set N T1 contains non-terminals for the concepts B u ∃r.∃s.A1 , B, ∃r.∃s.A1 ,
∃s.A1 , A1 , ∃r.∃t.A2 , ∃t.A2 , A2 , A3 and >. We show an example derivation in
the grammar Gv (T1 , B u ∃r.∃s.A1 ).
                  (R3)                   (R1)
     nBu∃r.∃s.A1 −−−→G nB u n∃r.∃t.A2 −−−→G B u n∃r.∃t.A2
                  (R2)                  (R3)                (R1)
                 −−−→G B u ∃r.n∃t.A2 −−−→G B u ∃r.nA3 −−−→G B u ∃r.A3
                                                                                  5

The applied rules of type (R3) are due to the entailments T |= B u ∃r.∃s.A1 v B,
T |= B u ∃r.∃s.A1 v ∃r.∃t.A2 and T |= ∃t.A2 v A3 . One can show that T |=
Bu∃r.∃s.A1 ≡ Bu∃r.A3 . In fact, one cannot derive a smaller equivalent concept.
   It is shown in [21] that for every pair of EL concepts C, D such that no concept
occurs twice in a conjunction in D, T |= C v D iff D is generated by Gv (T , C).
Therefore, Gv (T , C) generates all subsumers of C that are candidates for mini-
mal equivalent concepts of C in T . In order to compute an element of minc (C, T )
or minc (C, T , Cc ), we search the space of concepts generated by Gv (T , C). In
order to limit the search space we make use of the following properties, which
can be shown by inspection of the derivation rules.
Lemma 1. Let T be an acyclic EL ontology, C an EL concept and G = Gv (C, T ).
Further, let t1 , t2 be terms over N T ∪ F T .
    – If t1 →G t2 , then T |= Con(t1 ) v Con(t2 ). (v-monotonicity)
    – If t1 →G t2 , then ∫ (t1 ) ≤ ∫ (t2 ). (∫ -monotonicity)
    Due to the ∫ -monotonicity, we do not have to follow derivations from con-
cepts C 0 for which ∫ (C 0 ) > ∫ (C m ), where C m is the currently known smallest
equivalent concept of C. Note that this way, we have to check at most exponen-
tially many derivations. Due to the v-monotonicity, we do not have to follow
derivations of concepts C 0 such that T 6|= C 0 v C, where C is the concept to
minimise.

4      Minimising Acyclic Ontologies
We describe our method for minimising acyclic EL ontologies. The notion of ∫ -
size is extended as follows to EL axioms and TBoxes: ∫ (C v D) = ∫ (C) + ∫ (D),
∫ (C1 ≡ . . . ≡ Cn ) = Σ1≤i≤n Ci , ∫ (T ) = Σα∈T ∫ (α). An EL ontology T m is
minimal if there is no ontology T such that T ≡ T m and ∫ (T ) < ∫ (T m ).
    We focus on computing minimal equivalent ontologies for a class of EL on-
tologies that is characterized by the following definition.
Definition 2. Let T be an EL ontology. T is acyclic iff there are no EL concepts
C, D such that C ∈ sub(D) and T |= C v ∃r.D.
    Note that since the definition of acyclicity is defined purely semantically,
acyclicity is robust under logical equivalence: given two equivalent ontologies T1
and T2 , T1 is acyclic if and only if T2 is acyclic.
    Acyclic ontologies have the following property, which facilitates the minimi-
sation of ontologies compared to cyclic ontologies.
Lemma 2. Let T m be an ∫ -minimal acyclic ontology. Further, let α ∈ T m be of
the form C1 v C2 or C1 ≡ . . . ≡ Cn . Then, every equivalent EL ontology T con-
tains an axiom β of the form C10 v C20 or C10 ≡ . . . ≡ Cm
                                                         0
                                                           such that T |= C1 ≡ C10 .
    Due to this lemma, it is sufficient to consider axioms for the minimised on-
tology whose left-hand side is equivalent to the left-hand side of axioms in the
input ontology. This allows to partition the axioms in the ontology based on
their left-hand side concepts, and minimise the partitions one after the other.
6

4.1   Structuring the input ontology
To formalise this idea, we first group equivalent concepts in the ontology. For
a concept C, we denote by [C]T the equivalence class of C, the concepts that
syntactically occur in T and are equivalent to C: [C]T = {C 0 | C ∈ sub(T ), T |=
C ≡ C 0 }. For each equivalence class [C]T , we define the before-mentioned par-
titions as T[C] = {α | α = C1 v C2 or α = C1 ≡ . . . ≡ Cn ,, C1 ∈ [C]T }.
                                             m
By Lemma 2, each non-empty partition T[C]       in a minimal ontology T m has a
corresponding non-empty partition T[C] in any equivalent ontology T .
    Note that in Example 1, if we replace the concept B u ∃r.∃s.A1 with its
minimal equivalent B u ∃r.A3 , we obtain an ontology that is not equivalent.
In order to minimise a partition, we can only take into account entailments
from axioms outside of that partition. For example, all elements in [C]T are
equivalent to the same concept C m minimal with respect to T , but if we replace
equivalence axioms in T[C] by the tautological axiom C m ≡ C m , we do not
obtain an equivalent ontology. To determine which axioms of the ontology have
to be considered when minimising a partition T[C] , we structure the ontology
based on an implicability relation ;T between axioms and equivalence classes.
Intuitively, if we have α ;T β, for some α ∈ T[C] , α affects the meaning of β,
and we should not take β into account when minimising T[C] .
    We define the relation ;T formally. For two EL axioms α, β, α ;T β iff
there is a TBox T 0 with T |= T 0 , such that α ∈ T 0 , T 0 |= β and T 0 \ {α} 6|= β.

Example 2. Take the ontology T1 used in the last example, and the two axioms
α = B u∃r.A1 v ∃r.∃t.A2 and β = B u∃r.∃s.A1 ≡ B u∃r.A3 . We observed in the
example that T1 |= β. Set T 0 = T1 . We have T1 |= T 0 , T 0 |= β and T 0 \ {α} 6|= β.
Therefore, α ;T1 β. In the same way, we can establish ∃t.A2 v A3 ;T1 β and
A3 v ∃s.A1 ;T1 β.

In acyclic EL ontologies, the only cycles in the implicability relation are between
axioms of the same partition.

Lemma 3. Let T be an ∃-acyclic EL ontology and let α, β be entailments of T
with α = Cα v Dα and β = Cβ v Dβ . Further, let β ;T α and α ;T β. Then,
T |= Cα ≡ Cβ .

    We extend ;T to equivalence classes. [C]T ;T [D]T iff there are axioms
αC = C1 v C2 and αD = D1 v D2 , T |= C1 ≡ C, D1 ≡ D, such that
αC ;T αD . As a consequence of Lemma 3, the only cycles in the implicabil-
ity relation between S
                     equivalence classes are due to reflexivity of the relation.
                in
The ontology T[C]  = {T[D] | [D]T ;T [C]T , [D]T 6= [C]T } contains all axioms
in T that may have an impact on the minimised version of T[C] .

Lemma 4. Let T be an EL ontology and C, D ∈ sub(T ) be two EL concepts
such that [C]T ;T [D]T . Then, one of the following is true.
1. T |= D v C.
2. T |= D ≡ ∃r.C.
                                                                                     7


 Algorithm 1: Algorithm for computing minimal equivalent ontologies.
   Data: T : ∃-acyclic EL ontology
   Result: T m : ∫ -minimal equivalent ontology
     m
 1 T   := ∅;
 2 Ptodo := {[C]T | C v D ∈ T };
 3 while Ptodo 6= ∅ do
 4     Choose [C]T ∈ Ptodo with [D]T 6∈ Ptodo for all [D]T ;T [C]T ,[D]T 6= [C]T ;
 5     T m := T m ∪ minimise(T[C] , T m );
 6     Ptodo := Ptodo \ {[C]};
 7   return T m ;


3. There is a concept C2 such that [C2 ]T 6∈ {[C]T , [D]T }, [C]T ;T [C2 ]T and
   [C2 ]T ;T [D]T .

    Lemma 4 allows to compute a super-relation of the implicability relation
that is sufficient for our purposes. Condition 1 can be checked by flattening and
classifying the ontology, Condition 2 is syntactical, and Condition 3 corresponds
to the transitive closure.
Example 3. There are three non-empty partitions in T1 , corresponding to the
equivalence classes [B u ∃r.∃s.A1 ]T1 , [∃t.A2 ]T1 and [A3 ]T1 . Based on the obser-
vations in Example 2, we have [∃t.A1 ]T1 ;T1 [B u ∃r.∃s.A1 ]T1 and [A3 ]T1 ;T1
[B u ∃r.∃s.A1 ]T1 . Using Lemma 4 and T1 |= ∃t.A1 v A3 , we can further es-
tablish [A3 ]T1 ;T1 [∃t.A1 ]T1 . The ontology T1[Bu∃r.∃s.A
                                                     in
                                                              1]
                                                                  contains the union
of T1[∃t.A1 ] and T1[A3 ] , that is, the last two axioms of the ontology. Therefore,
in order to minimise partition T1[Bu∃r.A3 ] , we are only allowed to take into
account entailments from the last two axioms, which means the equivalence
T1 |= B u ∃r.∃s.A1 ≡ B u ∃r.A3 , which depends on all axioms, cannot be used.

Theorem 1. Let T 1 and T 2 be two acyclic EL ontologies s.t. T 1 ≡ T 2 , and C
be any EL concept. Then, the following statements are true:

1. (T 1 )in        2 in
         [C] ≡ (T )[C]
2. (T 1 )in      1      1 in      2
         [C] ∪ T[C] ≡ (T )[C] ∪ T[C]

    Note that Theorem 1 also holds for any minimal equivalent ontology T 2 . We
can therefore construct a minimal equivalent ontology by first computing a min-
imal ontology equivalent to (T 1 )in [C] , and then a minimal extension equivalent to
(T 1 )in
      [C] ∪T 1
            [C] . By starting with the concepts C for which (T 1 )in
                                                                  [C] is empty, we can
compute minimal equivalent ontologies in an incremental way. In each step, we
extend the current minimal ontology with the next partition, until all partitions
are processed. An overview of the corresponding top-level procedure is shown in
Algorithm 1. The algorithm makes use of a procedure minimise(T[C] , T m ) that
minimises partitions T[C] against the already constructed ontology T m . This is
described in the next subsection.
8

4.2   Computing Minimal Partitions
The concepts occurring in a minimised partition can be computed solely based
on the concepts that occur in the original partition T[C] , making use of logical
                               in
relations that follow from T[C]   . Note that [C]T contains all concepts that occur
                                           in
on the left-hand side of an axiom in T[C]     . We further define the set S(C)T =
{Cs | C1 v Cs ∈ T[C] or C1 ≡ . . . ≡ Cs ≡ . . . ≡ Cn ∈ T[C] }, which contains
the corresponding concepts on the right-hand sides. [C]T and S(C)T encode
all information din T[C] : we obtain an equivalent ontology if we replace T[C] by
{≡([C]T ), C v S(C)T )}. In the remainder of the section, we specify how to
remove all redundancy from these sets, and how to determine the shape of the
minimised partition.
    We first specify a minimal subset of the concepts in [C]T whose equivalence
has to be expressed in the partition of any equivalent ontology.

Definition 3. A set C of concepts is equivalence-reduced against T if there are
no distinct concepts C1 , C2 ∈ C such that T |= C1 ≡ C2 .
    A set [C]≡
             T is a minimal set of required equivalent concepts in T[C] if [C]T
                                                                               ≡
                                                           in
is a maximal, equivalence-reduced subset of [C]T against T[C] such that there are
no concepts C1 ∈ [C]≡                      in                  in
                     T , C2 ∈ [C]T with T[C] 6|= C1 ≡ C2 and T[C] |= C1 v C2 .

   First, we only have to consider sets of concepts that are equivalence-reduced
          in                                                          in
against T[C] , since all remaining equivalences already follow from T[C] . Second,
we exclude all concepts from [C] whose equivalence can be expressed by a single
concept inclusion axiom.
Example 4. Let T2 extend T1 with the following axioms:

       B2 u B v ∃r.∃s.A1        B3 ≡ B4 u A3 ≡ B4 u ∃t.A2 ≡ B4 u ∃s.A1

We have S(B2 u B)T2 = {∃r.∃s.A1 } and S(B3 )T2 = [B3 ]T2 = {B3 , B4 u A3 , B4 u
∃t.A2 , B4 u ∃s.A1 }. We determine a minimal set of required equivalent concepts
in T2[B3 ] . We have [B u∃r.∃s.A1 ] ;T [B2 uB] and [∃t.A2 ] ;T [B3 ], and therefore
  in
T2[B 3]
        = T1 ∪{B2 v ∃r.∃s.A1 }. The minimal set of required equivalent concepts in
T2[B3 ] is [B3 ]≡
                T2 = {B3 , B4 u∃s.A1 }. B4 uA3 and B4 u∃t.A2 are not included in this
                  in                                in
set, because T2[B    2]
                        |= B4 uA3 v B4 u∃s.A1 and T2[B 2]
                                                          |= B4 u∃t.A2 v B4 u∃s.A1 .
    In order to determine a minimised partition for T[C] , we have to distinguish
cases based on S(C)T and any minimal set of required equivalent concepts
in T[C] . We first give the definition of minimised partitions, and then explain
it in detail.

Definition 4. Let T be an acyclic EL ontology and C an EL concept. Then,
  m
T[C] = {αC } is a minimised partition for C in T iff αC is as follows, where
[C]≡ d= {C1 , . . . , Cn } is a set of minimally required concepts of C in T and
Cs = S(C)T :
1. If |[C]≡ | ≤ 1 and T[C]
                        in
                           |= C v Cs :
                                                                                         9

     – αC = ∅
2. If |[C]≡ | ≤ 1 and T[C]
                         in
                             6|= C v Cs :
     – α = C v Cs , where C m ∈ minc (C, T[C]
               m     m                              in
                                                       ) and Csm ∈ minc (Cs , T[C]
                                                                                in
                                                                                   , C m ).
3. If |[C]≡ | > 1 and T[C]
                         in
                             ∪ {C1 ≡ . . . ≡ Cn } |= C1 v Cs :
                m
     – αC = C1 ≡ . . . ≡ Cnm , where Cim ∈ minc (Ci , T[C]   in
                                                                )
          ≡              in
4. If |[C] | > 1 and T[C] ∪ {C1 ≡ . . . ≡ Cn } 6|= C1 v Cs :
     – αC = C1m ≡ . . . ≡ Cnm ≡ Csm , where Cim ∈ minc (Ci , T[C]      in
                                                                          ) and Csm ∈
                          in
        minc (C1 u Cs , T[C] ∪ {C1 ≡ . . . ≡ Cn }).
    The minimised partition only contains an equivalence axiom if there is more
than one required equivalent concept (Condition 3 and 4). Otherwise, whether we
need a concept inclusion axiom depends on whether all conceptd inclusions already
               in
follow from T[C]  or not (Condition 1 and 2). Note that Cs = S(C)T contains
all concept inclusion information for C. A minimal concept inclusion axiom is
determined as discussed in Section 3. Assume we have more than one required
                                                             in
equivalent concept and C v Cs does not follow solely from T[C]  and the required
equivalences (Condition 4). In this case, we might just add a concept inclusion
axiom as for Condition 2. However, this way we might miss the minimal solution.
Observe that the axioms C1 ≡ C2 , C1 v Cs are equivalent to C1 ≡ C2 ≡ C1 uCs .
C1 u Cs has the same size as C1 v Cs , but it can be equivalent to a concept
that is smaller to any concept inclusion axiom for C. For simplicity, we therefore
always encode concept inclusions into the equivalence axiom if Condition 4 of
the definition is fulfilled.
Theorem 2. Let T be an acyclic EL ontology, T m a minimal equivalent on-
                                    m
tology, C an EL concept and T[C]       be a minimised partition for C in T . De-
            m2
note by T       the result of replacing T[C] in T m by T[C]
                                                         m
                                                            . Then, T ≡ T m2 and
∫ (T m ) = ∫ (T m2 ).
    The result of the method minimise(T[C] , T m ) used in Algorithm 1 is calcu-
lated by checking the cases in Definition 4. Together with Theorem 2, we can
establish the correctness of our method.
Theorem 3. For any acyclic EL ontology T , Algorithm 1 terminates and re-
turns a minimal ontology T m such that T ≡ T m .
Example 5. We continue on the running example. As it turns out, (T2 )in  [B2 uB] =
T1 is already minimal. To minimise T[B2 uB] , we note that Case 2 applies, since
[B2 u B]T contains only one element. Based on the minimisation result in Ex-
ample 1, we obtain the minimised partition {B2 u B v ∃r.A3 }. For [B3 ]T , the
minimal set of required equivalent concepts is {B3 , B4 u ∃s.A1 }. Case 4 applies,
which means we have to encode remaining concept inclusions from S(B3 )T2 into
the equivalence axiom. The resulting minimised partition is {B3 ≡ B4 u ∃s.A1 ≡
B4 u ∃t.A2 }. As a result, we obtain the following minimal ontology:
            B u ∃r.∃s.A1 v ∃r.∃t.A2       ∃t.A2 v A3      A3 v ∃s.A1
               B2 u B v ∃r.A3         B3 ≡ B4 u ∃s.A1 ≡ B4 u ∃t.A2
10

5    Evaluation
We implemented the method in Java, using the OWL API [9]. We used the latest
version of ELK [11] for reasoning, since it supports incremental reasoning, a
feature required for a fast retrieval of subsumption relations for the incrementally
                                in
built minimised ontologies T[C]    . ELK was further used to verify the equivalence
of the minimised ontologies. The implementation is available online.
    The computation of minimal equivalent concepts was the most expensive
part of the minimisation. Note that for each concept, there are exponentially
many rules of type (R3) in the subsumer grammar, which makes an exhaustive
search impossible. We therefore used several optimisations. (1) We determine the
order in which rules of type (R3) are tried using a best-first strategy, where we
evaluate rules based on (a) the conjunction length, (b) the size of the concepts
corresponding to the non-terminals in the conjunction, and (c) the size of con-
cepts subsuming these non-terminals. This way, we could reduce the number of
entailment tests to 1 or 2 in most cases.Without a strategy like this, we were not
able to compute minimal equivalent concepts in almost every case. (2) For large
conjunctions, we tested whether certain conjuncts have to be included in every
equivalent concept. For small concepts, this test is more expensive than just
using a best-first search as described above, but it enabled us to minimise large
conjunctions, which is why we only used this optimisation for large concepts.
    We evaluated our method on ontologies from the NCBO BioPortal reposi-
tory [23]. From this repository, we selected all ontologies that (1) could be parsed
by the OWL API, (2) contained at least 75% EL axioms, as defined in the pre-
liminaries of this paper, (3) contained at least one existential role restriction
and one conjunction in the EL axioms. The resulting set contained 55 ontolo-
gies. We further included the versions of Galen [25] and NCI [26] from the Tones
repository [20] and SNOMED [4]. To get an idea on how our method performs
on generated ontologies, we generated 360 uniform interpolants of Galen with a
signature size of 50 using the tool Lethe [15]. The syntactical structure of these
interpolants was completely determined by the tool. For more information on
uniform interpolation, we refer to [16].
    Table 5 shows the sizes, the percentage of EL-axioms, and the percentage of
equivalence classes in the acyclic part of the input ontologies. The percentage of
the size reduced by our method is shown in the column labelled MSize1. To com-
pare against simple syntactic transformations, we modified the input ontologies
by exhaustively applying the transformation C1 v C2 , C1 v C3 ⇒ C1 v C2 uC3 .
The difference in size against these ontologies is shown in the column labelled
MSize2. We were especially interested in how the amount of complex concepts
changed. We therefore computed the sum of the sizes of complex concepts, as
well as the sum of the sizes of existential role restrictions in each ontology. The
reductions with respect to these measures are shown in the columns respectively
labelled CSize and ∃Size. The running times per ontology are shown in the last
column. We see a significant reduction for all measures, especially for complex
concepts and existential restrictions, whose accumulative size was reduced by
respectively 14.17% and 25.82% on average in the BioPortal repository.
                                                                                    11

    Ontologies     Size     EL acyclic MSize1 MSize2      CSize   ∃Size Duration
    BioPortal 58,943.8 92.1% 78.5% 12.96% 7.64% 14.17% 25.82%              443 s.
    Interpolants 2,192.4 79.8% 98.5% 31.50% 25.37% 48.78% 48.62%           1.3 s.
    Galen         13,625 95.72% 90.53% 19.05% 13.41% 21.11% 31.15%         9.4 s.
    Gene Ext.    162,950 100.0% 100.0% 31.65% 16.81% 19.68% 50.71%         483 s.
    NCI          267,916 94.7% 100.0% 13.45% 3.36% 7.24% 8.71%           1,432 s.
    SNOMED       444,473 100.0% 73.7% 23.11% 20.59% 26.04% 27.51%        3,954 s.

                           Table 1. Evaluation results.


6    Related Work
The problem of minimising EL concepts was first studied in [2], for the special
case of acyclic terminologies. The presented method is based on unfolding, which
can easy be implemented for acyclic terminologies. Due to the exponential search
space of this method, they also provide a greedy version of the algorithm that
has polynomial worst-case complexity, but does not guarantee optimal results.
The first technique for simplifying EL ontologies was presented in [22]. Whereas
our method explores the full space of equivalent concepts for minimisation, and
determines the exact shape of minimised axioms, this method minimises axioms
by replacing subconcepts based on known equivalences. Redundant axioms are
removed in a last step. While this method only runs in polynomial time, it
only guarantees optimality if the ontology does not use conjunctions. Comparing
the results of our evaluations, we see that our method provides a significant
improvement: for example, using their method, the size of NCI and Galen are
respectively reduced by 6% and 9%, whereas our method provides a reduction
by respectively 13% and 19%.


7    Summary
We presented a method for minimising acyclic EL ontologies, which might pro-
vide great help in improving existing ontologies as well as for tools that generate
ontology content. Key to our method is to structure the axioms in the ontology
into partitions that can be minimised one after another following a implicability
relation. Minimal axioms for each partition are computed by analysing inclusion
relations between concepts that syntactically occur in the ontology, and mak-
ing use of a method for minimising EL concepts with respect to an ontology.
EL concepts are minimised using a technique based on regular tree grammars.
An evaluation on realistic and generated ontologies showed that our method re-
duced the overall size as well as the complexity of ontologies significantly. An
open question is how to deal with cyclic ontologies. Whereas our approach could
be used to minimise existing partitions in cyclic ontologies, the main challenge
for cyclic ontologies is that we have to determine partitions not present in the
original ontology. Apart from cyclic EL ontologies, we are currently investigating
methods for minimising concepts in more expressive description logics.
12

References
 1. Baader, F., Calvanese, D., McGuinness, D., Nardi, D., Patel-Schneider, P.: The De-
    scription Logic Handbook: Theory, Implementation and Applications. Cambridge
    University Press (2003)
 2. Baader, F., Küsters, R., Molitor, R.: Rewriting concepts using terminologies. In:
    Proceedings of KR 2000. pp. 297–308 (2000)
 3. Carral, D., Feier, C., Grau, B.C., Hitzler, P., Horrocks, I.: EL-ifying ontologies. In:
    Proceedings of IJCAR 2014 (2014)
 4. Cornet, R., de Keizer, N.: Forty years of SNOMED: a literature review. BMC Med.
    Inf. & Decision Making 8 (2008)
 5. Ding, Y., Foo, S.: Ontology research and development. part 1—a review of ontology
    generation. Journal of information science 28(2), 123–136 (2002)
 6. Du, J., Wang, K., Shen, Y.: Towards tractable and practical ABox abduction over
    inconsistent description logic ontologies. In: Proceedings of AAAI 2015 (2015)
 7. Grau, B.C., Kharlamov, E., Zheleznyakov, D.: How to contract ontologies. In:
    Proceedings of OWLED 2012 (2012)
 8. Grimm, S., Wissmann, J.: Elimination of redundancy in ontologies. In: Proceedings
    of ESWC 2011. pp. 260–274 (2011)
 9. Horridge, M., Bechhofer, S.: The OWL API: A Java API for OWL ontologies. IOS
    Press (2011)
10. Horrocks, I., Patel-Schneider, P., van Harmelen, F.: From SHIQ and RDF to
    OWL: The making of a web ontology language. Web Semantics: Science, Services
    and Agents on the World Wide Web 1(1), pp.7–26 (2003)
11. Kazakov, Y., Klinov, P.: Incremental reasoning in OWL EL without bookkeeping.
    In: Proceedings of ISWC 2013, pp. 232–247. Springer (2013)
12. Klarman, S., Endriss, U., Schlobach, S.: ABox abduction in the description logic
    ALC. Journal of Automated Reasoning 46(1), 43–80 (2011)
13. Konev, B., Lutz, C., Ozaki, A., Wolter, F.: Exact learning of lightweight description
    logic ontologies. In: Proceedings of KR 2013. AAAI Press (2013)
14. Koopmann, P., Nikitina, N.: Computing minimal equivalence acyclic EL
    ontologies—extended version. Tech. rep., University of Oxford (2016), http:
    //users.ox.ac.uk/~coml0607/DL2016.pdf
15. Koopmann, P., Schmidt, R.A.: Lethe: Saturation-based reasoning for non-
    standard reasoning tasks. In: Proceedings of ORE 2015. pp. 23–30. CEUR-WS.org
    (2015)
16. Koopmann, P., Schmidt, R.A.: Uniform interpolation and forgetting for ALC on-
    tologies with ABoxes. In: Proceedings of AAAI 2015 (2015)
17. Lehmann, J., Hitzler, P.: Concept learning in description logics using refinement
    operators. Machine Learning 78(1-2), 203–250 (2010)
18. Ludwig, M., Konev, B.: Practical uniform interpolation and forgetting for ALC
    TBoxes with applications to logical difference. In: Proceedings of KR 2014. AAAI
    Press (2014)
19. Lutz, C., Piro, R., Wolter, F.: Description logic TBoxes: Model-theoretic charac-
    terizations and rewritability. In: Proceedings of IJCAI 2011. pp. 983–988 (2011)
20. Matentzoglu, N., Bail, S., Parsia, B.: A corpus of OWL DL ontologies. In: Pro-
    ceedings of DL 2013. vol. 1014, pp. 829–841 (2013)
21. Nikitina, N., Rudolph, S.: (Non-)succinctness of uniform interpolants of general
    terminologies in the description logic EL. Artificial Intelligence 215(0), 120–140
    (2014)
                                                                                       13

22. Nikitina, N., Schewe, S.: Simplifying description logic ontologies. In: Proceedings
    of ISWC 2013. LNCS, vol. 8219, pp. 411–426. Springer (2013)
23. Noy, N.F., Shah, N.H., Whetzel, P.L., Dai, B., Dorf, M., Griffith, N., Jonquet, C.,
    Rubin, D.L., Storey, M.A., Chute, C.G., Musen, M.A.: BioPortal: ontologies and
    integrated data resources at the click of a mouse. Nucleic Acids Research (2009)
24. Qi, G., Liu, W., Bell, D.A.: Knowledge base revision in description logics. In: Logics
    in Artificial Intelligence, pp. 386–398. Springer (2006)
25. Rector, A., Gangemi, A., Galeazzi, E., Glowinski, A., Rossi-Mori, A.: The GALEN
    CORE model schemata for anatomy: Towards a re-usable application-independent
    model of medical concepts. In: Proceedings of MIE 2094. pp. 229–233 (1994)
26. Sioutos, N., de Coronado, S., Haber, M.W., Hartel, F.W., Shaiu, W.L., Wright,
    L.W.: NCI Thesaurus: A semantic model integrating cancer-related clinical and
    molecular information. Journal of Biomedical Informatics 40(1), 30–43 (2007)
27. Wächter, T., Schroeder, M.: Semi-automated ontology generation within OBO-
    Edit. Bioinformatics 26(12), 88–96 (2010)