<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>More is Sometimes Less: Succinctness in E L</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Nadeschda Nikitina</string-name>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Sven Schewe</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>University of Liverpool</institution>
          ,
          <country country="UK">UK</country>
        </aff>
        <aff id="aff1">
          <label>1</label>
          <institution>University of Oxford</institution>
          ,
          <country country="UK">UK</country>
        </aff>
      </contrib-group>
      <abstract>
        <p>In logics, there are many ways to represent same facts. With respect to both reasoning and cognitive complexity, some representations are significantly less efficient than others. In this paper, we investigate different means of improving the succinctness of TBoxes expressed in the lightweight description logic E L that forms a basis of some large ontologies used in practice. As a measure of size, we consider the number of references to signature elements. We investigate the problem of finding minimal equivalent representations and show that this task is NP-complete. A significant (up to triple-exponential) further improvement can be achieved by the introduction of auxiliary concept symbols. Thus, we additionally investigate the task of finding minimal representations for an ontology by extending its signature. Since arbitrary extension of the ontology with concept symbols can make the ontology unreadable, we only allow for auxiliary concepts acting as shortcuts for other concepts (E L concepts and disjunctions thereof) expressed by means of terms of the original ontology. We show that this task is also NP-complete if shortcuts represent only E L concepts, and between NP and 2P , otherwise.</p>
      </abstract>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>-</title>
      <p>It is well-known that same facts can be represented in many ways, and that the size
of these representations can vary significantly. Determining and increasing the degree
of succinctness of a particular syntactic representation is an important, but also a very
difficult task: for the average ontology, it is almost impossible to obtain the minimal
representation without tool support. Thus, automated methods that help to assess the
current succinctness of an ontology and generate suggestions on how to increase it
would be highly valued by ontology engineers.</p>
      <p>
        In description logics [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ], only few results in this direction were obtained so far.
Baader, Ku¨ ster, and Molitor [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ] investigate rewriting concepts using terminologies in
the narrow sense (sets of equivalence axioms where each defined atomic concept has
exactly one definition). The investigated problem is a special case of minimizing a
knowledge base by computing a minimal equivalent knowledge base. Grimm et al. [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ]
propose an algorithm for eliminating semantically redundant axioms from ontologies.
In the above approach, axioms are considered as atoms that cannot be split into parts
or changed in any other way. Bienvenu [
        <xref ref-type="bibr" rid="ref4">4</xref>
        ] proposes a normal form called prime
implicates normal form for ALC ontologies, which enables fast reasoning. However, as
a side-effect of this transformation, a doubly-exponential blowup in concept size can
occur.
      </p>
      <p>
        In this paper, we investigate the succinctness for the lightweight description logic
E L [
        <xref ref-type="bibr" rid="ref5">5</xref>
        ], which is the logical underpinning of one of the tractable sub-languages (the
so-called profiles [
        <xref ref-type="bibr" rid="ref6">6</xref>
        ]) of the W3C-specified OWL Web Ontology Language [
        <xref ref-type="bibr" rid="ref7">7</xref>
        ].
      </p>
      <p>First, we consider the problem of finding a minimal equivalent E L representation
for a given ontology. We show that the related decision problem (is there an equivalent
ontology of size k?) is NP-complete.</p>
      <p>
        Inspired by recent results on uniform interpolation in E L [
        <xref ref-type="bibr" rid="ref8">8</xref>
        ], we additionally
consider an extended version of the problem. The above results imply that, even for the
minimal equivalent representation of an ontology, an up to triple-exponentially more
succinct representation can be obtained by extending its signature. Auxiliary concept
symbols are therefore important contributors towards the succinctness of ontologies. It
is easy to envision scenarios that demonstrate the usefulness of auxiliary concept
symbols for improving succinctness. For instance, when a complex concept C is frequently
used in the axioms of an ontology, the ontology will diminish in size when all
occurrences of C are replaced by a fresh atomic concept AC , and an axiom AC C is added
to the ontology. However, an arbitrary extension of the ontology with concept symbols
whose meaning is not obvious can certainly make the ontology unreadable. In order to
preserve comprehensiveness, we only allow for auxiliary concepts acting as shortcuts
– concepts that are defined using only terms of the original ontology. Presented with
such a shortcut concept, an ontology engineer could find an appropriate comprehensive
name for it. Otherwise, the ontology engineer has to guess the meaning of an auxiliary
concept and the chance that he approves the extension suggested by the tool would be
low.
      </p>
      <p>We demonstrate that auxiliary concept symbols acting as shortcuts for E L concepts
expressed only by means of original ontology terms can lead to an exponential
improvement of succinctness and that the corresponding decision problem (is there such a
representation of size k?) is NP-complete.</p>
      <p>Further, we show that, if we additionally allow for auxiliary concept symbols that
act as shortcuts for disjunctions of E L concepts on the left-hand side of axioms
(encodable in E L using several axioms), we can reduce the size of the representation by
a further exponent, thereby obtaining doubly-exponentially more succinct
representations.We show that the corresponding decision problem (is there such a representation
of size k?) is NP-hard and included in 2P .</p>
      <p>
        The paper is organized as follows: In Section 2, we recall the necessary
preliminaries on description logics. Section 3 demonstrates the potential of auxiliary concept
symbols acting as shortcuts for achieving a higher succinctness. In the same section, we
also introduce the basic definitions of the size of ontologies as well as the investigated
notions of equivalents with and without signature extension. In Sections 4,5, we derive
the complexity bounds for the corresponding decision problems. Finally, we conclude
and outline future work in Section 6. Further details and proofs can be found in the
extended version [
        <xref ref-type="bibr" rid="ref9">9</xref>
        ] of this paper.
      </p>
    </sec>
    <sec id="sec-2">
      <title>Preliminaries</title>
      <p>
        We recall the basic notions in description logics [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ] required in this paper. Let NC
and NR be countably infinite and mutually disjoint sets of concept symbols and role
symbols. An E L concept C is defined as
      </p>
      <p>C ::= Aj&gt;jC u Cj9r:C;
where A and r range over NC and NR, respectively. In the following, we use symbols
A; B to denote atomic concepts and C; D; E to denote arbitrary concepts. A
terminology or TBox consists of concept inclusion axioms C v D and concept equivalence
axioms C D used as a shorthand for C v D and D v C. The signature of an E L
concept C or an axiom , denoted by sig(C) or sig( ), respectively, is the set of
concept and role symbols occurring in it. To distinguish between the set of concept symbols
and the set of role symbols, we use sigC (C) and sigR(C), respectively. The signature of
a TBox T , in symbols sig(T ) (correspondingly, sigC (T ) and sigR(T )), is defined
analogously. Next, we recall the semantics of the above introduced DL constructs, which
is defined by means of interpretations. An interpretation I is given by the domain I
and a function I assigning each concept A 2 NC a subset AI of I and each role
r 2 NR a subset rI of I I . The interpretation of &gt; is fixed to I . The
interpretation of an arbitrary E L concept is defined inductively, i.e., (C u D)I = CI \ DI and
(9r:C)I = fx j (x; y) 2 rI ; y 2 CI g. An interpretation I satisfies an axiom C v D
if CI DI . I is a model of a TBox, if it satisfies all of its axioms. We say that a TBox
T entails an axiom (in symbols, T j= ), if is satisfied by all models of T . A TBox
T entails another TBox T 0, in symbols T j= T 0, if T j= for all 2 T 0. T T 0 is a
shortcut for T j= T 0 and T 0 j= T .</p>
      <p>In addition to E L, we will use disjunction on the left-hand side of axioms to obtain
more succinct representations of E L TBoxes. Note that this extension is of a notational
nature, i.e., does not give us the expressive power to represent more TBoxes than
standard E L. We define an E LD concept C as</p>
      <p>C ::= Aj&gt;jC u CjC t Cj9r:C;
where A and r range over NC and NR, respectively. The interpretation of an arbitrary
E LD concept is defined analogously to the interpretation of E L concepts with the
extension (C t D)I = CI [ DI . An E LD TBox consists of axioms that are either E L
axioms or have the form C v D, where C is an E LD concept and D is an E L concept.
Note that equivalence axioms (C D) do not contain E LD concepts, since they are a
shortcut for C v D and D v C.
3</p>
      <p>Achieving Succinctness in E L
The size of a TBox is often measured by the number of axioms contained in it. This is,
however, a very simplified view of the size in terms of both, cognitive complexity and
reasoning. In this paper, we measure the size of a concept, an axiom, or a TBox by the
number of references to signature elements.</p>
      <sec id="sec-2-1">
        <title>Definition 1. The size of an E L concept D is defined as follows:</title>
        <p>– for D 2 sig(T ), s (D) = 1;
– for D = 9r:C, s (D) = s (C) + 1 where r 2 sigR(T ) and C is an arbitrary
concept;
– for D = C1 u C2, s (D) = s (C1) + s (C2) where C1; C2 are arbitrary concepts;</p>
      </sec>
      <sec id="sec-2-2">
        <title>The size of an E L axiom or a TBox is accordingly defined as follows:</title>
        <p>– s (C1 v C2) = s (C1) + s (C2) for concepts C1; C2;
– s (C1 C2) = s (C1) + s (C2) for concepts C1; C2.</p>
        <p>– s (T ) = P 2T s ( ) for a TBox T .</p>
        <p>In practice, the suitable means that can be used to obtain a compact representation
can differ depending on the scenario. To address cases, in which a signature extension
is not feasible, we first consider the problem of finding the minimal equivalent E L
representation for a given TBox among representations that use the same signature.
Popular examples for avoidable non-succinctness are axioms that follow from other
axioms and sub-concepts that can be removed from axioms without losing any logical
consequences. While non-succinctness is easy to detect in these simple cases,
nonsuccinctness can occur in many other forms. The ontology T = fC v 9r:C; 9r:C v
9r:D; 9r:D v Dg, for instance, does neither contain any axioms that are entailed by
the remainder of the ontology, nor are there any sub-expressions that can be removed.
However, there exists a smaller representation fC v 9r:C; C v D; 9r:D v Dg of
T . The general version of the corresponding decision problem can be formulated as
follows:</p>
      </sec>
      <sec id="sec-2-3">
        <title>Definition 2 (P1). Given an E L TBox T and a natural number k, is there an E L TBox T 0 with s (T 0) k such that T 0 T .</title>
        <p>We denote the set fT 0 j T 0 T g by [T ]. We will show that this decision problem,
which does not involve any signature extensions, is already NP-complete.</p>
        <sec id="sec-2-3-1">
          <title>Extending the Signature</title>
          <p>From the user’s point of view as well as with respect to reasoning, it sometimes makes
sense to introduce fresh concept symbols, for instance, used as shortcuts for complex
concepts that occur frequently in the ontology. It can be a tedious task for an
ontology engineer to do it in an advantageous way, since, as we will show later on, the
corresponding decision problem is NP-hard. To account for scenarios, in which an
introduction of auxiliary concept symbols is desirable, in addition to the decision problem
introduced above we consider the problem of finding succinct representations
containing shortcuts. We demonstrate by means of the following example the theoretical
potential of such an extension of the signature with shortcuts: we show that it can lead to
a doubly-exponentially more succinct representation of TBoxes.</p>
          <p>Example 1. Let the sets Ci of concept descriptions be inductively defined by C0 =
fA1; A2g, Ci+1 = f9r:C1 u 9s:C2 j C1; C2 2 Cig. For a natural number n, consider
the TBox Tn = fC v B j C 2 Cn 1g.</p>
          <p>Intuitively, the sets Ci of concepts have the shape of binary trees with exponentially
many leaves, each of which can be A1 or A2. Clearly, the concepts grow exponentially
with i. Further, it holds that jCi+1j = jCij2 and consequently jCij = 2(2i). Thus, Tn
contains doubly exponentially many axioms, each of which has exponential size. While
there is no smaller equivalent representation of Tn, this TBox can easily be represented
in a more compact way using auxiliary concept symbols as shortcuts for complex E L
or E LD concept expressions.</p>
          <p>First, combining several axioms into a single axiom with a disjunction on the
lefthand side would allow us to reduce the size of Tn from double-exponential to
singleexponential: we can define C0 = fA1tA2g and thus express all elements of the set Cn 1
by means of a single concept Cn 1 that has the shape of a binary tree with the concept
A1 t A2 as leaves. The corresponding E L TBox Tn0 can be obtained by introducing the
concept B0 that represents the disjunction A1 t A2 by means of the axioms A1 v B0
and A2 v B0.</p>
          <p>Second, by using fresh concept symbols as shortcuts for complex E L concepts, Tn0
can be reduced by a further exponential as follows: we introduce concept symbols Bi
with i 2 f1; :::; n 1g to represent each Ci and obtain the following TBox Tn00:
(1)
(2)
(3)
(4)
5.</p>
          <p>Bi+1</p>
          <p>A1 v B0</p>
          <p>A2 v B0
9r:Bi u 9s:Bi</p>
          <p>
            Bn 1 v B
i &lt; n
1
As a result, the binary tree contracts into a chain of n+3 axioms j with s ( j )
In general, an extension of the signature has to be meaning-preserving in the sense
that the logical consequences expressed using only the originally given signature remain
unchanged. Formally, the corresponding “equivalence” between TBoxes with different
signatures is captured by the notion of inseparability as investigated by various authors
[
            <xref ref-type="bibr" rid="ref10 ref11 ref12 ref13 ref14 ref15">10–15</xref>
            ] in different variations. We base this work on the deductive notion of
inseparability for E L. Two E L TBoxes, T1 and T2, are inseparable w.r.t. a signature if they
have the same E L consequences whose signature is a subset of :
          </p>
        </sec>
      </sec>
      <sec id="sec-2-4">
        <title>Definition 3. Let T1 and T2 be two general E L TBoxes and a signature. T1 and T2 are -inseparable, in symbols T1 EL T2, if for all E L concepts C; D with sig(C) [ sig(D) it holds that T1 j= C v D, iff T2 j= C v D.</title>
        <p>Thus, the formal requirement for any TBox T 0 obtained from T by means of a
signature extension is that it remains -inseparable from T , where = sig(T ). We
take this into account in the subsequent definitions.</p>
      </sec>
    </sec>
    <sec id="sec-3">
      <title>EL-Shortcuts</title>
      <p>We now consider the problem of finding small TBoxes that are -inseparable from T
(with = sig(T )) and use explicitly defined E L shortcuts. From Example 1, we can
observe that a significantly higher effect can be achieved if shortcuts are introduced
gradually such that previously introduced shortcuts can be used to define new ones.
The definition below allows for a hierarchy of shortcuts. To ensure that shortcuts form
a hierarchy, we impose an acyclicity condition on the syntactic references within the
definitions of shortcuts.</p>
      <p>Definition 4 (E L-Shortcuts). Let T be an E L TBox with sig(T ) =
TBox T 0 is an equivalent with E L-shortcuts, in symbols T 0 2 [T ]EL, iff
. Then an E L
1. T 0 EL T ;
2. sigR(T 0) = sigR(T );
3. for all Ai 2 fA1; : : : ; Ang = sigC (T 0) n sigC (T ) there exists exactly one concept</p>
      <sec id="sec-3-1">
        <title>Ci (called definition of Ai) such that Ai Ci 2 T 0;</title>
        <p>4. for all i 2 f1; : : : ; ng it holds that sig(Ci) sig(T ) [
fAj j j &lt; ig.</p>
        <p>The introduction of E L-shortcuts corresponds to the second transformation of the
TBox given in Example 1. The corresponding decision problem can be stated as follows:</p>
      </sec>
      <sec id="sec-3-2">
        <title>Definition 5 (P2). Given an E L TBox T and a natural number k, is there an E L TBox T 0 with s (T 0) k such that T 0 2 [T ]EL.</title>
        <p>It can be shown that the equivalence relation between T and its equivalent given
in Definition 4 is stronger than deductive inseparability. It is called emulation and is
defined as follows:
Definition 6. Let T1 and T2 be two E L TBoxes. T2 emulates T1, in symbols T2 j=em T1,
iff T2 j= T1 and every model of T1 can be extended into a model of T2.
Clearly, T2 j=em T1 implies T2 EL T1 with
establishes the role of E L-shortcuts within TBoxes:
= sig(T1). The following lemma
Lemma 1. Let T ; T 0 be two E L TBoxes such that T 0 2 [T ]EL and fA1; : : : ; Ang =
sigC (T 0) n sigC (T ). Further, let Ci be the corresponding definition of Ai. Then for the
TBox Text = T [ fAi Ci j i 2 f1; : : : ; ngg it holds that Text j=em T .
Proof Sketch. Clearly, the interpretation of each Ai 62 is completely determined
by the interpretations of symbols in (due to acyclicity condition on the syntactic
references within the definitions of the shortcuts). Thus, we can extend each model of
T by assigning AiI = CiI and obtain a model of Text. Additionally, Text j= T , since
T</p>
        <p>Text.
t-Shortcuts
The second important contribution of additional vocabulary elements to succinctness
of E L TBoxes is their ability to act as a replacement for disjunction on the left-hand
side of axioms. We can obtain a corresponding E L TBox T 0 from an E LD TBox T by
replacing each disjunction C1 t ::: t Cn occurring in T by a fresh concept symbol A
and extending T with axioms C1 v A; :::; Cn v A, called definitions of A. We denote
such an E L representation of T by TEL(T ).
tu</p>
      </sec>
      <sec id="sec-3-3">
        <title>Definition 7 (t-Shortcuts). Let T be an E L TBox. Then an E L TBox T 0 is an equivalent with t-shortcuts, in symbols T 0 2 [T ]t, iff there is an E LD TBox T 00 such that T 00 T ; sig(T 00) sig(T ) and TEL(T 00) = T 0.</title>
        <p>Introduction of t-shortcuts corresponds to the first transformation in Example 1. The
corresponding decision problem is as follows:</p>
      </sec>
      <sec id="sec-3-4">
        <title>Definition 8 (P3). Given an E L TBox T and a natural number k, is there an E L TBox T 0 with s (T 0) k such that T 0 2 [T ]t.</title>
      </sec>
    </sec>
    <sec id="sec-4">
      <title>ELD-Shortcuts</title>
      <p>If we simultaneously allow for both types of shortcuts (note that these roles can never
be played by a single concept at the same time!), we obtain the following definition of
equivalents:</p>
      <sec id="sec-4-1">
        <title>Definition 9 (E LD-Shortcuts). Let T be an E L TBox. Then an E L TBox T 0 is an equivalent with E LD-shortcuts, in symbols T 0 2 [T ]ELD, iff there is an E LD TBox T 00 such that Conditions 1-4 of Definition 4 are true for T 00 and T 0 = TEL(T 00).</title>
        <p>The corresponding decision problem is stated as follows:</p>
      </sec>
      <sec id="sec-4-2">
        <title>Definition 10 (P4). Given an E L TBox T and a natural number k, is there an E L TBox T 0 with s (T 0) k such that T 0 2 [T ]ELD.</title>
        <p>The following inclusion relations between the above introduced notions hold:
[T ]
[T ]
[T ]EL
[T ]t
[T ]ELD
[T ]ELD</p>
        <p>In the following, we show that problems P1-P2 are NP-complete, while the two
problems involving t- and E LD-shortcuts (P3-P4) are between NP and 2P .</p>
      </sec>
    </sec>
    <sec id="sec-5">
      <title>4 Inclusion in NP resp.</title>
      <p>P
2
In this section, we investigate the upper complexity bound for the problems P1 -P4 and
show that P1 -P2 are in NP and P3 -P4 in 2P . In case of P1, showing the upper bound
is simple:</p>
      <sec id="sec-5-1">
        <title>Theorem 1. P1 is in NP.</title>
        <p>
          Proof. We ask the non-deterministic algorithm to guess such an equivalent TBox T 0
T of size k. Then, we check T 0 T in PTIME [
          <xref ref-type="bibr" rid="ref5">5</xref>
          ]. tu
        </p>
        <p>
          The inclusion of P2 in NP (and of P3 and P4 in 2P ) is less straightforward, since
deciding inseparability of E L TBoxes is known to be EXPTIME-complete and
emulation is even undecidable [
          <xref ref-type="bibr" rid="ref14">14</xref>
          ]. For the inclusion of P3 and P4 in 2P , we make use of
the following simple lemma:
Lemma 2. Let T be an E LD TBox. TEL(T ) j=em T .
        </p>
        <p>Proof Sketch. We can transform each model of T into a model of T 0 = TEL(T ) by
successively adding AI = Sn</p>
        <p>i=1 CiI for each concept A that is introduced to replace
the disjunction Fn</p>
        <p>i=1 Ci. Additionally, it can be show that TEL(T ) j= T holds, since
fFin=1 Ci v Ag fCi v A j i 2 f1; : : : ; ngg and “v” is transitive. tu</p>
      </sec>
      <sec id="sec-5-2">
        <title>Theorem 2. P2 is in NP and P3, P4 are in</title>
        <p>2P .</p>
        <p>Proof. Let T 0 the corresponding equivalent of an E L TBox T returned by the
nondeterministic algorithm of size k. Now we consider how to verify that T 0 indeed
fulfills the requirements stated in Definitions 4,7,9.</p>
        <p>For P2, we have to verify Conditions 2-4 of Definition 4, which clearly can be done
in polynomial time. In order to verify Condition 1 (T 0 EL T ), it is sufficient to insert
the shortcut definitions into T and then test the equivalence of this extended TBox Text
and T 0 for the following reasons: By Lemma 1, Text j=em T . Due to transitivity of</p>
        <p>EL, Text T 0 implies T 0 EL T . It remains to show that Text T 0 only if T 0 EL
T . Let us assume for contradiction that there exists an inclusion axiom C v D 2 T 0
such that Text 6j= C v D. Then we can obtain concepts C0; D0 with sig(C0)[sig(D0)
by recursively replacing shortcuts by their definitions such that Text 6j= C0 v D0 and
Text j= C0 v D0. With, T EL Text we can conclude T 6 EL T 0.</p>
        <p>For P3, we need to show that there exists an E LD TBox T 00 such that sig(T 00)
sig(T ); T 0 = TEL(T 00) and T 00 T . In case there exists such T 00, we can obtain it
from T 0 by replacing the introduced concept symbols by the corresponding disjunctions
of definitions. As T 0 and T 00 are -inseparable with = sig(T 00) by Lemma 2, it
suffices to show T 0 j= T and T j= T 00. The first is standard reasoning in E L and
can clearly be performed in polynomial time. The refutation of the latter, i.e., showing
T 6j= T 00, can be done in NP: if T 6j= T 00, then, for some concretization C v D of some
axiom of T 0 (where a concretization is simply the replacement of each disjunction by
one of its disjuncts) T 6j= C v D holds. A non-deterministic machine can simply guess
the axiom and its concretization. Consequently, testing T j= T 00 is in CONP and P3
thus in 2P . (Note that it suffices to call the oracle once at the end.)</p>
        <p>For P4, we can simply combine these tests. tu
Clearly, the results of this section also apply to tractable extensions of E L.
5</p>
      </sec>
    </sec>
    <sec id="sec-6">
      <title>NP-Hardness of P1- P4</title>
      <p>In this section, we show the NP-hardness of problems P1 through P4 by a reduction
from the set cover problem, which is one of the standard NP-complete problems. For
a given set S = fS1; S2; : : : ; Sng with carrier set S = Sin=1 Si, a cover C S is a
subset of S , such that the union of the sets in C covers S, i.e., S = SC2C C.</p>
      <p>The set cover problem is the problem to determine, for a given set S =
fS1; S2; : : : ; Sng and a given integer k, if there is a cover C of S with at most k jCj
elements.</p>
      <p>We will use a restricted version of the set cover problem, which we call the dense
set cover problem (DSCP). In the dense set cover problem, we require that
– neither the carrier set S nor the empty set is in S,
– all singleton subsets (sets with exactly one element) of S are in S, and
– if a non-singleton set S is in S, so is some subset S0 S, which contains only one
element less than S (jS r S0j = 1).</p>
      <p>Lemma 3. The dense set cover problem is NP-complete.</p>
      <p>Proof. Inclusion in NP is inherited from the set cover problem, of which it is a special
instance.</p>
      <p>We now reduce solving the set cover problem to solving the dense set cover
problem. We start with a set cover problem for a given S and k, and first check if the carrier
set S is contained in S (if so, the problem is solved). If it is not the case, we identify
the size l of the largest set in S, initialise S0 to S and extend S0 using the following
algorithm:
– while l &gt; 1 do
for all S 2 S0, choose an s 2 S and join S0 with S r fsg
decrement l by one.</p>
      <p>After this, we join S with ffsg j s 2 Sg, and remove the empty set from S if applicable.
Note that S0 can easily be constructed in polynomial time. Now we show that there is
a cover C of size k of S exactly if there is a cover C0 of size k of S0. W.l.o.g., we
can assume that ; 62 C, since we always obtain a cover from any cover C by removing
; from it. Since S S0 [ f;g, any cover of S is a cover of S0. Let C0 be a cover of
size k of S0. We can construct a cover C of S by replacing each S0 2 C0 by the
corresponding superset S 2 S. tu</p>
      <p>Given the above NP-completeness result, we show that the size of minimal
equivalents specified in P1 through P4 is a linear function of the size of the minimal cover. To
this end, we use the lemma below to obtain a lower bound on the size of equivalents.
Intuitively, it states that for each entailed non-trivial equivalence C A, the TBox
must contain at least one axiom that is at least as large as C0 A for some C0 with
T j= C C0:
Lemma 4. Let T be an E L TBox, A 2 sig(T ) and C; D E L concepts such that T j=
C A, T j= A v D (the latter is required for induction). Then, one of the following
is true:
1. A is a conjunct of C (including the case C = A);
2. there exists an E L concept C0 such that T j= C C0 and C0 ./ A 2 T or</p>
      <sec id="sec-6-1">
        <title>C0 ./ A u D0 2 T for some ./2 f ; vg and some concept D0.</title>
        <p>
          Proof Sketch. For the full version of the proof, see extended version of the paper. We
use the sound and complete proof system for general subsumption in E L terminologies
introduced in [
          <xref ref-type="bibr" rid="ref8">8</xref>
          ] and prove the lemma by induction on the depth of the derivation of
C v A u D. We assume that the proof has minimal depth and consider the possible
rules that could have been applied last to derive C v A u D. In each case the lemma
holds. tu
The encoding of the dense set cover problem as P1-P4 is as follows.
        </p>
        <p>Consider an instance of the dense set cover problem with the carrier set A =
fB1; : : : ; Bng, the set S = fA1; : : : ; Am; fB1g; : : : ; fBngg of subsets that can be
used to form a cover. By interpreting the set and element names as atomic concepts, we
can construct TSbase as follows:</p>
        <p>TSbase = fA00</p>
        <p>A0 u B j A00; A0 2 S; B 2 A; A00 = A0 [ fBg; A00 6= A0g:</p>
        <p>Observe that the size odfTSbase is at least 3m. Clearly, TSbase j= Ai dB2Ai B.
Let TS = TSbase [ fA B2A Bg. We establish the connection between the size of
TS equivalents and the size of the cover of S as follows:</p>
      </sec>
      <sec id="sec-6-2">
        <title>Lemma 5. TS has an equivalent (as specified in P1-P4) of size s (TSbase) + k + 1 if, and only if, S has a cover of size k.</title>
        <p>Proof. For the if-direction, assume that S has a cover of size k. We construct TS0 of
size s (TSbase) + k + 1 as follows: TS0 = TSbase [ fA dA02C A0g. Clearly, TS0 TS .
Note that TS0 2 [TS ] and, therefore, also TS0 2 [TS ]t; [TS ]EL; [TS ]ELD.</p>
        <p>For the only-if-direction, we assume that k is minimal and argue that no equivalent
T 0 2 [TS ]ELD of size s (TSbase) + k can exist. Assume that T is a minimal TBox
with T 2 [TS ]ELD. With the observation, that the m + n atomic concepts that
represent elements of S are pairwise not equivalent with each other or the concept A that
represents the carrier set, we can conclude that no two atomic concepts are equivalent.
From Lemma 4 it follows that, for each Ai with i 2 f1; : : : ; mg, there is an axiom
Ci Ci0 2 T or Ci v Ci0 2 T such that T j= Ci Ai and Ai is a conjunct of Ci0
or Ai = C0. Since there are no equivalent atomic concepts and Ci 6= Ai due to the
i
minimality of T , the size of each such axiom is at least 3 and none of these axioms
coincide. We will later make use of two obvious properties (*) of these axioms:
1. since TS 6j= Ai v A, A cannot occur as a conjunct of Ci or as a conjunct of C0;
i
2. these axioms cannot be (parts of) the definitions of atomic concepts representing
disjunctions (as Ai is a conjunct of Ci0) or shortcuts (T j= Ci Ai).</p>
        <p>Finally, we estimate the size of the remaining axioms and show that their cumulative
size is &gt; k. It also follows from Lemma 4 that there exists an axiom C C0 2 T or
C v C0 2 T such that T j= C A and A is a conjunct of C0 or A = C0. It holds
that T j= C dB2A B. We also know that for no proper subset S0 ( A it holds that
T j= dB2S0 B v C.</p>
        <p>If C does not contain any shortcuts or disjunction replacements, then we have found
a cover of S and the size of the axiom must be k +1. Assume that it contains auxiliary
shortcut and disjunction concepts and let C0 be the concept obtained by replacing all
these concepts recursively in C until sig(C0) sig(TS ). It is clear that the cumulative
size of the corresponding definitions for these auxiliary concept symbols cannot be
smaller than the size of C0, which does not contain any concept symbols twice. Since
T j= C0 C, we have once more found a cover of S and the size of this axiom plus the
size of definition axioms must be k + 1. From the two properties (*) of the axioms
definition Ai we can conclude that none of these axioms can coincide. Thus, the overall
size of T must be s (TSbase) + k + 1. tu
Theorem 4. P1 and P2 are NP-complete.
6</p>
      </sec>
    </sec>
    <sec id="sec-7">
      <title>Summary and Outlook</title>
      <p>tu
Theorem 3. P1 through P4 are NP-hard.</p>
      <p>Proof. The theorem is an immediate consequence of Lemma 5. It establishes that all
four problems can be used to solve the dense set cover problem, which is NP-complete
according to Lemma 3. tu</p>
      <p>Thus, we establish completeness of the first two problems:
In this paper, we have considered the problem of finding minimal equivalent
representations for ontologies expressed in the lightweight description logic E L that forms a basis
of some large ontologies used in practice. We have shown that the task of finding such
a representation (or rather: its related decision problem) is NP-complete.</p>
      <p>In addition to studying the problem of computing minimal equivalent TBoxes, we
investigated the task of finding minimal representations for ontologies under signature
extension. We considered scenarios, where auxiliary concepts are allowed to be used
as shortcuts for complex E L concepts. We showed that this task is also NP-complete.
For the corresponding decision problem with auxiliary concepts acting as shortcuts for
a disjunction of E L concepts, we have established NP-hardness and inclusion in 2P .
The same bounds hold for the combination of the two ways of extending the signature.</p>
      <p>
        There are various natural extensions of this work. The results obtained within this
paper can easily be transferred to the context of ontology reuse, where a sub-signature
becomes obsolete in a new context and a compact representation of the facts about
the remaining terms is sought-after. Recent results on ontology reuse show that neither
uniform interpolation nor standard module extraction guarantee the optimality of the
extracted ontology [
        <xref ref-type="bibr" rid="ref16">16</xref>
        ].
      </p>
      <p>Further, a question that naturally arises is that of tight complexity bounds when
shortcuts for disjunctions are allowed for. Another target would be the complexity of
identifying minimal TBoxes by the means of an arbitrary inseparable TBox, where we
waive the requirement of explicitly defining the meaning of new concepts. An
EXPTIME upper bound for this problem is implied from the fact that the set of candidate
TBoxes is exponential, and so is the general test for inseparability in E L.</p>
      <p>Minimizing representations is, of course, an interesting problem for all logics, and
similar questions can (and should) be asked for more expressive ontology languages.</p>
      <p>While the concern of this paper is the complexity of the above problems, a natural
follow-up task would be to develop efficient algorithms and tools that support ontology
engineers in the development of succinct representations of their ontologies. Natural
targets would be good heuristics and efficient approximations. For the latter, our proofs
contain the bad news that there is no linear approximation scheme, as the set cover
problem has no logarithmic approximations unless P equals NP.</p>
      <p>Finally, from practical point of view, it would be very interesting to investigate the
potential improvement of succinctness in existing medical ontologies. Such a case study
can be carried out after the corresponding tool support becomes available.
Acknowledgments This work is supported by the EPSRC grant EP/H046623/1 ’Synthesis and
Verification in Markov Game Structures’ and the University of Oxford.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          1.
          <string-name>
            <surname>Baader</surname>
            ,
            <given-names>F.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Calvanese</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>McGuinness</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Nardi</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Patel-Schneider</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          :
          <source>The Description Logic Handbook: Theory, Implementation and Applications</source>
          . Cambridge University Press (
          <year>2003</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          2.
          <string-name>
            <surname>Baader</surname>
            ,
            <given-names>F.</given-names>
          </string-name>
          , Ku¨sters, R.,
          <string-name>
            <surname>Molitor</surname>
          </string-name>
          , R.:
          <article-title>Rewriting concepts using terminologies</article-title>
          .
          <source>In: Proceedings of the Seventh International Conference on Principles of Knowledge Representation and Reasoning (KR</source>
          <year>2000</year>
          ).
          <article-title>(</article-title>
          <year>2000</year>
          )
          <fpage>297</fpage>
          -
          <lpage>308</lpage>
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          3.
          <string-name>
            <surname>Grimm</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Wissmann</surname>
          </string-name>
          , J.:
          <article-title>Elimination of redundancy in ontologies</article-title>
          .
          <source>In: Proceedings of the 8th Extended Semantic Web Conference (ESWC</source>
          <year>2011</year>
          ).
          <article-title>(</article-title>
          <year>2011</year>
          )
          <fpage>260</fpage>
          -
          <lpage>274</lpage>
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          4.
          <string-name>
            <surname>Bienvenu</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          :
          <article-title>Prime implicates and prime implicants: From propositional to modal logic</article-title>
          .
          <source>Journal of Artificial Intelligence Research (JAIR) 36</source>
          (
          <year>2009</year>
          )
          <fpage>71</fpage>
          -
          <lpage>128</lpage>
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          5.
          <string-name>
            <surname>Baader</surname>
            ,
            <given-names>F.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Brandt</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Lutz</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          :
          <article-title>Pushing the E L envelope</article-title>
          .
          <source>In: Proceedings of the 19th International Joint Conference on Artificial Intelligence (IJCAI</source>
          <year>2005</year>
          ).
          <article-title>(</article-title>
          <year>2005</year>
          )
          <fpage>364</fpage>
          -
          <lpage>369</lpage>
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          6.
          <string-name>
            <surname>Motik</surname>
            ,
            <given-names>B.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Cuenca Grau</surname>
            ,
            <given-names>B.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Horrocks</surname>
            ,
            <given-names>I.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Wu</surname>
            ,
            <given-names>Z.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Fokoue</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Lutz</surname>
          </string-name>
          , C., eds.
          <source>: OWL 2 Web Ontology Language: Profiles. W3C Recommendation</source>
          (
          <issue>27 October 2009</issue>
          ) Available at http: //www.w3.org/TR/owl2-profiles/.
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          7. OWL Working Group, W.:
          <article-title>OWL 2 Web Ontology Language: Document Overview</article-title>
          . W3C
          <source>Recommendation</source>
          (
          <issue>27 October 2009</issue>
          ) Available at http://www.w3.org/TR/owl2-overview/.
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          8.
          <string-name>
            <surname>Nikitina</surname>
            ,
            <given-names>N.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Rudolph</surname>
            ,
            <given-names>S.:</given-names>
          </string-name>
          <article-title>ExpExpExplosion: Uniform interpolation in general EL terminologies</article-title>
          .
          <source>In: Proceedings of the 20th European Conference on Artificial Intelligence (ECAI</source>
          <year>2012</year>
          ).
          <article-title>(</article-title>
          <year>2012</year>
          )
          <fpage>618</fpage>
          -
          <lpage>623</lpage>
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          9.
          <string-name>
            <surname>Nikitina</surname>
            ,
            <given-names>N.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Schewe</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          : More is Sometimes Less: Succinctness in E L.
          <string-name>
            <surname>Techreport</surname>
          </string-name>
          , Department of Computer Science, University of Oxford, Oxford (
          <year>Mai 2013</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          10.
          <string-name>
            <surname>Ghilardi</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Lutz</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Wolter</surname>
            ,
            <given-names>F.</given-names>
          </string-name>
          :
          <article-title>Did I Damage my Ontology? A Case for Conservative Extensions in Description Logics</article-title>
          .
          <source>In: Proceedings of the 10th International Conference on the Principles of Knowledge Representation and Reasoning (KR</source>
          <year>2006</year>
          ).
          <article-title>(</article-title>
          <year>2006</year>
          )
          <fpage>187</fpage>
          -
          <lpage>197</lpage>
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          11.
          <string-name>
            <surname>Lutz</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Walther</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Wolter</surname>
            ,
            <given-names>F.</given-names>
          </string-name>
          :
          <article-title>Conservative extensions in expressive description logics</article-title>
          .
          <source>In: Proceedings of the 20th International Joint Conference on Artificial Intelligence (IJCAI</source>
          <year>2007</year>
          ).
          <article-title>(</article-title>
          <year>2007</year>
          )
          <fpage>453</fpage>
          -
          <lpage>458</lpage>
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          12.
          <string-name>
            <surname>Konev</surname>
            ,
            <given-names>B.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Lutz</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Walther</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Wolter</surname>
            ,
            <given-names>F.</given-names>
          </string-name>
          :
          <article-title>Semantic modularity and module extraction in description logics</article-title>
          .
          <source>In: Proceedings of the 18th European Conference on Artificial Intelligence (ECAI</source>
          <year>2008</year>
          ).
          <article-title>(</article-title>
          <year>2008</year>
          )
          <fpage>55</fpage>
          -
          <lpage>59</lpage>
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          13.
          <string-name>
            <surname>Konev</surname>
            ,
            <given-names>B.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Lutz</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Walther</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Wolter</surname>
            ,
            <given-names>F.</given-names>
          </string-name>
          :
          <article-title>Formal properties of modularisation</article-title>
          . In Stuckenschmidt, H.,
          <string-name>
            <surname>Parent</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Spaccapietra</surname>
          </string-name>
          , S., eds.: Modular Ontologies. Springer-Verlag (
          <year>2009</year>
          )
          <fpage>25</fpage>
          -
          <lpage>66</lpage>
        </mixed-citation>
      </ref>
      <ref id="ref14">
        <mixed-citation>
          14.
          <string-name>
            <surname>Lutz</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Wolter</surname>
            ,
            <given-names>F.</given-names>
          </string-name>
          :
          <article-title>Deciding inseparability and conservative extensions in the description logic E L</article-title>
          .
          <source>Journal of Symbolic Computation</source>
          <volume>45</volume>
          (
          <issue>2</issue>
          ) (
          <year>2010</year>
          ) pp.
          <fpage>194</fpage>
          -
          <lpage>228</lpage>
        </mixed-citation>
      </ref>
      <ref id="ref15">
        <mixed-citation>
          15.
          <string-name>
            <surname>Kontchakov</surname>
            ,
            <given-names>R.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Wolter</surname>
            ,
            <given-names>F.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Zakharyaschev</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          :
          <article-title>Can you tell the difference between dl-lite ontologies?</article-title>
          <source>In: Proceedings of the 11th International Conference on Principles of Knowledge Representation and Reasoning (KR</source>
          <year>2008</year>
          ).
          <article-title>(</article-title>
          <year>2008</year>
          )
          <fpage>285</fpage>
          -
          <lpage>295</lpage>
        </mixed-citation>
      </ref>
      <ref id="ref16">
        <mixed-citation>
          16.
          <string-name>
            <surname>Nikitina</surname>
            ,
            <given-names>N.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Glimm</surname>
            ,
            <given-names>B.</given-names>
          </string-name>
          :
          <article-title>Hitting the sweetspot: Economic rewriting of knowledge bases</article-title>
          .
          <source>In: Proceedings of the 11th International Semantic Web Conference (ISWC</source>
          <year>2012</year>
          ).
          <article-title>(</article-title>
          <year>2012</year>
          )
          <fpage>394</fpage>
          -
          <lpage>409</lpage>
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>