<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>UEL: Unification Solver for E L</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Franz Baader</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Stefan Borgwardt</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Julian Mendez</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Barbara Morawska?</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>baader</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>stefborg</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>mendez</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>morawska}@tcs.inf.tu-dresden.de</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Theoretical Computer Science</institution>
          ,
          <addr-line>TU Dresden</addr-line>
          ,
          <country country="DE">Germany</country>
        </aff>
      </contrib-group>
      <abstract>
        <p>UEL is a system that computes unifiers for unification problems formulated in the description logic EL. EL is a description logic with restricted expressivity, but which is still expressive enough for the formal representation of biomedical ontologies, such as the large medical ontology SNOMED CT. We propose to use UEL as a tool to detect redundancies in such ontologies by computing unifiers of two formal concepts suspected of expressing the same concept of the application domain. UEL provides access to two different unification algorithms and can be used as a plug-in of the popular ontology editor Protégé, or stand-alone.</p>
      </abstract>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>Motivation</title>
      <p>
        The description logic (DL) E L, which offers the concept constructors conjunction
(u), existential restriction (9r:C), and the top concept (&gt;), has recently drawn
considerable attention since, on the one hand, important inference problems such
as the subsumption problem are polynomial in E L [
        <xref ref-type="bibr" rid="ref1 ref10 ref4">1,10,4</xref>
        ]. On the other hand,
though quite inexpressive, E L can be used to define biomedical ontologies, such
as the large medical ontology SNOMED CT.1
      </p>
      <p>
        Unification in DLs has been proposed in [
        <xref ref-type="bibr" rid="ref9">9</xref>
        ] as a novel inference service that
can, for instance, be used to detect redundancies in ontologies. For example,
assume that one developer of a medical ontology defines the concept of a patient
with severe head injury as
      </p>
      <sec id="sec-1-1">
        <title>Patient u 9 nding:(Head_injury u 9severity:Severe);</title>
        <p>whereas another one represents it as</p>
      </sec>
      <sec id="sec-1-2">
        <title>Patient u 9 nding:(Severe_injury u 9 nding_site:Head):</title>
        <p>These two concept descriptions are not equivalent, but they are nevertheless
meant to represent the same concept. They can obviously be made equivalent
by treating the concept names Head_injury and Severe_injury as variables, and
substituting the first one by Injury u 9 nding_site:Head and the second one by
Injury u 9severity:Severe. In this case, we say that the descriptions are unifiable,
and call the substitution that makes them equivalent a unifier. Intuitively, such
? Supported by DFG under grant BA 1122/14-1
1 see http://www.ihtsdo.org/snomed-ct/
(1)
(2)
Name</p>
        <p>Syntax
concept name A AI
role name r rI I I
top &gt; &gt;I = I
conjunction C u D (C u D)I = CI \ DI
existential restriction 9r:C (9r:C)I = fx j 9y : (x; y) 2 rI ^ y 2 CI g
concept definition</p>
        <p>A</p>
        <p>C</p>
        <p>AI = CI
Semantics</p>
        <p>I
a unifier proposes definitions for the concept names that are used as variables: in
our example, we know that, if we define Head_injury as Injuryu9 nding_site:Head
and Severe_injury as Injury u 9severity:Severe, then the two concept descriptions
(1) and (2) are equivalent w.r.t. these definitions. Of course, this example was
constructed such that the unifier actually provides sensible definitions for the
concept names used as variables. In general, the existence of a unifier only says
that there is a structural similarity between the two concepts. The developer that
uses unification as a tool for finding redundancies in an ontology or between two
different ontologies needs to inspect the unifier(s) to see whether the suggested
definitions really make sense.</p>
        <p>
          In [
          <xref ref-type="bibr" rid="ref6">6</xref>
          ] it was shown that unification in E L is an NP-complete problem.
Basically, this problem is in NP since every solvable unification problem has a “local”
unifier, i.e., one built from parts of the unification problem. The NP algorithm
introduced in [
          <xref ref-type="bibr" rid="ref6">6</xref>
          ] is a brutal “guess and then test” algorithm, which guesses a
local substitution and then checks whether it is a unifier. In [
          <xref ref-type="bibr" rid="ref8">8</xref>
          ], a more practical
rule-based E L-unification algorithm was introduced, which tries to transform
the given unification problems into a solved form, and makes nondeterministic
decisions only if triggered by the problem. Finally, the paper [
          <xref ref-type="bibr" rid="ref7">7</xref>
          ] proposes a third
algorithm, which encodes the unification problem into a set of propositional
clauses and then solves it using an existing highly optimized SAT solver.
        </p>
        <p>
          Version 1.0.0 of our system UEL2 used only the SAT translation to solve
unification problems. This approach allowed us to get a fast unification algorithm—
the unification problem only has to be translated into a propositional formula
and the SAT solver is used to actually solve the problem. In contrast, for the
implementation of the rule-based algorithm from [
          <xref ref-type="bibr" rid="ref8">8</xref>
          ] we had to find efficient
methods to deal with the nondeterminism ourselves. As of version 1.2.0, UEL
includes both implementations, as well as a variant of the SAT translation that
tries to compute only small unifiers.
2
        </p>
        <p>E L and Unification in E L
In order to explain what the algorithms implemented in UEL actually compute,
we need to recall the relevant definitions and results for unification in E L.
2 All versions of this system are available at http://uel.sourceforge.net.</p>
        <p>Starting with a finite set NC of concept names and a finite set NR of role
names, E L-concept descriptions are built from concept names using the
constructors conjunction (C u D), existential restriction (9r:C for every r 2 NR),
and top (&gt;). On the semantic side, concept descriptions are interpreted as sets.
To be more precise, an interpretation I = ( I ; I ) consists of a non-empty
domain I and an interpretation function I that maps concept names to subsets
of I and role names to binary relations over I . This function is extended to
concept descriptions as shown in the semantics column of Table 1.</p>
        <p>A concept definition is of the form A C for a concept name A and a
concept description C. A TBox T is a finite set of concept definitions such that
no concept name occurs more than once on the left-hand side of a definition in
T . The TBox T is called acyclic if there are no cyclic dependencies between its
concept definitions. Given a TBox T , we call a concept name A a defined concept
if it occurs as the left-side of a concept definition A C in T . All other concept
names are called primitive concepts. An interpretation I is a model of a TBox
T if AI = CI holds for all definitions A C in T .</p>
        <p>Subsumption asks whether a given concept description C is a subconcept of
another concept description D: C is subsumed by D w.r.t. T (C vT D) if every
model I of T satisfies CI DI . We say that C is equivalent to D w.r.t. T
(C T D) if C vT D and D vT C. For the empty TBox, we write C v D and
C D instead of C v; D and C ; D, and simply talk about subsumption and
equivalence (without saying “w.r.t. ;”).</p>
        <p>In order to define unification, we partition the set NC of concept names into
a set Nv of concept variables (which may be replaced by substitutions) and
a set Nc of concept constants (which must not be replaced by substitutions).
Intuitively, Nv are the concept names that have possibly been given another
name or been specified in more detail in another concept description describing
the same notion. A substitution maps every variable to a concept description.
It can be extended to concept descriptions in the usual way.</p>
        <p>
          Unification in E L was first considered w.r.t. the empty TBox [
          <xref ref-type="bibr" rid="ref6">6</xref>
          ]. In this
setting, an E L-unification problem is a finite set = fC1 ? D1; : : : ; Cn ? Dng
of equations. A substitution is a unifier of if solves all the equations in
, i.e., if (C1) (D1); : : : ; (Cn) (Dn). We say that is solvable if it
has a unifier. Without loss of generality, we can assume that the unification
problem is flat, i.e., that all concept descriptions occurring in it are conjunctions
of concept names or existential restrictions of the form 9r:A with A 2 NC . If a
unification problem is not flat, we can transform it into a flat unification problem
by introducing auxiliary variables.
        </p>
        <p>As mentioned before, the main reason for solvability of unification in E L to be
in NP is that any solvable unification problem has a local unifier. Basically, any
unification problem determines a polynomial number of so-called non-variable
atoms, which are concept constants or existential restrictions of the form 9r:A
for a role name r and a concept constant or variable A. An assignment S maps
every concept variable X to a subset SX of the set Atnv of non-variable atoms
of . Such an assignment induces a relation &gt;S on Nv, which is the transitive
closure of f(X; Y ) 2 Nv Nv j Y occurs in an element of SX g: We call the
assignment S acyclic if &gt;S is irreflexive (and thus a strict partial order). Any
acyclic assignment S induces a unique substitution S, which can be defined by
induction along &gt;S:
– If X 2 Nv is minimal w.r.t. &gt;S, then we define S(X) := dD2SX D.
– Assume that S(Y ) is already defined for all Y such that X &gt;S Y . Then we
define S(X) := dD2SX S(D):
We call a substitution local if it is of this form, i.e., if there is an acyclic
assignment S such that = S. Consequently, one can enumerate (or guess, in
a nondeterministic machine) all acyclic assignments and then check whether any
of them induces a substitution that is a unifier. Using this brute-force approach,
in general many local substitutions will be generated that only in the subsequent
check turn out not to be unifiers.</p>
        <p>
          The fact that any solvable unification problem has a local unifier was shown
in [
          <xref ref-type="bibr" rid="ref6">6</xref>
          ] by proving that such a problem always has a minimal unifier and that
every minimal unifier is equivalent to a local unifier. Minimality and equivalence
of unifiers are determined by the following order on substitutions. For two
substitutions and , we define iff (X) v (X) holds for all variables
X. We say that a unifier of a unification problem is minimal if there is no
unifier of such that and . Two unifiers , are equivalent iff
and . We introduce a similar order on assignments and write
S S0 iff SX SX0 holds for all X 2 Nv. We call an assignment S minimal
if S is a unifier of and there is no assignment S0 different from S such that
S0 is a unifier of and S S0. The following is easy to show: if S S0, then
S S0 . As shown in [
          <xref ref-type="bibr" rid="ref3">3</xref>
          ], this implies that all minimal unifiers are induced
by minimal assignments, but the opposite need not hold. Computing only the
minimal assignments is thus a first step towards computing only minimal unifiers.
Computig only minimal unifiers is desirable since they are those among the local
unifiers that substitute the variables by smaller concept descriptions and their
corresponding assignments do not contain “irrelevant” non-variable atoms.
        </p>
        <p>
          In [
          <xref ref-type="bibr" rid="ref8">8</xref>
          ], unification w.r.t. an acyclic TBox T was introduced. In this setting,
the concept variables are a subset of the primitive concepts of T , and
substitutions are applied both to the concept descriptions in the unification problem
and to the right-hand sides of the definitions in T . To deal with such unification
problems, one does not need to develop a new algorithm. In fact, by viewing the
defined concepts of T as variables, one can turn T into a unification problem,
which one simply adds to the given unification problem . As shown in [
          <xref ref-type="bibr" rid="ref8">8</xref>
          ], there
is a 1–1-correspondence between the unifiers of w.r.t. T and the unifiers of
this extended unification problem.
        </p>
        <p>We will now describe the two unification algorithms implemented in UEL
1.2.0. Both algorithms have in common that they generate acyclic assignments,
and thus output only local unifiers. They follow two different approaches to
reduce the amount of blind guessing of the brute-force approach.</p>
        <sec id="sec-1-2-1">
          <title>The SAT Translation</title>
          <p>
            Instead of blindly generating all local substitutions, the reduction to the
propositional satisfiability problem introduced in [
            <xref ref-type="bibr" rid="ref7">7</xref>
            ] ensures that only assignments
that induce unifiers are generated. The set of propositional clauses C( )
generated by the reduction contains two kinds of propositional letters: [A v B]
for A; B 2 Atnv3 and [X &gt; Y ] for concept variables X; Y . Intuitively, setting
[A v B] = 1 means that the local substitution S induced by the corresponding
assignment S satisfies S (A) v S (B), and setting [X &gt; Y ] = 1 means that
X &gt;S Y . The clauses in C( ) are such that has a unifier iff C( ) is
satisfiable. In particular, any propositional valuation satisfying C( ) defines an
assignment S with SX := fA j ([X v A]) = 1; A 2 Atnvg; which induces a
local unifier of . Conversely, any local unifier of can be obtained in this way.
Thus, by generating all propositional valuations satisfying C( ) we can generate
all local unifiers of . The main advantage of this algorithm is the speed of the
used SAT solver. However, the number of generated clauses is in general cubic in
the size of the unification problem. Thus, the translation will generate huge SAT
instances even from moderately sized unification problems, which might lead to
memory problems even before the SAT solver is applied.
          </p>
        </sec>
        <sec id="sec-1-2-2">
          <title>The Rule-Based Algorithm</title>
          <p>
            The second unification algorithm implemented in UEL is based on the rule-based
algorithm from [
            <xref ref-type="bibr" rid="ref8">8</xref>
            ]. However, internally it uses subsumptions of the form C v? D
instead of equivalences. This is without loss of generality since any equivalence
C ? D can be expressed by the two subsumptions C v? D and D v? C. This
variant of the algorithm has been described in more detail in [
            <xref ref-type="bibr" rid="ref2">2</xref>
            ].
          </p>
          <p>The algorithm generates local unifiers by maintaining a set of current
subsumptions and a current acyclic assignment S, both of which are extended by
applying certain rules. Initially, all subsumptions are marked as unsolved and
rules apply only to unsolved subsumptions and mark them as solved. In the
process, new subsumptions may be generated and the current assignment may be
extended by adding non-variable atoms to some of the sets SX . Once all
subsumptions are solved, the substitution S induced by the current assignment is
a unifier of the unification problem.</p>
          <p>Some of the rules are don’t-know nondeterministic, i.e., they might apply in
different ways to the same subsumption, but we do not know beforehand which
application is the correct one. Also the choice between several applicable
nondeterministic rules is don’t-know nondeterministic. The algorithm additionally
employs several eager rules that are always applied first and leave no choice in
their application. They are mainly there to reduce the number of
nondeterministic choices the algorithm has to make.</p>
          <p>
            In contrast to the SAT reduction, this rule-based algorithm does not generate
all local unifiers. A non-variable atom D will only be put in the set SX if there
3 The reduction in [
            <xref ref-type="bibr" rid="ref7">7</xref>
            ] actually uses variables [A 6v B], but it turned out that the
reduction using non-negated subsumptions behaves better in practice.
is a reason to do so in the unification problem, although there may be a local
unifier whose assignment S0 contains D in SX0 . However, it was shown in [
            <xref ref-type="bibr" rid="ref8">8</xref>
            ] that
it can generate all minimal unifiers (up to equivalence). The converse is not true,
i.e., it might also generate local unifiers that are not minimal.
          </p>
          <p>The main advantage of this algorithm is that non-variable atoms are only
added to the assignment if this is required by the unification problem, and thus
fewer unifiers of relatively small size are generated. The space requirements are
also quite low, since the current set of subsumptions and the current assignment
are of size at most quadratic in the input size. The downside of this algorithm
is that, without any of the optimizations implemented in modern SAT solvers,
the algorithm naively traverses the search space. However, implementing such
optimizations to improve its efficiency requires a huge effort.
3</p>
        </sec>
      </sec>
    </sec>
    <sec id="sec-2">
      <title>Stuff not Mentioned in the Theoretical Papers</title>
      <p>When implementing UEL, we had to deal with several issues that are abstracted
away in the theoretical papers describing unification algorithms for EL. Most of
them are not specific to the used unification algorithm.</p>
      <p>Primitive definitions In addition to concept definitions, as introduced above,
biomedical ontologies often contain so-called primitive definitions A v C where
A is a concept name and C is a concept description. Models I of A v C need
to satisfy AI CI . Thus, primitive definitions formulate necessary conditions
for concept membership, but these conditions are not sufficient. SNOMED CT
contains about 350,000 primitive definitions and only 40,000 concept definitions.</p>
      <p>
        By using a trick first introduced by Nebel [
        <xref ref-type="bibr" rid="ref13">13</xref>
        ], primitive definitions A v C
can be turned into concept definitions A C u A_UNDEF, where A_UNDEF
is a new concept name that stands for the undefined part of the definition of A.
In the resulting acyclic TBox, these new concept names are primitive concepts,
and thus can be declared to be variables. In this case, a unifier suggest how to
complete the definition of A by providing the concept description (A_UNDEF).
Unifiers as acyclic TBoxes Given an acyclic assignment S computed by one
of the unification algorithms, our system UEL actually does not produce the
corresponding local unifier S as output, but rather the acyclic TBox TS :=
fX dD2SX D j X 2 Nvg. This TBox solves the input unification problem
w.r.t. T in the sense that C T [TS D holds for all equations C ? D in .
This is actually what the developer that employs unification wants to know: how
must the concept variables be defined such that the concept descriptions in the
equations become equivalent? Another advantage of this representation of the
output is that the size of S and thus of TS is polynomial in the size of the input
and T , while the size of the concept descriptions S(X) may be exponential
in this size. In the following, we will also call the TBoxes TS unifiers.
Internal variables As mentioned before, the unification algorithms for EL
assume that the unification problem is first transformed into a flat form. This form
can easily be generated by introducing auxiliary variables. These new variables
have system-generated names, which do not make sense to the user. Thus, they
should not show up in the output acyclic TBox TS . By replacing such auxiliary
defined concepts in TS by their definitions as long as auxiliary names occur, we
can transform TS into an acyclic TBox that satisfies this requirement, actually
without causing an exponential blow-up of the size of the TBox.
Reachable subontology As mentioned above, acyclic TBoxes are treated by
viewing them as part of the unification problem. For very large TBoxes like
SNOMED CT, adding the whole TBox to the unification problem is neither
viable nor necessary. In fact, it is sufficient to add the reachable part of the
TBox, i.e., the definitions on which the concept descriptions in the unification
problem depend. This reachable part is usually rather small, even for very large
ontologies.
      </p>
      <p>Enumeration of unifiers While computing a single unifier is usually quite
fast, computing all of them can take much longer. We alleviate this problem by
enabling the user to compute and then inspect one unifier at a time. If this unifier
makes sense, i.e., suggests reasonable definitions for the variables, then the user
can stop. Otherwise, the computation of the next unifier can be initiated.</p>
      <p>For the rule-based algorithm, we output all produced unifiers by a depth-first
traversal of the search space. This means that we apply applicable
nondeterministic rules as long as this is possible. If there are no more unsolved subsumptions,
we return the corresponding unifier. If no rule is applicable, we backtrack and
apply the last nondeterministic rule in a different way or apply another rule.</p>
      <p>For the SAT reduction, the approach is different. If the SAT solver has
provided a satisfying propositional valuation, we can add a clause to the SAT
problem that prevents the re-computation of this assignment, and call the SAT solver
with this new SAT instance. If the SAT solver determines that the current set
of clauses is unsatisfiable, then there are no more unifiers.</p>
      <p>Computing only minimal assignments The satisfying valuations of the
propositional formula generated by the SAT translation yield all local unifiers
of the unification problem. Depending on how many concept names are turned
into variables, there can be many local unifiers. To address this problem, we
implemented a variant of the SAT reduction that computes only the local unifiers
induced by minimal assignments w.r.t. , which are often significantly fewer.</p>
      <p>
        This approach is still complete in the sense that it computes all minimal
unifiers (see Section 2). It works by transforming the SAT problem into a special
case of a partial MAX-SAT problem [
        <xref ref-type="bibr" rid="ref11">11</xref>
        ], where in addition to the clauses that
are to be satisfied one can specify a subset Vmin of the propositional variables.
The goal is to minimize the number of variables from the set Vmin that are set
to 1. By setting Vmin = f[X v A] j X 2 Nv; A 2 Atnvg, we ensure that the first
valuation returned by the MAX-SAT solver induces a minimal assignment S.4
4 If f(X; A) j X 2 Nv; A 2 SX g has minimal cardinality among all assignments that
induce a unifier of the unification problem, then it is also minimal w.r.t. set inclusion.
To ensure that subsequent calls to the MAX-SAT solver return no assignments
larger than S, we add a clause to the problem instance that requires that at
least one of the variables of the form [X v A] with A 2 SX should be set to 0.
      </p>
      <p>Of course, the reformulation of the problem as a MAX-SAT instance adds
an overhead to the computation time. However, this approach guarantees that
no “superfluous,” i.e., non-minimal, assignments are presented to the user.
4</p>
    </sec>
    <sec id="sec-3">
      <title>The User Interface</title>
      <p>UEL was implemented in Java 1.6 and is compatible with Java 1.7. It uses the
OWL API 3.2.45 to read ontologies. It has a visual interface that can be used
as a Protégé 4.1 plug-in, or as a standalone application. For the SAT-based
unification algorithm, we currently use SAT4J6 as (MAX-)SAT solver, which
is implemented in Java. However, this configuration can easily be changed to
any solver that accepts the popular DIMACS CNF7 (or WCNF8) format as
input and returns the computed satisfying propositional valuation. For the
rulebased algorithm, we have implemented everything from scratch, and thus have
no external dependencies.</p>
      <p>After opening UEL’s visual interface, the first step is to open one or two
ontologies. The latter enables unification of concepts defined in different
ontologies. Additionally, the user can choose between the three unification algorithms
by selecting the “SAT-based algorithm [(minimal assignments)]” or the
“Rulebased algorithm”. The user can then choose two concepts to be unified. This is
done by choosing two concept names that occur on the left-hand sides of concept
definitions or primitive definitions (see Figure 1). UEL then computes the
subontologies reachable from these concept names, and turns the primitive definitions
in these subontologies into concept definitions.</p>
      <p>After choosing the concepts to be unified, pressing the button opens
a dialog window in which the user is presented with the primitive concepts
contained in these subontologies (including the ones with ending _UNDEF).
The user can then decide which of these primitive concepts should be viewed as
variables in the unification problem.</p>
      <p>Once the user has chosen the variables, UEL computes the unification
problem defined this way and opens a dialog window with control buttons. By pressing
the button , the user triggers the computation of the first (or next) unifier.
Each computed unifier is shown as an acyclic TBox in KRSS format. The button
can be used to go back to the previously computed unifier. The button
can be used to trigger the computation of all remaining unifiers, and the button
allows to jump back to the first unifier (see Figure 2). Computed unifiers
are stored, and thus need not be recomputed during navigation. Each unifier
(i.e., the acyclic TBox representing it) can be saved using the RDF/OWL or the
5 http://owlapi.sourceforge.net
6 http://www.sat4j.org
7 http://www.satcompetition.org/2011/format-benchmarks2011.html
8 http://www.maxsat.udl.cat/11/requirements/index.html</p>
      <p>KRSS format by pressing the button . The file format is determined by the
filename extension given by the user (.owl or .krss).</p>
      <p>The user can use the button to retrieve internal details about the
computation process. The unification problem created internally by UEL is then shown
in KRSS format in a separate dialog. Additionally, the number of all concept
variables (those chosen by the user and internal variables) is given. Depending
on the chosen algorithm, several other internal statistics can be viewed. If the
SAT reduction is used, the number of propositional letters and the number of
propositional clauses are listed. For the rule-based algorithm, the number of
initial subsumptions, the maximal number of generated subsumptions (over all
nondeterministic choices), and the size of the search tree (i.e., the number of
nondeterministic choices) are shown, as well as the number of “dead ends” where
the algorithm had to backtrack. These numbers always reflect the current status
and might increase if more unifiers are computed.
5</p>
    </sec>
    <sec id="sec-4">
      <title>Some Examples</title>
      <p>To illustrate the behavior of the three approaches, we consider several
example problems. The size of these problems as well as the size of the generated
data structures (propositional clauses or subsumptions) is depicted in Table 2,
together with the runtimes of the three algorithms one these problems.</p>
      <p>The first example is a modified version of our example from the beginning,
where the TBox gives (1) as definition for Patient_with_severe_head_injury and
(2) as definition for Patient_with_severe_injury_at_head. In addition, the TBox
contains two primitive definitions, saying that Head_injury and Severe_injury</p>
      <p>problem size SAT MAX-SAT Rule
# equ. var. clauses prop. subs. (max.) dead ends time unif. time unif. time unif.
1 7 8 3,976 320 20 (52) 0 0.249 128 0.047 1 0.001 1
2 23 12 17,971 820 47 (217) 63,523 0.500 0 0.510 0 1.920 0
3 34 14 28,071 1,096 62 (320) 1,126,286 1.086 15 1.162 15 26.308 30
are subconcepts of Injury. If we choose Patient_with_severe_head_injury and
Patient_with_severe_injury_at_head as the concepts to be unified, the system
offers us the primitive concepts Patient, Severe, Head, Head_injury_UNDEF, and
Severe_injury_UNDEF as possible variables, of which we choose only the latter
two.</p>
      <p>The SAT translation generates a large SAT problem (4,000 clauses are created
from only 7 equations) and first computes the following unifier:
fHead_injury_UNDEF 7! 9 nding_site:Head;</p>
      <p>Severe_injury_UNDEF 7! 9severity:Severeg:
This substitution completes the primitive definitions of the concepts Head_injury
and Severe_injury to concept definitions Head_injury Injuryu nding_site:Head
and Severe_injury Injury u 9severity:Severe.</p>
      <p>However, the unification problem has 127 additional local unifiers. Some of
them are similar to the first one, but contain “redundant” conjuncts. Others do
not make much sense in the application (e.g., ones where Patient occurs in the
images of the variables). In contrast, the MAX-SAT variant of the translation
has to deal with an even larger problem since it has to consider the additional
set Vmin. However, it is much faster since it computes only the unifier shown
above, which is the only minimal unifier of this unification problem. The
rulebased algorithm shows an even better performance since the data structures it
creates are orders of magnitude smaller than the SAT problem and the
unification problem is very deterministic—in fact, no backtracking is necessary. It also
returns only the minimal unifier.</p>
      <p>
        The second unification problem was constructed starting with a “hard” SAT
instance (according to the approach in [
        <xref ref-type="bibr" rid="ref12">12</xref>
        ]), which was translated into a
unification problem using the construction in [
        <xref ref-type="bibr" rid="ref5">5</xref>
        ]. Even though a large number of
propositional clauses are constructed in the reduction back to a SAT problem,
the SAT solver takes little time to determine that the problem has no solution.
The overhead incurred by the use of a MAX-SAT problem is negligible. In
contrast, the rule-based algorithm works on a much smaller data structure (at most
217 subsumptions during the whole run), but lacks the optimizations of the SAT
solver and naively traverses a large search tree looking for a unifier.
      </p>
      <p>The last example shows even more extreme differences. Again, we started
from a simple SAT problem (“Choose exactly 2 out of 6 variables.”) that has 15
models which correspond to 15 minimal unifiers. Both the SAT-based and the
MAX-SAT-based approach compute these unifiers quite fast, but the rule-based
algorithm takes much longer and even computes each of the solutions twice.
6</p>
    </sec>
    <sec id="sec-5">
      <title>Conclusions</title>
      <p>Our system UEL enables ontology engineers to test ontologies for redundancies
and allows them to choose between different algorithms with different strengths.
The rule-based algorithm has a big advantage over the SAT translation since it
needs only small data structures, but currently lacks important search
optimizations, which make this implementation unsuitable for large problems.</p>
      <p>In the current version 1.2.0, UEL only supports checking two pre-selected
concept names for similarities. In future versions, we plan to implement an
automatic search feature that can scan (a part of) an ontology for concept names
that unify. For this we also need to find an intelligent way to automatically select
the variables from a set of primitive concept names.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          1.
          <string-name>
            <surname>Baader</surname>
            ,
            <given-names>F.</given-names>
          </string-name>
          :
          <article-title>Terminological cycles in a description logic with existential restrictions</article-title>
          .
          <source>In: Proc. IJCAI'03</source>
          . pp.
          <fpage>325</fpage>
          -
          <lpage>330</lpage>
          . Morgan Kaufmann (
          <year>2003</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          2.
          <string-name>
            <surname>Baader</surname>
            ,
            <given-names>F.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Borgwardt</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Morawska</surname>
            ,
            <given-names>B.</given-names>
          </string-name>
          :
          <article-title>Unification in the description logic EL w</article-title>
          .r.t. cycle
          <article-title>-restricted TBoxes</article-title>
          .
          <source>LTCS-Report 11-05</source>
          , Chair of Automata Theory, TU Dresden, Germany (
          <year>2011</year>
          ), see http://lat.inf.tu-dresden.de/research/reports.html.
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          3.
          <string-name>
            <surname>Baader</surname>
            ,
            <given-names>F.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Borgwardt</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Morawska</surname>
            ,
            <given-names>B.</given-names>
          </string-name>
          :
          <article-title>Computing minimal EL-unifiers is hard</article-title>
          .
          <source>LTCS-Report 12-03</source>
          ,
          <article-title>Chair for Automata Theory</article-title>
          , TU Dresden (
          <year>2012</year>
          ), see http://lat.inf.tu-dresden.de/research/reports.html.
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          4.
          <string-name>
            <surname>Baader</surname>
            ,
            <given-names>F.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Brandt</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Lutz</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          :
          <article-title>Pushing the EL envelope</article-title>
          .
          <source>In: Proc. IJCAI'05</source>
          . pp.
          <fpage>364</fpage>
          -
          <lpage>369</lpage>
          . Professional Book Center (
          <year>2005</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          5.
          <string-name>
            <surname>Baader</surname>
            ,
            <given-names>F.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Küsters</surname>
          </string-name>
          , R.:
          <article-title>Matching concept descriptions with existential restrictions</article-title>
          .
          <source>In: Proc. KR'00</source>
          . pp.
          <fpage>261</fpage>
          -
          <lpage>272</lpage>
          . Morgan Kaufmann (
          <year>2000</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          6.
          <string-name>
            <surname>Baader</surname>
            ,
            <given-names>F.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Morawska</surname>
            ,
            <given-names>B.</given-names>
          </string-name>
          :
          <article-title>Unification in the description logic EL</article-title>
          .
          <source>In: Proc. RTA'09. LNCS</source>
          , vol.
          <volume>5595</volume>
          , pp.
          <fpage>350</fpage>
          -
          <lpage>364</lpage>
          . Springer (
          <year>2009</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          7.
          <string-name>
            <surname>Baader</surname>
            ,
            <given-names>F.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Morawska</surname>
            ,
            <given-names>B.</given-names>
          </string-name>
          :
          <article-title>SAT encoding of unification in EL</article-title>
          .
          <source>In: Proc. LPAR'10. LNCS</source>
          , vol.
          <volume>6397</volume>
          , pp.
          <fpage>97</fpage>
          -
          <lpage>111</lpage>
          . Springer (
          <year>2010</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          8.
          <string-name>
            <surname>Baader</surname>
            ,
            <given-names>F.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Morawska</surname>
            ,
            <given-names>B.</given-names>
          </string-name>
          :
          <article-title>Unification in the description logic EL</article-title>
          .
          <source>Log. Meth. Comput. Sci. 6</source>
          (
          <issue>3</issue>
          ) (
          <year>2010</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          9.
          <string-name>
            <surname>Baader</surname>
            ,
            <given-names>F.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Narendran</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          :
          <article-title>Unification of concept terms in description logics</article-title>
          .
          <source>J. Symb. Comput</source>
          .
          <volume>31</volume>
          (
          <issue>3</issue>
          ),
          <fpage>277</fpage>
          -
          <lpage>305</lpage>
          (
          <year>2001</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          10.
          <string-name>
            <surname>Brandt</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          :
          <article-title>Polynomial time reasoning in a description logic with existential restrictions, GCI axioms, and-what else?</article-title>
          <source>In: Proc. ECAI'04</source>
          . pp.
          <fpage>298</fpage>
          -
          <lpage>302</lpage>
          (
          <year>2004</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          11.
          <string-name>
            <surname>Li</surname>
            ,
            <given-names>C.M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Manyà</surname>
            ,
            <given-names>F.</given-names>
          </string-name>
          :
          <article-title>Maxsat, hard and soft constraints</article-title>
          .
          <source>In: Handbook of Satisfiability, chap. 19</source>
          , pp.
          <fpage>613</fpage>
          -
          <lpage>631</lpage>
          . IOS Press (
          <year>2009</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          12.
          <string-name>
            <surname>Markström</surname>
            ,
            <given-names>K.</given-names>
          </string-name>
          :
          <article-title>Locality and hard SAT-instances</article-title>
          .
          <source>JSAT</source>
          <volume>2</volume>
          (
          <issue>1-4</issue>
          ),
          <fpage>221</fpage>
          -
          <lpage>227</lpage>
          (
          <year>2006</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          13.
          <string-name>
            <surname>Nebel</surname>
          </string-name>
          , B.:
          <article-title>Reasoning and Revision in Hybrid Representation Systems</article-title>
          , LNAI, vol.
          <volume>422</volume>
          . Springer (
          <year>1990</year>
          )
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>