A Goal-Oriented Algorithm for Unification in EL
        w.r.t. Cycle-Restricted TBoxes?

            Franz Baader, Stefan Borgwardt, and Barbara Morawska
            {baader,stefborg,morawska}@tcs.inf.tu-dresden.de

               Theoretical Computer Science, TU Dresden, Germany


1     Introduction
Unification in DLs has been proposed in [7] (for the DL FL0 , which offers the
constructors conjunction (u), value restriction (∀r.C), and the top concept (>))
as a novel inference service that can, for instance, be used to detect redundancies
in ontologies. For example, assume that one developer of a medical ontology
defines the concept of a patient with severe head injury as

               Patient u ∃finding.(Head_injury u ∃severity.Severe),             (1)

whereas another one represents it as

        Patient u ∃finding.(Severe_finding u Injury u ∃finding_site.Head).      (2)

These two concept descriptions are not equivalent, but they are nevertheless
meant to represent the same concept. They can obviously be made equivalent
by treating the concept names Head_injury and Severe_finding as variables, and
substituting the first one by Injury u ∃finding_site.Head and the second one by
∃severity.Severe. In this case, we say that the descriptions are unifiable, and call
the substitution that makes them equivalent a unifier. Intuitively, such a unifier
proposes definitions for the concept names that are used as variables: in our
example, we know that, if we define Head_injury as Injury u ∃finding_site.Head
and Severe_finding as ∃severity.Severe, then the two concept descriptions (1)
and (2) are equivalent w.r.t. these definitions. Here equivalence holds without
additional GCIs.
   To motivate our interest in unification w.r.t. GCIs, assume that the second
developer uses the description

                                       Patient u ∃status.Emergency u            (3)
            ∃finding.(Severe_finding u Injury u ∃finding_site.Head)

instead of (2). The descriptions (1) and (3) are not unifiable without additional
GCIs, but they are unifiable, with the same unifier as above, if the GCI

                  ∃finding.∃severity.Severe v ∃status.Emergency
?
    Supported by DFG under grant BA 1122/14-1
is present in a background ontology.
    In [4], we were able to show that unification in the DL EL (which differs from
FL0 by offering existential restrictions (∃r.C) in place of value restrictions) is of
considerably lower complexity than in FL0 : the decision problem in EL is NP-
complete rather than ExpTime-complete in FL0 . In addition to a brute-force
“guess and then test” NP-algorithm [4], we have developed a goal-oriented uni-
fication algorithm for EL, in which nondeterministic decisions are only made if
they are triggered by “unsolved parts” of the unification problem [6], and an algo-
rithm that is based on a reduction to satisfiability in propositional logic (SAT)
[5], which enables the use of highly-optimized SAT solvers. In [6] it was also
shown that the approaches for unification of EL-concept descriptions (without
any background ontology) can easily be extended to the case of an acyclic TBox
as background ontology without really changing the algorithms or increasing
their complexity. Basically, by viewing defined concepts as variables, an acyclic
TBox can be turned into a unification problem that has as its unique unifier
the substitution that replaces the defined concepts by unfolded versions of their
definitions. For GCIs, this simple trick is not possible.
    In [2], we extended the brute-force “guess and then test” NP-algorithm from
[4] to the case of GCIs, which required the development of a new characterization
of subsumption w.r.t. GCIs in EL. Unfortunately, the algorithm is complete only
for general TBoxes (i.e., finite sets of GCIs) that satisfy a certain restriction
on cycles, which, however, does not prevent all cycles. For example, the cyclic
GCI ∃child.Human v Human satisfies this restriction, whereas the cyclic GCI
Human v ∃parent.Human does not.
    In the present paper, we describe a goal-oriented algorithm for unification in
EL w.r.t. cycle-restricted general TBoxes, which extends the one from [6] and
reduces the amount of nondeterministic guesses considerably. Full proofs of the
presented results can be found in [1].


2    The Description Logic EL

Syntax and semantics of EL are defined in the usual way (see, e.g., [9]). Here,
we just recall that EL-concept descriptions are built from a finite set NC of
concept names and a finite set NR of role names using the concept constructors
top-concept (>), conjunction (C u D), and existential restriction (∃r.C for every
r ∈ NR ). Nested existential restrictions ∃r1 .∃r2 . · · · ∃rn .C will sometimes also be
written as ∃r1 r2 . . . rn .C, where r1 r2 . . . rn is viewed as a word over the alphabet
of role names, i.e., an element of NR∗ . As usual, concepts C are interpreted as sets
C I over some domain such that the semantics of the constructors is respected.
    A general concept inclusion (GCI) is of the form C v D for concept de-
scriptions C, D, and a general TBox is a finite set of GCIs. An interpretation I
satisfies such a GCI if C I ⊆ DI , and it is a model of the general TBox T if it
satisfies all GCIs in T . Subsumption asks whether a given GCI C v D follows
from a general TBox T , i.e. whether every model of T satisfies C v D. In this
case we say C is subsumed by D w.r.t. T and write C vT D. Subsumption in
EL w.r.t. a general TBox is known to be decidable in polynomial time [9].
    An EL-concept description is an atom if it is an existential restriction or a
concept name. The atoms of an EL-concept description C are the subdescriptions
of C that are atoms, and the top-level atoms of C are the atoms occurring in
the top-level conjunction of C. Obviously, any EL-concept description is the
conjunction of its top-level atoms, where the empty conjunction corresponds to
>. The atoms of a general TBox T are the atoms of all the concept descriptions
occurring in T .
    We say that a subsumption between two atoms is structural if their top-level
structure is compatible. To be more precise, we define structural subsumption
between atoms as follows: the atom C is structurally subsumed by the atom D
w.r.t. T (C vsT D) iff either (i) C = D is a concept name, or (ii) C = ∃r.C 0 ,
D = ∃r.D0 , and C 0 vT D0 . It is easy to see that subsumption w.r.t. ∅ between
two atoms implies structural subsumption w.r.t. T , which in turn implies sub-
sumption w.r.t. T . The unification algorithm in [2] and the one presented below
crucially depend on the following characterization of subsumption:
Lemma 1. Let T be a general TBox and C1 , . . . , Cn , D1 , . . . , Dm atoms. Then
C1 u · · · u Cn vT D1 u · · · u Dm iff for every j ∈ {1, . . . , m}
1. there is an index i ∈ {1, . . . , n} such that Ci vsT Dj , or
2. there are atoms A1 , . . . , Ak , B of T (k ≥ 0) such that
   a) A1 u · · · u Ak vT B,
   b) for every η ∈ {1, . . . , k} there is i ∈ {1, . . . , n} with Ci vsT Aη , and
   c) B vsT Dj .
    Our proof of this lemma in [1] is based on a new Gentzen-style proof calculus
for subsumption w.r.t. a general TBox, which is similar to the one developed in
[10] for subsumption w.r.t. cyclic and general TBoxes.
    As mentioned in the introduction, our unification algorithm is complete only
for general TBoxes that satisfy a certain restriction on cycles.
Definition 2. The general TBox T is called cycle-restricted iff there is no
nonempty word w ∈ NR+ and EL-concept description C such that C vT ∃w.C.
    In [1] we show that a given general TBox can easily be tested for cycle-
restrictedness. The main idea is that it is sufficient to consider the cases where
C is a concept name or >.
Lemma 3. Let T be a general TBox. It can be decided in time polynomial in
the size of T whether T is cycle-restricted or not.

3    Unification in EL w.r.t. General TBoxes
We partition the set NC into a set Nv of concept variables (which may be
replaced by substitutions) and a set Nc of concept constants (which must not be
replaced by substitutions). A substitution σ maps every concept variable to an
EL-concept description. It is extended to concept descriptions in the usual way:
 – σ(A) := A for all A ∈ Nc ∪ {>},
 – σ(C u D) := σ(C) u σ(D) and σ(∃r.C) := ∃r.σ(C).
An EL-concept description C is ground if it does not contain variables. Obvi-
ously, a ground concept description is not modified by applying a substitution.
A general TBox is ground if it does not contain variables.
Definition 4. Let T be a general TBox that is ground. An EL-unification prob-
lem w.r.t. T is a finite set Γ = {C1 v? D1 , . . . , Cn v? Dn } of subsumptions
between EL-concept descriptions. A substitution σ is a unifier of Γ w.r.t. T if σ
solves all the subsumptions in Γ , i.e. if σ(C1 ) vT σ(D1 ), . . . , σ(Cn ) vT σ(Dn ).
We say that Γ is unifiable w.r.t. T if it has a unifier.
   Note that we have restricted the background general TBox T to be ground.
This is not without loss of generality. If T contained variables, then we would
need to apply the substitution also to its GCIs, and instead of requiring σ(Ci ) vT
σ(Di ) we would thus need to require σ(Ci ) vσ(T ) σ(Di ), which would change
the nature of the problem considerably (see [1] for a more detailed discussion).

Preprocessing To simplify the description of the algorithm, it is convenient to
first normalize the TBox and the unification problem appropriately. An atom is
called flat if it is a concept name or an existential restriction of the form ∃r.A
for a concept name A. The general TBox T is called flat if it contains only GCIs
of the form A u B v C, where A, B are flat atoms or > and C is a flat atom.
The unification problem Γ is called flat if it contains only flat subsumptions of
the form C1 u · · · u Cn v? D, where n ≥ 0 and C1 , . . . , Cn , D are flat atoms.1
    Let Γ be a unification problem and T a general TBox. By introducing auxil-
iary variables and concept names, respectively, Γ and T can be transformed in
polynomial time into a flat unification problem Γ 0 and a flat general TBox T 0
such that the unifiability status remains unchanged, i.e., Γ has a unifier w.r.t.
T iff Γ 0 has a unifier w.r.t. T 0 . In addition, if T was cycle-restricted, then so is
T 0 (see [1] for details). Thus, we can assume without loss of generality that the
input unification problem and general TBox are flat.

Local Unifiers The main idea underlying the “in NP” results in [4,2] is to show
that any EL-unification problem that is unifiable has a so-called local unifier. Let
T be a flat cycle-restricted TBox and Γ a flat unification problem. The atoms
of Γ are the atoms of all the concept descriptions occurring in Γ . We define
                     At := {C | C is an atom of T or of Γ } and
                   Atnv := At \ Nv    (non-variable atoms).
Every assignment S of subsets SX of Atnv to the variables X in Nv induces the
following relation >S on Nv : >S is the transitive closure of
               {(X, Y ) ∈ Nv × Nv | Y occurs in an element of SX }.
1
    If n = 0, then we have an empty conjunction on the left-hand side, which as usual
    stands for >.
We call the assignment S acyclic if >S is irreflexive (and thus a strict partial
order). Any acyclic assignment S induces a unique substitution σS , which can
be defined by induction along >S :
                                                             d
 – If X ∈ Nv is minimal w.r.t. >S , then we define σS (X) := D∈SX D.
 – Assume that σ(Yd) is already defined for all Y such that X >S Y . Then we
    define σS (X) := D∈SX σS (D).
We call a substitution σ local if it is of this form, i.e., if there is an acyclic assign-
ment S such that σ = σS . If the unifier σ of Γ w.r.t. T is a local substitution,
then we call it a local unifier of Γ w.r.t. T .
    The main technical result shown in [2] is that any unifiable EL-unification
problem w.r.t. a cycle-restricted TBox has a local unifier. This yields the follow-
ing brute-force unification algorithm for EL w.r.t. cycle-restricted TBoxes: first
guess an acyclic assignment S, and then check whether the induced local sub-
stitution σS solves Γ . As shown in [2], this algorithm runs in nondeterministic
polynomial time. NP-hardness follows from the fact that already unification in
EL w.r.t. the empty TBox is NP-hard [4].

4    A Goal-Oriented Unification Algorithm
The brute-force algorithm is not practical since it blindly guesses an acyclic
assignment and only afterwards checks whether the guessed assignment induces
a unifier. We now introduce a more goal-oriented unification algorithm, in which
nondeterministic decisions are only made if they are triggered by “unsolved parts”
of the unification problem. In addition, failure due to wrong guesses can be
detected early. Any non-failing run of the algorithm produces a unifier, i.e.,
there is no need for checking whether the assignment computed by this run
really produces a unifier. This goal-oriented algorithm generalizes the algorithm
for unification in EL w.r.t. the empty TBox introduced in [6], though the rules
look quite different because in the present paper we consider unification problems
that consist of subsumptions whereas in [6] we considered equivalences.
    We assume without loss of generality that the cycle-restricted TBox T and
the unification problem Γ0 are flat. Given T and Γ0 , the sets At and Atnv are
defined as above. Starting with Γ0 , the algorithm maintains a current unification
problem Γ and a current acyclic assignment S, which initially assigns the empty
set to all variables. In addition, for each subsumption in Γ it maintains the in-
formation on whether it is solved or not. Initially, all subsumptions are unsolved,
except those with a variable on the right-hand side. Rules are applied only to
unsolved subsumptions. A (non-failing) rule application does the following:
 – it solves exactly one unsolved subsumption,
 – it may extend the current assignment S, and
 – it may introduce new flat subsumptions built from elements of At.
Each rule application that extends SX additionally expands Γ w.r.t. X as follows:
every subsumption s ∈ Γ of the form C1 u · · · u Cn v? X is expanded by adding
the subsumption C1 u · · · u Cn v? A to Γ for every A ∈ SX .
 Eager Ground Solving:
    Condition: This rule applies to s = C1 u · · · u Cn v? D if it is ground.
    Action: If C1 u · · · u Cn vT D does not hold, the rule application fails. Oth-
    erwise, s is marked as solved.
 Eager Solving:
    Condition: This rule applies to s = C1 u · · · u Cn v? D if either
     – there is i ∈ {1, . . .d, n} such that Ci = D or Ci = X ∈ Nv and D ∈ SX , or
     – D is ground and SG vT D holds, where G is the set of all ground atoms
       in {C1 , . . . , Cn } ∪ X∈{C1 ,...,Cn }∩Nv SX .
    Action: Its application marks s as solved.
Eager Extension:
   Condition: This rule applies to s = C1 u· · ·uCn v? D if there is i ∈ {1, . . . , n}
   with Ci = X ∈ Nv and {C1 , . . . , Cn } \ {X} ⊆ SX .
   Action: Its application adds D to SX . If this makes S cyclic, the rule appli-
   cation fails. Otherwise, Γ is expanded w.r.t. X and s is marked as solved.

                 Fig. 1. The eager rules of the unification algorithm.

    Subsumptions are only added if they are not already present in Γ . If a new
subsumption is added to Γ , either by a rule application or by expansion of Γ ,
then it is initially designated unsolved, except if it has a variable on the right-
hand side. Once a subsumption is in Γ , it will not be removed. Likewise, if a
subsumption in Γ is marked as solved, then it will not become unsolved later.
    If a subsumption is marked as solved, this does not mean that it is already
solved by the substitution induced by the current assignment. It may be the
case that the task of satisfying the subsumption was deferred to solving other
subsumptions which are “smaller” than the given subsumption in a well-defined
sense. The task of solving a subsumption whose right-hand side is a variable is
deferred to solving the subsumptions introduced by expansion.
    The rules of the algorithm consist of the three eager rules Eager Ground
Solving, Eager Solving, and Eager Extension (see Figure 1), and several nonde-
terministic rules (see Figures 2 and 3). Eager rules are applied with higher pri-
ority than nondeterministic rules. Among the eager rules, Eager Ground Solving
has the highest priority, then comes Eager Solving, and then Eager Extension.
Algorithm 5. Let Γ0 be a flat EL-unification problem. We set Γ := Γ0 and
SX := ∅ for all X ∈ Nv . While Γ contains an unsolved subsumption, apply the
steps (1) and (2). Once all subsumptions are solved, return the substitution σ
induced by the current assignment.
(1) Eager rule application: If some eager rules apply to an unsolved sub-
    sumption s in Γ , apply one of highest priority. If the rule application fails,
    then return “not unifiable”.
(2) Nondeterministic rule application: If no eager rule is applicable, let s be
    an unsolved subsumption in Γ . If one of the nondeterministic rules applies
    to s, nondeterministically choose one of these rules and apply it. If none of
    these rules apply to s or the rule application fails, then return “not unifiable”.
 Decomposition:
    Condition: This rule applies to s = C1 u · · · u Cn v? ∃s.D0 if there is at least
    one index i ∈ {1, . . . , n} with Ci = ∃s.C 0 .
    Action: Its application chooses such an index i, adds the subsumption C 0 v?
    D0 to Γ , expands it w.r.t. D0 if D0 is a variable, and marks s as solved.
Extension:
  Condition: This rule applies to s = C1 u · · · u Cn v? D if there is at least one
  i ∈ {1, . . . , n} with Ci ∈ Nv .
  Action: Its application chooses such an i and adds D to SCi . If this makes S
  cyclic, the rule application fails. Otherwise, Γ is expanded w.r.t. Ci and s is
  marked as solved.
          Fig. 2. The nondeterministic rules Decomposition and Extension.


    In step (2), the choice which unsolved subsumption to consider next is don’t
care nondeterministic. However, choosing which rule to apply to the chosen sub-
sumption is don’t know nondeterministic. Additionally, the application of non-
deterministic rules requires don’t know nondeterministic guessing.
    The eager rules are mainly there for optimization purposes, i.e., to avoid
nondeterministic choices if a deterministic decision can be made. For example,
a ground subsumption, as considered in the Eager Ground Solving rule, either
follows from the TBox, in which case any substitution solves it, or it does not,
in which case it does not have a solution. This condition can be checked in poly-
nomial time using the polynomial time subsumption algorithm for EL [9]. In
the case considered in the Eager Solving rule, the substitution induced by the
current assignment already solves the subsumption. In fact, if the first (second)
condition of the rule is satisfied, then the first (second) condition of Lemma 1
applies. The Eager Extension rule solves a subsumption that contains only a
variable X and some elements of SX on the left-hand side. The rule is moti-
vated by the following observation: for any assignment S 0 extending the current
assignment, the induced substitution σ 0 satisfies σ 0 (X) ≡ σ 0 (C1 ) u . . . u σ 0 (Cn ).
           0
Thus, if SX   contains D, then σ 0 (X) vT σ 0 (D), and σ 0 solves the subsumption.
Conversely, if σ 0 solves the subsumption, then σ 0 (X) vT σ 0 (D), and thus adding
        0
D to SX   yields an equivalent induced substitution.
    The nondeterministic rules only come into play if no eager rules can be
applied. In order to solve an unsolved subsumption s = C1 u · · · u Cn v? D, we
consider the two conditions of Lemma 1. Regarding the first condition, which is
addressed by the rules Decomposition and Extension, assume that γ is induced
by an acyclic assignment S. To satisfy the first condition of the lemma with γ,
the atom γ(D) must subsume a top-level atom in γ(C1 ) u · · · u γ(Cn ). This atom
can either be of the form γ(Ci ) for an atom Ci , or it can be of the form γ(C) for
an atom C ∈ SCi and a variable Ci . In the second case, the atom C can either
already be in SCi or it can be put into SCi by an application of the Extension rule.
The Mutation rules cover the second condition in Lemma 1. For example, let us
analyze in detail how Mutation 1 ensures that all the requirements of the second
condition of Lemma 1 are satisfied. Whenever this condition requires a structural
 Mutation 1:
  Condition: This rule applies to s = C1 u · · · u Cn v? D if n > 1 and there are
  atoms A1 , . . . , Ak , B of T such that A1 u · · · u Ak vT B holds.
  Action: Its application chooses such atoms, marks s as solved, and generates
  the following subsumptions:
    – it chooses for each η ∈ {1, . . . , k} an i ∈ {1, . . . , n} and adds the new
      subsumption Ci v? Aη to Γ ,
    – it adds B v? D to Γ .
 Mutation 2:
  Condition: This rule applies to s = ∃r.X v? D if X is a variable, D is ground,
  and there are atoms ∃r.A1 , . . . , ∃r.Ak of T such that ∃r.A1 u · · · u ∃r.Ak vT D
  holds.
  Action: Its application chooses such atoms, adds A1 , . . . , Ak to SX , expands
  Γ w.r.t. X, and marks s as solved.
 Mutation 3:
  Condition: This rule applies to s = ∃r.X v? ∃s.Y if X and Y are variables,
  and there are atoms ∃r.A1 , . . . , ∃r.Ak , ∃s.B of T with ∃r.A1 u · · · u ∃r.Ak vT
  ∃s.B.
  Action: Its application chooses such atoms, marks s as solved, and generates
  the following subsumptions:
    – it adds A1 , . . . , Ak to SX and expands Γ w.r.t. X,
    – it adds the subsumption B v? Y to Γ and expands it w.r.t. Y .
Mutation 4:
 Condition: This rule applies to s = C v? ∃s.Y if C is a ground atom or >,
 Y is a variable, and there is an atom ∃s.B of T such that C vT ∃s.B holds.
 Action: Its application chooses such an atom, adds the new subsumption B v?
 Y to Γ , expands this subsumption w.r.t. Y , and marks s as solved.

      Fig. 3. The nondeterministic Mutation rules of the unification algorithm.


subsumption γ(E) vsT γ(F ) to hold for a (hypothetical) unifier γ of Γ , the rule
creates the new subsumption E v? F , which has to be solved later on. This way,
the rule ensures that the substitution built by the algorithm actually satisfies
the conditions of the lemma. To check the subsumption A1 u · · · u Ak vT B, the
rule again employs a polynomial-time subsumption algorithm.
    The other mutation rules follow the same idea, but they implicitly apply
one or more Decomposition or Eager Extension rules after mutation. This en-
sures that the generated subsumptions are “smaller” than the subsumption that
triggers their introduction.


Soundness We will show that, if Algorithm 5 returns a substitution σ on input
Γ0 , then σ is a unifier of Γ0 w.r.t. T . In the following, let S be the final acyclic
assignment computed by a non-failing run of Algorithm 5 on input Γ0 , and σ
the substitution induced by S. By Γb we denote the final set of subsumptions
computed by this run, i.e., the original subsumptions of Γ0 together with the
new ones generated by rule applications. To show that σ solves all subsumptions
in Γb, we use well-founded induction [8] on the well-founded order  on Γb:

Definition 6. Let s = C1 u · · · u Cn v? Cn+1 ∈ Γb.
 – s is small if n = 1 and C1 is ground or Cn+1 is ground.
 – We define m(s) := (m1 (s), m2 (s), m3 (s)), where
     • m1 (s) := 0 if s is small, and m1 (s) := 1 otherwise;
     • m2 (s) := X if Cn+1 = X or Cn+1 = ∃r.X for a variable X and some
       r ∈ NR , and m2 (s) := ⊥ otherwise;
     • m3 (s) := max{rd(σ(Ci )) | i ∈ {1, . . . , n + 1}} where rd yields the role
       depth of a concept description, i.e., the maximal nesting of existential
       restrictions.
 – The strict partial order  on such triples is the lexicographic order, where
   the first and the third component are compared w.r.t. the normal order > on
   natural numbers. The variables in the second component are compared w.r.t.
   the relation >S induced by S, and ⊥ is smaller than any variable.
 – We extend  to Γb by setting s1  s2 iff m(s1 )  m(s2 ).
   As the lexicographic product of well-founded strict partial orders is again
well-founded [8],  is a well-founded strict partial order on Γb.

Lemma 7. σ is an EL-unifier of Γb w.r.t. T , and thus also of its subset Γ0 .
Proof. Let s ∈ Γb and assume that σ solves all subsumptions s0 ∈ Γb with s0 ≺ s.
 – If s has a non-variable atom as its right-hand side, then it was initially
   marked as unsolved and must have been marked solved by a successful rule
   application. As an example, we consider the application of the Decomposition
   rule (the other rules can be treated similarly [1]). Then s is of the form
   C1 u · · · u Cn v? ∃s.D0 with Ci = ∃s.C 0 for some i ∈ {1, . . . , n} and we
   have s0 = C 0 v? D0 ∈ Γb. We will show that s  s0 holds. By induction, this
   implies that σ solves s0 , and by Lemma 1 thus also s.
   Observe first that m2 (s) = m2 (s0 ) since either ∃s.D0 and D0 contain the
   same variable or are both ground. We now make a case distinction based on
   m1 (s0 ). If s0 is small, then s is either non-small, i.e. m1 (s) > m1 (s0 ), or small
   and of the form ∃s.C 0 v? ∃s.D0 . In the second case, we have m1 (s) = m1 (s0 )
   and m3 (s) > m3 (s0 ). If s0 is non-small, then both C 0 and D0 are variables,
   and thus s is also non-small, which yields m1 (s) = m1 (s0 ). Furthermore, the
   maximal role depth obviously decreases when going from s to s0 , and thus
   m3 (s) > m3 (s0 ). In all cases we have shown m(s)  m(s0 ), i.e., s  s0 .
 – If s has a variable as its right-hand side, it is of the form C1 u · · · u Cn v? X
   and for every A ∈ SX there is a subsumption sA = C1 u · · · u Cn v? A in Γb.
   If s is small, then n = 1 and C1 is ground, and thus the subsumptions sA are
   also small. Thus, we have m1 (s) ≥ m1 (sA ) for every A ∈ SX . Furthermore,
   we have m2 (s) > m2 (sA ) since A is ground or contains a variable on which
   X depends. This yields s  sA , and thus by induction σ(C1 )u· · ·uσ(Cn ) vT
   σ(A) for every A ∈ SX , which implies that σ(C1 ) u · · · u σ(Cn ) vT σ(X) by
   the definition of σ.                                                                t
                                                                                       u
Completeness Assume that Γ0 is unifiable w.r.t. T and let γ be a ground unifier
of Γ0 w.r.t. T . We use this unifier to guide the application of the nondeterministic
rules such that Algorithm 5 does not fail. The following invariants for Γ and S
will be maintained:
  (I) γ is a unifier of Γ .
 (II) For all B ∈ SX we have γ(X) vT γ(B).
Since SX is initialized to ∅ for all variables X ∈ Nv and Γ is initialized to Γ0 ,
these invariants are satisfied after the initialization of the algorithm.
   The invariants immediately rule out one cause of failure for the algorithm,
namely that the current assignment becomes cyclic. This is the only place in the
whole proof where our assumption on cycle-restrictedness of T is needed.
Lemma 8. If invariant (II) is satisfied, then the current assignment S is acyclic.
   The proofs of this and of the next lemma can be found in [1].
Lemma 9. Assume that the current set of subsumptions Γ and the current as-
signment S satisfy the invariants (I) and (II), and let s ∈ Γ be unsolved.
1. If an eager rule applies to s, then its application does not fail and the resulting
   set Γ 0 and assignment S 0 also satisfy the invariants (I) and (II).
2. If no eager rule applies to s, then there is a nondeterministic rule that can
   successfully be applied to s such that the resulting set Γ 0 and assignment S 0
   also satisfy the invariants (I) and (II).
    An immediate consequence of this lemma is that, if Γ0 is unifiable, then there
is a non-failing run of Algorithm 5 on Γ0 during which the invariants (I) and (II)
are satisfied. Together with the fact that every run of the algorithm terminates
(see below), this shows completeness, i.e., whenever Γ0 has a unifier w.r.t. T ,
the algorithm computes one.

Termination Consider a run of Algorithm 5. It is easy to show that any sub-
sumption encountered during this run falls into one of the following categories:
 1. subsumptions from Γ0 ;
 2. subsumptions created by expansion from Γ0 : these are of the form C1 u · · · u
    Cn v? A for a subsumption C1 u · · · u Cn v? X ∈ Γ0 and A ∈ Atnv ;
 3. subsumptions of the form C v? D for C, D ∈ At.
Since the cardinality of At is polynomially bounded by the size of Γ0 and T ,
there are only polynomially many subsumptions of this form. Rules are only
applicable to subsumptions that are marked unsolved, and the application of a
rule marks at least one subsumption as solved. Thus, only polynomially many
rules can be applied during the run. In addition, each rule application takes
only polynomial time. This shows that every run of the algorithm terminates in
polynomial time.
Theorem 10. Algorithm 5 is an NP-decision procedure for unifiability in EL
w.r.t. cycle-restricted TBoxes.
5    Conclusions

We have presented a goal-oriented NP-algorithm for unification in EL w.r.t.
cycle-restricted TBoxes. In [3], we developed a reduction of this problem to a
propositional satisfiability problem (SAT), which is based on a characterization
of subsumption different from the one in Lemma 1. Though clearly better than
the brute-force algorithm introduced in [2], both algorithms suffer from a high
degree of nondeterminism introduced by having to guess atoms and GCIs from
the underlying cycle-restricted TBox. We have to find optimizations to tackle
this problem before an implementation becomes feasible.
    On the theoretical side, the main topic for future research is to consider
unification w.r.t. unrestricted general TBoxes. In order to generalize the brute-
force algorithm in this direction, we need to find a more general notion of locality.
Starting with the goal-oriented algorithm, the idea would be not to fail when
a cyclic assignment is generated, but rather to add rules that can break such
cycles, similar to what is done in procedures for general E-unification [11].


References
 1. Baader, F., Borgwardt, S., Morawska, B.: Unification in the description logic EL
    w.r.t. cycle-restricted TBoxes. LTCS-Report 11-05, Chair of Automata Theory, TU
    Dresden, Germany (2011), see http://lat.inf.tu-dresden.de/research/reports.html.
 2. Baader, F., Borgwardt, S., Morawska, B.: Extending unification in EL towards
    general TBoxes. In: Proc. KR’12. AAAI Press (2012), short paper. To appear.
 3. Baader, F., Borgwardt, S., Morawska, B.: SAT encoding of unification in ELHR+
    w.r.t. cycle-restricted ontologies. LTCS-Report 12-02, Chair for Automata The-
    ory, TU Dresden (2012), see http://lat.inf.tu-dresden.de/research/reports.html. A
    short version of this report has been submitted to a conference.
 4. Baader, F., Morawska, B.: Unification in the description logic EL. In: Proc. RTA’09.
    LNCS, vol. 5595, pp. 350–364. Springer (2009)
 5. Baader, F., Morawska, B.: SAT encoding of unification in EL. In: Proc. LPAR’10.
    LNCS, vol. 6397, pp. 97–111. Springer (2010)
 6. Baader, F., Morawska, B.: Unification in the description logic EL. Log. Meth.
    Comput. Sci. 6(3) (2010)
 7. Baader, F., Narendran, P.: Unification of concept terms in description logics. J.
    Symb. Comput. 31(3), 277–305 (2001)
 8. Baader, F., Nipkow, T.: Term Rewriting and All That. Cambridge University Press,
    United Kingdom (1998)
 9. Brandt, S.: Polynomial time reasoning in a description logic with existential re-
    strictions, GCI axioms, and—what else? In: Proc. ECAI’04. pp. 298–302. IOS Press
    (2004)
10. Hofmann, M.: Proof-theoretic approach to description-logic. In: Proc. LICS’05. pp.
    229–237. IEEE Press (2005)
11. Morawska, B.: General E-unification with eager variable elimination and a nice
    cycle rule. J. Autom. Reasoning 39(1), 77–106 (2007)