=Paper=
{{Paper
|id=None
|storemode=property
|title=Towards Practical Uniform Interpolation and Forgetting for ALC TBoxes
|pdfUrl=https://ceur-ws.org/Vol-1014/paper_61.pdf
|volume=Vol-1014
|dblpUrl=https://dblp.org/rec/conf/dlog/LudwigK13
}}
==Towards Practical Uniform Interpolation and Forgetting for ALC TBoxes==
<pdf width="1500px">https://ceur-ws.org/Vol-1014/paper_61.pdf</pdf>
<pre>
Towards Practical Uniform Interpolation and Forgetting
                   for ALC TBoxes

                              Michel Ludwig1,2 and Boris Konev1
         1
             Department of Computer Science, University of Liverpool, United Kingdom
                               Konev@liverpool.ac.uk

               2
                   Institute for Theoretical Computer Science, TU Dresden, Germany
                               Center for Advancing Electronics Dresden
                               michel@tcs.inf.tu-dresden.de


        Abstract We develop a clausal resolution-based approach for computing uni-
        form interpolants of TBoxes formulated in the description logic ALC when such
        uniform interpolants exist. We also present an experimental evaluation of our
        approach and its applications to concept forgetting, ontology obfuscation and lo-
        gical difference on real-life ALC ontologies. Our results indicate that in many
        practical cases a uniform interpolant exists and can be computed with the presen-
        ted algorithm.


1     Introduction
Ontologies or TBoxes expressed in Description Logics (DL) provide a common vocabu-
lary for a domain of interest together with a description of the meaning of the terms built
from the vocabulary and of the relationships between them. Modern applications of on-
tologies, especially in the biological, medical, or healthcare domain, often demand large
and complex ontologies; for example, the National Cancer Institute ontology (NCI) con-
sists of more than 60 000 term definitions. For developing, maintaining, and deploying
such large-scale ontologies it can be advantageous for ontology engineers to concen-
trate on specific parts of an ontology and ignore or forget the rest, for example, in the
following scenarios. Exhibiting hidden relations: in addition to the explicitly stated re-
lations between terms, additional relations can also be derived from ontologies with the
help of reasoners. Such inferred relations are often harder to understand or debug. By
forgetting everything but a handful of terms of interest, it then becomes possible to ex-
hibit inferred relations that are hidden initially, which can simplify the understanding of
the ontology structure. Ontology obfuscation: in software engineering, obfuscation [4]
transforms a given program into a functionally equivalent one that is more difficult to
read and understand for humans for the purpose of preventing reverse engineering. For-
getting can provide a similar function in the context of ontology engineering. Terms
are often defined with the help of auxiliary terms which help to give structure to TBox
inclusions but such a structure might be considered proprietary knowledge that should
not be exposed, or it could simply be of little interest for ontology users. By forgetting
    This research was supported by EPSRC grant EP/E065279/1.
these intermediate auxiliary terms, we obtain an ontology that is functionally equival-
ent, yet harder to read, understand, and modify by humans. Logical difference: when
modifying an ontology, an ontology engineer usually wants to ensure that her changes
do not interfere with the meaning of the terms outside the fragment under considera-
tion. Such a correctness guarantee can be achieved by checking the equivalence of the
ontologies resulting from forgetting the terms under consideration before and after the
changes occurred. Further applications of forgetting can be found in [11, 12].
    Ignoring parts of an ontology can be formalised with the help of predicate forgetting
and its dual uniform interpolation, which have both been extensively studied in the AI
and DL literature [3,6,8,11,13,16–18]. However, to the best of our knowledge, previous
research in this area in the setting of DL mainly concentrates on theoretical foundations
of forgetting and, except for practical work on lightweight description logics [11], we
are not aware of any attempts to compute uniform interpolants for real-life ontologies
in practice. This could be partly explained by the high computational complexity of this
task and by the fact that uniform interpolants do not always exist.
    In this paper we develop an algorithm based on clausal resolution for computing
uniform interpolants of TBoxes formulated in the description logic ALC which can pre-
serve all the consequences that do not make use of some given concept names. We also
present an experimental evaluation of our approach which demonstrates that in many
practical cases uniform interpolants exist and that they can be computed with our al-
gorithm. All missing proofs can be found in the full version of this paper, which is avail-
able from http://lat.inf.tu-dresden.de/˜michel/publications/.


2   Preliminaries
We start with introducing the description logic ALC. Let NC and NR be countably
infinite and mutually disjoint sets of concept names and role names. ALC-concepts are
built according to the following syntax rule

                  C ::=     A |     >    |   ¬C    |   ∃r.C    | C u D,
where A ∈ NC and r ∈ NR . As usual, other ALC concept constructors are introduced
as abbreviations: ⊥ stands for ¬>, C t D stands for ¬(¬C u ¬D) and ∀r.C stands
for ¬∃r.¬C. An ALC-TBox T is a finite set of ALC-inclusions of the form C v D,
where C and D are ALC-concepts. A concept equation C ≡ D is an abbreviation for
the two inclusions C v D and D v C. An ALC-TBox is acyclic if all its inclusions
are of the form A v C and A ≡ C, where A ∈ NC and C is an ALC-concept, such that
no concept name occurs more than once on the left-hand side and T contains no cycle
in its definitions, that is, it does not contain inclusions A1 ./ C1 ,. . . , Ak ./ Ck , where
./ ∈ {v, ≡}, such that Ai+1 occurs in Ci , for i = 1, . . . , k − 1 and Ak = A1 .
     A signature Σ is a finite subset of NC ∪NR . The signature of a concept C, denoted by
sig(C), is the set of concept and role names that occur in C. If sig(C) ⊆ Σ, we call C
a Σ-concept. We assume that the two previous definitions also apply to concept inclu-
sions/equations C ./ D with ./ ∈ {v, ≡} and to TBoxes T . The size of a concept C is
the length of the string that represents it, where concept names and role names are con-
sidered to be of length one. The size of an inclusion/equation C ./ D with ./ ∈ {v, ≡}
is the sum of the sizes of C and D plus one. The size of a TBox T is the sum of the
sizes of its inclusions.
     The semantics of ALC is given by interpretations I = (∆I , ·I ), where the domain
∆ is a non-empty set, and ·I is a function mapping each concept name A to a subset
   I

AI of ∆I , each role name r to a binary relation rI ⊆ ∆I × ∆I . The extension C I of
a concept C is defined by induction as follows:
     >I    := ∆I                 (∃r.C)I := { d ∈ ∆I | ∃e ∈ C I : (d, e) ∈ rI }
     (¬C) := ∆I \ C I
         I
                                 (C u D)I := C I ∩ DI .

Then I satisfies a concept inclusion C v D, in symbols I |= C v D, if C I ⊆ DI .
    We say that an interpretation I is a model of a TBox T if I |= C v D for all
C v D ∈ T . An ALC-inclusion C v D follows from (or is entailed by) a TBox T if
every model of T is a model of C v D, in symbols T |= C v D. We use |= C v D to
denote that C v D follows from the empty TBox. Finally, a TBox T 0 follows from (or
is entailed by) a TBox T if every model of T is a model of T 0 , in symbols T |= T 0 .
    We now introduce the main notion that we study in this paper.
Definition 1. Let T be an ALC-TBox and let Σ ⊆ sig(T ) be a signature. We say that
an ALC-TBox TΣ is a Σ-uniform interpolant of the TBox T iff sig(TΣ ) ⊆ Σ, T |= TΣ ,
and for every ALC Σ-concept inclusion C v D with T |= C v D it holds that
TΣ |= C v D.
Uniform interpolation can be seen as the dual notion of forgetting: a TBox TΥ is the
result of forgetting about a signature Υ in a TBox T iff TΥ is a uniform interpolant
of T w.r.t. Σ = sig(T ) \ Υ . As the following example shows, uniform interpolants of
ALC-TBoxes do not always exist.
Example 2. Let T = {A v B, B v C u ∃r.B} and Σ = {A, C, r}. Then there
does not exist a Σ-uniform interpolant of T as (in particular) the infinite number of
consequences of the form A v ∃r.C, A v ∃r.∃r.C, . . . cannot be captured by an
ALC-TBox T 0 with sig(T 0 ) ⊆ Σ. On the other hand, for T 0 = {A v B, B v
C u ∃r.B, D ≡ B} and Σ 0 = {A, C, D, r}, a Σ 0 -uniform interpolant of T 0 is {A v
D, D v C u ∃r.D}.
     A 2-E XP T IME-complete bound on the complexity for deciding the existence of a
Σ-uniform interpolant in ALC and a worst-case triple-exponential procedure for com-
puting a Σ-uniform interpolant if it exists, have been given in [13]. The proof of the
upper bound proceeds by ‘internalisation’ of the TBox: if a Σ-uniform interpolant ex-
ists, then there exists a concept CT of size double exponential in the size of T having the
property that for any Σ-inclusion C v D it holds that T |= C v D iff |= C u CT v D.
Then the authors compute C 0 , a Σ-concept uniform interpolant [3] of CT , and show
that > v C 0 is a Σ-uniform interpolant of T .


3   Computing Uniform Interpolants by ALC-Resolution
The aim of our work is to investigate a practical approach for computing uniform inter-
polants when they exist. Note that the procedure given in [13] is inherently inefficient
as it requires one to explicitly construct the double-exponential size internalisation CT
of a given TBox T .
    Our approach is to introduce a resolution-like calculus for ALC that derives con-
sequences of a TBox T such that a concept inclusion C v D is entailed by T iff a con-
tradiction can be derived from T and C u ¬D. Similarly to [8], we then show that any
derivation can be restructured in such a way that inferences on selected concept names
always precede inferences on other concept names. Then, if the signature Σ is such that
sig(T ) \ Σ only contains concept names, we generate a set of Σ-consequences T 0 of T
by applying the inference rules in a forward chaining manner such that for an arbitrary
Σ-inclusion C v D a contradiction can be derived from T and C u ¬D iff a con-
tradiction can be derived from T 0 and C u ¬D. Thus, if the forward-chaining process
terminates, T 0 is a Σ-uniform-interpolant for T .

ALC-Resolution. ALC-resolution operates on ALC formulae in conjunctive normal
form defined according to the following grammar (this is similar to [8]):
                       Literal ::= A | ¬A | ∀r.Clause | ∃r.CNF
                       Clause ::= Literal | Clause t Clause | ⊥
                        CNF ::= > | Clause | Clause u CNF
To simplify the presentation, we assume that clauses are sets of literals and that CNF
expressions are sets of clauses. Then ⊥ corresponds to the empty clause and > to the
empty set of clauses. In the following, the calligraphic letters C, D, E symbolise clauses
and F, G represent sets of clauses. Similarly to first-order formulae, every ALC concept
can be transformed into an equivalent set of ALC clauses. The depth of a clause C,
Depth(C), is defined to be the maximal nesting depth of the quantifiers contained in C.
    We additionally assume that every clause is assigned a type. Clauses obtained from
the clausification of TBox inclusions are of the type universal, and clauses resulting
from the clausification of inclusions to be tested for entailment by the TBox are of the
type initial. The type of a derived clause is determined by the types of the clauses from
which it is derived and by the derivation rule that is used.
Example 3. The clausification of T from Example 2 produces three universal clauses:
¬A t B, ¬B t C, ¬B t ∃r.B.
We now introduce the two resolution calculi T and Tu . The former calculus assumes
the TBox to be empty, whereas the latter takes TBox inclusions into account. Thus, T
derives the empty clause from the set of initial clauses stemming from the clausification
of an inclusion C v D iff |= C v D; and Tu derives the empty clause from the
universal clauses stemming from the clausification of a TBox T and the initial clauses
stemming from the clausification of an inclusion C v D iff T |= C v D.
    The calculus T is defined with the help of the relation ⇒α given in Fig. 1. For every
α ∈ NC ∪{⊥}, the relation ⇒α associates with a set of clauses N a new clause C which
can be ‘derived’ from the set N by ‘resolving’ on α. T now consists of the following
two inference rules.
                 C (if C ⇒ E)                    C D (if C, D ⇒ E),
                            α                                     α
                 E                                 E
where C, D, and E are initial clauses.
           (rule ⊥)     C10 t ∀r.⊥, C20 t ∃r.F =⇒⊥ C10 t C20
           (rule A)     C10 t A, C20 t ¬A =⇒A C10 t C20
          (rule ∀∃)     C10 t ∀r.C1 , C20 t ∃r.(C2 , F) =⇒α C10 t C20 t ∃r.(C2 , F, C3 )
                                                                        if C1 , C2 =⇒α C3
          (rule ∀∀)     C10 t ∀r.C1 , C20 t ∀r.C2 =⇒α C10 t C20 t ∀r.C3
                                                                        if C1 , C2 =⇒α C3
          (rule ∃1 )    C 0 t ∃r.(C1 , F) =⇒α C 0 t ∃r.(C1 , F, C2 )
                                                                            if C1 =⇒α C2
          (rule ∃2 )    C 0 t ∃r.(C1 , C2 , F) =⇒α C 0 t ∃r.(C1 , C2 , F, C3 )
                                                                        if C1 , C2 =⇒α C3
           (rule ∀)     C 0 t ∀r.C1 =⇒α C 0 t ∀r.C2
                                                                            if C1 =⇒α C2

                                    Figure 1. Rules of =⇒α .


    The calculus Tu operates initial and universal clauses and also consists of two rules:
                     C (if C ⇒ E)                       C 0 D (if C 0 , D ⇒u E 0 ),
                               α                                           α
                     E                                     E0
where C, C 0 , D are initial or universal clauses, and C 0 , D ⇒uα E 0 holds iff either C 0 , D ⇒α
E 0 , or D is a universal clause and there exist role names r1 , . . . , rn ∈ NR (n ≥ 1) such
that C 0 , ∀r1 . . . . ∀rn .D ⇒α E 0 . (Intuitively, the calculus Tu allows for inferences with
universal clauses at arbitrary nesting levels of quantifiers, which the calculus T does
not.) Then E is a universal clause if C is a universal clause, and an initial clause other-
wise. Similarly, E 0 is a universal clause if both C 0 and D are universal clauses, and an
initial clause otherwise.

Example 4. For the universal clauses from Example 3, we have for instance,
                  ¬A t B, ¬B t ∃r.B ⇒B ¬A t ∃r.B by (rule A).
So, the universal clause ¬A t ∃r.B is derivable by Tu from ¬A t B and ¬B t ∃r.B.
As ¬B t C is a universal clause and
              ¬B t ∃r.B, ∀r.¬B t C ⇒B ¬B t ∃r.(B, C) by (rule ∀∃),
the universal clause ¬B t ∃r.(B, C) is derivable by Tu from ¬B t ∃r.B and ¬B t C.
By applying the inference rules to old and newly generated clauses, one can conclude
that the universal clauses ¬A t ∃r.(B, C) and ¬A t ∃r.(B, ∃r.B) are also derivable by
Tu from N = {¬A t B, ¬B t C, ¬B t ∃r.B}.

    For x ∈ {T, Tu }, a x-derivation (tree) ∆ built from a set of clauses N is a finite
binary tree where each leaf is labelled with a clause from N and each non-leaf node n
is labelled with a clause C such that C results from an x-inference on the parent(s)
of n in ∆. We say that ∆ is a derivation of a clause C if the root of ∆ is labelled
with C. A derivation of the empty clause is called a refutation. Every path n1 , . . . , nm
of nodes in ∆ where n1 is a leaf node and nm is the root node induces an inference
path α2 , . . . , αm where αi ∈ NC ∪ {⊥} (2 ≤ i ≤ m) denotes the concept name, or ⊥,
which has been resolved upon to obtain the clause that is the label of the node ni . For
a signature Υ ⊆ NC and a strict total order  ⊆ Υ × Υ , a derivation ∆ is a (x, Υ, )-
derivation if for every inference path α1 , . . . , αn of ∆ (with αi ∈ NC ∪ {⊥} for every
1 ≤ i ≤ n) there exists 0 ≤ k ≤ n such that {α1 , . . . , αk } ⊆ Υ , αj  αj+1 or
αj = αj+1 for every 1 ≤ j < k, and αj 6∈ Υ for every k < j ≤ n.
    The calculus T is a direct adaption of the calculus for the modal logic K introduced
in [7] to ALC, and so |= > v C iff the empty clause can be derived from the clausifica-
tion of ¬C. However, as we show in the full version of the paper, the proof given in [7]
of the fact that any derivation in T can be reordered so that inferences on concept names
from Υ always precede inferences on other concept names, or ⊥, appears to have some
gaps. We prove the following theorem which extends the results stated in [8].

Theorem 5 (T-Completeness). Let Υ ⊆ NC , let  ⊆ Υ × Υ be a strict total order
on Υ and let C be an ALC concept. Then it holds that |= C v D iff there exists a
(T, Υ, )-derivation of the empty clause from the initial clauses Cls(C u ¬D).

To prove completeness for Tu , we observe the following link between derivations in T
and Tu . Let N be a set of clauses and let
                                                    [
 Univ0 (N ) = N ; Univi+1 (N ) = Univi (N ) ∪               { ∀r.D | D ∈ Univi (N ) }
                S                              r∈NR ∩sig(N )
and Univ(N ) = i≥0 Univi (N ).

Theorem 6. Let M be a set of initial clauses and let N be a set of universal clauses.
Additionally, let ∆ be a (T, Υ, )-refutation from M ∪ Univ(N ) such that there exists
n ∈ N with Depth(C) ≤ n for every C ∈ Clauses(∆). Then there exists a (Tu , Υ, )-
derivation ∆u of the empty clause from M ∪ N such that Depth(C) ≤ n for every
C ∈ Clauses(∆u ).

We then use Theorem 5 and the fact that every ALC-TBox can be internalised. Notice
that the actual TBox internalisation CT does not have to be computed as it is only used
for the proof of completeness.

Corollary 7 (Tu -Completeness). Let T be an ALC-TBox, let Υ ⊆ NC , let  ⊆ Υ × Υ
be a strict total order on Υ and let C be an ALC concept. Then it holds that T |= C v
D iff there exists a (Tu , Υ, )-derivation of the empty clause from the universal clauses
Cls(T ) and the initial clauses Cls(C u ¬D).


Computing Uniform Interpolants. The procedure U NIFORM I NTERPOLANT depicted
in Algorithm 1 takes as input an ALC-TBox T , a signature Σ ⊆ sig(T ) such that
Σ ∩ NR = sig(T ) ∩ NR and a strict total order  ⊆ Υ × Υ over Υ = sig(T ) \ Σ.
Following the outline of [8], after the clausification of T , the procedure iterates over
the concept names contained in Υ in descending order according to the relation . In
each iteration all possible Tu -inferences on the current concept name A ∈ Υ from the
clause set N obtained in the previous iteration are computed. Finally, after iterating
over all the concept names from Υ = sig(T ) \ Σ, the operator ‘Supp’ is applied on the
resulting clauses, which replaces all occurrences of Υ concept names in clauses with >
and then simplifies the resulting CNF.
Algorithm 1
 1: procedure U NIFORM I NTERPOLANT(T , Σ, )
 2:    Υ := sig(T ) \ Σ
 3:    N := Cls(T )
 4:    while Υ 6= ∅ do
 5:        A := max (Υ )
 6:        N := Res∞ Tu ,{A} (N )
 7:        Υ := Υ \ {A}
 8:    end while
 9:    return FΣ (T ) = Supp(sig(T ) \ Σ, N )
10: end procedure


Example 8. For the clauses derived in Example 4, Supp({B}, ¬A t C) = A t C,
Supp({B}, ¬A t ∃r.B) = ¬A t ∃r.>, Supp({B}, ¬A t ∃r.(B, C)) = ¬A t ∃r.C.
     One can show that if Algorithm 1 terminates, for all ALC Σ-concepts C, D such
that there exists a (Tu , Υ, )-refutation ∆u from the universal clauses Cls(T ) and the
initial clauses Cls(C u ¬D) it holds that FΣ (T ) |= C v D. Thus, it follows from
Corollary 7 that if Algorithm 1 terminates, it computes a Σ-uniform interpolant of T .
However, Algorithm 1 does not terminate if a uniform interpolant does not exist. For
example, when applied to T from Example 2, Algorithm 1 can generate, among others,
the infinite sequence of universal clauses ¬A t ∃r.C, ¬A t ∃r.(C, ∃r.C), . . . and so it
does not terminate. Moreover, as T from Example 2 is a subset of T 0 , and so Cls(T ) ⊆
Cls(T 0 ), Algorithm 1 will derive, among others, the same clauses when it is applied
on T 0 . Thus, in some cases Algorithm 1 does not terminate even though a uniform
interpolant exists.
     To guarantee termination on all inputs, we focus on the notion of depth-bounded
uniform interpolation (related to the notion of ‘bounded forgetting’ [19]). Let T be
an ALC-TBox and let Σ ⊆ sig(T ) be a signature. We say that an ALC-TBox TΣ
is a depth n-bounded uniform interpolant of the TBox T w.r.t. Σ iff sig(TΣ ) ⊆ Σ,
T |= TΣ , and for every ALC Σ-concept inclusion C v D with T |= C v D and
max{Depth(C), Depth(D)} ≤ n it holds that TΣ |= C v D. Let FΣ,m (T ) be the
outcome of Algorithm 1 where in Step 6 only clauses up to depth m are generated. The
following example shows that it might be necessary to consider intermediate clauses of
a depth m > n in order to preserve all the Σ-consequences of depth n entailed by T .
Example 9. Let T = {¬At∃r.C, ¬C t∃s.>, B t∀s.⊥}, Σ = {A, B, r, s}, Υ = {C}
and = ∅. Then every (Tu , Υ, )-refutation from the universal clauses Cls(T ) and the
initial clauses {A, ∀r.¬B} derives the clause ¬A t ∃r.(C, ∃s.>).
We establish, however, that by choosing the maximal depth of derived clauses appro-
priately, the procedure depicted in Algorithm 1 computes uniform interpolants that pre-
serve consequences up to a specified depth n.
Theorem 10. Let T be an ALC-TBox, Σ ⊆ sig(T ) a signature such that Σ ∩ NR =
sig(T ) ∩ NR , and let n ≥ 0. Set m = n + 2|sub(Cls(T ))|+1 + max{ Depth(C) | C ∈
Cls(T ) }. Then it holds that FΣ,m (T ) is a depth n-bounded uniform interpolant of the
TBox T w.r.t. Σ.
We can combine this result with the bound on the size of uniform interpolants in [13]:
for any ALC-TBox T and signature Σ (with Σ ∩ NR = sig(T ) ∩ NR ), if a Σ-uniform
interpolant of T exists, there exists m, which can be computed based on the bound in
Theorem 10 and the results of [13], such that FΣ,m (T ) is a Σ-uniform interpolant of T .
    The bound in Theorem 10 can be significantly improved if the TBox is acyclic.
For an acyclic ALC-TBox T we define ExpansionDepth(T ) = max{ Depth(A[T ]) |
A ∈ sig(T ) }, where A[T ] denotes the concept obtained by exhaustively replacing
every concept B with CB if B v CB ∈ T or B ≡ CB ∈ T .
Theorem 11. Let T be an acyclic ALC TBox, Σ ⊆ sig(T ) a signature such that Σ ∩
NR = sig(T ) ∩ NR , and let n ≥ 0. Set m = ExpansionDepth(T ) + n. Then it holds
that FΣ,m (T ) is a uniform interpolant limited to consequence depth n of the TBox T
w.r.t. Σ.
Note that in the description logic EL (i.e. the fragment of ALC that does not allow ⊥,
negation, disjunction, or universal quantification) the acyclicity of a TBox guarantees
the existence of uniform interpolants [11] for any signature Σ. Interestingly, this is
not so in the case of ALC. Moreover, as the following example shows, there exists an
acyclic EL-TBox T and a signature Σ for which no ALC Σ-uniform interpolant exists.
Example 12. Consider Σ = {A, A0 , A1 , A2 , E, r} and T = {A v ∃r.B, A0 v
∃r.(A1 u B), E ≡ A1 u B u ∃r.(A2 u B)}. Then for every n ≥ 0
                   ln
         T |= A0 u    ∀r. . . ∀r.}(A u ¬E u (A1 t A2 )) v |∃r. .{z
                      | .{z                                      . . ∃r.} A1 .
                       i=1      i                                       n
This infinite sequence of ALC consequences of T cannot be captured by any ALC Σ-
TBox T 0 , which can be proved formally using the criterion (∗m ) of Theorem 9 in [13].

4   Case Study
We have implemented a prototype of an inference computation architecture using the
calculus Tu and the inference relation ⇒α in Java. It has turned out that our initial
implementation of Algorithm 1 did not perform well in practice. This is in particular
due to the fact that clauses can contain sets F of other clauses in existential literals ∃r.F,
which renders all the possible inferences on clauses from F ‘explicit’. For example, if
we resolve the universal clause which just consists of the existential literal ∃r.(A) with
the universal clauses ¬A t B1 , . . . , ¬A t Bn on the concept name A, then not only the
clauses ∃r.(A, B1 ), ∃r.(A, B2 ), ∃r.(A, B3 ),. . . could be derived but all clauses of the
form ∃r.(A, G), where G is a subset of {B1 , . . . , Bn }.
    A common technique to reduce the number of inferences that have to be made is to
use forward- and backward deletion of subsumed clauses [2]. However, it is known [1]
that the subsumption lemma (stating that if a clause E results from an inference in-
volving two clauses C and D, and if there exist clauses C 0 , D0 such that C 0 subsumes C
and D0 subsumes D, then either E is subsumed by one of C 0 , D0 , or a clause E 0 can be de-
rived from C 0 and D0 such that E 0 subsumes E) does not hold even in the modal logic K
for the standard minimal subsumption relation ≤s and ⇒α . To be able to prove that one
can safely discard subsumed clauses, we have modified the inference relation ⇒α by
introducing the following additional rule
          (rule ∃f )     C1 t ∀r.D, C2 t ∃r.F =⇒∃f C1 t C2 t ∃r.(F, D).
We will denote the resulting inference relation by ⇒fα with α ∈ NC ∪ {⊥, ∃f }. One can
then prove that a variant of the subsumption lemma holds for the relations ≤s and ⇒fα ,
which allows us to employ forward- and backward deletion of subsumed clauses in our
implementation.
    In order to further speed up computations, we first extract the locality-based >⊥∗
Σ-module [5, 14] for a given TBox T . The locality-based module entails the same Σ-
inclusions as the TBox T but it is often considerably smaller in size. We also rely on
ontologies to have structure: if a concept name occurs in several inclusions, it is likely
that it occurs in the same syntactic pattern. Thus,
 1. If the clause set contains some clauses C1 t DΥ , . . . , Cm t DΥ such that for every
    1 ≤ i ≤ m we have sig(Ci ) ∩ Υ = ∅, we rewrite them into X t DΥ , where
    X ≡ C1 u . . . u Cm , perform forgetting on Υ symbols and then replace X with its
    definition.
 2. If the clause set contains a clause C t ∃r.(FΥ , G1 ) t . . . t ∃r.(FΥ , Gm ) where
    sig(Gi ) ∩ Υ = ∅ for every 1 ≤ i ≤ m, we rewrite it into C t ∃r.(FΥ , Y ), where
    Y ≡ G1 t . . . t Gm , perform forgetting on Υ and then replace Y with its definition.

Experimental setting. All experiments were conducted on PCs equipped with an Intel
Core i5-2500K CPU running at 3.30GHz. 15 GiB of RAM were allocated to the Java
VM and an execution timeout of 60 CPU minutes was imposed on each problem. We
only considered ALC-fragments of ontologies, i.e. for a given ALC-ontology T we re-
moved all the axioms which contain non-ALC concept (or role) constructors to obtain
its ALC-fragment. We used Algorithm 1 to forget concept names one by one (i.e. for
sig(T ) \ Σ = {A1 , . . . , An }, Algorithm 1 was applied iteratively on A1 , . . . , An ), and
we did not impose a bound on the depth of clauses. After each run of Algorithm 1 we
further simplified the resulting clause set N by removing clauses D for which there
exists C ∈ N with C ≤s D. Thus, in all the experiments reported on in this section we
computed true Σ-uniform interpolants (i.e. not a depth-bounded variant). The correct-
ness of our extensions to Algorithm 1 can be shown by model-theoretic arguments.
Exhibiting Hidden Relations. We applied our uniform interpolation tool to compute
uniform interpolants w.r.t. small signatures Σ ⊆ sig(T ) with sig(T ) ∩ NR = Σ ∩ NR
for a fragment of version 1.4 of the Drug Interaction Knowledge Base3 (DIKB) and for
a fragment of version 08.10e of the National Cancer Institute Thesaurus4 (NCI) that are
of expansion depth 3 (that is, we removed all the axioms from both ontologies that led
to an expansion depth greater than 3). The resulting DIKB fragment is a small acyclic
terminology that contains 120 concept names, 27 roles names, and 127 axioms. The
NCI fragment is also an acyclic terminology with 53571 concept names, 78 role names
and 62494 axioms (of which 2362 are of the form A ≡ C). For each considered sample
size x and terminology T we generated 100 signatures Σ by randomly choosing x
concept names from sig(T ) and by adding all the role names from sig(T ) to Σ.
 3
     Available from http://bioportal.bioontology.org/ontologies/1672
 4
     Available from http://evs.nci.nih.gov/ftp1/NCI_Thesaurus
                                  Successful    Average Nr        Average Maximal
        Ontology     |Σ ∩ NC |
                                 Computations   of Axioms           Axiom Size
                         5           85             7.482              19.176
       DIKB v1.4        10           60           14.033               24.933
                        15           44           25.114               26.568
                        50           65            21.369              30.538
       NCI v08.10e     100           56            41.089              62.839
                       150           41            63.146             104.756

 Table 1. Computing Uniform Interpolants of DIKB and of NCI Limited to Expansion Depth 3


    The results that we obtained for the two ontologies are shown in Table 1. The third
column gives the number of signatures for which a uniform interpolant could be com-
puted within the given time limit, and the average number of axioms and the average of
the maximal size of the axioms contained in each uniform interpolant are shown in the
subsequent columns. Most uniform interpolants could be computed within 60 seconds.
In a few cases, however, the computations took up to 3487 seconds.
    One can see that the number of successful computations decreased with increas-
ing size of Σ ∩ NC , which seems to be due to the fact that the >⊥∗ Σ-modules then
contain more symbols that lead to a large number of inferences. Most uniform inter-
polants that we have obtained are relatively small and contain a lot of expressions of
the form ∃r1 . . . ∃rn .>. In some cases the process of forgetting certain intermediate
concept names generated a few hundred clauses that were simplified or deleted in the
remaining computation steps.
Ontology Obfuscation. As a proof of concept, we applied our uniform interpolation
tool on (a fragment of) the Lipid Ontology5 (LiPrO) to forget 45 concept names which
are intermediate concept names in the ontology’s induced concept hierarchy, i.e. those
concept names group certain subconcepts together to give structure to the ontology.
LiPrO is an acyclic terminology with 593 axioms, 574 concept names and one role
name. The maximal size of an axiom is 50.
    It then took 192 CPU seconds to compute the uniform interpolant, which contains
3415 axioms with a maximal size of 283. The uniform interpolant that we computed
thus approximately contains 6 times more axioms than the original ontology and the
maximal axiom size has increased by a factor of 6 as well. Notice that most of the ori-
ginal structure of the ontology has been destroyed while preserving all the consequences
entailed by the retained concept names.
Logical Difference. We also used our implementation for the computation of the logical
difference [9] between two versions of the NCI thesaurus on various signatures. The
Σ-logical difference between ALC-TBoxes T1 and T2 is the set Diff Σ (T1 , T2 ) of all
ALC-concept inclusions C v D such that sig(C v D) ⊆ Σ, T1 |= C v D, and
T2 6|= C v D. Notice that if Diff Σ (T1 , T2 ) is nonempty, it is infinite already.
                                                                           (Σ)      (Σ)
     It is easy to see that Diff Σ (T1 , T2 ) = ∅ if, and only if, T2 |= T1 where T1 is a
                                                          (Σ)                           (Σ)
Σ-uniform interpolant of T1 . Moreover, if T2 6|= T1 , every inclusion C v D ∈ T1
 5
     Available from http://bioportal.bioontology.org/ontologies/1183
         |(sig(T ) \ Σ)    Successful       Average Nr        Average Maximal
              ∩NC |       Computations      of Witnesses        Witness Size
                50            100            1976.600              69.380
               100             99            1975.939              74.969
               150             95            1974.874              72.284

    Table 2. Computing Uniform Interpolants of NCI-08.10e Limited to Expansion Depth 3

with T2 6|= C v D is said to be a witness of Diff Σ (T1 , T2 ). In our experiments we
used the reasoner FaCT++ v1.6.2 [15] to determine whether any axiom C v D ∈
  (Σ)
T1     is a witness of Diff Σ (T1 , T2 ). To make the experiments more challenging for
the reasoner, we focused on comparing (a fragment of) NCI v08.10e (as T1 ) with (a
fragment of) NCI v.08.09d (as T2 ), both limited to an expansion depth of 3, on large
signatures Σ with Σ∩NR = sig(T )∩NR . In that way the computed uniform interpolants
remain rather large as well. For each sample size x ∈ {50, 100, 150} we again generated
100 signatures by randomly choosing |sig(T )∩NC |−x concept names from sig(T ) and
by including all the role names from sig(T ). The results that we obtained are now shown
in Table 2. Note that these results are not directly comparable with the logical difference
for the description logics of the EL family [9, 10] as illustrated by Example 12.
    One can observe that as size of sig(T ) \ Σ increases, i.e. more symbols have to
be forgotten from the >⊥∗ Σ-modules, the success rate dropped slightly. Overall, the
average number of witnesses and the average maximal size of the witnesses remains
comparable throughout the different sample sizes. Also, the axioms generated by the
computation of the uniform interpolant did not pose a problem for FaCT++ as comput-
ing the logical difference for a given signature never took more than 20 seconds in our
experiments.

5   Conclusion
In this paper we presented an approach based on clausal resolution for computing uni-
form interpolants of ALC-TBoxes T w.r.t. signatures Σ ⊆ sig(T ) that contain all the
role names present in T . We proved that whenever the saturation process under ALC-
resolution terminates, the algorithm computes a uniform interpolant. To guarantee ter-
mination on all inputs, we introduced a depth-bounded version of our algorithm. We
showed that by choosing an appropriate bound on the depth of clauses, one can axio-
matise all Σ-inclusions implied by the given TBox up to a specified depth. Combined
with a known bound on the size of uniform interpolants, our depth-bounded procedure
always computes a uniform interpolant if it exists.
     In the second part of this paper we investigated how often our unrestricted resolution-
based algorithm terminates with a uniform interpolant by applying our prototype im-
plementation on a number of case studies. Our findings suggest that despite a high com-
putational complexity uniform interpolants can be computed in many practical cases.
The computation procedure could further benefit from better redundancy elimination
techniques, which, together with extending our approach to forgetting role names, con-
stitutes future work. It would also be interesting to explore proof strategies for our
resolution calculi that guarantee termination when uniform interpolants exist.
References
 1. Auffray, Y., Enjalbert, P., Hébrard, J.J.: Strategies for modal resolution: Results and prob-
    lems. Journal of Automated Reasoning 6(1), 1–38 (1990)
 2. Bachmair, L., Ganzinger, H.: Resolution theorem proving. In: Handbook of automated reas-
    oning, vol. 1, chap. 2, pp. 19–99. Elsevier (2001)
 3. ten Cate, B., Conradie, W., Marx, M., Venema, Y.: Definitorially complete description logics.
    In: Proceedings of the Tenth International Conference on Principles of Knowledge Repres-
    entation and Reasoning (KR 2006). pp. 79–89. AAAI Press (2006)
 4. Collberg, C.S., Thomborson, C.D., Low, D.: Manufacturing cheap, resilient, and stealthy
    opaque constructs. In: Proceedings of the 25th ACM SIGPLAN-SIGACT Symposium on
    Principles of Programming Languages (POPL ’98). pp. 184–196. ACM (1998)
 5. Cuenca Grau, B., Horrocks, I., Kazakov, Y., Sattler, U.: Modular reuse of ontologies: theory
    and practice. Journal of Artificial Intelligence Research (JAIR) 31, 273–318 (2008)
 6. Eiter, T., Ianni, G., Schindlauer, R., Tompits, H., Wang, K.: Forgetting in managing rules
    and ontologies. In: Proceedings of the 2006 IEEE / WIC / ACM International Conference on
    Web Intelligence (WI 2006). pp. 411–419. IEEE Computer Society (2006)
 7. Enjalbert, P., del Cerro, L.F.: Modal resolution in clausal form. Theoretical Computer Sci-
    ence 65(1), 1–33 (1989)
 8. Herzig, A., Mengin, J.: Uniform interpolation by resolution in modal logic. In: Proceedings
    of the 11th European Conference on Logics in Artificial Intelligence (JELIA 2008). Lecture
    Notes in Computer Science, vol. 5293, pp. 219–231. Springer (2008)
 9. Konev, B., Walther, D., Wolter, F.: The logical difference problem for description logic ter-
    minologies. In: Proceedings of the 4th International Joint Conference on Automated Reas-
    oning (IJCAR 2008). Lecture Notes in Computer Science, vol. 5195, pp. 259–274. Springer
    (2008)
10. Konev, B., Ludwig, M., Walther, D., Wolter, F.: The logical difference for the lightweight
    description logic EL. Journal of Artificial Intelligence Research (JAIR) 44, 633–708 (2012)
11. Konev, B., Walther, D., Wolter, F.: Forgetting and uniform interpolation in large-scale de-
    scription logic terminologies. In: Proceedings of the 21st International Joint Conference on
    Artificial Intelligence (IJCAI 2009). pp. 830–835 (2009)
12. Lutz, C., Seylan, I., Wolter, F.: An automata-theoretic approach to uniform interpolation and
    approximation in the description logic EL. In: Proceedings of the Thirteenth International
    Conference on Principles of Knowledge Representation and Reasoning (KR 2012). AAAI
    Press (2012)
13. Lutz, C., Wolter, F.: Foundations for uniform interpolation and forgetting in expressive de-
    scription logics. In: Walsh, T. (ed.) Proceedings of the 22nd International Joint Conference
    on Artificial Intelligence (IJCAI 2011). pp. 989–995 (2011)
14. Sattler, U., Schneider, T., Zakharyaschev, M.: Which kind of module should I extract? In:
    Proceedings of the 22nd International Workshop on Description Logics (DL 2009). CEUR
    Workshop Proceedings, vol. 477. CEUR-WS.org (2009)
15. Tsarkov, D., Horrocks, I.: FaCT++ Description Logic reasoner: System description. In:
    Proceedings of the Third International Joint Conference on Automated Reasoning (IJCAR
    2006). Lecture Notes in Computer Science, vol. 4130, pp. 292–297. Springer (2006)
16. Wang, K., Wang, Z., Topor, R., Pan, J.Z., Antoniou, G.: Eliminating concepts and roles
    from ontologies in expressive descriptive logics. Computational Intelligence (Accepted for
    publication), DOI: 10.1111/j.1467-8640.2012.00442.x
17. Wang, Z., Wang, K., Topor, R., Pan, J.Z.: Forgetting Concepts in DL-Lite. In: Proceedings
    of the 5th European Semantic Web Conference (ESWC2008). Lecture Notes in Computer
    Science, vol. 5021, pp. 245–257. Springer (2008)
18. Wang, Z., Wang, K., Topor, R.W., Zhang, X.: Tableau-based forgetting in ALC ontologies.
    In: Proceedings of the 19th European Conference on Artificial Intelligence (ECAI 2010).
    Frontiers in Artificial Intelligence and Applications, vol. 215, pp. 47–52 (2010)
19. Zhou, Y., Zhang, Y.: Bounded forgetting. In: Proceedings of the Twenty-Fifth AAAI Confer-
    ence on Artificial Intelligence (AAAI 2011). AAAI Press (2011)

</pre>