=Paper=
{{Paper
|id=Vol-477/paper-6
|storemode=property
|title=A Tableaux-based Method for Computing Least Common Subsumers for Expressive Description Logics
|pdfUrl=https://ceur-ws.org/Vol-477/paper_22.pdf
|volume=Vol-477
|dblpUrl=https://dblp.org/rec/conf/dlog/DoniniCNS09
}}
==A Tableaux-based Method for Computing Least Common Subsumers for Expressive Description Logics==
<pdf width="1500px">https://ceur-ws.org/Vol-477/paper_22.pdf</pdf>
<pre>
   A Tableaux-based Method for Computing Least
Common Subsumers for Expressive Description Logics?

     Francesco M. Donini2 , Simona Colucci1 , Tommaso Di Noia1 , and Eugenio Di
                                     Sciascio1
                           1
                               SisInfLab–Politecnico di Bari, Bari, Italy
                               2
                                 Università della Tuscia, Viterbo, Italy


1 Motivation

Least Common Subsumers (LCSs) in Description Logics (DLs) have been introduced
by Cohen and Hirsh [2] to denote the most specific concept descriptions subsuming all
of the elements of a given collection of concepts.
    Since its introduction, LCS has been usefully exploited in several application fields
where the search for commonalities in a collection is needed, despite the complexity
of its computation even for inexpressive DLs. In fact, Baader et al. [3] investigated
LCS computation in EL, FLE and ALE, showing how the complexity of the related
problem arises to exponential size even for the small DL EL, and can neither be reduced
by introducing TBoxes to shorten possible repetitions [4]. So far, the most expressive
investigated DL is ALEN , for which a double exponential time algorithm has been
proposed [5].
    Our contribution introduces a novel general tableau-based calculus for computing
LCS with reference to a DL more expressive than ALEN , namely ALEHIN R+ . The
approach uses substitutions on concept terms containing concept variables.
    Several application fields may take advantage from the enrichment of representa-
tion expressiveness related to LCS computational problem. For example, Lutz et al. [6]
recently underlined the need to have LCS available also for the more expressive lan-
guages currently used in ontology design, one of the best known applications of LCS
[7–9]. LCS has been also used in semantic-based information retrieval, for the definition
of measures for concept similarity [10, 11].
    Hacid et al. [12] studied the use of LCS for defining and computing the best cover-
ing. Such a computation approach has been applied to semantic Web services discovery
and composition [13] and, more recently, to the composition of learning resources [14].
LCS was also employed for schema extraction from semistructured data [15].
    A further application field is in inductive learning algorithms, to find a least general
concept consistent with a set of positive example, used as basis for learning [16]. Fi-
nally, Colucci et al. [17] used the LCS to evaluate the Core Competence of a company,
in the context of DL-based Knowledge Management, modeled through an ontology in
ELHN , and introduced variants of LCS needed in that context.
?
    This paper is a reduced version of the one[1] accepted for publication at the Twenty-first In-
    ternational Joint Conference on Artificial Intelligence (IJCAI-09)
    The variety of approaches recalled so far witnesses the importance of LCS compu-
tation in knowledge representation and reasoning literature, justifies the need for LCS
in expressive DLs, and motivates our contribution.
    The rest of the paper is organized as follows: in the next section we introduce the
DL used in the paper . We then present a novel generalized definition of LCS in Section
3. Based on such definition and on the tableux rules for ALEHIN R+ presented in
Section 4, we propose a tableaux-based method for LCS computation in Section 5.
Conclusions and future research directions close the paper.


2 The Language ALEHIN R+
To show the generality of our calculus, we choose the DL ALEHIN R+ , which is
SHIN without constructs that introduce concept disjunction, namely, t and ¬. In lan-
guages including disjunction, the simplest LCS would be just C1 tC2 —or equivalently,
¬(¬C1 u ¬C2 ) with full negation—making the LCS problem trivial.
     Let Nr be a set of role names. A general role R can be either a role name P ∈ Nr , or
its inverse P − . We admit a set of role axioms, formed by: (1) a role hierarchy H, which
is a set of role inclusions of the form R1 v R2 , and (2) a set of transitivity axioms
for roles, denoted by trans(R). We denote by v∗ the transitive closure of H ∪ {R− v
S − | S v R ∈ H}. A role S is simple if it is not transitive, and for no R such that
R v∗ S, R is transitive.
     In the following syntax for concepts, let A be any concept name in a set Nc of
concept names, let R be a general role, and S be a simple role.

                      At −→ > | ⊥ | A | ¬A | (> n S) | (6 n S)                        (1)
                      C −→ At | C1 u C2 | ∃R.C1 | ∀R.C1                               (2)

 We call At an atomic concept, while C simply concept. We consider (6 R 0) as an
abbreviation of ∀R.⊥. We extend the inverse role constructor to general roles by letting
R− = P − if R = P , and R− = P if R = P − . Moreover, we denote by ∼C the
negation normal form of ¬C (see [18, Ch.2] for a definition).
   We denote by C v D subsumption between two concepts C and D, and by C @ D
we mean strict subsumption. Moreover, we make use of the following property.
Proposition 1. For every four concepts C1 , C2 , D1 , and D2 : if C1 v D1 and C2 v
D2 , then C1 u C2 v D1 u D2 .
    As for concept axioms, we do not consider General Concept Inclusions (GCI) in
this paper, since Baader et al. [19] showed that even for the simple DL ALE LCS may
not exist, when GCIs interpreted with descriptive semantics are used. We could admit
simple concept inclusions—e.g., acyclic definitions of concept names—but since they
do not add expressivity to the language, for sake of simplicity we do not consider them.
    Regarding Qualified Number Restrictions (Q), observe that (6 0 P.(∼C1 u ∼C2 )) ≡
∀P .(C1 t C2 ). Hence, qualified at-most number restrictions would implicitly introduce
t in concepts—although inside quantifications—so we want to exclude them. Since in
DLs the at-most restriction comes always paired with the at-least restriction, we exclude
them both, only for uniformity.
3 A logical definition of the LCS

In this section we denote by DL a generic Description Logic.

Definition 1 (LCS as a set). Let C1 , C2 ∈ DL be two concepts. By LCS(C1 , C2 ) we
mean a set of equivalent concepts in DL, such that for every L ∈ LCS(C1 , C2 ), (a) L
is a common subsumer of C1 and C2 —in formulas, C1 v L, C2 v L—and (b) there
does not exist D ∈ DL such that C1 v D, C2 v D and D @ L.

In order to write a second-order formula defining LCS, we need an alphabet Nx =
{X0 , X1 , X2 , . . .} of concept variables, which we can quantify over.

Theorem 1. L ∈ LCS(C1 , C2 ) iff C1 v L, C2 v L, and the formula below is false:

                    ∃X. {X ∈ DL, C1 v X, C2 v X, X @ L}                                 (3)

where commas in (3) denote conjunction. The proof is straightforward, since (3) is just
a formal rewriting of (b) in Def.1. Observe that (3) does not belong to the Monadic
Second-Order fragment proved decidable by Rabin [20], since the formula X ∈ DL,
if explicitly defined, would force to write a least fixpoint to logically define DL. We
prefer instead to keep X ∈ DL as a constraint on the possible assignments for X,
which restricts the possible substitutions for X to be defined later on. Intuitively, we are
using general semantics [21] for interpreting X.
    To introduce a more computation-oriented version of (3), we define the decoration
of a concept. Intuitively, given a concept L, we put a new concept variable in con-
junction with the filler of every universal and existential role quantification in L, plus
one concept variable in the outermost level of L. Since we would like to have all vari-
ables consecutively numbered, starting at 0, we define the decoration in Algorithm 1
by means of a recursive procedure, plus a global counter i that keeps track of the last
index used in whatever recursive call. Although maybe not mathematically elegant, we
believe that Algorithm 1 presents this idea in the most precise and intuitive way.

Definition 2. Let C be a concept in DL, and Nx be an alphabet of concept variables.
We denote by CX the decoration of C, defined by Algorithm 1.

Algorithm 1: Decoration of a concept C
     input concept C ∈ ALEHIN R+ ;
     var i := 0;
                  .
     return CX = X0 u dec(C);
     function dec(C)
     case C is of the form
           atomic: return C
           C1 u C2 : return dec(C1 ) u dec(C2 )
           ∃R.C1 : i := i+1; return ∃R.(Xi u dec(C1 ))
           ∀R.C1 : i := i+1; return ∀R.(Xi u dec(C1 ))
     end function
For example, if C is the concept A u ∃P .(> 2 Q) u ∀P .∀Q.B, then CX = A u X0 u
∃P .(X1 u (> 2 Q)) u ∀P .(X2 u ∀Q.(X3 u B)). Note that CX is a (particular) concept
term [22], i.e., a concept formed according to the rules in (1) and (2), with the addi-
tion to (1) of the rule At −→ X, for every X in Nx . Our decorations are particular
concept terms, in that every variable occurs only once in the term, and—by inspecting
Algorithm 1—one can verify that variables are one-one with quantifications, plus the
outermost variable. Observe that we do not count number restrictions as quantifiers,
although their logical definition would contain a quantifier.
Definition 3 (Substitutions). A substitution σ is a set of pairs {Xi1 7→ Di1 , . . . , Xik 7→
Dik }, where indexes i1 , . . . , ik are all different, and for every j = 1, . . . , k each Xij is
a concept variable and each Dij is a concept term. A substitution is ground if no Dij
contains any variables, i.e., Dij ∈ DL for every j = 1, . . . , k.
    For a decorated concept CX , we inductively define σ(CX ) as σ(Xi ) = Di , σ(¬Xi ) =
∼Di , σ(C) = C if C is atomic, σ(C1 u C2 ) = σ(C1 ) u σ(C2 ), σ(∃R.C) = ∃R.σ(C),
σ(∀R.C) = ∀R.σ(C).
    The cardinality of a substitution is its cardinality as a set.
Let σ>,n denote the substitution {Xi 7→ >}i=0,...,n ; since variables in decorations
appear always in some conjunction, then for every concept C ∈ DL containing n
quantifications, σ>,n (CX ) ≡ C.
     We also need renaming of variables, as bijective functions on the indices of vari-
ables. We denote a renaming by ρ. A variable renaming can also be applied to a sub-
stitution, ρ(σ), meaning that the index of each variable changes according to ρ; e.g.,
if σ = {X0 7→ A, X1 7→ B}, and ρ is the renaming {0 7→ 1, 1 7→ 2}, then ρ(σ) =
{X1 7→ A, X2 7→ B}.
     We can now reformulate Thm.1 (for DL =ALEHIN R+ ) in a way that leads to a
direct computational method.
Theorem 2. Let C1 , C2 , L ∈ ALEHIN R+ . Then L ∈ LCS(C1 , C2 ) iff C1 v L, C2 v
L, and the formula below is false:

             ∃σ { σ is ground, C1 v σ(LX ), C2 v σ(LX ), L 6v σ(LX )}                       (4)

Proof. Since Thm.1 has the same premises of Thm.2, the latter amounts to an equiva-
lence between formulas (3) and (4), which we prove below. Let 0, . . . , n be the indexes
of concept variables in LX .
(3)⇒(4) If (3) holds, then there is a concept L0 ∈ DL such that all three conditions:
C1 v L0 , C2 v L0 , and L0 @ L hold. Then let σ be the substitution {X0 7→
L0 } ∪ {Xi 7→ >}i=1,...,n . Applying σ to LX we obtain L0 u σ>,n (dec(L)), which
is equivalent to L0 since σ>,n (dec(L)) ≡ L and L0 @ L. Since all conditions for
σ(LX ) hold, the claim follows.
(4)⇒(3) If (4) holds, then we use σ(LX ) as a witness for ∃X in (3), proving all four
conditions of (3). First condition is met because σ(LX ) ∈ DL since σ is ground.
Second and third conditions hold by hypothesis. Regarding fourth condition (proving
σ(LX ) @ L), we prove subsumption separately (to ease readability) in Lemma 1 below,
while fourth condition in (4) implies that subsumption holds strictly.
Lemma 1. For every given concept C ∈ ALEHIN R+ , and every ground substitution
σ, it holds σ(CX ) v C.

Proof. By induction on the structure of C. Base cases: when C is atomic (no structure),
CX = X0 u C contains just one concept variable. Then σ(X0 u C) = σ(X0 ) u C =
D0 u C v C by definition of u, whatever D0 , and hence for every σ.
    Inductive cases: we now show that the claim holds for ∃R.C1 , given that it holds for
C1 which is structurally simpler. Suppose C1 has n ≥ 0 quantifiers, hence (C1 )X has
n + 1 variables X0 , . . . , Xn . Observe that by letting ρ = {i 7→ i + 1}i=0,...,n , it holds
(∃R.C1 )X = X0 u ∃R.ρ((C1 )X ), where the renaming is necessary because in (C1 )X
variables are numbered from 0. Observe that ρ is a bijection, hence its inverse ρ−1
is well defined. Then, for every substitution σ over n + 2 variables, σ((∃R.C1 )X ) =
σ(X0 ) u ∃R.σ(ρ((C1 )X )). Let σ 0 be σ without X0 7→ D0 , and let σ1 = ρ−1 (σ 0 ). Then
σ(ρ((C1 )X )) = σ1 ((C1 )X ), which is subsumed by C1 by inductive hypothesis. Hence
∃R.σ(ρ((C1 )X )) v ∃R.C1 , and the subsumption holds also if (whatever concept) D0
is conjoined on the left-hand side. Since we made no restrictions on σ, for every σ the
claim σ((∃R.C1 )X ) v ∃R.C1 holds. A similar proof can be laid out for ∀R.C1 .
    Regarding C1 u C2 , the inductive hypothesis is that the claim holds for C1 and
C2 separately, which are both structurally simpler. Again, suppose that both C1 and
C2 have at most m, n ≥ 0 quantifiers, respectively, and let ρ be now the renaming
{i 7→ i + m}i≥1 . For this renaming, (C1 u C2 )X ≡ (C1 )X u ρ((C2 )X ), where both
decorations introduce a variable X0 (which ρ does not rename, since it applies to i ≥ 1),
and equivalence holds since for every ground substitution σ, σ(X0 u X0 ) ≡ σ(X0 ).
Then, for every σ on m + n + 1 variables, σ((C1 u C2 )X ) ≡ σ1 ((C1 )X ) u σ2 ((C2 )X ),
where σ1 is just σ restricted to the first m + 1 variables, while σ2 is built as follows:
let σ 0 be σ restricted to the last n variables; let ρ0 = {i + m 7→ i}i=m+1,...,n . Then,
σ2 = {X0 7→ D0 } ∪ ρ0 (σ 0 ). By inductive hypothesis, both σ1 ((C1 )X ) v C1 and
σ2 ((C2 )X ) v C2 hold. By Prop.1, the claim is obtained.
   Observe that Thm. 2 refers to ALEHIN R+ only because both Algorithm 1 and
Lemma 1 refer to ALEHIN R+ .


4 Tableaux Rules for ALEHIN R+

We first give an intuition of the way our calculus proceeds. To prove or disprove For-
mula (4), we expand three tableaux, one for each of the three conditions: T1 for C1 v
LX , T2 for C2 v LX , and T3 for L 6v LX . The tableaux are first expanded using
tableaux rules (T-rules), treating concept variables as concept names. Then, by using
substitution rules (S-rules), we try to find a substitution σ satisfying (4), i.e., closing
T1 and T2 and leaving T3 open. The substitution might make applicable some other
T-rule, and so on, till no rule is applicable. If all branches of T1 and T2 close, and
at least one branch of T3 is open, we found a substitution σ validating (4), otherwise,
we prove that no such σ exist, disproving (4). We warn the reader that the calculus we
present in this section does not compute an LCS; it just tries to prove Formula (4), by
exhibiting a common subsumer of C1 , C2 which is “better” than L. In the next section,
we use such a calculus to incrementally compute a finite LCS (if one exists).
    Rules for constructing tableaux in ALEHIN R+ (Fig. 1) are a subset of the ones
for SHIQ, and have been proved [23] sound and complete. We summarize them here
for sake of completeness, remarking that we just inherit them from past research. Any
inaccuracy is due to our rephrasing. In such rules, blocking is pair-wise blocking as de-
fined by, e.g., Tobies [23, p.125]; our only addition is that concept variables are treated
as concept names for what regards blocking.
    We recall that an individual y is an S-successor of x in Li , (for i = 1, 2, 3) if for
some role R, both R ∈ Li (x, y) and R v∗ S. Conversely, y is an S-predecessor of x
if x is an S-successor of y. An individual y is an R-neighbor of x if either y is an R-
successor of x, or x is an R− -successor of y. The definitions of successor, predecessor,
and neighbor allow us to treat roles and inverse roles in a uniform way, both in T-rules
(Fig. 1) and in subsequent S-rules (Fig. 2).


All rules are applicable only if x is not blocked. For each i = 1, 2, 3, Li is a branch in Ti .

u-rule : if C u D ∈ Li (x), then add both C and D to Li (x)
t-rule : if C t D ∈ Li (x), then add either C or D to Li (x)
∃-rule : if ∃R.C ∈ Li (x), and x has no R-successor y with C ∈ Li (y), then pick up a new
     individual y, add R to L(x, y), and let Li (y) := {C}
∀-rule : if ∀R.C ∈ Li (x), and there exists an individual y such that y is an R-successor of x,
     then add C to Li (y).
∀+ -rule : if ∀S.C ∈ Li (x), with trans(R) and R v∗ S, there exists an individual y such that
     y is an R-successor of x, and ∀R.C 6∈ Li (y), then add ∀R.C to Li (y)
>-rule : if (> n S) ∈ Li (x), and x has not n S-neighbors y1 , . . . , yn with y` 6= yj for 1 ≤ ` <
     j ≤ n, then create n new successors y1 , . . . , yn of x with Li (x, y` ) = {S}, and y` 6= yj ,
     for 1 ≤ ` < j ≤ n
6-rule : if (6 n S) ∈ Li (x) with n ≥ 1, and there are more than n S-neighbors of x, and there
     are two S-neighbours y, z of x, y is an S-successor of x, and not y 6= z then (1) add Li (y)
     to Li (z), (2) for every R ∈ Li (x, y) if z is a predecessor of x then add R− to Li (z, x) else
     add R to Li (x, z), (3) let Li (x, y) = ∅, and (4) for all u with u 6= y, set u 6= z


    Fig. 1. Tableaux rules (T-rules) for ALEHIN R+ (rephrased from Tobies [23, p.128])


    Differently from Tobies [23], we say that T-rules construct a branch L, while we
call tableau the set of all different branches that can be constructed applying T-rules.
Branches are different because of the nondeterminism present in t-rule and 6-rule (we
ignore differences due to possible renaming of new individuals in >-rule). A branch L
is closed if for some individual x, either (a) ⊥ ∈ L(x), or (b) {A, ¬A} ⊆ L(x) for
some A ∈ Nc , or (c) (6 n R) ∈ L(x) and there are n + 1 R-neighbors y1 , . . . , yn+1 of
x such that yi 6= yj ∈ L for 1 ≤ i < j ≤ n + 1. A branch is T-complete if no T-rule
is applicable. A tableaux is closed if all its T-complete branches are closed, it is open if
there exists at least one T-complete branch which is not closed.
    We are now able to present our original part of this section, namely, substitution
rules (S-rules) dealing with concept variables, in Fig. 2. There is a substitution rule
for every syntax rule in (1)–(2), except conjunction. This is because substituting, say,
X1 7→ X2 u X3 , and repeatedly X2 7→ X4 u X5 , etc., would yield infinite branching in
our substitution calculus, which should be carefully dealt with by some restrictions on
substitution applications. Instead, we prefer to deal with conjunctions in an incremental
fashion, as explained in the next section.
    Our tableaux contain concept variables; we denote by σ(T) the application of the
substitution σ to every concept in every constraint of T. Since we operate on a system of
three tableaux, we denote it globally as hT1 , T2 , T3 i. For such a system, σhT1 , T2 , T3 i
denotes hσ(T1 ), σ(T2 ), σ(T3 )i. When both a T-rule and an S-rule is applicable to
hT1 , T2 , T3 i, T-rules have always precedence over S-rules.


All rules are applicable only if L ∈ T1 ∪ T2 , L is open, and the substitution is not σ-blocked.
Rules above the separating line have precedence over rules below it.

σ>-rule : if ¬X ∈ L(x), then apply σ = {X 7→ >} to hT1 , T2 , T3 i
σN-rule : if {¬X, A} ⊆ L(x) for some A ∈ Nc , then apply σ = {X 7→ A} to hT1 , T2 , T3 i
σ¬N-rule : if {¬X, ¬A} ∈ L(x) for some A ∈ Nc , then apply σ = {X 7→ ¬A}, to
    hT1 , T2 , T3 i
σ>-rule : if ¬X ∈ L(x) and there are exactly n R-neighbors of x, then apply σ = {X 7→
    (> m S)}, where m is between 0 and n, and R v∗ S
σ6-rule : if {¬X, (6 n S)} ⊆ L(x), then apply σ = {X 7→ (6 n R)} to hT1 , T2 , T3 i, for
    some role R such that R v∗ S
    .
σ∀-rule : if {¬X, ∀S.C} ⊆ L(x), then apply σ = {X 7→ ∀R.Y } to hT1 , T2 , T3 i, where Y
    denotes a concept variable not appearing in hT1 , T2 , T3 i, and R v∗ S
σ∃-rule : if {¬X, ∃R.C} ⊆ L(x), then apply σ = {X 7→ ∃S.Y } to hT1 , T2 , T3 i, where Y
    denotes a concept variable not appearing in hT1 , T2 , T3 i, and R v∗ S


                     Fig. 2. Substitution rules (S-rules) for ALEHIN R+


    When the application of a rule to hT1 , T2 , T3 i yields hT01 , T02 , T03 i, we say that
hT01 , T02 , T03 i directly derives from hT1 , T2 , T3 i. Then derives is just the transitive
closure of “directly derives”. In what follows, we assume that T- and S-rules are applied
to three tableaux that always start as follows:
                             T1 = {L1 (a) = {C1 , ∼(LX )}}                                  (5)
                             T2 = {L2 (a) = {C2 , ∼(LX )}}                                  (6)
                             T3 = {L3 (a) = {L, ∼(LX )}}                                    (7)
For such tableaux, the following properties can be isolated.
Lemma 2. Let hT1 , T2 , T3 i start as above. Then, (1) every concept variable occurs al-
ways with a negation in front in every constraint of every tableaux; (2) every T-complete
branch Li of every Ti contains at most one concept variable X such that ¬X ∈ Li (x),
for some individual x.
Proof. Property (1) can be proved by induction on T- and S-rule applications. Base: in
the initial tableaux (5)–(7), concept variables appear only in ∼(LX ), and since variables
occur positively in LX , negation normal form puts a “¬” in front of every variable.
Induction: suppose the claim holds for hT1 , T2 , T3 i. By inspection, T-rules never in-
troduce nor delete negations in concepts, so the claim holds also after a T-rule has been
applied. S-rules which introduce new concept variables (σ∀ and σ∃) apply a substitu-
tion X 7→ D (where D is either ∀R.Y or ∃S.Y ) to an existing negated variable ¬X (by
induction hypothesis). By Def.3, ¬X is substituted with ∼D, so also newly introduced
concept variables satisfy the claim.
We now turn to prove (2). At start, concept variables appear only in ∼ (LX ), and
S-rules never increase the number of concept variables: all S-rules reduce by 1 the
number of concept variables, but for σ∀ and σ∃ that introduce a new variable Y ,
but remove X. So, we can base our induction on the number n of variables of LX .
If L contains no quantifications (i.e., L is a conjunction of atomic concepts), then
∼(LX ) = ∼(X0 u L) = ¬X0 t ∼L, so the claim holds for one variable X = X0
and x = a. Suppose the claim holds for concepts with n variables. If L contains n + 1
variables, then it must contain at least one quantification, and LX can be decomposed
as X0 u L1 , where L1 is a concept term that contains at least one quantification, so
another variable, and at most n variables in total. Hence, ∼(LX ) = ¬X0 t ∼(L1 ), so
from ∼(LX ) ∈ Li (a) T-rules can obtain two branches, say, L0i and L00i , the former with
¬X0 ∈ L0i (a), and the latter with ∼(L1 ) ∈ L00i (a). For L0i the claim holds directly, while
for L00i it holds by inductive hypothesis since it contains n variables.
     The above lemma justifies the absence of S-rules acting for constraints of the form
X ∈ Li (x), since starting from hT1 , T2 , T3 i as in (5)–(7), such constraints never
appear.
     A branch is S-complete if no S-rule is applicable. We call a branch complete when
it is both T-complete and S-complete.

Theorem 3 (Soundness and completeness).
Let C1 , C2 , L as in Thm.2, and let hT1 , T2 , T3 i be defined as in (5)–(7). Then For-
mula (4) is true if and only if there is a way of applying S-rules, obtaining a global
substitution σ, such that both σ (T1 ) and σ (T2 ) close, and σ (T3 ) is open.
Proof. (Only if.) Validity of Formula (4) requires a ground substitution σ 0 . Now σ 0
may not be used to prove directly the claim, since it may contain u in some substitution
Xi 7→ Di , and u is not reconstructed by S-rules. So let σ be a substitution obtained
from σ 0 by choosing only one conjunct in the outermost u of each Di , and if such a
conjunct contains an u (also inside quantifications), recursively choosing one conjunct,
till one obtains a Di0 without us; the choice is made by inspecting how T-rules build the
branches of the (variable-free) tableau σ 0 (T3 ). Observe that σ 0 (∼(LX )) = σ 0 (¬X0 ) t
σ 0 (∼(L1 )) for a suitable concept term L1 , and suppose that σ 0 contains the substitution
X0 7→ E1 uE2 . Therefore, t-rule applied to σ 0 (∼(LX )) ∈ L3 (a) yields three branches,
¬E1 ∈ L03 (a), ¬E2 ∈ L003 (a), and σ 0 (∼(L1 )) ∈ L000               0
                                                       3 (a). Since σ validates (4), at least
               0    00  000
one among L3 , L3 , L3 can be turned by T-rules into a T-complete, open branch. If such
a branch is L03 , choose X0 7→ E1 for σ, if it is L003 choose E2 , while if the open branch
stems from L000  3 then choose whatever E1 , E2 , indifferently. Clearly if Ei is chosen
and Ei still contains conjunctions, the choice is recursively repeated. Soundness of T-
rules ensures that the choice of a u-free substitution σ can always be made in such a
way that, finally, T-rules can obtain from σ(T3 ) a T-complete, open branch. Observe
also that σ 0 (T1 ), σ 0 (T2 ) must close by completeness of T-rules, that is, every branch
stemming from them must close. Now branches from σ(T1 ) and σ(T2 ) are a subset of
the branches from σ 0 (T1 ), σ 0 (T2 ), hence all of them must close too. It remains to show
that σ can be reconstructed by repeated application of S-rules, and which can be proved
by induction on the quantifications of each Di in Xi 7→ Di ∈ σ.
     (If.) If S-rules (intertwined with T-rules) can construct a ground substitution σ such
that both σ (T1 ) and σ (T2 ) close, and σ (T3 ) is open, then by soundness of T-rules, σ
is a witness validating Formula (4).

     The above theorem does not exclude that, when Formula (4) is false, the calcu-
lus runs forever. In fact, the reader could verify that in the example trans(P ), C1 =
∃P u ∀P .∃P .(A u C), C2 = ∃P u ∀P .∃P .(B u C), and L = ∃P u ∀P .∃P .C T- and
S-rules together run indefinitely. Intuitively, the calculus can go astray when already
L ∈ LCS(C1 , C2 ), and transitive roles produce new individuals and concepts that, in
turn, trigger the application of an S-rule, which may add new concepts to old individu-
als, and such concepts can propagate to new individuals, destroying pairwise blocking.
Therefore, although S-rules require some other constraints to be already present in the
branch, they also need a blocking condition, to prevent their infinite application.
     A substitution X 7→ ∃R.Y is S-blocked for ¬X ∈ Li (x) in hT1 , T2 , T3 i if
hT1 , T2 , T3 i derives from some hT01 , T02 , T03 i, in which there is some individual x0
such that:

  (i) ¬X 0 ∈ L0i (x0 ),
 (ii) Li (x) = L0i (x0 ),
(iii) for every R-successor y of x in Li , there exists an R-successor y 0 of x0 in L0i such
      that Li (y) = L0i (y 0 ),
(iv) for every S, the number of different S-neighbors of x in Li is the same as the
      number of different S-neighbors of x0 in L0i , and
 (v) the σ∃-rule has been applied to hT01 , T02 , T03 i, with the substitution X 0 7→ ∃R.Y 0 .

The S-blocking of a substitution X 7→ ∀R.Y is defined analogously. Observe that we
compare Li (x) (the concepts attached to x in hT1 , T2 , T3 i) with L0i (x0 ), that is, the
concepts attached to x0 in the old state hT01 , T02 , T03 i. Also, note that Lemma 2 allows
us to define blocking only for constraints of the form ¬X ∈ Li (x).
    Rule σ∃ is S-blocked for ¬X ∈ Li (x) in hT1 , T2 , T3 i if either it is simply not
applicable, or every possible substitution for X allowed by σ∃ is S-blocked, and anal-
ogously for Rule σ∀. Note that if {¬X, ∃R.C} ⊆ L(x), then for each role S such that
R v∗ S, Rule σ∃ allows the substitution X 7→ ∃S.Y , so S-blocking prevents all these
substitutions. Finally, we say that a branch is S-blocked if it is not closed and contains
a constraint ¬X ∈ Li (x) for which both Rule σ∃ and Rule σ∀ are S-blocked, and
hT1 , T2 , T3 i is S-blocked if either T1 or T2 contain an S-blocked branch.
    Now we modify the completeness of a branch by saying that a branch is S-complete
if no S-rule is applicable, taking also S-blocking into account. Clearly, when we stop
the calculus because of S-blocking, we have to prove that no substitution was ever to be
found.
Theorem 4. If hT1 , T2 , T3 i is S-blocked, no hT01 , T02 , T03 i such that T01 and T02 are
closed, and T03 is open, can be derived from it.
Proof. (Sketch) We intuitively view our calculus as a game between Player T, whose
moves are T-rules, and player S, whose moves are S-rules. Faithfully to our precedences,
T moves whenever she can, S moves only if T cannot move, and S should use a move
above the line in Fig. 2 whenever he can. S wins if he can reach a state hT1 , T2 , T3 i
in which both T1 and T2 are closed, and T3 is open, while T wins in every other case
(including infinite runs). Intuitively, S tries to build some finite proof of (4), while T re-
sponds by constructing a (possibly infinite) model that would serve as a counterexample
for that proof. In this setting, the claim is proved if T has a winning strategy whenever
hT1 , T2 , T3 i is S-blocked.
    In fact, suppose that in such a case S tries anyway a substitution X 7→ D for
¬X ∈ Li (x). Then, T can respond in the same way she did in hT01 , T02 , T03 i when
S played X 0 7→ D0 for ¬X 0 ∈ L0i (x0 ). Conditions (ii)–(iv) ensure that after S plays
X 7→ D in hT1 , T2 , T3 i, T can play on x the same rules she played on x0 after S played
X 0 7→ D0 in hT01 , T02 , T03 i. S will not succeed in closing Li , otherwise by precedence
of Rules σ>, σN, σ¬N, σ >, σ 6, over Rules σ∀ and σ∃, S would have closed L0i in
hT01 , T02 , T03 i before deriving hT1 , T2 , T3 i.

Theorem 5 (Termination). Let T-rules and S-rules be applied according to blocking
conditions, giving always precedence to T-rules. Then there is no infinite sequence of
applications of T- and S-rules starting from hT1 , T2 , T3 i as in (5)–(7).

Proof. (Sketch) Termination of T-rules alone was proved by Tobies [23, Lemma 6.35].
Termination of T- and S-rules together stems from S-blocking, which eventually occurs
since S-rules add concepts that are in the syntactic closure of C1 , C2 , L and H. In fact,
observe that, e.g., σ∃ substitutes X with ∃S.Y only if ∃R.C ∈ Li (x), and ∃R.C does
not come from another substitution in the same branch, because of Lemma 2.
    In conclusion, checking whether L ∈ LCS(C1 , C2 ) is a decidable problem for
C1 , C2 , L ∈ ALEHIN R+ .


5 Computing the LCS

The previous section set up a calculus for the LCS decision problem. We now set an
iterative algorithm that computes the LCS by repeatedly solving Formula (4) for in-
creasingly better Ls.

Algorithm 2: Computing an LCS of C1 , C2
input concepts C1 , C2
var concept L := >, concept L1
  repeat (*)
    T1 := {C1 ∈ L1 (a), ∼(LX ) ∈ L1 (a)};
    T2 := {C2 ∈ L2 (a), ∼(LX ) ∈ L2 (a)};
    T3 := {L ∈ L3 (a), ∼(LX ) ∈ L3 (a)};
    apply T-rules and S-rules to hT1 , T2 , T3 i
   if a substitution σ s.t. (T1 , T2 close and T3 is open) is found
      then L1 := σ(LX ); L := L1
      else L1 := nil;
 until (L1 = nil)
 return L;
Termination of the above algorithm implies the existence of a finite LCS for every pair
of concepts in ALEHIN R+ . This problem is out of the scope of this paper, hence we
give a weaker termination proof.
Theorem 6. Let C1 , C2 ∈ ALEHIN R+ . If LCS(C1 , C2 ) contains a finite concept
expression, then Algorithm 2 terminates with L ∈ LCS(C1 , C2 ).
Proof. If there exists a finite concept L̂ ∈ LCS(C1 , C2 ), it must have a finite number of
non-redundant conjuncts. In each iteration (*), in order for T3 to be open, L1 = σ(LX )
must have at least one non-redundant conjunct more than L. Hence after |L̂| iterations,
the until condition is reached.
    We remark that for DLs for which a finite LCS always exists, the above theorem
implies that Algorithm 2 always terminates. For instance, in ALEN there always exist
a finite LCS(C1 , C2 ), whose size is exponential in the sizes of C1 , C2 [5]. Hence for
C1 , C2 ∈ ALEN , Algorithm 2 iterates (*) a number of times exponential in |C1 |+|C2 |.
    Also, we remark that if iterations (*) are stopped before the until condition is true,
an invariant of (*) is that the current value of L is a common subsumer of C1 , C2
(although not the least one). In this sense, Algorithm 2 can be turned into an anytime
approximation algorithm for LCS.

6 Conclusion and Perspective
Although Least Common Subsumer is one of the most interesting and usefully exploited
non-standard inference service for Description Logics, its computation for expressive
DLs is still an open challenge.
    In this paper we formulated the problem of evaluating LCS in terms of second-order
formulas where variables represent general DL concepts. Based on this formulation we
also proposed a novel general tableau-based calculus to compute a solution to a LCS
problem and presented the whole calculus for an expressive DL, namely ALEHIN R+ .
    Having a calculus based on well-founded analytic tableaux surely presents many
advantages both from a practical point of view and from a theoretical one. First, our
approach may ease implementing the computation of LCS in state-of-the-art tableaux-
based reasoners (Pellet, FaCT++, RacerPro), also exploiting well known optimization
techniques for tableaux in DLs [18, Ch.9]. Secondly, the analysis of soundness and com-
pleteness for a tableau-based algorithm is less tricky than the one based on structural
algorithms—as the ones proposed so far for LCS. Finally, but even more importantly,
for DLs in which a finite LCS may not always exist [19], a terminating algorithm for
computing LCS cannot exist, while a sound and complete calculus can be devised along
the lines we showed—analogously to sound and complete calculi for full First-Order
Logic.
    In perspective, our formulation and computation of LCS paves the way to further
results for computing other useful non-standard reasoning services in DLs.
Acknowledgments

We thank Franz Baader for useful discussions on LCS and pointers to his relevant liter-
ature. We acknowledge partial support of Apulia region Strategic Projects PS 092 and
PS 121.


References

 1. Donini, F.M., Colucci, S., Di Noia, T., Di Sciascio, E.: A tableaux-based method for com-
    puting least common subsumers for expressive description logics. In: Proc. of IJCAI 2009,
    AAAI (2009) to appear.
 2. Cohen, W., Borgida, A., Hirsh, H.: Computing least common subsumers in description log-
    ics. In Rosenbloom, P., Szolovits, P., eds.: Proc. of AAAI’92, Menlo Park, California, AAAI
    Press (1992) 754–761
 3. Baader, F., Küsters, R., Molitor, R.: Computing least common subsumers in description
    logics with existential restrictions. In: Proc. of IJCAI’99, Morgan Kaufmann (1999) 96–101
 4. Baader, F., Turhan, A.Y.: On the problem of computing small representations of least com-
    mon subsumers. In: Proc. of KI 2002, Aachen, Germany, Springer (2002)
 5. Küsters, R., Molitor, R.: Structural Subsumption and Least Common Subsumers in a De-
    scription Logic with Existential and Number Restrictions. Studia Logica 81 (2005) 227–259
 6. Lutz, C., Baader, F., Franconi, E., Lembo, D., Möller, R., Rosati, R., Sattler, U., Suntisri-
    varaporn, B., Tessaris, S.: Reasoning support for ontology design. In Grau, B.C., Hitzler, P.,
    Shankey, C., Wallace, E., eds.: Proc. of OWLED 2006. (2006)
 7. Baader, F., Küsters, R.: Computing the least common subsumer and the most specific concept
    in the presence of cyclic ALN -concept descriptions. In: Proc. of KI’98, Bremen, Germany,
    Springer (1998) 129–140
 8. Kopena, J.B., Regli, W.C.: Design repositories on the semantic web with description-logic
    enabled services. In: Proc. of VLDB 2003. (2003)
 9. Ovchinnikova, E., Wandmacher, T., Kuehnberger, K.U.: Solving terminological inconsis-
    tency problems in ontology design. Interoperability in Business Information Systems 2
    (2007) 65–80
10. Möller, R., Haarslev, V., Neumann, B.: Semantics-based information retrieval. In: Proc. of
    IT&KNOWS-98, Vienna, Budapest (1998)
11. Janowicz, K., Wilkes, M., Lutz, M.: Similarity-based information retrieval and its role within
    spatial data infrastructures. In: Proc. of GIScience ’08, Springer (2008) 151–167
12. Hacid, M.S., Leger, A., Rey, C.: Computing concept covers: A preliminary report. In: Proc.
    of DL 2002. (2002)
13. Benatallah, B., Hacid, M.S., Leger, A., Rey, C., Toumani, F.: On automating web services
    discovery. VLDB Journal 14(1) (March 2005) 84–96
14. Karam, N., Linckels, S., Meinel, C.: Semantic composition of lecture subparts for a person-
    alized e-learning. In: Proc. of ESWC 2007, Springer (2007) 716–728
15. Hacid, M.S., Soualmia, F., Toumani, F.: Schema Extraction for Semistructured Data. In:
    Proc. of DL 2000. (August 2000) 133–142
16. Cohen, W.W., Hirsh, H.: Learning the CLASSIC description logics: Theoretical and ex-
    perimental results. In Doyle, J., Sandewall, E., Torasso, P., eds.: Proc. of KR’94. (1994)
    121–133
17. Colucci, S., Di Sciascio, E., Donini, F.M., Tinelli, E.: Finding informative commonalities in
    concept collections. In: Proc. of CIKM 2008, ACM Press (2008) 807–816
18. Baader, F., Calvanese, D., Mc Guinness, D., Nardi, D., Patel-Schneider, P., eds.: The De-
    scription Logic Handbook. Cambridge University Press (2003)
19. Baader, F., Sertkaya, B., Turhan, A.Y.: Computing the least common subsumer w.r.t. a back-
    ground terminology. J. of Appl. Log. 5(3) (2007) 392–420
20. Rabin, M.O.: Decidability of second-order theories and automata on infinite trees. Trans. of
    the Am. Math. Soc. 141 (1969) 1–35
21. Henkin, L.: Completeness in the theory of types. J. Symb. Log. 15(2) (1950) 81–91
22. Baader, F., Narendran, P.: Unification of concept terms in description logics. J. of Symbolic
    Computation 31 (2001) 277–305
23. Tobies, S.: Complexity Results and Practical Algorithms for Logics in Knowledge Repre-
    sentation. PhD thesis, RWTH Aachen (2001)

</pre>