Translations
   Translations between
                between Concept
                        Concept Hierarchies
                                  Hierarchies over
                                              over
              Different Attribute Sets
              Different Attribute Sets

                                    Rainer Osswald
                                    Rainer Osswald
                             Department of Computer Science
                  IntelligentDepartment
                              Informationofand
                                            Computer   Science Systems
                                               Communication
                  Intelligent Information  and Communication
                                 FernUniversität in Hagen     Systems
                                 FernUniversität in Hagen
                                  58084 Hagen, Germany
                                  58084 Hagen, Germany
                         http://pi7.fernuni-hagen.de/osswald
                         http://pi7.fernuni-hagen.de/osswald


        Abstract. There are (at least) two different approaches to constructing
        concept lattices – one as the concept lattice of a given formal context,
        the other as a distributive lattice defined as the Lindenbaum algebra of
        a given set of (conjunctive or disjunctive) rules. It has been pointed out,
        first, that these two approaches are systematically related via Birkhoff’s
        duality theorem between (finite) ordered sets and distributive lattices,
        and, second, that the concept lattices of Formal Concept Analysis are lat-
        tices only due to the restriction to conjunctive implications. The present
        paper shows how this duality can be naturally extended to describe the
        relation between concept hierarchies with differing sets of underlying at-
        tributes. To this end, an appropriate notion of attribute translation is
        introduced. Particular emphasis is given to the special case of exten-
        sions by attributes and statements and their effect on the corresponding
        concept hierarchies.


 1    Introduction
 It is well-known that within Formal Concept Analysis (FCA), the formal con-
 cepts defined by a formal context are uniquely determined by those attribute
 sets that are closed under all conjunctive implicational statements holding in
 the formal context. This systematic relationship between the statements holding
 in a formal context and the conceptual hierarchies consisting of closed attribute
 sets is not restricted to conjunctive implications and concept lattices but can
 be straightforwardly generalized to implications whose premise and conclusion
 are arbitrary affirmative terms, i.e., terms that may also contain disjunction,
 truth, and falsity [6, 14, 7, 2, 13]. The resulting hierarchies, which are not neces-
 sarily lattices any more, are also known as information domains [5]; they are
 directed-complete ordered sets and include all finite partial orders.
     Other approaches towards constructing conceptual hierarchies based on a
 set of statements employ the distributive lattice of affirmative terms modulo
 the equivalence induced by the given statements [8–10], that is, the Lindenbaum
 algebra determined by the theory. Interestingly, both [10] and [9] deny any close
 connection of their approaches to FCA. As it has been pointed out in [12],


Radim Bělohlávek, Václav Snášel (Eds.): CLA 2005, pp. 92–106, ISBN 80–248–0863–3.
      Translations between Concept Hierarchies over Different Attribute Sets    93


however, there is a close connection to the information domain and thus to the
FCA approach: The concept lattice of FCA can be identified with the information
domain of the conjunctive theory determined by the formal context, whereas
the conceptual lattices studied in [8–10] are the Lindenbaum algebras of the
respective theories.
   In the present paper, the systemic correspondence between statements, infor-
mation domains, and distributive lattices is extended to the case of translating
between different base vocabulary. To this end, we introduce an appropriate
notion of translation between statement sets over different vocabularies. Partic-
ular emphasis will be given to the special case of theory extensions. Proofs are
omitted and can be found in [11].


2     Theories and Information Domains

2.1   Theories and Models

Suppose Σ is a set of (primitive) attributes that can be employed to classify
the elements of a certain domain of discourse U . Let  be the corresponding
satisfaction relation from U to Σ. For convenience, let us introduce two special
attributes V and Λ which satisfy respectively everything and nothing in any
universe of discourse. We allow to combine attributes by the standard Boolean
connectives ∧, ∨, and ¬. The term algebra of the Boolean attributes inductively
defined that way will be denoted by B[Σ]. As usual, φ → ψ stands for ¬φ ∨ ψ
and φ ↔ ψ for φ → ψ ∧ ψ → φ. The satisfaction relation  can be inductively
extended to a relation from U to B[Σ] in the obvious way: x ∈ U satisfies φ ∧ ψ
iff x satisfies φ and ψ; similarly, x satisfies ¬φ iff x does not satisfy φ; etc. A
compound attribute φ ∈ B[Σ] is called affirmative (or positive) just in case ¬
does not occur in φ. The term algebra of affirmative terms over Σ will be denoted
by T [Σ].
    The notions introduced in the preceding paragraph can be most easily re-
formulated within a standard logical setting by treating attributes as monadic
predicates (cf. also [13]). Recall that within first-order predicate logic, monadic
predicates are interpreted by subsets of a universe U . A satisfaction relation
 from U to Σ is thus essentially the same as an interpretation function M
from Σ to ℘(U ), with M (p) = {x ∈ U | x  p}. Such an interpretation func-
tion M can be inductively extended to a function M̂ from B[Σ] to ℘(U ), with
M̂ (φ) = {x ∈ U | x  φ}. The set M̂ (φ) will be referred to as the extent of φ.
    Making use of the (compound) attributes in statements that hold or are
true with respect to an interpretation calls for quantifying over these attributes.
The framework presented in this paper, which covers the approaches mentioned
in the introduction, only employs universal quantification. That is, we restrict
ourselves to universal statements of the form ∀x(φx), or ∀φ, with φ ∈ B[Σ].
A theory over Σ is then defined as a set of universal statements of this form.
First-order predicate logic gives us the following standard notions of truth and
model: A statement ∀φ is true with respect to an interpretation if φ is satisfied
94      Rainer Osswald


by every element of the universe. An interpretation is a model of a theory Γ if
every statement of Γ is true with respect to that interpretation.
    Suppose Γ and Γ 0 are theories over Σ. Then Γ is said to entail Γ 0 if every
model of Γ is also a model of Γ 0 . Notice that entailment in this “semantic” sense
coincides with entailment by any sound and complete inference calculus for first-
order predicate logic.1 Keeping this in mind, we write Γ ` Γ 0 if Γ entails Γ 0 . If
two theories entail each other, they are said to be equivalent.
    A universal statement of the form ∀(φ → ψ) (or ∀(φ ↔ ψ)), with φ and ψ
affirmative, is said to have conditional (or biconditional ) form. In the following,
we also write φ  ψ and φ ≡ ψ for ∀(φ → ψ) and ∀(φ ↔ ψ), respectively.
A conditional form is called normal, if φ is purely conjunctive or V , and ψ is
purely disjunctive or Λ. The normal form is called reduced if φ and ψ do not share
any primitive attributes. A theory is said to have conditional (reduced normal)
or biconditional form if each of its statements has this form. It is a standard
exercise in elementary logic to check that every theory is equivalent to a theory
in conditional (reduced normal) form as well as to a theory in biconditional
form. Without restriction of generality, we can thus assume that a theory has
conditional or biconditional normal form when appropriate.

2.2   Information Domains
Every theory Γ over Σ has a canonical “Henkin-style” model. Let the canonical
interpretation of Σ in ℘(Σ) take p ∈ Σ to {X ⊆ Σ |p ∈ X}, i.e., X  p iff p ∈ X.
The canonical model of Γ is then defined by eliminating all elements of ℘(Σ)
that are not compatible with the statements of Γ :
Definition 1 (Canonical Universe/Model). The canonical universe C(Γ )
of a theory Γ is the set of all subsets of Σ which, under the canonical interpre-
tation, satisfy φ for every statement ∀φ of Γ ; that is, C(Γ ) = {X ⊆ Σ | X 
φ for every (∀φ) ∈ Γ }. The canonical model of Γ has the universe C(Γ ) and
takes each p ∈ Σ to {X ∈ C(Γ ) | p ∈ X}.
    It is not difficult to verify the following universal property the canonical
model: a statement is entailed by Γ iff it is true with respect to the canonical
model of Γ . The elements of C(Γ ) will be referred to as the consistently Γ -closed
subsets of Σ. The canonical universe is naturally ordered by set inclusion. We
say that any ordered set which is order-isomorphic to C(Γ ) “is” or represents the
information domain of Γ , thereby adapting the terminology introduced in [5].2
Example 1. Let Γ be the theory consisting of the single statement human 
¬feathered ∧ biped, which is equivalent to the theory {human ∧ feathered 
Λ, human  biped} in conditional normal form, and let Γ 0 consist of the state-
ment human  featherless ∧ biped. The information domains of Γ and Γ 0 are
1
  It is not difficult to devise a much simpler sound and complete inference calculus
  since we are working in a small fragment of first-order logic; see e.g. [11, Sect. 6.3].
2
  The canonical universe of Γ is the free extent of Γ in the sense of [6]. See [13] for a
  more detailed comparison of several terminologies.
       Translations between Concept Hierarchies over Different Attribute Sets         95


                                                                      human


           feathered biped      human                         featherless biped


      feathered         biped                           featherless           biped


                  ⊥                                                    ⊥


                       Fig. 1. Feathered and featherless bipeds


depicted on the left and the right of Figure 1, respectively, with subsets replaced
by appropriate labels.


2.3    The Lindenbaum Algebra of Affirmative Terms

The Lindenbaum construction is the key step towards algebraizing logic. Its
basic idea is to abstract away from syntactical differences between terms that
are equivalent with respect to a given theory.

Definition 2 (Lindenbaum Algebra). Let Γ be a theory over Σ. The Lin-
denbaum algebra L(Γ ) (of affirmative terms) of Γ is the quotient T [Σ]/'Γ ,
where φ 'Γ ψ iff Γ entails φ ≡ ψ.

    The Lindenbaum algebra of affirmative terms is a distributive lattice with zero
and unit, i.e., an algebra of type h2, 2, 0, 0i, with [φ] ∧ [ψ] = [φ ∧ ψ], [φ] ∨ [ψ] =
[φ ∨ ψ], 0 = [Λ], and 1 = [V ]. It should be emphasized that the restriction to
affirmative terms is essential in our definition of the Lindenbaum algebra because
replacing T [Σ] by B[Σ] would give rise to a Boolean lattice (see also Section 4.2).
    It follows by definition that two theories are equivalent if and only if they
have the same Lindenbaum algebra. In [10], the elements of the Lindenbaum
algebra L(Γ ) are referred to as the semantic concepts determined by Γ . Under
this perspective, two affirmative terms “mean” the same (with respect to the
given theory) just in case they represent the same semantic concept.

Example 2. The (non-isomorphic) Lindenbaum algebras of the theories Γ and
Γ 0 of Example 1 are respectively depicted on the left and the right of Figure 2.
The Lindenbaum algebra of Γ , for instance, is the quotient of the distributive
lattice with zero and unit freely generated over the set {human, feathered, biped}
by the congruence relation generated by the pairs hhuman ∧ feathered, Λi and
hhuman, biped ∧ humani.

    In the rest of the paper, we speak of distributive lattices with zero and unit
briefly as algebras.
96         Rainer Osswald


                          [V ]                                             [V ]

                             [fthed ∨ biped]                                   [fless ∨ biped]

[fthed ∨ human]
                                      [biped]                 [fless]                  [biped]

      [fthed]                    [(fthed ∧ biped) ∨ human]
                                                             [fless ∧ biped]
                                      [human]                                     [human]
     [fthed ∧ biped]

                          [Λ]                                              [Λ]


                Fig. 2. Lindenbaum algebras of feathered and featherless bipeds


2.4      Birkhoff Duality
In the finite case, there is an intimate connection between the Lindenbaum alge-
bra and the information domain of a theory in that they determine each other
uniquely up to isomorphism. This correspondence is just an immediate conse-
quence of a classical result of Birkhoff, according to which there is a categorical
equivalence between the finite ordered sets and the finite distributive lattices
with zero and unit (cf. [1, 4]).
    A standard formulation of Birkhoff’s duality is as follows: The (bounded)
distributive lattice associated with a finite ordered set P is given by the lattice
of upwards closed subsets of P . Conversely, the ordered set associated with a
finite (bounded) distributive lattice D is given by the ordered set of ∨-irreducible
elements of D. The only difference to the present situation is that we have to
reverse the order of the ∨-irreducibles. For example, the ∨-irreducible elements
in the lattice diagrams of Figure 2, which are marked by shaded circles, stand in
an order-reversing one-to-one correspondence to the elements of the respective
information domains; see Figure 1. (The ∨-irreducibles are characterized by the
property of having precisely one element immediately below them.)
    For a more explicit characterization of the Lindenbaum algebra of a theory
in terms of its information domain we can employ the one-to-one correspondence
between the upwards closed subsets of an ordered set and its antichains, where
the antichain associated with an upwards closed set is the set of minimal elements
of that set.3
Proposition 1. In the finite case, there is a one-to-one correspondence between
                                            C(Γ ), where an antichain S in C(Γ )
the elements of L(Γ ) and the antichainsWin V
corresponds to the equivalence class of { X | X ∈ S} in L(Γ ).
3
     An antichain in an ordered set is a subset whose elements are pairwise incomparable
     with respect to the ordering relation.
      Translations between Concept Hierarchies over Different Attribute Sets        97


3     Translations of Theories
3.1    Translations and Equivalences
Suppose two cognitive agents want to decide whether their theories about a
certain universe of discourse are equivalent. If both agree that they use the same
vocabulary in the same way, equivalence means that both theories entail each
other, which is the case if every statement of one theory is entailed by the other
and vice versa. This is precisely the definition of equivalent theories introduced
in Section 2.1.
    Consider now the case of two theories Γ and Γ 0 over different sets Σ and Σ 0
of primitives. Assume that both agents do not hinge on their chosen primitive
terms but are willing to express them in terms of those of the other. Within the
present framework this means to define a translation function µ from Σ to T [Σ 0 ],
which then can be naturally extended to a function µ̂ from T [Σ] to T [Σ 0 ]. Since
varying the base vocabulary makes it necessary to keep track of it, we consider
theories from now on as pairs hΣ, Γ i where Γ is a theory over Σ in the sense
introduced in Section 2.1. Keeping this in mind, we often write Γ instead of
hΣ, Γ i, if Σ is clear from the context or irrelevant.
Definition 3 (Theory Translation). A translation of theories from hΣ, Γ i to
hΣ 0 , Γ 0 i is a function µ from Σ to T [Σ 0 ] such that Γ 0 ` µ̂(Γ ). The translation
µ is called primitive if µ(Σ) ⊆ Σ 0 .
   The composite of two theory translations µ from Γ to Γ 0 and µ0 from Γ 0
to Γ 00 is the composite function µb0 ◦ µ from Σ to T [Σ 00 ]. It is straightforward
to verify that µb0 ◦ µ is indeed a theory translation from Γ to Γ 00 . In order to
characterize the equivalence of theories, the notion of an invertible translation is
too restrictive since, for instance, the empty theory over {a} clearly should count
as equivalent to the theory {b ≡ c} over {b, c} under any sensible definition of
equivalence. We can overcome this problem by relaxing the notion of an invertible
translation to that of a quasi-invertible one. Let ıΓ be the identity translation
on Γ that is given by the canonical inclusion of Σ into T [Σ].
Definition 4 (Equivalence of Translations). Let µ and ν be theory trans-
lations from hΣ, Γ i to hΣ 0 , Γ 0 i. Then µ is equivalent to ν, notation: µ ∼ ν, iff
Γ 0 ` µ(p) ≡ ν(p) for every p ∈ Σ.
Definition 5 (Quasi-Inverse). Let µ be a theory translation from Γ to Γ 0 and
let ν be a translation from Γ 0 to Γ . Then ν is quasi-inverse to µ iff ν ◦ µ ∼ ıΓ
and µ ◦ ν ∼ ıΓ 0 .
   We are now able to formulate an appropriate notion of equivalence between
theories with different base vocabularies: Γ and Γ 0 are said to be equivalent if
there is a quasi-invertible translation from Γ and Γ 0 ; notation: Γ ∼ Γ 0 . Quasi-
invertible translations can be characterized by the following two conditions:
Definition 6 (Conservative Translation). A translation µ from Γ to Γ 0 is
conservative iff, for all statements α over Σ, Γ 0 ` µ̂(α) only if Γ ` α.
98      Rainer Osswald


Definition 7 (Essentially Surjective Translation). A translation µ from Γ
to Γ 0 is essentially surjective iff for every p ∈ Σ 0 there is a φ ∈ T [Σ] such that
Γ 0 ` p ≡ µ̂(φ).

Proposition 2. A theory translation µ has a quasi-inverse if and only if µ is
conservative and essentially surjective.

Example 3. Let Γ be the theory over Σ = {a0 , a1 , . . . ak } ∪ {b0 , b1 , . . . bk }, with
k finite, that consists of the statements
          an ≡ an+1 ∨ bn+1           and         an ∧ bn ≡ Λ      (0 6 n < k).
The information domain C(Γ ) of Γ consists of the sets ∅, {a0 , a1 , . . . , ak },
and {a0 , a1 , . . . , an−1 , bn } for all n 6 k. Since the elements of C(Γ ) \ {∅}
are pairwise incomparable with respect to set inclusion, it follows that C(Γ )
is flat and hence order-isomorphic to the information domain of the theory
Γ 0 = {cm ∧ cn ≡ Λ | m 6= n} over Σ 0 = {c0 , c1 , . . . , ck+1 }. Let µ be the func-
tion from Σ to T [Σ 0 ] with
      µ(an ) = cn+1 ∨ . . . ∨ ck+1         and      µ(bn ) = cn       (0 6 n 6 k).
Clearly Γ 0 entails µ̂(Γ ), since (cn+1 ∨ . . . ∨ ck+1 ) ∧ cn ≡ (cn+1 ∧ cn ) ∨ . . . ∨
(ck+1 ∧ cn ) ≡ Λ for all n 6 k. So µ is a translation from Γ to Γ 0 . Now consider
the function ν from Σ 0 to T [Σ] such that ν(ck+1 ) = ak and ν(cn ) = bn for
every n 6 k. We want to show that ν is a translation from Γ 0 to Γ . To this
end, observe that Γ entails bm  an and hence bm ∧ bn  Λ for all m > n.
In addition, Γ entails am  an if m > n, from which it follows that am ∧
bn ≡ am ∧ an ∧ bn ≡ Λ whenever m > n. All in all, this proves that Γ entails
ν̂(Γ 0 ). It remains to check that ν is quasi-inverse to µ. The only nontrivial claim
is that Γ entails ν(µ(an )) ≡ an , i.e., that Γ ` an ≡ bn+1 ∨ . . . ∨ bk ∨ ak ,
which follows easily by induction. Figure 3 illustrates the situation for k = 1
in terms of Lindenbaum algebras and information domains; in addition, the
figure depicts the extents of the primitives. The shaded circles in the diagrams
of L(Γ ) and L(Γ 0 ) correspond to join-irreducible elements, which stand in an
(order-reversing) one-to-one correspondence to the elements of C(Γ ) and C(Γ 0 ).
Clearly, the equivalence µ from Γ to Γ 0 induces isomorphisms between L(Γ ) and
L(Γ 0 ) and between C(Γ ) and C(Γ 0 ).

    Every theory translation µ from Γ to Γ 0 canonically gives rise to an algebra
homomorphism L(µ) from L(Γ ) to L(Γ 0 ) as well as to an order-preserving func-
tion C(µ) from C(Γ 0 ) to C(Γ ) which preserves suprema of directed sets (cf. e.g.
[11, Sect. 8.2]).

Proposition 3. Suppose µ and ν are theory translations from Γ to Γ 0 . Then
L(µ) = L(ν) iff µ ∼ ν. In particular, L(Γ ) ' L(Γ 0 ) iff Γ ∼ Γ 0 .

Proposition 4. A translation µ of theories is conservative iff C(µ) is onto; µ
is essentially surjective iff C(µ) is an order embedding.
       Translations between Concept Hierarchies over Different Attribute Sets               99


                          1                                            1
            L(Γ )                                      L(Γ 0 )
                               b0 ∨ a 0

                                     a0 = a 1 ∨ b 1                              c1 ∨ c 2

                    b0   b1             a1                       c0   c1         c2

                               0                                            0
                                                          0
          C (Γ )                                      C (Γ )
                   b0     b1       a1                            c0    c1       c2


                                          a0


    Fig. 3. Lindenbaum algebras and informations domains of Example 3 for k = 1


3.2     Example: Minimal Representations


Let P be a finite ordered set. Then P can be represented by a subset system
over P , where x ∈ P corresponds to ↓x = {y ∈ P | y 6 x}. Moreover, every
subset system over a finite set Σ is the information domain of a theory over Σ.
Of course, P may have a representation by a subset system over a set Σ with
lower cardinality than P . For instance, if P is bounded and distributive, the set
of join-irreducible elements of P can be chosen for Σ. Another type of example
is given by the fact that a full binary tree P of height k + 1, which has 2k+1 − 1
nodes and 2k leaves, can be embedded in ℘(Σ) with |Σ| = 2k.
    The representation problem arising here has the following general form: Given
a finite ordered set P , find the least number n such that P can be represented
as a subset system over a set of cardinality n. The solution of this problem is of
practical importance because representations of ordered sets as subset systems
can be easily implemented on a computer via bit-vector encoding. Unfortunately,
the representation problem is NP-complete (see e.g. [3]).
    In terms of theories and translations the representation problem can be
rephrased as follows: Given a theory Γ over a finite set Σ, find a set Σ 0 with
                                                              0       0
minimal cardinality such that Γ is equivalent to   a theory
                                                         Γ over Σ . Notice that
                                                     n
if |Σ 0 | = n then the width of C(Γ 0 ) is at most  n  , by Sperner’s Lemma (cf.
                                                               2
e.g. [15]), which gives us a lower bound for the cardinality of Σ 0 .4

4
    The width of an ordered set P is the cardinality of a maximal antichain of P .
100     Rainer Osswald


              C (Γ )                                 {a, b, c}   C (Γ 0 )
                             {a, b}
                                                      {b, c}

                       {a}            {b}
                                                       {b}

                               ∅
                                                        ∅


           Fig. 4. Extension ε where C(ε) is neither one-to-one nor onto


4     Theory Extensions

A translation of theories from hΣ, Γ i to hΣ 0 , Γ 0 i is called an extension if its
underlying function is an inclusion of Σ into Σ 0 , and Γ ⊆ Γ 0 . We then say that
hΣ 0 , Γ 0 i is an extension of hΣ, Γ i.

Proposition 5. If ε is an extension of theories from hΣ, Γ i to hΣ 0 , Γ 0 i then
C(ε)(Y ) = Y ∩ Σ. In particular, C(ε) is an order embedding if Σ = Σ 0 .

Example 4. Let Γ be the empty theory over {a, b} and let ε be its extension to
Γ 0 by a single primitive c and the statements a  c and c  b. The induced
function C(ε) from C(Γ 0 ) to C(Γ ) is depicted by Figure 4. Notice that C(ε) is
neither one-to-one nor onto.


4.1   Conservative Extensions and Rule Extensions

Suppose Γ and Γ 0 are theories over Σ such that Γ ⊆ Γ 0 . Then Γ 0 is called a
rule extension of Γ . The corresponding extension from Γ to Γ 0 is the identity
function on Σ. A rule extension is trivially essentially surjective and the induced
function of canonical universes is an inclusion, by Proposition 5; in particular,
C(Γ 0 ) ⊆ C(Γ ). The definition of the canonical universe of a theory hΣ, Γ i given
in Section 2.2 can be seen as an example of this fact: C(Γ ) is included in the
powerset ℘(Σ) of Σ, which is the canonical universe of the empty theory over Σ.
    An extension hΣ 0 , Γ 0 i of hΣ, Γ i is called conservative if the corresponding
extension translation is conservative, that is, if Γ 0 ` α just in case Γ ` α, for
every statement α over Σ. In other words, conservative extensions do not entail
additional statements over the base vocabulary. It follows by Propositions 5
and 4 that C(Γ ) = {Y ∩ Σ | Y ∈ C(Γ 0 )} in this case.

Proposition 6. Let µ be a theory translation from Γ to Γ 0 . (i) If µ is conser-
vative, Γ 0 is equivalent to a conservative extension of Γ . (ii) If µ is essentially
surjective, Γ 0 is equivalent to a rule extension of Γ .
       Translations between Concept Hierarchies over Different Attribute Sets           101


         {a, c}         {b, c}                                 C (Γ )

                  {c}            {−a, b, c}   {a, −b, c}   {−a, −b, c}   {−a, −b, −c}
      C (Γ )

                  ∅


           Fig. 5. Function of canonical universes induced by Booleanization


4.2    Booleanization
Let Γ be a theory over Σ. The Booleanization Γ of Γ is defined as follows:
Σ is extended by a disjoint copy {−p | p ∈ Σ} of Σ, and Γ is extended by all
statements p ∧ −p  Λ and V  p ∨ −p, with p ∈ Σ, i.e., by all instantiations of
the laws of contradiction and excluded middle for primitives.
    Observe that no element of C(Γ ) is a proper subset of another element of
C(Γ ), because each element of C(Γ ) contains either p or −p, but not both, for
all p ∈ Σ.

Proposition 7. The information domain of the Booleanization of a theory is
an antichain.

    Let ε be the extension from Γ to Γ . By Proposition 5, C(ε) takes Y ∈ C(Γ )
to Y ∩Σ. Observe that C(ε) is onto and one-to-one; in particular, Booleanization
is conservative, by Proposition 4.

Example 5. Consider the theory Γ = {a ∧ b  Λ, a ∨ b  c} over Σ = {a, b, c}.
Then C(Γ ) consists of ∅, {c}, {a, c}, and {b, c}, whereas C(Γ ) consists of
{−a, −b, −c}, {−a, −b, c}, {a, −b, c}, and {−a, b, c}. Figure 5 depicts the induced
function C(ε) from C(Γ ) to C(Γ ).

    The Lindenbaum algebra of Γ is easily seen to be complemented, i.e., L(Γ )
is a Boolean lattice. Since C(ε) is onto, L(ε) is an embedding of L(Γ ) into
L(Γ ). It is not difficult to prove the following universal property of L(ε): For
every homomorphism h from L(Γ ) to a complemented algebra A, there exists
a unique homomorphism h0 from L(Γ ) to A such that h = h0 ◦ L(ε). This
universal characterization of L(Γ ) provides an additional justification of the
term ‘Booleanization’. Moreover, the Boolean lattice L(Γ ) is isomorphic to the
Lindenbaum algebra B[Σ]/'Γ of Boolean terms of Γ .

4.3    Rule Completion
Besides Booleanization there is a another possible way to extend a theory hΣ, Γ i
to a theory whose information domain is an antichain. The idea is to find an
extension of Γ whose information domain is the set of maximal elements of the
information domain of Γ . (Since every information domain is directed-complete,
102     Rainer Osswald


it has maximal elements by Zorn’s Lemma.) Now observe that if such an ex-
tension of Γ exists at all, it can be realized by a rule extension, according to
Propositions 4 and 6. We speak of a rule completion of Γ in this case. A possible
rule completion of the theory Γ of Example 5 is given by Γ 0 = Γ ∪ {V  a ∨ b}.
Then C(Γ 0 ) consists of {a, c} and {b, c}.
    Rule completion, however, is not always possible. A simple counterexample
is provided by the full binary exclusion theory Γ = {p ∧ q  Λ | p 6= q} over an
infinite set Σ. Then C(Γ ) = {∅} ∪ {{p} | p ∈ Σ}. It can be shown (see [13, p.
27]) that there is no theory Γ 0 over Σ such that C(Γ 0 ) = {{p} | p ∈ Σ}. Hence,
there Wis no rule completion of Γ . Intuitively, what is needed here is the statement
V  Σ, that is, an infinite disjunction.

4.4   Direct Sums
Consider the task of combining two theories hΣ, Γ i and hΣ 0 , Γ 0 i. Let us assume
that Σ and Σ 0 and hence Γ and Γ 0 are disjoint. The most obvious way to combine
the two theories into one is to take their disjoint union

                       hΣ, Γ i ] hΣ 0 , Γ 0 i = hΣ ∪ Σ 0 , Γ ∪ Γ 0 i,

which is also known as their direct sum. Correspondingly, one can define the
direct sum of an arbitrary family hhΣi , Γi iii∈I of theories, given that the Σi ’s
are pairwise disjoint:
                          U                   S        S
                            i∈I hΣi , Γi i = h i∈I Σi , i∈I Γi i.
                                                                U
    It is not difficult to describe the canonical universe of i Γi in terms
                                                                          S     ofSthose
of the Γi ’s: Let εi be the extension from hΣi , Γi i to the direct
                                                              U     sum  h   Σ
                                                                            i i ,  i Γi i.
According to Proposition 5, the function C(εi ) from C(     S i i Γ ) to C(Γ
                                                                        U i   ) takes X
to X ∩ Σi . Moreover, if Xi ∈ C(Γi ) for every i, then i Xi ∈ C( i Γi ); hence
                             U              S
                          C( i Γi ) = { i Xi | Xi ∈ C(Γi )}.
                                               U
It follows thatQthe information domain of i Γi is order-isomorphic to the Carte-
sian product i C(Γi ) ordered by coordinatewise inclusion:
                                     U          Q
                                 C( i Γi ) '      i C(Γi ).

Example 6. Let Γ be a theory over Σ and let Σ 0 be a set disjoint to Σ. We
call hΣ ∪ Σ 0 , Γ i the extension of hΣ, Γ i by primitives Σ 0 . Since hΣ ∪ Σ 0 , Γ i is
identical to hΣ, Γ i]hΣ 0 , ∅i, it follows that C(hΣ ∪ Σ 0 , Γ i) ' C(hΣ, Γ i)×℘(Σ 0 ).
                                                       0
In particular, |C(hΣ ∪ Σ 0 , Γ i)| = |C(hΣ, Γ i)| · 2|Σ | .
Example 7. Let Γ be a theory over Σ. Call a primitive p ∈ Σ free with respect
to Γ if p does not occur in any statement of Γ . Let σ(Γ ) be the set of primitives
occurring in at least one of the statements of Γ , that is, Σ 0 = Σ \ σ(Γ ) is the
set of free primitives. Then hΣ, Γ i is the direct sum of hσ(Γ ), Γ i and hΣ 0 , ∅i
(and hΣ, Γ i is the extension of hσ(Γ ), Γ i by primitives Σ 0 ). Hence C(hΣ, Γ i) '
C(hσ(Γ ), Γ i) × ℘(Σ 0 ).
      Translations between Concept Hierarchies over Different Attribute Sets                          103


                                                    {a, c}        {b, c}       {a, c}   {b, c}
      {a}         {b}       {c}
                        ×         '   {a}         {b}        {c}           ⊇          {c}       {b}
            ∅               ∅
                                            ∅                                               ∅

                                                        {a, b, c}


                                                                                        =
         {a, c}                         {a, c}           {b, c}                {a, c}       {b, c}

            {c}     ×       {b}   '         {c}           {b}          ⊇        {c}          {b}

            ∅               ∅               ∅                                    ∅


             Fig. 6. Information domain of extension as subset of product


4.5   Extensions Decomposed

An extension of theories from hΣ, Γ i to hΣ 0 , Γ 0 i can be decomposed into an
extension of primitives from hΣ, Γ i to hΣ 0 , Γ i followed by a rule extension from
hΣ 0 , Γ i to hΣ 0 , Γ 0 i. Recall from Example 6 that

C(hΣ 0 , Γ i) = {X ∪ Y | X ∈ C(hΣ, Γ i), Y ⊆ Σ \ Σ 0 } ' C(hΣ, Γ i) × ℘(Σ \ Σ 0 ).

Moreover, C(hΣ 0 , Γ 0 i) consists of all elements of C(hΣ 0 , Γ i) that are consistently
closed with respect to Γ 0 \ Γ . So we can construct C(hΣ 0 , Γ 0 i) from C(hΣ, Γ i)
by taking first the product of C(hΣ, Γ i) and ℘(Σ \ Σ 0 ) and then deleting those
elements that are not consistently closed with respect to Γ 0 \ Γ .

Example 8. Let Γ 0 be the theory {a ∧ b  Λ, a  c} over Σ 0 = {a, b, c}. Viewed
as an extension of the theory {a ∧ b  Λ} over {a, b}, the construction of C(Γ 0 )
by product and deletion is as depicted by the upper row of Figure 6. The lower
row of the figure shows the construction if Γ 0 is viewed as an extension of the
theory {a  c} over {a, c}. The shaded elements are subject to deletion because
they are not consistently closed with respect to Γ 0 \ Γ .


4.6   A Simple Algorithmic Construction Scheme

Let Γ be a theory over a finite set Σ. Choose a strictly increasing sequence
Σ0 , Σ1 , . . . , Σn of sets, with Σ0 = ∅ and Σn = Σ, and let Γi = Γ |Σi be the
set of all statements of Γ with primitives in Σi . Then C(Γ ) can be constructed
from C(Γ0 ) by applying the two-step method described in Section 4.5 iteratively
to the extension from Γi−1 to Γi , for every i. As we have seen, C(Γi ) can be
constructed from C(Γi−1 ) by taking all sets of the form X ∪Y , with X ∈ C(Γi−1 )
and Y ⊆ Σi \Σi−1 , such that X ∪Y is consistently closed with respect to Γi \Γi−1 .
104     Rainer Osswald


              function C (Σ: set; Γ : theory): system of sets;
              begin
                 if not cc? (∅, Γ |∅ ) then
                     C := ∅
                 else begin
                     C := {∅};
                     Σ 0 := ∅;
                     while Σ 6= ∅ and C 6= ∅ do begin
                          F := any nonempty subset of Σ;
                          Σ 0 := Σ 0 ∪ F ;
                          Γ 0 := Γ |Σ 0 ;
                          C 0 := ∅;
                          foreach X ∈ C, Y ⊆ F do begin
                               X 0 := X ∪ Y ;
                               if cc? (X 0 , Γ 0 ) then C 0 := C 0 ∪ {X 0 }
                          end;
                          Σ := Σ \ F ;
                          Γ := Γ \ Γ 0 ;
                          C := C 0
                     end
                 end
              end;

              function cc? (X: set; Γ : theory): boolean;
              { true if X is consistently Γ -closed, false otherwise }
              begin
                  foreach (φ  ψ) ∈ Γ do
                       if X  φ and X 2 ψ then return (false);
                  return (true)
              end;


      Fig. 7. A generic algorithmic scheme for information domain construction


    An algorithmic formulation of this iteration scheme is shown in Figure 7,
where the variables F and Σ 0 take the place of Σi \ Σi−1 and Σi , respec-
tively. (Notice that the algorithm yields C = ∅ for inconsistent theories like
{V  a, a  Λ}.) Calculating the information domain of the i-th extension re-
quires to check |C(Γi−1 )| · 2ki sets against |Γi \ Γi−1 | statements, with ki =
|Σi \ Σi−1 |. Consequently, C(Γi ) should be of low cardinality (more in the order
of |Σi | than of 2|Σi | ) and ki should be near to one. In other words, it is important
to choose the partition of Γ \ Γ0 into the sets Γi \ Γi−1 (1 6 i 6 n) in such a
way that keeps |C(Γi )| small during the construction process. How to do this in
a systemic way is a topic for future research.
   Notice that a nonredundant theory is not necessarily the best choice. For
though less rules reduce the number of tests a single candidate set of primitives
has to undergo (in the subroutine cc? of Figure 7), the total number of sets to
      Translations between Concept Hierarchies over Different Attribute Sets       105


be tested during an extension step can increase because additional (redundant)
statements would have pruned C(Γi ) at an earlier stage of the construction.
    Suppose Γ has reduced normal form. Then every statement of Γ can be
represented by a sequent, i.e., a pair hP, Qi of disjoint finite subsets of Σ, where φ
is the conjunction of the elements of P and ψ is the disjunction of the elements of
Q (with V and Λ arising respectively by conjunction and disjunction of nothing).
In terms of the sequent representation of Γ , the if -statement of the subroutine
cc? becomes an elementary condition on sets:

              if P ⊆ X and X ∩ Q = ∅ then return (false);


5    Conclusion

We have presented a framework for translating between classifications, taken as
theories, over different base vocabulary and studied their effect on the associated
information domains, which can be regarded as conceptual hierarchies. In partic-
ular, we have studied extensions of theories, with Booleanization as a special case.
Moreover, it has been shown how theory extensions can be straightforwardly em-
ployed for constructing conceptual hierarchies in a step-by-step fashion.
    Though the presentation has abstained from using the language of category
theory, some readers surely will have noticed that categorical concepts are at
least implicit in our treatment of theory translations. Readers interested in a
more explicit categorical setting are invited to consult [11].


References

 1. Garrett Birkhoff. Lattice Theory. AMS Colloquium Publications, Vol. 25. AMS,
    Providence, RI, 3rd edition, 1967.
 2. Radim Bělohlávek and Vladimı́r Sklenář. Formal concept analysis constrained by
    attribute-dependency formulas. In Bernhard Ganter and Robert Godin, editors,
    Proceedings of ICFCA 2005, Lecture Notes in Artificial Intelligence 3403, pages
    176–191, Berlin, 2005. Springer.
 3. Yves Caseau, Michel Habib, Lhouari Nourine, and Olivier Raynoud. Encoding
    of multiple inheritance hierarchies and partial orders. Computational Intelligence,
    15(1):50–62, 1999.
 4. Brian A. Davey and Hilary A. Priestley. Introduction to Lattices and Order. Cam-
    bridge University Press, Cambridge, 2nd edition, 2002.
 5. Manfred Droste and Rüdiger Göbel. Non-deterministic information systems and
    their domains. Theoretical Computer Science, 75:289–309, 1990.
 6. Bernhard Ganter and Rudolf Wille. Contextual attribute logic. In William Tepfen-
    hart and Walling Cyre, editors, Proceedings of ICCS 1999, Lecture Notes in Arti-
    ficial Intelligence 1640, pages 377–388, Berlin, 1999. Springer.
 7. Hele-Mai Haav. A semi-automatic method to ontology design by using FCA. In
    Vaclav Snášel and Radim Bělohlávek, editors, Proceedings of CLA 2004. VŠB –
    Technical University of Ostrava, Department of Computer Science, 2004.
106     Rainer Osswald


 8. Jørgen Fischer Nilsson. A logico-algebraic framework for ontologies. In P. Anker
    Jensen and P. Skadhauge, editors, Ontology-based Interpretation of Noun Phrases,
    Proceedings of the 1st International OntoQuery Workshop, pages 43–56, University
    of Southern Denmark – Kolding, 2001.
 9. Nikolaj Oldager. Taxonomies with lattice algebras. In K. I. Simov and A. Kiryakov,
    editors, Proceedings of the 1st International Workshop on Ontologies and Lexical
    Knowledge Bases (OntoLex), pages 106–117, 2000.
10. Frank J. Oles. An application of lattice theory to knowledge representation. The-
    oretical Computer Science, 249:163–196, 2000.
11. Rainer Osswald. A logic of classification – with applications to linguistic theory.
    Technical Report 303 – 9/2003, FernUniversität in Hagen, Department of Com-
    puter Science, 2003.
12. Rainer Osswald. Two views on building conceptual hierarchies. In Gabriele Kern-
    Isberner, Wilhelm Rödder, and Friedhelm Kulmann, editors, Proceedings of the 2nd
    Workshop on Conditionals, Information, and Inference, 2004. Ulm, September 21.
13. Rainer Osswald. Concept hierarchies from a logical point of view. In Supplemen-
    tary Proceedings of the 3rd International Conference on Formal Concept Analysis
    (ICFCA-2005), pages 15–30, Lens, France, 2005.
14. Rainer Osswald and Wiebke Petersen. A logical approach to data driven classifi-
    cation. In KI-2003: Advances in Artificial Intelligence, Lecture Notes in Artificial
    Intelligence 2821, pages 267–281, Berlin, 2003. Springer.
15. William T. Trotter. Partially ordered sets. In Ronald Graham, Martin Grötschel,
    and László Lovász, editors, Handbook of Combinatorics, pages 433–480. North-
    Holland, Amsterdam, 1995.