Translations Translations between between Concept Concept Hierarchies Hierarchies over over Different Attribute Sets Different Attribute Sets Rainer Osswald Rainer Osswald Department of Computer Science IntelligentDepartment Informationofand Computer Science Systems Communication Intelligent Information and Communication FernUniversität in Hagen Systems FernUniversität in Hagen 58084 Hagen, Germany 58084 Hagen, Germany http://pi7.fernuni-hagen.de/osswald http://pi7.fernuni-hagen.de/osswald Abstract. There are (at least) two different approaches to constructing concept lattices – one as the concept lattice of a given formal context, the other as a distributive lattice defined as the Lindenbaum algebra of a given set of (conjunctive or disjunctive) rules. It has been pointed out, first, that these two approaches are systematically related via Birkhoff’s duality theorem between (finite) ordered sets and distributive lattices, and, second, that the concept lattices of Formal Concept Analysis are lat- tices only due to the restriction to conjunctive implications. The present paper shows how this duality can be naturally extended to describe the relation between concept hierarchies with differing sets of underlying at- tributes. To this end, an appropriate notion of attribute translation is introduced. Particular emphasis is given to the special case of exten- sions by attributes and statements and their effect on the corresponding concept hierarchies. 1 Introduction It is well-known that within Formal Concept Analysis (FCA), the formal con- cepts defined by a formal context are uniquely determined by those attribute sets that are closed under all conjunctive implicational statements holding in the formal context. This systematic relationship between the statements holding in a formal context and the conceptual hierarchies consisting of closed attribute sets is not restricted to conjunctive implications and concept lattices but can be straightforwardly generalized to implications whose premise and conclusion are arbitrary affirmative terms, i.e., terms that may also contain disjunction, truth, and falsity [6, 14, 7, 2, 13]. The resulting hierarchies, which are not neces- sarily lattices any more, are also known as information domains [5]; they are directed-complete ordered sets and include all finite partial orders. Other approaches towards constructing conceptual hierarchies based on a set of statements employ the distributive lattice of affirmative terms modulo the equivalence induced by the given statements [8–10], that is, the Lindenbaum algebra determined by the theory. Interestingly, both [10] and [9] deny any close connection of their approaches to FCA. As it has been pointed out in [12], Radim Bělohlávek, Václav Snášel (Eds.): CLA 2005, pp. 92–106, ISBN 80–248–0863–3. Translations between Concept Hierarchies over Different Attribute Sets 93 however, there is a close connection to the information domain and thus to the FCA approach: The concept lattice of FCA can be identified with the information domain of the conjunctive theory determined by the formal context, whereas the conceptual lattices studied in [8–10] are the Lindenbaum algebras of the respective theories. In the present paper, the systemic correspondence between statements, infor- mation domains, and distributive lattices is extended to the case of translating between different base vocabulary. To this end, we introduce an appropriate notion of translation between statement sets over different vocabularies. Partic- ular emphasis will be given to the special case of theory extensions. Proofs are omitted and can be found in [11]. 2 Theories and Information Domains 2.1 Theories and Models Suppose Σ is a set of (primitive) attributes that can be employed to classify the elements of a certain domain of discourse U . Let  be the corresponding satisfaction relation from U to Σ. For convenience, let us introduce two special attributes V and Λ which satisfy respectively everything and nothing in any universe of discourse. We allow to combine attributes by the standard Boolean connectives ∧, ∨, and ¬. The term algebra of the Boolean attributes inductively defined that way will be denoted by B[Σ]. As usual, φ → ψ stands for ¬φ ∨ ψ and φ ↔ ψ for φ → ψ ∧ ψ → φ. The satisfaction relation  can be inductively extended to a relation from U to B[Σ] in the obvious way: x ∈ U satisfies φ ∧ ψ iff x satisfies φ and ψ; similarly, x satisfies ¬φ iff x does not satisfy φ; etc. A compound attribute φ ∈ B[Σ] is called affirmative (or positive) just in case ¬ does not occur in φ. The term algebra of affirmative terms over Σ will be denoted by T [Σ]. The notions introduced in the preceding paragraph can be most easily re- formulated within a standard logical setting by treating attributes as monadic predicates (cf. also [13]). Recall that within first-order predicate logic, monadic predicates are interpreted by subsets of a universe U . A satisfaction relation  from U to Σ is thus essentially the same as an interpretation function M from Σ to ℘(U ), with M (p) = {x ∈ U | x  p}. Such an interpretation func- tion M can be inductively extended to a function M̂ from B[Σ] to ℘(U ), with M̂ (φ) = {x ∈ U | x  φ}. The set M̂ (φ) will be referred to as the extent of φ. Making use of the (compound) attributes in statements that hold or are true with respect to an interpretation calls for quantifying over these attributes. The framework presented in this paper, which covers the approaches mentioned in the introduction, only employs universal quantification. That is, we restrict ourselves to universal statements of the form ∀x(φx), or ∀φ, with φ ∈ B[Σ]. A theory over Σ is then defined as a set of universal statements of this form. First-order predicate logic gives us the following standard notions of truth and model: A statement ∀φ is true with respect to an interpretation if φ is satisfied 94 Rainer Osswald by every element of the universe. An interpretation is a model of a theory Γ if every statement of Γ is true with respect to that interpretation. Suppose Γ and Γ 0 are theories over Σ. Then Γ is said to entail Γ 0 if every model of Γ is also a model of Γ 0 . Notice that entailment in this “semantic” sense coincides with entailment by any sound and complete inference calculus for first- order predicate logic.1 Keeping this in mind, we write Γ ` Γ 0 if Γ entails Γ 0 . If two theories entail each other, they are said to be equivalent. A universal statement of the form ∀(φ → ψ) (or ∀(φ ↔ ψ)), with φ and ψ affirmative, is said to have conditional (or biconditional ) form. In the following, we also write φ  ψ and φ ≡ ψ for ∀(φ → ψ) and ∀(φ ↔ ψ), respectively. A conditional form is called normal, if φ is purely conjunctive or V , and ψ is purely disjunctive or Λ. The normal form is called reduced if φ and ψ do not share any primitive attributes. A theory is said to have conditional (reduced normal) or biconditional form if each of its statements has this form. It is a standard exercise in elementary logic to check that every theory is equivalent to a theory in conditional (reduced normal) form as well as to a theory in biconditional form. Without restriction of generality, we can thus assume that a theory has conditional or biconditional normal form when appropriate. 2.2 Information Domains Every theory Γ over Σ has a canonical “Henkin-style” model. Let the canonical interpretation of Σ in ℘(Σ) take p ∈ Σ to {X ⊆ Σ |p ∈ X}, i.e., X  p iff p ∈ X. The canonical model of Γ is then defined by eliminating all elements of ℘(Σ) that are not compatible with the statements of Γ : Definition 1 (Canonical Universe/Model). The canonical universe C(Γ ) of a theory Γ is the set of all subsets of Σ which, under the canonical interpre- tation, satisfy φ for every statement ∀φ of Γ ; that is, C(Γ ) = {X ⊆ Σ | X  φ for every (∀φ) ∈ Γ }. The canonical model of Γ has the universe C(Γ ) and takes each p ∈ Σ to {X ∈ C(Γ ) | p ∈ X}. It is not difficult to verify the following universal property the canonical model: a statement is entailed by Γ iff it is true with respect to the canonical model of Γ . The elements of C(Γ ) will be referred to as the consistently Γ -closed subsets of Σ. The canonical universe is naturally ordered by set inclusion. We say that any ordered set which is order-isomorphic to C(Γ ) “is” or represents the information domain of Γ , thereby adapting the terminology introduced in [5].2 Example 1. Let Γ be the theory consisting of the single statement human  ¬feathered ∧ biped, which is equivalent to the theory {human ∧ feathered  Λ, human  biped} in conditional normal form, and let Γ 0 consist of the state- ment human  featherless ∧ biped. The information domains of Γ and Γ 0 are 1 It is not difficult to devise a much simpler sound and complete inference calculus since we are working in a small fragment of first-order logic; see e.g. [11, Sect. 6.3]. 2 The canonical universe of Γ is the free extent of Γ in the sense of [6]. See [13] for a more detailed comparison of several terminologies. Translations between Concept Hierarchies over Different Attribute Sets 95 human feathered biped human featherless biped feathered biped featherless biped ⊥ ⊥ Fig. 1. Feathered and featherless bipeds depicted on the left and the right of Figure 1, respectively, with subsets replaced by appropriate labels. 2.3 The Lindenbaum Algebra of Affirmative Terms The Lindenbaum construction is the key step towards algebraizing logic. Its basic idea is to abstract away from syntactical differences between terms that are equivalent with respect to a given theory. Definition 2 (Lindenbaum Algebra). Let Γ be a theory over Σ. The Lin- denbaum algebra L(Γ ) (of affirmative terms) of Γ is the quotient T [Σ]/'Γ , where φ 'Γ ψ iff Γ entails φ ≡ ψ. The Lindenbaum algebra of affirmative terms is a distributive lattice with zero and unit, i.e., an algebra of type h2, 2, 0, 0i, with [φ] ∧ [ψ] = [φ ∧ ψ], [φ] ∨ [ψ] = [φ ∨ ψ], 0 = [Λ], and 1 = [V ]. It should be emphasized that the restriction to affirmative terms is essential in our definition of the Lindenbaum algebra because replacing T [Σ] by B[Σ] would give rise to a Boolean lattice (see also Section 4.2). It follows by definition that two theories are equivalent if and only if they have the same Lindenbaum algebra. In [10], the elements of the Lindenbaum algebra L(Γ ) are referred to as the semantic concepts determined by Γ . Under this perspective, two affirmative terms “mean” the same (with respect to the given theory) just in case they represent the same semantic concept. Example 2. The (non-isomorphic) Lindenbaum algebras of the theories Γ and Γ 0 of Example 1 are respectively depicted on the left and the right of Figure 2. The Lindenbaum algebra of Γ , for instance, is the quotient of the distributive lattice with zero and unit freely generated over the set {human, feathered, biped} by the congruence relation generated by the pairs hhuman ∧ feathered, Λi and hhuman, biped ∧ humani. In the rest of the paper, we speak of distributive lattices with zero and unit briefly as algebras. 96 Rainer Osswald [V ] [V ] [fthed ∨ biped] [fless ∨ biped] [fthed ∨ human] [biped] [fless] [biped] [fthed] [(fthed ∧ biped) ∨ human] [fless ∧ biped] [human] [human] [fthed ∧ biped] [Λ] [Λ] Fig. 2. Lindenbaum algebras of feathered and featherless bipeds 2.4 Birkhoff Duality In the finite case, there is an intimate connection between the Lindenbaum alge- bra and the information domain of a theory in that they determine each other uniquely up to isomorphism. This correspondence is just an immediate conse- quence of a classical result of Birkhoff, according to which there is a categorical equivalence between the finite ordered sets and the finite distributive lattices with zero and unit (cf. [1, 4]). A standard formulation of Birkhoff’s duality is as follows: The (bounded) distributive lattice associated with a finite ordered set P is given by the lattice of upwards closed subsets of P . Conversely, the ordered set associated with a finite (bounded) distributive lattice D is given by the ordered set of ∨-irreducible elements of D. The only difference to the present situation is that we have to reverse the order of the ∨-irreducibles. For example, the ∨-irreducible elements in the lattice diagrams of Figure 2, which are marked by shaded circles, stand in an order-reversing one-to-one correspondence to the elements of the respective information domains; see Figure 1. (The ∨-irreducibles are characterized by the property of having precisely one element immediately below them.) For a more explicit characterization of the Lindenbaum algebra of a theory in terms of its information domain we can employ the one-to-one correspondence between the upwards closed subsets of an ordered set and its antichains, where the antichain associated with an upwards closed set is the set of minimal elements of that set.3 Proposition 1. In the finite case, there is a one-to-one correspondence between C(Γ ), where an antichain S in C(Γ ) the elements of L(Γ ) and the antichainsWin V corresponds to the equivalence class of { X | X ∈ S} in L(Γ ). 3 An antichain in an ordered set is a subset whose elements are pairwise incomparable with respect to the ordering relation. Translations between Concept Hierarchies over Different Attribute Sets 97 3 Translations of Theories 3.1 Translations and Equivalences Suppose two cognitive agents want to decide whether their theories about a certain universe of discourse are equivalent. If both agree that they use the same vocabulary in the same way, equivalence means that both theories entail each other, which is the case if every statement of one theory is entailed by the other and vice versa. This is precisely the definition of equivalent theories introduced in Section 2.1. Consider now the case of two theories Γ and Γ 0 over different sets Σ and Σ 0 of primitives. Assume that both agents do not hinge on their chosen primitive terms but are willing to express them in terms of those of the other. Within the present framework this means to define a translation function µ from Σ to T [Σ 0 ], which then can be naturally extended to a function µ̂ from T [Σ] to T [Σ 0 ]. Since varying the base vocabulary makes it necessary to keep track of it, we consider theories from now on as pairs hΣ, Γ i where Γ is a theory over Σ in the sense introduced in Section 2.1. Keeping this in mind, we often write Γ instead of hΣ, Γ i, if Σ is clear from the context or irrelevant. Definition 3 (Theory Translation). A translation of theories from hΣ, Γ i to hΣ 0 , Γ 0 i is a function µ from Σ to T [Σ 0 ] such that Γ 0 ` µ̂(Γ ). The translation µ is called primitive if µ(Σ) ⊆ Σ 0 . The composite of two theory translations µ from Γ to Γ 0 and µ0 from Γ 0 to Γ 00 is the composite function µb0 ◦ µ from Σ to T [Σ 00 ]. It is straightforward to verify that µb0 ◦ µ is indeed a theory translation from Γ to Γ 00 . In order to characterize the equivalence of theories, the notion of an invertible translation is too restrictive since, for instance, the empty theory over {a} clearly should count as equivalent to the theory {b ≡ c} over {b, c} under any sensible definition of equivalence. We can overcome this problem by relaxing the notion of an invertible translation to that of a quasi-invertible one. Let ıΓ be the identity translation on Γ that is given by the canonical inclusion of Σ into T [Σ]. Definition 4 (Equivalence of Translations). Let µ and ν be theory trans- lations from hΣ, Γ i to hΣ 0 , Γ 0 i. Then µ is equivalent to ν, notation: µ ∼ ν, iff Γ 0 ` µ(p) ≡ ν(p) for every p ∈ Σ. Definition 5 (Quasi-Inverse). Let µ be a theory translation from Γ to Γ 0 and let ν be a translation from Γ 0 to Γ . Then ν is quasi-inverse to µ iff ν ◦ µ ∼ ıΓ and µ ◦ ν ∼ ıΓ 0 . We are now able to formulate an appropriate notion of equivalence between theories with different base vocabularies: Γ and Γ 0 are said to be equivalent if there is a quasi-invertible translation from Γ and Γ 0 ; notation: Γ ∼ Γ 0 . Quasi- invertible translations can be characterized by the following two conditions: Definition 6 (Conservative Translation). A translation µ from Γ to Γ 0 is conservative iff, for all statements α over Σ, Γ 0 ` µ̂(α) only if Γ ` α. 98 Rainer Osswald Definition 7 (Essentially Surjective Translation). A translation µ from Γ to Γ 0 is essentially surjective iff for every p ∈ Σ 0 there is a φ ∈ T [Σ] such that Γ 0 ` p ≡ µ̂(φ). Proposition 2. A theory translation µ has a quasi-inverse if and only if µ is conservative and essentially surjective. Example 3. Let Γ be the theory over Σ = {a0 , a1 , . . . ak } ∪ {b0 , b1 , . . . bk }, with k finite, that consists of the statements an ≡ an+1 ∨ bn+1 and an ∧ bn ≡ Λ (0 6 n < k). The information domain C(Γ ) of Γ consists of the sets ∅, {a0 , a1 , . . . , ak }, and {a0 , a1 , . . . , an−1 , bn } for all n 6 k. Since the elements of C(Γ ) \ {∅} are pairwise incomparable with respect to set inclusion, it follows that C(Γ ) is flat and hence order-isomorphic to the information domain of the theory Γ 0 = {cm ∧ cn ≡ Λ | m 6= n} over Σ 0 = {c0 , c1 , . . . , ck+1 }. Let µ be the func- tion from Σ to T [Σ 0 ] with µ(an ) = cn+1 ∨ . . . ∨ ck+1 and µ(bn ) = cn (0 6 n 6 k). Clearly Γ 0 entails µ̂(Γ ), since (cn+1 ∨ . . . ∨ ck+1 ) ∧ cn ≡ (cn+1 ∧ cn ) ∨ . . . ∨ (ck+1 ∧ cn ) ≡ Λ for all n 6 k. So µ is a translation from Γ to Γ 0 . Now consider the function ν from Σ 0 to T [Σ] such that ν(ck+1 ) = ak and ν(cn ) = bn for every n 6 k. We want to show that ν is a translation from Γ 0 to Γ . To this end, observe that Γ entails bm  an and hence bm ∧ bn  Λ for all m > n. In addition, Γ entails am  an if m > n, from which it follows that am ∧ bn ≡ am ∧ an ∧ bn ≡ Λ whenever m > n. All in all, this proves that Γ entails ν̂(Γ 0 ). It remains to check that ν is quasi-inverse to µ. The only nontrivial claim is that Γ entails ν(µ(an )) ≡ an , i.e., that Γ ` an ≡ bn+1 ∨ . . . ∨ bk ∨ ak , which follows easily by induction. Figure 3 illustrates the situation for k = 1 in terms of Lindenbaum algebras and information domains; in addition, the figure depicts the extents of the primitives. The shaded circles in the diagrams of L(Γ ) and L(Γ 0 ) correspond to join-irreducible elements, which stand in an (order-reversing) one-to-one correspondence to the elements of C(Γ ) and C(Γ 0 ). Clearly, the equivalence µ from Γ to Γ 0 induces isomorphisms between L(Γ ) and L(Γ 0 ) and between C(Γ ) and C(Γ 0 ). Every theory translation µ from Γ to Γ 0 canonically gives rise to an algebra homomorphism L(µ) from L(Γ ) to L(Γ 0 ) as well as to an order-preserving func- tion C(µ) from C(Γ 0 ) to C(Γ ) which preserves suprema of directed sets (cf. e.g. [11, Sect. 8.2]). Proposition 3. Suppose µ and ν are theory translations from Γ to Γ 0 . Then L(µ) = L(ν) iff µ ∼ ν. In particular, L(Γ ) ' L(Γ 0 ) iff Γ ∼ Γ 0 . Proposition 4. A translation µ of theories is conservative iff C(µ) is onto; µ is essentially surjective iff C(µ) is an order embedding. Translations between Concept Hierarchies over Different Attribute Sets 99 1 1 L(Γ ) L(Γ 0 ) b0 ∨ a 0 a0 = a 1 ∨ b 1 c1 ∨ c 2 b0 b1 a1 c0 c1 c2 0 0 0 C (Γ ) C (Γ ) b0 b1 a1 c0 c1 c2 a0 Fig. 3. Lindenbaum algebras and informations domains of Example 3 for k = 1 3.2 Example: Minimal Representations Let P be a finite ordered set. Then P can be represented by a subset system over P , where x ∈ P corresponds to ↓x = {y ∈ P | y 6 x}. Moreover, every subset system over a finite set Σ is the information domain of a theory over Σ. Of course, P may have a representation by a subset system over a set Σ with lower cardinality than P . For instance, if P is bounded and distributive, the set of join-irreducible elements of P can be chosen for Σ. Another type of example is given by the fact that a full binary tree P of height k + 1, which has 2k+1 − 1 nodes and 2k leaves, can be embedded in ℘(Σ) with |Σ| = 2k. The representation problem arising here has the following general form: Given a finite ordered set P , find the least number n such that P can be represented as a subset system over a set of cardinality n. The solution of this problem is of practical importance because representations of ordered sets as subset systems can be easily implemented on a computer via bit-vector encoding. Unfortunately, the representation problem is NP-complete (see e.g. [3]). In terms of theories and translations the representation problem can be rephrased as follows: Given a theory Γ over a finite set Σ, find a set Σ 0 with 0 0 minimal cardinality such that Γ is equivalent to  a theory  Γ over Σ . Notice that n if |Σ 0 | = n then the width of C(Γ 0 ) is at most  n  , by Sperner’s Lemma (cf. 2 e.g. [15]), which gives us a lower bound for the cardinality of Σ 0 .4 4 The width of an ordered set P is the cardinality of a maximal antichain of P . 100 Rainer Osswald C (Γ ) {a, b, c} C (Γ 0 ) {a, b} {b, c} {a} {b} {b} ∅ ∅ Fig. 4. Extension ε where C(ε) is neither one-to-one nor onto 4 Theory Extensions A translation of theories from hΣ, Γ i to hΣ 0 , Γ 0 i is called an extension if its underlying function is an inclusion of Σ into Σ 0 , and Γ ⊆ Γ 0 . We then say that hΣ 0 , Γ 0 i is an extension of hΣ, Γ i. Proposition 5. If ε is an extension of theories from hΣ, Γ i to hΣ 0 , Γ 0 i then C(ε)(Y ) = Y ∩ Σ. In particular, C(ε) is an order embedding if Σ = Σ 0 . Example 4. Let Γ be the empty theory over {a, b} and let ε be its extension to Γ 0 by a single primitive c and the statements a  c and c  b. The induced function C(ε) from C(Γ 0 ) to C(Γ ) is depicted by Figure 4. Notice that C(ε) is neither one-to-one nor onto. 4.1 Conservative Extensions and Rule Extensions Suppose Γ and Γ 0 are theories over Σ such that Γ ⊆ Γ 0 . Then Γ 0 is called a rule extension of Γ . The corresponding extension from Γ to Γ 0 is the identity function on Σ. A rule extension is trivially essentially surjective and the induced function of canonical universes is an inclusion, by Proposition 5; in particular, C(Γ 0 ) ⊆ C(Γ ). The definition of the canonical universe of a theory hΣ, Γ i given in Section 2.2 can be seen as an example of this fact: C(Γ ) is included in the powerset ℘(Σ) of Σ, which is the canonical universe of the empty theory over Σ. An extension hΣ 0 , Γ 0 i of hΣ, Γ i is called conservative if the corresponding extension translation is conservative, that is, if Γ 0 ` α just in case Γ ` α, for every statement α over Σ. In other words, conservative extensions do not entail additional statements over the base vocabulary. It follows by Propositions 5 and 4 that C(Γ ) = {Y ∩ Σ | Y ∈ C(Γ 0 )} in this case. Proposition 6. Let µ be a theory translation from Γ to Γ 0 . (i) If µ is conser- vative, Γ 0 is equivalent to a conservative extension of Γ . (ii) If µ is essentially surjective, Γ 0 is equivalent to a rule extension of Γ . Translations between Concept Hierarchies over Different Attribute Sets 101 {a, c} {b, c} C (Γ ) {c} {−a, b, c} {a, −b, c} {−a, −b, c} {−a, −b, −c} C (Γ ) ∅ Fig. 5. Function of canonical universes induced by Booleanization 4.2 Booleanization Let Γ be a theory over Σ. The Booleanization Γ of Γ is defined as follows: Σ is extended by a disjoint copy {−p | p ∈ Σ} of Σ, and Γ is extended by all statements p ∧ −p  Λ and V  p ∨ −p, with p ∈ Σ, i.e., by all instantiations of the laws of contradiction and excluded middle for primitives. Observe that no element of C(Γ ) is a proper subset of another element of C(Γ ), because each element of C(Γ ) contains either p or −p, but not both, for all p ∈ Σ. Proposition 7. The information domain of the Booleanization of a theory is an antichain. Let ε be the extension from Γ to Γ . By Proposition 5, C(ε) takes Y ∈ C(Γ ) to Y ∩Σ. Observe that C(ε) is onto and one-to-one; in particular, Booleanization is conservative, by Proposition 4. Example 5. Consider the theory Γ = {a ∧ b  Λ, a ∨ b  c} over Σ = {a, b, c}. Then C(Γ ) consists of ∅, {c}, {a, c}, and {b, c}, whereas C(Γ ) consists of {−a, −b, −c}, {−a, −b, c}, {a, −b, c}, and {−a, b, c}. Figure 5 depicts the induced function C(ε) from C(Γ ) to C(Γ ). The Lindenbaum algebra of Γ is easily seen to be complemented, i.e., L(Γ ) is a Boolean lattice. Since C(ε) is onto, L(ε) is an embedding of L(Γ ) into L(Γ ). It is not difficult to prove the following universal property of L(ε): For every homomorphism h from L(Γ ) to a complemented algebra A, there exists a unique homomorphism h0 from L(Γ ) to A such that h = h0 ◦ L(ε). This universal characterization of L(Γ ) provides an additional justification of the term ‘Booleanization’. Moreover, the Boolean lattice L(Γ ) is isomorphic to the Lindenbaum algebra B[Σ]/'Γ of Boolean terms of Γ . 4.3 Rule Completion Besides Booleanization there is a another possible way to extend a theory hΣ, Γ i to a theory whose information domain is an antichain. The idea is to find an extension of Γ whose information domain is the set of maximal elements of the information domain of Γ . (Since every information domain is directed-complete, 102 Rainer Osswald it has maximal elements by Zorn’s Lemma.) Now observe that if such an ex- tension of Γ exists at all, it can be realized by a rule extension, according to Propositions 4 and 6. We speak of a rule completion of Γ in this case. A possible rule completion of the theory Γ of Example 5 is given by Γ 0 = Γ ∪ {V  a ∨ b}. Then C(Γ 0 ) consists of {a, c} and {b, c}. Rule completion, however, is not always possible. A simple counterexample is provided by the full binary exclusion theory Γ = {p ∧ q  Λ | p 6= q} over an infinite set Σ. Then C(Γ ) = {∅} ∪ {{p} | p ∈ Σ}. It can be shown (see [13, p. 27]) that there is no theory Γ 0 over Σ such that C(Γ 0 ) = {{p} | p ∈ Σ}. Hence, there Wis no rule completion of Γ . Intuitively, what is needed here is the statement V  Σ, that is, an infinite disjunction. 4.4 Direct Sums Consider the task of combining two theories hΣ, Γ i and hΣ 0 , Γ 0 i. Let us assume that Σ and Σ 0 and hence Γ and Γ 0 are disjoint. The most obvious way to combine the two theories into one is to take their disjoint union hΣ, Γ i ] hΣ 0 , Γ 0 i = hΣ ∪ Σ 0 , Γ ∪ Γ 0 i, which is also known as their direct sum. Correspondingly, one can define the direct sum of an arbitrary family hhΣi , Γi iii∈I of theories, given that the Σi ’s are pairwise disjoint: U S S i∈I hΣi , Γi i = h i∈I Σi , i∈I Γi i. U It is not difficult to describe the canonical universe of i Γi in terms S ofSthose of the Γi ’s: Let εi be the extension from hΣi , Γi i to the direct U sum h Σ i i , i Γi i. According to Proposition 5, the function C(εi ) from C( S i i Γ ) to C(Γ U i ) takes X to X ∩ Σi . Moreover, if Xi ∈ C(Γi ) for every i, then i Xi ∈ C( i Γi ); hence U S C( i Γi ) = { i Xi | Xi ∈ C(Γi )}. U It follows thatQthe information domain of i Γi is order-isomorphic to the Carte- sian product i C(Γi ) ordered by coordinatewise inclusion: U Q C( i Γi ) ' i C(Γi ). Example 6. Let Γ be a theory over Σ and let Σ 0 be a set disjoint to Σ. We call hΣ ∪ Σ 0 , Γ i the extension of hΣ, Γ i by primitives Σ 0 . Since hΣ ∪ Σ 0 , Γ i is identical to hΣ, Γ i]hΣ 0 , ∅i, it follows that C(hΣ ∪ Σ 0 , Γ i) ' C(hΣ, Γ i)×℘(Σ 0 ). 0 In particular, |C(hΣ ∪ Σ 0 , Γ i)| = |C(hΣ, Γ i)| · 2|Σ | . Example 7. Let Γ be a theory over Σ. Call a primitive p ∈ Σ free with respect to Γ if p does not occur in any statement of Γ . Let σ(Γ ) be the set of primitives occurring in at least one of the statements of Γ , that is, Σ 0 = Σ \ σ(Γ ) is the set of free primitives. Then hΣ, Γ i is the direct sum of hσ(Γ ), Γ i and hΣ 0 , ∅i (and hΣ, Γ i is the extension of hσ(Γ ), Γ i by primitives Σ 0 ). Hence C(hΣ, Γ i) ' C(hσ(Γ ), Γ i) × ℘(Σ 0 ). Translations between Concept Hierarchies over Different Attribute Sets 103 {a, c} {b, c} {a, c} {b, c} {a} {b} {c} × ' {a} {b} {c} ⊇ {c} {b} ∅ ∅ ∅ ∅ {a, b, c} = {a, c} {a, c} {b, c} {a, c} {b, c} {c} × {b} ' {c} {b} ⊇ {c} {b} ∅ ∅ ∅ ∅ Fig. 6. Information domain of extension as subset of product 4.5 Extensions Decomposed An extension of theories from hΣ, Γ i to hΣ 0 , Γ 0 i can be decomposed into an extension of primitives from hΣ, Γ i to hΣ 0 , Γ i followed by a rule extension from hΣ 0 , Γ i to hΣ 0 , Γ 0 i. Recall from Example 6 that C(hΣ 0 , Γ i) = {X ∪ Y | X ∈ C(hΣ, Γ i), Y ⊆ Σ \ Σ 0 } ' C(hΣ, Γ i) × ℘(Σ \ Σ 0 ). Moreover, C(hΣ 0 , Γ 0 i) consists of all elements of C(hΣ 0 , Γ i) that are consistently closed with respect to Γ 0 \ Γ . So we can construct C(hΣ 0 , Γ 0 i) from C(hΣ, Γ i) by taking first the product of C(hΣ, Γ i) and ℘(Σ \ Σ 0 ) and then deleting those elements that are not consistently closed with respect to Γ 0 \ Γ . Example 8. Let Γ 0 be the theory {a ∧ b  Λ, a  c} over Σ 0 = {a, b, c}. Viewed as an extension of the theory {a ∧ b  Λ} over {a, b}, the construction of C(Γ 0 ) by product and deletion is as depicted by the upper row of Figure 6. The lower row of the figure shows the construction if Γ 0 is viewed as an extension of the theory {a  c} over {a, c}. The shaded elements are subject to deletion because they are not consistently closed with respect to Γ 0 \ Γ . 4.6 A Simple Algorithmic Construction Scheme Let Γ be a theory over a finite set Σ. Choose a strictly increasing sequence Σ0 , Σ1 , . . . , Σn of sets, with Σ0 = ∅ and Σn = Σ, and let Γi = Γ |Σi be the set of all statements of Γ with primitives in Σi . Then C(Γ ) can be constructed from C(Γ0 ) by applying the two-step method described in Section 4.5 iteratively to the extension from Γi−1 to Γi , for every i. As we have seen, C(Γi ) can be constructed from C(Γi−1 ) by taking all sets of the form X ∪Y , with X ∈ C(Γi−1 ) and Y ⊆ Σi \Σi−1 , such that X ∪Y is consistently closed with respect to Γi \Γi−1 . 104 Rainer Osswald function C (Σ: set; Γ : theory): system of sets; begin if not cc? (∅, Γ |∅ ) then C := ∅ else begin C := {∅}; Σ 0 := ∅; while Σ 6= ∅ and C 6= ∅ do begin F := any nonempty subset of Σ; Σ 0 := Σ 0 ∪ F ; Γ 0 := Γ |Σ 0 ; C 0 := ∅; foreach X ∈ C, Y ⊆ F do begin X 0 := X ∪ Y ; if cc? (X 0 , Γ 0 ) then C 0 := C 0 ∪ {X 0 } end; Σ := Σ \ F ; Γ := Γ \ Γ 0 ; C := C 0 end end end; function cc? (X: set; Γ : theory): boolean; { true if X is consistently Γ -closed, false otherwise } begin foreach (φ  ψ) ∈ Γ do if X  φ and X 2 ψ then return (false); return (true) end; Fig. 7. A generic algorithmic scheme for information domain construction An algorithmic formulation of this iteration scheme is shown in Figure 7, where the variables F and Σ 0 take the place of Σi \ Σi−1 and Σi , respec- tively. (Notice that the algorithm yields C = ∅ for inconsistent theories like {V  a, a  Λ}.) Calculating the information domain of the i-th extension re- quires to check |C(Γi−1 )| · 2ki sets against |Γi \ Γi−1 | statements, with ki = |Σi \ Σi−1 |. Consequently, C(Γi ) should be of low cardinality (more in the order of |Σi | than of 2|Σi | ) and ki should be near to one. In other words, it is important to choose the partition of Γ \ Γ0 into the sets Γi \ Γi−1 (1 6 i 6 n) in such a way that keeps |C(Γi )| small during the construction process. How to do this in a systemic way is a topic for future research. Notice that a nonredundant theory is not necessarily the best choice. For though less rules reduce the number of tests a single candidate set of primitives has to undergo (in the subroutine cc? of Figure 7), the total number of sets to Translations between Concept Hierarchies over Different Attribute Sets 105 be tested during an extension step can increase because additional (redundant) statements would have pruned C(Γi ) at an earlier stage of the construction. Suppose Γ has reduced normal form. Then every statement of Γ can be represented by a sequent, i.e., a pair hP, Qi of disjoint finite subsets of Σ, where φ is the conjunction of the elements of P and ψ is the disjunction of the elements of Q (with V and Λ arising respectively by conjunction and disjunction of nothing). In terms of the sequent representation of Γ , the if -statement of the subroutine cc? becomes an elementary condition on sets: if P ⊆ X and X ∩ Q = ∅ then return (false); 5 Conclusion We have presented a framework for translating between classifications, taken as theories, over different base vocabulary and studied their effect on the associated information domains, which can be regarded as conceptual hierarchies. In partic- ular, we have studied extensions of theories, with Booleanization as a special case. Moreover, it has been shown how theory extensions can be straightforwardly em- ployed for constructing conceptual hierarchies in a step-by-step fashion. Though the presentation has abstained from using the language of category theory, some readers surely will have noticed that categorical concepts are at least implicit in our treatment of theory translations. Readers interested in a more explicit categorical setting are invited to consult [11]. References 1. Garrett Birkhoff. Lattice Theory. AMS Colloquium Publications, Vol. 25. AMS, Providence, RI, 3rd edition, 1967. 2. Radim Bělohlávek and Vladimı́r Sklenář. Formal concept analysis constrained by attribute-dependency formulas. In Bernhard Ganter and Robert Godin, editors, Proceedings of ICFCA 2005, Lecture Notes in Artificial Intelligence 3403, pages 176–191, Berlin, 2005. Springer. 3. Yves Caseau, Michel Habib, Lhouari Nourine, and Olivier Raynoud. Encoding of multiple inheritance hierarchies and partial orders. Computational Intelligence, 15(1):50–62, 1999. 4. Brian A. Davey and Hilary A. Priestley. Introduction to Lattices and Order. Cam- bridge University Press, Cambridge, 2nd edition, 2002. 5. Manfred Droste and Rüdiger Göbel. Non-deterministic information systems and their domains. Theoretical Computer Science, 75:289–309, 1990. 6. Bernhard Ganter and Rudolf Wille. Contextual attribute logic. In William Tepfen- hart and Walling Cyre, editors, Proceedings of ICCS 1999, Lecture Notes in Arti- ficial Intelligence 1640, pages 377–388, Berlin, 1999. Springer. 7. Hele-Mai Haav. A semi-automatic method to ontology design by using FCA. In Vaclav Snášel and Radim Bělohlávek, editors, Proceedings of CLA 2004. VŠB – Technical University of Ostrava, Department of Computer Science, 2004. 106 Rainer Osswald 8. Jørgen Fischer Nilsson. A logico-algebraic framework for ontologies. In P. Anker Jensen and P. Skadhauge, editors, Ontology-based Interpretation of Noun Phrases, Proceedings of the 1st International OntoQuery Workshop, pages 43–56, University of Southern Denmark – Kolding, 2001. 9. Nikolaj Oldager. Taxonomies with lattice algebras. In K. I. Simov and A. Kiryakov, editors, Proceedings of the 1st International Workshop on Ontologies and Lexical Knowledge Bases (OntoLex), pages 106–117, 2000. 10. Frank J. Oles. An application of lattice theory to knowledge representation. The- oretical Computer Science, 249:163–196, 2000. 11. Rainer Osswald. A logic of classification – with applications to linguistic theory. Technical Report 303 – 9/2003, FernUniversität in Hagen, Department of Com- puter Science, 2003. 12. Rainer Osswald. Two views on building conceptual hierarchies. In Gabriele Kern- Isberner, Wilhelm Rödder, and Friedhelm Kulmann, editors, Proceedings of the 2nd Workshop on Conditionals, Information, and Inference, 2004. Ulm, September 21. 13. Rainer Osswald. Concept hierarchies from a logical point of view. In Supplemen- tary Proceedings of the 3rd International Conference on Formal Concept Analysis (ICFCA-2005), pages 15–30, Lens, France, 2005. 14. Rainer Osswald and Wiebke Petersen. A logical approach to data driven classifi- cation. In KI-2003: Advances in Artificial Intelligence, Lecture Notes in Artificial Intelligence 2821, pages 267–281, Berlin, 2003. Springer. 15. William T. Trotter. Partially ordered sets. In Ronald Graham, Martin Grötschel, and László Lovász, editors, Handbook of Combinatorics, pages 433–480. North- Holland, Amsterdam, 1995.