A Similarity Measure to Generalize Attributes

       Rostand S. Kuitché1 , Romuald E. A. Temgoua2 , and Léonard Kwuida3
              1
                  Université de Yaoundé I, Département des Mathematiques,
                                 BP 812 Yaoundé, Cameroun
                                     kuitcher@yahoo.com
                   2
                     Université de Yaoundé I, École Normale Supérieure,
                                  BP 47 Yaoundé, Cameroun
                                    retemgoua@gmail.com
                           3
                              Bern University of Applied Sciences,
                             Brückenstrasse 73, 3005 Bern, Suisse
                                   leonard.kwuida@bfh.ch


         Abstract. Formal Concept Analysis (FCA) plays a crucial role in var-
         ious domains, especially in qualitative data analysis. Here knowledge
         are extracted from an information system in form of clusters (forming a
         concept lattice) or in form of rules (implications basis). The number of
         extracted pieces of information can grow very fast. To control the num-
         ber of cluster, one possibility is to put some attributes together to get a
         new attribute called a generalized attribute. However, generalizing does
         not always lead to the expected results: the number of concepts can even
         exponentially increase after generalizing two attributes [7,8]. A natural
         question is whether there is a similarity measure, (possibly cheap and
         fast to compute), that is compatible with generalizing attributes: i.e. if
         m1 , m2 are more similar than m3 , m4 , then putting m1 , m2 together
         should not lead to more concepts as putting m3 , m4 together. This paper
         is an attempt to answer this question.


  Keywords: Formal Concept Analysis; Generalizing Attributes; Similarity Mea-
  sures.


  1    Introduction
  In Formal Concept Analysis (FCA), a formal context is a binary relation
  (G, M, I) that models an elementary information system, whereby G is the set of
  objects, M the set of attributes and I ⊆ G×M the incidence relation. To extract
  knowledge from such an elementary information system, one possibility is to get
  clusters of objects and/or attributes by grouping together those sharing the
  same characteristics. These pairs, called concepts, were formalized by Rudolf
  Wille [16]. For A ⊆ G and B ⊆ M we set

                        A0 = {m ∈ M | g I m for all g ∈ A} and
                        B 0 = {g ∈ M | g I m for all m ∈ B}.


c paper author(s), 2018. Proceedings volume published and copyrighted by its editors.
  Paper published in Dmitry I. Ignatov, Lhouari Nourine (Eds.): CLA 2018, pp.
  141–152, Department of Computer Science, Palacký University Olomouc, 2018.
  Copying permitted only for private and academic purposes.
 142      Rostand S. Kuitché, Romuald E. A. Temgoua, and Lénard Kwuida


 A concept is a pair (A, B) such that A0 = B and B 0 = A. A is called extent and
 B intent of the concept (A, B). The set of concepts of a context K := (G, M, I)
 is ordered by the relation (A, B) ≤ (C, D) : ⇐⇒ A ⊆ C, and forms a lattice,
 denoted by B(K) and called concept lattice of K. To control the size of concept
 lattices, many methods have been suggested: decomposition [18,19,17], iceberg
 lattices [14] α-Galois lattices [15], fault tolerant patterns [3], closure or kernel
 operators and/or approximation [6]. In [7] the authors consider putting together
 some attributes to get a generalized attribute. Doing this one has to decide when
 an object satisfies a (new) generalized attribute. They discuss several scenarios
 among which the following, called ∃-generalization:
       an object g ∈ G satisfies a generalized attribute
                                                  S      s ⊆ M if g satisfies at
       least one of the attributes in s. i.e. s0 = {m0 | m ∈ s}.
 In the rest of this contribution, we will simply say generalization to mean ∃-
 generalization. By generalizing (i.e putting together some attributes) we reduce
 the number of attributes and hope to also reduce the size of the concept lattice.
 Unfortunately this is not always the case. In [8] the authors provide some exam-
 ples where the size increases exponentially after generalizing two attributes and
 also give the maximal increase.
     In [1,5], the authors discuss similarity measures on concepts, and even on
 lattices. For our purpose, we need a measure of similarity on attributes such
 that if m1 , m2 are more similar than m3 , m4 , then generalizing m1 , m2 should
 not lead to more concepts as generalizing m3 , m4 . We say that such a similarity
 measure is compatible with the generalization. Given a set M of attributes,
 a similarity measure on M is defined as a function S : M × M → R such that
 for all m1 , m2 in M ,
  (i) S(m1 , m2 ) ≥ 0,                                                   positivity
 (ii) S(m1 , m2 ) = S(m2 , m1 )                                         symmetry
(iii) S(m1 , m1 ) ≥ S(m1 , m2 )                                        maximality
 If in addition S(m1 , m2 ) ≤ 1, we say that S is normalized. Similarity measures
 aim at quantifying to which extent two attributes resemble each other. Getting
 a similarity measure compatible with the generalization will be a valuable tool
 in preprocessing and will warn the data analyst on possible lost or gain when
 generalizing.
     The rest of the paper is organized as follows: In Section 2, we investigate
 the existing similarity measures that we found in the literature. In Section 3, we
 give a new similarity measure that characterize the pairs of attributes which can
 increase the size of the concept lattice after generalizing. Section 4 exposes an
 example on lexicographic data and Section 5 concludes the paper.


 2     Test of Existing Similarity Measures in ∃-Generalization
 Similarity and dissimilarity measures play a key role in pattern analysis problems
 such as classification, clustering, etc. Ever since Pearson proposed a coefficient
                             A Similarity Measure to Generalize Attributes    143


of correlation in 1896, numerous similarity measures and distance have been
proposed in various fields. These measures can be grouped into tree main types,
depending of the data on which they are used:

Correlation coefficients: They are often used in data to compare variables
   with qualitative characters subdivided in more than two states.
Distance similarity coefficients: They are generally used in data with pure
   quantitative variables. In most cases, for quantitative data, the similarity
   between two taxa is expressed as a function of their distance in a dimensional
   space whose coordinates are the characters.
Coefficients of association: They are often used in data with presence-absence
   characters or in data with individuals having qualitative characters subdi-
   vided into two states.

There are two subsets of coefficients of association: those that only depend on
characteristics present in at least one of the taxa compared, but are independent
of the attributes absent in both taxa (denoted by type 1), and those that also
take into account the attributes absent in both taxa (denoted by type 2). Those
measures use

 – a as the number of cases where the two variables occur together in a sample,
 – d as the number of cases where none of the two attributes occur in a sample,
 – b as the number of cases in which only the first variable occur, and
 – c as the number of cases where only the second variable occur.

One of the
         most important similarity measure of type 1 is the Jaccard measure
     a
  a+b+c , proposed in order to classify ecological species. Also in the ecological
                                                       
                                                   2a
field, the Dice coefficient of association 2a+b+c         aims at quantifying the
extent to which two differentspecies are associated in a biotope, the Sorensen
                                  4a
coefficient of association 4a+b+c        and the Anderberg coefficient of as-
                   
                8a
sociation 8a+b+c      are of the same type. The Sneath and Sokal 2 similarity
             1     
                  a
coefficient 1 a+b+c
                2
                      , put in place in order to compare organisms in numeri-
              2                                                         
cal taxonomy, the Kulczynski similarity measure 21 ( a+b      a
                                                                 + a+ca
                                                                        ) and the
Ochiai similarity measure ( √ a            ) are also from this first type.
                                (a+b)(a+c)
   The most used similarity coefficient    of the
                                                  second type is the Sokal and
                                          a+d
Michener coefficient of association a+d+b+c , also called the simple match-
ing coefficient, put in place to express the similarity between two species of
                                                                         1
                                                                         2 (a+d)
bees. Moreover, the Rogers and Tanimoto similarity measure ( 1 (a+d)+b+c         )
                                                                       2
whose aim was to compare species of plants in the ecological field, the Sokal and
                                     2(a+d)
Sneath 1 similarity coefficient ( 2(a+d)+b+c  ) was defined to make comparison in
                                                                              a
numerical taxonomy and the Russels and Rao similarity measure ( a+d+b+c          )
put in place with the aim of showing resemblance between species of anopheline
144     Rostand S. Kuitché, Romuald E. A. Temgoua, and Lénard Kwuida


larvae, are included
                   in this
                         type. Same are the Yule and Kendall similar-
                     ad
ity coefficients ad+bc , often used in the statistical field. Some of the above
similarity measures can be found in [5].
    Regarding the definitions of the above kinds of similarity measures, only the
coefficients of association suitable to formal contexts, since formal contexts are
data with presence-absence characters. We will investigate the impact of these
coefficients of association on a special pair of attributes in some formal contexts.
The objective is to show that these similarity measures are not helpful in finding
whether their generalization increases the size of the lattice or not.
    Our first example is an arbitrary formal context (G, M, I) containing two
attributes x, y ∈ M such that x0 ⊆ y 0 and |x0 ∩ y 0 | = 1. Then |x0 \ y 0 | = 0 and
the generalization of the attributes x and y does not increase the size of the
lattice. Choosing |y 0 \ x0 | = 20 and |G \ (x0 ∪ y 0 )| = 1 yields a = |x0 ∩ y 0 | = 1,
b = |x0 \ y 0 | = 0, c = |y 0 \ x0 | = 20 and d = |G \ (x0 ∪ y 0 )| = 1. For the coefficient
of association of type 1 with Jaccard (Jc), Dice (Di), Sorensen (So), Anderberg
(An), Sneath and Sokal 2 (SS2 ), Kulczynski (Ku) and Orchiai (Orch), and the
coefficient of association of type 2 with Sokal and Michener (SM), Rogers and
Tanimoto (RT), Sneath and Sokal 1 (SS1 ) and Russel and Rao (RR), we get the
table below for s(x, y):


  Jc       Di     So     An    SS2      Ku    Orch      SM     RT     SS1     RR
 0,05    0,09   0,17    0,29   0,02    0,52   0,22     0,09   0,05    0,17   0,05


   The table above shows that with almost all these measures, the similarity
measured between the attributes x and y is very low, despite the fact that their
generalization does not increase the size of the lattice.
   Our second example is the formal context K6 := (S6 ∪ {g1 }, S6 ∪ {m1 , m2 }, I)
below, with S6 = {1, 2, 3, 4, 5, 6}.

                               K6 1 2 3 4 5 6 m1 m2
                               1    ××××× ×
                               2 × ×××× × ×
                               3 ×× ××× × ×
                               4 ××× ×× × ×
                               5 ×××× × × ×
                               6 ×××××           ×
                               g1 × × × × × ×

We observe that |m01 ∩ m02 | = 4, |m01 \m02 | = 1 and |m02 \m01 | = 1. Putting together
the attributes m1 and m2 by a ∃-generalization increases the size of the lattice
by 16. The following table shows the measures of type 1 and type 2 between
the attribute m1 and any other attribute i. All the similarity measures of the
                                  A Similarity Measure to Generalize Attributes       145


              Jc      Di     So     An    SS2     Ku      Orch    SM     RT    SS1     RR
  i ∈ S5    0,57    0,80   0,89    0,94   0,50   0,80     0,80   0,71   0,56   0,83   0,57
   i=6      0,83    0,91   0,95    0,97   0,71   0,92     0,91   0,75   0,75   0,92   0,71
 i = m2     0,67    0,80   0,89    0,94   0,50   0,80     0,80   0,71   0,56   0,83   0,57


two types show that the attribute m1 is more similar to m2 than to any other
attribute i ∈ S6 (apart from i = 6); But putting m1 and m2 together increases
the size of the lattice. We can conclude that these similarity measures are not
compatible with the ∃-generalization. We are actually looking for a measure on
attributes that will flag pairs of attributes as less similar when putting these
together increases the size of the concept lattice.


3    A Similarity Measure Compatible with ∃-Generalization
In this section we define a similarity measure on attributes which is compati-
ble with the existential generalization. This generalization means that from an
attribute reduced context K := (G, M, I), two attributes a, b are removed and
replaced with an attribute s defined by s0 = a0 ∪ b0 . We set M0 := M \ {a, b} and

           K00 :=(G, M0 , I ∩(G × M0 )),                (removing a, b from K)
           K0s :=(G, M0 ∪{s},
                         ·     I0s ),                   (adding s to K00 )

where I0s := (I ∩(G × M0 )) ∪ {(g, s) | g I b or g I a}. Furthermore we denote the
set of extents of K00 by Ext(K00 ). We also set

       H(a) := {A ∩ a0 | A ∈ Ext(K00 ) and A ∩ a0 ∈
                                                  / Ext(K00 )} ,
        H(b) := {A ∩ b0 | A ∈ Ext(K00 ) and A ∩ b0 ∈
                                                   / Ext(K00 )} ,
    H(a ∪ b) := {A ∩ (a0 ∪ b0 ) | A ∈ Ext(K00 ) and A ∩ (a0 ∪ b0 ) ∈
                                                                   / Ext(K00 )} ,
    H(a ∩ b) := {A ∩ (a0 ∩ b0 ) | A ∈ Ext(K00 ) and A ∩ (a0 ∩ b0 ) ∈
                                                                   / Ext(K00 )} .

We will often write h(x) for |H(x)|, for any x ∈ {a, b, a ∩ b, a ∪ b}. Before we
start the construction, let us recall the following result partly proved in [8]:
Theorem 1. Let K := (G, M, I) be an attribute reduced context with |G| ≥ 3 and
|M | > 3. Let a and b be two attributes such that their existential generalization
s = a ∪ b increases the size of the concept lattice. Then
a) |B(K)| = |B(K00 )| + |H(a, b)|, with |H(a, b)| = |H(a) ∪ H(b) ∪ H(a ∩ b)|.
                                               0    0       0       0
b) The increase is |H(a ∪ b)| − |H(a, b)| ≤ 2|a |+|b | − 2|a | − 2|b | + 1.
Proof. Let K := (G, M, I) be such context and a, b two attributes of K. One
proceeds to the ∃-generalization of attributes a and b.

a) We set Ka = (G, M \ {b}, I). It holds:

                   |B(K)| = |B(Ka )| + h∗ (b) = |B(K00 )| + h(a) + h∗ (b)
146       Rostand S. Kuitché, Romuald E. A. Temgoua, and Lénard Kwuida


      where h∗ (b) = |{B ∩ b0 ; B ∈ Ext(Ka ), B ∩ b0 ∈  / Ext(Ka )}|. Our aim is to
               ∗
      express h (b) as a function of h(b) and h(a ∩ b). According to [8], Ext(Ka ) =
      Ext(K00 )∪ H(a). Hence,

                 H∗ (b) = {B ∩ b0 | B ∈ Ext(Ka ), B ∩ b0 ∈
                                                         / Ext(Ka )}
                       = {B ∩ b0 | B ∈ Ext(K00 ) and B ∩ b0 ∈
                                                            / Ext(Ka )}
                            ∪ {B ∩ b0 | B ∈ H(a) and B ∩ b0 ∈
                                                            / Ext(Ka )}

      Replacing Ext(Ka ) by Ext(K00 ) ∪ H(a), we get

         {B ∩ b0 | B ∈ Ext(K00 ) and B ∩ b0 ∈
                                            / Ext(Ka )} = H(b) \ H(a)         and


        {B ∩ b0 | B ∈ H(a) and B ∩ b0 ∈
                                      / Ext(Ka )} = H(a ∩ b) \ (H(b) ∪ H(a)).


      Thus,     h∗ (b) = h(b) + h(a ∩ b) − |H(a) ∩ H(b)| + |H(a ∩ b) ∩ H(a) ∩ H(b)|
                       − |H(a ∩ b) ∩ H(a)| − |H(a ∩ b) ∩ H(b)|.

      Hence,

      |B(K)| = |B(K00 )| + |H(a)| + |H(b)| + |H(a ∩ b)| + |H(a ∩ b) ∩ H(a) ∩ H(b)|
                − |H(a) ∩ H(b)| − |H(a ∩ b) ∩ H(a)| − |H(a ∩ b) ∩ H(b)|
               = |B(K00 )| + |H(a) ∪ H(b) ∪ H(a ∩ b)|.

b) Although b) was proved in [8], we can now get it from a). To maximize the
   increase a0 ∩ b0 should be ∅; i.e. |H(a ∩ b)| ∈ {0, 1}.
     • If |H(a ∩ b)| = 0, then

                        |B(K)| = |B(K00 )| + |H(a) ∪ H(b) ∪ H(a ∩ b)|
                               = |B(K00 )| + |H(a)| + |H(b)|.

       • If |H(a ∩ b)| = 1, then we consider two subcases:
         – The only element of H(a ∩ b) is not in H(a) ∪ H(b). Then,

                  |H(a) ∩ H(b)| = |H(a ∩ b) ∩ H(a) ∩ H(b)|
                                = |H(a ∩ b) ∩ H(a)| = |H(a ∩ b) ∩ H(b)| = 0

          and |B(K)| = |B(K00 )| + |H(a)| + |H(b)| + |H(a ∩ b)|.
          – The only element of H(a ∩ b) is either in H(a) or H(b). Then

          |H(a ∩ b)| + |H(a ∩ b) ∩ H(a) ∩ H(b)| − |H(a ∩ b) ∩ H(a)| − |H(a ∩ b) ∩ H(b)|

          is equal to zero and |H(a) ∩ H(b)| ∈ {0, 1}. Thus

                  |B(K)| = |B(K00 )| + |H(a)| + |H(b)| + 1 − |H(a) ∩ H(b)|.
                                     A Similarity Measure to Generalize Attributes       147


    In all these subcases, considering that |B(K0s )| = |B(K00 )| + |H(a ∪ b)|, the
    increase after the generalization is

       |B(K0s )| − |B(K)| = |H(a ∪ b)| − |H(a, b)|
                                       0       0        0      0
                              ≤ 2|a |+|b | − 2|a | − 2|b | + (d1 + d2 − d0 )
                                       0       0        0      0
                              ≤ 2|a |+|b | − 2|a | − 2|b | + 1, since d1 + d2 − d0 ≤ 0,

    with d1 = |{A ⊆ a0 | A ∈ Ext(K00 )}|, d2 = |{A ⊆ b0 | A ∈ Ext(K00 )}| and
    d0 = |{A ⊆ a0 ∪ b0 | A ∈ Ext(K00 )}|.                                   t
                                                                            u

   Now, we define the following gain function:

                      ψ : M × M −→ Z
                            (a, b) 7−→ ψ(a, b) = |H(a ∪ b)| − |H(a, b)|

Note that H(a ∪ b) = H(b ∪ a), and H(a, b) = H(b, a) because the order of
adding the attributes a and b does not matter. Therefore ψ(a, b) = ψ(b, a). By
definition, ψ(a, a) = 0. Further, we define the map δ as followed:

                            δ : M × M −→ R
                                         (
                                                    1       if ψ(a, b) ≤ 0
                                    (a, b) 7−→
                                                    0       else

Since K is a finite context, there is a pair of attributes a0 , b0 in M such that

                              |a00 | + |b00 | = max (|a0 | + |b0 |).
                                                   a,b∈M

                  0     0       0          0                          0      0   0   0
We set n0 = 2|a0 |+|b0 | − 2|a0 | − 2|b0 | + 1. Then n0 ≥ 2|a |+|b | − 2|a | − 2|b | + 1 for
all pairs {a, b} ⊆ M . With the function δ, we construct the following map:

                  Sgen : M × M −→ R
                          (a, b) 7−→ Sgen (a, b) = 1+δ(a,b)
                                                      2     − |ψ(a,b)|
                                                                2n0

where |ψ(a, b)| is the absolute value of ψ(a, b). That leads to the following results.

Proposition 1. Let (G, M, I) be a reduced context with |G| ≥ 3 and |M | > 3.
Then Sgen is a normalized similarity measure on M .

Proof. Let a, b two attributes of (G, M, I). Since |ψ(a, b)| ≤ n0 we can easily
check that 0 ≤ Sgen (a, b) = Sgen (b, a) ≤ Sgen (a, a) = 1 holds.             t
                                                                              u

   Sgen also has the following properties:

Proposition 2. Let (G, M, I) be a reduced context with |G| ≥ 3 and |M | > 3.
Let a, b, c, d ∈ M . It holds:
a) Sgen (a, b) ≥ 21 if and only if ψ(a, b) ≤ 0.
 148     Rostand S. Kuitché, Romuald E. A. Temgoua, and Lénard Kwuida


 b) If ψ(a, b) ≤ 0 < ψ(d, c) then Sgen (d, c) < Sgen (a, b).
 c) If 0 < ψ(a, b) ≤ ψ(d, c) then Sgen (d, c) ≤ Sgen (a, b).
 d) If ψ(a, b) ≤ ψ(d, c) ≤ 0 then Sgen (a, b) ≤ Sgen (d, c).

Proof. Let K = (G, M, I) be such a context and a, b, c, d ∈ M .
 a) If ψ(a, b) ≤ 0 then δ(a, b) = 1 and
                                                                           
                             1 + δ(a, b) |ψ(a, b)|   1            ψ(a, b)           1
             Sgen (a, b) =              −          =         2+                 ≥     .
                                 2         2n0       2              n0              2

    Now, Sgen (a, b) ≥ 12 implies 1+δ(a,b)
                                       2    − |ψ(a,b)|
                                                 2n0   ≥ 12 and |ψ(a, b)| ≤ n0 δ(a, b).
    If δ(a, b) = 0 then |ψ(a, b)| = 0. If δ(a, b) = 1 then ψ(a, b) ≤ 0 by definition
    of δ. Hence, Sgen (a, b) ≥ 21 if and only if ψ(a, b) ≤ 0.
 b) If ψ(a, b) ≤ 0 < ψ(d, c) then Sgen (d, c) < 12 ≤ Sgen (a, b).
 c) If 0 < ψ(a, b) ≤ ψ(d, c) then δ(a, b) = δ(d, c) = 0, and

                                 1 ψ(d, c)  1 ψ(a, b)
                 Sgen (d, c) =     −       ≤ −        = Sgen (a, b).
                                 2   2n0    2  2n0
 d) If ψ(a, b) ≤ ψ(d, c) ≤ 0 then δ(a, b) = δ(d, c) = 1, and

                                      ψ(a, b)     ψ(d, c)
                  Sgen (a, b) = 1 +           ≤1+         = Sgen (d, c).
                                       2n0         2n0
                                                                                          t
                                                                                          u

Proposition 3. Let (G, M, I) be a reduced context and a, b ∈ M . The following
assertions are equivalent:
  (i) δ(a, b) = 1.
 (ii) ψ(a, b) ≤ 0.
(iii) Sgen (a, b) ≥ 21 .
(iv) A ∃-generalization of a and b does not increase the size of the concept lattice.

Proof. (i) ⇐⇒ (ii) follows from the definition of δ. (ii) ⇐⇒ (iii) is Proposition 2
a). (ii) ⇐⇒ (iv) follows from the fact that ψ(a, b) = |H(a ∪ b)| − |H(a, b)| is
actually the difference |B(G, M ∪ {s} \ {a, b}, I)| − |B(G, M, I)| between the
number of concepts before and after generalizing a, b to s with s0 = a0 ∪ b0 .

Therefore, generalizing two attributes a, b in a reduced context (G, M, I) in-
creases the size of the lattice if and only if Sgen (a, b) < 21 . The threshold 12 is
just a consequence of the way Sgen has been defined.
    To test our results we have designed a naive algorithm (see Algorithm 1) that
computes Sgen on all pairs of attributes a, b of K. If the set of attributes M is
considered as a vector, then for any attribute a ∈ M , we set T(a) the set of all
attributes coming before a in M . The complexity of our algorithm is given by
              X           X
                  (1 +          ((q(a, b) + 4)[4(q(a, b) + 1) + 4] + 3),
              a∈M       b∈M \T (a)
                                 A Similarity Measure to Generalize Attributes       149


which is equal to
         X      X
 |M | +                  (4q 2 (a, b) + 24q(a, b) + 35),   with q(a, b) = | Ext(K00 )|.
        a∈M b∈M \T (a)


 Algorithm 1: Computing a similarity measure
   Data: An attribute reduced context (G, M, I)
   Result: ψ and Sgen on M × M
                                          0  0
 1 Choose x, y in M , x 6= y with |x | + |y | maximal;
            0    0        0       0
          |x |+|y |
 2 n0 ← 2            − 2|x | − 2|y | + 1;
 3 T ← ∅;
 4 foreach a in M do
 5    T ← T ∪ {a};
 6    foreach b in M \ T do
 7        Ext0 ← Ext(G, M \ {a, b}, I);
 8        foreach x in {a, b, a ∪ b, a ∩ b} do H(x) ← ∅;
 9        foreach A in Ext0 do
10             foreach x in {a, b, a ∪ b, a ∩ b} do
11                  if A ∩ x0 ∈/ Ext0 then H(x) ← H(x) ∪ {A ∩ x0 };
12             end
13        end
14    end
15    ψ(a, b) ← |H(a ∪ b)| − |H(a) ∪ H(b) ∪ H(a ∩ b)|; ψ(b, a) ← ψ(a, b);
16    if ψ(a, b) ≤ 0 then
17        δ(a, b) ← 1
18    else
19        δ(a, b) ← 0
20    end
                      1 + δ(a, b) |ψ(a, b)|
21    Sgen(a,b) ←                 −
                          2            2n0
22 end


4   An Example from Lexicographic Data

Formal Concept Analysis has been applied to compare lexical databases. In [11]
Uta Priss proposes an example in where the information channel is ”building”.
With respect to this, the main difference between English and German is that in
English, the word ”house” only refers to small residential buildings whereas in
German even small office buildings and large residential buildings can be called
”Haus”, and only factories would normally not be called ”Haus”. Moreover,
”building” in English refers to either a factory, an office or even a big residential
house. But only a factory can be called ”Gebäude” in German. She presented in
the figure below the information channel of the word ”building” in the sense of
Barwise and Seligman [2] in both English and German.
150      Rostand S. Kuitché, Romuald E. A. Temgoua, and Lénard Kwuida


With the above information channel we can construct a formal context as fol-
lows: The objects are different kinds of buildings: small house (”h”), office (”o”),
factory (”f”) and large residential house (”l”). The attributes are different names
of these objects in both languages: English and German. These are ”building”,
”house”, ”Haus”, ”Gebäude”, ”large building” (short: ”large”), ”business build-
ing” (short: ”business”), ”residential house” (short: ”residential”), and ”small
house” (short: ”small”). Thus G = {h, o, f, l} and M = {”building”, ”house”,
”Haus”, ”Gebäude”, ”large”, ”business”, ”residential”, ”small”}. In the follow-
ing, a set of objects will be denoted as a concatenation of those objects. For
example we will write ho or oh for the set {h, o}. The English and German
classifications of the word ”building” are then presented in the following formal
context:
              building house Haus Gebäude large business residential small
      factory    ×                   ×       ×      ×
      office     ×            ×                     ×                  ×
      house              ×    ×                                ×       ×
      large      ×            ×              ×                 ×

For this formal context, n0 = 23+3 − 23 − 23 + 1 = 49. Let consider the attributes
a := house and b := Gebäude. Then a0 ∪ b0 = {f, h} and a0 ∩ b0 = ∅. We have

          Ext(K00 ) = {f ohl, f ol, ohl, f o, f l, ol, oh, hl, f, o, h, l, ∅}, and

H(a) = H(b) = H(a ∩ b) = ∅ and H(a ∪ b) = {f ohl}. Therefore, ψ(a, b) = 1
and Sgen (a, b) = 21 − 98
                        1
                          ≈ 0.49. Using our algorithm, we compute ψ(a, b) and
                              A Similarity Measure to Generalize Attributes      151


Sgen (a, b) for all pairs a, b ∈ M . The table below show ψ(a, b) below the diagonal,
and Sgen (a, b) on the rest.

               building house Haus Gebäude large business residential small
   building      1.00    0.98 0.97   1.00    0.99   0.98      0.97      0.97
      house      −2      1.00 1.00   0.49    0.49   0.49      1.00      1.00
      Haus       −3        0  1.00   0.98    0.97   0.97      0.99      0.99
   Gebäude        0       1   −2    1.00    1.00   1.00      0.49      0.49
       large     −1        1   −3     0      1.00   0.98      0.49      0.97
   business      −2        1   −3     0       −2    1.00      0.98      0.49
 residential     −3        0   −1     1       1      −2       1.00      0.98
      small      −3        0   −1     1       −3     1         −2       1.00

    From the above table, the attributes ”house” and ”Gebäude” are less similar.
It reflects the fact that these words ”Gebäude” (in German) and ”house” (in En-
glish) do not have the same meaning. It is also the case for the attributes ”house”
and ”business buildings” as well as ”Gebäude” and ”residential building”. Hence,
putting together each of the above pairs of attributes will increase the size of
the lattice. On the contrary, the attributes ”large” and ”Haus”, ”building” and
”Haus” are more similar through Sgen . It is because the word ”Haus” which
designates a house, a business office or simply large building in German, often
coincides with the words ”building” or ”large building” in English. For these
pairs, the existential generalization will not increase the size of the lattice.


5   Conclusion
We have constructed a similarity measure compatible with the change in the size
of the lattice after a generalization of a pair of attributes in a formal context.
That measure should send a warning when grouping two attributes. Also, it
enables us to characterize contexts where generalizing two attributes increases
the size of the concept lattice. Our next step is to look at the implication between
generalized attributes. We suspect that the number of implications decreases if
the number of concepts increases.


References
 1. Alqadah, F., Bhatnagar, R.: Similarity Measures in Formal Concept Analysis.
    AMAI – Springer (2009)
 2. Barwise, J., Seligman, J.: Information Flow: the logic of distributed systems.
    Cambridge University Press (1997)
 3. Besson, J., Pensa, R. G., Robardet, C., Boulicaut, J.: Constraint-Based Mining of
    Fault-Tolerant Patterns from Boolean Data. KDID 55–71 (2005)
 4. Dice, L. R.: Measures of the Amount of Ecologic Association Between Species. esa.
    Promoting the Science of Ecology Vol. 26, No 3, 297–302 (1945)
152     Rostand S. Kuitché, Romuald E. A. Temgoua, and Lénard Kwuida


 5. Domenach, F.: Similarity Measures of concept lattices In: Lausen B., Krolak-
    Schwerdt S., Böhmer M. (eds) Data Science, Learning by Latent Structures, and
    Knowledge Discovery. Studies in Classification, Data Analysis, and Knowledge Or-
    ganization. Springer, Berlin, Heidelberg
 6. Kwuida, L.: On concept lattice approximation. In Osamu Watanabe (Ed.) Founda-
    tions of Theoretical Computer Science: For New Computational View. RIMS No.
    1599, Proceedings of the LA Symposium (2008), January 28–30, Kyoto, Japan,
    42–50 (2008)
 7. Kwuida, L., Missaoui, R., Balamane A., Vaillancourt, J.: Generalized pattern
    extraction from concept lattices. AMAI – Springer 151–168 (2014)
 8. Kwuida, L., Kuitché, R. S., Temgoua, E. R. A.: On the Size of ∃-Generalized
    Concepts. ArXiv:1709.08060.
 9. Ganter, B., Wille, R.: Formal Concept Analysis. Mathematical Foundations.
    Springer (1999)
10. Jaccard, P.: Nouvelle recherche sur la distribution florale. Bulletin de la Société
    Vaudoise des Sciences Naturelles (1908)
11. Priss, U.: Linguistic Applications of Formal Concept Analysis. Formal Concept
    Analysis, 149–160 (2005)
12. Rogers, D. J., Tanimoto, T. T.: A Computer Program for classifying plants.
    Springer (1960)
13. Sneath, P. H. A.: The Application of Computers to Taxonomy J. gen Microbiol.
    17, 201–226 (1957)
14. Stumme, G., Taouil, R., Bastide, Y., Pasquier, N., Lakhal, L.: Computing iceberg
    concept lattices with Titanic. Data Knowl. Eng., 42(2): 189–222 (2002)
15. Ventos, V., Soldano, H., Lamadon, T.: Alpha Galois Lattices. ICDM, 555–558
    (2004)
16. Wille, R.: Restructuring lattice theory: an approach based on hierarchies of con-
    cepts. In I. Rival (Ed.) Ordered Sets. Reidel, 445–470 (1982)
17. Wille, R.: Lattices in data analysis: how to draw them with a computer. In I.
    Rival (Ed.) Algorithms and Order. Kluwer, 33–58 (1989)
18. Wille, R.: Tensorial decomposition of concept lattices. Order 2, 81–95 (1985)
19. Wille, R.: Subdirect product construction of concept lattices. Discrete Mathematics
    63, 305–313 (1987)