=Paper= {{Paper |id=Vol-1624/paper19 |storemode=property |title=On Stability of Triadic Concepts |pdfUrl=https://ceur-ws.org/Vol-1624/paper19.pdf |volume=Vol-1624 |authors=Sergei O. Kuznetsov,Tatyana P. Makhalova |dblpUrl=https://dblp.org/rec/conf/cla/KuznetsovM16 }} ==On Stability of Triadic Concepts== https://ceur-ws.org/Vol-1624/paper19.pdf
               On Stability of Triadic Concepts

                Sergei O. Kuznetsov1 and Tatiana P. Makhalova1,2
1
    National Research University Higher School of Economics, Kochnovsky pr. 3,
                             Moscow 125319, Russia
     2
       ISIMA, Complexe scientifique des Cézeaux, 63177 Aubière Cedex, France

                     skuznetsov@hse.ru,t.makhalova@gmail.com



        Abstract. Triadic formal concept analysis has become a popular re-
        search direction, since triadic relations give natural models of many data
        collections. In this paper we address the problem of selecting most inter-
        esting concepts by proposing triadic stability indices.


1      Introduction
Triadic formal concept analysis (3FCA) was introduced by Rudolf Wille and
Fritz Lehmann [1] to model hierarchies of classes and dependencies arising from
ternary relations. Recently, several algorithms for computing frequent tricon-
cepts were proposed [2, 3]. It is noticed that some infrequent concepts are still
interesting, since they represent extraordinary or uncommon data. In this paper
we propose triadic stability for selecting interesting triadic concepts. Together
with exact stability indices we suggest their efficient approximations analogous
to ∆-stability introduced in [4].


2      Main definitions
2.1     Formal Concept Analysis
First, we briefly recall some basic definitions of the Formal Concept Analysis
(FCA) [5]. A formal context is a triple (G, M, I). G and M are sets of objects
and attributes respectively, and I is an incidence relation. It is defined as the
Cartesian product G × M , i.e. (g, m) ∈ I if the object g ∈ G has the attribute
                                       0
m ∈ M . The derivation operators (·) are defined for A ⊆ G and B ⊆ M as
follows:

                            A0 = {m ∈ M | ∀g ∈ A : gIm}
                            B 0 = {g ∈ G | ∀m ∈ B : gIm}

A0 is the set of attributes common to all objects of A, and B 0 is the set of objects
                                                           0
sharing all attributes of B. The double application of (·) is a closure operator,
        00
i.e. (·) is extensive, idempotent and monotone. Subsets A ⊆ G, B ⊆ M such
that A = A00 and B = B 00 are called closed.
    A (formal) concept is a pair (A, B), where A ⊆ G, B ⊆ M and A0 = B,
  0
B = A. A is called the (formal) extent, and B is called the (formal) intent of
the concept (A, B).
    A concept lattice (or Galois lattice) is a partial ordered set of concepts,
the order 6 on the set of concepts is defined as follows: (A, B) ≤ (C, D) iff
A ⊆ C (D ⊆ B), a pair (A, B) is a subconcept of (C, D), while (C, D) is a
superconcept of (A, B). Each finite lattice has the highest element with A = G,
called the top element, and the lowest element with B = M , called the bottom
element.

2.2     Triadic Concept Analysis
In the case of a triadic relation one deals with a quadruple (G, M, B, Y ), called
a triadic context. G, M , B are sets and Y is a ternary relation between G, M
and B, i.e. Y ⊆ G × M × B; the elements of G, M and B are called objects,
attributes and conditions respectively, and (g, m, b) ∈ Y is read: object g has
attribute m under condition b.
    The dyadic derivation operators can be used to construct triadic concepts.
A triadic context can be represented as follows: K := (K1 , K2 , K3 , Y ), where K1
is a set of objects G, K2 is a set of attributes and K3 is a set of conditions, and
each element of Ki may be seen as an instance of Peirce’s i-th category [1]. For
every triadic context one defines
                                the following dyadic contexts:
    K1 := K1 , K2 × K3 , Y (1) with gY (1) (m, b) :⇔ (g, m, b) ∈ Y
                               
    K2 := K2 , K1 × K3 , Y (2) with mY (2) (g, b) :⇔ (g, m, b) ∈ Y
                               
    K3 := K3 , K1 × K2 , Y (3) with bY (3) (g, m) :⇔ (g, m, b) ∈ Y              
                                                                    (i,j)                (i,j)
      For {i, j, k} = {1, 2, 3} and Ak ⊆ Kk , one defines KAk := Ki , Kj , YAk                   ,
                    (i,j)
where (ai , aj ) ∈ YAk if and only if (ai , aj , ak ) ∈ Y
                                                    for all ak ∈ Ak .
                                  (i)
    Put differently, the context K is a flattened representation of the original
                         (i,j)
triadic context, while KAk corresponds to the relation between elements of Ki
and Kj that belong to Ak .

(i)-derivation operator For {i, j, k} = {1, 2, 3} with j < k and for X ⊆ Ki and
Z ⊆ Kj × Kk the (i)-derivation operators are defined by


        X 7−→ X (i) := {(aj , ak ) ∈ Kj × Kk | (ai , aj , ak ) ∈ Y for all ai ∈ X}

            Z 7−→ Z (i) := {ai ∈ Ki | (ai , aj , ak ) ∈ Y for all (aj , ak ) ∈ Z}

(i, j, Xk )-derivation operators For {i, j, k} = {1, 2, 3} and for Xi ⊆ Ki , Xj ⊆ Kj
and Ak ⊆ Kk the (i, j, Xk )-derivation operators are defined by
                (i,j,Ak )
      Xi 7−→ Xi             := {aj ∈ Kj | (ai , aj , ak ) ∈ Y for all (ai , ak ) ∈ Xi × Ak }
                (i,j,Ak )
      Xj 7−→ Xj             := {ai ∈ Ki | (ai , aj , ak ) ∈ Y for all (aj , ak ) ∈ Xj × Ak }
    A triadic concept (triconcept) of K := (K1 , K2 , K3 , Y ) is a triple (A1 , A2 , A3 )
                                                        (i)
with Ai ⊆ Ki for i ∈ {1, 2, 3} and Ai = (Aj × Ak ) for every {i, j, k} = {1, 2, 3}
with j < k. The sets A1 ,A2 , and A3 are called extent, intent and modus of the
triadic concept respectively. We let T (K) denote the set of all triadic concepts
of K.
    A triadic concept lattice has three maximal elements, namely ((K2 × K3 )(1) ,
K2 , K3 ), (K1 , (K1 × K3 )(2) , K3 ), and (K1 , K2 , (K1 × K2 )(3) ). For any two ele-
ments of a lattice one defines tree types of set inclusion/exclusion relations, which
satisfy the following antiordinal dependencies: (A1 , A2 , A3 ) G (B1 , B2 , B3 ) iff
A1 ⊆ B1 , A2 ⊇ B2 , A3 ⊇ B3 , (A1 , A2 , A3 ) M (B1 , B2 , B3 ) iff A1 ⊇ B1 , A2 ⊆
B2 , A3 ⊇ B3 or (A1 , A2 , A3 ) C (B1 , B2 , B3 ) iff A1 ⊇ B1 , A2 ⊇ B2 , A3 ⊆ B3 .


3    Stability Indices For Triadic Concepts

Stability indices for formal concepts were introduced in [6, 7] and modified in [8].
We define stability indices for the triadic case in a similar way. We describe two
types of stability that correspond to the derivation operators defined above.

(i)-stability For a triadic concept (A1 , A2 , A3 ) the (i)-stability is defined by:
                                       
                   (i)                | X ⊆ Ai |X (i) = (Aj × Ak ) |
              Stab (A1 , A2 , A3 ) :=
                                                      2|Ai |
    This index shows how much the binary relation on sets Xj and Xk is depen-
dent on particular elements of a subset Ai .

(i, j, Xk )-stability For a triadic concept (A1 , A2 , A3 ) the (i, j, Xk )-stability is
defined by:
                                                
                  (i,j,Xk )                    | X ⊆ (Ai × Aj ) |X (k) = Ak |
             Stab           (A1 , A2 , A3 ) :=
                                                        2|Ai |+|Aj |
     The (i, j, Xk )-stability allows us to estimate the dependence of a subset Ak
on elements of the (Xi , Xj )-relation.

Example Below we consider a small examples of computing stability indices for
a concept. The formal context is given in the table 1.


                               Table 1. Triadic context

                             α       β       γ       δ
                          a b c d a b c d a b c d a b c d
                        1×        ×     ×       ×
                        2×        ×××       ××      ××
                        3     ×××××         ××
                        4               ×
   Let us consider a triconcept C = ({2, 3} , {b, c} , {β, γ}) with (1) - stability
and (1, 3, X2 )-stability.
                                                              (1)
   Stab(1) (C) = 12 . Since the numerator is comprised by {3} = ({b, c} × {β, γ})
           (1)
and {2, 3} = ({b, c} × {β, γ}).


    Table 2. I ⊆ M × C corresponding to all possible subsets of the extent {2, 3}

               {∅} a b c d {2} a b c d {3} a b c d {2, 3} a b c d
                α ×××× α ×              α      ×× α
                β ×××× β ×××            β ×××        β ×××
                γ ×××× γ         ××     γ    ××      γ      ××
                δ ×××× δ         ××     δ            δ


   Stab(1,3,X2 ) (C) = 38 . To compute this value one needs to check 16 subsets
of X1 × X3 and corresponding subsets of X2 . The following sets occur in the
numerator: {∅, 2, 3, 23} × {∅, β, γ, βγ}.
                                                          (1,3,A2 )               (1,3,A2 )
    {b, c} = ({2, 3} , {β, γ})(1,3,A2 ) = ({2, 3} , {γ})              = ({3} , {γ})
                              (1,3,A2 )               (1,3,A2 )                       (1,3,A2 )
                ({3} , {β, γ})            = ({2} , {γ})           = ({2} , {β, γ})

4     Estimates of stability
The problem of computing stability is #P -complete [6, 7], therefore, in practice,
when one deals with a big context and with the huge amount of generated
concepts, it is very difficult to apply these indices. That’s why, estimates of the
stability for dyadic concepts have been proposed [9, 4].
    We have expanded the ∆-stability [4] for the case of triadic stability indices.
In this regard, it is important to note that the estimates derived from the di-
rect descendants of a triconcept can be useless owing to the defined quasiorders,
because the number of direct neighbors is usually small. In figure 1 the distri-
butions of the descendants number with respect to different inclusion/exclusion
relations are represented.




                     Fig. 1. The number of neighbors distributions


   Instead of considering the set difference between (i)-th components of a tri-
concept and each direct descendant, we consider the set difference between (i)-th
components of a triconcept c = (A1 , A2 , A3 ) and other, possibly, unclosed con-
cepts derived by adding new elements from Kj \ Aj , j 6= i or Kj × Kk \ Aj × Ak ,
j, k 6= i. Put differently, the lower and upper bounds estimates of stability index
look as follows:
                           X
                 −log2              2−∆(c,d) ≤ LStab (c) ≤ ∆min (c) ,
                         d∈Enl(c)

where ∆min (c) = mind∈Enl(c) ∆ (c, d),
                 n                                                    o
        Enl (c) = X | X = {Ak ∪ x} , x ∈ Kk \ Ak , X (k) ⊆ (Ai × Aj )

and ∆ (c, d) is the difference between |Aj | · |Ak | and the number of elements in
X (k) for estimates of (i, j, Xk )-stability.
               n                                                             o
    Enl (c) = X | X = {Aj × Ak ∪ x} , x ∈ Kj × Kk \ Aj × Ak , X (i) ⊆ Ai

and ∆ (c, d) is the difference between |Ai | and the number of elements in X (i)
for estimates of (i)-stability.

Example. Consider upper and lower bounds of stability estimates for C =
({2, 3} , {b, c} , {β, γ}) from the running example (Table 1). To get an estimate of
the (1)-stability we consider elements of the following set {a, b, c, d}×{α, β, γ, δ}\
{b, c}×{β, γ}. Subsets of A1 derived from those elements are {∅} and {2}, which
give us −log2 (7 · 2−2 + 2−1 ) and 1 for lower and upper bounds, respectively. To
get estimates of (1, 3, X2 )-stability one needs to expand the intent by elements
from {a, d}. Adding the first element a reduces the {2, 3}×{β, γ} to {2, 3}×{β},
while expanding the intent by d results in the empty set. Thus, the lower and
upper bounds take values 1.678 and 2, respectively.


5    Experimental Results

In this section we explore some empirical properties of the introduced indices
using synthetic data. We generated four groups of 100 random 10 × 10 × 10
contexts with densities 0.1, 0.2, 0.4, 0.6. The features of the data are given in
Figure 2.




        Fig. 2. Parameters of lattices constructed on 10 × 10 × 10 contexts.
    The choice of a subset of indices for data exploration can be motivated by
the following properties: the indices should be pairwise uncorrelated (to avoid
biased results when combining indices) and efficiently computable (if possible).
The density function of an index may be a multimodal mixture of two or more
distributions. In this case one needs a special justification for the choice of a
threshold value separating two distributions.
    We consider Pearson’s correlation between all pairs of stability indices and
cardinalities of sets that comprise a triadic concept (extent, intent, modus). In
Figure 3 the values of the coefficient are represented. The sizes of the extent,
intent and modus are denoted by |A1 |, |A2 |, |A3 |, respectively. The sizes of dyadic
subcontexts are denoted in a similar way. The corresponding stability estimates
are referred to by the log prefix.




Fig. 3. The Pearson’s correlation coefficient among 100 contexts with the density 0.4


    As can be seen from Figure 3, there is a correlation between (i)-stability and
|Aj | · |Ak |. The index of (i)-stability correlates less strongly with the estimates
of (j, k, Xi )-stability and the size of the set Ai . These types of correlation be-
come stronger as the density of a context increases. In fact, these indices can
be replaced by the size of a particular set in the case of a dense context. A cor-
relation is observed between (i, j, Xk )-stability and its estimates, a less strong
correlation is observed between (i, j, Xk )-stability and |Ai | · |Aj |. It is important
to note that the strong correlation between (i, j, Xk )-stability and its estimates,
as well as very small correlation between (i, j, Xk )-stability and estimates of
(k)-stability remains the same with different context densities. The pairwise cor-
relation between stability indices is weak, hence it is preferable to use these
indices together.
    For selecting interesting concepts based on values of an index it is important
to choose a correct threshold. This choice can be based on the distribution of
index values. Figure 4 shows that the distribution of values (2)-stability and
(1, 3, Xk )-stability (other (i)-stabilities and (i, j, Xk )-stabilities have similar dis-
tributions). The distribution of (i)-stability values allows us to identify a thresh-
old easily, since some picks exist in the distribution, while for (i, j, Xk )-stability
the distribution varies from density to density, in case of a dense context it mo-
tivates further study of the index and the way one selects thresholds for it. For
values of the lower bound of stability estimates the modes of the distribution
become less distinct or the distribution becomes unimodal (Figures 5,6). The
upper bound for (i)-stability estimate (or (i, j, Xk )) in most cases corresponds
to |Ai | (or |Ai | · |Aj |), since the closure of a superset of Aj × Ak (or Ak ) results
in the empty set.




       Fig. 4. The distribution of values of (2)-stability and (1, 3, X2 )-stability




               Fig. 5. The values distribution of (2)-stability estimates


    The lower bound of (i)-stability is also strongly correlated with the size of
set i. This is due to the fact that a big size of the set i leads to larger difference
between the sizes of Ai and (Xj ×Xk )(i) , where Xj ×Xk is a superset of Aj ×Ak ,
and the sum under logarithm. The estimate of (i, j, Xk )-stability are correlated
            Fig. 6. The values distribution of (i, j, Xk )-stability estimates


with the corresponding indices. There is roughly the same correlation between
the estimate and the value |Ai | · |Aj |, which results from the bigger difference
between |Ai | · |Aj | and |Xi | · |Xj |, which correspond to a superset of Ak . The
upper and lower estimates of (i, j, Xk )-stability also correlate, in this case, the
correlation could be related to set-difference between the set Ai × Aj and the
volume of the rectangular subarea of Xi × Xk for the corresponding superset of
Ak .
    It is noteworthy that the calculation of stability estimates in practice could
take more time then the stability calculation itself. It is typical for (i)-stability,
where the number 2|Ai | is lower then the number of all possible subsets obtained
by adding elements from Kj × Kk \ Aj × Ak .


6    Conclusion

In this paper we have introduced two stability indices for triadic concepts, based
on two derivation operators, and studied their empirical behavior. We have pro-
posed to compute stability using two derivation operators. We have studied
correlation of stability indices and their distributions, which is important in
practical data analysis. As it was shown, the introduced stability indices are not
pairwise correlated and therefore can be used in some combinations for selecting
interesting concepts. Moreover, (i)-stability correlates with |Ai |(for dense con-
texts) and |Aj ||Ak |, and hence these indices should not be combined together.
    The values of (i)-stability for all concepts are characterized by the presence
of groups of values with high frequency, which facilitates selection of interesting
concepts based on threshold values, while the distribution of (i, j, Xk )-stability
does not give clearly defined groups of interesting concepts.
    We have also introduced the estimates of stability indices, which correlate
both with the corresponding stability indices and some of stability estimates.
This is due to the fact that the estimates of (i)-stability (or (i, j, Xk )-stability)
are based on the elements from Kj × Kk \ Aj × Ak (or Kk \ Ak ). Hence, the
choice between stability and its estimates must be guided by the comparison
of the sizes of sets involved in calculation, e.g. in the case of (i)-stability the
number of subsets 2|Ai | , most probably, will be less then the number of elements
in Kj × Kk \ Aj × Ak .
    The proposed indices characterize triconcepts differently, in general they do
not agree in the top-n selected concepts, which allow us to use either their
combination to set up the strictest selection criteria, or to take some of them
depending on the meaning behind a stability index.


References
1. Lehmann, F., Wille, R.: A triadic approach to formal concept analysis. In: Con-
   ceptual Structures: Applications, Implementation and Theory: Third International
   Conference on Conceptual Structures, ICCS ’95 Santa Cruz, CA, USA, August 14–
   18, 1995 Proceedings. Springer Berlin Heidelberg, Berlin, Heidelberg (1995) 32–43
2. Jäschke, R., Hotho, A., Schmitz, C., Ganter, B., Stumme, G.: Trias–an algorithm
   for mining iceberg tri-lattices. In: Proceedings of the Sixth International Conference
   on Data Mining. ICDM ’06, Washington, DC, USA, IEEE Computer Society (2006)
   907–911
3. Ji, L., Tan, K.L., Tung, A.K.H.: Mining frequent closed cubes in 3d datasets. In:
   Proceedings of the 32Nd International Conference on Very Large Data Bases. VLDB
   ’06, VLDB Endowment (2006) 811–822
4. Buzmakov, A., Kuznetsov, S.O., Napoli, A.: Scalable estimates of concept stability.
   In Glodeanu, C., Kaytoue, M., Sacarea, C., eds.: Formal Concept Analysis. Lecture
   Notes in Computer Science, Springer International Publishing (ICFCA, 2014) 157–
   172
5. Ganter, B., Wille, R.: Contextual attribute logic. In Tepfenhart, W., Cyre, W., eds.:
   Conceptual Structures: Standards and Practices. Volume 1640 of Lecture Notes in
   Computer Science. Springer Berlin Heidelberg (1999) 377–388
6. Kuznetsov, S.O.: Stability as an estimate of degree of substantiation of hypotheses
   derived on the basis of operational similarity. Nauchn. Tekh. Inf., Ser. 2 (12) (1990)
   21–29
7. Kuznetsov, S.O.: On stability of a formal concept. Annals of Mathematics and
   Artificial Intelligence 49(1-4) (2007) 101–115
8. Kuznetsov, S.O., Obiedkov, S., Roth, C.: Reducing the representation complexity
   of lattice-based taxonomies. In: Conceptual Structures: Knowledge Architectures
   for Smart Applications. Springer Berlin Heidelberg (2007) 241–254
9. Babin, M.A., Kuznetsov, S.O.: Approximating concept stability. In Domenach, F.,
   Ignatov, D., Poelmans, J., eds.: Formal Concept Analysis. Volume 7278 of Lecture
   Notes in Computer Science., Springer Berlin Heidelberg (2012) 7–15