Triadic Factor Analysis

                                Cynthia Vera Glodeanu

                           Technische Universität Dresden,
                              01062 Dresden, Germany
                   Cynthia_Vera.Glodeanu@mailbox.tu-dresden.de


        Abstract. This article is an extension of work which suggests using
        formal concepts as optimal factors of Factor Analysis. They discussed
        a method for decomposing a p × q binary matrix W into the Boolean
        matrix product P ◦ Q of a p × n binary matrix P and a n × q binary
        matrix Q with n as small as possible.
        We have generalised this factorization problem to the triadic case, looking
        for a decomposition of a p × q × r Boolean 3d-matrix B into the Boolean
        3d-matrix product P ◦ Q ◦ R for p × n, q × n and r × n binary matrices
        P, Q and R with n as small as possible.
        The motivation is given by the increasing interest in Triadic Concept
        Analysis due to Web 2.0 applications.

        Keywords: Triadic Concept Analysis, Factor Analysis, Web 2.0


1     Introduction and Previous works

The interest in decomposition of triadic data in psychology goes back at least to
the beginning of the 90’s [2]. However at that time the methodologies were not
as developed as nowadays. The interest in triadic data is also increasing due to
the structure of Web 2.0 data [4].
   This paper contains a new method for factorization of triadic data which
could be developed as a generalisation of the dyadic case presented in [1].1


2     Triadic Concept Analysis

For an introduction to Formal Concept Analysis we refer the reader to [5]. The
triadic approach to Formal Concept Analysis was introduced by Rudolf Wille
and Fritz Lehmann in [6]. It is based upon C.S. Pierce’s pragmatic philosophy
with his three universal categories. We just give a compressed introduction to
this field and refer the interested reader to [6, 7].

Definition 1. A triadic context is defined as a quadruple (K1 , K2 , K3 , Y )
where K1 , K2 and K3 are sets and Y is a ternary relation between K1 , K2 and
1
    During the revision period it turned out there is yet unpublished work of R. Be-
    lohlavek and V. Vychodil dealing with the same subject [3]
128       Cynthia Vera Glodeanu

K3 , i.e. Y ⊆ K1 ×K2 ×K3 . The elements of K1 , K2 and K3 are called (formal)
objects, attributes and conditions, respectively, and (g, m, b) ∈ Y is read: the
object g has the attribute m under the condition b.

   Small triadic contexts can be represented through three-dimensional cross
tables. Examples are given in Table 1 and 2 on page 6 and 11.

Definition 2. A triadic concept (shortly triconcept) of a triadic context
(K1 , K2 , K3 , Y ) is defined as a triple (A1 , A2 , A3 ) with A1 ⊆ K1 , A2 ⊆ K2
and A3 ⊆ K3 that is maximal with respect to component-wise set inclusion in
satisfying A1 × A2 × A3 ⊆ Y , i.e. for X1 ⊆ K1 , X2 ⊆ K2 and X3 ⊆ K3 with
X1 × X2 × X3 ⊆ Y , the containments A1 ⊆ X1 , A2 ⊆ X2 , and A3 ⊆ X3 always
imply (A1 , A2 , A3 ) = (X1 , X2 , X3 ).
    For a triconcept (A1 , A2 , A3 ), the components A1 , A2 and A3 are called the
extent, the intent, and the modus of (A1 , A2 , A3 ) respectively.

    Pictorially, a triconcept (A1 , A2 , A3 ) is a rectangular box full of crosses in
the three-dimensional cross table representation of (K1 , K2 , K3 , Y ), where ‘box’
is maximal under proper permutation of rows, columns and layers of the cross
table.

Definition 3. For {i, j, k} = {1, 2, 3} with j < k and for X ⊆ Ki and Z ⊆
Kj × Kk , the (i)-derivation operators are defined by:

         X 7→ X (i) := {(aj , ak ) ∈ Kj × Kk | (ai , aj , ak ) ∈ Y for all ai ∈ X},
         Z 7→ Z (i) := {ai ∈ Ki | (ai , aj , ak ) ∈ Y for all (aj , ak ) ∈ Z}.

     These derivation operators correspond to the derivation operators of the
dyadic contexts defined by K(1) := (K1 , K2 × K3 , Y (1) ), K(2) := (K2 , K1 ×
K3 , Y (2) ) and K(3) := (K3 , K1 × K2 , Y (3) ) where gY (1) (m, b) :⇔ mY (2) (g, b) :⇔
bY (3) (g, m) :⇔ (g, m, b) ∈ Y . Due to the structures of triadic contexts fur-
ther derivation operators are necessary for the computation of triconcepts. For
{i, j, k} = {1, 2, 3} and for Xi ⊆ Ki , Xj ⊆ Kj and Xk ⊆ Kk the (i, j, Xk )-
derivation operators are defined by
               (i,j,Xk )
      Xi 7→ Xi             := {aj ∈ Kj | (ai , aj , ak ) ∈ Y for all (ai , ak ) ∈ Xi × Xk },
             (i,j,Xk )
      Xj 7→ Xj         := {ai ∈ Ki | (ai , aj , ak ) ∈ Y    for all (aj , ak ) ∈ Xj × Xk }.

These derivation operators correspond to the derivation operators of the dyadic
contexts defined by Kij                       ij                       ij
                            Xk := (Ki , Kj , YXk ) where (ai , aj ) ∈ YXk if and only if
(ai , aj , ak ) ∈ Y for all ak ∈ Xk .
     The structure on the set of all triconcepts T(K) is the set inclusion in each
component of the triconcept. There is for each i ∈ {1, 2, 3} a quasiorder .i and
its corresponding equivalence relation ∼i defined by

                 (A1 , A2 , A3 ) .i (B1 , B2 , B3 ) :⇐⇒ Ai ⊆ Bi and
                 (A1 , A2 , A3 ) ∼i (B1 , B2 , B3 ) :⇐⇒ Ai = Bi (i = 1, 2, 3).
                                                                Triadic Factor Analysis    129

    These quasiorders satisfy the antiordinal dependences: for {i, j, k} = {1, 2, 3},
(A1 , A2 , A3 ) .i (B1 , B2 , B3 ) and (A1 , A2 , A3 ) .j (B1 , B2 , B3 ) imply (A1 , A2 , A3 )
&k (B1 , B2 , B3 ) for all triconcepts (A1 , A2 , A3 ) and (B1 , B2 , B3 ) of K. For i 6= j
the relation ∼i ∩ ∼j is the identity on T(K), i.e. if we have two components
of the triconcept the third one is uniquely determined by them. [(A1 , A2 , A3 )]i
denotes the equivalence class of ∼i containing the triconcept (A1 , A2 , A3 ). The
quasiorder .i induces an order ≤i on the factor set T(K)/ ∼i of all equivalence
classes of ∼i which is given by [(A1 , A2 , A3 )]i ≤i [(B1 , B2 , B3 )]i ⇔ Ai ⊆ Bi .
(T(K)/ ∼1 , ≤1 ), (T(K)/ ∼2 , ≤2 ) and (T(K)/ ∼3 , ≤3 ) can be identified with the
ordered set of all extents, intents and modi of K respectively. Generally every
ordered set with smallest and greatest element is isomorphic to the ordered set
of all extents, intents respectively modi of some triadic context. This means that
the extents, intents respectively modi do not form a closure system in general
as in the dyadic case.
    An analogous structure to the concept lattice in the dyadic case is given
by T(K) := (T(K), .1 , .2 , .3S      ). Let {i, j, k} = {1, 2, 3}, Xi and Xk beSsets of
triconcepts of K and Xi := {Ai | (A1 , A2 , A3 ) ∈ Xi } and Xk := {Ak |
(A1 , A2 , A3 ) ∈ Xk }. The ik-join of (Xi , Xk ) is defined to be the triconcept

                   ∇(Xi , Xk ) := (B1 , B2 , B3 ) with
                                       (i,j,Xk )(i,j,Xk )
                            Bi := Xi                        ,
                                       (i,j,Xk )
                            Bj :=    Xi          ,
                                        (i,j,Xk )(i,j,Xk )    (i,j,Xk ) (k)
                            Bk :=    (Xi                   × Xi        ) .

   The Basic Theorem of triconcept Analysis [7] proves that T(K) of a triadic
context K is a complete trilattice with its ik-joins being exactly the ik-joins
obtained by using the derivation operators. Further, every complete trilattice
L := (L, .1 , .2 , .3 ) is isomorphic to a concept trilattice of a suitable triadic
context for which one can choose (L, L, L, YL ) with YL := {(x1 , x2 , x3 ) ∈ L3 |
there exists an u ∈ L with u .i xi for i = 1, 2, 3}.


3    Factor Analysis through Formal Concept Analysis

This section contains the main results from [1] and the translation of the problem
to Formal Concept Analysis. In [1] an approach to Factor Analysis is presented:
a p×q binary matrix W is decomposed into the Boolean matrix product P ◦Q of
a p × n binary matrix P and a n × q binary matrix Q with n W   as small as possible.
                                                                 n
The
W     Boolean matrix product  P ◦ Q is defined as (P  ◦ Q)ij =   l=1 Pil · Qlj , where
   is the maximum and · is the product. Through the Boolean matrix product a
non-linear relationship between objects, factors and attributes is given.
    The matrices W , P and Q represent an object-attribute, object-factor and
factor-attribute relationship respectively. W = P ◦ Q means that object i is
incident with attribute j if and only if there is a factor l which applies to i and
contains j.
130    Cynthia Vera Glodeanu

   The method developed in [1] uses formal concepts as factors which produce
decompositions with smallest number of factors possible. Contexts can be seen
as Boolean matrices by replacing in the cross table the crosses by 1’s and the
blanks by 0’s.

    In the language of Formal Concept Analysis we give the following definition
for factors:
Definition 4. A subset of formal concepts F ⊆ B(G, M, I) such that I =
S
  (A,B)∈F A × B is called factorization. If F is minimal with respect to its car-
dinality then it is called optimal factorization. The elements of F are called
(optimal) factors.
   For subset F = {(A1 , B1 ), . . . , (An , Bn )} ⊆ B(G, M, I) of formal concepts
we construct binary matrices AF and BF as follows:
                     ½                                       ½
                       1 if i ∈ Al                             1 if j ∈ Bl
           (AF )il =                ,              (BF )lj =
                       0 if i ∈
                              / Al                             0 if j ∈
                                                                      / Bl

We consider a decomposition of the Boolean matrix W associated to (G, M, I)
into the Boolean matrix product AF ◦ BF .
Theorem 1. Universality of concepts as factors [1]
For every W there is F ⊆ B(G, M, I) such that W = AF ◦ BF .
Theorem 2. Optimality of concepts as factors [1]
Let W = P ◦ Q for p × n and n × q binary matrices P and Q. Then there exists
a set F ⊆ B(G, M, I) of formal concepts of W with |F| ≤ n such that for the
p × |F| and |F| × q binary matrices AF and BF we have W = AF ◦ BF .
                                                                             W
   The proofs are based on the fact that a binary matrix can be seen as a -
superposition of rectangles full of crosses. Each such rectangle in contained in
a maximal rectangle. Every maximal rectangle full of crosses corresponds to a
formal concept.
Theorem 3. Mandatory factors [1]
If W = AF ◦ BF for some subset F ⊆ B(G, M, I) of formal concepts then
O(G, M, I) ∩ A(G, M, I) ⊆ F, where O(G, M, I) and A(G, M, I) are the sets of
object and attribute concepts respectively.
   The proof is based on the fact that if one considers a formal concept (A, B) ∈
O(G, M, I) ∩ A(G, M, I), then (A, B) = ({g}pp , {g}p ) = ({m}p , {m}pp ) for some
g ∈ G and m ∈ M . The formal concept (A, B) is the only one which contains
the tuple (g, m).
Theorem 4. [1, 8]
The problem to find a decomposition W = P ◦ Q of an p × q binary matrix W
into an p × n binary matrix P and a n × q binary matrix Q with n as small as
possible is NP-hard and the corresponding decision problem is NP-complete.
                                                                  Triadic Factor Analysis   131

4    Triadic Factor Analysis
In this section K is a triadic context.
Definition 5. A subset F ⊆ T(K) such that
                               [
                        Y =          (A × B × C)
                                       (A,B,C)∈F

is called a factorization of the triadic context K. If F is minimal with respect
to its cardinality then it is called optimal factorization. The elements of F
are called (optimal) factors.
    In the dyadic case we have worked with Boolean matrices. For the triadic
case we need the notion of a Boolean 3d-matrix (shortly 3d-matrix) which
is a rectangular box Bp×q×r such that bijk ∈ {0, 1} for all i ∈ {1, . . . , p}, j ∈
{1, . . . , q}, k ∈ {1, . . . , r}. For a 3d-matrix B we write B = B1 | . . . | Br where
the Bk with k ∈ {1, . . . , r} are p × q binary matrices called layers.
    For a triadic context (K1 , K2 , K3 , Y ) with |K1 | = p, |K2 | = q, |K3 | = r we
obtain the corresponding 3d-matrix by replacing in the three-dimensional cross
table the crosses by 1’s and the blanks by 0’s.
    In order to define the 3d Boolean matrix product for a factorization F =
{(A1 , B1 , C1 ), . . . , (An , Bn , Cn )} of a triadic context K we first have to build
matrices AF , BF , CF : AF is defined as in the dyadic case and
                            ½                                     ½
                                1 if j ∈ Bl                         1 if k ∈ Cl
                (BF )jl =                   ,           (CF )kl =
                                0 if j ∈
                                       / Bl                         0 if k ∈
                                                                           / Cl
for all l ∈ {1, . . . , n}.
    AF , BF , CF represent relationships between objects and factors, attributes
and factors, conditions and factors respectively, i.e (AF )il = 1 means that ob-
ject i can be described through factor l, (BF )jl = 1 means that attribute j is
contained in factor l, and (CF )kl = 1 means that the factor l exists under the
condition k.
Definition 6. For p×n, q×n and r×n binary matrices P, Q and R the Boolean
3d-matrix product (shortly 3d-product) is defined as a ternary operation:
                                                     n
                                                     _
                             (P ◦ Q ◦ R)ijk :=             Pil · Qjl · Rkl
                                                     l=1

with i ∈ {1, . . . , p}, j ∈ {1, . . . , q} and k ∈ {1, . . . , r}.
   For the 3d-product defined above we consider two equivalent expressions.
The first one is given by:
                                              n
                                              _
                       (P ∗ Q ◦ R)ijk :=            (P ∗ Q)(ij)l · Rkl with
                                              l=1
                           (P ∗ Q)(ij)l := Pil · Qjl
132      Cynthia Vera Glodeanu

where P ∗ Q is a p × q × n 3d-matrix and its n layers result from the Boolean
multiplication of the l-th column P l of P and the l-th column Q l of Q. If we
replace P and Q by AF and BF respectively then a layer l with l ∈ {1, . . . , n}
of the 3d-matrix describes the relationship between the objects and attributes
of the l-th factor. Furthermore we have to check which factors exist under which
conditions. Therefore we take R = CF which describes the relationship between
conditions and factors. The value of (P ∗ Q ◦ R)ijk is the maximum over the
product between (P ∗ Q)(ij)l and Rkl for each l ∈ {1, . . . , n}. Finally P ∗ Q ◦ R
corresponds to the 3d-matrix we wanted to decompose.
    This 3d-product mimics the best the situation of the dyadic case, since P ∗ Q
describes the relation between objects and attributes. Because in the triadic case
these tuples belong to different layers, we have to check under which conditions
the tuples exist.

   Consider the triadic context K = ({1, 2, 3}, {a, b, c}, {b1 , b2 }, Y ) given by the
three-dimensional cross table in Table 1. and the corresponding optimal factor-


                                          b1    b2
                                        a b c a b c
                                      1 ××      ×
                                      2 ××      ×
                                      3 × × × ×
                           Table 1. Example of triadic context


ization F = {({1, 2}, {a, b}, b1 ), (3, {a, c}, {b1 , b2 }), ({1, 2}, b, {b1 , b2 })}. Then the
3d-product is given by:
                                             
                          101            110          µ        ¶
                                                         111
    AF ∗ BF ◦ CF =  1 0 1  ∗  1 0 1  ◦                        =
                                                         011
                          010            010
                                                
                          110|000|010                 µ        ¶
                                                         111
                   = 1 1 0 | 0 0 0 | 0 1 0 ·                     =
                                                         011
                          000|101|000
                                                                                    
                          110|000 _ 000|000 _ 010|010
                   = 1 1 0 | 0 0 0 0 0 0 | 0 0 0 0 1 0 | 0 1 0
                          000|000                 101|101                  000|000

      Another Boolean 3d-matrix product is given by:
                                           n
                                           _
                      (P ◦ Q ∗ R)ijk :=          Pil · (Q ∗ R)(jk)l with
                                           l=1
                         (Q ∗ R)(jk)l := Qjl · Rkl
                                                            Triadic Factor Analysis         133

This 3d matrix product is very similar to the above one. However the differ-
ence is in the interpretation. The operations ◦, ∗ and · are defined as above but
carried out on different matrices. If we replace P, Q and R by AF , BF and CF
respectively then the lth layer of BF ∗ CF describes the relationship between
attributes and conditions for the l-th factors with l ∈ {1, . . . , n}. P l · (Q ∗ R)( )l
describes the relation between the objects and pairs of attributes and conditions
for the l-th factor.
    For the triadic context from above we obtain the following 3d-product:
                                               
                          101        10|11|00
    AF ◦ BF ∗ CF =  1 0 1  ·  1 0 | 0 0 | 1 1  =
                          010        00|11|00
                                                                              
                          110|000 _ 000|000 _ 010|010
                     = 1 1 0 | 0 0 0 0 0 0 | 0 0 0 0 1 0 | 0 1 0
                          000|000            101|101                 000|000

   We give the triadic versions of the theorems presented in [1] showing that
an optimal factorization of a triadic context can be obtained by triconcepts.
The proofs are also very similar to the dyadic case, the difference consists in
considering a different matrix product.

Theorem 5. For every 3d-matrix B there is F ⊆ T(K) such that:

                                    B = AF ◦ BF ◦ CF .

Proof. Bijk = 1 iff there is a triconcept (A, B, C) ∈ T(K) such that i ∈ A,
j ∈ B and k ∈ C. If we choose for F the set of all triconcepts T(K) then
(AF ◦ BF ◦ CF )ijk = 1 iff there is l such that (AF )il = 1, (BF )jl = 1 and
(CF )kl = 1 iff there is (Al , Bl , Cl ) ∈ T(K) with i ∈ Al , j ∈ Bl and k ∈ Cl iff
Bijk = 1.                                                                        ⊓
                                                                                 ⊔

    Using triconcepts as factors we obtain a factorization with smallest number
of factors possible as stated by the next theorem:

Theorem 6. Let K = (K1 , K2 , K3 , Y ) be a triadic context and B the corre-
sponding 3d-matrix such that B = P ◦ Q ◦ R for p × n, q × n and r × n binary
matrices P, Q and R. Then there exists a set F ⊆ T(K) of triconcepts with
|F| ≤ n such that for the p × |F|, q × |F| and r × |F| binary matrices AF , BF
and CF we have B = AF ◦ BF ◦ CF .

Proof. Recall that triconcepts are maximal rectangular boxes full of 1’s which
are maximal w.r.t. the component wise set inclusion.
      Through B = P ◦ W Q ◦ R for p × n, q × n and r × n binary matrices P, Q and
R, B is written as a -superposition, as a union of n rectangular boxes full of
1’s, but these rectangular boxes do not have to be maximal. B = Pi1 · Qj1 · Rk1 ∨
· · · ∨ Pin · Qjn · Rkn with i ∈ {1, . . . , p}, j ∈ {1, . . . , q} and k ∈ {1, . . . , r}. B is
the union of rectangular boxes Bl = P l ◦ Q l ◦ R l . B = P ◦ Q ◦ R for p × n, q × n
134       Cynthia Vera Glodeanu

and r × n binary matrices P, Q and R with corresponding rectangular boxes
B1 , . . . , Bn . Each such rectangular box Bl with l ∈ {1, . . . , n} is contained in a
maximal rectangular box B̃l of B, i.e. (Bl )ijk ≤ (B̃l )ijk for all i, j, k.
Every maximal rectangular box B̃l corresponds to a triconcept (Al , Bl , Cl ) in
that (B̃l )ijk = 1 iff i ∈ Al , j ∈ Bl and k ∈ Cl . Since
                                   n
                                   _                  n
                                                      _
                          Bijk =         (Bl )ijk ≤         (B̃l )ijk ≤ Bijk
                                   l=1                l=1
 W
a -superposition of maximal rectangular boxes B̃l yields B. Putting therefore
F = {(A1 , B1 , C1 ), . . . , (An , Bn , Cn )} we get B = AF ◦ BF ◦ CF . Further |F| © n
may happen because two distinct rectangular boxes can be contained in the same
maximal rectangular box.                                                               ⊓
                                                                                       ⊔
   The next definition contains the notion of d-cut which is actually a special
case of Kij               ij
         Xk = (Ki , Kj , YXk ) for Xk ⊆ Kk and |Xk | = 1. Each d-cut is itself a
dyadic context.
Definition 7. For a triadic context K = (K1 , K2 , K3 , Y ) a dyadic-cut
(d-cut) is defined as:
                             ciα := (Kj , Kk , Yαjk )
where {i, j, k} = {1, 2, 3} and α ∈ Ki .
      For every triadic context there are three families of d-cuts:

                          c1 := {c1g := (K2 , K3 , Yg23 )}g∈K1                      (1)
                          c2 := {c2m := (K1 , K3 , Ym13 )}m∈K2                      (2)
                          c3 := {c3b := (K1 , K2 , Yb12 )}b∈K3                      (3)

Hence, (3) represents horizontal cuts in K for each condition b ∈ K3 . The family
{c3b }b∈K3 of d-cuts contains (at most) |K3 | d-cuts. Such a d-cut is itself a dyadic
context, namely (K1 , K2 , Yb12 ), and represents the relations between the object
and attribute set of K under the condition b ∈ K3 . (1) represents vertical cuts in
K for each object g ∈ K1 . Such a d-cut contains the relationship between all the
attributes and conditions for the object which generated the d-cut. (2) represents
also vertical cuts in K but this time one cuts for every attribute m ∈ K2 . Such
a d-cut contains the relationship between all the objects and conditions for the
attribute which generated the d-cut.
     Obviously one can reconstruct the triadic context K from the d-cuts by ’glu-
ing’ them together. For the d-cut families of a triadic context the following
equations hold:
                   [                  [                  [
             Y =       {g} × Yg23 =       {m} × Ym13 =       {b} × Yb12 .
                   g∈K1                    m∈K2                       b∈K3

    Since each d-cut is a dyadic context, we can compute its concepts. For a d-cut
ciα = (Kj , Kk , Yαjk ) its concepts are the pairs (Aj , Ak ) such that for all {j, k} ⊆
                                                                        Triadic Factor Analysis              135

                                                                                                       (j,k,α)
{1, 2, 3} with j < k and Aj ⊆ Kj , Ak ⊆ Kk , α ∈ Ai we have Aj 7→ Aj
                (j,k,α)
and Ak 7→ Ak         with the derivation operators given in Section 3. From this
concept of the d-cut we can compute the corresponding triconcept (A1 , A2 , A3 )
                                           (i)
with (Aj , Ak ) given as above and Ai 7→ Ai for all {i, j, k} = {1, 2, 3}.
   For different d-cuts of a d-cut family we may have identical dyadic contexts.
This happens whenever the relationship between Ki and Kj are the same for
some elements from Kk with {i, j, k} = {1, 2, 3}.

Definition 8. A factor (A1 , A2 , A3 ) is called mandatory if and only if there
exists a tuple (g, m, b) ∈ Y such that (A1 , A2 , A3 ) is the only triconcept satisfying:
(g, m, b) ∈ A1 × A2 × A3 .

Theorem 7. A factor (A1 , A2 , A3 ) of a factorization B = AF ◦ BF ◦ CF for a
subset F ⊆ T(K) of triconcepts is mandatory if and only if it is of the form
                             (1,2,x3 )(1,2,x3 )     (1,2,x3 )      (1,2,x3 )(1,2,x3 )      (1,2,x3 ) (3)
     (A1 , A2 , A3 ) = (x1                        , x1          , (x1                   × x1       )       ) (4)
                            (1,2,x3 )    (1,2,x3 )(1,2,x3 )      (1,2,x3 )     (1,2,x3 )(1,2,x3 ) (3)
                      =   (x2         , x2                  , (x2          × x2                   ) )        (5)
                            (1,3,x2 )(1,3,x2 )     1,3,x2 )(1,3,x2 )       (1,2,x3 ) (2)     (1,3,x2 )
                      =   (x1                  , (x1                  × x1           ) , x1            )     (6)
                            (1,3,x2 )     (1,3,x2 )      (1,2,x3 )(1,3,x2 ) (2)      (1,3,x2 )(1,3,x2 )
                      =   (x3         , (x3         × x3                    ) , x3                      )    (7)
                             (2,3,x1 )(2,3,x1 )       (2,3,x1 ) (1)     (2,3,x1 )(2,3,x1 )    (2,3,x1 )
                      =   ((x2                   × x2           ) , x2                     , x2         )    (8)
                             (2,3,x1 )       (2,3,x1 )(2,3,x1 ) (1)     (2,3,x1 )    (2,3,x1 )(2,3,x1 )
                      =   ((x3          × x3                    ) , x3            , x3                  )    (9)

for fixed xι ∈ Kι with ι ∈ {1, 2, 3}.

Remark 1. We know from the dyadic case that the mandatory factors are those
which are object and attribute concepts simultaneously. In the triadic case we
have for the d-cut family c1 attribute and condition triconcepts ((4), (5)), for
c2 object and condition triconcepts ((6), (7)) and for c3 object and attribute
triconcepts ((8), (9)). A factor is mandatory in the factorization of a triadic
context if and only if it is a mandatory factor for at least one d-cut factorization
of each d-cut family.

Proof. (A1 , A2 , A3 ) as written in (4) and (5) is the only triconcept such that
xι ∈ Aι with ι ∈ {1, 2, 3} in the d-cut c3x3 .
    Suppose there is another triconcept (B1 , B2 , B3 ) 6= (A1 , A2 , A3 ) from c3x3 . If
x1 ∈ B1 then due to the properties of the triadic derivation operators we obtain:
        (1,2,x3 )(1,2,x3 )    (1,2,x3 )(1,2,x3 )
A1 = x1                    ⊆ B1                  = B1 . Since (A1 , A2 , A3 ) and (B1 , B2 , B3 )
                                           (1,2,x3 )        (1,2,x3 )
are triconcepts we have A2 = A1                      ⊇ B1             = B2 . On the other hand
                                      (1,2,x3 )(1,2,x3 )        (1,2,x3 )(1,2,x3 )
from x2 ∈ B2 follows A2 = x2                             ⊆ B2                      = B2 . Altogether
                                         (1,2,x3 )       (1,2,x3 )
we obtain B2 = A2 and A1 = A2                      = B2            = B1 . Since two components
of the triconcept uniquely determine the third one, we also have B3 = A3 . This
means that (A1 , A2 , A3 ) = (B1 , B2 , B3 ). The triconcept (A1 , A2 , A3 ) is the only
one which contains the tuple (x1 , x2 , x3 ) in the d-cut c3x3 .
136      Cynthia Vera Glodeanu

    Analogously we consider the cases (6), (7) and (8), (9). If a triconcept (A1 , A2 ,
A3 ) can be written in the form of (4) − (9) then is the only triconcept from T(K)
which contains the tuple (x1 , x2 , x3 ).                                           ⊓
                                                                                    ⊔

    A possibility to compute the factorization of a triadic context is through
the d-cuts. One determines the optimal factorization of every d-cut of a d-cut
family and takes their disjoint union as the factorization of the triadic context.
However this method is naive and laborious since it may happen that we compute
the same triconcept several times. Finally we also have to check the disjointnes
of the dyadic factorizations. However Algorithm 1 eliminates this unnecessary
verification.
    The number of triconcepts even for a small triadic context can be quite big
and searching for factors in T(K) can get cumbersome. On the other hand the ik-
joins obtained by triconcepts from different d-cuts usually increase the number
of factors since they cover just partially the incidence of the ik-join components.
    It follows trivially from the dyadic case that finding a factorization of a 3d-
matrix B into the 3d product P ◦ Q ◦ R for p × n, q × n and r × n binary matrices
P, Q and R is NP-hard as well.
    Algorithm 1 is the adoptions to the triadic case of the greedy approximation
algorithm presented in [1]:2


Algorithm 1
Input: B (Boolean 3d-matrix)
Output: F (set of factor concepts)
U := {(i, j, k) | Bi,j,k = 1}
for k := 1 to |K3 |
    Uk := {(i, j, k) | Bi,j,k = 1}
F := ∅
for k := 1 to |K3 |
    while U 6= ∅ or Uk 6= ∅:
     B := ∅
     V := 0
     while there is j ∈ / B such that |B ⊕ j| ª V :
        do select j ∈ / B that maximizes B ⊕ j:
            B := (B ∪ j)(1,2,k)(1,2,k)
            V := |(B (1,2,k) × B) ∩ U |
     A := B (1,2,k)
     C := (A × B)(3)
     add (A, B, C) to F
     for each (i, j, k) ∈ A × B × C:
        remove (i, j, k) from U and from Uk
return F

2
    see also [3] for another algorithm and test data
                                                            Triadic Factor Analysis         137

     Thereby B ⊕ j = ((B ∪ j)(1,2,k) × (B ∪ j)(1,2,k)(1,2,k) ) ∩ U. In this algorithm
we consider the d-cut family of the conditions. Alternatively one can choose
also another d-cut family. As presented in [1] for the dyadic S        case if (A, B) is a
concept of a d-cut then the intent can be written as B = j∈B j (1,2,k)(1,2,k) .
Further if j ∈   / B then ((B ∪ {j})(1,2,k) , (B ∪ {j})(1,2,k)(1,2,k) ) is a concept with
B ⊂ (B ∪ {j})(1,2,k)(1,2,k) . This algorithm selects the triconcepts which cover
as much of the incidence relation as possible. Finally we compute the third
component of the triconcept. The sets U and Uk with k ∈ {1, . . . , |K3 |} assure
that we do not compute unnecessary triconcepts.
     It may happen that for different d-cut families we obtain different factoriza-
tion from which some are optimal and others not. It may happen that there exists
aStriconcept (A1 , A2 , A3 ) in the factorization of a d-cut such that A1 ×A2 ×A3 ⊆
   (B1 ,B2 ,B3 )∈F \(A1 ,A2 ,A3 ) B1 × B2 × B3 . An improvement of Algorithm 1 can be
 done by searching and removing such factors from the factorization.

    The example we give is a small one and serves as an illustration of the meth-
ods developed in this paper. It is however a paradigmatic one since it reflects the
usual setting in Web 2.0 applications. The object set contains the hostels {Nuevo
Suizo, Samay, Oasis Backpacker, One, Ole Backpacker, Garden Backpacker},
the attribute set is given by the services {character, safety, location, staff, fun,
cleanliness}. Since we have chosen the hostels with best ratings, the attribute
values are - (good) or , (excellent) and are considered as tags. In the triadic
context we make a cross in the corresponding line of object, attribute and condi-
tion if the service was rated as excellent. The conditions are given by the overall
rating of users from { [9],[10],[11] }. We number consecutively the elements from
K1 , K2 and K3 , i.e Ki := {0, . . . , |Ki |} with i ∈ {1, 2, 3}.


                            0           1           2
                       0 1 2 3 4 5 0 1 2 3 4 5 0 1 2 3 4 5
                     0     ×           ××          ××
                     1   ××× ×       ××× × ×××× ×
                     2 ×××× × ××××××             ×××××
                     3 ×× × × ×××××× ××××××
                     4 ×× × × ×××××× ××××××
                     5     ×× × ×××××× ×××× ×
                            Table 2. Triadic context of hostels


     This triadic context has 18 triconcepts. To obtain the optimal factorization
of the triadic context we consider the optimal factorizations of every d-cut from
{c3b }b∈K3 . The factorization of the d-cuts are given by F(c11 ) = {({0, 1, 2, 5}, 2,
K3 ), ({2, 3, 4}, {0, 1, 3, 5}, {0, 1}), ({1, 2, 5}, {2, 3, 5}, K3 ), ({1, 2, 3, 4}, {1, 3, 5},
K3 )}, F(c12 ) = {(K1 , {2, 3}, {1, 2}), ({2, 3, 4, 5}, K2 , 1), ({1, 2, 3, 4, 5}, {1, 2, 3, 5},
{1, 2})} and F(c13 ) = {(K1 , {2, 3}, {1, 2}), ({2, 3, 4}, {1, 2, 3, 4, 5}, {1, 2}), ({1, 3, 4,
138     Cynthia Vera Glodeanu

5}, {0, 1, 2, 3, 5}, 2)}. Obviously F(c12 ) and F(c13 ) have a common triconcept. We
can reduce the number of factors even more, since the incidence covered by
the factor ({1, 2, 3, 4, 5}, {1, 2, 3, 5}, {1, 2}) ∈ F(c12 ) is already covered by the
factors (K1 , {2, 3}, {1, 2}) ∈ F(c12 ) and ({1, 2, 3, 4}, {1, 3, 5}, K3 ) ∈ F(c11 ). Algo-
rithm 1 avoids including these ’unnecessary’ triconcepts into the factorization.
The factorization of the triadic context is given by F = F(c11 )∪F(c      ˙   1 ˙      1
                                                                              2 )∪F(c3 ) \
{({1, 2, 3, 4, 5}, {1, 2, 3, 5}, {1, 2})}.
    The factors have also a verbal description, as for example the first one stand
for very safe since the users from all platforms have this point of view, the
hostels from the extent of ({2, 3, 4, 5}, K2 , 1) are the best deals according to
the users from the second platform. On the other hand all users agreed that
the hostels 1, 2, 5 are excellent concerning safety, location and fun and that the
hostels 1, 2, 3, 4 are excellent concerning character, location and fun.
    The factorization offers the possibility to describe the hostels through 8 fac-
tors while in the triadic context they are described through 6 attributes under
3 conditions.
    The 3d-product can be written in a similar way like in the above examples.
We omit the details due to lack of space.

5     Conclusion
We have presented a factorization method of triadic data such that the number
of factors is as small as possible.
    In this paper we have also illustrated the developed mathematical and algo-
rithmic techniques through a paradigmatic example.

References
 1. Belohlávek, R., Vychodil, V.: Discovery of optimal factors in binary data via a
    novel method of matrix decomposition. J. Comput. Syst. Sci. 76(1) (2010) 3–20
 2. Krolak-Schwerdt, S., Orlik, P., Ganter, B.: TRIPAT: a model for analyzing three-
    mode binary data. In Bock, H.H., Lenski, W., Richter, M.M., eds.: Studies in
    Classification, Data Analysis, and Knowledge Organization. Volume 4 of Informa-
    tion systems and data analysis. Springer, Berlin (1994) 298–307
 3. Belohlávek, R., Vychodil, V.: Optimal factorization of three-way binary data. IEEE
    Symposium on Foundations and Practice of Data Mining in GrC 2010 (2010) 61–66
 4. Bibsonomy: http://www.bibsonomy.org/
 5. Ganter, B., Wille, R.: Formale Begriffsanalyse: Mathematische Grundlagen. (1996)
 6. Lehmann, F., Wille, R.: A triadic approach to formal concept analysis. In Ellis,
    G., Levinson, R., Rich, W., Sowa, J.F., eds.: ICCS. Volume 954 of Lecture Notes
    in Computer Science., Springer (1995) 32–43
 7. Wille, R.: The basic theorem of triadic concept analysis. Order 12 (1995) 149–158
 8. Markowsky, G., Nau, D., Woodbury, M.A., Amos, D.: A mathematical analysis of
    human leukocyte antigen serology. Math. Biosciences 40 (1978)
 9. Hostelworld: http://www1.hostelworld.com
10. Hostels: http://www.hostels.com
11. HostelBookers.com Ltd.: http://www.hostelbookers.com