Introduction

A Markov Chain Approach to Random Generation of Formal Concepts

Dmitry V. Vinogradov

vin@viniti.ru 0 1 0 All-Russia Institute for Scienti c and Technical Information (VINITI), Intelligent Information Systems Laboratory , Moscow 125190 , Russia 1 Russian State University for Humanities, Intelligent Robotics Laboratory , Moscow 125993 , Russia

127 133

A Monte Carlo approach using Markov Chains for random generation of concepts of a nite context is proposed. An algorithm similar to CbO is used. We discuss three Markov chains: non-monotonic, monotonic, and coupling ones. The coupling algorithm terminates with probability 1. These algorithms can be used for solving various information retrieval tasks in large datasets.

formal context formal concept Markov chain termination with probability 1

Introduction

In our opinion, the main drawback of 'old-fashioned' JSM-method is the computational complexity of JSM algorithms, especially for the induction step. Paper [7] presents a results of comparison between various (partially improved by the survey's authors) variants of famous deterministic algorithms of FCA. Paper [6] provides theoretical bounds on computational complexities of various JSM tasks.

The development of JSM-method has resulted in intelligent systems of JSM type that were applied in various domains such as sociology, pharmacology, medicine, information retrieval, etc. In practice there were situations when a JSM system generates more than 10,000 formal concepts (JSM similarities) from a context with about 100 objects. In our opinion the importance of all generated concepts is doubtful, because when experts manually select important JSM causes they reject majority of generated JSM similarities.

In this paper we propose Monte Carlo algorithms using Markov Chain approach for random generation of concepts of a nite context. In other words, we replace the lattice of all concepts by small number of its random elements. 2 2.1

Background

Basic de nitions and facts of FCA Here we recall some basic de nitions and facts of Formal Concept Analysis (FCA) [3].

A ( nite) context is a triple (G; M; I) where G and M are nite sets and I G M . The elements of G and M are called objects and attributes, respectively. As usual, we write gIm instead of hg; mi 2 I to denote that object g has attribute m.

For A G and B M , de ne

A0 = fm 2 M j8g 2 A(gIm)g;

B0 = fg 2 Gj8m 2 B(gIm)g; so A0 is the set of attributes common to all the objects in A and B0 is the set of objects possesing all the attributes in B. The maps ( )0 : A 7! A0 and ( )0 : B 7! B0 are called derivation operators (polars) of the context (G; M; I).

A concept of the context (G; M; I) is de ned to be a pair (A; B), where A G, B M , A0 = B, and B0 = A. The rst component A of the concept (A; B) is called the extent of the concept, and the second component B is called its intent. The set of all concepts of the context (G; M; I) is denoted by B(G; M; I).

Example 1 (Boolean cube with n atoms). Consider the context (G; M; I), where G = fg1; : : : ; gng, M = fm1; : : : ; mng, and

gjImk , j 6= k: Then (fgj1 ; : : : ; gjk g)0 = M nfmj1 ; : : : ; mjk g, (fmj1 ; : : : ; mjk g)0 = Gnfgj1 ; : : : ; gjk g, A00 = A for all A G, and B00 = B for all B M . Hence B(G; M; I) has element (fgj1 ; : : : ; gjk g; M n fmj1 ; : : : ; mjk g) for every fgj1 ; : : : ; gjk g G. ( 1 ) ( 2 ) ( 3 )

Let (G; M; I) be a context. For concepts (A1; B1) and (A2; B2) in B(G; M; I) we write (A1; B1) (A2; B2), if A1 A2. The relation is a partial order on B(G; M; I).

A subset A G is the extent of some concept if and only if A00 = A in which case the unique concept of which A is the extent is (A; A0). Similarly, a subset B of M is the intent of some concept if and only if B00 = B and then the unique concept with intent B is (B0; B).

It is easy to check that A1 A2 implies A01 A02 and for concepts (A1; A01) and (A2; A02) reverse implication is valid too, because A1 = A010 A020 = A2. Hence, for (A1; B1) and (A2; B2) in B(G; M; I) Proposition 1. [3] Let (G; M; I) be a context. Then (B(G; M; I); ) is a lattice with join and meet given by Corollary 1. For context (G; M; I) the lattice (B(G; M; I); ) has (M 0; M ) as the bottom element and (G; G0) as the top element. In other words, for all (A; B) 2 B(G; M; I) the following inequalities hold: (M 0; M ) (A; B) (G; G0): 2.2

The \Close-by-One" operations: de nition and properties With the help of expressions for the in num and the supremum operations in B(G; M; I) given by Proposition 1 we can introduce local steps of our Markov chains: ( 4 ) ( 5 ) ( 6 ) ( 7 ) ( 8 ) (9) (10) (11) (12) De nition 1. For (A; B) 2 B(G; M; I), g 2 G, and m 2 M de ne CbO((A; B); g) = ((A [ fgg)00; B \ fgg0);

CbO((A; B); m) = (A \ fmg0; (B [ fmg)00): so CbO((A; B); g) is equal to (A; B) _ (fgg00; fgg0) and CbO((A; B); m) is equal to (A; B) ^ (fmg0; fmg00).

We call these operations CbO because the rst one is used in Close-by-One (CbO) Algorithm to generate all the elements of B(G; M; I), see [4] for details. Lemma 2. Assume that (G; M; I) is a context and let (A; B) 2 B(G; M; I), g 2 G, and m 2 M . Then

g 2 A ) CbO((A; B); g) = (A; B); m 2 B ) CbO((A; B); m) = (A; B);

g 2= A ) (A; B) < CbO((A; B); g); m 2= B ) CbO((A; B); m) < (A; B): Proof. If g 2= A then A A [ fgg (A [ fgg)00 by ( 5 ). By de nition of the order between concepts this inclusion and (13) imply (17). Relation (18) is proved in the same way, the rest is obvious.

Lemma 3. Assume that (G; M; I) is a context and let (A1; B1); (A2; B2) 2 B(G; M; I), g 2 G, and m 2 M . Then Proof. If A1 A2 then A1 [ fgg A2 [ fgg. Hence ( 6 ) implies (A2 [ fgg)0 (A1 [ fgg)0. Second part of ( 6 ) implies (A1 [ fgg)00 (A2 [ fgg)00. By de nition of the order between concepts this is (19). Relation (20) is proved in the same way by using ( 4 ). 3

Markov Chain Algorithms

Now we represent Markov chain algorithms for random generation of formal concepts.

Data: context (G; M; I), external function CbO( ; ) Result: random concept (A; B) 2 B(G; M; I) t := 0; (A; B) := (M 0; M ); while (t < T ) do select random element x 2 (G n A) t (M n B); (A; B) := CbO((A; B); x); t := t + 1; end

Algorithm 1: Non-monotonic Markov chain (13) (14) (15) (16) (17) (18) Example 2 (Random walk on Boolean cube). Consider the context (G; M; I) of Example 1. Then Non-monotonic Markov chain corresponds to Random Walk on Boolean Cube of all the concepts in B(G; M; I).

De nition 2. An (order) ideal of partially odered set (poset) (S; ) is a subset J of S such that 8s 2 S 8r 2 J [s r ) s 2 J ]: A Markov chain St with values into poset (S; ) is called monotonic if for every pair of start states a b (a; b 2 S) and every order ideal J S P[S1 2 J jS0 = a]

P[S1 2 J jS0 = b]: (21) (22) Proposition 2. There exists the context (G; M; I) such that Non-monotonic Markov chain for (B(G; M; I); ) isn't monotonic one.

See [8] for the proof of Proposition 2. The following Markov chain is always monotonic one.

Data: context (G; M; I), external function CbO( ; ) Result: random concept (AT ; BT ) 2 B(G; M; I) t := 0; X := G t M ; (A; B) := (M 0; M ); while (t < T ) do select random element x 2 X; (A; B) := CbO((A; B); x); t := t + 1; end

Algorithm 2: Monotonic Markov chain Proposition 3. For every context (G; M; I) the Monotonic Markov chain for (B(G; M; I); ) is monotonic one.

The proof of Proposition 3 can be found in [8].The Monotonic Markov chain algorithm has another advantage: random selection of elements of the static set X := G t M . However both previous algorithms have common drawback: the unknown value T for the termination time of the calculation. In the Monte Carlo Markov Chain (MCMC) theory this value corresponds to the mixing time of the chain. For some special Markov chains (for instance, for the chain of Example 2) the mixing time is estimated by sophisticated methods. In general case, it is an open problem. The following algorithm has not this problem at all.

Data: context (G; M; I), external function CbO( ; ) Result: random concept (A; B) 2 B(G; M; I) X := G t M ; (A; B) := (M 0; M ); (C; D) = (G; G0); while ((A 6= C) _ (B 6= D)) do select random element x 2 X; (A; B) := CbO((A; B); x); (C; D) := CbO((C; D); x); end

Algorithm 3: Coupling Markov chain (23) (24)

Intermediate value of quadruple (A; B) (C; D) on step t corresponds to Markov chain state Yt. The At, Bt, Ct and Dt are rst, second, third, and fourth components, respectively.

De nition 3. A coupling length for context (G; M; I) is de ned by A choice probability of xed object or attribute in context (G; M; I) is equal to L = min(j G j; j M j): p =

1 j G j + j M j : Lemma 4. If j G j<j M j then for every integer r and every pair of start states (A; B) (C; D) ((A; B); (C; D) 2 B(G; M; I))

P[Ar = Ar+L = Cr+L&Br = Br+L = Dr+LjYr = (A; B) (C; D)] pL: (25) If j G j j M j then for every integer r and every pair of start states (A; B) (C; D) ((A; B); (C; D) 2 B(G; M; I))

P[Ar+L = Cr+L = Cr&Br+L = Dr+L = DrjYr = (A; B) (C; D)] pL: (26) Proof. For coupling to (A; B) (A; B) it su ces to get an element from A n C, but j A n C j j G j= L. Relation (26) is proved in a similar way. tu Theorem 1. The coupling Markov chain has the probability of coupling (termination) before n steps with limit 1 when n ! 1.

Proof. Lemma 3 implies that (A(k 1) L; B(k 1) L) (C(k 1) L; D(k 1) L). Let r = (k 1) L. Then Lemma 4 implies that P[Ak L 6= Ck L _ Bk L 6= Dk LjYr = (Ar; Br) (Cr; Dr)] (1 pL). After k independent repetitions we have P[Ak L 6= Ck L _ Bk L 6= Dk LjY0 = (M 0; M ) (G; G0)] (1 pL)k. But when k ! 1 we have (1 pL)k ! 0. tu

Conclusions

In this paper we have described a Monte Carlo approach using Markov Chains for random generation of concepts of a nite context. The basic steps of proposed Markov chains are similar to ones of algorithm CbO. We discuss three Markov chains: non-monotonic, monotonic, and coupling ones. The coupling algorithm terminates with probability 1.

Acknowledgements.

The author would like to thank Prof. Victor K. Finn and Tatyana A. Volkova for helpful discussions. The research was supported by Russian Foundation for Basic Research (project 11-07-00618a) and Presidium of the Russian Academy of Science (Fundamental Research Program 2012-2013).

V.K.

Finn , About Machine-Oriented Formalization of Plausible Reasonings in F. Beckon-J.S. Mill Style. Semiotika I Informatika , 20 , 35{ 101 ( 1983 )

V.K.

Finn , The Synthesis of Cognitive Procedures and the Problem of Induction . Autom. Doc. Math. Linguist., 43 , 149{ 195 ( 2009 )

Ganter ,

Wille , Formal Concept Analysis: Mathematical Foundations, SpringerVerlag , 1999

S.O.

Kuznetsov , A Fast Algorithm for Computing All Intersections of Objects in a Finite Semi- Lattice , Automatic Documentation and Mathematical Linguistics , vol. 27 , no. 5 , 11 - 21 , 1993 .

S.O.

Kuznetsov , Mathematical aspects of concept analysis . Journal of Mathematical Science , Vol. 80 , Issue 2, pp. 1654 - 1698 , 1996 .

S.O.

Kuznetsov , Complexity of Learning in Concept Lattices from Positive and Negative Examples . Discrete Applied Mathematics , 2004 , no. 142 ( 1-3 ), pp. 111 - 125 .

S.O.

Kuznetsov and

S.A.

Obiedkov , Comparing Performance of Algorithms for Generating Concept Lattices . Journal of Experimental and Theoretical Arti cial Intelligence , Vol. 14 , no. 2-3 , pp. 189 - 216 , 2002 .

D.V.

Vinogradov , Random generation of hypotheses in the JSM method using simple Markov chains . Autom. Doc. Math. Linguist. , 46 , 221{ 228 ( 2012 )