-

Accidental formal concepts in the presence of counterexamples

Dmitry V. Vinogradov

vinogradov.d.w@gmail.com 0 1 0 Federal Research Center for Computer Science and Control, Russian Academy of Science , Moscow 119333 , Russia 1 Russian State University for Humanities, Intelligent Robotics Laboratory , Moscow 125993 , Russia

An accidental formal concept corresponds to a subset of attributes that accidentally appear in several objects (of the concept extent) if every such an object belongs to di erent \real" formal concept extent. There are two standard techniques to forbid such concepts: counterexamples to exclude involved concepts and the lower bound on the size of extent. Both are insu cient. Here we de ne random formal context with attributes (di erent from elements of \real" formal concepts) generated by independent Bernoulli variables. The main result has asymptotic form: If number n of the random attributes tends to in nity, the probability of success equals to p a , and there are m = b pn counterexamples, n then the probability of an accidental formal concept with 2 parents is 1 e a a e a [1 e b pa] in the limit.

formal context formal concept Bernoulli variable counterexample over tting

Let be a set of damaged planes, described by set

F = ff1 = empennage damage; f2 = motor damage; f3 = curse scratchg of attributes by formal context

Formal concepts \empennage damage is a cause" and

\motor damage is a cause" are plausible ones, but

hfo2; o3g ; ff3gi \curse scratch is a cause of damage" is doubtful. The last one occurs since the attribute ff3g accidentally appears in 2 training objects o2 and o3 with di erent \real" causes.

Of couse, separation of formal concepts into \accidental" and \real" ones heavily depends on experts' intuition about research domain. In the previous example we know enough to do this ourselves. However in more complex domain there are no information to reject \accidental" concepts automatically.

Counterexamples give a mean to delete such \accidental" formal concepts by using so-called \counterexamples forbidden condition" procedure. A counterexample is an object (described by combinations of the same attributes as training objects) without the target property. Target property (or goal attribute) is a binary attribute of the objects. We suppose that formal concept contains training objects with the target property and there exists an additional list of counterexamples without this property. This assumption converts FCA into a machine learning method (see, for example, [3]). In the previous example "to be damaged" was the target property of training examples.

We develop a probabilistic model for (retriction of) formal context (on the subset of unessential attributes). Attribute is essential if it occurs as a part of intent of some \real" formal concept. The goal is to estimate of probability of appearance of subset of unessential attributes as a formal concept intent without inclusion into some counterexample. Counterexample intent (or description) corresponds to a random subset of unessential attributes. Inclusion of the intent of a formal concept into some counterexample intent is suspicious because the cause (this formal concept) is present and the target property is absent.

Main result of the paper states unsu ciency of the counterexamples forbidden condition. If number n of the random (unessential) attributes tends to in nity, the probability of success equals to p na , and there are m = c pn counterexamples, then the probability of an accidental formal concept with 2 elements extent and without any counterexample is 1 e a a e a [1 e c pa] in the limit.

It's clear that the last term is greater than 0. It means that there is a positive probability of appearance of formal concept with intent consisting of only inessential attributes such that there is no inclusion into some counterexample. \Accidental" formal concepts correspond to the over tting phenomenon since we try to use all available information on one hand, and \accidental" formal concepts do not correspond to any \real" cause and hence worsen the target property prediction on test examples on the other hand. 2

Basic de nitions and facts of FCA Here we recall some basic de nitions and facts of Formal Concept Analysis (FCA) [2].

A ( nite) context is a triple (G; M; I) where G and M are nite sets and I G M . The elements of G and M are called objects and attributes, respectively. As usual, we write gIm instead of hg; mi 2 I to denote that object g has attribute m.

For A G and B M , de ne

A0 = fm 2 M j8g 2 A(gIm)g;

B0 = fg 2 Gj8m 2 B(gIm)g; so A0 is the set of attributes common to all the objects in A and B0 is the set of objects possesing all the attributes in B. The maps ( )0 : A 7! A0 and ( )0 : B 7! B0 are called derivation operators (polars) of the context (G; M; I).

A concept of the context (G; M; I) is de ned to be a pair (A; B), where A G, B M , A0 = B, and B0 = A. The rst component A of the concept (A; B) is called the extent of the concept, and the second component B is called its intent. The set of all concepts of the context (G; M; I) is denoted by B(G; M; I).

Let (G; M; I) be a context. For concepts (A1; B1) and (A2; B2) in B(G; M; I) we write (A1; B1) (A2; B2), if A1 A2. The relation is a partial order on B(G; M; I).

Fix a context (G; M; I). In the following, let J be an index set. We assume that Aj G and Bj M , for all j 2 J . ( 1 ) ( 2 ) Lemma 1. [2] Assume that (G; M; I) is a context and let A Aj G and Bj M , for all j 2 J . Then G, B

M and A

A00 and A1

A2 ) A01

A02

and An attribute is called essential, if it occurs in some \real" cause. Assume that 2 \real" causes have no common attributes. Other attributes are called accompanying.

Essential attributes of counterexamples are absent, and essential attributes of training objects correspond to some of \real" cause in such a way that any of 2 \real" causes appear in some training object.

Denote the number of training objects by k, the number of counterexamples by m, and the number of accompanying attributes by n. It's clear that accompanying attributes of training objects and counterexamples form a (k + m) n binary matrix. It contains N = (k + m) n bites. Accompanying attributes are generated by Bernoulli series of N tests. De nition 1. Bernoulli series of N tests is the probability distribution on f0; 1gN with

P(x1 = 1; : : : ; xN = N ) =

N Y p j (1 j=1 p)1 j ; where 0 < p < 1. The number p > 0 is called success probability xj = 1 in test j. Lemma 2. Accident formal concept with extent of cardinality k can be replaced by Bernoulli series of n tests ha1; : : : ; ani with success probability pk of aj = 1 added to the string of 0s corresponding to essential attributes. Proposition 2. Probability of an accidental formal concept with the extent of size k exceeds " > 0 if the success probability p ( ln(1 ")=n)1=k. tu

In analysis of the counterexample forbidden rejection procedure we restrict ourselves by k = 2.

Lemma 3. The probability of appearance of accidental formal concept without m random counterexamples is doesn't delete the formal concept, and n l since pl is the probability of deleting this formal concept by some xed counterexample, hence 1 pl m is the probability that any (of m) counterexample p2 l 1 p2 n l is the probability of appearance of an accidental formal concept with intent of cardinality l. But p2 n l 1

pl m = n = X l=0

m = X j=0 n l m j

p2 n l ( 1 )j

Xn n l=0 l ( 1 )j (1 p2 + p2+j )n:

Also the proof of Lemma 3 can be performed by direct application of \inclusionexclusion" principle. 4

Main result

Lemma 4. For any constant c the following holds

Xm m j=0 j

( 1 )j c = 0:

This is proved by Newton's binomial formula.

Lemma 5. For j > 0 holds as n ! 1.

1 na + na 1+j=2 n 1 na n = a e a na j=2 + o(n j=2) tu tu Proof.

1 a n a n + a n + k n Lemma 6. Formula holds as n ! 1 .

a n n 1 1 Theorem 1. If number n of the random attributes tends to in nity, the probability of success equals to p na , and there are m = b pn counterexamples, then the probability of an accidental formal concept with 2 elements extent and without any counterexample is 1 e a a e a [1 e b pa] in the limit. Proof. Lemma 3 reduces the theorem to estimation of

p2 + p2+j ))n: = Xm m

j a n a n

+ a n + a n a n a n

Substitute p = p na , then Lemma 4 implies

Then

n a j=2 n " 1 = Xm m

j j=1

= a e a ( 1 )j a e a

+ o(n j=2) = ( 1 )j a e a b pa m m b pa j m

# + o(m j ) = 1 + o( 1 ) = = a e a he b pa i 1 + o( 1 ); nishes the proof, since the last equation holds by Lemma 6. tu To prove that 1 e a a e a h1 e b pai > 0 for any a > 0 and b > 0 use the following reasoning 1 e a a e a h1

e b pai 1 e a a e a = e a (ea 1 a) e a a2 > 0: 2!

Our model of random counterexamples is incorrect if the number m of counterexamples is comparable with 2n, since in this case many pairs of counterexamples coincide. But this situation is also inpractical from data analysts' point of view, since majority of attrubutes absents, when real data are coded by binary attributes. Hence formal context has a small fraction of 1s.

Conclusions

In this paper we demonstrate that Formal Concept Analyses has the over tting shortcoming similar to other machine learning methods. JSM method [1] has the same drawback. The author [5] developed probabilistic approach to JSM, FCA, and related formalisms. The inspiration for the study was the over tting phenonenon of classical methods.

Recently Tatiana Makhalova and Prof. Sergey O. Kuznetsov [4] develop an interesting approach to compare of over tting phenomena between di erent machine learning algorithms by means of FCA. They suppose that formal context contains errors of prediction of the target property of objects by di erent algorithms (attributes). Hence Formal Concept Analysis is a means to estimate the over tting of algorithms when set of all objects randomly separated into two parts (to learn and to test). There are some theoretical results based on \weak probability axiomatization" proposed by Prof. Konstantin V. Vorontsov [6].

In the present paper we discuss the over tting of FCA itself considered as a machine learning method. We consider \accidental" formal concepts appearence as over tting because it occurs during computing of all formal concepts (in attempt to use full available information ignoring a possibility of \accidental" coincidence between unessential attributes).

Acknowledgements. The author would like to thank Prof. Victor K. Finn and his colleagues at Federal Research Center for Computer Science and Control for support and helpful discussions.

1. Finn , V.K. The Synthesis of Cognitive Procedures and the Problem of Induction . Automatic Documentation and Mathematical Linguistics , Vol. 43 . { 2009 . { pp. 149 { 195

2. Ganter , Bernard and Wille, Rudolf. Formal Concept Analysis: Mathematical Foundations , Springer-Verlag, 1999

3. Kuznetsov , S.O. Complexity of Learning in Concept Lattices from Positive and Negative Examples . Discrete Applied Mathematics, no. 142 ( 1 - 3 ). { 2004 . { pp. 111 { 125

4. Makhalova , T. and Kuznetsov , S.O. On Over tting of Classi ers Making a Lattice Proc. 14th International Conference on Formal Concept Analysis (ICFCA 2017), Lecture Notes in Arti cial Intelligence (LNAI) , vol. 10308 . { Springer. { 2017 . { pp. 184 { 197 .

5. Vinogradov , D.V. VKF-method of Hypotheses Generation . Communications in Computer and Information Science , Vol. 436 . { 2014 . { pp. 237 { 248

6. Vorontsov , K.V. Combinatorial Probability and Generalization Bounds Tightness . Pattern Recognition and Image Analysis , Vol. 18 . { no. 2 . { 2008 . { pp. 243 { 259