<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Accidental formal concepts in the presence of counterexamples</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Dmitry V. Vinogradov</string-name>
          <email>vinogradov.d.w@gmail.com</email>
          <xref ref-type="aff" rid="aff0">0</xref>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Federal Research Center for Computer Science and Control, Russian Academy of Science</institution>
          ,
          <addr-line>Moscow 119333</addr-line>
          ,
          <country country="RU">Russia</country>
        </aff>
        <aff id="aff1">
          <label>1</label>
          <institution>Russian State University for Humanities, Intelligent Robotics Laboratory</institution>
          ,
          <addr-line>Moscow 125993</addr-line>
          ,
          <country country="RU">Russia</country>
        </aff>
      </contrib-group>
      <abstract>
        <p>An accidental formal concept corresponds to a subset of attributes that accidentally appear in several objects (of the concept extent) if every such an object belongs to di erent \real" formal concept extent. There are two standard techniques to forbid such concepts: counterexamples to exclude involved concepts and the lower bound on the size of extent. Both are insu cient. Here we de ne random formal context with attributes (di erent from elements of \real" formal concepts) generated by independent Bernoulli variables. The main result has asymptotic form: If number n of the random attributes tends to in nity, the probability of success equals to p a , and there are m = b pn counterexamples, n then the probability of an accidental formal concept with 2 parents is 1 e a a e a [1 e b pa] in the limit.</p>
      </abstract>
      <kwd-group>
        <kwd>formal context</kwd>
        <kwd>formal concept</kwd>
        <kwd>Bernoulli variable</kwd>
        <kwd>counterexample</kwd>
        <kwd>over tting</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>-</title>
      <p>Let
be a set of damaged planes, described by set</p>
      <p>F = ff1 = empennage damage; f2 = motor damage; f3 = curse scratchg
of attributes by formal context</p>
    </sec>
    <sec id="sec-2">
      <title>Formal concepts \empennage damage is a cause" and</title>
      <p>\motor damage is a cause" are plausible ones, but</p>
      <p>hfo2; o3g ; ff3gi
\curse scratch is a cause of damage" is doubtful. The last one occurs since the
attribute ff3g accidentally appears in 2 training objects o2 and o3 with di erent
\real" causes.</p>
      <p>Of couse, separation of formal concepts into \accidental" and \real" ones
heavily depends on experts' intuition about research domain. In the previous
example we know enough to do this ourselves. However in more complex domain
there are no information to reject \accidental" concepts automatically.</p>
      <p>Counterexamples give a mean to delete such \accidental" formal concepts by
using so-called \counterexamples forbidden condition" procedure. A
counterexample is an object (described by combinations of the same attributes as training
objects) without the target property. Target property (or goal attribute) is a
binary attribute of the objects. We suppose that formal concept contains training
objects with the target property and there exists an additional list of
counterexamples without this property. This assumption converts FCA into a machine
learning method (see, for example, [3]). In the previous example "to be
damaged" was the target property of training examples.</p>
      <p>We develop a probabilistic model for (retriction of) formal context (on the
subset of unessential attributes). Attribute is essential if it occurs as a part of
intent of some \real" formal concept. The goal is to estimate of probability of
appearance of subset of unessential attributes as a formal concept intent
without inclusion into some counterexample. Counterexample intent (or description)
corresponds to a random subset of unessential attributes. Inclusion of the intent
of a formal concept into some counterexample intent is suspicious because the
cause (this formal concept) is present and the target property is absent.</p>
      <p>Main result of the paper states unsu ciency of the counterexamples
forbidden condition. If number n of the random (unessential) attributes tends to
in nity, the probability of success equals to p na , and there are m = c pn
counterexamples, then the probability of an accidental formal concept with 2
elements extent and without any counterexample is 1 e a a e a [1 e c pa]
in the limit.</p>
      <p>It's clear that the last term is greater than 0. It means that there is a
positive probability of appearance of formal concept with intent consisting of only
inessential attributes such that there is no inclusion into some counterexample.
\Accidental" formal concepts correspond to the over tting phenomenon since
we try to use all available information on one hand, and \accidental" formal
concepts do not correspond to any \real" cause and hence worsen the target
property prediction on test examples on the other hand.
2</p>
      <p>Basic de nitions and facts of FCA
Here we recall some basic de nitions and facts of Formal Concept Analysis
(FCA) [2].</p>
      <p>A ( nite) context is a triple (G; M; I) where G and M are nite sets and
I G M . The elements of G and M are called objects and attributes,
respectively. As usual, we write gIm instead of hg; mi 2 I to denote that object
g has attribute m.</p>
      <p>For A G and B M , de ne</p>
      <p>A0 = fm 2 M j8g 2 A(gIm)g;</p>
      <p>B0 = fg 2 Gj8m 2 B(gIm)g;
so A0 is the set of attributes common to all the objects in A and B0 is the set of
objects possesing all the attributes in B. The maps ( )0 : A 7! A0 and ( )0 : B 7!
B0 are called derivation operators (polars) of the context (G; M; I).</p>
      <p>A concept of the context (G; M; I) is de ned to be a pair (A; B), where
A G, B M , A0 = B, and B0 = A. The rst component A of the concept
(A; B) is called the extent of the concept, and the second component B is
called its intent. The set of all concepts of the context (G; M; I) is denoted by
B(G; M; I).</p>
      <p>Let (G; M; I) be a context. For concepts (A1; B1) and (A2; B2) in B(G; M; I)
we write (A1; B1) (A2; B2), if A1 A2. The relation is a partial order on
B(G; M; I).</p>
      <p>
        Fix a context (G; M; I). In the following, let J be an index set. We assume
that Aj G and Bj M , for all j 2 J .
(
        <xref ref-type="bibr" rid="ref1">1</xref>
        )
(
        <xref ref-type="bibr" rid="ref2">2</xref>
        )
Lemma 1. [2] Assume that (G; M; I) is a context and let A
Aj G and Bj M , for all j 2 J . Then
G, B
      </p>
      <p>M and
A</p>
      <p>A00 and
A1</p>
      <p>A2 ) A01</p>
      <p>A02</p>
      <p>and
An attribute is called essential, if it occurs in some \real" cause. Assume that
2 \real" causes have no common attributes. Other attributes are called
accompanying.</p>
      <p>Essential attributes of counterexamples are absent, and essential attributes
of training objects correspond to some of \real" cause in such a way that any of
2 \real" causes appear in some training object.</p>
      <p>Denote the number of training objects by k, the number of counterexamples
by m, and the number of accompanying attributes by n. It's clear that
accompanying attributes of training objects and counterexamples form a (k + m) n
binary matrix. It contains N = (k + m) n bites. Accompanying attributes are
generated by Bernoulli series of N tests.
De nition 1. Bernoulli series of N tests is the probability distribution on f0; 1gN
with</p>
      <p>P(x1 = 1; : : : ; xN = N ) =</p>
      <p>N
Y p j (1
j=1
p)1 j ;
where 0 &lt; p &lt; 1. The number p &gt; 0 is called success probability xj = 1 in test j.
Lemma 2. Accident formal concept with extent of cardinality k can be replaced
by Bernoulli series of n tests ha1; : : : ; ani with success probability pk of aj = 1
added to the string of 0s corresponding to essential attributes.
Proposition 2. Probability of an accidental formal concept with the extent of
size k exceeds " &gt; 0 if the success probability p ( ln(1 ")=n)1=k.
tu</p>
      <p>In analysis of the counterexample forbidden rejection procedure we restrict
ourselves by k = 2.</p>
      <p>Lemma 3. The probability of appearance of accidental formal concept without
m random counterexamples is
doesn't delete the formal concept, and n
l
since pl is the probability of deleting this formal concept by some xed
counterexample, hence 1 pl m is the probability that any (of m) counterexample
p2 l 1 p2 n l is the probability
of appearance of an accidental formal concept with intent of cardinality l. But
p2 n l
1</p>
      <p>pl m =
n
= X
l=0</p>
      <p>m
= X
j=0
n
l
m
j</p>
      <p>
        p2 n l
(
        <xref ref-type="bibr" rid="ref1">1</xref>
        )j
      </p>
      <p>
        Xn n
l=0 l
(
        <xref ref-type="bibr" rid="ref1">1</xref>
        )j (1
p2 + p2+j )n:
      </p>
      <p>Also the proof of Lemma 3 can be performed by direct application of
\inclusionexclusion" principle.
4</p>
      <sec id="sec-2-1">
        <title>Main result</title>
        <p>Lemma 4. For any constant c the following holds</p>
        <p>Xm m
j=0 j</p>
        <p>
          (
          <xref ref-type="bibr" rid="ref1">1</xref>
          )j c = 0:
        </p>
        <p>This is proved by Newton's binomial formula.</p>
        <p>Lemma 5. For j &gt; 0
holds as n ! 1.</p>
        <p>1
na + na 1+j=2 n
1
na n = a e a
na j=2 + o(n j=2)
tu
tu
Proof.</p>
        <p>1
a
n
a
n
+
a n
+
k
n
Lemma 6. Formula
holds as n ! 1 .</p>
        <p>a n
n
1
1
Theorem 1. If number n of the random attributes tends to in nity, the
probability of success equals to p na , and there are m = b pn counterexamples,
then the probability of an accidental formal concept with 2 elements extent and
without any counterexample is 1 e a a e a [1 e b pa] in the limit.
Proof. Lemma 3 reduces the theorem to estimation of</p>
        <p>p2 + p2+j ))n:
= Xm m</p>
        <p>j
a
n
a
n</p>
        <p>+
a
n
+
a n
a n
a n</p>
        <p>Substitute p = p na , then Lemma 4 implies</p>
      </sec>
    </sec>
    <sec id="sec-3">
      <title>Then</title>
      <p>n
a j=2
n
"
1
= Xm m</p>
      <p>j
j=1</p>
      <p>
        = a e a
(
        <xref ref-type="bibr" rid="ref1">1</xref>
        )j
a e a
      </p>
      <p>
        + o(n j=2) =
(
        <xref ref-type="bibr" rid="ref1">1</xref>
        )j
a e a
b pa m
m
b pa j
m
      </p>
      <p>
        #
+ o(m j ) =
1 + o(
        <xref ref-type="bibr" rid="ref1">1</xref>
        ) =
= a e a he b pa
i
1 + o(
        <xref ref-type="bibr" rid="ref1">1</xref>
        );
nishes the proof, since the last equation holds by Lemma 6.
tu
To prove that
1
e a
a e a h1
e b pai &gt; 0
for any a &gt; 0 and b &gt; 0 use the following reasoning
1
e a
a e a h1
      </p>
      <p>e b pai
1 e a a e a = e a (ea 1 a) e a a2 &gt; 0:
2!</p>
      <p>Our model of random counterexamples is incorrect if the number m of
counterexamples is comparable with 2n, since in this case many pairs of
counterexamples coincide. But this situation is also inpractical from data analysts' point
of view, since majority of attrubutes absents, when real data are coded by binary
attributes. Hence formal context has a small fraction of 1s.</p>
      <sec id="sec-3-1">
        <title>Conclusions</title>
        <p>In this paper we demonstrate that Formal Concept Analyses has the over tting
shortcoming similar to other machine learning methods. JSM method [1] has
the same drawback. The author [5] developed probabilistic approach to JSM,
FCA, and related formalisms. The inspiration for the study was the over tting
phenonenon of classical methods.</p>
        <p>Recently Tatiana Makhalova and Prof. Sergey O. Kuznetsov [4] develop an
interesting approach to compare of over tting phenomena between di erent
machine learning algorithms by means of FCA. They suppose that formal context
contains errors of prediction of the target property of objects by di erent
algorithms (attributes). Hence Formal Concept Analysis is a means to estimate the
over tting of algorithms when set of all objects randomly separated into two
parts (to learn and to test). There are some theoretical results based on \weak
probability axiomatization" proposed by Prof. Konstantin V. Vorontsov [6].</p>
        <p>In the present paper we discuss the over tting of FCA itself considered as a
machine learning method. We consider \accidental" formal concepts appearence
as over tting because it occurs during computing of all formal concepts (in
attempt to use full available information ignoring a possibility of \accidental"
coincidence between unessential attributes).</p>
        <p>Acknowledgements. The author would like to thank Prof. Victor K. Finn and
his colleagues at Federal Research Center for Computer Science and Control for
support and helpful discussions.</p>
      </sec>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          1.
          <string-name>
            <surname>Finn</surname>
            ,
            <given-names>V.K.</given-names>
          </string-name>
          <article-title>The Synthesis of Cognitive Procedures and the Problem of Induction</article-title>
          .
          <source>Automatic Documentation and Mathematical Linguistics</source>
          , Vol.
          <volume>43</volume>
          . {
          <year>2009</year>
          . { pp.
          <volume>149</volume>
          {
          <fpage>195</fpage>
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          2.
          <string-name>
            <surname>Ganter</surname>
          </string-name>
          , Bernard and Wille, Rudolf.
          <source>Formal Concept Analysis: Mathematical Foundations</source>
          , Springer-Verlag,
          <year>1999</year>
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          3.
          <string-name>
            <surname>Kuznetsov</surname>
            ,
            <given-names>S.O.</given-names>
          </string-name>
          <article-title>Complexity of Learning in Concept Lattices from Positive and Negative Examples</article-title>
          . Discrete Applied Mathematics, no.
          <issue>142</issue>
          (
          <issue>1</issue>
          -
          <fpage>3</fpage>
          ). {
          <year>2004</year>
          . { pp.
          <volume>111</volume>
          {
          <fpage>125</fpage>
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          4.
          <string-name>
            <surname>Makhalova</surname>
            ,
            <given-names>T.</given-names>
          </string-name>
          and
          <string-name>
            <surname>Kuznetsov</surname>
          </string-name>
          ,
          <source>S.O. On Over tting of Classi ers Making a Lattice Proc. 14th International Conference on Formal Concept Analysis (ICFCA 2017), Lecture Notes in Arti cial Intelligence (LNAI)</source>
          , vol.
          <volume>10308</volume>
          . { Springer. {
          <year>2017</year>
          . { pp.
          <volume>184</volume>
          {
          <fpage>197</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          5.
          <string-name>
            <surname>Vinogradov</surname>
            ,
            <given-names>D.V.</given-names>
          </string-name>
          <article-title>VKF-method of Hypotheses Generation</article-title>
          .
          <source>Communications in Computer and Information Science</source>
          , Vol.
          <volume>436</volume>
          . {
          <year>2014</year>
          . { pp.
          <volume>237</volume>
          {
          <fpage>248</fpage>
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          6.
          <string-name>
            <surname>Vorontsov</surname>
            ,
            <given-names>K.V.</given-names>
          </string-name>
          <article-title>Combinatorial Probability and Generalization Bounds Tightness</article-title>
          .
          <source>Pattern Recognition and Image Analysis</source>
          , Vol.
          <volume>18</volume>
          . { no.
          <issue>2</issue>
          . {
          <year>2008</year>
          . { pp.
          <volume>243</volume>
          {
          <fpage>259</fpage>
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>