Recovering Noisy Contexts with Probabilistic
                Formal Concepts?

               Vitaliy V. Martynovich1 and Euvgeniy E. Vityaev1,2
                  1
                   Novosibirsk State University, Novosibirsk, Russia,
                                     vilco@ya.ru.
               2
                 Sobolev Institute of Mathematics, Novosibirsk, Russia,
                           evgenii.vityaev@math.nsc.ru,


        Abstract. The uncertainty in the environment typically generates noisy
        concept alternatives and leads to an overpopulated concept lattice. From
        a computational point of view, a straightforward filtering of the noisy
        concept lattice will suffer from an exponential-size computational overkill,
        and from a semantical one – will face numerous ambiguities due to an
        overfitting. We managed to bypass the filtering problem by applying a
        sort of probabilistic approach. We developed a probabilistic generaliza-
        tion of formal concepts which seems to avoid a monstrous combinatorial
        complexity of a complete context lattice construction. The theoretical
        base for this method is described, as well as a ready-to-work noise resis-
        tant algorithm. The algorithm has been tested and showed a moderate
        precision and recall rate on various datasets, including a toy one pre-
        sented with the presence of a 2, 3 or 5% random noise.

        Keywords: formal concept analysis, concept lattice, inductive learning,
        data mining, association rules, classification task


1     Introduction

Formal concepts may be successfully used as classification units [1, 2]. However,
reviewing the concept lattice as a plain graph with the Formal Concept Anal-
ysis (FCA) works well until data become uncertain, when lattices can become
prohibitively huge even on small-sized datasets.
     There are some attempts to get rid of noise in data by concepts selection or
filtering. E.g. measures of the concept stability has been shown to pick out the
most reliable formal concepts [4, 5]. It was demonstrated that the stability index
is relevant to data mining tasks and possesses several attractive properties [9].
     Nevertheless, this is still not enough for uncertain environments [4]. Noisy
clones overloading makes the calculation intractable even on small datasets. For-
mally speaking, a zero-populated concept context superimposed with a random
Bernoulli noise is expected to produce exponential-size lattices [8].
?
    The work is financially supported by the Russian Foundation for Basic Research
    Grant 15-07-03410-a
                Recovering noisy contexts with probabilistic formal concepts      25

    Approaches based on the hypothesis-making has been analyzed: performance
of a model still suffers in the practical tasks environment [7]. The closest research
domains are probably connected with the fuzzy concepts analysis, like [18].
    In the paper we reconsider the problem of handling a possible noise in data
by means of probability and logic. The origin of the probabilistic pattern for
formal concepts lies in cognitive science, where they are closely related to the
”natural classes” [16]. We will focus on developing a context recovery method
keeping eye on the next key capabilities:

 1. The stability of a reproduced context concept lattice with respect to a pos-
    sible minor noise;
 2. Computational tractability, avoiding filtering the whole concept lattice;
 3. Handling the prediction ambiguity problem;
 4. Relationship with the theory of category formation [17, 16].

    The first step has been made in [12] – a probabilistic generalization of formal
concepts goes here. The next step is to equip concepts with possible attribute
negations and develop a logical language of a context probabilistic reasoning. We
will also prove some technical facts about a consistency of probabilistic reasoning.


2    Formal concept analysis foundations
This section suggests a brief overview for a formal concept analysis framework
[1, 2, 13] exploited in the paper.
     A dataset is represented by an attribute-value cross-table. Formally speaking,
Definition 1. A formal context is a triple (G, M, I) where G and M are the
sets of an arbitrary nature and I ⊆ G × M is a binary relation.
    On the formal context (or simply context) a derivation operator 0 is defined:
Definition 2. A ⊆ G, B ⊆ M . Then
1. A0 = {m ∈ M | ∀g ∈ A, (g, m) ∈ I}
2. B 0 = {g ∈ G | ∀m ∈ B, (g, m) ∈ I}
3. The pair (A, B) is called a formal concept if A0 = B and B 0 = A.
    Generally speaking, formal concepts analysis concentrates on a concept lat-
tice arising on concept extents from the natural subset order. However, we aim
to avoid considering a concept lattice and exploit the intrinsic properties of the
data. The implication is a core notion.
Definition 3. The implication is a pair (B, C), B, C ⊆ M , which we write as
B → C. The implication B → C is true on K = (G, M, I), if ∀g ∈ G(B * g 0
or C ⊆ g 0 ). We denote the set of all true implications as Imp(K).
    Implications are not only the forms a conceptual bridge from FCA to logic
structures, but are an essential way of reasoning within a machine logic and
prediction task particularly.
26       Vitaliy V. Martynovich and Euvgeniy E. Vityaev

Definition 4. For any set of implications L we construct an operator of a direct
inference fL that adds all conclusions of applicable implications:
                    fL (X) = X ∪ {C | B ⊆ X, B → C ∈ L}
     The following theorem characterizes concepts by means of fixed points.
Theorem 1 (see [2]). For any set B ⊆ M , fImp(K) (B) = B ⇔ B 00 = B.
     The theorem application may be illustrated on a simple formal context.


                                    m1 m2 m3
                                     1 1 0
                                     1 1 1
                                     0 1 1
                                     0 0 0
                     Table 1: A very simple formal context K0


    We can reformulate concept lattice construction task by means of implica-
tions and an inference operator. It can be easily found out from Table 1 that
attributes m1 and m3 determine the class of an object. In fact, m1 implies m2
and so does m3 . This is written as m1 → m2 and m3 → m2 . The sets {m1 , m2 },
{m2 , m3 } and {m1 , m2 , m3 } are the formal concepts, so do fixed points of the
direct inference operator. For example,
                           fImp(K)           fImp(K)
                    {m1 } −−−−−→ {m1 , m2 } −−−−−→ {m1 , m2 }
Thus indeed, {m1 , m2 } is a fixed point and a formal concept simultaneously.


3     Probabilistic logic on a formal context
Let us add some noise on K0 . We also extend K0 , by adding redundant objects
duplicates. It will help to keep the noise level rather low in order to make a
concept recovery practically possible.
    Every single altering will change a concept lattice a lot. The first context is
equivalent to K0 above and has the same concept lattice. However, the second one
generates a lot of side concepts, provoked by noise: the sets {m2 } and {m1 , m3 }
also become formal concepts. The amount of side concepts is increasing as more
noise is incoming – the dependency tends to be asymptotically exponential [8].
    The stability may be obtained in various ways. The most obvious way is
computing some stability index in order to evaluate, does the concept from
noisy context is good enough, either does it is produced by noise [4, 5].
    The other one may be based on a different subject: instead of measuring a
stability of concepts, a stability of implications is measured. An essential way
to do this is to exploit a likelihood of attribute implications, but we will also
generalize them up to logical formulas.
                  Recovering noisy contexts with probabilistic formal concepts        27

             m1 m2 m3                                        m1 m2 m3
              1 1 0                                           0 1 0
              1 1 1                                           1 1 1
              0 1 1                                           0 1 1
              1 1 0                                           1 1 0
              1 1 1                                           1 0 1
              0 1 1                                           0 1 1
              1 1 0                                           1 1 0
              1 1 1                                           1 1 1
              0 1 1                                           0 1 1
              0 0 0                                           0 0 0
Table 2: K0 populated with duplicates           Table 3: Knoise , with a little bit noise


Definition 5. For a formal context K = (G, M, I) we introduce classical logical
definitions:

    – LK is a letters set and includes any m ∈ M as well as their negations ¬m;

    – ΦK is a formulas set and is defined inductively: a letter is a formula and for
      any φ, ψ ∈ ΦK products of φ ∧ ψ, φ ∨ ψ, φ → ψ, ¬φ are formulas, too;
                                     V                  V
Remark 1. For brevity, we assume         L = ∧ P (or        L = 1 if L = ∅). Similarly,
                                             P ∈L
¬L = {¬P | P ∈ L}.

    For every object {g}, a logic model of the object Kg is defined. We say that
the object g respects the formula φ ∈ ΦK , if the formula is true for the model
Kg . We will write this fact as g  φ ⇔ Kg  φ. The set Gφ = {g ∈ G | g  φ} is
called the support of φ.

Definition 6. Let us consider an arbitrary probability measure µ, i.e. µ is a
finite countably additive measure on the set G. Then the contextual probability
measure is defined by the following:

                       ν : ΦK → [0, 1], ν(φ) = µ({g | g  φ}).

.

   The most common understanding of formula probability may be linked with a
well-known confidence index for context implications: conf(X → Y ) = |supp(X∪Y  )|
                                                                       |supp(X)| .
The formula probability will express exactly the same, if we keep things simple
                                                     1
and assume µ to be a counting measure: µ({g}) = |G|    .
   For practical applications, here and further we will suppose that G is finite
and does not contain any objects of a zero measure, i.e. ∀g ∈ G, µ({g}) 6= 0.

Definition 7. The set of attributes M is compatible, if M 0 6= ∅.
28      Vitaliy V. Martynovich and Euvgeniy E. Vityaev

                                     V
    The same may be expressed as ν( M ) > 0.
    Now let us consider the set L = {mi , m}i=1...k ⊆ LK . The formula m1 ∧
m2 ... ∧ mk → m will look like the classical context implication ({mi }, {m}),
except when it is possible to include negations of attributes, like in this one:
m1 ∧ ¬m2 ... ∧ mk → ¬m. The concept of the implication as a formula is reified
in definition of the rule:

Definition 8. Let C, Hi ∈ LK , C ∈
                                 / {H1 , H2 , ...Hk }, k ≥ 0. Then:

1. The rule R = (H1 , H2 ..., Hk → C) is an implication (H1 ∧ H2 ... ∧ Hk → C);
2. The premise R← of a rule R is a set of letters {H1 , H2 ..., Hk };
3. The conclusion is R→ = C;
4. If R1← = R2← and R1→ = R2→ , then R1 = R2 .

Definition 9. The probability of the rule R is a conditional probability

                                               ν(R← ∧ R→ )
                      η(R) = ν(R→ | R← ) =
                                                  ν(R← )

If ν(R← ) is zero, the probability of the rule remains undefined.

    Keeping eye on K0 , let us try to watch what is happening on Knoise with
the implications m1 → m3 and m1 → m3 . They stopped to be contextual
tautologies, but we still can stick to the corresponding rules with reasonable
likelihoods: η(m1 → m2 ) = 45 and η(m3 → m2 ) = 65 .
    The core idea of the approach is to exploit Theorem 1. An operator of a direct
inference could be easily adapted to employing probabilistic rules in contrast to
formal context implications.

Definition 10. The prediction operator Π on the set of the rules R works as
follows:

                ΠR (L) = L ∪ {C | ∃R ∈ R : R← ⊆ L, R→ = C}.

Definition 11. A closure L of the set of the letters L is the smallest fixed point
of the prediction operator: L = Π∞ (L).


4    Rule classes

Note, that the definition 10 accepts any set of rules. To produce a relevant
and consistent set of generalized concepts, additional restrictions for this set are
needed. Following the [6], we will prove the compatibility theorem and ensure
the correctness property for the prediction closure operator.

Definition 12 (subrule). R1 @ R2 , if R1← ⊂ R2← and R1→ = R2→ .

Definition 13 (refinement). R1 > R2 , if R2 @ R1 and η(R1 ) > η(R2 ).
                Recovering noisy contexts with probabilistic formal concepts      29

    For example, the rule m1 → m2 from Knoise has the only unconditional
subrule: (∅ → m2 ) @ (m1 → m2 ). This could not be considered as a refinement
                        8
relation: η(∅ → m2 ) = 10  = 54 = η(m1 → m2 ). However, (m3 → m2 ) > (∅ →
                            8
m2 ) because η(∅ → m2 ) = 10   < 56 = η(m3 → m2 ).
    The class M1 requires that the rules have a greater conditional probability
than an unconditional probability of C, i.e. the rule is guaranteed to be useful
in reasoning:
Definition 14. R ∈ M1 (C) ⇔ η(R) > ν(R→ ), R→ = C.
   The class M2 requires a rule to be specific – we cannot improve probability
by refining the rule:
Definition 15. R ∈ M2 (C) ⇔ R ∈ M1 (C) and [R @ R̃ ⇒ η(R̃) ≤ η(R)]
    The rule (m3 → m2 ) could be refined up to the (¬m1 ∧ m3 → m2 ) due to
the inequality: η(m3 → m2 ) = 65 < 1 = 33 = η(¬m1 ∧ m3 → m2 ). The last rule
satisfies all M2 conditions, and thus (¬m1 ∧ m3 → m2 ) ∈ M2 (m2 ).
    The class Imp contains all exact implications. So does any contextual tau-
tology:
Definition 16. R ∈ Imp(C) ⇔ R→ (R) = C and η(R) = 1
  We also consider compound classes for entire set of letters:
                      S
Definition 17. M1 =      M1 (C)
                        C∈LK

Remark 2. M2 and Imp are defined similarly.
    All exact implications are indeed necessary to ensure a completeness property
for the prediction operator. In turn, a set of rules must consist only from the M2
rules in order to obtain a consistency property. The set of the letters L is called
consistent, if it does not contain an atom C and its negation ¬C.
Definition 18. If Imp ⊂ R, then the set of rules R is called complete.
Definition 19. By a system of the rules, we will call any R ⊆ M2 .

5   Prediction consistency
Definition 20. The set of attributes M is consistent, if L ∈ M ⇒ ¬L ∈
                                                                    / M.
   ΠR must avoid inconsistent inferences [3]. The following theorem is the main
theoretical result of the paper. It proves predictions to be consistent and com-
patible (see def. 7). For the proof and technical details, see [6].
Theorem 2 (Compatibility). If L is compatible, then ΠR (L) is also compat-
ible and consistent for any system of the rules R.
    Somewhat more difficult, but still solvable, is the question of the inconsistency
of prediction closures. Let us assume R to be a complete set of rules and ΠR to
be the corresponding prediction operator. It is important to note that the rule
systems containing M2 are always complete.
Theorem 3. If L is incompatible, then ΠR (L) is inconsistent and incompatible.
30      Vitaliy V. Martynovich and Euvgeniy E. Vityaev

6    Probabilistic formal concepts
The fixed points of a prediction operator are clear to be the candidates for
concept intents. What about concept extents? The principles proposed in [5,
9] give us a cue. The idea is to take all possible closure preimages attribute
sets, i.e. all M : Π(M ) = B, and compose their derivative sets together into a
derived concept extent A. This will allow restoring the actual concept reference
by applying the prediction operator and include all the objects of the same class
into a conjoined extent.
    For example, let Ksquares be a context depicted as two disjoint squares (which
are two independent formal concepts). To bring extra complexity, we also alter
some entries:


                         G m1 m2 m3 m4 m5 m6 m7 m8
                         g1 1 1 1 1 1 0 0 0
                         g2 1 1 1 1 0 0 0 0
                         g3 1 1 1 1 0 0 0 0
                         g4 1 1 1 1 0 0 0 0
                         g5 0 0 0 0 1 1 1 1
                         g6 0 0 0 0 1 1 0 1
                         g7 0 0 0 0 1 1 1 1
                         g8 0 0 0 0 1 1 1 1
            Table 4: A two-concept context Ksquares with a minor noise


    Note that the most specific rules referring to M2 are (mi=1...4 → ¬mj=5...8 ),
however rule m5 → ¬mi=1...4 is not. There is a more specific rule for the last one:
η(m5 ∧ m6 → ¬m1 ) = 1 > 45 = η(m5 ∧ → ¬m1 ). This is how noise is handled
being encapsulated in probability and refinement.
    To pick up the first object from Ksquares , firstly, the prediction operator com-
putes a closure: Π(g10 ) = {m1 , m2 , m3 , m4 , ¬m5 , ¬m6 , ¬m7 , ¬m8 }. And secondly,
all objects with the same closure are composed into a concept with the extent
{g1 , g2 , g3 , g4 }.

Definition 21. By a probabilistic formal concept on K = (G, M, I) we mean
any pair (A, B) which satisfies
                                           [
                       Π(B) = B, A =              GC
                                           C⊂B, Π(C)=B

   Our selection is also justified by the following statement, relating probabilistic
and ordinary formal concepts on the same context.

Theorem 4 (Ordinary concepts inclusion [12]). Let K be a formal context.
1. If (A, B) is an ordinary concept on K, then there is a probabilistic concept
   (N, M ) such that A ⊆ N , and B ⊆ M .
                Recovering noisy contexts with probabilistic formal concepts   31

2. If (N, M ) is a probabilistic concept on K, then there is a set of ordinary
   concepts C, such that

                             ∀(A, B) ∈ C (Π(B) = M ),
                                          [
                                  N=         A.
                                         (A,B)∈C


7   Probabilistic concepts discovery
For practical applications, a computational problem should be solved. It is still
exponentially hard if we require a full M2 set enumeration.
   A semantic probabilistic inference as an enumeration procedure has been
described in details in [15]. The idea is to perform a kind of a greedy search
combined with a branches and boundaries search on the inference tree. The last
aims to obtain an M2 subset, which will be enough for the most practical tasks.
Definition 22. R is a probabilistic law, if for any R̃, (R̃ @ R) ⇒ (R̃ < R).
Definition 23. The rule R̃ is semantically probabilistic inferred from the rule
R. We write R . R̃, if R, R̃ are the probabilistic laws, and R̃ > R.
Definition 24. The probabilistic law R is the strongest, or R ∈ SPL, if there is
no other probabilistic law R̃ such that (R̃ > R).
Proposition 1. All strongest probabilistic laws are in M2 .
    The rules extraction routine is based on exploiting a Semantic Probabilistic
Inference (SPI) approach. It requires each path in the inference graph to be a
sequence of semantic inferences:
Definition 25. SPI is a sequence of the rules R0 . R1 . R2 ... . Rm , such that
R0← = ∅ and Rm is the strongest probabilistic law.
    Now let us assume that some system of the rules R on a context K has
already been discovered by semantic probabilistic inference. The probabilistic
concept definition implies the following closure-search procedure.

1. Set the step counter k = 1 and generate the set C (0) = {ΠR (R← ) | R ∈ R}.
   In fact, this may be an arbitrary family of letter sets to be extended up to
   their probabilistic concepts closures. The set C (0) is almost always redun-
   dant, but it should be enough to cover all statistically significant attribute
   sets;
2. On the step k > 1 in case C (k) = ∅ the algorithm finishes the execution and
   outputs a list of detected probability concepts;
3. On the step k > 1 the set A = {g ∈ G | ΠR (g 0 ∩ B) = B} is computed
   for each B ∈ C (k) . If A 6= ∅, the pair (A, B) is added to the list of the
   found concepts. It corresponds to a join operation on the concept lattice and
   subsequently climbs to superordinate levels of the lattice;
32     Vitaliy V. Martynovich and Euvgeniy E. Vityaev

 4. The set C (k+1) = {ΠR (B ∪ C) | B, C ∈ C (k) } \ C (k) is generated. In fact,
    actual prediction closures are computed on this step;
 5. Let k := k + 1 and go to the step 2.
   The algorithm could be applied to a context recovery task as well as to a
wide variety of data mining problems, such as classification and clusterisation
tasks. In the final section we will focus on handling noise in a toy, a rather hard
context recovery task.

8    An example


                               Fig. 1: Initial context


   Earlier we considered the Ksquares context very simple but illustrative. A
more sophisticated example should contain more interactions between concepts,
both in extent and intent components. Also more noise should be produced.
                 Recovering noisy contexts with probabilistic formal concepts         33

    To measure some performance issues, we will set up several modifications of a
single context. Modifications differ at levels of a noise and there may be a number
of data duplicates, when producing more data is necessary. An initial context has
been composed from rectangle blocks, easy to be recognized as formal concepts
(let them be denoted as ”solid” concepts).
    A set of experiments was based on:

 1. Kexp – the initial context, depicted on Fig. 1.
 2. Kx3 – similar to Kexp , except it contains 3 duplicates of each Kexp object;
 3. Kx3.n05 = Kx3 + randomly inflicted binary noise, Bernoulli distributed with
    p = 0.05
 4. Kx3.n04 = Kx3 + noise, p = 0.04
 5. Kx3.n03 = Kx3 + noise, p = 0.03

   The primary characteristics of the datasets are presented in Table 5.
                Context |G| |M | # Concepts # Solid # Logical Noise
                Kexp      61 8 5 + 4 + 2          5      6+4          0
                Kx3      183 8 5 + 4 + 2          5      6+4          0
                Kx3.n05 183 8 5 + 4 + 2           5      6+4        0.05
                Kx3.n04 183 8 5 + 4 + 2           5      6+4        0.04
                Kx3.n03 183 8 5 + 4 + 2           5      6+4        0.03
Table 5: Experimental data summary: |G| and |M | stay for the amount of objects and
attributes in a context. The number of concepts is a sum of three different types of
concepts: solid concepts, which are indicated on Fig. 1 as solid sequences of ones; join-
concepts, which are made of two sequences; and meet-concepts, which are produced by
an intersection of two solid concepts. Note that a logical approach is a little bit spe-
cific: the model excludes meet-concepts from the consideration and accepts attributes
negations, while respecting the empty concept. Thus, the column logical sums solid
and join-concepts add an empty one to solid.

   The first stage in executing a closure-search procedure is a rules ex-
traction routine. According to the method discussed in Section 7, a computer
program was implemented to perform a semantical probabilistic inference. For
each context a set of rules has been obtained and has eventually been used in a
closure-search procedure.
                Context # Rules # Rules (p > 0.5) # Rules (p > 0.9)
                Kx3.n05 1712              1358               682
                Kx3.n04 1193               967               497
                Kx3.n03 1103               916               532
Table 6: Rules extraction routine summary: # Rules are a total number of discovered
rules via semantic probabilistic inference. The next two values are the numbers of the
rules with conditional probability thresholds of 0.5 and 0.9.

   While increasing a noise level, a context becomes less and less clear and
requires more and more rules for describing attribute associations. A minor noise
produces an insignificant effect and affords to solve the problem almost exactly.
34      Vitaliy V. Martynovich and Euvgeniy E. Vityaev

   Following [10], we will compare an original concept lattice O with a predicted
one E and measure the method performance by calculating two ratios:
                                    |O ∩ E|            |O ∩ E|
                     P recision =           , Recall =
                                      |E|                |O|
    The experiment results are presented in Table 7. In addition to the perfor-
mance indexes, the data are presented separately for the solid concepts and the
join-concepts.


            Context |O| |O ∩ E| |O \ E| |E \ O| Precision         Recall
              Kx3 6 + 4 6 + 4 0 + 0 0 + 0 1.0 + 1.0 1.0 + 1.0
            Kx3.n03 6 + 4 6 + 4 0 + 0 0 + 0 1.0 + 1.0 1.0 + 1.0
            Kx3.n04 6 + 4 6 + 1 0 + 3 0 + 0 1.0 + 1.0 1.0 + 0.25
            Kx3.n05 6 + 4 6 + 3 0 + 1 0 + 1 1.0 + 0.75 1.0 + 0.75
Table 7: Concepts recovery summary on the same contexts: |O| is the total number of
expected concepts; |O ∩ E| is the number of correctly predicted concepts; |O \ E| is the
number of lost concepts; and |E \ O| is the number of incorrectly predicted concepts.


    It was rather easy for a closure-search procedure to determine all formal
concepts without noise.
    However, even on noisy contexts the algorithm has been able to restore the
original set of concepts with a moderate accuracy. All probabilistic concepts en-
countered by the algorithm may be essentially associated with the original images
in ordinary concepts, while some non-primal concepts have been leaked. Never-
theless, it seems that probabilistic formal concepts perform more accurately, in
comparison with a stability approach [4].
    Indeed, the main advantage may not even be the method accuracy: proba-
bilistic formal concepts are able to discover concepts on big data frames. The
                                                   d+2
estimated computational complexity for SPI is |M |     ∗|G|, and one for a closure-
                              c
search seems to be |Π| ∗ |M | ∗ |G|, where 3 < c < 4 (the estimation is empirical
and still needs to be checked). Noise induces extra complexity but using a Pen-
tium 4 2-core 2.4GHz computer is enough to solve a 321x26 context with 10%
noise in about 10 minutes, while it takes 5 minutes to complete a 3% noise task.

9    Conclusion
The introduced method has been experimentally and theoretically proven to
be correct and accurate. Some extra experiments have been proposed in earlier
works [12, 20]. Probabilistic formal concepts are also very profitable as they may
serve to construct exact concept lattices from real, noisy raw data immediately
instead of performing a filtering on a overpopulated concept lattices, possibly
exponentially sized. The further work includes theoretical evaluation for compu-
tational complexity as well as more sophisticated experiments on a big dataset.
We are also planning to compare our results with some famous classification
methods in terms of prediction accuracy and speed.
                 Recovering noisy contexts with probabilistic formal concepts       35

References
 1. Ganter, B.: Formal Concept Analysis: Methods, and Applications in Computer
    Science. TU Dresden (2003)
 2. Ganter, B., Wille, R.: Formal concept analysis – Mathematical Foundations. Berlin-
    Heidelberg-New York, Springer (1999)
 3. Carl G. Hempel: Inductive Inconsistencies. Synthese, 12: 439-469 (1960)
 4. Kuznetsov, S. O., Makhalova, T. P.: Concept interestingness measures: a com-
    parative study. Proceedings of the Twelfth International Conference on Concept
    Lattices and Their Applications Clermont-Ferrand, France, October 13-16, 2015
    Vol. 1466. CEUR Workshop Proceedings, 59-72 (2015)
 5. Kuznetsov, S.O.: On Stability of a Formal Concept. Annals of Mathematics and
    Artificial Intelligence. 49, 101-115 (2007)
 6. Vityaev, E.E., Martynovich, V.V.: Probabilistic Formal Concepts with Negation.
    Perspectives of System Informatics. A. Voronkov, I. Virbitskaite (Eds.), LNCS vol.
    8974, pp. 385-399 (2015)
 7. Prokasheva O., Onishchenko A., Gurov S., “Classification based on formal concept
    analysis and biclustering: Possibilities of the approach”, Computational mathemat-
    ics and modeling, 23(3) (2012)
 8. Emilion R., Levy G.: Size of random Galois lattices. Discrete Applied Math. J.
    157, 2945-2957 (2009)
 9. Buzmakov, A., Kuznetsov, S.O., Napoli A.: Concept Stability as a Tool for Pattern
    Selection. CEUR Workshop proceedings, vol. 1257, ECAI 2014, pp. 51-58 (2014)
10. L. Piskova, S. Pero, T. Horvath, S. Krajci: Mining Concepts from Incomplete
    Datasets Utilizing Matrix Factorization. Szathmary L., Priss U. (Eds.), CEUR
    Workshop proceedings, vol. 972, Proc. CLA 2012, pp. 33-44 (2012)
11. Kashnitsky, Y., Ignatov, D.I.: Can FCA-based Recommender System Suggest a
    Proper Classifier. CEUR Workshop proceedings 1257, ECAI 2014, pp. 17-26 (2014)
12. Vityaev, E.E., Demin, A.V., Ponomaryov, D. K.: Probabilistic Generalization of
    Formal Concepts. Programming and Computer Software. 38(5), 219-230 (2012)
13. Ganter, B., Obiedkov, S.: Implications in Triadic Formal Contexts. TU Dresden,
    Springer (2004)
14. Missaoui, R., Kwuida, L.: Implications in Triadic Formal Contexts. In: 9th Inter-
    national Conference, ICFCA 2011. Nicosia, Cyprus, Springer (2011)
15. Kovalerchuk, B., Vityaev, E.: Data Mining in Finance: Advances in Relational and
    Hybrid methods. Kluwer Academic Publishers (2000)
16. Rehder, B.: Categorization as causal reasoning. Cognitive Science, Vol. 27(5), pp.
    709-748 (2003)
17. Rosch, E., Lloyd, B.B.: Principles of categorization. Cognition and Categorization,
    Lawrence Elbaum Associates (1978)
18. Quan, T.T., Hui, S.C., Cao, T.H.: A Fuzzy FCA-based Approach to Conceptual
    Clustering for Automatic Generation of Concept Hierarchy on Uncertainty Data.
    Belohlavek R., Snasel V. (Eds.): Proc. CLA 2004, CEUR Workshop proceedings,
    vol. 110, Proc. CLA 2004 (2004)
19. Speransky, S.O.: Logic of probability and probability logic. Novosibirsk State Uni-
    versity, Novosibirsk, PhD thesis. (2013) (in Russian)
20. Vityaev, E.E., Neupokoev, N.V.: Formal model of perception and pattern as
    fix point of anticipations. In: Approaches to the thinking modeling, pp.155-172,
    Moscow, URSS Editorial (2014) (in Russian)