Recovering Noisy Contexts with Probabilistic Formal Concepts? Vitaliy V. Martynovich1 and Euvgeniy E. Vityaev1,2 1 Novosibirsk State University, Novosibirsk, Russia, vilco@ya.ru. 2 Sobolev Institute of Mathematics, Novosibirsk, Russia, evgenii.vityaev@math.nsc.ru, Abstract. The uncertainty in the environment typically generates noisy concept alternatives and leads to an overpopulated concept lattice. From a computational point of view, a straightforward filtering of the noisy concept lattice will suffer from an exponential-size computational overkill, and from a semantical one – will face numerous ambiguities due to an overfitting. We managed to bypass the filtering problem by applying a sort of probabilistic approach. We developed a probabilistic generaliza- tion of formal concepts which seems to avoid a monstrous combinatorial complexity of a complete context lattice construction. The theoretical base for this method is described, as well as a ready-to-work noise resis- tant algorithm. The algorithm has been tested and showed a moderate precision and recall rate on various datasets, including a toy one pre- sented with the presence of a 2, 3 or 5% random noise. Keywords: formal concept analysis, concept lattice, inductive learning, data mining, association rules, classification task 1 Introduction Formal concepts may be successfully used as classification units [1, 2]. However, reviewing the concept lattice as a plain graph with the Formal Concept Anal- ysis (FCA) works well until data become uncertain, when lattices can become prohibitively huge even on small-sized datasets. There are some attempts to get rid of noise in data by concepts selection or filtering. E.g. measures of the concept stability has been shown to pick out the most reliable formal concepts [4, 5]. It was demonstrated that the stability index is relevant to data mining tasks and possesses several attractive properties [9]. Nevertheless, this is still not enough for uncertain environments [4]. Noisy clones overloading makes the calculation intractable even on small datasets. For- mally speaking, a zero-populated concept context superimposed with a random Bernoulli noise is expected to produce exponential-size lattices [8]. ? The work is financially supported by the Russian Foundation for Basic Research Grant 15-07-03410-a Recovering noisy contexts with probabilistic formal concepts 25 Approaches based on the hypothesis-making has been analyzed: performance of a model still suffers in the practical tasks environment [7]. The closest research domains are probably connected with the fuzzy concepts analysis, like [18]. In the paper we reconsider the problem of handling a possible noise in data by means of probability and logic. The origin of the probabilistic pattern for formal concepts lies in cognitive science, where they are closely related to the ”natural classes” [16]. We will focus on developing a context recovery method keeping eye on the next key capabilities: 1. The stability of a reproduced context concept lattice with respect to a pos- sible minor noise; 2. Computational tractability, avoiding filtering the whole concept lattice; 3. Handling the prediction ambiguity problem; 4. Relationship with the theory of category formation [17, 16]. The first step has been made in [12] – a probabilistic generalization of formal concepts goes here. The next step is to equip concepts with possible attribute negations and develop a logical language of a context probabilistic reasoning. We will also prove some technical facts about a consistency of probabilistic reasoning. 2 Formal concept analysis foundations This section suggests a brief overview for a formal concept analysis framework [1, 2, 13] exploited in the paper. A dataset is represented by an attribute-value cross-table. Formally speaking, Definition 1. A formal context is a triple (G, M, I) where G and M are the sets of an arbitrary nature and I ⊆ G × M is a binary relation. On the formal context (or simply context) a derivation operator 0 is defined: Definition 2. A ⊆ G, B ⊆ M . Then 1. A0 = {m ∈ M | ∀g ∈ A, (g, m) ∈ I} 2. B 0 = {g ∈ G | ∀m ∈ B, (g, m) ∈ I} 3. The pair (A, B) is called a formal concept if A0 = B and B 0 = A. Generally speaking, formal concepts analysis concentrates on a concept lat- tice arising on concept extents from the natural subset order. However, we aim to avoid considering a concept lattice and exploit the intrinsic properties of the data. The implication is a core notion. Definition 3. The implication is a pair (B, C), B, C ⊆ M , which we write as B → C. The implication B → C is true on K = (G, M, I), if ∀g ∈ G(B * g 0 or C ⊆ g 0 ). We denote the set of all true implications as Imp(K). Implications are not only the forms a conceptual bridge from FCA to logic structures, but are an essential way of reasoning within a machine logic and prediction task particularly. 26 Vitaliy V. Martynovich and Euvgeniy E. Vityaev Definition 4. For any set of implications L we construct an operator of a direct inference fL that adds all conclusions of applicable implications: fL (X) = X ∪ {C | B ⊆ X, B → C ∈ L} The following theorem characterizes concepts by means of fixed points. Theorem 1 (see [2]). For any set B ⊆ M , fImp(K) (B) = B ⇔ B 00 = B. The theorem application may be illustrated on a simple formal context. m1 m2 m3 1 1 0 1 1 1 0 1 1 0 0 0 Table 1: A very simple formal context K0 We can reformulate concept lattice construction task by means of implica- tions and an inference operator. It can be easily found out from Table 1 that attributes m1 and m3 determine the class of an object. In fact, m1 implies m2 and so does m3 . This is written as m1 → m2 and m3 → m2 . The sets {m1 , m2 }, {m2 , m3 } and {m1 , m2 , m3 } are the formal concepts, so do fixed points of the direct inference operator. For example, fImp(K) fImp(K) {m1 } −−−−−→ {m1 , m2 } −−−−−→ {m1 , m2 } Thus indeed, {m1 , m2 } is a fixed point and a formal concept simultaneously. 3 Probabilistic logic on a formal context Let us add some noise on K0 . We also extend K0 , by adding redundant objects duplicates. It will help to keep the noise level rather low in order to make a concept recovery practically possible. Every single altering will change a concept lattice a lot. The first context is equivalent to K0 above and has the same concept lattice. However, the second one generates a lot of side concepts, provoked by noise: the sets {m2 } and {m1 , m3 } also become formal concepts. The amount of side concepts is increasing as more noise is incoming – the dependency tends to be asymptotically exponential [8]. The stability may be obtained in various ways. The most obvious way is computing some stability index in order to evaluate, does the concept from noisy context is good enough, either does it is produced by noise [4, 5]. The other one may be based on a different subject: instead of measuring a stability of concepts, a stability of implications is measured. An essential way to do this is to exploit a likelihood of attribute implications, but we will also generalize them up to logical formulas. Recovering noisy contexts with probabilistic formal concepts 27 m1 m2 m3 m1 m2 m3 1 1 0 0 1 0 1 1 1 1 1 1 0 1 1 0 1 1 1 1 0 1 1 0 1 1 1 1 0 1 0 1 1 0 1 1 1 1 0 1 1 0 1 1 1 1 1 1 0 1 1 0 1 1 0 0 0 0 0 0 Table 2: K0 populated with duplicates Table 3: Knoise , with a little bit noise Definition 5. For a formal context K = (G, M, I) we introduce classical logical definitions: – LK is a letters set and includes any m ∈ M as well as their negations ¬m; – ΦK is a formulas set and is defined inductively: a letter is a formula and for any φ, ψ ∈ ΦK products of φ ∧ ψ, φ ∨ ψ, φ → ψ, ¬φ are formulas, too; V V Remark 1. For brevity, we assume L = ∧ P (or L = 1 if L = ∅). Similarly, P ∈L ¬L = {¬P | P ∈ L}. For every object {g}, a logic model of the object Kg is defined. We say that the object g respects the formula φ ∈ ΦK , if the formula is true for the model Kg . We will write this fact as g  φ ⇔ Kg  φ. The set Gφ = {g ∈ G | g  φ} is called the support of φ. Definition 6. Let us consider an arbitrary probability measure µ, i.e. µ is a finite countably additive measure on the set G. Then the contextual probability measure is defined by the following: ν : ΦK → [0, 1], ν(φ) = µ({g | g  φ}). . The most common understanding of formula probability may be linked with a well-known confidence index for context implications: conf(X → Y ) = |supp(X∪Y )| |supp(X)| . The formula probability will express exactly the same, if we keep things simple 1 and assume µ to be a counting measure: µ({g}) = |G| . For practical applications, here and further we will suppose that G is finite and does not contain any objects of a zero measure, i.e. ∀g ∈ G, µ({g}) 6= 0. Definition 7. The set of attributes M is compatible, if M 0 6= ∅. 28 Vitaliy V. Martynovich and Euvgeniy E. Vityaev V The same may be expressed as ν( M ) > 0. Now let us consider the set L = {mi , m}i=1...k ⊆ LK . The formula m1 ∧ m2 ... ∧ mk → m will look like the classical context implication ({mi }, {m}), except when it is possible to include negations of attributes, like in this one: m1 ∧ ¬m2 ... ∧ mk → ¬m. The concept of the implication as a formula is reified in definition of the rule: Definition 8. Let C, Hi ∈ LK , C ∈ / {H1 , H2 , ...Hk }, k ≥ 0. Then: 1. The rule R = (H1 , H2 ..., Hk → C) is an implication (H1 ∧ H2 ... ∧ Hk → C); 2. The premise R← of a rule R is a set of letters {H1 , H2 ..., Hk }; 3. The conclusion is R→ = C; 4. If R1← = R2← and R1→ = R2→ , then R1 = R2 . Definition 9. The probability of the rule R is a conditional probability ν(R← ∧ R→ ) η(R) = ν(R→ | R← ) = ν(R← ) If ν(R← ) is zero, the probability of the rule remains undefined. Keeping eye on K0 , let us try to watch what is happening on Knoise with the implications m1 → m3 and m1 → m3 . They stopped to be contextual tautologies, but we still can stick to the corresponding rules with reasonable likelihoods: η(m1 → m2 ) = 45 and η(m3 → m2 ) = 65 . The core idea of the approach is to exploit Theorem 1. An operator of a direct inference could be easily adapted to employing probabilistic rules in contrast to formal context implications. Definition 10. The prediction operator Π on the set of the rules R works as follows: ΠR (L) = L ∪ {C | ∃R ∈ R : R← ⊆ L, R→ = C}. Definition 11. A closure L of the set of the letters L is the smallest fixed point of the prediction operator: L = Π∞ (L). 4 Rule classes Note, that the definition 10 accepts any set of rules. To produce a relevant and consistent set of generalized concepts, additional restrictions for this set are needed. Following the [6], we will prove the compatibility theorem and ensure the correctness property for the prediction closure operator. Definition 12 (subrule). R1 @ R2 , if R1← ⊂ R2← and R1→ = R2→ . Definition 13 (refinement). R1 > R2 , if R2 @ R1 and η(R1 ) > η(R2 ). Recovering noisy contexts with probabilistic formal concepts 29 For example, the rule m1 → m2 from Knoise has the only unconditional subrule: (∅ → m2 ) @ (m1 → m2 ). This could not be considered as a refinement 8 relation: η(∅ → m2 ) = 10 = 54 = η(m1 → m2 ). However, (m3 → m2 ) > (∅ → 8 m2 ) because η(∅ → m2 ) = 10 < 56 = η(m3 → m2 ). The class M1 requires that the rules have a greater conditional probability than an unconditional probability of C, i.e. the rule is guaranteed to be useful in reasoning: Definition 14. R ∈ M1 (C) ⇔ η(R) > ν(R→ ), R→ = C. The class M2 requires a rule to be specific – we cannot improve probability by refining the rule: Definition 15. R ∈ M2 (C) ⇔ R ∈ M1 (C) and [R @ R̃ ⇒ η(R̃) ≤ η(R)] The rule (m3 → m2 ) could be refined up to the (¬m1 ∧ m3 → m2 ) due to the inequality: η(m3 → m2 ) = 65 < 1 = 33 = η(¬m1 ∧ m3 → m2 ). The last rule satisfies all M2 conditions, and thus (¬m1 ∧ m3 → m2 ) ∈ M2 (m2 ). The class Imp contains all exact implications. So does any contextual tau- tology: Definition 16. R ∈ Imp(C) ⇔ R→ (R) = C and η(R) = 1 We also consider compound classes for entire set of letters: S Definition 17. M1 = M1 (C) C∈LK Remark 2. M2 and Imp are defined similarly. All exact implications are indeed necessary to ensure a completeness property for the prediction operator. In turn, a set of rules must consist only from the M2 rules in order to obtain a consistency property. The set of the letters L is called consistent, if it does not contain an atom C and its negation ¬C. Definition 18. If Imp ⊂ R, then the set of rules R is called complete. Definition 19. By a system of the rules, we will call any R ⊆ M2 . 5 Prediction consistency Definition 20. The set of attributes M is consistent, if L ∈ M ⇒ ¬L ∈ / M. ΠR must avoid inconsistent inferences [3]. The following theorem is the main theoretical result of the paper. It proves predictions to be consistent and com- patible (see def. 7). For the proof and technical details, see [6]. Theorem 2 (Compatibility). If L is compatible, then ΠR (L) is also compat- ible and consistent for any system of the rules R. Somewhat more difficult, but still solvable, is the question of the inconsistency of prediction closures. Let us assume R to be a complete set of rules and ΠR to be the corresponding prediction operator. It is important to note that the rule systems containing M2 are always complete. Theorem 3. If L is incompatible, then ΠR (L) is inconsistent and incompatible. 30 Vitaliy V. Martynovich and Euvgeniy E. Vityaev 6 Probabilistic formal concepts The fixed points of a prediction operator are clear to be the candidates for concept intents. What about concept extents? The principles proposed in [5, 9] give us a cue. The idea is to take all possible closure preimages attribute sets, i.e. all M : Π(M ) = B, and compose their derivative sets together into a derived concept extent A. This will allow restoring the actual concept reference by applying the prediction operator and include all the objects of the same class into a conjoined extent. For example, let Ksquares be a context depicted as two disjoint squares (which are two independent formal concepts). To bring extra complexity, we also alter some entries: G m1 m2 m3 m4 m5 m6 m7 m8 g1 1 1 1 1 1 0 0 0 g2 1 1 1 1 0 0 0 0 g3 1 1 1 1 0 0 0 0 g4 1 1 1 1 0 0 0 0 g5 0 0 0 0 1 1 1 1 g6 0 0 0 0 1 1 0 1 g7 0 0 0 0 1 1 1 1 g8 0 0 0 0 1 1 1 1 Table 4: A two-concept context Ksquares with a minor noise Note that the most specific rules referring to M2 are (mi=1...4 → ¬mj=5...8 ), however rule m5 → ¬mi=1...4 is not. There is a more specific rule for the last one: η(m5 ∧ m6 → ¬m1 ) = 1 > 45 = η(m5 ∧ → ¬m1 ). This is how noise is handled being encapsulated in probability and refinement. To pick up the first object from Ksquares , firstly, the prediction operator com- putes a closure: Π(g10 ) = {m1 , m2 , m3 , m4 , ¬m5 , ¬m6 , ¬m7 , ¬m8 }. And secondly, all objects with the same closure are composed into a concept with the extent {g1 , g2 , g3 , g4 }. Definition 21. By a probabilistic formal concept on K = (G, M, I) we mean any pair (A, B) which satisfies [ Π(B) = B, A = GC C⊂B, Π(C)=B Our selection is also justified by the following statement, relating probabilistic and ordinary formal concepts on the same context. Theorem 4 (Ordinary concepts inclusion [12]). Let K be a formal context. 1. If (A, B) is an ordinary concept on K, then there is a probabilistic concept (N, M ) such that A ⊆ N , and B ⊆ M . Recovering noisy contexts with probabilistic formal concepts 31 2. If (N, M ) is a probabilistic concept on K, then there is a set of ordinary concepts C, such that ∀(A, B) ∈ C (Π(B) = M ), [ N= A. (A,B)∈C 7 Probabilistic concepts discovery For practical applications, a computational problem should be solved. It is still exponentially hard if we require a full M2 set enumeration. A semantic probabilistic inference as an enumeration procedure has been described in details in [15]. The idea is to perform a kind of a greedy search combined with a branches and boundaries search on the inference tree. The last aims to obtain an M2 subset, which will be enough for the most practical tasks. Definition 22. R is a probabilistic law, if for any R̃, (R̃ @ R) ⇒ (R̃ < R). Definition 23. The rule R̃ is semantically probabilistic inferred from the rule R. We write R . R̃, if R, R̃ are the probabilistic laws, and R̃ > R. Definition 24. The probabilistic law R is the strongest, or R ∈ SPL, if there is no other probabilistic law R̃ such that (R̃ > R). Proposition 1. All strongest probabilistic laws are in M2 . The rules extraction routine is based on exploiting a Semantic Probabilistic Inference (SPI) approach. It requires each path in the inference graph to be a sequence of semantic inferences: Definition 25. SPI is a sequence of the rules R0 . R1 . R2 ... . Rm , such that R0← = ∅ and Rm is the strongest probabilistic law. Now let us assume that some system of the rules R on a context K has already been discovered by semantic probabilistic inference. The probabilistic concept definition implies the following closure-search procedure. 1. Set the step counter k = 1 and generate the set C (0) = {ΠR (R← ) | R ∈ R}. In fact, this may be an arbitrary family of letter sets to be extended up to their probabilistic concepts closures. The set C (0) is almost always redun- dant, but it should be enough to cover all statistically significant attribute sets; 2. On the step k > 1 in case C (k) = ∅ the algorithm finishes the execution and outputs a list of detected probability concepts; 3. On the step k > 1 the set A = {g ∈ G | ΠR (g 0 ∩ B) = B} is computed for each B ∈ C (k) . If A 6= ∅, the pair (A, B) is added to the list of the found concepts. It corresponds to a join operation on the concept lattice and subsequently climbs to superordinate levels of the lattice; 32 Vitaliy V. Martynovich and Euvgeniy E. Vityaev 4. The set C (k+1) = {ΠR (B ∪ C) | B, C ∈ C (k) } \ C (k) is generated. In fact, actual prediction closures are computed on this step; 5. Let k := k + 1 and go to the step 2. The algorithm could be applied to a context recovery task as well as to a wide variety of data mining problems, such as classification and clusterisation tasks. In the final section we will focus on handling noise in a toy, a rather hard context recovery task. 8 An example Fig. 1: Initial context Earlier we considered the Ksquares context very simple but illustrative. A more sophisticated example should contain more interactions between concepts, both in extent and intent components. Also more noise should be produced. Recovering noisy contexts with probabilistic formal concepts 33 To measure some performance issues, we will set up several modifications of a single context. Modifications differ at levels of a noise and there may be a number of data duplicates, when producing more data is necessary. An initial context has been composed from rectangle blocks, easy to be recognized as formal concepts (let them be denoted as ”solid” concepts). A set of experiments was based on: 1. Kexp – the initial context, depicted on Fig. 1. 2. Kx3 – similar to Kexp , except it contains 3 duplicates of each Kexp object; 3. Kx3.n05 = Kx3 + randomly inflicted binary noise, Bernoulli distributed with p = 0.05 4. Kx3.n04 = Kx3 + noise, p = 0.04 5. Kx3.n03 = Kx3 + noise, p = 0.03 The primary characteristics of the datasets are presented in Table 5. Context |G| |M | # Concepts # Solid # Logical Noise Kexp 61 8 5 + 4 + 2 5 6+4 0 Kx3 183 8 5 + 4 + 2 5 6+4 0 Kx3.n05 183 8 5 + 4 + 2 5 6+4 0.05 Kx3.n04 183 8 5 + 4 + 2 5 6+4 0.04 Kx3.n03 183 8 5 + 4 + 2 5 6+4 0.03 Table 5: Experimental data summary: |G| and |M | stay for the amount of objects and attributes in a context. The number of concepts is a sum of three different types of concepts: solid concepts, which are indicated on Fig. 1 as solid sequences of ones; join- concepts, which are made of two sequences; and meet-concepts, which are produced by an intersection of two solid concepts. Note that a logical approach is a little bit spe- cific: the model excludes meet-concepts from the consideration and accepts attributes negations, while respecting the empty concept. Thus, the column logical sums solid and join-concepts add an empty one to solid. The first stage in executing a closure-search procedure is a rules ex- traction routine. According to the method discussed in Section 7, a computer program was implemented to perform a semantical probabilistic inference. For each context a set of rules has been obtained and has eventually been used in a closure-search procedure. Context # Rules # Rules (p > 0.5) # Rules (p > 0.9) Kx3.n05 1712 1358 682 Kx3.n04 1193 967 497 Kx3.n03 1103 916 532 Table 6: Rules extraction routine summary: # Rules are a total number of discovered rules via semantic probabilistic inference. The next two values are the numbers of the rules with conditional probability thresholds of 0.5 and 0.9. While increasing a noise level, a context becomes less and less clear and requires more and more rules for describing attribute associations. A minor noise produces an insignificant effect and affords to solve the problem almost exactly. 34 Vitaliy V. Martynovich and Euvgeniy E. Vityaev Following [10], we will compare an original concept lattice O with a predicted one E and measure the method performance by calculating two ratios: |O ∩ E| |O ∩ E| P recision = , Recall = |E| |O| The experiment results are presented in Table 7. In addition to the perfor- mance indexes, the data are presented separately for the solid concepts and the join-concepts. Context |O| |O ∩ E| |O \ E| |E \ O| Precision Recall Kx3 6 + 4 6 + 4 0 + 0 0 + 0 1.0 + 1.0 1.0 + 1.0 Kx3.n03 6 + 4 6 + 4 0 + 0 0 + 0 1.0 + 1.0 1.0 + 1.0 Kx3.n04 6 + 4 6 + 1 0 + 3 0 + 0 1.0 + 1.0 1.0 + 0.25 Kx3.n05 6 + 4 6 + 3 0 + 1 0 + 1 1.0 + 0.75 1.0 + 0.75 Table 7: Concepts recovery summary on the same contexts: |O| is the total number of expected concepts; |O ∩ E| is the number of correctly predicted concepts; |O \ E| is the number of lost concepts; and |E \ O| is the number of incorrectly predicted concepts. It was rather easy for a closure-search procedure to determine all formal concepts without noise. However, even on noisy contexts the algorithm has been able to restore the original set of concepts with a moderate accuracy. All probabilistic concepts en- countered by the algorithm may be essentially associated with the original images in ordinary concepts, while some non-primal concepts have been leaked. Never- theless, it seems that probabilistic formal concepts perform more accurately, in comparison with a stability approach [4]. Indeed, the main advantage may not even be the method accuracy: proba- bilistic formal concepts are able to discover concepts on big data frames. The d+2 estimated computational complexity for SPI is |M | ∗|G|, and one for a closure- c search seems to be |Π| ∗ |M | ∗ |G|, where 3 < c < 4 (the estimation is empirical and still needs to be checked). Noise induces extra complexity but using a Pen- tium 4 2-core 2.4GHz computer is enough to solve a 321x26 context with 10% noise in about 10 minutes, while it takes 5 minutes to complete a 3% noise task. 9 Conclusion The introduced method has been experimentally and theoretically proven to be correct and accurate. Some extra experiments have been proposed in earlier works [12, 20]. Probabilistic formal concepts are also very profitable as they may serve to construct exact concept lattices from real, noisy raw data immediately instead of performing a filtering on a overpopulated concept lattices, possibly exponentially sized. The further work includes theoretical evaluation for compu- tational complexity as well as more sophisticated experiments on a big dataset. We are also planning to compare our results with some famous classification methods in terms of prediction accuracy and speed. Recovering noisy contexts with probabilistic formal concepts 35 References 1. Ganter, B.: Formal Concept Analysis: Methods, and Applications in Computer Science. TU Dresden (2003) 2. Ganter, B., Wille, R.: Formal concept analysis – Mathematical Foundations. Berlin- Heidelberg-New York, Springer (1999) 3. Carl G. Hempel: Inductive Inconsistencies. Synthese, 12: 439-469 (1960) 4. Kuznetsov, S. O., Makhalova, T. P.: Concept interestingness measures: a com- parative study. Proceedings of the Twelfth International Conference on Concept Lattices and Their Applications Clermont-Ferrand, France, October 13-16, 2015 Vol. 1466. CEUR Workshop Proceedings, 59-72 (2015) 5. Kuznetsov, S.O.: On Stability of a Formal Concept. Annals of Mathematics and Artificial Intelligence. 49, 101-115 (2007) 6. Vityaev, E.E., Martynovich, V.V.: Probabilistic Formal Concepts with Negation. Perspectives of System Informatics. A. Voronkov, I. Virbitskaite (Eds.), LNCS vol. 8974, pp. 385-399 (2015) 7. Prokasheva O., Onishchenko A., Gurov S., “Classification based on formal concept analysis and biclustering: Possibilities of the approach”, Computational mathemat- ics and modeling, 23(3) (2012) 8. Emilion R., Levy G.: Size of random Galois lattices. Discrete Applied Math. J. 157, 2945-2957 (2009) 9. Buzmakov, A., Kuznetsov, S.O., Napoli A.: Concept Stability as a Tool for Pattern Selection. CEUR Workshop proceedings, vol. 1257, ECAI 2014, pp. 51-58 (2014) 10. L. Piskova, S. Pero, T. Horvath, S. Krajci: Mining Concepts from Incomplete Datasets Utilizing Matrix Factorization. Szathmary L., Priss U. (Eds.), CEUR Workshop proceedings, vol. 972, Proc. CLA 2012, pp. 33-44 (2012) 11. Kashnitsky, Y., Ignatov, D.I.: Can FCA-based Recommender System Suggest a Proper Classifier. CEUR Workshop proceedings 1257, ECAI 2014, pp. 17-26 (2014) 12. Vityaev, E.E., Demin, A.V., Ponomaryov, D. K.: Probabilistic Generalization of Formal Concepts. Programming and Computer Software. 38(5), 219-230 (2012) 13. Ganter, B., Obiedkov, S.: Implications in Triadic Formal Contexts. TU Dresden, Springer (2004) 14. Missaoui, R., Kwuida, L.: Implications in Triadic Formal Contexts. In: 9th Inter- national Conference, ICFCA 2011. Nicosia, Cyprus, Springer (2011) 15. Kovalerchuk, B., Vityaev, E.: Data Mining in Finance: Advances in Relational and Hybrid methods. Kluwer Academic Publishers (2000) 16. Rehder, B.: Categorization as causal reasoning. Cognitive Science, Vol. 27(5), pp. 709-748 (2003) 17. Rosch, E., Lloyd, B.B.: Principles of categorization. Cognition and Categorization, Lawrence Elbaum Associates (1978) 18. Quan, T.T., Hui, S.C., Cao, T.H.: A Fuzzy FCA-based Approach to Conceptual Clustering for Automatic Generation of Concept Hierarchy on Uncertainty Data. Belohlavek R., Snasel V. (Eds.): Proc. CLA 2004, CEUR Workshop proceedings, vol. 110, Proc. CLA 2004 (2004) 19. Speransky, S.O.: Logic of probability and probability logic. Novosibirsk State Uni- versity, Novosibirsk, PhD thesis. (2013) (in Russian) 20. Vityaev, E.E., Neupokoev, N.V.: Formal model of perception and pattern as fix point of anticipations. In: Approaches to the thinking modeling, pp.155-172, Moscow, URSS Editorial (2014) (in Russian)