Rough Inclusion Functions and Similarity Indices? Anna Gomolińska1 and Marcin Wolski2 1 Bialystok University, Computer Science Institute, Sosnowa 64, 15-887 Bialystok, Poland, anna.gom@math.uwb.edu.pl 2 Maria Curie-Sklodowska University, Dept. of Logic and Philosophy of Science, Pl. Marii Curie-Sklodowskiej 4, 20-031 Lublin, Poland, marcin.wolski@umcs.lublin.pl Abstract. Rough inclusion functions are mappings considered in the rough set theory with which one can measure the degree of inclusion of a set in a set (and in particular, the degree of inclusion of an information granule in an information granule) in line with rough mereology. On the other hand, similarity indices are mappings in cluster analysis with which one can compare clusterings, and clustering methods with respect to similarity. In this article we investigate the relationships between rough inclusion functions and similarity indices. Keywords: rough inclusion function, rough mereology, similarity index, cluster analysis, granular computing. 1 Introduction In 1994, L. Polkowski and A. Skowron introduced the formal notion of a rough inclusion, making it a fundamental concept of rough mereology (see, e.g. [1–4]).3 Rough inclusion may be interpreted as a ternary relation with which one can express the fact that a set of objects is to some degree included in the same or an- other set of objects. Rough mereology is a theory extending the Leśniewski mere- ology [6, 7] from a theory of being-a-part to a theory of being-a-part-to-degree. Rough inclusion functions (RIFs) are mappings with which one can measure the degree of inclusion of sets in sets and which comply with the axioms of rough inclusion. Since according to L. A. Zadeh’s definition [8], an information granule is a clump of objects drawn together on the basis of indistinguishability, simi- larity or functionality, RIFs can be used in particular to measure the degree of inclusion of information granules in information granules. Hence, the concept of a RIF is fundamental not only for the rough set theory [5, 9] but also for the foundations and the development of granular computing [10, 11]. ? Many thanks to the anonymous referees for interesting comments on the paper. All errors left are our sole responsibility. 3 It is worthy to note that some ideas on rough inclusion were presented by Z. Pawlak in [5]. 146 A. Gomolińska, M. Wolski RIFs can be useful in the rough set theory and, more generally, in granular computing in many ways. First, they can be applied to compare sets (and infor- mation granules) with respect to inclusion. Secondly, they can be used to define rough membership functions [12] and various approximation operators as those in the Skowron – Stepaniuk approach (see, e.g. [13, 14] and other papers by the same authors), in the Ziarko variable-precision rough set model (see, e.g. [15, 16] and more recent papers), or in the decision-theoretic rough set model [17, 18]. RIFs can also be used to estimate the confidence (known as accuracy as well) and the coverage of decision rules and association rules (see, e.g. [19]). Another application of RIFs is graded semantics of formulas (see, e.g. [20]). An important application of RIFs is obviously their usage to compute the degree of similarity (nearness, closeness) between sets of objects and, in particular, between infor- mation granules. Some steps into this direction have already been made (see, e.g. [21, 4, 14]). The similarity indices we are going to speak about are used in cluster analy- sis [22–24] to compare clusterings, and clustering methods with respect to how they are similar to (or dissimilar from) one another. Many of these similarity indices were originally designed to compare species with respect to their mutual similarity, given information about presence and/or absence of some features. A. N. Albatineh, M. Niewiadomska-Bugaj, and D. Michalko thoroughly exam- ined 28 similarity indices known from the literature on classification and cluster analysis, from which 22 turned out to be different.4 The results of their re- search on correction for chance agreement for similarity indices can be found, e.g. in [25]. In the present article we continue our earlier works [26, 27], where among other things, three similarity indices out of those 22 were derived from RIFs. Our actual goal is to show that all 22 similarity indices investigated in [25] can be obtained starting with the RIFs κ£ , κ1 , and κ2 only. This reveals one more connection between the rough set theory and cluster analysis. The rest of the paper is organized as follows. In Sect. 2 we recall the notion of a rough inclusion function and the three particular RIFs mentioned above. In Sect. 3 we present the 22 similarity indices known from the literature and discussed in [25], and we characterize them one by one by means of the standard RIF κ£ or two other RIFs, viz. κ1 and κ2 . The last section contains final remarks. 2 Rough Inclusion Functions Rough inclusion functions (RIFs for short) are supposed to be mappings to measure the degree of inclusion of sets in sets and to comply with the axioms of rough inclusion. In detail, a rough inclusion function upon a non-empty set of objects U (in short, a RIF upon U or simply, a RIF) is a mapping κ : ℘U ×℘U 7→ [0, 1], assigning to any pair of sets (X, Y ) of elements of U , a number κ(X, Y ) from the unit interval [0, 1] interpreted as the degree to which X is included in 4 Some similarity indices were introduced more than once, under different names. Rough Inclusion Functions and Similarity Indices 147 Y , and such that the conditions rif 1 (κ) and rif ∗2 (κ) are satisfied, where def rif 1 (κ) ⇔ ∀X, Y ⊆ U.(κ(X, Y ) = 1 ⇔ X ⊆ Y ), def rif ∗2 (κ) ⇔ ∀X, Y, Z ⊆ U.(κ(Y, Z) = 1 ⇒ κ(X, Y ) ≤ κ(X, Z)). Condition rif 1 (κ) expresses the fact that the set-theoretical inclusion of sets is the most perfect case of rough inclusion. When rif 1 (κ) holds, condition rif ∗2 (κ) will be equivalent with condition rif 2 (κ) below: def rif 2 (κ) ⇔ ∀X, Y, Z ⊆ U.(Y ⊆ Z ⇒ κ(X, Y ) ≤ κ(X, Z)) expressing monotonicity of κ in the second variable. In the literature, weaker versions of RIFs are considered as well, where rif 1 (κ) is replaced by “a half of it”. Then, rif ∗2 (κ) and rif 2 (κ) will define different classes of inclusion mappings (see, e.g. [28]). In summary, any RIF κ upon U should satisfy rif 1 (κ) and rif ∗2 (κ) or, equiv- alently, rif 1 (κ) and rif 2 (κ). Among RIFs, various subclasses of mappings can be distinguished by adding new postulates to be satisfied. These can be, for instance, def rif 3 (κ) ⇔ ∀∅ 6= X ⊆ U.κ(X, ∅) = 0, def rif 4 (κ) ⇔ ∀X, Y ⊆ U.(κ(X, Y ) = 0 ⇒ X ∩ Y = ∅), def rif −1 4 (κ) ⇔ ∀∅ 6= X ⊆ U.∀Y ⊆ U.(X ∩ Y = ∅ ⇒ κ(X, Y ) = 0), def rif 5 (κ) ⇔ ∀∅ 6= X ⊆ U.∀Y ⊆ U.(κ(X, Y ) = 0 ⇔ X ∩ Y = ∅), def rif 6 (κ) ⇔ ∀∅ 6= X ⊆ U.∀Y ⊆ U.κ(X, Y ) + κ(X, Y c ) = 1, def rif 7 (κ) ⇔ ∀X, Y, Z ⊆ U.(Z ⊆ Y ⊆ X ⇒ κ(X, Z) ≤ κ(Y, Z)), where Y c denotes the set-theoretical complement of Y .5 Obviously, rif 5 (κ) if and only if rif 4 (κ) and rif −1 4 (κ). Apart from that rif −1 4 (κ) ⇒ rif 3 (κ), rif 1 (κ) & rif 6 (κ) ⇒ rif 5 (κ). (1) The standard RIF, denoted by κ£ here, is the most famous and frequently used by the rough set community. The idea underlying this notion is closely related to the conditional probability. In logic, J. Lukasiewicz was the first who employed this idea when calculating the probability of truth associated with implicative formulas [31, 32]. Let us recall that κ£ is only defined for a finite U by putting ( #(X∩Y ) £ def #X if X 6= ∅, κ (X, Y ) = (2) 1 otherwise, 5 The last condition was mentioned in [29, 30]. There, rough inclusion is understood in a different way than in our paper. 148 A. Gomolińska, M. Wolski where X, Y are any subsets of U and #X denotes the number of elements of X. In words, the standard RIF measures the fraction of the elements having the property described by the second argument (Y ) among the elements with the property described by the first argument (X). Apart from being a true RIF, κ£ has a number of interesting properties recalled, e.g. in [27]. For instance, it satisfies rif i (κ) (i = 3, . . . , 7) and rif −1 4 (κ). Examples of other RIFs are mappings κ1 and κ2 such that for any X, Y ⊆ U ,  #Y def if X ∪ Y 6= ∅, κ1 (X, Y ) = #(X∪Y ) 1 otherwise, c def #(X ∪Y) κ2 (X, Y ) = . (3) #U Also in this case, U has to be finite. While κ1 was introduced in [26], κ2 had already been mentioned in [33]. The both RIFs were investigated in detail in [27]. The RIFs κ£ , κ1 , and κ2 are different from one another. Below we recall a few other properties of these mappings. Proposition 1. For any X, Y ⊆ U , we have: (i) X 6= ∅ ⇒ (κ1 (X, Y ) = 0 ⇔ Y = ∅), (ii) κ2 (X, Y ) = 0 ⇔ X = U & Y = ∅, (iii) rif 4 (κ1 ) & rif 4 (κ2 ), (iv) κ£ (X, Y ) ≤ κ1 (X, Y ) ≤ κ2 (X, Y ), (v) κ1 (X, Y ) = κ£ (X ∪ Y, Y ) & κ£ (X, Y ) = κ1 (X, X ∩ Y ), (vi) κ2 (X, Y ) = κ£ (U, X c ∪ Y ). Let us also note that due to (i), rif 3 (κ1 ) holds. The same cannot be however said about κ2 (compare (ii)). 3 Similarity Indices in Terms of RIFs In this section we reformulate the similarity indices studied in [25] in terms of the RIFs κ£ , κ1 , or κ2 . The proofs that the indices can really be expressed in this way will be given in the full version of this paper. Consider a set U0 of m > 0 data points to be grouped by some clustering methods A1 and A2 . Let U (our universe) be the set of all unordered pairs of data points {x, y} ⊆ U0 to be compared in order to obtain clusterings, i.e. partitions of U0 generated by A1 and by A2 , and denoted by C1 and C2 here. Thus, #U = M = m  2 = m(m − 1)/2. The similarity between the clusterings C1 and C2 (and the clustering methods A1 and A2 ) is usually assessed on the basis of the number of pairs of data points that are put into the same cluster or are put into different clusters by each of the grouping methods considered. For i = 1, 2, let us define Xi = {{x, y} ∈ U | x, y are clustered by Ai }. (4) Rough Inclusion Functions and Similarity Indices 149 Additionally, let a = #(X1 ∩ X2 ), b = #(X1 ∩ X2c ), c = #(X1c ∩ X2 ), d = #(X1c ∩ X2c ). (5) In words, a is the number of pairs of data points {x, y} such that x and y are placed in the same cluster according to both A1 and A2 ; b (respectively, c) is the number of pairs of data points {x, y} such that x and y are placed in the same cluster by A1 (resp., A2 ), but they are placed in different clusters by A2 (resp., A1 ); finally, d is the number of pairs of data points {x, y} such that x and y are placed in different clusters according to both A1 and A2 . We also have #X1 = a+b, #X2 = a+c, #X1c = c+d, #X2c = b+d, and #U = a+b+c+d = M . For simplicity assume that a, b, c, d > 0. Then, we will have that a κ£ (X1 , X2 ) = , a+b a+c κ1 (X1 , X2 ) = , a+b+c a+c+d κ2 (X1 , X2 ) = . (6) M In what follows we will present similarity indices one by one and their new formulation in terms of κ£ , κ1 , or κ2 . Wallace (1983). The similarity indices W1 , W2 with range [0, 1] were intro- duced by D. L. Wallace: def a W1 (C1 , C2 ) = , a+b def a W2 (C1 , C2 ) = . (7) a+c It is easy to see that W1 (C1 , C2 ) = κ£ (X1 , X2 ), W2 (C1 , C2 ) = κ£ (X2 , X1 ). (8) Kulczyński (1927). The similarity index K with range [0, 1] was proposed by S. Kulczyński in 1927:   def 1 a a K(C1 , C2 ) = + (9) 2 a+b a+c K can be rewritten to the following form: 1 £ κ (X1 , X2 ) + κ£ (X2 , X1 )  K(C1 , C2 ) = (10) 2 In words, K(C1 , C2 ) is the arithmetical mean of κ£ (X1 , X2 ) and κ£ (X2 , X1 ). 150 A. Gomolińska, M. Wolski McConnaughey (1964). The similarity index M C with range [−1, 1] goes back to B. H. McConnaughey: def a2 − bc M C(C1 , C2 ) = (11) (a + b)(a + c) This index can be expressed by the following equation: M C(C1 , C2 ) = κ£ (X1 , X2 ) + κ£ (X2 , X1 ) − 1 (12) Peirce (1884). The similarity index P E with range [−1, 1] is attributed to C. S. Peirce: def ad − bc P E(C1 , C2 ) = (13) (a + c)(b + d) The index P E can be characterized as follows: 1 £ κ (X2 , X1 ) + κ£ (X2c , X1c ) − κ£ (X2 , X1c ) − κ£ (X2c , X1 )  P E(C1 , C2 ) = 2 (14) The Gamma index. The similarity index Γ with range [−1, 1] is given by def ad − bc Γ (C1 , C2 ) = p . (15) (a + b)(a + c)(b + d)(c + d) In this case, the following characterization can be obtained: r 1 £ Γ (C1 , C2 ) = (κ (X2 , X1 ) + κ£ (X2c , X1c ) − κ£ (X2 , X1c ) − κ£ (X2c , X1 )) 2 q · κ£ (X1 , X2 ) − κ£ (X1c , X2 ) (16) Ochiai (1957), Fowlkes and Mallows (1983). The similarity index OF M ranges over [0, 1]. It was introduced by A. Ochiai in 1957 and again by E. B. Fowlkes and C. L. Mallows in 1983: def a OF M (C1 , C2 ) = p (17) (a + b)(a + c) After rewriting we get q OF M (C1 , C2 ) = κ£ (X1 , X2 )κ£ (X2 , X1 ). (18) That is, OF M (C1 , C2 ) is the geometrical mean of κ£ (X1 , X2 ) and κ£ (X2 , X1 ). Rough Inclusion Functions and Similarity Indices 151 The Pearson index. The similarity index P named after C. Pearson ranges over [−1, 1]. It is given by def ad − bc P (C1 , C2 ) = . (19) (a + b)(a + c)(b + d)(c + d) The index P can be expressed in the following ways: −1 ab P (C1 , C2 ) = · Γ 2 (C1 , C2 ) cd = (κ£ (X1 , X2 ) − κ£ (X1c , X2 ))κ£ (X2 , {u})κ£ (X2c , {u0 }) (20) for arbitrary u ∈ X2 and u0 6∈ X2 . Sokal and Sneath (1963). The similarity indices SS1 , SS2 , SS3 with range [0, 1] were introduced by R. R. Sokal and P. H. Sneath in 1963. The third index is also attributed to A. Ochiai (1957):   def 1 a a d d SS1 (C1 , C2 ) = + + + , 4 a+b a+c b+d c+d def a SS2 (C1 , C2 ) = , a + 2(b + c) def ad SS3 (C1 , C2 ) = p . (21) (a + b)(a + c)(b + d)(c + d) One can prove the following: 1 £ κ (X1 , X2 ) + κ£ (X2 , X1 ) + κ£ (X1c , X2c ) + κ£ (X2c , X1c ) ,  SS1 (C1 , C2 ) = 4 κ1 (X1 , X2 ) + κ1 (X2 , X1 ) − 1 SS2 (C1 , C2 ) = , 3 − (κ1 (X1 , X2 ) + κ1 (X2 , X1 )) q SS3 (C1 , C2 ) = κ£ (X1 , X2 )κ£ (X2 , X1 )κ£ (X1c , X2c )κ£ (X2c , X1c ). (22) Thus, SS1 (C1 , C2 ) (resp., SS3 (C1 , C2 )) is the arithmetical (geometrical) mean of κ£ (X1 , X2 ), κ£ (X2 , X1 ), κ£ (X1c , X2c ), and κ£ (X2c , X1c ). Jaccard (1908). The similarity index J with range [0, 1] goes back to P. Jac- card: def a J(C1 , C2 ) = (23) a+b+c It can be shown that J(C1 , C2 ) = κ1 (X1 , X2 ) + κ1 (X2 , X1 ) − 1. (24) 152 A. Gomolińska, M. Wolski Sokal and Michener (1958), Rand (1971). The similarity index R with range [0, 1] was introduced by R. R. Sokal and C. D. Michener, and later inde- pendently by W. Rand: def a + d R(C1 , C2 ) = (25) M The index R can be rewritten to R(C1 , C2 ) = κ2 (X1 , X2 ) + κ2 (X2 , X1 ) − 1. (26) Hamann (1961), Hubert (1977). The similarity index H, ranging over [−1, 1], was proposed by U. Hamann and independently by L. J. Hubert: def (a + d) − (b + c) H(C1 , C2 ) = (27) M By certain transformations we obtain H(C1 , C2 ) = 2(κ2 (X1 , X2 ) + κ2 (X2 , X1 )) − 3. (28) Czekanowski (1932), Dice (1945), Gower and Legendre (1986). The similarity index CZ ranges over [0, 1]. It was proposed by J. Czekanowski in 1932, L. R. Dice in 1945, and by J. C. Gower and P. Legendre in 1986: def 2a CZ(C1 , C2 ) = (29) 2a + b + c On can prove the following: 2(κ1 (X1 , X2 ) + κ1 (X2 , X1 ) − 1) CZ(C1 , C2 ) = (30) κ1 (X1 , X2 ) + κ1 (X2 , X1 ) Russel and Rao (1940). The similarity index RR ranges over [0, 1] and is attributed to P. F. Russel and T. R. Rao: def a RR(C1 , C2 ) = (31) M In this case we obtain that RR(C1 , C2 ) = κ£ (U, X1 ∩ X2 ) = κ2 (U, X1 ∩ X2 ). (32) Fager and McGowan (1963). The similarity index F M G with range [−1/2, 1) goes back to E. W. Fager and J. A. McGowan : def a 1 F M G(C1 , C2 ) = p − √ (33) (a + b)(a + c) 2 a + b The above formula can be expressed in the following way: 1 q q F M G(C1 , C2 ) = κ£ (X1 , X2 )κ£ (X2 , X1 ) − κ£ (X1 , {u}) (34) 2 for an arbitrary u ∈ X1 . Rough Inclusion Functions and Similarity Indices 153 Sokal and Sneath (1963), Gower and Legendre (1986). The similarity index GL with range [0, 1] was introduced by R. R. Sokal and P. H. Sneath in 1963, and again by J. C. Gower and P. Legendre in 1986: def a+d GL(C1 , C2 ) = (35) a + 21 (b + c) + d A characterization of GL in terms of κ2 is the following: 2(κ2 (X1 , X2 ) + κ2 (X2 , X1 ) − 1) GL(C1 , C2 ) = (36) κ2 (X1 , X2 ) + κ2 (X2 , X1 ) Rogers and Tanimoto (1960). The similarity index RT with range [0, 1] is attributed to D. J. Rogers and T. T. Tanimoto: def a+d RT (C1 , C2 ) = (37) a + 2(b + c) + d This index can be rewritten to the following form: κ2 (X1 , X2 ) + κ2 (X2 , X1 ) − 1 RT (C1 , C2 ) = (38) 3 − (κ2 (X1 , X2 ) + κ2 (X2 , X1 )) Yule (1927), Goodman and Kruskal (1954). The similarity index GK ranges over [−1, 1]. It was proposed by G. U. Yule in 1927, and again by L. A. Goodman and W. H. Kruskal in 1954: def ad − bc GK(C1 , C2 ) = (39) ad + bc This index can be expressed in terms of the standard RIF as follows: κ£ (X2 , X1 )κ£ (X2c , X1c ) − κ£ (X2 , X1c )κ£ (X2c , X1 ) GK(C1 , C2 ) = (40) κ£ (X2 , X1 )κ£ (X2c , X1c ) + κ£ (X2 , X1c )κ£ (X2c , X1 ) Baulieu (1989). The similarity indices B1 and B2 range over [0, 1] and [−1/4, 1/4], respectively. They were introduced by F. B. Baulieu in 1989: 2 def M − M (b + c) + (b − c)2 B1 (C1 , C2 ) = , M2 def ad − bc B2 (C1 , C2 ) = . (41) M2 As in all previous cases, a RIF (precisely, κ2 here) underlies the definitions of these similarity indices, viz., B1 (C1 , C2 ) = κ2 (X1 , X2 ) + κ2 (X2 , X1 ) − 1 + (κ2 (U, X1 ) − κ2 (U, X2 ))2 , B2 (C1 , C2 ) = (1 − κ2 (X1 , X2c ))κ2 (U, X1c ) − (1 − κ2 (X1c , X2c ))κ2 (U, X1 ). (42) 154 A. Gomolińska, M. Wolski 4 Final Remarks The main goal realized in this paper was to show that a pretty vast number of various similarity indices known from the literature can be formulated in terms of some rough inclusion functions. Rough inclusion functions (RIFs) are mappings, inspired by the notion of a rough inclusion introduced by L. Polkowski and A. Skowron as a basic concept of rough mereology, by means of which one can measure the degree of inclusion of a set of objects in a set of objects. Since information granules can be viewed as particular sets of objects, RIFs are important not only for the rough set theory but also for granular computing. Starting with the standard RIF κ£ and two other RIFs of a similar origin, denoted by κ1 and κ2 , we have obtained all 22 similarity indices discussed in [25]. In the paper just mentioned it is proved that the indices K and MC are equivalent after some correction known as the correction for agreement due to chance, and the same holds for R, H, and CZ. We have not referred to this question because we are interested in other aspects concerning similarity indices. For example, we think about a usage of similarity indices in granular computing to calculate the degree of similarity between compound information granules such as indistinguishability relations and tolerance relations on a set of elementary objects considered. Let us note that similarity indices can also be used in granular computing in a more general setting, viz. to compute the degree of similarity between arbitrary sets of objects. In the full version of this article we will give an illustrating example and proofs of the formulas characterizing the similarity indices considered. In the future research we will generalize our results, viz. we will propose general schemata for generation of similarity indices from an arbitrary RIF. Another question, also suggested by the referee, is the discovery of relationships among RIFs and quality measures for clusters. References 1. Polkowski, L.: Reasoning by Parts: An Outline of Rough Mereology, Warszawa (2011) 2. Polkowski, L., Skowron, A.: Rough mereology. Lecture Notes in Artificial Intelli- gence 869 (1994) 85–94 3. Polkowski, L., Skowron, A.: Rough mereology: A new paradigm for approximate reasoning. Int. J. Approximated Reasoning 15(4) (1996) 333–365 4. Polkowski, L., Skowron, A.: Rough mereological calculi of granules: A rough set approach to computation. Computational Intelligence 17(3) (2001) 472–492 5. Pawlak, Z.: Rough Sets. Theoretical Aspects of Reasoning About Data. Kluwer, Dordrecht (1991) 6. Leśniewski, S.: Foundations of the General Set Theory 1 (in Polish). Volume 2 of Works of the Polish Scientific Circle., Moscow (1916) Also in [7], pages 128–173. 7. Surma, S.J., Srzednicki, J.T., Barnett, J.D., eds.: Stanislaw Leśniewski Collected Works. Kluwer/Polish Scientific Publ., Dordrecht/Warsaw (1992) 8. Zadeh, L.A.: Outline of a new approach to the analysis of complex system and decision processes. IEEE Trans. on Systems, Man, and Cybernetics 3 (1973) 28–44 Rough Inclusion Functions and Similarity Indices 155 9. Pawlak, Z., Skowron, A.: Rudiments of rough sets. Information Sciences 177(1) (2007) 3–27 10. Pedrycz, W., Skowron, A., Kreinovich, V., eds.: Handbook of Granular Computing. John Wiley & Sons, Chichester (2008) 11. Stepaniuk, J.: Rough-Granular Computing in Knowledge Discovery and Data Mining. Springer-V., Berlin Heidelberg (2008) 12. Pawlak, Z., Skowron, A.: Rough membership functions. In Fedrizzi, M., Kacprzyk, J., Yager, R.R., eds.: Fuzzy Logic for the Management of Uncertainty. John Wiley & Sons, New York (1994) 251–271 13. Skowron, A., Stepaniuk, J.: Tolerance approximation spaces. Fundamenta Infor- maticae 27(2–3) (1996) 245–253 14. Stepaniuk, J.: Knowledge discovery by application of rough set models. In: [34]. (2001) 137–233 15. Ziarko, W.: Variable precision rough set model. J. Computer and System Sciences 46(1) (1993) 39–59 16. Ziarko, W.: Probabilistic decision tables in the variable precision rough set model. Computational Intelligence 17(3) (2001) 593–603 17. Yao, Y.Y.: Decision-theoretic rough set models. Lecture Notes in Artificial Intel- ligence 4481 (2007) 1–12 18. Yao, Y.Y., Wong, S.K.M.: A decision theoretic framework for approximating con- cepts. Int. J. of Man–Machine Studies 37(6) (1992) 793–809 19. Tsumoto, S.: Modelling medical diagnostic rules based on rough sets. Lecture Notes in Artificial Intelligence 1424 (1998) 475–482 20. Gomolińska, A.: Satisfiability and meaning of formulas and sets of formulas in approximation spaces. Fundamenta Informaticae 67(1–3) (2005) 77–92 21. Nguyen, H.S., Skowron, A., Stepaniuk, J.: Granular computing: A rough set ap- proach. Computational Intelligence 17(3) (2001) 514–544 22. Cios, K.J., Pedrycz, W., Swiniarski, R.W., Kurgan, L.A.: Data Mining: A Knowl- edge Discovery Approach. Springer Science + Business Media, LLC, New York (2007) 23. Jain, A.K., Murty, M.N., Flynn, P.J.: Data clustering: A review. ACM Computing Surveys 31(3) (1999) 264–323 24. Kaufman, L., Rousseeuw, P.J.: Finding Groups in Data: An Introduction to Cluster Analysis. John Wiley & Sons, Chichester (1990) 25. Albatineh, A.N., Niewiadomska-Bugaj, M., Mihalko, D.: On similarity indices and correction for chance agreement. J. of Classification 23 (2006) 301–313 26. Gomolińska, A.: On three closely related rough inclusion functions. Lecture Notes in Artificial Intelligence 4585 (2007) 142–151 27. Gomolińska, A.: On certain rough inclusion functions. Transactions on Rough Sets IX: journal subline of LNCS 5390 (2008) 35–55 28. Gomolińska, A.: Rough approximation based on weak q-RIFs. Transactions on Rough Sets X: journal subline of LNCS 5656 (2009) 117–135 29. Xu, Z.B., Liang, J.Y., Dang, C.Y., Chin, K.S.: Inclusion degree: A perspective on measures for rough set data analysis. Information Sciences 141 (2002) 227–236 30. Zhang, W.X., Leung, Y.: Theory of including degrees and its applications to uncertainty inference. In: Proc. of 1996 Asian Fuzzy System Symposium. (1996) 496–501 31. Borkowski, L., ed.: Jan Lukasiewicz – Selected Works. North Holland/Polish Scientific Publ., Amsterdam/Warsaw (1970) 32. Lukasiewicz, J.: Die logischen Grundlagen der Wahrscheinlichkeitsrechnung, Kraków (1913) English translation in [31], pages 16-63. 156 A. Gomolińska, M. Wolski 33. Drwal, G., Mrózek, A.: System RClass – software implementation of a rough classifier. In Klopotek, M.A., Michalewicz, M., Raś, Z.W., eds.: Proc. 7th Int. Symp. Intelligent Information Systems (IIS’1998), Malbork, Poland, June 1998. (1998) 392–395 34. Polkowski, L., Tsumoto, S., Lin, T.Y., eds.: Rough Set Methods and Applications: New Developments in Knowledge Discovery in Information Systems. Physica V., Heidelberg New York (2001)