Rough Inclusion Functions and Similarity
                       Indices?

                      Anna Gomolińska1 and Marcin Wolski2
                  1
                   Bialystok University, Computer Science Institute,
                       Sosnowa 64, 15-887 Bialystok, Poland,
                            anna.gom@math.uwb.edu.pl
    2
      Maria Curie-Sklodowska University, Dept. of Logic and Philosophy of Science,
               Pl. Marii Curie-Sklodowskiej 4, 20-031 Lublin, Poland,
                          marcin.wolski@umcs.lublin.pl


        Abstract. Rough inclusion functions are mappings considered in the
        rough set theory with which one can measure the degree of inclusion of a
        set in a set (and in particular, the degree of inclusion of an information
        granule in an information granule) in line with rough mereology. On the
        other hand, similarity indices are mappings in cluster analysis with which
        one can compare clusterings, and clustering methods with respect to
        similarity. In this article we investigate the relationships between rough
        inclusion functions and similarity indices.
        Keywords: rough inclusion function, rough mereology, similarity index,
        cluster analysis, granular computing.


1     Introduction
In 1994, L. Polkowski and A. Skowron introduced the formal notion of a rough
inclusion, making it a fundamental concept of rough mereology (see, e.g. [1–4]).3
Rough inclusion may be interpreted as a ternary relation with which one can
express the fact that a set of objects is to some degree included in the same or an-
other set of objects. Rough mereology is a theory extending the Leśniewski mere-
ology [6, 7] from a theory of being-a-part to a theory of being-a-part-to-degree.
Rough inclusion functions (RIFs) are mappings with which one can measure the
degree of inclusion of sets in sets and which comply with the axioms of rough
inclusion. Since according to L. A. Zadeh’s definition [8], an information granule
is a clump of objects drawn together on the basis of indistinguishability, simi-
larity or functionality, RIFs can be used in particular to measure the degree of
inclusion of information granules in information granules. Hence, the concept of
a RIF is fundamental not only for the rough set theory [5, 9] but also for the
foundations and the development of granular computing [10, 11].
?
  Many thanks to the anonymous referees for interesting comments on the paper. All
  errors left are our sole responsibility.
3
  It is worthy to note that some ideas on rough inclusion were presented by Z. Pawlak
  in [5].
146      A. Gomolińska, M. Wolski

    RIFs can be useful in the rough set theory and, more generally, in granular
computing in many ways. First, they can be applied to compare sets (and infor-
mation granules) with respect to inclusion. Secondly, they can be used to define
rough membership functions [12] and various approximation operators as those
in the Skowron – Stepaniuk approach (see, e.g. [13, 14] and other papers by the
same authors), in the Ziarko variable-precision rough set model (see, e.g. [15, 16]
and more recent papers), or in the decision-theoretic rough set model [17, 18].
RIFs can also be used to estimate the confidence (known as accuracy as well)
and the coverage of decision rules and association rules (see, e.g. [19]). Another
application of RIFs is graded semantics of formulas (see, e.g. [20]). An important
application of RIFs is obviously their usage to compute the degree of similarity
(nearness, closeness) between sets of objects and, in particular, between infor-
mation granules. Some steps into this direction have already been made (see,
e.g. [21, 4, 14]).
    The similarity indices we are going to speak about are used in cluster analy-
sis [22–24] to compare clusterings, and clustering methods with respect to how
they are similar to (or dissimilar from) one another. Many of these similarity
indices were originally designed to compare species with respect to their mutual
similarity, given information about presence and/or absence of some features.
A. N. Albatineh, M. Niewiadomska-Bugaj, and D. Michalko thoroughly exam-
ined 28 similarity indices known from the literature on classification and cluster
analysis, from which 22 turned out to be different.4 The results of their re-
search on correction for chance agreement for similarity indices can be found,
e.g. in [25]. In the present article we continue our earlier works [26, 27], where
among other things, three similarity indices out of those 22 were derived from
RIFs. Our actual goal is to show that all 22 similarity indices investigated in [25]
can be obtained starting with the RIFs κ£ , κ1 , and κ2 only. This reveals one
more connection between the rough set theory and cluster analysis.
    The rest of the paper is organized as follows. In Sect. 2 we recall the notion
of a rough inclusion function and the three particular RIFs mentioned above.
In Sect. 3 we present the 22 similarity indices known from the literature and
discussed in [25], and we characterize them one by one by means of the standard
RIF κ£ or two other RIFs, viz. κ1 and κ2 . The last section contains final remarks.


2     Rough Inclusion Functions

Rough inclusion functions (RIFs for short) are supposed to be mappings to
measure the degree of inclusion of sets in sets and to comply with the axioms of
rough inclusion. In detail, a rough inclusion function upon a non-empty set of
objects U (in short, a RIF upon U or simply, a RIF) is a mapping κ : ℘U ×℘U 7→
[0, 1], assigning to any pair of sets (X, Y ) of elements of U , a number κ(X, Y )
from the unit interval [0, 1] interpreted as the degree to which X is included in

4
    Some similarity indices were introduced more than once, under different names.
                              Rough Inclusion Functions and Similarity Indices     147

Y , and such that the conditions rif 1 (κ) and rif ∗2 (κ) are satisfied, where
                  def
           rif 1 (κ) ⇔ ∀X, Y ⊆ U.(κ(X, Y ) = 1 ⇔ X ⊆ Y ),
                  def
           rif ∗2 (κ) ⇔ ∀X, Y, Z ⊆ U.(κ(Y, Z) = 1 ⇒ κ(X, Y ) ≤ κ(X, Z)).

Condition rif 1 (κ) expresses the fact that the set-theoretical inclusion of sets is
the most perfect case of rough inclusion. When rif 1 (κ) holds, condition rif ∗2 (κ)
will be equivalent with condition rif 2 (κ) below:
                        def
             rif 2 (κ) ⇔ ∀X, Y, Z ⊆ U.(Y ⊆ Z ⇒ κ(X, Y ) ≤ κ(X, Z))

expressing monotonicity of κ in the second variable. In the literature, weaker
versions of RIFs are considered as well, where rif 1 (κ) is replaced by “a half of
it”. Then, rif ∗2 (κ) and rif 2 (κ) will define different classes of inclusion mappings
(see, e.g. [28]).
    In summary, any RIF κ upon U should satisfy rif 1 (κ) and rif ∗2 (κ) or, equiv-
alently, rif 1 (κ) and rif 2 (κ). Among RIFs, various subclasses of mappings can
be distinguished by adding new postulates to be satisfied. These can be, for
instance,
                   def
           rif 3 (κ) ⇔ ∀∅ 6= X ⊆ U.κ(X, ∅) = 0,
                   def
           rif 4 (κ) ⇔ ∀X, Y ⊆ U.(κ(X, Y ) = 0 ⇒ X ∩ Y = ∅),
                   def
          rif −1
              4 (κ) ⇔ ∀∅ 6= X ⊆ U.∀Y ⊆ U.(X ∩ Y = ∅ ⇒ κ(X, Y ) = 0),
                   def
           rif 5 (κ) ⇔ ∀∅ 6= X ⊆ U.∀Y ⊆ U.(κ(X, Y ) = 0 ⇔ X ∩ Y = ∅),
                   def
           rif 6 (κ) ⇔ ∀∅ 6= X ⊆ U.∀Y ⊆ U.κ(X, Y ) + κ(X, Y c ) = 1,
                   def
           rif 7 (κ) ⇔ ∀X, Y, Z ⊆ U.(Z ⊆ Y ⊆ X ⇒ κ(X, Z) ≤ κ(Y, Z)),

where Y c denotes the set-theoretical complement of Y .5 Obviously, rif 5 (κ) if and
only if rif 4 (κ) and rif −1
                          4 (κ). Apart from that

                              rif −1
                                  4 (κ) ⇒ rif 3 (κ),
                              rif 1 (κ) & rif 6 (κ) ⇒ rif 5 (κ).                   (1)

    The standard RIF, denoted by κ£ here, is the most famous and frequently
used by the rough set community. The idea underlying this notion is closely
related to the conditional probability. In logic, J. Lukasiewicz was the first who
employed this idea when calculating the probability of truth associated with
implicative formulas [31, 32]. Let us recall that κ£ is only defined for a finite U
by putting                            (
                                        #(X∩Y )
                        £         def     #X     if X 6= ∅,
                       κ (X, Y ) =                                               (2)
                                        1        otherwise,
5
    The last condition was mentioned in [29, 30]. There, rough inclusion is understood
    in a different way than in our paper.
148     A. Gomolińska, M. Wolski

where X, Y are any subsets of U and #X denotes the number of elements of
X. In words, the standard RIF measures the fraction of the elements having
the property described by the second argument (Y ) among the elements with
the property described by the first argument (X). Apart from being a true RIF,
κ£ has a number of interesting properties recalled, e.g. in [27]. For instance, it
satisfies rif i (κ) (i = 3, . . . , 7) and rif −1
                                               4 (κ).
    Examples of other RIFs are mappings κ1 and κ2 such that for any X, Y ⊆ U ,
                                             #Y
                                        def           if X ∪ Y 6= ∅,
                         κ1 (X, Y ) = #(X∪Y )
                                             1        otherwise,
                                              c
                                    def #(X  ∪Y)
                       κ2 (X, Y ) =              .                              (3)
                                            #U
Also in this case, U has to be finite. While κ1 was introduced in [26], κ2 had
already been mentioned in [33]. The both RIFs were investigated in detail in [27].
The RIFs κ£ , κ1 , and κ2 are different from one another. Below we recall a few
other properties of these mappings.
Proposition 1. For any X, Y ⊆ U , we have:

          (i) X 6= ∅ ⇒ (κ1 (X, Y ) = 0 ⇔ Y = ∅),
          (ii) κ2 (X, Y ) = 0 ⇔ X = U & Y = ∅,
         (iii) rif 4 (κ1 ) & rif 4 (κ2 ),
          (iv) κ£ (X, Y ) ≤ κ1 (X, Y ) ≤ κ2 (X, Y ),
          (v) κ1 (X, Y ) = κ£ (X ∪ Y, Y ) & κ£ (X, Y ) = κ1 (X, X ∩ Y ),
          (vi) κ2 (X, Y ) = κ£ (U, X c ∪ Y ).

Let us also note that due to (i), rif 3 (κ1 ) holds. The same cannot be however said
about κ2 (compare (ii)).


3     Similarity Indices in Terms of RIFs
In this section we reformulate the similarity indices studied in [25] in terms of
the RIFs κ£ , κ1 , or κ2 . The proofs that the indices can really be expressed in
this way will be given in the full version of this paper.
    Consider a set U0 of m > 0 data points to be grouped by some clustering
methods A1 and A2 . Let U (our universe) be the set of all unordered pairs
of data points {x, y} ⊆ U0 to be compared in order to obtain clusterings, i.e.
partitions of U0 generated    by A1 and by A2 , and denoted by C1 and C2 here.
Thus, #U = M = m
                          
                        2   = m(m − 1)/2. The similarity between the clusterings
C1 and C2 (and the clustering methods A1 and A2 ) is usually assessed on the
basis of the number of pairs of data points that are put into the same cluster or
are put into different clusters by each of the grouping methods considered. For
i = 1, 2, let us define

                   Xi = {{x, y} ∈ U | x, y are clustered by Ai }.               (4)
                            Rough Inclusion Functions and Similarity Indices    149

Additionally, let
                                    a = #(X1 ∩ X2 ),
                                    b = #(X1 ∩ X2c ),
                                    c = #(X1c ∩ X2 ),
                                    d = #(X1c ∩ X2c ).                          (5)
In words, a is the number of pairs of data points {x, y} such that x and y are
placed in the same cluster according to both A1 and A2 ; b (respectively, c) is
the number of pairs of data points {x, y} such that x and y are placed in the
same cluster by A1 (resp., A2 ), but they are placed in different clusters by A2
(resp., A1 ); finally, d is the number of pairs of data points {x, y} such that x
and y are placed in different clusters according to both A1 and A2 . We also have
#X1 = a+b, #X2 = a+c, #X1c = c+d, #X2c = b+d, and #U = a+b+c+d = M .
For simplicity assume that a, b, c, d > 0. Then, we will have that
                                                a
                               κ£ (X1 , X2 ) =     ,
                                               a+b
                                                a+c
                               κ1 (X1 , X2 ) =       ,
                                               a+b+c
                                               a+c+d
                               κ2 (X1 , X2 ) =        .                       (6)
                                                  M
   In what follows we will present similarity indices one by one and their new
formulation in terms of κ£ , κ1 , or κ2 .

Wallace (1983). The similarity indices W1 , W2 with range [0, 1] were intro-
duced by D. L. Wallace:
                                              def a
                               W1 (C1 , C2 ) =       ,
                                                 a+b
                                             def  a
                               W2 (C1 , C2 ) =       .                          (7)
                                                 a+c
It is easy to see that
                            W1 (C1 , C2 ) = κ£ (X1 , X2 ),
                            W2 (C1 , C2 ) = κ£ (X2 , X1 ).                      (8)

Kulczyński (1927). The similarity index K with range [0, 1] was proposed by
S. Kulczyński in 1927:
                                                  
                                    def 1   a   a
                        K(C1 , C2 ) =         +                          (9)
                                        2 a+b a+c
K can be rewritten to the following form:
                                    1 £
                                      κ (X1 , X2 ) + κ£ (X2 , X1 )
                                                                  
                    K(C1 , C2 ) =                                              (10)
                                    2
In words, K(C1 , C2 ) is the arithmetical mean of κ£ (X1 , X2 ) and κ£ (X2 , X1 ).
150     A. Gomolińska, M. Wolski

McConnaughey (1964). The similarity index M C with range [−1, 1] goes
back to B. H. McConnaughey:

                                            def       a2 − bc
                              M C(C1 , C2 ) =                                           (11)
                                                   (a + b)(a + c)

This index can be expressed by the following equation:

                     M C(C1 , C2 ) = κ£ (X1 , X2 ) + κ£ (X2 , X1 ) − 1                  (12)

Peirce (1884). The similarity index P E with range [−1, 1] is attributed to
C. S. Peirce:
                                   def    ad − bc
                     P E(C1 , C2 ) =                                   (13)
                                       (a + c)(b + d)
The index P E can be characterized as follows:
                     1 £
                       κ (X2 , X1 ) + κ£ (X2c , X1c ) − κ£ (X2 , X1c ) − κ£ (X2c , X1 )
                                                                                       
   P E(C1 , C2 ) =
                     2
                                                                                       (14)

The Gamma index. The similarity index Γ with range [−1, 1] is given by

                                  def          ad − bc
                     Γ (C1 , C2 ) = p                              .                    (15)
                                      (a + b)(a + c)(b + d)(c + d)

In this case, the following characterization can be obtained:
                    r
                      1 £
   Γ (C1 , C2 ) =       (κ (X2 , X1 ) + κ£ (X2c , X1c ) − κ£ (X2 , X1c ) − κ£ (X2c , X1 ))
                      2
                     q
                    · κ£ (X1 , X2 ) − κ£ (X1c , X2 )                                     (16)


Ochiai (1957), Fowlkes and Mallows (1983). The similarity index OF M
ranges over [0, 1]. It was introduced by A. Ochiai in 1957 and again by E. B. Fowlkes
and C. L. Mallows in 1983:
                                            def      a
                            OF M (C1 , C2 ) = p                                         (17)
                                                (a + b)(a + c)

After rewriting we get
                                            q
                        OF M (C1 , C2 ) =       κ£ (X1 , X2 )κ£ (X2 , X1 ).             (18)

That is, OF M (C1 , C2 ) is the geometrical mean of κ£ (X1 , X2 ) and κ£ (X2 , X1 ).
                              Rough Inclusion Functions and Similarity Indices         151

The Pearson index. The similarity index P named after C. Pearson ranges
over [−1, 1]. It is given by

                                    def             ad − bc
                      P (C1 , C2 ) =                                   .               (19)
                                          (a + b)(a + c)(b + d)(c + d)

The index P can be expressed in the following ways:
                              −1
                         ab
       P (C1 , C2 ) =              · Γ 2 (C1 , C2 )
                         cd
                     = (κ£ (X1 , X2 ) − κ£ (X1c , X2 ))κ£ (X2 , {u})κ£ (X2c , {u0 })   (20)

for arbitrary u ∈ X2 and u0 6∈ X2 .

Sokal and Sneath (1963). The similarity indices SS1 , SS2 , SS3 with range
[0, 1] were introduced by R. R. Sokal and P. H. Sneath in 1963. The third index
is also attributed to A. Ochiai (1957):
                                                                          
                              def 1    a        a       d         d
              SS1 (C1 , C2 ) =             +       +        +                  ,
                                 4 a+b a+c b+d c+d
                             def      a
              SS2 (C1 , C2 ) =                ,
                                 a + 2(b + c)
                             def                ad
              SS3 (C1 , C2 ) = p                                .                      (21)
                                   (a + b)(a + c)(b + d)(c + d)

One can prove the following:

                   1 £
                      κ (X1 , X2 ) + κ£ (X2 , X1 ) + κ£ (X1c , X2c ) + κ£ (X2c , X1c ) ,
                                                                                      
  SS1 (C1 , C2 ) =
                   4
                    κ1 (X1 , X2 ) + κ1 (X2 , X1 ) − 1
  SS2 (C1 , C2 ) =                                     ,
                   3 − (κ1 (X1 , X2 ) + κ1 (X2 , X1 ))
                   q
  SS3 (C1 , C2 ) = κ£ (X1 , X2 )κ£ (X2 , X1 )κ£ (X1c , X2c )κ£ (X2c , X1c ).          (22)

Thus, SS1 (C1 , C2 ) (resp., SS3 (C1 , C2 )) is the arithmetical (geometrical) mean
of κ£ (X1 , X2 ), κ£ (X2 , X1 ), κ£ (X1c , X2c ), and κ£ (X2c , X1c ).

Jaccard (1908). The similarity index J with range [0, 1] goes back to P. Jac-
card:
                                     def   a
                         J(C1 , C2 ) =                                   (23)
                                         a+b+c
It can be shown that

                      J(C1 , C2 ) = κ1 (X1 , X2 ) + κ1 (X2 , X1 ) − 1.                 (24)
152    A. Gomolińska, M. Wolski

Sokal and Michener (1958), Rand (1971). The similarity index R with
range [0, 1] was introduced by R. R. Sokal and C. D. Michener, and later inde-
pendently by W. Rand:
                                          def a + d
                              R(C1 , C2 ) =                               (25)
                                               M
The index R can be rewritten to

                   R(C1 , C2 ) = κ2 (X1 , X2 ) + κ2 (X2 , X1 ) − 1.        (26)

Hamann (1961), Hubert (1977). The similarity index H, ranging over
[−1, 1], was proposed by U. Hamann and independently by L. J. Hubert:

                                      def (a + d) − (b + c)
                            H(C1 , C2 ) =                                  (27)
                                                     M
By certain transformations we obtain

                 H(C1 , C2 ) = 2(κ2 (X1 , X2 ) + κ2 (X2 , X1 )) − 3.       (28)

Czekanowski (1932), Dice (1945), Gower and Legendre (1986). The
similarity index CZ ranges over [0, 1]. It was proposed by J. Czekanowski in
1932, L. R. Dice in 1945, and by J. C. Gower and P. Legendre in 1986:
                                            def       2a
                              CZ(C1 , C2 ) =                               (29)
                                                  2a + b + c
On can prove the following:
                                  2(κ1 (X1 , X2 ) + κ1 (X2 , X1 ) − 1)
                CZ(C1 , C2 ) =                                             (30)
                                     κ1 (X1 , X2 ) + κ1 (X2 , X1 )

Russel and Rao (1940). The similarity index RR ranges over [0, 1] and is
attributed to P. F. Russel and T. R. Rao:
                                           def a
                              RR(C1 , C2 ) =                        (31)
                                               M
In this case we obtain that

               RR(C1 , C2 ) = κ£ (U, X1 ∩ X2 ) = κ2 (U, X1 ∩ X2 ).         (32)

Fager and McGowan (1963). The similarity index F M G with range [−1/2, 1)
goes back to E. W. Fager and J. A. McGowan :
                                 def      a            1
                 F M G(C1 , C2 ) = p               − √                     (33)
                                     (a + b)(a + c) 2 a + b
The above formula can be expressed in the following way:
                                                       1
                         q                               q
        F M G(C1 , C2 ) = κ£ (X1 , X2 )κ£ (X2 , X1 ) −     κ£ (X1 , {u})   (34)
                                                       2
for an arbitrary u ∈ X1 .
                              Rough Inclusion Functions and Similarity Indices           153

Sokal and Sneath (1963), Gower and Legendre (1986). The similarity
index GL with range [0, 1] was introduced by R. R. Sokal and P. H. Sneath in
1963, and again by J. C. Gower and P. Legendre in 1986:

                                          def           a+d
                           GL(C1 , C2 ) =                                                (35)
                                                 a + 21 (b + c) + d

A characterization of GL in terms of κ2 is the following:

                                     2(κ2 (X1 , X2 ) + κ2 (X2 , X1 ) − 1)
                  GL(C1 , C2 ) =                                                         (36)
                                        κ2 (X1 , X2 ) + κ2 (X2 , X1 )

Rogers and Tanimoto (1960). The similarity index RT with range [0, 1] is
attributed to D. J. Rogers and T. T. Tanimoto:

                                           def        a+d
                           RT (C1 , C2 ) =                                               (37)
                                                 a + 2(b + c) + d

This index can be rewritten to the following form:

                                      κ2 (X1 , X2 ) + κ2 (X2 , X1 ) − 1
                   RT (C1 , C2 ) =                                                       (38)
                                     3 − (κ2 (X1 , X2 ) + κ2 (X2 , X1 ))

Yule (1927), Goodman and Kruskal (1954). The similarity index GK
ranges over [−1, 1]. It was proposed by G. U. Yule in 1927, and again by
L. A. Goodman and W. H. Kruskal in 1954:
                                                  def ad − bc
                                GK(C1 , C2 ) =                                           (39)
                                                      ad + bc
This index can be expressed in terms of the standard RIF as follows:

                       κ£ (X2 , X1 )κ£ (X2c , X1c ) − κ£ (X2 , X1c )κ£ (X2c , X1 )
     GK(C1 , C2 ) =                                                                      (40)
                       κ£ (X2 , X1 )κ£ (X2c , X1c ) + κ£ (X2 , X1c )κ£ (X2c , X1 )

Baulieu (1989). The similarity indices B1 and B2 range over [0, 1] and [−1/4, 1/4],
respectively. They were introduced by F. B. Baulieu in 1989:
                                           2
                                   def M   − M (b + c) + (b − c)2
                     B1 (C1 , C2 ) =                              ,
                                                  M2
                                   def ad − bc
                     B2 (C1 , C2 ) =           .                                         (41)
                                         M2
As in all previous cases, a RIF (precisely, κ2 here) underlies the definitions of
these similarity indices, viz.,

    B1 (C1 , C2 ) = κ2 (X1 , X2 ) + κ2 (X2 , X1 ) − 1 + (κ2 (U, X1 ) − κ2 (U, X2 ))2 ,
    B2 (C1 , C2 ) = (1 − κ2 (X1 , X2c ))κ2 (U, X1c ) − (1 − κ2 (X1c , X2c ))κ2 (U, X1 ). (42)
154     A. Gomolińska, M. Wolski

4     Final Remarks
The main goal realized in this paper was to show that a pretty vast number
of various similarity indices known from the literature can be formulated in
terms of some rough inclusion functions. Rough inclusion functions (RIFs) are
mappings, inspired by the notion of a rough inclusion introduced by L. Polkowski
and A. Skowron as a basic concept of rough mereology, by means of which
one can measure the degree of inclusion of a set of objects in a set of objects.
Since information granules can be viewed as particular sets of objects, RIFs are
important not only for the rough set theory but also for granular computing.
    Starting with the standard RIF κ£ and two other RIFs of a similar origin,
denoted by κ1 and κ2 , we have obtained all 22 similarity indices discussed in [25].
In the paper just mentioned it is proved that the indices K and MC are equivalent
after some correction known as the correction for agreement due to chance,
and the same holds for R, H, and CZ. We have not referred to this question
because we are interested in other aspects concerning similarity indices. For
example, we think about a usage of similarity indices in granular computing to
calculate the degree of similarity between compound information granules such
as indistinguishability relations and tolerance relations on a set of elementary
objects considered. Let us note that similarity indices can also be used in granular
computing in a more general setting, viz. to compute the degree of similarity
between arbitrary sets of objects.
    In the full version of this article we will give an illustrating example and proofs
of the formulas characterizing the similarity indices considered. In the future
research we will generalize our results, viz. we will propose general schemata
for generation of similarity indices from an arbitrary RIF. Another question,
also suggested by the referee, is the discovery of relationships among RIFs and
quality measures for clusters.


References
 1. Polkowski, L.: Reasoning by Parts: An Outline of Rough Mereology, Warszawa
    (2011)
 2. Polkowski, L., Skowron, A.: Rough mereology. Lecture Notes in Artificial Intelli-
    gence 869 (1994) 85–94
 3. Polkowski, L., Skowron, A.: Rough mereology: A new paradigm for approximate
    reasoning. Int. J. Approximated Reasoning 15(4) (1996) 333–365
 4. Polkowski, L., Skowron, A.: Rough mereological calculi of granules: A rough set
    approach to computation. Computational Intelligence 17(3) (2001) 472–492
 5. Pawlak, Z.: Rough Sets. Theoretical Aspects of Reasoning About Data. Kluwer,
    Dordrecht (1991)
 6. Leśniewski, S.: Foundations of the General Set Theory 1 (in Polish). Volume 2 of
    Works of the Polish Scientific Circle., Moscow (1916) Also in [7], pages 128–173.
 7. Surma, S.J., Srzednicki, J.T., Barnett, J.D., eds.: Stanislaw Leśniewski Collected
    Works. Kluwer/Polish Scientific Publ., Dordrecht/Warsaw (1992)
 8. Zadeh, L.A.: Outline of a new approach to the analysis of complex system and
    decision processes. IEEE Trans. on Systems, Man, and Cybernetics 3 (1973) 28–44
                            Rough Inclusion Functions and Similarity Indices        155

 9. Pawlak, Z., Skowron, A.: Rudiments of rough sets. Information Sciences 177(1)
    (2007) 3–27
10. Pedrycz, W., Skowron, A., Kreinovich, V., eds.: Handbook of Granular Computing.
    John Wiley & Sons, Chichester (2008)
11. Stepaniuk, J.: Rough-Granular Computing in Knowledge Discovery and Data
    Mining. Springer-V., Berlin Heidelberg (2008)
12. Pawlak, Z., Skowron, A.: Rough membership functions. In Fedrizzi, M., Kacprzyk,
    J., Yager, R.R., eds.: Fuzzy Logic for the Management of Uncertainty. John Wiley
    & Sons, New York (1994) 251–271
13. Skowron, A., Stepaniuk, J.: Tolerance approximation spaces. Fundamenta Infor-
    maticae 27(2–3) (1996) 245–253
14. Stepaniuk, J.: Knowledge discovery by application of rough set models. In: [34].
    (2001) 137–233
15. Ziarko, W.: Variable precision rough set model. J. Computer and System Sciences
    46(1) (1993) 39–59
16. Ziarko, W.: Probabilistic decision tables in the variable precision rough set model.
    Computational Intelligence 17(3) (2001) 593–603
17. Yao, Y.Y.: Decision-theoretic rough set models. Lecture Notes in Artificial Intel-
    ligence 4481 (2007) 1–12
18. Yao, Y.Y., Wong, S.K.M.: A decision theoretic framework for approximating con-
    cepts. Int. J. of Man–Machine Studies 37(6) (1992) 793–809
19. Tsumoto, S.: Modelling medical diagnostic rules based on rough sets. Lecture
    Notes in Artificial Intelligence 1424 (1998) 475–482
20. Gomolińska, A.: Satisfiability and meaning of formulas and sets of formulas in
    approximation spaces. Fundamenta Informaticae 67(1–3) (2005) 77–92
21. Nguyen, H.S., Skowron, A., Stepaniuk, J.: Granular computing: A rough set ap-
    proach. Computational Intelligence 17(3) (2001) 514–544
22. Cios, K.J., Pedrycz, W., Swiniarski, R.W., Kurgan, L.A.: Data Mining: A Knowl-
    edge Discovery Approach. Springer Science + Business Media, LLC, New York
    (2007)
23. Jain, A.K., Murty, M.N., Flynn, P.J.: Data clustering: A review. ACM Computing
    Surveys 31(3) (1999) 264–323
24. Kaufman, L., Rousseeuw, P.J.: Finding Groups in Data: An Introduction to Cluster
    Analysis. John Wiley & Sons, Chichester (1990)
25. Albatineh, A.N., Niewiadomska-Bugaj, M., Mihalko, D.: On similarity indices and
    correction for chance agreement. J. of Classification 23 (2006) 301–313
26. Gomolińska, A.: On three closely related rough inclusion functions. Lecture Notes
    in Artificial Intelligence 4585 (2007) 142–151
27. Gomolińska, A.: On certain rough inclusion functions. Transactions on Rough Sets
    IX: journal subline of LNCS 5390 (2008) 35–55
28. Gomolińska, A.: Rough approximation based on weak q-RIFs. Transactions on
    Rough Sets X: journal subline of LNCS 5656 (2009) 117–135
29. Xu, Z.B., Liang, J.Y., Dang, C.Y., Chin, K.S.: Inclusion degree: A perspective on
    measures for rough set data analysis. Information Sciences 141 (2002) 227–236
30. Zhang, W.X., Leung, Y.: Theory of including degrees and its applications to
    uncertainty inference. In: Proc. of 1996 Asian Fuzzy System Symposium. (1996)
    496–501
31. Borkowski, L., ed.: Jan Lukasiewicz – Selected Works. North Holland/Polish
    Scientific Publ., Amsterdam/Warsaw (1970)
32. Lukasiewicz, J.: Die logischen Grundlagen der Wahrscheinlichkeitsrechnung,
    Kraków (1913) English translation in [31], pages 16-63.
156     A. Gomolińska, M. Wolski

33. Drwal, G., Mrózek, A.: System RClass – software implementation of a rough
    classifier. In Klopotek, M.A., Michalewicz, M., Raś, Z.W., eds.: Proc. 7th Int.
    Symp. Intelligent Information Systems (IIS’1998), Malbork, Poland, June 1998.
    (1998) 392–395
34. Polkowski, L., Tsumoto, S., Lin, T.Y., eds.: Rough Set Methods and Applications:
    New Developments in Knowledge Discovery in Information Systems. Physica V.,
    Heidelberg New York (2001)