Semantic Rendering of Data Tables: Multivalued Information Systems Revisited Marcin Wolski1 and Anna Gomolińska2 1 Maria Curie-Skłodowska University, Department of Logic and Cognitive Science, Pl. Marii Curie-Skłodowskiej 4, 20-031 Lublin, Poland marcin.wolski@umcs.lublin.pl 2 University of Białystok, Faculty of Mathematics and Informatics, Konstantego Ciołkowskiego 1M, 15-245 Białystok, Poland anna.gom@math.uwb.edu.pl Abstract. Data tables provide a convenient means of representation of descrip- tive information about objects. They serve also as standard input for data analysis tools or theories. In this paper we focus our attention upon the special class of data tables, namely multivalued information systems introduced by Z. Pawlak and E. Orłowska in the early 80s. The main idea presented in the paper is to in- terpret multivalued information systems as semantically processed single valued data tables. This interpretation allows us to describe classical rough set theory, dominance-based rough set theory, and formal concept analysis within the frame- work of multivalued information systems. Keywords: information system, rough set, dominance relation, formal concept 1 Introduction Data tables provide a simple and effective means of representation of collected pieces of information about a given set of objects. They serve also as standard input for data anal- ysis tools or theories, output of which is often referred to as knowledge. In the present paper we shall focus our attention upon the special class of data tables, namely mul- tivalued/approximate/nondeterministic information systems, introduced by Z. Pawlak and E. Orłowska in the early 80s [3, 4]. These systems are generalisations of the stan- dard data tables/information systems to the case in which for an object x and an attribute A we are given (as the entry in the table) a set VA of attribute values instead of a single value. The formal definitions of both multivalued and approximate information sys- tems are actually the same (see [4]); it is the interpretation/semantics of the entries of these tables which makes the difference: the first interpretation is the object x has all values from VA for the attribute A (multivalued systems), whereas the second reads as the object x has a single value from the set VA for the attribute A (approximate systems). These systems equipped with a generalised semantics (which reads for the object x and the attribute A the set VA provides some possible values) are called by Z. Pawlak and E. Orłowska nondeterministic information systems [3]. Of course, the semantics of the entries in the table determines how information is further processed; in other words, which relations between objects are used to construct information granules M. Wolski, A. Gomolińska being the building blocks of knowledge. The main concern of [4] is the informational indiscerniblity between objects (an equivalence relation), whereas the main focus of [3] are the information inclusion and the informational connection (a preorder and a tolerance relation, respectively). The main novelty of the present paper consists in taking multivalued information systems as semantically enriched single valued data tables; this idea can be summarised by the following equation: data table + semantics = multivalued information system. Thus multivalued information systems represent semantically processed data. We start our study with some standard data tables used in the leading theories of data analysis: single valued information systems from rough set theory (RST) [4–6], single valued in- formation systems enriched by dominance relations taken from dominance based rough set theory (DRS) [7, 8], and formal contexts from formal concept analysis (FCA) [1, 9]. Then we provide these structures with some specific interpretation/semantics. To this end we use scales from FCA, which are tools to convert a multivalued formal context (which is actually a standard information system) into a (single valued) formal con- text. We are going to employ these scales in order to obtain multivalued information systems. Then we focus upon informational relations of indiscerniblity and inclusion. These steps allow us to consider RST, DRS, and FCA within a single conceptual frame- work of multivalued information systems and to emphasise how these theories differ semantically. Finally, we shall discuss different interpretations of RST, DRS, and FCA based operators in the context of John Stuart Mill inductive reasoning [2]. 2 Data Tables and Semantics In the present section we discuss different forms of data tables considered in the lead- ing theories of data analysis: rough set theory (RST) [4–6], dominance based rough sets (DRS) [7, 8], and formal concept analysis (FCA) [1, 9]. Of course, apart from data tables (input), each theory provides special tools to process the tables and produce some mean- ingful output. However, in the present section we shall discuss only tables, whereas the ways they may be further processed will be presented in the next section. Definition 1 (Formal Context [1, 9]). A formal context is a triple (U, Att, R), where U is a set of objects, Att a set of binary attributes and R ⊆ U × Att. If an object x stands in the relation R to A, then we mark it in the data table by 8. Table depicted by Fig. 1 presents a very simple context; Bob is a good mathematician whereas Agnes is not; on the bright side, she is rich. Definition 2 (Information System [3, 4]). A quadruple I = (U, Att, V al, f ) is called an information system, where: – U is a nonempty finite set of objects, – A is a nonempty finite set of attributes, Semantic Rendering of Data Tables: Multivalued Information Systems Revisited good mathematician rich Steven 8 Bob 8 8 Agnes 8 Fig. 1. A simple formal context S – V = A∈Att V alA , where V alA is the value–domain of the attribute A, and V alA ∩ V alB = ∅, for all A, B ∈ Att (the last condition is not necessary, yet it is mathematically convenient), – f : U × Att → V al is an information function, such that for all A ∈ Att and x ∈ U it holds that f (x, A) ∈ V alA . If f is a partial function then the information system I is called incomplete. If the codomain of f is the powerset of V al, then the system is called multivalued. In what follows we shall confine our attention to the complete information (a simple example is depicted by Fig. 2) systems and their multivalued version being the result of scaling. mathematician (1-6) rich Steven 5 $ 0.1 million Bob 4 $ 0.8 million Agnes 2 $ 1 million Fig. 2. An information system Definition 3 (Dominance-Based Data Table [7, 8]). A dominance-based data table is a system I = (U, Att, V al, f, ≤D ), where (U, Att, V al, f ) is an information system such that all attribute values are real numbers. Let us define: x ≤A y iff f (A, x) ≤ f (A, y), for x, y ∈ U . Then x ≤D y iff x ≤A y for all A ∈ Att. Thus, information systems allow one to be more specific about the meaning of at- tributes. In the case of dominance-based data tables, we additionally assume that the attribute values are comparable with respect to some complete order. If f (A, x) ≤ f (A, y), then we say that y is at least as good as x with respect to A. If y is at least as good as x with respect to all attributes then we say that y dominates x. Of course, any information system may be converted into a formal context. The easiest way of doing this is called in FCA a nominal scaling. Formally, a scale for A is a formal context (V alA , V alA , RA ) having V alA as both the set of objects and the set of attributes. We also assume that the identity idV alA is included in RA . After scaling each pair attribute-value (A, v) is regarded as a separate attribute of the new context CI = (U, {(A, v)}A∈Att v∈V alA , R). For the fundamental relation in rough set M. Wolski, A. Gomolińska theory is the indistinguishability, the values in (V alA , V alA , RA ) are compared by the equality relation: RA is =, for every A; and in consequence 1 is interpreted as 1, 2 as 2, and so on. That is why in the scale depicted in Fig. 3 only the diagonal is marked by 8. This type of scaling is called nominal. However, since all values in our exemplary data 1 2 3 4 5 6 1 8 mathematician rich $ 0.1 $ 0.8 $ 1 2 8 1 2 3 4 5 6 $ 0.1 $ 0.8 $ 1 $ 0.1 8 3 8 Steven 8 8 $ 0.8 8 4 8 Bob 8 8 $1 8 5 8 Agnes 8 8 6 8 Fig. 3. Nominal scaling: the common ground of FCA and RST table are real numbers, we can compare them with respect to ≤ instead of =. Actually, when we order these values according to ≤ we do change the scaling, or better still, the semantics (interpretation) of attribute values. Now, e.g., Bob’s score in physics 5 and Agnes’s score 2 means that Bob is a better mathematician than Agnes. In other words, whatever Agnes can solve, Bob can as well. Following DRS, we may say that Bob is at least as good as Agnes with respect to the attribute mathematician. Formally, the higher value (e.g., 5) with respect to ≤ implies the lower value (e.g., 2). Such scales are called ordinal scales. Thus this time the mark 8 on 5 may mean that Bob has solved at least 5 problems. Under this reading Bob has solved at least 4 problems too. Therefore the scaling representing this interpretation (semantics) may be defined as depicted in Fig. 4. Please note that the values $ 0.1 or $ 0.8 are interpreted in the same fashion. Thus 8 on $ 0.8 means that Bob has at least $ 0.8 on his bank account, and in consequence he has at least $ 0.1 too. Following DRS, we may say that Bob dominates Steven. In consequence, both RST and DRS start with the same date table but use different scales (interpretations). Of course, there are also (many) other possibilities. We conclude this 1 2 3 4 5 6 1 8 mathematician rich $ 0.1 $ 0.8 $ 1 2 8 8 1 2 3 4 5 6 $ 0.1 $ 0.8 $ 1 $ 0.1 8 3 8 8 8 Steven 8 8 8 8 8 8 $ 0.8 8 8 4 8 8 8 8 Bob 8 8 8 8 8 8 $1 8 8 8 5 8 8 8 8 8 Agnes 8 8 8 8 8 6 8 8 8 8 8 8 Fig. 4. DRS in terms of formal contexts part with a nontrivial interpretation offered by B. Ganter during his seminar at Warsaw Semantic Rendering of Data Tables: Multivalued Information Systems Revisited University (a few years ago). He suggested to read the exam scores in a natural language and take the “natural” scale. Under this reading, someone who has done a very good 1 2 3 4 5 6 1 may be interpreted as bad 1 8 8 2 may be interpreted as unsatisfactory 2 8 3 may be interpreted as satisfactory 3 8 4 may be interpreted as good 4 8 8 5 may be interpreted as very good 5 8 8 8 6 may be interpreted as excellent 6 8 8 8 8 Fig. 5. B. Ganter’s semantics of exam’s scores and scaling job has done also a good job. The job of course is also satisfactory. However, someone who has done a bad job has also done an unsatisfactory job, but not vice versa. In consequence we obtain yet another scale, as depicted by Fig. 5. In this scale we use two orderings, and this type of scaling is therefore called bi-ordinal. Thus, starting from the information system depicted by Fig. 2 we can obtain a number of different multivalued information systems, depending on which scale is applied to the original system (see Fig. 6). mathematician (1-6) rich Nominal scaling Steven {5} {$0.1 million } ⇒ Bob {4} {$0.8 million } Agnes {2} {$1 million } mathematician (1-6) rich Ordinal scaling Steven {1, 2, 3, 4, 5} {$0.1 million } ⇒ Bob {1, 2, 3, 4} {$0.1 million , $0.8 million } Agnes {1, 2} {$0.1 million , $0.8 million $1 million } mathematician (1-6) rich B. Ganter’s scaling Steven {3, 4, 5} {$0.1 million } ⇒ Bob {3, 4} {$0.1 million , $0.8 million } Agnes {2} {$0.1 million , $0.8 million $1 million } Fig. 6. Multivalued information systems as the results of scaling, i.e. providing data tables with interpretation/semantics More formally, different interpretations/semantics of attribute values lead to differ- ent formal contexts. When one starts with a multivalued context (an information system) M. Wolski, A. Gomolińska I = (U, Att, V al, f ) and a set of scales S = {(V alA , V alA , RA ) : A ∈ Att} (as dis- cussed above), then every pair (A, v), where A ∈ Att and v ∈ V alA , is regarded as the attribute in the induced formal context CI = (U, {(A, v)}A∈Att v∈V alA , R), where R is defined by R = {(x, (A, v)) : v ∈ fs (x, A)}, fs (x, A) = {vi ∈ V alA : f (x, A) = v & (v, vi ) ∈ RA }. Of course every (multivalued) context I = (U, Att, V al, f ) may also be converted into a multivalued information system IS = (U, Att, V al, fs ). The whole process of providing I with semantics given by S is depicted by Fig.6. 3 Information Processing: Approximation Operators In the previous section we have discussed information systems (collected pieces of data about objects) which must be further processed to give some meaningful output (knowl- edge). The information processing in rough set theory [4–6] and dominance based rough set theory (DRS) [7, 8] is done by means of binary relations between objects. However, formal concept analysis (FCA) [1, 9] is based upon a relation between objects and at- tributes; in some special cases this relation may be reduced to the relation between objects, but it is not always possible. In multivalued information systems, in contrast to the classical information systems where there is considered a single relation, there are three important relations between objects [3, 4]. Definition 4. Let (U, Att, V al, f ) be a multivalued information system; then one can define: – Informational Indiscerniblity: x Ind y iff f (x, A) = f (y, A), – Informational Connectivity (Similarity): x Sim y iff f (x, A) ∩ f (y, A) 6= ∅, – Informational Inclusion: x Incl y iff f (x, A) ⊆ f (y, A), for all A ∈ Att and x, y ∈ U . Usually, the output of data analysis is related to some aspect of reality represented by a decision attribute. For example, we could take the subjective quality of life as the decision attribute. Then our aim would be to “express” the subjective quality of life in terms of the attributes mathematician and rich, which are (in this case) called conditional attributes (we would like to obtain knowledge how the subjective quality of life “depends” on wealth and education in mathematics). Information systems having a single attribute distinguished as a decision attribute are called decision tables. In such a case, all informational relations (indiscerniblity, connectivity, and inclusion) are defined with respect to (only) conditional attributes. As earlier, we shall start discussion in this section with FCA (and its operators). Definition 5 (Derivation Operators). For a formal context C = (U, Att, R), define: R0 (X) = {A ∈ Att : ∀x ∈ X ((x, A) ∈ R)}, R0 (A) = {x ∈ U : ∀A ∈ A ((x, A) ∈ R)}, for all X ⊆ U and A ⊆ Att. Semantic Rendering of Data Tables: Multivalued Information Systems Revisited mathematician (1-6) rich sub. quality of life (1-3) Steven 5 $ 0.1 million 3 Bob 4 $ 0.8 million 3 Agnes 2 $ 1 million 1 Fig. 7. A decision table Definition 6 (Formal Concept). A formal concept is a pair (X, A) such that X = R0 (A) and A = R0 (X). X is called an extension and A is called an intention of this concept. Let us start with a complete information system I = (U, Att, V al, f ) and a semantics S, that is, a family of scales RA , for all A ∈ Att. As one can easily observe, for every object x in CI = (U, {(A, v)}A∈Att v∈V alA , R), a pair (R0 (R0 ({x})), R0 ({x})) is a concept, and y ∈ R0 (R0 ({x})) provided that x Incl y in the corresponding multivalued information system IS = (U, Att, V al, fs ): R0 (R0 ({x})) = {y ∈ U : x Incl y}. However, it usually happens that R0 (R0 (X)) 6= {y ∈ U : ∃x (x Incl y & x ∈ X)}. But it always holds that X ⊆ {y ∈ U : ∃x (x Incl y & x ∈ X)} ⊆ R0 (R0 (X)). However, after conversion of an information system I = (U, Att, V al, f ) to the formal context CI = (U, {(A, v)}A∈Att v∈V alA , R) we lose the contact with original attributes (we shall discuss this issue in detail later in the paper). Definition 7 (Lower and Upper Approximations). A pair (U, E), where E is an equivalence relation, is called an approximation space. Define after Z. Pawlak: LowE (X) = {x ∈ U : [x]E ⊆ X}, U ppE (X) = {x ∈ U : [x]E ∩ X 6= ∅}. LowE (X) is called the lower approximation of X, whereas U ppE (X) is called the upper approximation of X. Coming back to information systems, every set of attributes A ⊆ Att of an information system I = (U, Att, V al, f ) induces an approximation space (U, EA ), where EA = {(x, y) : f (x, A) = f (y, A) for all A ∈ A}. In order to simplify the notation, we shall write LowA (X) and U ppA (X) for LowEA (X) and U ppEA (X), respectively. In the case when A = Att, we shall leave E without any subscript. Every information system I = (U, Att, V al, f ) (together with a family of scales S) induces also a multivalued information system IS = (U, Att, V al, fs ) and another ap- proximation space (U, Ind). Due to scaling, Ind and E may be two different relations. M. Wolski, A. Gomolińska For any A ⊆ Att of (U, Att, V al, fs ) the corresponding indiscernibility relation will be denoted by IndA . This notational convention will also be used for other relations. As usual, we can generalise E to any reflexive relation P (e.g. Sim or Incl) and obtain generalised approximation operators. Let [x]P = {y ∈ U : (x, y) ∈ P } and define: LowP (X) = {x ∈ U : [x]P ⊆ X}, U ppP (X) = {x ∈ U : [x]P ∩ X 6= ∅}. As one can note, it holds that U ppIncl (x) = R0 (R0 ({x})) = {y ∈ U : x Incl y}, and U ppIncl (X) = {y ∈ U : ∃x (x Incl y & x ∈ X)}. 0 0 However, R (R (X)) is much more complex in the settings of information systems than it might seem at the first sight. It is worth noting that R0 (X), for X ⊆ U , is a set consisting of pairs (A, v). So, in order to go back to the level of information systems we need a method of retrieving the original attributes from this set, so as it would act as A ⊆ Att. Let Atex (attribute extraction) be defined by Atex(H) = {A : (A, v) ∈ H} for H ⊆ Att × V al. Obviously, this a projection operation on the first coordinate and it makes sense only for a family of regular scales. Consider the following example. Let V al = {1, 2, 3, 4, 5, 6} be a set of values of some attribute A. Assume that a scaling converts 1 to {1, 2}, 2 to {2, 3}, and all other values to {3, 4, 5, 6}. So after scaling A has three value sets: {1, 2}, {2, 3}, {3, 4, 5, 6}. Let f (x, A) = {1, 2} and f (y, A) = {2, 3}. Now, let us start with (U, {A}, V al, f ), then go to the corresponding CI , and compute R0 ({x, y}), which is {(A, 2)} – but this item does not make sense in our semantics S: {2} is the meaning of neither element of V al. Thus, using set intersection ∩ we may produce a new non-empty value set, which is not present in IS = (U, Att, V al, fs ). A scale is regular if that is not possible. Nominal and ordinal scales are regular. Only for regular scales we are able to define the concepts of the formal context on the level of attributes of information systems. Let IS = (U, Att, V al, fs ) be a multi- valued information system obtained from an information system I = (U, Att, V al, f ) by means of regular scales S; then R0 (R0 (X)) = {y ∈ U : ∀A ∈ Atex(R0 (X)) ∃x ∈ X (x InclA y)}. If the scale is not regular, then the following inclusion holds only: R00 (X) = {y ∈ U : ∀A ∈ Atex(R0 (X)) ∃x ∈ X (x InclA y)} ⊆ R0 (R0 (X)). Therefore, in such a case we need a new name R00 for this operator. Let us now consider a decision table IG = (U, Att, V al, G, f ), that is, an informa- tion system I = (U, Att, V al, f ) equipped with a decision (goal) attribute G 6∈ Att and f being defined on Att ∪ {G}. The semantics S for IG needs now to include a scale RG = (V alG , V alG , RG ) for the decision attribute. The main goal is to approx- imate a given pair (G, [v]RG ), where v ∈ V alG is a specific distinguished value. More precisely, we want to approximate the set X = {x ∈ U : (v, f (x, G)) ∈ RG }. Let us take two scaling methods for the decision attribute subjective quality of life: Semantic Rendering of Data Tables: Multivalued Information Systems Revisited nominal scale N om: RG = {(i, j) : i, j ∈ {1, 2, 3} & i = j}; ordinal scale Ord: RG = {(i, j) : i, j ∈ {1, 2, 3} & j ≤ i}. The nominal scale N omG interprets 1 as low quality of life, 2 as average quality of life, and 3 as high quality of life. As expected, the ordinal scale OrdG interprets 1, 2, and 3 as: at least low quality of life, at least average quality of life, and at least high quality of life, respectively. Now, let us take the value 3 as the distinguished value of the deci- sion attribute. Then under N omG we are going to approximate the set {Bob, Steven}, however, under OrdG the set to be approximated is {Agnes, Bob, Steven}. The dominance-based rough set approach (DRS) [7, 8] is actually a kind of rough set theory rendered according to the above ideas and the ordinal scaling method. An information system is equipped with a dominance relation ≤D , that is, we consider the system I = (U, Att, V al, f, ≤D ) (see Section 2). This system induces a multi- valued information system IS = (U, Att, V al, fs ), where S consists of ordinal scales (V alA , V alA , RA ) for every A ∈ Att. Before we recall the definitions of the approxi- mation operators, we need a few auxiliary concepts: – D+ (x) = {y ∈ U : x ≤D y} (a set of objects dominating x, or better than x); – D− (x) = {y ∈ U : y ≤D x} (a set of objects dominated by x, or worse than x); – Decision attribute G, VG = T , Clt = {x ∈ U : f (x, G) = t}, Clt≤ = Cls and Clt≥ = [ [ Cls , s≤G t t≤G s where t, s ∈ T . It is additionally assumed that [ Cls ∩ Clt = ∅ for s 6= t and Cls = U. s∈T Classification patterns to be discovered are functions representing granules Cl≤ t and Cl≥ t by means of granules D+ (x) and D− (x). It is worth emphasising that due to the preference order, the sets to be approximated are not the particular Clt (for some t ∈ VG ), but the upward and downward unions. As said in the previous section, DRS may be represented in terms of ordinal scaling. In what follows we would like to make this scaling explicit in DRS and use relations Ind and Incl (from multivalued information systems) rather than the dominance re- lation ≤D . Let us consider a multivalued information system IS = (U, Att, V al, fs ) obtained from I = (U, Att, V al, f, ≤D ) by means of a scaling set S. Please note that due to the ordinal scaling of all attributes of I, it holds that: D+ (x) = {y ∈ U : x Incl y} = R0 (R0 ({x})), D− (x) = {y ∈ U : x Incl−1 y} = {y ∈ U : y Incl x}, Clt≥ = {y ∈ U : ∃x (x Incl{G} y & x ∈ Clt )}, Clt≤ = {y ∈ U : ∃x (x Incl{G} −1 y & x ∈ Clt )}. The specific interpretation of ordinal scaling in DRS makes a new type of inconsistency in data tables possible: M. Wolski, A. Gomolińska (a) an object x belongs to Cl≥ t (that is, it belongs to Clt or a class better than Clt ), but it is dominated by some objects y 6∈ Cl≥ t (it is dominated by some object from a worse class), (b) an object x belongs to the class Cl≤ t (that is, it belongs Clt or a class worse than Clt ), but it dominates some object y 6∈ Cl≤ t (it dominates some object from a better class). These objects are regarded as borderline cases: they might or might not belong to a given class. In consequence, in DRS we consider the following approximations: Clt≤ = {x ∈ U : D− (x) ⊆ Clt≤ }, Clt≤ = {x ∈ U : D+ (x) ∩ Clt≤ 6= ∅}, Clt≥ = {x ∈ U : D+ (x) ⊆ Clt≥ }, Clt≥ = {x ∈ U : D− (x) ∩ Clt≥ 6= ∅}. As earlier, our aim is to express these approximation operators by means of Incl. Thus, let be given an information system I = (U, Att, V al, f, ≤D ) and its corresponding multivalued information system (U, Att, V al, fs ), obtained by means of the ordinal scaling method Ord. Then Clt≤ = D− (x) = {y ∈ U : ∀x ∈ U (x Incl y ⇒ x ∈ Clt≤ )}, [ ≤ D − (x)⊆Clt Clt≤ = D− (x) = {y ∈ U : ∃x ∈ U (x Incl−1 y & x ∈ Clt≤ )}, [ ≤ x∈Clt Clt≥ = D+ (x) = {y ∈ U : ∀x ∈ U (x Incl−1 y ⇒ x ∈ Clt≥ )}, [ ≥ D + (x)⊆Clt Clt≥ = D+ (x) = {y ∈ U : ∃x ∈ U (x Incl y & x ∈ Clt≥ )}. [ ≥ x∈Clt So, we are able to transfer FCA, RST, and DRS, along with explicitly given semantics S (a family of scales RA for every attribute A ∈ Att) into the framework of multivalued information systems. The connections between the operators discussed above are as follows: Clt≤ = LowIncl−1 (Clt≤ ), Clt≤ = U ppIncl (Clt≤ ), Clt≥ = LowIncl (Clt≥ ), Clt≥ = U ppIncl−1 (Clt≥ ) ⊆ R0 (R0 (Clt≥ )). Two important comments are needed. As long as we regard all above operators as ap- proximation operators, RST and DRS based results are better than that coming from Semantic Rendering of Data Tables: Multivalued Information Systems Revisited FCA. However, when we change the context, then the FCA operator may be more preferable. Secondly, in DRS, x Incl y is read as y is better than x. However, there are other readings possible, e.g., we have more pieces of information about y than about x. Let us consider an example which brings new meanings for relations and opera- tors we have discussed so far. Let w be a serious disease which we have an antibiotic working against. Let us assume that we have a test checking whether someone is ill, but the antibiotic should be given to a patient before the disease develops. So, our aim is to select people who will be given the medicine. One very expensive solution is to give the medicine to all people (that is, all elements of the universe U ). On the other extreme, we could give the medicine only to people who test positive for w. Of course, both solutions are not good and we need to find another method of selection. We are going to employ the John Stuart Mill inductive reasoning [2] here, which was designed to solve problems of this type, but under complete knowledge. We shall focus our atten- tion on the very basic cannon, namely the direct method of agreement (Fig. 8). We are not going to use the pure form of this cannon, but rather its rough set based rendering. ABCD occur together with w, v AEFG occur together with w, z Therefore A is the cause of w Fig. 8. Direct method of agreement: If two or more instances of the phenomenon under investi- gation have only one circumstance in common, the circumstance in which alone all the instances agree, is the cause (or effect) of the given phenomenon [2]. Let be given an information system I = (U, Att, V al, f ) representing our (medical) knowledge about people. At first, we employ a semantics S consisting only of ordinal scales such that for each attribute A of IS = (U, Att, V al, fS ), the relation x InclA y means that y is in a worse medical condition than x with respect to A and w. In our settings, Mill’s inductive reasoning is modelled in the following way. The concept w is actually a set {x ∈ U : f (x, w) = >), where V alw = {>, ⊥} (truth and false, respectively). The scale Rw is given by {(>, >), (>, ⊥), (⊥, ⊥)}, so if f (x, w) = >, then fS (x, w) = {>, ⊥}, and if f (x, w) = ⊥, then fS (x, w) = {⊥}. Thus, as required, if x Inclw y, then y is in the same or worse medical condition than x, so if x is ill, then y must be ill as well. Let Cl> be a set of positive examples of w (direct method of agreement). The term “in common” (Fig. 8) is beyond the expressive power of pure RST and DRS. However, in our case we can retrieve common attributes by means of Atex(R0 (Cl> )). Now we can compute possible solutions to our problem: ≥ Cl> ⊆ U ppInd (Cl> ) ⊆ U pp−1 0 0 ≥ Incl (Cl> ) = Cl> ⊆ R (R (Cl> ). As said Cl> is an extreme solution, another one may be given by U ppInd (Cl> ) (that is, we give the antibiotic to all people with exactly the same medical description in terms of conditional attributes as some patients having positive test for w). It seems reasonable, however the next solution U pp−1 ≥ Incl (Cl> ) = Cl> is much better: we give the medicine to all people with the same or worse medical condition than some patients who have M. Wolski, A. Gomolińska positively tested for w. Better still, we may apply the direct method of agreement and ≥ give medicine to all people in R0 (R0 (Cl> ) having medically worse results only for attributes which seem to be relevant to w. It is worth emphasising that regarded as ≥ an approximation of Cl> , the set R0 (R0 (Cl> ) is the worst candidate, but in this very settings it is the best solution for the problem at issue. Consider a non-regular scale now and assume that all people who positively tested on w have problems with blood pressure A. So scaling of A shows how unstable is the pressure. Some patients may have value {normal, high}, some {low, high}, and some {low, normal, high}, but ≥ none of them has {normal}. This time R0 (R0 (Cl> ) is a very bad solution, because it may include people with a normal blood pressure. However, we can still use the direct ≥ method of agreement in a modified version, and take R00 (Cl> ) as the solution. 4 Conclusions In the paper we have investigated the implicit semantics used in some leading theo- ries of data analysis: rough set theory (RST) [4–6], dominance based rough set theory (DRS) [7, 8], and formal contexts from formal concept analysis (FCA) [1, 9]. We have presented all theories within the unifying framework of multivalued information sys- tems [3, 4], enriched with the scaling methods from FCA. We have also discussed the relations between the operators coming from these theories, and presented their differ- ent interpretations in the context of John Stuart Mill inductive reasoning [2]. References 1. Ganter, B., Wille, R.: Formal Concept Analysis. Mathematical Foundations, Springer Ver- lag (1999) 2. Mill, J.S.: A System of Logic. Vol. 1. London (1843) 3. Orłowska, E., Pawlak, Z.: Representation of nondeterministic information. Theoretical Computer Science 29, pp. 27–39 (1984) 4. Pawlak, Z.: Systemy informacyjne. Podstawy teoretyczne. Warszawa, WNT (1983) 5. Pawlak, Z.: Rough Sets: Theoretical Aspects of Reasoning about Data. Kluwer Academic Publisher (1991) 6. Pawlak, Z.: Wiedza z perspektywy zbiorów przybliżonych. Institute of Computer Science Report 23, (1992) 7. Słowinski, R., Greco, S., Matarazzo, B.: Dominance-based rough set approach to reasoning about ordinal data. Lecture Notes in Artificial Intelligence 4585, 5–11, (2007) 8. Słowiński, R., Greco, S., Matarazzo, B.: Rough sets in decision making. In: R.A.Meyer y (Ed.), Encyclopedia of Complexity and Systems Science, Springer, NY, 7753–7786, (2009) 9. Wille, R.: Concept lattices and conceptual knowledge systems. Computers & Mathematics with Applications 23, 493–515, (1992)