Exploration of Knowledge Bases Inspired by Rough Set Theory Agnieszka Nowak-Brzezinska and Alicja Wakulicz-Deja Department of Computer Science, Institute of Computer Science, Silesian University, Poland http://ii.us.edu.pl Abstract. In this paper, we discuss some issues related to exploration of knowl- edge bases inspired by the rough set theory, which emerged 30 years ago, and is nowadays a rapidly developing branch of artificial intelligence. The partition of rules approach allows us to divide a large set of rules into smaller subsets that are easier to manage. Optimisation relies on reducing the number of rules searched in each run of inference. The paper presents the definition of the knowledge base model based on partition of rules and a modification of the forward inference al- gorithm for groups of rules generated by the partition strategy. It also contains a simple case study with an example of the partition of rules and the inference process for a simple knowledge base as well as experimental results. Key words: knowledge base, inference process, goal driven, data driven, deci- sion support system 1 Introduction For the last twenty years, there has been an enormous interest in integrating database and knowledge-based system technologies to create an infrastructure for modern ad- vanced applications. The result of it is a knowledge base (KB) system which consists of database systems extended with some kind of knowledge, usually expressed in the form of rules1 - logical statements (implications) of the type „if condition1 & . . . & conditionn then conclusion”. The knowledge of experts, expressed in such a natural way, makes rules easily understood by people not involved in the expert system build- ing. KBs are constantly increasing in volume, thus the knowledge stored as a set of rules is getting progressively more complex and searching such sets is an important data-mining task. When rules are not organized into any structure, the system is ineffi- cient. There is a growing research interest in searching for methods that manage large sets of rules. Most of them use the clustering approach as well as joining and reduc- ing rules [1, 2, 10, 11]. This paper presents a different idea, in which rules are divided into a number of groups based on similar conditions and/or conclusion. The process is called partition of rules2 and is conducted by so-called partition strategies. In the 1 Rules have been extensively used in knowledge representation and reasoning. It is very space efficient: only a relatively small number of facts needs to be stored in the KB and the rest can be derived by the inference rules. 2 The idea is new but it is based on the authors’ previous research, where the idea of clustering rules as well as creating so-called decision units was introduced[9, 5]. 65 authors’ opinion, partition of rules leads to the improvement of the inference process efficiency [5]. If we assume that the user wants to get an answer from the system as soon as possible, the proposed modification of the KB structure reduces the time of inference. Instead of searching within the whole set of rules (as in case of traditional inference processes), only representatives of groups are compared with the set of facts and/or hypothesis to be proven. The most relevant group of rules is selected and the exhaustive searching is done only within a given group. We show how exploration of complex KBs can be applied based on the partition of rules. Our motivation is two- fold. First, we are concerned with how to reason KBs effectively with a large number of rules. Second, we are concerned with improving the efficiency of reasoning over a set of rules by partitioning the set with respect to some strategy. To this end, we provide algorithms for partitioning and reasoning with such partition of rules. The idea of partition of rules is implemented in the kbExplorer system3 , which is not limited to the inference optimization. The practical goal of the project is to create an expert system shell that allows for flexible switching between different inference methods based on knowledge engineer preferences4 . The rest of the paper is organized as follows. In Section 2, the research background is introduced. It also contains a very general description of the proposed concept. Sec- tion 3 presents the basic definitions of the partition of rules and partition strategies, as well as the specification of partition’s representative. It also includes the inspiration of the rough set theory. Section 4 describes the forward inference algorithm for the parti- tion of rules which needs to modify the classical algorithm. The results of experiments are given in Section 5. A simple case study is also given. Finally, Section 6 concludes the paper. 2 Related Work and the Proposed Idea There is a number of studies related to knowledge matching and rules modularization. Interesting and effective approaches to the problem with managing a large set of rules and necessity of its partitioning can be found in [1–3, 10, 11], where known systems like CHIRA, XTT2 or D3RJ are described. In [2] the authors present CHIRA - an algo- rithm performing decision rules aggregation. Rules are joined if their conditional parts are built from the same conditional attributes or if the conditional attributes set of one rule is a subset of the conditional attributes set of the second one. Another joining al- gorithm, which operates on rules determined by the dominance-based rough set model, was proposed in [3]. Rules that lie close to each other are joined if their joint causes no deterioration of accuracy of the obtained rule. XTT2 provides a modularized rule base, where rules working together are placed in one context corresponding to a single 3 http://kbexplorer.ii.us.edu.pl/ 4 The user has the possibility of creating KBs using a special creator or by importing a KB from a given data source. The format of the KBs enables working with a rule set generated automatically based on the RST theory as well as with rules given apriori by the domain expert. The KB can have one of the following file formats: XML, RSES, TXT. It is possible to define attributes of any type: nominal, discrete or continuous. There are no limits for the number of rules, attributes, facts or the length of the rule. 66 decision table[10]. D3RJ is a method which produces more general rules, where each descriptor can be endorse subset of values [11]. In most of these tools, a global set of rules is partitioned by the system designer into several parts in an arbitrary way. The idea of partition of rules, presented in this paper, is based on the previous au- thors’ research concerning rules clusters and decision units [8, 9, 5]. The proposed algo- rithm is based on the same idea as the classical inference algorithm, but it uses groups’ representatives (playing a role of general rules) instead of each single rule. If a repre- sentative matches the searched data, it means that its group probably contains a rule or rules that we are looking for. In this context it is necessary to rewrite the pattern match- ing algorithm in one point. Instead of finding rules that match exactly a given set of facts, we compare facts with rules’ representatives and the most similar group of rules is selected. Further research is performed only on this group. The advantage is as fol- lows: having n rules divided into k groups, only k groups’ representatives are searched, while in the classical version of the inference process all n rules have to be analysed. 3 Partition of Rules Idea The proposed concept assumes division of a KB into coherent subgroups of rules. Therefore, this section presents the basic concepts and notations of KB partitioning. We assume that a KB is a single set R = {r1 , . . . , ri , . . . , rn } without any order of n rules. Each rule ri ∈ R is stored as a Horn’s clause defined as: ri : p1 ∧ p2 ∧ . . . ∧ pm → c, where ps is s-th literal (a pair of an attribute and its value) (a, via ) (s = 1, 2, . . . , m). Attribute a ∈ A may be a conclusion of rule ri as well as a part of the premises. For every KB with n rules, the number of possible subsets is 2n . Any arbitrarily created subset of rules R ∈ 2R is called partition of rules (P R) and it can be generated by one of the many possible partition strategies (P S). 3.1 Definition of Partition of Rules The partition of rules P R = {R1 , R2 , . . . , Rk }, where k is the number of groups of rules in partition P R, Rj is j-th group of rules for j = 1, . . . , k and P R ⊆ 2R is gener- ated by one of many possible partition strategies. We need a function mc : R × P R → [0..1] to decide whether rule ri belongs to group Rj or not. It is defined individually for every P S. If there is no doubt that rule ri belongs to group Rj , mc(ri , Rj ) = 1. Otherwise, mc(ri , Rj ) = 0. It means that rule ri definitely does not belong to group Rj . Values from the range 0 to 1 mean partial membership. In mathematical S meaning, a partition of rules is a collection of subsets {Rj }j∈J of R such that: R = j∈J Rj and if j, l ∈ J and j 6= l then Rj ∩ Rl = ∅. The subsets of rules are non-empty and every rule is included in one and only one of the subsets5 . 5 However, we also consider the case, in which a given rule belongs to more than one group. In our future work we are going to extend the definition of the partition in this context. 67 3.2 Rough Set Theory and Indiscernibility Relation as Inspiration for Partition of Rules Idea One of the tools for exploration of large rules data sets is analysing the similarities between rules. It leads naturally to their grouping and integration. The similarities of objects are examined by the indiscernibility relations used in the rough set theory (RST)[12, 13]. If R is the set of all rules in the KB, and ri , rf are arbitrary rules, they are said to be indiscernible by P R, denoted ri P g R rf , if and only if ri and rf have the same value on all elements in P R: IN D(P R) = {(ri , rf ) ∈ R × R : ∀a∈P R a(ri ) = a(rf )}. As such, it induces a partition of R generated by P R, denoted P R∗6 . Let us assume that the KB contains the following rules: r1 : (a, 1) → (b, 1) r2 : (a, 1) → (c, 1) r3 : (d, 1) → (e, 1) r4 : (d, 1) → (f, 1) r5 : (b, 1) → (g, 1) r6 : (c, 1) → (g, 1) r7 : (e, 1) → (h, 1) r8 : (f, 1) → (h, 1) r9 : (g, 1) → (i, 1) r10 : (i, 1) → (k, 1) r11 : (i, 1) → (k, 1) r12 : (h, 1) → (j, 1) r13 : (j, 1) → (l, 1) r14 : (a, 1) ∧ (j, 1) → (b, 1) r15 : (a, 1) ∧ (j, 1) ∧ (c, 2) → (b, 1) r16 : (a, 1) ∧ (j, 1) ∧ (c, 2) ∧ (e, 2) → (b, 1) r17 : (a, 1) ∧ (j, 1) ∧ (c, 2) ∧ (e, 2) → (b, 1) For this KB it is possible to create different partitions in terms of different criteria. An equivalence relation on P R, defined by premises of rules in R, is as follows: {P R}∗ = {r17 , r16 , r15 , r14 , r13 , r2 , r1 }, {r5 }, {r6 }, {r7 }, {r8 }, {r9 }, {r12 }, {r4 , r3 }, {r11 , r10 } while the equivalence relation on P R being conclusions of rules produces the partition: {P R}∗ = {r17 , r16 , r15 , r14 , r1 }, {r2 }, {r3 }, {r4 }, {r5 , r6 }, {r7 , r8 }, {r9 }, {r12 }, {r4 , r3 }, {r11 , r10 }, {r13 } whereas the equivalence relation on P R for P R = (b, 1) in conclusions is as follows: {P R}∗ = {r17 , r16 , r15 , r14 , r1 }, {r2 , r3 , r4 , r5 , r6 , r7 , r8 , r9 , r12 , r4 , r3 , r11 , r10 , r13 }. This last example partition needs additional comments. Usually, when the partition strategy divides rules into the groups that match a given condition (in this case the literal (b, 1)), the final partition is given by the so-called selection in which the first part of the partition is the group that matches a given condition, and the second group does not meet this condition 7 . 3.3 Partition Strategies Partition strategy is the general concept. It is possible to point out a number of different approaches for creating groups of rules. The most general division of partition strate- gies talks about two types of strategies: simple and complex. Simple strategies8 allocate 6 It does not necessarily contains a subset of attributes. It may also include the criterion of creating groups of rules, i.e. groups of rules with at least m number of premises or groups of rules with a premise containing a specific attribute or particular pair (attribute,value). 7 {r17 , r16 , r15 , r14 , r1 } contains literal (b, 1) in conclusion part, and {r2 , r3 , r4 , r5 , r6 , r7 , r8 , r9 , r12 , r4 , r3 , r11 , r10 ,r13 } does not contain this literal. 8 It allows for partitioning the rules using the algorithm with time complexity not higher than O(nk), where n = |R| and k = |P R|. Simple strategies create final partition P R by a 68 every rule ri to the proper group Rj , according to the value of the function mc(ri , Rj ). The example is the strategy of finding a pair of rules the most similar in a given context, i.e. premises of rules. Complex strategies usually do not generate the final partition in a single step. It is defined by a sequence of simple strategies or a combination of them, or by iteration of a single simple strategy. An example is the strategy of creating simi- larity based partition, which stems from the method of cluster analysis used for rules clustering. It uses a simple strategy which finds pairs of the most similar rules many 9 times. The process terminates if the similarity is no longer V at least T . In effect, we get k groups of rules R1 , R2 , . . . , Rl , . . . , Rk such that ri ,rf ∈Rl sim(ri , rf ) ≥ T. Each group Rl contains rules for which mutual similarity is at least equal to T . It is possible to obtain many different rules partitions. In this strategy, rules that form the same group are said to have the same or similar premises. It uses the similarity function sim(ri , rf ): R × R → [0..1] which can be defined in a variety of ways, i.e. based on the conditional part of the rules as |cond(ri ) ∩ cond(rf )| sim(ri , rf ) = , |cond(ri ) ∪ cond(rf )| where cond(ri ) denotes the conditional part of rule ri and concl(ri ) its conclusion - respectively. The value of sim(ri , rf ) is equal to 1 if rules are formed by the same literals and 0 when they do not have any common literals. The more similar the rules are the closer to 1 is the value of sim(ri , rf ). 3.4 Similarity-Based Partition of Rules The similarity-based partition strategy, described shortly above, allows for improv- ing the efficiency of the inference process (see alg. Alg01). In the first step of the algorithm, a specific partition (createSingletonGroups), in which every rule forms a separate group, is created: ∀R∈P R |R| = 1 (∀R∈P R | ∪ {R : R ∈ P R}| = n and ∀Ri ,Rj ∈P R,i6=j Ri ∩ Rj = ∅). Further, it finds a pair of the most similar rules iteratively and join them into one group. Similarity is determined by using function sim based on the similarity of the conditional part of rules (or groups). This strategy is used to build the partition of 17 rules (presented in section 3.2). At the begining, each rule creates a single group: R1 , . . . , R17 . Next, the following groups are cre- ated: R18 = {R17 , R16 }, R19 = {R18 , R15 }, R20 = {R19 , R14 }, R21 = {R1 , R2 }, R22 = {R3 , R4 }, R23 = {R10 , R11 }, R24 = {R20 , R13 }, R25 = {R24 , R21 }. This algorithm is based on the classical AHC algorithm discussed in [7, 9, 5]. The partition of rules achieved in this way has got the hierarchical structure thus it is necessary to search the tree of groups of rules 10 . Alg01: Similarity-based partition - an algorithm Require: R, sim, T ; single search of rules set R according to the value of mc(ri , Rj ) function described above. For complex strategies time complexity is rather higher than any simple partition strategy. 9 Let us assume that threshold value 0 ≤ T ≤ 1 exists. 10 The time complexity of this operation is O(log n) where n is the number of elements in the tree. 69 Ensure: P R = {R1 , R2 , . . . , Rk }; procedure createPartitions( R, var P R, sim, T ) var Ri , Rj ; begin P R = createSingletonGroups(R); Ri = Rj = {}; while findTwoMostSimilarGroups(sim, Ri , Rj , P R) > T do P R = rearrangeGroups(Ri , Rj , P R); end while end procedure 3.5 Partition’s Representatives - Profiles The efficiency of the partition of rules and its usage in the inference process depends on the quality of the created partition. It is based on both: the cohesion and separation of the created groups of rules as well as on the representatives of the groups (Profiles). There can be found many different approaches to determine representatives for groups of rules. There are many possible forms of P rof ile(R) 11 . We propose an approach inspired by the RST with lower and upper approximations12 . It allows for defining two representatives: P rof ile(Rj ) and P rof ile(Rj ) for every group Rj . The lower approx- imation set of a profile of group Rj , is the union of all these literals from premises of rules, which certainly appear in Rj , while the upper approximation is the union of the literals that have a non-empty intersection with the subset of interest. As it was men- tioned above, rule ri : p1 ∧ p2 ∧ . . . ∧ pm → c has a conjunction of m literals in the conditional part. Each literal pc ∈ cond(ri ) is a premise of rule ri . Thus, \ P rof ile(Rj ) = { cond(ri )} i denotes the set of all literals pc that are an intersection of the conditional part of each rule ri in group Rj , whereas [ \ P rof ile(Rj ) = { cond(ri ) : cond(ri ) ∩ cond(rf ) 6= ∅} i f for f 6= i, ri , rf ∈ Rj contains the set of all premises of rules creating group Rj which have a non-empty intersection with the premises of other rules from the same group. For example, if in the KB presented above, group R22 contains two rules r3 and r4 11 It may be (i) the set of all premises of rules that form group R, (ii) the conjunction of the selected premises of all rules included in a given group R as well as (iii) the conjunction of the premises of all rules included in a given group R. 12 If X denotes a subset of universe element U (X ⊂ U ) then the lower approximation of X in B (B ⊆ A) denoted as BX, is defined as the union of all these elementary sets which are contained in X: BX = {xi ∈ U |[xi ]IN D(B) ⊂ X}. Upper approximation BX is the union of these elementary sets, which have a non-empty intersection with X:BX = {xi ∈ U |[xi ]IN D(B) ∩ X 6= ∅} . 70 (groups R3 , R4 ), P rof ile(R22 ) = {(d, 1)} whereas P rof ile(R22 ) = {(d, 1), (b, 2)}. The BN (P rof ile(R22 )) = P rof ile(R22 ) − P rof ile(R22 ) = {(b, 2)}. It means that literal (d, 1) appears in every rule in this group while (b, 2) makes rule r3 distinct from r4 . In other words, (d, 1) certainly describes every rule in group R22 while (b, 2) pos- sibly describes this group (there is at least one rule which contains this literal in the conditional part). Lower and upper approximations of the profile of a given group are very convenient in further searching. When we want to find a rules which certainly contains some literal, we have to match it to the lower approximation of the profile for every group. If we look for a rule that possibly contains some literal, we only have to match it to the upper approximation of the profiles of groups. 4 Inference Algorithm In authors’ opinion, both the proposed idea of the partition of rules and high quality of representatives’ representation lead to improve the efficiency of the inference process. There are two inference algorithms: forward and backward13 . Due to the limited size of the paper only the forward inference algorithm is discussed. The goal of the forward inference process is to find a set of rules with given data included in the premise part of a particular rule. The shorter the time needed to perform such a process, the better. During the inference process, premises of each and every rule are analysed to match them with the current set of facts (F ). Then the subset of the conflict set is chosen consisting of the rules that are actually activated. Finally, previously selected rules are analysed and the set of facts is updated. The most time consuming procedure searches rules and finds those relevant to the given data (goal/facts) [4]. The aim is to reduce this process in the best way. Modification of the classical inference algorithm, proposed by the authors, is based on changes in the structure of the rules set. It is no longer a list of all rules. Now it forms a partition of rules with representatives. Thanks to this, the searching procedure is optimized. It needs less time to find the group and then the rule that match a given set of facts. 4.1 Forward Inference Algorithm Based on Partition of Rules The forward inference algorithm based on the partition of rules (see Alg02) reduces search space by choosing only rules from a particular group of rules, which matches the current facts set. Thanks to the similarity-based partition strategy as the result we get groups of similar rules. In every step, the inference engine matches facts to groups’ representatives and finds a group with the greater similarity value. To find a relevant group of rules, during the inference process, the maximal value of similarity between the set of facts F (and/or hypothesis to be proven) and the representatives of each group 13 In case of backward inference we look for rules with a given hypothesis as a conclusion [6]. Examples of applying the inference in rule KBs are described in [8], where the modification of the backward inference algorithm for rule KBs is presented. 71 of rules, P rof ile(Rj ) is searched14 . The more facts with hypothesis are included in the profile of group Rj , the greater the value of similarity. Exhaustive searching of rules is done only within the selected group. At the end, rules from the most promising group are activated. The inputs are: P R - groups of rules with the representatives and F -the set of facts. The output is F the set of facts, including possible new facts obtained through the inference. The algorithm uses temporary variable R, which is the set of rules that is the result of the previous selection. Alg02: Modified forward inference algorithm Require: PR, F; Ensure: F; procedure forwardInference( PR, var F ) var R,r; begin R := selectBestFactMachingGroup( PR, F ); while R 6= ∅ do r:=selectRule( R, strategy); activeRule( r ); R := selectBestFactMachingGroup( PR, F ); end while end procedure selectBestFactMachingGroup is the procedure responsible for finding the most rel- evant group of rules. If the result of the selection is not empty (R 6= ∅), the selectRule procedure is commenced. It finds a rule, in which the premises part is fully covered by facts. If there is more than one rule, the strategy of the conflict set problem plays a significant role in the final selection of one rule. The activeRule procedure adds a new fact (conclusion of the activated rule) to the KB. Of course, it is necessary to remove the activated rule from further searching. Afterwards, the selectBestFactMachingGroup procedure is called again. The whole presented process is repeated in a while loop un- til the selection becomes empty. Among the advantages of the proposed modification of the classical forward inference process are: reducing the time necessary to search within the whole KB in order to find rules to activate and achieving additional infor- mation about fraction of rules that match the input knowledge (the set of facts). It is worth to mention that such a modification led to firing not only certain rules but also the approximate rules for which the similarity with the given set of facts is at least equal to T or is greater than 0. 5 Experiments The main goal of the research is to show that the inference process for a KB with the partition of rules is carried out in a shorter time than the classical inference process for a KB with a list of rules. To do this, it is necessary to compare the following parameters: 14 |F ∩P rof ile(R )| sim(F, Rj ) = |F ∪P rof ile(Rjj )| . The value of sim(F, P rof ile(Rj )) is equal to 0 if there is no such fact fi (or hypothesis to prove) which is included in the representative of any group Rj . It is equal to 1 if all facts (and/or hypothesis) are included in P rof ile(Rj ) of group Rj . 72 inference time for both inference algorithms , the size of the KB (the number of rules or groups of rules) and the % of the KB that is really analyzed. The results of the inference process on two different structures of the KB (rules and partition of rules) should not differ. Facts extracted from rules should be the same as facts extracted from the partition of rules. Checking this became an additional goal of the experiments. For a better understanding of the inference process for the partition of rules below we show a simple case study, and then the results of the experiments with some interpretation. 5.1 A Simple Case Study To illustrate an idea of knowledge exploration from the partition of rules, let us consider an example of a KB presented in section 3.215 . The classical inference algorithm requires searching each of 17 rules in this set. Using the strategy, presented in section 3.1, based on similar premises only representatives of 8 created groups of rules are analysed. The more rules the greater the profit from using this approach. Usually k << n. The rules partitions, intended for the forward inference are represent by profiles, as follows: R5 = {r5 }, Profile: {(b, 1)}, R6 = {r6 }, P rof ile : {(c, 1)} R7 = {r7 }, Profile: {(e, 1)}, R8 = {r8 }, P rof ile : {(f, 1)} R9 = {r9 }, Profile: {(g, 1)}, R12 = {r12 }, P rof ile : {(h, 1)} R22 = {r4 , r3 }, Profile: {(d, 1)}, R23 = {r11 , r10 }, P rof ile : {(i, 1)} R25 = {r17 , r16 , r15 , r14 , r13 , r2 , r1 }, Profile: {(a, 1), (j, 1), (c, 2), (e, 2)} Assuming that there is no given goal of the inference and F = {(a, 1), (j, 1), (c, 2), (e, 2)} we need to find a group of rules that matches set F. sim(F, P rof ile(R5 )) = 0, sim(F, P rof ile(R6 )) = 0, sim(F, P rof ile(R7 )) = 0, sim(F, P rof ile(R8 )) = 0, sim(F, P rof ile(R9 )) = 0, sim(F, P rof ile(R12 )) = 0. sim(F, P rof ile(R22 )) = 0, sim(F, P rof ile(R23 )) = 0, sim(F, P rof ile(R25 )) = 1. If there is more than one relevant group, it is necessary to choose one for further analysis. In our case, there is only one such group. It is group R25 = {r17 , r16 , r15 , r14 , r13 , r2 , r1 }. We need to search for rules to execute within this group: sim(F, P rof ile(R17 )) = 1, sim(F, P rof ile(R16 )) = 1, sim(F, P rof ile(R15 )) = 1, sim(F, P rof ile(R14 )) = 1, sim(F, P rof ile(R13 )) = 1, sim(F, P rof ile(R2 )) = 1. sim(F, P rof ile(R1 )) = 1. Every rule can be executed because it matches the whole set of facts. It is necessary to use the so-called conflict set strategy16 , which allows for selecting one rule. Using the LAST_RULE strategy, rule r17 is chosen to execute. In effect, conclusion of rule r17 : (b, 1) is added as a new fact to F (F = F ∪ {(b, 1)}). Next, according to user preferences, the algorithms runs again until it executes all relevant rules, or a given number of iterations has been done or a given goal of the inference was included in the set of facts. 15 Due to the limitation of size, the example is very simple and it is directed at presenting a conception described above, useful for improving the classical inference algorithm. 16 There are many possible conflict set strategies: LAST_RULE, FIRST_RULE, LONGEST_RULE, SHORTEST_RULE 73 5.2 Results of Experiments We are aware that experiments should be done on real KBs. The results presented in the article are derived from both real and artificial rules sets. Unfortunately, access to real large rules sets is not easy. It is worth to mention that currently we are obtaining two real KBs with over one and over four thousand of rules. Table 1 17 presents the results of the experiments carried out on 12 different cases of KBs named 1A . . . 4C. Even if the number of rules is the same for some sets, they differ in sets of facts given as the input knowledge. Thus KBs are considered as different sets. We expect that during Table 1. The results oft he experiments KB rule-based inference partition of rules-based inference KB 1 2 3 3A 4 5 6 7 7A 8 9 10 10A 1A 10 10 10 2827 11 11 10 95,8 2939 100% 1B 10 50 9,97 9,98 2691 11 11 16,6 12,8 50,3 2544 100% 100% 1C 10 10 9,97 2713 11 11 11,8 67 2593 100% 2A 17 17 10 3260 25 14 10,2 236 3111 56% 2B 17 62 20 13,33 3190 25 19 10 10,07 149 3076 76% 56% 2C 17 17 10 3147 25 9 10 98,8 3116 36% 3A 199 199 30,4 16065 389 11 9,93 63026 15610 2, 8% 3B 199 199 42,9 39,47 16006 389 75 61,4 31,84 59865 23024 19, 3% 8,1% 3C 199 199 45,1 15694 389 9 24,2 62078 16228 2, 3% 4A 416 416 51,9 29431 819 13 21,7 410940 31044 1, 6% 4B 416 416 51,8 56,73 30230 819 13 19,8 20,5 421159 44445 1, 6% 12,5% 4C 416 416 66,5 29410 819 280 20 448260 404121 34, 2% the next two months it will be possible to present the results of these experiments. As it can be seen in the table, in case of the classical forward inference process (rule- based) the more rules in the KB, the longer the time of inference. The reason is the necessity of matching every rule with a given set of facts. In contrast, the more rules, the faster partition-based inference. The explanation is simple. The more rules, the more advantages of using the modification of the classical forward inference algorithm in comparison to the classical deduction. The differences in times of inference on two structures of the KB: rules and the partition of rules are getting bigger when the number of rules grows up. 17 The following information is presented in the table: the number of rules (1), the number of rules analyzed during the inference (2)18 , the time of the inference process for rules [ms] (3) and the average time of the inference [ms] (3A), the time of loading the KB for rules (4), the number of groups (5) and the number of groups analyzed during the inference (6), the time of inference process for groups of rules [ms] (7) and the average time of the inference for partition of rules [ms] (7A), the time to build the partition of rules [ms] (8), the time of loading the KB [ms] (9) as well as % of the KB searched in contrast to the classic forward inference process (10) with average % of the KB analyzed in contrast to the classic version (10A). 74 We have observed that the time of rules partitioning is significantly long in compar- ison to the time of inference, and the difference increases as the number of rules gets higher. That is why we suggest having the partition of rules stored as a static structure (assuming that it is rebuilt whenever the KB is modified). Being aware of it, we are interested only in reducing the time of the inference maximally. 6 Conclusion The article presents an idea of a rule-based KB decomposed into groups of rules, called partition of rules. The KB in such a form allows for improving the efficiency of infer- ence algorithms. In this article, only the strategy based on rules clustering is presented. However, there are many possible strategies for partitioning rules. Searching represen- tatives of the partition of rules, instead of every single rule enables reducing the time of inference significantly. The algorithm for the forward inference on the partition of rules (proposed by the authors) has been compared to the classical inference algorithm (for rules). The results presented in Table 1 are promising. We can assume that for large rules sets (consisted of hundreds of rules or more) it is better to partition them into groups of similar rules and to carry out the inference process on representatives of such groups. The partition of rules is a very useful property which makes management of large rules sets easier and speeds up the inference process. The concept of rules’ partitions is a new proposition, based on the previous research concerning rules clusters [9] and de- cision units [8]. Proposition of the modified inference algorithm is the continuation of the previous research on optimisation of the inference in knowledge-based systems [7]. During next works the authors plan to perform more detailed comparative experiments on real world KBs, consisted of over 4000 rules. Mining knowledge bases is becoming an important task for many data mining ap- plications. Thus in the future work, to reduce the time complexity of the algorithm for rules clustering and the inference on created groups of rules, the authors are going to present modifications of clustering algorithms based on incomplete knowledge. Acknowledgements This work is a part of the project „Exploration of rule knowledge bases” founded by the Polish National Science Centre (NCN: 2011/03/D/ST6/03027). References 1. Toivonen H., Klemettinen M., Ronkainen P., Hatonen K., Mannila H.: Pruning and Grouping Discovered Association Rules, 1995 2. Sikora M., Gudyś A.: CHIRA—Convex Hull Based Iterative Algorithm of Rules Aggrega- tion, Fundam. Inf.,Vol. 123 (2), p. 143–170, IOS Press, 2013. 3. Pindur R., Susmaga R., Stefanowski J.: Hyperplane aggregation of dominance decision rules, Fundam. Inf.,Vol. 61 (2), p. 117–137, IOS Press, 2004. 4. Pedrycz W., Cholewa W.: Expert Systems [in polish], Silesian University of Technology, Poland, Section of Scientific Publications, (1987) 75 5. Nowak-Brzezińska A., Simiński R.: Knowledge mining approach for optimization of in- ference processes in rule knowledge bases, Lecture Notes in Computer Science, vol. 7567, Springer Berlin Heidelberg, (534-537), (2012) 6. Akerkar R., Sajja P.: Knowledge-Based Systems. Jones & Bartlett Learning, 2010. 7. Nowak-Brzezińska A., Simiński R., Xieski T., Jach T.:Towards a practical approach to discover internal dependencies in rule-based knowledge bases, LNCS, 6954, p. 232-237, Springer Verlag, 2011. 8. Simiński R.: Extraction of Rules Dependencies for Optimization of Backward Inference Algorithm, Beyond Databases, Architectures, and Structures,Communications in Computer and Information Science, Springer International Publishing, vol. 424, p. 191-200, Springer Verlag, 2014. 9. Nowak-Brzezińska A., Wakulicz-Deja A.:The way of rules representation in composited knowledge bases, Advanced In Intelligent and Soft Computing, Man - Machine Interactions, p. 175-182, Springer-Verlag, 2009. 10. Nalepa G., Ligeza A., Kaczor K.: Overview of Knowledge Formalization with XTT2 Rules, Rule-Based Reasoning, Programming, and Applications, LNCS 6826, p. 329-336, Springer Verlag, 2011. 11. Latkowski R., Mikołajczyk M.: Data decomposition and decision rule joining for classifi- cation of data with missing values, Lecture Notes in Artificial Intelligence, LNAI 3066, p. 254-263, Springer Verlag, 2004. 12. Pawlak Z.: On Learning - A Rough Set Approach, In: Lecture Notes in Computer Science (A. Skowron, Ed.). Berlin: Springer, Vol.28, pp.197-227, 1985. 13. Skowron A., Pawlak Z., Komorowski J., Polkowski L.: Rough sets perspective on data and knowledge, in: W. Kloesgen, J. Zytkow (Eds.), Handbook of Data Mining and Knowledge Discovery, Oxford University Press, Oxford, pp. 134-149, 2002.