New inference algorithms based on rules partition Agnieszka Nowak-Brzezińska and Roman Simiński Department of Computer Science, Institute of Computer Science, Silesian University, Poland http://ii.us.edu.pl Abstract. This paper presents new inference algorithms based on rules partition. Optimisation relies on reducing the number of rules searched in each run of inference and reducing the number of unnecessary recursive calls and avoiding the extraction of too many new facts. The first part of the paper presents the definition of the model of knowledge base based on rules partition. The second part describes the modifications of both forward and backward inference algorithms using knowledge base with rules assigned to the groups given by the partition strategy. At the end, the paper consists of an example of knowledge base with the description of inference processes for this particular knowledge base. Keywords: knowledge base, inference process, goal driven, data driven, rules partition 1 Introduction Knowledge bases are still one of the most popular method of knowledge repre- sentation. They are constantly increasing in volume, thus the knowledge stored as a set of rules is getting progressively more complex. The ability to obtain large rules sets in easy and automatic way has got an enormous negative impact on the time of decision making — inference in large rules sets is time consuming pro- cess and the results of inference are in many cases hard to understand [1, 11, 7]. Effectiveness of both forward and backward inference algorithms is a significant problem when considering the large and complex rules bases, especially when we consider inference used in systems which work in the real-time environments and do not use a dialogue with user during inference. This paper presents a new idea of partitioning of rule knowledge base and ap- plication of this idea in inference optimisation task. We introduce an approach, which assumes that the rule knowledge base is decomposed into the groups of rules, called rules partitions. Knowledge base in the form of rules partitions al- lows to improve the efficiency of inference algorithms. The groups of rules on which the knowledge base is partitioned let to reduce the search space for infer- ence algorithms and to decrease the time needed to browse rules. Thus we may say that the knowledge base in the form of rules partition optimises inference processes. In this work we introduce two types of strategies for partitioning of knowledge base. First important strategy, divides the rules knowledge base into 2 A.Nowak-Brzezińska, R.Simiński groups of rules with similar premisses, the second divides the rules into groups of rules with the same conclusion. The first one is called strategy based on similar premisses. The second one is known as decision strategy. Both of the strategies will be used as a base for proposed modification of inference algorithms. The introduced partition strategies correspond to the research carried out so far by the authors on cluster analysis for the rules in the knowledge base [8–10] and the application of the concept of decision unit idea [12]. The proposed modification of knowledge base model is a kind of decomposition of knowledge base. Every created group of rules has its’ representatives and during the inference processes the representatives of groups of rules are browsed instead of every rule one by one. Improving of the efficiency of forward inference is significant if the qual- ity of groups’ representatives is high enough. The modification of the classical backward inference algorithm is based on extracting information of internal rules dependencies. This information allows to perform only promising recursive calls of backward inference algorithm. Optimisation relies on reducing the number of rules searched for each run of inference and reducing the number of unnecessary recursive calls. The first part of the work presents the rules partitioning approach. The second part consists of the description of interence algorithms modifications. At the end, the paper contains an example of knowledge base with the description of inference processes for this particular knowledge base. 2 Decomposition of rule knowledge bases The proposed concept assumes division of knowledge base into coherent sub- groups of rules. This section presents the basic concepts and notations of knowl- edge base partitioning, because this conception is a relatively new and it is not widely known. A significant part of this work contains detailed description of proposed approach. 2.1 Preliminary information The rules-based knowledge base R consists of n rules: r1 , . . . , ri , . . . , rn and is an unordered set. Each rule ri ∈ R is stored as a Horn’s clause: ri : p1 ∧ p2 ∧ . . . ∧ pm → c, where m is the number of literals1 in conditional part of rule ri (m ≥ 0, i = 1 . . . n) and c is the literal for the conclusion part of rule ri . An attribute a ∈ A may be a conclusion of rule ri as well as a premise of it. For each rule ri ∈ R, the function concl(ri ) returns the conclusion literal of rule ri , whereas the function cond(ri ) returns the set of conditional literals of rule ri . The set of facts is denoted as F = {f1 , f2 , . . . fs }, the facts are also stored as literals. The knowledge base KB is defined as a pair: KB = (R, F). 1 literals in the form of pairs of attribute and its value (a, via ) (or equally a = via ). Let A be a non empty finite set of conditional and conclusion (decision) attributes. For every attribute a ∈ A, the set Va is called the set of values of attribute a. New inference algorithms based on rules’ partition 3 2.2 Partition strategies For every rules base R with n rules the number of possible subsets is 2n . Any arbitrarily created subset of rules R ∈ 2R is called a group of rules P R or rules partition. It is created by partition strategy, denoted by P S, which generates gen groups of rules P R: P S(R) ⇒ P R. Partition strategy P S for rules set R gen- erates the partition of rules P R ⊆ 2R :P R = {R1 , R2 , . . . , Rk } where: k is the number of groups of rules formed by the partition P R, and Rj is j-th group of rules, R ∈ 2R and j = 1, . . . , k. Partition strategy P S is the general con- cept. It is possible to show a lot of different approaches for creating groups of rules. Each of them can be done with a different partition strategy. Every sim- ple partition strategy involves the necessity of defining the membership criteria mc : R × P R → [0..1]. We can define mc in a variety of ways, also we can consider specific form of this function (mc : R × P R → {0, 1}), for example:  1 if sim(ri , Rj ) ≥ T mc(ri , Rj ) = (1) 0 in the opposite case where sim(ri , Rj ) is defined in eq.3. In general, the value 1 denotes the situation when rule ri ∈ R is definitely included in the group Rj and 0 in the opposite case. The value between 0 and 1 denotes the degree of membership of rule ri to the group Rj . It is possible to achieve many different partitions of rules. Generally, including op, as any operator from the set [>, >, <, 6, =, 6=], we can simply define partition P R as follows: P R = {Rj : Rj ∈ 2R ∧ ∀ri ∈Rj mc(ri , Rj ) op T }, where value mc(ri , Rj ) defines the membership of rule ri to the group Rj which can be higher, higher or equal, equal, less, less or equal to the threshold value T (0 ≤ T ≤ 1). The exceptional case of a simple strategy is the specific partition called fur- ther the selection. The selection divides the set of rules R into two subsets R and R − R in such a way, that all rules from R meet the membership criterion for some partition strategy P S, and all other rules do not meet such criterion. Partition P R = {R, R − R}, which is the result of the selection, is important in practical terms. During the inference process, we search within the knowledge base one rule by one (which is time consuming) in order to find a set of relevant rules. Selection serves the same role, because it provides rules necessary to be activated during the inference process, but in a more efficient way. There are two types of partition strategies: simple and complex. Simple strat- egy creates final partition P R by a single search of rules set R and by assigning each rule ri to group Rj according to the value of the membership function mc(ri , Rj ). The example of simple strategy is a selection of a pair of most sim- ilar rules or selection a group of rules to which a given rule is the most similar according to the value of threshold T and membership function mc. The strategy of creating groups of similar rules is the example of complex strategy because it usually multiply runs the simple strategy of finding two most similar rules or groups of rules. Complex strategies very often do not generate the final par- tition in a single step. Usually, they begin with a preliminary partition, which is modified further in accordance to the methods of partition. It is defined by 4 A.Nowak-Brzezińska, R.Simiński a combination of a few simple strategies, an iteration of one of them or the combination of both methods. Partition strategy based on finding groups of rules with similar premisses is the example of complex strategy which uses simple strategy known as strategy of choosing a pair of the most similar rules many times. Finding the pair of most similar objects is a crucial step in every clustering algorithm. Thus the parti- tion strategy which divides rules into the groups of rules with similar premisses has the result similar to the Agglomerative Hierarchical Clustering (AHC) algo- rithm but the hierarchy does not need to be saved [8, 9]. Only the result of the final partition is important. The process can be terminated when the similarity between merged clusters drops below given threshold (it was called the mAHC algorithm in [6]). The result of partitioning the set of all rules from R using the mAHC algorithm is k groups of rules R1 , . . . , Rl , . . . , Rk and it is following: R = {R1 , . . . , Rl , . . . , Rk : ∀ri ,rj ∈Rl sim(ri , rj ) ≥ T }. (2) Each group Rl consists of rules for which mutual similarity is at least equal to T . To do this, at the beginning we have to use simple strategy forming R1 , . . . , Rk groups where every rule belongs to exactly one group. This strategy is used only once and is defined as: R : ∪{R1 ∪ . . . ∪ Rn : ∀Ri ∈R1 ,...,Rn |Ri | = 1 ∧ ∀Ri ,Rj ∈R1 ,...,Rn ,Ri 6=Rj Ri ∩Rj = ∅}. In the next step the strategy based on finding the most similar rules or groups of rules is multiply executed. To determine the similarity between rules here the authors use the function R × R → [0..1] defined as: |cond(ri ) ∩ cond(rj )| sim(ri , rj ) = . (3) |cond(ri ) ∪ cond(rj )| The value of sim(ri , rj ) is equal to 1 if rules are formed by the same literals and 0 when they do not have any common literals. The more similar the rules are, the value of sim(ri , rj ) is closer to 1. It uses the similarity of conditional part of rules. However, we may also use the information about conclusions of rules while measuring the similarity of rules. This similarity measure works well with either rules, groups of rules, representatives of rules or set of facts. It is worth to mention that each group in created rules partition achieves its’ representa- tive called Profile. Instead of searching within the full structure of rules, only representatives of groups are compared. They are usually given by the set of common literals among all the literals belonging to all the rules forming a group of rules, a set of the most frequent literals or the set of distinctive ones. To find a relevant group of rules during the inference process the maximal value of sim- ilarity (see eq. 3) between the set of facts F (and/or hypothesis to be proven) and the representatives of each group of rules P rof ile(Rj ) is searched. Value of sim(F, P rof ile(Rj )) is equal to 0 if there is no such fact fi (or hypothesis to prove) which is included in the representative of any group Rj . It is equal to 1 if all facts (and/or hypothesis) are included in the P rof ile(Rj ) of group Rj . The more facts with hypotesis included in the profile of group Rj the greater the New inference algorithms based on rules’ partition 5 value of similarity. Exhaustive search of rules is done only within a given group. Details of inference process are given in the section 3. Partition strategy for rules with the same conclusions creates the groups of rules from R by grouping rules with the same attribute in conclusions. It also can be called decision oriented partition strategy. The membership criterion for rule ri and group Rj is given by the function mc defined in eq. 1. When we will use the simple partition algorithm (described below in the next section as Alg01:createPartitions) with the mc function defined in this way, we obtain decision oriented partitions. Each group of the rules generated by this algorithm may have the following form: R = {r ∈ R : ∀ri ∈R concl(ri ) = concl(r)}. The number of groups in the partition k : 1 ≤ k ≤ n depends on the number of different decisions included in conclusions of such rules. When we distinguish different decision by the different conclusions appearing in the rules — we get one group for each conclusion. This partition strategy corresponds to the concept of the decision units presented in [9, 10]. In accordance to more general conception of rules partitions (presented here), decision units are just one of many possible partitioning strategy. In every group we have rules with the same conclusion: a fixed (a, v) pair. All rules, grouped within a rule set, take part in an inference process confirming the goal described by the particular attribute-value — for each R ∈ P R the conclusion set |Concl(R)| = 1. The attribute in knowledge base usually represents the particular concept or property from real word. The value of attribute expresses the determined meaning of the concept or state of the property. If we consider the specified (a, v) pair, we think about particular kind of concept or about particular property state. Similarly, in decision tables DT = (U, A ∪ {d}) — considered in the Rough Set Theory — values of decision attributes d defines partition of objects from 1 2 r(d) U into the decision class CLASSA (d) = {XA , XA , . . . XA }, where r(d) is the i cardinality of value set for attribute a and XA is the i-th decision class of A [13]. Each group of rules from elementary partition is similar to i-th decision class i XA [14]. 2.3 The partition algorithms The input parameters for rules partition algorithm based on simple strategy are: knowledge base R, mc function and the value of the threshold T . Output data is the partition P R. Time complexity of such algorithm is O(n · k), where n = |R|, k = |P R|. For each rule r ∈ R we have to check whether the goal partition P R contains the group R with rule r (the value of the mc function has to be at least equal to T : mc(r, R) > T ). If such a rule doesn’t exist the given rule r becomes the seed of a new group which is added to the created partition P R. Alg01: The simple partition algorithm Require: R, mc, T ; Ensure: P R = {R1 , R2 , . . . , Rk }; procedure createPartitions( R, var P R, mc, T ) var R, r; 6 A.Nowak-Brzezińska, R.Simiński begin P R = {}; for all r ∈ R do if ∃R ∈ P R : mc(r, R) > T then R = R ∪ r; else R = {r}; P R = P R ∪ R; end if end for end procedure The algorithm of creating the specific partition called selection is presented below. The input parameters are knowledge base R, the function mc which defines the membership criterion and the threshold value T . Output data is the single group of rules R, which consists of rules from R, that fulfill the condition of mc. In fact, each selection generates the partition which has two elements, but the second group is defined implicitly as R − R. So the final partition of P R for the selection strategy is created as: P R = {R, R − R}. Time complexity of this algorithm is O(n), where n = |R|. Let us notice that presented algorithm does not have any knowledge about the specification of a given group of rules. It uses only the values of the membership function mc. If the parameter R is an empty set, the function of mc determines whether the rule r fulfills the condition given by the membership function for empty group R. Alg02: Selection strategy algorithm Require: R, mc, T ; Ensure: R = {r1 , r2 , . . . , rl }; procedure createSelection( R, var R, mc, T ) var r; begin for all r ∈ R do if mc(r, R) > T then R = R ∪ r; end if end for end procedure An example of partition algorithm (different to selection strategy algorithm) is the algorithm based on clustering rules into groups of similar rules. For each rule it checks if a particular rule should be included in a given group. The result is a partition of k groups: P R = {R1 , R2 , . . . , Rk }. This algorithm describes the AHC or mAHC algorithms and it is presented as follows. Alg03: Partition algorithm given by clustering Require: R, mc, T ; Ensure: P R = {R1 , R2 , . . . , Rk }; procedure clustering ( R, var R, mc, T ) INPUT: R ; New inference algorithms based on rules’ partition 7 repeat carry out some processing findTwoMostSimilarGroupsOfRules(R); createNewGroup(pOR, R ); until max{sim(Ri , Rj )} ≥ T OUTPUT: R’ ; where R0 denotes the created rules’ partition. findTwoMostSimilarGroupsOfRules( R ) is the procedure that compares groups’ representatives and finds the two most similar. The createNewGroup(pOR, R) function adds the new, recently created group to the structure and reduces the size of the structure by removing the elements which formed the newly created group. Partitions generated by the algorithms Alg01, Alg02, Alg03 are unique and complete — each rule r ∈ R is considered, and qualified to one group R, that matches the criteria of mc. Presented modification of backward inference will utilise this features. For this reason algorithms which generates incomplete or non-unique partitions will not be presented in this work. 3 Inference in knowledge bases with partition of rules The main goal of the decision support systems (DSSs) is to infer new knowledge from data stored in knowledge base. The inference is the process of proving (or falsifying) the hypothesis, knowing that the facts used in the process are true [1, 2]. The more rules in knowledge base, the greater the time needed to search withing a given knowledge base. A lot also depends on the number of facts given as input data. Choosing a proper methods of inference is very crucial and should be considered due to a goal of the given user of DSS. If the user wants to prove some hypothesis, the best choice is a goal driven inference process. When the user does not know what to prove, but has some facts given as input knowledge, the proper way is to use data driven inference process. 3.1 Forward inference This type of inference is also called data or facts driven. Each rule is analysed whether all its’ premisses are satisfied. If that is the case, the rule is activated and its’ conclusions are added to the set of facts. The algorithm stops either where there are no more rules which can be activated or when the starting hypothesis is added to the set of facts [5–8]. Forward inference leads to a massive growth of new facts, but most of them are nonessential to prove the starting hypothesis and can slow down the inference process significantly. Unfortunately, when the DSS consists of more than 100 rules, the vast majority of time (over 90%) is consumed to find the rules which can be activated [2–4]. The inference process can be divided into three stages: matching, choosing and execution. At first, the premisses of each and every rule are analysed to match it to the current set of facts. Then the subset of the conflict set is chosen consisting of those rules, which are actually activated. At the end, previously selected rules are analysed 8 A.Nowak-Brzezińska, R.Simiński and the set of facts is updated. The matching phase usually takes the biggest amount of processing time during the inference. To reduce the time necessary to realise the matching phase, the authors propose to partition the rules into groups that match the facts set with minimal covering threshold. It is not necessary to build a full partition and to create the selection from all the rules which are relevant to set of facts with some minimal covering factor. Having the structure with rules’ partition, the forward inference works as follows: in each iteration the representatives of each group are browsed and only a group with the highest similarity to the given set of facts is chosen for further analysis. This particular group is the browsed. Each rule belonging to this group is analysed and when the rules’ premisses are fully covered by the set of facts, rule is activated. The conclusion of the activated rule is added to the set of facts. The pseudocode of such inference process is following: Require: R, F; Ensure: F; procedure forwardInference( R, F ) var R; begin createFactsMatchingSelection( R, F, R ); while R 6= ∅ do applyRules( R, F ); excludeRules( R, R ); createFactsMatchingSelection( R, F, R ); end while end procedure The algorithm uses temporary variable R, which is the set of rules that is the result of the previous selection. The procedure responsible for finding the most relevant group to the given set of facts (F ) and thus - the rules to activate is createFactsMatchingSelection. It is responsible for creating the selection of rules that match the given condition defined by mc. The membership function mc checks whether the sim(ri , Rj ) is greater or equal to T . Using the createFacts- MatchingSelection procedure reduces the time necessary to find relevant rules significantly. The original time complexity O(n) is reduced to the O(k) where k << n, k denotes the number of created groups. If the result of the selection is not empty (R 6= ∅) the procedure applyRules is commenced. It adds the new facts which are the conclusions of all rules activated from the set R to the set of facts. The excludeRules function temporarily removes activated rules from further searching. Afterwords, the createF actsM atchingSelection procedure is called again. The whole presented process is repeated in while loop until the selection becomes empty. It should now be clear that the most often used method of running the mc function is to calculate the similarity value between a given set of facts and a rule from the previously selected set R. Among the advantages of the proposed modification of the classical forward inference process are: reducing the time necessary to search within the whole New inference algorithms based on rules’ partition 9 knowledge base in order to find rules to activate and achieving additional in- formation about fraction of rules that match the input knowledge (the set of facts). 3.2 Backward inference In this section we present the backward inference algorithm, which use the con- ception of the decision partitions P R, created by the selection algorithm Alg01. The proposed modification utilises the following concepts: (i) the reduction of the search space by choosing only rules from particular rule group, according to a current structure of decision oriented rules partition and (ii) the estimation of the usefulness for each recursive call for sub-goals, only promising recursive calls of classical backward algorithm will be made. Modified algorithm for backward inference will use the following input data: P R — the decision partition, F — the set of facts, g — the goal of the inference. The output data of the algorithm: F — the set of facts, including possible new facts obtained through inference, the function’s result as boolean value, true if the goal g is in the set of facts: g ∈ F , false otherwise. Alg04: Modified backward inference algorithm function backwardInference( P R, g, var F ) : boolean begin if g ∈ F do return true else A ← ∅, truePremise ← false, select R ∈ P R where g ∈ Concl(R) while ¬trueP remise ∧ {R − A} 6= ∅ do select r ∈ {R − A} according to the current selection strategy forall w ∈ cond(r) do trueP remise ← ( w ∈ F ) if ¬trueP remise ∧ w ∈ InC (R) then trueP remise ← backwardInference( P R, w, F ) elseif ¬trueP remise then trueP remise ← environmentConfirmsFact( w ) elseif ¬trueP remise then break endif endfor if ¬trueP remise then A = A ∪ {r} endif endwhile endif if trueP remise then F = F ∪ {g} endif return trueP remise end function The proposed modification of backward inference algorithm utilises the in- formation from decision partitions of rules to reduce the number of unnecessary 10 A.Nowak-Brzezińska, R.Simiński recursive calls and the number of rules searched for in each run of the infer- ence. Only promising groups of rules are selected for further processing (select R ∈ P R where g ∈ Concl(R)), where Concl(R) is the set of conclusions for the group of rule R, containing literals appearing in the conclusion parts of the rules r from R. Only the selected subset of the whole rules of (R − A) is pro- cessed in each iteration. Finally, only the promising recursive calls are made (w ∈ InC (R)). InC (R) denotes connected inputs of the rules group, defined in the following way: InC (R) = {(a, v) ∈ Cond(R) : ∃r∈R (a, v) = concl(r)}, where Cond(R) is the set of conditions for the group of rule R, containing literals ap- pearing in the conditional parts of the rules r from R. In each iteration the set R contains only proper rules matching to the currently considered goal, it com- pletely eliminates the necessity of searching for rules with conclusion matching to the inference goal, it is not necessary to search within the whole set of rules R — this information is simply stored in the decision-units and does not have to be generated. 3.3 An example of knowledge base and inference processes To illustrate the conception of partitioning and exploring the knowledge base, let us consider an example of knowledge base R which consists of following rules. r1 : (a, 1) ∧ (b, 1) → (c, 1) r2 : (a, 1) ∧ (b, 2) → (c, 2) r3 : (a, 1) ∧ (b, 3) → (c, 1) r4 : (b, 3) ∧ (d, 3) → (e, 1) r5 : (b, 3) ∧ (d, 2) → (e, 1) r6 : (b, 3) → (e, 2) r7 : (d, 4) → (f, 1) r8 : (d, 4) ∧ (g, 1) → (f, 1) r9 : (a, 1) → (d, 4) According to the knowledge base presented below and the set of facts F = {(a, 1)}, in case of goal driven inference algorithm, for confirmation of the main goal of the inference described by literal (f, 1), it is necessary to find rules match- ing to the goal. Next — if such rules exist — we have to select the rule to be activated. In our simple example we select a group R1 with rules r7 and r8 from which we are choosing r7 as the rule with higher similarity value. Now we have to confirm the truth of the premise (d, 4). Because it doesn’t appear in the facts set, it becomes a new sub-goal of inference and inference algorithm runs recur- sively. The classic version of the algorithm does not known whether the call is promising — i.e. it is unknown if there is a rule that proves the new goal of inference. For goal (d, 4) we use recursive call for (a, 1). It is a fact and when it is confirmed the other subgoal will also be confirmed (d, 4) as well as the main hypothesis (f, 1). According to the rules’ partition based on finding similar rules and the set of facts {(g, 1), (d, 4)} the process of data driven inference algorithm goes as follows. Assuming that chosen partition strategy returns the following structure of knowledge base: R1 = {r7 , r8 } R2 = {r1 , r3 } R3 = {r4 , r5 } R4 = {r2 } R5 = {r6 } R6 = {r9 } New inference algorithms based on rules’ partition 11 we check the similarity2 between the set of facts F and the representative of each group: sim(F, P rof ile(R1 )) = 23 , sim(F, P rof ile(R2 )) = 0, sim(F, P rof ile(R3 )) = 0, sim(F, P rof ile(R4 )) = 0, sim(F, P rof ile(R5 )) = 0, sim(F, P rof ile(R6 )) = 0. Next we choose the most similar group to the set of facts (which is R1 ) and we check the similarity between every rule in this particular group: 1 2 sim(r7 , F ) = , sim(r8 , F ) = . 3 3 We activate rules with all known premisses (in our case it is only one rule r8 ). Then we include new fact to F : F = F ∪ {(f, 1)}. The new fact is the conclusion of activated rule r8 . Of course we block rule r8 to make sure that it will not be analyzed later because it would only increase the time of inference process without gaining additional knowledge. We also block the whole group if all its’ rules are already blocked. The result of the inference process is eventually given as a set of facts F = {(g, 1), (d, 4), (f, 1)}. If the rules partition is the result of using the partition strategy based on clustering algorithm, it is necessary to search the groups of rules using similarity measure based on searching using profiles of groups. We may use any algorithms for rules clustering but the most efficient one is, in authors’ opinion, the AHC or mAHC algorithm presented in [8, 9]. Therefore, it is quite easy to find a group which would be suitable for inference process. The time complexity of this operation is O(log n) where n is the number of elements grouped during the clustering. 4 Summary We have introduced the approach, which assumes that the rule knowledge base is decomposed into the groups of rules, called rules partitions. In our opinion each knowledge base contains enough information, which allows to improve the efficiency of the classic inference algorithms. We introduce partitions of rules as the source of such information. Modification of the classical inference algorithms is based on information extracted from the rules of groups generated by the proposed, two partitioning strategy. The proposed modification of forward algorithm consists of the reduction the time necessary to realise the matching phase of inference by choosing only the rules from particular rule group, which match the facts set with minimal covering threshold. The proposed modification of backward algorithm consists of the reduction of the search space by choosing only the rules from particular rule group, according to a current structure of decision oriented rules partition and the estimation of the usefulness for each recursive call for sub-goals. Therefore, 2 the value for a pair F and P rof ile(R1 ) is equal to 23 because there are two common literals ((g, 1), (d, 4)) both in the set of facts and the representative of a group R1 = {(g, 1), (d, 4), (f, 1)}. 12 A.Nowak-Brzezińska, R.Simiński only necessary rules are searched and only promising recursive call of the classical backward algorithm will made. The concept of rules partitions is the new proposition, based on the previous research concerning on rules clusters [9] and decision units [12]. Proposition of the modified inference algorithm is the continuation of the previous research on optimisation of the inference in the knowledge based systems [8] . Acknowledgements This work is a part of the project ,,Exploration of rule knowledge bases” founded by the Polish National Science Centre (NCN: 2011/03/D/ST6/03027). References 1. R. Akerkar and P. Sajja: Knowledge-Based Systems, Jones & Bartlett Learning, (2010) 2. D. Batory: The LEAPS Algorithms, http://reports-archive.adm.cs.cmu.edu/ anon/1995/CMU-CS-95-113.pdf, (1995) 3. C. L. Forgy: On the efficient implementation of production systems, Carnegie- Mellon University, (1979) 4. C. L. Forgy: Rete: A Fast Algorithm for the Many Pattern. Many Object Pattern Match Problem, Artificial Intelligence, (1981) 5. J. P. Ignizio: An introduction to Expert Systems, McGraw-Hill, (1991) 6. D. P. Miranker: TREAT: A better match algorithm for AI Production Systems, Department of Computer Sciences, University of Texas at Austin,(1987) 7. D. Heckerman, An Empirical Comparison of Three Inference Methods, CoRR, 2013, http://arxiv.org/abs/1304.2357, http://dblp.uni-trier.de. 8. Nowak-Brzezińska, T. Jach, A., R. Simiński, T. Xieski: Towards a practical ap- proach to discover internal dependencies in rule-based knowledge bases, Lecture Notes in Computer Science, LNCS 6954, Springer, 232-237, (2011) 9. A. Nowak-Brzezińska, A. Wakulicz-Deja: The way of rules representation in com- posited knowledge bases, Advanced In Intelligent and Soft Computing, Man - Ma- chine Interactions, Springer-Verlag, 175-182, (2009) 10. A. Nowak-Brzezińska, R. Simiński: Knowledge mining approach for optimization of inference processes in rule knowledge bases, Lecture Notes in Computer Science, vol. 7567, Springer Berlin Heidelberg, (534-537), (2012) 11. W. Pedrycz: Knowledge-Based Clustering: From Data to Information Granules, Willey, (2005) 12. R. Siminski: Extraction of Rules Dependencies for Optimization of Backward Infer- ence Algorithm, Beyond Databases, Architectures, and Structures, vol. 424, Com- munications in Computer and Information Science, Springer International Pub- lishing, 191-200, (2014) 13. A. Skowron, J. Komorowski, Z. Pawlak, L. Polkowski: Handbook of Data Mining and Knowledge Discovery, Oxford University Press, Inc., (2002) 14. A. Skowron, Z. Suraj: Rough Sets and Intelligent Systems - Professor Zdzislaw Pawlak in Memoriam, Intelligent Systems Reference Library, Vol. 42, Springer Verlag, (2013)