Improving Michigan-style fuzzy-rule base classification generation using a Choquet-like Copula-based aggregation function Edward Hinojosa-Cardenas1 , Edgar Sarmiento-Calisaya1 , Heloisa A. Camargo2 and Jose Antonio Sanz3 1 Universidad Nacional de San Agustin de Arequipa, Arequipa, Peru 2 Federal University of São Carlos, São Carlos, São Paulo, Brazil 3 Universidad Publica de Navarra, Pamplona, Spain Abstract This paper presents a modification of a Michigan-style fuzzy rule based classifier by applying the Choquet- like Copula-based aggregation function, which is based on the minimum t-norm and satisfies all the conditions required for an aggregation function. The proposed new version of the algorithm aims at improving the accuracy in comparison to the original algorithm and involves two main modifications: replacing the fuzzy reasoning method of the winning rule by the one based on Choquet-like Copula- based aggregation function and changing the calculus of the fitness of each fuzzy rule. The modification proposed, as well as the original algorithm, uses a (1+1) evolutionary strategy for learning the fuzzy rule base and it shows promising results in terms of accuracy, compared to the original algorithm, over ten classification datasets with different sizes and different numbers of variables and classes. Keywords Michigan-style algorithm, fuzzy rule-based classification systems, Choquet-like Copula-based aggregation function, evolutionary strategy 1. Introduction We face classification problems in a wide range of real-world problems and research areas. For example, cancer classification [1], text classification [2], emotion classification [3], so on. Many researchers have proposed machine learning-based techniques to solve the classification task, for instance, decisions trees [4], neural networks [5], deep learning [6] and Fuzzy Rule- Based Classification Systems (FRBCSs) [7]. FRBCSs, a type of Fuzzy Rule-based Systems (FRBSs), have demonstrated to be an effective technique to tackle classification problems [8]. Additionally, a FRBCS contains fuzzy rules (if-then) with linguistic labels (represented by fuzzy sets) that model natural language and have high interpretability, offering the possibility to understand in detail how the system works [9]. Evolutionary Computation is studied since the beginning of the 1990s to automatic learn or tune all components of FRBSs and FRBCSs. This hybridization is named Evolutionary Fuzzy WILF’21: The 13th International Workshop on Fuzzy Logic and Applications, Dec. 20–22, 2021, Vietri sul Mare, Italy " ehinojosa@unsa.edu.pe (E. Hinojosa-Cardenas); esarmientoca@unsa.edu.pe (E. Sarmiento-Calisaya); heloisa@dc.ufscar.br (H. A. Camargo); joseantonio.sanz@unavarra.es (J. A. Sanz)  0000-0003-0307-7567 (E. Hinojosa-Cardenas); 0000-0002-0956-3091 (E. Sarmiento-Calisaya); 0000-0002-5489-7306 (H. A. Camargo); 0000-0002-1427-9909 (J. A. Sanz) © 2021 Copyright for this paper by its authors. Use permitted under Creative Commons License Attribution 4.0 International (CC BY 4.0). CEUR Workshop Proceedings http://ceur-ws.org ISSN 1613-0073 CEUR Workshop Proceedings (CEUR-WS.org) Systems (EFSs) [10]. The proposed algorithm in this paper is an EFS to automatic learn the Rule Base (RB), that is, the set of fuzzy rules composing the FRBCS. There are four main approaches to learn a RB in EFS: Pittsburgh [11], Michigan [12], iterative rule learning [13] and genetic cooperative competitive learning [14]. This paper is focused on the Michigan approach where an individual represents a single fuzzy rule, thus, a RB is defined by all individuals of the population in the evolutionary optimization process. Another important component of a FRBCS is the Fuzzy Reasoning Method (FRM), which is responsible for performing the classification of new examples based on the RB and the Data Base (DB - that specifies the definitions of fuzzy sets for the variables). The classification of new instances made by the FRM is based on an aggregation function. Several FRBCSs use the Winning Rule (WR) as aggregation function, where, the class assigned to a new instance is determined by the fired fuzzy rule with the maximum compatibility with the new instance, i.e., the WR uses the maximum aggregation function, considered as averaging. Thus, the information provided by the other fired fuzzy rules are ignored. To avoid this problem, other FRBCSs use additive combination as aggregation function in the FRM, where, the information of all fired fuzzy rules for each class is taken into account in the aggregation step to determine the class for a new instance. This aggregation operator is considered as non-averaging. In order to mix the characteristic of two aggregation functions mentioned above and to improve the results of FRMs, Barrenechea et al. [15] introduced the usage of an averaging operator named Choquet integral. Different generalizations of the Choquet integral are proposed in [16] [17]. The use and extension of the Choquet Integral-based operators applied to improve the performance FRBCs is a field of current interest for researchers and a future research direction for the aggregations operators [18]. The main contribution of this paper is to use the generalization called Choquet-like Copula- based aggregation function (CC-integral) [16] in a Michigan-style fuzzy rule generation algo- rithm [19] to improve the classification rate of the learned fuzzy rules. The remainder of this paper is outlined as follows: Section 2 presents the concept of FRBCSs and details of each stage in a FRM. In section 3 the proposed algorithm are detailed step by step. Then, Section 4 shows the experimental results to evaluate the accuracy of the proposed algorithm. Finally, in Section 5, the conclusions are drawn and some future works are proposed. 2. Fuzzy Rule-Based Classification Systems Any classification problem considers a set of examples 𝐸 = {𝑒1 , 𝑒2 , . . . , 𝑒𝑝 } and a set of classes 𝐶𝑙𝑎𝑠𝑠 = {𝐶𝑙𝑎𝑠𝑠1 , 𝐶𝑙𝑎𝑠𝑠2 , . . . , 𝐶𝑙𝑎𝑠𝑠𝑚 }, and the objective is to assign a class 𝐶𝑙𝑎𝑠𝑠𝑗 ∈ 𝐶𝑙𝑎𝑠𝑠 to each example 𝑒𝑏 ∈ 𝐸 . Each 𝑒𝑏 is defined by a set of features 𝑒𝑏 = {𝑒𝑏1 , 𝑒𝑏2 , . . . , 𝑒𝑏𝑛 } and each feature is defined by a linguistic term 𝑎. In a FRBCS, the FRM uses a set of 𝐿 fuzzy rules in the RB and fuzzy sets in the DB to assign a class to each example. Usually, the fuzzy rules follow the format: 𝑅𝑖 : IF 𝑒𝑏1 IS 𝑎𝑖1 AND 𝑒𝑏2 IS 𝑎𝑖2 AND ... AND 𝑒𝑏𝑛 IS 𝑎𝑖𝑛 THEN Class = 𝐶𝑙𝑎𝑠𝑠𝑗𝑖 WITH 𝑅𝑊𝑖 The FRM follow four stages [20]: 1. Matching degree (𝑀 ): For each fuzzy rule 𝑖 in the RB, the if-part is compared with the example to be classified 𝑒𝑏 using a t-norm (𝑇 ) as conjunction operator for all membership degrees (𝜇) obtained. 𝑀𝑖 (𝑒𝑏 ) = 𝑇 (𝜇𝑎𝑖1 (𝑒𝑏1 ), . . . , 𝜇𝑎𝑖𝑛 (𝑒𝑏𝑛 )) (1) 2. Association degree (𝐴): For each fuzzy rule 𝑖 in the RB, 𝑀𝑖 is weighted by its rule weight according the 𝐶𝑙𝑎𝑠𝑠𝑗𝑖 𝐶𝑙𝑎𝑠𝑠𝑗 𝐴𝑖 𝑖 (𝑒𝑏 ) = 𝑀𝑖 (𝑒𝑏 ) · 𝑅𝑊𝑖 (2) 3. Example classification soundness degree for all classes (𝑆): At this point, for each class, 𝐶𝑙𝑎𝑠𝑠𝑗𝑖 𝐶𝑙𝑎𝑠𝑠𝑗 , the positive information, 𝐴𝑖 (𝑒𝑏 ) > 0, given by the fired fuzzy rules of the previous step, is aggregated by an aggregation function A. (︁ 𝐶𝑙𝑎𝑠𝑠 𝐶𝑙𝑎𝑠𝑠𝑠𝑗𝑖 )︁ (3) 𝑗𝑖 𝑆𝐶𝑙𝑎𝑠𝑠𝑗𝑖 (𝑒𝑏 ) = A 𝐴1 (𝑒𝑏 ), . . . , 𝐴𝐿 (𝑒𝑏 ) The key point in the FRM is how the information given by the fired fuzzy rules is aggregated. Following, three different aggregation functions are presented: a) Winning Rule (WR): For each class, it only considers the rule having the maximum compatibility with the example. 𝐶𝑙𝑎𝑠𝑠𝑗𝑖 𝑆𝐶𝑙𝑎𝑠𝑠𝑗𝑖 (𝑒𝑏 ) = 𝑚𝑎𝑥{𝐴𝑖 (𝑒𝑏 )} (4) b) Additive Combination (AC): It aggregates all the fired rules, for each class 𝐶𝑙𝑎𝑠𝑠𝑗 , by using the normalized sum. ∑︀𝐿 𝐶𝑙𝑎𝑠𝑠𝑗𝑖 𝑖=1 𝐴𝑖 (𝑒𝑏 ) 𝑆𝐶𝑙𝑎𝑠𝑠𝑗𝑖 (𝑒𝑏 ) = ∑︀𝐿 𝐶𝑙𝑎𝑠𝑠𝑗𝑖 (5) 𝑚𝑎𝑥𝑗=1,...,𝑚 𝑖=1 𝐴𝑖 (𝑒𝑏 ) c) Choquet-like Copula-based aggregation function (CC-integral): It is an aggregation function supported by solid theory, proposed and detailed in [16]. (︁ 𝐶𝑙𝑎𝑠𝑠 𝐶𝑙𝑎𝑠𝑠𝑗𝑖 )︁ 𝑆𝐶𝑙𝑎𝑠𝑠𝑗𝑖 (𝑒𝑏 ) = C𝐶 (6) 𝑗𝑖 m𝑗 𝐴 1 (𝑒𝑏 ), . . . , 𝐴 𝐿 (𝑒 𝑏 ) The C𝐶 m𝑗 is the constructed CC-integral for the copula 𝐶: [0, 1] → [0.1] and fuzzy 2 measure m𝑗 : 𝐿 (︁ ∑︁ )︁ C𝐶 m𝑗 (𝑥 ⃗) = 𝑚𝑖𝑛{𝑥(𝑖) , m𝑗 (𝐴(𝑖) )} − 𝑚𝑖𝑛{𝑥(𝑖−1) , m𝑗 (𝐴(𝑖) )} (7) 𝑖=1 (︂ )︂𝑞𝑗 |𝑋| m𝑗 (𝑋) = , 𝑤𝑖𝑡ℎ 𝑞𝑗 > 0 (8) 𝑛 𝐶𝑙𝑎𝑠𝑠𝑗 𝐶𝑙𝑎𝑠𝑠𝑗 where ⃗𝑥 = 𝐴1 𝑖 (𝑒𝑏 ), . . . , 𝐴𝐿 𝑖 (𝑒𝑏 ) and 𝑋 ⊆ 𝑁 . In the proposed algorithm, the CC integral is constructed using the minimum and the cardinality as the copula (C) and fuzzy measure (𝑚𝑗 ), respectively. 4. Classification: The final decision is made in this step. To do so, a function 𝐹 : [0, 1] → {1, . . . , 𝑚} is applied over the results obtained by example classification soundness degrees of all classes: (9) (︀ )︀ (︀ )︀ 𝐹 𝑆𝐶𝑙𝑎𝑠𝑠𝑗 , . . . , 𝑆𝐶𝑙𝑎𝑠𝑠𝑗 = 𝑚𝑎𝑥𝑗=1,...,𝑚 𝑆𝐶𝑙𝑎𝑠𝑠𝑗 An example of the behavior of three aggregation functions mentioned above is presented in [16]. 3. Proposed Algorithm The proposed modified algorithm in this paper is based on the algorithm proposed in [19] (it is called in this paper as Michigan_EE). Basically, we propose an algorithm that modifies the calculation of fitness of each fuzzy rule, the calculation of the classification rate of the RB based on the CC-integral aggregation function (explained in the previous section) and the evolutional optimization of the values of the exponents 𝑞. We call the proposed algorithm Michigan_EE_CC and it is detailed in Algorithm 1. Algorithm 1 Proposed Algorithm Michigan_EE_CC Output: 𝑃𝑏𝑒𝑠𝑡 and 𝑞 𝑏𝑒𝑠𝑡 1: Create the DB 2: Generate 𝑁𝑟𝑢𝑙𝑒 fuzzy rules by the MPB to make an initial population 𝑃0 3: Calculate the fitness of each rule 𝑅𝑖 in 𝑃0 4: Generate an encoded individual 𝑞 0 and 𝑞 𝑏𝑒𝑠𝑡 = 𝑞 0 5: Calculate the Classification Rate (𝐶𝑅) by 𝑃0 using 𝑞 𝑏𝑒𝑠𝑡 and 𝐶𝑅𝑏𝑒𝑠𝑡 = 𝐶𝑅 6: for 𝑡 = 1 to 𝑇 𝑄 do 7: Generate a randomly 𝑞 𝑡 8: Calculate the CR by 𝑃0 using 𝑞 𝑡 9: if (𝐶𝑅 > 𝐶𝑅𝑏𝑒𝑠𝑡 ) then 10: 𝐶𝑅𝑏𝑒𝑠𝑡 = 𝐶𝑅; 𝑞 𝑏𝑒𝑠𝑡 = 𝑞 𝑡 11: end if 12: end for 13: 𝑃𝑏𝑒𝑠𝑡 = 𝑃0 14: for 𝑖𝑡𝑒𝑟 = 0 to 𝐼𝑡𝑒𝑟 do 15: 𝑃𝑖𝑡𝑒𝑟 = 𝑃𝑏𝑒𝑠𝑡 16: Generate 𝑁𝑟𝑒𝑝𝑙𝑎𝑐𝑒 /2 fuzzy rules by genetic operations in 𝑃𝑖𝑡𝑒𝑟 and 𝑁𝑟𝑒𝑝𝑙𝑎𝑐𝑒 /2 fuzzy rules by the MPB 17: Replace the worst 𝑁𝑟𝑒𝑝𝑙𝑎𝑐𝑒 fuzzy rules in 𝑃𝑖𝑡𝑒𝑟 with the newly generated 𝑁𝑟𝑒𝑝𝑙𝑎𝑐𝑒 fuzzy rules to make a new population 𝑃𝑖𝑡𝑒𝑟 18: Calculate the fitness of each rule 𝑅𝑖 in 𝑃𝑖𝑡𝑒𝑟 19: Calculate the 𝐶𝑅 by 𝑃𝑖𝑡𝑒𝑟 using 𝑞 𝑏𝑒𝑠𝑡 20: for 𝑡 = 1 to 𝑇 𝑄 do 21: Generate a randomly 𝑞 𝑡 22: Calculate the 𝐶𝑅 by 𝑃𝑖𝑡𝑒𝑟 using 𝑞 𝑡 23: if (𝐶𝑅 > 𝐶𝑅𝑏𝑒𝑠𝑡 ) then 24: 𝐶𝑅𝑏𝑒𝑠𝑡 = 𝐶𝑅; 𝑞 𝑏𝑒𝑠𝑡 = 𝑞 𝑡 ; 𝑃𝑏𝑒𝑠𝑡 = 𝑃𝑖𝑡𝑒𝑟 25: end if 26: end for 27: end for In line 1, for each attribute, the minimum and maximum value are obtained. After, 𝑛𝐹 𝑆 triangular fuzzy sets are defined uniformly distributed on the attribute domain, i.e. each fuzzy set has the same support and they cover all the range between maximum and minimum values. In line 2, 𝑁𝑟𝑢𝑙𝑒 fuzzy rules are generated based on the format mentioned in Section 2, and inserted into the population 𝑃0 . Each fuzzy rule is encoded as a chromosome with three parts: the first part represents the antecedent part where each gene represents an index of a fuzzy set (or linguistic term) for each attribute (value zero represents a don’t care condition, what means that the respective attribute does not appear in the rule). The second part (only one gene) represents the class or consequent of the fuzzy rule. Finally, the third part (only one gene) represents the rule weight. Figure 1 illustrates the representation used in this step. Figure 1: Encoding a Fuzzy Rule Each fuzzy rule in 𝑃0 is generated by Multi-Pattern-Based rule generation (MPB), where, for a single rule generation, one base example and some (𝐻 − 1) support examples are randomly selected from the training data of the same class as the base example. More details of MPB can be found in [19]. One of the proposed modification is performed in line 3. In Michigan_EE algorithm, the fitness of each fuzzy rule is calculated by the number of correctly classified training patterns by the fuzzy rule, which is more appropriate for using WR. In the proposed algorithm, that uses CC-integral, the fitness of each rule 𝑅𝑖 (with a class 𝐶𝑅𝑖 ) is calculated by the difference between: the average degree of association (> 0) of all examples with a class 𝐶+, where 𝐶𝑅𝑖 = 𝐶+, and twice the average degree of association (> 0) of all examples with a class 𝐶−, where 𝐶𝑅𝑖 ̸= 𝐶−. That difference refers to the idea of obtaining rules covering the maximum number of examples (completeness degree) with the minimum number of negative examples (consistency degree) proposed in [21]. The next equation shows that difference: ∑︀𝑝 𝐶𝑙𝑎𝑠𝑠𝑗𝑖 ∑︀𝑝 𝐶𝑙𝑎𝑠𝑠𝑗𝑖 𝑏=1 𝐴𝑖 (𝑒𝑏 )+ 𝑏=1 𝐴𝑖 (𝑒𝑏 )− 𝑓 𝑖𝑡𝑛𝑒𝑠𝑠(𝑅𝑖 ) = 𝐶𝑙𝑎𝑠𝑠𝑗𝑖 −2× 𝐶𝑙𝑎𝑠𝑠𝑗𝑖 (10) |𝐴𝑖 (𝑒𝑏 ) + | |𝐴𝑖 (𝑒𝑏 ) − | In line 4, an encoded individual 𝑞 0 is generated, which contains the values of the exponents for each class used in the CC-integral aggregation method (see Secction 2-3-c) and it is stored as 𝑞 𝑏𝑒𝑠𝑡 . The value of 1.00 is assigned to each exponent so that the classical cardinality measure is represented. Figure 2 illustrates the representation used in these steps. Figure 2: Encoding the values of the exponents used in CC-integral aggregation function Another modification in the proposed algorithm is performed in line 5. In Michigan_EE algorithm the classification rate (𝐶𝑅) of all the fuzzy rules in the population or RB is calculated based on a FRM with WR aggregation function for each training example. In the proposed algorithm, the 𝐶𝑅 of the RB is based on a FRM with CC-integral aggregation function for each training example, using 𝑞 𝑏𝑒𝑠𝑡 . After that, 𝐶𝑅 is stored as 𝐶𝑅𝑏𝑒𝑠𝑡 . In lines 6-12, new better values for each exponent are searched with a small value of 𝑇 𝑄 because the calculation of the CR is computationally expensive. In line 7, a new 𝑞 𝑡 is randomly generated. The values of the exponents 𝑞𝑗 are generated in the range [0.01, 1.99]. However, according to [15], the suggested final values of the exponents are in the range [0.01, 100], for that, the values used in the calculation of 𝐶𝑅 are adapted as: {︃ 𝑞𝑗 𝑖𝑓 0.00 < 𝑞𝑗 ≤ 1.00 𝑞𝑗 = 1 (11) 2−𝑞𝑗 𝑖𝑓 1.00 < 𝑞𝑗 < 2.00 In line 8, 𝑞 𝑡 is used in the calculation of 𝐶𝑅 using 𝑃0 . After that, the new 𝑞 𝑡 is stored as 𝑞 𝑏𝑒𝑠𝑡 and 𝐶𝑅 is stored as 𝐶𝑅𝑏𝑒𝑠𝑡 if the 𝐶𝑅 is better than 𝐶𝑅𝑏𝑒𝑠𝑡 . In lines 14-27, the Michigan approach is performed to learn evolutionarily the RB. In each iteration the worst 𝑁𝑟𝑒𝑝𝑙𝑎𝑐𝑒 (= 𝑁𝑟𝑢𝑙𝑒 /2) fuzzy rules in the population 𝑃𝑖𝑡𝑒𝑟 (copy of 𝑃 𝑏𝑒𝑠𝑡 or population with the best 𝐶𝑅) are replaced by fuzzy rules genetically created or MPB, in order to found a better population (or a population with better 𝐶𝑅). In line 16, for generating the first 𝑁𝑟𝑒𝑝𝑙𝑎𝑐𝑒 /2 fuzzy rules are used a parent selection operator, a crossover operator (with 𝑐𝑟𝑜𝑠𝑠𝑃 𝑟𝑜𝑏 probability) and a mutation operator (with 𝑚𝑢𝑡𝑎𝑃 𝑟𝑜𝑏 probability and a random replacement of each membership function with 𝑚𝑢𝑡𝑎𝑃 𝑟𝑜𝑏𝑀 𝐹 probability) based on population 𝑃𝑖𝑡𝑒𝑟 . For the remaining 𝑁𝑟𝑒𝑝𝑙𝑎𝑐𝑒 /2 fuzzy rules, the MPB is performed, where, for a single rule generation, the base example is randomly selected from the misclassified examples by 𝑃𝑖𝑡𝑒𝑟 . If misclassified examples do not exist, base examples are selected from the whole training data. In line 17, the 𝑁𝑟𝑒𝑝𝑙𝑎𝑐𝑒 worst fuzzy rules are replaced in 𝑃𝑖𝑡𝑒𝑟 by the newly-generated 𝑁𝑟𝑒𝑝𝑙𝑎𝑐𝑒 fuzzy rules. Then, the fitness of all fuzzy rules in 𝑃𝑖𝑡𝑒𝑟 are calculated (using the fitness function proposed) in line 18 and the 𝐶𝑅 of 𝑃𝑖𝑡𝑒𝑟 using 𝑞 𝑏𝑒𝑠𝑡 are calculated in line 19. Finally, new better values of each exponent are searched in lines 20-26 (similar to lines 6-12), where, a randomly generated 𝑞 𝑡 and 𝑃𝑖𝑡𝑒𝑟 are stored as 𝑞 𝑏𝑒𝑠𝑡 and 𝑃𝑏𝑒𝑠𝑡 , respectively, if the calculus of 𝐶𝑅 using both is better than previous one. The final outputs 𝑃𝑏𝑒𝑠𝑡 and 𝑞 𝑏𝑒𝑠𝑡 are the best population and the best values of exponents found during the evolution process. 4. Experiments In this section, we present a computational experiment aimed to assess the performance of proposed Michigan_EE_CC algorithm modification when it is applied on ten datasets with varied numbers of examples, attributes and classes. Table 1 show the datasets used in this paper, which are available at KEEL dataset repository [22]. Table 1 Datasets used in this study Dataset Examples Attributes Classes Dataset Examples Attributes Classes appendicitis 106 7 2 newthyroid 215 5 3 bupa 345 6 2 pima 768 8 2 glass 214 9 7 segment 2310 19 7 hayes-roth 160 4 3 tae 151 5 3 heart 270 13 2 wine 178 13 3 The parameters and genetic operators of the proposed Michigan_EE_CC algorithm modifica- tion used in this paper are listed in Table 2. Table 2 Parameters and genetic operators used in this study Parameter Value Parameter Value Population size (𝑁𝑟𝑢𝑙𝑒 ) 30 Crossover probability 0.9 (𝑐𝑟𝑜𝑠𝑠𝑃 𝑟𝑜𝑏) Number of replaced 6 Mutation probability 1/𝑛 (𝑛: Number rules (𝑁𝑟𝑒𝑝𝑙𝑎𝑐𝑒 ) (𝑚𝑢𝑡𝑎𝑃 𝑟𝑜𝑏) of attributes) Number of Fuzzy 5 Function membership mutation 0.1 Set (𝑛𝐹 𝑆) probability (𝑚𝑢𝑡𝑎𝑃 𝑟𝑜𝑏𝐹 𝑆) MPB (𝐻) 2 𝑇𝑄 5 Parent selection Binary tournament 𝐼𝑡𝑒𝑟 100000 selection Crossover Uniform crossover Number of runs 50 Table 3 shows the achieved results by the proposed Michigan_EE_CC algorithm modification for training and testing, each line describes the mean of the accuracy obtained after 50 runs (10-fold cross validation × five times) and the standard deviations in brackets. In order to show the quality of the proposed Michigan_EE_CC algorithm, we compare it with the Michigan_EE algorithm. The parameters used in Michigan_EE algorithm are the same as Michigan_EE_CC algorithm, except for the 𝑇 𝑄 parameter that is not used. The results obtained by the Michigan_EE algorithm are shown in Table 3. Table 3 Accuracy rate in training and testing for the proposed Michigan_EE_CC algorithm modification vs Michigan_EE algorithm base Michigan_EE Michigan_EE_CC Dataset Training Testing Training Testing appendicitis 0.9287 (0.0109) 0.8024 (0.0473) 0.9382 (0.0104) 0.8209 (0.0801) bupa 0.7307 (0.0217) 0.5943 (0.0631) 0.7264 (0.0119) 0.6425 (0.0478) glass 0.7498 (0.0250) 0.6281 (0.0662) 0.7155 (0.0149) 0.6133 (0.0899) hayes-roth 0.8163 (0.01510) 0.7063 (0.0557) 0.8832 (0.0062) 0.7825 (0.0812) heart 0.8915 (0.0116) 0.7756 (0.0627) 0.8697 (0.0106) 0.7763 (0.0467) newthyroid 0.9008 (0.03180) 0.8230 (0.0376) 0.9775 (0.0050) 0.9370 (0.0291) pima 0.7515 (0.0075) 0.6941 (0.0333) 0.7795 (0.0033) 0.7363 (0.0276) segment 0.8555 (0.0305) 0.8428 (0.0366) 0.8570 (0.0104) 0.8480 (0.0173) tae 0.6248 (0.0295) 0.5114 (0.0802) 0.6574 (0.0123) 0.5161 (0.1004) wine 0.9889 (0.0106) 0.8805 (0.0368) 0.9938 (0.0035) 0.8895 (0.0472) AVG 0.8239 0.7259 0.8398 0.7562 Table 3 shows that the proposed Michigan_EE_CC algorithm obtains better results than Michigan_EE algorithm in seven out of the ten training data and nine out of the ten testing data. We also consider the use of the Wilcoxon test [23] in order to perform pair-wise comparison on test results for the two algorithms. Table 4 shows that the null hypothesis for the Wilcoxon’s test has been rejected (p-value ⩽ 𝛼) and we may conclude that proposed Michigan_EE_CC algorithm presents better results than previous version. The source code of the proposed Michigan_EE_CC algorithm (GitHub) and our implementa- tion of Michigan_EE (GitHub) algorithm are available on GitHub. Table 4 Wilcoxon’t Test (𝛼 = 0.05) Comparison 𝑅+ 𝑅− Hypothesis p-value Michigan_EE_CC vs. Michigan_EE 0.3187 0.0148 Rejected 0.025 5. Conclusions In this paper, we proposed the Michigan_EE_CC algorithm, which is a modification of the Michigan-style fuzzy rule generation algorithm proposed in [19], using a Choquet-like Copula- based aggregation function. Michigan_EE_CC algorithm was applied to ten standard classifica- tion datasets and compared to the prominent original algorithm, named Michigan_EE, that uses winning rule aggregation function in the fuzzy reasoning method. The experimental results showed that the Michigan_EE_CC algorithm is able to increase the accuracy over the training and testing dataset. We foresee different avenues for future work, they include: 1) using other generalizations of the Choquet integral and, 2) evaluating the performance with challenging datasets, i.e., imbalanced and high dimensional datasets. Acknowledgments This work was supported by the Universidad Nacional de San Agustin de Arequipa under Project IBAIB-06-2019-UNSA and in part by the Spanish Ministry of Economy and Com- petitiveness through the Spanish National Research (project PID2019-108392GB-I00 / AEI / 10.13039/501100011033) and by the Public University of Navarre under the project PJUPNA1926. References [1] S. D. Bharathi, S. Sudha, A survey on gene selection for microarray cancer classification based on soft computing techniques, in: 2018 International Conference on Inventive Research in Computing Applications (ICIRCA), 2018, pp. 304–309. [2] N. Arunachalam, S. J. Sneka, G. MadhuMathi, A survey on text classification techniques for sentiment polarity detection, in: 2017 Innovations in Power and Advanced Computing Technologies (i-PACT), 2017, pp. 1–5. [3] H. P. Ünal, G. Gökmen, M. Yumurtacı, Emotion classification with deap dataset:survey, in: Innovations in Intelligent Systems and Applications Conference, 2020, pp. 1–6. [4] S. Pathak, I. Mishra, A. Swetapadma, An assessment of decision tree based classification and regression algorithms, in: 2018 3rd International Conference on Inventive Computation Technologies (ICICT), 2018, pp. 92–95. [5] G. Algan, I. Ulusoy, Image classification with deep learning in the presence of noisy labels: A survey, Knowledge-Based Systems 215 (2021) 106771. [6] S. Dong, P. Wang, K. Abbas, A survey on deep learning and its applications, Computer Science Review 40 (2021) 100379. [7] M. Elkano, M. Galar, J. Sanz, H. Bustince, Chi-bd: A fuzzy rule-based classification system for big data classification problems, Fuzzy Sets and Systems 348 (2018) 75–101. [8] H. Ishibuchi, T. Nakashima, M. Nii, Classification and Modeling with Linguistic Informa- tion Granules: Advanced Approaches to Linguistic Data Mining (Advanced Information Processing), Springer-Verlag, Berlin, Heidelberg, 2004. [9] M. Gacto, R. Alcalá, F. Herrera, Interpretability of linguistic fuzzy rule-based systems: An overview of interpretability measures, Information Sciences 181 (2011) 4340–4360. Special Issue on Interpretable Fuzzy Systems. [10] A. Fernández, V. López, M. D. Jesús, F. Herrera, Revisiting evolutionary fuzzy systems: Taxonomy, applications, new trends and challenges, Knowl. Based Syst. 80 (2015) 109–121. [11] C. H. Tan, K. S. Yap, S. Y. Wong, M. T. Au, C. T. Yaw, H. J. Yap, Genetic rules induction fuzzy inference system for classification and regression application in energy industry, International Journal of Engineering and Advanced Technology (IJEAT) 9 (2019) 4154–4160. [12] A. Orriols-Puig, J. Casillas, E. Bernado-Mansilla, Fuzzy-ucs: A michigan-style learn- ing fuzzy-classifier system for supervised learning, IEEE Transactions on Evolutionary Computation 13 (2009) 260–283. [13] E. H. Cárdenas, H. A. Camargo, Y. J. Túpac, Imbalanced datasets in the generation of fuzzy classification systems-an investigation using a multiobjective evolutionary algorithm based on decomposition, in: IEEE International Conference on Fuzzy Systems, 2016, pp. 1445–1452. [14] F. J. Berlanga, M. J. del Jesus, F. Herrera, A novel genetic cooperative-competitive fuzzy rule based learning method using genetic programming for high dimensional problems, in: 2008 3rd International Workshop on Genetic and Evolving Systems, 2008, pp. 101–106. [15] E. Barrenechea, H. Bustince, J. Fernandez, D. Paternain, J. A. Sanz, Using the choquet integral in the fuzzy reasoning method of fuzzy rule-based classification systems, Axioms 2 (2013) 208–223. [16] G. Lucca, J. A. Sanz, G. P. Dimuro, B. R. C. Bedregal, M. J. Asiain, M. Elkano, H. Bustince, Cc-integrals: Choquet-like copula-based aggregation functions and its application in fuzzy rule-based classification systems, Knowl. Based Syst. 119 (2017) 32–43. [17] G. P. Dimuro, G. Lucca, B. R. C. Bedregal, R. Mesiar, J. A. Sanz, C. Lin, H. Bustince, Gener- alized cf1f2-integrals: From choquet-like aggregation to ordered directionally monotone functions, Fuzzy Sets Syst. 378 (2020) 44–67. [18] L. Sun, H. Dong, A. X. Liu, Aggregation functions considering criteria interrelationships in fuzzy multi-criteria decision making: State-of-the-art, IEEE Access 6 (2018) 68104–68136. [19] Y. Nojima, S. Takemura, K. Watanabe, H. Ishibuchi, Michigan-style fuzzy GBML with (1+1)- ES generation update and multi-pattern rule generation, in: Joint 17th World Congress of International Fuzzy Systems Association and 9th International Conference on Soft Computing and Intelligent Systems, IEEE, 2017, pp. 1–6. [20] O. Cordón, M. J. del Jesus, F. Herrera, A proposal on reasoning methods in fuzzy rule-based classification systems, International Journal of Approximate Reasoning 20 (1999) 21–45. [21] A. Gonzalez, R. Perez, Slave: a genetic learning system based on an iterative approach, IEEE Transactions on Fuzzy Systems 7 (1999) 176–191. [22] J. Alcalá-Fdez, A. Fernández, J. Luengo, J. Derrac, S. García, F. Herrera, KEEL data-mining software tool: Data set repository, integration of algorithms and experimental analysis framework, Journal of Multiple-Valued Logic and Soft Computing 17 (2011) 255–287. [23] F. Wilcoxon, Individual comparisons by ranking methods, Biometrics Bulletin 1 (1945).