-

Fine tuning run parameter values in rule-based machine learning?

Sotiris Moschoyiannis

s.moschoyiannis@surrey.ac.uk 0

Vasily Shcherbinin

v.shcherbinin@surrey.ac.uk 0 0 Department of Computer Science, University of Surrey , Guildford, Surrey, GU2 7XH

Rule-based machine learning focuses on learning or evolving a set of rules that represents the knowledge captured by the system. Due to its inherent complexity, a certain amount of ne tuning is required before it can be applied to a particular problem. However, there is limited information available to researchers when it comes to setting the corresponding run parameter values. In this paper, we investigate the run parameters of Learning Classi er Systems (LCSs) as applied to single-step problems. In particular, we study two LCS variants, XCS for reinforcement learning and UCS for supervised learning, and examine the e ect that di erent parameter values have on enhancing the model prediction, increasing accuracy and reducing the resulting rule set size.

condition-action rules learning classi er systems forcement learning supervised learning ne tuning LCS

The key feature behind Rule-based machine learning (RBML) is that it evolves arounds rules and therefore the outcome is interpretable by both computers and humans. RBML approaches include learning classi er systems (LCSs), association rule learning and arti cial immune systems. The focus here is on LCSs which have been used in a variety of settings, from games [ 1 ] to industrial control [ 2 ] to policy [ 3 ] to transport [ 4 ] to controlling Boolean networks [ 5 ].

However, rule-based approaches involve a number of parameters and the quality of the outcome largely relies on assigning them appropriate values. There is distinct lack in guidance on how the run parameters should be set in order to build a good LCS, for both reinforcement learning and supervised learning problems. Here, the investigation is aimed at single-step problems; that is, learning problems where the reward from e ecting the action on the environment is known immediately rather than after several steps, as in multi-step problems.

In this paper we study the interplay of the LCS with the data being learned in benchmark (single-step) problems. In particular, we focus the investigation on XCS, the most popular LCS-variant for reinforcement learning, and UCS the classic variant for supervised learning problems. UCS was speci cally built for knowledge extraction (learning patterns in complex datasets) and provides a contrast to reinforcement learning XCS, which allows to see the e ects of various parameter values from a di erent LCS angle.

We focus on the run parameters suggested in seminal work in the eld of LCS such as [ 6 ],[ 7 ] and use the values therein as a starting point. Neither work gives information on the source of the values. We aim to ll this gap in this paper by providing explicit guidelines on how to set the values of these key parameters of LCS for both reinforcement and supervised learning. We test both algorithms against balanced and unbalanced, generalisable problems.

The remainder of the paper is structured as follows. Section 2 provides a brief overview of LCS, XCS and UCS. Section 3 introduces the two benchmark problems used in the study while Section 4 outlines the methodology and experiments we carried out. In Section 5 we discuss the results for both learning algorithms and provide explicit guidelines for setting their run parameters. We conclude and consider future work in Section 6. 2

Rule-based Machine Learning: an overview of LCSs Learning classi er systems (LCSs) evolve around rules - matching, predicting, updating, discovering. A rule follows an IF : THEN expression and comprises two parts; the condition (or state) and the action (or class or phenotype).

A rule is associated with a number of parameters: (i) the reward prediction p estimates the reward if this rule's action is eventually e ected on the environment, (ii) the prediction error , and (iii) a tness F which re ects the quality of the rule. These parameters are updated in every iteration of the learning cycle (cf Section 2.1, 2.2). A classi er is a rule with associated parameters.

The condition of a rule is expressed in a ternary alphabet; an attribute can have a value of 0 or 1, but also a # which stands for the 'don't care' value, i.e., it can match both 0 or 1. This is what allows to generalise rules and make them applicable to more than one instance - this in turn allows to capture useful data relationships without over tting the rules to training data. For example, the rules 110:1 and 111:1 can be generalised to 11#:1 (e.g., cloudy and windy points to taking an umbrella irrespective of evening or morning). The action part of a rule represents the predicted or recommended action.

The LCS functional cycle, shown in Figure 1 comprises three components: the environment, the learning machine and, optionally, rule compression (postprocessing). The environment provides as input a series of sensory situations { typically abstracted to a dataset containing so-called training instances.

The rst step of the learning cycle is to receive input from the environment (1). The next step is to identify and choose all rules in the rule population [P] whose condition part matches that of the training instance (environment input) (2). Matching compares the current input to all rules in the population [P]. If all speci ed attributes match, the rule is moved to the match set [M] (3).

The next step produces the action set [A] (4) which comprises all actions associated with rules whose condition matched the input. This set forms the input used in the GA and also speci es which rules are updated after each iteration. We will have more to say on this in Sections 2.1 and 2.2.

In addition to the parameters p, , F discussed earlier, rules in [P] are associated with parameters: age - the number of iterations that occurred after the rule was created; exp - the number of times the rule belonged to the action set [A]; actionSetSize - the size of the action set to which the rule belonged; n the number of copies of a rule appearing in the population, called numerosity.

If the action set is empty, or the number of actions in it do not exceed the pre-set maximum number of actions, mna, the Covering mechanism is applied (5) { this allows to introduce new rules to the population [P] (rule discovery). During the process, some of the attributes in the rule's condition and action are generated using values taken from the current environment input; the rest are set to #, with probability P #. Since LCS rule populations start o empty, covering is also used for population initialisation. If there is at least one action in the action set, the next step is to update the rule parameters for the classi ers that have made it into the action set [A] (6) (cf Sections 2.1 and 2.2).

Additionally, a mechanism called Subsumption is often applied (7) in which pairs of rules are examined and whenever a rule (subsumer) matches all attributes of the other rule, is more general and just as accurate it absorbs this other, more speci c, rule. An error threshold 0 determines the maximum error a rule may have and still be able to subsume another.

Next, a genetic algorithm (GA) is applied (8). It operates on the action set [A], with a frequency determined by ga to produce new rules by crossover and mutation. When activated, the GA runs just once. Parent rules are selected to produce two child rules. With probability a two-point crossover operates on these o spring rules. Each integer pair is then mutated with a probability . Several selection mechanisms exist - recent literature suggests preference towards tournament selection, where a portion of classi ers is randomly selected from [A] and the classi er with the highest tness is chosen to be the parent.

The last step involves a deletion mechanism which is applied to cap the population at N rules (9). Whenever the population size, which is the sum of all rule numerosities, exceeds N , rules are deleted from the population until the size limit is met. Classi er experience exp must exceed a threshold del or tness to impact deletion likelihood. The population cap, combined with the selective deletion of rules, results in an overall drive to improve the quality of the population.

Optionally, a post-processing step is applied to the rule population after training. This is known as rule compaction (see Fig. 1) and it aims to remove rules that are inexperienced, inaccurate, have a low tness but whose error does not exceed the speci ed limit 0. 2.1

eXtended Classi er System (XCS) LCS can function by receiving only the utility of the function e ected on the environment (reinforcement learning). The eXtended Classi er System (XCS) is an accuracy-based LCS developed originally by [ 8 ] to target reinforcement learning problems. The key di erentiating characteristics of XCSs follow.

Firstly, once the match set [M] has been created, a prediction array is created, which supplies the tness-weighted predicted rewards for each of the actions within [M]. This is used in action selection { in explore mode, with probability pexpl, the system randomly selects from the available actions in [M]; otherwise, in exploit mode it chooses the action with the highest predicted reward p as indicated by the array. Certain rules within the match set (more than one possibly) contain the action that was selected. These rules form the action set [A] (Fig 1).

Secondly, there is a feedback mechanism which detects the e ect of the selected action when executed on the environment and grants a particular reward or payo . On a single-step problem such as those addressed in this paper, this reward is used directly in updating the rule parameters.

The updating can only happen in [A]. Parameter updates follow the timeweighted recency average (TWRA) principle [ 7 ], which can be described as: AverageNew = AverageCurrent + (V alueCurrent AverageCurrent) (1) This is directly inspired by the Widrow-Ho update, which is commonly used in arti cial neural networks (ANNs) for updating the weights of the inputs: i+1 = i + (ui i) (2) When the value ui is received, the equation above is used to record the value of variable i at each iteration i. The value of i+1 would thus be a combination of the old value of and the newly received value of u. The split balance between using old and new values is controlled by which is called the learning rate parameter. If is 0, then no learning takes place; if = 1, then only the new value is stored and there is no recollection of the past values. The optimal value for lays therefore in the range from [ 0, 1 ]. The reward prediction p is given by: p p + (r p); 0

1 >8< + (jr > + : pj exp (jr pj ) ; if exp <

1 ); otherwise = 81; < : if

< 0 ( ) ; otherwise

0 following the same concept of split balance described earlier for p, but this time applied to (cf Fig. 6). Fitness of a rule is based on accuracy [ 6 ], given by: where 0 is a prede ned error threshold, as discussed before. If the error associated with this rule (classi er's error) is greater than the error threshold 0, its accuracy decreases, based on and . When is set to a low value (approaching 0), there is going to be an acute decrease in the accuracy value; when is set to a higher value, drives the rate of increase in accuracy. If is set to a high value, the accuracy will decrease fast; if is set to a low value, accuracy will drop slowly. A form of tness-sharing is applied in XCS given by: where r is the reward received from the environment. The error between the reward received by the environment and the predicted reward is given by: (3) (4) (5) (6) (7) 0 =

P cl2[A]

cl F

F + ( 0

F ) where cl denotes a classi er in the action set [A], cl the accuracy associated with cl, and the accuracy of the present classi er.

The tness classi er (cf Fig. 6) can thus be calculated and updated using: 2.2

sUpervised Classi er System (UCS) The sUpervised Classi er System (UCS) is an accuracy-based LCS developed by [ 9 ] speci cally for supervised learning problems. In supervised learning, the correct action is available from the environment. This allows UCS to focus on data mining and knowledge discovery in database type problems, e.g., performance analysis of CBA in [10]. UCS and XCS share key principles such as a niche GA and de ning tness based on accuracy, but also have key di erences.

Firstly, since every environment input comes with the correct action in supervised scenarios, the match set [M] is split into two sets - a correct set [C] and an incorrect set [I] (recall step (4) in Fig. 1), depending on whether the population rule's action also matches the current input action. If the current action is not in [M], covering is triggered to create a rule that matches the current action. The ability to cover without guessing a suitable action in UCS is also bene cial.

Secondly, the fact the correct action is knows simpli es the reward function. The measure of accuracy is simpli ed as the performance of each rule is explicitly available. Each classi er in UCS has an additional parameter, correctTrack, which is incremented every time a classi er is part of the correct set [C]. The classi er accuracy replaces the reward prediction p in XCS and is calculated as: accuracy = correctT rack exp Fitness in UCS is simpler than XCS [ 7 ] and is given by: f itness = accuracy where determines the pressure on the rules to be accurate (recall Sec. 2.1). 3

Benchmark problems

In this section we describe the benchmark problems used in this research. 3.1

Mutliplexer

The Boolean n-bit Multiplexer de nes a set of learning problems that conceptually describe the behaviour of an electronic multiplexer (MUX), a device that can combine multiple analog or digital input signals and forward them into a single output signal [11]. The 6-bit multiplexer is shown in Figure 2.

Multiplexer problems exhibit epistasis3 (class is determined via interaction between multiple features, rather than a single one) and also heterogeneous4 behaviour (for di erent instances, the class value is determined by a di erent subset of features) [ 7 ]. This elevates the complexity of the problem due to the non-linear association between the condition and the corresponding action.

In addition, multiplexer problems contain knowledge-based patterns. Figure 3 shows the entire domain can be separated into 8 maximally general rules theoretically this is the minimum (optimal) number of rules that the LCS must generate to completely cover the problem. Figure 4 shows the class symmetry, which is characteristic of a balanced problem. The multiplexer benchmark problem was chosen for use in this project as an example of a balanced, generalisable problem. Speci cally, the 11-bit Multiplexer was used. 3.2

Position

The position problem benchmark is an example of an unbalanced multi-class problem where, given a binary string of length l as input, the output is determined by the value of the index position of the left-most one-valued bit. Although 3 Type of gene interaction - a single gene is under the in uence of other distinct genes. 4 To be understood here as consisting of di erent or diverse components. being a generalisable problem, the position benchmark di ers from other benchmarks as it is unbalanced - this is clearly seen in Figure 5, where there are much more values for class 0 than for class 2, for example. The 9-bit Position benchmark was used in this paper.

Experiments: methodology and implementation5 We started with the "recommended" parameter values suggested by [ 7,6 ] and then altered every parameter consecutively, changing value from the smallest to the highest recommended value whilst keeping other parameters constant. For every experiment, the resulting model cross-validation accuracy, number of rules in [P] and run time were recorded. Each experiment was repeated 10 times to minimise the LCS stochastic e ect and the averaged values recorded.

k-Fold Cross-Validation was used to evaluate XCS and UCS models using a set of training data. The most common values for the number of folds k are 5 and 10 [12], as they have been shown "empirically to yield test error rate estimates that su er neither from excessively high bias nor from very high variance" [13]. In our case, k = 5 was chosen, as it allowed to generate su ciently large training and testing data sets, representative of the data being learned.

The XCS algorithm implementation was created following closely [ 6 ], which, since its publication, has become the standard to follow when implementing XCS. Minor modi cations in our implementation: 1. Order of updating rule parameters changed for convenience - see Fig. 6. 5 The codebase used for our experiments is available on Gitlab by pointing your browser to: https://gitlab.eps.surrey.ac.uk/vs00162/lcs-suite 2. Calculation of accuracy is done via a power law function, as proposed in [14], rather than via an exponential function used in the original. 3. Covering occurs based on classi ers in [M], rather than the mean prediction of the population suggested in [15], to increase program execution speed. 4. Genetic algorithm is applied to the action set, as has been described in [16]. For UCS, the implementation follows [ 9 ]. The run parameters investigated in this work can be seen in Figure 7.

Results - Guidelines for run parameters values

Parameter I - number of iterations.

The iteration parameter (I) speci es the maximum number of learning iterations before the evaluation of the algorithm commences. During a single iteration, a input is received from the environment and the LCS performs a learning cycle over it. The learning continues until either a speci c number of iterations is reached or the training accuracy reaches a certain threshold. Here, we opted for the former as this allows for a more fair test and comparison.

Results shown in Figures 8 and 9 demonstrate the relationship between number of iterations and accuracy, rule set size, and time for both benchmark problems, for both XCS (reinforcement learning) and UCS (supervised learning). It x can be seen that the number of iterations and the time it takes to complete the learning cycle are directly proportional. This is true for both benchmark problems and is not a ected by the algorithm type. The increase in the number of iterations causes a decrease in the amount of rules generated, indicating that the population is becoming more optimised and generalised. It also has a positive e ect on the accuracy of the algorithm.

Guideline. For problems where the input has a length of 20 attributes or less, acceptable results may be achieved with [ 10,000-50,000 ] iterations. For larger and more complex problems, a larger number of iterations should be considered.

Parameter N - Rule population size.

The maximum population size parameter (N ) speci es the allowed maximum number of rules in the population. Once N is exceeded, the deletion mechanism is activated. It is usually one of the rst parameters to be tweaked in an LCS and has an important in uence on the result obtained by the system.

Results shown in Figures 10, 11 demonstrate the relationship between number of iterations and accuracy, rule set size, and time for both benchmark problems.

It can be seen that an increase in the population size has yielded a dramatic increase in accuracy, following a logarithmic pattern. In the position problem it is almost constant (see Fig. 11 while a sharp rise is observed when N is smaller than 500. The results allow to draw several conclusions: (1) Parameter N plays a vital role in a learning classi er system and it is crucial to get the value correct for the system to yield adequate results. (2) If N is set too low, the LCS will not be able to maintain good classi ers and generate a good model, resulting in low accuracy of prediction of new instances. (3) If N is set too high, unnecessary computational resources will be consumed without any bene t. This is not only ine cient but can also lead to longer time for the algorithm to converge to an optimal solution.

The challenge therefore is to nd such value of N that it is as low as possible, but at the same time provides adequate accuracy.

Guideline. N in the range of [1000, 4000] will yield adequate accuracy for most classi cation tasks, with smaller problems requiring the lower-end values and more large and complex problems the higher-end. If rule compression is applied, the rule population size will need to be set on the generous side of the proposed range.

Parameter P # - "don't care" probability.

The parameter (P #) speci es the probability of inserting a "don't care" symbol, denoted by #, into a rule classi er that has been generated during the covering stage. Results shown in Figures 12 and 13 demonstrate the relationship between P #, algorithm accuracy, rule set size and time elapsed - without compression. The challenge is to set P# to a value which will enable the system to generate rules in the population that are not overly-general (such classi ers may maintain high tness and succeed in certain cases, but may lead to degradation of overall performance [ 5 ]), but at the same time not overly-speci c, since this will require more rules to provide a result that covers the entire problem map, which in turn might require an increase of N , leading to unnecessary increase in running time.

Guideline. For problems that are not evidently generalisable, P # must be set low { in the range of [0, 0.3], with more complex situations requiring P # to be set as low as 0. For problems with cleanly generalisable patterns or where high generality is needed, P # should be set at the higher end of the range, i.e, [0.7, 0.99]. In situations where determination of P # is too di cult, rule speci city limit (RSL) [17] is recommended.

Parameter - Fitness exponent.

The tness exponent parameter ( ) is central to updating the tness of a rule. Results shown in Figures 14 and 15 demonstrate that optimal values for for XCS and UCS di er. For XCS, optimal value appears to be = 5; at this value, best balance between accuracy, rules generated and run time can be achieved. For UCS, results demonstrate that as the tness exponent increases, algorithm accuracy increases, whilst the number of devised rules decreases even when rule compaction has not been switched on.

Guideline. Optimal values of for XCS and UCS di er. For XCS, optimal value appears to be = 5, whilst for UCS working on cleanly generalisable problems and data, must be set high, closer to = 10.

Parameter - Crossover probability.

The crossover probability parameter ( ) determines the probability of applying crossover to some selected classi ers during activation of the genetic algorithm. Results shown in Figures 16 and 17 demonstrate that increase in results in considerable decrease of the algorithm accuracy - best results are achieved when = 0:7. The number of rules is also decreasing. This can be explained by the fact that by increasing the algorithm is encouraged to generate rules that will be more generalised, so more rules will be subsumed during the compression stage. The caveat is that more general rules do not necessarily guarantee a better resulting algorithm accuracy.

Guideline. If rule compression is not used, values in the range of [0.7, 0.9] will yield appropriate results; otherwise, the recommended value for is 0.7.

Parameter - Mutation probability.

The mutation probability parameter ( ) determines the probability of mutation for every attribute of a classi er generated by the genetic algorithm. A range of values from [0:2; 0:5] was investigated for this parameter, as per recommendations in [ 7 ]. Results shown in Figures 18 and 19 demonstrate that as the value of increases, the accuracy decreases, the rule count grows and the computation time increases. Also, note that for the multiplexer problem (with or without compression), the optimum value for appears to be 0:05, resulting in a prediction accuracy of 100%, which is a stronger result than when was set in the [0:2; 0:5] range. A similar trend was observed for the position problem.

Guideline. proved to be domain speci c, hence no single value can be rec1 ommended; rather, must be calculated using the formula = , as suggested l in [18], where l is the number of attributes in the condition of the rules.

Guidelines for other parameters. = 0:1 must be used for problems where noise is present, data is unbalanced, complexity is high, whilst = 0:2 must be used for clean, straight-forward problems. The parameters must be set to 0.1. For clean problems, 0 must be set low, in the range [0, 0.01] { for complex problems and problems that are expected to yield a high amount of error for each classi er, 0 must be set higher. 6

Concluding remarks and future work

The reader may notice an alignment of our guidelines for certain parameters with suggestions for XCS in [ 7 ] and [ 6 ], which e ectively comprise the only closely related work. Therefore, it is worth noting that i) in addition to XCS, supervised learning (UCS) has also been addressed here, and ii) di erent benchmark problems have been considered, so the alignment reinforces our guidelines. In addition, we report on how rule-based machine learning algorithms, such as XCS and UCS, react to a problem with in-built class imbalance (recall Fig. 5).

The codebase used for our experiments is available on Gitlab by pointing your browser to: https://gitlab.eps.surrey.ac.uk/vs00162/lcs-suite .

Previous work has been concerned with the application of rule-based machine learning to learning individual passengers' preferences and making personalised recommendations for their onward journeys, in the event of major disruption [ 4 ], [19]. In this context, XCS was adapted to work with integer valued rather than boolean parameters, resulting in the so-called XCSI, as most variables in the problem domain were not binary. XCS and reinforcement learning has been applied to the problem of controllability in complex networks and more speci cally to the control of random Boolean networks (RBNs) in [ 5 ]. In brief, XCS can be successfully used to evolve rule sets that can direct the network from any state to a target attractor state, or switch between attractor states. The network is understood to operate e ectively at attractor states, hence this notion of controllability in complex networks. The rule-based machine learning algorithm in this case learns to develop a series of interventions, at most one at each time step, to successfully control the underlying network in what is a multistep problem. The work in [20] then investigates whether the machine learning algorithm produces optimal paths to the speci ed attractor state.

There was little guidance in the literature in setting the run parameter values for XCSI [21] in the one-step problem (of making the correct recommendation to a passenger) as well as for XCS in the multi-step problem of directing a random boolean network to an attractor state. This was in part the motivation for the work presented in this paper. The lack of guidance on setting the run parameter values can be a barrier to adopting a rule-based machine learning approach and hence not being able to capitalise on the fact the outcome in this form of machine learning is interpretable.

Latest developments in applications of rule-based machine learning include ExTRaCS [22], which is a slimmed down version of rule-based machine learning for supervised learning problems, as well as attempts to control dynamical systems such as predator-prey models or the well-known S-shaped growth population model [23]. This kind of systems involve real rather than boolean valued parameters and in a certain important sense control becomes continuous rather than discrete. In this context, an adaptation of XCS is necessary so as to work with real valued parameters, the so-called XCSR [14]. The challenge for XCSR is to identify the initialisation of the model which leads to a desired state. This is possible after determining the control or driver nodes [24] and the series of interventions needed for steering the network to that state.

Preliminary investigations in [25] have shown that XCSR can control nonlinear systems of order 2 by appealing to adaptive action mapping [26]. Further work is required to overcome issues of scaling up and the action chain learning problem, which is intrinsic in reinforcement learning in multi-step problems.

The RuleML hub for interoperating structured knowledge [27] considers condition-action rules, similar to those found in rule-based machine learning and LCS. This opens up the spectrum for considering `web-based AI' type of architectures and applications.

Another possible direction for future work concerns evolving DMN rules using XCS. DMN-based rules [28] are expressive but may become extremely complex for a human to construct. Using XCS to evolve DMN rules could produce complex and adaptive rule sets without the need for human intervention in the system. 10. J. Filip and T. Kliegr, \Classi cation based on associations (CBA) - A performance analysis," in Proceedings of RuleML+RR 2018 Challenge, 2018. 11. T. Dean, Network+ Guide to Networks. Course Tech Cengage Learning, 2013. 12. M. Kuhn and K. Johnson, Applied predictive modeling. Springer, NY, USA., 2016. 13. G. James, D. Witten, T. Hastie, and R. Tibshirani, An introduction to statistical learning: with applications in R. Springer, New York, NY, USA, 2015. 14. S. W. Wilson, \Get real! XCS with continuous-valued inputs," in Learning Classier Systems, pp. 209{219, Springer, 2000. 15. S. W. Wilson, \Classi er tness based on accuracy," Evolutionary Computation, vol. 3, pp. 149{175, June 1995. 16. S. W. Wilson, \Generalization in the xcs classi er system," in 3rd Annual Genetic

Programming Conference (GP-98), 1998. 17. R. J. Urbanowicz and J. H. Moore, \Exstracs 2.0: description and evaluation of a scalable learning classi er system," Evolutionary Intelligence, vol. 8, no. 2-3, p. 89{116, 2015. 18. K. Deb, \Multi-objective optimisation using evolutionary algorithms: An introduction," in Multi-objective Evolutionary Optimisation for Product Design and Manufacturing, pp. 3{34, Springer, 2011. 19. M. R. Karlsen and S. Moschoyiannis, \Customer Segmentation of Wireless Trajectory Data," in Technical Report, University of Surrey, 2019. 20. M. R. Karlsen and S. K. Moschoyiannis, \Optimal control rules for random boolean networks," in International Workshop on Complex Networks and their Applications, pp. 828{840, Springer, 2018. 21. S. W. Wilson, \Compact rulesets from XCSI," in International Workshop on Learning Classi er Systems, pp. 197{208, Springer, 2001. 22. R. J. Urbanowicz and J. H. Moore, \Exstracs 2.0: description and evaluation of a scalable learning classi er system," Evolutionary intelligence, vol. 8, no. 2-3, p. 89{116, 2015. 23. Business Dynamics: System Thinking and Modeling fro a Complex World, publisher=McGraw-HJill Higher Education, author=Sterman, J. D., year=2000. 24. S. Moschoyiannis, N. Elia, A. Penn, D. J. B. Lloyd, and C. Knight, \A webbased tool for identifying strategic intervention points in complex systems," in Proceedings of the Games for the Synthesis of Complex Systems (CASSTING'16 @ ETAPS 2016), vol. 220 of EPTCS, pp. 39{52, 2016. 25. V. B. Georgiev, M. R. Karlsen, L. Schoenenberger, and S. Moschoyiannis, \On controlling non-linear systems with XCSR," in 13th SICC Workshop { Complexity and the City, 2018. 26. L. P. T. K. Nakata, M., \Selection strategy for XCS with adaptive action mapping," in 15th Annual Conference on Genetic and Evolutionary Computation, 2016. 27. H. Boley, \The RuleML Knowledge-Interoperation Hub," in RuleML 2016, LNCS 9718, pp. 19{33, Springer, 2016. 28. D. Calvanese, M. Dumas, F. M. Maggi, and M. Montali, \Semantic DMN: Formalizing decision models with domain knowledge," in RuleML+RR 2017, LNCS 10364, pp. 70{86, Springer, 2017.

Sha and

Abbass , \ A survey of learning classi er systems in games," IEEE Computational Intelligence Magazine , vol. 12 , pp. 42 { 55 , 02 2017 .

W. N. L.

Browne , \ The development of an industrial learning classi er system for data-mining in a steel hop strip mill," in Applications of Learning Classi er Systems , pp. 223 { 259 , Springer Berlin, 2004 .

Krause ,

Razavi ,

Moschoyiannis , and

Marinos , \ Stability and complexity in digital ecosystems," in IEEE Digital Ecosystems and Technologies (IEEE DEST) , pp. 85 { 90 , 2009 .

M. R.

Karlsen and

Moschoyiannis , \ Learning condition{action rules for personalised journey recommendations," in RuleML + RR 2018 , LNCS 11092 , pp. 293 { 301 , Springer, 2018 .

M. R.

Karlsen and

Moschoyiannis , \ Evolution of control with learning classi er systems," Applied Network Science , vol. 3 , pp. 3 { 30 , Aug 2018 .

M. V.

Butz and S. W. Wilson, \ An algorithmic description of xcs," Soft Computing , vol. 6 , no. 3 , pp. 144 { 153 , 2001 .

R. J.

Urbanowicz and

W. N.

Browne , Introduction to learning classi er systems . Berlin, Germany. Springer, 2017 .

M. V.

Butz and S. W. Wilson, \ An algorithmic description of XCS," in International Workshop on Learning Classi er Systems , pp. 253 { 272 , Springer, 2000 .

Bernado-Mansilla and

J. M.

Garrell-Guiu , \ Accuracy-based learning classi er systems: Models, analysis and applications to classi cation tasks," Evolutionary Computation , vol. 11 , no. 3 , pp. 209 { 238 , 2003 .