<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Fine tuning run parameter values in rule-based machine learning?</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Sotiris Moschoyiannis</string-name>
          <email>s.moschoyiannis@surrey.ac.uk</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Vasily Shcherbinin</string-name>
          <email>v.shcherbinin@surrey.ac.uk</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Department of Computer Science, University of Surrey</institution>
          ,
          <addr-line>Guildford, Surrey, GU2 7XH</addr-line>
        </aff>
      </contrib-group>
      <abstract>
        <p>Rule-based machine learning focuses on learning or evolving a set of rules that represents the knowledge captured by the system. Due to its inherent complexity, a certain amount of ne tuning is required before it can be applied to a particular problem. However, there is limited information available to researchers when it comes to setting the corresponding run parameter values. In this paper, we investigate the run parameters of Learning Classi er Systems (LCSs) as applied to single-step problems. In particular, we study two LCS variants, XCS for reinforcement learning and UCS for supervised learning, and examine the e ect that di erent parameter values have on enhancing the model prediction, increasing accuracy and reducing the resulting rule set size.</p>
      </abstract>
      <kwd-group>
        <kwd>condition-action rules</kwd>
        <kwd>learning classi er systems forcement learning</kwd>
        <kwd>supervised learning</kwd>
        <kwd>ne tuning LCS</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>-</title>
      <p>
        The key feature behind Rule-based machine learning (RBML) is that it evolves
arounds rules and therefore the outcome is interpretable by both computers
and humans. RBML approaches include learning classi er systems (LCSs),
association rule learning and arti cial immune systems. The focus here is on LCSs
which have been used in a variety of settings, from games [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ] to industrial control
[
        <xref ref-type="bibr" rid="ref2">2</xref>
        ] to policy [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ] to transport [
        <xref ref-type="bibr" rid="ref4">4</xref>
        ] to controlling Boolean networks [
        <xref ref-type="bibr" rid="ref5">5</xref>
        ].
      </p>
      <p>However, rule-based approaches involve a number of parameters and the
quality of the outcome largely relies on assigning them appropriate values. There is
distinct lack in guidance on how the run parameters should be set in order
to build a good LCS, for both reinforcement learning and supervised learning
problems. Here, the investigation is aimed at single-step problems; that is,
learning problems where the reward from e ecting the action on the environment is
known immediately rather than after several steps, as in multi-step problems.</p>
      <p>In this paper we study the interplay of the LCS with the data being learned
in benchmark (single-step) problems. In particular, we focus the investigation
on XCS, the most popular LCS-variant for reinforcement learning, and UCS
the classic variant for supervised learning problems. UCS was speci cally built
for knowledge extraction (learning patterns in complex datasets) and provides a
contrast to reinforcement learning XCS, which allows to see the e ects of various
parameter values from a di erent LCS angle.</p>
      <p>
        We focus on the run parameters suggested in seminal work in the eld of LCS
such as [
        <xref ref-type="bibr" rid="ref6">6</xref>
        ],[
        <xref ref-type="bibr" rid="ref7">7</xref>
        ] and use the values therein as a starting point. Neither work gives
information on the source of the values. We aim to ll this gap in this paper by
providing explicit guidelines on how to set the values of these key parameters of
LCS for both reinforcement and supervised learning. We test both algorithms
against balanced and unbalanced, generalisable problems.
      </p>
      <p>The remainder of the paper is structured as follows. Section 2 provides a
brief overview of LCS, XCS and UCS. Section 3 introduces the two benchmark
problems used in the study while Section 4 outlines the methodology and
experiments we carried out. In Section 5 we discuss the results for both learning
algorithms and provide explicit guidelines for setting their run parameters. We
conclude and consider future work in Section 6.
2</p>
      <p>Rule-based Machine Learning: an overview of LCSs
Learning classi er systems (LCSs) evolve around rules - matching, predicting,
updating, discovering. A rule follows an IF : THEN expression and comprises
two parts; the condition (or state) and the action (or class or phenotype).</p>
      <p>A rule is associated with a number of parameters: (i) the reward prediction
p estimates the reward if this rule's action is eventually e ected on the
environment, (ii) the prediction error , and (iii) a tness F which re ects the quality of
the rule. These parameters are updated in every iteration of the learning cycle
(cf Section 2.1, 2.2). A classi er is a rule with associated parameters.</p>
      <p>The condition of a rule is expressed in a ternary alphabet; an attribute can
have a value of 0 or 1, but also a # which stands for the 'don't care' value, i.e.,
it can match both 0 or 1. This is what allows to generalise rules and make them
applicable to more than one instance - this in turn allows to capture useful data
relationships without over tting the rules to training data. For example, the
rules 110:1 and 111:1 can be generalised to 11#:1 (e.g., cloudy and windy points
to taking an umbrella irrespective of evening or morning). The action part of a
rule represents the predicted or recommended action.</p>
      <p>The LCS functional cycle, shown in Figure 1 comprises three components:
the environment, the learning machine and, optionally, rule compression
(postprocessing). The environment provides as input a series of sensory situations {
typically abstracted to a dataset containing so-called training instances.</p>
      <p>The rst step of the learning cycle is to receive input from the environment
(1). The next step is to identify and choose all rules in the rule population [P]
whose condition part matches that of the training instance (environment input)
(2). Matching compares the current input to all rules in the population [P]. If
all speci ed attributes match, the rule is moved to the match set [M] (3).</p>
      <p>The next step produces the action set [A] (4) which comprises all actions
associated with rules whose condition matched the input. This set forms the
input used in the GA and also speci es which rules are updated after each
iteration. We will have more to say on this in Sections 2.1 and 2.2.</p>
      <p>In addition to the parameters p, , F discussed earlier, rules in [P] are
associated with parameters: age - the number of iterations that occurred after the
rule was created; exp - the number of times the rule belonged to the action set
[A]; actionSetSize - the size of the action set to which the rule belonged; n
the number of copies of a rule appearing in the population, called numerosity.</p>
      <p>If the action set is empty, or the number of actions in it do not exceed the
pre-set maximum number of actions, mna, the Covering mechanism is applied
(5) { this allows to introduce new rules to the population [P] (rule discovery).
During the process, some of the attributes in the rule's condition and action
are generated using values taken from the current environment input; the rest
are set to #, with probability P #. Since LCS rule populations start o empty,
covering is also used for population initialisation. If there is at least one action
in the action set, the next step is to update the rule parameters for the classi ers
that have made it into the action set [A] (6) (cf Sections 2.1 and 2.2).</p>
      <p>Additionally, a mechanism called Subsumption is often applied (7) in which
pairs of rules are examined and whenever a rule (subsumer) matches all attributes
of the other rule, is more general and just as accurate it absorbs this other, more
speci c, rule. An error threshold 0 determines the maximum error a rule may
have and still be able to subsume another.</p>
      <p>Next, a genetic algorithm (GA) is applied (8). It operates on the action set
[A], with a frequency determined by ga to produce new rules by crossover and
mutation. When activated, the GA runs just once. Parent rules are selected
to produce two child rules. With probability a two-point crossover operates
on these o spring rules. Each integer pair is then mutated with a probability .
Several selection mechanisms exist - recent literature suggests preference towards
tournament selection, where a portion of classi ers is randomly selected from [A]
and the classi er with the highest tness is chosen to be the parent.</p>
      <p>The last step involves a deletion mechanism which is applied to cap the
population at N rules (9). Whenever the population size, which is the sum of
all rule numerosities, exceeds N , rules are deleted from the population until
the size limit is met. Classi er experience exp must exceed a threshold del or
tness to impact deletion likelihood. The population cap, combined with the
selective deletion of rules, results in an overall drive to improve the quality of
the population.</p>
      <p>Optionally, a post-processing step is applied to the rule population after
training. This is known as rule compaction (see Fig. 1) and it aims to remove
rules that are inexperienced, inaccurate, have a low tness but whose error
does not exceed the speci ed limit 0.
2.1</p>
      <p>
        eXtended Classi er System (XCS)
LCS can function by receiving only the utility of the function e ected on the
environment (reinforcement learning). The eXtended Classi er System (XCS)
is an accuracy-based LCS developed originally by [
        <xref ref-type="bibr" rid="ref8">8</xref>
        ] to target reinforcement
learning problems. The key di erentiating characteristics of XCSs follow.
      </p>
      <p>Firstly, once the match set [M] has been created, a prediction array is
created, which supplies the tness-weighted predicted rewards for each of the actions
within [M]. This is used in action selection { in explore mode, with probability
pexpl, the system randomly selects from the available actions in [M]; otherwise,
in exploit mode it chooses the action with the highest predicted reward p as
indicated by the array. Certain rules within the match set (more than one possibly)
contain the action that was selected. These rules form the action set [A] (Fig 1).</p>
      <p>Secondly, there is a feedback mechanism which detects the e ect of the
selected action when executed on the environment and grants a particular reward
or payo . On a single-step problem such as those addressed in this paper, this
reward is used directly in updating the rule parameters.</p>
      <p>
        The updating can only happen in [A]. Parameter updates follow the
timeweighted recency average (TWRA) principle [
        <xref ref-type="bibr" rid="ref7">7</xref>
        ], which can be described as:
AverageNew = AverageCurrent + (V alueCurrent
AverageCurrent)
(1)
This is directly inspired by the Widrow-Ho update, which is commonly used in
arti cial neural networks (ANNs) for updating the weights of the inputs:
i+1 = i + (ui
i)
(2)
When the value ui is received, the equation above is used to record the value of
variable i at each iteration i. The value of i+1 would thus be a combination of
the old value of and the newly received value of u. The split balance between
using old and new values is controlled by which is called the learning rate
parameter. If is 0, then no learning takes place; if = 1, then only the new
value is stored and there is no recollection of the past values. The optimal value
for lays therefore in the range from [
        <xref ref-type="bibr" rid="ref1">0, 1</xref>
        ]. The reward prediction p is given by:
p
p +
(r
p);
0
      </p>
      <p>1
&gt;8&lt; + (jr
&gt; +
:
pj
exp
(jr
pj
)
;
if exp &lt;</p>
      <p>1
); otherwise
=
81;
&lt;
:
if</p>
      <p>&lt; 0
( ) ; otherwise</p>
      <p>
        0
following the same concept of split balance described earlier for p, but this time
applied to (cf Fig. 6). Fitness of a rule is based on accuracy [
        <xref ref-type="bibr" rid="ref6">6</xref>
        ], given by:
where 0 is a prede ned error threshold, as discussed before. If the error
associated with this rule (classi er's error) is greater than the error threshold 0, its
accuracy decreases, based on and . When is set to a low value (approaching
0), there is going to be an acute decrease in the accuracy value; when is set
to a higher value, drives the rate of increase in accuracy. If is set to a high
value, the accuracy will decrease fast; if is set to a low value, accuracy will
drop slowly. A form of tness-sharing is applied in XCS given by:
where r is the reward received from the environment. The error between the
reward received by the environment and the predicted reward is given by:
(3)
(4)
(5)
(6)
(7)
0 =
      </p>
      <p>P
cl2[A]</p>
      <p>cl
F</p>
      <p>F + ( 0</p>
      <p>F )
where cl denotes a classi er in the action set [A], cl the accuracy associated
with cl, and the accuracy of the present classi er.</p>
      <p>The tness classi er (cf Fig. 6) can thus be calculated and updated using:
2.2</p>
      <p>
        sUpervised Classi er System (UCS)
The sUpervised Classi er System (UCS) is an accuracy-based LCS developed by
[
        <xref ref-type="bibr" rid="ref9">9</xref>
        ] speci cally for supervised learning problems. In supervised learning, the
correct action is available from the environment. This allows UCS to focus on data
mining and knowledge discovery in database type problems, e.g., performance
analysis of CBA in [10]. UCS and XCS share key principles such as a niche GA
and de ning tness based on accuracy, but also have key di erences.
      </p>
      <p>Firstly, since every environment input comes with the correct action in
supervised scenarios, the match set [M] is split into two sets - a correct set [C] and an
incorrect set [I] (recall step (4) in Fig. 1), depending on whether the population
rule's action also matches the current input action. If the current action is not in
[M], covering is triggered to create a rule that matches the current action. The
ability to cover without guessing a suitable action in UCS is also bene cial.</p>
      <p>
        Secondly, the fact the correct action is knows simpli es the reward function.
The measure of accuracy is simpli ed as the performance of each rule is
explicitly available. Each classi er in UCS has an additional parameter, correctTrack,
which is incremented every time a classi er is part of the correct set [C]. The
classi er accuracy replaces the reward prediction p in XCS and is calculated as:
accuracy =
correctT rack
exp
Fitness in UCS is simpler than XCS [
        <xref ref-type="bibr" rid="ref7">7</xref>
        ] and is given by: f itness = accuracy
where determines the pressure on the rules to be accurate (recall Sec. 2.1).
3
      </p>
    </sec>
    <sec id="sec-2">
      <title>Benchmark problems</title>
      <p>In this section we describe the benchmark problems used in this research.
3.1</p>
      <sec id="sec-2-1">
        <title>Mutliplexer</title>
        <p>The Boolean n-bit Multiplexer de nes a set of learning problems that
conceptually describe the behaviour of an electronic multiplexer (MUX), a device that
can combine multiple analog or digital input signals and forward them into a
single output signal [11]. The 6-bit multiplexer is shown in Figure 2.</p>
        <p>
          Multiplexer problems exhibit epistasis3 (class is determined via interaction
between multiple features, rather than a single one) and also heterogeneous4
behaviour (for di erent instances, the class value is determined by a di erent
subset of features) [
          <xref ref-type="bibr" rid="ref7">7</xref>
          ]. This elevates the complexity of the problem due to the
non-linear association between the condition and the corresponding action.
        </p>
        <p>In addition, multiplexer problems contain knowledge-based patterns. Figure
3 shows the entire domain can be separated into 8 maximally general rules
theoretically this is the minimum (optimal) number of rules that the LCS must
generate to completely cover the problem. Figure 4 shows the class symmetry,
which is characteristic of a balanced problem. The multiplexer benchmark
problem was chosen for use in this project as an example of a balanced, generalisable
problem. Speci cally, the 11-bit Multiplexer was used.
3.2</p>
      </sec>
      <sec id="sec-2-2">
        <title>Position</title>
        <p>The position problem benchmark is an example of an unbalanced multi-class
problem where, given a binary string of length l as input, the output is
determined by the value of the index position of the left-most one-valued bit. Although
3 Type of gene interaction - a single gene is under the in uence of other distinct genes.
4 To be understood here as consisting of di erent or diverse components.
being a generalisable problem, the position benchmark di ers from other
benchmarks as it is unbalanced - this is clearly seen in Figure 5, where there are
much more values for class 0 than for class 2, for example. The 9-bit Position
benchmark was used in this paper.</p>
        <p>
          Experiments: methodology and implementation5
We started with the "recommended" parameter values suggested by [
          <xref ref-type="bibr" rid="ref6 ref7">7,6</xref>
          ] and
then altered every parameter consecutively, changing value from the smallest to
the highest recommended value whilst keeping other parameters constant. For
every experiment, the resulting model cross-validation accuracy, number of rules
in [P] and run time were recorded. Each experiment was repeated 10 times to
minimise the LCS stochastic e ect and the averaged values recorded.
        </p>
        <p>k-Fold Cross-Validation was used to evaluate XCS and UCS models using
a set of training data. The most common values for the number of folds k are 5
and 10 [12], as they have been shown "empirically to yield test error rate
estimates that su er neither from excessively high bias nor from very high variance"
[13]. In our case, k = 5 was chosen, as it allowed to generate su ciently large
training and testing data sets, representative of the data being learned.</p>
        <p>
          The XCS algorithm implementation was created following closely [
          <xref ref-type="bibr" rid="ref6">6</xref>
          ], which,
since its publication, has become the standard to follow when implementing
XCS. Minor modi cations in our implementation:
1. Order of updating rule parameters changed for convenience - see Fig. 6.
5 The codebase used for our experiments is available on Gitlab by pointing your
browser to: https://gitlab.eps.surrey.ac.uk/vs00162/lcs-suite
2. Calculation of accuracy is done via a power law function, as proposed in [14],
rather than via an exponential function used in the original.
3. Covering occurs based on classi ers in [M], rather than the mean prediction
of the population suggested in [15], to increase program execution speed.
4. Genetic algorithm is applied to the action set, as has been described in [16].
For UCS, the implementation follows [
          <xref ref-type="bibr" rid="ref9">9</xref>
          ].
The run parameters investigated in this work can be seen in Figure 7.
        </p>
        <p>Results - Guidelines for run parameters values</p>
      </sec>
      <sec id="sec-2-3">
        <title>Parameter I - number of iterations.</title>
        <p>The iteration parameter (I) speci es the maximum number of learning iterations
before the evaluation of the algorithm commences. During a single iteration, a
input is received from the environment and the LCS performs a learning cycle
over it. The learning continues until either a speci c number of iterations is
reached or the training accuracy reaches a certain threshold. Here, we opted for
the former as this allows for a more fair test and comparison.</p>
        <p>Results shown in Figures 8 and 9 demonstrate the relationship between
number of iterations and accuracy, rule set size, and time for both benchmark
problems, for both XCS (reinforcement learning) and UCS (supervised learning). It
x
can be seen that the number of iterations and the time it takes to complete the
learning cycle are directly proportional. This is true for both benchmark
problems and is not a ected by the algorithm type. The increase in the number of
iterations causes a decrease in the amount of rules generated, indicating that the
population is becoming more optimised and generalised. It also has a positive
e ect on the accuracy of the algorithm.</p>
        <p>
          Guideline. For problems where the input has a length of 20 attributes or less,
acceptable results may be achieved with [
          <xref ref-type="bibr" rid="ref1 ref2 ref3 ref4 ref5 ref6 ref7 ref8 ref9">10,000-50,000</xref>
          ] iterations. For larger
and more complex problems, a larger number of iterations should be considered.
        </p>
      </sec>
      <sec id="sec-2-4">
        <title>Parameter N - Rule population size.</title>
        <p>The maximum population size parameter (N ) speci es the allowed maximum
number of rules in the population. Once N is exceeded, the deletion mechanism
is activated. It is usually one of the rst parameters to be tweaked in an LCS
and has an important in uence on the result obtained by the system.</p>
        <p>Results shown in Figures 10, 11 demonstrate the relationship between number
of iterations and accuracy, rule set size, and time for both benchmark problems.</p>
        <p>It can be seen that an increase in the population size has yielded a dramatic
increase in accuracy, following a logarithmic pattern. In the position problem it
is almost constant (see Fig. 11 while a sharp rise is observed when N is smaller
than 500. The results allow to draw several conclusions:
(1) Parameter N plays a vital role in a learning classi er system and it is crucial
to get the value correct for the system to yield adequate results.
(2) If N is set too low, the LCS will not be able to maintain good classi ers and
generate a good model, resulting in low accuracy of prediction of new instances.
(3) If N is set too high, unnecessary computational resources will be consumed
without any bene t. This is not only ine cient but can also lead to longer time
for the algorithm to converge to an optimal solution.</p>
        <p>The challenge therefore is to nd such value of N that it is as low as possible,
but at the same time provides adequate accuracy.</p>
        <p>Guideline. N in the range of [1000, 4000] will yield adequate accuracy for
most classi cation tasks, with smaller problems requiring the lower-end values
and more large and complex problems the higher-end. If rule compression is
applied, the rule population size will need to be set on the generous side of the
proposed range.</p>
      </sec>
      <sec id="sec-2-5">
        <title>Parameter P # - "don't care" probability.</title>
        <p>
          The parameter (P #) speci es the probability of inserting a "don't care" symbol,
denoted by #, into a rule classi er that has been generated during the covering
stage. Results shown in Figures 12 and 13 demonstrate the relationship between
P #, algorithm accuracy, rule set size and time elapsed - without compression.
The challenge is to set P# to a value which will enable the system to generate
rules in the population that are not overly-general (such classi ers may maintain
high tness and succeed in certain cases, but may lead to degradation of overall
performance [
          <xref ref-type="bibr" rid="ref5">5</xref>
          ]), but at the same time not overly-speci c, since this will require
more rules to provide a result that covers the entire problem map, which in turn
might require an increase of N , leading to unnecessary increase in running time.
        </p>
        <p>Guideline. For problems that are not evidently generalisable, P # must be
set low { in the range of [0, 0.3], with more complex situations requiring P # to
be set as low as 0. For problems with cleanly generalisable patterns or where high
generality is needed, P # should be set at the higher end of the range, i.e, [0.7,
0.99]. In situations where determination of P # is too di cult, rule speci city
limit (RSL) [17] is recommended.</p>
      </sec>
      <sec id="sec-2-6">
        <title>Parameter - Fitness exponent.</title>
        <p>The tness exponent parameter ( ) is central to updating the tness of a rule.
Results shown in Figures 14 and 15 demonstrate that optimal values for for
XCS and UCS di er. For XCS, optimal value appears to be = 5; at this value,
best balance between accuracy, rules generated and run time can be achieved.
For UCS, results demonstrate that as the tness exponent increases, algorithm
accuracy increases, whilst the number of devised rules decreases even when rule
compaction has not been switched on.</p>
        <p>Guideline. Optimal values of for XCS and UCS di er. For XCS, optimal
value appears to be = 5, whilst for UCS working on cleanly generalisable
problems and data, must be set high, closer to = 10.</p>
      </sec>
      <sec id="sec-2-7">
        <title>Parameter - Crossover probability.</title>
        <p>The crossover probability parameter ( ) determines the probability of applying
crossover to some selected classi ers during activation of the genetic algorithm.
Results shown in Figures 16 and 17 demonstrate that increase in results in
considerable decrease of the algorithm accuracy - best results are achieved when
= 0:7. The number of rules is also decreasing. This can be explained by the
fact that by increasing the algorithm is encouraged to generate rules that
will be more generalised, so more rules will be subsumed during the compression
stage. The caveat is that more general rules do not necessarily guarantee a better
resulting algorithm accuracy.</p>
        <p>Guideline. If rule compression is not used, values in the range of [0.7, 0.9]
will yield appropriate results; otherwise, the recommended value for is 0.7.</p>
      </sec>
      <sec id="sec-2-8">
        <title>Parameter - Mutation probability.</title>
        <p>
          The mutation probability parameter ( ) determines the probability of mutation
for every attribute of a classi er generated by the genetic algorithm. A range
of values from [0:2; 0:5] was investigated for this parameter, as per
recommendations in [
          <xref ref-type="bibr" rid="ref7">7</xref>
          ]. Results shown in Figures 18 and 19 demonstrate that as the
value of increases, the accuracy decreases, the rule count grows and the
computation time increases. Also, note that for the multiplexer problem (with or
without compression), the optimum value for appears to be 0:05, resulting in
a prediction accuracy of 100%, which is a stronger result than when was set
in the [0:2; 0:5] range. A similar trend was observed for the position problem.
        </p>
        <p>Guideline. proved to be domain speci c, hence no single value can be
rec1
ommended; rather, must be calculated using the formula = , as suggested
l
in [18], where l is the number of attributes in the condition of the rules.</p>
        <p>Guidelines for other parameters. = 0:1 must be used for problems
where noise is present, data is unbalanced, complexity is high, whilst = 0:2
must be used for clean, straight-forward problems. The parameters must be
set to 0.1. For clean problems, 0 must be set low, in the range [0, 0.01] { for
complex problems and problems that are expected to yield a high amount of
error for each classi er, 0 must be set higher.
6</p>
      </sec>
    </sec>
    <sec id="sec-3">
      <title>Concluding remarks and future work</title>
      <p>
        The reader may notice an alignment of our guidelines for certain parameters with
suggestions for XCS in [
        <xref ref-type="bibr" rid="ref7">7</xref>
        ] and [
        <xref ref-type="bibr" rid="ref6">6</xref>
        ], which e ectively comprise the only closely
related work. Therefore, it is worth noting that i) in addition to XCS,
supervised learning (UCS) has also been addressed here, and ii) di erent benchmark
problems have been considered, so the alignment reinforces our guidelines. In
addition, we report on how rule-based machine learning algorithms, such as XCS
and UCS, react to a problem with in-built class imbalance (recall Fig. 5).
      </p>
      <p>The codebase used for our experiments is available on Gitlab by pointing
your browser to: https://gitlab.eps.surrey.ac.uk/vs00162/lcs-suite .</p>
      <p>
        Previous work has been concerned with the application of rule-based machine
learning to learning individual passengers' preferences and making personalised
recommendations for their onward journeys, in the event of major disruption
[
        <xref ref-type="bibr" rid="ref4">4</xref>
        ], [19]. In this context, XCS was adapted to work with integer valued rather
than boolean parameters, resulting in the so-called XCSI, as most variables
in the problem domain were not binary. XCS and reinforcement learning has
been applied to the problem of controllability in complex networks and more
speci cally to the control of random Boolean networks (RBNs) in [
        <xref ref-type="bibr" rid="ref5">5</xref>
        ]. In brief,
XCS can be successfully used to evolve rule sets that can direct the network
from any state to a target attractor state, or switch between attractor states.
The network is understood to operate e ectively at attractor states, hence this
notion of controllability in complex networks. The rule-based machine learning
algorithm in this case learns to develop a series of interventions, at most one at
each time step, to successfully control the underlying network in what is a
multistep problem. The work in [20] then investigates whether the machine learning
algorithm produces optimal paths to the speci ed attractor state.
      </p>
      <p>There was little guidance in the literature in setting the run parameter values
for XCSI [21] in the one-step problem (of making the correct recommendation to
a passenger) as well as for XCS in the multi-step problem of directing a random
boolean network to an attractor state. This was in part the motivation for the
work presented in this paper. The lack of guidance on setting the run parameter
values can be a barrier to adopting a rule-based machine learning approach and
hence not being able to capitalise on the fact the outcome in this form of machine
learning is interpretable.</p>
      <p>Latest developments in applications of rule-based machine learning include
ExTRaCS [22], which is a slimmed down version of rule-based machine
learning for supervised learning problems, as well as attempts to control dynamical
systems such as predator-prey models or the well-known S-shaped growth
population model [23]. This kind of systems involve real rather than boolean valued
parameters and in a certain important sense control becomes continuous rather
than discrete. In this context, an adaptation of XCS is necessary so as to work
with real valued parameters, the so-called XCSR [14]. The challenge for XCSR
is to identify the initialisation of the model which leads to a desired state. This
is possible after determining the control or driver nodes [24] and the series of
interventions needed for steering the network to that state.</p>
      <p>Preliminary investigations in [25] have shown that XCSR can control
nonlinear systems of order 2 by appealing to adaptive action mapping [26]. Further
work is required to overcome issues of scaling up and the action chain learning
problem, which is intrinsic in reinforcement learning in multi-step problems.</p>
      <p>The RuleML hub for interoperating structured knowledge [27] considers
condition-action rules, similar to those found in rule-based machine learning
and LCS. This opens up the spectrum for considering `web-based AI' type of
architectures and applications.</p>
      <p>Another possible direction for future work concerns evolving DMN rules using
XCS. DMN-based rules [28] are expressive but may become extremely complex
for a human to construct. Using XCS to evolve DMN rules could produce complex
and adaptive rule sets without the need for human intervention in the system.
10. J. Filip and T. Kliegr, \Classi cation based on associations (CBA) - A performance
analysis," in Proceedings of RuleML+RR 2018 Challenge, 2018.
11. T. Dean, Network+ Guide to Networks. Course Tech Cengage Learning, 2013.
12. M. Kuhn and K. Johnson, Applied predictive modeling. Springer, NY, USA., 2016.
13. G. James, D. Witten, T. Hastie, and R. Tibshirani, An introduction to statistical
learning: with applications in R. Springer, New York, NY, USA, 2015.
14. S. W. Wilson, \Get real! XCS with continuous-valued inputs," in Learning
Classier Systems, pp. 209{219, Springer, 2000.
15. S. W. Wilson, \Classi er tness based on accuracy," Evolutionary Computation,
vol. 3, pp. 149{175, June 1995.
16. S. W. Wilson, \Generalization in the xcs classi er system," in 3rd Annual Genetic</p>
      <p>Programming Conference (GP-98), 1998.
17. R. J. Urbanowicz and J. H. Moore, \Exstracs 2.0: description and evaluation of
a scalable learning classi er system," Evolutionary Intelligence, vol. 8, no. 2-3,
p. 89{116, 2015.
18. K. Deb, \Multi-objective optimisation using evolutionary algorithms: An
introduction," in Multi-objective Evolutionary Optimisation for Product Design and
Manufacturing, pp. 3{34, Springer, 2011.
19. M. R. Karlsen and S. Moschoyiannis, \Customer Segmentation of Wireless
Trajectory Data," in Technical Report, University of Surrey, 2019.
20. M. R. Karlsen and S. K. Moschoyiannis, \Optimal control rules for random boolean
networks," in International Workshop on Complex Networks and their
Applications, pp. 828{840, Springer, 2018.
21. S. W. Wilson, \Compact rulesets from XCSI," in International Workshop on
Learning Classi er Systems, pp. 197{208, Springer, 2001.
22. R. J. Urbanowicz and J. H. Moore, \Exstracs 2.0: description and evaluation of
a scalable learning classi er system," Evolutionary intelligence, vol. 8, no. 2-3,
p. 89{116, 2015.
23. Business Dynamics: System Thinking and Modeling fro a Complex World,
publisher=McGraw-HJill Higher Education, author=Sterman, J. D., year=2000.
24. S. Moschoyiannis, N. Elia, A. Penn, D. J. B. Lloyd, and C. Knight, \A
webbased tool for identifying strategic intervention points in complex systems," in
Proceedings of the Games for the Synthesis of Complex Systems (CASSTING'16
@ ETAPS 2016), vol. 220 of EPTCS, pp. 39{52, 2016.
25. V. B. Georgiev, M. R. Karlsen, L. Schoenenberger, and S. Moschoyiannis, \On
controlling non-linear systems with XCSR," in 13th SICC Workshop { Complexity
and the City, 2018.
26. L. P. T. K. Nakata, M., \Selection strategy for XCS with adaptive action mapping,"
in 15th Annual Conference on Genetic and Evolutionary Computation, 2016.
27. H. Boley, \The RuleML Knowledge-Interoperation Hub," in RuleML 2016, LNCS
9718, pp. 19{33, Springer, 2016.
28. D. Calvanese, M. Dumas, F. M. Maggi, and M. Montali, \Semantic DMN:
Formalizing decision models with domain knowledge," in RuleML+RR 2017, LNCS
10364, pp. 70{86, Springer, 2017.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          1.
          <string-name>
            <given-names>K.</given-names>
            <surname>Sha</surname>
          </string-name>
          and
          <string-name>
            <given-names>H.</given-names>
            <surname>Abbass</surname>
          </string-name>
          , \
          <article-title>A survey of learning classi er systems in games,"</article-title>
          <source>IEEE Computational Intelligence Magazine</source>
          , vol.
          <volume>12</volume>
          , pp.
          <volume>42</volume>
          {
          <issue>55</issue>
          , 02
          <year>2017</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          2.
          <string-name>
            <given-names>W. N. L.</given-names>
            <surname>Browne</surname>
          </string-name>
          , \
          <article-title>The development of an industrial learning classi er system for data-mining in a steel hop strip mill,"</article-title>
          <source>in Applications of Learning Classi er Systems</source>
          , pp.
          <volume>223</volume>
          {
          <issue>259</issue>
          , Springer Berlin,
          <year>2004</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          3.
          <string-name>
            <given-names>P.</given-names>
            <surname>Krause</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Razavi</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Moschoyiannis</surname>
          </string-name>
          ,
          <article-title>and</article-title>
          <string-name>
            <given-names>A.</given-names>
            <surname>Marinos</surname>
          </string-name>
          , \
          <article-title>Stability and complexity in digital ecosystems," in IEEE Digital Ecosystems and Technologies (IEEE DEST)</article-title>
          , pp.
          <volume>85</volume>
          {
          <issue>90</issue>
          ,
          <year>2009</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          4.
          <string-name>
            <given-names>M. R.</given-names>
            <surname>Karlsen</surname>
          </string-name>
          and
          <string-name>
            <given-names>S.</given-names>
            <surname>Moschoyiannis</surname>
          </string-name>
          , \
          <article-title>Learning condition{action rules for personalised journey recommendations,"</article-title>
          <source>in RuleML + RR</source>
          <year>2018</year>
          ,
          <article-title>LNCS 11092</article-title>
          , pp.
          <volume>293</volume>
          {
          <issue>301</issue>
          , Springer,
          <year>2018</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          5.
          <string-name>
            <given-names>M. R.</given-names>
            <surname>Karlsen</surname>
          </string-name>
          and
          <string-name>
            <given-names>S.</given-names>
            <surname>Moschoyiannis</surname>
          </string-name>
          , \
          <article-title>Evolution of control with learning classi er systems,"</article-title>
          <source>Applied Network Science</source>
          , vol.
          <volume>3</volume>
          , pp.
          <volume>3</volume>
          {
          <issue>30</issue>
          ,
          <string-name>
            <surname>Aug</surname>
          </string-name>
          <year>2018</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          6.
          <string-name>
            <given-names>M. V.</given-names>
            <surname>Butz</surname>
          </string-name>
          and S. W. Wilson, \
          <article-title>An algorithmic description of xcs,"</article-title>
          <source>Soft Computing</source>
          , vol.
          <volume>6</volume>
          , no.
          <issue>3</issue>
          , pp.
          <volume>144</volume>
          {
          <issue>153</issue>
          ,
          <year>2001</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          7.
          <string-name>
            <given-names>R. J.</given-names>
            <surname>Urbanowicz</surname>
          </string-name>
          and
          <string-name>
            <given-names>W. N.</given-names>
            <surname>Browne</surname>
          </string-name>
          ,
          <article-title>Introduction to learning classi er systems</article-title>
          . Berlin, Germany. Springer,
          <year>2017</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          8.
          <string-name>
            <given-names>M. V.</given-names>
            <surname>Butz</surname>
          </string-name>
          and S. W. Wilson, \
          <article-title>An algorithmic description of XCS,"</article-title>
          <source>in International Workshop on Learning Classi er Systems</source>
          , pp.
          <volume>253</volume>
          {
          <issue>272</issue>
          , Springer,
          <year>2000</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          9.
          <string-name>
            <given-names>E.</given-names>
            <surname>Bernado-Mansilla</surname>
          </string-name>
          and
          <string-name>
            <given-names>J. M.</given-names>
            <surname>Garrell-Guiu</surname>
          </string-name>
          , \
          <article-title>Accuracy-based learning classi er systems: Models, analysis and applications to classi cation tasks,"</article-title>
          <source>Evolutionary Computation</source>
          , vol.
          <volume>11</volume>
          , no.
          <issue>3</issue>
          , pp.
          <volume>209</volume>
          {
          <issue>238</issue>
          ,
          <year>2003</year>
          .
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>