<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Self-Adaptive Ensemble Classi er for Handling Complex Concept Drift</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Imen Khamassi</string-name>
          <email>imen.khamassi@isg.rnu.tn</email>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Moamar Sayed-Mouchaweh</string-name>
          <email>moamar.sayed-mouchaweh@mines-douai.fr</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>. Ecole des Mines Douai</institution>
          ,
          <country country="FR">France</country>
        </aff>
        <aff id="aff1">
          <label>1</label>
          <institution>. Universite de Tunis, Institut Superieur de Gestion de Tunis</institution>
          ,
          <country country="TN">Tunisia</country>
        </aff>
      </contrib-group>
      <abstract>
        <p>In increasing number of real world applications, data are presented as streams that may evolve over time and this is known by concept drift. Handling concept drift through ensemble classi ers has received a great interest in last decades. The success of these ensemble methods relies on their diversity. Accordingly, various diversity techniques can be used like block-based data, weighting-data or ltering-data. Each of these diversity techniques is e cient to handle certain characteristics of drift. However, when the drift is complex, they fail to e ciently handle it. Complex drifts may present a mixture of several characteristics (speed, severity, in uence zones in the feature space, etc) which may vary over time. In this case, drift handling is more complicated and requires new detection and updating tools. For this purpose, a new ensemble approach, namely EnsembleEDIST2, is presented. It combines the three diversity techniques in order to take bene t from their advantages and outperform their limits. Additionally, it makes use of EDIST2, as drift detection mechanism, in order to monitor the ensemble's performance and detect changes. EnsembleEDIST2 was tested through di erent scenarios of complex drift generated from synthetic and real datasets. This diversity combination allows EnsembleEDIST2 to outperform similar ensemble approaches in term of accuracy rate, and present stable behaviors in handling di erent scenarios of complex drift.</p>
      </abstract>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>-</title>
      <p>Learning from evolving data stream has received a great attention. It
addresses the non-stationarity of data over time, which is known by concept drift.
The term concept refers to data distribution, represented by the joint distribution
p(x; y), where x represents the n dimensional feature vector and y represents
its class label. The term concept drift refers to a change in the underlying
distribution of new incoming data. For example, in intrusion detection application,
the behavior of an intruder may evolve in order to confuse the system protection
rules. Hence, it is essential to consider these changes for updating the system in
order to preserve its performance.</p>
      <p>
        Ensemble classi ers appear to be promising approaches for tracking evolving
data streams. The success of the ensemble methods, according to single classi er,
relies on their diversity [
        <xref ref-type="bibr" rid="ref17">17</xref>
        ] [
        <xref ref-type="bibr" rid="ref22">22</xref>
        ] [
        <xref ref-type="bibr" rid="ref21">21</xref>
        ]. Diversity can be achieved according to three
main strategies [
        <xref ref-type="bibr" rid="ref15">15</xref>
        ]: block-based data, weighting-data or ltering-data. In
blockbased ensembles [
        <xref ref-type="bibr" rid="ref5">5</xref>
        ], [
        <xref ref-type="bibr" rid="ref16">16</xref>
        ], [
        <xref ref-type="bibr" rid="ref20">20</xref>
        ], the training set is presented as blocks or chunks
of data at a time. Generally, these blocks are of equal size and the evaluation
of base learners is done when all instances from a new block are available. In
weighting-data ensembles [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ] [
        <xref ref-type="bibr" rid="ref4">4</xref>
        ] [
        <xref ref-type="bibr" rid="ref18">18</xref>
        ] [
        <xref ref-type="bibr" rid="ref13">13</xref>
        ], the instances are weighted according
to some weighting process. For example in Online Bagging [
        <xref ref-type="bibr" rid="ref19">19</xref>
        ], the weighting
process is based on re-using instances for training individual learners. Finally,
ltering-data ensembles [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ] are based on selecting data from the training set
according to a speci c criterion, for example similarity in feature space.
      </p>
      <p>In many real-life applications, the concept drift may be complex in the sense
that it presents time-varying characteristics. For instance, a drift can present
di erent characteristics according to its speed (abrupt or gradual ), nature
(continuous or probabilistic) and severity (local or global ). Accordingly, complex drift
can present a mixture of all these characteristics over time. It is worth to
underline that each characteristic presents its own challenges. Accordingly, a mixture
of these di erent characteristics may accentuate the challenge issues and
complicate the drift handling.</p>
      <p>
        In this paper, the goal is to underline the complementarity of the diversity
techniques (block-based data, weighting-data and ltering-data) for handling
different scenarios of complex drift. For this purpose, a new ensemble approach,
namely EnsembleEDIST2, is proposed. The intuition is to combine these three
diversity techniques in order to e ciently handle di erent scenarios of complex
drift. Firstly, EnsembleEDIST2 de nes a data-block with variable size for
updating the ensemble's members, thus it can avoid the problem of tuning o size
of the data-block. Secondly, it de nes a new ltering criterion for selecting the
most representative data of the new concept. Thirdly, it applies a new
weighting process in order to create diversi ed ensemble's members. Finally, it makes
use of EDIST2 [
        <xref ref-type="bibr" rid="ref14">14</xref>
        ] [
        <xref ref-type="bibr" rid="ref12">12</xref>
        ], as drift detection mechanism, in order to monitor the
ensemble's performance and detect changes.
      </p>
      <p>EnsembleEDIST2 has been tested through di erent scenarios of complex
drifts generated from synthetic and real datasets. This diversity combination
allows EnsembleEDIST2 to outperform similar ensemble approaches in term of
accuracy rate, and present a stable behavior in handling di erent scenarios of
complex drift.</p>
      <p>The remainder of the paper is organized as follows. In Section II, the
challenges of complex concept drift are exposed. In Section III, the advantages and
the limits of each diversity technique are studied. In Section IV, the proposed
approach, namely EnsembleEDIST2, is detailed. Section V, the experimental setup
and the obtained results are presented. Finally, in Section VI, the conclusion and
some future research directions are exposed.</p>
    </sec>
    <sec id="sec-2">
      <title>Complex Concept Drift</title>
      <p>In many real-life applications, the concept drift may be complex in the sense
that it presents time-varying characteristics. Let us take the example of a drift
with three di erent characteristics according to its speed (gradual or abrupt),
nature (continuous or probabilistic) and severity (local or global). It is worth
to underline that each characteristic presents its own challenges. Accordingly,
a mixture of these di erent characteristics may accentuate the challenge issues
and complicate the drift handling.</p>
      <p>For instance, we can consider the drift depicted in Fig.1 as complex drift as
it simulates a Gradual Continuous Local Drift, in the sense that the hyperplane
class boundary is gradually rotating during the drifting phase and continuously
presenting changes with each instance in local regions. Namely, the time until
this complex drift is detected can be arbitrarily long. This is due to the rarity
of data source representing the drift, which in turn makes it di cult to con rm
the presence of drift. Moreover, in some cases, this drift can be considered as
noise by confusion, which makes the model unstable. Hence, to overcome the
instability, the model has to (i) e ectively di erentiate between local changes
and noises, and (ii) deal with the scarcity of instances that represent the drift
in order to e ectively update the learner.</p>
      <p>Another interesting complex drift represents the Gradual Continuous Global
Drift (see Fig.2). During this drift, the concept is gradually changing and
continuously presenting modi cations with each instance. Namely, during the
transition phase, the drift evolves and presents several intermediate concepts until the
emergence of the nal concept (see Fig.2.b). Hence, the challenging issue is to
e ciently decide the end time of the old concept and detect the start time of the
new concept. The objective is to update the learner with the data that represent
the nal concept (see Fig.2.c) and not with data collected during the concept
evolution (see Fig.2.b). Moreover, this drift is considered as global because it is
a ecting all the instances of the drifting class. Namely, handling this complex
drift is also challenging, because the performance's decrease of the learner is
more pronounced than the other types of drifts.</p>
      <p>
        The diversity [
        <xref ref-type="bibr" rid="ref15">15</xref>
        ] among the ensemble can be ful lled by applying various
techniques such as: block-based data, weighting-data or ltering data, in order
to di erently train base learners (see Fig.3). Accordingly, the objective in this
investigation is to highlight the advantages and drawbacks of each diversity
techniques in handling complex drift (see Table 1).
      </p>
      <p>
        According to the block-based technique, the training set is presented as blocks
or chunks of data at a time. Generally, these blocks are of equal size and the
construction, evaluation, or updating of base learners is done when all instances
from a new block are available. Very often, ensemble learners periodically
evaluate their components and substitute the weakest one with a new (candidate)
learner after each data block [
        <xref ref-type="bibr" rid="ref20">20</xref>
        ] [
        <xref ref-type="bibr" rid="ref16">16</xref>
        ] [
        <xref ref-type="bibr" rid="ref6">6</xref>
        ]. This technique preserves the
adaptability of the ensemble in such way that learners, which were trained in recent
blocks, are the most suitable for representing the current concept.
      </p>
      <p>The block-based ensembles are suitable for handling gradual drifts. Generally,
during these drifts, the change between consecutive data blocks is not quite
pronounced; thus, it can be only noticeable in long period. The interesting point
in the block-based ensembles is that they can enclose di erent learners that are
trained in di erent period of time. Hence, by aggregating the outputs of these
base classi ers, the ensemble can o er accurate reactions to such gradual drifts.</p>
      <p>In contrast, the main drawback of block-based ensembles is the di culty of
tuning o the block size to o er a compromise between fast reactions to drifts
and high accuracy. If the block size is too large, they may slowly react to abrupt
drift; whereas small size can damage the performance of the ensemble in stable
periods.</p>
      <p>In this technique, the base learners are trained according to weighted
instances from the training set. A popular instance weighting process is presented
(a) Block-based
(b) Weighting-data
(c) Filtering-data</p>
      <p>Data Stream
C1</p>
      <p>C 2</p>
      <p>C3</p>
      <p>Ensemble Classifier
3 times</p>
      <p>
        1 time
C 2
in the Online Bagging ensemble [
        <xref ref-type="bibr" rid="ref19">19</xref>
        ]. For ease of understanding, the weighting
process is based on re-using instances for training individual classi ers. Namely,
if we consider that each base classi er Ci is trained from a subset Mi from the
global training set; then the instancei will be presented k times in Mi; where
the weight k is drawn from a P oisson(1) distribution.
      </p>
      <p>
        Online Bagging has inspired many researchers in the eld of drift tracking
[
        <xref ref-type="bibr" rid="ref3">3</xref>
        ] [
        <xref ref-type="bibr" rid="ref17">17</xref>
        ] [
        <xref ref-type="bibr" rid="ref13">13</xref>
        ]. This approach can be of great interest for:
{ Class imbalance: where some classes are severely underrepresented in the
dataset
{ Local drift: where changes occur in only some regions of the instance space.
Generally, the weighting process intensi es the re-use of underrepresented class
data and helps to deal with the scarcity of instances that represent the local
drift. However, the instance duplication may impact the ability of the ensemble
in handling global drift. During global drift, the change a ects a large amount of
data; thus when re-using data for constructing base classi ers, the performance's
decrease is accentuated and the recovery from the drift may be delayed.
3.3
      </p>
      <sec id="sec-2-1">
        <title>Filtering-data Technique</title>
        <p>This technique is based on selecting data from the training set according to
a speci c criterion, for example similarity in the feature space. Such technique
allows to select subsets of attributes that provide partitions of the training set
containing maximally similar instances, i.e., instances belonging to the same
regions of feature space. Thanks to this technique, base learners are trained
according to di erent subspaces to get bene t from di erent characteristics of
the overall feature space.</p>
        <p>In contrast with conventional approaches which detect drift in the overall
distribution without specifying which feature has changed, ensemble learners
based on ltered data can exactly specify the drifting feature. This is a desired
property for detecting novel class emergence or existing class fusion in unlabeled
data. However, these approaches may present di culty in handling local drifts
if they do not de ne an e cient ltering criterion. It is worth to underline that
during local drift, only some regions of the feature space are a ected by the
drift. Hence, only the base classi er which is trained on changing region is the
most accurate to handle the drift. However, when aggregating the nal decision
of this classi er with the remained classi ers, trained from unchanged regions,
the performance recovery may be delayed.</p>
        <p>The intuition behind EnsembleEDIST2 is to combine the three diversity
techniques (Block-based, Weighting-data and Filtering data) in order to take bene t
from their advantages and avoid their drawbacks.</p>
        <p>
          The contributions of EnsembleEDIST2 for e ciently handling complex
concept drifts are as follows, it:
{ Explicitly handles drift through a drift detection method EDIST2 [
          <xref ref-type="bibr" rid="ref14">14</xref>
          ]
(sub
        </p>
        <p>Section4.1)
{ Makes use of data-block with variable size for updating the ensemble's
members (subSection4.2)
{ De nes a new ltering criterion for selecting the most representative data of
the new concept (subSection4.3)
{ Applies a new weighting process in order to create diversi ed ensemble's
members (subSection4.4)</p>
        <p>WG W 0
WG W 0</p>
        <p>
          EnsembleEDIST2 is an ensemble classi er designed to explicitly handle drifts.
It makes use of EDIST2 [
          <xref ref-type="bibr" rid="ref14">14</xref>
          ], as drift detection mechanism, in order to monitor
the ensemble's performance and detect changes (see Fig4).
        </p>
        <p>EDIST2 monitors the prediction feedback provided by the ensemble. More
precisely, EDIST2 studies the distance between two consecutive errors of
classi cation. Notice that the distance is represented by the number of instances
between two consecutive errors of classi cation. Accordingly, when the data
distribution becomes non-stationary, the ensemble will commit much more errors
and the distance between these errors will decrease.</p>
        <p>In EDIST2, the concept drift is tracked through two data windows, a 'global'
one and a 'current' one. The global window WG is a self-adaptive window which
is continuously incremented if no drift occurs and decremented otherwise; and
the current window W0 which represents the batch of current collected instances.</p>
        <p>In EDIST2, we want to estimate the error distance distribution of WG and W0
and make a comparison between the averages of their error distance distributions
in order to check a di erence. As stated before, a signi cant decrease in the error
distance implies a change in the data distribution and suggests that the learning
model is no longer appropriate.</p>
        <p>
          EDIST2 makes use of a statistical hypothesis test in order to compare WG and
W0 error distance distributions and check whether the averages di er by more
than the threshold . It is worth underlining that there is no a priori de nition of
the threshold , in the sense that it does not require any a priori adjusting related
to the expected speed or severity of the change. is autonomously adapted
according to a statistical hypothesis test (for more details please refer ti [
          <xref ref-type="bibr" rid="ref14">14</xref>
          ]).
        </p>
        <p>The intuition behind EDIST2 is to monitor d which represents di erence
between WG and W0 averages and accordingly three thresholds are de ned:
{ In-Control level : d ; within this level, we con rm that there is no change
between the two distributions, so we enlarge WG by adding W0 's instances.
Accordingly, all the ensemble members are incremented according to data
samples in WG and W0.
{ Warning level : d &gt; ; within this level, the instances are stored in an
warning chunk Wwarning. Accordingly, all the ensemble members are incremented
according to weighted data from Wwarning. (The weighting process will be
explained in subSection4.4)
{ Drift level : d &gt; + d; within this level, the drift is con rmed and WG
is decremented by only containing the instances stored since the warning
level,i.e., in Wwarning. Additionally, a new base classi er is created from
scratch and trained according to data samples in Wwarning, then the oldest
classi er is removed from the ensemble.
4.2</p>
      </sec>
      <sec id="sec-2-2">
        <title>EnsembleEDIST2's diversity by variable-sized block technique</title>
        <p>In EnsembleEDIST2, the size of data-block is not de ned according to the
number of instances, as it is the case of conventional block-based ensembles, but
according to the number of errors committed during the learning process. More
precisely, the data-block W0, in EnsembleEDIST2, is constructed by collecting
the instances that exist between N0 errors.</p>
        <p>As depicted in Fig.5 , when the drift is abrupt, the ensemble commits N0
errors in short drifting time. However, when the drift is gradual, the ensemble
commits N0 errors in relatively longer drifting time. Hence, according to this
strategy, the block size is variable and adjusted according to drift characteristics.</p>
        <p>It is worth to underline that EnsembleEDIST2 can o er a compromise
between fast reaction to abrupt drift and stable behavior regarding gradual drift.
This is a desirable property for handling complex drift which may present
different characteristics in the same time, and accordingly EnsembleEDIST2 can
avoid the problem of tuning o the size of data-block as it is the case of most
block-based approaches.</p>
        <p>(a) Abrupt drift</p>
        <p>(b) Gradual drift</p>
        <p>Di erently from conventional ltering-data ensembles, which lter data
according to similarity in the feature space, EnsembleEDIST2 de nes a new
ltering criterion. It lters the instances that trigger the warning level. More precisely,
each time the ensemble reaches the warning level, the instances are gathered in
a warning chunk Wwarning in order to re-use them for training the ensemble's
members (see Fig.6.a). This is an interesting point when dealing with local drift
because drifting data are scarce and not continuously provided. It is possible
that a certain amount of drifting data can be found in zones (1), (2), (3) and
(4) but not quite su cient to reach the drift level. Accordingly, by considering
these data for updating the ensemble's members, EnsembleEDIST2 can ensure
a rapid recovery from local drift.</p>
        <p>In contrast, conventional ltering-data ensembles are unable the de ne in
which zone the drift has occurred, thus, they may update the ensemble's
members with data ltered from unchanged feature space; which in turn may delay
the performance correctness.
4.4</p>
      </sec>
      <sec id="sec-2-3">
        <title>EnsembleEDIST2's diversity by new weighting-data process</title>
        <p>
          The focus in EnsembleEDIST2 is to maximize the use of data present in
Wwarning for accurately updating the ensemble. More precisely, the data in
Wwarning are weighted according to the same weighting process used in
Online bagging [
          <xref ref-type="bibr" rid="ref19">19</xref>
          ]. Namely, each instancei from Wwarning is re-used k times for
training the base classi er Ci , where the weight k is drawn from a P oisson(1)
distribution (see Appendix7).
        </p>
        <p>Generally, the weighting process in EnsembleEDIST2 o ers twofold
advantages. First, it intensi es the re-use of underrepresented class data and helps to
deal with scarcity of instances that represent the local drift. Second, it permits
faster recovery from global drift than conventional weighting-data ensembles. As
it is known, during global drift, the change a ects a large amount of data. Hence,
di erently from conventional weighting-data ensembles, which apply the
weighting process to all the data sets; EnsembleEDIST2 only weights the instances
present in Wwarning (see Fig.6.b). Accordingly, it can avoid to accentuate the
decrease of the ensemble's performance during global drift, and ensure a fast
recovery.</p>
      </sec>
    </sec>
    <sec id="sec-3">
      <title>Experiments and performance analysis</title>
      <sec id="sec-3-1">
        <title>Experimental evaluation</title>
        <p>
          Synthetic Datasets In this investigation, we are studying six di erent
scenarios of complex concept drift as depicted in Table 2 . All synthetic datasets
contain 100; 000 instances and one concept drift where the starting and the
ending time are prede ned. For gradual drift, the drifting time lasts 30; 000 instances
(it begins at tstart=40,000 and ends at tend = 70; 000). For abrupt drift, the drift
occurs at t = 50; 000.
Electricity Dataset (48,312 instances, 8 attributes, 2 classes) is a real world
dataset from the Australian New South Wales Electricity Market [
          <xref ref-type="bibr" rid="ref9">9</xref>
          ]. In this
electricity market, the prices are not xed and may be a ected by demand and
supply. The dataset covers a period of two years and the instances are recorded
every half an hour. The classi cation task is to predict a rise (UP) or a fall
(DOWN) in the electricity price. Three numerical features are used to de ne the
feature space: the electricity demand in current region, the electricity demand
in the adjacent regions and the schedule of electricity transfer between the two
regions.
        </p>
        <p>This dataset may present several scenarios of complex drift. For instance, a
gradual continuous drift may occur when the users progressively change their
consumption habits during a long time period. Likewise, an abrupt drift may
occur when the electricity prices suddenly increase due to unexpected events
(e.g., political crises or natural disasters). Moreover, the drift can be local if
it impacts only one feature (e.g., the electricity demand in current region); or
global if it impacts all the features.</p>
        <p>
          Spam Dataset (9,324 instances, 500 attributes, 2 classes) is a real world dataset
containing email messages from the Spam Assassin Collection Project [
          <xref ref-type="bibr" rid="ref11">11</xref>
          ]. The
classi cation task is to predict if a mail is a spam or legitimate. The data set
contains 20% of spam mailing. The feature space is de ned by a set of numerical
features such as the number of receptors, textual attributes describing the mail
contain and sender characteristics:::
        </p>
        <p>This dataset may present several scenarios of complex drift. For instance,
a gradual drift may occur when the user progressively changes his preferences.
However, an abrupt drift may occur when the spammer rapidly changes the mail
content to trick the spam lter rules. It is worth to underline that the drift can
also be continuous when the spammer starts to change the spam content; but
the lter continues to correctly detect them. In the other side, the drift can be
probabilistic when the spammer starts to change the spam content; but the lter
fails in detecting some of them.</p>
        <p>
          Evaluation criteria When dealing with evolving data streams, the objective
is to study the evolution of the EnsembleEDIST2 performance over time and see
how quick the adaptation to drift is. According to Gama et al. [
          <xref ref-type="bibr" rid="ref8">8</xref>
          ] the prequential
accuracy is a suitable metric to evaluate the learner performance in presence of
concept drift. It proceeds as follows: each instance is rstly used for testing then
for training. Hence, the accuracy is incrementally updated using the maximum
available data; and the model is continuously tested on instances that it has not
already seen (for more details please refer to [
          <xref ref-type="bibr" rid="ref8">8</xref>
          ]).
        </p>
        <p>
          Parameter Settings All the tested approaches were implemented in the java
programming language by extending the Massive Online Analysis (MOA)
software [
          <xref ref-type="bibr" rid="ref2">2</xref>
          ]. MOA is an online learning framework for evolving data streams and
supports a collection of machine learning methods.
        </p>
        <p>
          For comparison, we have selected well known ensemble approaches according
to each category:
{ Block-based ensemble: AUE (Accuracy Updated Ensemble) [
          <xref ref-type="bibr" rid="ref5">5</xref>
          ], AWE
(Accuracy Weighted Ensemble) [
          <xref ref-type="bibr" rid="ref16">16</xref>
          ] and LearnNSE [
          <xref ref-type="bibr" rid="ref20">20</xref>
          ] with block size equal to
500 instances.
{ Weighting-data ensemble: LeveragingBag [
          <xref ref-type="bibr" rid="ref3">3</xref>
          ] and OzaBag [
          <xref ref-type="bibr" rid="ref19">19</xref>
          ]
{ Filtering-data ensemble: LimAttClass [
          <xref ref-type="bibr" rid="ref1">1</xref>
          ]
For all these approaches, the ensemble's size was xed to 10 and the Hoe ding
Tree (HT) [
          <xref ref-type="bibr" rid="ref7">7</xref>
          ] was used as base learning algorithm.
        </p>
        <p>It is worth to notice that EnsembleEDIST2 makes use of two parameters: N0
which is the number of error in W0 and m which is the number of base classi ers
among the ensemble. In this investigation, we respectively set N0 = 30 and
m = 3 according to empirically studies done in subSections 5.2 and 5.2.
5.2</p>
      </sec>
      <sec id="sec-3-2">
        <title>Comparative study and interpretation</title>
      </sec>
      <sec id="sec-3-3">
        <title>Impact of N0 on EnsembleEDIST2 performance EnsembleEDIST2 makes</title>
        <p>use of the parameter N0 in order to de ne the minimum number of error occurred
in W0. Recall that W0 represents the batch of current collected instances. This
batch is constructed by collecting the instances that exist between N0 errors.</p>
        <p>It is interesting to study the impact of N0 on the accuracy according to
di erent scenarios of complex drift. For this purpose, we have done the following
experiments: for each scenario of complex drift, the accuracy of EnsembleEDIST2
is presented by varying N0 values (see Table 3).</p>
        <p>Based on these results, we can conclude that the performance of
EnsembleEDIST2 in handling di erent scenarios of complex drifts is weakly sensitive
to N0. Hence, we have decided to use N0 = 30 as it has achieved the best
accuracy rate in most cases.</p>
      </sec>
      <sec id="sec-3-4">
        <title>Impact of ensemble size on EnsembleEDIST2 performance Ensem</title>
        <p>bleEDIST2 makes use of the parameter m in order to de ne the number of
classi ers in the ensemble. Accordingly, it is interesting to study the impact of
m on ensemble's performance according to di erent scenarios of complex drift.</p>
        <p>According to Table4, it is noticeable that the size of EnsembleEDIST2 does
not impact signi cantly the performance in handling di erent scenarios of
complex drift. Hence, we have decided to use m = 3 as it achieved the best accuracy
rate in most cases and it allows to limit the computational complexity of the
ensemble.
Accuracy of EnsembleEDIST2 Vs other ensembles Table5 summarizes
the average of prequential accuracy during the drifting phase. The objective of
this experiment is to study the ensemble performance in the presence of di erent
scenarios of complex drift. Firstly, it is noticeable that EnsembleEDIST2 has
achieved better results than block-based ensembles in handling di erent types
of abrupt drift. During abrupt drift (independently of being local of global),
the change is rapid; thus AUE, AWE and LearnNSE present di culty in tuning
o the block size to o er a compromise between fast reaction to drift and high
accuracy. However, EnsembleEDIST2 is able to autonomously train ensemble
members with variable amount of data at each time process, thus it can e ciently
handle abrupt drift.</p>
        <p>Secondly, it is noticeable that EnsembleEDIST2 outperforms weighting-data
ensembles in handling di erent categories of global drift. During global drift
(either continuous, probabilistic or abrupt), the change a ects a large amount
of data; thus when LeveragingBag and OzaBag intensify the re-use of data for
training ensemble members, the performance's decrease is accentuated. In
contrast, EnsembleEDIST2 duplicates only a set of ltered instances for training
the ensemble members, that is why it is more accurate in handling global drift.</p>
        <p>Thirdly, it is noticeable that EnsembleEDIST2 outperforms the ltering-data
ensembles in handling di erent categories of local drift. During local drift (either
continuous, probabilistic or abrupt), the change a ects a little amount of data;
thus the choice of the ltering criterion is a essential point for e ciently handling
local drift. EnsembleEDIST2 de nes a new ltering criterion, which is based on
selecting the data that triggered the warning level. These data are the most
representative of the new concept, thus when training the ensemble's members
accordingly, it makes it more e cient for handling local drift.</p>
        <p>EnsembleEDIST2 has also been tested through real world data sets which
represent di erent scenarios of drift. It is worth underlining that the size of
these data sets is relatively small comparing to the synthetic ones. Despite the
di erent features of each real data set, encouraging results have been found
where EnsembleEDIST2 has achieved the best accuracy in all the datasets (see
Table6).</p>
        <p>To sum, it is worth to underline that the combination of the three diversity
techniques in EnsembleEDIST2 is bene cial for handling di erent scenarios of
complex drift in the same time.</p>
        <p>In this paper, we have presented a new study of the role of diversity among
the ensemble. More precisely, we have highlighted the advantages and the limits
of three widely used diversity techniques (block-based data, weighting-data and
ltering data) in handling complex drift.</p>
        <p>Additionally, we have presented a new ensemble approach, namely
EnsembleEDIST2, which combines these three diversity techniques. The intuition
behind this approach is to explicitly handle drifts by using the drift detection
mechanism EDIST2. Accordingly, the ensemble performance is monitored through a
self-adaptive window. Hence, EnsembleEDIST2 can avoid the problem of tuning
o the size of the batch data as it is the case of most block-based ensemble
approaches, which is a desirable property for handling abrupt drifts. Secondly, it
de nes a new ltering criterion, which is based on selecting the data that trigger
the warning level. Thanks to this property, EnsembleEDIST2 is more e cient for
handling local drifts then conventional ltering-data ensembles, which are only
based on ltering data according to similarity on feature space. Then, di erently
from the conventional weighting-data ensembles which apply the weighting
process to all the data stream; EnsembleEDIST2 only intensi es the re-use of most
representative data of the new concept, which is a desirable property for handling
global drifts.</p>
        <p>EnsembleEDIST2 has been tested di erent scenarios of complex drift.
Encouraging results were found, comparing to similar approaches, where
EnsembleEDIST2 has achieved the best accuracy rate in all datasets; and presented a
stable behavior in handling di erent scenarios of complex drift.</p>
        <p>It worth to underline that in the present investigation, the ensemble size, i.e.,
the number of ensemble members, was xed. Hence it is interesting, for future
work, to perform a strategy for dynamically adapting the ensemble size. The
focus is that, during stable period, the ensemble size is maintained xed; whereas
during the drifting phase the size is autonomously adapted. This may ameliorate
the performance and reduce the computational cost among the ensemble.
Acknowledgements The second author acknowledges the support of the
Regional project REPAR, funded by the French Region Hauts-de-France.</p>
      </sec>
    </sec>
    <sec id="sec-4">
      <title>EnsembleEDIST2 pseudo code</title>
      <sec id="sec-4-1">
        <title>Algorithm EnsembleEDIST2</title>
        <p>Input: (x; y): Data Stream</p>
        <p>N0: number of error to construct the window
m: number of base classi er
Output: Trained ensemble classi er E
1. for each base classi er Ci from E
2. InitializeClassif ier(Ci)
3. end for
4. WG CollectInstances(E; N0)
5. Wwarning
6. repeat
7. W0 CollectInstances(E; N0)
8. Level DetectedLevel(WG; W0)
9. switch (Level)
10. case 1: Incontrol
11. WG WG [ W0
12. U pdateP arameters(WG; W0)
13. Increment all ensemble's members of E according to instances in</p>
      </sec>
      <sec id="sec-4-2">
        <title>Algorithm DetectedLevel(WG; W0)</title>
        <p>Input: WG: Global data window characterized by:</p>
        <p>NG: error number</p>
        <p>G: error distance mean</p>
        <p>G:error distance standard deviation
W0: Current data window characterized by:</p>
        <p>N0: error number,</p>
        <p>WG
end case 1
case 2: W arning</p>
        <p>Wwarning Wwarning [ W0
U pdateP arameters(Wwarning; W0)</p>
        <p>W eightingDataP rocess(E; Wwarning)
end case 2
case 3: Drif t</p>
        <p>Create a new base classi er Cnew trained on instances in Wwarning
E E [ Cnew</p>
        <p>Remove the oldest classi er from E
0: error distance mean,
0:error distance standard deviation
Output: Level: detection level
1.</p>
      </sec>
      <sec id="sec-4-3">
        <title>Algorithm UpdateParameters(WG; W0)</title>
        <p>Input: WG: Global data window characterized by:
NG: error number</p>
        <p>1
NG+N0 (NG: G+N0: 0) G
NG + N0
q NG G2+N0 02 + (NNGG+NN00)2 ( G</p>
        <p>NG+N0
0)2</p>
      </sec>
      <sec id="sec-4-4">
        <title>Algorithm WeightingDataProcess(E; Wwarning)</title>
        <p>Input: E: Ensemble Classi er</p>
        <p>Wwarning: Window of data
Output: E: Updated ensemble classi er
1. for each instance xi from Wwarning
2. for each base classi er Ci from E
3. k poisson(1)
4. do k times
5. T rainClassif ier(Ci; xi)
6. end do
7. end for
8. end for
23. Schlimmer, J.C., Granger, Jr., R.H.: Incremental learning from noisy data. Mach.</p>
        <p>Learn. 1(3), 317{354 (Mar 1986)
24. Street, W.N., Kim, Y.: A streaming ensemble algorithm (sea) for large-scale
classi cation. In: Proceedings of the Seventh ACM SIGKDD International Conference
on Knowledge Discovery and Data Mining. pp. 377{382. KDD01, ACM, New York,
NY, USA (2001)</p>
      </sec>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          1.
          <string-name>
            <surname>Bifet</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Frank</surname>
            ,
            <given-names>E.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Holmes</surname>
            ,
            <given-names>G.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Pfahringer</surname>
            ,
            <given-names>B.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Sugiyama</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Yang</surname>
            ,
            <given-names>Q.</given-names>
          </string-name>
          :
          <article-title>Accurate ensembles for data streams: Combining restricted hoe ding trees using stacking</article-title>
          .
          <source>In: 2nd Asian Conference on Machine Learning (ACML2010)</source>
          . pp.
          <volume>225</volume>
          {
          <issue>240</issue>
          (
          <year>2010</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          2.
          <string-name>
            <surname>Bifet</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Holmes</surname>
            ,
            <given-names>G.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Kirkby</surname>
            ,
            <given-names>R.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Pfahringer</surname>
            ,
            <given-names>B.</given-names>
          </string-name>
          :
          <article-title>MOA: massive online analysis</article-title>
          .
          <source>Journal of Machine Learning Research</source>
          <volume>11</volume>
          ,
          <volume>1601</volume>
          {
          <fpage>1604</fpage>
          (
          <year>2010</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          3.
          <string-name>
            <surname>Bifet</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Holmes</surname>
            ,
            <given-names>G.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Pfahringer</surname>
            ,
            <given-names>B.</given-names>
          </string-name>
          :
          <article-title>Leveraging bagging for evolving data streams</article-title>
          .
          <source>In: Proceedings of the 2010 European Conference on Machine Learning and Knowledge Discovery in Databases: Part I</source>
          . pp.
          <volume>135</volume>
          {
          <fpage>150</fpage>
          . ECML PKDD'
          <volume>10</volume>
          , Springer-Verlag, Berlin, Heidelberg (
          <year>2010</year>
          ), http://dl.acm.org/citation.cfm? id=
          <volume>1888258</volume>
          .
          <fpage>1888275</fpage>
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          4.
          <string-name>
            <surname>Bifet</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Holmes</surname>
            ,
            <given-names>G.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Pfahringer</surname>
            ,
            <given-names>B.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Kirkby</surname>
            ,
            <given-names>R.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Gavalda</surname>
          </string-name>
          , R.:
          <article-title>New ensemble methods for evolving data streams</article-title>
          .
          <source>In: Proceedings of the 15th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining</source>
          . pp.
          <volume>139</volume>
          {
          <fpage>148</fpage>
          . KDD '09,
          <string-name>
            <surname>ACM</surname>
          </string-name>
          , New York, NY, USA (
          <year>2009</year>
          ), http://doi.acm.
          <source>org/10</source>
          .1145/1557019. 1557041
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          5.
          <string-name>
            <surname>Brzezinski</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Stefanowski</surname>
          </string-name>
          , J.:
          <article-title>Reacting to di erent types of concept drift: The accuracy updated ensemble algorithm</article-title>
          .
          <source>Neural Networks and Learning Systems, IEEE Transactions on 25(1)</source>
          ,
          <volume>81</volume>
          {94 (Jan
          <year>2014</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          6.
          <string-name>
            <surname>Brzezinski</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Stefanowski</surname>
          </string-name>
          , J.:
          <article-title>Accuracy updated ensemble for data streams with concept drift</article-title>
          . In: Corchado,
          <string-name>
            <surname>E.</surname>
          </string-name>
          , Kurzy?ski,
          <string-name>
            <surname>M.</surname>
          </string-name>
          , Wo?niak, M. (eds.)
          <source>Hybrid Arti cial Intelligent Systems, Lecture Notes in Computer Science</source>
          , vol.
          <volume>6679</volume>
          , pp.
          <volume>155</volume>
          {
          <fpage>163</fpage>
          . Springer Berlin Heidelberg (
          <year>2011</year>
          ), http://dx.doi.org/10.1007/ 978-3-
          <fpage>642</fpage>
          -21222-2_
          <fpage>19</fpage>
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          7.
          <string-name>
            <surname>Domingos</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Hulten</surname>
          </string-name>
          , G.:
          <article-title>Mining high-speed data streams</article-title>
          .
          <source>In: Proceedings of the Sixth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining</source>
          . pp.
          <volume>71</volume>
          {
          <fpage>80</fpage>
          . KDD00, ACM, New York, NY, USA (
          <year>2000</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          8.
          <string-name>
            <surname>Gama</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Sebastio</surname>
            ,
            <given-names>R.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Rodrigues</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          :
          <article-title>On evaluating stream learning algorithms</article-title>
          .
          <source>Machine Learning</source>
          <volume>90</volume>
          (
          <issue>3</issue>
          ),
          <volume>317</volume>
          {
          <fpage>346</fpage>
          (
          <year>2013</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          9.
          <string-name>
            <surname>Harries</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          :
          <article-title>Splice-2 comparative evaluation: Electricity pricing</article-title>
          .
          <source>Tech. rep.</source>
          , The University of South Wales, United
          <string-name>
            <surname>Kingdom</surname>
          </string-name>
          (
          <year>1999</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          10.
          <string-name>
            <surname>Hulten</surname>
            ,
            <given-names>G.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Spencer</surname>
            ,
            <given-names>L.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Domingos</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          :
          <article-title>Mining time-changing data streams</article-title>
          .
          <source>In: Proceedings of the seventh ACM SIGKDD international conference on Knowledge discovery and data mining</source>
          , San Francisco, CA, USA,
          <year>August</year>
          26-
          <issue>29</issue>
          ,
          <year>2001</year>
          . pp.
          <volume>97</volume>
          {
          <issue>106</issue>
          (
          <year>2001</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          11.
          <string-name>
            <surname>Katakis</surname>
            ,
            <given-names>I.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Tsoumakas</surname>
            ,
            <given-names>G.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Vlahavas</surname>
            ,
            <given-names>I.</given-names>
          </string-name>
          :
          <article-title>Tracking recurring contexts using ensemble classi ers: an application to email ltering</article-title>
          .
          <source>Knowledge and Information Systems</source>
          <volume>22</volume>
          (
          <issue>3</issue>
          ),
          <volume>371</volume>
          {
          <fpage>391</fpage>
          (
          <year>2010</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          12.
          <string-name>
            <surname>Khamassi</surname>
            ,
            <given-names>I.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Sayed-Mouchaweh</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          :
          <article-title>Drift detection and monitoring in nonstationary environments</article-title>
          .
          <source>In: Evolving and Adaptive Intelligent Systems (EAIS)</source>
          ,
          <source>Austria</source>
          . pp.
          <volume>1</volume>
          {
          <issue>6</issue>
          (
          <year>June 2014</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          13.
          <string-name>
            <surname>Khamassi</surname>
            ,
            <given-names>I.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Sayed-Mouchaweh</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Hammami</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Ghedira</surname>
            ,
            <given-names>K.</given-names>
          </string-name>
          :
          <article-title>Ensemble classi ers for drift detection and monitoring in dynamical environments</article-title>
          .
          <source>In: Annual Conference of the Prognostics and Health Management Society</source>
          , New Orlean, USA,
          <year>2013</year>
          (
          <year>October 2013</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref14">
        <mixed-citation>
          14.
          <string-name>
            <surname>Khamassi</surname>
            ,
            <given-names>I.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Sayed-Mouchaweh</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Hammami</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Ghedira</surname>
            ,
            <given-names>K.</given-names>
          </string-name>
          :
          <article-title>Self-adaptive windowing approach for handling complex concept drift</article-title>
          .
          <source>Cognitive Computation</source>
          <volume>7</volume>
          (
          <issue>6</issue>
          ),
          <volume>772</volume>
          {
          <fpage>790</fpage>
          (
          <year>2015</year>
          ), http://dx.doi.org/10.1007/s12559-015-9341-0
        </mixed-citation>
      </ref>
      <ref id="ref15">
        <mixed-citation>
          15.
          <string-name>
            <surname>Khamassi</surname>
            ,
            <given-names>I.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Sayed-Mouchaweh</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Hammami</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Ghedira</surname>
            ,
            <given-names>K.</given-names>
          </string-name>
          :
          <article-title>Discussion and review on evolving data streams and concept drift adapting</article-title>
          .
          <source>Evolving Systems (Oct</source>
          <year>2016</year>
          ), http://dx.doi.org/10.1007/s12530-016-9168-2
        </mixed-citation>
      </ref>
      <ref id="ref16">
        <mixed-citation>
          16.
          <string-name>
            <surname>Kolter</surname>
            ,
            <given-names>J.Z.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Maloof</surname>
            ,
            <given-names>M.A.</given-names>
          </string-name>
          :
          <article-title>Dynamic weighted majority: An ensemble method for drifting concepts</article-title>
          .
          <source>J. Mach. Learn. Res</source>
          .
          <volume>8</volume>
          ,
          <issue>2755</issue>
          {2790 (Dec
          <year>2007</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref17">
        <mixed-citation>
          17.
          <string-name>
            <surname>Minku</surname>
            ,
            <given-names>L.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>White</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Yao</surname>
            ,
            <given-names>X.</given-names>
          </string-name>
          :
          <article-title>The impact of diversity on online ensemble learning in the presence of concept drift. Knowledge and Data Engineering</article-title>
          , IEEE Transactions on
          <volume>22</volume>
          (
          <issue>5</issue>
          ),
          <volume>730</volume>
          {742 (May
          <year>2010</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref18">
        <mixed-citation>
          18.
          <string-name>
            <surname>Minku</surname>
            ,
            <given-names>L.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Yao</surname>
            ,
            <given-names>X.</given-names>
          </string-name>
          :
          <article-title>Ddd: A new ensemble approach for dealing with concept drift. Knowledge and Data Engineering</article-title>
          , IEEE Transactions on
          <volume>24</volume>
          (
          <issue>4</issue>
          ),
          <volume>619</volume>
          {633 (April
          <year>2012</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref19">
        <mixed-citation>
          19.
          <string-name>
            <surname>Oza</surname>
            ,
            <given-names>N.C.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Russell</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          :
          <article-title>Online bagging and boosting</article-title>
          .
          <source>In: In Arti cial Intelligence and Statistics</source>
          <year>2001</year>
          . pp.
          <volume>105</volume>
          {
          <fpage>112</fpage>
          . Morgan Kaufmann (
          <year>2001</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref20">
        <mixed-citation>
          20.
          <string-name>
            <surname>Polikar</surname>
            ,
            <given-names>R.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Upda</surname>
            ,
            <given-names>L.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Upda</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Honavar</surname>
            ,
            <given-names>V.</given-names>
          </string-name>
          :
          <article-title>Learn++: an incremental learning algorithm for supervised neural networks</article-title>
          .
          <source>Systems, Man, and Cybernetics</source>
          , Part C:
          <article-title>Applications</article-title>
          and Reviews,
          <source>IEEE Transactions on 31(4)</source>
          ,
          <volume>497</volume>
          {508 (Nov
          <year>2001</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref21">
        <mixed-citation>
          21.
          <string-name>
            <surname>Ren</surname>
            ,
            <given-names>Y.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Zhang</surname>
            ,
            <given-names>L.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Suganthan</surname>
            ,
            <given-names>P.N.</given-names>
          </string-name>
          :
          <article-title>Ensemble classi cation and regression-recent developments, applications and future directions [review article]</article-title>
          .
          <source>IEEE Computational Intelligence Magazine</source>
          <volume>11</volume>
          (
          <issue>1</issue>
          ),
          <volume>41</volume>
          {
          <fpage>53</fpage>
          (
          <year>2016</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref22">
        <mixed-citation>
          22.
          <string-name>
            <surname>Sayed-Mouchaweh</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          :
          <article-title>Learning from Data Streams in Dynamic Environments, chap</article-title>
          .
          <source>Handling Concept Drift</source>
          , pp.
          <volume>33</volume>
          {
          <fpage>59</fpage>
          . Springer International Publishing,
          <string-name>
            <surname>Cham</surname>
          </string-name>
          (
          <year>2016</year>
          ), http://dx.doi.org/10.1007/978-3-
          <fpage>319</fpage>
          -25667-
          <issue>2</issue>
          _
          <fpage>3</fpage>
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>