<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta>
      <journal-title-group>
        <journal-title>P. Basci);</journal-title>
      </journal-title-group>
    </journal-meta>
    <article-meta>
      <title-group>
        <article-title>Explaining Concept Drift via Neuro-Symbolic Rules</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Pietro Basci</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Salvatore Greco</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Francesco Manigrasso</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Tania Cerquitelli</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Lia Morra</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Department of Control and Computer Engineering</institution>
          ,
          <addr-line>Politecnico di Torino</addr-line>
          ,
          <country country="IT">Italy</country>
        </aff>
      </contrib-group>
      <pub-date>
        <year>2025</year>
      </pub-date>
      <volume>000</volume>
      <fpage>0</fpage>
      <lpage>0001</lpage>
      <abstract>
        <p>Concept drift in machine learning refers to changes in the underlying data distribution over time, which can lead to a degradation in the performance of predictive models. Although many methods have been proposed to detect and adapt to concept drift, efective methods to explain it in a human-understandable manner remain lacking. To address this, we propose the use of neuro-symbolic rules to explain the reason for drift. We applied recent rule extraction methods to convolutional neural networks (CNNs) to shed light on the model's internal behavior and promote interpretability of the outputs, while also proposing two novel automated approaches for semantic kernel labeling. We conducted preliminary experiments to assess the applicability and efectiveness of these rules in explaining concept drift, and the eficacy of the kernel labeling strategies. Under the optimality assumption, our method was able to extract rules that can facilitate the identification of the causes of drift, through either rule inspection or antecedents activation frequencies analysis. Moreover, the proposed strategies for kernel labeling ofer a more reliable and scalable alternatives to the state-of-the-art solutions.</p>
      </abstract>
      <kwd-group>
        <kwd>eol&gt;Neuro-Symbolic AI</kwd>
        <kwd>Explainable AI</kwd>
        <kwd>Concept Drift</kwd>
        <kwd>Data Drift</kwd>
        <kwd>Explainable Concept Drift</kwd>
        <kwd>Trustworthy AI</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>1. Introduction</title>
      <p>1. Methodology (Section 3): We propose the use of neuro-symbolic rules to explain the reasons for
concept drift. By applying rule extraction methods to Convolutional Neural Networks (CNNs),
we derive an interpretable set of rules that can be leveraged to formulate hypotheses regarding
the causes of drift. We also introduce two scalable approaches for semantic kernel labeling which
aim at mitigate the key limitations of the current systems.
2. Experiment (Section 4): We conducted a preliminary experiment using a deep learning classifier
trained to predict the gender of individuals from input images. To simulate drift, we removed
samples of male individuals wearing earrings from the training data and then introduced such
images during inference. This drift led to a noticeable decrease in classifier performance on
the drifted samples (accuracy drop = 0.09). Our method successfully extracts human-readable
explanations in the form of logical rules, which explain the cause of the classifier’s drift and its
performance degradation (e.g.,  Wearing Earrings ∧ ¬Wearing Lipstick ∧ ¬No Beard → Drift).1
We conclude this work by discussing the limitations of our current approach and experiments, as
well as outlining directions for future research (Section 5).</p>
    </sec>
    <sec id="sec-2">
      <title>2. Background and Related Work</title>
      <p>In this section, we first provide background on concept drift and review prior work in the field, with a
particular focus on drift explanation (Section 2.1). We then discuss how to extract rules from neural
networks, as we propose the use of these rules to explain concept drift (Section 2.2).</p>
      <sec id="sec-2-1">
        <title>2.1. Concept Drift</title>
        <p>
          Concept drift in machine learning refers to a change in the underlying data distribution over time,
which can lead to a degradation in model performance [
          <xref ref-type="bibr" rid="ref1">1</xref>
          ]. Formally, concept drift can be defined as
a change in the joint distribution over time, and occurs when: (, ) ̸= +(, ), where  are
the feature vectors,  is the target variable,  is a given time point, and  is the time window over
which the distribution shift takes place [
          <xref ref-type="bibr" rid="ref12">12</xref>
          ]. Several sub-terms have been defined under concept drift,
such as real drift (changes in  (/)), and virtual or data drift (changes in  ()) [
          <xref ref-type="bibr" rid="ref5">5</xref>
          ]. Notice that,
although concept drift is related to out-of-distribution detection (OOD) [
          <xref ref-type="bibr" rid="ref13">13</xref>
          ], there is a key distinction:
OOD detection is primarily concerned with identifying individual samples that do not conform to the
training feature distribution  (). In contrast, concept drift operates at the distribution level, often
over a temporal window, and is not strictly related to changes in  () only. In this paper, however, we
adopt the general term concept drift as a collective term encompassing all such cases.
Concept drift detection Drift detectors aim to identify whether (and when) drift occurs, and quantify its
severity [
          <xref ref-type="bibr" rid="ref5">5</xref>
          ]. Drift detection techniques can be categorized into two macro-categories: (i) supervised [
          <xref ref-type="bibr" rid="ref1">1</xref>
          ]
and (ii) unsupervised [
          <xref ref-type="bibr" rid="ref14 ref15">14, 15</xref>
          ]. Supervised drift detection techniques assume the availability of
groundtruth labels in the production data stream. They usually compute error rate-based measures or use
ensemble models to monitor and detect performance decrease over time, such as an accuracy drop
(e.g., [
          <xref ref-type="bibr" rid="ref16 ref17 ref18 ref19">16, 17, 18, 19</xref>
          ]). However, these techniques have limited applicability in real-world scenarios
since ground-truth labels are usually unavailable. In contrast, unsupervised drift detection techniques
do not require ground-truth labels to detect drifts. They usually apply statistical methods between
two distributions [20, 21, 22, 23], or exploit model loss functions [24, 25, 26, 27], or train virtual
classifiers [ 28, 29, 30] to detect drift. These techniques are generally more widely applicable, as newly
processed data often lack ground-truth labels. However, they tend to be more resource-intensive, since
they involve complex statistical tests and the training of additional models.
        </p>
        <p>
          Concept drift localization Drift localization or segmentation techniques aim to identify the drift
data points in the data space (where)–whether a given data point is afected by drift [
          <xref ref-type="bibr" rid="ref6">6, 31</xref>
          ]. This is
usually obtained by quantifying the amount of drift in some regions of the data space or in each single
data point by performing drift detection on a local scale, or by training virtual classifiers to distinguish
between samples with or without drift.
1The code repository is available at: https://github.com/grecosalvatore/neurosymbolic-explainable-concept-drift
Concept drift explanation Some recent eforts have aimed to explain in human-readable terms the
reasons for concept drift [
          <xref ref-type="bibr" rid="ref6">6</xref>
          ]. Some works provide a simpler form of drift visualization as feature-wise
change intensity or change in correlation [32, 33, 34, 35, 36], or cause-efect analysis [ 35]. However, all
these techniques provide more visualization than explanation of drift. Moreover, they usually struggle
with high-dimensional data or non-semantic features, such as those in unstructured data like texts and
images. In contrast, only a few works investigated the use of more advanced and readable Explainable
Artificial Intelligence (XAI) [
          <xref ref-type="bibr" rid="ref10 ref7 ref8 ref9">7, 8, 9, 10</xref>
          ] techniques for explaining concept drift. [37, 38] proposes the use
of SHAP [39] to characterise data drift. [40] proposes the identification of relevant prototype examples
to explain the reasons for drift. Finally, [
          <xref ref-type="bibr" rid="ref11">11</xref>
          ] proposes training a classifier to diferentiate between
samples with and without drift, and then applying explainable AI (XAI) techniques—such as feature
importance or counterfactual generation—to the classifier to explain the nature of the drift.
Concept drift adaptation Some techniques also propose methods to adapt to concept drift or make
incremental changes to the model [41, 42, 43]. However, automatic drift adaptation remains particularly
challenging [
          <xref ref-type="bibr" rid="ref5 ref6">5, 6</xref>
          ], especially in data streams with large volumes of unstructured data, such as images,
that often lack ground-truth labels. In such cases, efective drift explanation and characterization can
assist experts in annotating new samples to better adapt to drift.
        </p>
        <p>In this paper, we focus on drift explanation only, assuming detection and localization have been
completed. We target concept drift in CNN for image classification, aiming to provide efective,
humanreadable explanations by extracting rules that explain the reasons for changes.</p>
      </sec>
      <sec id="sec-2-2">
        <title>2.2. Rule extraction from neural networks</title>
        <p>Deep convolutional neural networks (CNNs) have achieved remarkable performance in computer
vision tasks, yet their internal decision-making remains largely opaque. To enhance transparency
and accountability, research on explainable AI seeks methods that translate complex model behavior
into human-understandable forms. One promising direction is rule extraction, which approximates
network logic with a set of global, interpretable if–then rules. In contrast to local attribution
methods that highlight individual/groups of pixels or neurons [44, 45, 46, 47], rule extraction provides a
holistic description of the features and conditions driving model predictions. These rules serve as an
interpretable surrogate for the original CNN, enabling practitioners to assess whether the model relies
on semantically meaningful patterns or spurious correlations [48]. Early rule-extraction frameworks
introduced taxonomies such as pedagogical (treating the network as a black box) and decompositional
(leveraging internal structures), along with surrogate algorithms like the C4.5 decision tree [49]. More
recent approaches extend these ideas to deep CNNs by training decision trees on the network’s logits
or feature activations to derive high-fidelity logical rules. For example, Padalkar et al. [ 50] present
NeSyFOLD, a neurosymbolic architecture that replaces the final CNN layers with a response schedule of
interpretable rules. Similarly, symbolic rule extraction has been applied to Vision Transformers (ViTs),
where sparse attention maps and concept-level features yield human-readable rule sets [51].</p>
      </sec>
    </sec>
    <sec id="sec-3">
      <title>3. Methodology</title>
      <p>In this section, we describe the proposed concept drift explanation method, tailored to CNNs for image
classification. We consider the original CNN model as-is and build upon the most recent advances in
the field of neuro-symbolic rule extraction from neural networks [ 48, 50]. We assume that concept drift
has already been detected and that the samples responsible for it have been identified—drift detection
and localization are considered completed. Our objective is to extract neuro-symbolic rules that explain
why such drifted samples difer from the normal data distribution, causing a drift.</p>
      <sec id="sec-3-1">
        <title>3.1. Problem formulation</title>
        <p>Consider an image  ∈  with  ⊂ R3× ×  and a label  ∈ {0, 1} for a binary classification
problem. We assume , which represents the main concept , as a composition of multiple concepts
cnodeE NCN
r</p>
        <p>Feature
maps</p>
        <p>...</p>
        <p>Concepts</p>
        <p>BCE Loss</p>
        <p>: {}=1 → . For instance, in a problem of gender classification, some concepts that may imply
the gender Male are the presence of beard, absence of makeup or lipstick (see Section 4). From these
concepts, it is possible to derive classification rules that are easily interpretable by humans, such as:
 Beard ∧ ¬Makeup ∧ ¬Lipstick → Male
Similarly, the approach extends to other tasks, such as scene classification, where a given class (e.g.,
camping) is generally correlated to specific concept presence (e.g., tent, lawn, woods, mountains, etc.).</p>
        <p>Pre-trained models, such as CNNs, may demonstrate unpredictable behavior when exposed to
out-ofdistribution inputs and data drift, leading to significant performance degradation. From this perspective,

we focus on identifying the rule set { }=1 that mimics the behavior of the model, and can highlight
in a more interpretable way changes in predictive patterns caused by concept or data drift.</p>
        <p>Each rule  : ⋀︀</p>
        <p>=1  →  is satisfied if all antecedents, expressed as positive  or negative ¬,
evaluate true. In our setting, antecedents represent the presence (or absence) of a specific concept  in
the image, while the consequent  is the final predicted class. Therefore, given a pre-trained model
 , we aim to extract an interpretable rule-based model ˆ : { }=1 → ˆ ≈  . This enables the
identification of the satisfied rule that guides the model’s final decision, and facilitates the detection of
potentially failing antecedents–highlighting the corresponding parts of the network.</p>
      </sec>
      <sec id="sec-3-2">
        <title>3.2. Model architecture</title>
        <p>As stated above, the method considers the original CNN as-is and, starting from the feature maps from
the last convolutional layer, it tries to infer the presence or absence of predefined concepts  within
the original image, producing a binarized concept presence table. Then, a rule extractor algorithm uses
the sparse information of this table, pre-computed for the whole training dataset, to derive the rules
that approximate the original model behavior. The complete architecture is shown in Figure 1a.
Concept extraction and labeling The extraction of concepts is performed using the feature maps from
the last convolutional layer of the original CNN. We select this layer because it yields a more semantically
rich representation. Typical approaches consist in binarizing the feature maps by computing the norms
for all input samples and then applying a specific threshold for each kernel, defined as the mean [ 48] or
the weighted sum of the mean and standard deviation [50] of the kernel norms distribution. A label is
ultimately attributed to each kernel reflecting the most prominent concept among the images it responds
to, by means of visual inspection [48] or by employing a semantic segmentation model [50] to detect the
most relevant concept in the image. We propose two alternative approaches for concept extraction and
labeling that leverage an external source of knowledge—an Oracle—defining the presence of predefined
concepts in the training images. The Oracle can be manually annotated by humans or synthetically
generated through pre-trained task-specific models for object detection, semantic segmentation, Visual
Question Answering (VQA), or Vision Language Models (VLMs).</p>
        <p>The first method (Figure 1b) is inspired by existing approaches, but introduces two major variations in
the choice of thresholds for binarization and in the kernel labeling procedure. Specifically, starting from
the feature maps { }=1, we extract the kernel activations using the 1-norm { }=1 = {|| ||1}=1,

where  is the number of kernels. We then compute the point-biserial [52] correlation coeficient
 = (,  ) between the dichotomous variable  from the Oracle and the continuous variable
 representing the kernel norms, pre-computed for the whole training set. We repeat the process for
all combinations and then select the pairs of concepts-kernels that exhibit the highest correlation:
 ,
{(, )}=1 = {arg max()}=1,=1
,
where  is the number of concepts in the Oracle. In this way, we find the most representative kernel for
each concept while, at the same time, achieving the kernel labeling. Binarization was finally achieved
using per-concept percentiles computed on the Oracle to take into account the concept imbalance in the

training data, and then the thresholds { }=1 were selected to segment the bimodal norm distributions
in a way that preserves the original cardinalities of each split.</p>
        <p>The second method (Figure 1c) employs an MLP to infer the presence of concepts from feature maps,
in a pure data-driven approach. Specifically, the results of the global average pooling were fed to the
new concept head, which outputs binary values, one for each of the predefined concepts. The network
was trained to match the Oracle using a BinaryCrossEntropy objective. Although the correlation-based
extractor appears to be a viable solution, the second one is generally more accurate.
Rule extraction Once the binary concept presence table is available, the set of predicted concepts 
that define the antecedents, can be used to derive the prediction rules. Similar to [48], we employ a
tree-based extraction algorithm, near to C4.5 [49], to derive rules that outline the conditions driving the
model decisions. Specifically, to obtain an approximation ˆ of the original model  , the algorithm
was fitted to follow its predictions ˆ =  (). Although, as proposed in [48], the rule extraction could
be expanded to multiple layers, we limited the analysis to the final classifier of the CNN, which only
exploits the high-level features of the final convolutional layer, as in [ 50]. In addition to the original
model, we explored generating drift explanations by extracting rules that explain the behavior of a drift
location model, which diferentiates between drifted and non-drifted samples (see Section 4.2).
Inference Inference is straightforward, as it consists of extracting the feature maps using the original
CNN, identifying the concepts using the Concept extractor module, and finding the satisfied rule based
on the combination of activated antecedents.</p>
        <p>Listing 1: Rule extracted to approximate the gender classifier (optimal case). For visualization purposes,
the maximum depth of the decision tree was set to 5, ensuring a fidelity of about 90%.
1 ¬Wearing_Earrings → ¬Drift (conf=1.00)
2 Wearing_Earrings &amp; ¬Wearing_Lipstick &amp; ¬No_Beard → Drift (conf=1.00)
3 Wearing_Earrings &amp; ¬Wearing_Lipstick &amp; No_Beard &amp; ¬Brown_Hair &amp; ¬Bangs → Drift (conf=0.97)
4 Wearing_Earrings &amp; ¬Wearing_Lipstick &amp; No_Beard &amp; ¬Brown_Hair &amp; Bangs → Drift (conf=0.81)
5 Wearing_Earrings &amp; ¬Wearing_Lipstick &amp; No_Beard &amp; Brown_Hair &amp; ¬Smiling → Drift (conf=0.91)
6 Wearing_Earrings &amp; ¬Wearing_Lipstick &amp; No_Beard &amp; Brown_Hair &amp; Smiling → Drift (conf=0.56)
7 Wearing_Earrings &amp; Wearing_Lipstick &amp; ¬Bushy_Eyebrows &amp; ¬Pointy_Nose &amp; ¬Attractive → ¬Drift (conf=0.78)
8 Wearing_Earrings &amp; Wearing_Lipstick &amp; ¬Bushy_Eyebrows &amp; ¬Pointy_Nose &amp; Attractive → ¬Drift (conf=0.96)
9 Wearing_Earrings &amp; Wearing_Lipstick &amp; ¬Bushy_Eyebrows &amp; Pointy_Nose → ¬Drift (conf=1.00)
10 Wearing_Earrings &amp; Wearing_Lipstick &amp; Bushy_Eyebrows &amp; ¬Pointy_Nose &amp; ¬Wavy_Hair → Drift (conf=0.80)
11 Wearing_Earrings &amp; Wearing_Lipstick &amp; Bushy_Eyebrows &amp; ¬Pointy_Nose &amp; Wavy_Hair → ¬Drift (conf=1.00)
12 Wearing_Earrings &amp; Wearing_Lipstick &amp; Bushy_Eyebrows &amp; Pointy_Nose &amp; ¬Oval_Face → ¬Drift (conf=0.50)
13 Wearing_Earrings &amp; Wearing_Lipstick &amp; Bushy_Eyebrows &amp; Pointy_Nose &amp; Oval_Face → ¬Drift (conf=1.00)</p>
        <p>Listing 2: Rule extracted to approximate the drift classifier (optimal case). For visualization purposes,
the maximum depth of the decision tree was set to 5, ensuring a fidelity of about 97%.</p>
      </sec>
    </sec>
    <sec id="sec-4">
      <title>4. Experiments</title>
      <p>In this section, we first introduce the experimental setting (Section 4.1). Then, we discuss the drift
explanation results obtained by rules extraction (Section 4.2).</p>
      <sec id="sec-4-1">
        <title>4.1. Experimental settings</title>
        <p>Dataset We conduct our experiments using the CelebA dataset [53], which consists of about 202k
images and provides human-annotated labels for 40 facial attributes, that can serve as the concept
Oracle in the proposed framework.</p>
        <p>Task definition and drift simulation We define the main task as a gender classification problem by
choosing the attribute Male as the target. We simulate a drift by isolating all instances from the class
Male that wear earrings and then injecting them into the data stream. In particular, the dataset was
divided into 4 splits preserving the original distribution: Historical train of about 156k samples (used
for training the CNN), Historical test of about 19k samples (used for testing the CNN), Datastream of
Attribute correlation
dataset [53]. (b): Correlation patterns between top-1 binarized kernels and attributes.
samples (2), and data stream with 100% of drift samples (3).</p>
        <p>Per-concept Acc ↑
Method
Corr1
Corr2
Corr3
MLP1
MLP2
MLP3
then evaluate the performance of the two proposed methods for Concept extraction and labeling to get
an estimate of the gap with respect to the optimal case.</p>
        <p>
          Starting from the optimal Concept presence table we extract the rules considering two diferent settings:
(i) we fit the tree-based extraction algorithm to match the original model predictions and obtain an
approximation of its behavior; (ii) we proceed with the assumption that drift has been successfully
detected and localized (e.g., through established drift detectors [
          <xref ref-type="bibr" rid="ref1 ref5 ref6">1, 5, 6</xref>
          ]) and fit the extraction algorithm to
predict whether individual samples are afected by drift. In both cases, we assume a perfect classification.
        </p>
        <p>We aim to provide two complementary perspectives on model behavior: (i) Human interpretability:
We extract interpretable rules that enable direct inspection to identify problematic associations that
degrade model performance on concept drift and out-of-distribution samples (e.g., the model incorrectly
relies on the presence of Earrings to predict that gender is not Male), and identify the specific responsible
network component (kernel); and (ii) Automated diagnosis: We analyze antecedent activation
frequencies to systematically identify faulty antecedents driving incorrect predictions, under the assumption
that the most frequently activated antecedent is most likely responsible for the observed drift.</p>
        <p>Listings 1 and 2 show the rules extracted in the two considered settings under the optimality
assumption. Their interpretability facilitates an understanding of the antecedents primarily influencing
the decision process. For instance, in Listing 1, two of the most critical rules are Rule 5 and 10, which
state that the presence of earrings, regardless of the presence of beard, is suficient to predict that the
gender is not Male. In Listing 2, instead, one of the most informative is Rule 1 which states that in
absence of earrings no drift is detected while, in the other case, the presence of drift depends on other
factors that exhibit correlation with the gender class. For instance, Rule 2 states that the presence of
earrings and beard, along with the absence of lipstick, is a suficient condition to identify a drifted subset.
Figure 2 illustrates how the extracted rules approximate the original models for varying tree depths
in the evaluated settings under the optimal assumption. In both cases, the rule fidelity remains high,
ensuring a close representation of the original model’s behavior with minimal degradation. Tables 2
and 3 present the activation frequencies of antecedents on data streams containing 0% and 100% drift
samples, in both experimental settings. The results indicate that the satisfied rule leading to the final
prediction consistently includes the antecedent Wearing Earring—the concept used to simulate the drift.</p>
        <p>Finally, we evaluate the efectiveness of the Concept Extractor with respect to both concept sensitivity
and concept labeling. Figure 3 shows the correlation in the CelebA dataset (a) and the correlation levels
between Oracle concepts and kernels (top-1 pairs) in the correlation-based solution (b). The heatmap in
Figure 3b highlights the problem of the polysemantic nature of kernel activations, which, in general, are
sensitive to diferent correlated concepts. We attribute this problem to the lower performance of the
correlation-based solution with respect to the MLP-based solution, which can leverage all the kernels
and learn to weight them efectively to further reduce the error. Table 4, instead, shows the accuracies
of the two proposed methods. Although generally both methods appear to be more reliable on larger
concepts, results highlight the dificulties on very small concepts represented with few pixels—such as
male earrings, which are typically much smaller than their female counterparts. Such fine details can
easily be lost during convolutional operations.</p>
      </sec>
    </sec>
    <sec id="sec-5">
      <title>5. Discussion</title>
      <p>Our preliminary results show the potential relationship between model interpretability and concept
drift explanation in the context of trustworthy AI systems. Through neuro-symbolic rule extraction, the
proposed approach clarifies the model decision pathway and helps pinpoint the origin of mispredictions,
with traceability down to the responsible network component (kernel). Under the optimality assumption,
the extracted rules appear highly informative, making it easier to formulate hypotheses about the
possible causes of drift. However, in a real scenario, the interpretability of those rules may be afected by
the concept extraction and naming procedure due to the inherent flaws still present in these systems. We
proposed two efective automated approaches to address the problem of concept labeling, which is one
of the most challenging and still open problems of rule extraction systems, and provide an estimation
measure of their reliability. By considering the annotations (Oracle) previously extracted on the whole
training set, they not only allow to avoid the search for the top-k images to which each kernel reacts
most and the visual inspection phase for the labeling process, but they also enable to enlarge the set of
samples considered for the kernel-concept association, resulting in a more reliable and scalable solution.
Limitations and future works The eficacy of the method is inherently constrained by the a priori
selection of concepts, which may obscure the true cause of the drift. Those systems can be susceptible to
errors in the kernel labeling procedure, which can produce erroneously named antecedents that make the
rule unreasonable, although the final classification is correct. The issue becomes more pronounced when
small concepts are present in the images, as kernels in the final layers may lack suficient sensitivity to
detect them. Moreover, since the encoder was originally trained for a diferent objective, individual
feature maps may not necessarily be sensitive to a single concept, a challenge only partially mitigated
by the proposed naming strategies. In future work, we plan to (1) enhance the robustness of the
rule extraction process for drift explanation, including the exploration of automatic concept labeling
methods, (2) extend the methodology to additional data modalities (e.g., text, audio), and (3) conduct a
more comprehensive experimental evaluation across diverse drift scenarios, models, and data types.</p>
    </sec>
    <sec id="sec-6">
      <title>Declaration on Generative AI</title>
      <p>During the preparation of this work, the author(s) used ChatGPT and Grammarly in order to: Grammar
and spelling check.
Transactions on Knowledge and Data Engineering (2015). doi:10.1109/TKDE.2014.2345382.
[20] S. Rabanser, S. Günnemann, Z. C. Lipton, Failing loudly: An empirical study of methods
for detecting dataset shift, in: Neural Information Processing Systems, 2018. URL: https:
//api.semanticscholar.org/CorpusID:53096511.
[21] S. Greco, T. Cerquitelli, Drift lens: Real-time unsupervised concept drift detection by evaluating
per-label embedding distributions, in: 2021 International Conference on Data Mining Workshops
(ICDMW), 2021, pp. 341–349. doi:10.1109/ICDMW53433.2021.00049.
[22] L. Bu, C. Alippi, D. Zhao, A pdf-free change detection test based on density diference estimation,
IEEE Transactions on Neural Networks and Learning Systems 29 (2018) 324–334. doi:10.1109/
TNNLS.2016.2619909.
[23] S. Greco, B. Vacchetti, D. Apiletti, T. Cerquitelli, Driftlens: A concept drift detection tool, in:
Proceedings 27th International Conference on Extending Database Technology, EDBT 2024,
Paestum, Italy, March 25 - March 28, OpenProceedings.org, 2024, pp. 806–809. URL: https:
//doi.org/10.48786/edbt.2024.75. doi:10.48786/edbt.2024.75.
[24] E. Lughofer, E. Weigl, W. Heidl, C. Eitzinger, T. Radauer, Drift detection in data stream classification
without fully labelled instances, in: 2015 IEEE International Conference on Evolving and Adaptive
Intelligent Systems (EAIS), 2015, pp. 1–8. doi:10.1109/EAIS.2015.7368802.
[25] A. Suprem, J. Arulraj, C. Pu, J. Ferreira, Odin: automated drift detection and recovery in video
analytics, Proc. VLDB Endow. (2020). URL: https://doi.org/10.14778/3407790.3407837. doi:10.
14778/3407790.3407837.
[26] M. Hushchyn, A. Ustyuzhanin, Generalization of change-point detection in time series data based
on direct density ratio estimation, CoRR abs/2001.06386 (2020). URL: https://arxiv.org/abs/2001.
06386. arXiv:2001.06386.
[27] K. Yamanishi, J.-i. Takeuchi, A unifying framework for detecting outliers and change points
from non-stationary time series data, in: Proceedings of the Eighth ACM SIGKDD International
Conference on Knowledge Discovery and Data Mining, KDD ’02, 2002. URL: https://doi.org/10.
1145/775047.775148. doi:10.1145/775047.775148.
[28] O. Gözüaçık, A. Büyükçakır, H. Bonab, F. Can, Unsupervised concept drift detection with a
discriminative classifier, in: Proceedings of the 28th ACM International Conference on Information
and Knowledge Management, CIKM ’19, Association for Computing Machinery, New York, NY,
USA, 2019, p. 2365–2368. URL: https://doi.org/10.1145/3357384.3358144. doi:10.1145/3357384.
3358144.
[29] A. Liu, Y. Song, G. Zhang, J. Lu, Regional concept drift detection and density synchronized drift
adaptation, 2017, pp. 2280–2286. doi:10.24963/ijcai.2017/317.
[30] S. Hido, T. Idé, H. Kashima, H. Kubo, H. Matsuzawa, Unsupervised change analysis using supervised
learning, in: Advances in Knowledge Discovery and Data Mining, Springer Berlin Heidelberg,
Berlin, Heidelberg, 2008, pp. 148–159.
[31] F. Hinder, V. Vaquet, J. Brinkrolf, A. Artelt, B. Hammer, Localization of concept drift: Identifying
the drifting datapoints, in: 2022 International Joint Conference on Neural Networks (IJCNN), 2022,
pp. 1–9. doi:10.1109/IJCNN55064.2022.9892374.
[32] G. I. Webb, R. Hyde, H. Cao, H. L. Nguyen, F. Petitjean, Characterizing concept drift, Data Mining
and Knowledge Discovery 30 (2016) 964–994. URL: https://doi.org/10.1007/s10618-015-0448-4.
doi:10.1007/s10618-015-0448-4.
[33] X. Wang, W. Chen, J. Xia, Z. Chen, D. Xu, X. Wu, M. Xu, T. Schreck, Conceptexplorer: Visual
analysis of concept drifts in multi-source time-series data, in: 2020 IEEE Conference on Visual Analytics
Science and Technology (VAST), 2020, pp. 1–11. doi:10.1109/VAST50239.2020.00006.
[34] K. B. Pratt, G. Tschapek, Visualizing concept drift, in: Proceedings of the Ninth ACM SIGKDD
International Conference on Knowledge Discovery and Data Mining, KDD ’03, Association for
Computing Machinery, New York, NY, USA, 2003, p. 735–740. URL: https://doi.org/10.1145/956750.
956849. doi:10.1145/956750.956849.
[35] J. N. Adams, S. J. van Zelst, L. Quack, K. Hausmann, W. M. P. van der Aalst, T. Rose, A framework
for explainable concept drift detection in process mining, in: Business Process Management,
Springer International Publishing, Cham, 2021, pp. 400–416.
[36] F. Hinder, V. Vaquet, B. Hammer, Feature-based analyses of concept drift, Neurocomputing
600 (2024) 127968. URL: https://www.sciencedirect.com/science/article/pii/S0925231224007392.
doi:https://doi.org/10.1016/j.neucom.2024.127968.
[37] C. Duckworth, F. P. Chmiel, D. K. Burns, Z. D. Zlatev, N. M. White, T. W. V. Daniels, M. Kiuber,
M. J. Boniface, Using explainable machine learning to characterise data drift and detect emergent
health risks for emergency department admissions during covid-19, Scientific Reports 11 (2021)
23017. URL: https://doi.org/10.1038/s41598-021-02481-y. doi:10.1038/s41598-021-02481-y.
[38] Susnjak, Teo, Maddigan, Paula, Forecasting patient flows with pandemic induced concept drift
using explainable machine learning, EPJ Data Sci. 12 (2023) 11. URL: https://doi.org/10.1140/epjds/
s13688-023-00387-5. doi:10.1140/epjds/s13688-023-00387-5.
[39] S. M. Lundberg, S.-I. Lee, A unified approach to interpreting model predictions, in: Proceedings of
the 31st International Conference on Neural Information Processing Systems, NIPS’17, Curran
Associates Inc., Red Hook, NY, USA, 2017, p. 4768–4777.
[40] S. Greco, B. Vacchetti, D. Apiletti, T. Cerquitelli, Unsupervised concept drift detection from deep
learning representations in real-time, IEEE Transactions on Knowledge and Data Engineering 37
(2025) 6232–6245. doi:10.1109/TKDE.2025.3593123.
[41] J. a. Gama, I. Žliobaitundefined, A. Bifet, M. Pechenizkiy, A. Bouchachia, A survey on concept
drift adaptation, ACM Comput. Surv. 46 (2014). doi:10.1145/2523813.
[42] L. Yuan, H. Li, B. Xia, C. Gao, M. Liu, W. Yuan, X. You, Recent advances in concept drift adaptation
methods for deep learning, in: L. D. Raedt (Ed.), Proceedings of the Thirty-First International
Joint Conference on Artificial Intelligence, IJCAI-22, International Joint Conferences on Artificial
Intelligence Organization, 2022, pp. 5654–5661. URL: https://doi.org/10.24963/ijcai.2022/788. doi:10.
24963/ijcai.2022/788, survey Track.
[43] Q. Xiang, L. Zi, X. Cong, Y. Wang, Concept drift adaptation methods under the deep learning
framework: A literature review, Applied Sciences 13 (2023). URL: https://www.mdpi.com/2076-3417/13/
11/6515. doi:10.3390/app13116515.
[44] M. Sundararajan, A. Taly, Q. Yan, Axiomatic attribution for deep networks, 2017. URL: https:
//arxiv.org/abs/1703.01365. arXiv:1703.01365.
[45] K. Simonyan, A. Vedaldi, A. Zisserman, Deep inside convolutional networks: Visualising
image classification models and saliency maps, 2014. URL: https://arxiv.org/abs/1312.6034.
arXiv:1312.6034.
[46] F. Ventura, S. Greco, D. Apiletti, T. Cerquitelli, Explaining deep convolutional models by measuring
the influence of interpretable features in image classification, Data Mining and Knowledge
Discovery 38 (2024) 3169–3226. URL: https://doi.org/10.1007/s10618-023-00915-x. doi:10.1007/
s10618-023-00915-x.
[47] R. R. Selvaraju, A. Das, R. Vedantam, M. Cogswell, D. Parikh, D. Batra, Grad-cam: Why did you
say that?, 2017. URL: https://arxiv.org/abs/1611.07450. arXiv:1611.07450.
[48] J. Townsend, T. Kasioumis, H. Inakoshi, Eric: Extracting relations inferred from convolutions, in:</p>
      <p>Proceedings of the Asian Conference on Computer Vision, 2020.
[49] J. R. Quinlan, C4. 5: programs for machine learning, Elsevier, 2014.
[50] P. Padalkar, H. Wang, G. Gupta, Nesyfold: a framework for interpretable image classification, in:</p>
      <p>Proceedings of the AAAI Conference On Artificial Intelligence, volume 38, 2024, pp. 4378–4387.
[51] P. Padalkar, G. Gupta, Symbolic rule extraction from attention-guided sparse representations in
vision transformers, arXiv preprint arXiv:2505.06745 (2025).
[52] R. F. Tate, Correlation between a discrete and a continuous variable. point-biserial correlation,</p>
      <p>The Annals of mathematical statistics 25 (1954) 603–607.
[53] Z. Liu, P. Luo, X. Wang, X. Tang, Deep learning face attributes in the wild, in: Proceedings of
International Conference on Computer Vision (ICCV), 2015.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <given-names>J.</given-names>
            <surname>Lu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Liu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F.</given-names>
            <surname>Dong</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F.</given-names>
            <surname>Gu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Gama</surname>
          </string-name>
          ,
          <string-name>
            <surname>G.</surname>
          </string-name>
          <article-title>Zhang, Learning under concept drift: A review</article-title>
          ,
          <source>IEEE Transactions on Knowledge and Data Engineering</source>
          <volume>31</volume>
          (
          <year>2019</year>
          )
          <fpage>2346</fpage>
          -
          <lpage>2363</lpage>
          . doi:
          <volume>10</volume>
          .1109/TKDE.
          <year>2018</year>
          .
          <volume>2876857</volume>
          .
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [2]
          <string-name>
            <given-names>F.</given-names>
            <surname>Bayram</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B. S.</given-names>
            <surname>Ahmed</surname>
          </string-name>
          ,
          <article-title>Towards trustworthy machine learning in production: An overview of the robustness in mlops approach</article-title>
          ,
          <source>ACM Comput. Surv</source>
          .
          <volume>57</volume>
          (
          <year>2025</year>
          ). URL: https://doi.org/10.1145/3708497. doi:
          <volume>10</volume>
          .1145/3708497.
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [3]
          <string-name>
            <given-names>B.</given-names>
            <surname>Li</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Qi</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B.</given-names>
            <surname>Liu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Di</surname>
          </string-name>
          , J. Liu,
          <string-name>
            <given-names>J.</given-names>
            <surname>Pei</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Yi</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B.</given-names>
            <surname>Zhou</surname>
          </string-name>
          ,
          <article-title>Trustworthy ai: From principles to practices</article-title>
          ,
          <source>ACM Comput. Surv</source>
          .
          <volume>55</volume>
          (
          <year>2023</year>
          ). URL: https://doi.org/10.1145/3555803. doi:
          <volume>10</volume>
          .1145/3555803.
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [4]
          <string-name>
            <given-names>J.</given-names>
            <surname>Klaise</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A. V.</given-names>
            <surname>Looveren</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Cox</surname>
          </string-name>
          ,
          <string-name>
            <given-names>G.</given-names>
            <surname>Vacanti</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Coca</surname>
          </string-name>
          ,
          <article-title>Monitoring and explainability of models in production, 2020</article-title>
          . URL: https://arxiv.org/abs/
          <year>2007</year>
          .06299. arXiv:
          <year>2007</year>
          .06299.
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          [5]
          <string-name>
            <given-names>F.</given-names>
            <surname>Hinder</surname>
          </string-name>
          ,
          <string-name>
            <given-names>V.</given-names>
            <surname>Vaquet</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B.</given-names>
            <surname>Hammer</surname>
          </string-name>
          ,
          <article-title>One or two things we know about concept drift-a survey on monitoring in evolving environments. part a: detecting concept drift</article-title>
          ,
          <source>Frontiers in Artificial Intelligence</source>
          Volume 7
          <article-title>-</article-title>
          <year>2024</year>
          (
          <year>2024</year>
          ). URL: https://www.frontiersin.org/journals/artificial-intelligence/ articles/10.3389/frai.
          <year>2024</year>
          .
          <volume>1330257</volume>
          . doi:
          <volume>10</volume>
          .3389/frai.
          <year>2024</year>
          .
          <volume>1330257</volume>
          .
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          [6]
          <string-name>
            <given-names>F.</given-names>
            <surname>Hinder</surname>
          </string-name>
          ,
          <string-name>
            <given-names>V.</given-names>
            <surname>Vaquet</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B.</given-names>
            <surname>Hammer</surname>
          </string-name>
          ,
          <article-title>One or two things we know about concept drift-a survey on monitoring in evolving environments. part b: locating and explaining concept drift</article-title>
          ,
          <source>Frontiers in Artificial Intelligence</source>
          Volume 7
          <article-title>-</article-title>
          <year>2024</year>
          (
          <year>2024</year>
          ). URL: https://www.frontiersin.org/journals/ artificial-intelligence/articles/10.3389/frai.
          <year>2024</year>
          .
          <volume>1330258</volume>
          . doi:
          <volume>10</volume>
          .3389/frai.
          <year>2024</year>
          .
          <volume>1330258</volume>
          .
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          [7]
          <string-name>
            <given-names>R.</given-names>
            <surname>Guidotti</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Monreale</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Ruggieri</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F.</given-names>
            <surname>Turini</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F.</given-names>
            <surname>Giannotti</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Pedreschi</surname>
          </string-name>
          ,
          <article-title>A survey of methods for explaining black box models</article-title>
          ,
          <source>ACM Comput. Surv</source>
          .
          <volume>51</volume>
          (
          <year>2018</year>
          ). URL: https://doi.org/10.1145/3236009. doi:
          <volume>10</volume>
          .1145/3236009.
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          [8]
          <string-name>
            <given-names>F.</given-names>
            <surname>Xu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>H.</given-names>
            <surname>Uszkoreit</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Y.</given-names>
            <surname>Du</surname>
          </string-name>
          ,
          <string-name>
            <given-names>W.</given-names>
            <surname>Fan</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Zhao</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Zhu</surname>
          </string-name>
          ,
          <article-title>Explainable ai: A brief survey on history, research areas, approaches and challenges</article-title>
          ,
          <source>in: Natural Language Processing and Chinese Computing</source>
          , Springer International Publishing, Cham,
          <year>2019</year>
          , pp.
          <fpage>563</fpage>
          -
          <lpage>574</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          [9]
          <string-name>
            <given-names>R.</given-names>
            <surname>Dwivedi</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Dave</surname>
          </string-name>
          ,
          <string-name>
            <given-names>H.</given-names>
            <surname>Naik</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Singhal</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Omer</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Patel</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B.</given-names>
            <surname>Qian</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Z.</given-names>
            <surname>Wen</surname>
          </string-name>
          ,
          <string-name>
            <given-names>T.</given-names>
            <surname>Shah</surname>
          </string-name>
          , G. Morgan,
          <string-name>
            <given-names>R.</given-names>
            <surname>Ranjan</surname>
          </string-name>
          ,
          <article-title>Explainable ai (xai): Core ideas, techniques, and solutions</article-title>
          ,
          <source>ACM Comput. Surv</source>
          .
          <volume>55</volume>
          (
          <year>2023</year>
          ). URL: https://doi.org/10.1145/3561048. doi:
          <volume>10</volume>
          .1145/3561048.
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          [10]
          <string-name>
            <given-names>D.</given-names>
            <surname>Minh</surname>
          </string-name>
          ,
          <string-name>
            <given-names>H. X.</given-names>
            <surname>Wang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Y. F.</given-names>
            <surname>Li</surname>
          </string-name>
          ,
          <string-name>
            <given-names>T. N.</given-names>
            <surname>Nguyen</surname>
          </string-name>
          ,
          <article-title>Explainable artificial intelligence: a comprehensive review</article-title>
          ,
          <source>Artificial Intelligence Review</source>
          <volume>55</volume>
          (
          <year>2022</year>
          )
          <fpage>3503</fpage>
          -
          <lpage>3568</lpage>
          . URL: https://doi.org/10.1007/ s10462-021-10088-y. doi:
          <volume>10</volume>
          .1007/s10462-021-10088-y.
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          [11]
          <string-name>
            <given-names>F.</given-names>
            <surname>Hinder</surname>
          </string-name>
          ,
          <string-name>
            <given-names>V.</given-names>
            <surname>Vaquet</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Brinkrolf</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B.</given-names>
            <surname>Hammer</surname>
          </string-name>
          ,
          <article-title>Model-based explanations of concept drift</article-title>
          ,
          <source>Neurocomputing</source>
          <volume>555</volume>
          (
          <year>2023</year>
          )
          <fpage>126640</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          [12]
          <string-name>
            <given-names>F.</given-names>
            <surname>Bayram</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B. S.</given-names>
            <surname>Ahmed</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Kassler</surname>
          </string-name>
          ,
          <article-title>From concept drift to model degradation: An overview on performance-aware drift detectors</article-title>
          ,
          <source>Knowledge-Based Systems</source>
          <volume>245</volume>
          (
          <year>2022</year>
          )
          <article-title>108632</article-title>
          . doi:https: //doi.org/10.1016/j.knosys.
          <year>2022</year>
          .
          <volume>108632</volume>
          .
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          [13]
          <string-name>
            <given-names>S.</given-names>
            <surname>Farquhar</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Y.</given-names>
            <surname>Gal</surname>
          </string-name>
          ,
          <article-title>What'out-of-distribution'is and is not</article-title>
          ,
          <source>in: Neurips ml safety workshop</source>
          ,
          <year>2022</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref14">
        <mixed-citation>
          [14]
          <string-name>
            <given-names>R. N.</given-names>
            <surname>Gemaque</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A. F. J.</given-names>
            <surname>Costa</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Giusti</surname>
          </string-name>
          ,
          <string-name>
            <surname>E. M.</surname>
          </string-name>
          <article-title>dos Santos, An overview of unsupervised drift detection methods</article-title>
          ,
          <source>WIREs Data Mining and Knowledge Discovery</source>
          <volume>10</volume>
          (
          <year>2020</year>
          )
          <article-title>e1381</article-title>
          . doi:https: //doi.org/10.1002/widm.1381.
        </mixed-citation>
      </ref>
      <ref id="ref15">
        <mixed-citation>
          [15]
          <string-name>
            <given-names>P.</given-names>
            <surname>Shen</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Y.</given-names>
            <surname>Ming</surname>
          </string-name>
          ,
          <string-name>
            <given-names>H.</given-names>
            <surname>Li</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Gao</surname>
          </string-name>
          ,
          <string-name>
            <surname>W. Zhang,</surname>
          </string-name>
          <article-title>Unsupervised concept drift detectors: A survey</article-title>
          ,
          <source>in: Advances in Natural Computation, Fuzzy Systems and Knowledge Discovery</source>
          ,
          <year>2023</year>
          , pp.
          <fpage>1117</fpage>
          -
          <lpage>1124</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref16">
        <mixed-citation>
          [16]
          <string-name>
            <given-names>J.</given-names>
            <surname>Gama</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Medas</surname>
          </string-name>
          , G. Castillo,
          <string-name>
            <given-names>P.</given-names>
            <surname>Rodrigues</surname>
          </string-name>
          ,
          <article-title>Learning with drift detection</article-title>
          ,
          <source>in: Advances in Artificial Intelligence</source>
          ,
          <year>2004</year>
          , pp.
          <fpage>286</fpage>
          -
          <lpage>295</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref17">
        <mixed-citation>
          [17]
          <string-name>
            <given-names>M.</given-names>
            <surname>Baena-Garcıa</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J. Del</given-names>
            <surname>Campo-Ávila</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Fidalgo</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Bifet</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Gavalda</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Morales-Bueno</surname>
          </string-name>
          ,
          <article-title>Early drift detection method</article-title>
          ,
          <source>Fourth international workshop on knowledge discovery from data streams</source>
          (
          <year>2006</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref18">
        <mixed-citation>
          [18]
          <string-name>
            <given-names>J.</given-names>
            <surname>Gama</surname>
          </string-name>
          ,
          <string-name>
            <surname>G.</surname>
          </string-name>
          <article-title>Castillo, Learning with local drift detection</article-title>
          ,
          <source>in: Advanced Data Mining and Applications</source>
          , Springer,
          <year>2006</year>
          , pp.
          <fpage>42</fpage>
          -
          <lpage>55</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref19">
        <mixed-citation>
          [19]
          <string-name>
            <given-names>I.</given-names>
            <surname>Frías-Blanco</surname>
          </string-name>
          , J. d. Campo-Ávila,
          <string-name>
            <given-names>G.</given-names>
            <surname>Ramos-Jiménez</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Morales-Bueno</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Ortiz-Díaz</surname>
          </string-name>
          ,
          <string-name>
            <surname>Y.</surname>
          </string-name>
          <article-title>CaballeroMota, Online and non-parametric drift detection methods based on hoefding's bounds</article-title>
          , IEEE
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>