<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Drifter: Eficient Online Feature Monitoring for Improved Data Integrity in Large-Scale Recommendation Systems</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Blaž Škrlj</string-name>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Nir Ki-Tov</string-name>
          <xref ref-type="aff" rid="aff2">2</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Lee Edelist</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Natalia Silberstein</string-name>
          <xref ref-type="aff" rid="aff2">2</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Hila Weisman-Zohar</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Blaž Mramor</string-name>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Davorin Kopič</string-name>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Naama Ziporin</string-name>
          <xref ref-type="aff" rid="aff2">2</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Netanya</institution>
          ,
          <country country="IL">Israel</country>
        </aff>
        <aff id="aff1">
          <label>1</label>
          <institution>Outbrain Inc.</institution>
          ,
          <addr-line>Ljubljana</addr-line>
          ,
          <country country="SI">Slovenia</country>
        </aff>
        <aff id="aff2">
          <label>2</label>
          <institution>Outbrain Inc.</institution>
          ,
          <addr-line>Netanya</addr-line>
          ,
          <country country="IL">Israel</country>
        </aff>
      </contrib-group>
      <abstract>
        <p>Real-world production systems often grapple with maintaining data quality in large-scale, dynamic streams. We introduce Drifter, an eficient and lightweight system for online feature monitoring and verification in recommendation use cases. Drifter addresses limitations of existing methods by delivering agile, responsive, and adaptable data quality monitoring, enabling real-time root cause analysis, drift detection and insights into problematic production events. Integrating state-of-the-art online feature ranking for sparse data and anomaly detection ideas, Drifter is highly scalable and resource-eficient, requiring only two threads and less than a gigabyte of RAM per production deployments that handle millions of instances per minute (model training). Drifter's efectiveness in alerting and mitigating data quality issues was demonstrated on a real-life system that handles up to a billion predictions per second.</p>
      </abstract>
      <kwd-group>
        <kwd>eol&gt;feature monitoring</kwd>
        <kwd>recommender systems</kwd>
        <kwd>online learning</kwd>
        <kwd>online advertising</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>1. Introduction</title>
      <p>
        Designing and developing online machine learning systems is a complex endeavour, where data
quality and integrity plays a crucial role for models’ online performance. This is in particular
the case for contemporary recommender systems (see [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ]), which often rely on frequent model
updates (every few minutes or even less) and are thus subject to especially rapid negative impact
from data quality degradation. To address potential data quality issues, such recommender
systems can considerably benefit from supporting data-monitoring systems that enable fast
alerting/responsiveness when data-related issues are present. Systems like Greykite [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ] enable
automated forecasting and facilitate profiling of emerging issues related to internal system
behaviour. The study of online features’ behavior is commonly referred to as online feature
selection. This branch of methods attempts to distill relevant features from irrelevant ones
in an online learning setting [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ]. Approaches focusing on mining larger (online) data sets
must be computationally eficient and easily interpretable [
        <xref ref-type="bibr" rid="ref4">4</xref>
        ]. Online feature selection
and ranking has also been a lively research endeavour for building real-time recommender
systems. For example, [
        <xref ref-type="bibr" rid="ref5">5</xref>
        ] demonstrated that feature groups are a possible way of eficient
online feature selection since similar features tend to behave similarly in time. Furthermore,
large amounts of data that require processing can already represent substantial computational
burden as processing and transformation of instances can be expensive. It was shown that
higher-dimensional feature spaces in online settings require specialized approaches that are
versatile enough and can scale with real-life data sizes [
        <xref ref-type="bibr" rid="ref6">6</xref>
        ]. The field of online feature ranking is
a vibrant area of research and development; however, it needs to extend beyond the algorithmic
aspects typically associated with it – in addition to the algorithms, there is a growing need
for systems that can efectively monitor features in real-time. Design of such systems also
encompasses implementing and deploying mechanisms that enable online inspection of feature
scores and other related metrics (e.g., features’ cardinalities and coverages). In order to optimize
utility and efectiveness, these monitoring systems cannot exist in isolation. They must be
coupled with a visibility layer incorporating alerting mechanisms and visualizations. This
integration allows for an accessible and user-friendly interface, enabling the study of granular
details of the data consumed by real-time models. By providing such visibility, users can gain
insights into the underlying factors influencing the performance and behaviour of these models,
and make informed decisions and analysis based on them.
      </p>
      <p>The work presented in this paper builds upon recent ideas and advances in both online
feature ranking and real-time signal analysis. Drawing on these domains, we developed a
system that has been deployed in a large-scale online cloud environment. This system
handles real-life data streams for various use cases, including click-through rate prediction,
conversion rate prediction, and item viewability prediction. Furthermore, by operating in
real-world and real-time settings, the system is actively used when handling complex, dynamic
data monitoring scenarios. Overall, the paper emphasizes the importance of online feature
monitoring and highlights the value of integrating this functionality with a comprehensive
visibility layer. Through this integration, the presented system ofers a powerful tool for
inspecting and analyzing the most granular levels of data used in real-time models, ultimately
contributing to improved decision-making and performance optimization in various application
domains.</p>
      <p>The remainder of this paper is structured as follows. We begin by discussing the existing
use cases where online feature monitoring and verification was used to facilitate deployment,
monitoring and understanding of online recommender systems. Second, we describe Drifter,
the system used for online feature monitoring and verification, its implementation, and the
metrics implemented for measuring feature drifts and anomalies online. Third, we describe a
use case where Drifter helped profile and identify features that were subject to drift. Finally, we
discuss the implications and lessons learned when deploying and designing Drifter.</p>
    </sec>
    <sec id="sec-2">
      <title>2. Online feature monitoring overview</title>
      <p>We describe the main use cases where online feature monitoring is a suitable approach for
understanding, mitigating and improving online learning processes.</p>
      <p>Introduction of new features A typical process in many online learning workflows involves
the introduction of new signals or features. However, due to the dynamic nature of online
learning, incorporating new features often requires substantial engineering eforts that may
span multiple teams. Unfortunately, various factors, such as miscommunications, logging bugs,
or other issues can unintentionally impact these features’ distribution, coverage, or relative
importance. To mitigate these challenges, it is crucial to have a mechanism in place to monitor
and automatically raise alerts based on predefined conditions that indicate problematic change
in feature values. By proactively detecting and addressing such issues as early as possible in the
data pipeline, we are able to prevent the deployment of models that could negatively afect the
business. Furthermore, understanding how existing features vary in time provides an additional
perspective on their behaviour. This knowledge can be valuable for prioritizing testing of new
features and their transformations, used by the predictive model.</p>
      <p>
        Measuring quality drops of existing features As soon as an online feature monitoring
system is capable of running in real-time, it can (and does) serve as a "ghost mode" for a
given data stream (consumed by, e.g., click-through rate or conversion rate prediction models).
Furthermore, by systematically measuring features’ properties such as cardinality, coverage,
and statistics such as quantiles and histograms, the monitoring system can, in a matter of
milliseconds, alert relevant model stakeholders that a change in the distribution of an existing
feature has occurred. Such events can occur due to multiple reasons; examples include changes
in a component that participates in the construction or final transformation of the feature, a
drop of data quality due to an external event and feature drifts – gradual changes in a feature’s
distribution, eventually resulting in problematic behaviour [
        <xref ref-type="bibr" rid="ref7 ref8">7, 8</xref>
        ].
      </p>
      <p>Debugging online models When working with data streams in online learning systems,
explainability can become a challenge – the use of deep factorization machine-based models or
similar variants have made it increasingly dificult and time-consuming to inspect problematic
models directly. However, by establishing a connection between a model’s behaviour and
detectable shifts in feature distributions, it becomes possible to conduct more systematic
posthoc evaluations of the model itself. One such example includes perturbation-based analysis,
which focuses on identifying the efects of the distribution shifts of single or multiple features.
Observing such shifts automatically online makes it easier to link them to the model’s behaviour
and gain insights into its performance and potential issues. Further, understanding temporal
behaviour facilitates the design of follow-up ofline experiments that help identify potentially
useful transformations of existing features. Understanding feature distribution shifts thus
enables a more structured evaluation of the model, facilitating the identification and resolution
of problems that may arise during online learning.</p>
      <p>Understanding of temporal dynamics of features The endeavour to study the
behaviour of multiple features online simultaneously is not necessarily considered due to its
time-consuming nature. However, by being able to observe whole feature spaces’ dynamics
in time, patterns related to the complementary nature of features can arise, deepening the
data scientists’ understanding of which features fluctuate together; understanding temporal
relations can help with the creation of new features that account for this dynamics, or facilitate
exploration of alternative features that would otherwise be ignored. For example, separate
teams can introduce features in isolation, not being aware of their complementary nature –
by visualizing the joint space, such patterns can be studied and can further simplify existing
models.</p>
      <p>Online ranking of features’ contributions Productization of new signals (features) can
be expensive and time-consuming. By simulating a feature’s behaviour with the target space of
interest, prioritization of new features can be facilitated, saving valuable resources that would
otherwise be spent on multiple deployments, running A/B tests and other costly procedures,
that were always inevitable for testing the new features’ contribution to the target space. We
proceed with an overview of related work.</p>
      <p>
        We continue the discussion with an overview of existing feature monitoring/inspection
systems and how they compare to Drifter. An overview of how Drifter compares to existing
products and tools is summarized in Table 1. The selected tools include existing, well-established
solutions for online feature store construction and subsequent machine learning, as well as
up-an-coming solutions. The three possible table marks represent complete feature/capability
( ), lack thereof ( ) or partial compatibility (–). The categories of comparison were selected in
a way to entail diferent properties of this type of systems; from their extendability, compliance
with sparse data formats to less known functionalities such as drift detection and support
for computation of streaming statistics (sketching algorithms). The more extensive solutions
include feathr2, Vertex AI3, SAGEMAKER Store4, Tecton5 and Databricks feature store6. Other
considered systems include ByteHub7, butterfree8, RasgoQL9 and hospworks10. The capabilities
were assessed based on products’ websites and readily available documentation. We observed
that most end-to-end tools could be, with varying degrees of efort, extended to fulfuil
missing/partially missing fields of comparison (we evaluated out-of-the-box capabilities that require
minimum engineering overhead). Finally, systems that do not employ their own engines are
mostly based around Scikit-learn library [
        <xref ref-type="bibr" rid="ref9">9</xref>
        ].
2https://github.com/feathr-ai/feathr
3https://cloud.google.com/vertex-ai/docs/featurestore
4https://aws.amazon.com/sagemaker/feature-store
5https://www.tecton.ai/
6https://docs.gcp.databricks.com/machine-learning/feature-store/index.html
7https://github.com/bytehub-ai/bytehub
8https://github.com/quintoandar/butterfree
9https://github.com/rasgointelligence/RasgoQL
10https://github.com/logicalclocks/hopsworks
      </p>
    </sec>
    <sec id="sec-3">
      <title>3. Drifter - an overview</title>
      <p>Having established the motivating use cases which led us to build Drifter, we continue with
its overview, design and implementation choices, user interaction with the service and future
applications.</p>
      <sec id="sec-3-1">
        <title>3.1. Overall architecture and implementation</title>
        <p>Drifter was built as a component of the microservice architecture. It was built to operate
with a distributed data source and a metrics endpoint of choice; in the presented work, however,
it is a stand-alone service that receives the data via Hive11, and outputs the metrics to a metrics
service (Prometheus12). This design choice was undertaken so that each use case (e.g., a team
owning a CTR prediction or a viewability prediction model) has ownership over the relevant
Drifter instance(s), and can modify their queries or data regime according to their preferences.
Each metrics endpoint is aware of a particular deployment, meaning that Grafana dashboards
can be built with specific Drifter instance(s) in mind. This way, isolation of metrics is possible,
but at the same time, teams can use other teams’ metrics and information when defining their
visualizations and alerts. The service itself is deployed to an in-house cloud platform, where
each Drifter instance is monitored by default (resource-wise), ofering users insights into the
amount of resources required per each Drifter pod – this functionality is useful for profiling
changes in data regimes and their impact on the resources (which are finite for each use case). A
single Drifter instance is summarized in Figure 1a. Each Drifter is a self-contained unit that can
be deployed on a per-demand basis in the in-house cloud platform. Although multiple diferent
Drifters are simultaneously deployed (diferent use cases), their metrics are aggregated in a
joint endpoint (Prometheus), enabling a global overview of features being monitored and their
states. This overview is shown in Figure 1b.</p>
        <p>
          We continue with a more detailed overview of each of the components that constitute Drifter
(service). Drifter is implemented as a Python-based service, utilizing an in-house library that,
outof-the-box, enables reporting of Prometheus-based metrics per deployed pod. Each Drifter has its
internal scheduler, enabling flexibility in terms of time zones. In the initial phase of development,
we observed two main computational bottlenecks to running Drifter instances online at scale:
Consuming production-level data volumes comprised of up to millions of instances per ten
minutes, and computing scores between features of interest. The computationally expensive
parts of feature ranking related to score computing are written in Numba [
          <xref ref-type="bibr" rid="ref10">10</xref>
          ] (an LLVM-based
Python JIT compiler). Dockerized Drifter instances communicate with Prometheus endpoints
(metrics). A Grafana-based dashboard enables the inspection of these metrics in real-time. The
service is implemented with ease of on-boarding in mind; when a new use case needs to be
accommodated (e.g., a novel model), a template Drifter pod is cloned and configured to operate
with a dedicated data stream specific to a given use case. This way, use cases are separate and do
not interfere with one another. Further, as they jointly push metrics to the common endpoint,
visibility at the level of all active Drifter pods is possible and facilitates monitoring of their
health by the infrastructure team.
11https://hive.apache.org/
12https://prometheus.io/
        </p>
        <p>Data batches (different use
cases)</p>
        <p>Drifter
instances
JobOesunctRghieadneukler HivMendebtpirnoidcnsintgs</p>
        <p>In order for Drifter to be available to as many use cases as possible, we optimized the service
to a point it requires less than 1GB of RAM and only two threads. This was achieved by
inspecting and optimizing Hive queries and the ranking engine itself. Optimizations that enable
such low footprint include mini-batch feature ranking, probabilistic estimation of cardinalities
(Hyperloglog-based counting), hashing trick for clipping values to a fixed (integer) range and
randomized estimation of feature interactions – for each mini-batch, the number of interactions
computed is upper-bounded by a fixed maximum number, ensuring consistent performance.
Overview of a live benchmark of the service on a week of production data is shown in Figure 2.
Memory limit (last plot in (a)) was set to 1GB. It can be observed that on average less than a
single CPU is used. Memory spikes observed around e.g., 3rd of July correlate to variability
in data quantity received by the service, showing resilience of Drifters to trafic spikes and
similar events. Further, it can be observed that fluctuations in data have an impact that is
within the resource constraints of the deployed pod. Each Drifter instance, as soon as it’s
deployed, produces metrics for resource consumption (apart from the ones related to feature
space). Subfigure (b) demonstrates performance if the in-built ranking engine, in particular
Numba-based re-implementation of mutual information. The algorithm was further extended
to skip computations that are redundant due to data sparsity - benchmark shows evaluations
of input vectors of diferent sizes with varying degrees if in-built sparseness (from 1% to 50%
present values). For very sparse inputs, sparsity-aware mutual information can be substantially
faster (on average it is comparable to the baseline).</p>
      </sec>
      <sec id="sec-3-2">
        <title>3.2. Visualization layer</title>
        <p>A vital component of each Drifter instance are its resulting visualizations. The design choice of
metric-based visualizations (Prometheus and Grafana) enabled us to generalize metric retrieval
and, at the same time, facilitate the creation of custom metrics and alerts based on them. Further,
as Prometheus comes equipped with a collection of aggregation functions, users can create
complex queries based on many existing examples while also sharing the knowledge – resulting
PromQL queries can easily be shared and tested across use cases. Similarly, Grafana-based
visualizations are a direct extension of the metrics. Out-of-the-box capabilities sufice for most
use cases, are flexible and customizable, and are easy to maintain or change if needed. An
example visualization of feature coverage in time is shown in Figure 3a. A similar view showing
features’ cardinalities is shown in Figure 3b.</p>
        <p>A more complex example includes the computation of drifts – changes in the feature’s value
within a given (parametrized) time frame. Parameterization of this aspect was necessary, as
diferent use cases adhere to diferent temporal dynamics. An example is shown in Figure 4.</p>
        <p>The visualizations above are possible with out-of-the-box Grafana capabilities. However,
more custom visualizations are also possible and are implemented at the pod level. An example
includes interactive hierarchical clusters of time series of features’ cardinalities.</p>
      </sec>
    </sec>
    <sec id="sec-4">
      <title>4. Use case example: Anomaly detection</title>
      <p>Having discussed the overall Drifter architecture, we proceed with a collection of production
use cases where Drifter was used to facilitate and enable data-related monitoring.</p>
      <p>Models responsible for click-through and conversion rate prediction can be comprised of
hundreds of features. By monitoring each feature that is in production’s distribution and its
(a) Overview of resource utilization of Drifter instance (seven day period) that monitors
CTRrelated trafic. Drifter pod is stable even during trafic peaks (last sub-plot). CPU utilization
is minimal (first plot).</p>
      <p>50
40
its()e3200
m
10
0</p>
      <p>SKlearn-MI
MI-Numba
MI-Numba (sparse)
16
18</p>
      <p>20 22
Number of instances (2k)
24
26
(b) Performance of Numba-based mutual information implementation against the established
Scikit-learn one (diferent configurations ran ten times). Green samples represent a realistic
scenario where only 30% or less values are present.
shifts, Drifter instances alert the users when, e.g., a feature’s coverage or cardinality score
changes beyond expectations. We describe a use case where a feature that required efort from
multiple teams to be productized required additional inspection as it misbehaved online – being
one of the more-relevant features, impact on model quality could be detected (with slight delay).
The visualization layer of Drifter (Grafana-based visualizations of PromQL-based queries),
apart from monitoring of the feature’s distribution, also enables comparisons to previous time
points. If the diference in a feature’s coverage is beyond a pre-defined, acceptable threshold,
Drifter logs it as an anomaly (visible in a designated dashboard panel), and can trigger the
related alert. Each visualization considers metrics derived from the main signals outputted by
each Drifter instance. PromQL query examples are summarized in Table 2. The examples are
straightforward to implement and are easily extensible with Prometheus’s in-built capabilities
related to computation of derivatives, linear extrapolation and similar. However, we observed
that simple "deltas" between a quantity in a designated time frame already ofer suficient
information for a better understanding of the issue at hand. Note that this setup does not require
any external anomaly detection services (e.g., ‘prometheus-anomaly-detector‘13), even though
it can be extended to ofer such functionalities should the need arise.</p>
    </sec>
    <sec id="sec-5">
      <title>5. Lessons learned and conclusions</title>
      <p>Initially, Drifter was conceived as a stand-alone service with its front-end interface, enabling
users to explore "personalized" visualizations independently of others. Even though part of this
implementation remains available for each Drifter pod, we realized that the users’ needs to
define novel metrics of interest over longer time ranges could not be mimicked elegantly at the
13https://github.com/AICoE/prometheus-anomaly-detector
(a) One week of coverage information for a given feature space (click-through rate model).</p>
      <p>(b) Standard deviation of mutual information – a score of features’ importance.</p>
      <p>(a) An example visualization of a metric associated with feature drifts. Periodic and
anomalous behavior are considered.
(b) Apart from basic statistics such as coverage and cardinality, Drifter also performs
online feature ranking.</p>
      <p>PromQL
((avg by(feature_name)() ofset interval )/(avg by(feature_name)() * 100 &gt; threshold
((avg by(feature_name)() ofset interval ) − (avg by(feature_name)() * 100 &gt; ℎℎ
stddev by(feature_name)() &gt; 1/2 * avg by(feature_name)()
level of a custom in-house front-end solution. Furthermore, the architectural change that came
with one UI per pod included extended data retention (per pod), causing additional disk overhead
even though as soon as metrics are computed, raw outputs are no longer needed. By adopting
the metric-push approach, we substantially reduced the computational resources required per</p>
      <p>(b) Coverage drift events in real-time.
Drifter pod and, at the same time, facilitated visualizations. Drifter was initially designed to
incorporate all required algorithms for fast and accurate ranking. However, by realizing that
our in-house built, already optimized solution for feature ranking, enables fast-enough ranking
out-of-the-box, we proceeded by creating an interface rather than writing a Drifter-specific
solution. This way, algorithmic improvements already present in the proposed feature ranking
engine could be further optimized and modified for Drifter. Furthermore, should the need arise,
switching the ranking engine is straightforward, a matter of a changing interface. We finally
discuss the design choice of using the same data pipeline/processing steps as most machine
learning flows. As Drifter can operate with raw Hive 14 dumps directly, there was initially
no need to be entirely aligned with the processing steps undertaken to prepare data for, e.g.,
CTR prediction. However, we concluded that approximating the existing flows (data-wise) is
a mandatory capability, as Drifters are, in most use cases, utilized to help explain anomalies
or drifts of data that is fed to the prediction engine(s). This way, alignment up to the level of
input (Vowpal Wabbit format - compressed) enabled us to approximate and enable the study
of the very inputs that are used for downstream machine learning (e.g., CTR, CVR). Finally,
we discussed the design choices made along the way, to guide similar projects in avoiding the
repetition of mistakes that were successfully addressed through the implementation of Drifter.
The feature ranking and verification engine used in Drifter will be made public soon as an open
source project.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <given-names>S. C.</given-names>
            <surname>Hoi</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Sahoo</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Lu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Zhao</surname>
          </string-name>
          ,
          <article-title>Online learning: A comprehensive survey</article-title>
          ,
          <source>Neurocomputing</source>
          <volume>459</volume>
          (
          <year>2021</year>
          )
          <fpage>249</fpage>
          -
          <lpage>289</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [2]
          <string-name>
            <given-names>R.</given-names>
            <surname>Hosseini</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Chen</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K.</given-names>
            <surname>Yang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Patra</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Y.</given-names>
            <surname>Su</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Arora</surname>
          </string-name>
          ,
          <article-title>Greykite: a flexible, intuitive and fast forecasting library</article-title>
          ,
          <year>2021</year>
          . URL: https://github.com/linkedin/greykite.
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [3]
          <string-name>
            <given-names>J.</given-names>
            <surname>Haug</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Pawelczyk</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K.</given-names>
            <surname>Broelemann</surname>
          </string-name>
          , G. Kasneci,
          <article-title>Leveraging model inherent variable importance for stable online feature selection</article-title>
          ,
          <source>in: Proceedings of the 26th ACM SIGKDD International Conference on Knowledge Discovery &amp; Data Mining</source>
          ,
          <year>2020</year>
          , pp.
          <fpage>1478</fpage>
          -
          <lpage>1502</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [4]
          <string-name>
            <given-names>S. C.</given-names>
            <surname>Hoi</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Wang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Zhao</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Jin</surname>
          </string-name>
          ,
          <article-title>Online feature selection for mining big data</article-title>
          ,
          <source>in: Proceedings of the 1st international workshop on big data, streams and heterogeneous source mining: Algorithms, systems, programming models and applications</source>
          ,
          <year>2012</year>
          , pp.
          <fpage>93</fpage>
          -
          <lpage>100</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          [5]
          <string-name>
            <given-names>J.</given-names>
            <surname>Wang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Wang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Li</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L.</given-names>
            <surname>Liu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Z.</given-names>
            <surname>Zhao</surname>
          </string-name>
          ,
          <string-name>
            <given-names>X.</given-names>
            <surname>Hu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>X.</given-names>
            <surname>Wu</surname>
          </string-name>
          ,
          <article-title>Online feature selection with group structure analysis</article-title>
          ,
          <source>IEEE Transactions on Knowledge and Data Engineering</source>
          <volume>27</volume>
          (
          <year>2015</year>
          )
          <fpage>3029</fpage>
          -
          <lpage>3041</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          [6]
          <string-name>
            <given-names>G.</given-names>
            <surname>Manikandan</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Abirami</surname>
          </string-name>
          ,
          <article-title>Feature selection is important: state-of-the-art methods and application domains of feature selection on high-dimensional data, Applications in Ubiquitous Computing (</article-title>
          <year>2021</year>
          )
          <fpage>177</fpage>
          -
          <lpage>196</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          [7]
          <string-name>
            <given-names>J. P.</given-names>
            <surname>Barddal</surname>
          </string-name>
          ,
          <string-name>
            <given-names>H. M.</given-names>
            <surname>Gomes</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F.</given-names>
            <surname>Enembreck</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B.</given-names>
            <surname>Pfahringer</surname>
          </string-name>
          ,
          <article-title>A survey on feature drift adaptation: Definition, benchmark, challenges and future directions</article-title>
          ,
          <source>Journal of Systems and Software</source>
          <volume>127</volume>
          (
          <year>2017</year>
          )
          <fpage>278</fpage>
          -
          <lpage>294</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          [8]
          <string-name>
            <given-names>J. P.</given-names>
            <surname>Barddal</surname>
          </string-name>
          ,
          <string-name>
            <given-names>H. M.</given-names>
            <surname>Gomes</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F.</given-names>
            <surname>Enembreck</surname>
          </string-name>
          ,
          <article-title>Analyzing the impact of feature drifts in streaming learning</article-title>
          ,
          <source>in: Neural Information Processing: 22nd International Conference, ICONIP 2015</source>
          , Istanbul, Turkey, November 9-
          <issue>12</issue>
          ,
          <year>2015</year>
          , Proceedings,
          <source>Part I 22</source>
          , Springer,
          <year>2015</year>
          , pp.
          <fpage>21</fpage>
          -
          <lpage>28</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          [9]
          <string-name>
            <given-names>F.</given-names>
            <surname>Pedregosa</surname>
          </string-name>
          ,
          <string-name>
            <given-names>G.</given-names>
            <surname>Varoquaux</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Gramfort</surname>
          </string-name>
          ,
          <string-name>
            <given-names>V.</given-names>
            <surname>Michel</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B.</given-names>
            <surname>Thirion</surname>
          </string-name>
          ,
          <string-name>
            <given-names>O.</given-names>
            <surname>Grisel</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Blondel</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Prettenhofer</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Weiss</surname>
          </string-name>
          ,
          <string-name>
            <given-names>V.</given-names>
            <surname>Dubourg</surname>
          </string-name>
          , et al.,
          <article-title>Scikit-learn: Machine learning in python</article-title>
          ,
          <source>the Journal of machine Learning research 12</source>
          (
          <year>2011</year>
          )
          <fpage>2825</fpage>
          -
          <lpage>2830</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          [10]
          <string-name>
            <given-names>S. K.</given-names>
            <surname>Lam</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Pitrou</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Seibert</surname>
          </string-name>
          ,
          <article-title>Numba: A llvm-based python jit compiler</article-title>
          ,
          <source>in: Proceedings of the Second Workshop on the LLVM Compiler Infrastructure in HPC</source>
          ,
          <year>2015</year>
          , pp.
          <fpage>1</fpage>
          -
          <lpage>6</lpage>
          .
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>