<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Combining Supervised and Unsupervised Learning for Eficient, Explainable, and Interoperable Anomaly Detection</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Israel</string-name>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Grisha Weintraub</string-name>
          <email>grisha.weintraub@ibm.com</email>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Leonid Rise</string-name>
          <email>leonid.rise@ibm.com</email>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Doron Hillman</string-name>
          <email>doron.hillman@ibm.com</email>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Paula Ta-Shma</string-name>
          <email>paula@il.ibm.com</email>
        </contrib>
      </contrib-group>
      <pub-date>
        <year>2026</year>
      </pub-date>
      <abstract>
        <p>Anomaly detection is a widely used technique in various domains. While much of the research on anomaly detection focuses on improving detection accuracy and optimizing the training process, our approach emphasizes practical considerations, specifically addressing explainability, interoperability, and eficiency in real-world applications. Explainability refers to how clearly users can understand the reasoning behind why a specific point is labeled as an anomaly, interoperability pertains to how easily the anomaly detection module can be integrated into an application, and eficiency relates to the overhead associated with using the anomaly detection module in an application. We propose a practical approach that combines both supervised and unsupervised learning to enhance all three of these aspects. Our solution has been deployed in a real-world application for six months, and in this paper, we share the lessons learned during this period.</p>
      </abstract>
      <kwd-group>
        <kwd>eol&gt;Anomaly Detection</kwd>
        <kwd>Machine Learning</kwd>
        <kwd>Isolation Forest</kwd>
        <kwd>Decision Tree</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>1. Introduction</title>
      <p>
        Anomaly detection (AD) plays a vital role across various real-life applications. It is extensively used in
the financial sector to identify fraudulent activity [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ], in cybersecurity to detect potential threats [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ],
in medicine to recognize pathological processes [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ], and in numerous other fields [
        <xref ref-type="bibr" rid="ref4">4</xref>
        ]. The high-level
architecture of such applications looks as in Fig. 1:
1. A user submits a request to the application, which could be a credit card transaction or the results
of a specific medical test.
blood count.
2. The application receives the request, extracts critical data from it, and stores this information in a
database. For instance, important data for a credit card transaction may include location, time,
and amount. In the context of a medical test, relevant data could consist of various metrics like
3. A periodic ofline process trains on the data in the database to build an anomaly detection model.
4. The model is then made accessible to the application.
5. The application utilizes the model to determine in real-time whether a particular user request is
6. Based on the model’s response, the application makes a business decision, such as blocking a
credit card transaction or sending an email to a patient or their doctor.
      </p>
      <p>This architecture, while simple, has several practical limitations that lead to the following research
questions addressed in this paper:
• Q1 (Explainability): The AD model operates as a black box, making it dificult for users to
understand why a specific point was identified as an anomaly. Can we enhance its explainability?
• Q2 (Efficiency): The software that runs the AD model needs to operate very quickly — think
about the urgency of a credit card transaction — where every nanosecond matters. Can we reduce
the inference time?
• Q3 (Interoperability): Integrating the AD model into an application can be challenging
due to various practical considerations, such as where the model is stored and how it is accessed.</p>
      <p>Can we simplify the integration process with the application?
In this paper, we explore these questions. Our main contributions can be summarized as follows:
• We analyze the practical limitations of AD models in real-world applications and propose solutions
to address these challenges.
• We implement our proposed solution in a real-world application, evaluate its efectiveness, and
assess the associated trade-ofs.</p>
      <p>The paper is organized as follows: In Section 2, we provide a formal definition of the problem. Section
3 outlines our approach to addressing this problem, discussing the rationale behind why our solution is
expected to answer research questions Q1-Q3, as well as the potential trade-ofs involved. In Section 4,
we present our experimental evaluation based on the implementation of our solution in a real-world
application. Section 5 reviews related work, and Section 6 concludes the paper.</p>
    </sec>
    <sec id="sec-2">
      <title>2. Problem Statement</title>
      <p>We consider the architecture illustrated in Figure 1 and assume that the data extracted from user requests
(step 2) is in a relational format. Specifically, we assume that each user request  is transformed into a
tuple  by a function  . The tuple  =  () consists of pairs {(, ) |  ∈ ,  ∈ ,  ∈ {1, . . . , }},
where  represents a set of  column names and  denotes a set of real numbers. These tuples are
stored in a database , which can be a relational database or a data lake.</p>
      <p>
        In step 3, we apply an unsupervised AD algorithm (e.g., Local Outlier Factor [
        <xref ref-type="bibr" rid="ref5">5</xref>
        ] or Isolation Forest
[
        <xref ref-type="bibr" rid="ref6">6</xref>
        ]) to a subset of columns in the database to build an AD model, denoted as  , that will be used
in the application (step 5). It is possible to create multiple models during this step, such as distinct
models for each customer or separate models for diferent features. Diferent features are defined by
diferent subsets of columns. For example, we can have model 1 trained on columns 1, 3, 7 and
model 2 trained on columns 2, 7, 8, 9. For each AD model , we assign a projection function
 that takes the tuple  () and retains only the relevant columns for the model . A model  is a
function that takes as an argument ( ()) and returns a boolean flag indicating whether the given
request is considered an anomaly.
      </p>
      <p>
        The models  are made accessible to the application (step 4), either by deploying them on a dedicated
service (e.g., AWS SageMaker [
        <xref ref-type="bibr" rid="ref7">7</xref>
        ]) or by keeping them in the application’s memory. For each user
request  and model , the application determines whether the request is an anomaly based on the
response from (( ())) and executes the corresponding business operation (step 6).
      </p>
      <p>The goal of this paper is to address research questions Q1-Q3 within the framework of the presented
system model.</p>
    </sec>
    <sec id="sec-3">
      <title>3. Our Approach</title>
      <p>Our approach enhances the flow shown in Figure 1 by expanding it to the one presented in Figure 2.
Steps 1 to 4 remain unchanged. However, in step 5, instead of directly exposing the AD model (Model 1)
to the application, we use it to label our training set (which is the same training set used to train Model
1 in step 3 of Figure 1). For each tuple , we add an additional boolean column labeled "is-anomaly,"
which indicates whether the specific tuple is an anomaly.</p>
      <p>Formally, for each model  and each tuple  ∈ , we build a new training set ′, such that for each
′ ∈ ′, the following holds:</p>
      <p>′ =  ∪ (” − ”,  (()))
.</p>
      <p>In step 6, we perform supervised learning using a Decision Tree on the labeled training set ′ to
create a model (Model 2) that predicts whether a given instance is an anomaly. As a result, in step 7, we
have a decision tree model that makes anomaly predictions similar to those of Model 1, but in a more
interpretable format.</p>
      <p>In step 8, we apply Algorithm 1 to extract rules from Model 2. These rules are predicates in a
Disjunctive Normal Form (DNF), which look like (11 ∧12 ∧. . . )∨(21 ∧22 . . . ) · · ·∨( 1 ∧2 . . . ).
A term  is a condition of type &lt;column op value&gt; (e.g., age &gt; 40), where columns are taken from
, values from , and  ∈ {≤, &gt;}.</p>
      <p>The rules are then stored in the application’s memory in step 9. In step 10, similar to step 6 in Figure
1, the application can use these rules to assess user requests and determine whether a specific request is
an anomaly.</p>
      <p>Algorithm 1 Extracting DNF Rules from a Decision Tree
1: procedure ExtractRules(root)
2:  ← {}
3: FindPaths(root, “”, rules)
4: return ⋁︁</p>
      <p>∈
5: end procedure
6: procedure FindPaths(node, path, rules)
7: if node is a leaf then
8: if node.label = true then
9:  ←  ∪ ℎ
10: end if
11: return
12: end if
13:   ← .  ≤ .ℎℎ
14:   ℎ ← Append(path, leftCond)
15: FindPaths(node.left, leftPath, rules)
16: ℎ ← .  &gt; .ℎℎ
17: ℎ ℎ ← Append(path, rightCond)
18: FindPaths(node.right, rightPath, rules)
19: end procedure
20: procedure Append(path, cond)
21: if ℎ is empty then
22: return cond
23: else
24: return path ∧ cond
25: end if
26: end procedure</p>
      <p>Algorithm 1 traverses the decision tree from the root to the leaves. At each internal node, it records
the split condition (using ≤ for the left branch and &gt; for the right branch) while building the current
path. When it reaches a leaf labeled "true," which indicates an anomaly, it adds the conjunction of
conditions for that path to the rule set.</p>
      <p>The final output consists of all these paths combined with OR, resulting in a DNF formula. The
running time of this algorithm is ( × ) , where  represents the number of leaves and  denotes the
tree depth.</p>
      <p>We will now briefly review research questions Q1-Q3 and discuss how our approach addresses them.
• Q1 (Explainability): Unlike the baseline approach, where the model functions as a black
box, our method provides clear, human-readable logic for anomaly detection decisions (e.g., age &gt;
40 AND 2 ≤ 76). As a result, the explainability of our method is significantly enhanced.
• Q2 (Efficiency): We will present an eficiency analysis based on real-life tests in the following
section. The intuition behind why our approach may be faster lies in the fact that, instead of
complex model inference, we perform only basic operations - mainly number comparisons and
boolean expression evaluations.
• Q3 (Interoperability): Our approach simplifies integration with applications compared to
the baseline. First, unlike the machine learning (ML) model, our rules occupy minimal memory
space, making them easy to store and manage. Second, our rules can be interpreted across
any programming language, including even esoteric ones , since they rely on fundamental
primitives and do not require a special format. In contrast, accessing ML models typically requires
mainstream programming languages.</p>
      <p>The advantages mentioned above come with certain drawbacks. Two main issues with our approach
compared to the baseline are as follows:
1. There is an overhead in steps 5-7 of Figure 2 due to the processes involved in labeling the
training set, supervised learning, and rule extraction. However, since this overhead occurs only
occasionally (during retraining), its impact is not significant.
2. Our final model (Model 2) is designed to predict the outcomes of the original model (Model 1).</p>
      <p>Obviously, the prediction accuracy is unlikely to reach 100%, meaning our rules may not be as
precise as those of the original model. This trade-of is the primary drawback of our approach, as
we prioritize explainability, eficiency, and interoperability over accuracy.</p>
    </sec>
    <sec id="sec-4">
      <title>4. Real-life Evaluation</title>
      <p>We have implemented our approach in a real-life application that detects anomalous data points. The
overall system architecture follows the flow in Figure 2. This application has been running in production
mode for over six months. In this section, we will present our evaluation results while keeping internal
application details confidential.</p>
      <p>
        Our system follows the steps presented in Figure 2, with the following nuances:
• Step 1: The application receives billions of requests containing hundreds of diferent attributes.
• Step 2: Each request is stored in a relational format within Parquet files located in the S3 data
lake.
• Step 3: A Spark MLlib job running on an AWS EMR cluster processes the data in the data lake
to create Isolation Forest models. Five types of models are generated (identified by the letters A
to E), difering in both the records and columns used for training.
• Step 4: The models created in the previous step are stored in ONNX format [
        <xref ref-type="bibr" rid="ref8">8</xref>
        ] in S3.
• Step 5: Models from the previous step are used to add "is-anomaly" labels to the training set
for each model.
• Step 6: Another Spark MLlib job running on the AWS EMR cluster creates a Decision Tree
model from the labeled training sets.
• Step 7: The Decision Tree model from the previous step is kept in the memory of the Spark
cluster.
• Step 8: The same Spark cluster runs Algorithm 1 to processes the Decision Tree models and
generate DNF rules.
• Step 9: The rules are stored in S3 in JSON format and exposed to the application. A real-life
example of the rules, using anonymized column names, is shown in Listing 1.
      </p>
      <p>Listing 1: DNF rules example
" o r " : [</p>
      <p>{
{
}
} ,
{ " and " : [ { " f i e l d " : " c 1 " , " v a l u e " : 9 0 , " o p e r a t o r " : " &gt; " } ] }</p>
      <p>We will now compare our approach with the baseline described in Figure 1 across the following
dimensions:
• Storage: Comparing the storage size of the rules (in JSON format) with that of the Isolation</p>
      <p>Forest model (iForest) in ONNX format.
• Performance: Comparing the inference performance of the rules against the Isolation Forest
model.
• Accuracy: Analyzing the accuracy degradation of our rules in comparison to the Isolation Forest
model.</p>
      <sec id="sec-4-1">
        <title>4.1. Storage</title>
        <p>The results of the storage evaluation are presented in Table 1. For each model, we provided the storage
size and memory loading time for each case (rules in JSON format versus iForest in ONNX format).</p>
        <p>Both the storage size and loading time were an order of magnitude smaller with our approach (using
rules) compared to the baseline (using iForest) across all models.</p>
      </sec>
      <sec id="sec-4-2">
        <title>4.2. Performance</title>
        <p>The performance evaluation is summarized in Table 2. We assessed two key metrics for each model
(iForest vs. Rules): ofline processing overhead and inference time.</p>
        <p>For the ofline processing measurement, we recorded the time required to train the iForest model and
compared it with the time taken by our approach, which includes labeling, decision tree training, and
rules extraction - all combined.</p>
        <p>For the inference evaluation, we used real trafic data, and the number of records for each model is
indicated in the table under "Records Num." The inference time is calculated as the average time taken
to make a prediction for a single record.</p>
        <p>The results show that our rules inference method outperformed iForest inference across all
models, achieving an overall average advantage of 33%. While ofline processing overhead increased by
approximately 50%, it’s important to note that this overhead occurs only occasionally during retraining,
making it relatively insignificant.</p>
        <p>Model Type Records Num Rules Inference (ns) iForest Inference (ns) Rules Extraction (sec) iForest Training (sec)
A 2,301,358 7,555 9,112 52 132
B 32,677,265 6,916 9,700 82 150
C 791,083 7,702 8,566 63 121
D 36,453,915 2,686 8,938 48 117
E 5,472,485 9,398 9,619 83 172</p>
      </sec>
      <sec id="sec-4-3">
        <title>4.3. Accuracy</title>
        <p>We evaluate the accuracy of our approach by comparing the prediction results of the rules generated
by our method to the results produced by the iForest model. We use the same validation sets as those
used in the inference evaluation in the previous section (4.2). Our analysis focuses on three key metrics:
Precision (True Positives / (True Positives + False Positives)), Recall (True Positives / (True Positives
+ False Negatives)), and the F1 score (a balanced measure of Precision and Recall). The findings are
summarized in Table 3.</p>
        <p>The accuracy of the models varies, with an average F1 score of around 77%. However, it is important
to note that many samples misclassified by the rules difered from the classifications made by the iForest
model by only a small margin. This indicates that the practical accuracy may be better than what is
shown in Table 3. For example, we found that approximately 87% of all misclassified points were within
just 0.05 of the iForest threshold — the score that distinguishes between anomalies and valid points.</p>
        <p>Model Type</p>
        <p>A
B
C
D
E</p>
        <p>Precision
84.73%
69.47%
91.09%
77.83%
72.84%</p>
        <p>Recall
66.87%
69.58%
72.92%
77.58%
93.08%</p>
      </sec>
    </sec>
    <sec id="sec-5">
      <title>5. Related Work</title>
      <p>
        AD has been an active area of research for decades, with foundational studies beginning as far back as
the 1960s [
        <xref ref-type="bibr" rid="ref9">9</xref>
        ]. Today, numerous AD algorithms exist, including Isolation Forest [
        <xref ref-type="bibr" rid="ref6">6</xref>
        ], Local Outlier Factor
[
        <xref ref-type="bibr" rid="ref5">5</xref>
        ], One-Class Support Vector Machine [
        <xref ref-type="bibr" rid="ref10">10</xref>
        ], and K-Means-based [
        <xref ref-type="bibr" rid="ref11">11</xref>
        ]. These algorithms are applied
across various domains such as finance [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ], cybersecurity [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ], manufacturing [
        <xref ref-type="bibr" rid="ref12">12</xref>
        ], and healthcare [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ].
Comprehensive reviews of AD techniques can be found in [
        <xref ref-type="bibr" rid="ref4">4</xref>
        ], [
        <xref ref-type="bibr" rid="ref13">13</xref>
        ], and [
        <xref ref-type="bibr" rid="ref14">14</xref>
        ]. We do not compare our
system with existing AD algorithms as our primary contribution is not a new detection algorithm but a
deployment pipeline. This pipeline can wrap any unsupervised detector, including LOF, iForest, or deep
models, with equal efect.
      </p>
      <p>
        Explainability in Anomaly Detection Carletti et al. [
        <xref ref-type="bibr" rid="ref15">15</xref>
        ] propose DIFFI, a depth based feature
importance measure designed specifically for IForest. It ranks each feature’s contribution to the
anomaly score without retraining the model. Chawla et al. [
        <xref ref-type="bibr" rid="ref16">16</xref>
        ] introduce a cluster based autoencoder
for network anomaly detection and integrate SHAP values [
        <xref ref-type="bibr" rid="ref16">16</xref>
        ] to provide feature level explanations
for each prediction in real world telecom systems. While these approaches explain individual anomaly
decisions efectively, they still rely on the original model during inference. This means that inference
latency, storage requirements, and the dependency on a ML runtime remain unchanged. Our method is
fundamentally diferent. Instead of adding explanations to a black box model, we replace the model
entirely with globally interpretable DNF rules. These rules are self explanatory by design, language
agnostic, and require only simple arithmetic comparisons to evaluate.
      </p>
      <p>
        Hybrid Supervised and Unsupervised Approaches. Several recent studies combine unsupervised
and supervised learning for anomaly detection. Shanaa and Abdallah [
        <xref ref-type="bibr" rid="ref17">17</xref>
        ] propose a hybrid framework
that uses an autoencoder trained on normal transactions together with an XGBoost classifier. Their
method achieves state of the art F1 scores on a public credit card fraud benchmark. The design in [
        <xref ref-type="bibr" rid="ref17">17</xref>
        ]
is similar to ours because it also uses an unsupervised component to generate training signals for a
supervised model. However, there are three main diferences. First, [
        <xref ref-type="bibr" rid="ref17">17</xref>
        ] focuses primarily on detection
accuracy, while we prioritise explainability, eficiency, and interoperability. Second, in [
        <xref ref-type="bibr" rid="ref17">17</xref>
        ], both the
autoencoder and the classifier must run during serving, whereas in our approach everything is reduced
to a compact set of rules. Third, we evaluated our method on a real production system with billions of
actual requests rather than only on a public benchmark dataset, which makes our results more realistic
and practical.
      </p>
    </sec>
    <sec id="sec-6">
      <title>6. Conclusion</title>
      <p>In this paper, we addressed three critical research questions — explainability (Q1), eficiency (Q2),
and interoperability (Q3) — that often make the practical application of AD in real-world systems
challenging. By combining unsupervised and supervised learning we transformed "black box" iForest
models into interpretable DNF rules.</p>
      <p>Our real-life evaluation, demonstrated significant practical benefits:
• Efficiency: The rule-based inference method outperformed the baseline iForest inference by
an average of 33%, while storage requirements and memory loading times were reduced by an
order of magnitude.
• Interoperability: By storing rules in JSON format, we eliminated the need for specialized
ML environments, allowing the logic to be interpreted across any programming language using
fundamental primitives.
• Explainability: The transition from complex ML models to human-readable predicates
provides users with clear reasoning for anomaly detection decisions.</p>
      <p>While we observed an accuracy trade-of, with an average F1 score of approximately 77%, the majority
of misclassified points were within a narrow margin (0.05) of the original model’s threshold. We believe
this trade-of is justified for applications where speed and transparency are prioritized over absolute
precision. Future work could focus on further narrowing this accuracy gap while maintaining the
eficiency of the rule-extraction framework.</p>
    </sec>
    <sec id="sec-7">
      <title>Declaration on Generative AI</title>
      <p>During the preparation of this work, the authors used Grammarly in order to: Improve writing style,
and Grammar and spelling check. After using this tool, the authors reviewed and edited the content as
needed and take full responsibility for the publication’s content.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <given-names>E. L.</given-names>
            <surname>Paula</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Ladeira</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R. N.</given-names>
            <surname>Carvalho</surname>
          </string-name>
          , T. Marzagao,
          <article-title>Deep learning anomaly detection as support fraud investigation in brazilian exports and anti-money laundering</article-title>
          ,
          <source>in: 2016 15th ieee international conference on machine learning and applications (icmla)</source>
          , IEEE,
          <year>2016</year>
          , pp.
          <fpage>954</fpage>
          -
          <lpage>960</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [2]
          <string-name>
            <given-names>A. K.</given-names>
            <surname>Ghosh</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Wanken</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F.</given-names>
            <surname>Charron</surname>
          </string-name>
          ,
          <article-title>Detecting anomalous and unknown intrusions against programs</article-title>
          ,
          <source>in: Proceedings 14th annual computer security applications conference (Cat. No. 98Ex217)</source>
          , IEEE,
          <year>1998</year>
          , pp.
          <fpage>259</fpage>
          -
          <lpage>267</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [3]
          <string-name>
            <given-names>W.-K.</given-names>
            <surname>Wong</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A. W.</given-names>
            <surname>Moore</surname>
          </string-name>
          ,
          <string-name>
            <given-names>G. F.</given-names>
            <surname>Cooper</surname>
          </string-name>
          ,
          <string-name>
            <surname>M. M. Wagner</surname>
          </string-name>
          ,
          <article-title>Bayesian network anomaly pattern detection for disease outbreaks</article-title>
          ,
          <source>in: Proceedings of the 20th international conference on machine learning (ICML-03)</source>
          ,
          <year>2003</year>
          , pp.
          <fpage>808</fpage>
          -
          <lpage>815</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [4]
          <string-name>
            <given-names>V.</given-names>
            <surname>Chandola</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Banerjee</surname>
          </string-name>
          ,
          <string-name>
            <given-names>V.</given-names>
            <surname>Kumar</surname>
          </string-name>
          ,
          <article-title>Anomaly detection: A survey, ACM computing surveys (CSUR) 41 (</article-title>
          <year>2009</year>
          )
          <fpage>1</fpage>
          -
          <lpage>58</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          [5]
          <string-name>
            <surname>M. M. Breunig</surname>
            ,
            <given-names>H.-P.</given-names>
          </string-name>
          <string-name>
            <surname>Kriegel</surname>
            ,
            <given-names>R. T.</given-names>
          </string-name>
          <string-name>
            <surname>Ng</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          <string-name>
            <surname>Sander</surname>
          </string-name>
          ,
          <article-title>Lof: identifying density-based local outliers</article-title>
          ,
          <source>in: Proceedings of the 2000 ACM SIGMOD international conference on Management of data</source>
          ,
          <year>2000</year>
          , pp.
          <fpage>93</fpage>
          -
          <lpage>104</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          [6]
          <string-name>
            <given-names>F. T.</given-names>
            <surname>Liu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K. M.</given-names>
            <surname>Ting</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Z.-H.</given-names>
            <surname>Zhou</surname>
          </string-name>
          ,
          <article-title>Isolation-based anomaly detection, ACM Transactions on Knowledge Discovery from Data (TKDD) 6 (</article-title>
          <year>2012</year>
          )
          <fpage>1</fpage>
          -
          <lpage>39</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          [7]
          <string-name>
            <given-names>E.</given-names>
            <surname>Liberty</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Z.</given-names>
            <surname>Karnin</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B.</given-names>
            <surname>Xiang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L.</given-names>
            <surname>Rouesnel</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B.</given-names>
            <surname>Coskun</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Nallapati</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Delgado</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Sadoughi</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Y.</given-names>
            <surname>Astashonok</surname>
          </string-name>
          ,
          <string-name>
            <surname>P. Das</surname>
          </string-name>
          , et al.,
          <article-title>Elastic machine learning algorithms in amazon sagemaker</article-title>
          ,
          <source>in: Proceedings of the 2020 ACM SIGMOD International Conference on Management of Data</source>
          ,
          <year>2020</year>
          , pp.
          <fpage>731</fpage>
          -
          <lpage>737</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          [8]
          <string-name>
            <surname>ONNX</surname>
          </string-name>
          , Open neural network exchange,
          <year>2025</year>
          . URL: https://github.com/onnx/onnx.
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          [9]
          <string-name>
            <given-names>F. E.</given-names>
            <surname>Grubbs</surname>
          </string-name>
          ,
          <article-title>Procedures for detecting outlying observations in samples</article-title>
          ,
          <source>Technometrics</source>
          <volume>11</volume>
          (
          <year>1969</year>
          )
          <fpage>1</fpage>
          -
          <lpage>21</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          [10]
          <string-name>
            <given-names>M.</given-names>
            <surname>Hejazi</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Y. P.</given-names>
            <surname>Singh</surname>
          </string-name>
          ,
          <article-title>One-class support vector machines approach to anomaly detection</article-title>
          ,
          <source>Applied Artificial Intelligence</source>
          <volume>27</volume>
          (
          <year>2013</year>
          )
          <fpage>351</fpage>
          -
          <lpage>366</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          [11]
          <string-name>
            <given-names>M.</given-names>
            <surname>Elahi</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K.</given-names>
            <surname>Li</surname>
          </string-name>
          ,
          <string-name>
            <given-names>W.</given-names>
            <surname>Nisar</surname>
          </string-name>
          ,
          <string-name>
            <given-names>X.</given-names>
            <surname>Lv</surname>
          </string-name>
          ,
          <string-name>
            <given-names>H.</given-names>
            <surname>Wang</surname>
          </string-name>
          ,
          <article-title>Eficient clustering-based outlier detection algorithm for dynamic data stream</article-title>
          ,
          <source>in: 2008 Fifth International Conference on Fuzzy Systems and Knowledge Discovery</source>
          , volume
          <volume>5</volume>
          , IEEE,
          <year>2008</year>
          , pp.
          <fpage>298</fpage>
          -
          <lpage>304</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          [12]
          <string-name>
            <given-names>E.</given-names>
            <surname>Keogh</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Lin</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.-H.</given-names>
            <surname>Lee</surname>
          </string-name>
          ,
          <string-name>
            <given-names>H. V.</given-names>
            <surname>Herle</surname>
          </string-name>
          ,
          <article-title>Finding the most unusual time series subsequence: algorithms and applications</article-title>
          ,
          <source>Knowledge and Information Systems</source>
          <volume>11</volume>
          (
          <year>2007</year>
          )
          <fpage>1</fpage>
          -
          <lpage>27</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          [13]
          <string-name>
            <given-names>A.</given-names>
            <surname>Boukerche</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L.</given-names>
            <surname>Zheng</surname>
          </string-name>
          ,
          <string-name>
            <given-names>O.</given-names>
            <surname>Alfandi</surname>
          </string-name>
          ,
          <article-title>Outlier detection: Methods, models, and classification</article-title>
          ,
          <source>ACM Computing Surveys (CSUR) 53</source>
          (
          <year>2020</year>
          )
          <fpage>1</fpage>
          -
          <lpage>37</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref14">
        <mixed-citation>
          [14]
          <string-name>
            <given-names>G.</given-names>
            <surname>Pang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Shen</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L.</given-names>
            <surname>Cao</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A. V. D.</given-names>
            <surname>Hengel</surname>
          </string-name>
          ,
          <article-title>Deep learning for anomaly detection: A review, ACM computing surveys (CSUR) 54 (</article-title>
          <year>2021</year>
          )
          <fpage>1</fpage>
          -
          <lpage>38</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref15">
        <mixed-citation>
          [15]
          <string-name>
            <given-names>M.</given-names>
            <surname>Carletti</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Terzi</surname>
          </string-name>
          ,
          <string-name>
            <given-names>G. A.</given-names>
            <surname>Susto</surname>
          </string-name>
          ,
          <article-title>Interpretable anomaly detection with DIFFI: depth-based feature importance of Isolation Forest</article-title>
          ,
          <source>Engineering Applications of Artificial Intelligence</source>
          <volume>116</volume>
          (
          <year>2023</year>
          )
          <fpage>105730</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref16">
        <mixed-citation>
          [16]
          <string-name>
            <given-names>A.</given-names>
            <surname>Chawla</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Jacob</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Farrell</surname>
          </string-name>
          ,
          <string-name>
            <given-names>E.</given-names>
            <surname>Aumayr</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Fallon</surname>
          </string-name>
          ,
          <article-title>Towards interpretable anomaly detection: unsupervised deep neural network approach using feedback loop, in: IEEE/IFIP Network Operations and Management Symposium (NOMS)</article-title>
          , IEEE,
          <year>2022</year>
          , pp.
          <fpage>1</fpage>
          -
          <lpage>9</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref17">
        <mixed-citation>
          [17]
          <string-name>
            <given-names>M.</given-names>
            <surname>Shanaa</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Abdallah</surname>
          </string-name>
          ,
          <article-title>A hybrid anomaly detection framework combining supervised and unsupervised learning for credit card fraud detection</article-title>
          ,
          <source>F1000Research</source>
          <volume>14</volume>
          (
          <year>2025</year>
          ).
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>