<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta>
      <journal-title-group>
        <journal-title>Workshops, Los Angeles, USA, March</journal-title>
      </journal-title-group>
    </journal-meta>
    <article-meta>
      <title-group>
        <article-title>eX2: a framework for interactive anomaly detection</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Ignacio Arnaldo</string-name>
          <email>iarnaldo@patternex.com</email>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Mei Lam</string-name>
          <email>mei@patternex.com</email>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Kalyan Veeramachaneni</string-name>
          <email>kalyanv@mit.edu</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>LIDS, MIT</institution>
          ,
          <addr-line>Cambridge, MA</addr-line>
          ,
          <country country="US">USA</country>
        </aff>
        <aff id="aff1">
          <label>1</label>
          <institution>PatternEx</institution>
          ,
          <addr-line>San Jose, CA</addr-line>
          ,
          <country country="US">USA</country>
        </aff>
      </contrib-group>
      <pub-date>
        <year>2019</year>
      </pub-date>
      <volume>20</volume>
      <issue>2019</issue>
      <abstract>
        <p>We introduce eX 2 (coined after explain and explore), a framework based on explainable outlier analysis and interactive recommendations that enables cybersecurity researchers to eficiently search for new attacks. We demonstrate the framework with both publicly available and real-world cybersecurity datasets, showing that eX 2 improves the detection capability of stand-alone outlier analysis methods, therefore improving the eficiency of so-called threat hunting activities.</p>
      </abstract>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>CCS CONCEPTS</title>
      <p>• Security and privacy → Intrusion/anomaly detection and
malware mitigation; • Human-centered computing → User
interface management systems; • Information systems →
Recommender systems;</p>
    </sec>
    <sec id="sec-2">
      <title>INTRODUCTION</title>
      <p>
        The cybersecurity community is embracing machine learning to
transition from a reactive to a predictive strategy for threat
detection. At the same time, most research works at the intersection
of cybersecurity and machine learning focus on building complex
models for a specific detection problem [
        <xref ref-type="bibr" rid="ref11">11</xref>
        ], but rarely translate
into real-world solutions. Arguably one of the biggest weakspots
of these works is the use of datasets that lack generality, realism,
and representativeness [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ].
      </p>
      <p>To break out of this situation, the first step is to devise eficient
strategies to obtain representative datasets. To that end, intelligent
tools and interfaces are needed to enable security researchers to
carry out threat hunting activities, i.e., to search for attacks in
real-world cybersecurity datasets. Threat hunting solutions remain
vastly unexplored in the research community, and open challenges
in combining the fields of outlier analysis, explainable machine
learning, and recommendation systems.</p>
      <p>In this paper, we introduce eX 2, a threat hunting framework
based on interactive anomaly detection. The detection relies on
IUI Workshops’19, March 20, 2019, Los Angeles, USA
Copyright © 2019 for the individual papers by the papers’ authors. Copying permitted
for private and academic purposes. This volume is published and copyrighted by its
editors.
outlier analysis, given that new attacks are expected to be rare and
exhibit distinctive features. At the same time, special attention is
dedicated to providing interpretable, actionable results for analyst
consumption. Finally, the framework exploits human-data
interactions to recommend the exploration of regions of the data deemed
problematic by the analyst.
2</p>
    </sec>
    <sec id="sec-3">
      <title>RELATED WORK</title>
      <p>
        Anomaly detection methods have been extensively studied in the
machine learning community [
        <xref ref-type="bibr" rid="ref1 ref10 ref6">1, 6, 10</xref>
        ]. The strategy based on
Principal Component Analysis used in this work is inspired by [
        <xref ref-type="bibr" rid="ref14">14</xref>
        ],
while the method introduced to retrieve feature contributions based
on the analysis of feature projections into the principal components
is closely related to [
        <xref ref-type="bibr" rid="ref7">7</xref>
        ].
      </p>
      <p>
        Given the changing nature of cyber-attacks, many researchers
resort to anomaly detection for threat detection. The majority of these
works focus on building sophisticated models [
        <xref ref-type="bibr" rid="ref13 ref15">13, 15</xref>
        ], but do not
exploit analyst interactions with the data to improve detection rates.
Recent works explore a human-in-the-loop detection paradigm by
leveraging a combination of outlier analysis, used to identify new
threats, and supervised learning to improve detection rates over
time [
        <xref ref-type="bibr" rid="ref16 ref2 ref8">2, 8, 16</xref>
        ]. However, these works do not consider two critical
aspects in cybersecurity. First, they do not provide explanations
for the anomalies (note that [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ] provides predefined visualizations
based on prior attack knowledge, but it does not account for new
attacks exhibiting unique patterns). Second, neither of these works
exploit interactive strategies upon the confirmation of a new attack
by an analyst, therefore missing an opportunity to improve the
detection recall and the label acquisition process.
3
      </p>
    </sec>
    <sec id="sec-4">
      <title>FINDING ANOMALIES</title>
      <p>
        We leverage Principal Component Analysis (PCA) to find cases that
violate the correlation structure of the main bulk of the data. To
detect these rare cases, we analyze the projection from original
variables to the principal components’ space, followed by the
inverse projection (or reconstruction) from principal components to
the original variables. If only the first principal components (the
components that explain most of the variance in the data) are used
for projection and reconstruction, we ensure that the reconstruction
error will be low for the majority of the examples, while remaining
high for outliers. This is because the first principal components
explain the variance of normal cases, while last principal components
explain outlier variance [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ].
      </p>
      <p>Let X be a p-dimensional dataset. Its covariance matrix Σ can be
decomposed as: Σ = P × D × PT , where P is an orthonormal matrix
where the columns are the eigenvectors of Σ, and D is the diagonal
matrix containing the corresponding eigenvalues λ1 . . . λp , where
the eigenvectors and their corresponding eigenvalues are sorted in
0.015
0.010
decreasing order of significance (the first eigenvector accounts for
the most variance etc).</p>
      <p>The projection of the dataset into the principal component space
is given by Y = X P . Note that this projection can be performed with
a reduced number of principal components. Let Y j be the projected
dataset using the top j principal components: Y j = X × P j . In the
same way, the reverse projection (from principal component space
reconstructed dataset using the top j principal components.
to original space) is given by Rj = (P
j
× (Y j )T )T , where Rj is the
We define the outlier score of point Xi = [xi1 . . . xip ] as:










ev(j) = kp=1

p
j=1
j
Í λk
Í λk
k=1
score(Xi ) = Í</p>
      <p>j
(|Xi − Ri |) × ev(j)
(1)</p>
      <p>Note that ev(j) represents the cumulative percentage of variance
explained with the top j principal components. This means that,
the higher is j, the most variance will be accounted for within the
components from 1 to j. With this score definition, large deviations
in the top principal components are not heavily weighted, while
deviations in the last principal components are. Outliers present
large deviations in the last principal components, and thus will
receive high scores.</p>
      <p>
        Normalizing outlier scores: As shown in Figure 1, the outlier
detection method assigns a low score to most examples, and the
distribution presents a long right tail. At the same time, the range
of the scores depends on the datasets, which limits the method’s
interpretability. To overcome this situation, we project all scores
into a same space, in such a way that scores can be interpreted as
probabilities. To that end, we model PCA-based outlier scores with
a Weibull distribution (overlaid in the figures in red). Note that
the Weibull distribution is flexible and can model a wide variety of
shapes. For a given score S, its outlier probability corresponds to
the cumulative density function evaluated in S: F (S) = P (X ≤ S).
follow a long-right tailed distribution in the [
        <xref ref-type="bibr" rid="ref1">0, 1</xref>
        ] domain. Note that
these scores can be interpreted as the probability that a randomly
picked example will present a lower or equal score.
4
      </p>
    </sec>
    <sec id="sec-5">
      <title>EXPLAINING AND EXPLORING</title>
    </sec>
    <sec id="sec-6">
      <title>ANOMALIES</title>
      <p>
        Interpretability in machine learning can be achieved by
explaining the model that generates the results, or by explaining each
model outcome [
        <xref ref-type="bibr" rid="ref9">9</xref>
        ]. In this paper, we focus on the latter, given that
the goal is to provide explanations for each individual anomaly.
More formally, we consider an anomaly detection strategy given by
b(X p ) = S where b is a black-box detector, X p is a dataset with p
features, and S is the space of scores generated by the detector. The
goal is to find an explanation e ∈ ϵ for each x ∈ X p , where ϵ
represents the domain of interpretable explanations. We approach this
problem as finding a function
      </p>
      <p>f such that for each vector x ∈ X p ,
the corresponding explanation is given by e = f (x, b).</p>
      <p>In this paper, we introduce a procedure f tailored to PCA that
generates explanations e = {C, V }, where C contains the
contribution of each feature to the score, and V is a set of visualizations
that highlight the diference between the analyzed example and the
bulk of the population.</p>
      <p>Retrieving feature contributions: In this first step, we retrieve
the contribution of each feature of the dataset to the final outlier
score via model inspection. Note that we leverage matrix
operations to simultaneously retrieve the feature contributions for all
the examples; we proceed as follows:
(1) Project one feature at a time using all principal components.
where the matrix P contains all p eigenvectors.</p>
      <p>For feature i, the projected data is given by Yi = Xi × P ,
(2) Compute the feature contribution Ci of feature i as:
p
Õ
j=1
Ci =</p>
      <p>j</p>
      <p>Yi × ev(j)
where Yij is the projected value of the i-th feature on the j-th
principal component, and ev(j) is the cumulative percentage
of variance explained with the top j principal components
given in Equation 1. In other words, the higher the absolute
values projected with the last principal components, the
higher the contribution of the feature to the outlier score.
(3) In a last step, we normalize the feature contributions to
obtain a unit vector C for each sample:
(2)
(3)
Ci =</p>
      <p>Ci
p
num_cred_cards
58.29%</p>
      <p>5.13%
7.09%</p>
      <p>
        rest
new_ip
This way, for each outlier, we obtain a contribution score in the
[
        <xref ref-type="bibr" rid="ref1">0, 1</xref>
        ] domain for each feature in the dataset. To illustrate this step,
we show in Figure 2a the feature contributions to the score of
an outlier of the ATO dataset; we can see that num_cred_cards
contributed the most to the example’s score (58.29%), followed
by num_address_chд and addr _veri f y_f ail (17.94% and 11.40%
respectively).
      </p>
      <p>Visualizing anomalies: Once the feature contributions are
extracted, the system generates a series of visualizations to show each
outlier in relation with the rest of the population. For ease of
interpretation, these visualizations are generated in low dimensional
feature subspaces as follows:
(1) Retrieve the top-m features ranked by contribution score
(2) For each pair of features (xi , xj ) in the top-m, display the
joint distribution of the population in a 2D-scatter plot as
shown in Figure 2b. Note that in the example m = 2 and
that the analyzed outlier is highlighted in red. In cases of
large datasets, the visualizations are limited to 10K randomly
picked samples.</p>
      <p>With this approach, we obtain intuitive visualizations in low-dimension
subspaces of the original features, in such a way that outliers are
likely to stand out with respect to the rest of the population.
Exploring via recommendations in feature subspaces: As the
analyst interacts with the visualizations and confirms relevant
findings, the framework recommends the investigation of entities with
similar characteristics. These recommendations are interactive and
correspond to searching the top-k nearest neighbors in the feature
subspaces used to visualize the data (as opposed to using all the
features for distance computation). As shown in Figure 2c, the
recommendations highlighted in green help narrow down the search
of further anomalies.</p>
      <p>This strategy, recommending based on similarities computed
in feature subsets, exploits user interactions with the data. The
intuition is that, upon confirmation of the relevance of an outlier
with the provided visualizations, the user identifies discriminant
feature sets that are not known a priori. Thus, points close to the
identified anomaly in the resulting subspaces are likely to be in
turn relevant.
5</p>
    </sec>
    <sec id="sec-7">
      <title>EXPERIMENTAL WORK</title>
      <p>
        Datasets: We evaluate the framework’s capability to find, explain,
and explore anomalies with four outlier detection datasets, out of
which three are publicly available (WDBC, KDDCup, and Credit
Card) and one is a real-world dataset built with logs generated by
an online application:
- WDBC dataset: this dataset is composed of 367 rows, 30
numerical features, and includes 10 anomalies. We consider the version
available at [
        <xref ref-type="bibr" rid="ref5">5</xref>
        ] introduced by Campos et al. [
        <xref ref-type="bibr" rid="ref4">4</xref>
        ]. Note that this is
not a cybersecurity dataset, but has been included to cover a wider
range of scenarios.
- KDDCup 99 data (KDD): We consider the pre-processed version
introduced in [
        <xref ref-type="bibr" rid="ref4">4</xref>
        ] in which categorical values are one-hot encoded
and duplicates are eliminated. This version is composed of 48113
rows, 79 features, and counts 200 malicious anomalies.
- Credit card dataset (CC): used in a Kaggle competition [
        <xref ref-type="bibr" rid="ref12">12</xref>
        ],
the dataset is composed of 284807 rows, 29 numerical features, and
counts 492 anomalies.
- Account takeover dataset (ATO): this real-world dataset was
built using web logs from an online application during three months.
Each row corresponds to the summarized activity of a user during
a 24 hour time window (midnight to midnight). It is composed
of 317163 rows, 25 numerical features, and counts 318 identified
anomalies.1
      </p>
      <p>Detection rates and analysis of top outliers: Table 1 shows
the detection metrics of the PCA-based method and Local Outlier
Factor (LOF), a standard outlier analysis baseline, on each of the
datasets. The detection performance of LOF is superior for the
smaller dataset, WDBC. However, PCA-based outlier analysis
outperforms LOF in the three cybersecurity datasets (KDD, CC, and
ATO). This observation validates the choice of PCA, given that not
only it outperforms LOF, but it also provides interpretability as
explained in Section 4.</p>
      <p>Despite improving the results of LOF in the cybersecurity datasets,
we can see that the precision and recall metrics of the PCA-based
method remain low. For instance, when looking at the top 100
outliers, the precision of our method (noted as P@100 in the table) is
1As most real-world datasets, ATO is not fully labeled, therefore the metrics presented
in the following need to be interpreted accordingly.
Dataset
WDBC
KDDCup</p>
      <p>Credit card
Account takeover</p>
      <p>Method
LOF
PCA
LOF
PCA
LOF
PCA
LOF
PCA
0.210, 0.480, and 0.020 for KDDcup, CC, and ATO respectively. This
observation indicates that not all outliers are malicious, and justifies
the efort dedicated to providing interactive exploration of the data
to increase anomaly detection rates.</p>
      <p>Explain and explore: We show in Figure 3 the visualizations and
recommendations generated for the top ATO outlier. The
framework appropriately selects feature subsets such that the analyzed
AD-WDBC
AD-KDD
AD-CC
AD-ATO</p>
      <p>IAD-WDBC
IAD-KDD
IAD-CC</p>
      <p>IAD-ATO
50</p>
      <p>100 150
Investigation budget
200
250
outlier (shown in red) stands out with respect to the population
(blue), ie outliers fall in sparse regions of the selected subspaces. The
top 3 contributing features retrieved by the framework are the
number of address changes (num_addresschg), the number of credit
cards used (num_cred_cards), and whether the user failed the
address verification ( addr_verify_fail). In the first plot ( num_addresschg
vs num_cred_cards), we can clearly see why the highlighted user is
suspicious: he/she used four credit cards, and changed the delivery
address more that 90 times. The plot also shows five additional
users recommended by the system upon confirmation of the threat
by an analyst. The recommended users present an elevated number
of address changes, and used one or more credit cards.</p>
      <p>
        To further evaluate the exploratory strategy based on
recommendations, Figure 4 shows the detection rate obtained with PCA
alone, versus the metrics obtained with the combination of PCA
and recommendations. To obtain the latter metrics, we simulate
investigations for the top-m (m ∈ [
        <xref ref-type="bibr" rid="ref10">10, 25, 50, 100, 200, 500</xref>
        ]) outliers
(ie we reveal the ground truth) and consider the top-10
recommended entries for the confirmed threats. In all cases, interactive
anomaly detection improves the precision. In particular, we can see
a significant precision improvement for the KDD and CC datasets
for investigation budgets in the 50-200 range.
      </p>
    </sec>
    <sec id="sec-8">
      <title>6 CONCLUSION</title>
      <p>We have introduced the eX 2 framework for threat hunting activities.
The framework leverages principal component analysis to generate
interpretable anomalies, and exploits analyst-data interaction to
recommend the exploration of problematic regions of the data. The
results presented in this work with three cybersecurity datasets
show that eX 2 outperforms detection strategies based on
standalone outlier analyis.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <surname>Charu</surname>
            <given-names>C. Aggarwal. 2013. Outlier</given-names>
          </string-name>
          <string-name>
            <surname>Analysis</surname>
          </string-name>
          . Springer. https://doi.org/10.1007/ 978-1-
          <fpage>4614</fpage>
          -6396-2
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [2]
          <string-name>
            <given-names>Anaël</given-names>
            <surname>Beaugnon</surname>
          </string-name>
          , Pierre Chiflier, and
          <string-name>
            <given-names>Francis</given-names>
            <surname>Bach</surname>
          </string-name>
          .
          <year>2017</year>
          .
          <article-title>ILAB: An Interactive Labelling Strategy for Intrusion Detection</article-title>
          . In RAID 2017:
          <article-title>Research in Attacks, Intrusions and Defenses</article-title>
          . Atlanta, United States. https://hal.archives-ouvertes.fr/ hal-01636299
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [3]
          <string-name>
            <given-names>E.</given-names>
            <surname>Biglar Beigi</surname>
          </string-name>
          ,
          <string-name>
            <given-names>H. Hadian</given-names>
            <surname>Jazi</surname>
          </string-name>
          ,
          <string-name>
            <given-names>N.</given-names>
            <surname>Stakhanova</surname>
          </string-name>
          ,
          <article-title>and</article-title>
          <string-name>
            <given-names>A. A.</given-names>
            <surname>Ghorbani</surname>
          </string-name>
          .
          <year>2014</year>
          .
          <article-title>Towards efective feature selection in machine learning-based botnet detection approaches</article-title>
          .
          <source>In 2014 IEEE Conference on Communications and Network Security</source>
          .
          <fpage>247</fpage>
          -
          <lpage>255</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [4]
          <string-name>
            <surname>Guilherme</surname>
            <given-names>O.</given-names>
          </string-name>
          <string-name>
            <surname>Campos</surname>
          </string-name>
          , Arthur Zimek, Jörg Sander,
          <string-name>
            <surname>Ricardo J. G. B. Campello</surname>
            , Barbora Micenková, Erich Schubert, Ira Assent, and
            <given-names>Michael E.</given-names>
          </string-name>
          <string-name>
            <surname>Houle</surname>
          </string-name>
          .
          <year>2016</year>
          .
          <article-title>On the evaluation of unsupervised outlier detection: measures, datasets, and an empirical study</article-title>
          .
          <source>Data Mining and Knowledge Discovery</source>
          <volume>30</volume>
          ,
          <issue>4</issue>
          (
          <issue>01</issue>
          <year>Jul 2016</year>
          ),
          <fpage>891</fpage>
          -
          <lpage>927</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          [5]
          <string-name>
            <surname>Guilherme</surname>
            <given-names>O.</given-names>
          </string-name>
          <string-name>
            <surname>Campos</surname>
          </string-name>
          , Arthur Zimek, Jörg Sander,
          <string-name>
            <surname>Ricardo J. G. B. Campello</surname>
            , Barbora Micenková, Erich Schubert, Ira Assent, and
            <given-names>Michael E.</given-names>
          </string-name>
          <string-name>
            <surname>Houle</surname>
          </string-name>
          .
          <year>2018</year>
          .
          <article-title>Datasets for the evaluation of unsupervised outlier detection</article-title>
          . www.dbs.ifi.lmu. de/research/outlier-evaluation/DAMI/
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          [6]
          <string-name>
            <given-names>Varun</given-names>
            <surname>Chandola</surname>
          </string-name>
          , Arindam Banerjee, and
          <string-name>
            <given-names>Vipin</given-names>
            <surname>Kumar</surname>
          </string-name>
          .
          <year>2009</year>
          .
          <article-title>Anomaly Detection: A Survey</article-title>
          .
          <source>ACM Comput. Surv</source>
          .
          <volume>41</volume>
          ,
          <issue>3</issue>
          ,
          <string-name>
            <surname>Article 15</surname>
          </string-name>
          (
          <year>July 2009</year>
          ),
          <volume>58</volume>
          pages. https: //doi.org/10.1145/1541880.1541882
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          [7]
          <string-name>
            <given-names>XuanHong</given-names>
            <surname>Dang</surname>
          </string-name>
          , Barbora MicenkovÃą, Ira Assent, and RaymondT. Ng.
          <year>2013</year>
          .
          <article-title>Local Outlier Detection with Interpretation</article-title>
          .
          <source>In Machine Learning and Knowledge Discovery in Databases</source>
          , Hendrik Blockeel, Kristian Kersting, Siegfried Nijssen, and
          <string-name>
            <surname>Filip</surname>
            <given-names>Å</given-names>
          </string-name>
          ¡eleznÃ¡ (Eds.).
          <source>Lecture Notes in Computer Science</source>
          , Vol.
          <volume>8190</volume>
          . Springer Berlin Heidelberg,
          <fpage>304</fpage>
          -
          <lpage>320</lpage>
          . https://doi.org/10.1007/978-3-
          <fpage>642</fpage>
          -40994-3_
          <fpage>20</fpage>
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          [8]
          <string-name>
            <given-names>S.</given-names>
            <surname>Das</surname>
          </string-name>
          ,
          <string-name>
            <given-names>W.</given-names>
            <surname>Wong</surname>
          </string-name>
          ,
          <string-name>
            <given-names>T.</given-names>
            <surname>Dietterich</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Fern</surname>
          </string-name>
          ,
          <article-title>and</article-title>
          <string-name>
            <given-names>A.</given-names>
            <surname>Emmott</surname>
          </string-name>
          .
          <year>2016</year>
          .
          <article-title>Incorporating Expert Feedback into Active Anomaly Discovery</article-title>
          .
          <source>In 2016 IEEE 16th International Conference on Data Mining (ICDM)</source>
          .
          <volume>853</volume>
          -
          <fpage>858</fpage>
          . https://doi.org/10.1109/ICDM.
          <year>2016</year>
          . 0102
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          [9]
          <string-name>
            <given-names>Riccardo</given-names>
            <surname>Guidotti</surname>
          </string-name>
          , Anna Monreale, Salvatore Ruggieri, Franco Turini, Fosca Giannotti, and
          <string-name>
            <given-names>Dino</given-names>
            <surname>Pedreschi</surname>
          </string-name>
          .
          <year>2018</year>
          .
          <article-title>A Survey of Methods for Explaining Black Box Models</article-title>
          .
          <source>ACM Comput. Surv</source>
          .
          <volume>51</volume>
          ,
          <issue>5</issue>
          ,
          <string-name>
            <surname>Article 93</surname>
          </string-name>
          (
          <issue>Aug</issue>
          .
          <year>2018</year>
          ),
          <volume>42</volume>
          pages. https: //doi.org/10.1145/3236009
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          [10]
          <string-name>
            <given-names>Victoria</given-names>
            <surname>Hodge</surname>
          </string-name>
          and
          <string-name>
            <given-names>Jim</given-names>
            <surname>Austin</surname>
          </string-name>
          .
          <year>2004</year>
          .
          <article-title>A Survey of Outlier Detection Methodologies</article-title>
          .
          <source>Artif. Intell. Rev</source>
          .
          <volume>22</volume>
          ,
          <issue>2</issue>
          (Oct.
          <year>2004</year>
          ),
          <fpage>85</fpage>
          -
          <lpage>126</lpage>
          . https://doi.org/10.1023/B: AIRE.
          <volume>0000045502</volume>
          .10941.a9
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          [11]
          <string-name>
            <surname>Heju</surname>
            <given-names>Jiang</given-names>
          </string-name>
          , Jasvir Nagra, and
          <string-name>
            <given-names>Parvez</given-names>
            <surname>Ahammad</surname>
          </string-name>
          .
          <year>2016</year>
          .
          <article-title>SoK: Applying Machine Learning in Security-A Survey</article-title>
          .
          <source>arXiv preprint arXiv:1611.03186</source>
          (
          <year>2016</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          [12]
          <string-name>
            <surname>Kaggle</surname>
          </string-name>
          .
          <year>2018</year>
          .
          <article-title>Credit Card Fraud Detection Dataset</article-title>
          . www.kaggle.com/isaikumar/ creditcardfraud
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          [13]
          <string-name>
            <surname>Benjamin</surname>
            <given-names>J.</given-names>
          </string-name>
          <string-name>
            <surname>Radford</surname>
          </string-name>
          ,
          <string-name>
            <surname>Leonardo M. Apolonio</surname>
          </string-name>
          , Antonio J.
          <string-name>
            <surname>Trias</surname>
          </string-name>
          , and
          <string-name>
            <surname>Jim</surname>
            <given-names>A.</given-names>
          </string-name>
          <string-name>
            <surname>Simpson</surname>
          </string-name>
          .
          <year>2018</year>
          .
          <article-title>Network Trafic Anomaly Detection Using Recurrent Neural Networks</article-title>
          . CoRR abs/
          <year>1803</year>
          .10769 (
          <year>2018</year>
          ). arXiv:
          <year>1803</year>
          .10769 http://arxiv.org/abs/
          <year>1803</year>
          .10769
        </mixed-citation>
      </ref>
      <ref id="ref14">
        <mixed-citation>
          [14]
          <string-name>
            <surname>Mei-ling Shyu</surname>
            , Shu ching Chen, Kanoksri Sarinnapakorn, and
            <given-names>Liwu</given-names>
          </string-name>
          <string-name>
            <surname>Chang</surname>
          </string-name>
          .
          <year>2003</year>
          .
          <article-title>A novel anomaly detection scheme based on principal component classifier</article-title>
          .
          <source>In in Proceedings of the IEEE Foundations and New Directions of Data Mining Workshop, in conjunction with the Third IEEE International Conference on Data Mining (ICDMâĂŹ03</source>
          .
          <fpage>172</fpage>
          -
          <lpage>179</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref15">
        <mixed-citation>
          [15]
          <string-name>
            <surname>Aaron</surname>
            <given-names>Tuor</given-names>
          </string-name>
          , Samuel Kaplan, Brian Hutchinson, Nicole Nichols, and
          <string-name>
            <given-names>Sean</given-names>
            <surname>Robinson</surname>
          </string-name>
          .
          <year>2017</year>
          .
          <article-title>Deep Learning for Unsupervised Insider Threat Detection in Structured Cybersecurity Data Streams</article-title>
          .
          <source>CoRR abs/1710</source>
          .00811 (
          <year>2017</year>
          ). arXiv:
          <volume>1710</volume>
          .00811 http://arxiv.org/abs/1710.00811
        </mixed-citation>
      </ref>
      <ref id="ref16">
        <mixed-citation>
          [16]
          <string-name>
            <given-names>K.</given-names>
            <surname>Veeramachaneni</surname>
          </string-name>
          ,
          <string-name>
            <given-names>I.</given-names>
            <surname>Arnaldo</surname>
          </string-name>
          ,
          <string-name>
            <given-names>V.</given-names>
            <surname>Korrapati</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Bassias</surname>
          </string-name>
          , and
          <string-name>
            <given-names>K.</given-names>
            <surname>Li</surname>
          </string-name>
          .
          <year>2016</year>
          .
          <article-title>AI 2: Training a Big Data Machine to Defend</article-title>
          . In
          <source>2016 IEEE 2nd International Conference on Big Data Security on Cloud (BigDataSecurity)</source>
          .
          <fpage>49</fpage>
          -
          <lpage>54</lpage>
          .
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>