<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Outlier (Anomaly) Detection Modelling in PMML</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Jaroslav Kuchar</string-name>
          <email>jaroslav.kuchar@fit.cvut.cz</email>
          <xref ref-type="aff" rid="aff1">1</xref>
          <xref ref-type="aff" rid="aff2">2</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Adam Ashenfelter</string-name>
          <email>ashenfelter@bigml.com</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Tomas Kliegr</string-name>
          <email>tomas.kliegr@vse.cz</email>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>BigML Inc.</institution>
          <addr-line>Corvallis, Oregon</addr-line>
          <country country="US">United States</country>
        </aff>
        <aff id="aff1">
          <label>1</label>
          <institution>Department of Information and Knowledge Engineering, Faculty of Informatics and Statistics, University of Economics Prague</institution>
          <country country="CZ">Czech Republic</country>
        </aff>
        <aff id="aff2">
          <label>2</label>
          <institution>Web Intelligence Research Group, Faculty of Information Technology, Czech Technical University in Prague</institution>
          ,
          <country country="CZ">Czech Republic</country>
        </aff>
      </contrib-group>
      <abstract>
        <p>PMML is an industry-standard XML-based open format for representing statistical and data mining models. Since PMML does not yet support outlier (anomaly) detection, in this paper we propose a new outlier detection model to foster interoperability in this emerging eld. Our proposal is included in the PMML RoadMap for PMML 4.4. We demonstrate the proposed format on one supervised and two unsupervised outlier detection approaches: association rule-based classi er CBA, frequent-pattern based method FPOF and isolation forests.</p>
      </abstract>
      <kwd-group>
        <kwd>outlier detection</kwd>
        <kwd>anomaly detection</kwd>
        <kwd>PMML</kwd>
        <kwd>frequent pattern mining</kwd>
        <kwd>rule-based classi ers</kwd>
        <kwd>isolation forests</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>-</title>
      <p>
        Outliers (also called anomalies) are observations that di er from other
observations to the extent that they arouse suspicion that they were generated by
a di erent mechanism than the rest of the data. Algorithms that can detect
outliers have a growing list of applications, including fraud detection, intrusion
detection, medical diagnosis and sensor events [
        <xref ref-type="bibr" rid="ref1 ref3">1, 3</xref>
        ].
      </p>
      <p>There are many existing approaches that can be used to detect outliers.
Selection of the proper method depends on the character of the input data and
goals, level of the supervision, dimensionality of input data, algorithmic approach
(proximity-based or clustering-based techniques), and type of outliers detected
(point, contextual or collective outliers). In all their variety, all approaches
generally provide output value for each input instance that represents the level of
anomality. This is either a class label (usually a binary ag) or a numerical score.</p>
      <p>Despite the growing need for standard approach for handling outlier
detection models generated by di erent approaches and software tools implementing
them, there has been so far little standardization e ort that would foster
interoperability between the individual components handling these models in the
analytics tool chain.</p>
      <p>
        PMML4 is an XML-based open standard for representing statistical and data
mining models. It supports many existing models including association rules,
classi cation, regression or clustering models and also neural networks. Many
existing tools and data mining solutions support this standard [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ]. Since PMML
does not yet support outlier (anomaly) detection, in this paper we propose a
new outlier detection model to foster interoperability in this emerging eld. Our
proposal is included in the PMML RoadMap for PMML 4.4.
      </p>
      <p>The paper is organized as follows: In Section 2 we use XML schema fragments
to describe the proposed PMML extension. Section 3 demonstrates the versatility
of the proposed speci cation on three di erent types of models. In Section 4 we
compare the proposed speci cation with another proposed PMML extension.
Finally, the conclusions present a brief summary and outlook.
2</p>
    </sec>
    <sec id="sec-2">
      <title>Speci cation</title>
      <p>Since PMML is an XML-based standard, the proposed speci cation for the
outlier detection model is in the form of an XML Schema model. Figure 1 depicts
the main structure of PMML. Our extension adds a new model to the list of
available models.
4 http://dmg.org/pmml/pmml-v4-3.html</p>
      <sec id="sec-2-1">
        <title>Outlier Detection Model</title>
        <p>Listing 1.1 shows the main element of the proposed model: OutlierDetectionModel.
It contains required standard elements from the PMML speci cation - Extension
and Mining Schema.</p>
        <p>Listing 1.1. Outlier Detection Model
&lt;xs:element name="OutlierDetectionModel"&gt;
&lt;xs:complexType&gt;
&lt;xs:sequence&gt;
&lt;xs:element ref="pmml:Extension" minOccurs="0" maxOccurs="unbounded"/&gt;
&lt;xs:element ref="pmml:MiningSchema" minOccurs="0"/&gt;
&lt;xs:element ref="ParameterList" minOccurs="0"/&gt;
&lt;xs:choice&gt;
&lt;xs:element ref="pmml:AssociationModel" minOccurs="0"/&gt;
&lt;xs:element ref="pmml:Segmentation" minOccurs="0"/&gt;
&lt;!-- The rest of other possible models are skipped for the demonstration</p>
        <p>purpose --&gt;
&lt;/xs:choice&gt;
&lt;xs:element ref="LabeledInstances" minOccurs="0"/&gt;
&lt;/xs:sequence&gt;
&lt;xs:attribute name="modelName" type="xs:string"/&gt;
&lt;xs:attribute name="algorithmName" type="ALGORITHM-TYPE" use="required"/&gt;
&lt;xs:attribute name="typeOfOutliers" type="OUTLIERS-TYPE" use="required"/&gt;
&lt;xs:attribute name="numberOfOutliers" type="xs:positiveInteger" /&gt;
&lt;xs:attribute name="output" type="OUTLIERS-OUTPUT-TYPE" use="required"/&gt;
&lt;/xs:complexType&gt;
&lt;/xs:element&gt;</p>
        <p>Remaining elements and attributes were newly added. ParameterList is an
optional structure containing speci c parameters for each supported
algorithm/approach. PMML currently supports a variety of existing machine learning
algorithms such as decision trees or regression, which can serve as basis for outlier
detection algorithms. However, these existing PMML models cannot be directly
reused, because the adaptation of existing generic machine learning model for
outlier detection typically implies introduction of new parameters and/or
amendments of the existing ones. We therefore decided that ParameterList will be a
generic structure o ering a con gurable list of key-value pairs as parameters (cf.
Listing 1.2). The list can contain generic parameters of the underlying model or
any proprietary con gurations of each algorithm, which are important to
compute the output value.</p>
        <p>The model speci es a set of attributes for description of the type of the
outlier detection model. Similarly to other PMML models, there is an optional
modelName attribute and the following required attributes:
{ algorithmName { speci cation of algorithm type. Currently supported and
demonstrated algorithms are isolation forests, frequent pattern mining
outliers and a rule based classi er. The list of the allowed algorithm names is
extensible and currently de ned as the ALGORITHM-TYPE in Listing 1.3.
{ output { de nes the output of the outlier detection algorithm. Supported
options are label or numeric score (see Listing 1.3).</p>
        <p>In addition, there are the following required attributes speci c for outlier
detection:
{ typeOfOutliers { de nes the type of the outlier the model is able to handle.</p>
        <p>Supported types are point, collective and contextual (see Listing 1.3).
{ numberOfOutliers { the attribute speci es the number of outliers that
should be returned as the output of the task.</p>
        <p>Listing 1.3. Model Types
&lt;xs:simpleType name="OUTLIERS-TYPE"&gt;
&lt;xs:restriction base="xs:string"&gt;
&lt;xs:enumeration value="point"/&gt;
&lt;xs:enumeration value="collective"/&gt;
&lt;xs:enumeration value="contextual"/&gt;
&lt;/xs:restriction&gt;
&lt;/xs:simpleType&gt;
&lt;xs:simpleType name="OUTLIERS-OUTPUT-TYPE"&gt;
&lt;xs:restriction base="xs:string"&gt;
&lt;xs:enumeration value="score"/&gt;
&lt;xs:enumeration value="label"/&gt;
&lt;/xs:restriction&gt;
&lt;/xs:simpleType&gt;</p>
        <p>This speci cation also allows to provide detailed description of detected
outliers (cf. Listing 1.4). The output is in form of the set of top labeled instances.
The speci cation is similar to training instances of KNN model in PMML5. The
format is a table, where each row contains elements with the original data. The
required attributes id and output represent the original id of the row in the data
and the output value (label or score) assigned by the algorithm respectively.
5 http://dmg.org/pmml/v4-3/KNN.html#xsdElement_TrainingInstances
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32</p>
        <p>Listing 1.4. Labelled instances
&lt;xs:element name="LabeledInstances"&gt;
&lt;xs:complexType&gt;
&lt;xs:sequence&gt;
&lt;xs:element ref="pmml:Extension" minOccurs="0" maxOccurs="unbounded"/&gt;
&lt;xs:element ref="pmml:InstanceFields"/&gt;
&lt;xs:element ref="InlineTable"/&gt;
&lt;/xs:sequence&gt;
&lt;xs:attribute name="recordCount" type="xs:positiveInteger" use="optional"/&gt;
&lt;xs:attribute name="fieldCount" type="xs:positiveInteger" use="optional"/&gt;
&lt;/xs:complexType&gt;
&lt;/xs:element&gt;
&lt;xs:element name="InlineTable"&gt;
&lt;xs:complexType&gt;
&lt;xs:sequence&gt;
&lt;xs:element ref="pmml:Extension" minOccurs="0" maxOccurs="unbounded"/&gt;
&lt;xs:element ref="Row" minOccurs="0" maxOccurs="unbounded"/&gt;
&lt;/xs:sequence&gt;
&lt;/xs:complexType&gt;
&lt;/xs:element&gt;
&lt;xs:element name="Row"&gt;
&lt;xs:complexType&gt;
&lt;xs:complexContent mixed="true"&gt;
&lt;xs:restriction base="xs:anyType"&gt;
&lt;xs:sequence&gt;</p>
        <p>&lt;xs:any processContents="skip" minOccurs="1" maxOccurs="unbounded"/&gt;
&lt;/xs:sequence&gt;
&lt;xs:attribute name="id" type="xs:string" use="required"/&gt;
&lt;xs:attribute name="output" type="xs:string" use="required"/&gt;
&lt;/xs:restriction&gt;
&lt;/xs:complexContent&gt;
&lt;/xs:complexType&gt;
&lt;/xs:element&gt;</p>
        <p>An alternative way would be to use the Output element already de ned
in PMML instead of introducing a new structure consisting of multiple rows.
While using the Output element would be more in line with common practice
for existing PMML models, there are two important limitations. First, this would
imply that the existing set of features/operations de ned in PMML is su cient
to describe how the output for a speci c anomaly detection model is obtained.
Second, this would not support the use case when output for limited number of
detected outliers should be returned.
3</p>
      </sec>
    </sec>
    <sec id="sec-3">
      <title>Examples</title>
      <p>
        As examples we use three algorithms: frequent-pattern mining algorithm,
isolation forest and rule-based classi er: The rst two approaches are unsupervised,
rule-based classi er is an example of the standard supervised approach.
{ Outlier detection based on frequent pattern mining { the FPOF (Frequent
Pattern Contradiction Outlier Factor) method [
        <xref ref-type="bibr" rid="ref4">4</xref>
        ]. Reference implementation
is available as an R package 6. This package also already exports to the
proposed PMML extension.
6 https://github.com/jaroslav-kuchar/fpmoutliers
{ Isolation forest { well-known algorithm with good quality/complexity ratio.
      </p>
      <p>
        Represented as an ensemble of trees. Reference implementation is provided
by BigML API 7 or scikit-learn (v 0.18)8.
{ Rule based classi er { standard supervised classi cation algorithm based on
rules. We use reference implementation available of the Classi cation By
Associations (CBA) algorithm, which is available as an R package 9.
Listing 1.5 describes an example of output of the frequent pattern based
unsupervised method. The algorithm detects point outliers and provides scores as
the output (line 9). Since the method is built on top of frequent patterns, the
association model is included10. The nal score is computed proportionally to
the number of matching frequent itemsets and their support (cf. [
        <xref ref-type="bibr" rid="ref4">4</xref>
        ] for details).
      </p>
      <p>To represent outliers based on frequent itemsets our proposal reuses the
complete AssociationModel. What is actually needed is a way to express frequent
itemsets, which is only a part of it. An alternate more complex version of the
schema, which we considered, would introduce FrequentItemset model as a
standalone model, and then reuse it in the OutlierDetectionModel.</p>
      <p>Listing 1.5. Frequent-pattern mining example
1 &lt;?xml version="1.0"?&gt;
2 &lt;PMML version="4.3" xmlns="http://www.dmg.org/PMML-4_3" xmlns:xsi="http://www.w3.org
/2001/XMLSchema-instance" xmlns:od="http://www.example.com/od" xsi:schemaLocation="
http://www.dmg.org/PMML-4_3 pmml-4-3+od-0-1.xsd"&gt;
7 https://bigml.com/api/anomalies
8 http://scikit-learn.org/stable/modules/generated/sklearn.ensemble.</p>
      <p>IsolationForest.html
9 https://cran.r-project.org/web/packages/rCBA/index.html
10 http://dmg.org/pmml/v4-3/AssociationRules.html
&lt;Header copyright="" description=""&gt;
&lt;Timestamp&gt;2017-04-30 07:01:05&lt;/Timestamp&gt;
&lt;/Header&gt;
&lt;DataDictionary&gt;
&lt;!-- Data Dictionary is skipped for the demonstration purpose --&gt;
&lt;/DataDictionary&gt;
&lt;od:OutlierDetectionModel xmlns="http://www.example.com/od" algorithmName="fpof"
modelName="FPI OD model" typeOfOutliers="point" numberOfOutliers="10" output="
score"&gt;
As second example of representing output of an unsupervised method we selected
isolation forests as implemented in bigml.com. Since isolation forests are built
from several trees, the model uses the Segmentation11 speci cation to combine
multiple models and build the nal model from multiple models.</p>
      <p>
        Listing 1.6 describes an example of isolation forest as implemented in bigml.
com. There is only one required parameter specifying number of trees that should
be composed { two trees for this example (line 22). The algorithm also detects
11 http://dmg.org/pmml/v4-3/MultipleModels.html
1
2
3
4
5
6
7
8
9
10
11
12
point outliers and provides scores as the output (line 13). The nal output score
for each instance (e.g. as on line 59) is derived from the combination of available
trees and depth of relevant branches/predicates matching the instance (cf. [
        <xref ref-type="bibr" rid="ref5">5</xref>
        ]
for details).
      </p>
      <p>Listing 1.6. Frequent-pattern mining example
&lt;?xml version="1.0"?&gt;
&lt;PMML version="4.3" xmlns="http://www.dmg.org/PMML-4_3"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:od="http://www.example.com/
od"
xsi:schemaLocation="http://www.dmg.org/PMML-4_3 pmml-4-3+od-0-1.xsd"&gt;
&lt;Header copyright="" description=""&gt;
&lt;Timestamp&gt;2017-04-30 09:02:01&lt;/Timestamp&gt;
&lt;/Header&gt;
&lt;DataDictionary&gt;
&lt;!-- Data Dictionary is skipped for the demonstration purpose --&gt;
&lt;/DataDictionary&gt;
&lt;od:OutlierDetectionModel xmlns="http://www.example.com/od" algorithmName="iforest"
modelName="BigML Isolation Forests"
typeOfOutliers="point" numberOfOutliers="10" output="score"&gt;
&lt;Row id="5" output="0.52717"&gt;
&lt;Age.Range&gt;Young&lt;/Age.Range&gt;
&lt;Car&gt;Sports&lt;/Car&gt;
&lt;Salary.Level&gt;High&lt;/Salary.Level&gt;
&lt;/Row&gt;
&lt;!-- The rest of the table is skipped for the demonstration purpose --&gt;
&lt;/InlineTable&gt;
&lt;/LabeledInstances&gt;
&lt;/od:OutlierDetectionModel&gt;
&lt;/PMML&gt;
3.3</p>
      <sec id="sec-3-1">
        <title>Rule-based classi er</title>
        <p>The rule-based classi er is a representative of a supervised method { standard
classi cation algorithm applied on the outlier detection problem. Let assume
that we have the fth instance annotated as the outlier using the Class attribute
(See Table 1). The rule-based classi er (here CBA) can learn rules that label
speci c instances as outliers and the rest as normal instances.</p>
        <p>A simpli ed output of the rule-based classier can look as follows:
{ fg ! f Class=Normal g)
{ f Car=Sports &amp; Salary-Level=High g ! f Class=Outlier g)</p>
        <p>The structure of the model in PMML (Listing 1.7) is similar to the
unsupervised frequent pattern based model. The di erence is in setting of the algorithm
name (line 9), output type and parameters (starting from line 18). The model
also reuses AssociationModel to represent rules.</p>
        <p>Listing 1.7. Rule-based classi er example
&lt;?xml version="1.0"?&gt;
&lt;PMML version="4.3" xmlns="http://www.dmg.org/PMML-4_3" xmlns:xsi="http://www.w3.org
/2001/XMLSchema-instance" xmlns:od="http://www.example.com/od" xsi:schemaLocation="
http://www.dmg.org/PMML-4_3 pmml-4-3+od-0-1.xsd"&gt;
&lt;Header copyright="" description=""&gt;
&lt;Timestamp&gt;2017-04-30 11:39:17&lt;/Timestamp&gt;
&lt;/Header&gt;
&lt;DataDictionary&gt;
&lt;!-- Data Dictionary is skipped for the demonstration purpose --&gt;
&lt;/DataDictionary&gt;
&lt;od:OutlierDetectionModel xmlns="http://www.example.com/od" algorithmName="cba"
modelName="CBA OD model" typeOfOutliers="point" numberOfOutliers="10" output="
label"&gt;
&lt;MiningSchema xmlns="http://www.dmg.org/PMML-4_3"&gt;
&lt;MiningField name="Age.Range"/&gt;
&lt;MiningField name="Car"/&gt;
&lt;MiningField name="Salary.Level"/&gt;
&lt;MiningField name="Class"/&gt;
&lt;/MiningSchema&gt;
&lt;ParameterList&gt;
&lt;Parameter name="minSupport" value="0.1"/&gt;
&lt;Parameter name="minConfidence" value="0.1"/&gt;
&lt;Parameter name="label" value="Class"/&gt;
&lt;/ParameterList&gt;
&lt;AssociationModel functionName="associationRules" numberOfItems="6" minimumSupport="
0.1" minimumConfidence="0.1" numberOfItemsets="29" numberOfRules="12" xmlns="http
://www.dmg.org/PMML-4_3"&gt;
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
&lt;MiningSchema&gt;
&lt;MiningField name="transaction" usageType="group"/&gt;
&lt;MiningField name="item" usageType="active"/&gt;
&lt;/MiningSchema&gt;
&lt;Item id="1" value="Age.Range=Middle"/&gt;
&lt;Item id="2" value="Age.Range=Young"/&gt;
&lt;Item id="3" value="Car=Sedan"/&gt;
&lt;Item id="4" value="Car=Sports"/&gt;
&lt;Item id="5" value="Salary.Level=High"/&gt;
&lt;Item id="6" value="Salary.Level=Low"/&gt;
&lt;Item id="7" value="Class=Normal"/&gt;
&lt;Item id="8" value="Class=Outlier"/&gt;
&lt;Itemset id="1" numberOfItems="1" support="0.4"&gt;
&lt;ItemRef itemRef="4"/&gt;
&lt;AssociationRule support="0.4" confidence="0.7" antecedent="7" consequent="10"/&gt;
&lt;!-- The rest of the association model is skipped for the demonstration purpose --&gt;
&lt;/AssociationModel&gt;
&lt;LabeledInstances&gt;
&lt;InlineTable&gt;
&lt;Row id="5" output="Outlier"&gt;
&lt;Age.Range&gt;Young&lt;/Age.Range&gt;
&lt;Car&gt;Sports&lt;/Car&gt;
&lt;Salary.Level&gt;High&lt;/Salary.Level&gt;
&lt;/Row&gt;</p>
        <p>&lt;!-- The rest of the table is skipped for the demonstration purpose --&gt;
&lt;/InlineTable&gt;
&lt;/LabeledInstances&gt;
&lt;/od:OutlierDetectionModel&gt;
&lt;/PMML&gt;
4</p>
      </sec>
    </sec>
    <sec id="sec-4">
      <title>Related Work</title>
      <p>We have identi ed one existing approach to represent outlier detection models as
an extension of PMML, which is used by the R-based implementation of isolation
forests12. This speci cation implemented as part of the jpmml package13 is based
on the regression mining function of underlying models from PMML.</p>
      <p>Basing model on regression implies supervised learning. Isolation forests do
produce numeric scores, but they are generally considered as an unsupervised
model. Furthermore, the regression framework is not suitable for other types of
outlier detection algorithms.</p>
      <p>Our model proposal fundamentally di ers from jpmml in that it is not based
on a particular existing PMML model, but fosters reuse of fragments from
AssociationModel and Segmentation PMML models, which as we
demonstrated, allows support for a broader range of outlier detection algorithms,
including isolation forests.
5</p>
    </sec>
    <sec id="sec-5">
      <title>Conclusions</title>
      <p>Designing an anomaly detection model for PMML is particularly hard, because,
in principle, nearly all data mining models can produce information about
outliers. The goal of our work was to design modular solution that would support
12 https://r-forge.r-project.org/R/?group_id=479
13 https://github.com/jpmml/r2pmml
broader range of anomaly detection algorithms. We demonstrated the proposed
format on three algorithms. Reference implementation of the export is available
as an R package for frequent pattern mining outlier detection14. Including the
OutlierDetection model is on the roadmap for the next release of the PMML
speci cation.</p>
      <p>Acknowledgements. The authors would like to thank the anonymous
reviewers for their insightful comments. This research was supported by the European
Union's H2020 EU research and innovation programme via the OpenBudgets.eu
project (under grant agreement No 645833). Tomas Kliegr was supported by
long term institutional support of research activities by Faculty of Informatics
and Statistics, University of Economics, Prague.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          1.
          <string-name>
            <surname>Aggarwal</surname>
            ,
            <given-names>C.C.</given-names>
          </string-name>
          :
          <article-title>An Introduction to Outlier Analysis</article-title>
          , pp.
          <volume>1</volume>
          {
          <fpage>34</fpage>
          . Springer International Publishing,
          <string-name>
            <surname>Cham</surname>
          </string-name>
          (
          <year>2017</year>
          ), http://dx.doi.org/10.1007/ 978-3-
          <fpage>319</fpage>
          -47578-
          <issue>3</issue>
          _
          <fpage>1</fpage>
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          2.
          <string-name>
            <surname>Guazzelli</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Zeller</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Lin</surname>
            ,
            <given-names>W.C.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Williams</surname>
            ,
            <given-names>G.</given-names>
          </string-name>
          :
          <article-title>PMML: An Open Standard for Sharing Models</article-title>
          .
          <source>The R Journal</source>
          <volume>1</volume>
          (
          <issue>1</issue>
          ),
          <volume>60</volume>
          {
          <fpage>65</fpage>
          (
          <year>2009</year>
          ), https://journal.r-project. org/archive/2009/RJ-2009-010/index.html
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          3.
          <string-name>
            <surname>Hawkins</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          :
          <article-title>Identi cation of Outliers. Monographs on applied probability and statistics</article-title>
          , Chapman and Hall (
          <year>1980</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          4.
          <string-name>
            <surname>He</surname>
            ,
            <given-names>Z.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Xu</surname>
            ,
            <given-names>X.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Huang</surname>
            ,
            <given-names>Z.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Deng</surname>
            ,
            <given-names>S.:</given-names>
          </string-name>
          <article-title>FP-outlier: Frequent pattern based outlier detection</article-title>
          .
          <source>Computer Science and Information Systems/ComSIS</source>
          <volume>2</volume>
          (
          <issue>1</issue>
          ),
          <volume>103</volume>
          {
          <fpage>118</fpage>
          (
          <year>2005</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          5.
          <string-name>
            <surname>Liu</surname>
            ,
            <given-names>F.T.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Ting</surname>
            ,
            <given-names>K.M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Zhou</surname>
            ,
            <given-names>Z.H.</given-names>
          </string-name>
          :
          <article-title>Isolation forest</article-title>
          .
          <source>In: Proceedings of the 8th IEEE International Conference on Data Mining (ICDM'08)</source>
          . pp.
          <volume>413</volume>
          {
          <issue>422</issue>
          (
          <year>2008</year>
          )
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>