<?xml version="1.0" encoding="UTF-8"?>
<TEI xml:space="preserve" xmlns="http://www.tei-c.org/ns/1.0" 
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" 
xsi:schemaLocation="http://www.tei-c.org/ns/1.0 https://raw.githubusercontent.com/kermitt2/grobid/master/grobid-home/schemas/xsd/Grobid.xsd"
 xmlns:xlink="http://www.w3.org/1999/xlink">
	<teiHeader xml:lang="en">
		<fileDesc>
			<titleStmt>
				<title level="a" type="main">Evaluation of Association Rules Extracted during Anomaly Explanation</title>
			</titleStmt>
			<publicationStmt>
				<publisher/>
				<availability status="unknown"><licence/></availability>
			</publicationStmt>
			<sourceDesc>
				<biblStruct>
					<analytic>
						<author>
							<persName><forename type="first">Martin</forename><surname>Kopp</surname></persName>
							<affiliation key="aff0">
								<orgName type="department">Faculty of Information Technology</orgName>
								<orgName type="institution">Czech Technical University in Prague</orgName>
								<address>
									<addrLine>Thákurova 9</addrLine>
									<postCode>160 00</postCode>
									<settlement>Prague</settlement>
								</address>
							</affiliation>
							<affiliation key="aff1">
								<orgName type="department">Cisco Systems</orgName>
								<orgName type="institution">Cognitive Research Team</orgName>
								<address>
									<settlement>Prague</settlement>
								</address>
							</affiliation>
							<affiliation key="aff2">
								<orgName type="department">Institute of Computer Science</orgName>
								<orgName type="institution">Academy of Sciences of the Czech Republic Pod Vodárenskou věží</orgName>
								<address>
									<postCode>182 07</postCode>
									<settlement>Prague</settlement>
								</address>
							</affiliation>
						</author>
						<author>
							<persName><forename type="first">Martin</forename><surname>Holeňa</surname></persName>
							<affiliation key="aff0">
								<orgName type="department">Faculty of Information Technology</orgName>
								<orgName type="institution">Czech Technical University in Prague</orgName>
								<address>
									<addrLine>Thákurova 9</addrLine>
									<postCode>160 00</postCode>
									<settlement>Prague</settlement>
								</address>
							</affiliation>
							<affiliation key="aff2">
								<orgName type="department">Institute of Computer Science</orgName>
								<orgName type="institution">Academy of Sciences of the Czech Republic Pod Vodárenskou věží</orgName>
								<address>
									<postCode>182 07</postCode>
									<settlement>Prague</settlement>
								</address>
							</affiliation>
						</author>
						<title level="a" type="main">Evaluation of Association Rules Extracted during Anomaly Explanation</title>
					</analytic>
					<monogr>
						<imprint>
							<date/>
						</imprint>
					</monogr>
					<idno type="MD5">BC4E4916E4E831BCD5C43539FD436A61</idno>
				</biblStruct>
			</sourceDesc>
		</fileDesc>
		<encodingDesc>
			<appInfo>
				<application version="0.7.2" ident="GROBID" when="2023-03-24T16:09+0000">
					<desc>GROBID - A machine learning software for extracting information from scholarly documents</desc>
					<ref target="https://github.com/kermitt2/grobid"/>
				</application>
			</appInfo>
		</encodingDesc>
		<profileDesc>
			<textClass>
				<keywords>
					<term>Anomaly detection</term>
					<term>anomaly interpretation</term>
					<term>association rules</term>
					<term>confidence boost</term>
					<term>random forest</term>
				</keywords>
			</textClass>
			<abstract>
<div xmlns="http://www.tei-c.org/ns/1.0"><p>Discovering anomalies within data is nowadays very important, because it helps to uncover interesting events. Consequently, a considerable amount of anomaly detection algorithms was proposed in the last few years. Only a few papers about anomaly detection at least mentioned why some samples were labelled as anomalous. Therefore, we proposed a method allowing to extract rules explaining the anomaly from an ensemble of specifically trained decision trees, called sapling random forest.</p><p>Our method is able to interpret the output of an arbitrary anomaly detector. The explanation is given as conjunctions of atomic conditions, which can be viewed as antecedents of association rules. In this work we focus on selection, post processing and evaluation of those rules. The main goal is to present a small number of the most important rules. To achieve this, we use quality measures such as lift and confidence boost. The resulting sets of rules are experimentally and empirically evaluated on two artificial datasets and one real-world dataset.</p></div>
			</abstract>
		</profileDesc>
	</teiHeader>
	<text xml:lang="en">
		<body>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="1">Introduction</head><p>According to an IBM research <ref type="bibr" target="#b12">[13]</ref> there were 2,7 zettabytes of data in the digital universe at April 2012 and this amount is doubling approximately every 40 months.</p><p>Not only it is almost impossible to process such huge amounts of data, we are actually not interested in the raw data, but rather in the salient knowledge and interesting patterns contained in them. This is the reason why anomaly detection, especially unsupervised anomaly detection, becomes more and more important <ref type="bibr" target="#b0">[1,</ref><ref type="bibr" target="#b22">23]</ref>. Despite it can be formalised as a binary classification, it entails different issues and challenges than those in supervised classification. For example, anomalous events often adapt to appear normally and even normal behaviour evolve over time. Furthermore, defining a normal regions is very difficult, especially when the boundary between normal and anomalous is not always precise.</p><p>For the purposes of this paper consider anomalies equal to outliers as defined by Hawkins <ref type="bibr" target="#b10">[11]</ref>: " An outlier is an observation which deviates so much from the other observations as to arouse suspicions that it was generated by a different mechanism."</p><p>The more formal definition would necessarily reduce the amount of plausible anomaly detectors and/or application domains. This is in conflict with our goal to provide a solution as general as possible.</p><p>Even though anomaly detection techniques are aimed at only a minority of samples, the importance and demand for them grows rapidly. The real world applications range from the network security <ref type="bibr" target="#b9">[10]</ref>, bioinformatics <ref type="bibr" target="#b23">[24]</ref> or financial fraud detection <ref type="bibr" target="#b21">[22]</ref> to the astronomy and space exploration <ref type="bibr" target="#b8">[9]</ref>.</p><p>The identification of anomalies is only a half of the whole task. The second and equally important half is the interpretation. In high dimensional domains, like the network security or bioinformatics, where hundreds or even thousands of features are common, the proper interpretation is crucial.Therefore, anomalies have to be interpreted clearly, as a feature subset that explains its deviation from ordinary data, or even better as a set of association rules.</p><p>In <ref type="bibr" target="#b20">[21]</ref> we proposed method of anomaly explanation based upon specifically trained ensembles of decision trees called sapling random forest (SRF). The main idea behind it is to view the explanation as a feature selection and classification problem. Specifically, the goal is to find features in which the margin betweenı anomalous sample and the normal samples is maximised. Therefore, SRF returns subset of features, respectively rules on these features describing why this sample has been identified as an anomaly.</p><p>The main drawback of the direct rule extraction from our sapling random forests is the big number of rules with some of them introduced by unfortunate training set selection. Partially, these issues can be solved by confidence and / or support thresholds. But for our ultimate goal to present the minimal number of rules containing the maximal amount of useful information, such a simple approach is insufficient. Therefore, in this paper we focus on proper selection, post processing and evaluation of rulesets extracted from sapling random forests during anomaly explanation. We tested association rules quality measures such as lift and confidence boost. This paper is work in progress and we would like to extend the number of tested quality measures by some subjective measures like novelty.</p><p>The rest of this paper is organised as follows. The next section briefly reviews related work. Section 3 describes the SRF principles and its training followed by the rule extraction process in Section 4. The selected quality mea-sures of association rules are presented in Section 5. Experimental evaluation is described in Section 6 and Section 7 concludes the paper.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="2">Related Work</head><p>For more information about the anomaly detection we refer to the recent book of Aggarwall <ref type="bibr" target="#b0">[1]</ref>. This book provides an exhaustive listing of anomaly detection algorithms and their applications in different domains. Another source may be <ref type="bibr" target="#b4">[5]</ref>, which is briefer but very well written. To our best knowledge, there have been only few works addressing not only identification of anomalies, but also their explanation.</p><p>Knorr et al. <ref type="bibr" target="#b14">[15]</ref> focused on what kind of knowledge should be extracted and provided to the user. Strong and weak outliers were defined and searched within data by distance-based algorithms described in detail in <ref type="bibr" target="#b13">[14]</ref>.</p><p>Dang et al. <ref type="bibr" target="#b6">[7]</ref> presented an algorithm identifying and explaining anomalies. The algorithm starts by selecting a set of neighbouring samples based on quadratic entropy that are presented to a Fisher linear discriminant classifier to seek for an optimal half-space, in which a detected anomaly is well separated. The process of interpretation is entangled with the presented method of identification of anomalies. The difference to our work is that SRF can be used after an arbitrary anomaly detection algorithm to interpret its results.</p><p>The most similar to our approach and most recent is <ref type="bibr" target="#b19">[20]</ref>. Their approach, as well as ours, can interpret output of an arbitrary anomaly detector as a subset of features. They use classification accuracy for outlier ranking. The main drawback of this approach is that it needs balanced training sets which are created by sampling artificial samples around the anomalous point. With respect to this work, our approach can handle unbalanced training sets easily and returns not only feature subsets but feature subsets with rules on them, providing even more information about the anomaly. Furthermore, we simplify the analysis by clustering, which enables to interpret similar anomalies at once <ref type="bibr" target="#b15">[16]</ref>.</p><p>On the other hand, there are many papers about association rules. This paper was inspired mainly by <ref type="bibr" target="#b2">[3]</ref>, which is about measuring redundancy and information quality of sets of association rules. The author presents a measure called confidence boost and an algorithm to produce a small set of association rules using this measure. A really extensive list of interestingness measures can be found in <ref type="bibr" target="#b11">[12]</ref>. There is a lot of inspiration for our future work.</p><p>An alternative approach, well described in <ref type="bibr" target="#b5">[6]</ref>, may be so called subjective measures. A typical example is the novelty, sometimes called unexpectedness, of a rule with respect to user provided domain knowledge or against the another rule set. Because these terms are ambiguous there are multiple approaches of measuring them. An approach in <ref type="bibr" target="#b17">[18]</ref> inspired us for our future work.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="3">Sapling Random Forest</head><p>This section outlines principles of sapling random forests. SRF is a method able to explain an output of an arbitrary anomaly detector, proposed by us in <ref type="bibr" target="#b20">[21]</ref>. It is a random forest of specifically trained decision trees. Because produced trees are small they are called saplings rather than trees. Produced explanations show features in which inspected samples differ the most from the rest of data. These features are used to produce association rules, which are more informative than only a set of features. An outline of the whole method is at Algorithm 1.</p><p>Algorithm 1 Algorithm summary y ← anomalyDetector(data) for all data(y ==anomaly) do T ← createTrainingSet(size, method) t ← trainTree(T ) SRF ← t end for extractRules(SRF)</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="3.1">Training Set Selection</head><p>Dataset X = {x 1 , x 2 , . . . , x l }, where x ∈ R d , can be split into two disjoint sets X a , containing anomalous samples, and X n ,containing normal samples. Then, a training set T contains the anomaly x a as one class and a subset of X n as the other. The first strategy of creating training sets is to select k nearest neighbours of x a from X n . This strategy is sensible for algorithms detecting local anomalies, as according to <ref type="bibr" target="#b7">[8]</ref> they are more general than algorithms detecting global anomalies. The drawback of this strategy is a computational complexity.</p><p>The second strategy is to select k samples randomly from X n with uniform probability. The advantage of this approach is a possibility to generate more than one training set per anomaly by repeating the sampling process. More training sets lead to more saplings per anomaly and to more robust explanation, but at the expense of the more complicated aggregation of rules extracted from them (see Section 4). A comparison of both approaches can be found in <ref type="bibr" target="#b20">[21]</ref>.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="3.2">Training a Sapling</head><p>For simplicity consider sapling a binary decision tree with typical height 1-3. In the SRF method, there are always two leaves at the maximal depth, one of which contains only an anomaly x a and the other containing only normal samples. The saplings small height has two reasons. First, training sets are relatively small. Second, according to the anomaly isolation approach <ref type="bibr" target="#b18">[19]</ref>, if the analysed sample is an anomaly, it should be separated easily from the rest of data, resulting again into small trees. Therefore, if the height of a sapling is higher than expected it should be taken into consideration that the explained sample may not be an anomaly.</p><p>The standard procedure, to find the splitting function h for a new internal node, is maximising an information gain over the space of all possible splitting functions H as</p><formula xml:id="formula_0">arg max h∈H − ∑ b∈{L,R} |S b (h)| |S| H(S b ),<label>(1)</label></formula><p>where S is the subset of the training set T reaching the leaf being split, S L (h) = {x ∈ S|h(x) = +1} and S R (h) = {x ∈ S|h(x) = −1} and H(S) is an entropy of S.</p><p>The second commonly used approach involves minimising the Gini impurity.</p><formula xml:id="formula_1">arg min h∈H ∑ b∈{L,R} 1 − |x a | (S b (h)) 2 − |x n | (S b (h)) 2 (2)</formula><p>For experiments presented in this paper we used information gain.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="4">Extraction and Evaluation of Rules</head><p>Once a sapling is grown, it is used to explain the anomaly x a . Let h j 1 ,θ 1 , . . . , h j d ,θ d be the set of splitting functions, with features j 1 , . . . , j d and threshold θ 1 , . . . , θ d , used in the inner nodes on the path from the root to the leaf with the anomaly x a . Then x a is explained as a conjunction of atomic conditions :</p><formula xml:id="formula_2">c = (x j 1 &gt; θ 1 ) ∧ (x j 2 &gt; θ 2 ) ∧ . . . ∧ (x j d &gt; θ d ),<label>(3)</label></formula><p>which is the output of the algorithm. This conjunction can be read as "the sample is anomalous because it is greater than threshold θ 1 in feature j 1 and greater than θ 2 in feature j 2 and . . . than majority (or nearest neighbour) samples." Because resulting trees are very small, the explanation is compact.</p><p>The situation is more difficult, when more saplings per anomaly have been grown, as each sapling provides one conjunction of type <ref type="bibr" target="#b2">(3)</ref>. Using more than one sapling per anomaly improves robustness for training sets created by uniform sampling. The problem is that returning set of all conjunctions C is undesirable, as the primary objective -explanation of the anomaly to a human -would not be met. Hence, the algorithm needs to aggregate conjunctions in C.</p><p>For simplicity of the following notation consider 2d items, in such a way that 2 items are assigned to each feature, one for "&lt;" rules, the other for "&gt;" rules. Denote this 2d set of items F . Then we can group rules into the rule sets R f according to the item set f ⊆ F they share.</p><p>Based on |R f | the algorithm discards groups of low importance by sorting them in descend order, and then using only the first k groups such, that their cumulative frequency is greater than a threshold τ, which we recommend to be 0.90 or 0.95. Using the adopted notation, k is determined as</p><formula xml:id="formula_3">k = arg min k 1 ∑ f ∈F |R f | k ∑ i=1 |R f i | &gt; τ,<label>(4)</label></formula><p>where it is assumed, that R f are sorted according to their size to simplify the notation. We have also investigated the complementary approach, where groups are selected, if they were used with a frequency higher than a specified threshold. But the presented strategy based on the cumulative frequency showed more consistent results in our experiments.</p><p>Once the set of groups with decision rules is selected, we create one rule r f for every rule set R f .Thresholds for each item f j are calculated as an average of all thresholds within the rule set R f .</p><formula xml:id="formula_4">θ j = 1 |R f | |R f i | ∑ i=1 θ i, j<label>(5)</label></formula><p>By this approach we obtain one representative rule for each feature set f as:</p><formula xml:id="formula_5">c f = x j 1 &gt; θ1 ∧ x j 2 &gt; θ2 ∧ . . . ∧ x j t &gt; θt .<label>(6)</label></formula></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="5">Measuring Quality of Rules</head><p>This section reviews selected quality measures of association rules. Typical association rules are in the form A → Y, where A, Y are item sets. In rules extracted from SRF, items are atomic conditions and Y always means: "is anomalous". Therefore, our rules are in the form:</p><formula xml:id="formula_6">r f = c f → y,<label>(7)</label></formula><p>where c is a conjunction of atomic conditions like (3), y = x ∈ X a and f ⊆ F . The r f in its full form then look as:</p><formula xml:id="formula_7">r f (x) = x f 1 &gt; θ f 1 ∧ x f 2 &gt; θ f 2 ∧ . . . ∧ x f n &gt; θ f n → x ∈ X a , (<label>8</label></formula><formula xml:id="formula_8">)</formula><p>where n is a maximal index in the itemset f .</p><p>For this kind of rules support <ref type="bibr" target="#b1">[2]</ref> is calculated as:</p><formula xml:id="formula_9">supp(c f ) = |{c f (x)|x ∈ X }| |X | ,<label>(9) supp</label></formula><formula xml:id="formula_10">(y) = |X a | |X | . (<label>10</label></formula><formula xml:id="formula_11">)</formula><p>and gives the proportion of data points which satisfy the antecedent c, respectively the consequent y. It is used to measure the importance of a rule or as a frequency constrain. The disadvantage of support is that infrequent rules are often discarded. This is much bigger problem than it could seem because we are generating rules for anomalies, which are rare by definition.</p><p>Another frequently used measure is confidence <ref type="bibr" target="#b1">[2]</ref>:</p><formula xml:id="formula_12">conf(c f → y) = supp(c f → y) supp(c f ) . (<label>11</label></formula><formula xml:id="formula_13">)</formula><p>It estimates the conditional probability of the consequent being true on condition that the antecendent is true. The trouble with confidence is caused by its sensitivity to the frequency of y. Because all rules extracted from SRF have the same consequent the rule ranking produced by lift a confidence would be the same.</p><p>The third measure we used is lift <ref type="bibr" target="#b3">[4]</ref>:</p><formula xml:id="formula_14">lift(c f → y) = conf(c f → y) supp(y) ,<label>(12)</label></formula><p>which measures how many times more often the antecedent c and consequent y occur together than expected if they were statistically independent. Lift does not suffer from the rare items problem. Because in our experiments the consequent will always the same frequency there is no need to measure both Finally, confidence boost introduced by Balcázar [3] is calculated as:</p><formula xml:id="formula_15">β (r f ) = conf(r f ) max{conf(r f )|supp(r f ) &gt; σ , r f ≡ r f , f ⊆ f } ,<label>(13)</label></formula><p>where σ is support threshold and r f ≡ r f denotes the inequivalence of rules r f and r f , which for our simple case where all consequents are the same, means that f = f . From ( <ref type="formula" target="#formula_15">13</ref>) is evident that f ⊂ f .</p><p>If the set of confidences in denominator is empty, the confidence boost is by convention set to infinity.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="6">Experiments</head><p>For the experimental evaluation we used the synthetical three layer donut, the well known Fisher's iris and the Letter recognition data set from the UCI repository <ref type="bibr" target="#b16">[17]</ref>.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="6.1">Three Layer Donut</head><p>The three layer donut dataset contains 1000 normal samples forming the two dimensional toroid (donut). There are 200 anomalies, one half inside the toroid and the second half out of it. For this dataset we created 10 rules per anomaly, using SRF, resulting in 2000 rules. After the simple aggregation, described in Section 4, only 8 rules left. All of them printed in Table <ref type="table">1</ref>, sorted by their respective support. All rules before aggregation were used to calculate confidence boost. Otherwise, many rules would have confidence boost equal to infinity because there are no rules defined on any subset of their item sets f . The fact is that rules r 5 − r 8 have quite small support but, according to the other measures and our intuition, they are very important. The small supports is due to the small rule supp lift β r 1 = x 1 &gt; −0.33 ∧ x 2 &gt; −0.39 0.38 1.64 0.27 r 2 = x 1 &gt; −0.33 ∧ x 2 &lt; 0.3 0.37 1.60 0.27 r 3 = x 1 &lt; 0.4 ∧ x 2 &lt; 0.34 0.37 1.49 0.25 r 4 = x 1 &lt; 0.37 ∧ x 2 &gt; −0.37 0.37 1.57 0.26 r 5 = x 2 &gt; 2.2 0.02 6.00 1.00 r 6 = x 1 &gt; 2.3 0.02 6.00 1.00</p><formula xml:id="formula_16">r 7 = x 2 &lt; −2.4</formula><p>0.01 6.00 1.00</p><formula xml:id="formula_17">r 8 = x 1 &lt; −2.4</formula><p>0.01 6.00 1.00</p><p>Table <ref type="table">1</ref>: Aggregated rules with their quality measures for the three layer donut dataset, sorted by their respective supports.</p><p>- number of data points explained, lets recall that anomalies are only one sixth of all data points in this dataset. Both, lift and confidence boost, mostly reflects our subjective expectations.</p><p>All rules are depicted at Figure <ref type="figure" target="#fig_0">1</ref>. Its evident that presented rules cannot separate anomalies from normal samples perfectly. Especially difficult are anomalies inside the donut. To separate those inner anomalies perfectly it would be necessary to combine more rules together, for example r 1 and r 3 or r 2 and r 4 .</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="6.2">Iris</head><p>The virginica species were selected as anomalous class for the iris data set. Five rules per anomaly were produced using SRF, resulting in 250 rules. After aggregation we have got 6 rules. They are written in Table <ref type="table">2</ref> with their respective quality measures. Confidence boost was calculated using all 250 rules.</p><p>The main problem with all those rules is that almost every one of them can sufficiently separate anomalies from rule supp lift β r 1 = x 1 &gt; 6 ∧ x 4 &gt; 1.7 0.25 3.00 1.00 r 2 = x4 &gt; 2 0.19 3.00 1.00 r 3 = x 3 &gt; 5.5 0.17 3.00 1.00 r 4 = x 2 &lt; 2.8 ∧ x 4 &gt; 1.6 0.06 3.00 1.00 r 5 = x 1 &gt; 7.3 0.05 3.00 1.00 r 6 = x 2 &lt; 2.2 0.03 0.75 0.25 Table <ref type="table">2</ref>: Aggregated rules with their quality measures for the iris dataset , sorted by their respective supports.</p><p>Figure <ref type="figure">2</ref>: The iris dataset with r 1 plotted as a filled rectangle, and rules r 2 and r 5 as half-planes delimited by solid lines.</p><p>normal samples. No one of presented measures could help in selecting the most informative, yet small as possible, set of rules. Because presented rules are seen informationally equivalent by all quality measures. This doesn't say much about difference between the quality measures but it justifies the rule extraction process, because all generated rules have high score.</p><p>Figure <ref type="figure">2</ref> shows the iris dataset with rules r 1 , r 2 and r 5 .</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="6.3">Letter Recognition</head><p>This dataset was created as a classification problem with 26 classes, one class for each letter in the English alphabet.</p><p>The charactersÒ were obtained from 20 different fonts and randomly distorted to produce 20,000 unique samples presented as 16 dimensional numerical vectors. Letter X was selected as the anomaly class. SRF produced more than 15,000 rules, which were reduced by aggregation to 1955. Aggregated rules with support higher than 0.10 are presented in Table <ref type="table">3</ref>. The ranking of those rules is plotted at Figure <ref type="figure" target="#fig_2">3</ref>. Its evident that the ranking given by lift and confidence boost differs substantially. It is nearly impossible to evaluate all rules. Therefore, we have selected only those with confidence boost higher than one (202 rules) and those with lift higher than one   <ref type="table">3</ref>: Rules extracted from Letter recognition by SRF with support higher than 0.10 with their quality measures sorted by their respective supports.</p><p>(446 rules). The confidence boost selected the smaller rule set where almost all rules looked plausible. On the other hand, they missed some really interesting ones most highly rated by lift. The confidence boost tend to choose shorter more similar rules, whereas lift prefer richer and more heterogenous rules. Therefore, from our point of view the best selection strategy is choosing top k rules according to the lift ranking. The top 10 rules chosen from the whole set by lift and confidence boost, regardless their support, are in Table <ref type="table" target="#tab_0">4</ref>.</p><p>Still there are too much rules to make some conclusions, in our future work we are going to investigate more measures of interestingness and novelty, which will hopefully help us to reduce the amount of extracted rules even more.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>7</head><p>In this paper, we presented a novel approach for the explanation of an output of an arbitrary anomaly detector using sapling random forests. The explanation is given as conjunctions of atomic conditions, which can be viewed as antecedents of association rules. Due to an extraction method, the individual rules are short and comprehensible. The main drawback was that the rule sets for the bigger dataset were large and redundant. Therefore, we applied multiple quality measures to evaluate them and select those rules with desired properties. Performed experiments showed that no one of presented measures reflect our expectation. From the considered measures the lift looks the most promising. But this paper is just a work in progress and we don't view this observation as a final conclusion.</p><p>For our future work we would like to have a measure that will rate the novelty of a rule with respect to the set of previously selected rules. The first idea is to chose those rules that describe anomalies not covered by the already selected rules. The second idea is to select rules which may describe already covered anomalies but using completely different set of features. The last thing we would like to work on is finding a way of concatenating mined rules to make smaller yet precise rule sets.</p></div><figure xmlns="http://www.tei-c.org/ns/1.0" xml:id="fig_0"><head>Figure 1 :</head><label>1</label><figDesc>Figure 1: Three layer donut dataset with plotted rules r 1 − r 8 . Rules r 1 − r 4 are plotted as filled squares and rules r 5 − r 8 as half-planes delimited by solid lines.</figDesc></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" xml:id="fig_2"><head>Figure 3 :</head><label>3</label><figDesc>Figure 3: Ranking (higher number means better ranking) of rules fromTable 3 by its lift and confidence boost. rule supp lift β x 2 &gt; 14 0.29 1.44 0.29 x 11 &lt; 0.62 0.24 0.27 1.82 x 9 &gt; 8.8 ∧ x 1 0 &lt; 5.2 0.21 1.67 0.96 x 3 &gt; 10 0.18 0.89 0.35 x 1 &lt; 0.38 0.17 0.19 0.14 x 13 &gt; 6.2 ∧ x 15 &gt; 8.6 0.16 0.21 0.17 x 6 &gt; 9.8 ∧ x 8 &lt; 1.2 0.15 2.39 1.91 x 1 &gt; 9.5 ∧ x 2 &gt; 14 0.15 1.10 0.22 x 2 &gt; 13 ∧ x 10 &gt; 12 0.15 0.44 0.04 x 4 &gt; 7.8 ∧ x 8 &lt; 1.2 ∧ x 9 &gt; 7.2 0.15 5.75 0.71 x 8 &lt; 1.2 ∧ x 16 &lt; 5.2 0.14 1.62 0.39 x 2 &gt; 11 ∧ x 4 &gt; 7.8 ∧ x 9 &gt; 6.4 0.13 5.32 0.88 x 8 &lt; 1.2 ∧ x 9 &gt; 8.8 0.13 8.06 1.80 x 9 &gt; 8 ∧ x 10 &lt; 5.2 ∧ x 15 &gt; 6.80.13 2.20 1.31 x 1 &gt; 9.5 ∧ x 3 &gt; 9 0.12 1.04 0.06 x 9 &gt; 8.8 ∧ x 12 &lt; 5.5 0.12 0.78 0.22 x 3 &gt; 6.1 ∧ x 14 &lt; 6.2 ∧ x 15 &gt; 7.2 0.12 2.91 1.09 x 1 &gt; 4.2 ∧ x 8 &lt; 1.2 ∧ x 12 &lt; 6.8 0.12 0.55 0.05 x 1 &gt; 8.5 ∧ x 10 &gt; 12 0.12 0.84 0.10 x 1 &gt; 5.5 ∧ x 7 &gt; 7.8 ∧ x 8 &lt; 1.2 0.11 3.39 0.22 x 2 &lt; 1.5 ∧ x 9 &gt; 7.5 ∧ x 15 &gt; 5.5 0.11 3.43 0.84 x 1 &gt; 7.8 ∧ x 6 &gt; 8.8 ∧ x 7 &gt; 6 0.11 0.29 0.02 x 6 &gt; 8.2 ∧ x 15 &gt; 7.9 ∧ x 16 &lt; 6.2 0.11 0.29 0.24 x 2 &gt; 11 ∧ x 3 &gt; 7.6 ∧ x 10 &lt; 5.2 0.11 1.21 0.05 x 3 &gt; 10 ∧ x 5 &gt; 6.2 0.11 0.61 0.24Table 3: Rules extracted from Letter recognition by SRF with support higher than 0.10 with their quality measures sorted by their respective supports.</figDesc><graphic coords="5,44.63,216.38,242.95,182.26" type="bitmap" /></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" type="table" xml:id="tab_0"><head>Table 4 :</head><label>4</label><figDesc>top 10 lift rules top 10 β rules x 8 &lt; 1.8 ∧ x 15 &gt; 7.8 x 4 &lt; 1.2 ∧ x 9 &gt; 8 x 2 &lt; 1.8 ∧ x 3 &gt; 4.5 ∧ x 9 &gt; 8.8 x 2 &lt; 1.5 ∧ x 9 &gt; 8.8 x 5 &gt; 6.8 ∧ x 8 &lt; 1.8 ∧ x 15 &gt; 7 x 8 &lt; 1.2 ∧ x 9 &gt; 8.8 x 5 &gt; 5 ∧ x 9 &gt; 8.2 ∧ x 10 &lt; 5.2 x 9 &gt; 8 ∧ x 14 &lt; 5.2 x 4 &lt; 2.2 ∧ x 6 &gt; 8.2 ∧ x 9 &gt; 7.8 x 11 &gt; 12 ∧ x 16 &lt; 4.5 x 2 &lt; 5.6 ∧ x 3 &gt; 7.2 ∧ x 8 &lt; 1.2 x 6 &gt; 9.8 ∧ x 8 &lt; 1.2 x 1 &gt; 4.5 ∧ x 8 &lt; 1.8 ∧ x 15 &gt; 7.8 x 4 &gt; 8.8 ∧ x 14 &lt; 5.2 x 7 &lt; 5.8 ∧ x 8 &lt; 2.5 ∧ x 15 &gt; 7.8 x 5 &gt; 8.5 ∧ x 14 &lt; 5.2 x 2 &lt; 4.8 ∧ x 9 &gt; 8 ∧ x 11 &gt; 9.8 x 8 &lt; 1.2 ∧ x 16 &gt; 9.9 x 11 &gt; 9 ∧ x 14 &gt; 13 x 12 &gt; 10 ∧ x 16 &lt; 5.2 Comparison of top 10 rules extracted from the Letter recognition dataset by SRF selected by lift and confidence boost.</figDesc><table /></figure>
			<note xmlns="http://www.tei-c.org/ns/1.0" place="foot" xml:id="foot_0">Evaluation of Association Rules Extracted during Anomaly Explanation</note>
			<note xmlns="http://www.tei-c.org/ns/1.0" place="foot" n="0" xml:id="foot_1">.29</note>
			<note xmlns="http://www.tei-c.org/ns/1.0" place="foot" n="1" xml:id="foot_2">.44 0.29 x 11 &lt; 0.62 0.24 0.27 1.82 x 9 &gt; 8.8 ∧ x 1 0 &lt; 5.2 0.21 1.67 0.96 x 3 &gt; 10 0.18 0.89 0.35 x 1 &lt; 0.38 0.17 0.19 0.14 x 13 &gt; 6.2 ∧ x 15 &gt; 8.6 0.16 0.21 0.17 x 6 &gt; 9.8 ∧ x 8 &lt; 1.2 0.15</note>
			<note xmlns="http://www.tei-c.org/ns/1.0" place="foot" n="2" xml:id="foot_3">.39 1.91 x 1 &gt; 9.5 ∧ x 2 &gt; 14 0.15 1.10 0.22 x 2 &gt; 13 ∧ x 10 &gt; 12 0.15 0.44 0.04 x 4 &gt; 7.8 ∧ x 8 &lt; 1.2 ∧ x 9 &gt; 7.2 0.15 5.75 0.71 x 8 &lt; 1.2 ∧ x 16 &lt; 5.2 0.14 1.62 0.39 x 2 &gt; 11 ∧ x 4 &gt; 7.8 ∧ x 9 &gt; 6.4 0.13 5.32 0.88 x 8 &lt; 1.2 ∧ x 9 &gt; 8.8 0.13 8.06 1.80 x 9 &gt; 8 ∧ x 10 &lt; 5.2 ∧ x 15 &gt; 6.8 0.13 2.20 1.31 x 1 &gt; 9.5 ∧ x</note>
			<note xmlns="http://www.tei-c.org/ns/1.0" place="foot" n="3" xml:id="foot_4">&gt; 9 0.12 1.04 0.06 x 9 &gt; 8.8 ∧ x 12 &lt; 5.5 0.12 0.78 0.22 x 3 &gt; 6.1 ∧ x 14 &lt; 6.2 ∧ x 15 &gt; 7.2 0.12 2.91 1.09 x 1 &gt;</note>
			<note xmlns="http://www.tei-c.org/ns/1.0" place="foot" n="4" xml:id="foot_5">.2 ∧ x 8 &lt; 1.2 ∧ x 12 &lt; 6.8 0.12 0.55 0.05 x 1 &gt; 8.5 ∧ x 10 &gt; 12 0.12 0.84 0.10 x 1 &gt;</note>
			<note xmlns="http://www.tei-c.org/ns/1.0" place="foot" n="5" xml:id="foot_6">.5 ∧ x 7 &gt; 7.8 ∧ x 8 &lt; 1.2 0.11 3.39 0.22 x 2 &lt; 1.5 ∧ x 9 &gt; 7.5 ∧ x 15 &gt; 5.5 0.11 3.43 0.84 x 1 &gt; 7.8 ∧ x</note>
			<note xmlns="http://www.tei-c.org/ns/1.0" place="foot" n="6" xml:id="foot_7">&gt; 8.8 ∧ x</note>
			<note xmlns="http://www.tei-c.org/ns/1.0" place="foot" n="7" xml:id="foot_8">&gt; 6 0.11 0.29 0.02 x 6 &gt;</note>
			<note xmlns="http://www.tei-c.org/ns/1.0" place="foot" n="8" xml:id="foot_9">.2 ∧ x 15 &gt; 7.9 ∧ x 16 &lt; 6.2 0.11 0.29 0.24 x 2 &gt; 11 ∧ x 3 &gt; 7.6 ∧ x 10 &lt; 5.2 0.11 1.21 0.05 x 3 &gt; 10 ∧ x 5 &gt; 6.2 0.11 0.61 0.24</note>
		</body>
		<back>

			<div type="acknowledgement">
<div xmlns="http://www.tei-c.org/ns/1.0"><head>Acknowledgement</head><p>The research reported in this paper has been supported by the Czech Science Foundation (GA ČR) grant 13-17187S.</p></div>
			</div>

			<div type="references">

				<listBibl>

<biblStruct xml:id="b0">
	<monogr>
		<title level="m" type="main">Outlier analysis</title>
		<author>
			<persName><forename type="first">C</forename><forename type="middle">C</forename><surname>Aggarwal</surname></persName>
		</author>
		<imprint>
			<date type="published" when="2013">2013</date>
			<publisher>Springer</publisher>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b1">
	<analytic>
		<title level="a" type="main">Mining association rules between sets of items in large databases</title>
		<author>
			<persName><forename type="first">R</forename><surname>Agrawal</surname></persName>
		</author>
		<author>
			<persName><forename type="first">T</forename><surname>Imieliński</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Swami</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">ACM SIGMOD Record</title>
		<imprint>
			<biblScope unit="volume">22</biblScope>
			<biblScope unit="page" from="207" to="216" />
			<date type="published" when="1993">1993</date>
			<publisher>ACM</publisher>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b2">
	<analytic>
		<title level="a" type="main">Formal and computational properties of the confidence boost of association rules</title>
		<author>
			<persName><forename type="first">J</forename><forename type="middle">L</forename><surname>Balcázar</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">ACM Transactions on Knowledge Discovery from Data (TKDD)</title>
		<imprint>
			<biblScope unit="volume">7</biblScope>
			<biblScope unit="issue">4</biblScope>
			<biblScope unit="page">19</biblScope>
			<date type="published" when="2013">2013</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b3">
	<analytic>
		<title level="a" type="main">Dynamic itemset counting and implication rules for market basket data</title>
		<author>
			<persName><forename type="first">S</forename><surname>Brin</surname></persName>
		</author>
		<author>
			<persName><forename type="first">R</forename><surname>Motwani</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><forename type="middle">D</forename><surname>Ullman</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><surname>Tsur</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Proceedings ACM SIGMOD International Conference on Management of Data</title>
				<meeting>ACM SIGMOD International Conference on Management of Data<address><addrLine>Tucson, Arizona, USA</addrLine></address></meeting>
		<imprint>
			<date type="published" when="1997-05">May 1997</date>
			<biblScope unit="page" from="255" to="264" />
		</imprint>
	</monogr>
	<note>SIGMOD 1997</note>
</biblStruct>

<biblStruct xml:id="b4">
	<analytic>
		<title level="a" type="main">Anomaly detection: a survey</title>
		<author>
			<persName><forename type="first">V</forename><surname>Chandola</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Banerjee</surname></persName>
		</author>
		<author>
			<persName><forename type="first">V</forename><surname>Kumar</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">ACM Computing Surveys (CSUR)</title>
		<imprint>
			<biblScope unit="volume">41</biblScope>
			<biblScope unit="issue">3</biblScope>
			<biblScope unit="page">15</biblScope>
			<date type="published" when="2009">2009</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b5">
	<analytic>
		<title level="a" type="main">Quality and complexity measures for data linkage and deduplication</title>
		<author>
			<persName><forename type="first">P</forename><surname>Christen</surname></persName>
		</author>
		<author>
			<persName><forename type="first">K</forename><surname>Goiser</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Quality Measures in Data Mining</title>
				<imprint>
			<publisher>Springer</publisher>
			<date type="published" when="2007">2007</date>
			<biblScope unit="page" from="127" to="151" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b6">
	<analytic>
		<title level="a" type="main">Local outlier detection with interpretation</title>
		<author>
			<persName><forename type="first">X. -H</forename><surname>Dang</surname></persName>
		</author>
		<author>
			<persName><forename type="first">B</forename><surname>Micenková</surname></persName>
		</author>
		<author>
			<persName><forename type="first">I</forename><surname>Assent</surname></persName>
		</author>
		<author>
			<persName><forename type="first">R</forename><forename type="middle">T</forename><surname>Ng</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases (ECML/PKDD 2013)</title>
				<imprint>
			<biblScope unit="page">2013</biblScope>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b7">
	<analytic>
		<title level="a" type="main">Finding local anomalies in very high dimensional space</title>
		<author>
			<persName><forename type="first">T</forename><surname>De Vries</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><surname>Chawla</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><forename type="middle">E</forename><surname>Houle</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">IEEE 10th International Conference on Data Mining (ICDM 2010)</title>
				<imprint>
			<date type="published" when="2010">2010</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b8">
	<analytic>
		<title level="a" type="main">An approach to spacecraft anomaly detection problem using kernel feature space</title>
		<author>
			<persName><forename type="first">R</forename><surname>Fujimaki</surname></persName>
		</author>
		<author>
			<persName><forename type="first">T</forename><surname>Yairi</surname></persName>
		</author>
		<author>
			<persName><forename type="first">K</forename><surname>Machida</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Proceedings of the Eleventh ACM SIGKDD International Conference on Knowledge Discovery in Data Mining</title>
				<meeting>the Eleventh ACM SIGKDD International Conference on Knowledge Discovery in Data Mining</meeting>
		<imprint>
			<date type="published" when="2005">2005</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b9">
	<monogr>
		<title level="m" type="main">Anomaly-based network intrusion detection: techniques, systems and challenges</title>
		<author>
			<persName><forename type="first">P</forename><surname>Garcia-Teodoro</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Diaz-Verdejo</surname></persName>
		</author>
		<author>
			<persName><forename type="first">G</forename><surname>Maciá-Fernández</surname></persName>
		</author>
		<author>
			<persName><forename type="first">E</forename><surname>Vázquez</surname></persName>
		</author>
		<imprint>
			<date type="published" when="2009">2009</date>
			<publisher>Computers &amp; Security</publisher>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b10">
	<monogr>
		<title level="m" type="main">Identification of outliers</title>
		<author>
			<persName><forename type="first">D</forename><forename type="middle">M</forename><surname>Hawkins</surname></persName>
		</author>
		<imprint>
			<date type="published" when="1980">1980</date>
			<publisher>Springer</publisher>
			<biblScope unit="volume">11</biblScope>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b11">
	<analytic>
		<title level="a" type="main">A study on interestingness measures for associative classifiers</title>
		<author>
			<persName><forename type="first">M</forename><surname>Jalali-Heravi</surname></persName>
		</author>
		<author>
			<persName><forename type="first">O</forename><forename type="middle">R</forename><surname>Zaïane</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Proceedings of the 2010 ACM Symposium on Applied Computing</title>
				<meeting>the 2010 ACM Symposium on Applied Computing</meeting>
		<imprint>
			<publisher>ACM</publisher>
			<date type="published" when="2010">2010</date>
			<biblScope unit="page" from="1039" to="1046" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b12">
	<monogr>
		<title level="m" type="main">Big data brings marketing big numbers</title>
		<author>
			<persName><forename type="first">D</forename><surname>Karr</surname></persName>
		</author>
		<ptr target="https://www.marketingtechblog.com/ibm-big-data-marketing/" />
		<imprint>
			<date type="published" when="2012-06-19">2012. 19-June-2015</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b13">
	<analytic>
		<title level="a" type="main">Algorithms for mining distancebased outliers in large datasets</title>
		<author>
			<persName><forename type="first">E</forename><forename type="middle">M</forename><surname>Knorr</surname></persName>
		</author>
		<author>
			<persName><forename type="first">R</forename><forename type="middle">T</forename><surname>Ng</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Proceedings of the International Conference on Very Large Data Bases</title>
				<meeting>the International Conference on Very Large Data Bases</meeting>
		<imprint>
			<date type="published" when="1998">1998</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b14">
	<analytic>
		<title level="a" type="main">Finding intensional knowledge of distance-based outliers</title>
		<author>
			<persName><forename type="first">E</forename><forename type="middle">M</forename><surname>Knorr</surname></persName>
		</author>
		<author>
			<persName><forename type="first">R</forename><forename type="middle">T</forename><surname>Ng</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">VLDB</title>
				<imprint>
			<date type="published" when="1999">1999</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b15">
	<analytic>
		<title level="a" type="main">Interpreting and clustering outliers with sapling random forests</title>
		<author>
			<persName><forename type="first">M</forename><surname>Kopp</surname></persName>
		</author>
		<author>
			<persName><forename type="first">T</forename><surname>Pevný</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Holeňa</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Information Technologies -Applications and Theory Workshops, Posters, and Tutorials</title>
				<meeting><address><addrLine>ITAT</addrLine></address></meeting>
		<imprint>
			<date type="published" when="2014">2014</date>
			<biblScope unit="page">2014</biblScope>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b16">
	<monogr>
		<title level="m" type="main">UCI machine learning repository</title>
		<author>
			<persName><forename type="first">M</forename><surname>Lichman</surname></persName>
		</author>
		<ptr target="http://archive.ics.uci.edu/ml/" />
		<imprint>
			<date type="published" when="2013">2013</date>
		</imprint>
		<respStmt>
			<orgName>University of California, Irvine, School of Information and Computer Sciences</orgName>
		</respStmt>
	</monogr>
</biblStruct>

<biblStruct xml:id="b17">
	<analytic>
		<title level="a" type="main">Using general impressions to analyze discovered classification rules</title>
		<author>
			<persName><forename type="first">B</forename><surname>Liu</surname></persName>
		</author>
		<author>
			<persName><forename type="first">W</forename><surname>Hsu</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><surname>Chen</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">KDD</title>
		<imprint>
			<biblScope unit="page" from="31" to="36" />
			<date type="published" when="1997">1997</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b18">
	<analytic>
		<title level="a" type="main">Isolation forest</title>
		<author>
			<persName><forename type="first">F</forename><forename type="middle">T</forename><surname>Liu</surname></persName>
		</author>
		<author>
			<persName><forename type="first">K</forename><forename type="middle">M</forename><surname>Ting</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Z. -H</forename><surname>Zhou</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Eighth IEEE International Conference on Data Mining (ICDM 2008)</title>
				<imprint>
			<date type="published" when="2008">2008</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b19">
	<analytic>
		<title level="a" type="main">Explaining outliers by subspace separability</title>
		<author>
			<persName><forename type="first">B</forename><surname>Micenková</surname></persName>
		</author>
		<author>
			<persName><forename type="first">R</forename><forename type="middle">T</forename><surname>Ng</surname></persName>
		</author>
		<author>
			<persName><forename type="first">X. -H</forename><surname>Dang</surname></persName>
		</author>
		<author>
			<persName><forename type="first">I</forename><surname>Assent</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">IEEE 13th International Conference on Data Mining (ICDM 2013)</title>
				<imprint>
			<date type="published" when="2013">2013</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b20">
	<analytic>
		<title level="a" type="main">Explaining anomalies with sapling random forests</title>
		<author>
			<persName><forename type="first">T</forename><surname>Pevný</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Kopp</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Information Technologies -Applications and Theory Workshops, Posters, and Tutorials</title>
				<meeting><address><addrLine>ITAT</addrLine></address></meeting>
		<imprint>
			<date type="published" when="2014">2014</date>
			<biblScope unit="page">2014</biblScope>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b21">
	<analytic>
		<title level="a" type="main">Article: A survey on outlier detection in financial transactions</title>
		<author>
			<persName><forename type="first">K</forename><surname>Pradnya</surname></persName>
		</author>
		<author>
			<persName><forename type="first">H</forename><forename type="middle">K</forename><surname>Khanuja</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">International Journal of Computer Applications</title>
		<imprint>
			<biblScope unit="volume">108</biblScope>
			<biblScope unit="issue">17</biblScope>
			<biblScope unit="page" from="23" to="25" />
			<date type="published" when="2014-12">December 2014</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b22">
	<monogr>
		<title level="m" type="main">Robust regression and outlier detection</title>
		<author>
			<persName><forename type="first">P</forename><forename type="middle">J</forename><surname>Rousseeuw</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><forename type="middle">M</forename><surname>Leroy</surname></persName>
		</author>
		<imprint>
			<date type="published" when="2005">2005</date>
			<publisher>John Wiley &amp; Sons</publisher>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b23">
	<analytic>
		<title level="a" type="main">Outlier sums for differential gene expression analysis</title>
		<author>
			<persName><forename type="first">R</forename><surname>Tibshirani</surname></persName>
		</author>
		<author>
			<persName><forename type="first">T</forename><surname>Hastie</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">Biostatistics</title>
		<imprint>
			<date type="published" when="2007">2007</date>
		</imprint>
	</monogr>
</biblStruct>

				</listBibl>
			</div>
		</back>
	</text>
</TEI>
