=Paper=
{{Paper
|id=Vol-1297/020-25_paper-5
|storemode=property
|title=Методы выявления аномалий: обзор
(Methods for Anomaly Detection: a Survey)
|pdfUrl=https://ceur-ws.org/Vol-1297/020-25_paper-5.pdf
|volume=Vol-1297
|dblpUrl=https://dblp.org/rec/conf/rcdl/KalinichenkoST14
}}
==Методы выявления аномалий: обзор
(Methods for Anomaly Detection: a Survey)
==
Methods for Anomaly Detection: a Survey
© Leonid Kalinichenko © Ivan Shanin © Ilia Taraban
Institute of Informatics Problems of RAS
Moscow
leonidandk@gmail.com ivan_shanin@mail.ru tarabanil@gmail.com
detail, each form is related to a certain class of problems
Abstract and appropriate methods that are presented with the
In this article we review different approaches application examples. In Section 6 we discuss specific
to the anomaly detection problems, their features of the anomaly detection problem that make
applications and specific features. We classify strong impact on the methods used in this area. Section
different methods according to the data 7 contains conclusions and results of this review.
specificity and discuss their applicability in
different cases. 2 Data forms
The precise definition of the outlier depends on the
1 Introduction specific problem and its data representation. In this
Anomalies (or outliers, deviant objects, exceptions, survey we will establish a correspondence between
rare events, peculiar objects) is an important concept of concrete data representation forms and suitable anomaly
the data analysis. Data object is considered to be an detection methods. We assume that the data are usually
outlier if it has significant deviation from the regular presented in one of three forms: Metric Data, Evolving
pattern of the common data behaviour in a specific Data and Multistructured Data. Metric Data are the most
domain. Generally it means that this data object is common form of data representation, when every object
“dissimilar” to the other observations in the dataset. It is in a dataset has a certain set of attributes that allows to
very important to detect these objects during the data operate with notions of "distance" and "proximity".
analysis to treat them differently from the other data. Evolving Data are presented as well-studied objects:
For instance, the anomaly detection methods are widely Discrete Sequences, Time Series and Multidimensional
used for the following purposes: Data Streams. Third form is the Multistructured Data,
under this term we understand the data that are
• Credit card (and mobile phone) fraud detection
presented in unstructured, semi-structured or structured
[1, 2];
form. This data form may not have a rigid structure, and
• Suspicious Web site detection [3]; yet it can contain various data dependencies. The most
• Whole-genome DNA matching [4, 5]; usual task with this type of data is to extract attributes
• ECG-signal filtering [6]; that would allow using metric data oriented methods of
the outlier analysis. In our survey the Multistructured
• Suspicious transaction detection [7]; Data are specialized as the Graph Data or Text Data.
• Analysis of digital sky surveys [8, 9].
The anomaly detection problem has become a 3 Metric Data Oriented Methods
recognized rapidly-developing topic of the data
In this section the methods are considered that use
analysis. Many surveys and studies are devoted to this
the concept of “metric” data: such as the distance
problem [1, 3, 4, 5, 10, 11]. The main purpose of this
between objects, the correlation between them, and the
review is to reveal specific features of widely known
distribution of data. We assume that the data in this case
statistical and machine learning methods that are used to
represents the objects in the space, so-called points.
detect anomalies. All considered methods will be
Then the task is to determine regular and irregular
categorized by the data form they are applied to.
points, depending on the specific metric distance
The paper is organized as follows. In Section 2 we between objects in the space, or the correlation, or the
introduce three generic data representations that are spatial distribution of the points. In this case, we
most commonly used in anomaly detection problems: consider a structured data type, i.e., objects, which do
Metric Data, Evolving Data and Multistructured Data. not depend on time (time series are discussed in
In Sections 3, 4 and 5 these data forms are discussed in Section 4). Metric data form is the most widely-used,
usually due to the fact that almost all entities can be
Proceedings of the 16th All-Russian Conference “Digital represented as a structured object, a set of attributes, and
Libraries: Advanced Methods and Technologies, Digital thus as a point in a particular space [12]. Thus, these
Collections” ― RCDL-2014, Dubna, Russia, October methods are used in various applications, e.g., in
13–16, 2014. medicine and astronomy. We subdivide methods based
20
on the notion of distance, based on the correlations, data 3.3 Probabilistically Distributed Data
distributions and finally related to the data with high
In probabilistic methods, the main approach is to
dimension and categorical attributes. We now turn to a
assume that the data satisfy some distribution law. Thus,
more detailed review of certain types of these methods.
anomalous objects can be defined as objects that do not
3.1 Distance-Based Data satisfy such basic rule. A classic example of these
methods is the EM [23, 24], an iterative algorithm based
Basic set of methods that use the notion of distance on the maximum likelihood method. Each iteration is an
includes clustering methods, K nearest neighbors and expectation and maximization. Expectation supposes
their derivatives. Clustering methods use the distance the calculation of the likelihood function, and
defined in space to separate the data into homogenous maximization step is finding the parameter that
and dense groups (clusters). If we see that the point is not maximizes the likelihood function. As well there are
included in large clusters, it is classified as anomaly. So methods based on statistics, data distribution. These
we can assume that small clusters can be clusters of include the tail analysis of distributions (e.g., normal)
anomalous objects, because anomalies may also have a and using the Markov, Chebyshev, Chernoff inequality.
similar structure, i.e., be clustered. K-nearest neighbors An example of finding anomalies in sensors of
method [13] is based on the concept of proximity. We rotating machinery is considered in [27]. In this task
consider k nearest points on the basis of certain rules, that rolling element bearing failures are determined as
decide whether the object is abnormal or not. A simple anomalies. In practice, such frequent errors are one of
example of such rule is the distance between objects, i.e., the foremost causes of failures in the rotating
the farthest object from its neighbors the more likely is mechanical systems. Comparing with other SVM-based
abnormal. There are various kinds of rules starting from approaches, the authors apply a Gaussian distribution.
the distance-based rules to the neighbor distribution- After choosing threshold and calculating parameters of
based. For example, LOF (Local outlier factor) [14] is distribution the anomalies are found. For testing they
based on the density of objects in a neighborhood. use vibration data from the NSF I/UCR Center for
Examples of clustering methods of anomaly detection in Intelligent Maintenance Systems (IMS –
astronomy can be found in [15, 16, 17]. Besides classic www.imscenter.net) and reach 97% accuracy.
clustering methods, many machine learning techniques
can be used: e. g. modified methods of neural networks – Another examples of application of these methods
SOM (Self-organizing map) [18, 19]. can be found in [25, 26].
As an example, consider [20]. Authors propose their 3.4 Categorical Data
own clustering algorithm that also classifies anomalies.
The main task in this case is to find erroneous values and The appropriate anomaly detection methods operate
interesting events in sensor data. Using Intel Berkeley with continuous data - thus, one approach is to translate
Research lab dataset (2.3 million readings from 54 the categorical into continuous attributes. As an
sensors) and synthetic dataset their algorithm reached example, categorical data can be represented as a set of
Detection rate = 100%, False alarm rate = 0.10% and binary attributes. Certainly this kind of transformation
0.09% respectively. These experimental results show may increase the dimension of the data, but this
that their approach can detect dangerous events (such as problem can be solved with methods of dimensionality
forest fire, air pollution, etc.) as well as erroneous or reduction. Different probabilistic approaches also can be
noisy data. used for processing categorical data. It is clear that these
approaches are not the only ones that can work with the
3.2 Correlated Dimension Data categorical data. For example, some methods may be
partially modified for using categorical data types:
The idea of these methods is based on the concept of distance and proximity can be extended for categorical
correlation between data attributes. This situation is data.
often found in real data because different attributes can
be generated by the same processes. Thus, this effect 3.5 High-Dimensional Data
allows to use linear models and methods based on them.
A simple example of these methods is the linear In various applications the problem of the large
regression. Using the method of linear regression of the number of attributes often arises. This problem implies
data we are trying to bring some plane, which describes the extra attributes, the incorrectness of the concepts of
our data, then as the anomalous objects we pick those the distance between the objects and the sophistication
that are far away from this plane. Also often PCA of methods. For example, correlated dimension methods
(Principal component analysis) [21] can be used aiming will work much worse on a large number of attributes.
at the reducing of the dimensionality of the data. Due to The main way of solving these problems is the search of
this the PCA is sometimes used in preprocessing data as subspaces of attributes. Earlier we mentioned the PCA,
in [15]. But it can also be directly used to separate which is most commonly used for this task. But when
anomalies. In this case, the basic idea is that at new selecting a small number of attributes other problems
dimensions it is easier to distinguish normal objects will be encountered. By changing the number of
from abnormal objects [22]. attributes, we lose information. Because of the small
samples of anomalies, or the emergence of new types of
anomalies, previously "abnormal" attributes can be lost.
21
More subtle approach for this problem is the Sparse WinXP systems (including logs of the important system
Cube Method [28]. This technique is based on analysis processes such as svchost, Lsass, Inetinfo) and showed
of the density distributions of projections from the data, good results. One of the practical examples is given also
then the grid discretization is performed (data is in [31].
forming a sparse hypercube at this point) and the
evolutionary algorithm is employed to find an 4.2 Time Series Data
appropriate lower-dimensional subspace. If the data strongly depends on time, then we are
Many applications are confronted with the problem facing the need to predict the forthcoming data and
of high dimension. [29] will be taken as an example. analyze the current trends. The most common way to
Here authors searched for images, characterized by low determine an outlier is a surprising change of trends.
quality, low illumination intensity or some collisions. The methods considered are based on well-developed
They compare the PCA-based approach and the apparatus of time series analysis including Kalman
proposed one which is based on the random projections. Filtering, Autoregressive Modeling, detection of
After projection LOF works with neighborhood that was unusual shapes with the Haar transform and various
taken from source space. Both approaches show good statistic techniques. Historically, the first approach to
results, but the second is much faster at large finding this sort of outliers used an idea from the
dimensions than PCA and LOF. immunology [33].
4 Evolving Data 5 Multistructured Data
It is very common that data is given in a temporal Sometimes the data is presented in a more complex
(or just consecutive) representation. Usually it is caused form than numerical "attribute / value" table. In this
by the origin of the data. The temporal feature can be case it is important to understand what an outlier is by
discrete or continuous, so the data can be presented in using of the appropriate method of analysis. We will
sequences or in time series. Methods that we review in review two cases that need specific analysis: textual
this section can be applied to various common problems data (e.g., poll answers) and data presented as graph
in medicine, economy, earth science, etc. Also we (e.g., social network data).
review methods suitable for "on-line" outlier analysis in
data streams. 5.1 Text Data
4.1 Discrete Sequences Data In connection with the development of
communications, world wide web, and especially with
There are many problems that need outlier detection the advent of social networks, an interest in the analysis
in discrete sequences (web logs analysis, DNA analysis, of texts on the Internet greatly increased. Considering
etc. [3, 4]). There are several ways to determine an the text analytics and anomaly detection, several major
outlier in the data presented as a discrete sequence. We tasks can be distinguished: searching for abnormal texts
can analyze values on specific positions or test the – such as spam detection and searching for non-standard
whole sequence to be deviant. Three models are used to text – novelty detection. When solving these problems,
measure deviation in these problems: distance-based, the main problem is to represent texts in metric data.
frequency-based and Hidden Markov Model [10]. In the Thus we may use the previously defined methods. A
survey [30] the methods are divided in three groups: simple way is to use the standard metrics for texts, such
sequence-based, contiguous subsequence-based and as the tf-idf. Extaction of entites from texts also is
pattern-based. The first group includes Kernel Based widespread. Using natural language processing
Techniques, Window Based Techniques, Markovian techniques such as LSA (Latent semantic analysis) [34]
Techniques, contiguous subsequence methods include it is possible to group text, integrating it with the
Window Scoring Techniques and Segmentation Based standard anomaly detection methods. Due to the large
Techniques. Pattern-based methods include Substring number of texts, often the learning may have supervised
Matching, Subsequence Matching and Permutation character.
Matching Techniques [30]. In [36] a study is focused on spam detection. Using
In the work [34] the classic host-based anomaly the tf-idf measure their algorithm is based on computing
intrusion detection problem is solved. The study is distances between messages. Then it constructs
devoted to Windows Native API systems (a specific “normal” area using training set. Afterwards area’s
WindowsNT API that is used mostly during system threshold determines whether an email was a spam.
boot), while most of other works consider UNIX-based LingSpam (2412 ham, 480 spam), SpamAssassin(4150
systems. Authors analyse system calls in order to detect ham, 1896 spam) and TREC(7368 ham , 14937 spam)
the abnormal behaviour that indicates an attack or were selected as experimental data sets. The spam
intrusion. In order to solve this problem authors use a detector shows high accuracy and low false positive rate
slide window method to establish a database of "normal for each dataset.
patterns". Then the SVM method is used for anomaly
detection, and in addition to that several window-based 5.2 Graph Data
features are used to construct a detection rule. The
In this section we review how methods of data
method was tested on the real data from Win2K and
analysis depend on the graph structure. The main
22
difference is that the graph can be large and complex or, has its own specific features making possible to tune the
in the contrary, can consist of many smaller and simpler appropriate general algorithms properly turning them
graphs. The main problem here is to extract appropriate into the more efficient ones.
attributes from nodes, edges and subgraphs that allow to Let us consider one of the basic concept of machine
use methods considered in Section 3. In the first case we learning – the classification problem. The anomaly
will review methods that extract numerical attributes detection problem can be considered as a classification
from smaller graphs and treat them like data objects problem, in that case the data is assumed to have the
using algorithms from Section 3. In case of a large and class of anomalies. Most of the methods that solve
complex graph we may be interested in node outliers, classification problems assume that data classes have
linkage outliers and subgraph outlier. Methods that some sort of inner predictable structure. But the only
analyze node outliers usually extract attributes from the prediction that can be made about anomalies is that
given node and its neighborhood, but in case of a these objects do not resemble non-outlier "normal" data.
linkage outlier detection the concept of an outlier itself In this case, in order to solve the anomaly detection
becomes very complex [10, 3]. We will consider that problem, the outlier class modeling can be senseless and
edge is an outlier if it connects nodes from different unproductive. Instead of this, one should pay attention
dense clusters of nodes. The most popular methods are to the structure of the normal data, its laws of
based on the random graph theory, matrix factorization distribution.
and spectral analysis techniques [10]. Another problem
The machine learning methods can be divided in
in this section is to detect subgraphs with a deviant
three groups: supervised, semi-supervised and
behavior and to determine its structure and attribute
unsupervised methods. The first group is the most
extraction [37].
learned. It requires the labeled "training" dataset, and
Concrete definition of the outlier node or edge can this is exactly the situation described above: the
differ according to a specific problem. For example, in information about the outlier class is used to tune a
[38] several types of anomaly are considered: near-star, model of it in order to predict it's structure, which has
near-clique, heavy-vicinity and dominant edge. often very complex or random nature. The semi-
Anomalous subgraphs are often detected using the supervised methods use information only about the
Minimal Description Length principle [39, 40, 41]. One "normal" class, so these methods have better
of the most important application today is Social specifications for anomaly detection problem as well as
Network Data – many popular modern techniques are unsupervised methods, which do not use any
used in this area: Bayesian Models [42], Markov information besides the structure and configuration of
Random Field, Ising Model [43], EM algorithm [44] as the unlabeled data.
well as LOF [45].
Another important specific feature of the anomaly
In [44] authors perform anomaly detection methods detection problem is that usually abnormal objects are
for social networks. Social network contains significantly rare (compared to the non-outlier objects).
information about its members and their meetings. The This effect makes hard to construct a reliable training
problem statement is to find abnormal meeting and to dataset for supervised methods. Also, if this effect is not
measure its degree of abnormality. The problem presented in the data, most of known methods will
specificity is that the number of meetings is very small suffer from high alarm rates [47, 48].
compared to the number of members, that makes
challenging to use common statistical methods. In order 7 Conclusion
to solve the problem authors use the notion of
hypergraph. The vertices of the hypergraph are In this paper we introduced an approach to classify
considered as members of the social network and the different anomaly detection problems according to the
edges are considered as meetings of the members (each way the data are presented. We reviewed different
edge of a hypergraph connects some set of vertices applications of the outlier analysis in various cases. At
together). The anomalies are detected through density the end we summarized specific features of the methods
estimation of p-dimensional hypercube (the EM suitable for the outlier analysis problem. Our future
algorithm tunes a two-component mixture). The method plans include preparing of a university master level
is tested on a synthetic data and shows relatively low course focused on the anomaly detection as well as
estimation error. It is also considered to be a scalable working on the anomaly detection in various fields (e.g.
method, which makes it very valuable to use on large finding peculiar objects in massive digital sky
social networks. astronomy surveys).
6 Specific features of the anomaly detection References
methods comparing to the general machine [1] Chandola, V., Banerjee, A., & Kumar, V. (2009).
learning and statistics methods Anomaly detection: A Survey. ACM Computing
Surveys, 41(3), 1–58.
In this article we show the application for the Doi:10.1145/1541880.1541882
anomaly detection of various data mining methods that
[2] Kou, Y., Lu, C., & Sinvongwattana, S. (2004).
can re-use of the general machine learning and
Survey of Fraud Detection Techniques Yo-Ping
statistical algorithms. The anomaly detection problem
Huang, 749–754.
23
[3] Pan Y., Ding X. Anomaly based web phishing Advances in Intrusion Detection Lecture Notes in
page detection // Computer Security Applications Computer Science. Vol. 2820, 36–54.
Conference, 2006. ACSAC'06. 22nd Annual. – [20] Purarjomandlangrudi A., Ghapanchi A. H.,
IEEE, 2006. – С. 381–392. Esmalifalak M. A Data Mining Approach for
[4] Tzeng, J.-Y., Byerley, W., Devlin, B., Roeder, K., Fault Diagnosis: An Application of Anomaly
& Wasserman, L. (2003). Outlier Detection and Detection Algorithm // Measurement. – 2014.
False Discovery Rates for Whole-Genome DNA [21] Abdi, H., & Williams, L. J. (2010). Principal
Matching. Journal of the American Statistical component analysis. Wiley Interdisciplinary
Association, 98(461), 236–246. Reviews: Computational Statistics, 2(4), 433–
doi:10.1198/016214503388619256 459. doi:10.1002/wics.101
[5] Wu, B. (2007). Cancer outlier differential gene [22] Dutta H. et al. Distributed Top-K Outlier
expression detection. Biostatistics (Oxford, Detection from Astronomy Catalogs using the
England), 8(3), 566–75. DEMAC System // SDM. – 2007.
doi:10.1093/biostatistics/kxl029 [23] Cansado, A., & Soto, A. (2008). Unsupervised
[6] Lourenço A. et al. Outlier detection in non- Anomaly Detection in Large Databases Using
intrusive ECG biometric system // Image Analysis Bayesian Networks. Network, 1–37.
and Recognition. – Springer Berlin Heidelberg, [24] Zhu, X. (2007). CS838-1 Advanced NLP : The
2013. – С. 43–52. EM Algorithm K-means Clustering, (6), 1–6.
[7] Liu, F. T., Ting, K. M., & Zhou, Z.-H. (2012). [25] Spence, C., Parra, L., & Sajda, P. (2001).
Isolation-Based Anomaly Detection. ACM Detection, Synthesis and Compression in
Transactions on Knowledge Discovery from Data, Mammographic Image Analysis with a
6(1), 1–39. doi:10.1145/2133360.2133363 Hierarchical Image Probability Model, 3–10.
[8] Djorgovski, S. G., Brunner, R. J., Mahabal, A. A., [26] Pelleg, D., & Moore, A. (n.d.). Active Learning
& Odewahn, S. C. (2001). Exploration of Large for Anomaly and Rare-Category Detection.
Digital Sky Surveys. Observatory, 1–18.
[27] Fawzy A., Mokhtar H. M. O., Hegazy O. Outliers
[9] Djorgovski, S. G., Mahabal, A. A., Brunner, R. J., detection and classification in wireless sensor
Gal, R. R., Castro, S., Observatory, P., Carvalho, networks // Egyptian Informatics Journal. – 2013.
R. R. De, et al. (2001a). Searches for Rare and – Т. 14, № 2. – С. 157–164.
New Types of Objects, 225, 52–63.
[28] Aggarwal C. C., Philip S. Y. An effective and
[10] Aggarwal, C. C. (2013). Outlier Analysis efficient algorithm for high-dimensional outlier
(introduction). doi:10.1007/978-1-4614-6396-2 detection // The VLDB journal. – 2005. – Т. 14,
[11] Chandola, V., Banerjee, A., & Kumar, V. (2009). № 2. – С. 211–221.
Anomaly detection: A Survey. ACM Computing [29] De Vries, T., Chawla, S., &Houle, M. E. (2010).
Surveys, 41(3), 1–58. Finding Local Anomalies in Very High
doi:10.1145/1541880.1541882 Dimensional Space. 2010 IEEE
[12] Berti-équille, L. (2009). Data Quality Mining : InternationalConferenceonDataMining, 128–137.
New Research Directions. Current. doi:10.1109/ICDM.2010.151
[13] Stevens, K. N., Cover, T. M., & Hart, P. E. [30] Chandola V., Banerjee A., Kumar V. Anomaly
(1967). Nearest Neighbor Pattern Classification. detection for discrete sequences: A survey
EEE Transactions on Information Theory 13, I, // Knowledge and Data Engineering, IEEE
21–27. Transactions on. – 2012. – Т. 24, № 5. – С. 823–
[14] Breunig, M. M., Kriegel, H., Ng, R. T., & Sander, 839.
J. (2000). LOF?: Identifying Density-Based Local [31] Budalakoti S., Srivastava A. N., Otey M. E.
Outliers, 1–12. Anomaly detection and diagnosis algorithms for
[15] Borne, K., &Vedachalam, A. (2010). discrete symbol sequences with applications to
EFFECTIVE OUTLIER DETECTION IN airline safety // Systems, Man, and Cybernetics,
SCIENCE DATA STREAMS. ReCALL, 1–15. Part C: Applications and Reviews, IEEE
[16] Borne, K. (n.d.). Surprise Detection in Transactions on. – 2009. – Т. 39, №. 1. – С. 101–
Multivariate Astronomical Data. 113.
[17] Henrion, M., Hand, D. J., Gandy, A., &Mortlock, [32] Wang M., Zhang C., Yu J. Native API based
D. J. (2013). CASOS: a Subspace Method for windows anomaly intrusion detection method
Anomaly Detection in High Dimentional using SVM // Sensor Networks, Ubiquitous, and
Astronomical Databases. Statistical Analysis and Trustworthy Computing, 2006. IEEE
Data Mining, 6(1), 1–89. International Conference on. – IEEE, 2006. –
[18] Networks, K. (n.d.). Data Mining Self – Т. 1. – С. 6.
Organizing Maps, 1–20. [33] Dasgupta D., Forrest S. Novelty detection in time
[19] Manikantan Ramadas, Shawn Ostermann, Brett series data using ideas from immunology
TjadenDetecting Anomalous Network Traffic // Proceedings of the international conference on
with Self-organizing Maps.(2003) Recent intelligent systems. – 1996. – С. 82–87.
24
[34] Susan T. Dumais (2005). "Latent Semantic [46] Portnoy L., Eskin E., Stolfo S. Intrusion
Analysis". Annual Review of Information Detection with Unlabeled Data Using Clustering
Science and Technology 38: 188. (2001) // ACM Workshop on Data Mining
doi:10.1002/aris.1440380105 Applied to Security (DMSA 01).
[35] Allan, J., Papka, R., & Lavrenko, V. (1998). On- [47] Laorden C. et al. Study on the effectiveness of
line New Event Detection and Tracking. anomaly detection for spam filtering
[36] Laorden C. et al. Study on the effectiveness of // Information Sciences. – 2014. – Т. 277. –
anomaly detection for spam filtering С. 421–444.
// Information Sciences. – 2014. – Т. 277. – [48] Fawzy A., Mokhtar H. M. O., Hegazy O. Outliers
С. 421–444. detection and classification in wireless sensor
[37] Kil, H., Oh, S.-C., Elmacioglu, E., Nam, W., & networks // Egyptian Informatics Journal. – 2013.
Lee, D. (2009). Graph Theoretic Topological – Т. 14, № 2. – С. 157–164.
Analysis of Web Service Networks. [49] Yu M. A nonparametric adaptive CUSUM
WorldWideWeb, 12(3), 321–343. method and its application in network anomaly
doi:10.1007/s11280-009-0064-6 detection // International Journal of
[38] Akoglu L., McGlohon M., Faloutsos C. Oddball: Advancements in Computing Technology. –
Spotting anomalies in weighted graphs 2012. – Т. 4, № 1. – С. 280–288.
// Advances in Knowledge Discovery and Data [50] Muniyandi A.P., Rajeswari R., Rajaram R.
Mining. – Springer Berlin Heidelberg, 2010. – Network anomaly detection by cascading
С. 410–421. k-Means clustering and C4. 5 decision tree
[39] Noble C. C., Cook D. J. Graph-based anomaly algorithm // Procedia Engineering. – 2012. –
detection // Proceedings of the ninth ACM Т. 30. – С. 174–182.
SIGKDD international conference on Knowledge [51] Muda Z. et al. A K-Means and Naive Bayes
discovery and data mining. – ACM, 2003. – learning approach for better intrusion detection
С. 631–636. // Information technology journal. – 2011. –
[40] Eberle W., Holder L. Discovering structural Т. 10, №. 3. – С. 648–655.
anomalies in graph-based data // Data Mining [52] Kavuri V. C., Liu H. Hierarchical clustering
Workshops, 2007. ICDM Workshops 2007. method to improve transrectal ultrasound-guided
Seventh IEEE International Conference on. – diffuse optical tomography for prostate cancer
IEEE, 2007. – С. 393–398. imaging // Academic radiology. – 2014. – Т. 21,
[41] Chakrabarti D. Autopart: Parameter-free graph № 2. – С. 250–262.
partitioning and outlier detection // Knowledge [53] Li S., Tung W. L., Ng W. K. A novelty detection
Discovery in Databases: PKDD 2004. – Springer machine and its application to bank failure
Berlin Heidelberg, 2004. – С. 112–124. prediction // Neurocomputing. – 2014. – Т. 130. –
[42] Heard N. A. et al. Bayesian anomaly detection С. 63–72.
methods for social networks //The Annals of [54] Cogranne R., Retraint F. Statistical detection of
Applied Statistics. – 2010. – Т. 4, № 2. – С. 645– defects in radiographic images using an adaptive
662. parametric model // Signal Processing. – 2014. –
[43] Horn C., Willett R. Online anomaly detection Т. 96. – С. 173–189.
with expert system feedback in social networks [55] Daneshpazouh A., Sami A. Entropy-Based
// Acoustics, Speech and Signal Processing Outlier Detection Using Semi-Supervised
(ICASSP), 2011 IEEE International Conference Approach with Few Positive Examples // Pattern
on. – IEEE, 2011. – С. 1936–1939. Recognition Letters. – 2014.
[44] Silva J., Willett R. Detection of anomalous [56] Rahmani A. et al. Graph-based approach for
meetings in a social network //Information outlier detection in sequential data and its
Sciences and Systems, 2008. CISS 2008. 42nd application on stock market and weather data
Annual Conference on. – IEEE, 2008. – С. 636– // Knowledge-Based Systems. – 2014. – Т. 61. –
641. С. 89–97.
[45] Bhuyan M., Bhattacharyya D., Kalita J. Network
anomaly detection: methods, systems and tools. –
2013.
25