Methods for Anomaly Detection: a Survey © Leonid Kalinichenko © Ivan Shanin © Ilia Taraban Institute of Informatics Problems of RAS Moscow leonidandk@gmail.com ivan_shanin@mail.ru tarabanil@gmail.com detail, each form is related to a certain class of problems Abstract and appropriate methods that are presented with the In this article we review different approaches application examples. In Section 6 we discuss specific to the anomaly detection problems, their features of the anomaly detection problem that make applications and specific features. We classify strong impact on the methods used in this area. Section different methods according to the data 7 contains conclusions and results of this review. specificity and discuss their applicability in different cases. 2 Data forms The precise definition of the outlier depends on the 1 Introduction specific problem and its data representation. In this Anomalies (or outliers, deviant objects, exceptions, survey we will establish a correspondence between rare events, peculiar objects) is an important concept of concrete data representation forms and suitable anomaly the data analysis. Data object is considered to be an detection methods. We assume that the data are usually outlier if it has significant deviation from the regular presented in one of three forms: Metric Data, Evolving pattern of the common data behaviour in a specific Data and Multistructured Data. Metric Data are the most domain. Generally it means that this data object is common form of data representation, when every object “dissimilar” to the other observations in the dataset. It is in a dataset has a certain set of attributes that allows to very important to detect these objects during the data operate with notions of "distance" and "proximity". analysis to treat them differently from the other data. Evolving Data are presented as well-studied objects: For instance, the anomaly detection methods are widely Discrete Sequences, Time Series and Multidimensional used for the following purposes: Data Streams. Third form is the Multistructured Data, under this term we understand the data that are • Credit card (and mobile phone) fraud detection presented in unstructured, semi-structured or structured [1, 2]; form. This data form may not have a rigid structure, and • Suspicious Web site detection [3]; yet it can contain various data dependencies. The most • Whole-genome DNA matching [4, 5]; usual task with this type of data is to extract attributes • ECG-signal filtering [6]; that would allow using metric data oriented methods of the outlier analysis. In our survey the Multistructured • Suspicious transaction detection [7]; Data are specialized as the Graph Data or Text Data. • Analysis of digital sky surveys [8, 9]. The anomaly detection problem has become a 3 Metric Data Oriented Methods recognized rapidly-developing topic of the data In this section the methods are considered that use analysis. Many surveys and studies are devoted to this the concept of “metric” data: such as the distance problem [1, 3, 4, 5, 10, 11]. The main purpose of this between objects, the correlation between them, and the review is to reveal specific features of widely known distribution of data. We assume that the data in this case statistical and machine learning methods that are used to represents the objects in the space, so-called points. detect anomalies. All considered methods will be Then the task is to determine regular and irregular categorized by the data form they are applied to. points, depending on the specific metric distance The paper is organized as follows. In Section 2 we between objects in the space, or the correlation, or the introduce three generic data representations that are spatial distribution of the points. In this case, we most commonly used in anomaly detection problems: consider a structured data type, i.e., objects, which do Metric Data, Evolving Data and Multistructured Data. not depend on time (time series are discussed in In Sections 3, 4 and 5 these data forms are discussed in Section 4). Metric data form is the most widely-used, usually due to the fact that almost all entities can be Proceedings of the 16th All-Russian Conference “Digital represented as a structured object, a set of attributes, and Libraries: Advanced Methods and Technologies, Digital thus as a point in a particular space [12]. Thus, these Collections” ― RCDL-2014, Dubna, Russia, October methods are used in various applications, e.g., in 13–16, 2014. medicine and astronomy. We subdivide methods based 20 on the notion of distance, based on the correlations, data 3.3 Probabilistically Distributed Data distributions and finally related to the data with high In probabilistic methods, the main approach is to dimension and categorical attributes. We now turn to a assume that the data satisfy some distribution law. Thus, more detailed review of certain types of these methods. anomalous objects can be defined as objects that do not 3.1 Distance-Based Data satisfy such basic rule. A classic example of these methods is the EM [23, 24], an iterative algorithm based Basic set of methods that use the notion of distance on the maximum likelihood method. Each iteration is an includes clustering methods, K nearest neighbors and expectation and maximization. Expectation supposes their derivatives. Clustering methods use the distance the calculation of the likelihood function, and defined in space to separate the data into homogenous maximization step is finding the parameter that and dense groups (clusters). If we see that the point is not maximizes the likelihood function. As well there are included in large clusters, it is classified as anomaly. So methods based on statistics, data distribution. These we can assume that small clusters can be clusters of include the tail analysis of distributions (e.g., normal) anomalous objects, because anomalies may also have a and using the Markov, Chebyshev, Chernoff inequality. similar structure, i.e., be clustered. K-nearest neighbors An example of finding anomalies in sensors of method [13] is based on the concept of proximity. We rotating machinery is considered in [27]. In this task consider k nearest points on the basis of certain rules, that rolling element bearing failures are determined as decide whether the object is abnormal or not. A simple anomalies. In practice, such frequent errors are one of example of such rule is the distance between objects, i.e., the foremost causes of failures in the rotating the farthest object from its neighbors the more likely is mechanical systems. Comparing with other SVM-based abnormal. There are various kinds of rules starting from approaches, the authors apply a Gaussian distribution. the distance-based rules to the neighbor distribution- After choosing threshold and calculating parameters of based. For example, LOF (Local outlier factor) [14] is distribution the anomalies are found. For testing they based on the density of objects in a neighborhood. use vibration data from the NSF I/UCR Center for Examples of clustering methods of anomaly detection in Intelligent Maintenance Systems (IMS – astronomy can be found in [15, 16, 17]. Besides classic www.imscenter.net) and reach 97% accuracy. clustering methods, many machine learning techniques can be used: e. g. modified methods of neural networks – Another examples of application of these methods SOM (Self-organizing map) [18, 19]. can be found in [25, 26]. As an example, consider [20]. Authors propose their 3.4 Categorical Data own clustering algorithm that also classifies anomalies. The main task in this case is to find erroneous values and The appropriate anomaly detection methods operate interesting events in sensor data. Using Intel Berkeley with continuous data - thus, one approach is to translate Research lab dataset (2.3 million readings from 54 the categorical into continuous attributes. As an sensors) and synthetic dataset their algorithm reached example, categorical data can be represented as a set of Detection rate = 100%, False alarm rate = 0.10% and binary attributes. Certainly this kind of transformation 0.09% respectively. These experimental results show may increase the dimension of the data, but this that their approach can detect dangerous events (such as problem can be solved with methods of dimensionality forest fire, air pollution, etc.) as well as erroneous or reduction. Different probabilistic approaches also can be noisy data. used for processing categorical data. It is clear that these approaches are not the only ones that can work with the 3.2 Correlated Dimension Data categorical data. For example, some methods may be partially modified for using categorical data types: The idea of these methods is based on the concept of distance and proximity can be extended for categorical correlation between data attributes. This situation is data. often found in real data because different attributes can be generated by the same processes. Thus, this effect 3.5 High-Dimensional Data allows to use linear models and methods based on them. A simple example of these methods is the linear In various applications the problem of the large regression. Using the method of linear regression of the number of attributes often arises. This problem implies data we are trying to bring some plane, which describes the extra attributes, the incorrectness of the concepts of our data, then as the anomalous objects we pick those the distance between the objects and the sophistication that are far away from this plane. Also often PCA of methods. For example, correlated dimension methods (Principal component analysis) [21] can be used aiming will work much worse on a large number of attributes. at the reducing of the dimensionality of the data. Due to The main way of solving these problems is the search of this the PCA is sometimes used in preprocessing data as subspaces of attributes. Earlier we mentioned the PCA, in [15]. But it can also be directly used to separate which is most commonly used for this task. But when anomalies. In this case, the basic idea is that at new selecting a small number of attributes other problems dimensions it is easier to distinguish normal objects will be encountered. By changing the number of from abnormal objects [22]. attributes, we lose information. Because of the small samples of anomalies, or the emergence of new types of anomalies, previously "abnormal" attributes can be lost. 21 More subtle approach for this problem is the Sparse WinXP systems (including logs of the important system Cube Method [28]. This technique is based on analysis processes such as svchost, Lsass, Inetinfo) and showed of the density distributions of projections from the data, good results. One of the practical examples is given also then the grid discretization is performed (data is in [31]. forming a sparse hypercube at this point) and the evolutionary algorithm is employed to find an 4.2 Time Series Data appropriate lower-dimensional subspace. If the data strongly depends on time, then we are Many applications are confronted with the problem facing the need to predict the forthcoming data and of high dimension. [29] will be taken as an example. analyze the current trends. The most common way to Here authors searched for images, characterized by low determine an outlier is a surprising change of trends. quality, low illumination intensity or some collisions. The methods considered are based on well-developed They compare the PCA-based approach and the apparatus of time series analysis including Kalman proposed one which is based on the random projections. Filtering, Autoregressive Modeling, detection of After projection LOF works with neighborhood that was unusual shapes with the Haar transform and various taken from source space. Both approaches show good statistic techniques. Historically, the first approach to results, but the second is much faster at large finding this sort of outliers used an idea from the dimensions than PCA and LOF. immunology [33]. 4 Evolving Data 5 Multistructured Data It is very common that data is given in a temporal Sometimes the data is presented in a more complex (or just consecutive) representation. Usually it is caused form than numerical "attribute / value" table. In this by the origin of the data. The temporal feature can be case it is important to understand what an outlier is by discrete or continuous, so the data can be presented in using of the appropriate method of analysis. We will sequences or in time series. Methods that we review in review two cases that need specific analysis: textual this section can be applied to various common problems data (e.g., poll answers) and data presented as graph in medicine, economy, earth science, etc. Also we (e.g., social network data). review methods suitable for "on-line" outlier analysis in data streams. 5.1 Text Data 4.1 Discrete Sequences Data In connection with the development of communications, world wide web, and especially with There are many problems that need outlier detection the advent of social networks, an interest in the analysis in discrete sequences (web logs analysis, DNA analysis, of texts on the Internet greatly increased. Considering etc. [3, 4]). There are several ways to determine an the text analytics and anomaly detection, several major outlier in the data presented as a discrete sequence. We tasks can be distinguished: searching for abnormal texts can analyze values on specific positions or test the – such as spam detection and searching for non-standard whole sequence to be deviant. Three models are used to text – novelty detection. When solving these problems, measure deviation in these problems: distance-based, the main problem is to represent texts in metric data. frequency-based and Hidden Markov Model [10]. In the Thus we may use the previously defined methods. A survey [30] the methods are divided in three groups: simple way is to use the standard metrics for texts, such sequence-based, contiguous subsequence-based and as the tf-idf. Extaction of entites from texts also is pattern-based. The first group includes Kernel Based widespread. Using natural language processing Techniques, Window Based Techniques, Markovian techniques such as LSA (Latent semantic analysis) [34] Techniques, contiguous subsequence methods include it is possible to group text, integrating it with the Window Scoring Techniques and Segmentation Based standard anomaly detection methods. Due to the large Techniques. Pattern-based methods include Substring number of texts, often the learning may have supervised Matching, Subsequence Matching and Permutation character. Matching Techniques [30]. In [36] a study is focused on spam detection. Using In the work [34] the classic host-based anomaly the tf-idf measure their algorithm is based on computing intrusion detection problem is solved. The study is distances between messages. Then it constructs devoted to Windows Native API systems (a specific “normal” area using training set. Afterwards area’s WindowsNT API that is used mostly during system threshold determines whether an email was a spam. boot), while most of other works consider UNIX-based LingSpam (2412 ham, 480 spam), SpamAssassin(4150 systems. Authors analyse system calls in order to detect ham, 1896 spam) and TREC(7368 ham , 14937 spam) the abnormal behaviour that indicates an attack or were selected as experimental data sets. The spam intrusion. In order to solve this problem authors use a detector shows high accuracy and low false positive rate slide window method to establish a database of "normal for each dataset. patterns". Then the SVM method is used for anomaly detection, and in addition to that several window-based 5.2 Graph Data features are used to construct a detection rule. The In this section we review how methods of data method was tested on the real data from Win2K and analysis depend on the graph structure. The main 22 difference is that the graph can be large and complex or, has its own specific features making possible to tune the in the contrary, can consist of many smaller and simpler appropriate general algorithms properly turning them graphs. The main problem here is to extract appropriate into the more efficient ones. attributes from nodes, edges and subgraphs that allow to Let us consider one of the basic concept of machine use methods considered in Section 3. In the first case we learning – the classification problem. The anomaly will review methods that extract numerical attributes detection problem can be considered as a classification from smaller graphs and treat them like data objects problem, in that case the data is assumed to have the using algorithms from Section 3. In case of a large and class of anomalies. Most of the methods that solve complex graph we may be interested in node outliers, classification problems assume that data classes have linkage outliers and subgraph outlier. Methods that some sort of inner predictable structure. But the only analyze node outliers usually extract attributes from the prediction that can be made about anomalies is that given node and its neighborhood, but in case of a these objects do not resemble non-outlier "normal" data. linkage outlier detection the concept of an outlier itself In this case, in order to solve the anomaly detection becomes very complex [10, 3]. We will consider that problem, the outlier class modeling can be senseless and edge is an outlier if it connects nodes from different unproductive. Instead of this, one should pay attention dense clusters of nodes. The most popular methods are to the structure of the normal data, its laws of based on the random graph theory, matrix factorization distribution. and spectral analysis techniques [10]. Another problem The machine learning methods can be divided in in this section is to detect subgraphs with a deviant three groups: supervised, semi-supervised and behavior and to determine its structure and attribute unsupervised methods. The first group is the most extraction [37]. learned. It requires the labeled "training" dataset, and Concrete definition of the outlier node or edge can this is exactly the situation described above: the differ according to a specific problem. For example, in information about the outlier class is used to tune a [38] several types of anomaly are considered: near-star, model of it in order to predict it's structure, which has near-clique, heavy-vicinity and dominant edge. often very complex or random nature. The semi- Anomalous subgraphs are often detected using the supervised methods use information only about the Minimal Description Length principle [39, 40, 41]. One "normal" class, so these methods have better of the most important application today is Social specifications for anomaly detection problem as well as Network Data – many popular modern techniques are unsupervised methods, which do not use any used in this area: Bayesian Models [42], Markov information besides the structure and configuration of Random Field, Ising Model [43], EM algorithm [44] as the unlabeled data. well as LOF [45]. Another important specific feature of the anomaly In [44] authors perform anomaly detection methods detection problem is that usually abnormal objects are for social networks. Social network contains significantly rare (compared to the non-outlier objects). information about its members and their meetings. The This effect makes hard to construct a reliable training problem statement is to find abnormal meeting and to dataset for supervised methods. Also, if this effect is not measure its degree of abnormality. The problem presented in the data, most of known methods will specificity is that the number of meetings is very small suffer from high alarm rates [47, 48]. compared to the number of members, that makes challenging to use common statistical methods. In order 7 Conclusion to solve the problem authors use the notion of hypergraph. The vertices of the hypergraph are In this paper we introduced an approach to classify considered as members of the social network and the different anomaly detection problems according to the edges are considered as meetings of the members (each way the data are presented. We reviewed different edge of a hypergraph connects some set of vertices applications of the outlier analysis in various cases. At together). The anomalies are detected through density the end we summarized specific features of the methods estimation of p-dimensional hypercube (the EM suitable for the outlier analysis problem. Our future algorithm tunes a two-component mixture). The method plans include preparing of a university master level is tested on a synthetic data and shows relatively low course focused on the anomaly detection as well as estimation error. It is also considered to be a scalable working on the anomaly detection in various fields (e.g. method, which makes it very valuable to use on large finding peculiar objects in massive digital sky social networks. astronomy surveys). 6 Specific features of the anomaly detection References methods comparing to the general machine [1] Chandola, V., Banerjee, A., & Kumar, V. (2009). learning and statistics methods Anomaly detection: A Survey. ACM Computing Surveys, 41(3), 1–58. In this article we show the application for the Doi:10.1145/1541880.1541882 anomaly detection of various data mining methods that [2] Kou, Y., Lu, C., & Sinvongwattana, S. (2004). can re-use of the general machine learning and Survey of Fraud Detection Techniques Yo-Ping statistical algorithms. The anomaly detection problem Huang, 749–754. 23 [3] Pan Y., Ding X. Anomaly based web phishing Advances in Intrusion Detection Lecture Notes in page detection // Computer Security Applications Computer Science. Vol. 2820, 36–54. Conference, 2006. ACSAC'06. 22nd Annual. – [20] Purarjomandlangrudi A., Ghapanchi A. H., IEEE, 2006. – С. 381–392. Esmalifalak M. A Data Mining Approach for [4] Tzeng, J.-Y., Byerley, W., Devlin, B., Roeder, K., Fault Diagnosis: An Application of Anomaly & Wasserman, L. (2003). Outlier Detection and Detection Algorithm // Measurement. – 2014. False Discovery Rates for Whole-Genome DNA [21] Abdi, H., & Williams, L. J. (2010). Principal Matching. Journal of the American Statistical component analysis. Wiley Interdisciplinary Association, 98(461), 236–246. Reviews: Computational Statistics, 2(4), 433– doi:10.1198/016214503388619256 459. doi:10.1002/wics.101 [5] Wu, B. (2007). Cancer outlier differential gene [22] Dutta H. et al. Distributed Top-K Outlier expression detection. Biostatistics (Oxford, Detection from Astronomy Catalogs using the England), 8(3), 566–75. DEMAC System // SDM. – 2007. doi:10.1093/biostatistics/kxl029 [23] Cansado, A., & Soto, A. (2008). Unsupervised [6] Lourenço A. et al. Outlier detection in non- Anomaly Detection in Large Databases Using intrusive ECG biometric system // Image Analysis Bayesian Networks. Network, 1–37. and Recognition. – Springer Berlin Heidelberg, [24] Zhu, X. (2007). CS838-1 Advanced NLP : The 2013. – С. 43–52. EM Algorithm K-means Clustering, (6), 1–6. [7] Liu, F. T., Ting, K. M., & Zhou, Z.-H. (2012). [25] Spence, C., Parra, L., & Sajda, P. (2001). Isolation-Based Anomaly Detection. ACM Detection, Synthesis and Compression in Transactions on Knowledge Discovery from Data, Mammographic Image Analysis with a 6(1), 1–39. doi:10.1145/2133360.2133363 Hierarchical Image Probability Model, 3–10. [8] Djorgovski, S. G., Brunner, R. J., Mahabal, A. A., [26] Pelleg, D., & Moore, A. (n.d.). Active Learning & Odewahn, S. C. (2001). Exploration of Large for Anomaly and Rare-Category Detection. Digital Sky Surveys. Observatory, 1–18. [27] Fawzy A., Mokhtar H. M. O., Hegazy O. Outliers [9] Djorgovski, S. G., Mahabal, A. A., Brunner, R. J., detection and classification in wireless sensor Gal, R. R., Castro, S., Observatory, P., Carvalho, networks // Egyptian Informatics Journal. – 2013. R. R. De, et al. (2001a). Searches for Rare and – Т. 14, № 2. – С. 157–164. New Types of Objects, 225, 52–63. [28] Aggarwal C. C., Philip S. Y. An effective and [10] Aggarwal, C. C. (2013). Outlier Analysis efficient algorithm for high-dimensional outlier (introduction). doi:10.1007/978-1-4614-6396-2 detection // The VLDB journal. – 2005. – Т. 14, [11] Chandola, V., Banerjee, A., & Kumar, V. (2009). № 2. – С. 211–221. Anomaly detection: A Survey. ACM Computing [29] De Vries, T., Chawla, S., &Houle, M. E. (2010). Surveys, 41(3), 1–58. Finding Local Anomalies in Very High doi:10.1145/1541880.1541882 Dimensional Space. 2010 IEEE [12] Berti-équille, L. (2009). Data Quality Mining : InternationalConferenceonDataMining, 128–137. New Research Directions. Current. doi:10.1109/ICDM.2010.151 [13] Stevens, K. N., Cover, T. M., & Hart, P. E. [30] Chandola V., Banerjee A., Kumar V. Anomaly (1967). Nearest Neighbor Pattern Classification. detection for discrete sequences: A survey EEE Transactions on Information Theory 13, I, // Knowledge and Data Engineering, IEEE 21–27. Transactions on. – 2012. – Т. 24, № 5. – С. 823– [14] Breunig, M. M., Kriegel, H., Ng, R. T., & Sander, 839. J. (2000). LOF?: Identifying Density-Based Local [31] Budalakoti S., Srivastava A. N., Otey M. E. Outliers, 1–12. Anomaly detection and diagnosis algorithms for [15] Borne, K., &Vedachalam, A. (2010). discrete symbol sequences with applications to EFFECTIVE OUTLIER DETECTION IN airline safety // Systems, Man, and Cybernetics, SCIENCE DATA STREAMS. ReCALL, 1–15. Part C: Applications and Reviews, IEEE [16] Borne, K. (n.d.). Surprise Detection in Transactions on. – 2009. – Т. 39, №. 1. – С. 101– Multivariate Astronomical Data. 113. [17] Henrion, M., Hand, D. J., Gandy, A., &Mortlock, [32] Wang M., Zhang C., Yu J. Native API based D. J. (2013). CASOS: a Subspace Method for windows anomaly intrusion detection method Anomaly Detection in High Dimentional using SVM // Sensor Networks, Ubiquitous, and Astronomical Databases. Statistical Analysis and Trustworthy Computing, 2006. IEEE Data Mining, 6(1), 1–89. International Conference on. – IEEE, 2006. – [18] Networks, K. (n.d.). Data Mining Self – Т. 1. – С. 6. Organizing Maps, 1–20. [33] Dasgupta D., Forrest S. Novelty detection in time [19] Manikantan Ramadas, Shawn Ostermann, Brett series data using ideas from immunology TjadenDetecting Anomalous Network Traffic // Proceedings of the international conference on with Self-organizing Maps.(2003) Recent intelligent systems. – 1996. – С. 82–87. 24 [34] Susan T. Dumais (2005). "Latent Semantic [46] Portnoy L., Eskin E., Stolfo S. Intrusion Analysis". Annual Review of Information Detection with Unlabeled Data Using Clustering Science and Technology 38: 188. (2001) // ACM Workshop on Data Mining doi:10.1002/aris.1440380105 Applied to Security (DMSA 01). [35] Allan, J., Papka, R., & Lavrenko, V. (1998). On- [47] Laorden C. et al. Study on the effectiveness of line New Event Detection and Tracking. anomaly detection for spam filtering [36] Laorden C. et al. Study on the effectiveness of // Information Sciences. – 2014. – Т. 277. – anomaly detection for spam filtering С. 421–444. // Information Sciences. – 2014. – Т. 277. – [48] Fawzy A., Mokhtar H. M. O., Hegazy O. Outliers С. 421–444. detection and classification in wireless sensor [37] Kil, H., Oh, S.-C., Elmacioglu, E., Nam, W., & networks // Egyptian Informatics Journal. – 2013. Lee, D. (2009). Graph Theoretic Topological – Т. 14, № 2. – С. 157–164. Analysis of Web Service Networks. [49] Yu M. A nonparametric adaptive CUSUM WorldWideWeb, 12(3), 321–343. method and its application in network anomaly doi:10.1007/s11280-009-0064-6 detection // International Journal of [38] Akoglu L., McGlohon M., Faloutsos C. Oddball: Advancements in Computing Technology. – Spotting anomalies in weighted graphs 2012. – Т. 4, № 1. – С. 280–288. // Advances in Knowledge Discovery and Data [50] Muniyandi A.P., Rajeswari R., Rajaram R. Mining. – Springer Berlin Heidelberg, 2010. – Network anomaly detection by cascading С. 410–421. k-Means clustering and C4. 5 decision tree [39] Noble C. C., Cook D. J. Graph-based anomaly algorithm // Procedia Engineering. – 2012. – detection // Proceedings of the ninth ACM Т. 30. – С. 174–182. SIGKDD international conference on Knowledge [51] Muda Z. et al. A K-Means and Naive Bayes discovery and data mining. – ACM, 2003. – learning approach for better intrusion detection С. 631–636. // Information technology journal. – 2011. – [40] Eberle W., Holder L. Discovering structural Т. 10, №. 3. – С. 648–655. anomalies in graph-based data // Data Mining [52] Kavuri V. C., Liu H. Hierarchical clustering Workshops, 2007. ICDM Workshops 2007. method to improve transrectal ultrasound-guided Seventh IEEE International Conference on. – diffuse optical tomography for prostate cancer IEEE, 2007. – С. 393–398. imaging // Academic radiology. – 2014. – Т. 21, [41] Chakrabarti D. Autopart: Parameter-free graph № 2. – С. 250–262. partitioning and outlier detection // Knowledge [53] Li S., Tung W. L., Ng W. K. A novelty detection Discovery in Databases: PKDD 2004. – Springer machine and its application to bank failure Berlin Heidelberg, 2004. – С. 112–124. prediction // Neurocomputing. – 2014. – Т. 130. – [42] Heard N. A. et al. Bayesian anomaly detection С. 63–72. methods for social networks //The Annals of [54] Cogranne R., Retraint F. Statistical detection of Applied Statistics. – 2010. – Т. 4, № 2. – С. 645– defects in radiographic images using an adaptive 662. parametric model // Signal Processing. – 2014. – [43] Horn C., Willett R. Online anomaly detection Т. 96. – С. 173–189. with expert system feedback in social networks [55] Daneshpazouh A., Sami A. Entropy-Based // Acoustics, Speech and Signal Processing Outlier Detection Using Semi-Supervised (ICASSP), 2011 IEEE International Conference Approach with Few Positive Examples // Pattern on. – IEEE, 2011. – С. 1936–1939. Recognition Letters. – 2014. [44] Silva J., Willett R. Detection of anomalous [56] Rahmani A. et al. Graph-based approach for meetings in a social network //Information outlier detection in sequential data and its Sciences and Systems, 2008. CISS 2008. 42nd application on stock market and weather data Annual Conference on. – IEEE, 2008. – С. 636– // Knowledge-Based Systems. – 2014. – Т. 61. – 641. С. 89–97. [45] Bhuyan M., Bhattacharyya D., Kalita J. Network anomaly detection: methods, systems and tools. – 2013. 25