<?xml version="1.0" encoding="UTF-8"?>
<TEI xml:space="preserve" xmlns="http://www.tei-c.org/ns/1.0" 
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" 
xsi:schemaLocation="http://www.tei-c.org/ns/1.0 https://raw.githubusercontent.com/kermitt2/grobid/master/grobid-home/schemas/xsd/Grobid.xsd"
 xmlns:xlink="http://www.w3.org/1999/xlink">
	<teiHeader xml:lang="en">
		<fileDesc>
			<titleStmt>
				<title level="a" type="main">Anomaly Detection for System Logs Literature Overview *</title>
			</titleStmt>
			<publicationStmt>
				<publisher/>
				<availability status="unknown"><licence/></availability>
			</publicationStmt>
			<sourceDesc>
				<biblStruct>
					<analytic>
						<author role="corresp">
							<persName><forename type="first">Justas</forename><surname>Juknys</surname></persName>
							<email>juknys@vdu.lt</email>
							<affiliation key="aff0">
								<orgName type="institution">Vytautas Magnus University</orgName>
								<address>
									<addrLine>Universiteto str. 10-202</addrLine>
									<postCode>53361</postCode>
									<settlement>Akademija, Kaunas</settlement>
									<country key="LT">Lithuania</country>
								</address>
							</affiliation>
							<affiliation key="aff1">
								<orgName type="department">IVUS2024: Information Society</orgName>
								<orgName type="institution">University Studies</orgName>
								<address>
									<addrLine>2024, May 17</addrLine>
									<settlement>Kaunas</settlement>
									<country key="LT">Lithuania</country>
								</address>
							</affiliation>
						</author>
						<title level="a" type="main">Anomaly Detection for System Logs Literature Overview *</title>
					</analytic>
					<monogr>
						<idno type="ISSN">1613-0073</idno>
					</monogr>
					<idno type="MD5">9CC4E0385FF15B8336AB09AFDD073406</idno>
				</biblStruct>
			</sourceDesc>
		</fileDesc>
		<encodingDesc>
			<appInfo>
				<application version="0.7.2" ident="GROBID" when="2025-04-23T16:27+0000">
					<desc>GROBID - A machine learning software for extracting information from scholarly documents</desc>
					<ref target="https://github.com/kermitt2/grobid"/>
				</application>
			</appInfo>
		</encodingDesc>
		<profileDesc>
			<textClass>
				<keywords>
					<term>Deep learning</term>
					<term>Neural networks</term>
					<term>Machine Learning</term>
					<term>Log messages</term>
					<term>Literature Review</term>
					<term>Anomaly Detection</term>
					<term>Cyber Security</term>
					<term>Classification *</term>
				</keywords>
			</textClass>
			<abstract>
<div xmlns="http://www.tei-c.org/ns/1.0"><p>This paper describes analysis results of system log Anomaly Detection literature from time period of 2018 to 2023. The literature was found using keywords "log anomaly", "machine learning", "neural network". A total of 80 different scientific papers have been analyzed. It has been determined that most popular neural networks are LSTM/BiLSTM; most common datasets are HDFS, BGL and Thunderbird; Most popular evaluation metrics include F1, precision and accuracy. Most of research sought to address issues of improving model detection accuracy, lowering system resource use and making model more suitable real time detection.</p></div>
			</abstract>
		</profileDesc>
	</teiHeader>
	<text xml:lang="en">
		<body>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="1.">Introduction</head><p>As time goes by complexity and scale of software systems is rapidly increasing, which forces the necessity for new anomaly detection methods to be developed <ref type="bibr" target="#b0">[1]</ref>. As the amount of data needed to be analyzed increases, the need to fully automate detection process increases <ref type="bibr" target="#b1">[2]</ref>. As treats to system security become more sophisticated, the amount of needed to be analyzed data points keeps increasing as well, which at the same time makes it harder to use supervised training approach and properly interpret received data <ref type="bibr" target="#b2">[3]</ref>. Another major issue is prevalence of 0 day exploits which are usually impossible to predict in advance <ref type="bibr" target="#b3">[4]</ref>. All of the aforementioned issues are normally addressed through use of anomaly detection methods.</p><p>To limit the scope of research it was chosen to focus on the specific keywords: "Log", "Anomaly detection", "machine learning", "neural networks". arxiv.org[5] and sciencedirect.com <ref type="bibr">[6]</ref> databases have been used for research paper collection. A total of 117 different research papers have been analyzed. All papers have been written during 2000-2023 time period with 78 of the papers being from 2018-2023 time period. Only research and conference papers have been analyzed.</p><p>For the purposes of this paper the following has been chosen to analyze:</p><p>1. Which neural network and machine learning approaches are being used? 2. What metrics have been used to evaluate suggested approaches and how do different approaches compare to each other? 3. Which data sets are being used to train models? 4. What problems in anomaly detection have been identified? 5. What findings/conclusions have been made?</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="2.">Key definitions</head><p>1) Anomaly detection: It is an approach seeking to identify unusual events based on comparisons to standard situation. The anomalous event is something which cannot be fully anticipated in advance and as result cannot be detected via traditional pattern based detection methods. To declare an anomaly an outlier needs to be found. This outlier could appear through various contexts like statistical outlier, situation/sequence outlier, timing outlier and so on… It is usually assumed that the amount of anomalous data is much less numerous than normal data. Most popular approach to solving anomaly detection problems is use of semi supervised training, where models are trained exclusively on normal data <ref type="bibr" target="#b2">[3]</ref>.</p><p>2) Log data: this is information gathered in sequential order and presented in lines. Each log entry contains all the necessary information to identify various system states at given time moments. Data is usually saved in either string or numerical values and is saved in easily readable text files. By following log entries it should be possible to reconstruct how system continuously functioned in the past, so if system deviates from expected behavior, log analysis should identify the moment of system malfunction.</p><p>Log data can be used to determine in advance if there are any risks for system failure and also can be used to detect possible intrusions. In order to achieve this, multiple data entries need to be analyzed at once in order to identify any abnormal patterns <ref type="bibr" target="#b2">[3]</ref>.</p><p>3) Neural Networks are subset of Artificial Intelligence (AI) research. They are algorithms based on neuroscience seeking to replicate function of human brain. These networks consist of many input units, which are arranged in sets of layers. Initially preproccessed data is fed to initial layer and after performing initial data transformations, layer results are passed to subsequent layers7. Over time Neural Network discovers patterns within its data and then can use it to classify data into various categories. 4) Machine Learning (ML) is a subset of AI research, seeking to imitate human intellect through self learning algorithms. Firstly it is provided with preprocessed data, then a chosen model is applied to discover any meaningful patterns within given data <ref type="bibr" target="#b5">[8]</ref>. The given data can either be labeled to enhance model accuracy, which is called "Supervised Learning". In case of Unsupervised training provided data is unlabeled and patterns need to be discovered using statistical methods.</p><p>When compared to neural networks, classical, or "non-deep", machine learning is more dependent on human intervention to learn. Human experts determine the set of features to understand the differences between data inputs, usually requiring more structured data to learn <ref type="bibr" target="#b6">[9]</ref>. Traditional machine learning methods include Isolation Forest, SVM, kNN, Naive Bayes, Polynomial/Linear Regression, PCA and other methods.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="3.">Survey Results</head><p>Table <ref type="table" target="#tab_0">1</ref> showcases the amount of publications released during recent years. Publication amount is the exact number of research papers released during that year. Any papers which also include research into neural network use are counted as well. It can be said that during recent 5 years the anomaly detection field has received an increased amount of attention from the research community. During last 3 year period majority of written literature covers Neural Network methods and standard machine learning methods (like Knn, decision trees, SVM…) are becoming less popular.  Table <ref type="table" target="#tab_3">3</ref> provides the list of all commonly used machine learning methods. Any method which only has been used once within researched literature has been included in "other" category. It has been determined that SVM is the most frequently used machine learning method. Its primary advantage over Neural Networks is its significantly faster computational speed, which is important when it's necessary to detect system anomalies as soon as possible. Some other notable advantages include ability to handle high dimensional data and low risk of overfitting <ref type="bibr" target="#b8">[11]</ref>.  HDFS is a key component of Hadoop, offering reliable storage through data replication, integrates with big data frameworks and supports batch processing <ref type="bibr" target="#b9">[12]</ref>. Within reviewed literature it appeared the most frequently and often was simultaneously used with BGL and Thundebird <ref type="bibr" target="#b10">[13]</ref>, both of which are popular supercomputer log datasets. Table <ref type="table">5</ref> showcases all frequently used evaluation metrics. Any research metric which has only been used once within all research papers is included in "other" category. It has been determined that F1 Score (Formula 1) was the most commonly used evaluation metric. This metric is calculated using Recall (Formula 2) and Precision (Formula 3) metrics, so in most of research papers all 3 metrics have been used simultaneously. Within these formulas True Positive stands for all correctly identified elements, False Negative stands for all elements which have been incorrectly labeled as false, False Positive are all elements incorrectly labeled as true.</p><p>Precision is a good way of determining reliability of individual results which helps to minimize the risk of spending unnecessary resources on managing false alarms. Recall is useful for determining how much of an impact false negatives might have which is very important as all it takes is one missed anomaly to cause massive system damage. As both Precision and Recall are important, F1 ensures that both of them can be represented using a single metric <ref type="bibr" target="#b11">[14]</ref>. </p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>Precision=</head><p>True Postive True Positive+ False Positive <ref type="bibr" target="#b2">(3)</ref> Table 6 lists most common anomaly detection problems described within research papers. Any problem which has only been mentioned once has been assigned to "other" category. The largest concern specified by research literature is that due to increase in data amount, the extent to which log data analysis should be automated should increase as well <ref type="bibr" target="#b12">[15]</ref>[16] <ref type="bibr" target="#b14">[17]</ref>. Another major issue being brought up is that by itself log data does not include a sufficient amount of data to effectively determine new treats <ref type="bibr" target="#b15">[18]</ref>. Often while relying on log data, only time context is established and additional data context is ignored <ref type="bibr">[19]</ref>. Further issues could also be introduced while parsing log data, which could further degrade anomaly detection accuracy <ref type="bibr" target="#b16">[20]</ref>. One of the main requirements for successful anomaly detection is timely discovery of new treats. In order to comply with it and provide near real time detection, some necessary compromises need to be done. For example often this means only relying on most simple log data analysis and ignoring additional system analysis tools <ref type="bibr">[21][22]</ref>. Furthermore state of the art anomaly detection methods with highest detection accuracy are usually unfit for time sensitive issue detection <ref type="bibr" target="#b19">[23]</ref>. Another concern is that due to amount of information needed to be processed, cloud computing becomes necessary, which introduces issues of data transfer speeds <ref type="bibr" target="#b20">[24]</ref>[25]. To add on top of that due to software updates, models designed for previous software versions might severely degrade in accuracy <ref type="bibr" target="#b22">[26]</ref>.</p><p>Some additional issues being brought up in literature included having difficulty to perform simultaneous parallel analysis when each input is part of time series and requires proper understanding of its context <ref type="bibr" target="#b23">[27]</ref>; not all problems might be reflected within logs and the issues of software program itself might be overlooked <ref type="bibr" target="#b24">[28]</ref>; anomaly detection methods do not get sufficiently compared to each other <ref type="bibr" target="#b25">[29]</ref>;</p><p>traditional machine learning methods such as SVM are unable to perform sufficiently accurate analysis of temporal information of discrete log messages <ref type="bibr" target="#b27">[30]</ref>; Certain anomaly detection models have not been sufficiently tested in real life application <ref type="bibr" target="#b28">[31]</ref>; models based on statistical methods might be insensitive to importance of log entry order sequences <ref type="bibr" target="#b29">[32]</ref>.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="3.1.">Primary Findings</head><p>The following were the main findings of analyzed literature:</p><p>1.Embedding multi-core point-by-point convolution and global average pooling achieves significant advantages in terms of arithmetic power, memory and high availability, while ensuring detection accuracy <ref type="bibr" target="#b19">[23]</ref>.</p><p>2.Gumbel Noise Score Matching model demonstrated the capability of score matching for anomaly detection on categorical types in both tabular and image datasets. It also provided a unified framework for modeling mixed data types via score matching <ref type="bibr" target="#b30">[33]</ref>.</p><p>3.In transformer based models adapter-based tuning consistently outperforms training and fine-tuning models <ref type="bibr" target="#b13">[16]</ref>.</p><p>4.Dividing log events into dependent and independent types is an effective way to boost model accuracy <ref type="bibr" target="#b14">[17]</ref>.</p><p>5.Taking a character-based approach to process log events (lines) contributes to higher performance as the model may take advantage of characters deleted in word-based approaches, such as numbers and punctuation. Merging the parser, vectorizer, and classifier components into one deep neural network, allows model to learn log data at the language level <ref type="bibr" target="#b31">[34]</ref>.</p><p>6.Models trained on multi-project datasets are not only more accurate in standard tests but also more robust to sequence evolutions and more accurate in ahead of time anomaly predictions <ref type="bibr" target="#b31">[34]</ref>.</p><p>7.Though the presence of critical logs often indicates problems, their absence does not necessarily imply a healthy system status. An important reason is that sometimes determining where and how to place an informative log statement is difficult. In some cases, faults do not affect metrics, while in other cases, metrics exhibit unusual patterns (e.g., jitters) even if the system is experiencing minor performance fluctuations instead of faults. Hence, simply identifying anomalous metric patterns is insufficient <ref type="bibr" target="#b0">[1]</ref>.</p><p>8.Faults can cause unexpected behaviors involving either logs or metrics, or both of them. So the two data sources should be analyzed comprehensively to reveal the actual anomalies <ref type="bibr" target="#b0">[1]</ref>.</p><p>9. Intrinsic structure of host-based logs, as captured by persistence images and the spectrum of graph and hypergraph Laplacians, contains discriminative information about whether or not the logs are anomalous <ref type="bibr" target="#b32">[35]</ref>.</p><p>10. Data augmentation can simulate deviations in log data that occur from service updates over time which contribute to successful anomaly detection <ref type="bibr" target="#b21">[25]</ref>.</p><p>11. Multimodal approach can improve the scores for anomaly detection for multiple modalities in comparison to the single modalities of logs and traces <ref type="bibr" target="#b33">[36]</ref>.</p><p>12. Filtering out common log entries can noticeably improve anomaly detection accuracy <ref type="bibr" target="#b34">[37]</ref>.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="4.">Conclusions</head><p>During this survey it has been determined that over recent years the popularity of this topic has been increasing. The problems identified within research papers still need to be addressed and no universal solution has been discovered which would allow anomaly detection methods to keep up with ever increasing amount of generated log data and general increasing complexity of system software. It has also been determined that neural networks are continuously increasing in popularity, while traditional machine learning methods are becoming less popular. It has been determined that the most popular neural network model is LSTM/BiLSTM, most commonly used dataset is HDFS and most frequently used evaluation metric is F1 score.</p></div><figure xmlns="http://www.tei-c.org/ns/1.0" type="table" xml:id="tab_0"><head>Table 1 Publications</head><label>1</label><figDesc></figDesc><table><row><cell>per year</cell><cell></cell><cell></cell></row><row><cell>Year</cell><cell>Total Publication Amount</cell><cell>Neural Network Papers</cell></row><row><cell>2023 (first half) 2022 2021 2020 2019</cell><cell>6 15 26 13 10</cell><cell>5 10 17 10 4</cell></row></table></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" type="table" xml:id="tab_1"><head>Table 2</head><label>2</label><figDesc>lists all the different neural networks which have been mentioned in at least at least 2 separate research papers. All the remaining methods are included in "other" category. By far the most popular neural network models were LSTM or BiLSTM. The primary reason for this is that log data is normally represented in time series where usually previous log entries have influence over later entries<ref type="bibr" target="#b7">[10]</ref>.</figDesc><table /></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" type="table" xml:id="tab_2"><head>Table 2</head><label>2</label><figDesc></figDesc><table><row><cell>Most common Neural Network approaches</cell><cell></cell></row><row><cell>Neural Network</cell><cell>Amount</cell></row><row><cell>LSTM/BiLSTM</cell><cell>11</cell></row><row><cell>Autoencoder CNN/TCN Deeplog/LogAnomaly/LogRobust</cell><cell>7 6 6</cell></row><row><cell>Transformer GNN/eGNN/eGFC</cell><cell>6 5</cell></row><row><cell>RNN MLP Siamese Neural Network</cell><cell>4 3 2</cell></row><row><cell>Other</cell><cell>13</cell></row></table></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" type="table" xml:id="tab_3"><head>Table 3</head><label>3</label><figDesc></figDesc><table><row><cell>Most Common Machine Learning Approaches</cell><cell></cell></row><row><cell>Method Name</cell><cell>Amount</cell></row><row><cell>SVM</cell><cell>14</cell></row><row><cell>Isolation Forest Logistic/Linear Regression PCA Word2Vec Bayesen</cell><cell>10 6 6 6 5</cell></row><row><cell>kNN Decision Tree Drain Algorithm</cell><cell>4 3 2</cell></row><row><cell>Other</cell><cell>14</cell></row></table></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" type="table" xml:id="tab_4"><head>Table 4</head><label>4</label><figDesc>contains amounts of all most commonly used datasets. Industrial category refers to unnamed datasets which used specific industrial process log data. Private category includes all datasets, which cannot be disclosed due to a non disclosure agreement. Generated category includes all synthetic datasets which were generated specifically for the research study. Any dataset which didn't fall into previous 3 categories and was only mentioned once within all research papers, has been included in "other" category.</figDesc><table /></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" type="table" xml:id="tab_5"><head>Table 4</head><label>4</label><figDesc></figDesc><table><row><cell>Most Common Datasets</cell><cell></cell></row><row><cell>Dataset Name</cell><cell>Amount</cell></row><row><cell>HDFS BGL Thunderbird Openstack Spirit NSL-KDD</cell><cell>20 17 10 8 6 4</cell></row><row><cell>DARPA Hadoop</cell><cell>3 3</cell></row><row><cell>Lanl Mnist CIFAR</cell><cell>3 3 2</cell></row><row><cell>Huawei Cloud KDD CUP 99</cell><cell>2 2</cell></row><row><cell>Industrial Private Generated</cell><cell>4 11 4</cell></row><row><cell>Other</cell><cell>79</cell></row></table></figure>
		</body>
		<back>
			<div type="references">

				<listBibl>

<biblStruct xml:id="b0">
	<monogr>
		<title level="m" type="main">Heterogeneous Anomaly Detection for SoftwareSystems via Semi-supervised Cross-modal Attention</title>
		<author>
			<persName><forename type="first">Cheryl</forename><surname>Lee</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Tianyi</forename><surname>Yang</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Zhuangbin</forename><surname>Chen</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Yuxin</forename><surname>Su</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Yongqiang</forename><surname>Yang</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Michael</forename><forename type="middle">R</forename><surname>Lyu</surname></persName>
		</author>
		<ptr target="https://arxiv.org/pdf/2302.06914.pdf" />
		<imprint>
			<date type="published" when="2023">2023</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b1">
	<monogr>
		<title level="m" type="main">Reactive Log Anomaly DetectionBased On Iterative PU Learning</title>
		<author>
			<persName><forename type="first">Thorsten</forename><surname>Wittkopp</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Dominik</forename><surname>Scheinert</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Philipp</forename><surname>Wiesner</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Alexander</forename><surname>Acker</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Odej</forename><surname>Kao</surname></persName>
		</author>
		<author>
			<persName><surname>Pull</surname></persName>
		</author>
		<ptr target="https://arxiv.org/pdf/2301.10681.pdf" />
		<imprint>
			<date type="published" when="2023">2023</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b2">
	<monogr>
		<title level="m" type="main">Deep Learning for Anomaly Detection in Log Data:A Survey</title>
		<author>
			<persName><forename type="first">Max</forename><surname>Landauer</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Sebastian</forename><surname>Onder</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Florian</forename><surname>Skopik</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Markus</forename><surname>Wurzenberger</surname></persName>
		</author>
		<ptr target="https://arxiv.org/pdf/2207.03820.pdf" />
		<imprint>
			<date type="published" when="2023">2023</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b3">
	<monogr>
		<title level="m" type="main">Zero-day attack detection: a systematic literature review</title>
		<author>
			<persName><forename type="first">Rasheed</forename><surname>Ahmad</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Izzat</forename><surname>Alsmadi</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Wasim</forename><surname>Alhamdani</surname></persName>
		</author>
		<author>
			<persName><forename type="first">'</forename><surname>Lo</surname></persName>
		</author>
		<author>
			<persName><surname>Tawalbeh</surname></persName>
		</author>
		<idno type="DOI">10.1007/s10462-023-10437-z</idno>
		<ptr target="https://link.springer.com/article/10.1007/s10462-023-10437-z" />
		<imprint>
			<date type="published" when="2023">2023</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b4">
	<monogr>
		<title level="m" type="main">Neural networks</title>
		<author>
			<persName><forename type="first">Chris</forename><surname>Woodford</surname></persName>
		</author>
		<ptr target="https://www.explainthatstuff.com/introduction-to-neural-networks.html" />
		<imprint>
			<date type="published" when="2023">2023</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b5">
	<monogr>
		<title level="m" type="main">Machine learning, explained</title>
		<author>
			<persName><forename type="first">Sara</forename><surname>Brown</surname></persName>
		</author>
		<ptr target="https://mitsloan.mit.edu/ideas-made-to-matter/machine-learning-explained" />
		<imprint>
			<date type="published" when="2021">2021</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b6">
	<monogr>
		<title level="m" type="main">What is machine learning?</title>
		<ptr target="https://www.ibm.com/topics/machine-learning" />
		<imprint>
			<date type="published" when="2024">2024</date>
		</imprint>
		<respStmt>
			<orgName>IBM</orgName>
		</respStmt>
	</monogr>
</biblStruct>

<biblStruct xml:id="b7">
	<monogr>
		<title level="m" type="main">Log Message Anomaly Detection and Classification UsingAuto-B/LSTM and Auto-GRU</title>
		<author>
			<persName><forename type="first">,</forename><surname>Amir Farzada</surname></persName>
		</author>
		<author>
			<persName><forename type="first">T</forename><forename type="middle">Aaron</forename><surname>Gullivera</surname></persName>
		</author>
		<ptr target="https://arxiv.org/pdf/1911.08744.pdf" />
		<imprint>
			<date type="published" when="2021">2021</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b8">
	<monogr>
		<author>
			<persName><forename type="first">Suthar</forename><surname>Mudra</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Bhavikkmuar</forename></persName>
		</author>
		<ptr target="https://iq.opengenus.org/advantages-of-svm/" />
		<title level="m">Advantages of Support Vector Machines (SVM)</title>
				<imprint>
			<date type="published" when="2023">2023</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b9">
	<monogr>
		<title level="m" type="main">The Ultimate Guide to HDFS for Big Data Processing</title>
		<author>
			<persName><forename type="first">Donal</forename><surname>Tobin</surname></persName>
		</author>
		<ptr target="https://www.integrate.io/blog/guide-to-hdfs-for-big-data-processing/" />
		<imprint>
			<date type="published" when="2023">2023</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b10">
	<monogr>
		<title level="m" type="main">What Supercomputers Say: A Study of Five System Logs</title>
		<author>
			<persName><forename type="first">Adam</forename><surname>Oliner</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Jon</forename><surname>Stearley</surname></persName>
		</author>
		<ptr target="https://ieeexplore.ieee.org/document/4273008/" />
		<imprint>
			<date type="published" when="2007">2007</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b11">
	<monogr>
		<title level="m" type="main">F1 Score in Machine Learning</title>
		<author>
			<persName><forename type="first">Nikolaj</forename><surname>Buhl</surname></persName>
		</author>
		<ptr target="https://encord.com/blog/f1-score-in-machine-learning/" />
		<imprint>
			<date type="published" when="2023">2023</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b12">
	<monogr>
		<title level="m" type="main">Leveraging Log Instructions in Log-based AnomalyDetection</title>
		<author>
			<persName><forename type="first">Jasmin</forename><surname>Bogatinovski</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Gjorgji</forename><surname>Madjarov</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Sasho</forename><surname>Nedelkoski</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Jorge</forename><surname>Cardoso</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Odej</forename><surname>Kao</surname></persName>
		</author>
		<ptr target="https://arxiv.org/pdf/2207.03206.pdf" />
		<imprint>
			<date type="published" when="2022">2022</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b13">
	<monogr>
		<title level="m" type="main">A Unified Transformer-based Framework forLog Anomaly Detection</title>
		<author>
			<persName><forename type="first">Hongcheng</forename><surname>Guo</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Xingyu</forename><surname>Lin</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Jian</forename><surname>Yang</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Yi</forename><surname>Zhuang</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Jiaqi</forename><surname>Bai</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Liangfan</forename><surname>Tieqiaozheng</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Weichao</forename><surname>Zheng</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Bo</forename><surname>Hou</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Zhoujun</forename><surname>Zhang</surname></persName>
		</author>
		<author>
			<persName><surname>Li</surname></persName>
		</author>
		<ptr target="https://arxiv.org/pdf/2201.00016.pdf" />
		<imprint>
			<date type="published" when="2022">2022</date>
		</imprint>
	</monogr>
	<note>TRANSLOG</note>
</biblStruct>

<biblStruct xml:id="b14">
	<monogr>
		<title level="m" type="main">LogDP: Combining Dependency and Proximityfor Log-based Anomaly Detection</title>
		<author>
			<persName><forename type="first">Yongzheng</forename><surname>Xie</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Hongyu</forename><surname>Zhang</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Bo</forename><surname>Zhang</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Muhammad</forename><surname>Ali Babar</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Sha</forename><surname>Lu</surname></persName>
		</author>
		<ptr target="https://arxiv.org/pdf/2110.01927.pdf" />
		<imprint>
			<date type="published" when="2021">2021</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b15">
	<analytic>
		<title level="a" type="main">Fine-grained Anomaly Detection in Sequential Datavia Counterfactual Explanations</title>
		<author>
			<persName><forename type="first">He</forename><surname>Cheng</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Depeng</forename><surname>Xu</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Shuhan</forename><surname>Yuan</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Xintao</forename><surname>Wu</surname></persName>
		</author>
		<author>
			<persName><forename type="first">;</forename><surname>Saswati</surname></persName>
		</author>
		<author>
			<persName><forename type="first">;</forename><surname>Ray</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Sana</forename><surname>Lakdawala</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Mononito</forename><surname>Goswami</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Chufan</forename><surname>Gao</surname></persName>
		</author>
		<ptr target="https://arxiv.org/pdf/2111.08082v1.pdf" />
	</analytic>
	<monogr>
		<title level="m">Learning Probabalistic Graph Neural Networks forMultivariate Time Series Anomaly Detection</title>
				<imprint>
			<date type="published" when="2021">2022. 2021</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b16">
	<monogr>
		<title level="m" type="main">Log-based Anomaly Detection Without Log Parsing</title>
		<author>
			<persName><forename type="first">Hongyu</forename><surname>Van-Hoang Le</surname></persName>
		</author>
		<author>
			<persName><surname>Zhang</surname></persName>
		</author>
		<ptr target="https://arxiv.org/pdf/2108.01955.pdf" />
		<imprint>
			<date type="published" when="2021">2021</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b17">
	<monogr>
		<title level="m" type="main">Learning What to Monitor for EfficientAnomaly Detection</title>
		<author>
			<persName><forename type="first">Davide</forename><surname>Sanvito</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Giuseppe</forename><surname>Siracusano</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Sharan</forename><surname>Santhanam</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Roberto</forename><surname>Gonzalez</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Roberto</forename><surname>Bifulco</surname></persName>
		</author>
		<ptr target="https://arxiv.org/pdf/2203.15324.pdf" />
		<imprint>
			<date type="published" when="2022">2022</date>
		</imprint>
	</monogr>
	<note>syslrn</note>
</biblStruct>

<biblStruct xml:id="b18">
	<monogr>
		<title level="m" type="main">An Anomaly Event Detection Method Based on GNN Algorithmfor Multi-data Sources</title>
		<author>
			<persName><forename type="first">Yipeng</forename><surname>Ji</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Jingyi</forename><surname>Wang</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Shaoning</forename><surname>Li</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Yangyang</forename><surname>Li</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Shenwen</forename><surname>Lin</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Xiong</forename><surname>Li</surname></persName>
		</author>
		<ptr target="https://arxiv.org/pdf/2104.08761.pdf" />
		<imprint>
			<date type="published" when="2021">2021</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b19">
	<monogr>
		<title level="m" type="main">LightLog: A lightweight temporal convolutional network for log anomaly detection on the edge</title>
		<author>
			<persName><forename type="first">Zumin</forename><surname>Wang</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Jiyu</forename><surname>Tian</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Hui</forename><surname>Fang</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Liming</forename><surname>Chen</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Jing</forename><surname>Qin</surname></persName>
		</author>
		<ptr target="https://www.sciencedirect.com/science/article/abs/pii/S1389128621005119" />
		<imprint>
			<date type="published" when="2022">2022</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b20">
	<monogr>
		<title level="m" type="main">DeCorus: Hierarchical Multivariate Anomaly Detection atCloud-Scale</title>
		<author>
			<persName><forename type="first">Bruno</forename><surname>Wassermann</surname></persName>
		</author>
		<author>
			<persName><forename type="first">David</forename><surname>Ohana</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Ronen</forename><surname>Schaffer</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Robert</forename><surname>Shahla</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Elliot</forename><forename type="middle">K</forename><surname>Kolodner</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Eran</forename><surname>Raichstein</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Michal</forename><surname>Malka</surname></persName>
		</author>
		<ptr target="https://arxiv.org/pdf/2202.06892.pdf" />
		<imprint>
			<date type="published" when="2022">2022</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b21">
	<monogr>
		<title level="m" type="main">A2Log: Attentive Augmented Log Anomaly Detection</title>
		<author>
			<persName><forename type="first">Thorsten</forename><surname>Wittkopp</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Alexander</forename><surname>Acker</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Sasho</forename><surname>Nedelkoski</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Jasmin</forename><surname>Bogatinovski</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Dominik</forename><surname>Scheinert</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Odej</forename><surname>Wu Fan</surname></persName>
		</author>
		<author>
			<persName><surname>Kao</surname></persName>
		</author>
		<ptr target="https://arxiv.org/pdf/2109.09537.pdf" />
		<imprint>
			<date type="published" when="2021">2021</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b22">
	<monogr>
		<title level="m" type="main">Robust and Transferable Anomaly Detection in LogData using Pre-Trained Language Models</title>
		<author>
			<persName><forename type="first">Harold</forename><surname>Ott</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Jasmin</forename><surname>Bogatinovski</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Alexander</forename><surname>Acker</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Sasho</forename><surname>Nedelkoski</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Odej</forename><surname>Kao</surname></persName>
		</author>
		<ptr target="https://arxiv.org/pdf/2102.11570.pdf" />
		<imprint>
			<date type="published" when="2021">2021</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b23">
	<monogr>
		<title level="m" type="main">Distributed Anomaly Detection in Edge Streams usingFrequency based Sketch Datastructures</title>
		<author>
			<persName><forename type="first">Prateek</forename><surname>Chanda</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Malay</forename><surname>Bhattacharya</surname></persName>
		</author>
		<ptr target="https://arxiv.org/pdf/2111.13949.pdf" />
		<imprint>
			<date type="published" when="2021">2021</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b24">
	<monogr>
		<title level="m" type="main">System Log Anomaly Detectionbased on BERT Masked Language Model</title>
		<author>
			<persName><forename type="first">Yukyung</forename><surname>Lee</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Jina</forename><surname>Kim</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Pilsung</forename><surname>Kang</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Lanobert</forename></persName>
		</author>
		<ptr target="https://arxiv.org/pdf/2111.09564.pdf" />
		<imprint>
			<date type="published" when="2023">2023</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b25">
	<monogr>
		<title/>
		<author>
			<persName><forename type="first">Zhuangbin</forename><surname>Chen</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Jinyang</forename><surname>Liu</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Wenwei</forename><surname>Gu</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Yuxin</forename><surname>Su</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Jieming</forename><surname>Zhu</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Yongqiang</forename><surname>Yang</surname></persName>
		</author>
		<author>
			<persName><forename type="first">R</forename><surname>Michael</surname></persName>
		</author>
		<imprint/>
	</monogr>
</biblStruct>

<biblStruct xml:id="b26">
	<monogr>
		<title level="m" type="main">Experience Report: Deep Learning-based System Log Analysisfor Anomaly Detection</title>
		<author>
			<persName><surname>Lyu</surname></persName>
		</author>
		<ptr target="https://arxiv.org/pdf/2107.05908.pdf" />
		<imprint>
			<date type="published" when="2022">2022</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b27">
	<monogr>
		<title level="m" type="main">Log Anomaly Detection via BERT</title>
		<author>
			<persName><forename type="first">Haixuan</forename><surname>Guo</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Shuhan</forename><surname>Yuan</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Xintao</forename><surname>Wu</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Logbert</forename></persName>
		</author>
		<ptr target="https://arxiv.org/pdf/2103.04475.pdf" />
		<imprint>
			<date type="published" when="2021">2021</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b28">
	<monogr>
		<title level="m" type="main">Online anomaly detection using statisticalleverage for streaming business process events</title>
		<author>
			<persName><forename type="first">Jonghyeon</forename><surname>Ko</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Marco</forename><surname>Comuzz</surname></persName>
		</author>
		<ptr target="https://arxiv.org/pdf/2103.00831.pdf" />
		<imprint>
			<date type="published" when="2021">2021</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b29">
	<monogr>
		<author>
			<persName><forename type="first">Yicheng</forename><surname>Guo</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Yujin</forename><surname>Wen</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Congwei</forename><surname>Jian</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Yixin</forename><surname>Lian</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Yi</forename><surname>Wan</surname></persName>
		</author>
		<ptr target="https://arxiv.org/pdf/2101.02392.pdf" />
		<title level="m">Detecting Log Anomalies with Multi-Head Attention</title>
				<imprint>
			<publisher>LAMA</publisher>
			<date type="published" when="2021">2021</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b30">
	<monogr>
		<title level="m" type="main">Anomaly Detection via Gumbel Noise Score Matching</title>
		<author>
			<persName><forename type="first">Ahsan</forename><surname>Mahmood</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Junier</forename><surname>Oliva</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Martin</forename><surname>Styner</surname></persName>
		</author>
		<ptr target="https://arxiv.org/pdf/2304.03220.pdf" />
		<imprint>
			<date type="published" when="2023">2023</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b31">
	<monogr>
		<title level="m" type="main">OneLog: Towards End-to-End Training in Software Log Anomaly Detection</title>
		<author>
			<persName><forename type="first">Shayan</forename><surname>Hashemi</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Mika</forename><surname>Mäntylä</surname></persName>
		</author>
		<ptr target="https://arxiv.org/pdf/2104.07324v1.pdf" />
		<imprint>
			<date type="published" when="2021">2021</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b32">
	<monogr>
		<title level="m" type="main">Topological Data Analysis for Anomaly Detection in Host-Based Logs</title>
		<author>
			<persName><forename type="first">Thomas</forename><surname>Davies</surname></persName>
		</author>
		<ptr target="https://arxiv.org/pdf/2204.12919.pdf" />
		<imprint>
			<date type="published" when="2022">2022</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b33">
	<monogr>
		<title level="m" type="main">Multi-Source Anomaly Detection in Distributed IT Systems</title>
		<author>
			<persName><forename type="first">Jasmin</forename><surname>Bogatinovski</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Sasho</forename><surname>Nedelkoski</surname></persName>
		</author>
		<ptr target="https://arxiv.org/pdf/2101.04977.pdf" />
		<imprint>
			<date type="published" when="2021">2021</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b34">
	<monogr>
		<author>
			<persName><forename type="first">Siavash</forename><surname>Ghiasvand</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Florina</forename><forename type="middle">M</forename><surname>Ciorba</surname></persName>
		</author>
		<ptr target="https://arxiv.org/pdf/1906.04550.pdf" />
		<title level="m">Anomaly Detection in High Performance Computers: A Vicinity Perspective</title>
				<imprint>
			<date type="published" when="2019">2019</date>
		</imprint>
	</monogr>
</biblStruct>

				</listBibl>
			</div>
		</back>
	</text>
</TEI>
