<?xml version="1.0" encoding="UTF-8"?>
<TEI xml:space="preserve" xmlns="http://www.tei-c.org/ns/1.0" 
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" 
xsi:schemaLocation="http://www.tei-c.org/ns/1.0 https://raw.githubusercontent.com/kermitt2/grobid/master/grobid-home/schemas/xsd/Grobid.xsd"
 xmlns:xlink="http://www.w3.org/1999/xlink">
	<teiHeader xml:lang="en">
		<fileDesc>
			<titleStmt>
				<title level="a" type="main">Anomaly Detection of Command Shell Sessions based on DistilBERT: Unsupervised and Supervised Approaches</title>
			</titleStmt>
			<publicationStmt>
				<publisher/>
				<availability status="unknown"><licence/></availability>
			</publicationStmt>
			<sourceDesc>
				<biblStruct>
					<analytic>
						<author role="corresp">
							<persName><forename type="first">Zefang</forename><surname>Liu</surname></persName>
							<email>zefang.liu@jpmchase.com</email>
							<affiliation key="aff0">
								<orgName type="institution">JPMorgan Chase</orgName>
								<address>
									<addrLine>3223 Hanover St</addrLine>
									<postCode>94304</postCode>
									<settlement>Palo Alto</settlement>
									<region>CA</region>
									<country key="US">USA</country>
								</address>
							</affiliation>
						</author>
						<author>
							<persName><forename type="first">John</forename><forename type="middle">F</forename><surname>Buford</surname></persName>
							<affiliation key="aff0">
								<orgName type="institution">JPMorgan Chase</orgName>
								<address>
									<addrLine>3223 Hanover St</addrLine>
									<postCode>94304</postCode>
									<settlement>Palo Alto</settlement>
									<region>CA</region>
									<country key="US">USA</country>
								</address>
							</affiliation>
						</author>
						<title level="a" type="main">Anomaly Detection of Command Shell Sessions based on DistilBERT: Unsupervised and Supervised Approaches</title>
					</analytic>
					<monogr>
						<idno type="ISSN">1613-0073</idno>
					</monogr>
					<idno type="MD5">FE62F06B136442CBDE6B3233627F5E81</idno>
				</biblStruct>
			</sourceDesc>
		</fileDesc>
		<encodingDesc>
			<appInfo>
				<application version="0.7.2" ident="GROBID" when="2025-04-23T19:58+0000">
					<desc>GROBID - A machine learning software for extracting information from scholarly documents</desc>
					<ref target="https://github.com/kermitt2/grobid"/>
				</application>
			</appInfo>
		</encodingDesc>
		<profileDesc>
			<textClass>
				<keywords>
					<term>anomaly detection</term>
					<term>keystroke data</term>
					<term>command line</term>
					<term>Unix shell</term>
					<term>DistilBERT</term>
				</keywords>
			</textClass>
			<abstract>
<div xmlns="http://www.tei-c.org/ns/1.0"><p>Anomaly detection in command shell sessions is a critical aspect of computer security. Recent advances in deep learning and natural language processing, particularly transformer-based models, have shown great promise for addressing complex security challenges. In this paper, we implement a comprehensive approach to detect anomalies in Unix shell sessions using a pretrained DistilBERT model, leveraging both unsupervised and supervised learning techniques to identify anomalous activity while minimizing data labeling. The unsupervised method captures the underlying structure and syntax of Unix shell commands, enabling the detection of session deviations from normal behavior. Experiments on a largescale enterprise dataset collected from production systems demonstrate the effectiveness of our approach in detecting anomalous behavior in Unix shell sessions. This work highlights the potential of leveraging recent advances in transformers to address important computer security challenges.</p></div>
			</abstract>
		</profileDesc>
	</teiHeader>
	<text xml:lang="en">
		<body>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="1.">Introduction</head><p>The complexity of modern computer systems and networks has led to an increasing demand for efficient and reliable security solutions. Interactive command shells, especially Unix shells, which provide a powerful interface for system administration, development, and maintenance tasks, are an essential aspect of many computing environments. However, they can also be exploited by attackers to gain unauthorized access, escalate privileges, avoid defense detection, collect sensitive data, and manipulate systems. As a result, anomaly detection in command shells has become a crucial component of computer security.</p><p>Previous studies have utilized various techniques for anomaly detection in command shell sessions, ranging from simple rule-based methods to more complex machine learning algorithms. However, most of these approaches rely heavily on predefined features or labeled data from security experts for training supervised models. Assembling a large, well-labeled dataset can be time-consuming and labor-intensive, often resulting in a limited scope of detection capabilities due to the inherent biases in the labeling process.</p><p>Recent advances in deep learning and natural language processing (NLP) have enabled new opportunities for addressing complex security challenges. In particular, transformer-based models, such as BERT (Bidirectional Encoder Representations from Transformers) <ref type="bibr" target="#b0">[1]</ref> and GPT (Generative Pretrained Transformer) <ref type="bibr" target="#b1">[2]</ref>, have achieved state-of-the-art performance across various NLP tasks. These models have the potential to enhance computer security by enabling more effective and adaptable anomaly detection systems that can learn from large-scale, diverse data sources.</p><p>In enterprise production environments, access to command shells is treated as a privileged activity because of the potential for misuse of system commands. Commands with the potential for misuse are well known. Specific commands may be a priori disabled. Attack techniques have been compiled, for example, in the MITRE ATT&amp;CK ® framework. Enterprises can implement rule-based detection using these resources. Consequently, the benefit of the anomaly detection model is to automatically identify command patterns that are outliers with respect to the overall set of sessions that would not be detected by the rule-based approach. Due to the volume, length, and complexity of shell sessions, manual detection of outliers is impractical. An automatic process is needed to assign anomaly scores to sessions, where sessions with high anomaly scores can be prioritized for further investigation. In this paper, we apply a transformer-based model for anomaly detection in Unix shell sessions with a pretrained DistilBERT model. Our method employs both unsupervised and supervised learning techniques, aiming to deliver a robust and flexible solution for identifying anomalous activity while reducing the burden on manual labels from experts.</p><p>DistilBERT <ref type="bibr" target="#b2">[3]</ref>, a lighter and more efficient version of the BERT <ref type="bibr" target="#b0">[1]</ref>, has demonstrated exceptional performance across a wide range of NLP tasks. By pretraining a DistilBERT model on a large dataset of Unix shell sessions, we capture the underlying structure and syntax of Unix shell commands and allow the model to identify deviations of shell sessions from normal activity. The unsupervised method uses an ensemble model to calculate anomaly scores, detecting potential security threats without requiring labeled data. We further experimented with applying the unsupervised model to specific command subshells, such as HDFS, SQL, Spark, and Python, which are notable for having specific subshell command syntaxes. To further enhance the precision of our anomaly detection system, we implement a supervised approach by fine-tuning the pretrained DistilBERT model on a small set of labeled Unix shell sessions with suspicious keywords, which allows the model to learn from session labels and distinguish normal and anomalous activity more effectively. The overall pipeline is shown in the Figure <ref type="figure" target="#fig_0">1</ref> for both unsupervised ans supervised methods.</p><p>The main contributions of this paper are as follows:</p><p>1. We apply a comprehensive anomaly detection framework for Unix shell sessions based on the pretrained DistilBERT model and ensemble anomaly detectors, addressing an important problem in computer security. 2. We conduct experiment and demonstrate the effectiveness of unsupervised approach using an ensemble method to compute anomaly scores for a large-scale enterprise dataset, enabling the identification of suspicious activities without extensive manual labeling. 3. We evaluate the performances of supervised fine-tuned models on a few-shot set of labeled sessions, highlighting the adaptability and accuracy of our supervised approach.  The remainder of this paper is organized as follows: Section 2 provides related work in command shell anomaly detection; Section 3 presents the data, including dataset description, differences from previous datasets, data quality issues, and data cleaning procedures; Section 4 details our methodology, including the unsupervised and supervised approaches; Section 5 presents the experimental results and examples of suspicious activities; and Section 6 concludes the paper and outlines possible future work.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="2.">Related Work</head><p>In this section, we discuss the existing literature related to detecting anomalies in Unix shell commands. We first review research in log anomaly detection and then masquerade detection. We also highlight the gaps in previous research that our proposed approach aims to address.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="2.2.">Masquerade Detection</head><p>Masquerade detection <ref type="bibr" target="#b11">[12]</ref> is a specific type of anomaly detection that focuses on identifying unauthorized users who have gained access to legitimate user's accounts or privileges and are attempting to impersonate them. The goal is to detect differences in user behavior between sessions that may indicate the presence of an attacker. In the context of Unix shell sessions, masquerade detection aims to distinguish between the normal activities of the genuine user and the suspicious actions of the masquerader. Early approaches to masquerade detection relied on traditional machine learning techniques, such as Naive Bayes <ref type="bibr" target="#b12">[13,</ref><ref type="bibr" target="#b13">14,</ref><ref type="bibr" target="#b14">15]</ref>, Support Vector Machines (SVMs) <ref type="bibr" target="#b14">[15,</ref><ref type="bibr" target="#b15">16]</ref>, and Hidden Markov Models (HMM) <ref type="bibr" target="#b16">[17]</ref>. Deep learning techniques <ref type="bibr" target="#b17">[18,</ref><ref type="bibr" target="#b18">19]</ref>, including Convolutional Neural Networks <ref type="bibr" target="#b19">[20]</ref>, Temporal Convolutional Networks <ref type="bibr" target="#b20">[21]</ref>, and LSTM <ref type="bibr" target="#b19">[20]</ref>, have also been applied to masquerade detection, leading to improved detection accuracies.</p><p>However, these masquerade detection methods are not well-suited for detecting suspicious activities in Unix shell sessions. The goal of masquerade detection is to find imitators, while the command shell anomaly detection is trying to search suspicious or exploitable command patterns. Besides, the supervised method used in previous research can only detect anomalous sessions based on predefined rules and features from experts, which limit their flexibility and adaptability and make it challenging to identify new or unknown threats in command shell sessions.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="3.">Data</head><p>In this section, we describe the data used for our study, including the data description and data preprocessing. Important steps for extracting and cleaning commands from the raw keystroke data are highlighted. We also discuss the characteristics of the data that make it different from previous Unix shell datasets.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="3.1.">Data Description</head><p>Previous datasets for Unix shell commands include the SEA dataset <ref type="bibr" target="#b21">[22]</ref>, Greenberg dataset <ref type="bibr" target="#b22">[23]</ref>, PU dataset <ref type="bibr" target="#b23">[24]</ref>, and NL2Bash <ref type="bibr" target="#b24">[25]</ref>. The SEA dataset, introduced by Schonlau et al. <ref type="bibr" target="#b21">[22]</ref>, is a widely recognized benchmark, consisting of Unix commands from 50 users, with potential masquerade attacks seeded. The Greenberg dataset, collected by Greenberg et al. <ref type="bibr" target="#b22">[23]</ref>, contains Unix commands from 168 different users of the Unix C shell, and has been used to study user behavior and evaluate masquerade detection models. The PU dataset, developed by Lane et al. <ref type="bibr" target="#b23">[24]</ref>, contains 9 sets of sanitized user data collected from Purdue university command histories of 8 users in 2 years. The NL2Bash dataset, collected by Lin et al. <ref type="bibr" target="#b24">[25]</ref>, contains around 10,000 English sentence and bash command pairs. These datasets have contributed significantly to the development and evaluation of various Unix shell anomaly detection techniques, especially in the masquerade detection area. While each dataset offers unique insights, they also have their limitations, such as being outdated, only with truncated commands but without command options and subshells, lacking diversity of command usages, or not providing sufficient data for certain types of real exploits or attacks. Consequently, our study aims to leverage a large-scale, unlabeled dataset of Unix shell commands from real operating system users to explore novel anomaly detection approaches and address the limitations of previous datasets.</p><p>The raw data used in the research includes 90 days of Unix keystroke sessions from over 15,000 users, which have about 3 million activity objects. Among these activities, around 2.4 million objects are non-empty interactive sessions. However, the raw data have several data characteristics, including mixed shell prompts, command inputs, and command outputs, various shell prompts across sessions and within session, truncated long command lines with varying line lengths, various command aliases across sessions, mixed background process outputs with prompts and inputs, and missed backspaces and tab keys. In order to prepare this dataset for detecting anomalies in the next step, we developed heuristics to extract and clean commands from the raw data.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="3.2.">Data Preprocessing</head><p>The anomaly detector for shell commands needs clean command sessions to avoid introducing much noise into the model. However, the raw keystroke log dataset is a mixture of commands inputted by users and also responses outputted from systems. In order to increase the anomaly detection accuracies and also decrease the computing time, we extract user command inputs from the raw data and clean these commands. A heuristic algorithm is developed for this data preprocessing function, which is introduced briefly as follows.</p><p>In order to extract commands from the raw data, we need to search the shell prompts first. One conventional way is using the regular expressions. However, in practice, different sessions can have different shell prompts, and even in one session, the shell prompt can vary based on current working directories or subshells. Handcrafting regular expressions for each session is a tedious and non-adaptive work. To overcome these drawbacks, we create a list of 140 common Unix commands and a list of prompt terminal symbols ($, #, &gt;). More terminal symbols were tested, but the probability of mismatching increased. For each input line, the first occurring prompt terminal symbol is located, and the following word is tested against the common command set. If this word is a known common command, the prompt is saved, otherwise it is skipped. To avoid mismatching prompts, several rules are applied for fixing corner cases, such as removing time prefixes, checking for balanced brackets in each prompt, and excluding environment variables.</p><p>After extracting session prompts, we then extract commands from the raw data, where we search for known prompts from this session and then extract the command line after the prompt. Additional steps are applied for handling several special cases, such as removing text editor buffers and concatenating wrapped multiple-line commands. Some meta data are also collected for down-stream use, including numbers of output lines and error messages. After extracting commands and dropping duplicates, we obtain 1.15 million sessions.</p><p>The last step is the command cleaning process. The main goal of this step is to reduce the data noise, so the anomaly detection model can give more precise results. We apply several filters for cleaning the extracted shell commands, including removing command lines with error messages, dropping command editing buffers and shell completions, deleting long consecutive spaces and over-repeated characters, filtering command names with regular expressions, masking numbers and special words, and cleaning cyclic commands usually generated by loops from shell scripts.</p><p>The cleaned command shell sessions are then used in the next stage for both unsupervised an supervised approaches.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="4.">Methodology</head><p>In this section, we outline the methodology of our proposed anomaly detection approach for Unix shell sessions. Our approach employs both unsupervised and supervised learning techniques. We provide a detailed description of the unsupervised ensemble anomaly detector based on the pretrained DistilBERT model and also the supervised fine-tuning of the DistilBERT model using a few labeled data.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="4.1.">Unsupervised Approach</head><p>The unsupervised approach of our research involves pretraining a DistilBERT <ref type="bibr" target="#b2">[3]</ref> model from Hugging Face <ref type="bibr" target="#b25">[26]</ref> on Unix shell commands and constructing an ensemble anomaly detector based on the session embeddings from the pretrained DistilBERT. This method was first proposed by CrowdStrike <ref type="bibr" target="#b26">[27,</ref><ref type="bibr" target="#b27">28]</ref> for command lines from various platforms. The unsupervised model discovers new anomaly patterns for manual review.</p><p>Since the Unix shell commands are different from human languages, we pretrain a language model from scratch with the Unix shell commands instead of using an already existing pretrained model. BERT <ref type="bibr" target="#b0">[1]</ref> and its lighter-weight variant DistilBERT <ref type="bibr" target="#b2">[3]</ref> are state-of-the-art encoderbased transformer models that have shown remarkable performance in various natural language processing tasks, especially in understanding context and capturing complex language patterns. DistilBERT <ref type="bibr" target="#b2">[3]</ref> is selected in this research due to its balance of performance and efficiency. The WordPiece <ref type="bibr" target="#b0">[1]</ref>, the default sub-word tokenizer for DistilBERT, with a dictionary size of 30,000 is trained for tokenizing the Unix sessions, while several other dictionary sizes were experimented. Then the tokens are inputted into the DistilBERT model, and the model is pretrained for the masked language modeling (MLM) task to capture the inherent structure and dependencies within command sequences. The cased DistilBERT model is selected since the Unix shell is case-sensitive. This unsupervised pretraining allows the model to learn general representations of command sequences without relying on labeled data. Once the DistilBERT mode has been pretrained, the last hidden states are used as the embeddings of the Unix shell sessions. At the end of the pretraining process, we have one contextual embedding for each command session, which represents the higher-level features of the command sequences.</p><p>To detect anomalies of Unix sessions in an unsupervised approach without fine-tuning a classification layer, four outlier detectors from PyOD <ref type="bibr" target="#b28">[29]</ref> are applied, including the principal component analysis (PCA) <ref type="bibr" target="#b29">[30,</ref><ref type="bibr" target="#b30">31]</ref>, isolation forest (IF) <ref type="bibr" target="#b31">[32,</ref><ref type="bibr" target="#b32">33]</ref>, copula-based outlier detection (COPOD) <ref type="bibr" target="#b33">[34]</ref>, and autoencoders (AE) <ref type="bibr" target="#b29">[30]</ref>, by following CrowdStrike's framework <ref type="bibr" target="#b26">[27,</ref><ref type="bibr" target="#b27">28]</ref>. These four outlier detection models are trained with the session embeddings, and their decision scores are normalized for each outlier detector. For each session, all four decision scores are averaged to get the final anomaly score of that session. The anomaly scores represent how deviant of one command session from the overall collection of sessions. Sessions with high anomaly scores are considered outliers, which may contain unusual command syntaxes or patterns.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="4.2.">Supervised Approach</head><p>The supervised part of our approach involves fine-tuning the pretrained DistilBERT model with labeled data to improve its performance in distinguishing between normal and suspicious command sequences as a binary classifier. We fine-tune the pretrained DistilBERT with SetFit (Sentence Transformer Fine-tuning) <ref type="bibr" target="#b34">[35]</ref>, which is an efficient and prompt-free framework for few-shot fine-tuning of sentence transformers. In SetFit, the transformer can be fine-tuned on a small number of text pairs in a contrastive Siamese manner with high accuracy. The results of the model fine-tuned by SetFit are compared with the original fine-tuned DistilBERT and a trained logistic regressor with fixed session embedding.</p><p>In order to fine-tune the pretrained model, examples of labeled sessions are required. Instead of labeling sessions manually, we create a table of suspicious keywords developed based on Uptycs's work <ref type="bibr" target="#b35">[36]</ref> to cover MITRE ATT&amp;CK ® techniques <ref type="bibr" target="#b36">[37,</ref><ref type="bibr" target="#b37">38]</ref> commonly used by attackers. Those suspicious keywords are presented in the Table <ref type="table" target="#tab_1">1</ref> with their corresponding technique IDs and names. Those suspicious keywords are searched in each Unix shell sessions, and those sessions with the number of unique suspicious keywords higher than the threshold are considered as anomalies. The setting of the labeled dataset is discussed further in the experimental results. Besides the suspicious keywords, we also created regular expressions to tag sessions with more ATT&amp;CK techniques <ref type="bibr" target="#b36">[37,</ref><ref type="bibr" target="#b37">38]</ref>. Those tags are used for the session annotation and analysis.</p><p>Upon completing the supervised fine-tuning phase, we evaluate the performance of our anomaly detection approach using the testing data. We assess the model's effectiveness in detecting normal and suspicious command sequences by calculating various performance metrics, including precision, recall, and F1 score. The evaluations are discussed in the next section.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="5.">Experimental Results</head><p>In this section, we present the experimental results for both unsupervised and supervised anomaly detection methods applied to Unix shell commands. We first evaluate the unsupervised model with the pretrained DistilBERT embedding and the ensemble anomaly detector on the unlabeled data and then evaluate performance of the supervised model with labeled sessions.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="5.1.">Unsupervised Approach Results</head><p>In order to evaluate the unsupervised model and understand its performance, several analyses are done, including visualizing distributions of anomaly scores and embedding vectors, investigating relations between the anomaly scores and numbers of tokens and command lines, and also comparing anomaly scores of the common shell commands.</p><p>The distribution of anomaly scores is shown in the Figure <ref type="figure">2a</ref>. Since the anomaly scores have already been standardized, the mean and standard deviation of the distribution are 0 and 1 respectively. The distribution of anomaly scores is close to normal distribution, where most of sessions are observed around mean, while some outliers have higher anomaly scores than the most sessions. Besides, the anomaly scores from four anomaly detectors for the top 100 anomalies are also shown in the Figure <ref type="figure">2b</ref>, where the COPOD usually has the highest anomaly scores, while the IF tends to be the lower side and with a higher variance. For most sessions, these four anomaly detectors show consistent behaviors and assign high anomaly scores to these sessions.</p><p>To further understand the behavior of the unsupervised model, the anomaly scores are presented with the number of tokens and the number of command lines in the Figure <ref type="figure" target="#fig_6">3a</ref> and Figure <ref type="figure" target="#fig_6">3b</ref>. Generally speaking, a session with more tokens and more command lines can have higher anomaly score. It is because usually shorter sessions only have the simple syntax for straightforward and repetitive daily usages, while longer sessions can have long command sequences to perform complicated and uncommon tasks, which are preferred by the unsupervised model due to their unusual command structure and syntax.</p><p>At the end of unsupervised model analysis, we show the anomaly scores for the top 50 common commands in the Figure <ref type="figure" target="#fig_2">4</ref>. Those anomaly scores are weighted averaged of the session anomaly scores, where these commands appear. Most common commands, such as "ls" "exit", "bash", and so on, have lower anomaly scores, while "alias" and "l" have higher anomaly scores. In most cases, there is no clear explanation about the relation between the command names and their anomaly scores, since those anomaly scores are averaged from their sessions and can be affected by the session structures. But in general, infrequent commands have higher anomaly scores.</p><p>In summary, a session with a high anomaly score does not always mean it has the suspicious activity. However, anomaly scores can be used for prioritizing command sessions for expert analyses and also help monitoring experts discover new suspicious patterns. The unclear relations and uncertainties of the unsupervised model results motivate us to build and evaluate supervised models, which are discussed next. More investigation of relations between anomaly scores and suspicious activities and also the language structure of shell commands can be done in the future research.</p><p>In addition to the Unix shell, similar analyses are also done for subshell commands. During the command cleaning, we removed subshells which have different prompts than the Unix shells, such as HDFS, Spark, SQL, and Python. Those subshells are extracted separately, where an unsupervised anomaly detector in the same structure is applied to each subshell. The anomaly scores are assigned to subshell sessions, where specific exploits are also scanned through them. Analyzing the experiment results from subshell anomaly detection is beyond the scope of this paper.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="5.2.">Supervised Approach Results</head><p>To evaluate the supervised models, we label the command sessions by the number of suspicious keywords as described in the methodology. If one session has at least three unique suspicious keywords, it is considered as an abnormal session. However, if one session has zero suspicious keywords, it is labeled as a normal session. Other sessions are labeled as the abstained session, which are removed from model evaluations, since there is no strong criterion to classify them into either class. The labeled dataset is split into the training and testing sets by 90:10, and the number of sessions in each class are shown in the Table <ref type="table" target="#tab_2">2</ref>. During experiments, we use the same number of normal and abnormal sessions from the training data and combine them into a few-shot training set. Since the evaluation results from a small training set is unstable, we run 5 experiments for each model and each number of samples per class. For models fine-tuned with SetFit <ref type="bibr" target="#b34">[35]</ref>, we use the batch size 16, learning rate 1e-5, number of iterations 20 (number of text pairs), and train each model for 1 epoch. For fine-tuned DistilBERT models, we use the learning rate 1e-5, and each model is trained for 5 epochs. The averaged precisions, recalls, and F1 scores are reported in the Figure <ref type="figure" target="#fig_4">5</ref> and Table <ref type="table" target="#tab_3">3</ref>. The fine-tuned SetFit model with 2048 samples per class shows the best result, which is higher than the fine-tuned DistilBERT with the same training data size. The fixed DistilBERT embedding with logistic regression gives the lowest result. The observation shows the advantage of SetFit for fine-tuning pretrained models when the labeled data are limited. Also, the model performance increases as the number of samples per class increasing. The experimental results of supervised model show the feasibility of creating a small set of manually labeled command sessions, fine-tuning a pretrained model with SetFit, and then using it for classifying more sessions automatically.   </p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="5.3.">Session Annotations and Examples</head><p>Besides experiments and evaluations of unsupervised and supervised models, we also annotated sessions with MITRE ATT&amp;CK ® techniques in addition to previously mentioned suspicious keywords and anomaly scores. These annotations can help cybersecurity experts recognize and analyze suspicious activity.</p><p>During the annotation process, Unix shell sessions are labeled by searching 58 MITRE ATT&amp;CK ® techniques with corresponding regular expressions. For each technique, we search for specific command usages and file accesses. The distributions of techniques are shown in Figure <ref type="figure">6</ref>, and the tactics are shown in Figure <ref type="figure">7</ref>. The most common techniques are T1057 Process Discovery, T1082 System Information Discovery, and T1105 Ingress Tool Transfer, although those sessions with less-common techniques are more interesting to be analyzed for anomaly detection. Three session examples with high anomaly scores are selected and presented in the Table <ref type="table" target="#tab_5">4</ref>-6, where ATT&amp;CK techniques are highlighted in the blue color with suspicious keywords in the red color. The first example in the Table <ref type="table" target="#tab_5">4</ref> shows remote command execution of transient web server with potential for data exfiltration. The second example in the Table <ref type="table">5</ref> gives a potential data exfiltration and credential exposure subject to discovery via process discovery. And the last example in the Table <ref type="table" target="#tab_6">6</ref> illustrates disk clear and boot load configuration changes.  </p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>Table 5</head><p>An example of potential data exfiltration and credential exposure subject to discovery via process discovery. </p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="6.">Conclusions</head><p>Anomaly detection for interactive command shells is a complex problem. Detection of anomalies is needed as a cybersecurity safeguard because privileged access at the shell level provides the opportunity for a range of attacks that threaten critical enterprise infrastructure, data, and services. On the other hand, prevention of such threats by locking system access prevents important operations activities like upgrades, change management, and outage investigation and remediation.</p><p>Prior research has been limited by available datasets. We presented the first published results on keystroke anomaly detection using an enterprise-scale dataset captured from production systems over a 90-day period. The extent of the dataset, 1.15 million sessions captured from over 15,000 users, demonstrates the need for automated anomaly detection. The dataset came with important data extraction and cleaning issues but provides a rich cross-section of enterprise  operations activities. Notably, the monitored infrastructure in the dataset excludes network appliances and specialized embedded systems and is otherwise representative of widely used information technology.</p><p>Past research has also been limited by available models. We presented the first experimental results of using a machine-learning transformer model, specifically DistilBERT, for keystroke log anomaly detection of Unix shells, in both unsupervised and supervised approaches. Although the dataset is unlabeled, we tagged each session using two existing schemes: the MITRE ATT&amp;CK ® techniques and suspicious keywords. Unix shell sessions with high anomaly scores were then cross-checked with the tags as part of validating the utility of the anomaly model for operations uses. Model output was also compared with rule-based log analysis scripts used by operations teams. The results of the cross-check show that the outliers found by the model contain significant cases not found in either the tagging or existing analysis scripts. More future research can be done for designing specific tokenizers for shell commands, understanding the implicit relations between anomaly scores and suspicious activities, and analyzing subshell command anomalies.</p></div><figure xmlns="http://www.tei-c.org/ns/1.0" xml:id="fig_0"><head>Figure 1 :</head><label>1</label><figDesc>Figure 1: Pipeline of the command shell session anomaly detection with both unsupervised and supervised methods.</figDesc></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" xml:id="fig_1"><head>Figure 2 :Figure 3 :</head><label>23</label><figDesc>Figure 2: Distributions of averaged anomaly scores and all anomaly scores from four anomaly detection model.</figDesc></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" xml:id="fig_2"><head>Figure 4 :</head><label>4</label><figDesc>Figure 4: Averaged anomaly score for common command names.</figDesc></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" xml:id="fig_4"><head>Figure 5 :</head><label>5</label><figDesc>Figure 5: F1 scores of three supervised models with different training sizes.</figDesc></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" xml:id="fig_5"><head>4 - 6 -</head><label>46</label><figDesc>Activity id = *1e1BD9. Anomaly score = 1.8919. Suspicious keywords = [kill: 3, wget: 21] 1 &lt;lines removed&gt; 2 salt "WH" cmd.run "python -m SimpleHTTPServer # --directory /sqldata/ms_backups/" bg=trues/WH_test_db_FU 3 salt "WH" cmd.run "ps aux | grep '[S]impleHTTPServer #' | awk '{print $#}' |xargs kill -9 "/WH_test_db_FUWH: &gt; [T1057: Process Discovery, T1489: Service Stop]5salt "WH" cmd.run "cd /sqldata/dbmigration;wget http://&lt;host:port&gt;//sqldata/ms_backups/WH_test_db_FU launch transient web server on remote host. Line 3: terminate the server. Line 4 and 6: ATT&amp;CK tags inserted by processing pipeline. Line 5: transfer data from web server using wget.</figDesc></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" xml:id="fig_6"><head>3 -</head><label>3</label><figDesc>Activity id = *1c01C8. Anomaly score = 1.9754. Suspicious keywords = [curl: 12] 1 &lt;lines removed&gt; 2 curl -T server_support.tar.gz -u&lt;username&gt;:&lt;plaintext_credentials&gt; &lt;externalhost&gt; /dropzone/uploads</figDesc></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" xml:id="fig_7"><head>1 &lt;</head><label>1</label><figDesc>Activity id = *b41A0E. Anomaly score = 3.1271. Suspicious keywords = [chmod: 2, df: 1, wget: 1] lines removed &gt; 2 ansible all -i &lt;INVENTORY&gt; -m shell -a "uptime;grep Start /etc/INSTALL_CLASS;cat /etc/redhat-release" -o&gt; 3 -&gt; [T1082: System Information Discovery]&gt; 4 ansible all -i &lt;INVENTORY&gt; -m shell -a "cd /root;chmod HFF diskwipe.sh;./diskwipe.sh" -b&gt; 5 -&gt; [T1222.002: File and Directory Permissions Modification -Linux and Mac File and Directory Permissions Mod]&gt; 6 ansible all -i &lt;INVENTORY&gt; -m shell -a "/sbin/service ambari-agent restart" -become -b&gt; 7 &lt;lines removed&gt;&gt; 8 ansible all -i &lt;INVENTORY&gt; -m shell -a "cd /boot/grub#;cp -p grub.cfg grub.cfg.bkp" -b&gt; 9 ansible all -i &lt;INVENTORY&gt; -m shell -a "/sbin/grubby --args=transparent_hugepage=never --update-kernel=ALL " -b&gt; 10 &lt;lines removed&gt; Details: Lines 1, 7, 10 omitted for brevity. Line 3 and 5 are automatic annotations added by pipeline. Line 2: remote command to check system details. Line 4: remote command to clear disk prior to install. Line 6: restart Hadoop monitoring agent. Line 8, 9: modify boot loader.</figDesc></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" type="table" xml:id="tab_1"><head>Table 1</head><label>1</label><figDesc>Suspicious keywords and MITRE ATT&amp;CK ® techniques.</figDesc><table><row><cell>ATT&amp;CK Tech-</cell><cell>ATT&amp;CK Technique Name</cell><cell>Suspicious Keywords</cell></row><row><cell>nique ID</cell><cell></cell><cell></cell></row><row><cell>T1018</cell><cell>Remote System Discovery</cell><cell>arp, ping, ip, hosts</cell></row><row><cell>T1033</cell><cell>System Owner/User Discovery</cell><cell>whoami, who, w, users, USER</cell></row><row><cell>T1049</cell><cell>System Network Connections Discovery</cell><cell>netstat, lsof, who, w</cell></row><row><cell>T1016</cell><cell>System Network Configuration Discovery</cell><cell>arp, ipconfig, ifconfig, nbtstat, netstat,</cell></row><row><cell></cell><cell></cell><cell>route, ping, ip</cell></row><row><cell>T1082</cell><cell>System Information Discovery</cell><cell>df, uname, hostname, env, lspci, lscpu,</cell></row><row><cell></cell><cell></cell><cell>lsmod, dmidecode, systeminfo</cell></row><row><cell>T1087</cell><cell>Account Discovery</cell><cell>id, groups, lastlog, ldapsearch</cell></row><row><cell>T1069</cell><cell>Permission Groups Discovery</cell><cell>groups, id, ldapsearch</cell></row><row><cell>T1040</cell><cell>Network Sniffing</cell><cell>tcpdump, tshark</cell></row><row><cell>T1574.006</cell><cell>Hijack Execution Flow: Dynamic Linker Hi-</cell><cell>ld.so.preload, LD_PRELOAD</cell></row><row><cell></cell><cell>jacking</cell><cell></cell></row><row><cell>T1547.006</cell><cell>Boot or Logon Autostart Execution: Kernel</cell><cell>modprobe, insmod, lsmod, rmmod, mod-</cell></row><row><cell></cell><cell>Modules and Extensions</cell><cell>info</cell></row><row><cell>T1136</cell><cell>Create Account</cell><cell>useradd, adduser</cell></row><row><cell>T1053.003</cell><cell>Scheduled Task/Job: Cron</cell><cell>crontab, cron</cell></row><row><cell>T1489</cell><cell>Service Stop</cell><cell>kill, pkill</cell></row><row><cell>T1562.001</cell><cell>Impair Defenses: Disable or Modify Tools</cell><cell>systemctl</cell></row><row><cell>T1105</cell><cell>Ingress Tool Transfer</cell><cell>curl, scp, sftp, tftp, rsync, finger, wget</cell></row><row><cell>T1222.002</cell><cell>File and Directory Permissions Modification:</cell><cell>chown, chmod, chgrp, chattr</cell></row><row><cell></cell><cell>Linux and Mac File and Directory Permissions</cell><cell></cell></row><row><cell></cell><cell>Modification</cell><cell></cell></row><row><cell>T1003.008</cell><cell>OS Credential Dumping: /etc/passwd and</cell><cell>passwd, shadow</cell></row><row><cell></cell><cell>/etc/shadow</cell><cell></cell></row><row><cell>T1070.003</cell><cell>Indicator Removal: Clear Command History</cell><cell>.bash_history, HISTFILE, HISTFILESIZE</cell></row><row><cell>T1548.003</cell><cell>Abuse Elevation Control Mechanism: Sudo</cell><cell>sudo, sudoers</cell></row><row><cell></cell><cell>and Sudo Caching</cell><cell></cell></row><row><cell>T1546.004</cell><cell>Event Triggered Execution: Unix Shell Config-</cell><cell>profile, profile.d, .profile, .bash_profile,</cell></row><row><cell></cell><cell>uration Modification</cell><cell>.bash_login, .bashrc, .bash_logout</cell></row></table></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" type="table" xml:id="tab_2"><head>Table 2</head><label>2</label><figDesc>Number of sessions in the normal, abnormal, and abstained classes.</figDesc><table><row><cell>Class</cell><cell>Number of Unique Suspi-</cell><cell>Number of Samples</cell><cell>Training Set</cell><cell>Testing Set</cell></row><row><cell></cell><cell>cious Keywords</cell><cell></cell><cell>(90%)</cell><cell>(10%)</cell></row><row><cell>Normal</cell><cell>= 0</cell><cell>790,363</cell><cell>711,327</cell><cell>79,036</cell></row><row><cell>Abnormal</cell><cell>&gt;= 3</cell><cell>28,413</cell><cell>25,571</cell><cell>2,842</cell></row><row><cell cols="2">Abstained (no label) In between</cell><cell>335,322</cell><cell>-</cell><cell>-</cell></row><row><cell>Total</cell><cell>-</cell><cell>1,154,098</cell><cell>736,898</cell><cell>81,878</cell></row></table></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" type="table" xml:id="tab_3"><head>Table 3</head><label>3</label><figDesc>Evaluation results of three supervised models with different training sizes.</figDesc><table><row><cell>Model</cell><cell cols="3">Logistic Regression</cell><cell cols="3">Fine-tuned DistilBERT</cell><cell cols="3">Fine-tuned DistilBERT with SetFit</cell></row><row><cell cols="8">Number of Precision Recall F1 Score Precision Recall F1 Score Precision</cell><cell>Recall</cell><cell>F1 Score</cell></row><row><cell>Samples</cell><cell></cell><cell></cell><cell></cell><cell></cell><cell></cell><cell></cell><cell></cell><cell></cell><cell></cell></row><row><cell>per Class</cell><cell></cell><cell></cell><cell></cell><cell></cell><cell></cell><cell></cell><cell></cell><cell></cell><cell></cell></row><row><cell>16</cell><cell>0.1464</cell><cell>0.7860</cell><cell>0.2454</cell><cell>0.1632</cell><cell>0.5578</cell><cell>0.2513</cell><cell>0.1569</cell><cell>0.8287</cell><cell>0.2622</cell></row><row><cell>32</cell><cell>0.1625</cell><cell>0.8248</cell><cell>0.2711</cell><cell>0.1995</cell><cell>0.6977</cell><cell>0.3036</cell><cell>0.2059</cell><cell>0.8930</cell><cell>0.3331</cell></row><row><cell>64</cell><cell>0.1713</cell><cell>0.8754</cell><cell>0.2862</cell><cell>0.1625</cell><cell>0.8418</cell><cell>0.2716</cell><cell>0.2712</cell><cell>0.9484</cell><cell>0.4210</cell></row><row><cell>128</cell><cell>0.1922</cell><cell>0.8849</cell><cell>0.3155</cell><cell>0.1703</cell><cell>0.9098</cell><cell>0.2864</cell><cell>0.3909</cell><cell>0.9758</cell><cell>0.5563</cell></row><row><cell>256</cell><cell>0.2070</cell><cell>0.8890</cell><cell>0.3356</cell><cell>0.3230</cell><cell>0.9663</cell><cell>0.4840</cell><cell>0.4819</cell><cell>0.9850</cell><cell>0.6459</cell></row><row><cell>512</cell><cell>0.2308</cell><cell>0.9027</cell><cell>0.3676</cell><cell>0.4900</cell><cell>0.9774</cell><cell>0.6524</cell><cell>0.5845</cell><cell>0.9866</cell><cell>0.7337</cell></row><row><cell>1024</cell><cell>0.2631</cell><cell>0.9188</cell><cell>0.4090</cell><cell>0.6483</cell><cell>0.9854</cell><cell>0.7819</cell><cell>0.7134</cell><cell>0.9900</cell><cell>0.8290</cell></row><row><cell>2048</cell><cell>0.2944</cell><cell>0.9267</cell><cell>0.4467</cell><cell>0.7534</cell><cell>0.9899</cell><cell>0.8555</cell><cell>0.7934</cell><cell>0.9894</cell><cell>0.8802</cell></row></table></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" type="table" xml:id="tab_4"><head></head><label></label><figDesc>Number of sessions for different MITRE ATT&amp;CK ® tactics.</figDesc><table><row><cell>Number of Sessions</cell><cell>0 100000 200000</cell><cell cols="3">T1057 T1082 T1105 T1222.002 T1083 T1070.004 T1489 T1485 T1018 T1049 T1087.001 T1560.001 T1033 T1069.001 T1016 T1543.002 T1546.004 T1552.004 T1053.003 T1003.007 T1529 T1486 T1003.008 T1053.002 T1053.006 T1098.004 T1553.004 T1113 T1562.001 T1562.003 T1614.001 T1552.001 T1552.003 T1040 T1548.001 T1070.003 T1007 T1547.006 T1218 T1201 T1562.006 T1087.002 T1069.002 T1136.001 T1070.002 T1070.007 T1046 T1548.003 T1562 T1574.006 T1546.005 T1037.004 T1115 T1562.004 T1135 T1547.013 Technique T1546.016 T1558</cell></row><row><cell cols="5">Figure 6: Number of sessions for different MITRE ATT&amp;CK ® techniques.</cell></row><row><cell></cell><cell></cell><cell>Tactic</cell><cell>Impact Exfiltration Command and Control Collection Lateral Movement Discovery Credential Access Defense Evasion Privilege Escalation Persistence Execution Initial Access Resource Development Reconnaissance</cell><cell>0</cell><cell>100000 200000 300000 400000 Count</cell></row><row><cell cols="3">Figure 7:</cell><cell></cell></row></table></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" type="table" xml:id="tab_5"><head>Table 4</head><label>4</label><figDesc>An example of remote command execution of transient web server with potential for data exfiltration.</figDesc><table /></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" type="table" xml:id="tab_6"><head>Table 6</head><label>6</label><figDesc>An example of disk clear and boot load configuration changes.</figDesc><table /></figure>
		</body>
		<back>
			<div type="references">

				<listBibl>

<biblStruct xml:id="b0">
	<monogr>
		<author>
			<persName><forename type="first">J</forename><surname>Devlin</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M.-W</forename><surname>Chang</surname></persName>
		</author>
		<author>
			<persName><forename type="first">K</forename><surname>Lee</surname></persName>
		</author>
		<author>
			<persName><forename type="first">K</forename><surname>Toutanova</surname></persName>
		</author>
		<idno type="arXiv">arXiv:1810.04805</idno>
		<title level="m">Bert: Pre-training of deep bidirectional transformers for language understanding</title>
				<imprint>
			<date type="published" when="2018">2018</date>
		</imprint>
	</monogr>
	<note type="report_type">arXiv preprint</note>
</biblStruct>

<biblStruct xml:id="b1">
	<monogr>
		<title level="m" type="main">Improving language understanding by generative pre-training</title>
		<author>
			<persName><forename type="first">A</forename><surname>Radford</surname></persName>
		</author>
		<author>
			<persName><forename type="first">K</forename><surname>Narasimhan</surname></persName>
		</author>
		<author>
			<persName><forename type="first">T</forename><surname>Salimans</surname></persName>
		</author>
		<author>
			<persName><forename type="first">I</forename><surname>Sutskever</surname></persName>
		</author>
		<imprint>
			<date type="published" when="2018">2018</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b2">
	<monogr>
		<author>
			<persName><forename type="first">V</forename><surname>Sanh</surname></persName>
		</author>
		<author>
			<persName><forename type="first">L</forename><surname>Debut</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Chaumond</surname></persName>
		</author>
		<author>
			<persName><forename type="first">T</forename><surname>Wolf</surname></persName>
		</author>
		<idno type="arXiv">arXiv:1910.01108</idno>
		<title level="m">Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter</title>
				<imprint>
			<date type="published" when="2019">2019</date>
		</imprint>
	</monogr>
	<note type="report_type">arXiv preprint</note>
</biblStruct>

<biblStruct xml:id="b3">
	<analytic>
		<title level="a" type="main">A survey on log anomaly detection using deep learning</title>
		<author>
			<persName><forename type="first">R</forename><forename type="middle">B</forename><surname>Yadav</surname></persName>
		</author>
		<author>
			<persName><forename type="first">P</forename><forename type="middle">S</forename><surname>Kumar</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><forename type="middle">V</forename><surname>Dhavale</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">2020 8th International Conference on Reliability, Infocom Technologies and Optimization (Trends and Future Directions)(ICRITO)</title>
				<imprint>
			<publisher>IEEE</publisher>
			<date type="published" when="2020">2020</date>
			<biblScope unit="page" from="1215" to="1220" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b4">
	<analytic>
		<title level="a" type="main">Log-based anomaly detection with deep learning: How far are we?</title>
		<author>
			<persName><forename type="first">V.-H</forename><surname>Le</surname></persName>
		</author>
		<author>
			<persName><forename type="first">H</forename><surname>Zhang</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Proceedings of the 44th international conference on software engineering</title>
				<meeting>the 44th international conference on software engineering</meeting>
		<imprint>
			<date type="published" when="2022">2022</date>
			<biblScope unit="page" from="1356" to="1367" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b5">
	<analytic>
		<title level="a" type="main">Long short-term memory</title>
		<author>
			<persName><forename type="first">S</forename><surname>Hochreiter</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Schmidhuber</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">Neural computation</title>
		<imprint>
			<biblScope unit="volume">9</biblScope>
			<biblScope unit="page" from="1735" to="1780" />
			<date type="published" when="1997">1997</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b6">
	<analytic>
		<title level="a" type="main">Attention is all you need</title>
		<author>
			<persName><forename type="first">A</forename><surname>Vaswani</surname></persName>
		</author>
		<author>
			<persName><forename type="first">N</forename><surname>Shazeer</surname></persName>
		</author>
		<author>
			<persName><forename type="first">N</forename><surname>Parmar</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Uszkoreit</surname></persName>
		</author>
		<author>
			<persName><forename type="first">L</forename><surname>Jones</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><forename type="middle">N</forename><surname>Gomez</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Ł</forename><surname>Kaiser</surname></persName>
		</author>
		<author>
			<persName><forename type="first">I</forename><surname>Polosukhin</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">Advances in neural information processing systems</title>
		<imprint>
			<biblScope unit="volume">30</biblScope>
			<date type="published" when="2017">2017</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b7">
	<analytic>
		<title level="a" type="main">Deeplog: Anomaly detection and diagnosis from system logs through deep learning</title>
		<author>
			<persName><forename type="first">M</forename><surname>Du</surname></persName>
		</author>
		<author>
			<persName><forename type="first">F</forename><surname>Li</surname></persName>
		</author>
		<author>
			<persName><forename type="first">G</forename><surname>Zheng</surname></persName>
		</author>
		<author>
			<persName><forename type="first">V</forename><surname>Srikumar</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Proceedings of the 2017 ACM SIGSAC conference on computer and communications security</title>
				<meeting>the 2017 ACM SIGSAC conference on computer and communications security</meeting>
		<imprint>
			<date type="published" when="2017">2017</date>
			<biblScope unit="page" from="1285" to="1298" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b8">
	<analytic>
		<title level="a" type="main">Robust log-based anomaly detection on unstable log data</title>
		<author>
			<persName><forename type="first">X</forename><surname>Zhang</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Y</forename><surname>Xu</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Q</forename><surname>Lin</surname></persName>
		</author>
		<author>
			<persName><forename type="first">B</forename><surname>Qiao</surname></persName>
		</author>
		<author>
			<persName><forename type="first">H</forename><surname>Zhang</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Y</forename><surname>Dang</surname></persName>
		</author>
		<author>
			<persName><forename type="first">C</forename><surname>Xie</surname></persName>
		</author>
		<author>
			<persName><forename type="first">X</forename><surname>Yang</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Q</forename><surname>Cheng</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Z</forename><surname>Li</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Proceedings of the 2019 27th ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering</title>
				<meeting>the 2019 27th ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering</meeting>
		<imprint>
			<date type="published" when="2019">2019</date>
			<biblScope unit="page" from="807" to="817" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b9">
	<analytic>
		<title level="a" type="main">Logbert: Log anomaly detection via bert</title>
		<author>
			<persName><forename type="first">H</forename><surname>Guo</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><surname>Yuan</surname></persName>
		</author>
		<author>
			<persName><forename type="first">X</forename><surname>Wu</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">2021 international joint conference on neural networks (IJCNN), IEEE</title>
				<imprint>
			<date type="published" when="2021">2021</date>
			<biblScope unit="page" from="1" to="8" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b10">
	<analytic>
		<title level="a" type="main">Log-based anomaly detection without log parsing</title>
		<author>
			<persName><forename type="first">V.-H</forename><surname>Le</surname></persName>
		</author>
		<author>
			<persName><forename type="first">H</forename><surname>Zhang</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">36th IEEE/ACM International Conference on Automated Software Engineering (ASE), IEEE</title>
				<imprint>
			<date type="published" when="2021">2021. 2021</date>
			<biblScope unit="page" from="492" to="504" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b11">
	<analytic>
		<title level="a" type="main">A survey on masquerader detection approaches</title>
		<author>
			<persName><forename type="first">M</forename><surname>Bertacchini</surname></persName>
		</author>
		<author>
			<persName><forename type="first">P</forename><surname>Fierens</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Proceedings of V Congreso Iberoamericano de Seguridad Informática</title>
				<meeting>V Congreso Iberoamericano de Seguridad Informática</meeting>
		<imprint>
			<publisher>Universidad de la República de Uruguay</publisher>
			<date type="published" when="2008">2008</date>
			<biblScope unit="page" from="46" to="60" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b12">
	<analytic>
		<title level="a" type="main">Masquerade detection using truncated command lines</title>
		<author>
			<persName><forename type="first">R</forename><forename type="middle">A</forename><surname>Maxion</surname></persName>
		</author>
		<author>
			<persName><forename type="first">T</forename><forename type="middle">N</forename><surname>Townsend</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Proceedings international conference on dependable systems and networks</title>
				<meeting>international conference on dependable systems and networks</meeting>
		<imprint>
			<publisher>IEEE</publisher>
			<date type="published" when="2002">2002</date>
			<biblScope unit="page" from="219" to="228" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b13">
	<analytic>
		<title level="a" type="main">Masquerade detection using enriched command lines</title>
		<author>
			<persName><forename type="first">R</forename><forename type="middle">A</forename><surname>Maxion</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">2003 International Conference on Dependable Systems and Networks</title>
				<imprint>
			<date type="published" when="2003">2003. 2003</date>
			<biblScope unit="page" from="5" to="5" />
		</imprint>
	</monogr>
	<note>Proceedings., IEEE Computer Society</note>
</biblStruct>

<biblStruct xml:id="b14">
	<analytic>
		<title level="a" type="main">One-class training for masquerade detection</title>
		<author>
			<persName><forename type="first">K</forename><surname>Wang</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><forename type="middle">J</forename><surname>Stolfo</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Workshop on Data Mining for Computer Security</title>
				<imprint>
			<date type="published" when="2003">2003</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b15">
	<analytic>
		<title level="a" type="main">Empirical evaluation of svm-based masquerade detection using unix commands</title>
		<author>
			<persName><forename type="first">H.-S</forename><surname>Kim</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S.-D</forename><surname>Cha</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">Computers &amp; Security</title>
		<imprint>
			<biblScope unit="volume">24</biblScope>
			<biblScope unit="page" from="160" to="168" />
			<date type="published" when="2005">2005</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b16">
	<analytic>
		<title level="a" type="main">Hmms based masquerade detection for network security on with parallel computing</title>
		<author>
			<persName><forename type="first">J</forename><surname>Liu</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Duan</surname></persName>
		</author>
		<author>
			<persName><forename type="first">W</forename><surname>Li</surname></persName>
		</author>
		<author>
			<persName><forename type="first">X</forename><surname>Tian</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">Computer Communications</title>
		<imprint>
			<biblScope unit="volume">156</biblScope>
			<biblScope unit="page" from="168" to="173" />
			<date type="published" when="2020">2020</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b17">
	<analytic>
		<title level="a" type="main">Deep learning approaches for predictive masquerade detection</title>
		<author>
			<persName><forename type="first">W</forename><surname>Elmasry</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Akbulut</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><forename type="middle">H</forename><surname>Zaim</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">Security and Communication Networks</title>
		<imprint>
			<biblScope unit="page">2018</biblScope>
			<date type="published" when="2018">2018</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b18">
	<analytic>
		<title level="a" type="main">Deep learning for insider threat detection: Review, challenges and opportunities</title>
		<author>
			<persName><forename type="first">S</forename><surname>Yuan</surname></persName>
		</author>
		<author>
			<persName><forename type="first">X</forename><surname>Wu</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">Computers &amp; Security</title>
		<imprint>
			<biblScope unit="volume">104</biblScope>
			<biblScope unit="page">102221</biblScope>
			<date type="published" when="2021">2021</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b19">
	<analytic>
		<title level="a" type="main">A conceptual hybrid model of deep convolutional neural network (dcnn) and long short-term memory (lstm) for masquerade attack detection</title>
		<author>
			<persName><forename type="first">A</forename><forename type="middle">A</forename><surname>Azeezat</surname></persName>
		</author>
		<author>
			<persName><forename type="first">O</forename><forename type="middle">S</forename><surname>Adebukola</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A.-A</forename><surname>Adebayo</surname></persName>
		</author>
		<author>
			<persName><forename type="first">O</forename><forename type="middle">B</forename><surname>Olushola</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Information and Communication Technology and Applications: Third International Conference, ICTA 2020</title>
				<meeting><address><addrLine>Minna, Nigeria</addrLine></address></meeting>
		<imprint>
			<publisher>Springer</publisher>
			<date type="published" when="2020">November 24-27, 2020. 2021</date>
			<biblScope unit="volume">3</biblScope>
			<biblScope unit="page" from="170" to="184" />
		</imprint>
	</monogr>
	<note>Revised Selected Papers</note>
</biblStruct>

<biblStruct xml:id="b20">
	<analytic>
		<title level="a" type="main">Masquerade detection based on temporal convolutional network</title>
		<author>
			<persName><forename type="first">H</forename><surname>Zhai</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Y</forename><surname>Wang</surname></persName>
		</author>
		<author>
			<persName><forename type="first">X</forename><surname>Zou</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Y</forename><surname>Wu</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><surname>Chen</surname></persName>
		</author>
		<author>
			<persName><forename type="first">H</forename><surname>Wu</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Y</forename><surname>Zheng</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">IEEE 25th International Conference on Computer Supported Cooperative Work in Design (CSCWD)</title>
				<imprint>
			<publisher>IEEE</publisher>
			<date type="published" when="2022">2022. 2022</date>
			<biblScope unit="page" from="305" to="310" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b21">
	<analytic>
		<author>
			<persName><forename type="first">M</forename><surname>Schonlau</surname></persName>
		</author>
		<author>
			<persName><forename type="first">W</forename><surname>Dumouchel</surname></persName>
		</author>
		<author>
			<persName><forename type="first">W.-H</forename><surname>Ju</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><forename type="middle">F</forename><surname>Karr</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Theus</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Y</forename><surname>Vardi</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Computer intrusion: Detecting masquerades</title>
				<imprint>
			<date type="published" when="2001">2001</date>
			<biblScope unit="page" from="58" to="74" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b22">
	<monogr>
		<title level="m" type="main">Using unix: Collected traces of 168 users</title>
		<author>
			<persName><forename type="first">S</forename><surname>Greenberg</surname></persName>
		</author>
		<idno>88/333/45</idno>
		<imprint>
			<date type="published" when="1988">1988</date>
			<pubPlace>Calgary, Alberta</pubPlace>
		</imprint>
		<respStmt>
			<orgName>Department of Computer Science, University of Calgary</orgName>
		</respStmt>
	</monogr>
	<note type="report_type">Research Report</note>
</biblStruct>

<biblStruct xml:id="b23">
	<analytic>
		<title level="a" type="main">An application of machine learning to anomaly detection</title>
		<author>
			<persName><forename type="first">T</forename><surname>Lane</surname></persName>
		</author>
		<author>
			<persName><forename type="first">C</forename><forename type="middle">E</forename><surname>Brodley</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Proceedings of the 20th national information systems security conference</title>
				<meeting>the 20th national information systems security conference<address><addrLine>Baltimore, USA</addrLine></address></meeting>
		<imprint>
			<date type="published" when="1997">1997</date>
			<biblScope unit="volume">377</biblScope>
			<biblScope unit="page" from="366" to="380" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b24">
	<analytic>
		<title level="a" type="main">Nl2bash: A corpus and semantic parser for natural language interface to the linux operating system</title>
		<author>
			<persName><forename type="first">X</forename><forename type="middle">V</forename><surname>Lin</surname></persName>
		</author>
		<author>
			<persName><forename type="first">C</forename><surname>Wang</surname></persName>
		</author>
		<author>
			<persName><forename type="first">L</forename><surname>Zettlemoyer</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><forename type="middle">D</forename><surname>Ernst</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Proceedings of the Eleventh International Conference on Language Resources and Evaluation (LREC</title>
				<meeting>the Eleventh International Conference on Language Resources and Evaluation (LREC</meeting>
		<imprint>
			<date type="published" when="2018">2018. 2018</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b25">
	<analytic>
		<title level="a" type="main">Transformers: State-of-the-art natural language processing</title>
		<author>
			<persName><forename type="first">T</forename><surname>Wolf</surname></persName>
		</author>
		<author>
			<persName><forename type="first">L</forename><surname>Debut</surname></persName>
		</author>
		<author>
			<persName><forename type="first">V</forename><surname>Sanh</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Chaumond</surname></persName>
		</author>
		<author>
			<persName><forename type="first">C</forename><surname>Delangue</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Moi</surname></persName>
		</author>
		<author>
			<persName><forename type="first">P</forename><surname>Cistac</surname></persName>
		</author>
		<author>
			<persName><forename type="first">T</forename><surname>Rault</surname></persName>
		</author>
		<author>
			<persName><forename type="first">R</forename><surname>Louf</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Funtowicz</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Proceedings of the 2020 conference on empirical methods in natural language processing: system demonstrations</title>
				<meeting>the 2020 conference on empirical methods in natural language processing: system demonstrations</meeting>
		<imprint>
			<date type="published" when="2020">2020</date>
			<biblScope unit="page" from="38" to="45" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b26">
	<monogr>
		<author>
			<persName><forename type="first">S.-B</forename><surname>Cocea</surname></persName>
		</author>
		<ptr target="https://www.crowdstrike.com/blog/bert-embeddings-new-approach-for-command-line-anomaly-detection/" />
		<title level="m">Bert embeddings: A modern machine-learning approach for detecting malware from command lines (part 1 of 2</title>
				<imprint>
			<date type="published" when="2022-06-01">2022. 2022-06-01</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b27">
	<monogr>
		<author>
			<persName><forename type="first">C</forename><surname>Popa</surname></persName>
		</author>
		<ptr target="Ac-cessed" />
		<title level="m">Bert embeddings: A modern machine-learning approach for detecting malware from command lines (part 2 of 2</title>
				<imprint>
			<date type="published" when="2022-06-01">2022. 2022-06-01</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b28">
	<monogr>
		<author>
			<persName><forename type="first">Y</forename><surname>Zhao</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Z</forename><surname>Nasrullah</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Z</forename><surname>Li</surname></persName>
		</author>
		<idno type="arXiv">arXiv:1901.01588</idno>
		<title level="m">Pyod: A python toolbox for scalable outlier detection</title>
				<imprint>
			<date type="published" when="2019">2019</date>
		</imprint>
	</monogr>
	<note type="report_type">arXiv preprint</note>
</biblStruct>

<biblStruct xml:id="b29">
	<monogr>
		<title level="m" type="main">Outlier Analysis</title>
		<author>
			<persName><forename type="first">C</forename><forename type="middle">C</forename><surname>Aggarwal</surname></persName>
		</author>
		<imprint>
			<date type="published" when="2016">2016</date>
			<publisher>Springer Publishing Company, Incorporated</publisher>
		</imprint>
	</monogr>
	<note>2nd ed</note>
</biblStruct>

<biblStruct xml:id="b30">
	<analytic>
		<title level="a" type="main">A novel anomaly detection scheme based on principal component classifier</title>
		<author>
			<persName><forename type="first">M.-L</forename><surname>Shyu</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S.-C</forename><surname>Chen</surname></persName>
		</author>
		<author>
			<persName><forename type="first">K</forename><surname>Sarinnapakorn</surname></persName>
		</author>
		<author>
			<persName><forename type="first">L</forename><surname>Chang</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Proceedings of the IEEE foundations and new directions of data mining workshop</title>
				<meeting>the IEEE foundations and new directions of data mining workshop</meeting>
		<imprint>
			<publisher>IEEE Press</publisher>
			<date type="published" when="2003">2003</date>
			<biblScope unit="page" from="172" to="179" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b31">
	<analytic>
		<title level="a" type="main">Isolation forest</title>
		<author>
			<persName><forename type="first">F</forename><forename type="middle">T</forename><surname>Liu</surname></persName>
		</author>
		<author>
			<persName><forename type="first">K</forename><forename type="middle">M</forename><surname>Ting</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Z.-H</forename><surname>Zhou</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">in: 2008 eighth ieee international conference on data mining</title>
				<imprint>
			<publisher>IEEE</publisher>
			<date type="published" when="2008">2008</date>
			<biblScope unit="page" from="413" to="422" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b32">
	<analytic>
		<title level="a" type="main">Isolation-based anomaly detection</title>
		<author>
			<persName><forename type="first">F</forename><forename type="middle">T</forename><surname>Liu</surname></persName>
		</author>
		<author>
			<persName><forename type="first">K</forename><forename type="middle">M</forename><surname>Ting</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Z.-H</forename><surname>Zhou</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">ACM Transactions on Knowledge Discovery from Data (TKDD)</title>
		<imprint>
			<biblScope unit="volume">6</biblScope>
			<biblScope unit="page" from="1" to="39" />
			<date type="published" when="2012">2012</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b33">
	<analytic>
		<title level="a" type="main">Copod: copula-based outlier detection</title>
		<author>
			<persName><forename type="first">Z</forename><surname>Li</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Y</forename><surname>Zhao</surname></persName>
		</author>
		<author>
			<persName><forename type="first">N</forename><surname>Botta</surname></persName>
		</author>
		<author>
			<persName><forename type="first">C</forename><surname>Ionescu</surname></persName>
		</author>
		<author>
			<persName><forename type="first">X</forename><surname>Hu</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">2020 IEEE international conference on data mining (ICDM), IEEE</title>
				<imprint>
			<date type="published" when="2020">2020</date>
			<biblScope unit="page" from="1118" to="1123" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b34">
	<monogr>
		<title level="m" type="main">Efficient few-shot learning without prompts</title>
		<author>
			<persName><forename type="first">L</forename><surname>Tunstall</surname></persName>
		</author>
		<author>
			<persName><forename type="first">N</forename><surname>Reimers</surname></persName>
		</author>
		<author>
			<persName><forename type="first">U</forename><forename type="middle">E S</forename><surname>Jo</surname></persName>
		</author>
		<author>
			<persName><forename type="first">L</forename><surname>Bates</surname></persName>
		</author>
		<author>
			<persName><forename type="first">D</forename><surname>Korat</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Wasserblat</surname></persName>
		</author>
		<author>
			<persName><forename type="first">O</forename><surname>Pereg</surname></persName>
		</author>
		<idno type="arXiv">arXiv:2209.11055</idno>
		<imprint>
			<date type="published" when="2022">2022</date>
		</imprint>
	</monogr>
	<note type="report_type">arXiv preprint</note>
</biblStruct>

<biblStruct xml:id="b35">
	<monogr>
		<author>
			<persName><forename type="first">P</forename><surname>Salunkhe</surname></persName>
		</author>
		<ptr target="https://www.uptycs.com/blog/linux-commands-and-utilities-commonly-used-by-attackers" />
		<title level="m">Linux commands &amp; utilities commonly used by attackers</title>
				<imprint>
			<date type="published" when="2021">2021. 2022-10-01</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b36">
	<monogr>
		<author>
			<persName><forename type="first">T</forename><forename type="middle">M</forename><surname>Corporation</surname></persName>
		</author>
		<ptr target="https://attack.mitre.org/techniques/enterprise" />
		<title level="m">Mitre att&amp;ck ® enterprise techniques</title>
				<imprint>
			<date type="published" when="2023-03-01">2023. 2023-03-01</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b37">
	<monogr>
		<author>
			<persName><forename type="first">R</forename><surname>Canary</surname></persName>
		</author>
		<author>
			<persName><forename type="first">®</forename></persName>
		</author>
		<ptr target="https://github.com/redcanaryco/atomic-red-team" />
		<title level="m">Atomic red team ™</title>
				<imprint>
			<date type="published" when="2023-03-01">2023. 2023-03-01</date>
		</imprint>
	</monogr>
</biblStruct>

				</listBibl>
			</div>
		</back>
	</text>
</TEI>
