Semantic Models for Network Intrusion Detection Peter Bednar, Martin Sarnovsky, Pavol Halas Department of Artificial Inteligence and Cybernetics Technical University of Kosice Kosice, Slovakia {name.surname}@tuke.sk Abstract—The presented paper describes the design and proposed combined approach. In this chapter we at first define validation of the hierarchical intrusion detection system (IDS), quantitative evaluation metrics and then summarize the which combines machine learning approach with the knowledge- performance of the system on the standard benchmark dataset based methods. As the knowledge model, we have proposed the from the KDD Cup competition. ontology of network attacks, which allow to us decompose detection and classification of the existing types of attacks or formalize detection rules for the new types. Designed IDS was evaluated on a widely used KDD 99 dataset and compared to II. HIERARCHICAL KNOWLEDGE-BASED INTRUSION DETECTION similar approaches. SYSTEM Keywords—ontologies, network security incidents, machine A. Overall system architecture learning The main objective of the proposed architecture is to hierarchically decompose detection and classification of the I. INTRODUCTION intrusions according to the types of the attacks. For the With the extensive usage of the information and decomposition we have proposed the Network Intrusion communication technologies the number and variety of the Ontology which main part is formalized as the taxonomy of security attacks grow. This is also reflected in the growing of attack types. This ontology allows to capture all knowledge budget invested by companies or public institutions into the related to the known types of the attacks, including the security. In order to cope with the current situation, the new and description of rare cases which are difficult to detect using the innovative techniques are applied in order to automatize the machine learning methods. security management [1]. The main decomposition of the detection and classification Recently, we can observe two main approaches to the process can be divided into the following phases: security of the ICT: the first approach is data-oriented, and it is 1. Coarse attack/normal classification - this phase is based on the application of machine learning techniques to implemented using the machine learning algorithm proactively achieve the best possible prediction of the new which distinguish normal traffic and attacks. If a attacks [2][3][4][5]. The second approach is more user-centric network connection is labelled as a normal one, then an and it is based on the application of knowledge modelling alarm is not raised. Otherwise, the suspicious techniques in order to model user behavior and ICT environment connection is processed by a set of models to determine [8][9][10]. the class of attack during the phase 2. The presented article tries to combine these two approaches 2. Attack class and type prediction—this phase is guided into a single system, where the domain knowledge about the by the taxonomy of the attacks from the Network types, effects and severity of the attacks is used to decompose Intrusion Ontology. The system hierarchically processes intrusion detection task into the classification subtasks which the taxonomy and selects the appropriate model to can be handled more efficiently with less training data. The classify the instance on a particular level of a class design of the proposed intrusion detection system is symmetrical hierarchy. The model can be a machine learning model in the sense that both approaches (machine learning and statistically inferred from the training data, or rule-based knowledge based) are equal and mutually contribute to address model formalized using the classes and relations from the challenges of the detection and prevention of the security the ontology. threats. 3. When a class of attack is predicted, ontology is queried The rest of this paper is organized as follows: in the for all relevant sub-types of the attack type and to following chapter we will present hierarchical knowledge model retrieve the suitable model to predict the particular sub- in the form of the ontology which will be used for the type. Knowledge model can also be used to extract decomposition of the detection problem and which will provide specific domain-related information as a new attribute, additional contextual information. Subsequent part describes which could be used either to improve the classifier’s implemented machine learning models and how these models performance or to provide context, domain-specific are combined with the knowledge in the ontology. Subsequent information which could complement the predictive section then presents the experimental evaluation of the model. Copyright © 2020 for this paper by its authors. Use permitted under Creative Commons License Attribution 4.0 International (CC BY 4.0). 25 The details about the predictive models and their evaluation will system. The main concepts and relations of the ontology are be presented in the subsequent chapter. represented on the Figure 1. B. Network Intrusion Ontology The central part of the proposed semantic model is the The proposed knowledge model captures all essentials taxonomy of Attacks which are summarized in the following concepts required to describe network intrusion systems. We figure. The taxonomy was extracted from the types of the attacks have designed our semantic model according to the described in the KDD 99 datasets. Attacks are divided into the methodology proposed by Grüninger and Fox and with some four main groups such as DOS, R2L, U2R and PROBE. The extensions from Methodology. main types of the attacks are further specified on the additional level of the hierarchy. The designed ontology is formalized using the OWL 2 RL profile, which allows to formalize common constructs such as multiple hierarchies and at the same time provides compatibility with the rule languages for automatic reasoning. As the objective of the knowledge model was to use it in the data analytical tasks, the concepts and properties map directly to the data used in the process. Moreover, ontology was extended with the concepts related to the classification models, to create the relation between the particular classifier and its usability on the specific level of target attribute hierarchy. The main classes of ontology include: • Connections - This class represents the status of each connection record. It specifies Attack connection or normal traffic. Attack connections are further conceptualized using the Attack hierarchy described below. • Effects - This class contains subclasses that represent all possible consequences of individual attacks (e.g., slows down server response, execute commands as root, etc.). Fig. 1. The main concepts of the proposed sematnic model. • Mechanisms - The subclasses represent all possible causes of individual ontology attacks (poor environment sanitation, misconfiguration, etc.). • Flags - The subclasses represent normal or error states of individual connections (Established, responder aborted, Connection attempt was rejected, etc.). Each of these subclasses has a 1 equivalent instance. • Protocols - The class contains subclasses that represent the types of the communication protocols on which the Fig. 2. The hierarchy of Attacks. connection is running (TCP, UDP, and ICMP). C. Machine learning models • Services - The subclasses represent each type of connection service (http, telnet, etc. ...). Each of these To evaluate the proposed approach, we used the KDD Cup subclasses has a 1 equivalent instance. 1999 competition dataset, which is a commonly accepted benchmark for the intrusion detection task. The dataset consists • Severities - This class represents the severity of the of the records from the device logs in a LAN network collected attack, its subclasses represent the severity level (weak, over nine weeks. For the evaluation, we have used 10% sample medium, and high). with the 494,021 records in total. Each record is labeled as the normal communication or it is assigned to the major attack class • Targets - The subclasses represent possible targets of a and specific attack types. There are 22 different attack types given type of attack (user, network). which corresponds to the classes in the proposed ontology. • Models concept covers the classification models used to The common problem with the diagnostic tasks such as predict the given target attribute. intrusion detection systems is that the target attribute (i.e. in our The instances of the specified classes represent the network case type of the attack) is highly unbalanced with the majority connections (e.g., connection records from the data set). Trained of normal communication. Table I presents the taxonomy of and serialized classification models are instantiated as the attack types together with the number of cases in the dataset. instances of the Model class. The models are represented as the Some attack classes such as Probe are more balanced but web resources and they could be accessed by their URI property, generally for each attack class we can find some minor types which points to the location where the model is serialized in the with only the few training examples. The lack of cases is 26 problematic not only for the training of statistical models but for the classification which were identified in the work of [4]. also for the evaluation. On the other side, rare cases can be still The final list of features includes: service, src_bytes, dst_bytes, very critical and can in overall a big impact on the security of the logged_in, num_file_creations, srv_diff_host_rate, system. dst_host_count, dst_host_diff_srv_rate, dst_host_srv_diff_host_rate, srv_count, serror_rate, rerror_rate, TABLE I. ATTACK TYPES AND NUMBER OF SAMPLES Since the data of diagnostic tasks are commonly highly Attack Attack class # of samples unbalanced towards the normal cases, the proposed approach is based on the decomposition of the diagnostic classification task back 2203 into the hierarchy of classifiers. At the top level of the class land 21 hierarchy, an attack detection model is used for the prediction to distinguish between the attack connections and normal traffic. neptune 107,201 The classifier on this level was trained on the whole dataset and DoS pod 264 target attribute was transformed to the binary indicator. The main goal of this top-level classifier is to reliably separate smurf 280,790 normal connections from the attack ones. teardrop 979 If the top-level model detects an attack connection, the cases are further classified by the ensemble models into the one of the satan 1589 four types of the attack on the second level of the taxonomy ipsweap 1247 (DoS, R2L, U2R, Probe). In this level, we use ensemble Probe classifier with voting scheme trained on all attack instances (i.e. nmap 231 without the normal communication cases). We found that the portsweep 1040 proposed ensemble model is more efficient in the case of unbalanced target classes. The standard machine learning guess_passwd 53 models proposed in the previous works were able to gain good accuracy, achieved mostly on the dominant class (in our case on ftp_write 8 KDD 99 dataset, on the most common DoS attack). However, imap 12 the simple models struggled to predict minor classes such as U2R, which can be even more serious from the point of view of phf 4 network security. For example, when training a decision tree R2L multihop 7 model, the model has very good performance for the DoS and R2L classes but missed a significant amount of the Probe attacks warezmaster 20 and was not able to detect the U2R class at all. warezclient 1020 Proposed weighting schema is based on the idea of complementing classifiers which is based on the performance of spy 2 a particular model on the particular class. This weighting schema buffer_overflow 30 is presented on the Table II. The wi,j terms represent the weight associated with the i-th model and j-th class. loadmodule 9 U2R TABLE II. WEIGHTING SCHEME OF THE ENSEMBLE MODEL perl 3 rootkit 10 Model DoS R2L U2R Probe normal Normal 97,227 model 1 w1,1 w2,1 w3,1 w4,1 model 2 w1,2 w2,2 w3,2 w4,2 The records for each connection are described by set of model 3 w1,3 w2,3 w3,3 w4,3 features, which are represented in the ontology as the data ... ... ... ... ... attributes. The features can be divided into the basic features, content features and traffic features. Overall there are 32 features. The first group describes the type of the communication After the binary classification and classification of the attack protocol, duration of the connection, service on the destination class by ensemble weighted classifier, we have trained particular network node and other standard attributes describing the TCP models to further classify specific type of the attack on the most connection. Content features are attributes that can be linked to specific level of the taxonomy. Four different models were the domain specific knowledge depending on the applications trained using only the records of particular attack classes (i.e. and environment in which communication occurs. The last models for DoS, R2L, U2R and Probe). The most problematic group of features (traffic) describe the communication attributes was minority U2R class, as the dataset contains very few records captured during the 2 seconds time window, e.g. the number of of that type. The final implemented classification schema is hosts communicating with the target host etc. For the data presented on the Figure 3. All models were implemented in the preprocessing, we have selected only the most relevant features Python environment using the standard pandas and scikit-learn 27 stack. Predictive models were then persistently stored and the was in fact an attack, etc. The entire system was also evaluated models URIs (Uniform Resource Locators) were added as the with the number of missed attacks and raised false alarms as data properties to the knowledge model. FAR metric (False Alarm Rate), which corresponds to the false positive records divided by total number of normal traffic records (true negative + false positive). For the evaluation of the binary classification on the top level of the taxonomy, we used directly precision and recall metrics. In the subsequent stages on the more specific levels of taxonomy we have computed precision and recall for each class and used macro-averaging for overall evaluation. Additionally, we have computed multi-class confusion matrix to further investigate the types of the errors produced by the system. A. Training and evaluation Fig. 3. The implemented hierarchical classification schema. For the binary classification for the attack detection, we used The main role of the semantic model in the proposed the decision tree classifier. Dataset includes all records and detection system is to navigate through the target class taxonomy target attribute was transformed to binary indicator and decompose classification problem to the sub-problems attack/normal traffic. The classifier was trained without the limit implemented by the particular models for the specific type of for maximum depth with default settings for pruning and gini attack. The system is implemented using the Python language index as the splitting criterion. We split the dataset randomly to and RDFlib package which provides integration with the 70/30 training/testing ration. The testing data were also used for ontology using the SPARQL query interface. When predicting overall evaluation of the entire system. Model for the binary the unknown connection, system query the ontology using the classification achieved the accuracy 0.9997. The detailed SPARQL query and retrieve correspondent model for the confusion matrix is presented in the Table III. particular class of the attacks according to the URL stored in the TABLE III. PERFORMANCE OF THE BINARY ATTACK CLASSIFICATION hasTargetAttribute property. Once the classification of the main type is performed, the system checks in the ontology if there is a Normal Attack Precision Recall classifier able to process the record further and to detect subtype of the attack. Normal 29,095 11 0.999 0.999 Besides the hierarchical decomposition of the detection Attack 35 119,066 process, knowledge model provides also additional context which can be leveraged during the classification and improve detection of the minor classes. We have mainly extended the For the training of ensemble classifier, we have selected only context with the potential effect of the attack. Additionally, if the the attack records from the training set. As the base classifiers models are not reliable enough to predict the concrete attack sub- we have used various configuration of the Naive Bayes and type, the system can be used to classify attacks at least according Decision Tree models. The experiments proved that the Decision to the severity which is retrieved from the knowledge model for Tree classifier performed well on the Probe, DoS and R2L the particular main class of the attack. This could serve as a attacks. On the other hand, for the U2R class model produces supporting source of information, completing the attach type many false alarms or (depending on pruning) the model was not classification. able to detect U2R attacks at all. For this reason, we have trained one-vs-all model just to separate U2R class. We have then III. EVALUATION combined both types of the models into the ensemble classifier. For the evaluation, we used the most common metrics The weights of the base classifiers were computed according to employed in the classification tasks such as recall and precision. the accuracies of the models on the training data. For the We have also computed confusion matrix for the particular evaluation we have used the same 70/30 dataset split as for the classes of attacks. The confusion matrices were especially binary classification, but we have further selected only the attack informative since they record number of correctly and records (since the normal communication is filtered already by incorrectly classified examples and also the types of the error. the binary classifier). In total, models were trained on 396743 For the binary classification on the top level of the taxonomy records. The confusion matrix of the ensemble classifier is hierarchy we used standard evaluation metrics: presented on the Table IV. • Precision: P = TP / (TP + FP) TABLE IV. PERFORMANCE OF THE ENSEMBLE ATTACK CLASSIFICATION • Recall: R = TP / (TP + FN) Probe U2R DoS R2L Prec. Rec. where TP, TN, FP, FN are numbers of true positive, true Probe 1279 0 1 0 0.992 0.992 negative, false positive and false negative records (e.g. for true positive number of records when the predicted attack was in fact U2R 0 15 0 0 1 0.882 attack, false positive when the predicted attack was in fact a DoS 6 0 117,385 0 0.999 0.999 normal traffic, false negative when the predicted normal traffic 28 R2L 4 2 0 331 0.982 1 model for the detection of the attack class. Overall achieved performance was 0.999 precision and recall with very good accuracy for the high and low severity. The Table VIII presents On the most specific level of the taxonomy, each major the confusion matrix for the severity detection in comparison for attack class has dedicated one model for the further classification each class of the attack. of subtypes. The performance of each model was evaluated using the precision and recall macro-averaged for each subtype. TABLE VIII. CONFUSION MATRIX FOR THE SEVERITY DETECTION The overall performance of the models is summarized in Table High Low Medium Prec. Recall V. DoS 117695 0 0 TABLE V. PERFORMANCE OF THE SUBTYPE CLASSIFICATION Probe 443 0 779 Probe U2R DoS R2L 0.999 0.999 R2L 0 346 6 Accuracy 0.991 0.937 0.999 0.989 U2R 0 0 20 Precision 0.989 0.927 0.999 0.879 Recall 0.989 0.875 0.999 0.833 Medium severity was biased by our model towards the high severity which has the similar effect like the higher false positive rate. Further details and information about the designed model The overall system with the hierarchical classification was were published in [9]. evaluated using the standard precision, recall F-measure and FAR (False Alarm Rate) metrics. Comparison of the proposed IV. CONCLUSION AND FUTURE WORK system and models published in previous works [4][6][7][11] is presented in Table VI. In this paper we have proposed an approach based on the combination of knowledge based and machine learning methods TABLE VI. OVERALL PREFORMANCE OF THE SYSTEM for intrusion detection. The proposed knowledge model in the form of the ontology is used for the hierarchical decomposition Classifier Acc. Prec. F1 FAR of the detection process according to the types of the attack. This decomposition allows to overcome the problems with the C4.5 0.969 0.947 0.970 0.005 unbalanced training data which are typical for the diagnostic Random forests 0.964 0.998 0.986 0.025 machine learning tasks. By the leveraging of the domain knowledge, our combined approach also provides an additional Forest PA 0.975 0.998 0.998 0.002 context which includes for example the effects and severity of Ensemble model 0.976 0.998 0.998 0.001 the attacks. Our approach 0.998 0.998 0.998 0.001 The performance of the proposed IDS is 0.998 in terms of precision as well as recall and 0.001 in terms of FAR metric, which on the standard benchmark dataset outperforms other state-of-the-art methods. Moreover, the proposed method has Additionally, we have computed confusion matrix, which also potential to partially detect new emerging types of attacks summarizes the performance for each attack class. The in terms of the contextual information stored in the knowledge confusion matrix is presented in the Table VII. model. TABLE VII. CONFUSION MATRIX FOR THE OVERALL PREFORMANCE OF In the future work we plan to extend the role of the THE SYSTEM knowledge model by introducing a rule-based classifier which Probe U2R DoS R2L Normal will be based on the declarative rules and application of automatic reasoning technique and logical programming. We Probe 1176 0 5 0 7 hope that this will allow to further improve accuracy for minor U2R 0 15 0 0 5 classes with the low number of training examples. Additionally, extended knowledge model will allow to create formalized DoS 4 0 117547 0 1 knowledge base of the existing cases. R2L 3 1 0 346 7 ACKNOWLEDGMENT Normal 1 0 3 1 48454 This work was supported by Slovak Research and Development Agency under the contract No. APVV-16-0213 and by the VEGA project under grant No. 1/0493/16. Besides the classification of attack types, we have implemented and also evaluated the classification of the attack severity. To train the severity detector we have used 10 % of KDD 99 dataset with the 70/30 training/testing ratio. The severity classifier was applied complementary to the ensemble 29 REFERENCES [6] Sharma, N.; Mukherjee, S. A Novel Multi-Classifier Layered Approach to Improve Minority Attack Detection in IDS. Procedia Technol. 2012, 6, 913–921. [1] Park, J. Advances in Future Internet and the Industrial Internet of Things. [7] Ahmim, A.; Ghoualmi Zine, N. A new hierarchical intrusion detection Symmetry 2019, 11, 244. system based on a binary tree of classifiers. Inf. Comput. Secur. 2015, 23, [2] Javaid, A.; Niyaz, Q.; Sun, W.; Alam, M. A Deep Learning Approach for 31–57. Network Intrusion Detection System. In Proceedings of the 9th EAI [8] Abdoli, F.; Kahani, M. Ontology-based distributed intrusion detection International Conference on Bio-inspired Information and system. In Proceedings of the 2009 14th International CSI Computer Communications Technologies (formerly BIONETICS), New York, NY, Conference, Tehran, Iran, 20–21 October 2009; pp. 65–70. USA, 3-5 December 2016. [9] Sarnovsky, M.; Paralic, J. Hierarchical Intrusion Detection Using [3] Khan, M.A.; Karim, M.d.R.; Kim, Y. A Scalable and Hybrid Intrusion Machine Learning and Knowledge Model. Symmetry 2020, 12, 203. Detection System Based on the Convolutional-LSTM Network. Symmetry 2019, 11, 583. [10] More, S.; Matthews, M.; Joshi, A.; Finin, T. A Knowledge-Based Approach to Intrusion Detection Modeling. In Proceedings of the 2012 [4] Zhou, Y.; Cheng, G; Jiang, S.; dai, M. An efficient detection system based IEEE Symposium on Security and Privacy Workshops, San Francisco, on feature selection and ensemble classifier. arXiv 2019, CA, USA, 24–25 May 2012; pp. 75–81. arXiv:190401352 [11] Özgür, A.; Erdem, H. A review of KDD99 dataset usage in intrusion [5] Aljawarneh, S.; Aldwairi, M.; Yassein, M.B. Anomaly-based intrusion detection and machine learning between 2010 and 2015. PeerJ Preprints detection system through feature selection analysis and building hybrid 2016, 4, e1954v1. efficient model. J. Comput. Sci. 2018, 25, 152–160. 30