1. Introduction

Journal of [7] Y. Zhou

10.1109/ACCESS.2021.3116219

Adaptive Ensemble Learning for Intrusion Detection Systems

Vincenzo Agate

vincenzo.agate@unipa.it 0

Federico Concone

federico.concone@unipa.it 0

Alessandra De Paola

alessandra.depaola@unipa.it 0

Pierluca Ferraro

pierluca.ferraro@unipa.it 0

Salvatore Gaglio

salvatore.gaglio@unipa.it 0

Giuseppe Lo Re

giuseppe.lore@unipa.it 0

Marco Morana

marco.morana@unipa.it 0 0 Università degli Studi di Palermo , Dipartimento di Ingegneria, Palermo , Italy

2018

1 2007 2013

For years, the European Commission has highlighted the need to invest in cybersecurity as a means of protecting institutions and citizens from the many threats in cyberspace. Attacks perpetrated through the network are extremely dangerous, also because their mitigation is complex, making it dificult to ensure an adequate level of security. One of the crucial elements in building an overall system of protection against network-based cyber attacks are Intrusion Detection Systems (IDSs), whose goal is to detect and identify such attacks and misuse of computer networks in a timely manner. Nowadays, the most efective IDSs are based on Machine Learning (ML) and are able to combine and analyze information from heterogeneous sources, such as network trafic, user activity patterns, and data extracted from system logs. However, these tools commonly exploit specific classifiers, whose performance is highly dependent on the attacks being considered, and are unable to generalize adequately enough to be applied in diferent contexts. The research laboratories of Networking and Distributed Systems and Artificial Intelligence at the University of Palermo are carrying out research activities in order to address these issues, with the main goal of designing a new generation of IDSs that, by dynamically and adaptively combining multiple classifiers, are able to overcome the limitations of state-of-the-art solutions.

eol>Cybersecurity Artificial Intelligence Intrusion Detection Systems

1. Introduction

will face in the near future is the adoption of Machine Learning (ML) and, more generally, Artificial Intelligence Today, with the increasingly pervasive use of ICT tech- (AI) methods. nologies, cyber attacks pose a serious risk to the infras- However, a thorough study of the literature shows that tructural, productive and economic aspects of our soci- the adoption of machine learning methods to design IDSs ety. One of the most critical threats to today’s hyper- involves several critical issues. One of the most noticeconnected world are attacks that come from the network. able concerns is that, due to the high heterogeneity of In fact, all social and productive realities are closely de- network trafic generated by diferent attacks, specific pendent on the ability to exchange data through the net- classifiers are characterized by performance that is highly work. This dependence can be exploited by the malicious dependent on the attacks considered. This means that parties to gain unauthorized access to the resources of there is no single universal ML approach that can detect institutions and organizations. One of the most efective any kind of attack in diferent scenarios. In addition, solutions to such attacks are Intrusion Detection Systems diferent classes of ML approaches have very diferent ca(IDSs), whose main goal is to timely detect and iden- pabilities: for example, supervised methods can achieve tify misuse of resources early enough to enable timely excellent performance but are unable to handle unknown responses that stop any malicious behavior and ensure attacks, while unsupervised methods can detect anomanormal operation of systems. lies and unknown attacks but generally achieve poor

Currently, the most promising approach to designing performance with already known intrusions [1]. IDSs capable of dealing with the threats our systems The adoption of ensemble machine learning techniques, which leverage multiple machine learning algorithms, promises to be a very efective approach to achieve higher overall performance than single methods.

However, in the current literature, the ensemble of classiifers is often designed through trial-and-error procedures, and there is no evidence that an approach suitable for a specific scenario can be general enough to be adopted in diferent scenarios.

Our research group, through scientific activities funded by various projects, seeks to contribute to this tive solutions aiming to improve the robustness of exist- selection method and a ranking technique that evaluates ing approaches in the field of AI- and ML-based intrusion the ability of diferent base classifiers to detect diferent detection systems (IDS). attacks. Results are promising, but only for a subset of

The following of this paper introduces the current state the considered attack classes. The authors of [9] propose of the art of IDS and discusses the main limitations of a model based on sustainable ensemble learning and on current solutions, followed by a summary description of incremental learning. Such a system exploits multiclass our research group’s contribution. Finally, a description regression models so that the ensemble is adapted to recof the challenges and goals we intend to address in the ognize diferent types of attacks; moreover, by means near future is provided. of an iterative update method the parameters and the decision results of the historical model are included into the training process of the final ensemble model. 2. Related Work The performances of the solutions described above, as well as many other existing ensemble frameworks, are In the dynamic domain of cybersecurity, the arms race be- severely limited as many diferent classes of attacks can tween intrusion detection mechanisms and cyber-attack occur. Moreover, the combination of multiple ML-based methodologies has accelerated, highlighting an urgent classifiers generally increases the computational load, need for innovative detection techniques. Several IDSs thus limiting the IDS’s ability to operate timely. This have been proposed in the literature, exploiting both issue is particularly critical, given the need to promptly signature-based and anomaly-based approaches [2, 3]. identify incoming threats and immediately apply approThe former are reliable in recognizing known attacks but priate countermeasures. are inefective against those not previously seen. Conversely, the latter show a more flexible behavior and are better suited to detect constantly evolving attacks, espe- 3. Research Contribution cially by using Machine Learning (ML) techniques.

Nevertheless, the design of ML-based IDSs faces sev- In this perspective, a first contribution of our research eral challenges, such as the dificulty of ensuring fast unit is discussed in [10], where we introduced a system responses when dealing with high-dimensional data, as which addresses critical limitations in existing framein the case of network trafic, or providing consistently works, achieving the right trade-of between number of good performance for all types of intrusions. Moreover, recognized classes and prediction speed, in contrast to in modern network environments with heterogeneous other multi-class IDSs in the literature. devices, the input data distributions are subject to un- In particular, we presented a multi-layered architecture predictable fluctuations over time. This phenomenon, for a behavior-based Intrusion Detection System that referred to as concept drift, poses a significant challenge uses machine learning and ensemble learning techniques in the fields of machine learning and cybersecurity, as to distinguish between benign and malicious trafic and noted in [4]. One of the most promising directions to categorize detected malicious activities into one of nine achieve overall good performance is the adoption of en- possible attack classes. The architecture of the system is semble learning techniques [5], which exploit multiple shown in Figure 1.

ML algorithms to obtain better results than those of indi- The experimental evaluation was performed on the vidual methods. CIC-IDS2017 public dataset, showing that the proposed

The IDS presented in [6], for instance, combines a two- IDS exhibits good performance in detecting all attack stage meta classifier ensemble (i.e., rotation forest and classes according to well-established metrics. bagging) with hybrid feature selection (particle swarm A key aspect of our proposed system is its two-layer aroptimization, ant colony algorithm, and genetic algo- chitecture. To prevent the system from being overloaded rithm) to better distinguish regular and anomalous traf- with all the network trafic, and consequently to prevent ifc. However, such a solution is tailored on single attacks delayed detections, trafic filtering is preliminarily perinstances and not suitable for dealing with multi-class formed in order to distinguish “normal” and “abnormal” problems. The IDS introduced in [7] adopts an ensemble trafic, ensuring that only potentially malicious trafic approach that combines decision trees, Random Forest, is advanced to the next stage for further analysis. This and Forest by Penalizing Attributes algorithms, and a layer thus acts as a filter, improving the eficiency of the voting technique to combine their probability distribu- whole system. Accurate classification at this stage is crutions. Although the system achieves good performance cial, as trafic deemed benign is not subject to subsequent with popular attacks, this drops in the case of rare ones. scrutiny, highlighting the importance of minimizing false Multi-class intrusion detection is also addressed in [8], negatives to safeguard network integrity. For the design where an ensemble approach is designed to detect dif- of the first layer, we decided to adopt a Decision Tree ferent attacks. Such IDS also exploits a hybrid feature (DT), since experimental evaluation showed its better perOriginal features

Feature Selection 1

Feature Selection 2

Decision Tree

First Layer

Normal traffic Abnormal traffic Random Forest

Second Layer

Decision Tree

Soft Voting Model

Output

Neural Network formance for binary classification, compared to Neural lelization in the training and testing of weak learners, Networks, Random Forest, and Gaussian Naive Bayes. thereby enhancing eficiency in both training and pre

In the second layer, a detailed analysis of malicious diction phases, a critical feature for IDS systems where trafic is performed so thus the system generates alerts timely threat detection is paramount. more accurately. These alerts provide network admin- This work is partially funded by the European Union istrators with the information they need to quickly and FESR o FSE, PON Ricerca e Innovazione 2014-2020 - DM efectively respond to threats [ 11], allowing them to neu- 1062/2021. tralize ongoing attacks quickly and eficiently.

Our solution proposes the adoption of ensemble learning techniques, incorporating a combination of diferent 4. Preliminary Evaluation learning models, such as Neural Networks (NNs), Random Forests (RFs), and additional DTs as weak learners. To conduct a preliminary evaluation of the proposed solu

The results of the predictions of the single models are tion, the CIC-IDS2017 dataset was used [12]. This dataset aggregated using appropriate ensemble techniques that perfectly fits the goals of our study as it includes varyield better classification performances than those of the ious attacks encompassing SQL-Injection, Brute Force, single weak learners. Specifically, we adopt a weighted XSS, DoS GoldenEye, DoS Hulk, DoS Slowhttptest, and voting technique that assigns higher weights to the pre- DoS Slowloris. These attacks were grouped under two dictions of classifiers with low uncertainty in order to categories, i.e., Web and DOS Attacks, to streamline comdetermine the ensemble’s final verdict. putation while maintaining detailed and accurate identi

The adoption of this weighted voting strategy for ag- ifcation of malicious events. gregating classifier outputs, integrating the confidence All tests have been performed on of-the-shelf laptops values from neural network predictions with those of equipped with Intel 3805U 1.9GHz CPU and 4GB RAM. Decision Trees and Random Forests, notably improves Moreover, all the models that constitute the proposed the performance of the whole IDS. Finally, it is worth IDS have been run 1000 times using diferent train and noticing that our system’s architecture facilitates paral- test sets at every execution.

The numerous tests performed on the system have chine learning models. Indeed, ignoring the phenomenon demonstrated its reliability and accuracy in detecting of concept drift, like many current IDSs do, inevitably malicious trafic, as well as its time eficiency. The IDS lead to performance degradation over time. is able to recognize and identify 9 diferent types of at- Our future approach will try to overcome these chaltack in real-time, promptly alerting administrators to lenges by orchestrating supervised and unsupervised minimize serious consequences. In fact, on average, the systems to exploit the benefits of both approaches. The system misses attacks in very small percentages (close to detection of unknown attacks can rely on online unsu1%), while it requires extremely low execution time for pervised anomaly detection systems that are adept at both the first and second levels: some slight diference recognizing signs of zero-day attacks, all the while autois appreciated in dependence on the model used in the matically adapting to concept drift without the constant ensemble. need for manual intervention. This, in turn, can also

Besides the good performance achieved, numerous reduce the frequency of model re-training and enhance improvements are needed to address other important system eficiency. Such systems will be used in conjunclimitations, that are common to many IDSs in the litera- tion with supervised ones to improve the overall accuracy ture. for known attacks.

First of all, the solutions proposed in the literature (as The eficacy of our methodologies will be validated well as [10]) select the set of classifiers to be adopted through extensive experimental evaluation, showcasing through a trial-and-error process and lack a formalized our system’s capability of real-time threat detection commethodology that can drive the design process in difer- pared to traditional models. This will provide the reent scenarios. Moreover, many of the existing solutions search community with valuable insights into the efechave been designed ignoring the outbreak of unknown tiveness of diferent ML methods and ensemble strategies attacks. Such a “closed-world” approach makes IDSs un- against a wide range of security attacks. suitable for recognizing special types of attacks known Looking forward, we envision further enriching our as “zero-day”. IDS framework to improve its resilience against unknown attacks and concept drift, ofering robust defenses against the ever-evolving landscape of cyber threats.

5. Challenges and Goals

The main goal of the research unit is the design and 6. Research Unit development of a novel class of IDSs based on the combination of several dynamically orchestrated classifiers The Networks and Distributed Systems and Artificial Intel(both supervised and unsupervised), with the aim of rec- ligence research laboratories at the University of Palermo, ognizing a large set of diferent threats, also detecting directed by Prof. Giuseppe Lo Re and Salvatore Gaglio, the occurrence of zero-day attacks. have experience in several research fields such as dis

Given the strong characterization of the many appli- tributed systems, cybersecurity, artificial intelligence, cation scenarios in which IDSs are needed, the design and machine learning. In particular, the research unit of the system architecture will be guided by a formal- has developed deep expertise in several topics related to ized, rigorous, and replicable approach that can steer the cybersecurity domain that mainly concern the adopthe realization of specific IDS instances. The goal is to tion of artificial intelligence to assist the detection and design a scalable and modular architecture, capable of identification of potential threats in cyberspace. The idenmaintaining a low computing load while guaranteeing tified methodologies and proposed solutions have been high detection performance and responsiveness, even in applied in diferent scenarios, such as intrusion detection the presence of huge amounts of data. systems [10], malware detection systems [13, 14], social

The main challenge will be the definition of adaptive network security [15, 16], privacy-preserving distributed orchestration techniques, which will be crucial for the systems [17, 18], adversarial machine learning [19] and design of IDSs capable of dynamically adjusting their secure crowdsensing [20]. ensemble strategies based on the observed context. This Furthermore, it is worth noting that the research will include the integration of both supervised and un- group’s experience in applying artificial intelligence apsupervised learning approaches, allowing an adaptive proaches and methods to distributed systems and cyresponse to emerging threats. bersecurity challenges has been leveraged in several

To reach this ambitious goal, the system will also have funded research projects, such as FRASI - FRamework to address the phenomenon of concept drift, which is the for Agent-based Semantic- aware In-teroperability (FAR continuous shift of the statistical distribution of network MIUR D.M. 8 agosto 2000), Bigger Data (D.D. MIUR n. data over time. This poses a big challenge for current 2690 dell’11.12.2013, Piano di Azione e Coesione), SeNIDSs, often necessitating manual retraining of their ma- Sori - SEnsor Node as a Service for hOme and buildings and Computer Applications 191 (2021) 103165.

URL: https://www.sciencedirect.com/science/ article/pii/S1084804521001776. doi:https: //doi.org/10.1016/j.jnca.2021.103165. [19] S. Gaglio, A. Giammanco, G. Lo Re, M. Morana,

Adversarial machine learning in e-health: attacking a smart prescription system, in: International Conference of the Italian Association for Artificial

Intelligence (2021 AI*IA), Milan, Italy, 2021. [20] F. Concone, G. Lo Re, M. Morana, Smcp: a secure mobile crowdsensing protocol for fogbased applications, Human-centric Computing and Information Sciences 10 (2020) 1–23. URL: https://doi.org/10.1186/s13673-020-00232-y. doi:10. 1186/s13673-020-00232-y.