1. Introduction

Pairing an Autoencoder and a SF-SOINN for Implementing an Intrusion Detection System⋆

0 Department of Mathematics , Computer Science , and Physics University of Udine Via delle Scienze 206 , Udine, 33100 , Italy

2020

Intrusion Detection System are systems aiming to detect intrusions within individual computers or networks. These systems are of fundamental importance nowadays, as the number of attacks on networks is ever increasing. In this paper, a prototype of a new Intrusion Detection System is presented. The key novelty is the architecture of this system, pairing an Autoencoder and a Soft-Forgetting Self-Organizing Incremental Neural Network. A fusing scheme is applied to exploit the classification capabilities of the two approaches. The proposed system, tested in diferent conditions using the NSL-KDD dataset, has achieved excellent performance in detecting attacks, demonstrating its ability to evolve its knowledge and to recognize attacks never seen before.

eol>Cybersecurity intrusion detection systems anomaly detection continuous learning NSL-KDD Autoencoder SF-SOINN

1. Introduction

Ital-IA 2023: 3rd National Conference on Artificial Intelligence, organized by CINI, May 29–31, 2023, Pisa, Italy ⋆Work partially supported by the Department Strategic Project of the University of Udine within the InterDepartment Project on Artificial Intelligence (2020-25). $ voltan.gabriele@spes.uniud.it (G. Voltan); gianluca.foresti@uniud.it (G. L. Foresti); marino.miculan@uniud.it (M. Miculan)

© 2023 Copyright for this paper by its authors. Use permitted under Creative Commons License issues, in this paper we present a novel system, aiming CPWrEooUrckReshdoinpgs IhStpN:/c1e6u1r3-w-0s.o7r3g ACttEribUutRion W4.0oInrtekrnsahtioonpal (PCCroBYce4.0e).dings (CEUR-WS.org) to integrate continuous learning with anomaly detection. More precisely, we integrate SF-SOINN with Autoencoder. The SF-SOINN is a neural network, previously presented in [8] capable to implement continuous learning in an eficient way. The Autoencoder is a neural network with excellent ability to detect anomalies. The combination of these two approaches make the system optimal for use in a real, modern scenario. Figure 2: Logical architecture of the Autoencoder used by the

The rest of the paper is structured as follows. In Sec- ASFSOINN tion 2 we present ASFSOINN, our solution composed by an Autoencoder and SF-SOINN. In Section 3 we describe some experimental results of the application of 2.2. Soft-Forgetting Self-Organizing ASFSOINN on NSL-KDD dataset. Finally, some conclu- Incremental Neural Network sions and directions for further work are in Section 4.

SF-SOINN (Fig. 3), proposed by Foresti and Martina [8], is an evolution of the model called Enhanced Self2. ASFSOINN: a hybrid Organizing Incremental Neural Networks [10]. The main anomaly-based IDS features of this neural networks are the capability to evolve its knowledge (continuous learning) and the speed The solution proposed in this paper is a hybrid system, in producing the output, possible as the model forgets called ASFSOINN, composed of an Autoencoder and a what is no longer relevant [8].

Soft-Forgetting Self-Organizing Incremental Neural Net- This model takes as input a given x, with the respecwork. The two subsystems work together (Fig. 1), both tive label y, and using a distance function, calculates the contributing to the decision of the final label to be as- two nodes closest to the one received as input. Next, it signed to the input data. calculates a threshold to decide whether or not to add a node for the received data. After this phase, the model 2.1. Autoencoder goes through an update phase, in which it eliminates any nodes and edges that are no longer considered useful.

How the SF-SOINN model works is explained in detail in [8].

Autoencoders are a particular type of artificial neural network that aims to reconstruct the input , after an encoding phase. During this process, the input is compressed by a series of layers, called Encoder, and then decompressed by a series of layers, called Decoder [9].

Fig. 2 shows the logic architecture of the Autoencoder used in ASFSOINN. The architecture of the Autoencoder used in ASFSOINN, unlike many others present in literature, is very simple, to allow the system to work in real time. This feature allows the network to be fast in producing output data, making it usable in real scenarios.

2.3. Operating modes

The ASFSOINN proposed, has three operating modes: training mode, necessary to train the AE and SF-SOINN; testing mode, which allows the system to classify the input data; live mode, which allows an external operator to manual classify a specific data.

The algorithms are shown below.

Algorithm 2 Test phase

Require: , , for each ∈ do _ ← .() ◁ AE prediction _ ← .() ◁ SF-SOINN prediction if _ = “” ∧ _ ̸= “” then ← _ else if _ = “” ∧ _ ̸= “” then

← _ else if _ = “” ∧ _ = “” then

← else

← end if end for “” “”

The autoencoder only needs to be trained with good data as its operation is based on learning the normal behavior, then labeling anything that is diferent as a possibile “attack".

Algorithm 1 Training phase Require: , _, _, _ ← _ ◁ Creation of AE . (_) ◁ AE training for each , ∈ _, _ do

.(, ) ◁ Input of the data x with label y end for 2.3.3. Live mode This mode is very useful as it allows cybersecurity experts to transmit their knowledge to the IDS. Live mode allows an external operator to manually label data and use it to train the system.

Algorithm 3 Live phase

Require: , , _ for each , ∈ , _ do

.(, ) ◁ Data with label end for

Distribution of training set DoS

U2R

R2L

Probe where TP is the number of true positive predictions, TN is the number of true negative predictions, FP is the number of false negative predictions, and FP is the number of false positive predictions.

Detection Rate: this metric measures how many correct Rp2Lositive classiPficraobteions were made out of the total n u52mber of posit9i9v5e cases. Th1e1.6f5o6rmula is: =

+ where TP and FN are the same as described above.

False Positive Rate: this metric measures the probability that a negative case will be classified as a positive.

The formula is: =

+ DoS

U2R

R2L

Probe

Tabella 1 where FP and TN are the same as described above.

U2R R2L Probe

F2a00lse Negativ2e.75R4ate: this m2.42e1tric measures the likelihood that a positive case will go undetected. The formula The dataset chosen to conduct the experiments with ASF- is: SOINN is the NSL-KDD [11]. It contains five data classes, = one normal and four attacks (DoS, U2R, R2L, Probe). +

It is composed of 160367 elements, divided into a train where FN and TP are the same as described above. set of 125973 elements, a test set of 22544 and a second test set, which we called test_n21, of 11850 elements. In 3.2. Training with the NSL-KDD addition to these two test sets, we created another one, which we called test_21, which contains only the test set In the first test performed, ASFSOINN is trained using items that have a dificulty of 21, i.e. 10694. the training set. Specifically, the AE is trained with the

Fig. 4 shows the distributions of the training set, while good data of the training set (67343 elements), while the Fig. 5 shows the distribution of the test set. SF-SOINN with the whole set (125973 elements). After 1 training, ASFSOINN is tested using the test set, test_n21 3.1. Metrics and test_21, i.e. the one containing the items with dificulty equal to 21. Unlike the test set, the test_n21 contains This subsection lists the metrics used to evaluate the many attacks that are not in the training set. For this performance of ASFSOINN system in the various experi- reason, test_n21 is considered the hardest test set. ments conducted. The metrics used are: Below we show the table of the results obtained by the IDS in the testing phase (Table 1).

Accuracy: this metric measures how many correct clas- As we can see from the results shown in the Table 1, sifications were made out of the total number of predic- the system has an excellent ability to recognize attacks. tions made. The formula is: Furthermore, it was able to obtain acceptable accuracy even in those more dificult test cases, as in the case of = test_n21.

+ + + + Test Set Train 1 Train 2 Train 3 Train 4 Train 5 Train 6 Train 7 Train 8 Train 9 Train 10 Train 11 Train 12 Train 13 Media

3.3. Training set vs Test set

The second test performed is a test proposed by Constantinides [12]. The peculiarity of this test consists in the fact that the system is trained with the test set and tested with the train set, dividing it into 12 sets of 10000 elements and a set with the remaining elements.

From the results shown in the Table 2 we can see how this system, despite the particularity of the experiment, still obtains excellent results, both in terms of Detection Rate and in terms of Accuracy.

3.4. Incremental Training

In this experiment, proposed by Li-Ye [13], the system is trained incrementally. The train set is divided into 5 sets containing all the data of that category (train_normal, train_dos, train_r2l, train_u2r, train_probe), which will be passed to the system one at a time.

After each incremental training phase, the system is tested on each class, using the test set divided into 5 classes. The original experiment starts by training the system with train_normal and train_dos, then dividing Figure 7: SF-SOINN after training with all train set the experiment into four phases. In this work instead, the experiment is divided into five phases, training the system in the first phase only with the train_normal. This which the IDS receives as input attacks it does not know. choice was made to show how the proposed system can This type of test allows us to highlight the capabilities detect attacks even without having seen one before. of this innovative IDS, which is able to recognize attacks

After the first training phase, the SF-SOINN configura- never seen before, thanks to the anomaly detection work tion is the one shown in Fig. 6. In Fig. 7 instead, we can carried out by the AE. The results of this experiment observe the state of the model after training on all cate- is shown in Table 3. As we can see from the results, gories of attacks. As we can see from these two images, ASFSOINN recognizes a high number of attacks even the configuration of the SF-SOINN model has evolved. after being trained with only good data (no attacks). This highlights the continuous learning of this model.

This experiment is the most important as it simulates the worst scenario that could happen, i.e. the one in

4. Conclusions and Future works Bibliography