AI in Cybersecurity: Activities of the CINI-AIIS Lab at University of Naples Federico II

AI in Cybersecurity: Activities of the CINI-AIIS Lab at University of Naples Federico II AntoninoFerraro University of Naples Federico II

Via Claudio 21 80125 Naples Italy

AntonioGalli antonio.galli@unina.it University of Naples Federico II

Via Claudio 21 80125 Naples Italy

LaValerio Gatta University of Naples Federico II

Via Claudio 21 80125 Naples Italy

Department of Computer Science McCormick School of Engineering and Applied Science Northwestern University

2233 Tech Dr 60208 Evanston IL United States

LidiaMarassi University of Naples Federico II

Via Claudio 21 80125 Naples Italy

StefanoMarrone University of Naples Federico II

Via Claudio 21 80125 Naples Italy

VincenzoMoscato University of Naples Federico II

Via Claudio 21 80125 Naples Italy

MarcoPostiglione University of Naples Federico II

Via Claudio 21 80125 Naples Italy

Department of Computer Science McCormick School of Engineering and Applied Science Northwestern University

2233 Tech Dr 60208 Evanston IL United States

CarloSansone University of Naples Federico II

Via Claudio 21 80125 Naples Italy

GiancarloSperli University of Naples Federico II

Via Claudio 21 80125 Naples Italy

AI in Cybersecurity: Activities of the CINI-AIIS Lab at University of Naples Federico II 1613-0073 F5717A2871E9A669E09A98F6C9C4BFD4 GROBID - A machine learning software for extracting information from scholarly documents Artificial Intelligence Cybersecurity Deep Learning Machine Learning

Artificial intelligence (AI) is revolutionizing various industries, including cybersecurity, by emulating human intelligence to address complex threats. In the cybersecurity domain, AI offers significant potential, bolstering defense mechanisms, optimizing threat detection, and advancing incident response capabilities. AI-powered systems can analyze vast datasets to identify anomalies, predict cyberattacks, and enhance overall security posture. Machine Learning (ML), a subset of AI, enables systems to learn from data and make informed decisions, such as predicting optimal security measures based on threat intelligence and operational context. Deep Learning (DL), another ML subset, harnesses Artificial Neural Networks (ANNs) to process intricate data patterns and provide accurate threat assessments. DL, especially through Convolutional Neural Networks (CNNs), is transforming cybersecurity by extracting meaningful features from network traffic and log data for anomaly detection and threat hunting. Moreover, DL integrated with Natural Language Processing (NLP) streamlines tasks like threat intelligence analysis and incident response coordination. The versatility of AI underscores its pivotal role in cybersecurity, driving resilience enhancements and fostering proactive defense strategies. In this paper, we highlight AI projects in the cybersecurity sector from the University of Naples Federico II node of the CINI-AIIS Lab, showcasing their innovative contributions to cyber defense.

Introduction

Artificial intelligence (AI) is a transformative force across various industries, providing a paradigm shift in cybersecurity practices. Within the cybersecurity domain, AI is heralding significant advancements, redefining defensive strategies, amplifying threat detection capabilities, and refining incident response mechanisms. By harnessing AI technologies, organizations can fortify their defensive postures, anticipate and mitigate cyber threats proactively, and elevate overall security resilience.

At the core of AI's impact on cybersecurity lies its capacity to analyze vast and diverse datasets, enabling the identification of anomalies, prediction of emerging threats, and optimization of security measures. Machine Learning (ML), a pivotal subset of AI, equips systems with the ability to learn from data, thereby enhancing decisionmaking processes based on evolving threat landscapes and operational contexts. Deep Learning (DL), another cornerstone of AI, leverages sophisticated Artificial Neural Networks (ANNs) to discern intricate patterns within data, furnishing precise threat assessments and actionable insights. Particularly through Convolutional Neural Networks (CNNs), DL revolutionizes cybersecurity by extracting salient features from network traffic and log data, facilitating anomaly detection, threat prediction, and forensic analysis.

Moreover, the fusion of DL with Natural Language Processing (NLP) streamlines critical cybersecurity tasks, such as threat intelligence analysis, malware detection, and incident response coordination. By comprehensively analyzing textual data, NLP-powered systems augment analysts' capabilities, enabling rapid threat identification and proactive response measures.

The adaptable and multifaceted nature of AI positions it as a cornerstone of cybersecurity, driving innovation, resilience, and agility in the face of evolving threats. In this paper, we present a comprehensive overview of AI initiatives in cybersecurity, drawing from projects conducted at the University of Naples Federico II node of the CINI-AIIS Lab. Through these endeavors, we showcase the transformative potential of AI in bolstering cyber defense strategies and safeguarding digital ecosystems against emerging threats. formation, driven largely by the widespread adoption of Internet of Things (IoT) devices and Cloud Computing technologies. This proliferation has provided cybercriminals with a fertile ground for launching a multitude of attacks, ranging from the insertion of unwanted advertisements into websites to the clandestine exfiltration of sensitive data for illicit financial gains. At the forefront of these attacks are various forms of malicious software, collectively referred to as malware, which pose significant challenges to the security and integrity of digital systems. Examples of such malware include trojans, backdoors, spyware, and worms, each designed with the explicit purpose of exploiting vulnerabilities in target systems ( [1]).

The detection of malware represents a formidable research endeavor, compounded by the ever-evolving sophistication of cyber threats. As Cyber Security (CS) researchers develop new detection techniques, malware authors respond in kind, continually refining their strategies to evade detection ( [2,3]). In this perpetual arms race, traditional antivirus software programs, reliant on signature-based detection mechanisms, have struggled to keep pace with the rapidly evolving threat landscape. Signature-based detection relies on identifying known patterns or signatures of malicious code within a database, often leading to a cat-and-mouse game where malware authors employ advanced evasion techniques such as code obfuscation to circumvent detection ( [4,5]).

To address the shortcomings of signature-based detection, researchers have explored alternative approaches that focus on analyzing malware behavior, rather than static code signatures. These approaches can be broadly categorized into Static Malware Detection (SMD) and Behavioral Malware Detection (BMD). SMD techniques analyze the static characteristics of malware, such as its byte-code structure, while BMD approaches monitor the dynamic behavior of malware at runtime, particularly the sequence of Application Programming Interface (API) calls made by the software to the underlying operating system ( [6]). This behavioral analysis provides valuable insights into the actions performed by malware, offering a more comprehensive understanding of its capabilities and intentions.

However, the complexity and variability of modern malware present significant challenges to both SMD and BMD approaches. Static analysis techniques are vulnerable to evasion tactics such as dynamic code linking and encryption, while behavioral analysis can be computationally intensive and time-consuming ( [7,8]). In response to these challenges, researchers have turned to advanced Machine Learning (ML) and Deep Learning (DL) techniques to enhance the effectiveness of malware detection systems ([9, 10, 7]). These approaches leverage the power of neural networks to automatically learn complex patterns and features from raw data, offering promising avenues for improving detection accuracy and efficiency.

Despite their impressive performance, ML and DLbased detection systems often lack transparency and interpretability, raising concerns about their trustworthiness and reliability in real-world applications. To address these concerns, researchers have begun exploring the field of eXplainable Artificial Intelligence (XAI), which focuses on developing models and techniques that can provide human-understandable explanations for AI-driven decisions ( [11]). In the context of malware detection, XAI methodologies aim to elucidate the underlying reasoning behind classification decisions, offering valuable insights into the features and patterns driving the detection process.

While XAI approaches have shown promise in enhancing the explainability of malware detection systems, their application to Behavioral Malware Detection (BMD) remains relatively unexplored, particularly in the context of deep sequential neural networks. This gap in research underscores the need for comprehensive investigations into the explainability of BMD systems, especially as they become increasingly reliant on advanced DL techniques. In our research, we present a novel XAI framework for BMD, leveraging a range of state-of-the-art techniques to provide transparent and interpretable explanations for classification decisions. Through extensive experimentation on publicly available datasets, we evaluate the effectiveness and robustness of our framework, shedding light on its utility and potential limitations in real-world cybersecurity applications.

More in details, our methodology builds upon a pipeline composed by three steps: the sequence preprocessing module aims to standardize the data format, the model is a classification learner that exploits the sequence structure of input data to perform the classification and the explainer generates the explanation supporting the model's prediction. Our methodological workflow is summarized in Fig. 1.

To sum up, we introduced an Explainable Artificial Intelligence (XAI) framework for behavioral malware detection. We aimed to assess the effectiveness of four XAI methods within a sequence-based deep learning model and their relevance in contemporary cybersecurity applications.

Our experiments demonstrated the feasibility of various XAI techniques in explaining the decisions of LSTMbased classifiers, considering both explanation quality and computational efficiency. While our focus was on local explanations for individual samples, global explanations were not addressed.

However, limitations exist, particularly regarding the lack of qualitative metrics to directly evaluate XAI effectiveness and the potential influence of domain-specific factors on our findings. Future research will explore additional XAI methods and assess the robustness of our framework against adversarial attacks. We also plan to investigate whether explanations can enhance classification performance and assist in identifying systematic errors in predictive models. Real-world scenarios will be considered to evaluate the practical utility of explanations in aiding expert analysts.

API call sequence

Input

Autoencoder-Based Deep Learning Pipeline for Network Anomaly Detection

In recent years, the rapid expansion of interconnected devices, like those found in IoT and Cloud networks, has highlighted the urgent need for strong network security assessments. One crucial aspect of addressing this challenge is detecting network anomalies, which serve as important indicators of network intrusions, privacy breaches, system damage, and fraudulent activities. Deep neural networks, known for their ability to learn intricate anomaly patterns from data, have become increasingly popular in this field. However, their effectiveness can be hampered by the unique characteristics of network traffic data, which is sparse, noisy, and often imbalanced due to the multitude of devices and internet applications generating it. Anomalies typically occur in only a small fraction of instances, ranging from 0.001% to 1%. In our research, we tackle these challenges with a focused approach. Initially, we use an autoencoder (AE) to identify instances of anomalous behavior. Then, these anomalies are classified by an attack classifier based on their specific type. We have tested our framework on a largescale dataset consisting of real-world network traffic data, yielding promising results. Our proposed framework, as depicted in Figure 2, operates at a high level by processing session description attributes 𝑠𝑖 (such as port number and bytes transferred) and determining whether the input is benign or represents an attack. In cases of an attack, the output 𝑦𝑖 identifies the specific type of attack (e.g., DDoS, sweep).

Denoising Autoencoder (DAE):

The DAE module processes the 𝑖-th session 𝑠𝑖 ∈ R 𝑛 and outputs its latent representation 𝑥 ˜𝑖 ∈ R 𝑘 and the reconstructed instance 𝑠 ˜𝑖 ∈ R 𝑛 . The latent representation can be considered as the DAE features, while the reconstructed instance represents how the input session might be generated from the latent space.

Reconstruction Error (RE) Module: The RE module utilizes the output of the DAE, 𝑠 ˜𝑖, to calculate the reconstruction error 𝑒𝑖 ∈ R. This error is indicative of the autoencoder's proficiency in interpreting the input session -a higher error suggests a poorer representation. The RE module assesses the similarity between 𝑠𝑖 and 𝑠 ˜𝑖 using various metrics 𝑚(), such as cosine similarity or dot product, with empirical evidence favoring the former for enhanced results.

Threshold Module (TRH):

The TRH module concatenates the reconstruction error 𝑒𝑖 with the latent representation 𝑥 ˜𝑖, forming a comprehensive feature vector for the input instance. It functions as a binary classifier within a multilayer perceptron architecture, discerning if the DAE has recognized 𝑠𝑖 as akin to the benign instances it was trained on:

𝑓 : 𝑥 ˜𝑖 ∈ R 𝑘 → {0, 1}(1)

Here, a positive class indicates a benign session, while a negative class signals an attack, the specifics of which are determined by the AC module.

Attack Classifier (AC): In tandem with the TRH computation, the AC module also receives the concatenated vector of 𝑒𝑖 and 𝑥 ˜𝑖. The AC module employs a multiclass tabular classifier (such as a random forest or support vector machine) that can be trained using standard supervised machine learning methods. It assigns the attack typology to the input instance, with the choice of classification algorithm impacting overall performance, as detailed in the experimental section. The final decision of the framework is derived by considering the outputs of both the TRH and AC modules. If the TRH output is zero, indicating successful reconstruction by the DAE, the input instance is classified as benign. If not, the input instance is classified according to the attack type predicted by the AC module. This approach leverages the DAE's ability to recognize benign sessions, a capability honed through extensive training on numerous instances, while the AC module provides the specificity in attack typology classification when an attack is presumed.

Our dataset has been provided with the NAD2021 challenge [12], where participants are provided with traffic records from three specific dates, classified as either normal traffic or a specific type of network attack. The challenge focuses on two primary types of attacks: (1) probing attacks, that involve attempts to extract data from a targeted network, and (2) DDOS-Smurf attacks, which are characterized by the use of numerous ICMP flows, aimed at overwhelming and halting traffic to a specific destination IP address.

The DAE module was trained using an early stopping mechanism, halting after three epochs without MSE improvement on the validation set. Figure 3 show that training stops at 69 epochs and the model easily learns The TRH model, integrating latent features from the DAE and its reconstruction error, was trained to classify samples as Normal (0) or Anomalous (1), using a similar early stopping strategy set at 10 epochs. Figure 4 show that training stops at epoch 202 with a training accuracy 𝐴𝑐𝑐𝑡𝑟𝑎𝑖𝑛 = 0.9697 and validation accuracy 𝐴𝑐𝑐 𝑣𝑎𝑙 = 0.9698. These results indicate the model's proficiency in differentiating between anomalous and normal samples.

The AC module, tasked with classifying attack samples identified by the TRH, was trained using a RandomForest classifier. Performance metrics, including Precision, Recall, and F1 scores, are detailed in the classification report. The confusion matrix provides further insights into the classifier's performance across different attack types. We report results in Table 1 (Precision, Recall and The final test assessed the combined performance of the DAE, TRH, and AC modules on the test set. Given the unbalanced nature of the data, Precision and Recall were key metrics for evaluating the DAE+TRH's ability to distinguish between normal and anomalous samples. While these modules demonstrated high quality in differentiating negatives from positives, there were limitations in identifying all anomalies. The cumulative errors from the DAE+TRH and AC modules are reflected in the overall system performance. The aggregated 𝐹 𝛼𝛽 score, evaluating the system across all classes, was recorded as 0.577, indicating areas for improvement in the pipeline's ability to accurately classify various types of network activities.

In conclusion, we introduced a streamlined and effective framework for Network Anomaly Detection (NAD). Our approach involves two main phases: (1) identifying anomalies using latent features generated by a Deep Denoising Autoencoder, and (2) classifying these anomalies with a multi-label classifier. Despite potential error propagation within the pipeline, our approach has shown promising results. However, we observed a limitation in the performance of the Threshold module (TRH), particularly in detecting attack samples, due to dataset imbalance. Future research will focus on implementing classbalancing techniques to improve the TRH module's effectiveness and enhance the overall system performance.

AI Act and Biometrics

As AI becomes more integrated into daily life, cybersecurity emerges as a critical concern. The AI Act, the first global law on AI usage, serves as a key regulatory framework within the European Union, emphasizing ethical considerations in cybersecurity. This law seeks a balance between technological innovation and the protection of core ethical values, ensuring AI is used responsibly. Particularly important within the AI Act is the role of cybersecurity for high-risk AI systems, which requires a comprehensive security approach. One significant challenge addressed by the AI Act is the management of biometrics, acknowledging their sensitive nature and the privacy and security implications for individuals. The act is particularly concerned with the ethical use of biometric data, such as fingerprints, and facial and vocal recognition, due to the personal data protection it necessitates. To regulate the deployment of facial and biometric recognition technologies in public spaces, the AI Act sets strict rules, allowing exceptions only in well-defined scenarios like locating missing persons or preventing serious crimes [13].

While the AI Act represents a significant step forward in balancing the benefits of artificial intelligence with the protection of fundamental rights, it also makes even more complex the landscape of challenges that remain. Indeed, on one hand, stringent regulations are essential for managing the risks associated with AI technologies and ensuring they adhere to ethical standards. On the other hand, continuous research in the field of AI and biometrics is critical. The need for advancing research in biometrics is recognized globally, to the extent that numerous international competitions have been established to challenge researchers in identifying fake biometrics. Over the years, the Naples' CINI AI-IS node has made significant contributions to the field of fake fingerprint detection. It has actively participated in several editions of LIVDET1 , an international competition that challenges researchers with the task of distinguishing between live and fake fingerprints created through diverse techniques and spoofing materials. Our team has achieved notable success in the last two editions, securing first place in one and second place in another. These accomplishments were made possible through our innovative use of adversarial learning techniques, which allowed us to perform a synthetic data augmentation able to improve the overall performance of a liveness detector [14] achieving an accuracy over 90% on two dataset. More recently, exploiting the experience matured over the years, we also developed a new fake fingerprint crafting strategy that can be used to physically cast a fake fingerprint able to bypass AI-based liveness detectors [15].

These results not only anticipate future cybersecurity threats but also aid in formulating effective defence mechanisms. To address this need while also protecting people from unwanted misuses, we advocate that one of the major challenges in the field of AI is education, to promote a deeper understanding of the risks and ethical implications of AI and enable people to participate in an informed and conscious manner in public debate and decision-making regarding the use and regulation of these technologies. In pursuing a balance between technological innovation and the protection of fundamental rights, it seems necessary to promote an open and inclusive dialogue involving both developers and civil society stakeholders [16].

Figure 1 :1Figure 1: Methodological workflow. The pre-processing step aims to standardize the data format. The model classifies the input sequence as malware/goodware, and the explainer generates the explanation. The models are then evaluated in terms of classification performance, efficiency and explanations quality.

Figure 2 :2Figure 2: Overview of proposed NAD pipeline.

Figure 3 :3Figure 3: DAE reconstruction error on training and validation splits. On the x axis we report the increasing number of epochs, while MSE values are reported on the y axis.

Figure 4 :4Figure 4: TRH accuracy on training and validation splits. On the x axis we report the increasing number of epochs, while accuracy values are reported on the y axis.

Table 11Attacks Classifier, validation performanceAnomalyPrecisionRecallF1DDoS0.991.000.99IP sweep1.001.001.00Nmap sweep0.980.870.92Port sweep0.990.990.99to reconstruct input samples. The final MSE scores were1.2944e-5 for training and 1.2402e-5 for validation. Ad-ditionally, further training for five epochs using bothtraining and validation data reduced the training MSE to1.1759e-5.

Table 22Attacks classifier, validation confusion matrixDDoSIP sweepNmap sweepPort sweepDDoS374100IP sweep2383100172Nmap sweep1411612Port sweep2109212253Table 3Test performance of DAE+TRH modules distinguishing anoma-lous and normal samplesClassPrecisionRecallF1Normal1.000.960.98Anomaly0.470.980.63

Table 44Test performance of the entire DAE+TRH+AC pipelineClassPrecisionRecallF1DDoS0.110.520.19Normal1.000.960.98IP sweep0.530.990.69Nmap sweep0.960.830.89Port sweep0.340.950.50F1 scores) and Table 2 (confusion matrix).

https://sites.unica.it/livdet/

Acknowledgments

This work was supported in part by the Piano Nazionale Ripresa Resilienza (PNRR) Ministero dell'Università e della Ricerca (MUR) Project under Grant PE0000013-FAIR

A survey of adversarial attack and defense methods for malware classification in cyber security SYan JRen WWang LSun WZhang QYu 10.1109/COMST.2022.3225137 IEEE Communications Surveys & Tutorials 25 2023 A Systematical and longitudinal study of evasive behaviors in windows malware NGalloro MPolino MCarminati AContinella SZanero 10.1016/j.cose.2021.102550 doi: Computers & Security 113 102550 2022 MalFox: Camouflaged Adversarial Malware Example Generation Based on Conv-GANs Against Black-Box Detectors FZhong XCheng DYu BGong SSong JYu 10.1109/TC.2023.3236901 IEEE Transactions on Computers 2023 A survey on heuristic malware detection techniques ZBazrafshan HHashemi SM HFard AHamzeh 10.1109/IKT.2013.6620049 The 5th Conference on Information and Knowledge Technology IEEE 2013 {Obfuscation-Resilient} executable payload extraction from packed malware BCheng JMing EALeal HZhang JFu GPeng J.-YMarion 30th USENIX Security Symposium (USENIX Security 21) 2021 Malware detection based on structural and behavioural features of api calls MAlazab RLayton SVenkatraman PWatters International cyber resilience conference (1st: 2010 Edith Cowan University 2010 Malware detection with artificial intelligence: A systematic literature review MGGaber MAhmed HJanicke 10.1145/3638552 ACM Computing Surveys 2023 A comparison of static, dynamic, and hybrid analysis for malware detection ADamodaran FDi Troia CAVisaggio THAustin MStamp 10.1007/s11416-015-0261-z doi: Journal of Computer Virology and Hacking Techniques 13 2017 Deep learning based sequential model for malware analysis using windows exe api calls FOCatak AFYazı OElezaj JAhmed 10.7717/peerj-cs.285 PeerJ Computer Science 6 e285 2020 A comprehensive survey on deep learning based malware detection techniques GM SCSethuraman 10.1016/j.cosrev.2022.100529 doi: Computer Science Review 47 100529 2023 Explainable Artificial Intelligence (XAI): What we know and what is left to attain Trustworthy Artificial Intelligence SAli TAbuhmed SEl-Sappagh KMuhammad JMAlonso-Moral RConfalonieri RGuidotti JDel NSer FDíaz-Rodríguez Herrera 10.1016/j.inffus.2023.101805 doi: Information Fusion 99 101805 2023 .0: A large-scale dataset for real-world network anomaly detection LChen S.-EWeng C.-JPeng H.-HShuai W.-HCheng 10.48550/ARXIV.2103.05767 Zyell-nctu nettraffic-1 2021 TMadiega Artificial intelligence act, European Parliament European Parliamentary Research Service 2021 Adversarial liveness detector: Leveraging adversarial perturbations in fingerprint liveness detection AGalli MGravina SMarrone DMattiello CSansone IET Biometrics 12 2023 Realistic fingerprint presentation attacks based on an adversarial approach RCasula GOrrù SMarrone UGagliardini GLMarcialis CSansone IEEE Transactions on Information Forensics and Security 2023 Emerging challenges in ai and the need for ai ethics education JBorenstein AHoward AI and Ethics 1 2021