67 Securing Intelligent Autonomous Systems Through Artificial Intelligence Ganapathy Mania, Bharat Bhargavaa, Jason Kobesb, Justin Kingb , James MacDonaldb a Purdue University, West Lafayette, Indiana, USA b Northrop Grumman Corporation, McLean, Virginia, USA Abstract Intelligent Autonomous Systems (IAS) reconstruct their perception through adaptive learning and meet mission objectives. IAS are highly cognitive, rich in knowledge discovery, reflective through rapid adaptation, and provide security assurance. It is paramount to have effective reasoning, decision-making, and understanding of operational context since IAS are exposed to advanced multi-stage attacks during training and inference time. Advanced malware types such as file-less malware with benign initial execution phase can mislead IAS to accept them as normal processes and execute malicious code later. IAS are also exposed to adaptive poisoning attacks where adversary inputs malicious data into training/testing set to manipulate the learning. Hence it is vital to monitor IAS activities/interactions to conduct forensics. This project will advance science of security in IAS through multifaceted advanced analytics, cognitive and adversarial machine learning, and cyber attribution based on the following approaches. (a) Implement deep learning-based application profiling to categorize adaptive cyber- attacks and poison attacks on machine learning models using contextual information about the origin, trust, and transformation of data. (b) Using HW/OS/SW data to develop perception algorithms using LSTM deep neural networks for detecting malware/anomalies and classifying dynamic attack contexts. (c) Facilitate cyber attribution for forensics through privacy-preserving provenance structure for knowledge representation and perform intrusion detection sampling on HW /OS/SW data. (d) Employ advanced data analytics to aid ontological and semantic reasoning models to enhance decision-making, attack adaptiveness, and self-healing. Keywords 1 autonomy, machine learning, deep learning, cybersecurity, lstm International Semantic Intelligence Conference (ISIC 2021), Feb 25-27, 2021, New Delhi, India EMAIL: bbshail@purdue.edu (A. 2); ©️ 2020 Copyright for this paper by its authors. Use permitted under Creative Commons License Attribution 4.0 International (CC BY 4.0). CEUR Workshop Proceedings (CEUR-WS.org) 68 1. Solution Overview  Intelligent autonomous systems receive large amounts of diverse data from various Our focus is on constraints, barriers and data sources. In addition, they operate in a challenges such as poorly understood attack dynamic operational context and interact surfaces, data set training availability and with numerous entities such as other TAS, biases, processing latency, human UAVs, satellites, sensors, cloud systems, understanding of AI results, AI/ML analysts, malicious actors, and countermeasures, human-machine disparity, compromised systems. measurement of effects. We propose novel  Cyber attribution module constitutes a approaches for privacy-preserving cyber stream data processor where data streams attribution, intrusion detection, adversarial are labeled / tagged on-the-fly for better machine learning, malware/anomaly detection, knowledge representation and reasoning, and decision-making. Cyber categorization. This data is stored as attribution involves extracting software, monitored or provenance data with its hardware, and operating system data to origin and historical information. For perform intrusion detection sampling (fixed or preserving privacy, detailed provenance dynamic sampling), generating efficient data is reduced in its scope to include only provenance structure that is populated with necessary data for a particular analysis or specific data required for a particular analysis learning. This module uses Provenance or learning, and labeling and tagging to Ontology (PROV-O) structure (elaborated properly represent the information obtained. in a later section) to obscure unnecessary The processed data is distributed to the or privacy-compromising data. cognitive module where the data is checked Furthermore, the attribution model for any malicious data presence through monitors data generated by software poison attack filter. The filtered data is (application parameters), hardware transmitted to cognitive computing module (memory bytes and instructions), and and knowledge discovery module, where the operating system (system calls). This data data is fed into supervised, unsupervised, and is used to conduct periodic sampling to LSTM models to perform learning and identify signatures of intrusion activities. advanced analytics. Based on multifaceted  Once the data is processed, it goes through dimensions of data analytics, reasoning and adversarial machine learning model. decision-making ability of IAS are enhanced. Attackers can insert malicious data into The overall architecture of the proposed training and testing dataset to influence model-secure intelligent autonomous systems machine learning models. In order to with cyber attribution-is demonstrated in isolate poisonous data, poison data filter figure 1. performs methods such as classification of verified and unverified data as well as outlier extraction. Once the poisonous data is removed the data (raw or provenance data) is sent to Cognitive computing module and Knowledge discovery module.  In Cognitive computing module, depends on the data and efficiency of machine learning methods, malware / anomaly detection is performed through either deep learning methodologies such as Long short-term memory (LSTM) e.g. Recurrent Neural Networks (RNN) or Convolutional Figure 1: Comprehensive Architecture of Neural Networks (CNN) or light-weight Secure Intelligent Autonomous Systems with yet powerful machine learning methods Cyber such as Support Vector Machines (SVM), General characteristics of the proposed unified Random Forests (RF), and K-Nearest architecture are given as follows: Neighbors (KNN). In addition, cognitive computing module consists of reasoning 69 engine, which is driven by rule sets, reversing the error correction coding technique semantic, and ontological reasoning. Both known as Golay coding [4][8]. The scheme anomaly detection module and reasoning utilizes 223 number of binary vectors of size engine module influence the attack 23 bits to profile features and cluster the data adaptiveness (reflexivity) and self-healing items. Since the method is built based on error of IAS, where decisions obtained through correction scheme, it exhibits fault tolerance in reasoning and learning are turned into wrongly labeled data. Similarly, we perform actions. With this extensive cognitive privacy-preserving knowledge discovery computing modules, the final response through perturbed aggregation in untrusted from IAS to other interacting entities will cloud [5]. In this project, we will use advanced be a secure and trusted one. data analytics to enable reasoning module for  Knowledge discovery module facilitates assisting attack adaptation and reflexivity of multi-faceted dimensions of advanced data the system. analytics including regression analysis, supervised learning, unsupervised 3. Cognitive Autonomy for learning, and pattern-recognition. Discovered knowledge is shared with Cybersecurity in Autonomous cognitive computing module for further Systems learning. The proposed structure provides robust cyber resilience and autonomous Decentralized machine learning is a promising operation of the system. emerging paradigm in view of global challenges of data ownership and privacy. We consider learning of linear classification and 2. Background on Cognitive regression models, in the setting where the training data is decentralized over many user Autonomy devices, and the learning algorithm must run on device, on an arbitrary communication Cognitive computing is a vital part of network, without a central coordinator. We security in autonomous systems. In particular, plan to utilize and advance COLA, a new malware and anomaly detection has become a decentralized training algorithm [23] with biggest challenge with increase in strong theoretical guarantees and superior sophistication in attacks such as file-less practical performance. This framework malware [1] and ransomware [2]. Behavior- overcomes many limitations of existing based malware detection system (pBMDS) methods, and achieves communication was proposed in [3]. The technique observes efficiency, scalability, elasticity as well as unique behaviors of applications as well as resilience to changes in data and participating users and leverages Hidden Markov Model devices. We will consider fault tolerance to (HMM) to learn application and user behaviors dropped and oscillation of nodes from based on two features: process state transitions connected to disconnected and attacks on the and user operational patterns. One of the nodes. The learning has to be communication- drawbacks of the HMM model is that it has efficient decentralized framework and free of very limited memory thus cannot be used for parameter tuning. COLA offers full adaptively sequential data. In this project, we leverage to heterogeneous distributed systems on hardware, software, and operating system data arbitrary network topologies and is adaptive to and apply long short-term memory units to changes in network size and data and offers identify anomalous behavior. We will also fault tolerance and elasticity. IAS should have profile applications and malware using HW clear understanding of its operational context, data (memory bytes and instruction sequences) it's won processes, and its interactions with to whitelist benign processes and blacklist neighboring entities. In this project, the malicious processes. In order to enable better cognitive computing module consists of three results for LSTM deep learning major components: (1) Malware / anomaly methodologies, knowledge discovery and detection module, (2) Reasoning engine, and ( representation are important. We proposed a 4) Reflexivity engine. Cyber attribution data metadata labeling scheme, BFC, for (system monitoring data or provenance data) is information tagging and clustering by 70 sent to cognitive computing engine for analysis where the system profiles the applications based on machine learning models. In this paper, we will focus on the cognitive autonomy property of the autonomous systems. 4. Malware and Anomalous Application Behavior Profiling Figure 3: Malware/anomaly Detection with Light-weight Machine Learning Methods with Deep Learning Model: Advanced malware such as ransomware encrypts IAS data without authorization. Since it does not alter the system configurations and leave a footprint, it is difficult to detect them. But based on the executed instruction sequences and constants (also known as magic constants) used for encryption mechanism during malware execution, applications can be profiled. First, we will sample the address spots for every 1,000,000 instructions (fixed Figure 2: Recurrent Neural Network (RNN) sampling). After a fixed period of time, we model for application behavior profiling will calculate the frequently occurring addresses and their relevant process ids. A We use instruction sequences executed in threshold T will be set for data extraction. For memory by application to understand the example, extract memory bytes and behavior of each application. instructions from top T = 10% of the global Input: n-gram sequences of instructions from list of sampled addresses (sorted in descending memory order based on their frequency of occurrence). Output: Binary classification of benign or Once opcode and memory bytes data is malicious collected, we will extract features such as n-  Step 1: Define a finite set I of instructions gram, bigram, unigram features, magic {i1, i2, ..., in} in the system. Instructions are constants feature, cosine similarity with executed based on time epochs i.e., time- instructions occurrences, and standard series data. deviation. Cosine similarity metric is one of  Step 2: Given an observed sequence of {i1, the most efficient method to learn from large i2, ..., in}, we find the set N of the top P datasets [20]. It plays a crucial role in sequences to be executed at time t. The understanding similarity between two feature size of the set N varies in each prediction vectors when the magnitude of the vector is and is determined by n-grams input as well large or unspecified as the clusters in the output of the model. i.e., it can either be unigram, bigram, or n-  Step 3: At time t, the sequence {i1, i2, ..., gram features. Given two feature vectors Vi = in} is benign if i1 is in P, otherwise {f11, f12, ...} and Vi = {f21, f22, ...}, where f11, malicious. f21, . . .are values of a particular feature, the cosine similarity is given as, Algorithm 1: Application Behavioral Profiling Algorithm 5. Malware and Anomaly Detection with Light-weight The cosine similarity lies between O and 1. If the orientation of the two feature vectors is the Machine Learning Models: same then the similarity between them is Cos O = 1 i.e., there is zero angle between them. 71 But when the angle is 90° (the orientation of [3] Xie, Liang, Xinwen Zhang, Jean-Pierre the feature vectors is at an angle of 90) then Seifert, and Sencun Zhu. "pBMDS: a the behavior-based malware detection system similarity is Cos 90 = 0. The similarity score for cellphone devices." In Proceedings of varies between [O, ½). Once the features are the third A CM conference on Wireless extracted, we will implement RF, SVM, and network security, pp. 37-48. ACM, 2010. KNN learning models. K-NN is one of the [4] Mani, Ganapathy, Bharat Bhargava, and simplest yet powerful classifier with high Jason Kobes. "Scalable Deep Learning computational efficiency as well as accuracy Through Fuzzy-based Clustering in [6]. Autonomous Systems." In IEEE International Conference on Artificial 6. Conclusion Intelligence and Knowledge Engineering (AI.KE), pp. IEEE. 2018. http://www.cs.purdue.edu/homes/bb/aike We presented two approaches for detecting 2.pdf through profiling evasive malware [5] Mani, Ganapathy, Denis Ulybyshev, applications. We use both light-weight Bharat Bhargava, Jason Kobes, and machine learning models as well as deep Puneet Goyal. "Autonomous Aggregate learning models to profile and understand the Data Analytics in Untrusted Cloud." In behavior of autonomous systems. This multi- IEEE International Conference on model approach is advantages when it comes Artificial Intelligence and Knowledge to computational resources in mission critical Engineering (AI.KE), pp. IEEE. 2018. systems. Based on the data and sample size, http://www.cs.purdue.edu/homes/bb/aikel appropriate model can be selected for analysis. .pdf In particular, light-weight machine learning [6] Prasath, V. B., Haneen Arafat Abu models use less computational resources and Alfeilat, Omar Lasassmeh, and Ahmad they have considerably less time complexity. Hassanat. "Distance and Similarity On the other hand, LSTM model can provide Measures Effect on the Performance of robust classification with fundamental data, K-Nearest Neighbor Classifier-A which enables IAS to understand evasive Review." arXiv preprint malware at basic level. arXiv:1708.04321 (2017). [7] Bholowalia, Purnima, and Arvind Kumar. 7. Acknowledgements "EBK-means: A clustering technique based on elbow method and k-means in This research is funded by Northrop WSN." International Journal of Computer Grumman Corporation. Applications 105, no. 9 (2014). [8] Mani, Ganapathy, Nima Bari, Duoduo 8. References Liao, and Simon Berkovich. "Organization of knowledge [1] Hopkins, Michael, and Ali Dehghantanha. extraction from big data systems." In "Exploit Kits: The production line of the 2014 Fifth International Conference Cybercrime economy?" In Information on Computing for Geospatial Security and Cyber Forensics (InfoSec), Research and Application, pp. 63-69. 2015 Second International Conference on, IEEE, 2014. pp. 23-27. IEEE, 2015. [2] [2] Kharraz, Amin, William Robertson, Davide Balzarotti, Leyla Bilge, and Engin Kirda. "Cutting the gordian knot: A look under the hood of ransomware attacks." In International Conference on Detection of Intrusions and Ma/ware, and Vulnerability Assessment, pp. 3-24. Springer, Cham, 2015.