Forensic Artifacts’ Analysis using Graph Theory Sophia Petra Krišáková1,*,† , Pavol Sokol1,† and Rastislav Krivoš-Belluš1,† 1 Institute of Computer Science, Faculty of Science, Pavol Jozef Šafárik University in Košice, Jesenná 5, 040 01 Košice, Slovakia Abstract The number of cyber-attacks is constantly growing, and their sophistication is increasing due to new techniques and strategies of attackers. Organisations must continuously improve their methods of detecting and responding to these attacks to protect their networks and information systems. The time between the occurrence of a security incident and its identification takes an average of 100 - 200 days, with organisations having a response time of between 50 - 70 days. Our work aims to reduce this time so that organisations can respond to security incidents more quickly. In this work, we use graph theory for forensic analysis in the Windows operating system. The main objective of the work is to identify digital evidence and the relationships between them. For this purpose, we work with datasets from various Capture the Flag (CTF) competitions. We describe the processing stages of the digital evidence and their transformation into graphs and then identify anomalies and cycles in the graphs in order to provide readers with a deeper insight. Keywords graph theory, graph algorithms, forensic analysis, artifact, cybersecurity 1. Introduction that article, we focus on how graph theory applied to individual forensic artefacts available in the Windows In the digital world, data and network security is a key operating system and the NTFS file system can help us. concern for organisations of all sizes and industries. With Graph analysis allows us to identify the relationships the rise of cyber-attacks and their ever-changing nature, between different digital evidence and their attributes, organisations must constantly adapt to protect their as- thereby better understanding the nature of the attack. sets and ensure the security of their information. Cyber To achieve this objective, we specify the following attackers are constantly moving forward and developing partial research objectives: new ways to penetrate systems and gain unauthorised access to sensitive data. As these attacks become more so- • What attributes of forensic artefacts are best phisticated, the challenge for organisations is to identify suited for graph representation? attacks as quickly as possible and respond appropriately • How can specific properties of graphs help iden- and ideally. tify key forensic artefacts and relationships in One of the main issues in the response to security at- digital forensics? tacks is the time between the occurrence of a security This paper is divided into six sections. Section 2 dis- incident, its identification, and the subsequent response. cusses papers relevant to this research. Section 3 specifies This time can be non-trivial, often measured in hundreds the methods employed in this paper, including the col- of days, giving attackers ample time to cause damage lection and processing of the digital evidence. Section 4 without being detected. In addition, even after a secu- outlines the graph theory applied to digital evidence and rity incident is identified, an organisation needs time to graph generation options. Section 5 discusses the lessons resolve it and restore normal operations. learned from applied graph properties to digital evidence. Our research focuses on the security incident response Section 6 provides a summary, including our suggestions process, including digital forensics. The main aim is to for future research. reduce the time interval required to resolve a security incident and provide organisations with a way to respond more quickly and effectively to security incidents. In 2. Related works ITAT 2024 Information Technologies – Applications and Theory 2024, We often think of data analysis and machine learning September 20–24, 2024, Drienica, Slovakia as elements of artificial intelligence. It may be about * Corresponding author. analysing data differently. We must also visualise the † These authors contributed equally. data, preprocess it, get basic statistics, etc. It is in the vi- $ sophia.petra.krisakova@upjs.sk (S. P. Krišáková); sualisation that graphs can help us. Several works have al- pavol.sokol@upjs.sk (P. Sokol); rastislav.krivos-bellus@upjs.sk ready been done in forensic artefact analysis using graphs. (R. Krivoš-Belluš)  0000-0002-1967-8802 (P. Sokol) Many of these have focused on network communication, © 2024 Copyright for this paper by its authors. Use permitted under Creative Commons License which is different from the form of data we use in this Attribution 4.0 International (CC BY 4.0). CEUR Workshop Proceedings http://ceur-ws.org ISSN 1613-0073 CEUR Workshop Proceedings (CEUR-WS.org) work. Nevertheless, we can take inspiration from them CEUR ceur-ws.org Workshop ISSN 1613-0073 Proceedings as they offer insight into the representation of forensic Due to the proliferation of smart devices, detecting artefacts using graphs and their connections. We have and mitigating faults in computer networks is crucial. selected a few of these papers to get a basic overview of Anomalies, whether from security breaches, component the current state of the art in the given field. failures, or environmental factors, must be promptly ad- dressed. Recent studies on anomaly detection in com- 2.1. Security Data Analysis puter networks categorize solutions and highlight trends and shortcomings, especially regarding malware in smart- Cybercrime risks have escalated with the digitization of phone networks. data (books, videos, images, medical and genetic infor- Paper [8] introduces a graph-based approach to net- mation) via laptops, tablets, smartphones, and wearables. work forensics, using a graph model of digital evidence Digital forensics recovers lost or deleted files but requires for evidence presentation and automated reasoning. The more efficient investigation resources. Current processes proposed hierarchical reasoning framework infers net- rely heavily on human input, slowing responses to rapid work entity states and identifies critical entities. An in- cybercrimes. Machine learning can automate digital in- teractive hypothesis testing framework aids in detecting vestigations, aiding digital investigators [1]. attack activities. Experimental results show the proto- Constantini, Gasperis, and Olivieri explored artificial type’s effectiveness in extracting attack scenarios with intelligence and computational logic, particularly answer minimal expert knowledge. set programming, to automate evidence analysis in dig- As Internet traffic grows, so do cyber crimes, neces- ital forensics. They demonstrated how complex inves- sitating advanced network forensics. One method com- tigations could be optimized and automated to assist bines network vulnerability and graph network evidence in generating hypotheses for court cases using graph to reconstruct attack scenarios and identify multi-stage theory-based algorithms [2]. attacks, confirmed by experimental results [9]. Ch. Easttom highlighted the use of graph theory in Spectral graph theory helps understand network mal- criminal investigations, describing how mathematical ware propagation, essential as device connectivity in- modelling helps understand relationships among sus- creases. Using various Laplacian matrices to track net- pects, victims, and systems [3]. Palmer, Campbell, and work pattern changes, one study [10] offers insights into Gelfand further discussed graph theory’s role in forensic malware spread, aiding in faster infection detection. analysis, noting the visualization benefits for investiga- tors and its potential to support the investigation process [4]. 3. Methodology A study on distributed graph analysis of large-scale email datasets showcased improved efficiency and accu- In this chapter, we have covered how to acquire, prepro- racy in digital evidence analysis using centrality algo- cess, and explain data so that we can combine it into rithms [5]. Binwal, Devi, and Singh developed algorithms graphs, filter it, and analyse it later. We have also de- for fingerprint graph representation and isomorphism scribed the creation of the super timeline and the subse- testing, applicable to broader forensic analysis despite quent transformation of the other two datasets. differing input data [6]. Additionally, attack graphs, used to identify potential 3.1. Data acquisition and description attack paths and vulnerabilities, are proposed for prac- We selected seven fictitious cases from CTF competitions tical forensic analysis, including antiforensic scenarios. focused on forensics, incident response, and threat detec- These graphs help understand complex attack paths and tion, and we used disk images from these cases. missing evidence, demonstrated through a database at- The first example is the case of the stolen Szechuan tack case study [7]. sauce from the DFIR Madness portal called Case001 - Our contribution is to integrate these methods to en- The case of the Stolen Szechuan sauce [11], where in hance digital forensics. We focus on automating digital this case the main goal was to find out how CITADEL’s evidence analysis with graph theory to elucidate relation- recipe got on the dark web. The company requested ships among digital evidence and entities. forensic analysis, identification of unwanted applications installed on the system, and detection of the location and 2.2. Forensic Analysis in Network time of installation. The case also offers information as Communication Using Graph Theory to whether any content was changed, modified, deleted, or data was leaked. We worked with artefacts from the While our primary focus is on NTFS file system data from company’s DC domain control server (hereafter called Windows, we also review digital forensics in network the ”DC server”) and from the Desktop - therefore we communication, linking file system data with network count this case as two datasets. communication data. The other three cases Magnet CTF 2019 [12], Magnet • MACB : timestamps (Modification, Access, CTF 2020 [13] and Magnet CTF 2022 [14] were from the Changed, Birth) CTF (Capture the Flag) Magnet Forensics competition. It • Source : source name abbreviation (e.g. REG - was not a classical forensic analysis, but rather answering registry records) questions like ”when did we get the disk image” or ”when • Sourcetype : description of the source was the software installed”. • Type : timestamp type (e.g. last entry) The last two cases, NIST Data Leakage Case [15] • User : the user name (if any) that is associated and NIST Hacking Case [16], are used to learn about with the event different forms of data leakage and to improve techniques • Host : host name (if any) that is associated with for investigating them. We focused on investigating a the event data leak case where the key is to uncover evidence of illegal activities and obtain any information generated • Short : contains a short description field in by the suspect. which the text is stored • Desc : an array that contains most of the parsed information 3.2. Data preprocessing • Version : version number of the timestamp For data preprocessing, we followed the same steps in • Filename : the name of the file that is associated all seven cases, specifically specifying the preprocessing with the event process for only one case. We worked with the data • Inode : inode number of the file being analysed according to the procedure described in the paper [17]. • Notes : a place to store additional information We created the timeline using the Log2timeline [18] • Format : the input module that was used to parse tool and its plugins. We modified the resulting timeline • Extra : field with parsed information that is with the psort.py tool and the Python language with the linked and stored here pandas library. Our dataset contained records from 11 different data sources, with FILE, EVT, and REG records In addition to the basic seven datasets, we created 6 being the most prominently represented, accounting for additional CSV files containing records from FILE and 87% of all records. We divided the extracted attributes 7 CSV files containing records from EVT. The datasets into seven categories. from FILE have 44 attributes, and EVT have 40 attributes. We narrowed the dataset to the time of the security These files were created by extracting data from the orig- incident and manually identified relevant digital inal supertimeline. For example, if there was a MACB evidence, including the inode files: 84630, 84880, 84987, column in the original dataset, four new columns were 86966, 86967, 86968, 86970, 86971, 86975, 87059, 87060, created in the new (EVT or FILE) dataset, and the original 87064, 87111, 87112, 87137, and files with the names: one was deleted. The new columns are ’M’, ’A’, ’C’, ’B’. If ’coreupdater. exe’, ’FILESH 1’, ’Secret’, ’BETH_S 1.TXT’, there was a ’.ACB’ record in the original data, we wrote ’Beth-_Secret.lnk’, ’SECRET 1.TXT’, ’SECRET_beth.lnk’, 1 in the relevant ’A’, ’C’, ’B’ columns and 0 in the ’M’ ’Szechuan’, ’SZECHU1.TXT’, ’Secret.lnk’, ’NoJerry. column. In this way, we partially created binary data or lnk’, ’No-Jerry.txt’, ’f01b4d95cf55d32a.automatic- columns. Not all attributes could be converted this way, Destinationsms’, ’SECRET_beth.txt’, ’Beth_Secret.txt’, so columns like ’date’ or ’time’ were left unchanged. The ’Secret.zip’, ’coreupdater.exe.2424 urv. partial’. exact procedure for creating datasets is explained in [19]. Next, we analysed the inodes and filenames, excluding For later analysis and graphing needs, we had to cre- some inodes and filenames. Finally, we used aggregation ate additional columns in the FILE datasets - MACB, file, functions and created attribute combinations to analyse dir, and NTFS, which were created by concatenating the data. some of the columns. MACB - we merged the ’broken’ We only manually identified inodes and file names in columns ’M’, ’A’, ’C’, ’B’ into one again. In the case of the the stolen Szechuan sauce recipe; the identified inodes file column, these were ’file_executable’, ’file_graphic’, and file names still need to be identified for the other ’file_documents’, ’file_ps’ and ’file_other’. For dir - datasets. ’dir_appdata’, ’dir_win’, ’dir_user’ and ’dir_other’. NTFS Each of the seven datasets is a super timeline con- - ’file_stat’, ’NTFS_file_stat’, ’file_entry_shell_item’ and taining 17 attributes, and the following rows are records ’NTFS_USN_change’. (events). There are 17 attributes and their description The analysis of the selection of the attributes men- by [19]: tioned above, as well as the analysis of various combina- tions of attributes for anomaly detection, is presented in • Date : the date when the event occurred the paper [17]. • Time : time the event occurred • Timezone : time zone 4. Graph Theory eccentricity of a point in the graph 𝐺. A vertex 𝑐 in G is called central if 𝑒𝑐𝑐(𝑣) = 𝑟𝑎𝑑(𝐺). The center of a We use standard notation for graph theory [20]. In com- graph 𝐶(𝐺) is the set of all vertices in the graph 𝐺 with puter/network security, graph theory models have often minimum eccentricity: been used in the last decades. Some graph problems have also risen from security, e.g. k-Path Vertex Cover[21]. 𝐶(𝐺) = {𝑣 ∈ 𝑉 | 𝑒(𝑣) = min 𝑒(𝑢)} 𝑢∈𝑉 We will focus on modelling graphs from Artifacts’ data. The center of 𝐶(𝐺) is the set of all central vertices in 4.1. Background the graph 𝐺. The eccentric vertex of a vertex 𝑣 is the vertex that is the furthest from it. A vertex v is called A Graph 𝐺 = (𝑉, 𝐸) is a pair of a finite set of 𝑛 peripheral if 𝑒𝑐𝑐(𝑣) = 𝑑𝑖𝑎𝑚(𝐺). Periphery 𝑃 𝑒𝑟(𝐺) vertices 𝑉 = {𝑣0 , 𝑣1 , . . . , 𝑣𝑛−1 } and 𝑚 edges 𝐸 = of a graph 𝐺 is the set of all peripheral vertices in the {{𝑣𝑖 , 𝑣𝑗 }|0 ≤ 𝑖, 𝑗, < 𝑛, 𝑖 ̸= 𝑗}. A directed graph has graph 𝐺. oriented edges(directed arcs), i.e. the order of vertices is important (𝑣𝑖 , 𝑣𝑗 ). (Edge-)Weighted graph assigns the weight to each edge, so the edges are in the form 4.2. Graph Generation from Forensic (𝑣𝑖 , 𝑣𝑗 , 𝑤𝑖𝑗 ). Artifacts A bipartite graph is a special type of graph where Typical graph generation in the security area is creating a set of vertices can be divided into two disjoint sets just nodes of one type and connecting them depending on (partitions), where each edge has exactly one vertex from communication [22] in graph or finding an attack vector every partition. in a directed graph [23]. Standard computer network- There are several definitions and parameters: path based cybersecurity applications cover traffic, security (sequence of incident vertices and edges, starting and policies and vulnerabilities/threats [24]. ending vertex (leaf), all vertices are mutually different), For the artefacts, one can create nodes from any at- excentricity, etc. tribute (column) or any combination of attributes. More- The degree of the vertex 𝑣 in a graph 𝐺, denoted by over, we found the most interesting results for bipartite deg𝐺 (𝑣) or 𝑑𝐺 (𝑣), is the number of edges incident to 𝑣 graphs, e.g., using two node types. Depending on the in 𝐺. The maximum degree of a graph 𝐺, denoted by dataset, we focused on different pairs of node types. ∆(𝐺), is the maximum value among the degrees of all vertices of the graph 𝐺. By analogy, we also denote the minimum degree of a graph 𝐺. 4.3. Graph from supertimeline Walk in the graph 𝐺 denotes the alternating sequence For the super timeline, we have chosen attributes such of incident vertices and edges (starting and ending with as user, source, MACB, sourcetype and inode and vertex). A sequence with mutually different edges is always two of them as vertex types for the generated called a trail. A walk where all vertices differ is called bipartite graph. We can consider other attributes that a path. In other words, a path in a graph is a sequence will be represented by vertices in the graph, e.g. host, of vertices for which there indeed exists an edge in the type, filename. Some of these attributes are categori- graph between every two following vertices. No two cal so that we can think of them in this sense. The edge vertices (and hence no edges) are repeated. A trail in represents the row’s existence in data containing these which all vertices except the first and last are distinct is two vertices (artefact). An example of the NIST Data called a cycle. Leakage case is shown in Fig. 2. A graph is connected if every pair of vertices in the graph is connected. It means that there is a path between every pair of vertices. A component of a graph is a con- 4.4. Graph from EVT artifacts nected subgraph that is not part of any larger connected In the EVT dataset, we created binary combina- subgraph. tions of three attributes: event_id, user_sid, and An edge of a graph is called a bridge if the number of execution_process because we could not tell any components increases when it is removed. A vertex of vital information about the relationships between the a graph is called a cut vertex/articulation point if the attributes by selecting other attributes. We could also number of components of the graph increases when it is consider attributes like inode, computer_name, and removed. source_name, since we assume they are finite in num- Eccentricity the 𝑒𝑐𝑐(𝑣) of a point 𝑣 is the distance of ber and not binary values, but that probably wouldn’t the point 𝑣 from the farthest point in the 𝐺 graph. The add any value for us. An example of a generated graph radius of the graph 𝑟𝑎𝑑(𝐺) is the minimum eccentricity from the EVT dataset is shown in Fig. 1. of a point in the graph 𝐺, and 𝑑𝑖𝑎𝑚(𝐺) is the maximum Figure 1: NIST Data Leakage Case - EVT - user_sid, execu- Figure 2: NIST Data Leakage Case - super timeline - user, tion_process_id source 4.5. Graph from FILE artefacts In a similar way to the super timeline, we also created graphs in the FILE dataset, but with different attributes, such as MACB, dir, file, and NTFS, the creation of which we explained in Chapter 3.2. In the datasets file, it was challenging to think of other attributes that would be suitable for graphing because they were in binary form. This is why we merged some attributes, but it was impos- sible for all of them. We would still include the inode attribute in the graph creation process, but it had many unique values. Examples of these graphs are in Fig. 4 and Fig. 5. Figure 3: Magnet CTF 2022 - super timeline - user, sourcetype 5. Lessons learned from graph properties 2022. Fig. 3 shows the graph from the Magnet CTF 2022 case. The lowest eccentricities in this graph were in the From the generated graphs, we have focused on some vertices ’WinEVTX’ (source type attribute) and ’Patrick’ graph-based metrics [22] and found some anomalies. In (user attribute). We observed the same behaviour in the the current research, we have used only unweighted other Magnet CTF cases - that is, the graph centre was graphs. always identified as the ’WinEVTX’ vertex and one of the users. 5.1. Eccentricity In particular, we can exploit eccentricity in graphs in 5.2. Degree of vertices the super timeline dataset created from user and source We exploited the degree of vertex in the datasets created or user and source type attributes. For example, in the from the supertimeline from the FILE source. We cre- NIST Data Leakage Case in Fig. 2, we created a graph ated three types of graphs - MACB and file, MACB and from the user and source attributes and then found the dir, MACB and NTFS. Fig. 4 shows an example of such eccentricity of all vertices. The vertices belonging to a graph. The most interesting graphs arose when the the user attribute had the lowest eccentricity of 3: ’-’, MACB and NTFS timestamps were combined, as shown ’informant’, ’admin11’ and ’temporary’, and the vertices in Fig. 4. At first glance, it might seem that we should belonging to the source attribute had ’REG’ and ’EVT’. focus on the NTFS_UNS_change attribute because it is These vertices can also be called the centre of the graph. associated with timestamp C (Change), but neither the We also created graphs from the user and source type file_stat and NTFS_file_stat vertices are, despite having attributes and searched for the graph centre. We can illus- the highest degree of vertex - 14. The most significant trate this with all three Magnet CTF cases 2019, 2020 and vertex in this graph is file_entry_shell_item with a ver- 6. Results and future works This paper focuses on the automation of response to secu- rity incidents, including digital forensic analysis within the Windows operating system and NTFS file system. For this purpose, we used the graphs’ structure and prop- erties to better understand the relationships between the analysed digital evidence. The paper demonstrated the possibilities of generating graphs, particularly from su- per timeline formats, event records, and the Master File Table. We showed that it is essential to carefully select the attributes of artefacts that can be used as vertices and to determine the corresponding edges of the graphs. At Figure 4: Szechuan Sauce - FILE - MACB, NTFS the same time, we identified several properties of graphs, whose analysis can help better understand the relation- ships and identify interesting or relevant digital evidence. In the future, we will enhance our models with weighted graphs. Acknowledgment This paper was supported by the Slovak Research and Development Agency under contract No. APVV-23-0137 and contract No. APVV-21-0336. References [1] Iqbal, Salman, S. A. Alharbi, Advancing automa- Figure 5: Szechuan Sauce - FILE - MACB, inode tion in digital forensic investigations using machine learning forensics, Digital Forensic Science. Inte- chOpen (2019). [2] S. Costantini, G. D. Gasperis, R. Olivieri, Digital tex degree of 6 because if we look further at the records forensics and investigations meet artificial intelli- containing this NTFS attribute value, we find "only" 19 gence, Annals of Mathematics and Artificial Intelli- unique inode values in 163 records, and just 9 of these gence (2019). inodes were identified as relevant to the case. [3] C. Easttom, Utilizing graph theory to model foren- sic examination, International Journal of Innovative 5.3. Cycles in graphs Research in Information Security (IJIRIS) 4 (2017). [4] I. Palmer, R. Campbell, B. Gelfand, Exploring digital After creating the graph from The Stolen Szechuan Sauce evidence with graph theory, in: ADFSL Conference - FILE dataset in Fig. 5, it was not visible what specifically on Digital Forensics, Security and Law, 2017. to focus on, so we had to apply the properties of the graph. [5] S. Ozcan, M. Astekin, N. K. Shashidhar, B. Zhou, We looked for the base cycles in the graph. We found Centrality and scalability analysis on distributed 43 of these, and seven cycles contained inodes that were graph of large-scale e-mail dataset for digital foren- manually identified as relevant to the case by the analysis. sics, in: IEEE International Conference on Big Data In the same way, we analysed the graph created from the (Big Data), 2020. MACB and filename attributes, where we also found file [6] J. Binwal, R. Devi, B. Singh, Mathematical mod- names in the base cycles marked as relevant to the case elling and simulation of fingerprint analysis using by manual analysis. In this analysis, however, we must graph isomorphism, domination, and graph peb- consider that the attributes are not equivalent because bling, Advances and Applications in Discrete Math- the MACB timestamps or combinations will always be ematics (2023). at most 16, and the inodes are a different number, often [7] C. Liu, A. Singhal, D. Wijesekera, Using attack much higher. graphs in forensic examinations, in: Seventh Inter- national Conference on Availability, Reliability and tackers’ skill levels in multi-stage attacks, Informa- Security, 2012. tion 11 (2020) 537. [8] W. Wang, T. E. Daniels, Building evidence graphs [24] V. P. Janeja, Data Analytics for Cybersecurity, for network forensics analysis, in: 21st Annual Chapter 9: Cybersecurity through Network and Computer Security Applications Conference (AC- Graph Data, 2022. URL: https://doi.org/10.1017/ SAC’05), 2005. 9781108231954. [9] J. He, C. Chang, P. He, M. S. Pathan, Network forensics method based on evidence graph and vul- nerability reasoning, MDPI, future internet (2016). [10] C. McGee, J. Guo, Z. Wang, The application of the graph laplacian in network forensics, in: IEEE 12th Annual Information Technology, Electronics and Mobile Communication Conference (IEMCON), 2021. [11] Case 001 – the stolen szechuan sauce, 2024. URL: https://dfirmadness.com/ the-stolen-szechuan-sauce/, online, [cit. 2024-02- 18]. [12] Magnet ctf 2019 windows desktop, 2019. URL: https://digitalcorpora.s3.amazonaws.com/ corpora/scenarios/magnet/2019%20CTF%20-% 20Windows-Desktop.zip, online, [cit. 2024-02-18]. [13] Magnet ctf 2020 windows, 2020. URL: https://digitalcorpora.s3.amazonaws.com/ corpora/scenarios/magnet/2020%20CTF%20-% 20Windows.zip, online, [cit. 2024-02-18]. [14] Magnet ctf 2022 windows, 2022. URL: https://digitalcorpora.s3.amazonaws.com/ corpora/scenarios/magnet/2022%20CTF%20-% 20Windows.zip, online, [cit. 2024-02-18]. [15] Data leakage case, 2024. URL: https: //cfreds-archive.nist.gov/data_leakage_case/ data-leakage-case.html, online, [cit. 2024-02-18]. [16] Hacking case, 2024. URL: https://cfreds-archive.nist. gov/Hacking_Case.html, online, [cit. 2024-02-18]. [17] E. Marková, P. Sokol, K. Kováčová, Detection of relevant digital evidence in the forensic timelines, in: International Conference on Electronics, Com- puters and Artificial Intelligence (ECAI), IEEE, 2022, pp. 1–7. [18] Plaso, 2024. URL: https://plaso.readthedocs.io/en/ latest, online, [cit. 2024-04-13]. [19] E. Marková, P. Sokol, S. P. Krišáková, K. Kováčová, Dataset of windows operating system forensics arte- facts, Data in Brief (2024) 110693. [20] D. B. West, Introduction to graph theory, Prentice hall, 2001. [21] B. Brešar, F. Kardoš, J. Katrenič, G. Semanišin, Min- imum 𝑘-path vertex cover, DAM 159 (2011) 1189– 1195. [22] G. Zonneveld, L. Principi, M. Baldi, Using graph theory for improving machine learning-based de- tection of cyber attacks, 2024. URL: https://arxiv. org/pdf/2402.07878. [23] T. Mézešová, P. Sokol, T. Bajtoš, Evaluation of at-