1. Introduction

A Natural Language Processing-based Approach for Cyber Risk Assessment in the Healthcare Ecosystems

Stefano Silvestri

Giuseppe Tricomi

0 1 2

Giuseppe Felice Russo

Mario Ciampi

1 0 CINI-Consorzio Interuniversitario Nazionale per l'Informatica , Via Ariosto 25, Roma, 00185 , Italy 1 Institute for High Performance Computing and Networking, National Research Council of Italy (ICAR-CNR) , via Pietro Castellino 111, Naples, 80131 , Italy 2 Università degli Studi di Messina , Contrada di Dio 1, Messina, 98166 , Italy

The cyber risk in the healthcare sector is constantly increasing, due the large adoption of digital services formed by a complex interconnection of diferent systems and technologies, which ofer a larger attack surface for the attackers. Therefore, the risk assessment of the assets involved in these services is crucial to prevent and mitigate possible critical consequences, which could also afect the health of the patients. A large source of constantly updated information about threats and vulnerabilities of the assets of the healthcare ecosystems is available in natural language text on the Internet (cyber security news, forum, social media, etc.), but it is not easy to fully exploit them for a risk assessment process, due to the complexity of natural language. This paper proposes an AI-based approach for the individual risk assessment of the assets of digital healthcare systems based on the use of NLP and Knowledge Bases, which exploits the information extracted from natural language news from the web. The methodology has been developed within the activities of the EC-funded H2020 AI4HEALTHSEC project, where it has also been successfully tested in real-world scenarios. Moreover, the datasets collected have been made publicly available on the SoBigData research infrastructure.

eol>Natural Language Processing Large Language Models Cyber Threats Cyber Vulnerabilities Impact Assessment Cyber Risk Assessment

1. Introduction

could lead to the web exposure of sensitive information of patients, or an attack to a remote monitoring software The healthcare ecosystem is rapidly adopting a grow- of a medical device could damage the equipment of the ing number of recent technologies, such as Internet hospital or change the configuration of the device [ 4]. of Things (IoT), wearable and implantable devices, Pic- This sector has recently sufered several serious cyber ture Archiving and Communication System (PACS), Elec- attacks: for example, in 2017 and 2021 there were rantronic Health Records (EHRs), DiCOM images, and oth- somware attacks on U.K. National Health System (NHS) ers, interconnected to realise and ofer innovative health- and Ireland’s Department of Health and Health Service care digital services. While their adoption and use im- Executive respectively [5]. Furthermore, inherent vulprove the quality of service to patients, and support and nerabilities have been found in some medical devices ease the work of the physicians and the medical profes- such as Braun’s infusion pump and Medtronic’s insulin sionals, on the other hand, this complex and dynamic pump [3]. Finally, approximately 90% of healthcare orinter-connection of several systems ofers a larger at- ganisations experienced a data breach in 2018 [6]. For tack surface for the threat actors interested in attacking these reasons, it is necessary to study the most frequent the system by exploiting the existing vulnerabilities [1], attacks in healthcare to make the services ofered more also taking into account a low level of awareness of the secure and resilient [4, 7]. Due to the complexity of the cyber risks by the the healthcare personnel [2], often healthcare ecosystems, performing an efective cyber risk causing dramatic impacts to the healthcare ecosystem assessment can help to limit and prevent the cyber secu[3]. In example, a cyber-attack on a insecure PACS server rity incidents [8]. The cyber risk assessment process has the purpose of identifying, evaluating, and prioritising Ital-IA 2024: 4th National Conference on Artificial Intelligence, orga- security risks to the assets of an organisation, allowing *nCizoedrrebsypCoInNdIi,nMg aayut2h9o-3r.0, 2024, Naples, Italy to perform the most appropriate action to mitigate the $ stefano.silvestri@icar.cnr.it (S. Silvestri); risks and the vulnerabilities. giuseppe.tricomi@icar.cnr.it (G. Tricomi); Internet is a constantly updated source of threat, incigiuseppefelice.russo@icar.cnr.it (G. F. Russo); dent, and vulnerability-related information for healthcare mario.ciampi@icar.cnr.it (M. Ciampi) ecosystem assets in the form of unstructured Natural Lan(G.0T0r0i0c-o0m00i)2;-09080990--08040019-2(S0.9S0-il9v6e4s7tr(iG);.0F0.0R0u-0s0so03);-3837-8730 guage (NL) within blogs, specialized Cyber-Security (CS) 0000-0002-7286-6212 (M. Ciampi) websites, social media, Knowledge Bases (KBs) and others. © 2024 Copyright for this paper by its authors. Use permitted under Creative Commons License Although these sources contain crucial information about Attribution 4.0 International (CC BY 4.0). risk management and assessment, on the other hand, it is healthcare organizations and the risk assessment methoddificult to fully leverage them, due to the inherent com- ologies adopted. The authors demonstrated that in this plexity (polysemy, irony, long and complex sentences, domain, there is often a lack of adequate training for non-standardized abbreviations, acronyms) of NL. There- healthcare workers and a lack of specialized figures, such fore, extracting relevant information from this mass of as a chief information oficer, highlighting the need to data becomes a demanding task [9]. The information have security protocols updated to the latest standards. extraction from NL text issues is currently addressed in Also, AI-based information extraction from CS textual literature adopting AI-based Natural Language Process- documents has been recently developed and presented ing (NLP) models, usually implementing Named Entity in the literature. In [13] is presented SecureBERT, a BidiRecognition (NER) systems [10, 11, 12, 13] using Large rectional Encoder Representations from Transformers Language Models (LLMs) and CS KBs. However, there is (BERT) model trained on CS-domain large NL corpora, a lack of focus in the literature on analyzing and priori- which outperforms other similar models in NLP tasks tizing threats and vulnerabilities about the most frequent in the CS domain. The authors of [10] collected a large threats in healthcare. In this context, this paper extends corpus of labeled sequences from Industrial Control Systhe ideas previously presented in [14, 15, 16], combin- tems device’s documentation to pre-train and fine-tune ing NLP-based threat and vulnerability approaches to a BERT language model, named CyBERT. Also [12] prodefine an impact and risk assessment for the healthcare posed another interesting CS NER system, which exploits ecosystems, evaluating it by exploiting CS textual sources an architecture based on BERT, an LSTM, Iterated Diavailable on the Internet, presenting the final NLP cyber lated Convolutional Neural Networks (ID-CNNs), and risk assessment methodology developed within the activ- Conditional Random Field, to improve the obtained perities of the EC-funded H2020 AI4HEALTHSEC research formances. project, as well as the collection of a textual CS dataset The main innovation of the proposed approach is the related to the “SoBigData.it” research project. use of CS information extracted from NL texts to calculate

The paper is organized as follows: in Section 2, the the threat, vulnerability, and impact levels, allowing the most recent and related studies in the literature are out- risk assessment for the various assets involved in digital lined; subsequently, the details of the proposed approach healthcare services to be finally obtained. are described in Section 3.5; afterwards, Section 4 shows the implementation of the proposed solution, a description of the datasets used and the research project where 3. Methodology the approach was tested in real-world scenarios. Finally, Section 5 provides conclusions and future works.

The proposed risk assessment methodology is composed

of the following five steps: i) Healthcare Ecosystem Assets Identification and Categorisation ; ii) Threat Identification and Assessment; iii) Vulnerability Assessment; iv) Impact Assessment; and v) Risk Assessment.

2. Related Works There are several recent works in the literature dealing

with risk assessment and CS information extraction from 3.1. Healthcare Ecosystem Assets NL documents. The authors of [8] reviewed and compared diferent generic cyber risk assessment frameworks Identification and Categorisation in the healthcare field, comparing them, discussing the The preliminary step of the methodology provides a list methodology of assessment and the limitations associ- of the assets of the considered digital complex healthated with them. A threat and mitigation model tailored care system by identifying the corresponding services for the IoT health devices is presented in [17], combining involved and their assets, with the final purpose of meaSTRIDE and DREAD models: threats are identified us- suring their criticality within the healthcare system. For ing STRIDE model on the device access points, and then instance, the assets of a remote patient consultation serranked using DREAD. This approach is suitable for both vice could include a Database, a Linux Server, communicathe designers and users of health IoT devices. tion software, and a web server. After their identification,

The security and privacy challenges in Medical Cyber- the assets are also categorized, using the Common PlatPhysical Systems (MCPS) are discussed in [18], highlight- form Enumeration (CPE)1 catalogue to map them with ing that trust and threat models usually consider MCPS the corresponding area (based on their type) and catestakeholders, including healthcare practitioners, system gory (depending on their functionalities), as shown in administrators and non-medical staf, with incorrect lev- the next Table 1. This step allows us to understand the els of trust. Also, in [2], the issues related to the CS importance of each asset within the ecosystem and to awareness of the healthcare personnel are underlined, provide a list of the assets that require risk assessment. reviewing the existing gaps in CS strategies adopted by

These classifications are used to evaluate the criticality of each asset of the healthcare system, by measuring the dependency level that an asset has with other system components. We defined our dependency levels: a threat identification phase is performed by exploiting the Common Attack Pattern Enumeration and Classification (CAPEC)2, which also provides a detailed set of the characteristics of the threats, such as Likelihood of Attack, Related Attack Patterns, Execution Flow, Prerequisites and others. In this way, we obtain the list of the threats for each asset that operates in the considered healthcare service/system (identified in the previous step). Each threat also includes the CAPEC ID, a CAPEC category that will be used to rate the threat, and the corresponding characteristics.

Then, it is possible to assess the threats, assigning them a severity level. Our methodology exploits the NL history of reported incidents related to those threats, extracted from large CS domain collections available online, such as forums, social media, news, and others, using • Independent assets have a distinct operation an AI-based NLP approach. In detail, we use a Named and exhibit no dependency on other assets. If the Entity Recognition (NER) architecture based on Secureasset fails, no cascading events occur. BERT [13], a BERT model pre-trained on a very large CS • Incoming dependency, if syntactically, another domain text collection (more than 2.2 million documents), asset uses its data or functionality. If such an asset preprocessed with a CS customized tokenizer, and finefails, the operation of all related assets that use tuned for the NER task, to extract the mentions of the its data or functionality may be disrupted. pairs threat and asset found in each sentence of the NL • Outgoing dependency, if syntactically it uses source. In this case, we produced a custom training set, data or functionality of another asset. Therefore, annotated with the entity types of interest (Asset, and if the latter asset fails, the operation of the former Threat) using the semi-supervised approach described asset will be afected as well. in [19]. Then, the threat level is calculated based on • Coupling relationship reveals that two assets the percentage of the occurrence of the mentions of that have both incoming and outgoing dependencies. threat within the considered dataset, following the ranges Thereupon, failures in one of the assets will afect shown in Table 3. The assessment is finally performed the functionality of the other. through a mapping between the assets of the services of the healthcare system and the pairs asset and threat with the corresponding threat level.

3.3. Vulnerability Assessment The next step has the purpose of building a vulnerability

exploit prediction scoring system specifically tailored for the healthcare domain. To this end, we adopted the NLP and Machine Learning (ML) approach described in [15], which leverages CS domain textual data sources to train a supervised ML classification model able to predict the Thus, the criticality level of an asset can be determined by the number of services and relevant business flows it participates in. Specifically, the General Asset Criticality level based on running services (GAC) is calculated as the weighted summation of their interdependencies, normalized by the total number of services in the examined healthcare ecosystem. Thereupon, the Asset Criticality for a specific service (ACS) is equal to its GAC value divided by the number of relevant/redundant assets that co-exist in the service. Finally, based on the ACS range values, it is possible to assign a criticality level to each asset, as shown in Table 2.

3.2. Threat Identification and Assessment

Once the assets have been identified, the next step aims to assess the threats that could afect those assets, following the approach previously described in [14, 15, 16]. Firstly, 2https://capec.mitre.org vulnerability score, obtaining in this way the vulnera- tures evaluated with two diferent classifiers that output bility assessment. In summary, this method uses the scores to predict relevancy and severity, following the textual data included in the CVE (the Report column of approach described in [22]. Each adjective is associated this KB) and the corresponding exploitability and im- with a coeficient, calculated by taking through the logpact metrics, namely the attack vector, attack complexity, odd ratio, then computing the exponential function on privileges required, user interaction, scope, confidential- the log-odd, and converting odds to probability, using ity impact, integrity impact and availability, to obtain the formula: = /(1 + ). In this a vector representation with the corresponding labels way, it is possible to associate the vulnerability to a scale related to exploitability and impact metrics, used to train Low, Medium, and High, where Low corresponds to [0, a set of ML XGBoost classifiers, which are able to pre- 33) (meaning that there is an 0-33% impact assessment dict the labels of the Attack Vector (Network, Adjacent probability), Medium corresponds to [33, 66), i.e., and Network, Local, Physical) and of the exploitability and High corresponds to [66, 100]. impact metrics, summarised in the next Table 4. For vulnerabilities expressed in CVSS (obtained in the previous step), the three security criteria Confidentiality Table 4 (C), Integrity (I), and Availability (A) are rated on a threeExploitability and impact metrics and corresponding labels. tier-scale: None, Low, and High (see previous Table 4).

Exploitability and Impact metrics Labels We can define a mapping from this three-tier scale onto a Attack Complexity Low, High ifve-tier scale ranging from Very Low (VL) to Very High Privileges Required None, Low, High (VH) combining these characteristics, as shown in Table 6, SUcsoepreInteraction UncNhoanneg,eRde,qCuhiraendged providing in this way an initial impact level of a specific Confidentiality None, Low, High asset/vulnerability combination.

IAnvtaeiglaribtiylity NNoonnee,, LLooww,, HHiigghh Then, the final impact level per asset is obtained by combining the initial impact with the asset criticality

Then, an extension of CVE Exploit Prediction Scor- level (see Table 2), with the previous scale related to ing System (EPSS) is adopted [20], defining a Common the adjectives and the corresponding vulnerabilities exVulnerability Scoring System (CVSS)-like score using the tracted by the NER module, as stated in next Table 7. labels predicted by the trained ML models on the NL texts, and following the specifications provided by [ 21]. The 3.5. Risk Assessment vulnerability level is based on the ranges of the computed CVSS-like score, as shown in Table 5.

Finally, the Risk assessment is obtained by combining the Threat, Vulnerability, and Impact levels obtained in the previous steps, calculating the individual risk level for each asset following the next Table 8. 4. Implementation and Experiments

CVSS-like Score Range 8.0, 10 6.0, 8.0 4.0, 6.0 2.0, 4.0 0.0, 2.0 Vulnerability Level Very High

High Medium

Low Very Low

To implement the Threat and Impact assessment methods,

3.4. Impact Assessment we firstly needed a large and updated CS domain textual document collection. To this end, we collected the news The next step of the proposed methodology is the In- published by The Hacker News website3, a CS news platdividual Impact Assessment, where the impact level is form that attracts over 8 million readers monthly, which calculated to measure the efect that can be expected as is daily updated with attacks, threats, vulnerabilities, and the result of the successful exploitation of a vulnerability other CS news. A Python web crawler and scraper for that resides in a critical asset. In this case, the methodol- this website has been specifically developed to retrieve, ogy leverages the CVE KB used in conjunction with the extract, collect, and normalise the text of each posted same NER module used in the case of Threat Assessment news. The scraping task is performed bi-weekly, makifne-tuned to extract the assets and vulnerabilities entity ing this dataset constantly updated also increasing its types (see Section 3.2). This methodology exploits an ad- size. Moreover, this corpus is also made publicly on the ditional set of adjectives related to the vulnerabilities and SoBigData research infrastructure4. The NER module is belonging to a predefined dictionary. These adjectives, based on SecureBERT [13], a BERT model pre-trained such as severe, serious, dangerous, etc., tend to indicate via a weight coeficient the severity level of the vulnerability.

In detail, this dictionary is the result of the processed fea

3https://thehackernews.com

4Available at https://data.d4science.org/ctlg/ResourceCatalogue/ the_hackernews_dataset 5. Conclusion and Future Works on a very large CS domain text collection (more than 2.2 million documents), preprocessed with a CS customised tokenizer to improve its performance. This model has The paper proposes an AI-based approach for the indibeen fine-tuned for the NER task, to extract the men- vidual risk assessment of the assets of digital healthcare tions of the pairs of threat and asset found in each corpus systems. The approach, after the classification of the critsentence for the threat assessment, the mentions of vul- icality of the assets using CS KBs, leverages NER and ML nerabilities, the corresponding adjectives, and the assets systems to extract and classify relevant information from for the impact assessment. To this end, we created two textual CS sources, allowing to calculate the threat, vulcustom training sets, annotated with the entity types nerability and impact levels, which are finally combined of interest (Asset, and Threat in the first case and Asset, to obtain the risk level of each asset. The methodology Vulnerability and Adjectives in the latter case) using the was successfully tested in real-world pilot scenarios of semi-supervised approach described in [19]. The imple- the EC-funded H2020 AI4HEALTHSEC project, demonmentation of this module is based on the Huggingface strating its applicability and efectiveness. Moreover, the Transformers Python library. The vulnerability assess- datasets, which are constantly updated, are made pubment ML classifiers have been implemented using the licly available on the SoBigData research infrastructure. Dmlc XGBoost library, a distributed gradient boosting library designed to be highly eficient and flexible.

The proposed methodology has been developed and Acknowledgments implemented within the activities of the EC-funded H2020 project “AI4HEALTHSEC–A Dynamic and Self- This work is supported by the European Union— Organised Artificial Swarm Intelligence Solution for Se- NextGenerationEU—National Recovery and Resilience curity and Privacy Threats in Healthcare ICT Infrastruc- Plan (Piano Nazionale di Ripresa e Resilienza, PNRR)— tures”. In this project, the proposed approach has been Project: “SoBigData.it—Strengthening the Italian RI for tested in real-world pilot scenarios provided by the Fraun- Social Mining and Big Data Analytics”—Prot. IR0000013— hofer Institute for Biomedical Engineering (IBMT), a part- Avviso n. 3264 del 28/12/2021. ner of the project. The pilots tested three diferent com- We thank Simona Sada and Giuseppe Trerotola for the plex healthcare systems scenarios, namely Implantable administrative and technical support provided. Medical Devices, Wearables, and Biobank. The results of the tests, reported in [14, 15, 16], confirmed the efectiveness and the applicability of our method. tional Conference on Big Data Analytics (ICBDA), volume 26, IEEE, Xiamen, China, 2021, pp. 316–320. [1] P. Ribino, M. Ciampi, S. Islam, S. Papastergiou, doi:10.1109/ICBDA51983.2021.9403180.

Swarm intelligence model for securing health- [12] Y. Chen, J. Ding, D. Li, Z. Chen, Joint bert model care ecosystem, Procedia Computer Science 210 based cybersecurity named entity recognition, in: (2022) 149–156. doi:https://doi.org/10.1016/ 2021 The 4th International Conference on Software j.procs.2022.10.131. Engineering and Information Management, ICSIM, [2] S. Nifakos, K. Chandramouli, C. K. Nikolaou, P. Pa- Yokohama, Japan, 2021, pp. 236–242. doi:10.1145/ pachristou, S. Koch, E. Panaousis, S. Bonacina, In- 3451471.3451508. lfuence of human factors on cyber security within [13] E. Aghaei, X. Niu, W. Shadid, E. Al-Shaer, Securehealthcare organisations: A systematic review, Sen- BERT: A domain-specific language model for cysors 21 (2021). doi:10.3390/s21155119. bersecurity, in: Security and Privacy in Communi[3] D. McKee, P. Laulheret, McAfee Enterprise cation Networks, Springer, Cham, 2023, pp. 39–56.

ATR uncovers vulnerabilities in globally [14] S. Islam, S. Papastergiou, S. Silvestri, Cyber used B. Braun infusion pump, 2021. URL: threat analysis using natural language processhttps://www.trellix.com/blogs/research/mcafee- ing for a secure healthcare system, in: 2022 enterprise-atr-uncovers-vulnerabilities-in- IEEE Symposium on Computers and Commuglobally-used-b-braun-infusion-pump/. nications (ISCC), 2022, pp. 1–7. doi:10.1109/ [4] S. Islam, S. Papastergiou, H. Mouratidis, A dynamic ISCC55528.2022.9912768.

cyber security situational awareness framework for [15] S. Silvestri, S. Islam, S. Papastergiou, C. Tzagkarakis, healthcare ICT infrastructures, in: Proceedings M. Ciampi, A machine learning approach for the of the 25th Pan-Hellenic Conference on Informat- nlp-based analysis of cyber threats and vulnerabiliics, PCI ’21, ACM, Volos, Greece, 2022, p. 334–339. ties of the healthcare ecosystem, Sensors 23 (2023). doi:10.1145/3503823.3503885. doi:10.3390/s23020651. [5] D. Rees, Cyber attacks in healthcare: [16] S. Silvestri, S. Islam, D. Amelin, G. Weiler, S. Pathe position across europe, 2021. URL: pastergiou, M. Ciampi, Cyber threat assessment and https://www.pinsentmasons.com/out-law/ management for securing healthcare ecosystems analysis/cyber-attacks-healthcare-europe. using natural language processing, International [6] Sixth annual benchmark study on privacy & secu- Journal of Information Security 23 (2024) 31–50.

rity of healthcare data, 2016. Ponemon Institute. doi:10.1007/s10207-023-00769-w. [7] K. S. Bhosale, M. Nenova, G. Iliev, A study of cyber [17] A. Omotosho, B. A. Haruna, O. M. Olaniyi, Threat attacks: In the healthcare sector, in: 2021 Sixth Ju- modeling of internet of things health devices, Journior Conference on Lighting (Lighting), 2021, pp. 1– nal of Applied Security Research 14 (2019) 106–121. 6. doi:10.1109/Lighting49406.2021.9598947. doi:10.1080/19361610.2019.1545278. [8] S. Memon, S. Memon, L. Das, B. R. Memon, Cyber [18] H. Almohri, L. Cheng, D. Yao, H. Alemzadeh, On security risk assessment methods for smart health- threat modeling and mitigation of medical cybercare, in: 2024 IEEE 1st Karachi Section Humanitar- physical systems, in: 2017 IEEE/ACM International ian Technology Conference (KHI-HTC), 2024, pp. 1– Conference on Connected Health: Applications, 6. doi:10.1109/KHI-HTC60760.2024.10481961. Systems and Engineering Technologies (CHASE), [9] M. Tikhomirov, N. Loukachevitch, A. Sirotina, 2017, pp. 114–119. doi:10.1109/CHASE.2017.69.

B. Dobrov, Using BERT and augmentation in named [19] G. Aracri, A. Folino, S. Silvestri, Integrated use entity recognition for cybersecurity domain, in: of KOS and deep learning for data set annotation 25th International Conference on Applications of in tourism domain, Journal of Documentation Natural Language Processing and Information Sys- 79 (2023) 1440–1458. doi:10.1108/JD-02-2023tems, Springer, Saarbrücken, Germany, 2020, pp. 0019.

16–24. [20] J. Jacobs, S. Romanosky, B. Edwards, I. Adjerid, [10] K. Ameri, M. Hempel, H. Sharif, J. Lopez Jr., K. Pe- M. Roytman, Exploit prediction scoring system rumalla, Cybert: Cybersecurity claim classifica- (EPSS), Digital Threats 2 (2021). doi:10.1145/ tion by fine-tuning the bert language model, Jour- 3436242. nal of Cybersecurity and Privacy 1 (2021) 615– [21] A.A.V.V., Common Vulnerability Scoring System 637. URL: https://www.mdpi.com/2624-800X/1/4/ version 3.1 Specification Document, Technical Re31. doi:10.3390/jcp1040031. port, FIRST.Org, 2019. URL: https://www.first .org/ [11] S. Zhou, J. Liu, X. Zhong, W. Zhao, Named entity cvss/v3-1/cvss-v31-specification_r1 .pdf. recognition using bert with whole world masking [22] L. Breiman, Random forests, Machine learning 45 in cybersecurity domain, in: 2021 IEEE 6th Interna- (2001) 5–32.