Model for forecasting the development of information threats in the cyberspace of Ukraine ⋆ Mariia Nazarkevych1,2,†, Victoria Vysotska1,3,*,†, Yurii Myshkovskyi1,†, Nazar Nakonechnyi1,† and Andrii Nazarkevych1,† 1 Lviv Polytechnic National University, 12 Stepana Bandera str., 79013 Lviv, Ukraine 2 Ivan Franko National University of Lviv, 1 Universitetska str., 79000 Lviv, Ukraine 3 Osnabrück University, 29 Neuer Graben, 49074 Osnabrück, Germany Abstract Approaches to the formation of models for forecasting the development of information threats in cyberspace have been developed, which is an urgent task when fake news and information manipulation can affect public sentiment, politics, and the economy. The program uses machine learning and Natural Language Processing (NLP) techniques to detect fakes in a dataset. In the developed method, we train the model on a data set where true and fake news or any other types of information are already marked. The model can then be used to classify new data. The dataset contains news that the average Ukrainian saw during the war in the Internet space on such social networks as Telegram, Facebook, and Twitter, on news sites. The language of the messages, which were in Ukrainian and Russian, was highlighted as a separate field. In a separate field, it was noted how many people liked and how many people shared this message. The data set contains some fake news and some real news. The F1 score is 0.98 for both classes (0-forgery, 1-not forgery). Such good results can be explained by the “laboratory” quality of the data set. In further experiments, we will test the model on real-time news. Keywords information threats, cyberspace, fake messages, machine learning 1 1. Introduction to encrypt the victim’s files and then demand a ransom to obtain the decryption key [3]. Today, society is increasingly faced with various types of Cyberspace, along with other territories, is recognized cyberattacks: failures in the provision of electronic services, as one of the potential theaters of war, so the state’s ability blocking the work of state bodies, phishing attacks by e- to protect its national interests is considered an important mail, cybercrimes, violations of data integrity and component of cyber security. confidentiality, information-psychological pressure on the population, cyberterrorism, cyberespionage, information 2.1. Distributed attacks expansion into the national information space of the Criminals actively work on finding vulnerabilities in assets country, blocking the work or destruction of strategically (management systems) and develop for this purpose unique important enterprises for the economy and security of the in their characteristics: universal malicious software, state, life support systems and objects of increased danger encryption viruses, botnets that perform distributed attacks [1, 2]. (DDoS) on operating networks, production systems that use cloud services, as well as supply chain attacks. Given the 2. The main types of cyber-attacks progress in artificial intelligence technologies over the next Malware is a type of program that can perform various 5–10 years, the scope and consequences of such malicious tasks. Some types of malware are designed to interventions will grow. The expansion of the use of create persistent network access, some are designed to spy cyberspace by terrorist organizations (cyberterrorism) is on a user to obtain credentials or other valuable becoming a global trend [4]. information, and some are simply designed to disrupt The new resolution of the Government of Ukraine will operations. Some types of malware are designed to extort allow timely response and planning of cyber protection money from the victim. Probably, the most famous form of measures. We are talking about the Resolution of the malicious software is a ransomware program—it is designed Cabinet of Ministers of Ukraine dated 04.04.23 No. 299 CPITS-II 2024: Workshop on Cybersecurity Providing in Information 0000-0002-6528-9867 (M. Nazarkevych); and Telecommunication Systems II, October 26, 2024, Kyiv, Ukraine 0000-0001-6417-3689 (V. Vysotska); ∗ Corresponding author. 0009-0004-0051-026X (Y. Myshkovskyi); † These authors contributed equally. 0009-0000-2456-3498 (N. Nakonechnyi); mariia.a.nazarkevych@ lpnu.ua (M. Nazarkevych); 0009-0007-2078-8447 (A. Nazarkevych) victoria.a.vysotska@lpnu.ua (V. Vysotska); © 2024 Copyright for this paper by its authors. Use permitted under Creative Commons License Attribution 4.0 International (CC BY 4.0). yurii.myshkovskyi@lpnu.ua (Y. Myshkovskyi); nazar.i.nakonechnyi@lpnu.ua (N. Nakonechnyi); andrii.nazarkevych.ri.2023@lpnu.ua (A. Nazarkevych) CEUR Workshop ceur-ws.org ISSN 1613-0073 242 Proceedings “Some issues of response of cyber security entities to 2.3. DoS attacks various types of events in cyberspace” [5]. DoS vulnerabilities are refusal of service stand separately in 2.2. Ransomware or blackmailer several security threats (Fig. 1). As a rule, this class of attacks includes events described in the news “Hackers Highlight the following categories: Malware is an attack on attacked site X, disrupting its operation. The site was down a wide audience, in particular on the Internet. “Ransomware for Y hours”. Requests are made to the server that it cannot or Blackmailer”, which is a partial case. Distributed “denial (does not have time to) process, as a result, it does not have of service” DDoS attacks are attacks aimed at blocking the time to process the requests of ordinary visitors and appears operation of a specific network resource. The attack can be to them as not working. These attacks are not intended to implemented by the following three mechanisms: overflow steal data from the database but can help launch other types of the communication channel, “denial of service”)—a of attacks, i.e. clear the path. For example, some programs hacker attack on a comprehensive system to bring it to can cause exceptional situations due to errors in their code. failure, that is, creating such conditions under which bona fide system users will not be able to access the provided system resources (servers), or this access will be closed. Failure of the “enemy” system can also be a step towards mastering the system, if, in the next situation, the software releases some critical information—for example, the version, part of the software code, etc. DoS is a simplified variant of DDoS attacks. A distinctive feature is the clear manifestation of the moment of attack. Table 1 Types of attacks and software that detects them Attack type ESET Avast Pre Bitdefen Avira Figure 1: DOS attacks Internet mium Se der Total Internet Security curity Security Security It is impossible to protect against DOS attacks 100%, but it Malicious + + + + is possible to limit the number of login attempts from the software same IP address in a certain amount of time. For example— Distributed attacks, no more than 5 in 10 minutes. When running out, show a + + + + “wait” message or offer to enter a CAPTCHA. Some systems denial of service DDoS ask to enter the CAPTCHA in general at each login attempt [6]. denial of + + + + service DoS 2.4. Phishing attacks Phishing (social + + + + engineering) is the practice of sending emails that appear to be from Using SQL + + + + trusted sources to obtain personal information or influence injections Cross-Site users to do something. It combines social engineering and + + + + Scripting (XSS) technical techniques. It could be an email attachment that Botnets + + + + downloads malware to your computer. It can also be a link Brute force + + + + to an illegal website that can trick you into downloading attack Drive-By malware and handing over your data. Spear phishing is a + + – – very targeted type of phishing. Attackers spend time Download The Man in researching targets and crafting messages that are personal + + – – the Middle and relevant. Therefore, spear phishing is very difficult to Ransomware + + – – recognize and even more difficult to protect against it. One or blackmailer Unsuccessful of the easiest ways a hacker can conduct a spear phishing authorization + + + + attack is through email spoofing, where the information in attempts the “From” section of an email is faked to make it look like Attempts to exploit + + + + the email is coming from someone you know, such as your vulnerabilities management or company—partner. Another trick scammers Publication of use to give their story credibility is website cloning: they fraudulent + + – – copy legitimate websites to trick you into entering personal information Network information or login credentials [7]. + + + + scanning 243 2.5. Cros-site scripting attack 3. Model for forecasting the A cross-site scripting (XSS) attack occurs when a site has a development of information vulnerability that allows the introduction of scripts (Fig. 2). threats Attackers use such vulnerabilities and introduce malicious JS scripts into the database site data. When the user One of the most common cyber threats is the penetration of subsequently requests this data, the user’s web browser false information into the information space of Ukraine. executes a malicious JS script. This would allow an attacker Among them, false news occupies an important place. This to steal browser cookies to hijack the session. Hackers can news is also called fake news. The information space of then use the session information to exploit additional Ukraine needs the development of new protection systems, vulnerabilities, possibly gain network information, and as an uncontrolled process leads to the penetration of false control the user’s computer. This is especially important in information, which users spread in every way. The an enterprise environment, as a single XSS attack (Fig. 2) development of methods and tools for monitoring and can compromise an entire network [8]. detecting misinformation on the Internet is an urgent task In order not to become a victim of an XSS attack, the in the conditions of the modern digital age when fake news following security rules should be observed: all nested and manipulation of information can affect public structures must be filtered. Encryption. When creating a sentiment, politics, and the economy. NLP is a rapidly filter, you must take into account the risk of encoding developing technology that helps businesses get the most attacks. There are a lot of encoder programs that can be used out of artificial intelligence. Analytical research predicts an to encrypt any attack so that more than one filter will not increase in the global NLP market from USD 20.98 billion in “see” it. Application of tags. There is one vulnerability 2021 to USD 127.26 billion in 2027, with a compound annual related to tags url, bb, img, which have many parameters growth rate (CAGR) [10] of 29.4%. Today, texts are analyzed including lowsrc and dynsrc containing javacsript. These using artificial intelligence using methods of NLP to analyze tags should be filtered [3]. text messages and search for signs of manipulation or fake information. For example, artificial intelligence can detect suspicious speech patterns that are typical of disinformation. Also practiced is such an approach as fact- checking based on automated systems, which consists of an automated fact-checking system that can quickly compare information with reliable sources and determine whether it is reliable. For this, databases with verified information and algorithms for its analysis are used. For social networks, users’ behavior is monitored, identifying the disseminators of disinformation and detecting networks engaged in the manipulation of mass consciousness. Blockchain technology is widely used to ensure transparency of information, which will reduce the number of fake news, as all information will be transparently tracked. It is necessary to develop crowdsourced platforms for fact-checking, where users can verify information themselves and provide their results, also effectively contribute to the detection of disinformation. Figure 2: XSS attack Effective development of disinformation detection methods requires a combination of technological 2.6. Brute force attack innovations with international cooperation, regulatory measures, and increasing the level of digital literacy of A brute force attack, sometimes called a password attack, is users. Several methods and technologies based on artificial one of the simplest forms of web attacks. The hacker simply intelligence (AI) [11–13], machine learning [14, 15], and tries different combinations of usernames and passwords NLP are used to classify information as true and false [16, over and over again until he gets into the user’s account. Of 17]. These methods allow you to automate the process of course, one computer would need years to go through all verifying the authenticity of information and quickly the combinations. But when hackers gain control over identify disinformation. NLP is becoming an important part several computers or develop a powerful software of modern systems. It is intensively used in search engines, computing engine, things can become very simple. language interfaces, document processors, etc. Computers Bruteforce is one of the most popular methods of cracking are very good at dealing with structured data. If the texts passwords of online bank accounts, payment systems, or are in free form, computers face difficult tasks. The goal of websites. But as the length of the password grows, this NLP is to develop algorithms that would allow computers to method becomes inconvenient due to the length of time it recognize free text and understand live speech. The amount takes to go through all possible options [9]. of variation possible is one of the biggest challenges in NLP. Context is of great importance for understanding the meaning of individual sentences. People are very good at this because they learn to understand the content over many 244 years. We apply our knowledge to understand the context ambiguities in a wide range of industries, but there are and know exactly what the other person is talking about. To programs that can correctly respond to ambiguities in very overcome this problem, researchers in the field of NLP have narrow areas. begun to develop various applications using machine Classification of fakes learning-based approaches. To develop such applications, Fake (forgery) is false [18], often sensational we have to collect huge arrays of text and then train an information, distributed under the guise of news, that is, it algorithm to perform various tasks, such as text is fake news. Fakes are created to gradually, step by step, categorization, sentiment analysis, or topic modeling. At the form relationships, that is, to create reactions to a certain same time, the algorithms learn to detect patterns that social group. The biggest danger from fakes is their repeat in the input text and get the content embedded in it. cumulative effect. Fakes distort reality and undermine trust Natural language has syntactic ambiguity, which is in the media. Scientists from the University of Western shown in the proverb “Time is not a horse, you can’t drive Ontario distinguish five types of fakes: it and you can’t stop it”. For NLP, it is unclear whether the sentence is about a horse or time. The Ukrainian language  intentionally created fakes has a case ambiguity: in the phrases “Everyone was excited  jokes perceived as truth before the concert” and “It’s not necessary to give before!”  large-scale hoaxes the word before means time or place, which completely  intentionally one-sided coverage of events changes the meaning of the phrase. There is also a  stories in which the “truth” is contradictory (for referential ambiguity: in the phrase “Open the shelf and take example, a terrorist for some is a freedom fighter out the wet umbrella, I want to dry it”, the pronoun she will for others). refer to the wet umbrella by its semantic meaning, but for the machine, which has a complete lack of understanding of Fakes were first mentioned in 1981 when journalist reality, this pronoun will refer to both the shelf and to the Janet Cook won a Pulitzer Prize for her story “Jimmy’s umbrella. One of the challenges that arises in the process of World” for The Washington Post. Stephen Glass worked for NLP can be considered the problem of the presence of the Washington magazine The New Republic from 1995 to synonyms, as a result of which one concept can be 1998 and did not care about sensationalism, he simply expressed by several different words. As a result, documents invented them—half of his articles in TNR were fabricated. that use synonyms may not be identified by the system. The According to the observations of David Peterson [19], the influence of the above phenomena is especially noticeable editor of the Viralgranskaren project (Sweden), fakes are when creating machine translation systems. The problem created: lies in the difficulty of establishing a concrete mapping of the valid semantic-syntactic structure of a sentence into its  Viral sites create an instant response in the internal logical representation, which is automatically audience. generated by the system.  Pranksters, to weigh in on the audience, are set on intellectuals. 3.1. General norms for the formation of  Scammers hook and lead to the goal. messages  Ideological and political views, so that it is almost Postulates are not explicitly stated in the editing literature, impossible to convince. although they are always used when processing messages.  Foreign players. We think that fixing them will allow you to better  And finally, ordinary people who do not create, but understand the features of editing. Let’s list the postulates distribute. that, in our opinion, should be adopted in the editing. The message must necessarily contain new information for the 4. The method of detecting fake recipient. The message must have a defined modality. The messages message must be adapted to the time, place, and situation in which it will be perceived by the recipient. The author must We will use the Python Natural Language Toolkit (NLTK) use language and word meanings known to the recipients. [20] package to build the corresponding applications. Be sure to install this package before reading further. Enter the The message must be adapted to the recipient’s thesaurus. following command in a terminal window: $ рірЗ install In the message, mechanisms should be implemented only nltk The use of neural networks and machine learning is for the perception of information by the recipient. In the based on labeled data: Neural networks are trained on a message, means must be implemented that force the large number of examples of true and false information. recipient to perceive it. The message must be protected from During training, the model analyzes various characteristics noise. The message must comply with the norms adopted at of the text—vocabulary, syntax, presentation style, as well a specific time in a specific society. In addition to these as sources of information. The model then learns to postulates, which directly follow from the editing axiom, distinguish between true and false information based on one more should be added to their number. these features. Classification algorithms based on the Any general (postulate) or specific norm can be violated method of support vectors, decision trees, and deep neural networks are used to build models that classify texts as true if it leads to the set goal. Solving these types of ambiguities or false based on statistical features. is possible by introducing additional values that will NLP is carried out by analyzing linguistic features. The increase the program’s knowledge of a particular industry. system analyzes the text for emotional color, level of bias, Today, there are no programs that “understand” all types of and degree of confidence or uncertainty in the presentation 245 of facts. For example, fake information often contains Analysis processing sensational or emotionally charged headlines and phrases. Forming the Representation Relevant content plural of text Search by keywords and phrases is also used. NLP of a data array technologies help find patterns or keywords often used in Pragmatic Graphemic management fake news, including elements of conspiracy theories or Formatting exaggerated claims. Fact-checking is used by checking Irrelevant content Phonological Duplication literary sources. Machine learning can automatically find detection links to information sources and verify their credibility Lexical Classification using databases of trusted news organizations or official Dictionary of Formation of sources. You can also compare it with other sources. morphemes Morphological digests Algorithms can compare information with other available Semantic facts and detect inconsistencies. This is especially useful for segmentation Syntactic Construction of checking news that is shared on social media. Metadata Sentence software ontology analysis can be performed by establishing the publication construction rules Semantic Formation of knowledge about time and examining the change history. Models can use software Formation of a set metadata (time of creation, geographic location) to detect Pragmatic of knowledge suspicious material. For example, fast-spreading news from about the task new or unknown accounts can be filtered out as potentially Linguistic Processing of text Definition of dictionary corpus fake. Some systems use AI to analyze text writing style and linguistic units identify possible signs of automated content generation or Linguistic analysis Figure 4: Classification of the main methods of natural bot use. Basic techniques used in NLP Tokenization Also language processing called word segmentation, tokenization is one of the simplest and most important techniques (see Fig. 3). This is A token is an atomic meaningful object from a sequence within an important preprocessing step in which a long string of text is broken into smaller units called tokens. Tokens [1, N] characters. Identifies tokens based on regular expressions include words, symbols, and sub-words. They are the and by location in character set/sentence and context. This is building blocks of NLP, and most NLP models process raw not grapheme analysis as separating a group of characters text at the token level. The most common tokenization between punctuation marks. Tokens are identified by the rules process is the space/unigram. In this process, the entire text of the lexer, taking into account already grammatical features is broken into words by separating them with spaces. from the previous step of MA, according to the natural language of the input text, in particular:  Marking a set of incoming text characters into a set of tokens.  Identification of a separate token as a logical linguistic unit of the text (word, mathematical sign, number, punctuation mark, etc.). Figure 3: Tokenization  Establishing a relationship between a token and a token—the specific text of the token (“for”, “1979”, German Verarbeitung natürlicher Sprache is capital letters “+”, “variable”, “.”, “р.”, “;”, etc.). for nouns are mandatory and there are noun declensions  Identification of additional token attributes (for according to 4 cases—the change occurs only in the ending example, a period as a sentence boundary or part of the adjective natürlicher de er ending in the genitive case of a contraction). of the feminine gender for Sprache, so the literal equivalent  Forming a tuple of tokens as input information for in German for Ukrainian and English yes CA. (processing natural language) And for Polish Przetwarzanie języka naturalnego The lexical analyzer does not check the correctness of (natural language processing) the links in the tuple of tokens. The parser recognizes In French Traitement du langage naturel parentheses, punctuation marks, and math symbols as For Russian Processing of natural language characters, but does not check that each character “(” is In Ukrainian Processing of natural language in the matched by another “)”, and that each math character is nominative and there are 5 more cases, so there are possible between two specific numbers. options for a stable keyword combination Processing of natural language 4.1. Stemming and lemmatization Natural language processing Etc After tokenization, the next preprocessing step is stemming, But for Tokanization will be or lemmatization (Fig. 5). These methods generate a root Natural language processing word from the various existing variants of the word. Change of endings. Without conducting a preliminary Stemming and lemmatization [21, 22] are two different ways morphological analysis based on the modified Potter of trying to identify a root word. Creating roots works by algorithm, it is not possible for Ukrainian-language texts to removing the end of a word. This NLP technique may or correctly tokenize and lemmatize, as well as to determine may not work depending on the word. For example, this will the set of keywords in messages and news (see Fig. 4). work on “sticks” but not on “sticking” or “stuck”. Lemmatization is a more sophisticated technique that uses 246 morphological analysis to find the base form of a word, also the state of social consciousness or emotional coloring to called a lemma. promote relevant political and/or commercial advertising in social networks. In linguistic monitoring, in addition to the listed set of methods, regular expressions and a bag of words are used to study the functioning of language in a specific scientific, political, or mass media discourse. The purpose of monitoring is also recognition of fakes/propaganda and disinformation in the case of information threats, identification of foreign language borrowings, plagiarism/rewriting, grammatical/stylistic errors, vocabulary of emotions/feelings, thematic /spatial/ Figure 5: Stemming and lemmatization temporal vocabulary, etc. 4.2. Morphological segmentation 5. Processes of machine learning Morphological segmentation is the process of dividing Machine learning methods have set new accuracy records words into morphemes that make them up. A morpheme is in fields such as NLP [28, 29]. The success was facilitated by the smallest unit of language that carries meaning. Some a large amount of training data and the availability of huge words, such as “table” and “lamp”, contain only one capacities for parallel calculations using modern graphics morpheme. But other words can contain several processors. Each search query in Google triggers several AI morphemes. For example: the word “energy saving” models at once, such as text recognition and personalization contains two morphemes: energy and conservation. Similar of the output of results. The spam detection system in Gmail to stemming and lemmatization, morphological works in the same way, identifying fraudulent messages segmentation can help preprocess the input text [23, 24]. (Fig. 6) [30, 31]. The method of detecting fake news is show in Fig. 7. 4.3. Morphological analysis There are two types of POS tags in this case. Based on the rules of Stochastic POS Taggers Rule-based POS Tagger: For Begin words with ambiguous meaning, a rule-based approach based on context information is applied [25]. This is done by checking or analyzing the meaning of the previous or next word. Information is analyzed from the word environment. Therefore, words are marked with the Loading data grammatical rules of a particular language, such as the use of capital letters and punctuation marks. If a word is most often marked with a certain tag in the training set, then the test sentence is assigned this specific tag. This method is not always accurate. Another way is to calculate the probability Pre-processing of data of a certain tag appearing in a sentence. Thus, the final tag is calculated by checking the maximum probability of a word with a given tag. 4.4. Sentiment analysis Model training Sentiment analysis, also known as emotion intelligence or opinion research, is the process of analyzing text to determine whether it is generally positive, negative, or neutral. As one of the most important NLP techniques for Assessment of accuracy text classification, sentiment analysis is commonly used for applications such as user-generated content analysis. It can be used for a variety of text types, including reviews, comments, tweets, and articles [26, 27]. For example, the analysis and identification of End psychological effects laid down by the author of the textual content depends on the availability of a personalized Figure 6: Processes of machine learning dictionary of the author and a sentiment dictionary of this region (not all words have the same emotional colors and in different languages and different regions, even different people of specific people—a simple translation will not help to get a real description of a person’s psychological state). Statistical methods are used in content analysis to identify 247 Preprocessing Feature Dataset news the data extractor Content Training the classification classifier Truthful Fake content content Figure 9: Analysis of fake and real news Figure 7: The method of detecting fake news 6. Experiments For this study, a dataset was formed, which includes more than a thousand fake and real news. The dataset format is shown in Fig. 8. In this dataset, the news that the average Ukrainian saw during the war in the Internet space in such social networks as Telegram, Facebook, Twitter, and on news sites was formed. A separate field was allocated to the language of the messages, which were in Ukrainian and Russian. In a separate field, it was noted how many people liked and how many people shared this message. The dataset contains part of fake news and part of true news. Well, for clarity, in the Figure 10: Analysis of Ukrainian, Russian, and English news next field, we enter the author of the message and the web address of the site from where this news was read [32–34]. Figure 11: Classification Fake, True in the Telegram Figure 8: Format dataset Figure 12: Classification Fake, True in the WWW 248 BOW and Logistic Regression functions were used for the IoT Networks Based on Machine Learning forecast model. The results of the model are shown in Algorithms, Sensors, 24(2) (2024) 713. Fig. 13. [6] M. A. Tamal, et al., Unveiling Suspicious Phishing Attacks: Enhancing Detection with an Optimal Feature Vectorization Algorithm and Supervised Machine Learning, Frontiers in Computer Science, 6 (2024) 1428013. [7] A. Hannousse, S. Yahiouche, M. C. Nait-Hamoud, Twenty-Two Years Since Revealing Cross-Site Scripting Attacks: A Systematic Mapping and a Comprehensive Survey. Computer Science Review, 52 (2024) 100634. Figure 13: Model results [8] R. Alhamyani, M. Alshammari, Machine Learning- Driven Detection of Cross-Site Scripting Attacks, The F1 score is 0.98 for both classes (0-forgery, 1-not Information, 15(7) (2024) 420. forgery). Such good results can be explained by the [9] R. A. Febrian, Y. Muhyidin, D. Singasatia, Analisis “laboratory” quality of the data set. In further experiments, Penyerangan Bruteforce Terhadap Secure Shell (Ssh) we want to focus on validating the model on real-time news. Menggunakan Metode Penetration Testing, Scientica: Jurnal Ilmiah Sains dan Teknologi, 2(11) (2024) 151– 7. Conclusions 162. An analysis of attacks in the cyberspace of Ukraine was [10] M. A. Paranjape, S. Sathe, M. A. A. Abkari, Study On carried out. It is noted that for each attack it is necessary to Awareness and Perceptions of Individual Investors form a countermeasure, which is expressed in the Towards Cagr On Equity Shares, J. Econom. 17 (2024). development of new software, new hardware, etc. [11] O. Mykhaylova, et al., Person-of-Interest Detection on One of the most common threats is the penetration of Mobile Forensics Data—AI-Driven Roadmap, in: false information in social networks and chatbots, and it is Workshop on Cybersecurity Providing in Information necessary to detect fakes and delete this type of news in and Telecommunication Systems, CPITS, vol. 3654 every possible way. (2024) 239–251. A dataset of fake on real news has been created. [12] V. Buhas, et al., Cybersecurity Role in AI-Powered A program with machine learning was organized that Digital Marketing, in: Workshop on Digital Economy would allow us to evaluate the current news as real or fake. Concepts and Technologies Workshop, DECaT, vol. 3665 (2024) 1–11. Acknowledgments [13] V. Buhas, et al., AI-Driven Sentiment Analysis in Social Media Content, in: Workshop on Digital The research was carried out with the grant support of the Economy Concepts and Technologies Workshop, National Research Fund of Ukraine “Information system DECaT, vol. 3665 (2024) 12–21. development for automatic detection of misinformation [14] V. Zhebka, et al., Optimization of Machine Learning sources and inauthentic behaviour of chat users”, project Method to Improve the Management Efficiency of registration number 187/0012 from 1/8/2024 (2023.04/0012). Heterogeneous Telecommunication Network, in: Also, we would like to thank the reviewers for their precise Workshop on Cybersecurity Providing in Information and concise recommendations that improved the and Telecommunication Systems, vol. 3288 (2022) presentation of the results obtained. 149–155. [15] V. Zhebka, et al., Methodology for Predicting Failures References in a Smart Home based on Machine Learning [1] O. Trofymenko, Monitoring the State of Cyber Methods, in: Workshop on Cybersecurity Providing in Security in Ukraine, Legal Life of Modern Ukraine: Information and Telecommunication Systems, CPITS, Mater. International Science and Practice Conference, vol. 3654 (2024) 322–332. 1 (2019) 642–646. [16] O. Romanovskyi, et al., Prototyping Methodology of [2] O. Trofymenko, et al., Cybersecurity of Ukraine: End-to-End Speech Analytics Software, in: 4th Analysis of the Current State, Ukrainian Inf. Secur. International Workshop on Modern Machine Res. J. 21(3) (2019) 150–157. Learning Technologies and Data Science, vol. 3312 [3] V. I. Yashchuk, The Role and Place of the Cyber (2022) 76–86. Security Strategy of Ukraine in Ensuring the [17] I. Iosifov, O. Iosifova, V. Sokolov, Sentence Information Security of yhe State (2024). Segmentation from Unformatted Text using Language [4] Some Issues of Response by Cyber Security Entities to Modeling and Sequence Labeling Approaches, in: Various Types of Events in Cyberspace: Resolution of IEEE 7th International Scientific and Practical the Cabinet of Ministers of Ukraine dated (04.04.2023 Conference Problems of Infocommunications. Science No. 299). and Technology (2020) 335–337. doi: 10.1109/ [5] E. Altulaihan, M. A. Almaiah, A. Aljughaiman, PICST51311.2020.9468084. Anomaly Detection IDS for Detecting DoS Attacks in [18] A. Ghai, P. Kumar, S. Gupta, A Deep-Learning-based Image Forgery Detection Framework for Controlling 249 the Spread of Misinformation, Information [33] N. Pasieka, et al., Lego Technology as a Means of Technology & People, 37(2) (2024) 966–997. Enhancing the Learning Activities of Junior High [19] E. R. Peterson, et al., The Impact from Galaxy Groups School Students in the Conditions of the New on Cosmological Measurements with Type Ia Ukrainian School, International Conference on Supernovae, arXiv preprint arXiv:2408.14560 (2024). Interactive Collaborative Learning (2022) 530–541. [20] J. Shen, et al., Citekit: A Modular Toolkit for Large [34] P. Skladannyi, et al., Improving the Security Policy of Language Model Citation Generation, arXiv (2024). the Distance Learning System based on the Zero Trust doi: 10.48550/arXiv.2408.04662 Concept, in: Cybersecurity Providing in Information [21] O. Toporkov, R. Agerri, Evaluating Shortest Edit and Telecommunication Systems, vol. 3421 (2023) 97– Script Methods for Contextual Lemmatization, arXiv 106. (2024). doi: 10.48550/arXiv.2403.16968. [22] M. Medykovskyy, Methods of Protection Document Formed from Latent Element Located by Fractals, in: 10th Int. In Scient. and Techn. Conf. Comp. Sci. and Infor. Techn. (CSIT) (2015) 70–72. doi: 10.1109/STC- CSIT.2015.7325434. [23] R. Groenendijk, L. Dorst, T. Gevers, HaarNet: Large- Scale Linear-Morphological Hybrid Network for RGB- D Semantic Segmentation, International Conference on Discrete Geometry and Mathematical Morphology (2024) 242–254. [24] M. Nazarkevych, et al., Evaluation of the Effectiveness of Different Image Skeletonization Methods in Biometric Security Systems, Int. J. Sensors Wireless Commun. Control, 11(5) (2021) 542–552. [25] V. Vysotska, et al., NLP Tool for Extracting Relevant Information from Criminal Reports or Fakes/Propaganda Content, in: IEEE 17th International Conference on Computer Sciences and Information Technologies (CSIT) (2022) 93–98. [26] J. O. Krugmann, J. Hartmann, Sentiment Analysis in the Age of Generative AI, Customer Needs and Solutions, 11(1) (2024) 3. [27] K. Alieksieieva, A. Berko, V. Vysotska, Technology of Commercial Web-Resource Processing, in: 13th International Conference: The Experience of Designing and Application of CAD Systems in Microelectronics, CADSM (2015). [28] V. Hrytsyk, M. Nazarkevych, Real-Time Sensing, Reasoning and Adaptation for Computer Vision Systems, International Scientific Conference Intellectual Systems of Decision-making and Problems of Computational Intelligence, Proceedings (2022) 573–585. [29] I. Tsmots, et al., Basic Components of Neuronetworks with Parallel Vertical Group Data Real-Time Processing, Advances in Intelligent Systems and Computing II: Selected Papers from the International Conference on Computer Science and Information Technologies, CSIT (2018) 558–576. [30] I. Khomytska, V. Teslyuk, The Multifactor Method Applied for Authorship Attribution on the Phonological Level, In COLINS (2020) 189–198. [31] I. Tsmots, et al., The Method and Simulation Model of Element Base Selection for Protection System Synthesis and Data Transmission, Int. J. Sensors Wireless Commun. Control, 11(5) (2021) 518–530. [32] N. Pasieka, et al., Harmful Effects of Fake Social Media Accounts and Learning Platforms, in: Cybersecurity Providing in Information and Telecommunication Systems, vol. 2923 (2021) 258–271. 250