Advanced Communication Model with the Voice Control and the Increased Security Level Serhii Kulibaba1, Svitlana Popereshnyak2, Yurii Shcheblanin1, Oleg Kurchenko1, and Nataliia Mazur3 1 Taras Shevchenko National University of Kyiv, 60 Volodymyrska str., Kyiv, 01601, Ukraine 2 State University of Telecommunication, 7 Solomyanska str., Kyiv, 03680, Ukraine 3 Borys Grinchenko Kyiv University, 18/2 Bulvarno-Kudriavska str., Kyiv, 04053, Ukraine Abstract Technologies of voice assistants or assistants are actively used in modern solutions. They are implemented on the basis of software that can perform tasks or provide services based on given voice commands by processing and interpreting human speech. Most communication tools are easy to use, but have typical and limited functionality for commercial use, in particular the implementation of security services. The paper proposes a model of a communication tool with a voice assistant and an increased level of security, which uses an advanced encryption method and a face recognition algorithm to provide security services. To build this model, the internal logic of performing functions and placing objects in systems with the functions of voice assistants or assistants was considered. The proposed solution uses a client-server architecture. As an additional direction of commercial use, the possibility of expanding the capabilities of the communication tool through the integration of auxiliary functionality in the form of a currency exchange module and decentralization of the system is being considered. Keywords 1 Software, encryption, face recognition, decentralization, currency exchange. 1. Introduction commercial use, etc. [4, 5]. Commercial products and development companies try to implement a number of these requirements, but they do not Modern information technologies provide always include a demonstration of the realized users with a wide range of services. Means of capabilities. communication are actively used for their stable The relevance of the work lies in the high functioning. They can be in the form of a service, interest of users in the use of communication tools application, etc. [1–3]. with a voice assistant, an inclusive interface, an There is a significant number of similar tools increased level of security and opportunities for on the market, some of them differ in their commercial use. purpose, supported services and technological The novelty of the work is the improvement of solutions. Developers of communication tools are existing approaches for recognizing named trying to expand their capabilities in the direction entities of the program text by taking into account of providing commercial services. However, contextual information, directions for increasing existing implementations in this direction do not the level of security of the communication tool by allow to meet the needs of users. improving the encryption method and the user's Users prefer means of communication with a face recognition algorithm, as well as expanding convenient, functional and understandable the functional capabilities that are limited in most interface, available visual effects, speed of commercial products are proposed. information exchange, reliability of stored information, confidentiality, opportunities for CPITS-2022: Cybersecurity Providing in Information and Telecommunication Systems, October 13, 2022, Kyiv, Ukraine EMAIL: kulibseryyy@gmail.com (S. Kulibaba); spopereshnyak@gmail.com (S. Popereshnyak); sheblanin@ukr.net (Y. Shcheblanin); kurol@ukr.net (O. Kurchenko); n.mazur@kubg.edu.ua (N. Mazur) ORCID: 0000-0002-7316-1214 (S. Kulibaba); 0000-0002-0531-9809 (S. Popereshnyak); 0000-0002-3231-6750 (Y. Shcheblanin); 0000- 0002-3507-2392 (O. Kurchenko); 0000-0001-7671-8287 (N. Mazur) ©️ 2022 Copyright for this paper by its authors. Use permitted under Creative Commons License Attribution 4.0 International (CC BY 4.0). CEUR Workshop Proceedings (CEUR-WS.org) 64 2. Analysis of Publications, provided for a large continuous speech recognition system based on the Kaldi open the Status of the Issue source speech recognition toolkit. and the Statement of the Problem Thanks to advances in voice recognition, users can easily control any device in a smart home by 2.1. Analysis of Research simply speaking a voice command. Based on this and Publications idea, a new group of smart devices called voice assistants has been developed and released. Recent research in the fields of machine However, voice itself is not secure and can be learning and natural language processing has attacked in many ways. To protect against voice made it possible to create impressive things that attacks, you can use the voice detection system seemed impossible before [6]. Recognition of [12], which consists in the fact that movements images and voice commands, recommendation when opening the mouth change the size of the ear systems, self-driving cars, exoplanet searches, canal space, which further changes the air prediction of fraudulent bank transactions, pressure in the ear canals. This approach allows military technology are just a small list of what the user's air noise pressure data to be compared has become possible thanks to the rapid growth of and matched to voices to verify and identify the available information and the development of the source of the voice. aforementioned fields of science. One such technology is voice assistants or assistants, which 2.2. Analysis of the State of the are software that can perform tasks or provide Issue in the Applied Field services for the user based on given voice commands, that is, by processing and interpreting human speech [7]. Currently, there are a significant number of Voice control is an important emerging feature means of communication, but not all of them are that is changing the way people live. The voice reliable and convenient. The proposed model of a assistant is commonly used in smartphones and means of communication with an increased level laptops. AI-powered voice assistants are of security takes into account the shortcomings of operating systems that can recognize the human existing solutions and provides an opportunity to voice and respond with integrated voices. In work realize the function of commercial use. [8], a voice assistant collects audio data using a The proposed model of the means of microphone, converts it into text, and sends it to communication combines a number of solutions users. that are not available from competitors, and this is AI-powered voice assistants such as Alexa and what guarantees commercial success [13]. Siri are increasingly replacing search engines as users actively use them to solve a variety of 2.3. Analysis of Existing Means everyday tasks. Technology providers, as well as of Communication marketers, are increasingly working to attract customers to use communication tools with voice A number of software communication tools assistant functions. The work [9] analyzes the were able to gain success and take a confident possibilities and options of using communication position in the market among users. Solutions tools with voice assistants. with a high level of success include the following In [10], a voice control system based on an means of communication: artificial intelligence (AI) assistant is proposed. A 1. Viber: Viber Media company, Inc [14]. system of AI assistants used by Google Assistant, Exchange of messages; calls; creating your a representative artificial intelligence service with own bots; convenient to use. an open API, and a conditional automatic launch 2. WhatsApp: Facebook company [15]. system were developed. The proposed technology Exchange of messages, calls; convenient to is expected to be applied to various control use. systems based on voice recognition. 3. Messenger: Facebook company [16]. In work [11], a means of communication with Exchange of messages; calls; convenient to a voice assistant for mobile phones, based on the use. Android operating system, is considered. The tool allows voice control. A native interface is 65 4. Telegram: Telegram FZ-LLC company [17]. 1. Conducting an analysis of the Exchange of messages; calls; creating your possibilities of the means of communication own bots; convenient to use. present on the market. 5. Instagram: Facebook company [18]. Exchange 2. Development of a model of a means of of messages; calls; display information about communication. yourself, if the profile is not closed; 3. Improvement of the method of encryption information management; convenient to use. and face recognition. 4. Integration of the commercial capabilities 2.4. Formulation of the Problem module into a communication tool with support for voice assistant functions. Many means of communication are easy to use. But most of them lack the ability to expand functionality. Not everyone can use their own 4. Development of a Model an means without the help of others. Thanks to such Improved Means of modifications as a voice assistant, an increased Communication with a Voice level of security, the presence of an inclusive interface and the provision of commercial Assistant services, the percentage of users of these tools may increase significantly. 4.1. Organization of the Process of Interaction with the System 3. Study Purpose and Objectives User interaction with the system is carried out using certain visual means. Thanks to these tools, The purpose of the research is to develop a you can work with the system and reveal its model of an improved means of communication potential. Visual tools in such systems are called that provides an increased level of security and widgets [19]. commercial opportunities. Consider the feasibility In order for any user to be able to interact with of integrating a voice assistant into similar tools - the system, it is necessary to expand the inclusive. Currently, this technology is added to functionality of the system: add authorization, most commercial products. But this technology is registration, create a dialogue with other users, absent in means of communication. Therefore, etc. All or most of these functions should be built this modification is considered the newest among into the voice assistant. In Fig. 1 is a use case other software available on the market. diagram that demonstrates the main functionality Adding to the means of communication the of the system. possibility of tracking the exchange rate and carrying out its exchange and transfer will expand the possibilities of using this tool. There are analogues of means of communication on the market with the possibility of exchanging one's own funds with other users, but the provision of material assistance is carried out by a minority. Not everyone has the opportunity to personally exchange / transfer their own funds to others. Therefore, a modification in the form of a voice assistant in the communication system will be useful. The integration of the cryptographic data encryption module and the facial recognition function into the communication medium guarantees an increase in the level of security and reliability. The set goal is achieved by solving the following tasks: Figure 1: Precedent diagram 66 Voice assistant. A complete set of algorithms 1. The original text, image or video is encrypted that automatically implements the performance of using an encryption algorithm, and the user functions. This feature will be added and will recipient has a special key to decrypt this data. be available to all users of this system. 2. The encrypted message is sent to the recipient. 3. The recipient decrypts the message using a 4.2. Analysis of the Function of special key. Means of Communication in Relation to Data Saving 4.4. Increase the Level of Data Protection To save data and maintain communication with other people through a means of In this work, the Pattern Reverse Subtraction communication, equipment is used, which makes (PRS) encryption method will be used. This it possible to store and display data in software method processes the input data by converting the (Fig. 2). Data is stored through a database [20]. binary value of the current symbol to decimal, where the key is then applied to form the encrypted data. Encryption will occur as follows: 1. Creating a key. 2. Creating a reverse key. 3. Receiving bytes from input data. 4. Subtraction of the current byte until the condition 0 ≤ L is met, where L is the length of the key. 5. This number is taken as an index from the reverse. The output number is taken as the Figure 2: Cooperation diagram of data processing index of the value from the key reversal. 6. Adding a number after each operation on the 4.3. Analysis of the Use of byte, which will reflect the number of differences before obtaining a valid value. Cryptographic Protection in Means of Communication Decryption uses the following formula: 𝑛 Cryptographic encryption is the replacement ∑ 𝐾 + 𝐿, (1) of the data structure with symbols and the creation 𝑖=1 of a certain key to replace the symbols with the where n is the number saved in the encrypted file; original structure [21]. Encryption is used for K is symbol index of reverse key; three purposes: L is decimal number. 1. Confidentiality. Compression. To achieve a lower amount of 2. Immutability. equipment physical memory costs, the following 3. Confirmation of the source. approach to data storage will be used (Fig. 3) [22]. Confidentiality. Thanks to cryptographic protection methods, information can be made inaccessible to persons trying to steal this information from the outside. Immutability. Encrypted information cannot be tampered with during transmission or storage. Confirmation of the source. Encrypted information has information about the sender. Data transfer from one user to another occurs Figure 3: Data compression as follows: 67 Data compression is divided into two types: 1. Authorization. Inclusive authorization. 1. With losses. 2. Registration. Inclusive registration. 2. Without losses. 3. Creating a chat for interviews with the With losses. Removing unnecessary bits of possibility of using a voice assistant. data where after decompressing the compressed 4. Creation of group chats for interviews with the file you can get the raw data. possibility of using a voice assistant. Without losses. Compression of information 5. Exchange of messages with the possibility of without replacing the symbols of the input data, where after decompression the output data will be using a voice assistant. obtained. 6. Sending photos and videos. When studying this encryption method, it was 7. Viewing videos and photos using a voice decided to use a type of losses compression for a assistant. large amount of data. The study showed that the 8. Sending voice messages. use of this type of compression did not lead to the 9. Converting voice messages into text. detection of errors and incorrectness of the 10. Changing profile settings. original data. Portability is a property that indicates how easy it is to transfer / install the system to a certain 4.5. Algorithm of Recognition device. Currently, software development tools of Characteristic Points on the Face have the ability to create an executable program for various operating and mobile systems [25]. One of the methods of maintaining Reliability and fault tolerance. For example, confidentiality and preserving the reliability of the the system will have a number of functional program is authentication using characteristic capabilities, where in the future it will be possible points on the face-face recognition (Fig. 4). to make corrections in the processing of data and Face recognition is a means of recognizing clients, so reliability and fault tolerance are at a characteristic points of a person’s face and its sufficient level [26]. verification. Characteristic points are determined Security. In any case, the data will be on the face, and make it possible to recognize the transferred over the network between the server corresponding person [23]. devices. Therefore, the proposed PRS data encryption method with E2E (end-to-end) data transmission is used [27]. For authorization in the system, it will be possible to create additional authentication— facial recognition. Extensions is the possibility of adding/modifying the software product. The following diagram (Fig. 5) shows the structure of the future software product. Productivity. Thanks to well-written software, you can achieve a high level of productivity. To achieve high performance, you need to have Figure 4: Cooperation diagram of verification equipment that will ensure fast processing of input person and output data, in particular voice requests from users [28]. This technology is needed to increase the level of data security of registered users. 4.6. Performance Criteria Functional completeness is performing the main functions of effective management and providing a convenient interface for the user [24]. The functional completeness is as follows: 68 5. Improving the Means of Communication by Adding Commercial Opportunities 5.1. Cryptocurrency and Other Currencies Exchange Module The expansion of the audience of users is planned due to the addition of opportunities for conducting financial transactions. This extension will make the software product more successful. Due to the minimal increase in the commission for the transfer of currencies, there will be an increased profit for the corresponding period of time. Finance departments are interested in cooperating with such tools, because their number of users will be able to expand and increase the level of profit. Cooperation will take place through the API (Application Programming Interface) of financial departments for the interaction of their services with their own developed software and system users [29]. An account will be created for the financial support of the relevant users, where funds will be received for the percentage of the transfer of funds from other users and sent to the relevant rehabilitation institutions. 5.2. Decentralization Decentralization is the process of distributing people, rules, performing calculations, data stored from a single location or central governing body [30]. The internal structure of the decentralized system consists of blocks and chains of a single profile, which are connected to each other and have the name blockchain. The meaning of the structure in the system is as follows: blocks are general information, and a chain is a connection between blocks. The content of the information may depend on the conditions specified in the system software: time of sending funds, message, comment, etc. When applying the blockchain technology, it is necessary to define and distribute rules between users and computing devices, achieving consensus, in particular, connecting other users who must verify the authenticity of data due to stored copies [31, 32]. This improvement will be useful for people who do not want to lose their personal information Figure 5: Components diagram 69 and use the application without entering increase the productivity of developers, and also information about themselves [33]. provides an opportunity to support the standards of conventions of inclusive interfaces. 5.3. Analysis of the Success Based on the results of the research, the logic of the communication tools was determined. of Improvements Thanks to this, you can design a software product that will be different from similar products on the The implementation of points 5.1 and 5.2 will market. make it possible to raise the level of commercial So, as the system operation scheme was attractiveness of the improved means of considered in detail, these systems can be communication. Currency exchange in a closed expanded not only externally, but also internally. system can be interesting for a large number of Several examples have been given of how the users. functionality of the product can be extended. The following formula can be used to obtain the probability of success of improving the means of communication: 7. References (2) 𝐶 = 𝐴 × 𝐵 × 𝐷, [1] Y. Nakamura, et al., Design and Evaluation where C is the probability that the changes will be of In-Situ Resource Provisioning Method for successful; Regional IoT Services, 2018 IEEE/ACM А is dissatisfaction with the existing situation; 26th International Symposium on Quality of В is a clear formulation of the goal of the changes; Service (IWQoS), 2018, pp. 1–2. doi: D is specific first steps to achieve goals. 10.1109/IWQoS.2018.8624127. [2] O. Iosifova, et al., Techniques Comparison for Natural Language Processing, in 6. Conclusion Proceedings of the Modern Machine Learning Technologies and Data Science The paper examines the functioning scheme of Workshop, vol. 2631, 2020, pp. 57–67. means of communication called “messengers.” To [3] I. Iosifov, O. Iosifova, V. Sokolov, Sentence build a model of an improved means of Segmentation from Unformatted Text Using communication with a voice assistant, the internal Language Modeling and Sequence Labeling logic of performing functions and placing objects Approaches, in Proceedings of the 2020 in similar means was considered. The work IEEE International Scientific and Practical proposed improved principles of increasing the Conference Problems of Infocommuni- level of security of products of this type by cations. Science and Technology, 2020, pp. improving the encryption method and the face 335–337. doi: 10.1109/picst51311.2020. recognition algorithm, as well as expanding the 9468084. functionality that is missing in most commercial [4] Z. Shi, Y. Liang, X. Wang, Analysis of products. The tool is built using a client-server Demand Side Energy IoT Communication architecture. This security model and method Channel Requirements of Integrated provide an opportunity to protect data at a Stations, Equipment, and Users, in 7th Asia sufficient level. To provide convenience to users, Conference on Power and Electrical a decision was considered to add a voice assistant Engineering, 2022, pp. 793–797. doi: to the system. This modification will be useful for 10.1109/acpee53904.2022.9784018. all users of the system. The work proposed the [5] V. Buriachok, V. Sokolov, P. Skladannyi, possibility of expanding the system with the help Security Rating Metrics for Distributed of auxiliary functionality in the form of a currency Wireless Systems, in Workshop of the 8th exchange module, which is intended for a certain International Conference on “Mathematics. category of users interested in this system. Information Technologies. Education:” The system is built on the basis of a micro Modern Machine Learning Technologies and service approach, which ensures scalability and Data Science (MoMLeT and DS), vol. 2386, low resource consumption when adding new 2019, pp. 222–233. functionality or replacing existing [6] A. S. Subramanian, et al., Far-Field Location implementations. The created voice assistant for Guided Target Speech Extraction Using End- integrated development environments allows to to-End Speech Recognition Objectives, 70 ICASSP 2020 - 2020 IEEE International 2021, pp. 91–96. doi: 10.1109/IWBIS53353. Conference on Acoustics, Speech and Signal 2021.9631860. Processing (ICASSP), 2020, pp. 7299–7303. [16] Y. Mei, et al., Turbine: Facebook’s Service doi: 10.1109/ICASSP40776.2020.9053692. Management Platform for Stream th [7] C. Fan, et al., Gated Recurrent Fusion With Processing, 2020 IEEE 36 International Joint Training Framework for Robust End- Conference on Data Engineering (ICDE), to-End Speech Recognition, in: IEEE/ACM 2020, pp. 1591–1602. doi: 10.1109/ Transactions on Audio, Speech, and ICDE48307.2020.00141. Language Processing, vol. 29, 2021, pp. [17] C. Huda, F. A. Bachtiar, A. A. Supianto, 198–209. doi: 10.1109/TASLP.2020. Reporting Sleepy Driver into Channel 3039600. Telegram via Telegram Bot, 2019 [8] S. Subhash, et al., Artificial Intelligence- International Conference on Sustainable based Voice Assistant. 2020 Fourth World Information Engineering and Technology Conference on Smart Trends in Systems, (SIET), 2019, pp. 251–256. doi: Security and Sustainability (WorldS4). 2020. 10.1109/SIET48054.2019.8986000. pp. 593–596. doi: 10.1109/WorldS450073. [18] W. Uriawan, et al., Pearson Correlation 2020.9210344. Method and Web Scraping for Analysis of [9] S. Malodia, et al., Why Do People Use Islamic Content on Instagram Videos, 2020 Artificial Intelligence (AI)-Enabled Voice 6th International Conference on Wireless and Assistants, in IEEE Transactions on Telematics (ICWT), 2020, pp. 1–6. doi: Engineering Management, 2021. doi: 10.1109/ICWT50448.2020.9243626. 10.1109/TEM.2021.3117884. [19] E. Zhang, S. Peng, Y. Zhai, Design and [10] T.-K. Kim, Short Research on Voice Control Application Development of the Camps System Based on Artificial Intelligence Navigation System Based on ArcGIS Assistant, 2020 International Conference on Runtime SDK for Android: Taking the Electronics, Information, and Yunnan Normal University as an Example, Communication (ICEIC), 2020, pp. 1–2. doi: 2019 IEEE 4th Advanced Information 10.1109/ICEIC49074.2020.9051160. Technology, Electronic and Automation [11] B. Popović, et al., Voice assistant application Control Conference (IAEAC), 2019, pp. for the Serbian language, 2015 23rd 1262–1266. doi: 10.1109/IAEAC47372. Telecommunications Forum Telfor 2019.8997730. (TELFOR), 2015, pp. 858–861. doi: [20] P. Seda, et al., Performance testing of 10.1109/TELFOR.2015.7377600. NoSQL and RDBMS for storing big data in [12] J. Shang, J. Wu, Voice Liveness Detection e-applications, 2018 3rd International for Voice Assistants using Ear Canal Conference on Intelligent Green Building Pressure, 2020 IEEE 17th International and Smart Grid (IGBSG), 2018, pp. 1–4. doi: Conference on Mobile Ad Hoc and Sensor 10.1109/IGBSG.2018.8393559. Systems (MASS), 2020, pp. 693–701. doi: [21] S. Kulibaba, O. Kurchenko, Cryptographic 10.1109/MASS50613.2020.00089. Method of Pattern Reverse Multiplication [13] A. Asaul, et al., The Latest Information Data Encryption, Cyber Security: Education, Systems in the Enterprise Management and Science, Technology, vol. 3, iss. 15, 2022, Trends in their Development, 2019 9th pp. 216–223. International Conference on Advanced [22] S. Yamagiwa, R. Morita, K. Marumo, Bank Computer Information Technologies Select Method for Reducing Symbol Search (ACIT), 2019, pp. 409–412. doi: Operations on Stream-Based Lossless Data 10.1109/ACITT.2019.8779874. Compression, 2019 Data Compression [14] A. Vasilaras, et al., Android Device Incident Conference (DCC), 2019, pp. 611. doi: Response: Viber Analysis, 2022 IEEE 10.1109/DCC.2019.00123. International Conference on Cyber Security [23] R. He, et al., Adversarial Cross-Spectral Face and Resilience (CSR), 2022, pp. 138–142. Completion for NIR-VIS Face Recognition, doi: 10.1109/CSR54599.2022.9850300. in: IEEE Transactions on Pattern Analysis [15] R. Wahaz, et al., Is WhatsApp Plus and Machine Intelligence, vol. 42, no. 5, Malicious? A Review Using Static Analysis, 2020, pp. 1025–1037. doi: 10.1109/TPAMI. 2021 6th International Workshop on Big 2019.2961900. Data and Information Security (IWBIS), 71 [24] A. Kolodenkova, E. Khalikova, S. Vere- Blockchain, 2019, pp. 520–527. doi: shchagina, Data Fusion and Industrial 10.1109/Blockchain.2019.00078. Equipment Diagnostics Based on [33] Y. Shu, Y. J. Gu, J. Chen, Dynamic Information Technology, 2019 International Authentication with Sensory Information for Multi-Conference on Industrial Engineering the Access Control Systems, in IEEE and Modern Technologies (FarEastCon), Transactions on Parallel and Distributed 2019, pp. 1–5. doi: 10.1109/FarEastCon. Systems, vol. 25, no. 2, 2014, pp. 427–436. 2019.8934322. doi: 10.1109/TPDS.2013.153. [25] T. Deakin, et al., Performance Portability across Diverse Computer Architectures, 2019 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), 2019, pp. 1–13. doi: 10.1109/P3HPC49587.2019.00006. [26] S. Talwani, I. Chana, Fault tolerance techniques for scientific applications in cloud, 2017 2nd International Conference on Telecommunication and Networks (TEL- NET), 2017, pp. 1–5. doi: 10.1109/TEL- NET.2017.8343578. [27] Y. Kai, H. Qiang, M. Yixuan, Construction of Network Security Perception System Using Elman Neural Network, 2021 2nd International Conference on Computer Communication and Network Security (CCNS), 2021, pp. 187–190. doi: 10.1109/CCNS53852.2021.00042. [28] P. Y. Tilak, et al., A platform for enhancing application developer productivity using microservices and micro-frontends, 2020 IEEE-HYDCON, 2020, pp. 1–4. doi: 10.1109/HYDCON48903.2020.9242913. [29] IEEE Standard for Learning Technology — ECMAScript Application—Programming Interface for Content to Runtime Services Communication—Redline, in IEEE Std. 1484.11.2-2020, 2021, pp. 1–60. [30] E. Işık, M. Birim, E. Karaarslan, Chainex Decentralized Application Development & Test Workbench, 2021 15th Turkish National Software Engineering Symposium (UYMS), 2021, pp. 1–4. doi: 10.1109/UYMS54260.2021.9659637. [31] J. Jayabalan, et al., A Study on Distributed Consensus Protocols and Algorithms: The Backbone of Blockchain Networks, 2021 International Conference on Computer Communication and Informatics (ICCCI), 2021, pp. 1–10. doi: 10.1109/ICCCI50826. 2021.9402318. [32] T. Salman, R. Jain, L. Gupta, A Reputation Management Framework for Knowledge- Based and Probabilistic Blockchains, 2019 IEEE International Conference on 72