<?xml version="1.0" encoding="UTF-8"?>
<TEI xml:space="preserve" xmlns="http://www.tei-c.org/ns/1.0" 
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" 
xsi:schemaLocation="http://www.tei-c.org/ns/1.0 https://raw.githubusercontent.com/kermitt2/grobid/master/grobid-home/schemas/xsd/Grobid.xsd"
 xmlns:xlink="http://www.w3.org/1999/xlink">
	<teiHeader xml:lang="en">
		<fileDesc>
			<titleStmt>
				<title level="a" type="main">Implementation of reinforcement learning strategies in the synthesis of neuromodels to solve medical diagnostics tasks</title>
			</titleStmt>
			<publicationStmt>
				<publisher/>
				<availability status="unknown"><licence/></availability>
			</publicationStmt>
			<sourceDesc>
				<biblStruct>
					<analytic>
						<author>
							<persName><forename type="first">Serhii</forename><surname>Leoshchenko</surname></persName>
							<affiliation key="aff0">
								<orgName type="institution">National university &quot;Zaporizhzhia polytechnic&quot;</orgName>
								<address>
									<addrLine>Zhukovskogo street 64</addrLine>
									<postCode>69063</postCode>
									<settlement>Zaporizhzhia</settlement>
									<country key="UA">Ukraine</country>
								</address>
							</affiliation>
						</author>
						<author>
							<persName><forename type="first">Andrii</forename><surname>Oliinyk</surname></persName>
							<affiliation key="aff0">
								<orgName type="institution">National university &quot;Zaporizhzhia polytechnic&quot;</orgName>
								<address>
									<addrLine>Zhukovskogo street 64</addrLine>
									<postCode>69063</postCode>
									<settlement>Zaporizhzhia</settlement>
									<country key="UA">Ukraine</country>
								</address>
							</affiliation>
						</author>
						<author>
							<persName><forename type="first">Sergey</forename><surname>Subbotin</surname></persName>
							<affiliation key="aff0">
								<orgName type="institution">National university &quot;Zaporizhzhia polytechnic&quot;</orgName>
								<address>
									<addrLine>Zhukovskogo street 64</addrLine>
									<postCode>69063</postCode>
									<settlement>Zaporizhzhia</settlement>
									<country key="UA">Ukraine</country>
								</address>
							</affiliation>
						</author>
						<author>
							<persName><forename type="first">Viktor</forename><surname>Lytvyn</surname></persName>
							<affiliation key="aff0">
								<orgName type="institution">National university &quot;Zaporizhzhia polytechnic&quot;</orgName>
								<address>
									<addrLine>Zhukovskogo street 64</addrLine>
									<postCode>69063</postCode>
									<settlement>Zaporizhzhia</settlement>
									<country key="UA">Ukraine</country>
								</address>
							</affiliation>
						</author>
						<author>
							<persName><forename type="first">Oleksandr</forename><surname>Korniienko</surname></persName>
							<affiliation key="aff0">
								<orgName type="institution">National university &quot;Zaporizhzhia polytechnic&quot;</orgName>
								<address>
									<addrLine>Zhukovskogo street 64</addrLine>
									<postCode>69063</postCode>
									<settlement>Zaporizhzhia</settlement>
									<country key="UA">Ukraine</country>
								</address>
							</affiliation>
						</author>
						<title level="a" type="main">Implementation of reinforcement learning strategies in the synthesis of neuromodels to solve medical diagnostics tasks</title>
					</analytic>
					<monogr>
						<imprint>
							<date/>
						</imprint>
					</monogr>
					<idno type="MD5">FEBBCC27F50B58BC5D840305B2A82D9E</idno>
				</biblStruct>
			</sourceDesc>
		</fileDesc>
		<encodingDesc>
			<appInfo>
				<application version="0.7.2" ident="GROBID" when="2023-03-25T01:38+0000">
					<desc>GROBID - A machine learning software for extracting information from scholarly documents</desc>
					<ref target="https://github.com/kermitt2/grobid"/>
				</application>
			</appInfo>
		</encodingDesc>
		<profileDesc>
			<textClass>
				<keywords>
					<term>medical diagnostics</term>
					<term>neuromodel</term>
					<term>synthesis</term>
					<term>reinforcement learning</term>
					<term>penalty and reward</term>
					<term>duel</term>
				</keywords>
			</textClass>
			<abstract>
<div xmlns="http://www.tei-c.org/ns/1.0"><p>The highlevel of accuracy of the functioning of artificial neural network (ANN) diagnostic models described at the resources indicates the prospects for the use of ANN in various fields of medicine for the diagnosis and forecasting of diseases. The implementation of diagnostic neuromodels into clinical practice can provide effective results during making medical decisions, contribute to improving the accuracy of diagnosis of diseases, and speed up process of examination of the patient. It is also worth noting that ANN can be used as models of the subject area under consideration. By changing the input data of the neural network model, observing the behavior of the output signals, it is possible to research the subject area under consideration, identify and investigate medical patterns that the ANN extracted during training. However, medical tasks become more complicated every time: the nature of clinical data about the patient changes, the data is constantly updated, the volume of data increases, as well as the hidden connections in the data. An additional challenge is the increased requirements for the adaptability and sensitivity of the neuromodel for a particular patient or disease. Using a reinforcement learning approach demonstrates good training results on incomplete data or in areas of high specificity. The paper investigates the possibility of using reinforcement learning strategies for the synthesis of high-precision neuromodels for subsequent use in medical diagnostics.</p></div>
			</abstract>
		</profileDesc>
	</teiHeader>
	<text xml:lang="en">
		<body>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="1.">Introduction</head><p>ANNs are increasingly used in intelligent medical systems every year. Among the possible use cases are <ref type="bibr" target="#b0">[1]</ref>, <ref type="bibr" target="#b1">[2]</ref>:  methods that look for deviations in MRI images, mammography, X-rays. Before the pandemic, developers often created such programs to help doctors diagnose cancer. Since the beginning of the pandemic, they have been changed for the diagnosis of COVID-19;  analysis of medical records, patient complaints. The doctor enters data about the patient into the database: test results, examination data, and the program offers treatment tactics. This is how one of the most famous programs in this industry works: Watson from IBM;  control of medical staff. It is extremely important for the head of the clinic to understand whether the doctors prescribe the procedures and treatment correctly. The patient's medical history has everything: what he came with, what tests he was prescribed, what treatment. The method looks for anomalies and points to histories where excessive treatment is prescribed, too many procedures. Or to those where it is less than in similar cases.</p><p>IDDM-2021: 4rd International Conference on Informatics &amp; Data-Driven Medicine, November 19-21, 2021, Valencia, Spain EMAIL: sergleo.zntu@gmail.com (S. Leoshchenko); olejnikaa@gmail.com (A. Oliinyk); subbotin@zntu.edu.ua (S. Subbotin); lytvynviktor.a@gmail.com (V. Lytvyn); al.korn95@gmail.com (O. Korniienko) ORCID: 0000-0001-5099-5518 (S. Leoshchenko); 0000-0002-6740-6078 (A. Oliinyk); 0000-0001-5814-8268 (S. Subbotin); 0000-0003-4061-4755 (V. Lytvyn); 0000-0003-4812-5382 (O. Korniienko) Moreover, researchers often have to work with more specific tasks that are not so common in mass practice <ref type="bibr" target="#b2">[3]</ref><ref type="bibr" target="#b3">[4]</ref><ref type="bibr" target="#b4">[5]</ref><ref type="bibr" target="#b5">[6]</ref>:</p><p> methods for detecting signs of early-stage Alzheimer's disease on MRI images;  a method that looks for anomalies in X-ray images;  methods for the control of bedridden patients. There are cameras in the wards that are connected to a program that can recognize a specific situation: a patient falling out of bed. If this happens, the nurses are automatically notified;  methods for monitoring the workload of operating tables. The program determines how evenly the load is distributed on medical teams in different operating rooms;  medical reference book with artificial intelligence (the doctor enters data about the patient, the program suggests a solution). However, all these tasks are characterized by similar problems in the process of implementing ANN. For example, getting data. To create a neuromodel designed for any task, it must be trained on data. To teach her to see an anomaly on an X-ray or to determine that it is cancer and not pneumonia, she needs to show a lot of such pictures (thousands, hundreds of thousands, millions). The diagnosis must be correctly signed on all the pictures, otherwise the program will make more mistakes <ref type="bibr" target="#b6">[7]</ref><ref type="bibr" target="#b7">[8]</ref><ref type="bibr" target="#b8">[9]</ref><ref type="bibr" target="#b9">[10]</ref>.</p><p>So, many researchers agree that the main difficulty of developers is: the lack of homogeneous and high-quality data. A developer can't just come to a hospital and take medical data about patients. Even taking into account the fact that they are depersonalized, for example, X-rays without a first and last name <ref type="bibr" target="#b6">[7]</ref><ref type="bibr" target="#b7">[8]</ref><ref type="bibr" target="#b8">[9]</ref><ref type="bibr" target="#b9">[10]</ref>. These data are protected by several legal laws at once: on medical secrecy, on personal data, etc. Large Western universities often provide developers with arrays of data to guarantee the ability to train a model. But then there is a problem with data compatibility. For example, the developers received a database with postoperative X-rays: control images, which are made after surgery in the patient's supine position. However, to analyze the results of screening studies, the pictures are taken most massively when the patient is standing, it is impossible to apply a system trained on such data. The patient's X-rays lying down and standing are two very different pictures. There are also always doubts about the reliability and accuracy of other people's data. It is difficult to train models that prompt a doctor to make a decision based on text data: approaches to the treatment of certain diseases may differ in each country <ref type="bibr" target="#b10">[11]</ref><ref type="bibr" target="#b11">[12]</ref><ref type="bibr" target="#b12">[13]</ref>.</p><p>Reinforcement learning is a approach of machine learning method in which a model is trained that has no information about the system, but has the ability to perform any actions in it. Actions move the system to a new state, and the model receives some reward from the system. Therefore, such a strategy can be a useful practice for solving medical problems.</p><p>So the main goal of the work will be to develop a new method of neuroevolutionary synthesis of neuromodels for medical diagnostics with the borrowing of strategies and mechanisms of reinforcement learning methods. This approach will eliminate most of the disadvantages of neuroevolutionary methods.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="2.">Related Works</head><p>In recent years, researchers have observed significant qualitative growth in reinforcement learning technologies. If initially this approach demonstrated good results in game tasks, then at the moment neuromodels trained with reinforcement learning methods are actively used for pattern recognition, agent management in robotics and decision-making in continuous tasks <ref type="bibr" target="#b13">[14]</ref><ref type="bibr" target="#b14">[15]</ref><ref type="bibr" target="#b15">[16]</ref><ref type="bibr" target="#b16">[17]</ref><ref type="bibr" target="#b17">[18]</ref><ref type="bibr" target="#b18">[19]</ref><ref type="bibr" target="#b19">[20]</ref>.</p><p>Sometimes reinforcement learning is distinguished not as a separate strategy, but as an offshoot from the strategy of learning with a teacher. This is due to the formulation of the assignment: a real or virtual environment acts as a teacher. However, this is the main mistake of this classification. After all, the environment in this case reacts to the agent dynamically and each time the reaction may be different. Thus, during the training process, the agent receives information from the external environment about where there is no exit, thus, he studies the surrounding world and learns to find a way out <ref type="bibr" target="#b13">[14]</ref><ref type="bibr" target="#b14">[15]</ref><ref type="bibr" target="#b15">[16]</ref><ref type="bibr" target="#b16">[17]</ref><ref type="bibr" target="#b17">[18]</ref><ref type="bibr" target="#b18">[19]</ref><ref type="bibr" target="#b19">[20]</ref>.</p><p>It should be noted that several factors have influenced the rapid development of the reinforcement learning approach <ref type="bibr" target="#b13">[14]</ref><ref type="bibr" target="#b14">[15]</ref><ref type="bibr" target="#b15">[16]</ref><ref type="bibr" target="#b16">[17]</ref><ref type="bibr" target="#b17">[18]</ref><ref type="bibr" target="#b18">[19]</ref><ref type="bibr" target="#b19">[20]</ref>:  increased computing speed (using powerful distributed and parallel computing systems, the use of many lightweight threads of modern GPUs);  a significant increase in the amount of suitable data for training models in open repositories (for example, ImageNet);  dissemination of new ANN topologies (CNN, LSTM, GRU);  expansion and distribution of computing infrastructures (Linux, TCP/IP, Git, ROS, PR2, AWS, AMT, TensorFlow, etc.). So in general, It can be concluded that the main impetus for recent progress is not new ideas and methods, but the intensification of computing, sufficient data, mature infrastructure. And, despite significant practical results, their theoretical basis still remains simple <ref type="bibr" target="#b15">[16]</ref><ref type="bibr" target="#b16">[17]</ref><ref type="bibr" target="#b17">[18]</ref><ref type="bibr" target="#b18">[19]</ref><ref type="bibr" target="#b19">[20]</ref>.</p><p>The most common and researched reinforcement learning method is Policy Gradient (PG) <ref type="bibr" target="#b20">[21]</ref><ref type="bibr" target="#b21">[22]</ref><ref type="bibr" target="#b22">[23]</ref><ref type="bibr" target="#b23">[24]</ref><ref type="bibr" target="#b24">[25]</ref><ref type="bibr" target="#b25">[26]</ref><ref type="bibr" target="#b26">[27]</ref><ref type="bibr" target="#b27">[28]</ref>. The popularity of this method is explained by theoretically supported rules for optimizing the expected reward:  clear policy;  transparent rules. In general, PG can be represented in the likeness of the diagram in Fig. <ref type="figure" target="#fig_0">1</ref>. Then basically the method will consist of performing 4 basic steps <ref type="bibr" target="#b20">[21]</ref><ref type="bibr" target="#b21">[22]</ref><ref type="bibr" target="#b22">[23]</ref><ref type="bibr" target="#b23">[24]</ref><ref type="bibr" target="#b24">[25]</ref><ref type="bibr" target="#b25">[26]</ref><ref type="bibr" target="#b26">[27]</ref><ref type="bibr" target="#b27">[28]</ref> </p><formula xml:id="formula_0">                                    i i T t i t i t N i i t i t T t s a r s a N J 1 1 1 | | log 1</formula><p>, where    J is a function of the maximized mathematical expectation of the sum of the agent's winnings  , and</p><formula xml:id="formula_1">     J</formula><p>is the gradient of this function. Then,   Further researches of reinforcement learning methods was found in the more complete and advanced Q-learning method <ref type="bibr" target="#b20">[21]</ref><ref type="bibr" target="#b21">[22]</ref><ref type="bibr" target="#b22">[23]</ref><ref type="bibr" target="#b23">[24]</ref><ref type="bibr" target="#b24">[25]</ref><ref type="bibr" target="#b25">[26]</ref><ref type="bibr" target="#b26">[27]</ref><ref type="bibr" target="#b27">[28]</ref>. Q-learning is a method that researched values from a special table that measures in what quality level it will be performed a certain action in any state (it can be measured this with a simple scalar value, so the larger the value, the better the action). The values which stored in the table are called "Q-values". These are estimates of the amount of future awards. In other words, they estimate how much more reward it could be get before the end of the game by being in the ( i s ) state and performing the ( i a ) action. This method allows to get more information about the environment at every step. This information is used to update the values in the table <ref type="bibr" target="#b20">[21]</ref><ref type="bibr" target="#b21">[22]</ref><ref type="bibr" target="#b22">[23]</ref><ref type="bibr" target="#b23">[24]</ref><ref type="bibr" target="#b24">[25]</ref><ref type="bibr" target="#b25">[26]</ref><ref type="bibr" target="#b26">[27]</ref><ref type="bibr" target="#b27">[28]</ref>.</p><p>The basic concept of Q-learning is based on the Bellman equation:</p><formula xml:id="formula_2">    ' , ' max , ' a s Q r a s Q a    ,<label>(1)</label></formula><p>Q is a Q-Values for the state given a particular state; i s is a sequence of agent states ( i s ); i a is a sequence of agent and actions; r is a expected discounted cumulative reward;  is a the award in the future, devaluing future awards.</p><p>The equation states that the value of Q for a certain state-action pair should be the reward received when moving to a new state (by performing this action), added to the value of the best action in the next state. And to resolve the conflict, when the hypothesis works that receiving an award right now is more valuable than receiving an award in the future,  number is used from 0 to 1 (usually from 0.9 to 0.99), which is multiplied by the award in the future, devaluing future awards <ref type="bibr" target="#b20">[21]</ref><ref type="bibr" target="#b21">[22]</ref><ref type="bibr" target="#b22">[23]</ref><ref type="bibr" target="#b23">[24]</ref><ref type="bibr" target="#b24">[25]</ref><ref type="bibr" target="#b25">[26]</ref><ref type="bibr" target="#b26">[27]</ref><ref type="bibr" target="#b27">[28]</ref>. with the parameter  . To evaluate this network, firstly should be optimized the following sequence of function dropouts on iteration</p><formula xml:id="formula_3">State Action Q Table State-Action Value - - - - - - - - 0 0 0 0 0 0 0 0 Q-Value State Q-Value Action 1 Q-Value Action 2 Q-Value Action N … Deep Q-Learning General Q-Learning</formula><formula xml:id="formula_4">i :                 2 ' , , , , , i DQN i s r a s i i a s Q y E L ;   ' , ' , ' max     a s Q r y DQN i</formula><p>, updating the parameters of the descent gradient such that <ref type="bibr" target="#b20">[21]</ref><ref type="bibr" target="#b21">[22]</ref><ref type="bibr" target="#b22">[23]</ref><ref type="bibr" target="#b23">[24]</ref><ref type="bibr" target="#b24">[25]</ref><ref type="bibr" target="#b25">[26]</ref><ref type="bibr" target="#b26">[27]</ref><ref type="bibr" target="#b27">[28]</ref>. Dueling DDQN is a special state-of-the-art deep Q learning algorithm consisting of separate duel architectures that share streams of value and benefits in deep Q networks to determine the value of the next state. Prioritizing experience reproduction, i.e. sampling mini-experience packages that have a large expected impact on learning, further increases efficiency <ref type="bibr" target="#b20">[21]</ref><ref type="bibr" target="#b21">[22]</ref><ref type="bibr" target="#b22">[23]</ref><ref type="bibr" target="#b23">[24]</ref><ref type="bibr" target="#b24">[25]</ref><ref type="bibr" target="#b25">[26]</ref><ref type="bibr" target="#b26">[27]</ref><ref type="bibr" target="#b27">[28]</ref>.</p><formula xml:id="formula_5">        i i i DQN i s r a s i i i a s Q a s Q y E L          , , , , ' , , ,</formula><formula xml:id="formula_6">DNN DNN DNN Flatten FC FC V(s) A(s,a 1 )</formula><p>A(s,a 2 )</p><p>A(s,a 3 )</p><formula xml:id="formula_7">Aggregation Layer Q(s,a 1 ) Q(s,a 2 ) Q(s,a 3 )</formula><p>Figure <ref type="figure">3</ref>: General architecture of the Q-learning method</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="3.">Proposed method</head><p>As it was presented in the previous section, reinforcement learning methods have great prospects for solving problems that are poorly formalized, with incomplete data or with a dynamic environment. In our work, we propose a method based on strategies of reinforcement learning methods <ref type="bibr" target="#b28">[29]</ref>.</p><p>So it is proposed: 1. taken the neuroevolutionary synthesis of neuromodels as a basis, but with the addition of 2 separate neural networks: a network for evaluating and monitoring the environment ( crit glob NN _ ) and a network for duplicating the parameters of the best agent at the next step ( 3. during the synthesis and modification of the structure of individual individuals considered as agents in the environment, all information will be forwarded to the global network crit glob NN _ , whose task is to compare the current results of agents with the reference results of training data and adjust the penalty or reward for each agent; 4. at the same time, after evaluating the actions of all agents, the agent with the best results is selected at the iteration ( The main goal of this step is to evaluate the results in the next iteration with the previously best ones.</p><p>This synthesis approach also assumes the presence of an additional identifier: the evaluation of the reward growth step lev Q mark  <ref type="bibr" target="#b29">[30]</ref><ref type="bibr" target="#b30">[31]</ref><ref type="bibr" target="#b31">[32]</ref><ref type="bibr" target="#b32">[33]</ref><ref type="bibr" target="#b33">[34]</ref>. Such an identifier will help to avoid areas of local extremes, since if the reward value decreases less than the specified one, it is possible to change the best agent in the population peratively.</p><p>The general progress of the method is shown in Fig. <ref type="figure">4</ref>.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="4.">Results and Discussions</head><p>A data set was selected for testing based on the characteristics of patients with pneumonia, which was recently presented by authors M.-A. Kim, J. Seok Park,C. W. Lee, and W.-I. Choi <ref type="bibr" target="#b34">[35]</ref>. Total sample size: 77490 values. Table <ref type="table">1</ref> shows the characteristics of the set date.</p><p>For this task, the development of neuromodels will make it much easier to determine the further diagnosis of a person after collecting data on their well-being. Given that pneumonia is one of the most important signs and complications of COVID-19 <ref type="bibr" target="#b35">[36]</ref>, <ref type="bibr" target="#b36">[37]</ref>, after additional training on advanced data, this model can be used to diagnose patients or predict the further development of disease dynamics. </p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>Number of instances' 1435</head><p>It will be compared the work of the proposed method reinforcement learning (RLNE) with the modified neuroevolution genetic algorithm method (MGA) which synthesis tasks will be RNN and DNN <ref type="bibr" target="#b13">[14]</ref>, <ref type="bibr" target="#b14">[15]</ref>, <ref type="bibr" target="#b33">[34]</ref>. For methods compared, will be used next characteristics of the metaparameters. Types of mutation deleting a connection between neurons removing a neuron (hidden layer) adding a connection between neurons adding a neuron (hidden layer) changing the type of activation function The test results for the synthesis task are shown in Table <ref type="table" target="#tab_3">3</ref>. Analyzing the results, it can be concluded that the proposed method has well demonstrated the synthesis time in comparison with the use of MGA for the synthesis of DNN. This is due to the fact that topologically synthesized neuromodels were simpler and their modifications required less effort. However, the time results are inferior in time to MGA for RNN synthesis. A possible explanation may be that during RNN synthesis, there was no need to clone the best individuals to compare the results, since the presence of recurrent connections makes this process easier.</p><p>Another important characteristic is the accuracy of the synthesized solutions. So the solutions obtained by RLNE were more accurate both on training and test data, but the difference in error with MGA RNN is not so significant. And the results of MGA DNN were even better. It is likely that deep networks allow encode hidden connections between data more accurately.</p><p>The second stage of the study of experimental results was focused on the characteristics of resource consumption during the synthesis of solutions. So special attention was paid to measuring the load on the CPU and RAM <ref type="bibr">[38]</ref>. Such monitoring allows more accurately determine the load distribution at different iterations of the method execution. The CPU and RAM load graphs are shown in Fig. <ref type="figure">5 and 6</ref>, respectively. During the use of MGA in both cases, the load on the CPU and RAM was more abrupt, but did not exceed the mark of 81-82% on average. When using RLNE, the load distribution was more systematic, but it often reached 100%. These indicators are important when designing a parallel approach in synthesis using methods. So a relatively low load allow implement MGA on highly productive GPUs, but the high resource consumption of RLNE, on the contrary, limits this possibility.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="5.">Conclusion</head><p>The proposed strategies and method demonstrated the accepted level of work. Thus, the accuracy of the resulting solution was increased by 6.4% (from 0.157 to 0.147). It was also possible to reduce the synthesis time: in comparison with analogues by 8.5% (from 8031 s to 7352 s). However, a high level of resource consumption limits the parallelization of the method, which in turn can significantly limit the genetic diversity of individuals. In the future, it is possible to implement the main strategies of the proposed method in parallel implementations of neuroevolutionary methods for the purpose of intellectual maintenance and control of populations of solutions.</p><p>Also, an important option for further research may be to simplify the proposed strategy by extracting a clone of the best result at the iteration and replacing this approach with the use of individual agents with recurrent connections, but by tightening the control of the import of the barrier from the external global critic network. On the other hand, this approach will allow to focus the work of the critic's network on the external data of the environment.</p></div><figure xmlns="http://www.tei-c.org/ns/1.0" xml:id="fig_0"><head>FitFigure 1 :</head><label>1</label><figDesc>step T at which the transition to the terminal state occurred; 3. result not agree to the extreme, repeat from point 1.Generate samples (i.e. run the policy) General scheme of the PG method</figDesc></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" xml:id="fig_1"><head>Figure 2 :</head><label>2</label><figDesc>Figure 2: General scheme of the Q-learning method</figDesc></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" xml:id="fig_2"><head></head><label></label><figDesc>rest of the population will be a set of different individual agents (</figDesc></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" xml:id="fig_4"><head>Figure 5 :Figure 6 :</head><label>56</label><figDesc>Figure 5: Load on the CPU during synthesis</figDesc><graphic coords="7,72.00,428.75,453.25,144.75" type="bitmap" /></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" type="table" xml:id="tab_2"><head>Table 2</head><label>2</label><figDesc></figDesc><table><row><cell>Metaparameters for methods</cell><cell></cell></row><row><cell>Metaparameter of the method</cell><cell>Characteristics of the metaparameter</cell></row><row><cell>Number of individual in population (size)</cell><cell>100</cell></row><row><cell>Elite size of population</cell><cell>5%</cell></row><row><cell>Neurons activation function</cell><cell>hyperbolic tangent</cell></row><row><cell>Probability of the mutation (for the MGA)</cell><cell>25%</cell></row><row><cell>Type of the crossover</cell><cell>two-point</cell></row><row><cell>Reward</cell><cell>[-1;0;1]</cell></row></table></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" type="table" xml:id="tab_3"><head>Table 3</head><label>3</label><figDesc></figDesc><table><row><cell>General results on data set</cell><cell></cell><cell></cell><cell></cell></row><row><cell>Method of synthesis</cell><cell>Synthesis Time, s</cell><cell>Error at the training</cell><cell>Error at the test</cell></row><row><cell></cell><cell></cell><cell>sample</cell><cell>sample</cell></row><row><cell>RLNE</cell><cell>7352</cell><cell>0.021</cell><cell>0.147</cell></row><row><cell>MGA RNN</cell><cell>7173</cell><cell>0.03</cell><cell>0.157</cell></row><row><cell>MGA DNN</cell><cell>8031</cell><cell>0.019</cell><cell>0.134</cell></row></table></figure>
		</body>
		<back>

			<div type="acknowledgement">
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="6.">Acknowledgements</head><p>The work was carried out with the support of the state budget research projects of the state budget of the National University "Zaporozhzhia Polytechnic" "Intelligent methods and software for diagnostics and non-destructive quality control of military and civilian applications" (state registration number 0119U100360) and "Development of methods and tools for analysis and prediction of dynamic behavior of nonlinear objects" (state registration number 0121U107499).</p></div>
			</div>

			<div type="references">

				<listBibl>

<biblStruct xml:id="b0">
	<analytic>
		<title level="a" type="main">An optimized artificial neural network model for the prediction of rate of hazardous chemical and healthcare waste generation at the national level</title>
		<author>
			<persName><forename type="first">V</forename><forename type="middle">M</forename><surname>Adamović</surname></persName>
		</author>
		<author>
			<persName><forename type="first">D</forename><forename type="middle">Z</forename><surname>Antanasijević</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><forename type="middle">Đ</forename><surname>Ristić</surname></persName>
		</author>
		<idno type="DOI">10.1007/s10163-018-0741-6</idno>
	</analytic>
	<monogr>
		<title level="j">J Mater Cycles Waste Manag</title>
		<imprint>
			<biblScope unit="volume">20</biblScope>
			<biblScope unit="page" from="1736" to="1750" />
			<date type="published" when="2018">2018</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b1">
	<analytic>
		<title level="a" type="main">Artificial intelligence in healthcare operations to enhance treatment outcomes: a framework to predict lung cancer prognosis</title>
		<author>
			<persName><forename type="first">M</forename><surname>Johnson</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Albizri</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><surname>Simsek</surname></persName>
		</author>
		<idno type="DOI">10.1007/s10479-020-03872-6</idno>
	</analytic>
	<monogr>
		<title level="j">Ann Oper Res</title>
		<imprint>
			<date type="published" when="2020">2020</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b2">
	<analytic>
		<title level="a" type="main">Resource efficient for hybrid fiber-wireless communications links in access networks with multi response optimization algorithm</title>
		<author>
			<persName><forename type="first">A</forename><forename type="middle">W Y</forename><surname>Khang</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><forename type="middle">A J</forename><surname>Alsayaydeh</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><forename type="middle">M</forename><surname>Idrus</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><forename type="middle">A B M</forename><surname>Gani</surname></persName>
		</author>
		<author>
			<persName><forename type="first">W</forename><forename type="middle">A</forename><surname>Indra</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><forename type="middle">B</forename><surname>Pusppanathan</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">ARPN Journal of Engineering and Applied Sciences</title>
		<imprint>
			<biblScope unit="volume">16</biblScope>
			<biblScope unit="issue">1</biblScope>
			<biblScope unit="page" from="45" to="50" />
			<date type="published" when="2021">2021</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b3">
	<monogr>
		<title level="m" type="main">Development of programmable home security using GSM system for early prevention</title>
		<author>
			<persName><forename type="first">J</forename><forename type="middle">A J</forename><surname>Alsayaydeh</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Aziz</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><forename type="middle">I A</forename><surname>Rahman</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><forename type="middle">N S</forename><surname>Salim</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Zainon</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Z</forename><forename type="middle">A</forename><surname>Baharudin</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><forename type="middle">I</forename><surname>Abbasi</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><forename type="middle">W Y</forename><surname>Khang</surname></persName>
		</author>
		<imprint>
			<date type="published" when="2021">2021</date>
			<biblScope unit="volume">16</biblScope>
			<biblScope unit="page" from="88" to="97" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b4">
	<analytic>
		<title level="a" type="main">Development of vehicle door security using smart tag and fingerprint system</title>
		<author>
			<persName><forename type="first">J</forename><forename type="middle">A J</forename><surname>Alsayaydeh</surname></persName>
		</author>
		<author>
			<persName><forename type="first">W</forename><forename type="middle">A Y</forename><surname>Khang</surname></persName>
		</author>
		<author>
			<persName><forename type="first">W</forename><forename type="middle">A</forename><surname>Indra</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><forename type="middle">B</forename><surname>Pusppanathan</surname></persName>
		</author>
		<author>
			<persName><forename type="first">V</forename><surname>Shkarupylo</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><forename type="middle">K M</forename><surname>Zakir Hossain</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><surname>Saravanan</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">ARPN Journal of Engineering and Applied Sciences</title>
		<imprint>
			<biblScope unit="volume">9</biblScope>
			<biblScope unit="issue">1</biblScope>
			<biblScope unit="page" from="3108" to="3114" />
			<date type="published" when="2019">2019</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b5">
	<analytic>
		<title level="a" type="main">Development of smart dustbin by using apps</title>
		<author>
			<persName><forename type="first">J</forename><forename type="middle">A J</forename><surname>Alsayaydeh</surname></persName>
		</author>
		<author>
			<persName><forename type="first">W</forename><forename type="middle">A Y</forename><surname>Khang</surname></persName>
		</author>
		<author>
			<persName><forename type="first">W</forename><forename type="middle">A</forename><surname>Indra</surname></persName>
		</author>
		<author>
			<persName><forename type="first">V</forename><surname>Shkarupylo</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Jayasundar</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">ARPN Journal of Engineering and Applied Sciences</title>
		<imprint>
			<biblScope unit="volume">14</biblScope>
			<biblScope unit="issue">21</biblScope>
			<biblScope unit="page" from="3703" to="3711" />
			<date type="published" when="2019">2019</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b6">
	<analytic>
		<title level="a" type="main">Risk management for nuclear medical department using reinforcement learning algorithms</title>
		<author>
			<persName><forename type="first">G</forename><surname>Paragliola</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Naeem</surname></persName>
		</author>
		<idno type="DOI">10.1007/s40860-019-00084-z</idno>
	</analytic>
	<monogr>
		<title level="j">J Reliable Intell Environ</title>
		<imprint>
			<biblScope unit="volume">5</biblScope>
			<biblScope unit="page" from="105" to="113" />
			<date type="published" when="2019">2019</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b7">
	<analytic>
		<title level="a" type="main">Multi-step medical image segmentation based on reinforcement learning</title>
		<author>
			<persName><forename type="first">Z</forename><surname>Tian</surname></persName>
		</author>
		<author>
			<persName><forename type="first">X</forename><surname>Si</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Y</forename><surname>Zheng</surname></persName>
		</author>
		<idno type="DOI">10.1007/s12652-020-01905-3</idno>
	</analytic>
	<monogr>
		<title level="j">J Ambient Intell Human Comput</title>
		<imprint>
			<date type="published" when="2020">2020</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b8">
	<analytic>
		<title level="a" type="main">Reinforcement learning for medical information processing over heterogeneous networks</title>
		<author>
			<persName><forename type="first">A</forename><surname>Kishor</surname></persName>
		</author>
		<author>
			<persName><forename type="first">C</forename><surname>Chakraborty</surname></persName>
		</author>
		<author>
			<persName><forename type="first">W</forename><surname>Jeberson</surname></persName>
		</author>
		<idno type="DOI">10.1007/s11042-021-10840-0</idno>
	</analytic>
	<monogr>
		<title level="j">Multimed Tools Appl</title>
		<imprint>
			<biblScope unit="volume">80</biblScope>
			<biblScope unit="page" from="23983" to="24004" />
			<date type="published" when="2021">2021</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b9">
	<analytic>
		<title level="a" type="main">Software Agent with Reinforcement Learning Approach for Medical Image Segmentation</title>
		<author>
			<persName><forename type="first">M</forename><surname>Chitsaz</surname></persName>
		</author>
		<author>
			<persName><forename type="first">C</forename><forename type="middle">Seng</forename><surname>Woo</surname></persName>
		</author>
		<idno type="DOI">10.1007/s11390-011-9431-8</idno>
	</analytic>
	<monogr>
		<title level="j">J. Comput. Sci. Technol</title>
		<imprint>
			<biblScope unit="volume">26</biblScope>
			<biblScope unit="page" from="247" to="255" />
			<date type="published" when="2011">2011</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b10">
	<analytic>
		<title level="a" type="main">An approach towards missing data management using improved GRNN-SGTM ensemble method</title>
		<author>
			<persName><forename type="first">I</forename><surname>Izonin</surname></persName>
		</author>
		<author>
			<persName><forename type="first">R</forename><surname>Tkachenko</surname></persName>
		</author>
		<author>
			<persName><forename type="first">V</forename><surname>Verhun</surname></persName>
		</author>
		<author>
			<persName><forename type="first">K</forename><surname>Zub</surname></persName>
		</author>
		<idno type="DOI">10.1016/j.jestch.2020.10.005</idno>
	</analytic>
	<monogr>
		<title level="j">Engineering Science and Technology, an International Journal</title>
		<imprint>
			<biblScope unit="volume">24</biblScope>
			<biblScope unit="issue">3</biblScope>
			<biblScope unit="page" from="749" to="759" />
			<date type="published" when="2021">2021</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b11">
	<analytic>
		<title level="a" type="main">Predictive modeling based on small data in clinical medicine: RBF-based additive input-doubling method</title>
		<author>
			<persName><forename type="first">I</forename><surname>Izonin</surname></persName>
		</author>
		<author>
			<persName><forename type="first">R</forename><surname>Tkachenko</surname></persName>
		</author>
		<author>
			<persName><forename type="first">I</forename><surname>Dronyuk</surname></persName>
		</author>
		<author>
			<persName><forename type="first">P</forename><surname>Tkachenko</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Gregus</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Rashkevych</surname></persName>
		</author>
		<idno type="DOI">10.3934/mbe.2021132</idno>
		<idno type="PMID">33892562</idno>
	</analytic>
	<monogr>
		<title level="j">Math Biosci Eng</title>
		<imprint>
			<biblScope unit="volume">18</biblScope>
			<biblScope unit="issue">3</biblScope>
			<biblScope unit="page" from="2599" to="2613" />
			<date type="published" when="2021">2021</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b12">
	<analytic>
		<title level="a" type="main">Neuro-Fuzzy Diagnostics Systems Based on SGTM Neural-Like Structure and T-Controller</title>
		<author>
			<persName><forename type="first">R</forename><surname>Tkachenko</surname></persName>
		</author>
		<author>
			<persName><forename type="first">I</forename><surname>Izonin</surname></persName>
		</author>
		<author>
			<persName><forename type="first">P</forename><surname>Tkachenko</surname></persName>
		</author>
		<idno type="DOI">10.1007/978-3-030-82014-5_47</idno>
	</analytic>
	<monogr>
		<title level="m">Lecture Notes in Computational Intelligence and Decision Making. ISDMCI 2021</title>
		<title level="s">Lecture Notes on Data Engineering and Communications Technologies</title>
		<editor>
			<persName><forename type="first">S</forename><surname>Babichev</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">V</forename><surname>Lytvynenko</surname></persName>
		</editor>
		<meeting><address><addrLine>Cham</addrLine></address></meeting>
		<imprint>
			<publisher>Springer</publisher>
			<date type="published" when="2022">2022</date>
			<biblScope unit="volume">77</biblScope>
			<biblScope unit="page" from="685" to="695" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b13">
	<analytic>
		<title level="a" type="main">Ensemble application of bidirectional LSTM and GRU for aspect category detection with imbalanced data</title>
		<author>
			<persName><forename type="first">J</forename><forename type="middle">A</forename><surname>Kumar</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><surname>Abirami</surname></persName>
		</author>
		<idno type="DOI">10.1007/s00521-021-06100-9</idno>
	</analytic>
	<monogr>
		<title level="j">Neural Comput &amp; Applic</title>
		<imprint>
			<date type="published" when="2021">2021</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b14">
	<analytic>
		<title level="a" type="main">RNN-LSTM-GRU based language transformation</title>
		<author>
			<persName><forename type="first">A</forename><surname>Khan</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Sarfaraz</surname></persName>
		</author>
		<idno type="DOI">10.1007/s00500-019-04281-z</idno>
	</analytic>
	<monogr>
		<title level="j">Soft Comput</title>
		<imprint>
			<biblScope unit="volume">23</biblScope>
			<biblScope unit="page" from="13007" to="13024" />
			<date type="published" when="2019">2019</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b15">
	<analytic>
		<title level="a" type="main">Is Deep Reinforcement Learning Ready for Practical Applications in Healthcare? A Sensitivity Analysis of Duel-DDQN for Hemodynamic Management in Sepsis Patients</title>
		<author>
			<persName><forename type="first">M</forename><surname>Lu</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Z</forename><surname>Shahn</surname></persName>
		</author>
		<author>
			<persName><forename type="first">D</forename><surname>Sow</surname></persName>
		</author>
		<author>
			<persName><forename type="first">F</forename><surname>Doshi-Velez</surname></persName>
		</author>
		<author>
			<persName><forename type="first">L</forename><forename type="middle">H</forename><surname>Lehman</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Proceedings of the AMIA Annual Symposium</title>
				<meeting>the AMIA Annual Symposium<address><addrLine>Rockville</addrLine></address></meeting>
		<imprint>
			<date type="published" when="2020">2020. 2021 Jan 25</date>
			<biblScope unit="page" from="773" to="782" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b16">
	<analytic>
		<title level="a" type="main">Deep Recurrent Q-Learning for Partially Observable MDPs</title>
		<author>
			<persName><forename type="first">M</forename><surname>Hausknecht</surname></persName>
		</author>
		<author>
			<persName><forename type="first">P</forename><surname>Stone</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">AAAI Fall Symposia</title>
				<imprint>
			<date type="published" when="2015">2015</date>
			<biblScope unit="page" from="1" to="7" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b17">
	<analytic>
		<title level="a" type="main">Continuous control with deep reinforcement learning</title>
		<author>
			<persName><forename type="first">T</forename><forename type="middle">P</forename><surname>Lillicrap</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><forename type="middle">J</forename><surname>Hunt</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Pritzel</surname></persName>
		</author>
		<author>
			<persName><forename type="first">N</forename><forename type="middle">M</forename><surname>Heess</surname></persName>
		</author>
		<author>
			<persName><forename type="first">T</forename><surname>Erez</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Y</forename><surname>Tassa</surname></persName>
		</author>
		<author>
			<persName><forename type="first">D</forename><surname>Silver</surname></persName>
		</author>
		<author>
			<persName><forename type="first">D</forename><surname>Wierstra</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">CoRR</title>
		<imprint>
			<biblScope unit="page" from="1" to="14" />
			<date type="published" when="2016">2016</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b18">
	<analytic>
		<title level="a" type="main">Learning the Dynamic Treatment Regimes from Medical Registry Data through Deep Q-network</title>
		<author>
			<persName><forename type="first">A</forename><surname>Liu</surname></persName>
		</author>
		<author>
			<persName><forename type="first">N</forename><surname>Liu</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Y</forename><surname>Logan</surname></persName>
		</author>
		<author>
			<persName><forename type="first">B</forename></persName>
		</author>
		<idno type="DOI">10.1038/s41598-018-37142-0</idno>
	</analytic>
	<monogr>
		<title level="j">Sci Rep</title>
		<imprint>
			<biblScope unit="volume">9</biblScope>
			<biblScope unit="page">1495</biblScope>
			<date type="published" when="2019">2019</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b19">
	<analytic>
		<title level="a" type="main">Dueling Deep-Q-Network Based Delay-Aware Cache Update Policy for Mobile Users in Fog Radio Access Networks</title>
		<author>
			<persName><forename type="first">B</forename><surname>Guo</surname></persName>
		</author>
		<author>
			<persName><forename type="first">X</forename><surname>Zhang</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Q</forename><surname>Sheng</surname></persName>
		</author>
		<author>
			<persName><forename type="first">H</forename><surname>Yang</surname></persName>
		</author>
		<idno type="DOI">10.1109/ACCESS.2020.2964258</idno>
	</analytic>
	<monogr>
		<title level="j">IEEE Access</title>
		<imprint>
			<biblScope unit="volume">8</biblScope>
			<biblScope unit="page" from="7131" to="7141" />
			<date type="published" when="2020">2020</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b20">
	<monogr>
		<title level="m" type="main">A pair of interrelated neural networks in Deep Q-Network</title>
		<ptr target="https://towardsdatascience.com/a-pair-of-interrelated-neural-networks-in-dqn-f0f58e09b3c4" />
		<imprint>
			<date type="published" when="2020">2020</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b21">
	<monogr>
		<title level="m" type="main">Mixing policy gradient and Q-learning</title>
		<author>
			<persName><forename type="first">G</forename><surname>Delétang</surname></persName>
		</author>
		<ptr target="https://towardsdatascience.com/mixing-policy-gradient-and-q-learning-5819d9c69074" />
		<imprint>
			<date type="published" when="2019">2019</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b22">
	<monogr>
		<title level="m" type="main">Deep Reinforcement Learning: Pong from Pixels</title>
		<author>
			<persName><surname>Karpathy</surname></persName>
		</author>
		<ptr target="http://karpathy.github.io/2016/05/31/rl/" />
		<imprint>
			<date type="published" when="2016">2016</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b23">
	<monogr>
		<title level="m" type="main">Catch me if you can: A simple english explanation of GANs or Dueling neural-nets</title>
		<author>
			<persName><forename type="first">G</forename><surname>Kesari</surname></persName>
		</author>
		<ptr target="https://towardsdatascience.com/catch-me-if-you-can-a-simple-english-explanation-of-gans-or-dueling-neural-nets-319a273434db" />
		<imprint>
			<date type="published" when="2018">2018</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b24">
	<analytic>
		<title level="a" type="main">Dueling Deep Q Networks</title>
		<author>
			<persName><forename type="first">C</forename><surname>Yoon</surname></persName>
		</author>
		<ptr target="https://towardsdatascience.com/dueling-deep-q-networks-81ffab672751" />
	</analytic>
	<monogr>
		<title level="m">Dueling Network Architectures for Deep Reinforcement Learning</title>
				<imprint>
			<date type="published" when="2019">2019</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b25">
	<monogr>
		<ptr target="https://www.freecodecamp.org/news/improvements-in-deep-q-learning-dueling-double-dqn-prioritized-experience-replay-and-fixed-58b130cc5682/" />
		<title level="m">Improvements in Deep Q Learning: Dueling Double DQN, Prioritized Experience Replay, and fixed</title>
				<imprint>
			<date type="published" when="2018">2018</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b26">
	<monogr>
		<title level="m" type="main">Self Learning AI-Agents Part II: Deep Q-Learning</title>
		<author>
			<persName><surname>Oppermann</surname></persName>
		</author>
		<ptr target="https://towardsdatascience.com/self-learning-ai-agents-part-ii-deep-q-learning-b5ac60c3f47" />
		<imprint>
			<date type="published" when="2018">2018</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b27">
	<monogr>
		<title level="m" type="main">Understanding Q-Learning, the Cliff Walking problem</title>
		<author>
			<persName><forename type="first">L</forename><surname>Vazquez</surname></persName>
		</author>
		<ptr target="https://medium.com/@lgvaz/understanding-q-learning-the-cliff-walking-problem-80198921abbc" />
		<imprint>
			<date type="published" when="2018">2018</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b28">
	<analytic>
		<title level="a" type="main">Using the Actor-Critic method for population diversity in neuroevolutionary synthesis</title>
		<author>
			<persName><forename type="first">S</forename><surname>Leoshchenko</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Oliinyk</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><surname>Subbotin</surname></persName>
		</author>
		<author>
			<persName><forename type="first">V</forename><surname>Shkarupylo</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Proceedings of the 2nd International Workshop on Intelligent Information Technologies and Systems of Information Security (IntelITSIS&apos;2021)</title>
				<meeting>the 2nd International Workshop on Intelligent Information Technologies and Systems of Information Security (IntelITSIS&apos;2021)</meeting>
		<imprint>
			<publisher>CEUR-WS</publisher>
			<date type="published" when="2021">2021</date>
			<biblScope unit="page" from="99" to="107" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b29">
	<analytic>
		<title level="a" type="main">Combinatorial optimization problems solving based on evolutionary approach</title>
		<author>
			<persName><forename type="first">A</forename><surname>Oliinyk</surname></persName>
		</author>
		<author>
			<persName><forename type="first">I</forename><surname>Fedorchenko</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Stepanenko</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Rud</surname></persName>
		</author>
		<author>
			<persName><forename type="first">D</forename><surname>Goncharenko</surname></persName>
		</author>
		<idno type="DOI">10.1109/CADSM.2019.8779290</idno>
	</analytic>
	<monogr>
		<title level="m">Proceedings of the 15th International Conference on the Experience of Designing and Application of CAD Systems</title>
				<meeting>the 15th International Conference on the Experience of Designing and Application of CAD Systems<address><addrLine>Lviv, Ukraine</addrLine></address></meeting>
		<imprint>
			<publisher>IEEE</publisher>
			<date type="published" when="2019">2019</date>
			<biblScope unit="page" from="41" to="45" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b30">
	<analytic>
		<title level="a" type="main">Development of genetic methods for predicting the incidence of volumes of emissions of pollutants in air</title>
		<author>
			<persName><forename type="first">A</forename><surname>Oliinyk</surname></persName>
		</author>
		<author>
			<persName><forename type="first">I</forename><surname>Fedorchenko</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Stepanenko</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Katschan</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Y</forename><surname>Fedorchenko</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Kharchenko</surname></persName>
		</author>
		<author>
			<persName><forename type="first">D</forename><surname>Goncharenko</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Proceedings of the 2nd International Workshop on Informatics &amp; Data-Driven Medicine, IDDM 2019</title>
		<title level="s">CEUR-WS</title>
		<meeting>the 2nd International Workshop on Informatics &amp; Data-Driven Medicine, IDDM 2019</meeting>
		<imprint>
			<date type="published" when="2019">2019</date>
			<biblScope unit="page" from="340" to="353" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b31">
	<analytic>
		<title level="a" type="main">A Using modern architectures of recurrent neural networks for technical diagnosis of complex systems</title>
		<author>
			<persName><forename type="first">S</forename><surname>Leoshchenko</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Oliinyk</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><surname>Subbotin</surname></persName>
		</author>
		<author>
			<persName><forename type="first">T</forename><surname>Zaiko</surname></persName>
		</author>
		<idno type="DOI">10.1109/INFOCOMMST.2018.8632015</idno>
	</analytic>
	<monogr>
		<title level="m">Proceedings of the 5th International Scientific-Practical Conference Problems of Infocommunications. Science and Technology (PICST2018)</title>
				<meeting>the 5th International Scientific-Practical Conference Problems of Infocommunications. Science and Technology (PICST2018)<address><addrLine>Kharkiv, Ukraine</addrLine></address></meeting>
		<imprint>
			<publisher>IEEE</publisher>
			<date type="published" when="2018">2018</date>
			<biblScope unit="page" from="411" to="416" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b32">
	<analytic>
		<title level="a" type="main">Synthesis of artificial neural networks using a modified genetic algorithm</title>
		<author>
			<persName><forename type="first">S</forename><surname>Leoshchenko</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Oliinyk</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><surname>Subbotin</surname></persName>
		</author>
		<author>
			<persName><forename type="first">N</forename><surname>Gorobii</surname></persName>
		</author>
		<author>
			<persName><forename type="first">T</forename><surname>Zaiko</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Proceedings of the 1st International Workshop on Informatics &amp; Data-Driven Medicine, IDDM 2018</title>
				<meeting>the 1st International Workshop on Informatics &amp; Data-Driven Medicine, IDDM 2018</meeting>
		<imprint>
			<publisher>CEUR-WS</publisher>
			<date type="published" when="2018">2018</date>
			<biblScope unit="page" from="1" to="13" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b33">
	<analytic>
		<title level="a" type="main">Using Recurrent Neural Networks for Data-Centric Business</title>
		<author>
			<persName><forename type="first">S</forename><surname>Leoshchenko</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Oliinyk</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><surname>Subbotin</surname></persName>
		</author>
		<author>
			<persName><forename type="first">T</forename><surname>Zaiko</surname></persName>
		</author>
		<idno type="DOI">10.1007/978-3-030-35649-1_4</idno>
	</analytic>
	<monogr>
		<title level="m">Data-Centric Business and Applications</title>
		<title level="s">Lecture Notes on Data Engineering and Communications Technologies</title>
		<editor>
			<persName><forename type="first">D</forename><surname>Ageyev</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">T</forename><surname>Radivilova</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">N</forename><surname>Kryvinska</surname></persName>
		</editor>
		<meeting><address><addrLine>Cham</addrLine></address></meeting>
		<imprint>
			<publisher>Springer</publisher>
			<biblScope unit="volume">42</biblScope>
			<biblScope unit="page" from="73" to="91" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b34">
	<analytic>
		<title level="a" type="main">Pneumonia severity index in viral community acquired pneumonia in adults</title>
		<author>
			<persName><forename type="first">M.-A</forename><surname>Kim</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Seok</surname></persName>
		</author>
		<author>
			<persName><forename type="first">C</forename><forename type="middle">W</forename><surname>Park</surname></persName>
		</author>
		<author>
			<persName><forename type="first">W.-I</forename><surname>Lee</surname></persName>
		</author>
		<author>
			<persName><surname>Choi</surname></persName>
		</author>
		<idno type="DOI">10.1371/journal.pone.0210102</idno>
	</analytic>
	<monogr>
		<title level="j">PLoS One</title>
		<imprint>
			<biblScope unit="volume">14</biblScope>
			<biblScope unit="issue">3</biblScope>
			<date type="published" when="2019">2019</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b35">
	<analytic>
		<title level="a" type="main">COVID-19 interstitial pneumonia: monitoring the clinical course in survivors</title>
		<author>
			<persName><forename type="first">G</forename><surname>Raghu</surname></persName>
		</author>
		<author>
			<persName><forename type="first">K</forename><surname>Wilson</surname></persName>
		</author>
		<idno type="DOI">10.1016/S2213-2600(20)30349-0</idno>
	</analytic>
	<monogr>
		<title level="j">The Lancet Respiratory Medicine</title>
		<imprint>
			<biblScope unit="volume">8</biblScope>
			<biblScope unit="issue">9</biblScope>
			<biblScope unit="page" from="839" to="842" />
			<date type="published" when="2020">2020</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b36">
	<analytic>
		<title level="a" type="main">COVID-19 pneumonia: A review of typical CT findings and differential diagnosis</title>
		<author>
			<persName><forename type="first">C</forename><surname>Hani</surname></persName>
		</author>
		<author>
			<persName><forename type="first">N</forename><forename type="middle">H</forename><surname>Trieu</surname></persName>
		</author>
		<author>
			<persName><forename type="first">I</forename><surname>Saab</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><surname>Dangeard</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><surname>Bennani</surname></persName>
		</author>
		<author>
			<persName><forename type="first">G</forename><surname>Chassagnon</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M.-P</forename><surname>Revel</surname></persName>
		</author>
		<idno type="DOI">10.1016/j.diii.2020.03.014</idno>
	</analytic>
	<monogr>
		<title level="j">Diagnostic and Interventional Imaging</title>
		<imprint>
			<biblScope unit="volume">101</biblScope>
			<biblScope unit="issue">5</biblScope>
			<biblScope unit="page" from="263" to="268" />
			<date type="published" when="2020">2020</date>
		</imprint>
	</monogr>
</biblStruct>

				</listBibl>
			</div>
		</back>
	</text>
</TEI>
