<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Transfer training tools and methods for diagnostic tasks</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Serhii Leoshchenko</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Sergey Subbotin</string-name>
          <email>subbotin.csit@gmail.com</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Artem Borovikov</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Yevgeny Gofman</string-name>
          <email>gofmanjenek@gmail.com</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>National University “Zaporizhzhia Polytechnic”</institution>
          ,
          <addr-line>Zhukovskogo street 64, Zaporizhzhia, 69011</addr-line>
          ,
          <country country="UA">Ukraine</country>
        </aff>
      </contrib-group>
      <abstract>
        <p>The research about implementation transfer learning in medical diagnostics is important because it allows to evaluate how well already trained neural networks can adapt to specific medical data. This helps to understand which architectures work best, how to improve diagnostic accuracy, and reduce the risk of false positives. In addition, such research contributes to the development of more reliable and interpretable models, which is critical for physician confidence and the implementation of AI in real-world clinical practice.</p>
      </abstract>
      <kwd-group>
        <kwd>eol&gt;Machine learning</kwd>
        <kwd>transfer training</kwd>
        <kwd>medical diagnosis</kwd>
        <kwd>artificial neural network</kwd>
        <kwd>accuracy1</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>1. Introduction</title>
      <p>
        Transfer machine learning (TML) can be useful for training analytical diagnostic models as a basis for
medical diagnostics, as it allows you to use already pre-trained models (models after the parametric
synthesis stage) on new, similar tasks, reducing the need for large amounts of data. In medicine, even
modern medicine, there is often a lack of large annotated data sets – usually, this applies to rare
diseases, or viral (less often bacterial) infections that have passed the stage of seasonal or qualitative
mutation, or diseases at the beginning of the epidemic (as was the case with COVID-19, for example)
[
        <xref ref-type="bibr" rid="ref1">1</xref>
        ]–[
        <xref ref-type="bibr" rid="ref3">3</xref>
        ]. That is why the ability to adapt knowledge from other industries or similar tasks is very
valuable. For example, models trained on a large general set of medical images can be further trained
on smaller specific data sets for a specific diagnosis, which improves accuracy and reduces
development time. This is especially important in radiology, where the analysis of CT, MRI, or X-ray
images can be improved using models that have already learned to recognize common pathologies. It
also reduces the risk of retraining, since the basic characteristics of images or signals (for example,
tissue features or anomaly patterns) have already been studied by the model before. Unlike the
neuroevolution approach, which usually requires a large data set to synthesize a more universal
model, the principle of TML is to adapt an existing model to a specific, narrower task. In addition, it
can contribute to better generalization of models, allowing them to work on different sets of patients,
even if they differ in demographic or technical parameters [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ].
      </p>
      <p>
        TML shows good results precisely when using deep neural networks (DNNs) because of their
ability to automatically extract and summarize complex multi-level data features. DNNs consist of
many layers, where most layers are layers with hidden neurons, each of which learns to recognize
certain patterns – from the simplest (edges, textures, normal indicators) on the lower layers to more
complex (shapes, objects, splashes, pathologies) on the higher ones. This makes it possible to reuse
already trained layers without the need for training from scratch, which is crucial for tasks where
access to large amounts of annotated medical data is limited [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ]. Moreover, retraining or complete
resynthesis of DNN can be extremely complex and resource-intensive for a computing system, which is
not sufficiently optimized due to the receipt of a small amount of new data (2).
      </p>
      <p>This is particularly effective in areas such as medical image analysis (CT, MRI, X-ray), where the
first layers of DNN trained on large shared datasets (such as ImageNet) can be used to recognize basic
visual patterns, while only the last few layers are adapted to a specific task. This significantly reduces
the need for computing resources and training time. In addition, this strategy helps to avoid
re</p>
      <p>0000-0001-5099-5518 (S. Leoshchenko); 0000-0001-5814-8268 (S. Subbotin) ;
0000-0003-1429-5930 (A. Borovikov); 0009-0001-2885-4185 (Y. Gofman)
© 2025 Copyright for this paper by its authors.</p>
      <p>Use permitted under Creative Commons License Attribution 4.0 International (CC BY 4.0).
learning on small sets of medical data, since the initial layers already contain generalized
characteristics that are well transferred between similar tasks, because the dataset can contain
updated data of either individual patients or a specific pathology that requires confirmation or
refutation.</p>
      <p>
        Another reason for effectiveness is the ability of DNN to work with nonlinear and complex
relationships in data, which is important in medical diagnostic tasks where pathologies can have
complex and variable manifestations. With TLM, high accuracy can be achieved even with relatively
small data sets, making this approach practical and effective in real-world medical applications [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ]–[
        <xref ref-type="bibr" rid="ref3">3</xref>
        ].
      </p>
      <p>
        However, which DNN topologies to choose, which methods can help better teach a structurally
synthesized model, how to adjust the metaparameters of methods, and, ultimately, whether such an
approach is really optimized for medical diagnostics should be considered in this paper [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ]–[
        <xref ref-type="bibr" rid="ref3">3</xref>
        ].
      </p>
    </sec>
    <sec id="sec-2">
      <title>2. Related Works</title>
      <p>
        Automation of medical diagnostics is the use of technologies, in particular data processing algorithms,
machine learning and artificial intelligence, to partially or completely perform the process of
detecting diseases and making diagnoses. This may include analyzing medical images (MRI, CT,
Xrays), interpreting laboratory tests, recognizing symptoms based on electronic medical records, and
even predicting disease risks [
        <xref ref-type="bibr" rid="ref4">4</xref>
        ]–[
        <xref ref-type="bibr" rid="ref8">8</xref>
        ].
      </p>
      <p>
        It should be noted that work on automating decision-making in medical diagnostics has been
underway for quite a long time, which is associated with a number of important current needs:
 speed up the diagnostic process-automatic systems can significantly speed up the analysis of a
patient's set of clinical indicators, which is critical in acute conditions (for example, stroke, heart
attack), especially if you correctly configure the online transmission of clinical indicators after, for
example, the actual analysis to the general system [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ]–[
        <xref ref-type="bibr" rid="ref7">7</xref>
        ];
 improved accuracy – artificial intelligence (AI) techniques can detect patterns and
nonobvious connections that a person may miss, even in an ultra-large data stream, reducing the risk
of false or missed diagnoses;
 reduce the burden on doctors – automation helps reduce the share of routine work of doctors,
giving doctors more time for complex clinical cases and communication with patients;
 increased access to health care – in the event of a shortage of qualified specialists in the
regions or problems with the departure of medical care to dangerous, restricted locations,
automated systems can help compensate for this shortage by providing high-quality preliminary
diagnostics, and signal the real need to attract qualified specialists to extraordinary cases [
        <xref ref-type="bibr" rid="ref8">8</xref>
        ];
 standardization of diagnostic solutions – reducing the influence of the human factor allows
you to minimize the variability in diagnosis between different doctors.
      </p>
      <p>
        Therefore, it is necessary to clearly distinguish between the role of AI in the processes of such
automation – AI is a key component of automation, since it is able to:
 process and analyze large amounts of medical data – images, tests, medical histories;
 recognize complex patterns and correlations that are difficult to detect even for experienced
doctors [
        <xref ref-type="bibr" rid="ref6">6</xref>
        ]-[
        <xref ref-type="bibr" rid="ref8">8</xref>
        ];
 learn from previous cases-constantly improving the accuracy of predictions and diagnoses;
 perform routine tasks, such as sorting cases by risk level or automatically collecting patient
data.
      </p>
      <p>In general, AI does not replace doctors, but acts as a tool that enhances their capabilities, helping
them make informed decisions and improve the quality of medical services.</p>
      <p>The TML approach has a number of key advantages over classical machine learning methods,
which is especially important in medical diagnostics. Therefore, TML has less need for large data sets–
classical ML models require a large amount of annotated medical data to learn from scratch. Since
collecting and labeling such data in medicine is complex and resource-intensive, transfer training
allows you to use already trained models, adapting them to a specific task.</p>
      <p>
        On the other hand, faster adaptation to new tasks. Learning from scratch (especially in DNN)
requires a lot of time and computing resources. TML allows you to shorten this process by reusing the
basic characteristics you have already learned [
        <xref ref-type="bibr" rid="ref8">8</xref>
        ].
      </p>
      <p>In addition, the resulting neuromodels are better generalizability – DNNs trained on large shared
data sets already contain knowledge of common features of images or signals, which makes them
more resistant to changes in data than models trained only on specific medical sets.</p>
      <p>The derived advantage of using TML is to reduce the risk of retraining – in classical approaches,
when training on small medical datasets, the model can remember the features of a specific set
(whether it is a specific group of patients, a specific disease, or even a specific patient), rather than
learn general patterns. Thanks to transfer training, the basic levels of the network already contain
generalized knowledge, which makes the adapted model more resistant to various variations in
medical data. As already noted, learning DNN from scratch requires powerful hardware. TML reduces
the need for long-term training and allows you to achieve high accuracy even on less powerful
systems, thereby increasing the efficiency of using computing resources. Also, the TML approach has
a certain versatility: the same approach can be used for various medical tasks: analysis of X-rays, CT,
MRI, diagnostics using electrocardiograms or histological images.</p>
      <p>Thus, TML is significantly more efficient than classical methods, as it allows you to adapt existing
models to medical diagnostic tasks faster, more accurately and with less resource costs.</p>
      <p>To clearly demonstrate this, let's look at a comparison table of different ML approaches for our
problem of automating medical diagnostics in the form of a Table 1.</p>
      <p>Transfer training is more effective for medical diagnostics when access to large amounts of data is
limited and deployment speed is critical. Classical machine learning is useful when it is possible to
build a large, high-quality dataset and train the model for a specific task.</p>
      <p>
        To date, a number of independent and professional researches [
        <xref ref-type="bibr" rid="ref9">9</xref>
        ]-[
        <xref ref-type="bibr" rid="ref14">14</xref>
        ] have already been
conducted on the introduction of TML technologies in medical diagnostics. TML involves using the
knowledge gained by the model when solving one problem to improve results on another, often
similar problem. This approach is particularly useful in medicine, where there is a limited amount of
data to train models [
        <xref ref-type="bibr" rid="ref7">7</xref>
        ].
      </p>
      <p>After analyzing a set of studies, we can conclude the general advantages of using TML in medicine,
among which the researchers identified:
 resource savings: models pre-trained on large shared data sets can be adapted to specific
medical tasks with less time and data.;
 improved accuracy: adapting models to medical data can lead to higher diagnostic accuracy,
even with a limited amount of specific medical data.
 However, it should also be noted the general disadvantages and risks associated with the
implementation of TML in medical diagnostics:
 risk of transferring inappropriate characteristics: if the baseline model was trained on data
that is significantly different from medical data, this may lead to the transfer of inappropriate or
undesirable characteristics, which will worsen the quality of diagnosis;
 interpretation problems: machine learning models, including those that use TML, can be black
boxes, making it difficult to understand the reasons for making certain decisions that are crucial in
medical practice;
 need for thorough validation: it is necessary to carefully test adapted models on medical data
to ensure their reliability and accuracy before implementing them in clinical practice.</p>
      <p>Overall, while TML offers significant benefits for medical diagnosis, it is important to consider
potential risks and limitations while ensuring that models are thoroughly validated and adapted to the
specifics of medical data.</p>
      <p>As already noted, within the framework of this work, it is extremely important for us to deal with a
number of issues related to the implementation of transfer training in medical diagnostics, namely:
which DNN topologies to choose; which methods can help to better teach the structurally synthesized
model; how to configure the metaparameters of methods. A similar research structure should be
argued [.</p>
      <p>Choosing a deep neural network (DNN) topology for transfer learning, configuring
metaparameters, and learning methods are critical aspects when it comes to applying machine
learning technologies to medical diagnostics. Here are some important points that explain why this is
so important and how you can improve your results.</p>
      <sec id="sec-2-1">
        <title>2.1.1. Selecting the DNN topology for transfer training.</title>
        <p>
          The network topology (or architecture or structural structure) is crucial because it determines how
the neuromodel will process data. For medical tasks, such as image diagnostics or analysis of medical
records, architectures that are well-suited for image processing are most commonly used [
          <xref ref-type="bibr" rid="ref15">15</xref>
          ]:
• Convolutional Neural Networks (CNN)
• Recurrent Neural Networks (RNN)
 Transformer-based models (most often for sequential data).
        </p>
        <p>
          The choice of topology affects:
 model performance: an incorrect topology may cause the model to fail to learn or process data
efficiently;
 generalization capability and quality: it is important that the network can transfer the
acquired knowledge to new medical tasks without losing accuracy;
 model complexity: for small medical data sets, simpler models can be more efficient than
complex ones that require huge amounts of data [
          <xref ref-type="bibr" rid="ref16">16</xref>
          ].
        </p>
      </sec>
      <sec id="sec-2-2">
        <title>2.1.2. Methods for improving the training of a structurally synthesized model</title>
        <p>
          In order to train the model more effectively, the following methods are used:
• Fine-tuning: this is the process in which a network pre-trained on a large amount of shared
data adapts to a specific medical task. This method allows you to preserve the knowledge gained at
the previous stage of training, and only partially retrain the model on new data [
          <xref ref-type="bibr" rid="ref11">11</xref>
          ];
• Data augmentation: in cases where medical data is limited, you can use data augmentation
techniques to artificially enlarge the data set, creating new examples from the original ones
through transformations (rotation, shifting, image scaling, etc.);
• Regularization: regularization techniques such as Dropout or L2 regularization help prevent
overfitting, which is especially important when there is not enough data for training [
          <xref ref-type="bibr" rid="ref12">12</xref>
          ].
        </p>
      </sec>
      <sec id="sec-2-3">
        <title>2.1.3. Configuring metaparameters</title>
        <p>
          Among the metaparameters (or hyperparameters for some methods) of methods for training
structurally synthesized models, there are [
          <xref ref-type="bibr" rid="ref5">5</xref>
          ]:
• learning rate: it is important to adjust the learning rate correctly so that the model does not
get stuck in local lows or learn too slowly [
          <xref ref-type="bibr" rid="ref15">15</xref>
          ];
• batch size: the batch size determines how many examples will be processed before updating
the scale. This can affect the stability and speed of learning;
• number of epochs: the number of iterations (epochs) of training in which the network adapts
to data is an important factor for achieving optimal results.
        </p>
        <p>
          Transfer training can be a good (if not the best) approach to implementing ML in medical
diagnostics in general, given the frequent problem of data limitations, because the medical field often
lacks large data sets to train models from scratch. TML allows you to use already trained models,
which significantly reduces the need for data and time. Moreover, since decisions often need to be
made quickly in medicine, the use of models pre-trained on large sets of General Data allows you to
achieve results faster, and therefore solutions based on the use of the TML approach differ in speed
and efficiency. It is also worth paying attention to the fact that transfer training allows you to
effectively adapt General models to specific tasks related, for example, to rare diseases or specific
medical images. Thus, TML will help increase the adaptability of neuromodels. [
          <xref ref-type="bibr" rid="ref16">16</xref>
          ]
        </p>
        <p>However, this approach also has its own risks, especially if the adaptation of the model to new
medical data has not been properly performed. It is important that the validation performed is
thorough and takes into account the specifics of specific medical data, otherwise there is a risk of
incorrect diagnoses.</p>
        <p>
          Transfer training has great potential for medical diagnostics due to its ability to effectively use
limited data and reduce training time. However, it is important to carefully choose the network
architecture, configure metaparameters, and take into account the specifics of medical data to achieve
optimal results [
          <xref ref-type="bibr" rid="ref17">17</xref>
          ].
        </p>
      </sec>
    </sec>
    <sec id="sec-3">
      <title>3. Materials and the methods</title>
      <p>
        As noted earlier, DNN networks are most often used for the TML approach. Quite often, among
medical clinical data, you can find visualized test results, for example: X-rays, or MRI or CT. Then the
diagnostic task is a more complex task of computer vision – image recognition. That is why among all
possible topologies of DNN networks, we will choose those topologies that best demonstrate
themselves in working with images, namely: CNN, DenseNet, VGG16, ResNet and InceptionNet
[
        <xref ref-type="bibr" rid="ref18">18</xref>
        ][20]. For clarity, we will compare all the considered topologies in the form of a table: Table.2.
      </p>
      <p>The CNN architecture is one of the most common architectures for image processing. It consists of
several layers:
• convolutional layers: key components for identifying image features such as contours,
textures, etc.;
• pooling layers: reduce image size while maintaining important features;
• fully connected layers: exit at the last stage for classification or regression.</p>
      <p>Overall, CNNs are highly efficient in image recognition due to their ability to process spatial
structures.</p>
      <p>
        DenseNet or Densely Connected Convolutional Networks): this is an improved version of CNN,
where each layer has direct connections to all previous layer [
        <xref ref-type="bibr" rid="ref18">18</xref>
        ]s. This allows the model to have
more context and make better use of information from previous stages. Compared to conventional
CNNs, increased learning efficiency is most often noted due to the reduction of the problem of
gradient attenuation and improved accuracy due to the stronger exchange of information between
layers.
      </p>
      <p>VGG16 or Visual Geometry group 16: this is a deep CNN with 16 layers. It uses small filters (3x3)
and large layers for more accurate feature detection. Of course, this architecture is easy to implement
and learn thanks to the use of the same filters (3x3) in all layers [19].</p>
      <p>The ResNet architecture uses the concept of skip connections, which allows you to skip certain
layers and avoid the problem of fading gradients when training deep networks. It can work effectively
with very deep networks (up to several hundred layers). And the structural feature improves learning
ability by using redundant links that allow you to skip multiple layers without losing important
information [20].</p>
      <sec id="sec-3-1">
        <title>High (can reach hundreds of layers)</title>
      </sec>
      <sec id="sec-3-2">
        <title>Skip layers</title>
        <p>(skip
connections)
for deep
networks
It can be
difficult to
train with a
lot of
parameters</p>
      </sec>
      <sec id="sec-3-3">
        <title>InceptionNet</title>
      </sec>
      <sec id="sec-3-4">
        <title>Inception blocks with different filters</title>
      </sec>
      <sec id="sec-3-5">
        <title>Many depth</title>
        <p>options
depending on
the
configuration</p>
      </sec>
      <sec id="sec-3-6">
        <title>Higher</title>
        <p>efficiency
thanks to the
use of various
filters</p>
      </sec>
      <sec id="sec-3-7">
        <title>Optimization is required to reduce parameters</title>
        <p>VGG 16</p>
      </sec>
      <sec id="sec-3-8">
        <title>ResNet</title>
      </sec>
      <sec id="sec-3-9">
        <title>Deep CNN</title>
        <p>with 16 layers</p>
      </sec>
      <sec id="sec-3-10">
        <title>Skip connections 16 layers</title>
      </sec>
      <sec id="sec-3-11">
        <title>Simplicity, works well with small data</title>
      </sec>
      <sec id="sec-3-12">
        <title>Large</title>
        <p>volumes of
parameters,
which can be
a problem for
memory</p>
      </sec>
      <sec id="sec-3-13">
        <title>Fast training</title>
        <p>thanks to the
simplicity of
the
architecture</p>
      </sec>
      <sec id="sec-3-14">
        <title>High thanks to</title>
      </sec>
      <sec id="sec-3-15">
        <title>Slower due to The right the</title>
      </sec>
      <sec id="sec-3-16">
        <title>Learning the large setup for an</title>
      </sec>
      <sec id="sec-3-17">
        <title>Moderate combination</title>
        <p>speed number of effective
of different
parameters workout
filters</p>
      </sec>
      <sec id="sec-3-18">
        <title>Diagnostics of Wide real</title>
      </sec>
      <sec id="sec-3-19">
        <title>High- Analysis of</title>
        <p>medical Diagnostics of time</p>
      </sec>
      <sec id="sec-3-20">
        <title>Application in precision complex</title>
        <p>images, images with application
medicine image medical
pathology small details for image</p>
        <p>classification images
analysis analysis</p>
        <p>InceptionNet (GoogleNet): this is an architecture that includes the concept of Inception blocks,
where filters of different sizes (1x1, 3x3, 5x5) are used for each layer in order to preserve a variety of
functions. It differs in that it increases efficiency and reduces the number of parameters by combining
filters of different sizes. Well adapted for real-time use [19].</p>
        <p>
          We will use dropout as the basis of transfer training. Dropout is a regularization technique used in
DNN to prevent overfitting. It randomly shuts down a certain percentage of neurons during training,
which forces the model not to depend on individual neurons and process information more
universally. In the context of transfer training, dropout can be used to improve the efficiency and
stability of the model [
          <xref ref-type="bibr" rid="ref18">18</xref>
          ].
        </p>
        <p>
          In TML, we often have a model pre-trained on a large set of General Data, and then adapt it to
specific data (for example, medical images). Enabling dropout during adaptation reduces the risk of
retraining on new data, especially if the amount of data is limited [
          <xref ref-type="bibr" rid="ref18">18</xref>
          ].
        </p>
        <p>During fine-tuning, when we adapt an already trained model to a specific task (for example,
classification of medical images), dropout helps to avoid over-training on a small data set. This
ensures that the model does not remember specific features of training data, but can summarize new
examples [20].</p>
        <p>TML often involves using models that have been trained on large shared data sets and then adapted
to a narrow, specific task (such as detecting specific diseases in medical images). Because new data
may be less representative or have fewer examples, dropout helps reduce the likelihood that the model
will remember insignificant or noisy data that can cause diagnostic errors [19].</p>
        <p>TML often experiments with different dropout values (for example, 0.3-0.5), depending on the task
and data availability. Too high a dropout can make learning more difficult, while too low a dropout
will not give the desired regularization effect.</p>
      </sec>
    </sec>
    <sec id="sec-4">
      <title>4. Experiment</title>
      <p>A sample of data on patients with Pneumonia from the Mayo Clinic's Article was selected for the
experiment [21].</p>
      <p>Images from the entire sample for the experiment will be redistributed as follows, as in Table.3.</p>
      <p>For all topologies, we define the following training metaparameters:Table. 4</p>
      <p>The accuracy of all solutions demonstrates in the Table 5.
The following Fig.5-6 show the dynamics of changes in diagnostic accuracy.</p>
    </sec>
    <sec id="sec-5">
      <title>5. Analysis of results</title>
      <p>The analysis of the results should begin with noting the striking difference in accuracy between
classical CNNs and all other types of DNNs. Most notably, the difference is not even 10-15%.</p>
      <p>There are several possible explanations for why CNNs performed better in X-ray image
classification compared to more advanced architectures such as DenseNet, VGG16, ResNet, and
InceptionNet.</p>
      <p>Firstly, it is simplicity of structure and lack of parameter overload. More modern architectures,
such as ResNet or DenseNet, contain a large number of parameters and complex mechanisms that are
optimized for processing very deep and complex images, such as ImageNet. X-ray images typically
have fewer high-level texture features, so simpler CNNs can learn more efficiently without the risk of
overfitting.</p>
      <p>Secondly, it is limited variability in X-ray images. Unlike natural images (with huge variations in
textures, colors, and objects), X-ray images have a similar structure and fewer unique features to
extract. Conventional CNNs can quickly learn to extract the necessary medical features without the
need for complex mechanisms like ResNet (residual connections) or DenseNet (dense layer
connectivity).</p>
      <p>Moreover, retraining and data requirements. Deep networks like ResNet or InceptionNet require
very large amounts of data to train effectively. If your X-ray dataset is not large enough, then deeper
architectures may not reach their maximum efficiency and may need to be retrained.</p>
      <p>Further, artifacts and noise in medical images. Deeper architectures may be more sensitive to
artifacts, noise, or contrast variations in X-ray images. Conventional CNNs, due to their simplicity,
can learn to ignore unnecessary details and focus only on key patterns.</p>
      <p>Finally, model optimization and adaptation. Some modern architectures are optimized for color or
more variable images, while X-rays are usually black and white (grayscale). This can lead to inefficient
use of many filters in large networks. Limitations in hardware resources</p>
      <p>More complex networks require significantly more computing resources for inference. If the
system used for training and testing had limited capabilities (e.g., weaker GPUs or limited memory),
this could affect the performance of complex architectures.</p>
    </sec>
    <sec id="sec-6">
      <title>6. Conclusion</title>
      <p>For image-based medical diagnoses, each of these architectures has its advantages. CNN is a classic
and efficient method, suitable for basic tasks. DenseNet and ResNet provide better deep network
processing capability and reduce training problems, so they are suitable for more complex medical
images. VGG16 is a great option for simple but accurate tasks. InceptionNet is optimal for reducing the
number of parameters and improving efficiency, which is important for real-world medical
applications.</p>
      <p>Benefits of using transfer learning and dropout, in particular:</p>
      <p>Reduced overfitting: The model becomes less prone to overfitting on new data, which is especially
important when working with small medical datasets.</p>
      <p>Improved generalization: Thanks to regularization, the model can better generalize knowledge and
transfer it to new, previously unknown examples.</p>
      <p>Improved learning stability: Combined with fine-tuning techniques, dropout helps the model
consistently achieve optimal results without large fluctuations in performance on validation data.</p>
      <p>Dropout is a useful technique for transfer learning, especially when adapting models to specific
tasks with limited data, such as medical diagnosis. By using dropout during fine-tuning, you can
effectively reduce the risks of overfitting and improve the model's ability to generalize to new
examples.</p>
      <p>Transfer learning has a great future in medical diagnostics, as it allows to effectively use
knowledge from large datasets to analyze X-ray, CT, or MRI images, even when annotated medical
data is limited. This significantly reduces training time and improves the quality of predictions,
especially if the models are adapted to the specifics of medical images.</p>
      <p>However, it is important to keep in mind that standard architectures trained on ImageNet are not
always optimal for medical tasks, so they should be modified to take into account specific data
features. In general, transfer learning is a promising approach that has already demonstrated success
in clinical practice, but requires careful validation and adaptation to specific medical cases.</p>
    </sec>
    <sec id="sec-7">
      <title>Acknowledgements</title>
      <p>The work was carried out with the support of the state budget research projects of the state budget of
the National University "Zaporozhzhia Polytechnic" “Intelligent information processing methods and
tools for decision-making in the military and civilian industries” (state registration number
0124U000250) and “Artificial intelligence tools for control and management of technical and social
systems under martial law” (state registration number 0125U000854).</p>
    </sec>
    <sec id="sec-8">
      <title>Declaration on Generative AI</title>
      <p>During the preparation of this work, the authors used Grammarly in order to: Grammar and spelling
check. After using this tool, the authors reviewed and edited the content as needed and take full
responsibility for the publication’s content.
[19] Freitag, S., et al.: Reliability-based optimization of structural topologies using artificial neural
networks. Probabilistic Eng. Mech. 103356 (2022).
https://doi.org/10.1016/j.probengmech.2022.103356.
[20] Kaviani, S., Sohn, I.: Application of complex systems topologies in artificial neural networks
optimization: An overview. Expert Syst. With Appl. 180, 115073 (2021).
https://doi.org/10.1016/j.eswa.2021.115073.
[21] Identifying Medical Diagnoses and Treatable Diseases by Image-Based Deep Learning. URL:
https://www.cell.com/cell/fulltext/S0092-8674(18)30154-5</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <surname>Cosby</surname>
            ,
            <given-names>K.</given-names>
          </string-name>
          :
          <article-title>Medical Decision Making. Diagnosis</article-title>
          . С.
          <volume>13</volume>
          -
          <fpage>39</fpage>
          . CRC Press, Boca Raton : Taylor &amp; Francis,
          <year>2017</year>
          . (
          <year>2017</year>
          ). https://doi.org/10.1201/9781315116334-2.
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [2]
          <string-name>
            <surname>Williams</surname>
            ,
            <given-names>R.</given-names>
          </string-name>
          , et al.:
          <article-title>Domesticating AI in medical diagnosis</article-title>
          .
          <source>Technol. Soc</source>
          .
          <volume>102469</volume>
          (
          <year>2024</year>
          ). https://doi.org/10.1016/j.techsoc.
          <year>2024</year>
          .
          <volume>102469</volume>
          .
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [3]
          <string-name>
            <surname>Göndöcs</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Dörfler</surname>
          </string-name>
          , V.:
          <article-title>AI in medical diagnosis: AI prediction &amp; human judgment</article-title>
          .
          <source>Artif. Intell. Med</source>
          .
          <volume>149</volume>
          ,
          <issue>102769</issue>
          (
          <year>2024</year>
          ). https://doi.org/10.1016/j.artmed.
          <year>2024</year>
          .
          <volume>102769</volume>
          .
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [4]
          <string-name>
            <given-names>Medical</given-names>
            <surname>Diagnosis</surname>
          </string-name>
          and
          <article-title>Treatment Record Coding with AI</article-title>
          .
          <article-title>Working with AI</article-title>
          .
          <source>С</source>
          .
          <volume>53</volume>
          -
          <fpage>58</fpage>
          . The MIT Press (
          <year>2022</year>
          ). https://doi.org/10.7551/mitpress/14453.003.0013.
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          [5]
          <string-name>
            <surname>Wall</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Liu</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Zhang</surname>
          </string-name>
          , L.:
          <article-title>Deep Learning-based Respiratory Anomaly and COVID Diagnosis Using Audio and CT Scan Imagery</article-title>
          .
          <source>Recent Advances in AI-enabled Automated Medical Diagnosis. С</source>
          .
          <volume>29</volume>
          -
          <fpage>40</fpage>
          . CRC Press, New York (
          <year>2022</year>
          ). https://doi.org/10.1201/9781003176121-3.
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          <article-title>[6] Instance-Based Transfer Learning</article-title>
          .
          <source>Transfer Learning. С</source>
          .
          <volume>23</volume>
          -
          <fpage>33</fpage>
          . Cambridge University Press (
          <year>2020</year>
          ). https://doi.org/10.1017/9781139061773.004.
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          [7]
          <string-name>
            <surname>Haskell</surname>
            ,
            <given-names>R.E.</given-names>
          </string-name>
          :
          <article-title>The Similarity-Based Brain. Transfer of Learning</article-title>
          . С.
          <volume>189</volume>
          -
          <fpage>204</fpage>
          . Elsevier (
          <year>2001</year>
          ). https://doi.org/10.1016/b978-012330595-4/
          <fpage>50012</fpage>
          -3.
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          [8]
          <string-name>
            <surname>Yang</surname>
            ,
            <given-names>Q.</given-names>
          </string-name>
          , et al.:
          <article-title>Transfer Learning</article-title>
          . University of Cambridge ESOL Examinations (
          <year>2020</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          <article-title>[9] Evaluation of the efficiency of the use of composite reinforcement for building structures</article-title>
          .
          <year>2025</year>
          . URL: https://www.eoss-conf.com/wp-content/uploads/2025/03/Naples_Italy_
          <volume>10</volume>
          .
          <fpage>03</fpage>
          .25.pdf
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          [10]
          <article-title>Transfer learning for medical image classification: a literature review</article-title>
          . URL: https://bmcmedimaging.biomedcentral.com/articles/10.1186/s12880-022-00793-7
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          [11]
          <article-title>Transfer learning techniques for medical image analysis: A review</article-title>
          . URL: https://www.sciencedirect.com/science/article/abs/pii/S0208521621001297
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          [12]
          <article-title>A Study of CNN and Transfer Learning in Medical Imaging: Advantages, Challenges, Future Scope</article-title>
          . URL: https://www.mdpi.com/2071-1050/15/7/5930
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          [13]
          <article-title>What Makes Transfer Learning Work For Medical Images: Feature Reuse</article-title>
          &amp;
          <article-title>Other Factors</article-title>
          . URL: https://arxiv.org/abs/2203.01825
        </mixed-citation>
      </ref>
      <ref id="ref14">
        <mixed-citation>
          [14]
          <article-title>Multistage transfer learning for medical images</article-title>
          . URL: https://link.springer.com/article/10.1007/s10462-024-10855-7
        </mixed-citation>
      </ref>
      <ref id="ref15">
        <mixed-citation>
          <source>[15] Optimizing Deep Learning RNN Topologies on Intel Architecture. Supercomput. Front. Innov</source>
          .
          <volume>6</volume>
          (
          <issue>3</issue>
          ) (
          <year>2019</year>
          ). https://doi.org/10.14529/jsfi190304.
        </mixed-citation>
      </ref>
      <ref id="ref16">
        <mixed-citation>
          [16]
          <string-name>
            <surname>Singh</surname>
            ,
            <given-names>H.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Lone</surname>
            ,
            <given-names>Y.A.</given-names>
          </string-name>
          :
          <article-title>Introduction to Machine Learning. Deep Neuro-Fuzzy Systems with Python</article-title>
          .
          <source>С</source>
          .
          <volume>129</volume>
          -
          <fpage>156</fpage>
          . Apress, Berkeley, CA (
          <year>2019</year>
          ). https://doi.org/10.1007/978-1-
          <fpage>4842</fpage>
          -5361-
          <issue>8</issue>
          _
          <fpage>4</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref17">
        <mixed-citation>
          [17]
          <string-name>
            <surname>Lobbous</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Kim</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Burt</surname>
            <given-names>Nabors</given-names>
          </string-name>
          ,
          <string-name>
            <surname>L.</surname>
          </string-name>
          :
          <article-title>Diencephalic and other deep brain tumours</article-title>
          .
          <source>Handbook of Neuro-Oncology Neuroimaging. С</source>
          .
          <volume>661</volume>
          -
          <fpage>680</fpage>
          . Elsevier (
          <year>2022</year>
          ). https://doi.org/10.1016/b978-0
          <source>-12- 822835-7</source>
          .
          <fpage>00024</fpage>
          -x.
        </mixed-citation>
      </ref>
      <ref id="ref18">
        <mixed-citation>
          [18]
          <string-name>
            <surname>Graziani</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Xibilia</surname>
            ,
            <given-names>M.G.</given-names>
          </string-name>
          :
          <article-title>Innovative Topologies and Algorithms for Neural Networks</article-title>
          .
          <source>Future Internet</source>
          .
          <volume>12</volume>
          (
          <issue>7</issue>
          ),
          <volume>117</volume>
          (
          <year>2020</year>
          ). https://doi.org/10.3390/fi12070117.
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>