<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta>
      <journal-title-group>
        <journal-title>Modern Data Science Technologies Doctoral Consortium, June</journal-title>
      </journal-title-group>
    </journal-meta>
    <article-meta>
      <title-group>
        <article-title>Intelligent systems for recognizing artistic styles: a Deep Learning approach ⋆</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Nataliya Boyko</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Yaroslav Borys</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Lviv Polytechnic National University, the Department of Artificial Intelligence Systems</institution>
          ,
          <addr-line>Lviv, 79013</addr-line>
          ,
          <country country="UA">Ukraine</country>
        </aff>
      </contrib-group>
      <pub-date>
        <year>2025</year>
      </pub-date>
      <volume>15</volume>
      <issue>2025</issue>
      <fpage>0000</fpage>
      <lpage>0002</lpage>
      <abstract>
        <p>The study presents a machine learning-based approach for artistic style recognition in images, examining its practical value, feasibility, and potential applications. Existing research on the topic is analyzed, comparing different approaches and highlighting their strengths and limitations. The proposed method utilizes convolutional neural networks (CNNs) for style classification, trained on the WikiArt dataset containing over 100,000 high-quality images. The study details the data preparation process for training, provides a general overview of neural networks, and offers an in-depth analysis of the proposed CNN architecture. Finally, the experimental results are reviewed, identifying the model's limitations and discussing possible enhancements to improve accuracy and overall performance.</p>
      </abstract>
      <kwd-group>
        <kwd>eol&gt;machine learning</kwd>
        <kwd>image classification</kwd>
        <kwd>artistic style recognition</kwd>
        <kwd>convolutional neural networks (CNNs)</kwd>
        <kwd>data preprocessing</kwd>
        <kwd>WikiArt dataset 1</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>1. Introduction</title>
      <p>In the modern era of digital transformation, the automation of visual content analysis processes is
becoming increasingly important in various fields, from art history to information technology. In
particular, the recognition of artistic styles based on images has become one of the promising
research areas, uniting the interests of researchers in the field of artificial intelligence, cultural
heritage and computer vision. Traditionally, the identification of artistic style required in-depth
expert assessment by art historians, which made the process subjective and small-scale. However,
the development of machine learning, in particular convolutional neural networks (CNN), allows
us to automate this process and increase its objectivity and efficiency [1, 5].</p>
      <p>To date, there are a number of studies devoted to the automatic classification of artistic styles, in
particular using pre-trained models such as AlexNet, ResNet, VGG and Inception. At the same time,
most of them focus on a limited number of styles or demonstrate reduced accuracy in recognizing
similar stylistic directions. In addition, some of the solutions require significant computational
resources or large amounts of training data, which limits their practical use.</p>
      <p>Among the unresolved problems, it is worth highlighting the low classification accuracy when
detecting styles with similar visual features, such as abstract expressionism and color field, as well
as the difficulty of scaling models without losing the quality of the result. There is also a
contradiction between the accuracy of models and their ability to generalize, which is especially
noticeable when using models on new, unfamiliar data [3].</p>
      <p>The goal of this research is to develop an effective model for automatically identifying artistic
styles of images using convolutional neural networks, which provides high classification accuracy
at moderate computational costs. The task is to create our own CNN architecture, train it on a
subset of the large WikiArt set, implement preprocessing and data augmentation methods, as well
as analyze the results and potential areas for improvement.</p>
      <p>Thus, the research is relevant and aimed at overcoming the limitations of existing approaches,
using the advantages of modern information technologies in the field of machine learning to solve
interdisciplinary problems.</p>
    </sec>
    <sec id="sec-2">
      <title>2. Problem statement</title>
      <p>Classification of artistic style is a non-trivial task since styles often have overlapping visual
characteristics, which complicates their automatic identification. Moreover, stylistic features can
manifest themselves in compositional details, color palette, brushstrokes, or general aesthetics,
which are not always amenable to clear formalization. Thus, there is a need to build a model that
can effectively generalize complex visual patterns and distinguish even close painting styles [4].</p>
      <p>From a mathematical point of view, the problem of classifying artistic styles is formalized as a
multi-class classification problem:</p>
      <p>Given a set of images X ={ x1 , x2 , ... , xn }, where each xi ∈ Rh×w ×c is a three-dimensional tensor
representing an image of size h × w with c color channels (usually c=3 for RGB images).</p>
      <p>Let, Y ={ y1 , y2 , ... , yn }, where each yi ∈ {1,2 , ... , K } is a class label corresponding to one of
K art styles.</p>
      <p>The goal is to find the function f : Rh×w ×c → {1,2 , ... , K }, which approximates the
correspondence between the input image xi and its style yi with maximum accuracy: ^yi=f ( xi ; θ ),
where θ – model parameters.</p>
      <p>
        To construct such a function, a convolutional neural network (CNN) is used, which consists of a
composition of nonlinear functions and parameters θ={W 1 , b1 , ... , W l , bl }, which are learned
during the optimization of the objective function (
        <xref ref-type="bibr" rid="ref1">1</xref>
        ):
      </p>
      <p>L ( θ )=
1 n</p>
      <p>
        ∑ LCE ( yi , f ( xi ; θ )) ,
n i=1
(
        <xref ref-type="bibr" rid="ref1">1</xref>
        )
where LCE is the crossentropy function for multiclass classification, which measures the
difference between the predicted distribution and the true class.
      </p>
      <p>Thus, the solution to the problem is to construct a model f that minimizes L( θ ) on the training
sample, while ensuring good generalization ability on new images.</p>
    </sec>
    <sec id="sec-3">
      <title>3. Analysis of the latest research and publications</title>
      <p>In scientific research on the automatic identification of artistic styles using machine learning,
various approaches are considered that allow classifying paintings by stylistic features. One such
method is UFLK (Unsupervised Feature Learning), proposed by Eren Gultepe and other authors.
This approach consists of using unsupervised learning to highlight stylistic characteristics of
paintings, after which they are classified according to stylistic similarity. The results of the study
show that this method can be useful for classification and clustering, but its accuracy does not
always correspond to the level of more complex approaches [2].</p>
      <p>Another approach using the support vector method (SVM) is described in the work of
Alexander Blessing. In his study, works of art are classified by artists, and the model trained on a
sample of 750 paintings achieved an accuracy of about 78.53%. However, when trying to use hidden
features for classification, the accuracy of the model decreased, indicating the problem of
overtraining when processing complex features [6].</p>
      <p>A study by Adrian Lecoutra and colleagues examines the use of deep neural networks, including
the AlexNet and ResNet models, to automatically recognize artistic style based on 25 categories.
The authors note that the accuracy of the model increased when adding additional layers to the
pre-trained networks, although the overall accuracy remained at 62%. Further improvements in the
results are possible by using the bootstrap aggregation method [11].</p>
      <p>Saqid Imran and his colleagues propose an interesting two-stage approach to classifying
painting styles. The first stage involves dividing the image into five parts, each of which is
classified by a separate convolutional neural network. The second stage processes and combines
the probability vectors obtained from the first stage. This approach allows for a significant
improvement in classification accuracy, reaching 90.7% accuracy, and with properly tuned
hyperparameters even up to 96.5%. However, this method requires a large amount of data for
training and pre-tuning the models.</p>
      <p>A study by Maftuhah Rum and Arda Priscilla on the classification of naturalism and realism
styles showed that the use of pre-trained MobileNetV3 models provides high classification
accuracy, reaching 95%. This result demonstrates the effectiveness of using lightweight pre-trained
models for specific tasks, although for a wider range of styles such models may have limitations [7,
15].</p>
      <p>A study by Jacqueline Valencia and Gerradina Pineda provides an overview of trends in the use
of machine learning for predicting artistic styles. They highlight that most existing research
focuses on historical styles, while contemporary art remains understudied. This opens up
significant opportunities for further developments in this area.</p>
      <p>Finally, an analysis of existing tools for classifying painting styles, such as Art Style Identifier
AI and Analyzer-Art Style Identification, shows that most of them require a subscription to access
full functionality. This indicates a lack of accessible and effective tools for widespread use in
research and education, which highlights the importance of developing alternative solutions [8, 9].</p>
      <p>A brief comparison of existing publications, outlining their advantages and disadvantages, is
presented in Table 1. This comparison helps identify the strengths and limitations of current
research, providing insight into areas where further improvements and developments are needed.</p>
      <p>Artistic Style A two-stage system, Using pre-trained models, For accurate
Recognition: Combining the first stage of the accuracy is 90.7%, classification, the model
Deep and Shallow which involves the which was increased to requires a large data set,</p>
      <p>Thus, existing research indicates progress in the application of machine learning to classify
artistic styles, but also reveals a number of problems, such as limited classification accuracy when
working with similar styles and the need for larger datasets for the models to work effectively.</p>
    </sec>
    <sec id="sec-4">
      <title>4. Materials and Methods</title>
      <p>The main method for recognizing artistic styles is the use of convolutional neural networks
(CNNs), which can automatically extract important features from images and detect complex visual
patterns. CNNs consist of several types of layers [11, 13]:


</p>
      <p>Convolutional layer - the convolutional layer, the main layer of the network. It contains a
set of filters (kernels) whose parameters will be learned during training. There are several
layers of this type, and each subsequent layer usually learns a larger number of filters (it is
common to use a power of two to determine the number of filters, e.g. 32, 64, 128). In most
cases, the size of the filters is smaller than the size of the image; each filter creates an
activation map when collapsed with the image.</p>
      <p>Pooling layer - an aggregation layer that divides the input data (activation maps) into small
regions over which aggregation operations (e.g., average, maximum, or minimum) are
performed. This operation allows for compression of activation maps without significant
cost; often 2x2 regions are used.</p>
      <p>Fully connected (dense) layer - a layer where every input “neuron” is connected to every
output “neuron”.</p>
      <p>For this study, a proprietary CNN architecture was developed, consisting of 10 layers, including:



4 convolutional layers for image feature extraction.
5 subsampling layers (Max Pooling) to reduce data size and reduce the number of
parameters.</p>
      <p>1 fully connected layer for classification of results.</p>
      <p>The ReLU (Rectified Linear Unit) function was used to activate neurons, which helps to avoid
the problem of "dying" neurons and speeds up the learning process.</p>
      <p>The model is trained using the Backpropagation algorithm, which allows optimizing network
weights by minimizing the loss function. For this task, cross-entropy was used as the loss function
for multi-class classification (Equation 1).</p>
      <p>The Adam algorithm was used as an optimizer for training, which allows for effective tuning of
network parameters.</p>
      <p>Kernel size - specifies the size of the convolution window
Filters - number of filters in the convolution, which affects the dimension of output space
Stride - specifies stride length of convolution (the number of pixels the kernel moves
through), if &gt; 1 may conflict with other arguments
Padding - when value is ‘same’ creates padding evenly to the sides of the input</p>
      <p>Activation function - determines whether a neuron should be activated based on its input.</p>
      <p>A few more words on activation function. It introduces non-linearity, allowing the network to
learn complex patterns beyond simple linear relationships. Each activation function is best suited
for its scenario, as described below in Table 2 [10, 14].</p>
      <p>f ( x )= x if x &gt;0 ,
else 0.01 x</p>
      <p>1
f ( x )=
f ( x )=
f ( xi )=
1+e− x
ex − e− x
ex +e− x
exi
Σ ex j</p>
      <sec id="sec-4-1">
        <title>Description Simple and prevents vanishing gradients.</title>
      </sec>
      <sec id="sec-4-2">
        <title>Solves “dying ReLU” problem.</title>
      </sec>
      <sec id="sec-4-3">
        <title>Used in binary classification, suffers from vanishing gradient.</title>
      </sec>
      <sec id="sec-4-4">
        <title>Similar to sigmoid but centred at 0, reducing bias.</title>
      </sec>
      <sec id="sec-4-5">
        <title>Converts outputs into probability</title>
        <p>distributions, best suited for multi-class
classification.</p>
        <p>Now let's discuss main arguments of Keras layers.Conv2D class [14], that was used for our
research:










</p>
        <p>In this research, the training will be done on the WikiArt dataset [12]. The WikiArt [12] dataset
is a large and diverse dataset containing images of works of art collected from the eponymous
WikiArt.org, an online resource for studying artworks.</p>
        <p>In total, it contains more than 100,000 high-quality images of various artistic styles and authors:
27 styles are available for research, including Baroque, Renaissance, Impressionism, Realism, Pop
Art, etc.; the authors include Claude Monet, Paul Cezanne, Gustav Klimt, Leonardo da Vinci, etc
(See Fig. 1).</p>
        <p>The dataset is structured in three main categories:</p>
        <p>Leaky ReLU</p>
        <p>Sigmoid</p>
      </sec>
      <sec id="sec-4-6">
        <title>Tanh</title>
      </sec>
      <sec id="sec-4-7">
        <title>Softmax</title>
        <p>by artistic style;
by author;
by subject.</p>
        <p>This organization makes it ideal for research in the following areas:
classification of artistic styles;
analysis of image authorship;
training models to generate new images based on the provided ones.</p>
        <p>An analysis of the distribution of paintings by artistic styles shows that the largest number of
works belongs to impressionism - more than 13,000 works, while the smallest number, only 98, is
represented in the style of action painting.</p>
        <p>A similar analysis was performed to classify images by resolution. Most images are between 500
and 2000 pixels in size, which requires additional adjustments before feeding them to the model.</p>
        <p>In this research only 5000 images in total will be used in training on 5 classes (styles):




</p>
        <p>Let's discuss how data is read, transformed and presented as input to the model. The data is
stored in folders with the names, corresponding to specific classes. Then using keras
image_dataset_from_directory [12] function images are read and stored alongside an alphabetically
sorted array of classes. During this operation images are resized to specified size, namely 265 by
265 pixels.</p>
        <p>To allow the model to learn faster data normalization is used. Each pixel in image channels (R,
G, B) is divided by its maximum value, 255. This also improves generalization and prevents bias
towards bright or dark images.</p>
        <p>To improve model robustness and reduce potential overfitting data augmentation was
introduced. Several techniques were used, namely



</p>
        <p>Random flip.</p>
        <p>Random rotation.</p>
        <p>Random contrast.</p>
        <p>Random brightness.</p>
        <p>The algorithm for image processing and analysis using convolutional neural networks (CNN)
consists of several stages that can be described mathematically. The main goal of this algorithm is
to transform an image into a set of features that allow for classification by artistic styles.</p>
        <p>We have an input image I , which is represented as a matrix of size h × I n × c, also:



h – mage height (number of pixels vertically),
I n – image width (number of pixels horizontally),
c – number of color channels (usually c=3for RGB images, where each pixel contains
three values – red, green and blue).</p>
        <p>Where I i , j ,k the value of the pixel at position (i , j) for color channel k .</p>
        <p>
          The first step is image normalization, which scales pixel values to a range from 0 to 1. This is
done by dividing each pixel by its maximum value (255 for 8-bit images) (
          <xref ref-type="bibr" rid="ref2">2</xref>
          ):
        </p>
        <p>
          I 'i , j ,k= I i , j ,k . (
          <xref ref-type="bibr" rid="ref2">2</xref>
          )
        </p>
        <p>255
Now all pixel values I i , j ,k are within the range [0, 1].</p>
        <p>In addition, to improve the generalization ability of the model, data augmentation can be
applied, which includes operations such as random rotations, reflections, changes in contrast or
brightness. This helps to avoid overfitting and increase the diversity of the data.</p>
        <p>A neural network processes an image using convolutional operations, which are applied to each
layer of the CNN. Convolution is an operation in which a filter (kernel) K of size f × f × c is slid
over the image and creates a new feature matrix (activation).</p>
        <p>
          This can be written as (
          <xref ref-type="bibr" rid="ref3">3</xref>
          ):
        </p>
        <p>
          f f c
Ai , j ,k=∑ ∑ ∑ I 'i+ p , j+q ,r ∙ K p ,q ,r , (
          <xref ref-type="bibr" rid="ref3">3</xref>
          )
        </p>
        <p>p=1 q=1 r=1
where A is the activation for pixel (i , j) on k-th channel, K is the filter,f is the filter size, andc
is the number of channels in the image.</p>
        <p>This process allows us to extract local features such as edges, textures, and colors, which will
then be used for classification.</p>
        <p>
          After each convolution, pooling is usually applied to reduce the size of the activations and the
number of parameters. Typically, Max Pooling is used, where the maximum value is selected for
each small region. Mathematically, this looks like this (
          <xref ref-type="bibr" rid="ref4">4</xref>
          ):
        </p>
        <p>
          Pi , j ,k=mp a,qx Ai+ p , j+g ,k , (
          <xref ref-type="bibr" rid="ref4">4</xref>
          )
where Pi , j ,kare the values after subsampling for pixel (i , j) on the k -th channel, and p and q
are the sizes of the subsampling window.
        </p>
        <p>This process allows you to reduce the dimensionality of the image and preserve important
features.</p>
        <p>After several convolutional and subsampling layers, the image is passed to the fully connected
layers, which perform the classification. These are a set of neurons, each of which is connected to
all the outputs of the previous layers [13].</p>
        <p>
          Let x be the vector containing the activations after the last convolutional and subsampling
layer. These activations are now fed to the input of the fully connected layers, where each neuron
of the j-th layer has an output (
          <xref ref-type="bibr" rid="ref5">5</xref>
          ):
y j=σ (∑ W ij xi+b j) , (
          <xref ref-type="bibr" rid="ref5">5</xref>
          )
        </p>
        <p>i
where y j is the output of j-th neuron, W ij is the weight between i-th input and j-th output, b j
is the offset for j-th neuron, and σ is the activation function (usually ReLU or Softmax for
multiclass classification).</p>
        <p>
          At the output of the last fully connected layer, we get a probability vector for each class (art
style). The probability vector ^y can be calculated using the Softmax function (
          <xref ref-type="bibr" rid="ref6">6</xref>
          ):
^yk= K
exp ( yk )
        </p>
        <p>,
∑ exp ( yi )
i=1
where ^ykis the probability that the image belongs to class k , and K is the number of classes
(styles).</p>
        <p>
          The loss function for multiclass classification is usually calculated using crossentropy (
          <xref ref-type="bibr" rid="ref7">7</xref>
          ):
        </p>
        <p>
          K
L=− ∑ yk log ( ^yk ) (
          <xref ref-type="bibr" rid="ref7">7</xref>
          )
        </p>
        <p>k=1
where yk is the true class label, and ^yk is the predicted probability for class k.</p>
        <p>
          To optimize the weights and biases of the network, methods such as the backpropagation
algorithm and the Adam optimizer are used, which minimizes the loss function by updating the
model parameters.
(
          <xref ref-type="bibr" rid="ref6">6</xref>
          )
        </p>
      </sec>
    </sec>
    <sec id="sec-5">
      <title>5. Experiments</title>
      <p>Several experiments were conducted to evaluate the performance of the designed convolutional
neural network for image style recognition. The final version of the network consists of 10 hidden
layers, including 4 convolutional layers, 5 pooling layers, and 1 dense layer. The full network
structure is illustrated in Fig. 2. During the experiments, various hyperparameters, data
augmentation techniques, and optimization strategies were tested to enhance model performance
and ensure robust classification.</p>
      <p>To optimize computational efficiency early stopping was implemented. This algorithm
continuously monitors a specified model metric – validation loss in this research - and halts the
training process when the metric shows consistent increase. This helps the model avoid potential
overfitting and also reduces training time.</p>
      <p>The training results are presented in Fig. 3 and Fig. 4. The highest accuracy achieved was 0.7 for
the training data and 0.63 for the validation data. Analyzing the graphs, it can be concluded that
the model lacked sufficient complexity for the given task. This suggests that a deeper architecture,
additional feature extraction mechanisms, or further hyperparameter tuning may be required to
improve performance.</p>
      <p>Currently two possible solutions are being considered to improve model performance:

</p>
      <p>Reducing the number of classes, thus artificially increasing model complexity by allowing
it to focus on fewer categories. This also opens the possibility of using ensembles methods
to detect a larger number of classes.</p>
      <p>Using pre-trained models such as VGG16 or Xception for transfer learning, which can
significantly enhance feature extraction.</p>
      <p>a)
b)</p>
      <p>To further analyze the model’s performance in style detection, it was tested on a subset of 250
images spanning five different styles. As shown in Fig. 4, the model demonstrates high accuracy in
detecting the Color Field Painting style (labeled as 1), while achieving moderate accuracy for
Mannerism - Late Renaissance (labeled as 2) and Post-Impressionism (labeled as 4).</p>
      <p>However, the model struggles to correctly classify Abstract Expressionism (labeled as 0) and
Naive Art / Primitivism (labeled as 3).</p>
      <p>The model also exhibits confusion between similar styles, particularly:

</p>
      <p>
        Abstract Expressionism (0) and Color Field Painting (
        <xref ref-type="bibr" rid="ref1">1</xref>
        ) – Since both are substyles of
Abstract Art, they share overlapping visual characteristics, making it challenging the
model to distinguish between them accurately.
      </p>
      <p>
        Mannerism – Late Renaissance (
        <xref ref-type="bibr" rid="ref2">2</xref>
        ) and Post-Impressionism (
        <xref ref-type="bibr" rid="ref4">4</xref>
        ) – These styles also show
some degree of misclassification, likely due to similarities in color palettes, brushwork, or
composition techniques.
      </p>
      <p>To reduce misclassifications, a more refined feature extraction process is necessary, as was
already discussed earlier.</p>
      <p>The result, shown in Fig. 5, is a prototype interface demonstrating the practical application of
the solution. Built using Gradio [10], the interface consists of three main elements:


</p>
      <p>Input field – allows users to select an image for analysis
Model selection field – enables choosing between different trained models</p>
      <p>Output field – displays the predicted artistic style based on the selected image and model.</p>
    </sec>
    <sec id="sec-6">
      <title>6. Conclusions</title>
      <p>Image style recognition remains an underexplored yet highly promising field. Most existing
research focuses on theoretical aspects, with only a few practical solutions available – many of
which can classify only 2 to 3 styles or require large datasets for training.</p>
      <p>One of the key stages is image preprocessing, which includes pixel normalization and data
augmentation to improve the generalization ability of the model. These measures reduce the
probability of overtraining and ensure better adaptation of the model to new, unknown data.</p>
      <p>An important element is the use of convolutional neural networks, which automatically
highlight important features of images, in particular, different textures, edges, colors and other
stylistic characteristics, which is necessary for the classification of artistic styles. The use of
multiple layers of convolution and subsampling allows you to effectively reduce the dimensionality
of the input data and increase the recognition accuracy.</p>
      <p>Of particular note is the role of fully connected layers, which perform the final classification
based on activations from previous layers, allowing for accurate image style identification. To
improve the results, the ReLU activation function and the Adam optimization method were used,
which provide fast learning and accurate predictions.</p>
      <p>The study presented a working method for recognizing the artistic style of images across 5
styles, achieving an accuracy of 0.63. An in-depth analysis of the model's architecture was
conducted, and various approaches to pre-training the data were discussed to enhance the model's
performance</p>
      <p>Additionally, the training and validation results were thoroughly discussed, highlighting the
model's performance on both training and validation datasets. The study also displayed the
practical application of the solution, demonstrating how the model can be used in real-world
scenarios for artistic style recognition.</p>
      <p>While the proposed solution is not perfect, it represents a functional approach that can be
further optimized for specific applications. Additionally, this paper outlines several strategies to
enhance model accuracy. Future work can build on these foundations to develop more robust and
efficient style recognition systems.</p>
    </sec>
    <sec id="sec-7">
      <title>Acknowledgements</title>
      <p>The study was created within research topic "Methods and means of artificial intelligence to
prevent the spread of tuberculosis in war-time"(№0124U000660), which is carried out at the
Department of Artificial Intelligence Systems of the Institute of Computer Sciences and
Information of technologies of the National University "Lviv Polytechnic".</p>
    </sec>
    <sec id="sec-8">
      <title>Declaration on Generative AI</title>
      <p>During the preparation of this work, the authors used ChatGPT and DeepL in order to: Grammar
and spelling check, translation. After using these tools, the authors reviewed and edited the content
as needed and take full responsibility for the publication’s content.
[9] K. He, X. Zhang, S. Ren, J. Sun, Deep Residual Learning for Image Recognition, in: Proceedings
of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2016, pp.
770778
[10] J. Johnson, A. Alahi, F.F. Li, Perceptual Losses for Real-Time Style Transfer and Super
Resolution, in: European Conference on Computer Vision ECCV 2016, 2016, pp. 694-711. doi:
10.1007/978-3-319-46475-6_43
[11] D. P. Kingma, J. Ba, Adam. A Method for Stochastic Optimization, in: 3rd International
Conference for Learning Representations, San Diego, 2015. URL:
https://arxiv.org/abs/1412.6980
[12] Open WikiArt.org - Visual Art Encyclopedia, 2025. URL: https://www.wikiart.org/. Last visited
02/04/2025
[13] I. J.Goodfellow, J. Pouget-Abadie, M. Mirza, B. Xu, D. Warde-Farley, S. Ozair, A. Courville, Y.</p>
      <p>Bengio, Generative Adversarial Nets, in: NeurIPS Proceedings Advances in Neural Information
Processing Systems 27, 2014. URL: arXiv: 1406.2661
[14] P. Isola, J.-Y. Zhu, T. Zhou, and A. A. Efros, Image to-image translation with conditional
adversarial networks, in: IEEE Conference on Computer Vision and Pattern Recognition
(CVPR), 2017. doi: 10.1109/CVPR.2017.632
[15] P. Isola, J.-Y. Zhu, T. Zhou, and A. A. Efros, Image to-image translation with conditional
adversarial networks, in: IEEE Conference on Computer Vision and Pattern Recognition
(CVPR), 2017. doi: 10.1109/CVPR.2017.632</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <given-names>C.</given-names>
            <surname>Li</surname>
          </string-name>
          ,
          <string-name>
            <surname>M.</surname>
          </string-name>
          <article-title>Wand, Precomputed real-time texture synthesis with markovian generative adversarial networks</article-title>
          ,
          <source>in: European conference on computer vision ECCV</source>
          <year>2016</year>
          ,
          <year>2016</year>
          . doi:
          <volume>10</volume>
          .1007 / 978-3-
          <fpage>319</fpage>
          -46487-9_
          <fpage>43</fpage>
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [2]
          <string-name>
            <given-names>G.</given-names>
            <surname>Team</surname>
          </string-name>
          , Gradio,
          <year>2025</year>
          . URL: https://www.gradio.app/.
          <source>Last visited</source>
          <volume>02</volume>
          /04/
          <year>2025</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [3]
          <string-name>
            <given-names>L. A.</given-names>
            <surname>Gatys</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A. S.</given-names>
            <surname>Ecker</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Bethg</surname>
          </string-name>
          ,
          <article-title>A Neural Algorithm of Artistic Style</article-title>
          ,
          <source>Journal of Vision Vision Sciences Society Annual Meeting Abstract</source>
          , Vol.
          <volume>16</volume>
          , p.
          <fpage>326</fpage>
          ,
          <year>2016</year>
          . doi: https://doi.org/10.1167/16.12.326
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [4]
          <string-name>
            <given-names>P.</given-names>
            <surname>Isola</surname>
          </string-name>
          , J.-Y. Zhu,
          <string-name>
            <given-names>T.</given-names>
            <surname>Zhou</surname>
          </string-name>
          , and
          <string-name>
            <given-names>A. A.</given-names>
            <surname>Efros</surname>
          </string-name>
          ,
          <article-title>Image to-image translation with conditional adversarial networks</article-title>
          ,
          <source>in: IEEE Conference on Computer Vision and Pattern Recognition (CVPR)</source>
          ,
          <year>2017</year>
          . doi:
          <volume>10</volume>
          .1109/CVPR.
          <year>2017</year>
          .632
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          [5]
          <string-name>
            <given-names>N.</given-names>
            <surname>Boyko</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Bronetskyi</surname>
          </string-name>
          ,
          <string-name>
            <given-names>N.</given-names>
            <surname>Shakhovska</surname>
          </string-name>
          ,
          <source>Application of Artificial Intelligence Algorithms for Image Processing, in: Workshop Proceedings of the 8th International Conference on “Mathematics. Information Technologies</source>
          . Education”,
          <source>MoMLeT&amp;DS-2019</source>
          , Vol-
          <volume>2386</volume>
          ,
          <year>2019</year>
          , pp.
          <fpage>194</fpage>
          -
          <lpage>211</lpage>
          ,.
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          [6]
          <string-name>
            <given-names>N.</given-names>
            <surname>Boyko</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B.</given-names>
            <surname>Mandych</surname>
          </string-name>
          ,
          <article-title>Technologies of Object Recognition in Space for Visually Impaired People”</article-title>
          ,
          <source>The 3rd International Conference on Informatics &amp; Data-Driven Medicine (IDDM</source>
          <year>2020</year>
          ), Växjö, Sweden, November
          <volume>19</volume>
          -21, CEUR,
          <year>2020</year>
          , pp.
          <fpage>338</fpage>
          -
          <lpage>347</lpage>
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          [7]
          <string-name>
            <given-names>N.</given-names>
            <surname>Boyko</surname>
          </string-name>
          ,
          <string-name>
            <given-names>N.</given-names>
            <surname>Tkachuk</surname>
          </string-name>
          ,
          <article-title>Processing of Medical Different Types of Data Using Hadoop and Java MapReduce</article-title>
          , in: The 3rd International Conference on Informatics &amp;
          <string-name>
            <surname>Data-Driven Medicine</surname>
          </string-name>
          (IDDM
          <year>2020</year>
          ),
          <year>2020</year>
          , pp.
          <fpage>405</fpage>
          -
          <lpage>414</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          [8]
          <string-name>
            <given-names>S.</given-names>
            <surname>Ioffe</surname>
          </string-name>
          , Ch. Szegedy, Batch Normalization:
          <article-title>Accelerating Deep Network Training by reducing Internal Covariate Shift</article-title>
          ,
          <source>in: Proceedings of the 32 nd International Conference on Machine Learning</source>
          , Lille, France,
          <year>2015</year>
          . JMLR: W&amp;CP,
          <year>2015</year>
          , Vol.
          <volume>37</volume>
          , pp.
          <fpage>448</fpage>
          -
          <lpage>456</lpage>
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>