<?xml version="1.0" encoding="UTF-8"?>
<TEI xml:space="preserve" xmlns="http://www.tei-c.org/ns/1.0" 
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" 
xsi:schemaLocation="http://www.tei-c.org/ns/1.0 https://raw.githubusercontent.com/kermitt2/grobid/master/grobid-home/schemas/xsd/Grobid.xsd"
 xmlns:xlink="http://www.w3.org/1999/xlink">
	<teiHeader xml:lang="en">
		<fileDesc>
			<titleStmt>
				<title level="a" type="main">Handwritten Ukrainian Character Recognition using a Convolutional Neural Networks and Synthetic Dataset</title>
			</titleStmt>
			<publicationStmt>
				<publisher/>
				<availability status="unknown"><licence/></availability>
			</publicationStmt>
			<sourceDesc>
				<biblStruct>
					<analytic>
						<author>
							<persName><forename type="first">Zinchenko</forename><surname>Olha</surname></persName>
							<email>zinchenkoov@gmail.com</email>
							<affiliation key="aff0">
								<orgName type="institution">State University of Telecommunications</orgName>
								<address>
									<addrLine>Solomenska street, 7</addrLine>
									<postCode>03110</postCode>
									<settlement>Kiyv</settlement>
									<country key="UA">Ukraine</country>
								</address>
							</affiliation>
						</author>
						<author>
							<persName><forename type="first">Chychkarov</forename><surname>Yevhen</surname></persName>
							<email>chychksrovea@gmail.com</email>
							<affiliation key="aff0">
								<orgName type="institution">State University of Telecommunications</orgName>
								<address>
									<addrLine>Solomenska street, 7</addrLine>
									<postCode>03110</postCode>
									<settlement>Kiyv</settlement>
									<country key="UA">Ukraine</country>
								</address>
							</affiliation>
						</author>
						<title level="a" type="main">Handwritten Ukrainian Character Recognition using a Convolutional Neural Networks and Synthetic Dataset</title>
					</analytic>
					<monogr>
						<imprint>
							<date/>
						</imprint>
					</monogr>
					<idno type="MD5">AC18928A4CB302C358C7F37617408188</idno>
				</biblStruct>
			</sourceDesc>
		</fileDesc>
		<encodingDesc>
			<appInfo>
				<application version="0.7.2" ident="GROBID" when="2023-12-29T06:04+0000">
					<desc>GROBID - A machine learning software for extracting information from scholarly documents</desc>
					<ref target="https://github.com/kermitt2/grobid"/>
				</application>
			</appInfo>
		</encodingDesc>
		<profileDesc>
			<textClass>
				<keywords>
					<term>handwriting recognition</term>
					<term>recognition of Ukrainian characters</term>
					<term>convolutional neural networks (CNN)</term>
					<term>digit recognition</term>
					<term>deep learning</term>
					<term>image processing</term>
				</keywords>
			</textClass>
			<abstract>
<div xmlns="http://www.tei-c.org/ns/1.0"><p>text. This paper considers several options for the architecture of convolutional neural networks for the recognition of isolated handwritten Ukrainian characters and numbers, which were trained using a synthetic dataset built on the basis of a set of handwritten and cursive fonts. Comparison of the results of recognition of several variants of images containing handwritten letters and numbers using neural networks with different architectures showed that an increase in the number of convolutional layers leads to a decrease in the frequency of erroneous character recognition. The size of the training dataset significantly affects the reliability of character recognition. The data sets used in the work contained from 192 to 2304 samples per class. The upper limit of the number of samples per class is close to the limit that provides acceptable recognition accuracy. Reducing the sample size by reducing the number of samples per class leads to a significant decrease in recognition accuracy (from 90% recognition accuracy of elements of real inscriptions to 40-60% with a 4-fold decrease in sample size).</p></div>
			</abstract>
		</profileDesc>
	</teiHeader>
	<text xml:lang="en">
		<body>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="1.">Introduction</head><p>Optical character recognition (OCR) is a technology that is widely used today. The basis of this technology is the process of classifying images of symbols, which are selected on the original digital image, according to the corresponding samples <ref type="bibr" target="#b0">[1]</ref>.</p><p>Information technologies based on optical recognition allow solving a wide range of various practical tasks: identification of vehicle registration numbers from images of license plates, which helps control traffic <ref type="bibr" target="#b1">[2]</ref>, conversion of printed academic records into text for storage in an electronic database, decoding ancient inscriptions and texts, automatic data entry by optical scanning of cards or bank checks.</p><p>In most cases, modern optical recognition systems are based on deep learning neural networks <ref type="bibr" target="#b2">[3,</ref><ref type="bibr" target="#b3">4]</ref>. Convolutional neural networks (CNN) are widely used for image processing. It is one of the most popular types of deep neural networks and can be used to effectively recognize characters present in an image <ref type="bibr" target="#b4">[5]</ref>.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="2.">Literature review</head><p>Convolutional neural networks are widely used to solve optical recognition problems. They are able to automatically highlight the conditional features of the input data. The properties of such networks make them a very convenient tool for solving computer vision problems, in particular, for recognizing images of letters or numbers.</p><p>Initially, most of the research was focused on the recognition of the Latin alphabet letters, but in recent years other alphabets began to attract attention -Arabic, Russian, Kazakh, Chinese, Indian, etc. <ref type="bibr" target="#b5">[6]</ref><ref type="bibr" target="#b6">[7]</ref><ref type="bibr" target="#b7">[8]</ref><ref type="bibr" target="#b8">[9]</ref><ref type="bibr" target="#b9">[10]</ref><ref type="bibr" target="#b10">[11]</ref>. For research on handwritten Latin alphabet recognition technologies, the EMNIST dataset became the de facto standard <ref type="bibr" target="#b11">[12]</ref>. Many different variants of neural network architectures have been proposed to classify the images of this set.</p><p>One of the first successful attempts to use deep learning for character recognition was the creation of the LeNet-5 architecture <ref type="bibr" target="#b12">[13]</ref>. This architecture showed the highest accuracy of classification of handwritten digits among the solutions available at that time <ref type="bibr">(1998)</ref>.</p><p>Similar solutions are widely used in relatively recent works on computers with low computing power. For example, the ConvNet architecture proposed in <ref type="bibr" target="#b13">[14]</ref> consists of two convolutional layers with kernels of size 5x5 each, and using non-linearity activation (ReLU) functions, a MaxPooling layer after each convolutional layer, and two fully connected layers that contain 500 neurons and the last layer with a selection of 26 classes. Such a neural network has only 60,000 learning parameters. This number of parameters is much smaller than the AlexNet network (60 million training parameters and 650,000 neurons) <ref type="bibr" target="#b14">[15]</ref> or the GoogleNet network (6.8 million training parameters) <ref type="bibr" target="#b15">[16]</ref>.</p><p>The best results for training handwritten digit recognition models using the EMNIST Digits (or MNIST) datasets were achieved using convolutional neural networks (see <ref type="bibr" target="#b17">[17]</ref> for a review).</p><p>One way to improve the accuracy of letter or number image recognition is to use models with a more complex architecture than AlexNet or LeNet. For example, good results in recognition accuracy were achieved due to the use of capsular layers <ref type="bibr" target="#b18">[18]</ref>. The authors of <ref type="bibr" target="#b19">[19]</ref> proposed a convolutional neural network that contains 14 convolutional layers to represent character characteristics, two MaxPooling layers to reduce the number of features or highlight strong features, one softmax layer, and one classification layer for isolated character recognition.</p><p>Pre-training using ImageNet accelerates convergence, especially at the beginning of training. However, for models with random initialization, the results achieved do not differ from models with pretraining for a comparable number of epochs <ref type="bibr" target="#b20">[20]</ref>.</p><p>According to <ref type="bibr" target="#b22">[21]</ref>, models created from scratch, as a rule, give better results compared to pre-trained models in the recognition of handwritten characters of the Arabic language. Regarding the complexity of the CNN architectures used, according to <ref type="bibr" target="#b22">[21]</ref>, less complex CNN models are less accurate, but have higher classification and learning rates (and vice versa). The authors of <ref type="bibr" target="#b22">[21]</ref>, based on the obtained results, suggested that learning from scratch all used models can improve the accuracy of model classification and the speed of obtaining results, regardless of the complexity of the model.</p><p>In numerous studies devoted to the recognition of handwritten symbols, there is experience in using sufficiently complex architectures of neural networks. For example, in <ref type="bibr" target="#b23">[22]</ref>, modern pre-trained CNN architectures were used to classify 231 different Bangla handwritten characters based on the CMATERdb dataset. The images were first converted to black and white form with a white foreground color. Images were resized to 28×28 pixels. These images were used as input for the CNN architectures that were used. The learning rate was set to 0.001. Categorical cross-entropy was used as the error function. After 50 epochs, InceptionResNetV2 achieved the best accuracy (96.99%). DenseNet121 and InceptionNetV3 also demonstrated excellent recognition accuracy (96.55 and 96.20%, respectively). The authors <ref type="bibr" target="#b23">[22]</ref> also considered a combination of pre-trained architectures InceptionResNetV2, InceptionNetV3 and DenseNet121, which provided even better recognition accuracy (97.69%) compared to other individual CNN architectures, but concluded that its practical use requires large computing power and memory and therefore hard for practical use. The models were tested in cases where character recognition appears difficult to a human, but all architectures showed the same ability to reliably recognize such images. According to <ref type="bibr" target="#b23">[22]</ref>, the InceptionResNetV2 architecture can be called the most efficient model, taking into account the computational complexity, the amount of memory and the ability to recognize distorted symbols.</p><p>Research on the use of various architectures of neural networks without prior training based on ImageNet is also known. For example, in <ref type="bibr" target="#b24">[23]</ref>, two variants of convolutional neural networks with different architectures, varying the depth, width, and number of network parameters, were tested for recognition of Devanagari characters.</p><p>The first model consisted of three convolutional layers and one fully connected layer. The second model came from the LeNet family, and consisted of two convolutional layers followed by two fully connected layers. The best result in terms of recognition accuracy (over 98%) was obtained by the authors with a model with more convolutional layers.</p><p>A similar result was obtained in <ref type="bibr" target="#b25">[24]</ref>. The authors investigated three variants of the architecture of CNN networks: LeNet-5, a modified variant of LeNet and AlexNet CNN. Using the latest version of the neural network architecture, Devanagari character recognition accuracy of 99% was achieved.</p><p>Numerous experiments with several convolutional neural networks (basic CNN, VGG-16, and ResNet) were conducted in <ref type="bibr" target="#b26">[25]</ref> using regularization approaches such as filtering and data augmentation. The VGG and Resnet architectures gave close results in recognition accuracy: using the Resnet architecture, it was possible to achieve the best result with a recognition rate of 98.57%, for the VGG-16 architecture, a result of 97.14% was achieved.</p><p>The work <ref type="bibr" target="#b27">[26]</ref> also noted the higher achieved recognition accuracy when using the deeper architecture of the CNN neural network. But increasing recognition accuracy is achieved only by using input data augmentation. In <ref type="bibr" target="#b28">[27]</ref>, different CNN architectures were investigated for recognizing the EMNIST dataset. According to <ref type="bibr" target="#b28">[27]</ref>, using the GoogleNet architecture always gives higher accuracy compared to ResNet18, but requires 2.5-2.9 times more time to train the model.</p><p>Neural network architectures that use prior learning have been created to classify color images of different sizes. Therefore, for many datasets (e.g., EMNIST Letters 28×28), single-channel images must be converted to three-channel to use existing libraries and pre-training capabilities <ref type="bibr" target="#b28">[27]</ref>. In particular, the ResNet module from the tensorflow package requires an input image with a size of at least 32×32×3.</p><p>When using a modified CNN architecture and training models without loading the weights of the pre-trained model, the input data may contain single-channel images. When comparing variants of color and monochrome image recognition <ref type="bibr" target="#b28">[27]</ref>, it is indicated that variants with an input image size of 40×40 pixels (for the resized EMNIST data set) in monochrome versions with rotation and shift augmentation have the highest results in the models studied by the authors (ResNet18 and GoogleNet).</p><p>For the recognition of Cyrillic characters, similar studies are quite few. There is experience in using the MobileNet architecture, which included 30 layers <ref type="bibr" target="#b29">[28]</ref> for character recognition of the Kazakh and Russian languages.</p><p>Some results of Cyrillic character recognition are also presented in <ref type="bibr" target="#b31">[29]</ref><ref type="bibr" target="#b32">[30]</ref>.</p><p>Regarding the data set for the recognition of Ukrainian letters, individual works in this direction are known. According to <ref type="bibr" target="#b33">[31]</ref>, when creating a data set for model training, it is necessary to distinguish between uppercase and lowercase letters, as well as take into account the possibility of different spellings of the same letter. The authors <ref type="bibr" target="#b33">[31]</ref> identified more than 70 classes that form a complete set of symbols of the Ukrainian language (for example, different spellings of the lowercase letter "a" were taken into account).</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="3.">Experimental setup and proposed approach</head><p>There are quite a few studies of handwriting recognition technologies that are based on the use of the EMNIST data set <ref type="bibr" target="#b17">[17]</ref> (at least for English). There is known experience of using various classifiers and neural network technologies to recognize Cyrillic alphabet symbols, but comparative studies of recognition technologies for them are fragmentary. Also, there are no EMNIST-like datasets for the Ukrainian alphabet.</p><p>This article is devoted to researching the possibilities of recognizing Cyrillic (mainly Ukrainian) handwritten letters using convolutional neural networks and analyzing the influence of the selected neural network architecture on the accuracy and reliability of recognition. In addition, the possibility of using a synthetic data set and the effect of augmentation of the original data set on the recognition results were investigated.</p><p>The goals of this study:</p><p>• Analysis of the influence of the architecture of convolutional neural networks on the accuracy of recognition of handwritten numbers and letters of the Ukrainian alphabet.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>•</head><p>Analysis of the peculiarities of the recognition of Ukrainian symbols under the conditions of learning convolutional neural networks using a synthetic data set with various options for increasing the training sample.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="3.1.">Building a dataset for model training</head><p>The dataset used for training the models was built using a set of handwritten and italic fonts (a total of 48 font variants with Ukrainian glyphs were selected). All images of letters and numbers were divided into 76 classes (33 lowercase, 33 uppercase letters and 10 numbers) or 43 classes (33 letters and 10 numbers). All images of the data set were centered and a dataset was created from them with the size of each image 28x28, or 32x32, or 64x64, or 128x128 pixels. The Pillow library was used to create or transform images with letters or numbers (from one-channel to three-channel).</p><p>The test data set was built using the same fonts. The selection of specific fonts and augmentation options was chosen randomly. The volume of the test dataset was about 10% of the volume of the training one.</p><p>The presence of only a small number of suitable fonts with Ukrainian glyphs required the use of augmentation to form the necessary data set. We used the capabilities of the Image Data Generator from the tensorflow package to perform three options for transforming character images: random rotation, shift transformation, scaling transformation.</p><p>The number of generated images varied from 2 to 48 for each symbol. For 32 images per each symbol the total volume of the dataset was 116,736 samples (32 images per character of one font). This sample volume is quite comparable to the EMNIST Letters dataset <ref type="bibr" target="#b11">[12,</ref><ref type="bibr" target="#b17">17]</ref>, which contains mixed lowercase and uppercase letters (26 classes and a total of 145,600 samples).</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="3.2.">Preprocessing of images for recognition</head><p>Tools from the OpenCV library were used to select image regions containing letters or numbers, which were then recognized. The findContours function or the algorithm for extracting the most stable extreme region (mser function) was used to select the contours of recognized symbols.</p><p>The algorithm for preprocessing the image and selection of the area containing letters or numbers included the following stages:</p><p>1. Image filtering to reduce the noise level (the Gaussian filter was used -function cv2.GaussianBlur);</p><p>2. Binarization of the image to cut off noise (the cv2.threshold function was used, its parameters were chosen for reliable selection of character contours);</p><p>3. Morphological transformation (dilationfunction cv2.dilate, several iterations were used); 4. Selection of contours and their sorting (selection of contours was carried out using the function cv2.findContours; 5. Image segmentation, ie. selection of recognition areas as a set of rectangles containing contours of letters and numbers (cv2.boundingRect functions were used).</p><p>Directly for recognition, selected regions of interest were cut from the original image, binarization was again applied to them, after which the obtained images of individual symbols (without dilation or other distortions) were scaled to the size of the image in the dataset. Each pixel value of the images was in the range 0 to 255, so normalization of these pixel values was performed by dividing by 255 so that all values in the array describing the image were in the range 0 to 1.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="3.3.">Proposed CNN Architectures</head><p>At the first stage of the research, the models were trained using single-channel images sized 28x28 pixels. The simplest variants of the architecture of LeNet-type convolutional neural networks for character image recognition are presented in Table <ref type="table" target="#tab_0">1</ref>.</p><p>More complex neural network architectures are presented in Table <ref type="table" target="#tab_1">2</ref>. Architecture 4 and architecture 5 are implementations of the AlexNet architecture for single-channel images. Architecture 6 included thirteen convolutional layers and three dense layers, as well as MaxPooling and Dropout layers. This version of the architecture is the most complex and repeats the VGG-16 architecture with respect to single-channel images. However, it turned out to be the best in terms of accuracy and reliability of recognizing the test sample and real inscriptions. Architecture 1 included an input layer, one convolution block of two layers, a MaxPooling subsampling layer, a Dropout regularization layer, a dense layer, a Flatten dimensionality transformation layer, another regularization layer, and an output layer.</p><p>Two more variants of the architecture of convolutional neural networks with an increased number of convolutional blocks are also presented in Table <ref type="table" target="#tab_0">1</ref> (architecture 2 and architecture 3). They differ from the simplest version by an increase in the number of convolutional blocks from two layers (two blocks in the architecture 2 or three blocks in the architecture 3).</p><p>At the second stage of research, taking into account the presence of recognition errors even when using the best models, several variants for more complex neural networks architectures were considered. Research has been done with VGG16 and VGG19 <ref type="bibr" target="#b34">[32]</ref>, ResNet <ref type="bibr" target="#b35">[33]</ref> or ResNetV2 <ref type="bibr" target="#b36">[34]</ref>, MobileNet or MobileNetV2 <ref type="bibr" target="#b37">[35,</ref><ref type="bibr" target="#b38">36]</ref>, InceptionResNetV2 <ref type="bibr" target="#b39">[37]</ref> architectures.</p><p>Several variants of implemented architectures for ResNetV2 family are shown in Figure <ref type="figure">1</ref>. An increase in the number of neural network parameters due to the use of a deeper architecture leads to an increase in recognition accuracy. The calculation time during neural network training increases with the increase in the number of adjustable parameters (when comparing architectures 1 and 6approximately an order of magnitude).</p><p>However, when trying to recognize images with real inscriptions that do not belong to the training or test sample, a significant difference in the behavior of the studied architecture variants was found regarding the possibility of reliable character recognition.</p><p>A typical example of recognizing an inscription containing letters is given in Table <ref type="table">3</ref>. As can be seen from the obtained results, 100% recognition accuracy is provided only by the most complex variant of the architecture (variant 6).</p><p>An attempt to recognize an inscription containing only numbers gave an even more pronounced result of the accuracy of recognizing an image with isolated numbers, see Table <ref type="table">4</ref>. Similar results were obtained for many other variants of inscriptions, including those with letters and numbers at the same time: acceptable results in terms of recognition accuracy were obtained for more complex variants of architecture.</p><p>Recognition errors were obtained on some samples of inscriptions and when using deep architectures.</p><p>Neural networks for all architectures were trained using the Adam optimizer, the learning rate was chosen to be 0.0001, the number of training epochs was chosen to be 50.</p><p>The size of the training sample strongly affects the reliability of character recognition. The generation of 1,536 images per letter or number (32 images for each character for 48 font types) is actually the limit for acceptable recognition accuracy. Reducing the sample size leads to a significant decrease in recognition accuracy (from 100% accuracy to 40-60% when the sample size is reduced by 4 times). An increase in the size of the sample leads to a noticeable increase in the time spent on training the model.</p><p>The use of ResNet or MobileNet architectures required a transition to the formation of a training dataset from three-channel images. It has been established that reliable recognition of various alphanumeric inscriptions for all variants of the model architecture was achieved using a training set of sufficient size.</p><p>Training a model using three-channel images, especially as the resolution of the training sample increases, is a very resource-intensive process. Therefore, the authors were forced to reduce the number of recognizable classes to 43, abandoning the difference between lowercase and uppercase letters.</p><p>An example of the recognition result for alphabetic and digital inscriptions is shown in  Comparing different model architectures, all the options considered showed the recognition accuracy of the test set in the range of 99.2-99.6% when trained on a dataset of sufficient volume. An increase in the number of samples in the training data set for all the considered architectures led to an increase in recognition accuracy. An example of the experimental results for the model with the MobileNet architecture is shown in Figure <ref type="figure" target="#fig_4">5</ref>.</p><p>The recognition accuracy of real inscriptions with an accuracy of 80-90% was achieved with a training sample size of at least 700, and preferably more than 1500 images per class. An example of the experimental results for the model with the MobileNetV2 architecture is shown in Figure <ref type="figure" target="#fig_5">6</ref>.</p><p>Variation of the parameters of the transformations that were used for augmentation also has a noticeable effect on the recognition results: deformation or rotation of the image by more than 10-15% increases the frequency of errors.</p><p>Increasing the resolution of the training sample images had little effect on the results due to saturation.</p><p>For example, when training a model with the MobileNetV2 architecture on a dataset with a resolution of 32x32 data, the recognition accuracy of the test dataset was 98%, on a dataset with a resolution of 64x64, respectively, 99%, and on a dataset with a resolution of 128x128 -99.5% (example is shown in Figure <ref type="figure" target="#fig_6">7</ref>: An example of the influence of the training dataset images resolution on the achieved recognition accuracy (MobileNetV2 and ResNet152v2 architecture).Figure <ref type="figure" target="#fig_6">7(a)</ref>). However, for other architectures, the result of resolution increase was much less pronounced.</p><p>The number of errors in recognition of elements of real inscriptions has changed little: for the model with the ResNet152V2 architecture, an increase in the resolution of images of the training sample led to a decrease in the proportion of erroneous recognition from 18.0% to 11.4% (Figure <ref type="figure" target="#fig_6">7</ref> (b)), for models with the MobileNet or MobileNetV2 architecture, it has not practically changed. However, with an increase in the resolution of the training sample, the time spent on training increased quite significantly (by more than an order of magnitude).  When using deep neural networks to recognize letters or numbers, the reliability of recognition of elements of real inscriptions depended primarily on the size of the training dataset.</p><p>The recognition accuracy of the test dataset after training all variants of the models was quite high -97-98% and higher. However, the generation of training datasets of a small size -300-500 images per class -practically did not provide any reliable recognition.</p><p>The use of a model with the InceptionResNetV2 architecture, which requires an image resolution in the training set of at least 75x75x3 (actually, the model was trained on 128x128x3 images), did not lead to a noticeable increase in recognition accuracy.</p><p>In general, when comparing the achieved accuracy of recognition of real images and the speed of training the model, the best performance was provided by models of the ResNetV2 or MobileNetV2 family. </p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="5.">Conclusions</head><p>In this work, several variants of the architecture of convolutional neural networks for the recognition of isolated handwritten digits and Ukrainian letters are considered.</p><p>The results of recognition of various images containing letters and numbers were compared on models with different architectures. It has been established that when training a model on a set of onedimensional images 28x28, an increase in the number of convolutional layers of a neural network in most cases leads to an increase in the reliability of recognition. Among the options considered, the best accuracy and reliability of recognition was provided by a model with an architecture of the VGG16 type, which included 13 convolutional and three dense layers.</p><p>The possibility of learning convolutional neural networks using a synthetic data set built on the basis of handwritten or cursive fonts is shown. The size of the training dataset significantly affects the reliability of character recognition. The data sets used in the work contained from 192 to 2304 samples per class.</p><p>The lower limit of the sample size, which provides acceptable recognition accuracy, was 1536 characters per class. Reducing the sample size by reducing the number of samples per class leads to a significant decrease in recognition accuracy (from 90% recognition accuracy of elements of real inscriptions to 40-60% with a 4-fold decrease in sample size). An increase in the volume of the training data set did not provide an increase in the accuracy and reliability of recognition, but led to a significant increase in the training time of the model An increase in image resolution from 32x32x3 to 128x128x3 of the training dataset in most cases did not lead to an increase in the reliability of real image recognition.</p></div><figure xmlns="http://www.tei-c.org/ns/1.0" xml:id="fig_0"><head>Figure 1 : 2 :</head><label>12</label><figDesc>Figure 1: Examples of implementation of models with architectures ResNetV2 family</figDesc><graphic coords="6,375.85,122.60,147.43,149.40" type="bitmap" /></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" xml:id="fig_1"><head>Figure 3 :</head><label>3</label><figDesc>Results of learning the VGG-16-type modelTable 3 A sample of the results of an inscription with letters recognitionInscription on the image CNN</figDesc></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" xml:id="fig_2"><head>Figure 4 .</head><label>4</label><figDesc>The figure shows the selected areas of interest and recognition results.</figDesc></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" xml:id="fig_3"><head>Figure 4 :</head><label>4</label><figDesc>Figure 4: An example of recognition results using the VGG16 neural network (in this case, all letters and numbers are recognized accurately)</figDesc><graphic coords="8,103.25,371.16,387.95,96.99" type="bitmap" /></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" xml:id="fig_4"><head>Figure 5 :</head><label>5</label><figDesc>Figure 5: An example of the influence of the size of the training dataset on the achieved recognition accuracy (MobileNet architecture).</figDesc><graphic coords="9,210.24,415.68,232.74,118.98" type="bitmap" /></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" xml:id="fig_5"><head>Figure 6 :</head><label>6</label><figDesc>Figure 6: Recognition errors of real inscriptions depending on the size of the training dataset (MobileNetV2 architecture, 32х32х3 dataset images).</figDesc></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" xml:id="fig_6"><head>Figure 7 :</head><label>7</label><figDesc>Figure 7: An example of the influence of the training dataset images resolution on the achieved recognition accuracy (MobileNetV2 and ResNet152v2 architecture).</figDesc></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" type="table" xml:id="tab_0"><head>Table 1</head><label>1</label><figDesc>The simplest variants of convolutional neural network architecture</figDesc><table><row><cell>Architecture 1</cell><cell>Architecture 2</cell><cell>Architecture 3</cell></row><row><cell>Input (28x28x1)</cell><cell>Input (28x28x1)</cell><cell>Input (28x28x1)</cell></row><row><cell>conv2d, 128 filters</cell><cell>conv2d, 64 filters</cell><cell>conv2d, 128 filters</cell></row><row><cell>conv2d, 128 filters</cell><cell>conv2d, 64 filters</cell><cell>conv2d, 128 filters</cell></row><row><cell>MaxPooling2D</cell><cell>MaxPooling2D</cell><cell>MaxPooling2D</cell></row><row><cell>Dropout</cell><cell>Dropout</cell><cell>Dropout</cell></row><row><cell>Flatten</cell><cell>conv2d, 128 filters</cell><cell>conv2d, 256 filters</cell></row><row><cell>Dense, 256 filters</cell><cell>conv2d, 128 filters</cell><cell>conv2d, 256 filters</cell></row><row><cell>Dropout</cell><cell>MaxPooling2D</cell><cell>MaxPooling2D</cell></row><row><cell>Dense ( output -76 classes)</cell><cell>Dropout</cell><cell>Dropout</cell></row><row><cell></cell><cell>Flatten</cell><cell>conv2d, 512 filters</cell></row><row><cell></cell><cell>Dense, 256 filters</cell><cell>conv2d, 512 filters</cell></row><row><cell></cell><cell>Dropout</cell><cell>MaxPooling2D</cell></row><row><cell></cell><cell>Dense ( output -76 classes)</cell><cell>Dropout</cell></row><row><cell></cell><cell></cell><cell>Flatten</cell></row><row><cell></cell><cell></cell><cell>Dense, 1024 filters</cell></row><row><cell></cell><cell></cell><cell>Dropout</cell></row><row><cell></cell><cell></cell><cell>Dense ( output -76 classes)</cell></row></table></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" type="table" xml:id="tab_1"><head>Table 2</head><label>2</label><figDesc>Variants of the architecture of convolutional neural networks such as AlexNet and VGG 16</figDesc><table><row><cell>Architecture 4</cell><cell>Architecture 5</cell><cell>Architecture 6</cell></row><row><cell>Input (28x28x1)</cell><cell>Input (28x28x1)</cell><cell>Input (28x28x1)</cell></row><row><cell>conv2d, 128 filters</cell><cell>conv2d, 64 filters</cell><cell>conv2d, 128 filters</cell></row><row><cell>conv2d, 128 filters</cell><cell>conv2d, 64 filters</cell><cell>conv2d, 128 filters</cell></row><row><cell>MaxPooling2D</cell><cell>MaxPooling2D</cell><cell>MaxPooling2D</cell></row><row><cell>Dropout</cell><cell>Dropout</cell><cell>Dropout</cell></row><row><cell>Flatten</cell><cell>conv2d, 128 filters</cell><cell>conv2d, 256 filters</cell></row><row><cell>Dense, 256 filters</cell><cell>conv2d, 128 filters</cell><cell>conv2d, 256 filters</cell></row><row><cell>Dropout</cell><cell>MaxPooling2D</cell><cell>MaxPooling2D</cell></row><row><cell>Dense</cell><cell>Dropout</cell><cell>Dropout</cell></row><row><cell>( output -76 classes)</cell><cell>Flatten</cell><cell>conv2d, 512 filters</cell></row><row><cell></cell><cell>Dense, 256 filters</cell><cell>conv2d, 512 filters</cell></row><row><cell></cell><cell>Dropout</cell><cell>MaxPooling2D</cell></row><row><cell></cell><cell>Dense (output -76 classes)</cell><cell>Dropout</cell></row><row><cell></cell><cell></cell><cell>Flatten</cell></row><row><cell></cell><cell></cell><cell>Dense, 1024 filtrs</cell></row><row><cell></cell><cell></cell><cell>Dropout</cell></row><row><cell></cell><cell></cell><cell>Dense (output -76 classes)</cell></row></table></figure>
		</body>
		<back>
			<div type="references">

				<listBibl>

<biblStruct xml:id="b0">
	<analytic>
		<title level="a" type="main">Optical Character Recognition Systems for Different Languages with Soft Computing</title>
		<author>
			<persName><forename type="first">A</forename><surname>Chaudhuri</surname></persName>
		</author>
		<author>
			<persName><forename type="first">K</forename><surname>Mandaviya</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><forename type="middle">K</forename><surname>Ghosh</surname></persName>
		</author>
		<author>
			<persName><forename type="first">P</forename><surname>Badelia</surname></persName>
		</author>
		<idno type="DOI">10.1007/978-3-319-50252-6</idno>
	</analytic>
	<monogr>
		<title level="m">Studies in fuzziness and soft computing</title>
				<imprint>
			<publisher>Springer</publisher>
			<date type="published" when="2017">2017</date>
			<biblScope unit="volume">352</biblScope>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b1">
	<analytic>
		<title level="a" type="main">Toward end-to-end car license plate detection and recognition with deep neural networks</title>
		<author>
			<persName><forename type="first">H</forename><surname>Li</surname></persName>
		</author>
		<author>
			<persName><forename type="first">P</forename><surname>Wang</surname></persName>
		</author>
		<author>
			<persName><forename type="first">C</forename><surname>Shen</surname></persName>
		</author>
		<idno type="DOI">10.1109/TITS.2018.2847291</idno>
	</analytic>
	<monogr>
		<title level="j">IEEE Transactions on Intelligent Transportation Systems</title>
		<imprint>
			<biblScope unit="volume">20</biblScope>
			<biblScope unit="issue">3</biblScope>
			<biblScope unit="page" from="1126" to="1136" />
			<date type="published" when="2018">2018</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b2">
	<analytic>
		<title level="a" type="main">A neural network approach to character recognition</title>
		<author>
			<persName><forename type="first">A</forename><surname>Rajavelu</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><forename type="middle">T</forename><surname>Musavi</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><forename type="middle">V</forename><surname>Shirvaikar</surname></persName>
		</author>
		<idno type="DOI">10.1016/0893-6080(89)90023-3</idno>
	</analytic>
	<monogr>
		<title level="j">Neural Networks</title>
		<imprint>
			<biblScope unit="volume">2</biblScope>
			<biblScope unit="issue">5</biblScope>
			<biblScope unit="page" from="387" to="393" />
			<date type="published" when="1989">1989</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b3">
	<analytic>
		<title level="a" type="main">Image character recognition using deep convolutional neural network learned from different languages</title>
		<author>
			<persName><forename type="first">J</forename><surname>Bai</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Z</forename><surname>Chen</surname></persName>
		</author>
		<author>
			<persName><forename type="first">B</forename><surname>Feng</surname></persName>
		</author>
		<author>
			<persName><forename type="first">B</forename><surname>Xu</surname></persName>
		</author>
		<idno type="DOI">10.1109/ICIP.2014.7025518</idno>
	</analytic>
	<monogr>
		<title level="m">IEEE International Conference on Image Processing (ICIP)</title>
				<meeting><address><addrLine>Paris, France</addrLine></address></meeting>
		<imprint>
			<date type="published" when="2014">2014</date>
			<biblScope unit="page" from="2560" to="2564" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b4">
	<analytic>
		<title level="a" type="main">CNN based common approach to handwritten character recognition of multiple scripts</title>
		<author>
			<persName><forename type="first">D</forename><forename type="middle">S</forename><surname>Maitra</surname></persName>
		</author>
		<author>
			<persName><forename type="first">U</forename><surname>Bhattacharya</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><forename type="middle">K</forename><surname>Parui</surname></persName>
		</author>
		<idno type="DOI">10.1109/ICDAR.2015.7333916</idno>
	</analytic>
	<monogr>
		<title level="m">3th International Conference on Document Analysis and Recognition (ICDAR)</title>
				<imprint>
			<date type="published" when="2015">2015</date>
			<biblScope unit="page" from="1021" to="1025" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b5">
	<analytic>
		<title level="a" type="main">Online Turkish Handwriting Recognition Using Synthetic Data</title>
		<author>
			<persName><forename type="first">E</forename><forename type="middle">F</forename><surname>Bilgin Taşdemir</surname></persName>
		</author>
		<idno type="DOI">10.31590/ejosat.1039846</idno>
	</analytic>
	<monogr>
		<title level="j">Avrupa Bilim ve Teknoloji Dergisi</title>
		<imprint>
			<biblScope unit="volume">32</biblScope>
			<biblScope unit="page" from="649" to="656" />
			<date type="published" when="2021">2021</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b6">
	<analytic>
		<title level="a" type="main">Handwritten Kazakh and Russian (HKR) database for text recognition</title>
		<author>
			<persName><forename type="first">D</forename><surname>Nurseitov</surname></persName>
		</author>
		<author>
			<persName><forename type="first">K</forename><surname>Bostanbekov</surname></persName>
		</author>
		<author>
			<persName><forename type="first">D</forename><surname>Kurmankhojayev</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Alimova</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Abdallah</surname></persName>
		</author>
		<author>
			<persName><forename type="first">R</forename><surname>Tolegenov</surname></persName>
		</author>
		<idno type="DOI">10.1007/s11042-021-11399-6</idno>
	</analytic>
	<monogr>
		<title level="j">Multimedia Tools Appl</title>
		<imprint>
			<biblScope unit="volume">80</biblScope>
			<biblScope unit="page" from="33075" to="33097" />
			<date type="published" when="2021">2021</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b7">
	<analytic>
		<title level="a" type="main">Attention-Based Fully Gated CNN-BGRU for Russian Handwritten Text</title>
		<author>
			<persName><forename type="first">A</forename><surname>Abdelrahman</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Hamada</surname></persName>
		</author>
		<author>
			<persName><forename type="first">D</forename><surname>Nurseitov</surname></persName>
		</author>
		<idno type="DOI">10.3390/jimaging6120141</idno>
	</analytic>
	<monogr>
		<title level="j">Journal of Imaging</title>
		<imprint>
			<biblScope unit="volume">6</biblScope>
			<biblScope unit="issue">12</biblScope>
			<biblScope unit="page">141</biblScope>
			<date type="published" when="2020">2020</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b8">
	<analytic>
		<title level="a" type="main">An intelligent approach for Arabic handwritten letter recognition using convolutional neural network</title>
		<author>
			<persName><forename type="first">Z</forename><surname>Ullah</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Jamjoom</surname></persName>
		</author>
		<idno type="DOI">10.7717/peerj-cs.995</idno>
	</analytic>
	<monogr>
		<title level="j">PeerJ Computer Science</title>
		<imprint>
			<biblScope unit="volume">8</biblScope>
			<biblScope unit="page">e995</biblScope>
			<date type="published" when="2022">2022</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b9">
	<analytic>
		<title level="a" type="main">Handwritten Letter Recognition using Artificial Intelligence</title>
		<author>
			<persName><forename type="first">D</forename><surname>Jeevitha</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><surname>Muthu</surname></persName>
		</author>
		<author>
			<persName><forename type="first">I</forename><surname>Nila</surname></persName>
		</author>
		<author>
			<persName><forename type="first">V</forename><surname>Santhoshi</surname></persName>
		</author>
		<idno type="DOI">10.22214/ijraset.2022.42949</idno>
	</analytic>
	<monogr>
		<title level="j">International Journal for Research in Applied Science and Engineering Technology</title>
		<imprint>
			<biblScope unit="volume">10</biblScope>
			<biblScope unit="page" from="2752" to="2758" />
			<date type="published" when="2022">2022</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b10">
	<analytic>
		<title level="a" type="main">An exploratory study on the handwritten allographic features of multi-ethnic population with different educational backgrounds</title>
		<author>
			<persName><forename type="first">L</forename><surname>Gannetion</surname></persName>
		</author>
		<author>
			<persName><forename type="first">K</forename><forename type="middle">Y</forename><surname>Wong</surname></persName>
		</author>
		<author>
			<persName><forename type="first">P</forename><forename type="middle">Y</forename><surname>Lim</surname></persName>
		</author>
		<author>
			<persName><forename type="first">K</forename><forename type="middle">H</forename><surname>Chang</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><forename type="middle">F L</forename><surname>Abdullah</surname></persName>
		</author>
		<idno type="DOI">10.1371/journal.pone.0268756</idno>
	</analytic>
	<monogr>
		<title level="j">PloS one</title>
		<imprint>
			<biblScope unit="volume">17</biblScope>
			<biblScope unit="issue">10</biblScope>
			<biblScope unit="page">e0268756</biblScope>
			<date type="published" when="2022">2022</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b11">
	<analytic>
		<title level="a" type="main">EMNIST: Extending MNIST to handwritten letters</title>
		<author>
			<persName><forename type="first">G</forename><surname>Cohen</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><surname>Afshar</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Tapson</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Van Schaik</surname></persName>
		</author>
		<idno type="DOI">10.48550/arxiv.1702.05373</idno>
	</analytic>
	<monogr>
		<title level="m">2017 international joint conference on neural networks (IJCNN)</title>
				<imprint>
			<date type="published" when="2017">2017</date>
			<biblScope unit="page" from="2921" to="2926" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b12">
	<analytic>
		<title level="a" type="main">Backpropagation Applied to Handwritten Zip Code Recognition</title>
		<author>
			<persName><forename type="first">Y</forename><surname>Lecun</surname></persName>
		</author>
		<author>
			<persName><forename type="first">B</forename><forename type="middle">E</forename><surname>Boser</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><forename type="middle">S</forename><surname>Denker</surname></persName>
		</author>
		<author>
			<persName><forename type="first">D</forename><surname>Henderson</surname></persName>
		</author>
		<author>
			<persName><forename type="first">R</forename><forename type="middle">E</forename><surname>Howard</surname></persName>
		</author>
		<author>
			<persName><forename type="first">W</forename><forename type="middle">E</forename><surname>Hubbard</surname></persName>
		</author>
		<author>
			<persName><forename type="first">L</forename><forename type="middle">D</forename><surname>Jackel</surname></persName>
		</author>
		<idno type="DOI">10.1162/neco.1989.1.4.541</idno>
	</analytic>
	<monogr>
		<title level="j">Neural Computation</title>
		<imprint>
			<biblScope unit="volume">1</biblScope>
			<biblScope unit="page" from="541" to="551" />
			<date type="published" when="1989">1989</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b13">
	<analytic>
		<title level="a" type="main">Real-Time Handwritten Letters Recognition on an Embedded Computer Using ConvNets</title>
		<author>
			<persName><forename type="first">D</forename><forename type="middle">Núñez</forename><surname>Fernández</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><surname>Hosseini</surname></persName>
		</author>
		<idno type="DOI">10.1109/SHIRCON.2018.8592981</idno>
	</analytic>
	<monogr>
		<title level="m">IEEE Sciences and Humanities International Research Conference (SHIRCON)</title>
				<meeting><address><addrLine>Lima, Peru</addrLine></address></meeting>
		<imprint>
			<date type="published" when="2018">2018. 2018</date>
			<biblScope unit="page" from="1" to="4" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b14">
	<analytic>
		<title level="a" type="main">ImageNet classification with deep convolutional neural networks</title>
		<author>
			<persName><forename type="first">Alex</forename><surname>Krizhevsky</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Ilya</forename><surname>Sutskever</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Geoffrey</forename><forename type="middle">E</forename><surname>Hinton</surname></persName>
		</author>
		<idno type="DOI">10.1145/3065386</idno>
	</analytic>
	<monogr>
		<title level="j">Commun, ACM</title>
		<imprint>
			<biblScope unit="volume">60</biblScope>
			<biblScope unit="page" from="84" to="90" />
			<date type="published" when="2017">2017</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b15">
	<monogr>
		<title/>
		<author>
			<persName><forename type="first">C</forename><surname>Szegedy</surname></persName>
		</author>
		<author>
			<persName><forename type="first">W</forename><surname>Liu</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Y</forename><surname>Jia</surname></persName>
		</author>
		<author>
			<persName><forename type="first">P</forename><surname>Sermanet</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><forename type="middle">E</forename><surname>Reed</surname></persName>
		</author>
		<author>
			<persName><forename type="first">D</forename><surname>Anguelov</surname></persName>
		</author>
		<author>
			<persName><forename type="first">D</forename><surname>Erhan</surname></persName>
		</author>
		<author>
			<persName><forename type="first">V</forename></persName>
		</author>
		<imprint/>
	</monogr>
</biblStruct>

<biblStruct xml:id="b16">
	<analytic>
		<title level="a" type="main">Going deeper with convolutions</title>
		<author>
			<persName><forename type="first">A</forename><surname>Vanhoucke</surname></persName>
		</author>
		<author>
			<persName><surname>Rabinovich</surname></persName>
		</author>
		<idno type="DOI">10.1109/CVPR.2015.7298594</idno>
	</analytic>
	<monogr>
		<title level="m">IEEE Conference on Computer Vision and Pattern Recognition (CVPR)</title>
				<imprint>
			<date type="published" when="2014">2015. 2014</date>
			<biblScope unit="page" from="1" to="9" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b17">
	<analytic>
		<title level="a" type="main">A Survey of Handwritten Character Recognition with MNIST and EMNIST</title>
		<author>
			<persName><forename type="first">A</forename><surname>Baldominos</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Y</forename><surname>Saez</surname></persName>
		</author>
		<author>
			<persName><forename type="first">P</forename><surname>Isasi</surname></persName>
		</author>
		<idno type="DOI">10.3390/app9153169</idno>
	</analytic>
	<monogr>
		<title level="j">Appl. Sci</title>
		<imprint>
			<biblScope unit="volume">9</biblScope>
			<biblScope unit="issue">15</biblScope>
			<biblScope unit="page">3169</biblScope>
			<date type="published" when="2019">2019</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b18">
	<analytic>
		<title level="a" type="main">Handwritten Indic Character Recognition using Capsule Networks</title>
		<author>
			<persName><forename type="first">B</forename><surname>Mandal</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><surname>Dubey</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><surname>Ghosh</surname></persName>
		</author>
		<author>
			<persName><forename type="first">R</forename><surname>Sarkhel</surname></persName>
		</author>
		<author>
			<persName><forename type="first">N</forename><surname>Das</surname></persName>
		</author>
		<idno type="DOI">10.1109/ASPCON.2018.8748550</idno>
	</analytic>
	<monogr>
		<title level="m">IEEE Applied Signal Processing Conference (ASPCON)</title>
				<imprint>
			<date type="published" when="2018">2018. 2018</date>
			<biblScope unit="page" from="304" to="308" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b19">
	<analytic>
		<title level="a" type="main">Recognition of isolated characters across different input interfaces using 2D DCNN</title>
		<author>
			<persName><forename type="first">K</forename><forename type="middle">S</forename><surname>Yadav</surname></persName>
		</author>
		<author>
			<persName><forename type="first">K</forename><surname>Monsley</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><forename type="middle">A</forename><surname>Barlaskar</surname></persName>
		</author>
		<author>
			<persName><forename type="first">N</forename><surname>Ahmad</surname></persName>
		</author>
		<author>
			<persName><forename type="first">R</forename><forename type="middle">H</forename><surname>Laskar</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><forename type="middle">K</forename><surname>Bhuyan</surname></persName>
		</author>
		<idno type="DOI">10.1109/TENCON54134.2021.9707451</idno>
	</analytic>
	<monogr>
		<title level="m">TENCON 2021-2021 IEEE Region 10 Conference (TENCON)</title>
				<meeting><address><addrLine>Auckland, New Zealand</addrLine></address></meeting>
		<imprint>
			<date type="published" when="2021">2021</date>
			<biblScope unit="page" from="504" to="509" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b20">
	<monogr>
		<author>
			<persName><forename type="first">K</forename><surname>He</surname></persName>
		</author>
		<author>
			<persName><forename type="first">R</forename><forename type="middle">B</forename><surname>Girshick</surname></persName>
		</author>
		<author>
			<persName><forename type="first">P</forename><surname>Dollár</surname></persName>
		</author>
		<title level="m">Rethinking ImageNet Pre-Training</title>
				<imprint>
			<date type="published" when="2019">2019</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b21">
	<monogr>
		<idno type="DOI">10.1109/ICCV.2019.00502</idno>
		<title level="m">IEEE/CVF International Conference on Computer Vision (ICCV)</title>
				<imprint>
			<date type="published" when="2018">2018</date>
			<biblScope unit="page" from="4917" to="4926" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b22">
	<analytic>
		<title level="a" type="main">Intelligent Arabic Handwriting Recognition Using Different Standalone and Hybrid CNN Architectures</title>
		<author>
			<persName><forename type="first">W</forename><surname>Albattah</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><surname>Albahli</surname></persName>
		</author>
		<idno type="DOI">10.3390/app121910155</idno>
	</analytic>
	<monogr>
		<title level="j">Appl. Sci</title>
		<imprint>
			<biblScope unit="volume">12</biblScope>
			<biblScope unit="page">10155</biblScope>
			<date type="published" when="2022">2022</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b23">
	<analytic>
		<title level="a" type="main">Performance Analysis of State of the Art Convolutional Neural Network Architectures in Bangla Handwritten Character Recognition</title>
		<author>
			<persName><forename type="first">Tapotosh</forename><surname>Ghosh</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Min-Ha-Zul</forename><surname>Abedin</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Al</forename><surname>Hasan</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Nasirul</forename><surname>Banna</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Mohammad</forename><forename type="middle">Abu</forename><surname>Mumenin</surname></persName>
		</author>
		<author>
			<persName><surname>Yousuf</surname></persName>
		</author>
		<idno type="DOI">10.1134/S1054661821010089</idno>
	</analytic>
	<monogr>
		<title level="j">Pattern Recognit. Image Anal</title>
		<imprint>
			<biblScope unit="volume">31</biblScope>
			<biblScope unit="issue">1</biblScope>
			<biblScope unit="page" from="60" to="71" />
			<date type="published" when="2021">2021</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b24">
	<analytic>
		<title level="a" type="main">Handwritten devanagari character recognition using deep learning -convolutional neural network (cnn) model</title>
		<author>
			<persName><forename type="first">A</forename><surname>Bhardwaj</surname></persName>
		</author>
		<author>
			<persName><forename type="first">R</forename><surname>Singh</surname></persName>
		</author>
		<ptr target="https://archives.palarch.nl/index.php/jae/article/view/2203" />
	</analytic>
	<monogr>
		<title level="j">PalArch&apos;s Journal of Archaeology of Egypt/Egyptology</title>
		<imprint>
			<biblScope unit="volume">17</biblScope>
			<biblScope unit="issue">6</biblScope>
			<biblScope unit="page" from="7965" to="7984" />
			<date type="published" when="2020">2020</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b25">
	<analytic>
		<title level="a" type="main">Handwritten Devanagari Character Recognition Using Modified Lenet and Alexnet Convolution Neural Networks</title>
		<author>
			<persName><forename type="first">Duddela</forename><surname>Sai Prashanth</surname></persName>
		</author>
		<author>
			<persName><forename type="first">R</forename><surname>Vasanth Kumar</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Kadiyala</forename><surname>Mehta</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Vidhyacharan</forename><surname>Ramana</surname></persName>
		</author>
		<author>
			<persName><surname>Bhaskar</surname></persName>
		</author>
		<idno type="DOI">10.1007/s11277-021-08903-4</idno>
	</analytic>
	<monogr>
		<title level="j">Wirel. Pers. Commun</title>
		<imprint>
			<biblScope unit="volume">122</biblScope>
			<biblScope unit="issue">1</biblScope>
			<biblScope unit="page" from="349" to="378" />
			<date type="published" when="2022">2022</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b26">
	<analytic>
		<title level="a" type="main">Recognizing Arabic Handwritten Literal Amount Using Convolutional Neural Networks</title>
		<author>
			<persName><forename type="first">Aicha</forename><surname>Korichi</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Slatnia</forename><surname>Sihem</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Tagougui</forename><surname>Najiba</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Zouari</forename><surname>Ramzi</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Aiadi</forename><surname>Oussama</surname></persName>
		</author>
		<idno type="DOI">10.1007/978-3-030-96311-8_15</idno>
	</analytic>
	<monogr>
		<title level="m">Artificial Intelligence and Its Applications</title>
				<imprint>
			<date type="published" when="2022">2022</date>
			<biblScope unit="page" from="153" to="165" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b27">
	<analytic>
		<title level="a" type="main">A new Arabic handwritten character recognition deep learning system (AHCR-DLS)</title>
		<author>
			<persName><forename type="first">Magdy</forename><surname>Hossam</surname></persName>
		</author>
		<author>
			<persName><surname>Balaha</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Arafat</forename><surname>Hesham</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Mohamed</forename><surname>Ali</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Mahmoud</forename><surname>Saraya</surname></persName>
		</author>
		<author>
			<persName><surname>Badawy</surname></persName>
		</author>
		<idno type="DOI">10.1007/s00521-020-05397-2</idno>
	</analytic>
	<monogr>
		<title level="j">Neural Comput. Appl</title>
		<imprint>
			<biblScope unit="volume">33</biblScope>
			<biblScope unit="issue">11</biblScope>
			<biblScope unit="page" from="6325" to="6367" />
			<date type="published" when="2021">2021</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b28">
	<analytic>
		<title level="a" type="main">An Optimized Deep Residual Network with a Depth Concatenated Block for Handwritten Characters Classification</title>
		<author>
			<persName><forename type="first">Gibrael</forename><surname>Al Amin Abo</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Hadi</forename><surname>Samra</surname></persName>
		</author>
		<author>
			<persName><surname>Oqaibi</surname></persName>
		</author>
		<idno type="DOI">10.32604/cmc.2021.015318</idno>
	</analytic>
	<monogr>
		<title level="j">Computers, Materials &amp; Continua</title>
		<imprint>
			<biblScope unit="volume">680</biblScope>
			<biblScope unit="page" from="1" to="28" />
			<date type="published" when="2021">2021</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b29">
	<monogr>
		<title/>
		<author>
			<persName><forename type="first">D</forename><forename type="middle">B</forename><surname>Nurseitov</surname></persName>
		</author>
		<author>
			<persName><forename type="first">K</forename><surname>Bostanbekov</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Kanatov</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Alimova</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Abdallah</surname></persName>
		</author>
		<author>
			<persName><forename type="first">G</forename></persName>
		</author>
		<imprint/>
	</monogr>
</biblStruct>

<biblStruct xml:id="b30">
	<monogr>
		<author>
			<persName><surname>Abdimanap</surname></persName>
		</author>
		<idno type="DOI">10.25046/aj0505114</idno>
		<idno type="arXiv">arXiv:2102.04816</idno>
		<title level="m">Classification of Handwritten Names of Cities and Handwritten Text Recognition using Various Deep Learning Models</title>
				<imprint>
			<date type="published" when="2021">2021</date>
		</imprint>
	</monogr>
	<note type="report_type">arXiv preprint</note>
</biblStruct>

<biblStruct xml:id="b31">
	<monogr>
		<title level="m" type="main">Recognition of Handwritten Cyrillic Letters using PCA</title>
		<author>
			<persName><forename type="first">O</forename><surname>Vovchuk</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Kyrychenko</surname></persName>
		</author>
		<ptr target="https://www.researchgate.net/publication/336987544_Recognition_of_Handwritten_Cyrillic_" />
		<imprint>
			<date type="published" when="2019">2019</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b32">
	<monogr>
		<ptr target="https://github.com/GregVial/CoMNIST" />
		<title level="m">Cyrillic-oriented MNIST</title>
				<imprint/>
	</monogr>
</biblStruct>

<biblStruct xml:id="b33">
	<analytic>
		<title level="a" type="main">Economic efficiency of innovative projects of CNN modified architecture application</title>
		<author>
			<persName><forename type="first">V</forename><surname>Khavalko</surname></persName>
		</author>
		<author>
			<persName><forename type="first">V</forename><surname>Mykhailyshyn</surname></persName>
		</author>
		<author>
			<persName><forename type="first">R</forename><surname>Zhelizniak</surname></persName>
		</author>
		<author>
			<persName><forename type="first">I</forename><surname>Kovtyk</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Mazur</surname></persName>
		</author>
		<ptr target="https://ceur-ws.org/Vol-2654/paper14.pdf" />
	</analytic>
	<monogr>
		<title level="m">Proceedings of the International workshop on cyber hygiene (CybHyg-2019) co-located with 1st International conference on cyber hygiene and conflict management in global information networks (CyberConf 2019)</title>
				<meeting>the International workshop on cyber hygiene (CybHyg-2019) co-located with 1st International conference on cyber hygiene and conflict management in global information networks (CyberConf 2019)<address><addrLine>Kyiv, Ukraine</addrLine></address></meeting>
		<imprint>
			<date type="published" when="2019">2020. November 30, 2019</date>
			<biblScope unit="volume">2654</biblScope>
			<biblScope unit="page" from="182" to="193" />
		</imprint>
	</monogr>
	<note>CEUR Workshop Proceedings</note>
</biblStruct>

<biblStruct xml:id="b34">
	<monogr>
		<title level="m" type="main">Very Deep Convolutional Networks for Large-Scale Image Recognition</title>
		<author>
			<persName><forename type="first">K</forename><surname>Simonyan</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Zisserman</surname></persName>
		</author>
		<idno type="DOI">10.48550/arXiv.1409.1556</idno>
		<imprint>
			<date type="published" when="2014">2014</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b35">
	<analytic>
		<title level="a" type="main">Deep Residual Learning for Image Recognition</title>
		<author>
			<persName><forename type="first">K</forename><surname>He</surname></persName>
		</author>
		<author>
			<persName><forename type="first">X</forename><surname>Zhang</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><surname>Ren</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Sun</surname></persName>
		</author>
		<idno type="DOI">10.1109/cvpr.2016.90</idno>
	</analytic>
	<monogr>
		<title level="m">IEEE Conference on Computer Vision and Pattern Recognition (CVPR)</title>
				<imprint>
			<date type="published" when="2015">2015. 2016. 2015</date>
			<biblScope unit="page" from="770" to="778" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b36">
	<analytic>
		<title level="a" type="main">Identity Mappings in Deep Residual Networks</title>
		<author>
			<persName><forename type="first">X</forename><surname>Kaiming He</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Shaoqing</forename><surname>Zhang</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Jian</forename><surname>Ren</surname></persName>
		</author>
		<author>
			<persName><surname>Sun</surname></persName>
		</author>
		<idno type="DOI">10.1007/978-3-319-46493-0_38</idno>
	</analytic>
	<monogr>
		<title level="m">European Conference on Computer Vision-2016</title>
				<imprint>
			<publisher>Springer</publisher>
			<date type="published" when="2016">2016</date>
			<biblScope unit="page" from="630" to="645" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b37">
	<monogr>
		<title level="m" type="main">MobileNets: Efficient Convolutional Neural Networks for Mobile Vision Applications</title>
		<author>
			<persName><forename type="first">Andrew</forename><forename type="middle">G</forename><surname>Howard</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Menglong</forename><surname>Zhu</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Bo</forename><surname>Chen</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Dmitry</forename><surname>Kalenichenko</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Weijun</forename><surname>Wang</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Tobias</forename><surname>Weyand</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Marco</forename><surname>Andreetto</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Hartwig</forename><surname>Adam</surname></persName>
		</author>
		<idno type="DOI">10.48550/arXiv.1704.04861</idno>
		<imprint>
			<date type="published" when="2017">2017</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b38">
	<analytic>
		<title level="a" type="main">MobileNetV2: Inverted Residuals and Linear Bottlenecks</title>
		<author>
			<persName><forename type="first">Mark</forename><surname>Sandler</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Andrew</forename><forename type="middle">G</forename><surname>Howard</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Menglong</forename><surname>Zhu</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Andrey</forename><surname>Zhmoginov</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Liang-Chieh</forename><surname>Chen</surname></persName>
		</author>
		<idno type="DOI">10.1109/CVPR.2018.00474</idno>
	</analytic>
	<monogr>
		<title level="m">IEEE/CVF Conference on Computer Vision and Pattern Recognition</title>
				<imprint>
			<date type="published" when="2018">2018. 2018</date>
			<biblScope unit="page" from="4510" to="4520" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b39">
	<monogr>
		<author>
			<persName><forename type="first">C</forename><surname>Szegedy</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><surname>Ioffe</surname></persName>
		</author>
		<author>
			<persName><forename type="first">V</forename><surname>Vanhoucke</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><forename type="middle">A</forename><surname>Alemi</surname></persName>
		</author>
		<idno type="DOI">10.1609/aaai.v31i1.11231</idno>
		<title level="m">Inception-v4, Inception-ResNet and the Impact of Residual Connections on Learning</title>
				<imprint>
			<date type="published" when="2016">2016</date>
		</imprint>
	</monogr>
</biblStruct>

				</listBibl>
			</div>
		</back>
	</text>
</TEI>
