Handwritten Ukrainian Character Recognition using a Convolutional Neural Networks and Synthetic Dataset

Handwritten Ukrainian Character Recognition using a Convolutional Neural Networks and Synthetic Dataset ZinchenkoOlha zinchenkoov@gmail.com State University of Telecommunications

Solomenska street, 7 03110 Kiyv Ukraine

ChychkarovYevhen chychksrovea@gmail.com State University of Telecommunications

Solomenska street, 7 03110 Kiyv Ukraine

Handwritten Ukrainian Character Recognition using a Convolutional Neural Networks and Synthetic Dataset AC18928A4CB302C358C7F37617408188 GROBID - A machine learning software for extracting information from scholarly documents handwriting recognition recognition of Ukrainian characters convolutional neural networks (CNN) digit recognition deep learning image processing

text. This paper considers several options for the architecture of convolutional neural networks for the recognition of isolated handwritten Ukrainian characters and numbers, which were trained using a synthetic dataset built on the basis of a set of handwritten and cursive fonts. Comparison of the results of recognition of several variants of images containing handwritten letters and numbers using neural networks with different architectures showed that an increase in the number of convolutional layers leads to a decrease in the frequency of erroneous character recognition. The size of the training dataset significantly affects the reliability of character recognition. The data sets used in the work contained from 192 to 2304 samples per class. The upper limit of the number of samples per class is close to the limit that provides acceptable recognition accuracy. Reducing the sample size by reducing the number of samples per class leads to a significant decrease in recognition accuracy (from 90% recognition accuracy of elements of real inscriptions to 40-60% with a 4-fold decrease in sample size).

Introduction

Optical character recognition (OCR) is a technology that is widely used today. The basis of this technology is the process of classifying images of symbols, which are selected on the original digital image, according to the corresponding samples [1].

Information technologies based on optical recognition allow solving a wide range of various practical tasks: identification of vehicle registration numbers from images of license plates, which helps control traffic [2], conversion of printed academic records into text for storage in an electronic database, decoding ancient inscriptions and texts, automatic data entry by optical scanning of cards or bank checks.

In most cases, modern optical recognition systems are based on deep learning neural networks [3,4]. Convolutional neural networks (CNN) are widely used for image processing. It is one of the most popular types of deep neural networks and can be used to effectively recognize characters present in an image [5].

Literature review

Convolutional neural networks are widely used to solve optical recognition problems. They are able to automatically highlight the conditional features of the input data. The properties of such networks make them a very convenient tool for solving computer vision problems, in particular, for recognizing images of letters or numbers.

Initially, most of the research was focused on the recognition of the Latin alphabet letters, but in recent years other alphabets began to attract attention -Arabic, Russian, Kazakh, Chinese, Indian, etc. [6][7][8][9][10][11]. For research on handwritten Latin alphabet recognition technologies, the EMNIST dataset became the de facto standard [12]. Many different variants of neural network architectures have been proposed to classify the images of this set.

One of the first successful attempts to use deep learning for character recognition was the creation of the LeNet-5 architecture [13]. This architecture showed the highest accuracy of classification of handwritten digits among the solutions available at that time (1998).

Similar solutions are widely used in relatively recent works on computers with low computing power. For example, the ConvNet architecture proposed in [14] consists of two convolutional layers with kernels of size 5x5 each, and using non-linearity activation (ReLU) functions, a MaxPooling layer after each convolutional layer, and two fully connected layers that contain 500 neurons and the last layer with a selection of 26 classes. Such a neural network has only 60,000 learning parameters. This number of parameters is much smaller than the AlexNet network (60 million training parameters and 650,000 neurons) [15] or the GoogleNet network (6.8 million training parameters) [16].

The best results for training handwritten digit recognition models using the EMNIST Digits (or MNIST) datasets were achieved using convolutional neural networks (see [17] for a review).

One way to improve the accuracy of letter or number image recognition is to use models with a more complex architecture than AlexNet or LeNet. For example, good results in recognition accuracy were achieved due to the use of capsular layers [18]. The authors of [19] proposed a convolutional neural network that contains 14 convolutional layers to represent character characteristics, two MaxPooling layers to reduce the number of features or highlight strong features, one softmax layer, and one classification layer for isolated character recognition.

Pre-training using ImageNet accelerates convergence, especially at the beginning of training. However, for models with random initialization, the results achieved do not differ from models with pretraining for a comparable number of epochs [20].

According to [21], models created from scratch, as a rule, give better results compared to pre-trained models in the recognition of handwritten characters of the Arabic language. Regarding the complexity of the CNN architectures used, according to [21], less complex CNN models are less accurate, but have higher classification and learning rates (and vice versa). The authors of [21], based on the obtained results, suggested that learning from scratch all used models can improve the accuracy of model classification and the speed of obtaining results, regardless of the complexity of the model.

In numerous studies devoted to the recognition of handwritten symbols, there is experience in using sufficiently complex architectures of neural networks. For example, in [22], modern pre-trained CNN architectures were used to classify 231 different Bangla handwritten characters based on the CMATERdb dataset. The images were first converted to black and white form with a white foreground color. Images were resized to 28×28 pixels. These images were used as input for the CNN architectures that were used. The learning rate was set to 0.001. Categorical cross-entropy was used as the error function. After 50 epochs, InceptionResNetV2 achieved the best accuracy (96.99%). DenseNet121 and InceptionNetV3 also demonstrated excellent recognition accuracy (96.55 and 96.20%, respectively). The authors [22] also considered a combination of pre-trained architectures InceptionResNetV2, InceptionNetV3 and DenseNet121, which provided even better recognition accuracy (97.69%) compared to other individual CNN architectures, but concluded that its practical use requires large computing power and memory and therefore hard for practical use. The models were tested in cases where character recognition appears difficult to a human, but all architectures showed the same ability to reliably recognize such images. According to [22], the InceptionResNetV2 architecture can be called the most efficient model, taking into account the computational complexity, the amount of memory and the ability to recognize distorted symbols.

Research on the use of various architectures of neural networks without prior training based on ImageNet is also known. For example, in [23], two variants of convolutional neural networks with different architectures, varying the depth, width, and number of network parameters, were tested for recognition of Devanagari characters.

The first model consisted of three convolutional layers and one fully connected layer. The second model came from the LeNet family, and consisted of two convolutional layers followed by two fully connected layers. The best result in terms of recognition accuracy (over 98%) was obtained by the authors with a model with more convolutional layers.

A similar result was obtained in [24]. The authors investigated three variants of the architecture of CNN networks: LeNet-5, a modified variant of LeNet and AlexNet CNN. Using the latest version of the neural network architecture, Devanagari character recognition accuracy of 99% was achieved.

Numerous experiments with several convolutional neural networks (basic CNN, VGG-16, and ResNet) were conducted in [25] using regularization approaches such as filtering and data augmentation. The VGG and Resnet architectures gave close results in recognition accuracy: using the Resnet architecture, it was possible to achieve the best result with a recognition rate of 98.57%, for the VGG-16 architecture, a result of 97.14% was achieved.

The work [26] also noted the higher achieved recognition accuracy when using the deeper architecture of the CNN neural network. But increasing recognition accuracy is achieved only by using input data augmentation. In [27], different CNN architectures were investigated for recognizing the EMNIST dataset. According to [27], using the GoogleNet architecture always gives higher accuracy compared to ResNet18, but requires 2.5-2.9 times more time to train the model.

Neural network architectures that use prior learning have been created to classify color images of different sizes. Therefore, for many datasets (e.g., EMNIST Letters 28×28), single-channel images must be converted to three-channel to use existing libraries and pre-training capabilities [27]. In particular, the ResNet module from the tensorflow package requires an input image with a size of at least 32×32×3.

When using a modified CNN architecture and training models without loading the weights of the pre-trained model, the input data may contain single-channel images. When comparing variants of color and monochrome image recognition [27], it is indicated that variants with an input image size of 40×40 pixels (for the resized EMNIST data set) in monochrome versions with rotation and shift augmentation have the highest results in the models studied by the authors (ResNet18 and GoogleNet).

For the recognition of Cyrillic characters, similar studies are quite few. There is experience in using the MobileNet architecture, which included 30 layers [28] for character recognition of the Kazakh and Russian languages.

Some results of Cyrillic character recognition are also presented in [29][30].

Regarding the data set for the recognition of Ukrainian letters, individual works in this direction are known. According to [31], when creating a data set for model training, it is necessary to distinguish between uppercase and lowercase letters, as well as take into account the possibility of different spellings of the same letter. The authors [31] identified more than 70 classes that form a complete set of symbols of the Ukrainian language (for example, different spellings of the lowercase letter "a" were taken into account).

Experimental setup and proposed approach

There are quite a few studies of handwriting recognition technologies that are based on the use of the EMNIST data set [17] (at least for English). There is known experience of using various classifiers and neural network technologies to recognize Cyrillic alphabet symbols, but comparative studies of recognition technologies for them are fragmentary. Also, there are no EMNIST-like datasets for the Ukrainian alphabet.

This article is devoted to researching the possibilities of recognizing Cyrillic (mainly Ukrainian) handwritten letters using convolutional neural networks and analyzing the influence of the selected neural network architecture on the accuracy and reliability of recognition. In addition, the possibility of using a synthetic data set and the effect of augmentation of the original data set on the recognition results were investigated.

The goals of this study:

• Analysis of the influence of the architecture of convolutional neural networks on the accuracy of recognition of handwritten numbers and letters of the Ukrainian alphabet.

•

Analysis of the peculiarities of the recognition of Ukrainian symbols under the conditions of learning convolutional neural networks using a synthetic data set with various options for increasing the training sample.

Building a dataset for model training

The dataset used for training the models was built using a set of handwritten and italic fonts (a total of 48 font variants with Ukrainian glyphs were selected). All images of letters and numbers were divided into 76 classes (33 lowercase, 33 uppercase letters and 10 numbers) or 43 classes (33 letters and 10 numbers). All images of the data set were centered and a dataset was created from them with the size of each image 28x28, or 32x32, or 64x64, or 128x128 pixels. The Pillow library was used to create or transform images with letters or numbers (from one-channel to three-channel).

The test data set was built using the same fonts. The selection of specific fonts and augmentation options was chosen randomly. The volume of the test dataset was about 10% of the volume of the training one.

The presence of only a small number of suitable fonts with Ukrainian glyphs required the use of augmentation to form the necessary data set. We used the capabilities of the Image Data Generator from the tensorflow package to perform three options for transforming character images: random rotation, shift transformation, scaling transformation.

The number of generated images varied from 2 to 48 for each symbol. For 32 images per each symbol the total volume of the dataset was 116,736 samples (32 images per character of one font). This sample volume is quite comparable to the EMNIST Letters dataset [12,17], which contains mixed lowercase and uppercase letters (26 classes and a total of 145,600 samples).

Preprocessing of images for recognition

Tools from the OpenCV library were used to select image regions containing letters or numbers, which were then recognized. The findContours function or the algorithm for extracting the most stable extreme region (mser function) was used to select the contours of recognized symbols.

The algorithm for preprocessing the image and selection of the area containing letters or numbers included the following stages:

1. Image filtering to reduce the noise level (the Gaussian filter was used -function cv2.GaussianBlur);

2. Binarization of the image to cut off noise (the cv2.threshold function was used, its parameters were chosen for reliable selection of character contours);

3. Morphological transformation (dilationfunction cv2.dilate, several iterations were used); 4. Selection of contours and their sorting (selection of contours was carried out using the function cv2.findContours; 5. Image segmentation, ie. selection of recognition areas as a set of rectangles containing contours of letters and numbers (cv2.boundingRect functions were used).

Directly for recognition, selected regions of interest were cut from the original image, binarization was again applied to them, after which the obtained images of individual symbols (without dilation or other distortions) were scaled to the size of the image in the dataset. Each pixel value of the images was in the range 0 to 255, so normalization of these pixel values was performed by dividing by 255 so that all values in the array describing the image were in the range 0 to 1.

Proposed CNN Architectures

At the first stage of the research, the models were trained using single-channel images sized 28x28 pixels. The simplest variants of the architecture of LeNet-type convolutional neural networks for character image recognition are presented in Table 1.

More complex neural network architectures are presented in Table 2. Architecture 4 and architecture 5 are implementations of the AlexNet architecture for single-channel images. Architecture 6 included thirteen convolutional layers and three dense layers, as well as MaxPooling and Dropout layers. This version of the architecture is the most complex and repeats the VGG-16 architecture with respect to single-channel images. However, it turned out to be the best in terms of accuracy and reliability of recognizing the test sample and real inscriptions. Architecture 1 included an input layer, one convolution block of two layers, a MaxPooling subsampling layer, a Dropout regularization layer, a dense layer, a Flatten dimensionality transformation layer, another regularization layer, and an output layer.

Two more variants of the architecture of convolutional neural networks with an increased number of convolutional blocks are also presented in Table 1 (architecture 2 and architecture 3). They differ from the simplest version by an increase in the number of convolutional blocks from two layers (two blocks in the architecture 2 or three blocks in the architecture 3).

At the second stage of research, taking into account the presence of recognition errors even when using the best models, several variants for more complex neural networks architectures were considered. Research has been done with VGG16 and VGG19 [32], ResNet [33] or ResNetV2 [34], MobileNet or MobileNetV2 [35,36], InceptionResNetV2 [37] architectures.

Several variants of implemented architectures for ResNetV2 family are shown in Figure 1. An increase in the number of neural network parameters due to the use of a deeper architecture leads to an increase in recognition accuracy. The calculation time during neural network training increases with the increase in the number of adjustable parameters (when comparing architectures 1 and 6approximately an order of magnitude).

However, when trying to recognize images with real inscriptions that do not belong to the training or test sample, a significant difference in the behavior of the studied architecture variants was found regarding the possibility of reliable character recognition.

A typical example of recognizing an inscription containing letters is given in Table 3. As can be seen from the obtained results, 100% recognition accuracy is provided only by the most complex variant of the architecture (variant 6).

An attempt to recognize an inscription containing only numbers gave an even more pronounced result of the accuracy of recognizing an image with isolated numbers, see Table 4. Similar results were obtained for many other variants of inscriptions, including those with letters and numbers at the same time: acceptable results in terms of recognition accuracy were obtained for more complex variants of architecture.

Recognition errors were obtained on some samples of inscriptions and when using deep architectures.

Neural networks for all architectures were trained using the Adam optimizer, the learning rate was chosen to be 0.0001, the number of training epochs was chosen to be 50.

The size of the training sample strongly affects the reliability of character recognition. The generation of 1,536 images per letter or number (32 images for each character for 48 font types) is actually the limit for acceptable recognition accuracy. Reducing the sample size leads to a significant decrease in recognition accuracy (from 100% accuracy to 40-60% when the sample size is reduced by 4 times). An increase in the size of the sample leads to a noticeable increase in the time spent on training the model.

The use of ResNet or MobileNet architectures required a transition to the formation of a training dataset from three-channel images. It has been established that reliable recognition of various alphanumeric inscriptions for all variants of the model architecture was achieved using a training set of sufficient size.

Training a model using three-channel images, especially as the resolution of the training sample increases, is a very resource-intensive process. Therefore, the authors were forced to reduce the number of recognizable classes to 43, abandoning the difference between lowercase and uppercase letters.

An example of the recognition result for alphabetic and digital inscriptions is shown in Comparing different model architectures, all the options considered showed the recognition accuracy of the test set in the range of 99.2-99.6% when trained on a dataset of sufficient volume. An increase in the number of samples in the training data set for all the considered architectures led to an increase in recognition accuracy. An example of the experimental results for the model with the MobileNet architecture is shown in Figure 5.

The recognition accuracy of real inscriptions with an accuracy of 80-90% was achieved with a training sample size of at least 700, and preferably more than 1500 images per class. An example of the experimental results for the model with the MobileNetV2 architecture is shown in Figure 6.

Variation of the parameters of the transformations that were used for augmentation also has a noticeable effect on the recognition results: deformation or rotation of the image by more than 10-15% increases the frequency of errors.

Increasing the resolution of the training sample images had little effect on the results due to saturation.

For example, when training a model with the MobileNetV2 architecture on a dataset with a resolution of 32x32 data, the recognition accuracy of the test dataset was 98%, on a dataset with a resolution of 64x64, respectively, 99%, and on a dataset with a resolution of 128x128 -99.5% (example is shown in Figure 7: An example of the influence of the training dataset images resolution on the achieved recognition accuracy (MobileNetV2 and ResNet152v2 architecture).Figure 7(a)). However, for other architectures, the result of resolution increase was much less pronounced.

The number of errors in recognition of elements of real inscriptions has changed little: for the model with the ResNet152V2 architecture, an increase in the resolution of images of the training sample led to a decrease in the proportion of erroneous recognition from 18.0% to 11.4% (Figure 7 (b)), for models with the MobileNet or MobileNetV2 architecture, it has not practically changed. However, with an increase in the resolution of the training sample, the time spent on training increased quite significantly (by more than an order of magnitude). When using deep neural networks to recognize letters or numbers, the reliability of recognition of elements of real inscriptions depended primarily on the size of the training dataset.

The recognition accuracy of the test dataset after training all variants of the models was quite high -97-98% and higher. However, the generation of training datasets of a small size -300-500 images per class -practically did not provide any reliable recognition.

The use of a model with the InceptionResNetV2 architecture, which requires an image resolution in the training set of at least 75x75x3 (actually, the model was trained on 128x128x3 images), did not lead to a noticeable increase in recognition accuracy.

In general, when comparing the achieved accuracy of recognition of real images and the speed of training the model, the best performance was provided by models of the ResNetV2 or MobileNetV2 family.

Conclusions

In this work, several variants of the architecture of convolutional neural networks for the recognition of isolated handwritten digits and Ukrainian letters are considered.

The results of recognition of various images containing letters and numbers were compared on models with different architectures. It has been established that when training a model on a set of onedimensional images 28x28, an increase in the number of convolutional layers of a neural network in most cases leads to an increase in the reliability of recognition. Among the options considered, the best accuracy and reliability of recognition was provided by a model with an architecture of the VGG16 type, which included 13 convolutional and three dense layers.

The possibility of learning convolutional neural networks using a synthetic data set built on the basis of handwritten or cursive fonts is shown. The size of the training dataset significantly affects the reliability of character recognition. The data sets used in the work contained from 192 to 2304 samples per class.

The lower limit of the sample size, which provides acceptable recognition accuracy, was 1536 characters per class. Reducing the sample size by reducing the number of samples per class leads to a significant decrease in recognition accuracy (from 90% recognition accuracy of elements of real inscriptions to 40-60% with a 4-fold decrease in sample size). An increase in the volume of the training data set did not provide an increase in the accuracy and reliability of recognition, but led to a significant increase in the training time of the model An increase in image resolution from 32x32x3 to 128x128x3 of the training dataset in most cases did not lead to an increase in the reliability of real image recognition.

Figure 1 : 2 :12Figure 1: Examples of implementation of models with architectures ResNetV2 family

Figure 3 :3Results of learning the VGG-16-type modelTable 3 A sample of the results of an inscription with letters recognitionInscription on the image CNN

Figure 4 .4The figure shows the selected areas of interest and recognition results.

Figure 4 :4Figure 4: An example of recognition results using the VGG16 neural network (in this case, all letters and numbers are recognized accurately)

Figure 5 :5Figure 5: An example of the influence of the size of the training dataset on the achieved recognition accuracy (MobileNet architecture).

Figure 6 :6Figure 6: Recognition errors of real inscriptions depending on the size of the training dataset (MobileNetV2 architecture, 32х32х3 dataset images).

Figure 7 :7Figure 7: An example of the influence of the training dataset images resolution on the achieved recognition accuracy (MobileNetV2 and ResNet152v2 architecture).

Table 11The simplest variants of convolutional neural network architectureArchitecture 1Architecture 2Architecture 3Input (28x28x1)Input (28x28x1)Input (28x28x1)conv2d, 128 filtersconv2d, 64 filtersconv2d, 128 filtersconv2d, 128 filtersconv2d, 64 filtersconv2d, 128 filtersMaxPooling2DMaxPooling2DMaxPooling2DDropoutDropoutDropoutFlattenconv2d, 128 filtersconv2d, 256 filtersDense, 256 filtersconv2d, 128 filtersconv2d, 256 filtersDropoutMaxPooling2DMaxPooling2DDense ( output -76 classes)DropoutDropoutFlattenconv2d, 512 filtersDense, 256 filtersconv2d, 512 filtersDropoutMaxPooling2DDense ( output -76 classes)DropoutFlattenDense, 1024 filtersDropoutDense ( output -76 classes)

Table 22Variants of the architecture of convolutional neural networks such as AlexNet and VGG 16Architecture 4Architecture 5Architecture 6Input (28x28x1)Input (28x28x1)Input (28x28x1)conv2d, 128 filtersconv2d, 64 filtersconv2d, 128 filtersconv2d, 128 filtersconv2d, 64 filtersconv2d, 128 filtersMaxPooling2DMaxPooling2DMaxPooling2DDropoutDropoutDropoutFlattenconv2d, 128 filtersconv2d, 256 filtersDense, 256 filtersconv2d, 128 filtersconv2d, 256 filtersDropoutMaxPooling2DMaxPooling2DDenseDropoutDropout( output -76 classes)Flattenconv2d, 512 filtersDense, 256 filtersconv2d, 512 filtersDropoutMaxPooling2DDense (output -76 classes)DropoutFlattenDense, 1024 filtrsDropoutDense (output -76 classes)

Optical Character Recognition Systems for Different Languages with Soft Computing AChaudhuri KMandaviya SKGhosh PBadelia 10.1007/978-3-319-50252-6 Studies in fuzziness and soft computing Springer 2017 352 Toward end-to-end car license plate detection and recognition with deep neural networks HLi PWang CShen 10.1109/TITS.2018.2847291 IEEE Transactions on Intelligent Transportation Systems 20 3 2018 A neural network approach to character recognition ARajavelu MTMusavi MVShirvaikar 10.1016/0893-6080(89)90023-3 Neural Networks 2 5 1989 Image character recognition using deep convolutional neural network learned from different languages JBai ZChen BFeng BXu 10.1109/ICIP.2014.7025518 IEEE International Conference on Image Processing (ICIP)

Paris, France

2014 CNN based common approach to handwritten character recognition of multiple scripts DSMaitra UBhattacharya SKParui 10.1109/ICDAR.2015.7333916 3th International Conference on Document Analysis and Recognition (ICDAR) 2015 Online Turkish Handwriting Recognition Using Synthetic Data EFBilgin Taşdemir 10.31590/ejosat.1039846 Avrupa Bilim ve Teknoloji Dergisi 32 2021 Handwritten Kazakh and Russian (HKR) database for text recognition DNurseitov KBostanbekov DKurmankhojayev AAlimova AAbdallah RTolegenov 10.1007/s11042-021-11399-6 Multimedia Tools Appl 80 2021 Attention-Based Fully Gated CNN-BGRU for Russian Handwritten Text AAbdelrahman MHamada DNurseitov 10.3390/jimaging6120141 Journal of Imaging 6 12 141 2020 An intelligent approach for Arabic handwritten letter recognition using convolutional neural network ZUllah MJamjoom 10.7717/peerj-cs.995 PeerJ Computer Science 8 e995 2022 Handwritten Letter Recognition using Artificial Intelligence DJeevitha SMuthu INila VSanthoshi 10.22214/ijraset.2022.42949 International Journal for Research in Applied Science and Engineering Technology 10 2022 An exploratory study on the handwritten allographic features of multi-ethnic population with different educational backgrounds LGannetion KYWong PYLim KHChang AF LAbdullah 10.1371/journal.pone.0268756 PloS one 17 10 e0268756 2022 EMNIST: Extending MNIST to handwritten letters GCohen SAfshar JTapson AVan Schaik 10.48550/arxiv.1702.05373 2017 international joint conference on neural networks (IJCNN) 2017 Backpropagation Applied to Handwritten Zip Code Recognition YLecun BEBoser JSDenker DHenderson REHoward WEHubbard LDJackel 10.1162/neco.1989.1.4.541 Neural Computation 1 1989 Real-Time Handwritten Letters Recognition on an Embedded Computer Using ConvNets DNúñezFernández SHosseini 10.1109/SHIRCON.2018.8592981 IEEE Sciences and Humanities International Research Conference (SHIRCON)

Lima, Peru

2018. 2018 ImageNet classification with deep convolutional neural networks AlexKrizhevsky IlyaSutskever GeoffreyEHinton 10.1145/3065386 Commun, ACM 60 2017 <author> <persName><forename type="first">C</forename><surname>Szegedy</surname></persName> </author> <author> <persName><forename type="first">W</forename><surname>Liu</surname></persName> </author> <author> <persName><forename type="first">Y</forename><surname>Jia</surname></persName> </author> <author> <persName><forename type="first">P</forename><surname>Sermanet</surname></persName> </author> <author> <persName><forename type="first">S</forename><forename type="middle">E</forename><surname>Reed</surname></persName> </author> <author> <persName><forename type="first">D</forename><surname>Anguelov</surname></persName> </author> <author> <persName><forename type="first">D</forename><surname>Erhan</surname></persName> </author> <author> <persName><forename type="first">V</forename></persName> </author> <imprint/> </monogr> </biblStruct> <biblStruct xml:id="b16"> <analytic> <title level="a" type="main">Going deeper with convolutions AVanhoucke Rabinovich 10.1109/CVPR.2015.7298594 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) 2015. 2014 A Survey of Handwritten Character Recognition with MNIST and EMNIST ABaldominos YSaez PIsasi 10.3390/app9153169 Appl. Sci 9 15 3169 2019 Handwritten Indic Character Recognition using Capsule Networks BMandal SDubey SGhosh RSarkhel NDas 10.1109/ASPCON.2018.8748550 IEEE Applied Signal Processing Conference (ASPCON) 2018. 2018 Recognition of isolated characters across different input interfaces using 2D DCNN KSYadav KMonsley SABarlaskar NAhmad RHLaskar MKBhuyan 10.1109/TENCON54134.2021.9707451 TENCON 2021-2021 IEEE Region 10 Conference (TENCON)

Auckland, New Zealand

2021 KHe RBGirshick PDollár Rethinking ImageNet Pre-Training 2019 10.1109/ICCV.2019.00502 IEEE/CVF International Conference on Computer Vision (ICCV) 2018 Intelligent Arabic Handwriting Recognition Using Different Standalone and Hybrid CNN Architectures WAlbattah SAlbahli 10.3390/app121910155 Appl. Sci 12 10155 2022 Performance Analysis of State of the Art Convolutional Neural Network Architectures in Bangla Handwritten Character Recognition TapotoshGhosh Min-Ha-ZulAbedin AlHasan NasirulBanna MohammadAbuMumenin Yousuf 10.1134/S1054661821010089 Pattern Recognit. Image Anal 31 1 2021 Handwritten devanagari character recognition using deep learning -convolutional neural network (cnn) model ABhardwaj RSingh PalArch's Journal of Archaeology of Egypt/Egyptology 17 6 2020 Handwritten Devanagari Character Recognition Using Modified Lenet and Alexnet Convolution Neural Networks DuddelaSai Prashanth RVasanth Kumar KadiyalaMehta VidhyacharanRamana Bhaskar 10.1007/s11277-021-08903-4 Wirel. Pers. Commun 122 1 2022 Recognizing Arabic Handwritten Literal Amount Using Convolutional Neural Networks AichaKorichi SlatniaSihem TagouguiNajiba ZouariRamzi AiadiOussama 10.1007/978-3-030-96311-8_15 Artificial Intelligence and Its Applications 2022 A new Arabic handwritten character recognition deep learning system (AHCR-DLS) MagdyHossam Balaha ArafatHesham MohamedAli MahmoudSaraya Badawy 10.1007/s00521-020-05397-2 Neural Comput. Appl 33 11 2021 An Optimized Deep Residual Network with a Depth Concatenated Block for Handwritten Characters Classification GibraelAl Amin Abo HadiSamra Oqaibi 10.32604/cmc.2021.015318 Computers, Materials & Continua 680 2021 <author> <persName><forename type="first">D</forename><forename type="middle">B</forename><surname>Nurseitov</surname></persName> </author> <author> <persName><forename type="first">K</forename><surname>Bostanbekov</surname></persName> </author> <author> <persName><forename type="first">M</forename><surname>Kanatov</surname></persName> </author> <author> <persName><forename type="first">A</forename><surname>Alimova</surname></persName> </author> <author> <persName><forename type="first">A</forename><surname>Abdallah</surname></persName> </author> <author> <persName><forename type="first">G</forename></persName> </author> <imprint/> </monogr> </biblStruct> <biblStruct xml:id="b30"> <monogr> <author> <persName><surname>Abdimanap</surname></persName> </author> <idno type="DOI">10.25046/aj0505114</idno> <idno type="arXiv">arXiv:2102.04816</idno> <title level="m">Classification of Handwritten Names of Cities and Handwritten Text Recognition using Various Deep Learning Models 2021 arXiv preprint Recognition of Handwritten Cyrillic Letters using PCA OVovchuk MKyrychenko 2019 Cyrillic-oriented MNIST Economic efficiency of innovative projects of CNN modified architecture application VKhavalko VMykhailyshyn RZhelizniak IKovtyk AMazur Proceedings of the International workshop on cyber hygiene (CybHyg-2019) co-located with 1st International conference on cyber hygiene and conflict management in global information networks (CyberConf 2019) the International workshop on cyber hygiene (CybHyg-2019) co-located with 1st International conference on cyber hygiene and conflict management in global information networks (CyberConf 2019)

Kyiv, Ukraine

2020. November 30, 2019 2654 CEUR Workshop Proceedings Very Deep Convolutional Networks for Large-Scale Image Recognition KSimonyan AZisserman 10.48550/arXiv.1409.1556 2014 Deep Residual Learning for Image Recognition KHe XZhang SRen JSun 10.1109/cvpr.2016.90 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) 2015. 2016. 2015 Identity Mappings in Deep Residual Networks XKaiming He ShaoqingZhang JianRen Sun 10.1007/978-3-319-46493-0_38 European Conference on Computer Vision-2016 Springer 2016 MobileNets: Efficient Convolutional Neural Networks for Mobile Vision Applications AndrewGHoward MenglongZhu BoChen DmitryKalenichenko WeijunWang TobiasWeyand MarcoAndreetto HartwigAdam 10.48550/arXiv.1704.04861 2017 MobileNetV2: Inverted Residuals and Linear Bottlenecks MarkSandler AndrewGHoward MenglongZhu AndreyZhmoginov Liang-ChiehChen 10.1109/CVPR.2018.00474 IEEE/CVF Conference on Computer Vision and Pattern Recognition 2018. 2018 CSzegedy SIoffe VVanhoucke AAAlemi 10.1609/aaai.v31i1.11231 Inception-v4, Inception-ResNet and the Impact of Residual Connections on Learning 2016