Efficiency of Stochastic Gradient Identification of Similar Shape Objects in Binary and Grayscale Images Radik Magdeev Aleksander Tashlinsky Galina Safina LLC “Telecom.ru” Radio Engineering Department National Research Moscow State Ulyanovsk, Russia Ulyanovsk State Technical University University of Civil Engineering radiktkd2@yandex.ru Ulyanovsk, Russia Moscow, Russia tag@ulstu.ru safinagl@mgsu.ru Abstract—A comparative analysis of efficiency of stochastic the pattern and image of the object can differ in scale factor gradient identification method on the base of pattern of objects  , orientation angle  , and shifts h   h x , h y  T with similar shapes by their grayscale and binary images is along the carried out. Object identification is understood as the basic axes О х and Оу , in addition, additive noise. We determination of the object image in the studied image with the estimation of its spatial parameters in relation to the reference used the COIL-20 halftone images including images of 1440 image. Two types of objects with similar shape are investigated objects [10]. In this case, binary versions were obtained for on the base of COIL-20 halftone images and their binary each of the halftone images. A number of examples of versions. The objects of the first type have a different character halftone images and their binary versions are shown in of the curvature of the lines describing their contour, and the Fig. 1. objects of the second type are close to the curvature characteristics of the contour lines. II. IDENTIFICATION METHOD DESCRIPTION Keywords—binary image, grayscale image, object In SGIM the identification parameters ˆ , on the basis of recognition, pattern recognition, stochastic gradient which the decision is made, are searched recursively [11]: identification, parameter estimation, convergence    I. INTRODUCTION  ˆ t  ˆ t  1  Λ t β t   The problem of pattern recognition, both on separate where β t is the stochastic gradient of the cost function of images and on video sequences, arises in a variety of areas: from military affairs and security systems to the digitization identification quality, depending on ˆ t  1 and the iteration of analog signals. The problem of automating the solution of number t  0 , T ; Λ t is the gain matrix [12]; Т is the this problem remains relevant both from the point of view of number of iterations. It was shown in [11, 13] that it is theory and technical implementation [1-3]. Pattern advisable to use the brightness correlation coefficient (BCC) recognition, as a rule, is considered as assigning on the basis or the mean square of the brightness difference (MSBD) of of the initial data of the object in the image, to a certain class the pattern and the studied image as the cost function, which (group of classes) by comparing the selected essential were used in this work. Hereinafter, a pattern refers to a features characterizing this class. The main difficulty in this reference image of an object. At each iteration, in order to case is to establish the correspondence between the object find the next estimate of the parameter vector two- highlighted in the studied image and the given patterns dimensional local sample of the same samples on the pattern (images of the object’s standards) based on a finite set of and the studied image is used. As a rule, this sample has some properties and attributes. Note that there are several small size [14]. areas in pattern recognition: The effective working range of the estimated parameters – recognition of many predefined objects, or classes of of the SGIM (in which the estimates for a given number of objects in the image; iterations do not go beyond the required confidence interval) – object detection, implemented by checking the image is limited. If it does not cover the domain of parameters, then or its part for compliance with certain conditions; to provide coverage it is required to specify several patterns – identification on the image of the object with the with different initial approximations of the parameters. It was assessment of its parameters and decision making. also shown in [4, 5] that in order to increase the convergence In [4, 5] it is shown that identifying images of objects by rate of estimates and to expand the working range for binary a pattern can be reduced to searching for a spatial images it is advisable to use low-pass filtering, for example, transformation that minimizes the distance between the Gaussian, as the pre-processing. The optimal size of the desired image and the pattern in a given metric space, and a mask of a Gaussian filter for binary images is 10 % of the stochastic gradient identification method (SGIM) of objects identified object size. on binary images is proposed, which showed good It was also shown in [4, 5] that in order to increase the efficiency in comparison with the correlation-extreme convergence rate of estimates and to expand the working method [6] and the contour analysis method [7]. This article range for binary images, it is advisable to use low-pass discusses the effectiveness of SGIM for grayscale images in filtering, for example, Gaussian, as the pre-processing. The comparison with its usage for binarized images. optimal size of the mask of a Gaussian filter for binary images is 10% of the identified object size. For concreteness, we will assume that possible The studies using halftone images from the COIL-20 deformations of the identified object with respect to the base have also shown the appropriateness of low-pass pattern can be reduced to a similarity model [8, 9], that is, filtering. In this case, the optimal size of the Gaussian filter Copyright © 2020 for this paper by its authors. Use permitted under Creative Commons License Attribution 4.0 International (CC BY 4.0) Image Processing and Earth Remote Sensing mask, which allows expanding the operating range of the image preprocessing. Thus, both the rate of convergence of SGIM while maintaining identification accuracy, is from 3 % estimates and the effective working range when using to 10 % of the of the object size in the image. We also note grayscale and binarized images can vary. This is especially that the approximate implementation of the Gaussian filter proposed in [15, 16] and based on infinite impulse response true for images of objects having a similar shape. hˆ x t is used. The computational complexity of the approach used does not depend on the size of the filter mask and is approximately 1 6 L x L y elementary operations, where L x and Ly are the image sizes. (a) (a) (b) Fig. 2. Convergence of brightness differences SD of the modified pattern and the studied image for grayscale (a) and binary (b) images. (b) Fig. 1. Example of halftone patterns (a) and their binary versions (b). The computational complexity of the stochastic gradient parameter estimation procedure that underlies the SGIM was studied in [17] and, in particular, is similar to the parameters of the similarity model when using MSBD from  22  25 Т to  5 2   2 0  Т elementary operations (depending on the chosen method of finding the pseudo- (a) gradient of the objective function), and when using the BCC from  5 1   9 1  Т to  6 9   4 8  Т elementary operations, where  is the local sample size at each iteration. As a characteristic of the SGIM efficiency for binary and grayscale images, we use the convergence of the standard deviation (SD) ˆ t of the brightness differences of the modified pattern and the studied image, which is calculated (b) at each t -th iteration from a local sample of identifiable Fig. 3. Example of studied halftone (a) and binary (b) images and their corresponding patterns. image and pattern samples, t  0 , T . Example of ˆ t convergence graphs for the left object of Fig. 1 (car) with the mismatch parameters of the pattern and the studied h   hx , hy  T   6,  6  T object:   0 .8 5 ,   3 5 0 , , is shown in Fig. 2, where graph (a) corresponds to a halftone image, and (b) a binary image. The studied images and corresponding patterns are shown in Fig. 3, and the convergence graphs of the estimates of individual identification parameters are shown in Fig. 4, where the solid line corresponds to the grayscale images and the dashed line corresponds to binary images. The image sizes are 128x128 elements, the local sample size is   1 5 . It can be seen from the plots that for this object estimates   T ˆ of the identification parameters ˆ t  ˆ t , ˆ t , h t when processing halftone images and patterns converge slower (for about 400 iterations) than when processing their binarized versions (for about 200 iterations). This is Fig. 4. Iidentification parameters convergence. explained by the large size of the low-pass filter during VI International Conference on "Information Technology and Nanotechnology" (ITNT-2020) 26 Image Processing and Earth Remote Sensing III. IDENTIFICATION OF OBJECTS WITH A SIMILAR SHAPE where R t , m tˆ and  tˆ are threshold values. The threshold i i Using the objects of the COIL-20 images, we consider values of the identification criteria for the used image two types of objects that are similar in shape: the curvatures database were determined by the method [18]: of the lines describing the contour of a different nature (the R  0 .9 2 , m ˆ  9 .1 6 ,  ˆ  4 .6 3 . t t t objects shown in Fig. 5a can serve as an example), and with i i similar curvature characteristics of the contour lines (an The following results are obtained for the first type of example of such objects is shown in Fig. 5b). The indicated objects. For the binarized images, the correlation coefficient figures also show binary versions of the images of these between the image of the object and the “correct” pattern is objects. Obviously, the studied types of objects are critical in R  0 .9 9 and exceeds the threshold value. For this pair the the processing of binary images. additional criteria are also fulfilled: m ˆ  1 .1 1  m ˆ ,  ˆ  0 .6 9   ˆ . t t t i i i However, the correlation coefficient between the image of the object and similar patterns transformed by the SGIM also exceeds the identification threshold value ( R  0, 9 4 ). At the same time, the numerical values of auxiliary characteristics do not reach threshold values, although they are quite close to them ( m ˆ  1 1 ,  ˆ  7 .3 ). For grayscale t i images, the correlation coefficient between the image of the object and the “correct” pattern is also 0.99 and exceeds the threshold value, and the correlation coefficient with similar patterns ( R  0 .7 ) is significantly lower than the threshold. Additional criteria for the “correct” pattern are also fulfilled: (a) m ˆ  1 .2 1 ,  ˆ  0 .2 7 t i and for similar pattern the values of additional characteristics significantly exceed the threshold: m ˆ  1 7  9 .6 1 ,  ˆ  1 5  4 .6 3 . t i Thus, for this type of object, when binarizing their images, the decision on identification requires the use of additional criteria. For grayscale images, a decision on identification is possible using only the main criterion for the correlation coefficient, and additional ones can be used to assess the reliability of the identification. An analysis of the usage of SGIM for binary images of objects of similar shape in the second type showed that all (b) identification criteria are satisfied, both for the “correct” pattern and for similar ones. So, for the “correct” pattern Fig. 5. Examples of similarly shaped images having different and close are: characteristics of the contour lines curvature. R  0 .9 9 , m ˆ  1 .3 1  m ˆ ,  ˆ  0 .8 9   ˆ t t In the experiment, the identification method proposed in t i i i [18] was applied and based on three criteria, one of which and for similar are: uses the correlation coefficient between the studied R  0, 9 7 , m ˆ  7 .2  m ˆ ,  ˆ  1 .4   ˆ . t t (deformed) image of the object and the patterns transformed t i i i using SGIM (we will conventionally call this criterion the main one). Two other criteria use convergence characteristics For grayscale images, the correlation coefficient between of identification parameters (additional criteria). One the image of the object and the “correct” pattern exceeds the characteristic is the estimation of the mean value of the threshold, but less than in the other cases considered standard deviation of the brightness differences of the ( R  0 .9 6 ). The values of the additional characteristics are modified pattern and the studied image in the steady state of significantly lower than the threshold: the process of evaluating the SGIM identification m ˆ  1.81,  ˆ  1.74. parameters. This characteristic is in iterations of steady state. t i Another characteristic is the standard deviation of values, For similar pattern, the criterion for the correlation also at iterations of the steady state. coefficient is not satisfied ( R  0 .8 3 ) and the values of the The steady state of the identification process is clearly auxiliary characteristics significantly exceed the threshold illustrated in Fig. 2. The decision on identification is made if ( m ˆ  2 3 .3 ,  ˆ  1 2 .3 ). all three criteria are fulfilled: t i Thus, for objects of similar shape with similar R  R t , m ˆ  m tˆ ,  ˆ   tˆ , t i i i characteristics of contour lines curvature, their identification by the pattern from binarized images is ineffective. When VI International Conference on "Information Technology and Nanotechnology" (ITNT-2020) 27 Image Processing and Earth Remote Sensing identifying this type of objects by their grayscale images, it using hyperspectral data,” Computer Optics, vol. 43, no. 3, pp. 464- 473, 2019. DOI: 10.18287/2412-6179-2019-43-3-464-473. is advisable to use the additional criteria used in the work. [2] A.V. Poltavskii and A.V. Grinshkun, “Basics of pattern recognition using It should be noted that the effective operating range of computer,” Dvoinie Tehnologii, vol. 2, pp. 55-66, 2017 SGIM for images of this type is significantly reduced. So, [3] S.V. Kurochkin, “Detection of the homotopy type of an object using when choosing similarity model parameters as identification differential invariants of an approximating map,” Computer Optics, parameters, for the images considered, it is:   0 .8 ... 1 .1 ; vol. 43, no. 4, pp. 611-617, 2019. DOI: 10.18287/2412-6179-2019- 43-4-611-617.    1 0 ...  1 0 0 0 ; h   5 ...  5 pixels. This is due to the [4] R.G. Magdeev and A.G. Tashlinskii, “A comparative analysis of the fact that such images differ mainly in texture, and the efficiency of the stochastic gradient approach to the identification of objects in binary images,” Pat. Rec. and Im. Anal., vol. 24, no. 4, pp. preliminary low-pass filtering procedure smooths the 535-541, 2014. DOI: 10.1134/S1054661814040130. texture, so the size of the preprocessing filter mask does not [5] R.G. Magdeev and A.G. Tashlinskii, “Efficiency of object exceed 3 % of the size of the object. identification for binary,” Computer Optics, vol. 43, no. 2, pp. 277- 281, 2019. DOI: 10.18287/2412-6179-2019-43-2-277-281. IV. CONCLUSION [6] V.A. Kovalevsky, “Methods of optimal solutions in image recognition,” Moscow: Nauka Publisher, 2003. A comparative analysis showed that the extension of the [7] Ya.A. Furman, A.V. Krevetsky, A.K. Peredeyev, A.A. Rozhentsov, SGIM to grayscale images does not impair its performance R.G. Khafizov, I.L. Egoshina and A.L. Leukhin, “Introduction to in computational complexity, but slightly reduces the contour analysis and its applications to image and signal processing,” effective operating range and the convergence rate of Moscow: Fizmatlit Publisher, 2003. identification parameters. This is due to the fact that with the [8] R. Gonzalez and R Woods, “Digital image processing,” Upper Saddle same reliability of identification of objects, grayscale images River: New Jersey: Prentice Hall, 2012. allow a smaller size of the low-pass filter during pre- [9] A.G. Tashlinskii, “Estimation of the parameters of spatial processing. deformations of image sequences,” Ulyanovsk: UlSTU Publisher, 2000. A study on the basis of COIL-20 images of identifying [10] S.A. Nene, S.K. Nayar and Y. Murase, “Columbia Object Image objects of similar shape with different curvatures Library,” COIL-20 [Online]. URL: http://www.cs.columbia. characterizing the lines of an object’s contour showed that edu/CAVE/software/softlib/coil-20.php. with close characteristics of the curvature of contour lines, [11] A.G. Tashlinskii, “Pseudogradient Estimation of Digital Images identification by binarized images is ineffective. When Interframe Geometrical Deformations,” Vision Systems: identifying by grayscale images, it is advisable to increase Segmentation & Pattern Recognition, pp. 465-494, 2007. DOI: the reliability of using, in addition to correlation criteria, 10.5772/4975. additional ones based on the characteristics of the process of [12] Ya.Z. Tsypkin, “Information theory of identification,” Moscow: Fizmatlit Publisher, 1995. convergence of identification parameters. For objects of similar shape with different characteristics of the contour [13] A.G. Tashlinskii, “The specifics of pseudogradient estimation of geometric deformations in image sequences,” Pattern Recognition and lines curvature, when binarizing their images, the decision to Image Analysis, vol. 18, no. 4, pp. 706-711, 2008. DOI: identify also requires the use of additional criteria. For 10.1134/S1054661808040287. grayscale images, a solution is possible using only the [14] A.G. Tashlinskii, G.V. Dikarina, G.L. Minkina and A.N. Repin, correlation criterion, and additional ones can be used to “Pseudogradient optimization in the estimation of geometric assess the reliability of identification. interframe image deformations,” Pattern recognition and image We also note that in order to solve the problem of analysis, vol. 18, no. 4, pp. 535-541, 2008. DOI: 10.1134/ S1054661814040130. identification of objects according to a pattern the criterion [15] L.J. van Vliet, I.T. Young and P.W. Verbeek, “Van Vliet Recursive based on correlation and referred in this paper as “basic” is Gaussian Derivative Filters,” Proc. 14th Int. Conference on Pattern not significant for identification in many cases. Recognition, pp. 509-514, 1998. DOI: 10.1109/ICPR.1998.711192. [16] D. Hale, “RecursiveGaussianfilters” [Online]. URL: https:// ACKNOWLEDGMENT inside.mines.edu/~dhale/papers/Hale06RecursiveGaussianFilters.pdf. The reported study was funded by RFBR & Government [17] G.L. Fadeeva, “Optimization of the pseudo-gradient of the objective of the Ulyanovsk region according to the research projects № function in the estimation of inter-frame geometric deformations of images,” dis. cand. tech. Sciences: 05.13.18, Ulyanovsk, 2007. 19-29-09048 and № 19-47-730004. [18] A.G. Tashlinskii and R.G. Magdeev, “Increasing objects REFERENCES identification accuracy for binary images,” Informatsionno- izmeritel'nyye i upravlyayushchiye sistemy, vol. 12, pp. 24-30, 2017. [1] S.M. Borzov, M.A. Guryanov and O.I. Potaturkin, “Study of the classification efficiency of difficult-to-distinguish vegetation types VI International Conference on "Information Technology and Nanotechnology" (ITNT-2020) 28