Development of parallel implementation of the informative areas generation method in the spatial spectrum domain N. Kravtsova1, R. Paringer1,2, A. Kupriyanov1,2 1 Samara National Research University, 34 Moskovskoe Shosse, 443086, Samara, Russia 2 Image Processing Systems Institute – Branch of the Federal Scientific Research Centre “Crystallography and Photonics” of Russian Academy of Sciences, 151 Molodogvardeyskaya st., 443001, Samara, Russia Abstract This paper proposes parallel implementation of the image informative segments extraction method. The images are segmented in the spatial spectrum domain. Median energy in each selected segment is viewed as an area. For time saving purpose parallel implementation was developed for the areas calculation phase. The developed software implementation was tested on the high performance multicore computing system. Keywords: diagnostic crystallogram; spatial spectrum; discriminant analysis; k-NN classification; parallel implementation 1. Introduction Currently computer processing of medical diagnostic images is one of the vital research tools and a way to improve efficiency of early detection of various diseases. Change of the body fluids composition is one of the information-bearing health condition areas. Metabolic change that occurs due to pathological conditions affects the fluid composition; there are numerous changes in the molecular composition of tissues and body fluids. Converting the fluid from one phase state to another is one of the ways to detect such changes. Crystallization is one of the most convenient methods to change the fluid phase. Crystal properties modification is caused by changed physical and chemical properties of a body fluid. The investigation of these properties is the crucial problem of crystal analysis [1]. In medicine, studied crystallograms are the structures formed by salt crystallization caused by body fluid drying. In clinical practice the crystallogram analysis is based on their images. It is not always possible to visually identify changes in such key crystallogram parameters as predominant bar direction, bar density etc., which contribute to major pathologic signs. Quantified analysis and objectivity are among computer analysis advantages. The information contained in the image is structurally excessive. It is known that if the parallel bars of certain direction were predominant on the original image, the bars of the same direction would dominate the Fourier transform of an original image. This property can be used to analyze crystallograms [2, 3] and other images of branched structure. The developed method based on discriminatory analysis algorithm is applied to generate the informative areas set, which is used in this paper to identify the characteristics of the initial crystallogram images. In the article, we propose a parallel implementation of the method to speed up computations. 2. Informative areas generation method 2.1. Description of the areas used In this work, the areas are derived from calculation of the total energy on the selected spectrum image ranges. Most part of the spectrum does not contain the information suitable to identify the characteristics of an original image. If the image function and its Fourier transform F(u, v) are considered in a spatial domain, then the magnitude |F(u, v)|2 defines an energy spectrum of the image. The energy spectrum of the image can be directly analyzed as a whole or partially. In this work, we analyzed features derived by calculating the total energy of a selected domain of the spectrum image. The spectrum image in the domain of interest was segmented using a formula: r2 2 Cr1 r2 12    F  r ,  , 2 r  r1  1 where, θ1 и θ2 are the bounding angles of the sector. Since the spectral image is symmetric around the center, only half of the image will be used to form up the areas [4, 5]. 2.2. The technique of building an efficient set of areas for image discrimination This paper describes the method to extract the informative segments from the spectrum images (in Figure 1 is shown as a stage of smart area analysis process). The segment informative value was estimated using the criteria of separability of discriminatory analysis algorithm. 3rd International conference “Information Technology and Nanotechnology 2017” 51 High-Performance Computing / N. Kravtsova, R. Paringer, A. Kupriyanov The methods based on the discriminative analysis algorithm proved to be a good solution to form up new problem-specified areas [6]. These methods permit to improve the reliability of data classification [7, 8]. The discriminative analysis is used to eliminate the correlation between the areas and consequently to reduce the size of the areas set. The usage of this algorithm allows, on the one hand, to maintain the informativeness of the feature set for classification and, on the other hand, to reduce the number of areas to apply less complicated classification methods and to reduce the classification error value. An individual separability criterion was calculated for each area based on equation: J=tr((T)-1B), where T = B + W, B is between-group scattering matrix, W is intragroup scattering matrix. The informative areas set is further generated in the following manner:  The areas are ranged in the order of decreasing of individual separability criteria values.  The initial areas set consists of a feature with the largest criteria. Classification is carried out.  Then the area with the next value of separability criterion is added to the set. Classification based on the new set of areas is carried out.  Repeat item 3 until all areas are included into the set. The informative set of areas is the one with classification results yielding the minimum error value. The classification error that defines the number of cases with the classifier acquiring incorrect value is calculated from the equation: ε = (m/n) · 100%, – where m is the number of classification errors, n is the total number of images tested. 3. Parallel implementation of the method to generate informative areas The entire algorithm of informative areas generation may be presented as a diagram shown in Figure 1. The analysis of the algorithm structure showed that the first stage of areas computing has the maximum computational complexity. As mentioned above most of the algorithm run time is spent on computing the areas. This stage includes pre-processing of the learning sample, generation of spatial spectrum for each learning image and calculation of the area values. The number of calculations can be reduced if one take into account that the spatial spectrum image is symmetrical relative to its center, and for this reason the areas calculation may and must be performed only on the half of the spatial spectrum image. In order to speed up the algorithm run time this paper will apply task division. This paper will not use an MPI technology but will apply task distribution by threads, that is why the way of splitting the image into tasks is not important. In case of sample pre-processing and spatial spectrum image generation a single image is sent to each thread. Then at the areas calculation stage each thread reveives a separate element - an image segment calculated on the basis of the proposed segmentation. The next task is sent to the thread as soon as it completes handling of the element. Application of this segmentation pattern allows substantial run-time saving at the first stage of the algorithm. During this resource-intensive stage such method of tasks segmentation by threads has permitted to achieve the three-fold acceleration when four threads were used. The curve in Figure 2 shows the relation between acceleration and the number of areas. Fig. 1. Informative areas formation algorithm. 3rd International conference “Information Technology and Nanotechnology 2017” 52 High-Performance Computing / N. Kravtsova, R. Paringer, A. Kupriyanov Fig. 2. Speedup graph. 4. Classification results After classification for all possible segmentations, areas were selected using the developed area selection algorithm. Figure 3 shows the relation between classification error and the number of areas taken in the descending order of their separability criterion in case of classification with 2-nearest neighbors methods to split into 4 sectors and 8 rings Fig. 3. Relation between classification error and the number of areas. Tables 1 show classification error value after selecting the informative area set. The error value selection to be included into the final table is shown above. Table 1. Classification error following area selection, %. Number of rings 1 2 3 4 5 6 7 8 1 9 9 9 9 9 9 10 10 2 8 8 7 7 7 7 7 7 3 8 7 6 6 6 6 6 7 Number sectors 4 7 7 6 5 5 5 4 4 of 5 7 7 6 6 6 6 7 7 6 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 8 7 7 7 7 7 7 7 7 5. Conclusion This paper presented the method to generate a set of informative local areas of a spatial spectrum in order to classify medical crystallogram images, and an option of parallel implementation of such method. In addition, the research included experimental testing of the developed software implementation, which demonstrated that parallel algorithm implementation provided almost three-fold acceleration at the areas calculation stage. The next step is to implement parallel computing at the smart area analysis stage in order to improve the informative area set calculation speed. Acknowledgements This work was partially supported by the Ministry of education and science of the Russian Federation in the framework of the implementation of the Program of increasing the competitiveness of SSAU among the world’s leading scientific and educational centers for 2013-2020 years; by the Russian Foundation for Basic Research grants (# 15-29-03823, # 15-29-07077, # 16-41- 3rd International conference “Information Technology and Nanotechnology 2017” 53 High-Performance Computing / N. Kravtsova, R. Paringer, A. Kupriyanov 630761; # 16-29-11698, # 17-01-00972); by the ONIT RAS program # 6 “Bioinformatics, modern information technologies and mathematical methods in medicine” 2017 References [1] Shirokanev AS, Kirsh DV, Kupriyanov AV. Researching of a crystal lattice parameter identifica-tion algorithm based on the gradient steepest descent method. Computer Optics 2017; 41( 3): 453–460. DOI: 10.18287/2412-6179-2017-41-3-453-460. [2] Paringer RA, Kupriyanov AV. The Method for Effective Clustering the Dendrite Crystallogram Images. 9th Open German-Russian Workshop on Pattern Recognition and Image Understanding (OGRW 2014). Electronic on-site Proceedings, University of Koblenz-Landau, 2014. [3] Paringer RA, Kupriyanov AV. Research methods for classification of the crystallogramms images. Proceedings of the 12th international conference PRIP'2014. Minsk, Belarus, 2014; 1: 231–234. [4] Kravtsova N, Paringer R, Kupriyanov A. Development of methods for crystallogramms images classification based on technique of detection informative areas in the spectral space. CEUR Workshop Proceedings 2016; 1638: 357–363 DOI: 10.18287/1613-0073-2016-1638-357-36. [5] Gaidel AV, Krasheninnikov VR. Feature selection for diagnozing the osteoporosis by femoral neck X-ray images. Computer Optics 2016; 40( 6): 939– 946. DOI: 10.18287/2412-6179-2016-40-6-939-946. [6] Fukunaga K. Introduction to statistical pattern recognition. San Diego: Academic Press, 1990; 592 p. [7] Ilyasova NYu, Kupriyanov AV, Paringer RA. Formation features for improving the quality of medical diagnosis based on the discriminant analysis methods. Computer Optics 2014; 38( 4): 851–855. [8] Biryukova E, Paringer R, Kupriyanov A. Development of the effective set of features construction technology for texture image classes discrimination. CEUR Workshop Proceedings 2016; 1638: 263–269. DOI: 10.18287/1613-0073-2016-1638-357-363. 3rd International conference “Information Technology and Nanotechnology 2017” 54