=Paper=
{{Paper
|id=Vol-2534/13_short_paper
|storemode=property
|title=Algorithms of Multispectral Aerospace Image Sequential Analysis Based on the Use of Structural-Statistical Approach for Natural Object Decoding
|pdfUrl=https://ceur-ws.org/Vol-2534/13_short_paper.pdf
|volume=Vol-2534
|authors=Aleksander P. Guk,Maxim A. Altyntsev,Larisa G. Evstratova,Marina A. Altyntseva
}}
==Algorithms of Multispectral Aerospace Image Sequential Analysis Based on the Use of Structural-Statistical Approach for Natural Object Decoding==
Algorithms of Multispectral Aerospace Image Sequential Analysis Based on the Use of Structural-Statistical Approach for Natural Object Decoding Aleksander P. Guk (1), Maxim A. Altyntsev (1), Larisa G. Evstratova (2), Marina A. Altyntseva (1) (1) Siberian State University of Geosystems and Technologies, Novosibirsk (2) State University of land use planning, Moscow Abstract. The method of multispectral aerospace image decoding based on a non- parametric approach is considered. It is offered to apply a cumulative distribution function and a probability density function constructed from source images and transformed one by means of various algorithms for the analysis of objects demanded for recognition. The way for increasing image decoding reliability by means of sequential algorithm application of their transformation and use of a large number of test samples is discussed. Keywords: decoding, non-parametric approach, cumulative distribution function, probability density function, decision rule. 1 Introduction Natural object decoding using multispectral aerospace images is the main task in the sphere of remote sensing. Each band in multispectral aerospace images is two dimensional array. Spectral intensity of image elements is stored in the array values. Spectral intensity of image elements is the form for storing object spectral intensity that is the main source for decoding various image objects. Object decoding and their quality characterizing can be carried out based on image classification. Classification reliability are affected by many factors such as the type and resolution of surveying system, its orientation at the time of surveying, state of the atmosphere, cloud cover, susceptibility to significant changes in spectral reflection coefficients for various objects. Hereby, direct quality characterizing for objects is impossible. For this reason feature vectors that are capable to detect objects in a unique manner are necessary to be modeled [1]. For modeling feature vectors a technique for creating a model linking features and measurements carried out in a model space is mostly applied. The features are chosen in such way for an object to be defined. The simplest types of these models are models of clustering used for decoding of aerospace images. Various statistical models also belong to such models: the Mahalanobis distance, the maximum likelihood, etc. The normal distribution is used in majority of these models. Statistical models corresponding to the normal distribution are called parametrical [2]. Parametrical models are based on quantitative features and are applied for simplification of pattern recognition task solution. If distribution is happened to be different from normal one object recognition validity in images by means of parametrical model application is low. In this case it is necessary to use non-parametric models based on qualitative features. 2 Methods of analysis The essence of the non-parametric approach proposed in [3, 4] consists in the fact that reference features in the form of probability density functions are generated with samples of rather big size for all object classes required for recognition. Measurements are performed according to images of reference objects. The probability density function f(x) is a derivative of the cumulative distribution function F(x) and describes density with which values of a random variable are distributed in a certain point. The cumulative distribution function determines the probability that as a result of test random variable X will take a value less than x. Values of the cumulative distribution function belong to the interval [0, 1]. Before carrying out assessment of both the probability density function and the cumulative distribution function it is necessary to check distribution to normality. Check on normality is carried out on the basis of criterion ω2 [3]: _______________________________________________ Copyright © 2019 for this paper by its authors. Use permitted under Creative Commons License Attribution 4.0 International (CC BY 4.0). m2 m .... 2 F ( X m , ) Fm ( X m ) Fm ( X m , )2 dF( X m , ) 2 where: 𝐹𝑚 (𝑥𝑚 ) – empirical distribution function of the sample {𝑥𝑖 }𝑚 ; 𝐹𝑚 (𝑋𝑚 , 𝜇) – function of the normal distribution with parameters 𝜇 and 𝛹 2 [𝐹] – weight function. Non-parametric models are appropriate to use if distribution differs from normal one. Reference functions are also various for various surveying systems. For this reason when using non-parametric approach it is necessary to create database of both the probability density and cumulative distribution functions for each surveying system and all classes required for recognition. Reference function database is received with the help of cartographic materials. Reference functions are created for image sites corresponding to a certain class on a map. Then having carried out image decoding for any other area with a segmentation method the probability density and cumulative distribution functions are created for each image site. Created functions are compared with reference ones based on a given decision rule. As the result of the study carried out earlier the value of Pearson’s correlation coefficient calculated between two functions of images was chosen as the decision rule at assessment of probability density functions. The special case of Kolmogorov’s criterion offered by authors was chosen as the decision rule at assessment of cumulative distribution functions. At the beginning for comparing cumulative distribution functions using the special case of Kolmogorov’s criterion the greatest value of brightness Bmax among all compared image sites in each spectral band is defined. Some of these image sites are decoded, the others – reference. The cumulative distribution function is calculated in the range [0, Bmax] for a corresponding site in each band. Then brightness values B of a site under test are defined for cumulative distribution function values multiple 0.1 in the range [0, 1]. Based on these brightness values brightness vector f of size 1x10 corresponding to the cumulative distribution function values multiple 0.1 is calculated for each spectral band. In the same manner vectors fi for the cumulative distribution functions of reference image sites are calculated. In the next stage calculation of a distance r between the vector f and each of vectors fi is performed: 10 f [ j ] fi [ j ] r j 1 (1) 10 Distances between cumulative distribution functions calculated for bands of each reference site and decoded one are compared together. The decoded site will belong to that reference one to which the distance calculated by definition 1 will be lowest. The total distance among the functions of all image bands can be also calculated. The results of such analysis can have various degree of reliability for certain object types in various spectral bands. For example water area can be correctly recognized using various spectral bands based on comparing both cumulative distribution functions and probability density functions while recognition reliability of forest species will be significantly lower. It can occur that the certain forest species will be correctly recognized based on one of the function type calculated for the certain spectral band [6]. Recognition of forest species is the most difficult task. It can occur that forest species are not recognized is any band. In this case as the source feature space an image transformed in accordance with a priori specified probability model of multispectral measurements using one of the algorithm such as principal component analysis, independent component analysis, Tasseled Cap, vegetation indices can be used instead of a multispectral source image. It is possible to increase final reliability of various object class recognition significantly having carried out calculating the considered functions on the basis of transformed images and having consistently analyzed the results of their calculation by means of their similarity comparison by one of the offered decision rules [7,8]. As a method of the consecutive analysis the most appropriate algorithm is the decision tree. The decision tree is a multi-step algorithm. Decision trees represent various methods of rule description for data division in the form of consecutive and hierarchical structure where the only node giving the decision corresponds to each object. 3 Results For the analysis of natural object decoding results based on application of structural-statistical approach and algorithms of source multispectral image transformation a four-band space image Ikonos for an area close to Akademgorodok of Novosibirsk was chosen. The resolution of each band is 3.2 m. Creation of samples was carried out according to this image and on the basis of thematic map for species composition of forest (Fig. 1). Sites with the largest area were chosen as reference samples. In Figure 2 an example of a reference sample limited to a contour of red color and corresponding to pine forest is shown. In total next object classes were chosen as samples: birch forest, pine forest, aspen forest, ground, water. The area of reference samples was at least 3 ha. For estimation of decoding reliability with applying the probability density and the cumulative distribution functions test sites were also chosen according to the thematic map. Figure 1. Space image Ikonos and thematic map for Academgorodik area. Figure 2. Reference sample of pine forest. Probability density and cumulative distribution functions were calculated for each multispectral image band, images transformed with vegetation index and for each component obtained as the result of image transformation with principal component analysis. Figure 3 shows an example of cumulative distribution function calculation for a red band of a source image for reference samples and one of the decoded samples. Birch forest site was chosen as the decoded sample. In this figure the distance from a test sample to each of reference samples calculated by definition (1) is also shown. The minimum distance was received between cumulative distribution functions of birch forest and the reference sample of this forest type. This means that decoding of the sample was correctly done. Figure 3. The example of cumulative distribution function calculation for a red band. Figure 4 shows an example of probability density function calculation for a red band of the same reference samples and the same test sample. Figure 4. The example of probability density function calculation for a red band. Correlation coefficients shown in figure 4 were calculated between the probability density function of a test sample and these functions of reference samples. The highest value of a correlation coefficient was obtained between the probability density function of a test sample for birch forest and a reference one for this type of forest. Thus, the test sample was correctly decoded for a red band using both the cumulative distribution function and the probability density one. As it was noted above calculation of described functions can be carried out not only for source multispectral images but also for images transformed with a certain algorithm. Transformed images can increase reliability of a certain test sample decoding. The results of comparing the considered test sample with the reference ones for all spectral bands separately, for four-dimensional space of the image, for all components of the image transformed with principal component analysis algorithm and for the indexed image obtained using definition of calculating the normalized difference vegetation index (NDVI) are given in Table 1. These results demonstrate that the reliability of decoding significantly differs depending on the data that were used for calculating the cumulative distribution function and the probability density one. To estimate objectively what algorithms of image transformation provide the greatest reliability of the certain object class decoding it is necessary to calculate functions using larger number of test samples and to compare them with reference ones. That algorithm and that function providing the greatest distinction of classes have to be chosen for a basis. To achieve a larger proportion of reliability it is also possible to apply several algorithms of transformation consistently by means of the decision tree. 4 Conclusion Application of the non-parametric approach when decoding aerospace images is capable to significantly increase the results of various object class recognition. The offered decision rules allow estimating differences between cumulative distribution functions and the probability density ones calculated for source images and images transformed with various algorithms. Carried out comparing these functions calculated for a large number of samples of various types it is possible to choose that function and to select those bands of source and transformed images allowing reaching a larger proportion of reliability. Moreover consecutive combining several algorithms for transformation is capable to provide achievement of the reliability largest proportion. Further study will be directed to collecting a larger number of statistical information for the purpose of searching steady statistical characteristics of various object class brightness distribution in source and transformed multispectral images as well as to determining the sequence of applying algorithms of transformation and to the choice of the typical function site defining the greatest distinction of classes. Table 1. The results of the test sample decoding Test Spectral band Reference Distance between the Correlation coefficient sample sample cumulative between the probability class class distribution function density function of a test of a test sample and sample and this function this function of s of s reference one reference one Birch Red Pine 9.0 0.7673 Aspen 19.2 0.6868 Birch 6.3 0.8681 Water 24.0 0.0187 Ground 104.5 -0.2122 Blue Pine 2.9 0.8660 Aspen 7.6 0.7098 Birch 0.1 0.9893 Water 46.0 -0.0401 Ground 39.4 -0.0747 Green Pine 9.6 0.7393 Aspen 11.9 0.8241 Birch 6.2 0.8915 Water 68.5 -0.0600 Ground 63.1 0.1426 Infrared Pine 23.8 0.7612 Aspen 22.6 0.6350 Birch 47.0 0.6471 Water 273.9 -0.0821 Ground 16.8 0.6952 Four-dimensional space Pine 27.882 - Aspen 33.791 - Birch 47.856 - Water 288.356 - Ground 130.424 - The first component Pine 22.4 0.7547 Aspen 21.9 0.6391 Birch 45.5 0.6491 Water 281.6 -0.0898 Ground 15.1 0.6772 The second component Pine 11.7 0.7387 Aspen 19.0 0.7267 Birch 11.1 0.8317 Water 62.0 -0.1005 Ground 111.2 -0.4171 The third component Pine 3.1 0.8810 Aspen 9.5 0.7451 Birch 2.0 0.9269 Water 5.8 0.3535 Ground 57.4 -0.1217 The fourth component Pine 3.1 0.8743 Aspen 2.7 0.9244 Birch 4.7 0.8576 Water 3.6 0.7447 Ground 12.0 0.3227 NDVI Pine 0.006 0.8738 Aspen 0.032 0.7242 Birch 0.038 0.7621 Water 0.678 -0.0967 Ground 0.192 -0.0920 References [1] Guk A.P., Evstratova L.G., Khlebnikova E.P., Arbuzov S.A., Altyntsev M.A., Gordienko A.S, Guk A.A., Simonov D.P. Development of techniques for automated decoding of aerospace images. Object picture interpretive features on multispectral satellite images // Geodesy and Cartography. 2013. Vol. 7. P. 31-40. [2] Guk A.P., Evstratova L.G., Study of the efficiency criteria for estimating statistical non-parametric methods for forest decoding // Regional Problems of Earth Remote Sensing: Proceedings of the V International Scientific Conference, September, 11-14, 2016, Krasnoyarsk: Siberian Federal University. 2018. P. 12-15. [3] Guk A.P., Evstratova L.G. New statistical approach of forest image recognition // Regional Problems of Earth Remote Sensing: Proceedings of the III International Scientific Conference, September, 13-16, 2016, Krasnoyarsk: Siberian Federal University. 2016. P. 14-17. [4] Guk A.P. Automation of photo interpretation. Theoretical aspects of statistic recognition of images // Izvestia vuzov. Geodesy and aerophotography. 2015. Vol. 5/C. P. 166-170. [5] Fukunaga, K. Introduction to statistical pattern recognition. London: Academic Press; 2 Edition, 1990. 592 p. [6] Guk A.P., Shlyakhova M.M. Study of statistical characteristics of multispectral forest space images // Regional Problems of Earth Remote Sensing: Proceedings of the V International Scientific Conference, September, 11-14, 2016, Krasnoyarsk: Siberian Federal University. 2018. P. 105-108. [7] Guk A.P., Shlyakhova M.M. The efficiency analysis of the principal component analysis application when using non-parametric statistical approach to image decoding // Regional Problems of Earth Remote Sensing: Proceedings of the IV International Scientific Conference, September, 12-15, 2017, Krasnoyarsk: Siberian Federal University. 2017 P. 89-94. [8] Guk A.P., Evstratova L.G. The main direction improve of automatic classification of forest land, using multi spectral aero spase imageries // J. Sib. Fed. Univ. Eng. technol., 2018, 11(8), 892-901. DOI: 10.17516/1999- 494X-0111.