=Paper= {{Paper |id=Vol-2534/13_short_paper |storemode=property |title=Algorithms of Multispectral Aerospace Image Sequential Analysis Based on the Use of Structural-Statistical Approach for Natural Object Decoding |pdfUrl=https://ceur-ws.org/Vol-2534/13_short_paper.pdf |volume=Vol-2534 |authors=Aleksander P. Guk,Maxim A. Altyntsev,Larisa G. Evstratova,Marina A. Altyntseva }} ==Algorithms of Multispectral Aerospace Image Sequential Analysis Based on the Use of Structural-Statistical Approach for Natural Object Decoding== https://ceur-ws.org/Vol-2534/13_short_paper.pdf
    Algorithms of Multispectral Aerospace Image Sequential Analysis
           Based on the Use of Structural-Statistical Approach
                      for Natural Object Decoding

            Aleksander P. Guk (1), Maxim A. Altyntsev (1), Larisa G. Evstratova (2), Marina A. Altyntseva (1)
                      (1)
                            Siberian State University of Geosystems and Technologies, Novosibirsk
                                      (2)
                                          State University of land use planning, Moscow




               Abstract. The method of multispectral aerospace image decoding based on a non-
               parametric approach is considered. It is offered to apply a cumulative distribution
               function and a probability density function constructed from source images and
               transformed one by means of various algorithms for the analysis of objects demanded for
               recognition. The way for increasing image decoding reliability by means of sequential
               algorithm application of their transformation and use of a large number of test samples is
               discussed.

               Keywords: decoding, non-parametric approach, cumulative distribution function,
               probability density function, decision rule.

1        Introduction
    Natural object decoding using multispectral aerospace images is the main task in the sphere of remote sensing.
Each band in multispectral aerospace images is two dimensional array. Spectral intensity of image elements is stored
in the array values. Spectral intensity of image elements is the form for storing object spectral intensity that is the
main source for decoding various image objects. Object decoding and their quality characterizing can be carried out
based on image classification. Classification reliability are affected by many factors such as the type and resolution of
surveying system, its orientation at the time of surveying, state of the atmosphere, cloud cover, susceptibility to
significant changes in spectral reflection coefficients for various objects. Hereby, direct quality characterizing for
objects is impossible. For this reason feature vectors that are capable to detect objects in a unique manner are
necessary to be modeled [1].
    For modeling feature vectors a technique for creating a model linking features and measurements carried out in a
model space is mostly applied. The features are chosen in such way for an object to be defined. The simplest types of
these models are models of clustering used for decoding of aerospace images. Various statistical models also belong
to such models: the Mahalanobis distance, the maximum likelihood, etc. The normal distribution is used in majority
of these models. Statistical models corresponding to the normal distribution are called parametrical [2].
    Parametrical models are based on quantitative features and are applied for simplification of pattern recognition
task solution. If distribution is happened to be different from normal one object recognition validity in images by
means of parametrical model application is low. In this case it is necessary to use non-parametric models based on
qualitative features.

2        Methods of analysis
    The essence of the non-parametric approach proposed in [3, 4] consists in the fact that reference features in the
form of probability density functions are generated with samples of rather big size for all object classes required for
recognition. Measurements are performed according to images of reference objects.
    The probability density function f(x) is a derivative of the cumulative distribution function F(x) and describes
density with which values of a random variable are distributed in a certain point. The cumulative distribution function
determines the probability that as a result of test random variable X will take a value less than x. Values of the
cumulative distribution function belong to the interval [0, 1].
    Before carrying out assessment of both the probability density function and the cumulative distribution function it
is necessary to check distribution to normality. Check on normality is carried out on the basis of criterion ω2 [3]:

    _______________________________________________

Copyright © 2019 for this paper by its authors. Use permitted under Creative Commons License Attribution 4.0
International (CC BY 4.0).
                       m2  m .... 2 F ( X m , ) Fm ( X m )  Fm ( X m , )2 dF( X m , )  2

where: 𝐹𝑚 (𝑥𝑚 ) – empirical distribution function of the sample {𝑥𝑖 }𝑚 ; 𝐹𝑚 (𝑋𝑚 , 𝜇) – function of the normal
distribution with parameters 𝜇 and 𝛹 2 [𝐹] – weight function.
    Non-parametric models are appropriate to use if distribution differs from normal one.
    Reference functions are also various for various surveying systems. For this reason when using non-parametric
approach it is necessary to create database of both the probability density and cumulative distribution functions for
each surveying system and all classes required for recognition. Reference function database is received with the help
of cartographic materials. Reference functions are created for image sites corresponding to a certain class on a map.
Then having carried out image decoding for any other area with a segmentation method the probability density and
cumulative distribution functions are created for each image site. Created functions are compared with reference ones
based on a given decision rule.
    As the result of the study carried out earlier the value of Pearson’s correlation coefficient calculated between two
functions of images was chosen as the decision rule at assessment of probability density functions. The special case of
Kolmogorov’s criterion offered by authors was chosen as the decision rule at assessment of cumulative distribution
functions.
    At the beginning for comparing cumulative distribution functions using the special case of Kolmogorov’s criterion
the greatest value of brightness Bmax among all compared image sites in each spectral band is defined. Some of these
image sites are decoded, the others – reference. The cumulative distribution function is calculated in the range [0,
Bmax] for a corresponding site in each band. Then brightness values B of a site under test are defined for cumulative
distribution function values multiple 0.1 in the range [0, 1]. Based on these brightness values brightness vector f of
size 1x10 corresponding to the cumulative distribution function values multiple 0.1 is calculated for each spectral
band. In the same manner vectors fi for the cumulative distribution functions of reference image sites are calculated.
In the next stage calculation of a distance r between the vector f and each of vectors fi is performed:

                                                       10
                                                        f [ j ]  fi [ j ]
                                                  r  j 1                                                             (1)
                                                               10

    Distances between cumulative distribution functions calculated for bands of each reference site and decoded one
are compared together. The decoded site will belong to that reference one to which the distance calculated by
definition 1 will be lowest. The total distance among the functions of all image bands can be also calculated.
    The results of such analysis can have various degree of reliability for certain object types in various spectral
bands. For example water area can be correctly recognized using various spectral bands based on comparing both
cumulative distribution functions and probability density functions while recognition reliability of forest species will
be significantly lower. It can occur that the certain forest species will be correctly recognized based on one of the
function type calculated for the certain spectral band [6].
    Recognition of forest species is the most difficult task. It can occur that forest species are not recognized is any
band. In this case as the source feature space an image transformed in accordance with a priori specified probability
model of multispectral measurements using one of the algorithm such as principal component analysis, independent
component analysis, Tasseled Cap, vegetation indices can be used instead of a multispectral source image. It is
possible to increase final reliability of various object class recognition significantly having carried out calculating the
considered functions on the basis of transformed images and having consistently analyzed the results of their
calculation by means of their similarity comparison by one of the offered decision rules [7,8].
    As a method of the consecutive analysis the most appropriate algorithm is the decision tree. The decision tree is a
multi-step algorithm. Decision trees represent various methods of rule description for data division in the form of
consecutive and hierarchical structure where the only node giving the decision corresponds to each object.

3        Results
    For the analysis of natural object decoding results based on application of structural-statistical approach and
algorithms of source multispectral image transformation a four-band space image Ikonos for an area close to
Akademgorodok of Novosibirsk was chosen. The resolution of each band is 3.2 m. Creation of samples was carried
out according to this image and on the basis of thematic map for species composition of forest (Fig. 1). Sites with the
largest area were chosen as reference samples.
    In Figure 2 an example of a reference sample limited to a contour of red color and corresponding to pine forest is
shown. In total next object classes were chosen as samples: birch forest, pine forest, aspen forest, ground, water. The
area of reference samples was at least 3 ha. For estimation of decoding reliability with applying the probability
density and the cumulative distribution functions test sites were also chosen according to the thematic map.
                     Figure 1. Space image Ikonos and thematic map for Academgorodik area.




                                     Figure 2. Reference sample of pine forest.


    Probability density and cumulative distribution functions were calculated for each multispectral image band,
images transformed with vegetation index and for each component obtained as the result of image transformation
with principal component analysis. Figure 3 shows an example of cumulative distribution function calculation for a
red band of a source image for reference samples and one of the decoded samples. Birch forest site was chosen as the
decoded sample. In this figure the distance from a test sample to each of reference samples calculated by definition
(1) is also shown. The minimum distance was received between cumulative distribution functions of birch forest and
the reference sample of this forest type. This means that decoding of the sample was correctly done.




                Figure 3. The example of cumulative distribution function calculation for a red band.

   Figure 4 shows an example of probability density function calculation for a red band of the same reference
samples and the same test sample.
                   Figure 4. The example of probability density function calculation for a red band.

    Correlation coefficients shown in figure 4 were calculated between the probability density function of a test
sample and these functions of reference samples. The highest value of a correlation coefficient was obtained between
the probability density function of a test sample for birch forest and a reference one for this type of forest.
    Thus, the test sample was correctly decoded for a red band using both the cumulative distribution function and the
probability density one.
    As it was noted above calculation of described functions can be carried out not only for source multispectral
images but also for images transformed with a certain algorithm. Transformed images can increase reliability of a
certain test sample decoding. The results of comparing the considered test sample with the reference ones for all
spectral bands separately, for four-dimensional space of the image, for all components of the image transformed with
principal component analysis algorithm and for the indexed image obtained using definition of calculating the
normalized difference vegetation index (NDVI) are given in Table 1. These results demonstrate that the reliability of
decoding significantly differs depending on the data that were used for calculating the cumulative distribution
function and the probability density one.
    To estimate objectively what algorithms of image transformation provide the greatest reliability of the certain
object class decoding it is necessary to calculate functions using larger number of test samples and to compare them
with reference ones. That algorithm and that function providing the greatest distinction of classes have to be chosen
for a basis. To achieve a larger proportion of reliability it is also possible to apply several algorithms of
transformation consistently by means of the decision tree.

4        Conclusion
    Application of the non-parametric approach when decoding aerospace images is capable to significantly increase
the results of various object class recognition. The offered decision rules allow estimating differences between
cumulative distribution functions and the probability density ones calculated for source images and images
transformed with various algorithms. Carried out comparing these functions calculated for a large number of samples
of various types it is possible to choose that function and to select those bands of source and transformed images
allowing reaching a larger proportion of reliability. Moreover consecutive combining several algorithms for
transformation is capable to provide achievement of the reliability largest proportion.
    Further study will be directed to collecting a larger number of statistical information for the purpose of searching
steady statistical characteristics of various object class brightness distribution in source and transformed multispectral
images as well as to determining the sequence of applying algorithms of transformation and to the choice of the
typical function site defining the greatest distinction of classes.
                           Table 1. The results of the test sample decoding

 Test         Spectral band               Reference     Distance between the     Correlation coefficient
sample                                     sample            cumulative         between the probability
 class                                      class       distribution function   density function of a test
                                                        of a test sample and    sample and this function
                                                          this function of s       of s reference one
                                                            reference one

Birch             Red                       Pine                 9.0                     0.7673
                                           Aspen                19.2                     0.6868
                                            Birch                6.3                     0.8681
                                           Water                24.0                     0.0187
                                           Ground              104.5                    -0.2122
                  Blue                      Pine                 2.9                     0.8660
                                           Aspen                 7.6                     0.7098
                                            Birch                0.1                     0.9893
                                           Water                46.0                    -0.0401
                                           Ground               39.4                    -0.0747
                 Green                      Pine                 9.6                     0.7393
                                           Aspen                11.9                     0.8241
                                            Birch                6.2                     0.8915
                                           Water                68.5                    -0.0600
                                           Ground               63.1                     0.1426
                Infrared                    Pine                23.8                     0.7612
                                           Aspen                22.6                     0.6350
                                            Birch               47.0                     0.6471
                                           Water               273.9                    -0.0821
                                           Ground               16.8                     0.6952
         Four-dimensional space             Pine               27.882                       -
                                           Aspen               33.791                       -
                                            Birch              47.856                       -
                                           Water              288.356                       -
                                           Ground             130.424                       -
           The first component              Pine                22.4                     0.7547
                                           Aspen                21.9                     0.6391
                                            Birch               45.5                     0.6491
                                           Water               281.6                    -0.0898
                                           Ground               15.1                     0.6772
          The second component              Pine                11.7                     0.7387
                                           Aspen                19.0                     0.7267
                                            Birch               11.1                     0.8317
                                           Water                62.0                    -0.1005
                                           Ground              111.2                    -0.4171
           The third component              Pine                 3.1                     0.8810
                                           Aspen                 9.5                     0.7451
                                            Birch                2.0                     0.9269
                                           Water                 5.8                     0.3535
                                           Ground               57.4                    -0.1217
          The fourth component              Pine                 3.1                     0.8743
                                           Aspen                 2.7                     0.9244
                                            Birch                4.7                     0.8576
                                           Water                 3.6                     0.7447
                                           Ground               12.0                     0.3227
                 NDVI                       Pine               0.006                     0.8738
                                           Aspen               0.032                     0.7242
                                            Birch              0.038                     0.7621
                                           Water               0.678                    -0.0967
                                           Ground              0.192                    -0.0920
References
[1] Guk A.P., Evstratova L.G., Khlebnikova E.P., Arbuzov S.A., Altyntsev M.A., Gordienko A.S, Guk A.A.,
    Simonov D.P. Development of techniques for automated decoding of aerospace images. Object picture
    interpretive features on multispectral satellite images // Geodesy and Cartography. 2013. Vol. 7. P. 31-40.
[2] Guk A.P., Evstratova L.G., Study of the efficiency criteria for estimating statistical non-parametric methods for
    forest decoding // Regional Problems of Earth Remote Sensing: Proceedings of the V International Scientific
    Conference, September, 11-14, 2016, Krasnoyarsk: Siberian Federal University. 2018. P. 12-15.
[3] Guk A.P., Evstratova L.G. New statistical approach of forest image recognition // Regional Problems of Earth
    Remote Sensing: Proceedings of the III International Scientific Conference, September, 13-16, 2016,
    Krasnoyarsk: Siberian Federal University. 2016. P. 14-17.
[4] Guk A.P. Automation of photo interpretation. Theoretical aspects of statistic recognition of images // Izvestia
    vuzov. Geodesy and aerophotography. 2015. Vol. 5/C. P. 166-170.
[5] Fukunaga, K. Introduction to statistical pattern recognition. London: Academic Press; 2 Edition, 1990. 592 p.
[6] Guk A.P., Shlyakhova M.M. Study of statistical characteristics of multispectral forest space images // Regional
    Problems of Earth Remote Sensing: Proceedings of the V International Scientific Conference, September, 11-14,
    2016, Krasnoyarsk: Siberian Federal University. 2018. P. 105-108.
[7] Guk A.P., Shlyakhova M.M. The efficiency analysis of the principal component analysis application when using
    non-parametric statistical approach to image decoding // Regional Problems of Earth Remote Sensing:
    Proceedings of the IV International Scientific Conference, September, 12-15, 2017, Krasnoyarsk: Siberian
    Federal University. 2017 P. 89-94.
[8] Guk A.P., Evstratova L.G. The main direction improve of automatic classification of forest land, using multi
    spectral aero spase imageries // J. Sib. Fed. Univ. Eng. technol., 2018, 11(8), 892-901. DOI: 10.17516/1999-
    494X-0111.