=Paper=
{{Paper
|id=Vol-2744/short23
|storemode=property
|title=Optimization of the Composition of Standards in Recognition and Compression Algorithms of Hyperspectral Images (short paper)
|pdfUrl=https://ceur-ws.org/Vol-2744/short23.pdf
|volume=Vol-2744
|authors=Leonid Lebedev
}}
==Optimization of the Composition of Standards in Recognition and Compression Algorithms of Hyperspectral Images (short paper)==
<pdf width="1500px">https://ceur-ws.org/Vol-2744/short23.pdf</pdf>
<pre>
       Optimization of the Composition of Standards in
        Recognition and Compression Algorithms of
                   Hyperspectral Images*

                                      Leonid Lebedev

                     Lobachevsky state university of Nizhny Novgorod,
                               Nizhny Novgorod, Russia
                               lebedev@pmk.unn.ru


       Abstract. The paper proposes a solution to the problem of minimizing the num-
       ber of standards in order to increase both the compression coefficient of hyper-
       spectral images (HSI) and the speed of correlation extreme compression methods
       (CEM). As modifications of the CEM, randomized and differential compression
       algorithms are offered. The randomized and difference algorithms are based on
       the hypothesis of spatial compactness of pixels located in local regions of the
       image matrix. This means that when a new template is formed based on an un-
       recognized pixel, there is a high probability of using a pixel that lies near the
       boundaries of the coverage areas of the existing templates, which leads to their
       increase. In order to reduce the influence of spatial compactness of pixels on the
       formation of standards, a methodology based on changing the sequence of rec-
       ognized pixels is proposed. In a randomized algorithm, a row of the matrix is
       randomly determined for this, on the basis of which a sequence of recognized
       pixels is generated by a random column generator. In the difference algorithm of
       compression, the row number of the matrix is determined by the rule for finding
       the members of an arithmetic progression with a given difference. For the se-
       lected line a sequence of recognizable pixels is formed on the same principle. It
       should be noted that line-by-line pixel recognition in the self-learning mode al-
       lows compressing HSI of almost any volume. The effectiveness of the created
       algorithms is demonstrated on two fragments of real HSI. A comparative analysis
       of all three compression algorithms in terms of the quantitative composition of
       the obtained standards is presented.

       Keywords: Hyperspectral Image, Correlation Extreme Methods, Similarity Es-
       timates, Recognition, Compression, Random Pixel Samples.


Copyright © 2020 for this paper by its authors. Use permitted under Creative Commons License
Attribution 4.0 International (CC BY 4.0).

* This work was supported by the Russian Science Foundation, project No. 16-11-00068-P.
2 L. Lebedev


1      Introduction

In [1], the basics of correlation extreme methods of recognition and compression in
self-learning modes that are invariant to a given type of transformation, when hyper-
spectral images act as the source information. The HSI is represented as a three-dimen-
sional cube, each pixel of which is described by the response values in the correspond-
ing spectral channels. Representing HSI pixels as vectors (points) y = ( y 1 , y 2 ,..., y n )
in a multidimensional linear space ℝ𝑛 , where n it corresponds to the number of chan-
nels of the spectrometer, allows us to obtain similarity estimates for recognition and
compression methods that are invariant, respectively, to the identity transformation  mI
and ˆmI ; to the similarity transformation  mS and ˆmS ; to the offset  mD and ˆmD ; to the
scaling and offset  mT and ˆmT . The difference in similarity estimates for recognition
and compression methods is that the conversion operators adduce the source pixel y
to the reference pixel y e = ( y 1e, y e2,..., y en ) during recognition, while compression, on
the contrary, returns the restored value y based on the corresponding conversion of the
reference pixel y e . Note that for any pair of pixels y and y e for identical conversion
and offset, the similarity estimates for the recognition and compression methods are the
same. To increase the compression ratio of the HSI, two directions can be easily iden-
tified: first, with the choice of compression method, and second, with a decrease in the
number of standards. When compressing based on recognition with self-learning, re-
ducing the number of standards also increases the speed of reduction algorithms. In [1],
to reduce the number of standards, an algorithm based on the idea of solid stacking by
means of coverage zones of standards of the entire set of pixels is proposed. For the
identical transformation, the formation of a new standard based on an unrecognized
pixel is carried out in accordance with the formula (1)

                                                                   𝛿
                𝑦𝜈𝑒 = 𝑦 + ∑𝑖∈𝑄 𝑙⃗𝑖 ,    𝑙⃗𝑖 = (𝑦 − 𝑦𝑖𝑒 ) ⋅ (𝜆 ⋅ √𝜀̂ − 1),                   (1)
                                                                 𝑚,𝑖


where Q is the set of standards for which the pixel lies in the region of their solid
stacking;  is the compression threshold; and  is a parameter whose value depends
on the space dimension n and is responsible for the implementation of the regions of
solid stacking.
   In a space of dimension n = 2 , the formation of a new standard y e3 , the location of
which ensures optimal solid stacking in accordance with formula (1), is illustrated in
Fig. 1. The use of this method for optimizing the number of standards in compression
algorithms allows reducing their number by magnitude up to 5%. However, there are
difficulties in its practical application. The difficulty of applying this method lies in the
choice of the value of the parameter h for the implementation of the optimal continuous
stacking. Obviously, a violation of the principle of solid stacking or excessive intersec-
tion of the coverage areas of the standards leads to a decrease in the efficiency of the
optimization method. In turn, the analysis of the formation of standards shows that
           Optimization of the Composition of Standards in Recognition and Compression… 3


when using a sequential enumeration of pixels (television scan), the intersection of the
coverage areas of the standards is a significant factor in their excess replenishment.


Fig. 1. Illustration of the method of minimizing the number of standards in the compression al-
gorithm with an identical estimate

   Therefore, in order to focus on pixels that are more distant in space from the refer-
ence signatures, it is proposed to select pixels for compression at random. Thus, the
randomness of the choice of pixels can provide continuous coverage with fewer stand-
ards. The same idea in solving other problems can be traced in [2-4].


2       Methods of minimizing the number of standards

Consider two algorithms, randomized and difference, that minimize the number of
standards relative to the original HSI compression algorithm, implemented on the basis
of the correlation-extreme recognition method with self-learning.
2.1    The randomized compression algorithm for hyperspectral images
The main difficulty in implementing the compression algorithm when forming a ran-
dom pixel recognition sequence is the large amount of HSI, which does not allow to
store all the information in many programming systems in RAM. Therefore, the selec-
tion of a pixel at random in the proposed randomized compression algorithm of the HSI
will be carried out in two stages. First, a row of an image matrix is determined by a
random number generator, and then a sequence of pixels of this row is randomly
4 L. Lebedev


generated. At each step, the values of random numbers are controlled in order to ex-
clude the possibility of their coincidence. Since at the second step, the selection of pix-
els is carried out within one row of the matrix, the dimensions of the HSI are not critical
for this implementation of the algorithm.
2.2    The difference algorithm of compression for hyperspectral images
An alternative algorithm to the above is the difference algorithm of compression of the
HSI. Here, in the proposed algorithm, the row is selected according to the principle of
obtaining the members of the arithmetic progression with a given difference d. Line
numbers for compression in this algorithm are formed by the formula

           𝑁𝑖 = (𝑖 ⋅ 𝑑)%𝐾𝑟 + 𝑒𝑛𝑡( 𝑖 ⋅ 𝑑/𝐾𝑟),        𝑖 ∈ {0,1, . . . , 𝐾𝑟 − 1},          (2)

where Kr is the number of rows of the HSI matrix. To determine the sequence pixel
numbers in a row, one can use the same formula (2), replacing the value of the param-
eter Kr in it by the value Ks equal to the number of columns and possibly changing
the difference d . Thus, by changing the sequence of lines and pixels in them in ac-
cordance with the proposed algorithm, it is possible to reduce the correlation of current
pixels. And this implies a decrease in the number of standards for continuous coverage.


3      Experimental research

Experimental studies to minimize the number of standards in the proposed algorithms,
randomized and differential algorithm, were carried out on two fragments of HSI with
a significantly different composition of objects.


3.1    Experiments on the HSI fragment f100520t01p00r12
The first fragment was represented by lines 251 through 550 of the f100520t01p00r12
HSI file of the AVIRIS spectrometer based on 224 frequencies. The number of columns
of the fragment matrix was 813 pixels. The spatial resolution was 17.3 m. The image
of this fragment is shown in Fig. 2. For all four types of similarity ratings, this fragment
was compressed under various conditions. The original compression algorithm was
tested on a fragment when specifying various lines that were taken as start lines. Basi-
cally, the step of changing the start lines was 10 units. The way to select the current
lines further after the start was sequential. As a result of the experiments, for each as-
sessment, a set of s was obtained, the elements of which are the values of the number
of standards formed during compression of the fragment. The boundary values, the av-
erage number of standards, and the variance were found for this set.
           Optimization of the Composition of Standards in Recognition and Compression… 5


               Fig. 2. Colored image of the f100520t01p00r12 HSI fragment

For a randomized algorithm, similar characteristics were obtained. For the difference
algorithm, the number of generated standards was obtained with the difference value
d = 10 . The results obtained are summarized in Table 1.

         Table 1. The results of experiments on the f100520t01p00r12 HSI fragment

   Adaptive compression algo- with similarity with similarity with similarity with similarity
                                  estimate ˆm estimate ˆm estimate ˆm estimate ˆm
                                             I            S              D               T
            rithm, σ =2%
                   Interval [min,
                   max]             [544, 559]   [538, 546]     [421, 430]      [195, 209]
                    The average
   Initial al-
                     number of        551.53      242.5625       426.3125        200.1875
   gorithm           standards
                     Standard          4.13        2.5487         2.1424          3.8278
                     Deviation
                   Interval [min,
                   max]             [521, 552]   [228, 246]     [407, 421]      [191, 204]
                    The average
  Randomized
                     number of        536.69      236.7333       413.8125        196.5625
  algorithm          standards
                     Standard
                     Deviation         7.98        4.3660         5.2585          3.3721
  Difference
                    The number
  algorithm,                           525          238            409             195
                    of standards
  d=10

    From the data given in the table, it follows that the randomized and difference algo-
rithms are superior to the original compression algorithm in terms of minimizing the
number of standards needed to restore a fragment of the HSI with an error not exceeding
2% of the pixel norm. In turn, the difference algorithm is more efficient than the ran-
domized algorithm by the same criterion, although this is mainly for assessing similar-
ities in an identical transformation. However, as follows from the obtained boundaries
of the change in the number of standards in the experiments performed, in some cases
the randomized algorithm is superior to the difference algorithm, although on average
it can be inferior to it.
6 L. Lebedev


3.2    Experiments on the HSI fragment MoffettField

The second fragment was formed on the basis of lines 101 to 600 of the MoffettField
HSI. The number of matrix columns is 753, and the number of channels used by the
AVIRIS spectrometer was 224. This fragment is shown in Fig. 3. The difference of this
fragment from the previous one lies in the greater diversity of the underlying surface
and, therefore, the need to form a much larger number of standards.


            Fig. 3. Image of MoffettField HSI fragments (lines from 101 to 600)

   The experiments on this fragment were carried out according to the same scheme.
The research results are shown in table 2. As follows from the results in the tables,
when compressing the HSI using a similarity estimate that is invariant with respect to
the identical transformation ˆmI or the displacement ˆmD , preference should be given to
the difference algorithm. To evaluate the similarity invariant with respect to the simi-
larity transformation ˆmS or the similarity transformation with offset ˆmT , both compres-
sion algorithms showed almost identical results in minimizing the number of standards.
   However, when choosing a compression algorithm for HSI in this case, a random-
ized algorithm should be preferred because of the lack of settings in it. In the difference
algorithm, the value of the parameter d should be estimated mainly from the data on
the spatial resolution of the spectrometer used, with the aim of choosing the sequence
of less correlated pixels.
            Optimization of the Composition of Standards in Recognition and Compression… 7


             Table 2. The results of experiments on the MoffettField HSI fragment

     Adaptive compression algo- with similarity with similarity with similarity with similarity
                                   estimate ˆm estimate ˆm estimate ˆm       estimate ˆm
                                              I            S              D                T
              rithm, σ =5%
                        Interval
                      [min, max]    [9270, 9551] [3975, 4037] [6242, 6312] [3171,3240]

     Initial al-      The  average
                       number of       9362.23      4008.10        6271.00          3199.00
     gorithm           standards
                        Standard
                       Deviation       85.1121      21.6492        24.8193           21.46
                        Interval
                      [min, max]    [9082, 9246] [3889, 3983] [6152, 6226] [3147, 3178]
    Randomized The average
                       number of       9174.28    3954.6333       6177.500          3159.3
    algorithm          standards
                        Standard
                       Deviation       34.1614      23.7982        24.5937           11.63
     Difference
                      The number
    algorithm,                          9184         3956           6105             3149
                      of standards
    d=10


4       Conclusion

The hypothesis of a possible reduction in the number of standards as a result of a change
in the sequence of pixels during compression of the HSI was confirmed by experiments.
As a result, on the basis of the correlation extreme method, two algorithms were cre-
ated, randomized and differential compression algorithms, which reduce the number of
standards by 2-5% and thereby increase the compression coefficient of the HSI, as well
as the speed of the procedures.


References
 1. Lebedev L.I.: Geometrical aspects of correlation-extreme methods of object recognition and
    HSI compression. In: Proceedings of the 6th International Conference on Information Tech-
    nology and Nanotechnology (ITNT-2020), pp. 229-238. Samara National University, Sa-
    mara (2020)..
 2. Svirnov, S.I., Mikhailov, V.V., Ostrikov, V.N.: Application randomized method of principal
    components for hyperspectral data compression. J. Modern problems of Earth remote sens-
    ing from cosmos, 11(2), 9-17 (2014).
 3. Borzov, S.M., Guryanov, M.A., Potaturkin, O.I.: Study of the classification efficiency of
    difficult-to-distinguish vegetation types using hyperspectral data. J. Computer Optics, 43(3),
    464-473 (2019).
 4. Bibikov, S.A., Kazanskiy, N.L. Fursov, V.A.: Vegetation type recognition in hyperspectral
    images using a conjugacy indicator. J. Computer Optics, 42(5), 846-854 (2018).

</pre>