-

This work was supported by the Russian Science Foundation, project No.

Optimization of the Composition of Standards in Recognition and Compression Algorithms of Hyperspectral Images*

Leonid Lebedev

lebedev@pmk.unn.ru 0 0 Lobachevsky state university of Nizhny Novgorod , Nizhny Novgorod , Russia

1 16 6 11

The paper proposes a solution to the problem of minimizing the number of standards in order to increase both the compression coefficient of hyperspectral images (HSI) and the speed of correlation extreme compression methods (CEM). As modifications of the CEM, randomized and differential compression algorithms are offered. The randomized and difference algorithms are based on the hypothesis of spatial compactness of pixels located in local regions of the image matrix. This means that when a new template is formed based on an unrecognized pixel, there is a high probability of using a pixel that lies near the boundaries of the coverage areas of the existing templates, which leads to their increase. In order to reduce the influence of spatial compactness of pixels on the formation of standards, a methodology based on changing the sequence of recognized pixels is proposed. In a randomized algorithm, a row of the matrix is randomly determined for this, on the basis of which a sequence of recognized pixels is generated by a random column generator. In the difference algorithm of compression, the row number of the matrix is determined by the rule for finding the members of an arithmetic progression with a given difference. For the selected line a sequence of recognizable pixels is formed on the same principle. It should be noted that line-by-line pixel recognition in the self-learning mode allows compressing HSI of almost any volume. The effectiveness of the created algorithms is demonstrated on two fragments of real HSI. A comparative analysis of all three compression algorithms in terms of the quantitative composition of the obtained standards is presented.

Hyperspectral Image Correlation Extreme Methods Similarity Estimates Recognition Compression Random Pixel Samples

Introduction

In [ 1 ], the basics of correlation extreme methods of recognition and compression in self-learning modes that are invariant to a given type of transformation, when hyperspectral images act as the source information. The HSI is represented as a three-dimensional cube, each pixel of which is described by the response values in the corresponding spectral channels. Representing HSI pixels as vectors (points) y = ( y 1 , y 2 ,..., y n ) in a multidimensional linear space ℝ , where n it corresponds to the number of channels of the spectrometer, allows us to obtain similarity estimates for recognition and compression methods that are invariant, respectively, to the identity transformation  I m and ˆmI ; to the similarity transformation  mS and ˆmS ; to the offset  mD and ˆmD ; to the scaling and offset  mT and ˆT . The difference in similarity estimates for recognition m and compression methods is that the conversion operators adduce the source pixel y to the reference pixel ye = ( y 1e, y e2,..., y e ) during recognition, while compression, on n the contrary, returns the restored value y based on the corresponding conversion of the reference pixel ye . Note that for any pair of pixels y and ye for identical conversion and offset, the similarity estimates for the recognition and compression methods are the same. To increase the compression ratio of the HSI, two directions can be easily identified: first, with the choice of compression method, and second, with a decrease in the number of standards. When compressing based on recognition with self-learning, reducing the number of standards also increases the speed of reduction algorithms. In [ 1 ], to reduce the number of standards, an algorithm based on the idea of solid stacking by means of coverage zones of standards of the entire set of pixels is proposed. For the identical transformation, the formation of a new standard based on an unrecognized pixel is carried out in accordance with the formula (1) = + ∑ ∈ ⃗ , ⃗ = ( − ) ⋅ ( ⋅ √ ̂ , − 1), (1) where Q is the set of standards for which the pixel lies in the region of their solid stacking;  is the compression threshold; and  is a parameter whose value depends on the space dimension n and is responsible for the implementation of the regions of solid stacking.

In a space of dimension n = 2 , the formation of a new standard ye3 , the location of which ensures optimal solid stacking in accordance with formula (1), is illustrated in algorithms allows reducing their number by magnitude up to 5%. However, there are difficulties in its practical application. The difficulty of applying this method lies in the choice of the value of the parameter h for the implementation of the optimal continuous stacking. Obviously, a violation of the principle of solid stacking or excessive intersection of the coverage areas of the standards leads to a decrease in the efficiency of the optimization method. In turn, the analysis of the formation of standards shows that

Optimization of the Composition of Standards in Recognition and Compression… 3 when using a sequential enumeration of pixels (television scan), the intersection of the coverage areas of the standards is a significant factor in their excess replenishment.

Therefore, in order to focus on pixels that are more distant in space from the reference signatures, it is proposed to select pixels for compression at random. Thus, the randomness of the choice of pixels can provide continuous coverage with fewer standards. The same idea in solving other problems can be traced in [ 2-4 ]. 2

Methods of minimizing the number of standards

Consider two algorithms, randomized and difference, that minimize the number of standards relative to the original HSI compression algorithm, implemented on the basis of the correlation-extreme recognition method with self-learning. 2.1

The randomized compression algorithm for hyperspectral images

The main difficulty in implementing the compression algorithm when forming a random pixel recognition sequence is the large amount of HSI, which does not allow to store all the information in many programming systems in RAM. Therefore, the selection of a pixel at random in the proposed randomized compression algorithm of the HSI will be carried out in two stages. First, a row of an image matrix is determined by a random number generator, and then a sequence of pixels of this row is randomly generated. At each step, the values of random numbers are controlled in order to exclude the possibility of their coincidence. Since at the second step, the selection of pixels is carried out within one row of the matrix, the dimensions of the HSI are not critical for this implementation of the algorithm. 2.2

The difference algorithm of compression for hyperspectral images

An alternative algorithm to the above is the difference algorithm of compression of the HSI. Here, in the proposed algorithm, the row is selected according to the principle of obtaining the members of the arithmetic progression with a given difference d. Line numbers for compression in this algorithm are formed by the formula = ( ⋅ )% + ( ⋅ / ), ∈ {0,1, . . . , − 1}, (2) where Kr is the number of rows of the HSI matrix. To determine the sequence pixel numbers in a row, one can use the same formula (2), replacing the value of the parameter Kr in it by the value Ks equal to the number of columns and possibly changing the difference d . Thus, by changing the sequence of lines and pixels in them in accordance with the proposed algorithm, it is possible to reduce the correlation of current pixels. And this implies a decrease in the number of standards for continuous coverage. 3

Experimental research

Experimental studies to minimize the number of standards in the proposed algorithms, randomized and differential algorithm, were carried out on two fragments of HSI with a significantly different composition of objects. 3.1

Experiments on the HSI fragment f100520t01p00r12

The first fragment was represented by lines 251 through 550 of the f100520t01p00r12 HSI file of the AVIRIS spectrometer based on 224 frequencies. The number of columns of the fragment matrix was 813 pixels. The spatial resolution was 17.3 m. The image of this fragment is shown in Fig. 2. For all four types of similarity ratings, this fragment was compressed under various conditions. The original compression algorithm was tested on a fragment when specifying various lines that were taken as start lines. Basically, the step of changing the start lines was 10 units. The way to select the current lines further after the start was sequential. As a result of the experiments, for each assessment, a set of s was obtained, the elements of which are the values of the number of standards formed during compression of the fragment. The boundary values, the average number of standards, and the variance were found for this set.

Optimization of the Composition of Standards in Recognition and Compression… 5 For a randomized algorithm, similar characteristics were obtained. For the difference algorithm, the number of generated standards was obtained with the difference value d = 10 . The results obtained are summarized in Table 1. with similarity with similarity with similarity with similarity estimate ˆ mI estimate ˆmS estimate ˆmD estimate ˆmT

From the data given in the table, it follows that the randomized and difference algorithms are superior to the original compression algorithm in terms of minimizing the number of standards needed to restore a fragment of the HSI with an error not exceeding 2% of the pixel norm. In turn, the difference algorithm is more efficient than the randomized algorithm by the same criterion, although this is mainly for assessing similarities in an identical transformation. However, as follows from the obtained boundaries of the change in the number of standards in the experiments performed, in some cases the randomized algorithm is superior to the difference algorithm, although on average it can be inferior to it. 3.2

Experiments on the HSI fragment MoffettField

The second fragment was formed on the basis of lines 101 to 600 of the MoffettField HSI. The number of matrix columns is 753, and the number of channels used by the AVIRIS spectrometer was 224. This fragment is shown in Fig. 3. The difference of this fragment from the previous one lies in the greater diversity of the underlying surface and, therefore, the need to form a much larger number of standards.

The experiments on this fragment were carried out according to the same scheme. The research results are shown in table 2. As follows from the results in the tables, when compressing the HSI using a similarity estimate that is invariant with respect to the identical transformation ˆ mI or the displacement ˆmD , preference should be given to the difference algorithm. To evaluate the similarity invariant with respect to the similarity transformation ˆmS or the similarity transformation with offset ˆT , both compresm sion algorithms showed almost identical results in minimizing the number of standards.

However, when choosing a compression algorithm for HSI in this case, a randomized algorithm should be preferred because of the lack of settings in it. In the difference algorithm, the value of the parameter d should be estimated mainly from the data on the spatial resolution of the spectrometer used, with the aim of choosing the sequence of less correlated pixels.

Initial algorithm Randomized algorithm Optimization of the Composition of Standards in Recognition and Compression… 7 Adaptive compression algorithm, σ =5% with similarity with similarity with similarity with similarity estimate ˆ I m estimate ˆS m estimate ˆD m estimate ˆT m The hypothesis of a possible reduction in the number of standards as a result of a change in the sequence of pixels during compression of the HSI was confirmed by experiments. As a result, on the basis of the correlation extreme method, two algorithms were created, randomized and differential compression algorithms, which reduce the number of standards by 2-5% and thereby increase the compression coefficient of the HSI, as well as the speed of the procedures.

1. Lebedev

L.I.

: Geometrical aspects of correlation-extreme methods of object recognition and HSI compression . In: Proceedings of the 6th International Conference on Information Technology and Nanotechnology (ITNT-2020) , pp. 229 - 238 . Samara National University, Samara ( 2020 )..

2. Svirnov , S.I. , Mikhailov , V.V. , Ostrikov , V.N. : Application randomized method of principal components for hyperspectral data compression . J. Modern problems of Earth remote sensing from cosmos , 11 ( 2 ), 9 - 17 ( 2014 ).

3. Borzov , S.M. , Guryanov , M.A. , Potaturkin , O.I. : Study of the classification efficiency of difficult-to-distinguish vegetation types using hyperspectral data . J. Computer Optics , 43 ( 3 ), 464 - 473 ( 2019 ).

4. Bibikov , S.A. , Kazanskiy , N.L. Fursov , V.A. : Vegetation type recognition in hyperspectral images using a conjugacy indicator . J. Computer Optics , 42 ( 5 ), 846 - 854 ( 2018 ).