=Paper= {{Paper |id=Vol-3052/short2 |storemode=property |title=Hyperspectral Anomaly Detection based on Low-rank Structure Exploration |pdfUrl=https://ceur-ws.org/Vol-3052/short2.pdf |volume=Vol-3052 |authors=Shizhen Chang,,Pedram Ghamisi |dblpUrl=https://dblp.org/rec/conf/cikm/ChangG21 }} ==Hyperspectral Anomaly Detection based on Low-rank Structure Exploration== https://ceur-ws.org/Vol-3052/short2.pdf
Hyperspectral Anomaly Detection Based on Low-Rank
Structure Exploration
Shizhen Chang1 , Pedram Ghamisi1,2
1
 Institute of Advanced Research in Artificial Intelligence (IARAI), LandstraรŸer HauptstraรŸe 5, 1030 Vienna, Austria.
2
 Helmholtz-Zentrum Dresden-Rossendorf, Helmholtz Institute Freiberg for Resource Technology, Machine Learning Group, Chemnitzer Str. 40,
D-09599 Freiberg, Germany


                                             Abstract
                                             As one of the typical research area in unsupervised hyperspectral image learning, anomaly detection needs to accomplish the
                                             abnormal pixels separation process without prior spectral knowledge. Recently, the representation-based detectors which can
                                             find the spectral similarity between pixels under no statistical distribution assumption have attracted extensive attention and
                                             been frequently used. To this end, low-rank regularization methods can approximately decompose the hyperspectral data into
                                             a low-rank background part and a sparse anomaly part. Based on the theory of representation and self-representation, this
                                             paper proposed a double low-rank regularization (DLRR) model for hyperspectral anomaly detection. To further explore the
                                             reconstructed structure differences between the original data and the assumed background, the residual of their corresponding
                                             low-rank coefficient matrices are computed and utilized as a part of the detection output together with the column-wise โ„“2
                                             norm of the sparse matrix. Experiments carried out on two real-world hyperspectral datasets show promising performances
                                             compared with other state-of-the-art detectors.

                                             Keywords
                                             Hyperspectral imagery, anomaly detection, low-rank representation, sketched-subspace clustering



1. Introduction                                                                                                       limitations in practical applications.
                                                                                                                         To overcome the insufficient accuracy happened in
With continuous and redundant spectral bands, hyper-                                                                  the distribution-based models, the representation-based
spectral images (HSIs) carry a wealth of spectral and                                                                 detectors have been proposed and shown intended per-
spatial information of land-covers [1, 2]. This promotes                                                              formances. Representative methods are the collaborative
military and civilian applications utilizing the spectral                                                             representation detector (CRD) [11], the background joint
characteristics of different materials. And lots of research                                                          sparse representation detector (BJSRD) [12], etc. A dual
works have been conducted, such as feature extraction                                                                 concentrate window is utilized to extract the possible
[3], noise reduction [4], unmixing [5], classification [6],                                                           background information as the confidence dictionaries
and detection [7], etc. As a special branch of HSIs re-                                                               of the background at each test pixel. And the detection
searches, anomaly detection aims to extract potential                                                                 result is approximately derived by calculating the repre-
abnormal pixels without any prior knowledge [8]. There-                                                               sentation residual of the pixel. Nowadays, the low-rank
fore, suitable methods need to be designed.                                                                           representation is widely used in hyperspectral anomaly
   Traditionally, classic anomaly detectors have been de-                                                             detection which takes advantage of the repeatability of
veloped mostly based on the assumption of data sta-                                                                   the background spectrum and decomposes the original
tistical distributions. The benchmark RX detector, the                                                                data matrix. Considering that the anomalies are usually
cluster-based anomaly detection (CBAD) [7] algorithm,                                                                 rare and sparse, Chen et. al. [13] first utilized the low-
the blocked adaptive computationally efficient outlier                                                                rank decomposition model for anomaly detection. Later,
nominators (BACON) [9] and the random selection-based                                                                 many types of research have been carried out based on
anomaly detector (RSAD) [10] assume that the data fol-                                                                the low-rankness of the background subspace and the
lows the Gaussian or Gaussian mixture distributions,                                                                  sparsity of the anomaly subspace [14, 15]. However, after
then they implement the detection task according to the                                                               doing the low-rank decomposition, the detection decision
Mahalanobis distance between the pixel-under-test and                                                                 of these methods either focus on analyzing the sparse ma-
the background. However, this hypothesis has obvious                                                                  trix or back to the statistical estimation, a better combina-
                                                                                                                      tion of the assumed background component and anomaly
CDCEO 2021: 1st Workshop on Complex Data Challenges in Earth                                                          component may let the detection more reasonable.
Observation, November 1, 2021, Virtual Event, QLD, Australia.
                                                                                                                         As is well known, subspace clustering (SC) is gradu-
" szchang@whu.edu.cn (S. Chang); pedram.ghamisi@iarai.ac.at
(P. Ghamisi)                                                                                                          ally developed for unsupervised HSIs interpretation [16].
 0000-0001-2345-6789 (S. Chang); 0000-0003-1203-741X                                                                 It can learn the similarity between pixels through self-
(P. Ghamisi)                                                                                                          dictionary learning. Inspired by the sketched-SC and
                                       ยฉ 2021 Copyright for this paper by its authors. Use permitted under Creative
                                       Commons License Attribution 4.0 International (CC BY 4.0).                     representation theory, we propose a double low-rank
    CEUR
    Workshop
    Proceedings
                  http://ceur-ws.org
                  ISSN 1613-0073
                                       CEUR Workshop Proceedings (CEUR-WS.org)
regularization model for hyperspectral anomaly detec-         where ๐›ฝ and ๐œ† are the regularization parameters, ๐ด is the
tion. The proposed method is solved via the alternat-         self-representation coefficient matrix, ๐‘ is the assumed
ing direction method of multipliers (ADMMs) method            background coefficient matrix, and E denotes the sparse
[17] and the anomalies are finally detected by computing      part indicating the anomalies.
a background-related residual matrix and an anomaly-             Considering that the difference of ๐ด and ๐‘ may reflect
related distance matrix.                                      the abnormal information of the image, then the residual
                                                              of these two matrices can be utilized as the reference of
                                                              detection output when ๐‘ก = ๐‘›. And the final detection
2. Proposed method                                            result is formulated by the sum of the column-wise โ„“2
                                                              norm of the low-rank coefficient matrices residual and
2.1. Related works                                            the sparse coefficient matrix:
Let X = {x๐‘– }๐‘ ๐‘–=1 โˆˆ R
                       ๐ฟร—๐‘
                            denotes the data collections,
where ๐ฟ is the number of bands and ๐‘ is the total pixel                ๐ท๐ท๐ฟ๐‘…๐‘… (x๐‘– ) = ||๐ด:,๐‘– โˆ’ ๐‘:,๐‘– ||2 + ||๐ธ:,๐‘– ||2 .  (4)
number of the image. Then, the low-rank representation
model aims to decompose the data into a lowest-rank 2.3. Problem optimization
matrix L โˆˆ R๐ฟร—๐‘ and a sparse matrix E โˆˆ R๐ฟร—๐‘ . The
                                                               To solve the proposed DLRR model, the ADMM method
optimization problem is given by:
                                                               is employed and the detailed optimization process is de-
          min ||L||โ‹† + ๐›ผ||E||0 ๐‘ .๐‘ก. X = L + E,             (1) scribed as follows.
           L,E                                                    First, ๐ต and ๐ป are introduced as the auxiliary variables
                                                               for the coefficient matrix ๐ด and ๐‘, respectively:
where ๐›ผ is the regularization parameter, || ยท ||โ‹† and || ยท ||0
represents the nuclear norm and the โ„“0 norm, respec-                        min ||๐ต||โ‹† + ๐›ฝ||๐ป||โ‹† + ๐œ†||E||2,1
tively.                                                                  ๐ต,๐ด,๐ป,๐‘,E

   For a given dictionary ๐ท, the low-rank matrix L can                       ๐‘ .๐‘ก. X = Xฬƒ๐ด, X = ๐ท๐‘ + E,                 (5)
be rewritten as a linear combination of the ๐ท and its                              ๐ด = ๐ต, ๐‘ = ๐ป.
corresponding coefficient matrix ๐‘. And the NP-hard
problem eq. (1) can be represented as:                            Then, the augmented Lagrangian function of (5) can
                                                               be constructed:
        min ||๐‘||โ‹† + ๐›ผ||E||2,1 ๐‘ .๐‘ก. X = ๐ท๐‘ + E, (2)
        ๐‘,E
                                                                          min          ||๐ต||โ‹† + ๐›ฝ||๐ป||โ‹† + ๐œ†||E||2,1
                                                              ๐ต,๐ด,๐ป,๐‘,E,๐‘Œ1 ,๐‘Œ2 ,๐‘Œ3 ,๐‘Œ4
where ๐ท โˆˆ R๐ฟร—๐‘› has ๐‘› dictionary samples, ๐‘ โˆˆ R๐‘›ร—๐‘ ,            ๐œŒ                          ๐œŒ
                                                              +  ||X โˆ’ Xฬƒ๐ด + ๐‘Œ1 /๐œŒ||2๐น + ||X โˆ’ ๐ท๐‘ โˆ’ ๐ธ + ๐‘Œ2 /๐œŒ||2๐น
and || ยท ||2,1 is the โ„“2,1 norm of the matrix. โ„“2,1 norm       2                          2
can be regarded as the โ„“1 norm of the โ„“2 norm of matrix        ๐œŒ                        ๐œŒ
                                                              + ||๐ด โˆ’ ๐ต + ๐‘Œ3 /๐œŒ||๐น + ||๐‘ โˆ’ ๐ป + ๐‘Œ4 /๐œŒ||2๐น ,
                                                                                  2
columns.                                                       2                        2
                                                                                                            (6)
2.2. Problem formulation                                      where ๐‘Œ1 โˆˆ R๐ฟร—๐‘ , ๐‘Œ2 โˆˆ R๐ฟร—๐‘ , ๐‘Œ3 โˆˆ R๐‘กร—๐‘ , and
Assume that the data can be self-represented:                 ๐‘Œ4 โˆˆ R๐‘›ร—๐‘ are the Lagrangian multipliers, and ๐œŒ > 0
                                                              is the penalty parameter.
                        X = X๐ถ,                                  Then the equation (6) can be divided into five optimiza-
                                                              tion problems and be updated one by one with iterative
where ๐ถ is the coefficient matrix. Then for a sketched        procedures. The updating rules of these variables are:
data Xฬƒ = X๐‘‡ , we also have:
                                                                  1) ๐ต step with fixed ๐ด and ๐‘Œ3 :
                        X = Xฬƒ๐ด,
                                                                                            ๐œŒ
                                                                            min ||๐ต||โ‹† +      ||๐ด โˆ’ ๐ต + ๐‘Œ3 /๐œŒ||2๐น .   (7)
where ๐‘‡ โˆˆ R๐‘ ร—๐‘ก is defined as a random projection ma-                         ๐ต             2
trix to compress X while preserving its main information,         2) ๐ป step with fixed ๐‘ and ๐‘Œ4 :
and ๐ด โˆˆ R๐‘กร—๐‘ .
   By means of the sketched data Xฬƒ, the proposed double                             ๐œŒ
                                                                        min ๐›ฝ||๐ป||โ‹† + ||๐‘ โˆ’ ๐ป (๐‘˜+1) + ๐‘Œ4 /๐œŒ||2๐น . (8)
low-rank regularization (DLRR) model based on sketched-                  ๐ป           2
SC can be formulated as:
                                                                  3) E step with fixed ๐‘ and ๐‘Œ2 :
            min ||๐ด||โ‹† + ๐›ฝ||๐‘||โ‹† + ๐œ†||E||2,1                                           ๐œŒ
           ๐ด,๐‘,E
                                                        (3)             min ๐œ†||E||2,1 + ||Xโˆ’๐ท๐‘ โˆ’๐ธ +๐‘Œ2 /๐œŒ||2๐น . (9)
                                                                         E             2
            ๐‘ .๐‘ก. X = Xฬƒ๐ด, X = ๐ท๐‘ + E,
      4) ๐ด step with fixed ๐ต, ๐‘Œ1 and ๐‘Œ3 :
              ๐œŒ                      ๐œŒ
           min  ||Xโˆ’Xฬƒ๐ด+๐‘Œ1 /๐œŒ||2๐น + ||๐ดโˆ’๐ต+๐‘Œ3 /๐œŒ||2๐น .
             ๐ด2                      2
                                               (10)
      5) ๐‘ step with fixed ๐ป, E, ๐‘Œ2 and ๐‘Œ4 :
              ๐œŒ                         ๐œŒ
           min  ||Xโˆ’๐ท๐‘โˆ’๐ธ+๐‘Œ2 /๐œŒ||2๐น + ||๐‘โˆ’๐ป+๐‘Œ4 /๐œŒ||2๐น .
             ๐‘2                         2                              (a)                          (b)
                                                   (11)
      6) The Lagrangian multipliers and the penalty pa- Figure 1: The San  Diego dataset. (a) Image scene. (b) Ground-
         rameter are updated as:                        truth.

                             ๐‘Œ1 = ๐‘Œ1 + ๐œŒ(X โˆ’ Xฬƒ๐ด),                            (12)
                             ๐‘Œ2 + ๐œŒ(X โˆ’ ๐ท๐‘ โˆ’ ๐ธ),                              (13)
                             ๐‘Œ3 = ๐‘Œ3 + ๐œŒ(๐ด โˆ’ ๐ต),                              (14)
                             ๐‘Œ4 = ๐‘Œ4 + ๐œŒ(๐‘ โˆ’ ๐ป),                              (15)
                             ๐œŒ = min{1.1๐œŒ, ๐œŒ๐‘š๐‘Ž๐‘ฅ }.                            (16)
                                                                                                                                   (a)                                                                   (b)
   The solutions of (7) and (8) are calculated by ๐ต =
ฮ˜(1/๐œŒ) (๐ด + ๐‘Œ3 /๐œŒ) and ๐ป = ฮ˜(๐›ฝ/๐œŒ) (๐‘ + ๐‘Œ4 /๐œŒ), respec- Figure 2: The Urban dataset. (a) Image scene. (b) Ground-
                                                           truth.
tively, where ฮ˜ is the singular value thresholding (SVT)
operator. Then E is updated by ๐’ฎ(๐œ†/๐œŒ) (X โˆ’ ๐ท๐‘ + ๐‘Œ2 /๐œŒ)
where ๐’ฎ is a โ„“2,1 -min thresholding operator [18]. ๐ด and
๐‘ are respectively solved by finding the partial deriva-          and 189 bands are utilized for the detection task
tive and setting it to zero. Their optimized solutions are        after eliminating the noisy bands. It records the
             โŠค          โŠค       โŠค
๐ด = (๐ผ + Xฬƒ Xฬƒ) (Xฬƒ X + Xฬƒ ๐‘Œ1 /๐œŒ + ๐ต โˆ’ ๐‘Œ3 /๐œŒ) and
                  โˆ’1                                              area of the San Diego airport, CA, USA in 100 ร—
๐‘ = (๐ผ + ๐ทโŠค ๐ท)โˆ’1 (๐ทโŠค X โˆ’ ๐ทโŠค E + ๐ทโŠค ๐‘Œ2 /๐œŒ + ๐ป โˆ’                    100 pixels, three aircrafts including 58 pixels are
๐‘Œ4 /๐œŒ).                                                           selected as the anomaly target. The visualized
   The initial settings of this optimization process are:         2-D image scene and the ground-truth map of
๐ด0 = ๐‘0 = ๐ป0 = 0, E0 = 0, ๐‘Œ0 = ๐‘Œ1 = 0, ๐‘Œ3 =                       this dataset are shown in Figure 1.
๐‘Œ4 = 0, ๐œŒ0 = 0.01, ๐œŒ๐‘š๐‘Ž๐‘ฅ = 10 . And the convergence
                                  6                            2) Urban dataset: This dataset was collected by the
conditions are ||๐‘‹ โˆ’ Xฬƒ๐ด||๐น < ๐œ–, ||๐‘‹ โˆ’ ๐ท๐‘ + ๐ธ||๐น <                HYDICE airborne sensor, which has a spatial res-
๐œ–, ||๐ด โˆ’ ๐ต||๐น < ๐œ–, ||๐‘ โˆ’ ๐ป||๐น < ๐œ–, or the iteration               olution of 1 m and a spectral resolution of 10 nm.
times exceeds the predefined upper limit. Empirically,            After removing low quality bands, 162 bands are
the predefined value of the error tolerance is ๐œ– = 10โˆ’6           left for anomaly detection. This image scene con-
and the maximum iteration time is 100.                            tains 80 ร— 100 pixels, and 17 small objects are
                                                                  considered as anomaly targets. The 2-D visual-
                                                                  ization map and the ground-truth of this dataset
3. Experiments                                                    are shown in Figure 2.

In this section, the performance of the proposed DLRR
method is assessed on two real-world HSI scenes: the          1 .0
                                                                                                                               1 .2                                                     A n o m a ly (1 0 % ~ 9 0 % )          M in ~ M a x
                                                                                                                                         N o r m a liz e d S ta tis tic a l R a n g e
                                                                                            P r o b a b ility o f d e te c tio n




                                                                                                                                                                                         B a c k g ro u n d (1 0 % ~ 9 0 % )    M e d ia n L in e

San Diego dataset and the Urban dataset. Four classi-          0 .8                                                             1 .0


cal anomaly detection methods, which are RX, BACON,             0 .6                                                             0 .8


Kernel-RX (KRX) [19], and the low probability anomaly
                                                                                                                                  0 .6
                                                                 0 .4                                        R X
                                                                                                              B A C O N
                                                                                                                                   0 .4
detector (LPAD) [20], respectively, are applied for com-          0 .2                                         K     R X
                                                                                                                 L P A D
                                                                                                                                    0 .2
parable analysis. The regularization parameters ๐›ฝ and ๐œ†
                                                                                                                  D L R R
                                                                   0 .0
                                                                        0 .0 0 .2         0 .4     0 .6    0 .8           1 .0       0 .0

are set as 5 and 10, respectively. The background dictio-                         F a ls e A la r m R a te                                R X B A C O N K R X L P A D D L R R


naries are collected by the mean vector of the K-means                                   (a)                                                         (b)
clusters, and the dictionary number ๐‘› is set as 400.      Figure 3: The San Diego dataset. (a) ROC curves. (b) Normal-
                                                                                        ized background-anomaly statistical range.
      1) San Diego dataset: This dataset was captured by
         the AVIRIS sensor, which has a spatial resolution
         of 3.5 m and a spectral resolution of 10 nm. This                                The proposed DLRR model together with other com-
         dataset has 224 original spectral bands in total,                              parable algorithms are conducted in the aforementioned
San Diego and Urban datasets, and the detection perfor-                                                                                                                                                                                        agery: Overview and application, Remote Sens. 10
mances evaluated by the ROC curves and the normalized                                                                                                                                                                                          (2018) 482.
background-anomaly statistical range are shown in Fig-                                                                                                                                                                                     [5] R. Heylen, M. Parente, P. Gader, A review of non-
ure 3 and Figure 4, respectively. It can be seen that in                                                                                                                                                                                       linear hyperspectral unmixing methods, IEEE J. Sel.
the San Diego dataset, the proposed DLRR model has the                                                                                                                                                                                         Topics Appl. Earth Observa. Remote Sens. 7 (2014)
smallest false alarm rate when the detection probability                                                                                                                                                                                       1844โ€“1868.
reaches 1. And for the Urban dataset, our method has                                                                                                                                                                                       [6] Y. Xu, B. Du, L. Zhang, Beyond the patchwise clas-
the largest separation range between the backgrounds                                                                                                                                                                                           sification: Spectral-spatial fully convolutional net-
and the anomalies. For further evaluation, the area under                                                                                                                                                                                      works for hyperspectral image classification, IEEE
ROC curve (AUC) values are computed and the results                                                                                                                                                                                            Trans. Big Data. 6 (2020) 492โ€“506.
are shown in Table 1. The results show that the proposed                                                                                                                                                                                   [7] M. J. Carlotto, A cluster-based approach for de-
DLRR method has the largest AUC values compared with                                                                                                                                                                                           tecting man-made objects and changes in imagery,
other methods in both two datasets.                                                                                                                                                                                                            IEEE Trans. Geosci. Remote Sens. 43 (2005) 374โ€“387.
                                         1 .0
                                                                                                                                                                                                                                           [8] S. Chang, B. Du, L. Zhang, A subspace selection-
                                                                                                                                                            1 .2         A n o m a ly (1 0 % ~ 9 0 % )               M in ~ M a x
                                                                                                                                                                                                                                               based discriminative forest method for hyperspec-
  P r o b a b ility o f d e te c tio n




                                                                                                             N o r m a liz e d S ta tis tic a l R a n g e




                                                                                                                                                                          B a c k g ro u n d (1 0 % ~ 9 0 % )         M e d ia n L in e
                                         0 .8
                                                                                                                                                                                                                                               tral anomaly detection, IEEE Trans. Geosci. Remote
                                                                                                                                                            1 .0

                                                                                                                                                            0 .8
                                                                                                                                                                                                                                               Sens. 58 (2020) 4033โ€“4046.
                                         0 .6

                                                                                                                                                            0 .6
                                                                                                                                                                                                                                           [9] N. Billor, A. S. Hadi, P. F. Velleman, BACON:
                                         0 .4                                               R X
                                                                                            B A C O N
                                                                                                                                                            0 .4
                                                                                                                                                                                                                                               blocked adaptive computationally efficient outlier
                                         0 .2                                                K R X
                                                                                              L P A D
                                                                                               D L R R
                                                                                                                                                            0 .2
                                         0 .0
                                                  0 .0 0 1            0 .0 1         0 .1                1                                                  0 .0                                                                               nominators, Comput. Statist. Data Anal. 34 (2000)
                                                                                                                                                                                                                                               279โ€“298.
                                                             F a ls e A la r m   R a te                                                                            R X          B A C O N         K R X         L P A D      D L R R


                                                                   (a)                                                                                                                    (b)
                                                                                                                                                                                                                                          [10] B. Du, L. Zhang, Random-selection-based anomaly
Figure 4: The Urban dataset. (a) ROC curves. (b) Normalized                                                                                                                                                                                    detector for hyperspectral imagery, IEEE Trans.
background-anomaly statistical range.                                                                                                                                                                                                          Geosci. Remote Sens. 49 (2010) 1578โ€“1589.
                                                                                                                                                                                                                                          [11] W. Li, Q. Du, Collaborative representation for hy-
                                                                                                                                                                                                                                               perspectral anomaly detection, IEEE Trans. Geosci.
                                                                                                                                                                                                                                               Remote Sens. 53 (2014) 1463โ€“1474.
Table 1                                                                                                                                                                                                                                   [12] J. Li, H. Zhang, L. Zhang, L. Ma, Hyperspectral
The AUC Values of Five Algorithms in Two Datasets
                                                                                                                                                                                                                                               anomaly detection by the use of background joint
                                            Methods                            San Diego Dataset                                                                                  Urban Dataset                                                sparse representation, IEEE J. Sel. Topics Appl.
                                                                                                                                                                                                                                               Earth Observ. Remote Sens. 8 (2015) 2523โ€“2533.
                                                  RX                                         0.8742                                                                                             0.9919
                                                BACON                                        0.8768                                                                                             0.9720
                                                                                                                                                                                                                                          [13] S.-Y. Chen, S. Yang, K. Kalpakis, C.-I. Chang, Low-
                                                  KRX                                        0.7490                                                                                             0.7836                                         rank decomposition-based anomaly detection, in:
                                                 LPAD                                        0.8973                                                                                             0.8137                                         Proc. SPIE, volume 8743, Art, 2013, p. 87430N.
                                                 DLRR                                        0.9261                                                                                             0.9927                                    [14] Y. Zhang, B. Du, L. Zhang, S. Wang, A low-rank and
                                                                                                                                                                                                                                               sparse matrix decomposition-based mahalanobis
                                                                                                                                                                                                                                               distance method for hyperspectral anomaly detec-
                                                                                                                                                                                                                                               tion, IEEE Trans. Geosci. Remote Sens. 54 (2015)
References                                                                                                                                                                                                                                     1376โ€“1389.
                                                                                                                                                                                                                                          [15] L. Li, W. Li, Q. Du, R. Tao, Low-rank and sparse
 [1] P. Ghamisi, M. Dalla Mura, J. A. Benediktsson, A                                                                                                                                                                                          decomposition with mixture of gaussian for hyper-
     survey on spectralโ€“spatial classification techniques                                                                                                                                                                                      spectral anomaly detection, IEEE Trans. Cybern.
     based on attribute profiles, IEEE Trans. Geosci.                                                                                                                                                                                          (2020).
     Remote Sens. 53 (2015) 2335โ€“2353.                                                                                                                                                                                                    [16] H. Zhai, H. Zhang, L. Zhang, P. Li, Nonlocal means
 [2] D. Hong, L. Gao, J. Yao, B. Zhang, A. Plaza,                                                                                                                                                                                              regularized sketched reweighted sparse and low-
     J. Chanussot, Graph convolutional networks for hy-                                                                                                                                                                                        rank subspace clustering for large hyperspectral
     perspectral image classification, IEEE Trans. Geosci.                                                                                                                                                                                     images, IEEE Trans. Geosci. Remote Sens. 59 (2020)
     Remote Sens. (2020).                                                                                                                                                                                                                      4164โ€“4178.
 [3] Y. Xu, L. Zhang, B. Du, F. Zhang, Spectralโ€“spatial                                                                                                                                                                                   [17] S. Boyd, N. Parikh, E. Chu, Distributed optimization
     unified networks for hyperspectral image classifi-                                                                                                                                                                                        and statistical learning via the alternating direction
     cation, IEEE Trans. Geosci. Remote Sens. 56 (2018)                                                                                                                                                                                        method of multipliers, Now Publishers Inc, 2011.
     5893โ€“5909.                                                                                                                                                                                                                           [18] L. Zhang, L. Peng, T. Zhang, S. Cao, Z. Peng, In-
 [4] B. Rasti, P. Scheunders, P. Ghamisi, G. Licciardi,                                                                                                                                                                                        frared small target detection via non-convex rank
     J. Chanussot, Noise reduction in hyperspectral im-                                                                                                                                                                                        approximation minimization joint โ„“2,1 norm, Re-
                                                                                                                                                                                                                                               mote Sens. 10 (2018) 1821.
[19] H. Kwon, N. M. Nasrabadi, Kernel RX-algorithm:
     A nonlinear anomaly detector for hyperspectral
     imagery, IEEE Trans. on Geosci. Remote Sens. 43
     (2005) 388โ€“397.
[20] Z. Li, L. Wang, S. Zheng, Applied low dimension
     linear manifold in hyperspectral imagery anomaly
     detection, in: Proc. SPIE, volume 9142, Art, 2014, p.
     91421P.