=Paper= {{Paper |id=Vol-3349/paper3 |storemode=property |title=Impact of Learned Domain Specific Compression on Satellite Image Object Classification |pdfUrl=https://ceur-ws.org/Vol-3349/paper3.pdf |volume=Vol-3349 |authors=Alexander Bayerl,Manuel Keglevic,Matthias Woedlinger,Robert Sablatnig |dblpUrl=https://dblp.org/rec/conf/cvww/BayerlKWS23 }} ==Impact of Learned Domain Specific Compression on Satellite Image Object Classification== https://ceur-ws.org/Vol-3349/paper3.pdf
Impact of Learned Domain Specific Compression on
Satellite Image Object Classification
Alexander Bayerl1 , Manuel Keglevic1 , Matthias Wödlinger1 and Robert Sablatnig1
1
    Computer Vision Lab, TU Wien, Favoritenstraße 9/193-1, Vienna, Austria


                                             Abstract
                                             This paper proposes a methodology for learned compression for satellite imagery. The proposed method utilizes an image
                                             patching and stitching approach to address the high resolution of satellite images. We present rate-distortion metrics
                                             showing that this methodology outperforms JPEG2000, currently used on satellites. In addition, we demonstrate that using
                                             satellite images to train the compression model leads to superior performance compared to using non-domain-specific data.
                                             Furthermore, a detailed evaluation of the compression algorithm in a downstream classification task is conducted. The results
                                             demonstrate that 77.83% classification accuracy is still achievable for highly compressed images with a bitrate of 0.02 BPPs
                                             when the classification model is trained on images from the same compression model. The downstream classification task
                                             evaluation highlights that the performance of the classification model is highly dependent on the type of compression applied
                                             to the training data. When trained with learned compression images, the model can only classify images with an acceptable
                                             level of accuracy (>77%) if they had also undergone learned compression. Likewise, a model trained with JPEG images can
                                             only classify JPEG images with acceptable accuracy (>89%).

                                             Keywords
                                             Learned Image Compression, Satellite Imagery, Remote Sensing, Image Classification, Machine Learning



1. Introduction                                                                                                                       in learned compression can be achieved by limiting the
                                                                                                                                      training data to images from this domain. For example,
As remote sensing technology develops, satellites take                                                                                Tsai et al. [8] show that using domain-specific training
photos with increasing spatial, temporal, and spectral res-                                                                           data can significantly enhance the compression perfor-
olution. This leads to an increasing amount of produced                                                                               mance of video game images. Similarly, Wödlinger et
data per day, which is a challenge for data storage [1].                                                                              al. [9] demonstrate superior performance in stereo image
In addition to data storage, transferring satellite images                                                                            compression compared to other approaches by designing
from satellites to terrestrial nodes is a bottleneck in this                                                                          a custom-built architecture and training it using domain-
process as well. Compression algorithms specialized for                                                                               specific data.
the satellite image domain have been developed to allevi-                                                                                For satellite images following difficulty must be taken
ate this problem [2, 3, 4, 5].                                                                                                        into account: currently, 27 satellites with a spatial reso-
   Since image compression is a ubiquitous and funda-                                                                                 lution of less than 10 m per pixel are active, 19 of which
mental operation, it is a well-studied topic. Improve-                                                                                have been launched in the last 20 years [10]. This results
ments in image compression enable faster image data                                                                                   in increasing file sizes per satellite image [11] which
transfer and reduced storage costs. The invention of the                                                                              has to be considered when processing such images on
discrete cosine transformation in 1972 by Nasir Ahmed                                                                                 neural network hardware accelerators. Even though a
et al. [6] led to the definition of the JPEG-Format in 1992,                                                                          simple method for handling this is dividing the image
which is still dominant. Ballé et al. [7] showed in 2016                                                                              into processable patches and compressing each patch in-
that using compression models trained by artificial neural                                                                            dependently, this leads to stitching artifacts on the border
networks can outperform traditional image compression                                                                                 between two patches in the decompressed image.
algorithms like JPEG-Discrete Cosine Transformation in                                                                                   This work examines learned image compression in the
terms of image quality and bitrate.                                                                                                   context of satellite photography:
   For a specific image domain, further enhancements
                                                                                                                                           • We propose a methodology to alleviate border
                                                                                                                                             artifacts when stitching patches of compressed
26th Computer Vision Winter Workshop, Robert Sablatnig and Florian
Kleber (eds.), Krems, Lower Austria, Austria, Feb. 15-17, 2023                                                                               images.
$ alexanderbayerl95@gmail.com (A. Bayerl);                                                                                                 • The proposed method is evaluated on a classi-
keglevic@cvl.tuwien.ac.at (M. Keglevic);                                                                                                     fication downstream task (see Figure 1) using
mwoedlinger@cvl.tuwien.ac.at (M. Wödlinger);                                                                                                 the "Functional Map of the world"-satellite im-
sab@cvl.tuwien.ac.at (R. Sablatnig)                                                                                                          age data set published in 2017 by John Hopkins
 0000-0002-4644-2723 (M. Keglevic); 0000-0002-3872-7470
(M. Wödlinger); 0000-0003-4195-1593 (R. Sablatnig)                                                                                           University Applied Physics Laboratory [12] .
                                       © 2023 Copyright for this paper by its authors. Use permitted under Creative Commons License
                                       Attribution 4.0 International (CC BY 4.0).
                                                                                                                                           • Furthermore, we investigate the influence of
    CEUR
    Workshop
    Proceedings
                  http://ceur-ws.org
                  ISSN 1613-0073
                                       CEUR Workshop Proceedings (CEUR-WS.org)                                                               domain-specific training data on the rate- dis-



                                                                                                                                  1
Alexander Bayerl et al. CEUR Workshop Proceedings                                                                 1–8




                                      Train
                       1,289 imgs




                                                                                                Classification
                                               Compression




                                                                                                   model
                                                 model
                     train-compress


                                                                                        Train
                                      Eval                   Compress
                       1,551 imgs                                         train-class
                                                                                        Eval
                       378 imgs                                            val-class

                     val-compress                                       val-compress
                     (uncompressed)                                     (compressed)


Figure 1: Separation of the dataset used in this work: Subset of fMoW have been used to train the compression. Subset
of fMoW have been compressed. The compressed images themselves are separated into a train and into validation set for
classification training



       tortion metric and the classification downstream             2.2. Learned Image Compression
       task.
                                                           Recently, image compression models based on artificial
     • We show that even with compression ratios as low
                                                           neural networks have outperformed traditional compres-
       as 0.02 BPP, a classification accuracy of 77.83%
                                                           sion methods in terms of rate and distortion. Jamil et
       can be achieved as long as domain-specific data
                                                           al. [16] provide a survey on that subject. According to
       is utilized for training.
                                                           the findings of this survey, autoencoders are the most
                                                           common learning-driven lossy image compression archi-
2. State of the art                                        tectures. These models utilize an encoder to transform
                                                           image data into a low-dimensional latent space. A de-
Lossy image compression is the process of reducing the coder is then employed to reconstruct the original image
size of digital image data without sacrificing its overall from this encoding. The seminal work of this approach is
quality. This differs from lossless image compression, from Balle et al. [7]. They learn a probability distribution
which does not permit any information loss during the of the latent space jointly with the encoder and decoder
compression process.                                       networks trained to reconstruct the original image. Sub-
                                                           sequent works employ hyperpriors and auto-regressive
                                                           context models to decorrelate the spatial information in
2.1. Traditional Image Compression
                                                           the latent space [17].
A.J. Hussain et al. [13] conducted an exhaustive survey       Similarly, Toderici et al. [18] show that Recurrent Neu-
on the subject of lossy image compression. The authors ral Network (RNN) architectures can be used for learned
separate the compression approaches into predictive cod- image compression. Their model leverages feedback
ing, transform coding, vector quantization, and neural loops to iteratively compress an image to the desired
network approaches.                                        bit rate.
   JPEG, the most popular lossy image codec, is based         Furthermore, Generative Adversarial Networks
on transform coding, which uses the Discrete Cosine (GANs) have also been used in image compression.
Transformation to convert an image from pixel-space to According to Jamil et al. [16], GAN compression
frequency-space [6]. The method utilizes the fact that outperforms traditional image compression algorithms
the human visual system is less susceptible to variations in terms of visual quality, albeit with the disadvantage of
in high-frequency components. By applying wavelet higher deployment costs.
transformations on the image, JPEG2000 improves on
that to achieve better rate-distortion metrics [14].       2.3. Satellite image compression
   More recently, Fabrice Bellard developed the BPG for-
mat (Better Portable Graphics) that outperforms JPEG Indradjad et al. [19] compare four different approaches
and JPEG2000 in terms of rate and distortion [15]. This for satellite image compression with transform codings: a
format relies on the intraframe encoding of HEVC [15]. wavelet approach by Delaunay et al. [20], bandelets [21],
                                                           JPEG 2000 [14], and a discrete wavelet transformation
                                                           method by the CCSDS (Consultative Committee for Space



                                                                2
Alexander Bayerl et al. CEUR Workshop Proceedings                                                                                1–8



Data Systems) [2]. Of these approaches, JPEG 2000 yields           the actual marginal distribution 𝑚(𝑦), where 𝑦 denotes
the highest peak signal-to-noise ratio (PSNR) as well              the latent encoding. Similarly, the rate of the hyperprior
as the second shortest compression and decompression               𝑧 is calculated which leads to the following definition
times.                                                             for the rate-loss 𝑅:
   More recently, de Oliviera et al. [4] investigated neural
networks for the compression of satellite images. An                         [︀              ]︀      [︀             ]︀
autoencoder with learned hyper-prior is utilized to learn           𝑅 = E𝑥∼𝑝𝑥 − log2 𝑝𝑦^ (^
                                                                                          𝑦 ) + E𝑥∼𝑝𝑥 − log2 𝑝𝑧^(^
                                                                                                                 𝑧)
compression models for satellite imagery. The proposed
                                                                        ⏟        ⏞              ⏟        ⏞
                                                                                 rate (latents)           rate (hyper-latents)
method outperforms the CCSDS wavelet compression [2]                                                                             (3)
currently used on French satellites in terms of rate and
distortion.
   Bacchus et al. [3] investigate the use of learned meth-
                                                                   3.2. Stitching
ods for onboard satellite image compression, to address            As discussed in the introduction, a limitation of satellite
high memory and complexity constraints in this domain.             imagery is that image samples have resolutions of up
The authors also employ a hyperprior-based architecture            to 14798 × 14802 pixels, which causes issues for the
and incorporate data augmentations as a preprocessing              training and inference on neural network hardware ac-
step. Their method performs better than JPEG2000, and              celerators such as GPUs. Since dividing the input into
the authors concluded that its relatively low inference            patches and processing the patches independently of
time makes it well-suited for use on satellites.                   each other leads to visible artifacts on the borders be-
                                                                   tween the patches in the stitched images, our approach
                                                                   resolves this issue by compressing overlapping patches.
3. Methodology                                                     For the stitched image, the average value of both patches
This section provides an overview of the methodology               (or four patches in corners) is used for the overlapping
proposed in this work. It begins with a brief introduc-            regions. Figure 3 illustrates the overlapping regions of a
tion to learned image compression, followed by an ex-              1496 × 1496 image with a patch size of 256 × 256 pixels.
planation of how the technique is adapted to suit high-            A step size of 248 pixels in either the X or Y dimension
resolution satellite images.                                       is employed, resulting in an overlapping region of 8 pix-
                                                                   els. A disadvantage of this method is that it leads to the
                                                                   pixels in the overlapping regions being compressed mul-
3.1. Learned Image Compression                                     tiple times, i.e., 5.14% of the total pixels in the previous
This work is based on the compression model by Balle et            example.
al. [17]. Figure 2 shows an overview of the architecture.             In Figure 4 the influence of this blending process can
The model has an autoencoder structure, and the distribu-          be seen. As a result, the boundaries of each patch are less
tion of the quantized latent 𝑝𝑦^ is modeled using a learned        visible in the blended image on the right.
hyperprior 𝑔ℎ and a context model 𝑔𝑐𝑚 that predicts the
parameters of a Gaussian distribution 𝒩 (𝜇, 𝜎). The au-
toregressive component utilizes already decoded pixels
                                                                   4. Evaluation
for decoding further pixels. This yields superior rate-            This section provides an overview of the evaluation pro-
distortion results, with the disadvantage that decoding            cess and presents the results of this work.
has to be done iteratively and not in parallel.                       Firstly, the utilized data set is described in detail, and
   We directly train the model with the trade-off between          how it was employed in this work. Subsequently, the
the distortion 𝐷 of the original image and the compres-            results of the proposed compression algorithm on the
sion rate 𝑅:                                                       data set are highlighted and discussed. Finally, the results
                                                                   of the downstream classification task on the compressed
                     𝐿=𝐷+𝜆·𝑅                            (1)        images are presented.
  Here λ controls the trade-off between rate and dis-
tortion. For the distortion 𝐷 the Mean Squared Error 4.1. Dataset
(MSE) is used, which computes the averaged pixel-wise
quadratic difference between original image and distorted The dataset used in this work is the Functional Map
image:                                                     of the world (fMoW). It was created at the John Hop-
                                                           kins University Applied Physics Laboratory in Laurel,
                 𝐷 = E𝑥∼𝑝𝑥 ||𝑥 − 𝑥  ˆ ||22             (2) Maryland (United States) and is publicly available at
                                                           https://github.com/fMoW/dataset [12]. This dataset was
  The compression rate 𝑅 is estimated by the cross- compiled to facilitate research in computer vision for
entropy between the entropy model distribution 𝑝𝑦^ and



                                                               3
Alexander Bayerl et al. CEUR Workshop Proceedings                                                                                                                1–8



Input Image                                                                                                                 Component             Symbol


                         Encoder




                                                                           Encoder
                                                                                                                            Input Image              𝑥




                                                                            Hyper
                 x                 y                  â
                                                      y                              z
                                              Q                                             Q âz                              Encoder            𝑓 (𝑥; 𝜃𝑒 )
                                                  â
                                                  y                                             â
                                                                                                z                             Latents                𝑦
                                                                                                                        Latents (quantized)          𝑦^
                                             AE            Context                         AE                                 Decoder            𝑔(^𝑦 ; 𝜃𝑑 )
                                                            Model
                                                                                                                          Hyper Encoder         𝑓ℎ (𝑦; 𝜃ℎ𝑒 )
                                           Bits                Φ                         Bits           Factorized
Reconstruction




                                                                                                                           Hyper-latents             𝑧
                                                                                                         Entropy
                                                                                                                       Hyper-latents (quant.)        𝑧^
                         Decoder




                                                                           Decoder
                                                            Entropy                                       Model




                                                                            Hyper
                     â
                     x                 â
                                       y
                                                          Parameters   Ψ             â
                                                                                     z                                    Hyper Decoder         𝑔ℎ (^
                                                                                                                                                    𝑧 ; 𝜃ℎ𝑑 )
                                             AD                                            AD
                                                            N(μ, θ)                                                       Context Model       𝑔𝑐𝑚 (𝑦<𝑖 ; 𝜃𝑐𝑚 )
                                                                                                                        Entropy Parameters      𝑔𝑒𝑝 (·; 𝜃𝑒𝑝 )
                                                                                                                          Reconstruction             𝑥^

Figure 2: Compression architecture used in this work [17].



                                                                                                           The partitioning of the data set used in this work is
                                                                                                        shown in Figure 1. For compression training, 1,289 im-
                                                                                                        ages from the fMoW train set, uniformly distributed over
                                                                                                        all 63 categories, are used (train-compress). These 1,289
                                                                                                        images are from 1,038 objects. As such, for some objects,
                                                                                                        there are multiple images taken under different environ-
                                                                                                        mental conditions.
                                                                                                           Another set, denoted as the (val-compress), consists of
                                                                                                        1,929 images from 1,038 objects from the fMoW valida-
                                                                                                        tion set. The val-compress serves two purposes: one is
                                                                                                        evaluating the compression, and another is evaluating
                                                                                                        the downstream classification task. For the latter, the val-
                                                                                                        compress is split again into 1,551 images for classification
                                                                                                        training (train-class) and 378 images for classification
                                                                                                        validation (val-class).

                                                                                                        4.2. Compression Evaluation
                                                             With the parameter λ in Equation 1 the trade-off between
Figure 3: 1496 × 1496 image divided into 36 patches (256 ×   rate and distortion can be controlled, i.e. increasing the
256 pixels) with an overlapping region of 8 × 256 pixels be- λ leads to a smaller MSE but therefore more BPP. To
tween two patches.                                           evaluate our model for different bitrates, we train the
                                                             model with different values for the parameter lambda. In
                                                             Figure 5 the compression results with bitrates ranging
remote sensing applications. It includes over 1 million from 0.003 BPP to 0.68 BPP are shown for an example
images of objects taken from satellites, categorized into image. The BPP of the compressed image is calculated
63 categories, such as airports, tunnel openings, zoos, and directly by dividing the file size of the encoded image by
towers. Christie et al. [12] highlight the importance of ob- the amount of pixels in the respective image.
taining a geographically distributed data set to minimize       The Peak-signal-to-noise-ratio (PSNR) metric is used
geographical bias.                                           to evaluate the distortion. The distortion is calculated
   Overall the dataset contains about 628,000 training using the MSE between the compressed and the corre-
images and about 100,730 images for validation. The pho- sponding uncompressed images. The PSNR is defined
tographs are provided as compressed JPEG- and lossless as:                                      (︂
                                                                                                   255
                                                                                                        )︂
TIFF-color images. Each object has been photographed                       PSNR = 10 · log10                         (4)
                                                                                                   MSE
in a variety of environmental settings (weather, time,
season). Since this work explicitly focuses on high-            As depicted on the Rate-Distortion-Curve in Figure 6,
resolution satellite images, only images with a resolution the results indicate that the proposed learned compres-
of at least 1024 × 1024 pixels are considered.               sion methodology outperforms JPEG and is also superior




                                                                                                    4
Alexander Bayerl et al. CEUR Workshop Proceedings                                                                       1–8




                                           without Blending                         with Blending


Figure 4: In the left image, simple patching without blending is shown. In this case the connection line between two patches
can be seen. In the right image the patches are blended. The connection lines are denoted with arrows in the images.



Table 1                                                           task, classification models have been trained with the
Results of domain specific training and mon domain specific       following 4 training sets:
training
                                                                      • train-class compressed by the learned compres-
                                      PSNR         BPP
                                                                        sion model with 0.02 BPP
                ImageNet trained        32.8       0.84               • train-class compressed by the learned compres-
                FMoW trained           32.92       0.67                 sion model with 0.67 BPP
                                                                      • train-class compressed by the learned model that
                                                                        was trained with 1,749 non-domain specific im-
to the JPEG2000 compression format, which is frequently                 ages (ImageNet [23]) with 0.84 BPP
used in satellite applications.                                       • train-class compressed in JPEG format with
   To verify that training the compression model with                   0.77 BPP
domain-specific satellite images improves the down-
stream classification task, another compression model           Each of these classification models has been used to
was trained on 1,749 non-domain-specific samples from        validate data sets in different compression scenarios:
the ImageNet data-set.                                       JPEG data sets (0.31 BPP, 0.76 BPP, 1.55 BPP), Learned-
   The results in Table 1 show that the domain-specific      compression-compressed (LC) datasets (0.02 BPP,
compression model trained with satellite images outper-      0.67 BPP, 1.07 BPP), dataset retrieved from ImageNet-
forms the model trained with ImageNet samples. For a         trained learned Compression (0.84 BPP) and one without
PSNR of 33 it yields a lower bit rate of 0.67 BPP compared   compression. The result for this classification validations
to 0.84 BPP achieved by the domain-agnostic model.           are shown in Table 2. The columns denote the data set
                                                             the classifier was trained on, the rows denote the data
                                                             set that was classified during validation.
5. Classification Evaluation                                    The results show that a classification model works best
In addition to the evaluation in terms of image quality, when classifying images that were compressed with the
in this section, compressed quality is represented by the same algorithm (JPEG or learned compression) as the
accuracy of a classification downstream task, i.e., iden- images on which it was trained, i.e., the JPEG-trained
tifying objects in satellite images. As mentioned in the classifiers classified JPEG images with accuracies over
Section 4.1, compressed and uncompressed versions of 89%. In contrast, the JPEG-trained classifier only achieves
the val-compress set are used, with 1,554 images used an accuracy of up to 35.21% on images compressed by
for training (train-class) the classification model, and 375 learned compression. Similarly, the accuracy of the LC-
images used to validate the model (val-class).               trained classifiers was at least 77% when classifying LC
   A dual path network [22] is utilized for this evalua- images (except for very low bitrate of 0.02 BPP), and no
                           1

tion. For the evaluation of the classification downstream more than 39.38% when classifying JPEG-compressed
                                                             images.
1
    https://github.com/fMoW/first_place_solution




                                                              5
Alexander Bayerl et al. CEUR Workshop Proceedings                                                                        1–8




                                                       (a) Original image
                                                             24.00 BPP




               (b) LC 0.003 BPP           (c) LC 0.12 BPP              (d) LC 0.35 BPP        (e) LC 0.63 BPP




               (f) JPEG 0.06 BPP        (g) JPEG 0.14 BPP             (h) JPEG 0.44 BPP      (i) JPEG 0.68 BPP
Figure 5: Differences between an original image, JPEG compressed versions of this images, and results of our learned
compression models (LC), with given BPP.


Table 2
Classification accuracy for datasets with varying compression distortion; the columns denote classifiers trained with various
image compression formats; the columns denote the data set, that was compressed; LC= Learned Compression

             Validated                                Classifier trained on images compressed with:
             with:                      LC 0.02 BPP      LC 0.67 BPP      LC ImageNet 0.84 BPP   JPEG 0.77 BPP
             LC 0.02 BPP                  77.83%            27.12%               10.58%             12.91%
             LC 0.67 BPP                  25.57%            80.67%               15.67%             32.89%
             LC 1.07 BPP                   26.9%            78.81%               16.63%             35.21%
             LC ImageNet 0.84 BPP         14.94%            14.67%               78.92%             11.22%
             JPEG 0.31 BPP                14.41%            35.81%               11.22%             89.81%
             JPEG 0.77 BPP                16.59%            38.07%               11.85%             91.55%
             JPEG 1.55 BPP                15.34%            39.38%               14.32%             91.55%
             Uncompressed                   15%             39.52%               12.97%             91.29%



   The ImageNet-trained compression model demon-                    LC-compressed ones trained with satellite images. This
strates that the training data is also crucial for the down-        suggests that traditional compression methods lead to a
stream classification task. It classified 78.92% of the             more versatile encoding that is not as dependent on the
images created by the same compression model, while                 specific domain.
other datasets, even with high bitrate, could not attain
an accuracy greater than 16.63% for any other evalua-
tion. It fails to classify all other data sets, including the



                                                                6
Alexander Bayerl et al. CEUR Workshop Proceedings                                                                        1–8




Figure 6: Rate distortion graph for JPEG, JPEG2000 and proposed learned compression methodology evaluated on satellite
images (val-compress dataset).



6. Conclusion                                                    under grant agreement No 965502.

In this work, we propose a satellite compression method-
ology that outperforms traditional methods (JPEG,                References
JPEG2000) in terms of rate and PSNR. We show that im-
ages that exceed the memory of typical neural network             [1] H. Guo, Z. Liu, H. Jiang, C. Wang, J. Liu, D. Liang,
hardware accelerators can be compressed by feeding in                 Big earth data: A new challenge and opportunity for
patch-wise parts of the image. To remove artifacts at                 digital earth’s development, International Journal
the connection line between two patches, the connection               of Digital Earth 10 (2017) 1–12.
region is smoothed by compressing overlapping patches             [2] P.-S. Yeh, P. Armbruster, A. Kiely, B. Masschelein,
and combining the pixels in these regions. The proposed               G. Moury, C. Schaefer, C. Thiebaut, The new ccsds
methodology offers superior performance compared to                   image compression recommendation, 2005, pp.
JPEG, and JPEG2000, commonly used for satellite imag-                 4138 – 4145. doi:10.1109/AERO.2005.1559719.
ing. We assess the effects of compression on the perfor-          [3] P. Bacchus, R. Fraisse, A. Roumy, C. Guillemot,
mance of an object classification downstream task. We                 Quasi lossless satellite image compression, IGARSS
demonstrate that a classification model can learn to clas-            2022 - 2022 IEEE International Geoscience and Re-
sify images with an accuracy of 77.83% even for images                mote Sensing Symposium (2022) 1532–1535.
compressed with a bitrate as low as 0.02 BPP. Further-            [4] V. A. de Oliveira, M. Chabert, T. Oberlin, C. Poulliat,
more, we show that using differently encoded images                   M. Bruno, C. Latry, M. Carlavan, S. Henrot, F. Fal-
for training and inference can deteriorate classification             zon, R. Camarero, Satellite image compression and
accuracy significantly. As such, classification models                denoising with neural networks, IEEE Geoscience
trained with JPEG images only achieve acceptable results              and Remote Sensing Letters 19 (2022) 1–5.
when tested on JPEG images. Similarly, classification             [5] F. E. Hassan, G. I. Salama, M. S. Ibrahim, R. M. Bahy,
models trained with images compressed with learned                    Investigation of on-board compression techniques
compression models fail when tested with JPEG images.                 for remote sensing satellite imagery, in: Interna-
                                                                      tional Conference on Aerospace Sciences and Avia-
                                                                      tion Technology, volume 11, The Military Technical
Acknowledgments                                                       College, 2011, pp. 937–946.
                                                                  [6] N. Ahmed, T. Natarajan, K. Rao, Discrete cosine
This project has received funding from the European
Union’s Horizon 2020 research and innovation program



                                                             7
Alexander Bayerl et al. CEUR Workshop Proceedings                                                                1–8



     transform, IEEE Transactions on Computers C-23              cessing systems 31 (2018).
     (1974) 90–93. doi:10.1109/T-C.1974.223784.             [18] G. Toderici, D. Vincent, N. Johnston, S. Jin Hwang,
 [7] J. Ballé, V. Laparra, E. P. Simoncelli, End-to-end          D. Minnen, J. Shor, M. Covell, Full resolution image
     optimized image compression, in: 5th Interna-               compression with recurrent neural networks, in:
     tional Conference on Learning Representations,              Proceedings of the IEEE conference on Computer
     ICLR 2017, 2017.                                            Vision and Pattern Recognition, 2017, pp. 5306–
 [8] Y.-H. Tsai, M.-Y. Liu, D. Sun, M.-H. Yang,                  5314.
     J. Kautz, Learning binary residual representations     [19] A. Indradjad, A. S. Nasution, H. Gunawan, A. Widi-
     for domain-specific video streaming, Proceed-               paminto, A comparison of satellite image compres-
     ings of the AAAI Conference on Artificial Intel-            sion methods in the wavelet domain, IOP Confer-
     ligence 32 (2018). URL: https://ojs.aaai.org/index.         ence Series: Earth and Environmental Science 280
     php/AAAI/article/view/12259. doi:10.1609/aaai.              (2019) 012031. doi:10.1088/1755-1315/280/1/
     v32i1.12259.                                                012031.
 [9] M. Wödlinger, J. Kotera, J. Xu, R. Sablatnig, Sa-      [20] X. Delaunay, M. Chabert, V. Charvillat, G. Morin,
     sic: Stereo image compression with latent shifts            Satellite image compression by post-transforms
     and stereo attention, in: 2022 IEEE/CVF Con-                in the wavelet domain, Signal Processing 90
     ference on Computer Vision and Pattern Recog-               (2010) 599–610. doi:10.1016/j.sigpro.2009.
     nition (CVPR), 2022, pp. 651–660. doi:10.1109/              07.024.
     CVPR52688.2022.00074.                                  [21] S. Mallat, G. Peyré, A Review of Bandlet Meth-
[10] Y. Lai, J. Zhang, Y. Song, Surface water infor-             ods for Geometrical Image Representation, Nu-
     mation extraction based on high-resolution im-              merical Algorithms 44 (2007) 205–234. URL: https:
     age, IOP Conference Series: Earth and Environ-              //hal.archives-ouvertes.fr/hal-00359744.
     mental Science 330 (2019) 032013. URL: https://        [22] J. L. Yuppen Chen, International Journal of Com-
     dx.doi.org/10.1088/1755-1315/330/3/032013. doi:10.          puter Applications (2017).
     1088/1755-1315/330/3/032013.                           [23] J. Deng, W. Dong, R. Socher, L.-J. Li, K. Li, L. Fei-
[11] Q. Zhao, L. Yu, Z. Du, D. Peng, P. Hao, Y. Zhang,           Fei, Imagenet: A large-scale hierarchical image
     P. Gong, An overview of the applications of                 database, in: 2009 IEEE conference on computer
     earth observation satellite data: Impacts and fu-           vision and pattern recognition, Ieee, 2009, pp. 248–
     ture trends, Remote Sensing 14 (2022) 1863. doi:10.         255.
     3390/rs14081863.
[12] G. Christie, N. Fendley, J. Wilson, R. Mukherjee,
     Functional map of the world, in: Proceedings of the
     IEEE Conference on Computer Vision and Pattern
     Recognition, 2018, pp. 6172–6180.
[13] A. Hussain, A. Al-Fayadh, N. Radi, Image com-
     pression techniques: A survey in lossless and
     lossy algorithms, Neurocomputing 300 (2018) 44–
     69. doi:https://doi.org/10.1016/j.neucom.
     2018.02.094.
[14] A. Skodras, C. Christopoulos, T. Ebrahimi, The
     jpeg 2000 still image compression standard, IEEE
     Signal Processing Magazine 18 (2001) 36–58. doi:10.
     1109/79.952804.
[15] G. J. Sullivan, J.-R. Ohm, W.-J. Han, T. Wiegand,
     Overview of the high efficiency video coding (hevc)
     standard, IEEE Transactions on Circuits and Sys-
     tems for Video Technology 22 (2012) 1649–1668.
     doi:10.1109/TCSVT.2012.2221191.
[16] S. Jamil, M. J. Piran, MuhibUrRahman, Learning-
     driven lossy image compression; a comprehensive
     survey, 2022. URL: https://arxiv.org/abs/2201.09240.
     doi:10.48550/ARXIV.2201.09240.
[17] D. Minnen, J. Ballé, G. D. Toderici, Joint autore-
     gressive and hierarchical priors for learned image
     compression, Advances in neural information pro-



                                                        8