Proceedings of the 9th International Conference "Distributed Computing and Grid Technologies in Science and
                           Education" (GRID'2021), Dubna, Russia, July 5-9, 2021


                DEEP LEARNING APPLICATION FOR IMAGE
                           ENHANCEMENT
                         A. Elaraby1,a, I. Elansary2, A. Nechaevskiy3
   1
       Department of Computer Science, Faculty of Computers and Information, South Valley
                               University, Qena, 83523, Egypt
       2
        Modern Academy for Computer Science and Management Technology, Cairo, Egypt
       3
           Meshcheryakov Laboratory of Information Technologies, Joint Institute for Nuclear
                         Research, 141980, Joliot-Curie 6, Dubna, Russia

                                    Email: a ahmed.elaraby@svu.edu.eg


Recently, deep learning has obtained a central position toward our daily life automation and delivered
considerable improvements as compared to traditional algorithms of machine learning. Enhancing of
image quality is a fundamental image processing task and. A high-quality image is always expected in
several tasks of vision, and degradations like noise, blur, and low-resolution, are required to be
removed. The deep techniques approaches can significantly and substantially boost performance
compared with classical ones. One of the main research areas where deep learning can make a major
impact is imaging. This work presents a survey of deep learning on image enhancement and describes
its potential for future research.


Keywords: Machine Learning, Deep Learning, Image Enhancement.


                                                    Ahmed Elaraby, Ismail Elansary, Andrey Nechaevskiy


                                                               Copyright © 2021 for this paper by its authors.
                      Use permitted under Creative Commons License Attribution 4.0 International (CC BY 4.0).


                                                     212
Proceedings of the 9th International Conference "Distributed Computing and Grid Technologies in Science and
                           Education" (GRID'2021), Dubna, Russia, July 5-9, 2021


1. Introduction
        Image quality enhancing is a fundamental problem in image processing that has received great
attention over several decades. A high-quality image is always expected in different tasks of vision,
and degradations like low-resolution, blur, and noise, are required to be removed. While the classical
techniques for this task have achieved great progress, the recent top performer, deep techniques
approaches, can significantly and substantially boost performance compared with classical ones. The
advantages of deep learning techniques which enable it to realize such success are its high
representational capacity and the strong nonlinearity of the approaches [1].
         Image enhancement is adjusting process of digital images so that the results are more
appropriate for display or additional image analysis [2-5]. Image enhancement methods divide to two
main categories: “spatial domain” approaches and “frequency domain” approaches [1]. The term
spatial domain in image processing indicates to an image plane itself, and methods in this category are
depending to direct processing of image pixels. Spatial domain enhancement methods also divide to
two categories: the spatial domain filtering category and transformation category. The former is
depending on neighborhoods and the latter is depending on individual pixels. The spatial domain
transformation approaches in common use are basic on histogram processing and gray level
transformation. The frequency domain processing approaches are depending on modifying an image
Fourier transforms. Let 𝑓(𝑎, 𝑏) be the original image and its Fourier transform 𝐹(𝑥, 𝑦), while ℎ(𝑎, 𝑏)
be a filter and 𝐻(𝑥, 𝑦) its Fourier transform, and then𝑓(𝑎, 𝑏) is transformed to 𝑔(𝑎, 𝑏) after
convolving with ℎ(𝑎, 𝑏) and 𝐺(𝑥, 𝑦) is Fourier transform of 𝑔(𝑎, 𝑏).
        The procedure can be written in the spatial domain as:
                                    𝑔(𝑎, 𝑏) = 𝑓(𝑎, 𝑏) ∗ ℎ(𝑎, 𝑏)                                        (1)
        The frequency domain as:
                                    𝐺(𝑥, 𝑦) = 𝐹(𝑥, 𝑦) ∙ 𝐻(𝑥, 𝑦)                                        (2)
         Where “∗”is convolution operator and “∙”means multiplication operator. It is clear from Eqs.
(1,2) that if ℎ(𝑎, 𝑏) is selected correctly, the image 𝑓(𝑎, 𝑏) will then be influentially enhanced.
       The rest of this paper will explore the development of advanced deep approaches for image
enhancement by researching several fundamental issues with various motivations.

2. Deep Learning Applications on Image Enhancement
        In image denoising, the basic concept of learning in image denoising is to use training data to
refine a module of the designed model. Dictionary learning is an example of this; image patches can
be interpreted using a coefficients sequence on a collection of bases, resulting in a redundant
dictionary. The dictionary is supposed to express the general structures of natural images in such a
way that clean image patches are well estimated on it for the denoising task. As a result, as in KSVD
[6], LSSC [7], and CSR [8], the dictionary is an essential item that is learned from a collection of high-
quality training data or the to-be-processed degraded image. Deep learning is another example, in
which the goal is to learn a discriminative restoration function. The first attempts to use (CNNs) and
stacked auto-encoders to denoising natural images [9] are promising, demonstrating that these deep
models can work like or better than traditional wavelet or Markov random field-based denoising
approaches.
        Agostinelli et al. [10] investigated for building multi-column stacked auto encoders to deal
with different forms of noise. DnCNN [11] used the residual learning technique to train deep CNN
models and achieved state-of-the-art results. These studies show that deep architecture, strong
learning, and high representational ability can significantly improve image denoising efficiency.
Example of image denoising result using deep learning is shown in figure 1.


                                                   213
Proceedings of the 9th International Conference "Distributed Computing and Grid Technologies in Science and
                           Education" (GRID'2021), Dubna, Russia, July 5-9, 2021


                 Figure 1. Image denoising applying neural network-based models [9]
        In image deblurring, the most critical issue is ill-posedness. The observed fuzzy images do not
stably and uniquely evaluate as sharp images in the non-blind case, owing to the blur operator's ill-
conditioned existence [11]. Researchers have been concentrating on creating new strategies and
prototypes, as well as improving the efficiency of optimization techniques, to deal with ill-posedness.
For the task of deblurring, learning-based approaches are proposed. The motives in this case can be
divided into two categories. In the first group, the proposed methods [11-13] aim to learn a subspace
in which the sharp picture can be found. A subspace can be created by extracting local patterns from
multiple sharp images, where the sharp images and target images share similar information, allowing
for an accurate representation of details in the target image. In the second group, the aim is to find a
restoring mechanism that can transform a fuzzy target image into a sharp target image [13]. Multiple
sharp images with their fuzzy equivalents are widely used to train the restoring function parameters in
this scenario. The “Regression Tree Field” (RTF) was used by “Schmidt” et al. [14] to model a
nonlinear regressor that defines the parameters of local deblurring. Example of image deblurring result
using deep learning is shown in figure 2.


            Figure 2. Estimation of image deblurring utilizing explicit deconvolution [15]
        In Image Super-resolution, the aim of super-resolution is to create a high-resolution image
from one or more low-resolution images, recapturing high-frequency information that were lost during
the imaging process. A belief network is a type of learning technique that can be expressed in terms of
a Markov network. The images are analyzed with patch representation using a Markov network [16].
An observation function connects the low-resolution patches and their corresponding high-resolution
patches, defining how well one fits the other. A transition function connects the neighbor patches in
the reconstructed image. The model uses the belief propagation algorithm to restore the high-
resolution image after the parameters of the functions have been well trained.


                                                   214
Proceedings of the 9th International Conference "Distributed Computing and Grid Technologies in Science and
                           Education" (GRID'2021), Dubna, Russia, July 5-9, 2021


         High-resolution images form a manifold with similar local geometry to that provided by low-
resolution images, according to manifold learning-based approaches [17]. In order to do restoration,
the relationship between points on the low-resolution manifold can be directly applied to the high-
resolution manifold. However, the assumption of identical manifolds is overly restrictive, and it cannot
be fulfilled in many situations. To address this issue, it is proposed that two explicit mapping functions
be learned to find a common manifold for low and high-resolution images [18]. The idea behind the
sparse coding-based approaches [19-20] is that a signal can be interpreted by a sparse code on an over-
complete dictionary. The linear relationship between the sparse codes of low and high-resolution
images can be recovered in the super-resolution task, resulting in encouraging restoration results. A
high-resolution image can be obtained from low-resolution patches by dividing images into patches.
The mapping function types like simple functions, support vector regression, and anchored
neighborhood regression [21] are indeed shallow types with limited representational capacity. It has
been shown that stacking shallow types into a deep one can significantly improve the performance of
super-resolution, which is related to deep learning. Kim et al. conducted research into the use of very
deep convolutional networks for SISR [22], who suggested residual learning to train a 20-layer CNN
model and achieved high efficiency. They also presented a profoundly recursive convolutional
network [23] that uses a limited number of model parameters to allow for long-range pixel
dependencies. In the SR mission, use self-examples on different scales to fine-tune a pre-trained
convolutional auto-encoder [24]. Shi et al. and Dong et al. [25-26], constructed networks in which the
majority of the computation occurs in the LR space and the up-scaling process occurs only in the last
layer of the networks to speed up the SR computation. Example of image super-resolution result using
deep learning is shown in figure 3.


                            Figure 3. Obtaining a high-resolution image [26]
         Besides the above CNN-related works, the studies on other feed-forward neural networks were
conducted, such auto-encoders [27-28] and sparse coding-based networks [29-30]. An LR image patch
can be generated from various HR image patches which reside in a low-dimensional natural image
ramified. The MSE-based solution is a pixel-wise average of the possible HR patches on the ramified,
thus exhibiting blurry or over-smoothing effects, and lacking high-frequency details. To address this
task, in [31] presented a perceptual loss which measures the MSE between the VGG feature maps of
the SR result and the ground truth. In [32], by incorporating this loss and an adversarial loss, Ledig et
al. developed a generative adversarial network which can recover photo-realistic textures. Using
similar ideas, inference models can be applied to the statistics of CNN feature maps, such that the
statistics of SR solutions and that of natural images are as close as possible [32].


                                                   215
Proceedings of the 9th International Conference "Distributed Computing and Grid Technologies in Science and
                           Education" (GRID'2021), Dubna, Russia, July 5-9, 2021


3. Conclusion
        In this paper we highlighted several aspects of image enhancement. Providing a compendium
of the advances made in the new and exciting sub-field of deep learning for image enhancement. The
paper discusses image enhancement categories in spatial domain and frequency domain. Deep learning
technology and background is discussed. Furthermore, applications of deep learning on image
enhancement are analysis in most important tasks like image super-resolution, deblurring and
denoising.

References
[1] R. C. Gonzalez and R. E. Woods, Digital Image Processing, (3rd Edition). Upper Saddle River,
NJ, USA: Prentice-Hall, Inc., 2006.
[2] A. Elaraby; D.Moratal: A Generalized Entropy-Based Two-Phase Threshold Algorithm for
Noisy Medical Image Edge Detection, Scientia Iranica, Vol. 24; 6, 2017.
[3] A. Elaraby; et al: New Algorithm For Edge Detection in Medical Images Based on Minimum
Cross Entropy Thresholding, IJCSI, Vol. 11; 2, 2014.
[4] A. Elaraby; et al: New Algorithm For Edge Detection Based on Exponential Entropy, (ISCYR),
29-30 April 2014, Assiut University, Egypt.
[5] A. Elaraby, A. Nechaevskiy: An effective segmentation approach for liver computed tomography
scans using fuzzy exponential entropy // Computer Research and Modeling, 2021, 13(1), pp. 195–202.
[6] W. Dong, X. Li, L. Zhang, and G. Shi, “Sparsity-based image denoising via dictionary learning
and structural clustering,” in Proceedings of IEEE Conference on Computer Vision and Pattern
Recognition, 2011, pp. 457–464.
[7] V. Jain and S. Seung, “Natural image denoising with convolutional networks,” in NIPS, 2009, pp.
769–776.
[8] J. Xie, L. Xu, and E. Chen, “Image denoising and inpainting with deep neural networks,” in
NIPS, 2012, pp. 341–349.
[9] H. C. Burger, C. J. Schuler, and S. Harmeling, “Image denoising: Can plain neural networks
compete with bm3d?” in CVPR, 2012, pp. 2392–2399.
[10] H. C. Burger, C. Schuler, and S. Harmeling, “Learning how to combine internal and external
denoising methods,” in GCPR, vol. 8142, 2013, p. 121.
[11] F. Agostinelli, M. R. Anderson, and H. Lee, “Adaptive multi-column deep neural networks with
application to robust image denoising,” in NIPS, 2013, pp. 1493–1501.
[12] K. Zhang, W. Zuo, Y. Chen, D. Meng, and L. Zhang, “Beyond a gaussian denoiser: Residual
learning of deep cnn for image denoising,” IEEE Trans. Image Process, 2017.
[13] N. Joshi, W. Matusik, E. H. Adelson, and D. J. Kriegman, “Personal photo enhancement using
example images,” ACM Transactions on Graph, vol. 29, no. 2, p. 12, 2010.
[14] J. Ni, P. Turaga, V. M. Patel, and R. Chellappa, “Example-driven manifold priors for image
deconvolution,” IEEE Trans. Image Process, vol. 20, no. 11, pp. 3086–3096, 2011. 23, 24.
[15] C. J. Schuler, H. C. Burger, S. Harmeling, and B. Sch¨olkopf, “A machine learning approach for
non-blind image deconvolution,” in Proceedings of IEEE Conference on Computer Vision and Pattern
Recognition, 2013, pp. 1067–1074.
[16] W. T. Freeman and E. C. Pasztor, “Learning to estimate scenes from images,” Proceedings of
Advances in Neural Information Processing Systems, pp. 775–781, 1999.
[17] M. Bevilacqua, A. Roumy, C. Guillemot, and M.-L. A. Morel, “Neighbor embedding based
single-image super-resolution using semi-nonnegative matrix factorization,” in Proceedings of IEEE
International Conference on Acoustics, Speech and Signal Processing. IEEE, 2012, pp. 1289–1292.


                                                   216
Proceedings of the 9th International Conference "Distributed Computing and Grid Technologies in Science and
                           Education" (GRID'2021), Dubna, Russia, July 5-9, 2021


[18] B. Li, H. Chang, S. Shan, and X. Chen, “Low-resolution face recognition via coupled locality
preserving mappings,” IEEE Signal Process. Lett., vol. 17, no. 1, pp. 20–23, 2010.
[19] X. Gao, K. Zhang, D. Tao, and X. Li, “Image super-resolution with sparse neighbor embedding,”
IEEE Trans. Image Process, vol. 21, no. 7, pp. 3194–3205, 2012.
[20] L. He, H. Qi, and R. Zaretzki, “Beta process joint dictionary learning for coupled feature spaces
with application to single image super-resolution,” in CVPR, 2013, pp. 345–352.
[21] C. Dong, C. C. Loy, K. He, and X. Tang, “Learning a deep convolutional network for image
super-resolution,” in ECCV, 2014, pp. 184–199.
[22] J. Kim, J. K. Lee, and K. M. Lee, “Accurate image super-resolution using very deep
convolutional networks,” in CVPR, 2016.
[23] J. Kim, J. K. Lee, and K. M. Lee “Deeply-recursive convolutional network for image super-
resolution,” in CVPR, 2016.
[24] Z. Wang, Y. Yang, Z. Wang, S. Chang, W. Han, J. Yang, and T. Huang, “Self-tuned deep super
resolution,” in CVPR Workshops, 2015, pp. 1–8.
[25] W. Shi, J. Caballero, F. Husz´ar, J. Totz, A. P. Aitken, R. Bishop, D. Rueckert, and Z. Wang,
“Real-time single image and video super-resolution using an eﬃcient sub-pixel convolutional neural
network,” in CVPR, 2016, pp. 1874–1883.
[26] C. Dong, C. C. Loy, and X. Tang, “Accelerating the super-resolution convolutional neural
network,” in ECCV. Springer, 2016, pp. 391–407. 21, 85, 94
[27] Z. Cui, H. Chang, S. Shan, B. Zhong, and X. Chen, “Deep network cascade for image super-
resolution,” in ECCV, 2014, pp. 49–64.
[28] R. Wang and D. Tao, “Non-local auto-encoder with collaborative stabilization for image
restoration,” IEEE Trans. Image Process, vol. 25, no. 5, pp. 2117–2129, 2016.
[29] K. Zeng, J. Yu, R. Wang, C. Li, and D. Tao, “Coupled deep auto-encoder for single image super-
resolution,” IEEE Trans. Cybern., no. 99, pp. 1–11, 2016.
[30] Z. Wang, D. Liu, J. Yang, W. Han, and T. Huang, “Deep networks for image super-resolution
with sparse prior,” in ICCV, 2015, pp. 370–378.
[31] J. Johnson, A. Alahi, and L. Fei-Fei, “Perceptual losses for real-time style transfer and super-
resolution,” ECCV, 2016.
[32] J. Bruna, P. Sprechmann, and Y. LeCun, “Super-resolution with deep convolutional suﬃcient
statistics,” ICLR, 2016.


                                                   217