The regression model for the procedure of correction of
photos damaged by backlighting

                A V Goncharova1, I V Safonov1 and I A Romanov1


                1
                National Research Nuclear University MEPhI (Moscow Engineering Physics Institute),
                Kashirskoe Shosse, 31, Moscow, Russia, 115409


                e-mail: alen.gon4arowa@gmail.com


                Abstract. In the paper, we propose an approach for selection a correction parameter for images
                damaged by backlighting. We consider the photos containing underexposed areas due to
                backlit conditions. Such areas are dark and have poorly discernible details. The correction
                parameter controls the level of amplification of local contrast in shadow tones. Besides, the
                correction parameter can be considered as a quality estimation factor for such photos. For an
                automatic selection of the correction parameter, we apply regression by supervised machine
                learning. We propose new features calculated from the co-occurrence matrix for the training of
                the regression model. We compare the performance of the following techniques: the least
                square method, support vector machine, random forest, CART, random forest, two shallow
                neural networks as well as blending and staking of several models. We apply two-stage
                approach for the collection of a big dataset for training: initial model is trained on a manually
                labeled dataset containing about two hundred of photos, after that we use the initial model for
                searching for photos damaged by backlit in social networks having public API. Such approach
                allowed to collect about 1000 photos in conjunction with their preliminary quality assessments
                that were corrected by experts if it was necessary. In addition, we investigate an application of
                several well-known blind quality metrics for the estimation of photos affected by backlit.


1. Introduction
A lot of photos are affected by various defects and need to be enhanced in an automatic manner to be
more pleasant for observers. The most noticeable defects are the following: various issues with
brightness and contrast, color misbalance, blurring and shaking, compression artifacts, high noise
level, red eyes and other artifacts due to flash, color fringing, and geometrical distortions [1]. There
are numerous methods for noise suppression (e.g. [2]), red-eye correction (e.g. [3]), and image
sharpening (e.g. [4]), but there are just a few publications devoted to correction photos damaged by
backlit. Photos taken in backlighting conditions has high global contrast, but local contrast in areas of
shadow tones is quite low. Figure 1 demonstrates the photo affected by backlighting. One can see
poorly distinguishable details in shadows. It is important to develop a method for enhancement of such
images.
    Paper [5] describes the technique for the correction of photos damaged by backlit. That method is
based on a contrast stretching and alpha-blending of brightness of the initial image and an estimated
reflectance. In the majority cases, the technique provides good visual outcomes, nevertheless, it has

                    V International Conference on "Information Technology and Nanotechnology" (ITNT-2019)
Image Processing and Earth Remote Sensing
A V Goncharova, I V Safonov and I A Romanov


shortcomings. The most important parameter of the method is factor k s , which controls the
amplification of local contrast in shadows. This factor is calculated by means of decision tree based on
features that originated from brightness histogram. The decision tree allows to obtain for k s just five
discrete values. Sometimes it leads to significant changes in correction power due to insignificant
alterations in an image. Such effect is undesirable. Also, that decision tree was created based on
several heuristic assumptions and was not verified on the big number of sample photos. Is that solution
general for plenty of photos affected by backlit?
    The aim of our paper is overcoming of disadvantages of the method from [5] by the development
of a regression model for the estimation of k s based on machine learning techniques. It is worth to
note, also factor k s can be treated as blind metrics for assessment of the visual quality of photos
damaged by backlighting. In the paper, we discuss three subjects: an approach for collection of a
representative dataset; a selection of method for creation of regression model; algorithms for
calculation and selection of informative features for the model.


                                 Figure 1. Example of photo damaged by backlit.


                                  Figure 2. The scheme for dataset collection.

2. Collection of a dataset
In the best of our knowledge, there is no publicly available dataset containing photos deteriorated by
backlighting. Moreover, a collection of a big dataset is a tiresome task because a relatively small
number of such photos are in the Internet and social networks, so, manual search requires many efforts
and a long time. We employ a concept of semi-supervised learning and self-training [6] for the

V International Conference on "Information Technology and Nanotechnology" (ITNT-2019)                327
Image Processing and Earth Remote Sensing
A V Goncharova, I V Safonov and I A Romanov


collection of the representative dataset. An initial dataset containing about two hundred of photos was
collected and labeled manually. We trained the initial regression model using the initial dataset and
random forest technique [7]. Nowadays many social networks provide an application interface for
downloading of photos. We made a software tool for download of photos from flickr.com and vk.com
[8]. The initial regression model estimates the downloaded photos and collects photos having required
k s to provide uniform distribution of images by the correction factor. Evidently, the initial model is
not ideal, so, an expert checks outcome and makes re-labeling if it is necessary. Further, we add
validated photos to the main dataset. Photos from the initial dataset are added to the main dataset as
well. In this way, we collected about 1000 labeled images. The final model was trained on the main
dataset. Figure 2 illustrates our approach for the dataset collection.

3. Analysis of methods for creation of a regression model
We have analyzed the application of the following methods for creation of regression model:
    • linear least squares (LLS);
    • support vector machine for regression (SVR) [9];
    • k-nearest neighborhoods for regression (k-NN) [10];
    • classification and regression tree (CART) [11];
    • random forest [7];
    • feedforward neural network [12];
    • single-layer neural network [13];
    • averaging of outcomes of enumerated above methods;
    • blending [14]
    • averaging of several outcomes of blending;
    • stacking [15].
   There are several measures to evaluate the performance of the regression models. We calculated
the following five measures that evaluate conformance of outcomes of the regression model with
experts’ judgments.
   1. Mean absolute error (MAE):
                                                                      𝑁
                                                           1
                                                𝑀𝐴𝐸(𝑦, 𝑓) = �|𝑦𝑖 − 𝑓𝑖 | ;                           (1)
                                                           𝑁
                                                                     𝑖=1
where 𝑁 is the number of elements, 𝑓𝑖 is the predicted value, and 𝑦𝑖 is the true value.
  2. Mean squared error (MSE):
                                                                𝑁
                                                     1
                                          𝑀𝑆𝐸(𝑦, 𝑓) = �(𝑦𝑖 − 𝑓𝑖 )2 .                                (2)
                                                     𝑁
                                                              𝑖=1
   3. Median absolute error (MedAE):
                   𝑀𝑒𝑑𝐴𝐸(𝑦, 𝑓) = 𝑚𝑒𝑑𝑖𝑎𝑛(|𝑦1 − 𝑓1 |, . . , |𝑦𝑁 − 𝑓𝑁 |).                              (3)
   4. Pearson correlation coefficient (r):
                                               ∑𝑁𝑖=1(𝑦𝑖 − 𝑦�)�𝑓𝑖 − 𝑓 ̅�
                            𝑟(𝑦, 𝑓) = 1 −                                   .                       (4)
                                                                          2
                                           �∑𝑖=1(𝑦𝑖 − 𝑦�)2 �∑𝑖=1�𝑓𝑖 − 𝑓 ̅�
                                             𝑁                𝑁

            1                1
where 𝑦� = ∑𝑁
            𝑁 𝑖=1 𝑖
                   𝑦 , 𝑓 ̅ = 𝑁 ∑𝑁
                                𝑖=1 𝑓𝑖 .
    5. Normalized area under regression error characteristic curve (AUC REC) [16].
    We used 5-fold cross-validation with stratification. All models were trained using the features from
[5]. Table 1 contains the performance measures of different regression models. For MAE, MSE and
MedAE smaller is better. For r and AUC REC larger is better. Random forest and SVR have the
highest performance measures. However, the distribution of residuals of the regression model via
random forest looks normal in comparison with residuals by SVR. In addition, random forest has a
relatively small number of parameters for model adjustment. Thus, we have selected random forest for
the creation of regression model for the estimation of amplification factor k s .


V International Conference on "Information Technology and Nanotechnology" (ITNT-2019)                328
Image Processing and Earth Remote Sensing
A V Goncharova, I V Safonov and I A Romanov


                     Table 1. Performance measures of different regression models.
                                               MAE MSE MedAE                 r     normAUC
           LLS                                 15.2      350      13.3     0.59      0.694
           SVR                                 15.1      377      12.1     0.54      0.727
           k-NN                                16.7      453      14.0     0.38      0.709
           CART                                18.5      559      15.3     0.41      0.723
           Random forest                       14.8      334      13.1     0.61      0.711
           Feedforward neural network          15.0      350      12.9     0.59      0.706
           Single-layer neural network         15.0      344      12.9     0.59      0.692
           Averaging of methods                14.8      340      12.7     0.59      0.708
           Blending                            15.0      350      13.2     0.58      0.721
           Averaging of several blendings      14.9      340      13.2     0.60      0.718
           Stacking                            14.9      339      13.4     0.60      0.714

4. Features for the description of photos damaged by backlit
In the paper [5] feature set for estimation of amplification factor k s is described. Those features are
calculated from brightness histogram H. Typical histogram of a photo damaged by backlighting has
relatively high peaks in shadow and/or highlights, but a gap in middle tones. Besides, the histogram in
dark tones is asymmetric. To characterize such shape of histogram the following features were
proposed: parts of the tones in the shadow and middle tones; parts of tones in the first and second
halves of dark tones; the ratios of the histogram maxima in shadows, middle, and highlight tones per
the global histogram maximum; locations of the histogram maxima in shadows and highlights. The
entire dynamic range was divided uniformly on shadow, middle tones, and highlights.
    However, there are high-quality images that have values of those features close to values for photos
affected by backlit. Sometimes normal images have a histogram that has peaks near the boundaries of
dynamic range and a valley between them. Figure 3 shows an example of such pristine photo and its
histogram of brightness. The method from [5] makes dark areas in the photo a lighter. It is necessary
to prevent the modification of dark tones in normal images. To overcome the undesirable correction of
high-quality photos, we propose to use another set of features. We modify histogram-based features
from [5] and introduce new features originated from the co-occurrence matrix.
    In figure 3 one can see that peaks in the left and right parts of the histogram are shifted towards to
middle tones. We propose to treat as shadow tones a quarter of the leftmost part of the dynamic range
instead of one third. Highlight tones are a quarter of the rightmost part of the dynamic range. Middle
tones occupy half of the range. Solid lines in histogram in figure 3 demonstrate this division for dark,
middle and light tones. Dashed lines show a similar differentiation in [5]. This simple alteration of
sub-ranges for shadow, highlights, and middle tones allows to decrease considerably the number of
falsely corrected pristine images even by usage similar to [5] features.


    Figure 3. The high-quality photos and corresponding brightness histogram which looks like the
                              histogram of a photo damaged by backlit.


V International Conference on "Information Technology and Nanotechnology" (ITNT-2019)                  329
Image Processing and Earth Remote Sensing
A V Goncharova, I V Safonov and I A Romanov


    One more opportunity is a division of range on more than three sub-ranges and the calculation of
features for each range. We propose to employ nine evenly distributed sub-ranges.
    We argue that extraction of features from the co-occurrence matrix is a prospective approach for
the description of photos affected by backlit. Neighboring pixels of natural images have close values,
and non-zero elements locate along the main diagonal of the co-occurrence matrix, as a rule. For
photos damaged by backlighting, the co-occurrence matrix has the highest concentrations of elements
in left-top or right-bottom corners as well as gaps in the central part of the main diagonal. Figure 4
shows the co-occurrence matrices for undistorted and damages by backlighting photos.


           Figure 4. Co-occurrence matrices for undistorted (left) and damages (right) photos.
   We propose to divide the co-occurrence matrix into three parts orthogonally to the main diagonal
similar to shadow, middle tones and highlights sub-ranges in a histogram. It is possible to divide the
main diagonal in three equal parts (see dashed lines in figure 4), but it is preferable to set range of
middle tones twice larger than for dark and bright tones (see solid lines in figure 4). The following
features are calculated from the co-occurrence matrix G, where pixels situated in distance 3 pixels in
column is considered. Fractions of G in dark, light, and middle tones:
                                                                127 127−𝑖

                                                         𝑆1 = � � 𝐺(𝑖, 𝑗)/(𝑀 × 𝑁)                  (5)
                                                                𝑖=0 𝑗=0
                                                   𝑆2 = 1 − (𝑆1 + 𝑆3 ),                            (6)
                                                     255      255

                                              𝑆3 = �          � 𝐺(𝑖, 𝑗)/(𝑀 × 𝑁)                    (7)
                                                    𝑖=128 𝑗=383−𝑖
where М is the number of rows and N is the number of columns of the image.
  Fractions of G in sub-regions of dark tones:
                                                           63 63−𝑖

                                                 𝑆11 = � � 𝐺(𝑖, 𝑗)/(𝑀 × 𝑁)                         (8)
                                                           𝑖=0 𝑗=0
                                            𝑆12 = 𝑆1 − 𝑆11 .                                       (9)
   Ratios of maxima in dark tones and maxima in bright tones to global maxima:
                        𝑀1 =       𝑚𝑎𝑥        (𝐺(𝑖, 𝑗))� 𝑚𝑎𝑥 (𝐺(𝑖, 𝑗)) ,                         (10)
                                 𝑖=0…127,                  𝑖=0…255,
                                𝑗=0…127−𝑖                  𝑗=0…255

                          𝑀3 =         𝑚𝑎𝑥       (𝐺(𝑖, 𝑗))� 𝑚𝑎𝑥 (𝐺(𝑖, 𝑗)) .                      (11)
                                   𝑖=128…255,                  𝑖=0…255,
                                  𝑗=383−𝑖 …255                 𝑗=0…255
   Locations of the matrix G maxima in the left and in the right parts:
                                  𝑃1 = 𝑚𝑎𝑥 (𝐺(𝑖, 𝑗)),                                            (12)
                                                     𝑖=0…127,
                                                    𝑗=0…127−𝑖
                                           𝑃3 =        𝑚𝑎𝑥          (𝐺(𝑖, 𝑗)).                   (13)
                                                   𝑖=128…255,
                                                  𝑗=383−𝑖…255


V International Conference on "Information Technology and Nanotechnology" (ITNT-2019)               330
Image Processing and Earth Remote Sensing
A V Goncharova, I V Safonov and I A Romanov


   The following features are the number of elements of co-occurrence matrix G, which are the
greater than the threshold which equals to the average value of G:
                                                 1| 𝑦 ≥ 𝐺0
                               𝑡ℎ𝑟𝑒𝑠(𝑦, 𝐺0 ) = �            ,                           (14)
                                                  0| 𝑦 < 𝐺0
                               𝐺0 = 𝑀 × (𝑁 − 3)/2562 ,                                   (15)
                                                  127 127−𝑖

                                          𝐴1 = � � 𝑡ℎ𝑟𝑒𝑠(𝐺(𝑖, 𝑗), 𝐺0 ),                                     (16)
                                                  𝑖=0 𝑗=0
                         127    255                                255 383−𝑖

                  𝐴2 = � � 𝑡ℎ𝑟𝑒𝑠(𝐺(𝑖, 𝑗), 𝐺0 ) + � � 𝑡ℎ𝑟𝑒𝑠(𝐺(𝑖, 𝑗), 𝐺0 )                                    (17)
                         𝑖=0 𝑗=127−𝑖                              𝑖=128 𝑗=0
                                        255      255
                               𝐴3 = �           � 𝑡ℎ𝑟𝑒𝑠(𝐺(𝑖, 𝑗), 𝐺0 ) .                                     (18)
                                       𝑖=128 𝑗=383−𝑖
where М is the number of rows and N is the number of columns.
   Also, we calculate the ratios of the last three features:
                                          𝑅1 = 𝐴2 /𝐴1 ,                                           (19)
                                         𝑅 2 = 𝐴2 /𝐴3 ,                                           (20)
                                         𝑅 3 = 𝐴1 /𝐴3 .                                           (21)
   Additionally, the same features (5-21) can be extracted from big fragments of image. We divide a
photo into 2 equal parts vertically and 3 parts horizontally. Totally we calculate about two hundred of
features from the histogram and the co-occurrence matrix for the entire image and its fragments.

5. Enrichment of feature set
To enrich our feature set we applied a combination of features with the following feature selection.
The idea of combining features is based on the assumption that some of the combinations, for
example, sum, product, ratio of different features and their squaring, can provide an improvement to
the regression model. Therefore, it is advisable to try as many such combinations as possible.
    In an ideal case, features should be informative that is they should have a high correlation with
amplification factor 𝑘𝑠 from the ground truth. Obviously, informativeness of various features is very
different. A use of non-informative features can lead to the worsening of the regression model. It is
necessary to select “good” features and drop “bad” ones. For feature selection, we employed a greedy
addition of features to random forest regression model. If the addition of a feature to model leads to
decreasing of MAD, then we remain the feature, otherwise, we drop it.

6. Results
We tested and compared to each other all feature sets described above. In the paper, we do not show
intermediate outcomes, because they have a huge size. The best result demonstrates model which uses
features 𝑆3 , 𝑆11 , 𝑀1 , 𝑀3 , 𝑃1 , 𝑃3 , 𝐴2 , 𝐴3 , 𝑅1 and ratio 𝑆12⁄𝑀1, where the features are calculated for whole
image as well as its fragments.
    As was mentioned, the amplification factor for dark tones 𝑘𝑠 can be treated as blind quality factor
for images damaged by backlit. At present, a sufficient number of blind image quality metrics have
been proposed. Such metrics make possible to evaluate the quality based on the image only without
the reference. Part of the metrics is developed to assess the single factor affecting the quality, for
example, the blurriness level [17]. Other ones claim to be universal. We compared the correlation
coefficient r between experts’ judgments about 𝑘𝑠 and several measures for quality assessment.
    The following existing algorithms for non-reference quality assessment were analyzed. Blind
Image Quality Index (BIQI) [18] implements a two-step approach to assess the quality of photographs.
This method is based on usage of features originated from natural scene statistics (NSS) in the wavelet
domain and assumptions that photos of natural scenes have determined statistic characteristics, these
characteristics are changed due to distortions, and type and strength of distortion can be predicted. The
first stage of BIQI is a classification type of defect. The second one is a numerical quality assessment


V International Conference on "Information Technology and Nanotechnology" (ITNT-2019)                         331
Image Processing and Earth Remote Sensing
A V Goncharova, I V Safonov and I A Romanov


by means of the regression model. Support vector machine (SVM) is applied for training classification
and regression models. Blind/Referenceless Image Spatial Quality Evaluator (BRISQUE) [19] uses
features from NSS in the spatial domain. One regression model for all distortions is trained by SVM.
Oriented Gradients Image Quality Assessment (OG-IQA) method [22] analyzes the correlation of
oriented gradients in the spatial domain. It was speculated, orientations of local gradients change
predictably for distorted images of natural scenes. One regression model for all distortions is trained
via Adaptive Boosting for decision trees. Natural Image Quality Evaluator (NIQE) [20] does not use
distorted images for training. In this method, multivariate Gaussian (MVG) model based on NSS
features in the spatial domain is calculated for pristine photos only. The quality of the estimated image
is estimated as the distance between its MVG and pre-calculated MVG of several undistorted images.
Integrated Local Natural Image Quality Evaluator (IL-NIQE) [21] algorithm exploits the same idea as
NIQE, but IL-NIQE operates with color channels of photo in salient local patches.

  Table 2. Correlation coefficient between experts’ judgments and measures for quality assessment.
                      Decision tree, [5]         BIQI       BRISQUE         NIQE        ILNIQE    OG-IQA      Proposed
Correlation
                             0.47                -0.22         0.05          0.13        0.23      0.13           0.69
coefficient, 𝑟
    Table 2 contains the correlation coefficient between ground truth and measures for quality
evaluation. Amplification factor 𝑘𝑠 serves as quality metrics for algorithm from [5] and for proposed
regression model. Our method provides the best conformance to experts’ judgments. Well-known
universal image quality measures have low correlation with ground truth and cannot be used for
quality characterizing of photos damaged by backlighting.
    Table 3 contains outcomes of comparison of three methods for the estimation of amplification
factor 𝑘𝑠 . We analyzed decision tree from [5] as a baseline, regression model by means of random
forest trained in features from [5], and random forest model trained on proposed features. According
to all performance measures our technique outperforms considered alternatives.

                            Table 3. Comparison of methods for estimation of 𝑘𝑠 .
                                                MAE          MSE           MedAE             r       normAUC
 Decision tree from [5]                         21.5         800            17.0           0.47        0.700
 Random forest with features from [5]           14.8         334            13.1           0.61        0.711
 Random forest with proposed
                                                 13.5         282            11.7          0.69           0.724
 features

7. Conclusion
Photos taken by backlighting conditions need to be enhanced to improve their pleasantness. There is a
technique for the correction of dark tones. The automatic adjustment of the parameter for
amplification of dark tones is needed for that method. For this purpose, we propose a feature set,
extracted from the co-occurrence matrix, and regression model via random forest. For the collection of
training dataset, we employed semi-supervised paradigm for obtaining of required photos from social
networks. The regression model developed outperforms the baseline method. In addition, our model
produces smooth continuous values of amplification factor rather than step-wise discrete values in the
existing method. The proposed algorithm is intended for photo enhancement software.

8. References
[1] Safonov I V, Kurilin I V, Rychagov M N and Tolstaya E V 2018 Adaptive Image Processing
      Algorithms for Printing (Singapore: Springer Nature)
[2] Thang P C and Kopylov A V 2018 Tree-serial parametric dynamic programming with flexible
      prior model for image denoising Computer Optics 42(5) 838-845 DOI: 10.18287/2412-6179-
      2018-42-5-838-845
[3] Safonov I V, Rychagov M N, Kang K and Kim S H 2008 Automatic red eye correction and its
      quality metric Proc. of SPIE 6807 68070W


V International Conference on "Information Technology and Nanotechnology" (ITNT-2019)                                    332
Image Processing and Earth Remote Sensing
A V Goncharova, I V Safonov and I A Romanov


[4]    Safonov I V, Rychagov M N, Kang K and Kim S H 2008 Adaptive sharpening of photos Proc.
       of SPIE 6807 68070U
[5]    Safonov I V 2006 Automatic correction of amateur photos damaged by backlighting Proc. of
       GraphiCon 80-89
[6]    Triguero I, García S and Herrera F 2015 Self-labeled techniques for semi-supervised learning:
       taxonomy, software and empirical study Knowledge and Information systems 42(2) 245-284
[7]    Breiman L 2001 Random forests Machine learning 45(1) 5-32
[8]    Goncharova A V 2018 No-reference metrics for the quality of images damaged by backlighting
       Proc. of DSPA 2 568-572
[9]    Drucker H, Burges C J, Kaufman L, Smola A J and Vapnik V 1997 Support vector regression
       machines Proc. of NIPS 155-161
[10]   Altman N S 1992 An introduction to kernel and nearest-neighbor nonparametric regression The
       American Statistician 46(3) 175-185
[11]   Breiman L, Friedman J, Stone C J and Olshen R A 1984 Classification and regression trees
       (Boca Raton: CRC Press)
[12]   MATLAB         feedforwardnet     URL:       https://www.mathworks.com/help/deeplearning/ref/
       feedforwardnet.html (01.04.2019)
[13]   MATLAB linearlayer URL: https://www.mathworks.com/help/deeplearning/ref/linearlayer.html
       (01.04.2019)
[14]   Jahrer M, Töscher A and Legenstein R 2010 Combining predictions for accurate recommender
       systems Proc. of ACM SIGKDD 693-702
[15]   Wolpert D H 1992 Stacked generalization Neural networks 5(2) 241-259
[16]   Bi J and Bennett K P 2003 Regression error characteristic curves Proc. of ICML 43-50
[17]   Asatryan D G 2017 Image blur estimation using gradient field analysis Computer Optics 41(6)
       957-962 DOI: 10.18287/2412-6179-2017-41-6-957-962
[18]   Moorthy A K and Bovik A C 2010 A two-step framework for constructing blind image quality
       indices IEEE Signal processing letters 17(5) 513-516
[19]   Mittal A, Moorthy A K and Bovik A C 2012 No-reference image quality assessment in the
       spatial domain IEEE Transactions on Image Processing 21(12) 4695-4708
[20]   Mittal A, Soundararajan R and Bovik A C 2013 Making a “completely blind” image quality
       analyzer IEEE Signal Processing Letters 20(3) 209-212
[21]   Zhang L, Zhang L and Bovik A C 2015 A feature-enriched completely blind image quality
       evaluator IEEE Transactions on Image Processing 24(8) 2579-2591
[22]   Liu L, Hua Y, Zhao Q, Huang H and Bovik A C 2016 Blind image quality assessment by
       relative gradient statistics and adaboosting neural network Signal Processing: Image
       Communication 40 1-15


V International Conference on "Information Technology and Nanotechnology" (ITNT-2019)            333