=Paper=
{{Paper
|id=Vol-2353/paper52
|storemode=property
|title=Two-Stage Method for Adaptive Binarization of Raster Engineering Drawings
|pdfUrl=https://ceur-ws.org/Vol-2353/paper51.pdf
|volume=Vol-2353
|authors=Vera Molchanova,Dmitry Mironenko
|dblpUrl=https://dblp.org/rec/conf/cmis/MolchanovaM19
}}
==Two-Stage Method for Adaptive Binarization of Raster Engineering Drawings==
<pdf width="1500px">https://ceur-ws.org/Vol-2353/paper51.pdf</pdf>
<pre>
     Two-stage method for adaptive binarization of
            raster engineering drawings
                          0000-0001-5646-1410]                         0000-0001-5646-1410]
      Vera Molchanova[                           , Dmitry Mironenko[

    Pryazovskyi State Technical University, str. Universytets’ka 7, Mariupol 87555, Ukraine

                   molchanova_vs@i.ua, mironenko_ds@ukr.net


        Abstract. The paper describes the proposed original model and the developed
        two-stage method for adaptive binarization of raster engineering drawings. The
        method allows binarization of images whose brightness histograms are not bi-
        modal. At the same time, thin contour lines on resulting images are saved. The
        method is adaptive to brightness of image background. Comparative analysis of
        testing results of the developed method and some well-known methods for bi-
        narization raster images confirms that the proposed solutions are effective. The
        method was tested on a representative set of 40 different quality raster engineer-
        ing drawings.


        Keywords: drawing, binarization, brightness histogram, threshold, contour,
        background, local area.


1       Introduction

The task of raster-vector transformation of engineering drawings images requires en-
suring homotopy original and result images at all stages of image processing. The
binarization stage solves the problem of dividing a bitmap to a contour and a back-
ground. Binary image processing requires less computational resources and from
point of view of theoretical complexity it is simpler than gray-scale image processing.
This explains binarization expediency.
   Literary analysis has shown that there are many threshold methods for image proc-
essing [1-5]. They describe binarization of generalized images, but they don't pay
respect to the specifics of brightness distribution and contrast in engeneering draw-
ings.
   The GMT (Global Multi-stage Thresholding) [6] methodology is the most promis-
ing binarization technology for this subject domain. It provides images multistage
binarization. The researches for three-section brightness histograms [7, 8] is also in-
teresting for further development.
2        Formal problem statement

The bitmap drawing represents a two-dimensional matrix of raster points R(*) (1) .

                                                       
                           R *  rij i  0, j 0 rij  C  c k 0
                                        M 1, N 1                  K 1
                                                                                         (1)

where rij is the point (i,j) of raster drawing, C is the K-colors color palette, M, N is the
width and height of the image.
  The color palette of a color image (R(CF)) consists of K colors c k 0 . The color
                                                                             K 1


palette of a grayscale image (R(M)) consists of 256 tints of gray color. The color pal-
ette of a binary images R(B) is bicolored c k 0  0, 1 . Here white color coded as «0»,
                                                1


it is a background color. And black color coded as «1», it is a contour-line color.
    The purpose of a color image binarization R(CF) is its homotopic mapping to a bi-
nary image  : R CF  
                       
                            R B  with a permissible error . The transform colored
            M 1, N 1
points rij i  0, j0      to bicolored performs in accordance with the classification method
to a background and a contour.
   As a binarization quality criterion we selected the minimal Hamming distance be-
tween binarized R(B) and reference images R(0) (2) [9, 10].

                                                                        
                                                     M N
                                R B  , R 0     rijB   rij0     0         (2)
                                                     i 1 j1


   The image quality metrics (MSE, PSNR, СNR, SSIM) perform as measures of this
distance.


3        Literature review

A generalized n-stages scheme for raster images binarization was proposed in [5–7].
Images of engineering drawings are a typical example of contour-line images. The
qualitative binarization result for contour-line images can be obtained even with n = 2
[6]. Therefore, we will consider only two-stage binarization.
   Let's imagine that a dark contrast contour and an uneven chromatic image back-
ground there is in a color drawing image (for example, it is a blueprint or tracing pa-
per). Brightness histograms of such images are usually three-sectioned. (Fig. 1). As a
rule, in such histograms the most points have brightness which there is at the vicinity
                                 B                                                T 
of the two prevailing values rmax     (for points which belong to contour) and rmax      (for
points which belong to background).
   In this case, if the shadow zone is limited by a bottom global binarization threshold
T(B), the points of this zone with brightness rij < T(B) will certainly belong to a contour.
If the light zone is limited by a top global binarization threshold T(T), the points of
this zone with brightness rij > T(T) will certainly belong to a background. Points with
brightnes T(B) ≤ rij ≤ T(T), to the uncertainty interval [T(B); T(T)]. They need to extra
classification for a contour and a background based on the brightness estimates of
their local neighborhoods


   Fig. 1. Partioning of the brightness histogram for contour-line image in gray-scale
   In [ 7, 8] a two-stage method for calculating the estimates of global thresholds is
proposed for three-sectioned brightness histograms (fig. 1).
   At the first stage, the lower and upper binarization thresholds are estimated. For it,
we calculate minimum values of brightness for the points which belonging to the
shadow or to the light zone on a brightness histogram. The solution of this task is pro-
posed in [6], where the method QIR are described. This method allows to find the
extremum of a quadratic polynomial, which approximates the brightness histogram on
an information interval.
   At the second stage, a final threshold value is selected as an average between the
lower and the upper global thresholds. But this approach is unacceptable here. This is
due to the presence of many local extremes in the brightness histograms of images
with a non-uniform background (for example, drawings on tracing paper and blue-
prints). In addition, this approach doesn't take the mutual influence of the shadow
zone and light zone when estimates of global thresholds are calculating. This leads to
incomplete removal of the chromatic background from the images and to appearance
of binary noise near the contour.
4      The proposed two-stage method for contour-line images
       binarization

In view of the foregoing, we proposed an adaptive two-stage binarization method for
images with three-sectioned brigntness histograms. This method takes into considera-
tion the brightness characteristics of contour-line images with a non-uniform back-
ground (tracing paper, blueprint). It includes:
   ─     estimatimation of global binarization thresholds and search points in image
those belong to the contour or background definitely;
   ─     the points in an uncertainty interval are classificated as a contour and a back-
ground based on the local binarization thresholds estimates.
    The proposed method is adaptive to the image brightness characteristics for differ-
ent carrier types (drawing paper, tracing paper, blueprint) on account of injection an
additional adaptive coefficient α to the formulas for a global and a local binarization
thresholds calculation.


5      The method for global adaptive binarization thresholds
       estimation

The proposed method for calculating two global binarization thresholds assumes a
statistical analysis of the image points brightness distribution. Further, the obtained
result is adjusted depending by the drawing carrier type (drawing paper, tracing paper,
blue). It includes next steps:
   1.     The color drawing image is converted to a grayscale image. A brightness his-
togram H(n, rij) is calculated for it [5] and approximated by cubical B-spline. This
allows you to get rid of minor fluctuations in a brightness histogram of an image [11].
   2.     Brightness distribution of an image is analised by sizes of shadow zone and
light zone in the range of its brightness. The brightness characteristics at a left
(marked as «B») and a right (marked as «T») parts of the histogram are proposed to
analize separately (3-6). It allows you to eliminate the mutual influence of a shadow
and a light zones when the upper and the lower binarization thresholds are calculat-
ing.

           B                                   B               B       1 127
lim  h i  L min h i  0, lim h i  L max h i  0, L                       B    i  h i         (3)
 i0                      i 127                                           n       i 0

                             L B   LB  
                                          min
                                              
                     B   B          B  , 
                                                   B 
                                                         1   B                                  (4)
                              L max  L min 
            T                                  T               T            1     255
lim h i  L min h i  0, ilim   h i  L max h i  0, L                           T    i  h i
                                                                                                     (5)
i 128                     255                                               n        i 128
                          L T   LT  
                                       min
                                           
                  T   T          T  , 
                                                T 
                                                      1   T                            (6)
                          L max  L min  
where is the bin of colour «i» in the brightness histogram of an image,n(B), n(T), is
the amount of points of an image which are belonged to a left or to a right part of his-
togram respectively, Lmin
                       B
                           
                           ; Lmax
                               B
                                          и L  ; L   is the ranges of brightness, L   ,
                                                    T
                                                    min
                                                              T
                                                              max
                                                                                             B


    T 
 L is the average values of brightness in a left and a right part of brightness histo-
gram respectively, β(B), β(T) is the estimates of relative sizes of a shadow zone, γ(B), γ(T)
is the estimates of relative sizes of a light zone.
    1.    The estimates of the lower (T(B)) and the upper (T(T)) binarization thresholds
can be calculated using the formulas (7-8).

                                              B 
               T B    B     B   L   B   Lmin
                                                            B     B         B 
                                                                ,   1  k   ,          (7)
                                                               
                                              T 
               T T    T     T   L   T   Lmin
                                                            T     T         T 
                                                                ,   1  k   ,          (8)
                                                               

where α(B), α(T), k is the tuning coefficients.
   2.    The method sensitivity to an image background tonality on three
equiprobable carriers (blueprint, tracing paper, drawing paper) we can change using
the tuning coefficients, α(B), α(T). Their values should be selected from the range [0,95;
1.00].
   As a result, points of a drawing image {rij} are classified to:
   ─                                                     
         Points of a binary contour rijb   0 . Their brightness are less then the lower
                                                  (B)
global binarization threshold (i.e. rij < T );
   ─                                                                         
         Points of a binary background of the image rijb   1 . Their brightness are
                                                                           (T)
more than the higher global binarization threshold (i.e. rij > T ).
  The rest of an image points are remained in an interval of uncertainty.


6          The method for local adaptive binarization thresholds
           estimation

    As noted above, in brightness histograms of the images of drawings, a shadow
zone less than a light zone. The bimodality of those histograms is weakly expressed.
However, a bimodality at local areas of an image is more distinct. Therefore, the
points of the drawing, the brightness of which is in the interval of uncertainty, are
divided into points of the background and contour by the local threshold
                  
 t m  f  m , L m . The formula, which used for a local binarization threshold calcu-
lating, takes to account influence of the uneven distribution L m of the background
brightness and the type of image carrier. It allows you to vary the size dω for each
local area. A local binarization includes some steps:
   1.    We build local areas ωm around each point rij, brightness of which belongs to
uncertainty interval. We choose the points are equidistanted and centrated relatively
the point rij. The brightness histograms of those areas are quasi-bimodal (Fig. 2).


                Fig. 2. The local ωm area for the point rij with the radius dω

   In accordance with the Gibbs principle [11], it suffices to consider only points of
an image which are located along the base vectors. Their brightness has the greatest
effect on a result of binarization the point rij.
   It experimentally established [12], the thin contour lines can be binarized success-
ful when dω  [1; 9]. They are usually deformed when dω > 9. This is due to the pecu-
liarities of a smoothed raster contour lines construction by Bretzenheim [5]. In accor-
dance to it, the lines are depicted not only in black, some points of them are grey.
Their brightness value can overcome the local binarization threshold tm.
   The initial value     is proposed to calculate according to the expressions (9-10).
                                 rij  L max      
                        d  9                 1                               (9)
                                 L max  L min    

                                                                 1 255
   lim  h i  L min h i  0, lim h i  L max h i  0, L            ihi        (10)
    i0                      i  255                            n i0

   So, the value of dω is inversely proportional to brightness of an analyzed point. It
allows you to save thin contour lines after binarization.
  At the next, we calculate a minimal Lmin
                                        
                                            , a maximal Lmax
                                                          
                                                                                                 and an average
 L   estimates of points brightness inside the local ω area.
                                                             m
         


                           min rk i ,l                        k   d  , d  , l  0, 
                                                                                             
                    
               L min  min min rk ,l  j                       l   d  , d  , k  0,                 (10)
                                                                                             
                           min rk  i, l  j
                                                               k, l   d  , d        
                                                                                              

                           maxrk i ,l                          k   d  , d  , l  0, 
                                                                                              
                    
               L max  max max rk ,l  j                        l   d  , d  , k  0,                (11)
                                                                                              
                           max rk  i, l  j
                                                                k, l   d  , d        
                                                                                               

                              1  d                 d           d             
               L                       rk  i,l   rk ,l  j   rk i ,l  j                           (12)
                              9  d   k d       k  d       k d           
   After that, we estimate the sizes of a shadow zone (β(ω)) and of a light zone (γ(ω)) in
the local ωm - area of the point rij (14).
                                                                            
                           L           Lmin
                                             
                                                                Lmax
                                                                     
                                                                         L           
                                    ,                            1                             (13)
                           L max  Lmin
                                     
                                                               L max  L min

  The local binarization threshold can be calculated by the formula (15).
                                    
               t m       L      Lmin
                                                  
                                                     ,   1 k                   (14)
                                                    
   So, the proposed method for local binarization threshold calculating is adaptive to
brightness of the central point. This is ensured by the choise availability of an optimal
diameter for local area of each point, which brightness value is in uncertainty zone. In
addition, this method takes into account background tonality effect to the binarization
result for different types of drawing carriers. So, the result binary image contains
minimum number of artifacts.
   We have developed an information system that shows testing results of the devel-
oped method and of some most popular alternative binarization methods. As you can
see the proposed method allows you to save contour lines integrity. In this case, all
artifacts near the contour are removed (Fig. 3, a). Alternative binarization methods
leads to contour line doubling (Fig. 3, b).
                                     а)


                                      b)


Fig. 3. The result of the color image binarization using the developed method
7       Experiments and results

Analysis of the testing results of the method proposed in this work confirms that bi-
narization quality is sufficient to preserve the topological features of the contours. We
used a random sample of 40 color images of various sizes to test the method. Such
sample guarantees representative results with a 95% probability when n> 23 [15] (16).
                        p  100  p   t 2       15  85  2 2
                   n                                            23                                            (15)
                                2                    15 2
where is the minimum size of sample with an unknown value of the general concur-
rence, р is the average error (       ), t is the confidence coefficient, it shows a prob-
ability that the error will not exceed the a limit value (when t = 2 a probability of an
accurate prediction is 95%), is the boundary error (           ).
   The reliability and efficiency analysis of the developed method is based on a re-
sults comparison, which we got using this method and some well-known alternative
methods (Bernsen, Niblack, Otsu, Sauvola, Phansalkar). These methods usually are
used for solving the similar tasks, they're generally accepted and effective.
   We’ve chosen several general accepted metrics for results quantification:
PRECISION, RECALL, F-MEASURE, ACCURACY [16]. For each image from the
test sample, we repeated an experiment for 5 times, and then we averaged the results.
The relative improvement of values of basic metrics for the proposed and several al-
ternative methods is summarized in the table 1.

    Table 1. The relative improvement of values of basic metrics for the proposed and several
                                     alternative methods

                                                   The alternative method
                                                                                                          Average by
                                                                                             Phansalkar


                                                                                                          methods


           The basic metric
                                                   Bernsen


                                                                                  Sauvola
                                                              Niblack


                                                                          Otsu


                                               1,70          0,07       -0,08    8,17       0,22          2,01
                                               11,02         81,67 44,42         3,57       0,45          28,22
                                               7,58          71,03 36,60         8,05       0,45          24,74
                                               2,11          49,34 19,45         2,15       0,01          14,61
Average performance (                    ) 5,60              50,53 25,10         5,48       0,28          17,40
  So, using the developed method to reach relative values improvement for the basic
metrics of image quality by 17.4%. This allows us to make a positive conclusion
about its reliability and effectiveness.
8      Conclusion

It has been established that the homotopic mapping of a color drawing image to a
binary drawing image can be done as converting the brightness of its points to a two-
color palette by classifying it as a background and a contour. The proposed method
takes to account influence of brightness characteristics of images, which made on
different carriers and the need to preserve the topological features of the contour (for
example, small primitives and thin lines).
   The proposed method allows us to significantly increase the quality quantitative
indicators of the converting color images of drawings to binary ones. It includes:
   ─      the method for global adaptive binarization thresholds estimation;
   ─      the method for local adaptive binarization thresholds estimation.
   The first method is based on the forms analysis and a light and a shadow zones ra-
tio in the brightness histograms. This allows us to take to account the global bright-
ness heterogeneity of the drawing images.
   The second method is based on the balance of brightness analysis for areas with
quasi-bimodal brightness histograms. This allows us to take to account the local
brightness features of the drawing images by adaptation to different types of carriers.


References
 1. Otsu, N. Y.: A threshold selection method from gray‒level histograms. IEEE transactions
    on systems, man and cybernetics. 9(1): 62-66 (1979) doi:10.1109/TSMC.1979.4310076
 2. Ablamejko, S. V.: An Introduction to Interpretation of Graphic Images. SPIE Press,
    Washington (1997)
 3. Nixon, M. S.: Feature Extraction and Image Processing for Computer Vision. Academic
    Press, New York (2013)
 4. Garg, N. K.: Binarization Techniques used for Grey Scale Images. Binarization
    Techniques used for Grey Scale Images. International Journal of Computer
    Applications. 71(1):8-11 (2013) doi:10.5120/12320-8533
 5. Gonzalez, R. S Woods, R. E.: Digital image processing. Second edition. Prentice Hall,
    New Jersey (2006).
 6. Solihin, Ya Leedham, C.G.: Integral Ratio: A New Class of Global Thresholding
    Techniques for Handwriting Images. IEEE Transactions on pattern analysis and machine
    intelligence, 21(8): 761-768 (1999) doi: 10.1109/34.784289
 7. Dou, J Zhang, W.: A Fast Thresholding Technique in Image Binarization for Embedded
    System. Indonesian Journal of Electrical Engineering and Computer Science. 12(1):592-
    598 (2014) doi: 10.11591/telkomnika.v12i1.3359
 8. Muñoz, I. Rubio-Celorio, M.: Computer image analysis as a tool for classifying marbling:
    A case study in dry-cured ham. Journal of Food Engineering. 166:148-155 (2015)
    doi: 10.1016/j.jfoodeng.2015.06.004
 9. Blahut, R. Theory and Practice of Error Control Codes. Addison–Wesley Press, Boston
    (1983)
10. Chandler, D. M.: Seven Challenges in Image Quality Assessment: Past, Present, and
    Future Research. I SRN Signal Processing. 2013(8):1-55 (2013) doi: 10.1155/2013/905685
11. Myasnikov, V.V. Popov, S.P. Sergeyev, V.V. Soifer, V.A. Computer Image Processing,
    Part I: Basic concepts and theory. Edited by Victor A. Soifer. - VDM Verlag, Saarbruken
    (2009)
12. Molchanova, V. S.: The adaptive threshold binarization method of raster images of
    technical drawings. Radio Electronics, Computer Science, Control. 33(2): 62-70 (2015)
    doi: 10.15588/1607-3274-2015-2-8
13. Molchanova, V. S.: Development of a hybrid adaptive noise suppression method in a
    raster image. Eastern-European Journal of Enterprise Technologies. 74 (4): 35-43 (2015)
    doi: 10.15587/1729-4061.2015.47415
14. Parker, J. R. Algorithms for Image Processing and Computer Vision. Second edition.
    Wiley Publishing, Choboken (2011)
15. Lindgren, B. W.: Statistical Theory. MacMillan Publishing Company, London (1976)
16. Wang, Zh, Bovik. C. A.: A Universal Image Quality Index. IEEE Signal Processing Letter.
    9(3): 81-84 (2002) doi: 10.1109/97.995823

</pre>