=Paper= {{Paper |id=Vol-2744/short19 |storemode=property |title=Data Balancing Method for Training Segmentation Neural Networks (short paper) |pdfUrl=https://ceur-ws.org/Vol-2744/short19.pdf |volume=Vol-2744 |authors=Alexey Kochkarev,Alexander Khvostikov,Dmitry Korshunov,Andrey Krylov,Mikhail Boguslavskiy }} ==Data Balancing Method for Training Segmentation Neural Networks (short paper)== https://ceur-ws.org/Vol-2744/short19.pdf
     Data Balancing Method for Training Segmentation
                    Neural Networks ?

                Alexey Kochkarev1[0000−0003−4630−2832] , Alexander
              1[0000−0002−4217−7141]
Khvostikov                        , Dmitry Korshunov2[0000−0002−8500−7193] , Andrey
         1[0000−0001−9910−4501]
  Krylov                        , and Mikhail Boguslavskiy2[0000−0002−1582−8033]
                   1
                   Faculty of Computational Mathematics and Cybernetics,
          2
          Faculty of Geology, Lomonosov Moscow State University, Moscow, Russia
            alexey.kochkarev@gmail.com khvostikov@cs.msu.ru
        Dmit0korsh@gmail.com kryl@cs.msu.ru mikhail@geol.msu.ru



        Abstract. Data imbalance is a common problem in machine learning and im-
        age processing. The lack of training data for the rarest classes can lead to worse
        learning ability and negatively affect the quality of segmentation. In this paper we
        focus on the problem of data balancing for the task of image segmentation. We
        review major trends in handling unbalanced data and propose a new method for
        data balancing, based on Distance Transform. This method is designed for using
        in segmentation convolutional neural networks (CNNs), but it is universal and can
        be used with any patch-based segmentation machine learning model. The evalu-
        ation of the proposed data balancing method is performed on two datasets. The
        first is medical dataset LiTS, containing CT images of liver with tumor abnormal-
        ities. Second one is a geological dataset, containing of photographs of polished
        sections of different ores. The proposed algorithm enhances the data balance be-
        tween classes and improves the overall performance of CNN model.

        Keywords: Image Segmentation, Data Balancing, Convolutional Neural Net-
        works, Liver, Tumor, Geology.


1     Introduction

Data imbalance is a common issue in image segmentation [1]. If pixels corresponding
to a particular “majority” class are far more numerous than pixels of one or more “mi-
nority” classes, the rarity of the “minority” class in the training data makes the training
process less effective and worses the final results, as the learned model will tend to
classify most pixels as members of the “majority” classes.
    The problem of data imbalance is very common in medical problems and, in partic-
ular, detecting liver tumors. One of these problems is segmentation of CT images, since
the volume and area of different organs and abnormalities differs a lot. In a typical CT
    Copyright © 2020 for this paper by its authors. Use permitted under Creative Commons Li-
    cense Attribution 4.0 International (CC BY 4.0).
?
    The work was funded by RFBR, CNPq and MOST according to the research project 19-57-
    80014 (BRICS2019-394)
2 A. Kochkarev et al.

image of a liver tumor, the volume of healthy liver tissue is significantly greater than
the volume of cancerous tissue [2].
     Data imbalance problem also occurs in geological image segmentation. Some min-
erals are found in nature much less often than others. For example, photographs of
polished sections, taken from lead-zinc fields contain a larger amount of Sphalerite ore
(about 30 %) than Galena (about 7%).
     The existent methods that are used to overcome class imbalance can be divided into
two main categories [3].
     The first category of methods is represented with algorithm-based methods. One
of the approaches is cost-sensetive learning [4]. The idea is to assign different costs to
classification mistakes for different classes. One common scheme involves assigning to
each class a cost equal to the inverse of the proportion of this class in dataset. This leads
to higher model penalization for rarest classes.
     The second category of methods is represented with so-called data-based methods.
They use sampling techniques to rebalance the distribution of classes during prepro-
cessing. This involves either oversampling instances of the minority class or undersam-
pling instances of the majority class, or even both [5]. One of the generalizations of
these approaches is Synthetic Minority Over-Sampling Technique, or SMOTE [6]. This
technique involves generating synthetic samples of the minority class to train on, thus
reducing the class imbalance by artificially inflating the size of the minority class itself.
It also should be noticed that SMOTE and other methods based on SMOTE [7][8] are
designed for the machine learning-based classification problems, and either can not be
used at all in the segmentation problem or demonstrate mediocre results.
     In this paper we propose a data balancing method that focuses on modifying the
class distribution in the dataset. The proposed method uses only data from the orig-
inal set rather than replicating additional minority samples. The proposed method is
specially created for segmentation problems and has a wide range of applications.


2   Proposed method

Segmentation is defined as finding a mapping of the source color image I ∈ Rh×w×3
of height h and width w to the annotation S ∈ Zh×w , containing labels (or classes). Let
us consider a set of classes C = {c0 , c1 , ..., cN }.
    For the convenience we internally store the annotation as a 2-D array of integers
(S = S(x, y)), each value of which is just the index of the class value: S(x, y) ∈ C.
The source images are stored as 3-D array of float numbers: I = I(x, y), I(x, y) ∈ R3 .
    The proposed method is created to be used with data that is fed into neural networks
during training process. On every step of each training epoch the neural network gets a
batch of square patches, taken from one of the dataset images {I1 (x, y), ..., Im (x, y)}.
Conventional random choice of patches can only increase the existing data imbalance,
because there could occur objects that are smaller then a patch and for any patch cov-
ering case, the ratio between object and non-object pixels inside the patch can not be
increased. The proposed method allows to choose patches, that contain most amount of
pixels of the certain class. Thus, it is possible to keep balance of classes in images, that
are fed into network. The proposed method consists of three main steps:
                        Data Balancing Method for Training Segmentation Neural Networks 3

 1. choose class, that was most rarely fed into the model;
 2. choose image, which contains the greatest quantity of pixels of chosen class;
 3. crop patch from the selected image, which contains the greatest quantity of pixels
     of chosen class;
All these stages will be considered in detail below.

2.1   Class choice
A square area in image Ii (x, y) is called a patch and marked as P (x, y). The appropri-
ate area on corresponding annotation Si (x, y) is marked as PAnn (x, y). After patch is
chosen, we sum the amount of pixels for each class:
                   sj = |(x, y) : PAnn (x, y) = cj |, j = 0, 1, ..., N.                 (1)
On this step we should choose class Ck , for which amount of pixels is the smallest:
                               sk = min{s0 , s1 , ..., sN }.                            (2)

2.2   Image choice
For every image Ii in dataset we compute weights for each class {w0 , w1 , ..., wN } as
the amount of pixels of this class on the image:
                     wj = |(x, y) : Si (x, y) = cj |, j = 0, 1, ..., N.                 (3)
So, the image will be chosen from the dataset of {I1 , ..., Im } images with probability
proportionally to its weight.

2.3   Patch choice
On this step we choose patch, which contains the greatest quantity of pixels of chosen
class Ck . To perform this choice we first calculate the probability maps of choosing
upper left pixel of the patch in current pixel. First, distance to class map is built. Let,
annotation S(x, y) to be consisted of two classes: Object (c1 ) and Background (c0 ):
                                   S(x, y) ∈ {c0 , c1 }.                                (4)
At first, we apply Distance Transform [9] and get a map Sd (x, y), where each pixel is a
distance to the nearest object pixel on the annotation S(x, y):
                 
                   0,                                            S(x, y) = c1 ,
    Sd (x, y) =                                                                      (5)
                   min kx − x0 , y − y0 kL2 , ∀S (x0 , y0 ) = c0 , S(x, y) = c0
                                                 p
                             where kx, ykL2 = x2 + y 2 .                             (6)
Consider a patch P (x, y) of size p × p in annotation S(x, y). The less the sum of pixels
in a relevant area on Sd (x, y), the more pixels of chosen class there are in this area. Let
us define:
                                                 Sd (x, y)
                             S̃d (x, y) = 1 −                 .                          (7)
                                              max (Sd (x, y))
4 A. Kochkarev et al.

The more sum is inside a relevant area in S̃d (x, y), the more pixels of chosen class
there are in chosen area. Summing up pixels on every rectangle area of S(x, y), we get
desired probability map P r(x, y), where each pixel is sum of distances from chosen
class to background pixels. The higher the value in current pixel of P r(x, y), the higher
the probability of getting most pixels of chosen class, if we choose upper left pixel of
patch in current pixel.
The sum of pixels on the patch has to be calculated many times. Quick and efficient
summation of pixels can be performed using Summed-Area table [10]. The value at
any point (x, y) in the summed-area table is the sum of all the pixels that are above and
to the left of (x, y), inclusive:
                                                X
                                Sint (x, y) =            S (x0 , y 0 ) .                (8)
                                                x0 ≤x
                                                y 0 ≤y



The summed-area table can be computed efficiently in a single pass over the image, as
the value in the summed-area table at (x, y) is just:

    Sint (x, y) = S(x, y) + Sint (x, y − 1) + Sint (x − 1, y) − Sint (x − 1, y − 1).    (9)

Once the summed-area table is computed, evaluating the sum of intensities over any
rectangular area requires exactly four array references regardless of the area size:
               X
                        S(x, y) = Sint (D) + Sint (A) − Sint (B) − Sint (C),           (10)
             x0