=Paper= {{Paper |id=Vol-3304/paper35 |storemode=property |title=A PCB Defect Detection Algorithm with Improved Faster R-CNN |pdfUrl=https://ceur-ws.org/Vol-3304/paper35.pdf |volume=Vol-3304 |authors=Junhao Niu,Jin Huang,Lili Cui,Benxin Zhang,Aijun Zhu }} ==A PCB Defect Detection Algorithm with Improved Faster R-CNN== https://ceur-ws.org/Vol-3304/paper35.pdf
A PCB Defect Detection Algorithm with Improved Faster R-CNN
Junhao Niu 1,2, Jin Huang 1,2, Lili Cui 3*, Benxin Zhang1,2, Aijun Zhu1,2
1
  School of Electronic Engineering and Automation Guilin University of Electronic Technology, Guangxi, Guilin,
541004, China
2
  Guangxi Key Laboratory of Automatic Detecting Technology and Instruments, Guangxi, Guilin, 541004, China
3
  School of Art and Design Guilin University of Electronic Technology, Guangxi, Guilin, 541004, China

                 Abstract
                 The paper proposes an improved printed circuit board (PCB) defect detection algorithm based
                 on the original faster region convolutional neural networks (Faster R-CNN) for the problems
                 of low average accuracy mean value, poor detection of tiny defect targets and high leakage rate
                 in PCB tiny defect detection. Firstly, a genetic algorithm is added to the K-means++ clustering
                 algorithm to generate the initial anchor that match the data set in this paper. The standard
                 convolution in the Resnet50 network is then replaced by a depth-separable convolution as the
                 backbone network to reduce the number of computational parameters, and the multilayer depth
                 features are extracted and fed into the improved feature pyramid network to train the model,
                 effectively combining the geometric detail information in the bottom layer and the semantic
                 contour information in the top layer to provide material for subsequent classification and
                 localization. The experimental results show that the average accuracy of this algorithm is 95.6%
                 and the detection speed is 0.125s, which is 9.2% higher than the current mainstream tiny object
                 detection algorithm and has better detection accuracy for tiny defect object.

                 Keywords1
                 printed circuit board defect detection, K-means++ clustering, genetic algorithm, anchor box,
                 Faster R-CNN, Resnet50, depth separable convolution, feature pyramid network

1. Introduction

    As an indispensable part of electronic products, the quality of PCB directly determines whether the
electronic products can work normally, and the quality is closely related to each link of production.
With the continuous development of hardware level, PCB design is developing towards the direction
of multi-layer, table-pasting and densification. In addition, PCB production is composed of multiple
links. For example, the production process of a single panel includes cutting, drilling, copper deposition,
etching, resistance welding, hot air leveling, character and electrical measurement [1]. Problems in any
of the above links may cause the final product to fail to work normally and thus increase production
costs. An effective way to ensure the quality of PCB is to add PCB defect detection in the production
process. Compared with electrical testing, it has the advantage of non-contact nondestructive testing,
which can better protect PCB and avoid damage in the production process. So the research of PCB
defect detection algorithm is very necessary.
    According to whether the PCB is mounted or not, PCB defect detection is divided into PCB bare
board detection and PCB component detection. Reference [2] applies the multi-scale and pyramid
structure of deep convolutional network itself to the construction of feature pyramid, so as to detect the
tiny defects in PCB bare board. In reference [3], it is proposed that the improved YOLOv4 can locate
and identify components on the circuit board, which can realize the identification of device leakage,
device wrong installation, device offset, device skew and device polarity reverse installation. At present,
the research on PCB bare board defect detection is divided into two directions: automatic optical

ICBASE2022@3rd International Conference on Big Data & Artificial Intelligence & Software Engineering, October 21-
23, 2022, Guangzhou, China
EMAIL: 30189252@qq.com (Junhao Niu); 364817579@qq.com (Jin Huang); 45105703@qq.com (Lili Cui)
ORCID: 0000-0001-6876-4566 (Junhao Niu); 0000-0002-7436-577X (Jin Huang); 0000-0002-9454-4679 (Lili Cui)
              Β© 2022 Copyright for this paper by its authors.
              Use permitted under Creative Commons License Attribution 4.0 International (CC BY 4.0).
              CEUR Workshop Proceedings (CEUR-WS.org)



                                                                                 283
detection and machine vision defect detection based on deep learning. Automatic optical testing is
affected by the changes of testing environment, circuit board types, personnel experience and other
factors, so it is not robust to various PCB defects, and the cost of labor is too large. All kinds of PCB
on the market have different line spacing rules and small line width. Automatic optical detection is
difficult to detect such complex and diverse PCB, and the false detection rate and missed detection rate
are still too high. The machine vision defect detection based on deep learning can not only save the cost
of labor input, but also greatly reduce the false detection rate and missed detection rate.
    With the rapid development of convolutional neural networks in recent years, some excellent PCB
defect detection algorithms based on deep learning have emerged. Reference [4] proposed a redesigned
clustering method based on YOLOv3 and added attention mechanism to improve the detection speed
of the algorithm. Reference [5] proposed a new anchor based on YOLOv4 redesigned clustering method
to improve the detection speed to 37.09FPS. However, since most PCB defects are small defects, in
order to improve the detection accuracy of small defects, this paper proposes a combination of K-
means++ clustering algorithm and genetic algorithm to generate an initial anchor suitable for small
targets, and combines the improved Faster R-CNN detection algorithm to detect PCB defects. This
paper chooses to study the defects of PCB bare board, and the detected defects include 6 common types,
which are: missing hole, mouse bite, open, short, spur, and spurious copper.

2. Design of improved K-means++ clustering algorithm

    The anchor of the traditional Faster R-CNN method is designed manually to detect large targets such
as pedestrians, vehicles and tools in PASCAL VOC2007 data set [6], which is not suitable for the
detection of small defect targets in this paper, and the robustness of manual design through experience
is poor. Therefore, it is necessary to design an algorithm to cluster to get the anchor suitable for micro-
defect target detection, and select the size of the anchor suitable for the data set as the training parameter,
so that the network can learn faster and get a better detector. This can reduce the difficulty of network
fine-tuning anchor and improve the final recognition accuracy and speed.
    Therefore, this paper proposes to use genetic algorithm to optimize the anchor obtained by k-means
++ algorithm. The idea is to firstly use K-means++ algorithm to select a more appropriate initial
clustering center and then use k-means algorithm to get the clustering results. Finally, the final
clustering results are optimized by genetic algorithm. The core of the algorithm and the three problems
to be solved are how to select the initial cluster center, which distance formula between samples to
select, and how to optimize the final result of the genetic algorithm. The following three parts are
elaborated respectively.

2.1.    Select the appropriate initial cluster center

   Since the selection of the first clustering center of the standard k-means algorithm is randomly
selected from all training samples, it is easy to finally select the size of the anchor that does not match
the data set in this paper [7]. Moreover, the number of iterations and the final clustering effect of the
algorithm are closely related to the selection of the initial clustering center. In order to get a better
anchor, this paper uses K-means++ algorithm to select the initial cluster centers one by one, and the
number of categories is K. According to the final generated five feature layers and three different
proportions of anchor, we assign each feature layer in turn, so K is chosen as 15 here. Although K-
means++ algorithm increases the time to select the initial cluster center, it greatly speeds up the overall
convergence speed and can avoid the selection of inappropriate anchor. On the whole, it improves the
performance of the clustering algorithm. The selected initial cluster center is shown in Figure 1.




                                                     284
Figure 1: Initial cluster center

2.2.    Distance between samples was calculated

    The standard K-means algorithm uses Euclidean distance to calculate the distance between the
training sample and the cluster center, in which the larger anchor will produce more errors than the
smaller anchor. The defect detection problem studied in this paper requires the anchor to be as close as
possible to the ground truth box size. Therefore, this paper uses the distance calculation as formula (1)
and (2), where B represents the training sample and C represents the cluster center. The larger the
intersection ratio, the smaller the distance, and the more likely they are to belong to the same class. It
can describe the relationship between training samples and clustering centers more accurately. So as to
improve the quality of the final anchor.

                                   𝑑 (𝐡, 𝐢 ) = 1 βˆ’ πΌπ‘‚π‘ˆ(𝐡, 𝐢)                                             (1)


                                             π‘Žπ‘Ÿπ‘’π‘Ž(𝐡) ∩ π‘Žπ‘Ÿπ‘’π‘Ž(𝐢)                                           (2)
                              πΌπ‘‚π‘ˆ(𝐡, 𝐢 ) =
                                             π‘Žπ‘Ÿπ‘’π‘Ž(𝐡) βˆͺ π‘Žπ‘Ÿπ‘’π‘Ž(𝐢)


2.3.    Genetic algorithm

    Genetic algorithm draws on the idea of biological evolution and introduces the concept of survival
of the fittest into the clustering algorithm to optimize the clustering result. The characteristics of data
set optimization, flexible probabilistic search and parallel computing in genetic algorithm just fill in the
deficiencies of K-means algorithm [8]. The clustering center obtained by k-means algorithm is sent to
the genetic algorithm, and the algorithm process is as follows:
    1. The width and height data of 15 anchors were coded to get chromosomes.
    2. The coincidence degree of ground truth box and anchor is defined as fitness function. The closer
        the size of anchor is to the size of ground truth box, the better its fitness will be. The mass of the
        anchor from the reaction.
    3. The mutation probability was set as 90%, the number of iterations was set as 1000, and the
        fitness threshold was 0.25.
    4. Application propagation and mutation generate the next generation population, and if the fitness
        of the next generation is greater than that of the previous generation, the next generation will be
        updated.
    5. If the suboptimal solution is equal to the last optimal solution or reaches the number of iterations,
        the algorithm ends.
    Figure 2 shows the final clustering results.




                                                     285
Figure 2: Final clustering results

3. Design of improved Faster R-CNN algorithm

    In a nutshell, the object detection task is to input an image into the network and output an image
containing the object bounding box and the category label score. With the iteration of the algorithm and
people's pursuit of speed and accuracy, the target detection algorithm has gradually developed into one-
stage target detection algorithm and two-stage target detection algorithm. Relatively successful one-
stage target detection algorithms include SSD[9] and YOLO[10] series, which have the advantage of
fast detection speed but poor detection effect on dense and small targets. However, the defects on the
circuit board in this paper belong to dense and small targets, so we do not consider this algorithm. Two-
stage target detection algorithms with good results include R-CNN[11], Fast R-CNN[12], Faster R-
CNN[6], Mask R-CNN[13], etc., which have the advantage of high detection accuracy. Because PCB
defect detection in industry needs to accurately detect all kinds of small PCB defects, this paper adopts
the Faster R-CNN algorithm in the two-stage detection algorithm as the basis to improve the detection
of PCB small defect targets. The process of Faster R-CNN algorithm is described below before the
improvement.

3.1.      Faster R-CNN algorithm

     The overall process of Faster R-CNN algorithm is as follows, and its framework is shown in Figure
3:
     1. Firstly, the shortest side of the input image is scaled to 600 according to the original aspect ratio.
        At this time, the other side is automatically scaled according to the original aspect ratio. In this
        way, the image is not distorted and the original information in the image is retained.
     2. The scaled image is input into the main feature extraction network, and the feature map is
        obtained after a series of convolution operations.
     3. The feature map is input to generate the suggestion box network, in which the feature map is
        processed by sliding window, and the suggestion box and the probability of having or not having
        a target are obtained by category prediction and position prediction.
     4. The feature map and the suggestion box are input into the ROI Pooling layer. Due to the different
        dimensions of the suggestion box, the size of the local feature map generated after mapping to
        the feature map is also inconsistent, which affects the unified management of data by the code.
        Therefore, the Pooling operation is carried out in the ROI Pooling layer to solve the above
        problems. The size of this series of local feature maps is adjusted to the same size and then
        spliced onto the same channel.
     5. This series of local feature maps are flattened and then input into the classifier for prediction
        and the regressor for prediction. The prediction bounding box and the confidence score are
        obtained.




                                                      286
Figure 3: Faster R-CNN framework diagram

3.2.    Improved feature extraction network

    Feature extraction in object detection task is to extract the details of the image, such as color, contour,
texture and edge information. Using these features as the input of the detection algorithm can reduce
the overhead of the algorithm in time and space. Many excellent feature extraction networks have been
derived by improving and stacking modules such as convolutional layer, fully connected layer,
activation function and pooling layer. In 1998, LeNet excellent performance in the visual task of
handwritten digit recognition drew people's attention to the power of convolution [14]. In 2013, ZFNet
explained how the convolutional neural network works and demonstrated the functions of the
intermediate feature layer and the operation process of the classifier by using the visualization
technology [15]. This network is also the feature extraction network used by the traditional Faster R-
CNN. In 2014, VGGNet uniformly adopted a convolution kernel size of 3Γ—3 in order to simplify the
selection of hyper parameters in the convolution layer, which simplified the design of the model and
the number of network parameters and deepened the depth of the convolutional neural network, but at
this time, the network depth was up to 19 layers [16]. In 2015, ResNet was proposed to solve the
problem of gradient disappearance caused by network layer stacking, so that the network layer number
can be deep [17]. In this paper, ResNet50 is selected as the basis of the backbone feature extraction
network and the idea of deep separable convolution is introduced [18]. We call the improved feature
extraction network DS-Resnet50.
    As shown in Figure 4, Depthwise Convolution and Pointwise Convolution divide Depthwise
Convolution into two stages. The former uses three single-channel convolution pairs to carry out two-
dimensional convolution with step size of 1 and padding of 1 to extract spatial information in the
direction of length and width of a single feature layer. Based on the output of Depthwise Convolution,
the latter uses N three-channel 1Γ—1 Convolution to perform 3D Convolution with step size of 1 and
padding of 0 to make up for the lost cross-channel channel information of the former. Therefore, the
number of parameters and the calculation amount are the sum of the two, as shown in Equations (3) and
(4).
                                  π‘ƒπ‘Žπ‘Ÿπ‘Žπ‘šπ‘  = 9𝑁 + 3𝑁                                                         (3)
                       𝐹𝐿𝑂𝑃 = 9𝐻 βˆ— π‘Š βˆ— 𝑁 + 3𝐻 βˆ— π‘Š βˆ— 𝑁                                           (4)
    The formula of parameter number and calculation amount of standard convolution is (5), as shown
in (6).
                               π‘ƒπ‘Žπ‘Ÿπ‘Žπ‘šπ‘  = 9𝑁 βˆ— 𝑁                                                  (5)
                              𝐹𝐿𝑂𝑃 = 𝐻 βˆ— π‘Š βˆ— π‘ƒπ‘Žπ‘Ÿπ‘Žπ‘šπ‘                                              (6)
   By comparing the number of parameters and calculation amount of the above two operations, it can
be seen that when the number of channels N in the feature extraction network is large, the depth
separable convolution can effectively concentrate the training model and reduce the redundancy of the
algorithm compared with the standard convolution.




Figure 4: Schematic diagram of depth-separable convolution

                                                     287
    We input images with holes missing defects that need to be detected into the improved feature
extraction network. Five feature layers of different sizes are the inherent multi-scale pyramid structure
of deep convolutional network, and the extracted features can be seen by using visualization technology
on different feature layers. In Figure 5, we can clearly see that the features of defects are clearly
distinguished, which provides help for subsequent defect detection. However, in the operation process
of the trunk feature extraction network, the small target details of the bottom feature layer are gradually
replaced by the whole information through the pooling layer, which makes our PCB defect detection
task more difficult. Therefore, we improve it in Section 3.3.




Figure 5: Original and depth feature maps

3.3.    Improved multi-scale feature fusion

    The results obtained by observing the same object at different distances belong to multi-scale, and
the feature layers in each stage of deep learning field are called multi-scale. Because PCB defects vary
in size, we use characteristic pyramid network to accurately detect PCB defects of various sizes.
    Feature pyramid network integrates feature maps from different layers in horizontal and vertical
dimensions [19], as shown in Figure 6, which is mainly used to solve the shortcomings of PCB defect
detection algorithm when dealing with multi-scale changes. In Faster R-CNN, detection is based on the
last feature layer. An obvious defect of this approach is that it is not friendly to detect small targets,
because the underlying feature map has high resolution, small receptive field and strong representation
ability of geometric details, which is helpful to the localization function of target detection task.
However, the high-level feature map has low resolution, large receptive field and strong representation
ability of semantic contour information, which is helpful to the classification function of object
detection task. If only the last layer is used for detection, it obviously does not make full use of the
underlying geometric details, resulting in unsatisfactory detection effect of small objects.
    The bottom-up path is the sequential execution process of the algorithm, and the feature map
generated by DS ResNet50 network is divided into four pyramid levels C2, C3, C4, C5 according to its
size. Due to the high resolution and a large number of parameters. Conv1 is not considered to be
included in the pyramid level because its large memory footprint affects the real-time performance of
the algorithm.
    The top-down path is performed by enlarging the smaller feature map to the same width and height
as the neighboring feature map by a 2 times up-sampling operation. Lateral linkage refers to the idea of
residual network, and the feature map obtained by up-sampling in the previous layer is added to the
feature map obtained by correcting the number of channels in the current layer. We fuse the fusion
information from the top layer based on the bottom layer C2. The final multi-scale feature layers P1,
P2, P3, P4 and P5 are formed by effectively combining the bottom level geometric detail information
and the top-level semantic contour information. Finally, the generated multi-scale feature layers are
input into the ROI Pooling layer to generate a series of local feature maps with the same size of 7Γ—7.


                                                   288
After flattening, a vector is obtained, and the final detection map is obtained through classifiers and
regressors.




Figure 6: Characteristic pyramid structure

4. Experiment and Analysis
4.1. Experimental environment

   Considering the maturity of the deep learning framework and the ability of the environment to
schedule hardware resources, this program is programmed in Python language, the experimental
platform is Pycharm, PyTorch1.7.1 is used as the deep learning framework, and the running
environment is configured as Ubuntu16.04 and CUDA10.0.130, Cudnn7.6.5, Quadro P5000 GPU,
16GB of video memory. The CPU is Intel Xeon CPU E5-2699.

4.2.    The experimental data

   The PCB DATA SET used in this paper is from Peking University [2]. After data enhancement
processing, a total of 10668 images were generated in the data set, which contained six kinds of PCB
bare board defects, namely missing hole, mouse bite, open, short, spur, and spurious copper.

4.3.    Model training

    15 anchor boxes were obtained by using the improved K-means++ clustering algorithm in this paper
on the real box data in the PCB DATA SET. Its size, respectively [9,10]、[14,13]、[14,18]、[19,19]、
[27,13]、[16,25]、[27,19]、[22,24]、[16,37]、[23,31]、[40,19]、[30,28]、[23,44]、[36,36]、
[47,53]. Figure 7 shows the final generated 15 anchor, each color corresponds to a feature layer, and
each color has three anchors with different proportions to adapt to detection targets with different aspect
ratios.




Figure 7: The resulting anchor


                                                   289
   The hyper parameter initialization Settings in this article's code are shown in Table 1.

Table 1
Hyper parameter setting during training
                    Hyper parameter Name                          Value
                             Epoch                                  15
                         Learning rate                             0.01
                          Momentum                                  0.9
                         Weight decay                             0.0001
                           Batch size                                8
                            Gamma                                  0.33
                       Optimization Type                           SGD

4.4.    The evaluation index

    The evaluation indexes used in this paper are Accuracy (A), Precision (P), Recall (R), Frames Per
Second (FPS), weighted harmonic average (F1), and detection Accuracy (mAP). Such as formula (7) -
(11).
                                          𝑁 +𝑁                                                      (7)
                             𝐴=
                                   𝑁 +𝑁 +𝑁 +𝑁
                                              𝑁                                                     (8)
                                      𝑃=
                                          𝑁 +𝑁
                                              𝑁                                                     (9)
                                      𝑅=
                                          𝑁 +𝑁
                                              2𝑃𝑅                                                   (10)
                                        𝐹 =
                                             𝑃+𝑅
                                            βˆ‘ 𝐴𝑃(𝑖)                                                 (11)
                                    π‘šπ΄π‘ƒ =
                                                 𝑛
    Where: NTP is the number of samples with positive prediction results and positive actual samples.
NFP represents the number of samples with positive prediction results and negative actual results. NTN
is the number of samples with negative prediction results and negative actual results. NFN is the number
of samples that were predicted to be negative but actually were positive. FPS is the number of images
transmitted per second, and F1 is the average of P and R.

4.5.    Results analysis

    Experiment 1 is to verify the detection effect of anchor obtained by the improved K-means++
clustering algorithm in PCB defect detection algorithm. We compared the detection speed and accuracy
of anchor obtained by different clustering algorithms in the same detection algorithm and the same data
set, as shown in Table 2.

Table 2
Different clustering methods are used to check the speed and accuracy
            Model              Clustering method            mAP/%                       Time/s
         Faster-RCNN           Standard clustering           86.4%                       0.189
         Faster-RCNN              My clustering              90.7%                       0.131
          My Model             Standard clustering           91.8%                       0.129
          My Model                My clustering              95.6%                       0.125

  From the data in bold in Table 2, it can be seen that the clustering method in this paper and the
improved detection algorithm have been used to improve the accuracy and speed. The detection


                                                  290
accuracy was improved by 9.2%, and the detection speed was improved by 0.064s. This is due to the
design of suitable anchor and the improvement of detection algorithm.
    In Experiment 2, the traditional Faster R-CNN algorithm was compared with the proposed algorithm
in the same PCB DATA SET and the same hardware conditions by using the single variable comparison
method. Figure 8 shows the detection result diagram of the traditional Faster R⁃CNN algorithm. It can
be seen that 2 ⁃ defects were missed in the notch diagram, 1 ⁃ in the open circuit diagram, 1 ⁃ in the
short circuit diagram and 3 ⁃ in the residual copper diagram. It can be seen that the original Faster
R⁃CNN algorithm did not have a good detection effect on minor defect targets and could not accurately
detect the defect positions in the diagram.




Figure 8: Traditional Faster R-CNN detection results

   Figure 9 is the detection result chart of the algorithm in this paper. It can be seen that the improved
algorithm can mark all the 6 types of defects contained in the figure. Compared with the standard Faster
R-CNN algorithm, it is more suitable for small target detection and has significantly improved the
missed detection rate and confidence.




Figure 9: The detection results of the proposed algorithm

5. Acknowledgements

    I would like to thank the National Natural Science Foundation of Guilin University of Electronic
Technology (11901137) and Guangxi Young Teachers' Basic Scientific Research Ability Improvement
(2019KY0232) and Graduate Education Innovation Program of Guilin University of Electronic
Technology (2019YCXS093) for the financial support. I would like to thank the members of Guangxi
Key Laboratory of Automatic Detection Technology and Instruments for their help. Finally, I would
like to thank Aleksandr Ometov for providing the writing template.




                                                   291
6. References

[1] Jifeng Li, Research and implementation of PCB surface defect detection technology based on deep
     neural network, Master’s thesis, Foshan Institute of Science and Technology, Guangdong, China,
     2020.
[2] D. Runwei, D. Linhui, L. Guangpeng. "TDD-net: a tiny defect detection network for printed circuit
     boards." CAAI Transactions on Intelligence Technology (2019): 110-116.
[3] Li Xie, Xiaofang Yuan, Baixin Yin. "Defect detection of circuit board components based on
     improved YOLOv4 network." Measurement and control technology (2022): 19-27.
[4] Wen Li, Xiaochun Li, Haolei Yan. "PCB defect detection based on improved YOLOv3." Electro
     optic and Control (2022): 106-111.
[5] Xianyu Zhu, Jie Xiong, Ningsha Wang. "Research on Defect detection method of PCB Bare Board
     based on improved YOLOv4." Industrial control computer (2021): 39-40.
[6] S. Ren, K. He, R. Girshick. "Faster R-CNN: Towards Real-Time Object Detection with Region
     Proposal Networks." IEEE Transactions on Pattern Analysis and Machine Intelligence (2017):
     1137-1149.
[7] Yuxia Lai, Jianping Liu, Guoxing Yang. "K-means clustering analysis based on genetic
     algorithm." Computer Engineering (2008): 200-202.
[8] Yongliang Feng, Hao Li. "Research on K-means clustering improvement based on genetic
     algorithm." Computer and Digital Engineering (2020): 1831-1834.
[9] W. Liu, D. Anguelov, D. Erhan. "SSD: Single shot multi box detector." European conference on
     computer vision (2016): 21-37.
[10] R. Girshick, J. Donahue, T. Darrell. "You only look once: Unified, real-time object detection."
     Proceedings of the IEEE conference on computer vision and pattern recognition (2016): 779-788.
[11] J. Redmon, S. Divvala, R. Girshick. "Rich feature hierarchies for accurate object detection and
     semantic segmentation." Proceedings of the IEEE conference on computer vision and pattern
     recognition (2014): 580-587.
[12] R. Girshick. "Fast R-CNN." Proceedings of the IEEE international conference on computer vision
     (2015): 1440-1448.
[13] K. He, G. Gkioxari, P. DollΓ‘r. "Mask R-CNN." Proceedings of the IEEE international conference
     on computer vision (2017): 2961-2969.
[14] Y. L, L. B, Y. B. "Gradient-based learning applied to document recognition." Proceedings of the
     IEEE (1998): 2278-2324.
[15] M.D. Zeiler, R. Fergus. "Visualizing and understanding convolutional networks." European
     conference on computer vision (2014): 818-833.
[16] K. Simonyan, A. Zisserman. "Very deep convolutional networks for large-scale image
     recognition." arXiv preprint arXiv (2014): 1409-1556.
[17] K. He, X. Zhang, S. Ren. "Deep residual learning for image recognition." Proceedings of the IEEE
     conference on computer vision and pattern recognition (2016): 770-778.
[18] F. Chollet. "Deep learning with depthwise separable convolutions." Proceedings of the IEEE
     conference on computer vision and pattern recognition (2017): 1251-1258.
[19] T.Y. Lin, P. DollΓ‘r, R. Girshick. "Feature pyramid networks for object detection." Proceedings of
     the IEEE conference on computer vision and pattern recognition (2017): 2117-2125.




                                                 292