<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Real-time Twist Rebar Detection System exploiting GAN-based Data Augmentation technique</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Jong Chan. Park</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Gun-Woo. Kim</string-name>
          <email>gunwoo.kim@gun.ac.kr</email>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Department of AI Convergence Engineering, Gyeongsang National University</institution>
          ,
          <addr-line>Jinju</addr-line>
          ,
          <country country="KR">Korea</country>
        </aff>
        <aff id="aff1">
          <label>1</label>
          <institution>School of Computer Science / Department of AI Convergence Engineering, Gyeongsang National University</institution>
          ,
          <addr-line>Jinju</addr-line>
          ,
          <country country="KR">Korea</country>
        </aff>
      </contrib-group>
      <abstract>
        <p>Currently, AI image analysis research is being conducted on automated cutting, bending, and loading systems, which are the main facilities of rebar processing factories. For automation, various datasets through machine vision cameras are required. However, environmental factors include difficult data collection or high production costs to collect datasets in the production process. To solve these problems, we propose a real-time twist rebar detection system based on GAN (Generative adversarial network), with real rebar datasets collected from 20 rebar videos. In this paper, we generated additional datasets from a deep image generation network and detected rebars' endpoints through YOLO (You Only Look Once) v4, a deep-learning object detection model. In experiments, we generated rebar images corresponding to normal and abnormal, the measured quality between real rebar dataset and generated synthetic rebar dataset by FID (Frechet Inception Distance). As a result, FID measurements showed the normal synthetic rebar dataset 79.363 and the abnormal synthetic rebar dataset 113.973. After that, as a result of training in YOLO v4 by combining the synthetic rebar dataset generated from GAN and the real rebar dataset, we obtained the mean Average Precision (mAP) of 100% and a misdetection rate of 5% compared to the real rebar dataset, the mAP increased by 0.6%, and decreased by 10%. Overall, our results demonstrate a strong effect on rebar twist detection accuracy and misdetection rate.</p>
      </abstract>
      <kwd-group>
        <kwd>1 data augmentation</kwd>
        <kwd>data imbalance</kwd>
        <kwd>rebar factory</kwd>
        <kwd>image generation</kwd>
        <kwd>object detection</kwd>
        <kwd>rebar factory</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>1. Introduction</title>
      <p>Recently, in response to the development of
the artificial intelligence and robot industry,
unmanned operations have been accelerated.
Artificial intelligence's creative capacity can
create on its own shows innovation in the
manufacturing industry. Rebar processing
requires an automated intelligent production
system that minimizes loss rate, such as
automatic correction and optimization of rebars.
However, the improvement of calibration work
time and accuracy of the machining rebar factory
still depends on the worker's skill level, as
shown in Figure 1. In addition, rebar processing
has quality problems and safety accidents that
occur during the machining process. Therefore,
an unmanned system that detects the endpoints
of the processed rebar and predicts the errors of
correction value is needed.</p>
      <p>First, to detect the endpoints of rebars, it is
necessary to collect datasets of normal and
abnormal rebars through a machine vision
camera. However, collecting data in response to
environmental factors is difficult, and the high
production cost is when collecting datasets.
These problems can be addressed by augmenting
various high-quality images using an image
generation model for existing small image data.
In addition, the performance of rebar twist
detection can be improved by utilizing deep
learning detection models through real and
synthetic rebar data.</p>
      <p>
        In this paper, our contributions are three for
the rebar detection system.
1. A set of rebar data is generated by extracting
1000 normal and 1000 abnormal rebar
images from the rebar video collected in the
field with a machine vision camera.
2. To generate rebar images in various
situations from the real rebar dataset, 500
images of rebar and 500 images of abnormal
rebar are generated through GAN[
        <xref ref-type="bibr" rid="ref1">1</xref>
        ].
3. We improved the performance of rebar
detection and misdetection rate by combining
the rebar dataset learned from YOLO v4[
        <xref ref-type="bibr" rid="ref2">2</xref>
        ]
and the dataset generated through GAN.
      </p>
    </sec>
    <sec id="sec-2">
      <title>2. Background</title>
      <sec id="sec-2-1">
        <title>2.1 Start Point Detection for Tracing the</title>
      </sec>
      <sec id="sec-2-2">
        <title>Injection Path of Steel Rebars</title>
        <p>
          In this paper, this research proposed a
starting point rebar detection method using the
average brightness change of a high-speed
Infrared Ray (IR) camera to reduce errors
according to the environment [
          <xref ref-type="bibr" rid="ref3">3</xref>
          ]. The process of
the proposed method is shown in Figure 2.
        </p>
        <p>The average value of the pixel matrix had
measured by a specific size of standby window
at the rebar injection point, which was based on
848x848 grayscale and 90fps with the INTEL
RealSense D435 IR camera and performed
maximum detection accuracy of about 81%.</p>
        <p>To automate the rebar injection system, the
rebar detection accuracy must be over 90%, and
this research show failed on flicker phenomenon
cases, as shown in Figure 3.</p>
        <p>
          In this paper, this research used a feature
point matching algorithm to determine the
twisting of the machining rebar[
          <xref ref-type="bibr" rid="ref4">4</xref>
          ]. The
proposed method is first to designate the ROI
area of the rebar injection part and detect the two
straight lines algorithm through the Hoffman
straight-line algorithm, as shown in Figure 4.,
After that, as shown in Figure 5, normal and
abnormal rebars are compared with those of the
camera through the Oriented Fast Rotated
BRIEF (ORB) feature detection algorithm.
        </p>
        <p>This research has resulted that there was a
twist with an average accuracy of 96.5%.
However, the amount of computation increased
significantly during real-time detection,
resulting in a significant decrease in fps to 10-20,
and accuracy was significantly decreased in
twist detection when experimented in a new
environment.</p>
      </sec>
      <sec id="sec-2-3">
        <title>2.3 Prediction Model of Rebar Endpoints</title>
      </sec>
      <sec id="sec-2-4">
        <title>Based on YOLO v3 with Non-linear Regression</title>
        <p>
          In this paper, this research proposed a real-time
system to detect and track rebar endpoints based
on YOLO v3 from the input of rebar images of
the camera and predict rebar endpoints in
advance with non-linear regression of the
obtained coordinates [
          <xref ref-type="bibr" rid="ref5">5</xref>
          ]. As a result of this
research, it can be confirmed that the prediction
point in front of 10 frames is marked with a red
dot through the prediction value of the rebar
endpoint. The problem is that the detection
accuracy of rebar endpoints should be high to
predict the prediction point in front of 20 to 40
frames, but the detection accuracy showed a
performance of about 70 to 80%, and the
accuracy of prediction was poor in response to
the high rate of detection of rebar errors.
        </p>
      </sec>
    </sec>
    <sec id="sec-3">
      <title>3. Approach</title>
      <p>In this paper, we introduce generating a real
rebar dataset and synthetic rebar dataset for rebar
twist detection performance. as shown in Figure
10, we show a system for determining the
presence or absence of real-time rebar twist by
combining generated images from GAN with
real rebar dataset, and training yolov4, famous
for real-time detection deep learning model with
these datasets.</p>
      <sec id="sec-3-1">
        <title>3.1 Machine Vision Camera / Create a</title>
      </sec>
      <sec id="sec-3-2">
        <title>Rebar Dataset</title>
        <p>the HIKVISION MV-CAD13-20GM
Machine camera was selected as an environment
for collecting a dataset of processed rebars and
detecting real-time rebar twists, as shown in
Figure 6. After that, to reduce the flicker
phenomenon and bright and dark lighting
differences in various environments, we set the
working distance to 2000mm, the focal length to
50mm, the lighting to 90 Hz with an LED lamp,
and the exposure time to 500ms.</p>
        <p>The rebar injection videos have an average
frame width of 1280, frame height of 720, and
fps of 88.38, and a total of 20 data videos were
collected. The average length of the videos is
about 3 seconds because the speed of rebar
injection is very fast. Through the collected
images, 351 to 443 images were extracted for
each video. But blurred or broken images were
removed to generate a rebar dataset. Finally, the
rebar dataset consists of a total of 2,000 images,
and as shown in Figure 7, it was classified into
1,000 normal rebar images and 1,000 abnormal
rebar images.</p>
        <p>In this paper, we use the basic GAN, which
is famous for its image generation model. As
shown in Figure 8, it can be seen that synthetic
rebar images similar to the real rebar images
were extracted from rebar injection videos.
Finally, we combined extracted rebar dataset
with synthetic rebar images.</p>
        <p>the average detection accuracy and error
detection rate by training the real rebar dataset
and the combined rebar dataset. Figure 9 shows
that the twist detection test was conducted by
training the real rebar dataset, and the twist
detection test was conducted by training the
combined dataset of rebar images generated
through GAN.</p>
        <p>Finally, in our study, YOLO v4, which is
famous for its real-time detection model, was
used to detect rebar twists. For performance
comparison, the model obtained by training the
real rebar dataset and the synthetic rebar images
generated through GAN were compared through</p>
      </sec>
    </sec>
    <sec id="sec-4">
      <title>4. Experiments</title>
      <p>First, for rebar image generation, the training
environment in GAN is set to generate images of size
20000 for epoch, 0.0002 for learning rate, 1000 for latent
dimension, and 416x416 for image size. And finally, the
interval is set to 100. Second, For the classification of
twist rebar, the training environment of YOLO v4 was set
to epoch 2000 to 3000, learning rate 0.0013, saturation 1.5,
and exposure 1.5.</p>
      <p>In addition, when training in YOLO v4 by applying
data augmentation, training was conducted by applying
random vertical/horizontal flipping. The twist detection
test for the experiment was compared by obtaining results
through a total of three training times.
4.2</p>
      <sec id="sec-4-1">
        <title>Model Performance</title>
        <p>In order to measure the quality of rebar images
generated through GAN, the average value was measured
through FID. As can be seen from Table 1, the synthetic
normal rebar images generated through GAN were FID
79.3638476, and the synthetic abnormal rebar images
were 113.9733602.
4.3</p>
      </sec>
      <sec id="sec-4-2">
        <title>Training Environments</title>
        <p>First, for rebar image generation, the training
environment in GAN is set to generate images of size
20000 for epoch, 0.0002 for learning rate, 1000 for latent
dimension, and 416x416 for image size. And finally, the
interval is set to 100.</p>
        <p>Second, For the classification of twist rebar, the
training environment of YOLO v4 was set to epoch 2000
to 3000, learning rate 0.0013, saturation 1.5, and exposure
1.5.</p>
        <p>In addition, when training in YOLO v4 by applying
data augmentation, training was conducted by applying
random vertical/horizontal flipping. The twist detection
test for the experiment was compared by obtaining results
through a total of three training times.
4.4</p>
      </sec>
      <sec id="sec-4-3">
        <title>Model Performance</title>
        <p>In order to measure the quality of rebar images
generated through GAN, the average value was measured
through FID. As can be seen from Table 1, the synthetic
normal rebar images generated through GAN were FID
79.3638476, and the synthetic abnormal rebar images
were 113.9733602.</p>
        <p>For normal/abnormal rebar classification, Figure 11
shows that the average mAP obtained by training the real
rebar dataset from YOLO v4 was 99.4%. When the epoch
reached 2600, the AP showed a 69% drop. And 99.4% of
mAP were obtained when data augmentation was applied.</p>
        <p>And the results were the same as those learned by
applying non-data augmentation. Finally, when training
was performed by combining the real rebar dataset with
the synthetic(generated) rebar images through GAN,
good performance was shown at 100% mAP, as shown in
Figure 12.</p>
        <p>Finally, in this paper, a misdetection test was
performed on 20 rebar injection videos to confirm
the twist detection performance. Table 2 shows
that three misdetections occurred when data
augmentation was applied, showing a 15%
misdetection rate. When the GAN-based
generated dataset was combined with the real
rebar dataset, one misdetection occurred, showing
a 5% misdetection rate.</p>
      </sec>
    </sec>
    <sec id="sec-5">
      <title>5. Conclusion</title>
      <p>In this paper, we introduced the development of
an AI image analysis-based processing rebar
productivity improvement system. First, videos of
rebar injection were collected through a machine
vision camera. After that, images of real rebar
were extracted to create a rebar dataset. And they
were classified into normal rebars and abnormal
rebars. However, only the relevant rebar image
has limitations in improving the detection
accuracy and classification performance of rebars,
and there is a problem that there is a significant
cost problem in collecting additional datasets.</p>
      <p>To solve this problem, we proposed various
types of rebar images generated through GAN to
improve the performance of the real-time twist
detection system. After that, the detection
accuracy and misdetection rate were tested by
YOLO v4 by combining the synthetic rebar
images with the extracted real rebar images; In
experiments, FID measurements showed the
normal synthetic rebar dataset 79.363 and the
abnormal synthetic rebar dataset 113.973. After
that, as a result of training in YOLO v4, we
obtained the mAP by 100% and the misdetection
rate by 5% compared to the real rebar dataset, the
mAP increased by 0.6%, and decreased by 10%.
Overall, our results demonstrate a strong effect on
rebar twist detection accuracy and misdetection
rate.</p>
      <p>Based on this study, various regression
prediction models will be used to improve the
accuracy performance of predicting rebars'
endpoints and recognizing the rebars' shape.</p>
    </sec>
    <sec id="sec-6">
      <title>6. Acknowledgment</title>
      <p>This research was supported by Basic Science
Research Program through the National Research
Foundation of Korea (NRF), funded by the
Ministry of Education, Science and Technology
(NRF-2021R1G1A1006381).</p>
    </sec>
    <sec id="sec-7">
      <title>7. References</title>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <given-names>I. Goodfellow. J.</given-names>
            <surname>Pouget-Abadie</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Mirza</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B.</given-names>
            <surname>Xu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Warde-Farley</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Ozair</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Courvile</surname>
          </string-name>
          , and
          <string-name>
            <given-names>Y.</given-names>
            <surname>Bengio</surname>
          </string-name>
          , “
          <article-title>Generative adversarial nets”</article-title>
          ,
          <string-name>
            <surname>In</surname>
            <given-names>NIPS</given-names>
          </string-name>
          ,
          <year>2014</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [2]
          <string-name>
            <given-names>A.</given-names>
            <surname>Bochkovskiy</surname>
          </string-name>
          , CY. Wang, and HYM. Liao. Yolov4:
          <article-title>Optimal speed and accuracy of object detection"</article-title>
          , arXiv preprint arXiv:
          <year>2004</year>
          .10934,
          <year>2020</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [3]
          <string-name>
            <given-names>JM.</given-names>
            <surname>Lee</surname>
          </string-name>
          , DS. Kang,
          <article-title>"Start Point Detection Method for Tracing the Injection Path of Steel Rebars"</article-title>
          ,
          <source>KIIT</source>
          , vol.
          <volume>17</volume>
          , no.
          <issue>6</issue>
          , pp.
          <fpage>9</fpage>
          -
          <lpage>16</lpage>
          ,
          <year>Jun 2019</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [4]
          <string-name>
            <given-names>JC.</given-names>
            <surname>Park</surname>
          </string-name>
          , DS. Kang,
          <article-title>"An Algorithm for the Determination of Twisted Rebar using Feature Matching"</article-title>
          ,
          <source>KIIT</source>
          , vol.
          <volume>19</volume>
          , no.
          <issue>2</issue>
          , pp.
          <fpage>21</fpage>
          -
          <lpage>28</lpage>
          ,
          <year>Feb 2020</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          [5]
          <string-name>
            <given-names>J.</given-names>
            <surname>Han</surname>
          </string-name>
          ,
          <string-name>
            <surname>DS</surname>
          </string-name>
          . Kang,
          <article-title>"OPPDet: Object Position Prediction Detection Model for Predicting Endpoints of Rebar"</article-title>
          ,
          <source>Proceedings of KIIT Conference</source>
          ,
          <year>2020</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          [6]
          <string-name>
            <given-names>JC.</given-names>
            <surname>Park</surname>
          </string-name>
          , DS. Kang,
          <article-title>"Predicting Rebar Endpoints using Sin Exponential Regression Model"</article-title>
          ,
          <source>arXiv preprint arXiv:2110.08955</source>
          ,
          <year>2021</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          [7]
          <string-name>
            <given-names>SW.</given-names>
            <surname>Huang</surname>
          </string-name>
          , CT. Lin, SP. Chen, YY. Wu,
          <source>PH. Hsu</source>
          ,
          <string-name>
            <given-names>SH</given-names>
            .
            <surname>Lai</surname>
          </string-name>
          ,
          <article-title>"AugGAN: Cross Domain Adaptation with GAN-based Data Augmentation"</article-title>
          ,
          <source>ECCV</source>
          , pp.
          <fpage>718</fpage>
          -
          <lpage>731</lpage>
          ,
          <year>2018</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          [8]
          <string-name>
            <given-names>S.</given-names>
            <surname>Sundaram</surname>
          </string-name>
          ,
          <string-name>
            <given-names>N.</given-names>
            <surname>Hulkund</surname>
          </string-name>
          ,
          <article-title>"Gan-based data augmentation for chest X-ray classification"</article-title>
          ,
          <source>arXiv preprint arXiv:2107.02970</source>
          ,
          <year>2021</year>
          .
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>