<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Pix2Pix-Based Depth Estimation from Monocular Images for Dynamic Path Planning of Multirotor on AirSim</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Tomoyasu Shimada</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Hiroki Nishikawa</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Xiangbo Kong</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Hiroyuki Tomiyama</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Graduate School of Science and Engineering, Ritsumeikan University</institution>
          ,
          <addr-line>Shiga</addr-line>
          ,
          <country country="JP">Japan</country>
        </aff>
        <aff id="aff1">
          <label>1</label>
          <institution>JSPS Research Fellow</institution>
        </aff>
      </contrib-group>
      <abstract>
        <p>Recently, autonomous ight for multirotor is actively researched. It is essential to get information about the real world for an autonomous ight from sensors. The sensors used by autonomous ight include a camera with a depth sensor or a stereo camera to generate a depth map. A distance measured by a depth camera depends on the performance of the camera. However, the high-performance camera is too heavy and expensive to load on a multirotor. Therefore, this paper proposes a method to generate depth maps by a monocular camera that is light and a ordable. Besides, there are legally many limitations on an actual ight of a multirotor, and thus, autonomous ight in virtual environments is usually conducted in an early phase of development. To address this issue, this paper proposes a dynamic path planning method with collision avoidance using a monocular camera only and conducts simulation using AirSim. Experimental results show that the proposed method perfectly avoids collision.</p>
      </abstract>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>Introduction</title>
      <p>
        In recent years, multirotors are expected to play a variety of roles due to their convenience, and
a large amount of research is being conducted on them. Examples of the roles include sports
photography, aerial photography, infrastructure inspection, lifesaving, agriculture, and package
delivery. Multirotors can also be used to monitor and search areas a ected by re, ood, and
earthquake, which pose many risks that manned aircraft cannot. Unlike cars, multirotors can y,
making it possible to reach a destination in the shortest possible time, even if it takes a long time
on the ground. Multirotors are also smaller in size than other aircraft such as helicopters and
airplanes, allowing them to pass through narrow spaces. Therefore, it is considered to be suitable
for carrying light loads, aerial photography in narrow alleys, and lming for sports broadcasts
while the players keep moving. In order to take advantage of this convenience, research on
autonomous multirotor ight is being actively conducted. For autonomous ight, it is essential
to obtain information from sensors. A variety of sensors are used in the study of autonomous
ight. For example, LiDAR and automatic dependant surveillance-broadcast (ADS-B) are
used[
        <xref ref-type="bibr" rid="ref1">1</xref>
        ][
        <xref ref-type="bibr" rid="ref2">2</xref>
        ][
        <xref ref-type="bibr" rid="ref3">3</xref>
        ][
        <xref ref-type="bibr" rid="ref4">4</xref>
        ][
        <xref ref-type="bibr" rid="ref5">5</xref>
        ]. However, installing a large number of sensors, or high-performance sensors
with a large weight, increases energy consumption. With the limited energy of a battery,
longdistance ight becomes impossible due to the increased weight of sensors. Moreover, LiDAR
and ADS-B are too expensive so the costs of experiments increases. Besides, in recent years,
the laws and regulations for multirotors have become stricter, restricting the ight places and
the ight speeds, etc., making it impossible to fully conduct experiments. In this paper, we
use a ight simulator called AirSim in order to conduct experiments without being bound by
laws and regulations. Cameras with a depth sensor (depth cameras) are more a ordable than
LiDAR and ADS-B. In this paper, autonomous ight is performed using only depth maps.
      </p>
      <p>
        A lightweight depth camera that can be mounted on a multirotor can measure a distance of
up to 10 meters. However, AirSim that is one of ight simulators using Unreal Engine 4 (UE4)
can obtain the distance between the multirotor and the object from UE4. It can accurately
obtain more than 200 meters and generate the depth map. There has been some research
on deep learning of depth maps obtained by AirSim and images obtained from monocular
cameras[
        <xref ref-type="bibr" rid="ref6">6</xref>
        ][
        <xref ref-type="bibr" rid="ref7">7</xref>
        ][
        <xref ref-type="bibr" rid="ref8">8</xref>
        ] [
        <xref ref-type="bibr" rid="ref9">9</xref>
        ][
        <xref ref-type="bibr" rid="ref10">10</xref>
        ]. This paper proposes a dynamic path planning method by using Pix2Pix.
Pix2Pix generates a pair image, to generate an image close to the depth map that can be
generated by AirSim from monocular images, and using the depth map. Subsection 3.1. shows
the detail of Pix2Pix.
      </p>
      <p>
        One of the vision-based methods is optical ow[
        <xref ref-type="bibr" rid="ref11">11</xref>
        ][
        <xref ref-type="bibr" rid="ref12">12</xref>
        ][
        <xref ref-type="bibr" rid="ref13">13</xref>
        ]. Optical ow-based methods can
detect 2D vectors between frames. Therefore, it is e ective in avoiding collisions in the lateral
direction, but since it can only acquire two-dimensional vectors, the multirotor may not be
able to avoid collisions with obstacles ying in front of it. On the other hand, there are some
methods using depth maps[
        <xref ref-type="bibr" rid="ref14">14</xref>
        ][
        <xref ref-type="bibr" rid="ref15">15</xref>
        ][
        <xref ref-type="bibr" rid="ref16">16</xref>
        ]. the depth map-based methods detects the depth and
thus avoid obstacles that face the multirotor in front. However, the depth cameras that can
be mounted on real-world multirotors only accurately acquire distances up to 10 meters. The
proposal method is collision avoidance of multirotor using the Pix2Pix model to generate a
depth map approximately AirSim from a monocular image.
      </p>
      <p>The rest of this paper is organized as follows. A method of path planning for collision
avoidance is introduced in Section 2. Section 3 describes the overview of Pix2Pix and the
details of algorithms of depth map generation from the monocular image. Section 4 shows the
experimental results and Section 5 concludes this paper.
2</p>
    </sec>
    <sec id="sec-2">
      <title>Path</title>
    </sec>
    <sec id="sec-3">
      <title>Maps</title>
    </sec>
    <sec id="sec-4">
      <title>Planning for Collision</title>
    </sec>
    <sec id="sec-5">
      <title>Avoidance</title>
    </sec>
    <sec id="sec-6">
      <title>Using</title>
    </sec>
    <sec id="sec-7">
      <title>Depth</title>
      <p>
        In order to realize safe ight, multirotors are required to plan the path, which is to select the
direction so that the multirotor can avoid colliding with objects. In this section, we describe
a method for planning the path to avoid collision based on the state-of-the-art methods in
[
        <xref ref-type="bibr" rid="ref14">14</xref>
        ][
        <xref ref-type="bibr" rid="ref16">16</xref>
        ]. The works in [
        <xref ref-type="bibr" rid="ref14">14</xref>
        ] and [
        <xref ref-type="bibr" rid="ref16">16</xref>
        ] introduce a method that divides a depth map into sections.
In [
        <xref ref-type="bibr" rid="ref14">14</xref>
        ], the presented algorithm divides a depth map into ve sections and selects the section
which is the most distant object among them, as shown in Figure 1.
      </p>
      <p>
        The presented method in [
        <xref ref-type="bibr" rid="ref16">16</xref>
        ], on the other hand, divides a depth map into 289 overlapped
sections (17 rows and 17 columns) as shown in Figure 2 (a) and selects the most distant section
as well as the method in [
        <xref ref-type="bibr" rid="ref14">14</xref>
        ]. The darkest place in the bounding box in Figure 2 (b) represents
the farthest distance on the depth map.
(a) Overlapped section
(b) Section selection
      </p>
      <p>In the real world, however, common depth cameras can hardly realize further distance than
approximate 20 meters. The depth map may be useless in terms of collision avoidance since
multirotors can y approximate 20 meters per second at maximum and multirotors are di cult
to suddenly fend o objects or to slow it down. Therefore, depth maps would be necessary to
realize further distance of more than 20 meters for safe ight.
3</p>
    </sec>
    <sec id="sec-8">
      <title>Pix2Pix-Based Generate Depth</title>
    </sec>
    <sec id="sec-9">
      <title>Image</title>
    </sec>
    <sec id="sec-10">
      <title>Map from</title>
    </sec>
    <sec id="sec-11">
      <title>Monocular</title>
      <p>To overcome the issue presented in the prior section, this section introduces a method for a
depth map from a monocular map based on Pix2Pix to realize distant places based on Pix2Pix.
3.1</p>
      <sec id="sec-11-1">
        <title>Introduction of Pix2Pix</title>
        <p>
          Pix2Pix is an image generator based on cGAN (Conditional Generative Adversarial Networks)
in [
          <xref ref-type="bibr" rid="ref17">17</xref>
          ]. Figure 3 shows the overview of cGan in Pix2Pix. Generator in cGAN generates
an image from the input image and conditional noise. The GAN consists of two networks,
Generator and Discriminator, as shown in Figure 3. Generator learns to prevent the generated
image from being detected as the one generated by the discriminator, and the discriminator
learns not to misidentify the generated data from the training data. The Discriminator learns
not to misidentify the training data and the generated data and nally generates an image
similar to the training data. In addition, Pix2Pix uses U-NET[
          <xref ref-type="bibr" rid="ref18">18</xref>
          ] as Generator, PatchGAN as
Discriminator.
3.2
        </p>
      </sec>
      <sec id="sec-11-2">
        <title>Depth Map Generation from</title>
      </sec>
      <sec id="sec-11-3">
        <title>Monocular Image</title>
        <p>In the real world, we can generate accurate depth maps of only 10 to 20 meters at most. This
means that collision avoidance using depth maps is inaccurate and slow ight is unavoidable.
However, AirSim, the simulator used in this research, can obtain accurate depth maps up to
200 meters or more due to it obtains information from UE4. Therefore, a proposed method is
generating depth maps from monocular images by learning on Pix2Pix as a pair of monocular
images and depth maps acquired by AirSim. Figure 4 shows the monocular image, a depth
map to 10 meters and a depth map up to 200 meters taken at the same location, and an image
generated by Pix2Pix.</p>
        <p>(a) Depth(10meter)</p>
        <p>(b) Depth(up to 200meter)
(c) monocular
(d) Pix2Pix</p>
        <p>To output an image like (d) with the input of (c) in Figure 4, a Generator trained by cGAN
is used. The Generator uses a network called U-net, which has an encoder-decoder structure.
In the encoder, features are extracted by convolutional and pooling layers, and in the decoder,
features are preserved by convolutional and up-sampling layers, and the image size can be
restored to a larger size. By inputting the input image, a monocular image, into Generator,
an image like (d) in Figure 4 can be output. In this way, a depth map is generated from the
monocular image and combined with the collision avoidance algorithm described in Section
2 to achieve highly accurate collision avoidance. The next section describes an experiment to
compare the accuracy of the images generated by Pix2Pix and the collision avoidance algorithm.
4</p>
      </sec>
    </sec>
    <sec id="sec-12">
      <title>Experiment</title>
      <p>This section shows Pix2Pix training results and experiments conducted to verify the
performance of the images produced by Pix2Pix.
4.1</p>
      <sec id="sec-12-1">
        <title>Learning on Pix2Pix for Generation of Depth Map and Comparison of Similarity of Images</title>
        <p>This section describes a learning environment of Pix2Pix and a loss function during learning,
and a result of a comparison of the similarity of images. The speci cations of the computer on
which we were training are shown in Table 1. The learning conditions are also shown in Table
2.</p>
        <p>OS
RAM
CPU
GPU
Location</p>
        <p>The computer used for the study is Windows 10Pro OS, 32GB RAM, Intel Corei7-9700K
CPU, NVIDIA GeForce RTX 2070Super GPU, and the map used is Blocks, a binary version of
AirSim. Figure 5 shows an overhead view of Blocks. The training conditions were as follows:
the number of Epochs is 500 since the loss function became smoother when Epochs exceeded
400. The number of pairs between monocular images and depth maps obtained by AirSim is
10,000. Batch size is 1 due to in general, using small batch size gives higher SSIM(structural
similarity).</p>
        <p>The experiment describes the comparison of the similarity of the images between Pix2Pix
and depth maps. The images of the location where the image is taken, the depth map obtained,
and the image generated by Pix2Pix are shown in Figure 5.</p>
        <p>In order to quantify the similarity of these images, PSNR (Peak signal-to-noise ratio) and
SSIM are used. PSNR is obtained by the following equation.
(1)
MSE (Mean Squared Error) is obtained by the following equation.</p>
        <p>PSNR indicates how much the pixel brightness value at the same location has changed.
PSNR is also considered to be acceptable at 30 dB or more. Therefore, in this paper, we
will use 30 or more as a standard. SSIM is a measure that takes the average, variance, and
covariance of surrounding pixels based on brightness, contrast, and structure, and incorporates
correlations with surrounding pixels as well as individual pixels. SSIM is obtained by the
following equation.</p>
        <p>SSIM (x; y) =</p>
        <p>(2 x y + c1)(2 xy + c2)
( 2x + 2y + c1)( x2 + y2 + c2)
(a) Depth map</p>
        <p>(b) Image generated by Pix2Pix
Pix2Pix is 48 meters. Therefore, the error is about 1 m, which is not a problem for collision
avoidance. This will be con rmed in the next experiment.
4.2</p>
      </sec>
      <sec id="sec-12-2">
        <title>Comparison of Collision Avoidance by Each Method with Pix2Pix</title>
        <p>
          This section presents the results of a comparison between two collision avoidance methods
and three image acquisition methods, for a total of six di erent methods. Collision avoidance
methods Ma Method[
          <xref ref-type="bibr" rid="ref14">14</xref>
          ] and Prez Method[
          <xref ref-type="bibr" rid="ref16">16</xref>
          ] are used, and three methods of image acquisition
are used: a realistic 10-meter depth map, a 200-meter depth map obtained by AirSim, and
images obtained by Pix2Pix. Figure 7 shows the processing of each method.
        </p>
        <p>Firstly, this system gets a depth map. Secondly, this system divides the depth map and
selects the best section. Finally, This system will control the velocity of the multirotor so that
the selected section is in the center. Each method is implemented by modifying the section
selection method and the depth map acquisition method.</p>
        <p>The experiment will be conducted with Blocks and the speed in the direction of travel will
be set to 3 m/s. The results of 100 ights to random coordinates 100 meters from the starting
point will be summarized. The evaluation criteria are the collision rate and processing time.
The processing time of the experiment is the time from image acquisition to velocity control in
this owchart. The experimental results are shown in Table 5.</p>
        <p>10m-Ma
10m-Prez
200m-Ma
200m-Prez
Pix2Pix-Ma
Pix2Pix-Prez</p>
        <p>The experimental results show that Ma Method has an overall faster processing time than
the Prez method and a higher collision rate than the Prez method. This method does not allow
collision avoidance upward, while the multirotor allows collision avoidance upward. As a result,
there are many cases in which the multirotor fails to avoid obstacles. On the other hand, the
Prez Method has a longer processing time than the Ma Method, but a lower collision rate than
Ma method. In particular, the collision rate of Pix2Pix-Prez is lower than 200m-Prez, since
it takes time for the image to be updated when using Pix2Pix, therefore the collision can be
avoided by ying high enough to reach a height where there are no obstacles.
5</p>
      </sec>
    </sec>
    <sec id="sec-13">
      <title>Conclusion</title>
      <p>This paper presents the use of Pix2Pix to obtain highly accurate depth maps to avoid multirotor
collisions. The collision rate of the proposed method is 0, over-performing the related works.
Even when Pix2Pix is used, the results showed that there were few collisions. In order to
implement the system on a real multirotor, it is necessary to install a high-performance computer.
Our future work is to study and experiment on how to increase the speed of the system so that
it can be used in actual multirotors. The investigation of generalization performance is also a
future task.</p>
    </sec>
    <sec id="sec-14">
      <title>Acknowledgment</title>
      <p>This work is partly supported by KAKENHI 20K23333 and 20J21208.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <given-names>Florent</given-names>
            <surname>Martel</surname>
          </string-name>
          , Richard Schultz, Ziming Wang,
          <string-name>
            <surname>Mariusz Czarnomski</surname>
          </string-name>
          , and William Semke, \
          <article-title>Unmanned Aircraft Systems Sense</article-title>
          and
          <string-name>
            <surname>Avoid Avionics Utilizing ADS-B Transceiver</surname>
          </string-name>
          ,
          <article-title>"</article-title>
          <source>in Aerospace Conference and AIAA Unmanned... Unlimited Conference</source>
          ,
          <year>2009</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [2]
          <string-name>
            <given-names>Subodh</given-names>
            <surname>Bhandari</surname>
          </string-name>
          , Nicole Curtis-Brown, Isaac Guzman, Tristan Sherman, Joshua Tellez, and Edward Gomez, \
          <article-title>UAV Collision Detection and Avoidance using ADS-B Sensor and Custom ADSB Like Solution,"</article-title>
          <source>in AIAA Information Systems-AIAA Infotech@ Aerospace</source>
          ,
          <year>2017</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [3]
          <string-name>
            <given-names>Joshua</given-names>
            <surname>Redding</surname>
          </string-name>
          , Jayesh Amin, Jovan Boskovic, Yeonsik Kang, Karl Hedrick,
          <article-title>Jason Howlett, and Scott Poll, \A Real-Time Obstacle Detection and Reactive Path Planning System for Autonomous Small-Scale Helicopters,"</article-title>
          <source>in AIAA Guidance, Navigation and Control Conference and Exhibit</source>
          ,
          <year>2007</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [4]
          <string-name>
            <given-names>Andrew</given-names>
            <surname>Mo</surname>
          </string-name>
          <string-name>
            <surname>att</surname>
          </string-name>
          , Eric Platt, Brandon Mondragon, Aaron Kwok,
          <string-name>
            <given-names>Dennis</given-names>
            <surname>Uryeu</surname>
          </string-name>
          , and Subodh Bhandari, \
          <article-title>Obstacle Detection and Avoidance System for Small UAVs Using A LiDAR,"</article-title>
          <source>in International Conference on Unmanned Aircraft Systems</source>
          ,
          <year>2020</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          [5]
          <string-name>
            <given-names>Yawei</given-names>
            <surname>Hou</surname>
          </string-name>
          , Zhenling Zhang, Chao Wang, Shouhu Cheng, and Demao Ye, \
          <article-title>Research on Vehicle Identi cation Method and Vehicle Speed Measurement Method Based on Multi-rotor UAV Equipped with LiDAR,"</article-title>
          <source>in International Conference on Advanced Electronic Materials, Computers and Software Engineering</source>
          ,
          <year>2020</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          [6]
          <string-name>
            <given-names>Shaoyong</given-names>
            <surname>Zhang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Na</given-names>
            <surname>Li</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Chenchen</given-names>
            <surname>Qiu</surname>
          </string-name>
          , Zhibin Yu, Haiyong Zheng, and Bing Zheng, \
          <article-title>Depth Map Prediction from A Single Image with Generative Adversarial Nets,"</article-title>
          <source>Multimedia Tools and Applications</source>
          , vol.
          <volume>79</volume>
          , no.
          <issue>21</issue>
          , pp.
          <volume>14357</volume>
          {
          <issue>14374</issue>
          ,
          <year>2020</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          [7]
          <string-name>
            <given-names>L.</given-names>
            <surname>Madhuanand</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F.</given-names>
            <surname>Nex</surname>
          </string-name>
          , and
          <string-name>
            <given-names>M.</given-names>
            <surname>Yang</surname>
          </string-name>
          , \
          <article-title>Deep Learning for Monocular Depth Estimation from UAV Images," ISPRS Annals of the Photogrammetry</article-title>
          ,
          <source>Remote Sensing and Spatial Information Sciences</source>
          , pp.
          <volume>451</volume>
          {
          <issue>458</issue>
          ,
          <year>2020</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          [8]
          <string-name>
            <given-names>F.</given-names>
            <surname>Liu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Shen</surname>
          </string-name>
          ,
          <string-name>
            <given-names>G.</given-names>
            <surname>Lin</surname>
          </string-name>
          ,
          <string-name>
            <surname>and I. Reid</surname>
          </string-name>
          , \
          <article-title>Learning Depth from Single Monocular Images Using Deep Convolutional Neural Fields,"</article-title>
          <source>IEEE Transactions on Pattern Analysis and Machine Intelligence</source>
          , vol.
          <volume>38</volume>
          , no.
          <issue>10</issue>
          , pp.
          <year>2024</year>
          {
          <year>2039</year>
          ,
          <year>2016</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          [9]
          <string-name>
            <given-names>M.</given-names>
            <surname>Mancini</surname>
          </string-name>
          , G. Costante,
          <string-name>
            <given-names>P.</given-names>
            <surname>Valigi</surname>
          </string-name>
          , and
          <string-name>
            <given-names>T. A.</given-names>
            <surname>Ciarfuglia</surname>
          </string-name>
          , \J-MOD2:
          <article-title>Joint Monocular Obstacle Detection and Depth Estimation,"</article-title>
          <source>IEEE Robotics and Automation Letters</source>
          , vol.
          <volume>3</volume>
          , no.
          <issue>3</issue>
          , pp.
          <volume>1490</volume>
          {
          <issue>1497</issue>
          ,
          <year>2018</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          [10]
          <string-name>
            <surname>Kyle</surname>
            <given-names>Hatch</given-names>
          </string-name>
          , John Mern, and Mykel Kochenderfer, \
          <article-title>Obstacle Avoidance Using a Monocular Camera," in in AIAA Scitech Forum</article-title>
          ,
          <year>2021</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          [11]
          <string-name>
            <surname>Zhi</surname>
            <given-names>Hou</given-names>
          </string-name>
          , Juntong Qi, and Mingming Wang, \
          <article-title>Fusing Optical Flow and Inertial Data for UAV Motion Estimation in GPS-denied Environment,"</article-title>
          <source>in Chinese Control Conference</source>
          ,
          <year>2019</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          [12]
          <string-name>
            <surname>Dong-Wan</surname>
            <given-names>Yoo</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Dae-Yeon Won</surname>
          </string-name>
          , and
          <string-name>
            <surname>Min-Jea</surname>
            <given-names>Tahk</given-names>
          </string-name>
          , \
          <article-title>Optical Flow Based Collision Avoidance of Multi-rotor UAVs in Urban Environments,"</article-title>
          <source>International Journal of Aeronautical and Space Sciences</source>
          , vol.
          <volume>12</volume>
          , no.
          <issue>3</issue>
          , pp.
          <volume>252</volume>
          {
          <issue>259</issue>
          ,
          <year>2011</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          [13]
          <string-name>
            <surname>Haiyang</surname>
            <given-names>Chao</given-names>
          </string-name>
          ,
          <string-name>
            <given-names>Yu</given-names>
            <surname>Gu</surname>
          </string-name>
          , and Marcello Napolitano, \
          <article-title>A Survey of Optical Flow Techniques for UAV Navigation Applications,"</article-title>
          <source>in International Conference on Unmanned Aircraft Systems</source>
          ,
          <year>2013</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref14">
        <mixed-citation>
          [14]
          <string-name>
            <surname>Chenxiang</surname>
            <given-names>Ma</given-names>
          </string-name>
          , You Zhou, and
          <string-name>
            <given-names>Zhiqiang</given-names>
            <surname>Li</surname>
          </string-name>
          ,
          <article-title>\A New Simulation Environment Based on AirSim, ROS, and PX4 for Quadcopter Aircrafts,"</article-title>
          <source>in International Conference on Control, Automation and Robotics</source>
          ,
          <year>2020</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref15">
        <mixed-citation>
          [15]
          <string-name>
            <given-names>Johann</given-names>
            <surname>Borenstein</surname>
          </string-name>
          and Yoram Koren, \
          <article-title>The Vector Field Histogram-Fast Obstacle Avoidance for Mobile Robots," IEEE transactions on robotics and automation</article-title>
          , vol.
          <volume>7</volume>
          , no.
          <issue>3</issue>
          , pp.
          <volume>278</volume>
          {
          <issue>288</issue>
          ,
          <year>1991</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref16">
        <mixed-citation>
          [16]
          <string-name>
            <surname>Erwin</surname>
            <given-names>Perez</given-names>
          </string-name>
          , Alexander Winger, Alexander Tran, Carlos Garcia-Paredes, Niran Run, Nick Keti, Subodh Bhandari, and Amar Raheja, \
          <article-title>Autonomous Collision Avoidance System for a Multicopter using Stereoscopic Vision,"</article-title>
          <source>in International Conference on Unmanned Aircraft Systems</source>
          ,
          <year>2018</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref17">
        <mixed-citation>
          [17]
          <string-name>
            <surname>Phillip</surname>
            <given-names>Isola</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Jun-Yan</surname>
            <given-names>Zhu</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Tinghui Zhou</surname>
          </string-name>
          , and
          <article-title>Alexei A Efros, \Image-to-Image Translation with Conditional Adversarial Networks,"</article-title>
          <source>in IEEE Conference on Computer Vision and Pattern Recognition</source>
          ,
          <year>2017</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref18">
        <mixed-citation>
          [18]
          <string-name>
            <surname>Olaf</surname>
            <given-names>Ronneberger</given-names>
          </string-name>
          ,
          <string-name>
            <given-names>Philipp</given-names>
            <surname>Fischer</surname>
          </string-name>
          , and Thomas Brox, \
          <article-title>U-net: Convolutional Networks for Biomedical Image Segmentation,"</article-title>
          <source>in International Conference on Medical Image Computing and Computer-Assisted Intervention</source>
          . Springer,
          <year>2015</year>
          .
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>