<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Visualizing Motion of Natural Objects by Deep Learning Optical Flow Estimation in an Omnidirectional Image for Virtual Sightseeing</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Motoki Kakuho</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Norihiko Kawai</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Graduate School of Information Science, Osaka Institute of Technology</institution>
        </aff>
      </contrib-group>
      <abstract>
        <p>Services using omnidirectional images have become increasingly popular. For example, Google Street View enables users to view the scenery of a location online without physically visiting it. However, the use of still images limits the sense of presence. This study proposes a method that focuses on natural elements such as water, sky, and trees within a single omnidirectional image and utilizes deep learning to reproduce their motion in 3D space, generating omnidirectional videos. Experiments demonstrate the efectiveness of the proposed method by comparing results with conventional methods.</p>
      </abstract>
      <kwd-group>
        <kwd>eol&gt;Omnidirectional Image</kwd>
        <kwd>Video Generation</kwd>
        <kwd>Motion Reproduction</kwd>
        <kwd>Virtual Sightseeing</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>1. Introduction</title>
      <p>a significant amount of time for collecting video data.</p>
      <p>
        To solve this problem, we propose a method that
foVirtual sightseeing services using omnidirectional images cuses on natural objects such as water, sky, and trees
have been increasing. For example, TOWNWARP [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ] and within a single omnidirectional image and reproduces
AirPano [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ] enable users to enjoy the scenery of famous their motion to generate omnidirectional videos for
tourist spots and cities as videos online without physi- highly realistic virtual sightseeing at arbitrary locations.
cally visiting them. Additionally, there are studies that In the proposed method, for water surface and sky
recombine virtual tourism with education by synthesizing gions, a part of the target omnidirectional image is
convirtual objects into omnidirectional images. For example, verted into a perspective projection image, and the optical
CoSpaces [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ] provides functions to place virtual objects lfow of the water surface and sky is estimated by a deep
such as information boards, explanatory text, and human learning-based approach. The optical flow is then
transavatars in virtual environments created with omnidirec- formed into the motion in 3D space and projected back
tional images, which can support various types of learn- onto the omnidirectional image, reproducing the motion
ing. For ruins tourism, an application has been developed of the water and sky in the omnidirectional image. For
that allows users to learn and enjoy the scenery of the trees, the optical flow is obtained from a reference video
past of a historical site not only as VR at arbitrary loca- in the perspective projection, converted into the motion
tions but also as Indirect AR at the site by synthesizing on vertical 3D planes, and then applied to the
omnidirecvirtual buildings that existed in the past into omnidirec- tional image. Semantic segmentation [
        <xref ref-type="bibr" rid="ref6">6</xref>
        ] is also used to
tional images and presenting them to the user [
        <xref ref-type="bibr" rid="ref4">4</xref>
        ]. While clearly separate the sky, water surface, and tree regions.
such services allow users to virtually experience and This process generates an omnidirectional video where
learn various places around the world, users cannot view motion is reproduced only in the regions of water, sky,
locations other than famous tourist spots chosen by the and trees.
content creators.
      </p>
      <p>
        In contrast, Google Street View [
        <xref ref-type="bibr" rid="ref5">5</xref>
        ] is an example of
services that allow users to explore any location world- 2. Related Work
wide. However, since this service presents still images,
it lacks the sense of presence. One solution to this
issue is to record videos from fixed points while traveling
around the world. However, this method would require
Various studies have been conducted on converting still
images into videos by moving objects in them. Among
these, there are studies [
        <xref ref-type="bibr" rid="ref10 ref7 ref8 ref9">7, 8, 9, 10</xref>
        ] that focuses on the
movement of natural objects. For instance, Creating Fluid
Animation from a Single Image using Video Database [
        <xref ref-type="bibr" rid="ref9">9</xref>
        ]
generates high-quality animations by eficiently
assigning target images using a Markov Random Field (MRF)
and leveraging a fluid video database. Another example
using machine learning with neural networks. However,
these methods deal with images in perspective projection.
      </p>
      <p>
        Therefore, for example, if the method [
        <xref ref-type="bibr" rid="ref10">10</xref>
        ] is applied to
omnidirectional images in equirectangular projection,
the generated motion appears unnatural because the
model is trained on perspective projection images. It also
sufers from parameter dependency, causing motion in
regions where no motion should occur. In addition, even
though the left and right edges of the omnidirectional
image are connected, the conventinal methods do not
take this into account. Therefore, when looking around
the omnidirectional image as a perspective projection
image, we can observe a misaligned border in the texture
at the edges.
      </p>
      <p>
        For these problems, in our previous study [
        <xref ref-type="bibr" rid="ref11">11</xref>
        ], we
reproduced the motion of the sky and water surface in
omnidirectional images by assuming that the sky and
water surface could be expressed by straight-line motion
on a plane. The proposed method in this study is the
extended version, and reproduces more natural motion
by using optical flow estimated by deep learning, and
also reproduces the motion of trees.
      </p>
    </sec>
    <sec id="sec-2">
      <title>3. Proposed Method</title>
      <sec id="sec-2-1">
        <title>3.1. Overview</title>
        <p>
          The flow of the proposed method is as follows. First,
(1) we input an omnidirectional landscape image
containing either sky, water or trees, as shown in figure
1(a). This study assumes that omnidirectional images
are generated by equirectangular projection so that the
bottom pixel is in the direction of gravity obtained from
the accelerometer in the camera. Next, (2) we apply the
semantic segmentation [
          <xref ref-type="bibr" rid="ref6">6</xref>
          ] to the image to divide it into
regions such as water surface, sky, trees, and others as
shown in Figure 1(b). From the segmented image, we
generate a mask image that mask all objects above the
horizon except the sky area, as shown in Figure 1(c). Here,
due to inaccuracies near the boundaries of the semantic
segmentation, the mask regions are expanded to fully
include objects except the sky area. Next, (3) using the
generated mask image, we generate an image in which all
areas above the horizon have sky textures by inpainting
[
          <xref ref-type="bibr" rid="ref12">12</xref>
          ], as shown in Figure 1(d).
        </p>
        <p>
          Next, (4) we generate the motion of the water surface,
sky and trees by copying pixel values using calculated
optical flows. The motion of the water surface and sky are
calculated by estimating the motion in 3D space based on
the deep learning-based optical flow estimation [
          <xref ref-type="bibr" rid="ref10">10</xref>
          ]. The
tree motion is calculated by acquiring the motion from a
perspective projection video of the trees and reproducing
the motion in 3D space.
        </p>
        <p>Finally (5) We combine the video of each region
gen(a) Example of input image
(b) Semantic segmentation</p>
        <p>(c) Mask image
(d) Inpainting result
erated in (4) with the input image using the segmented
image as shown in Figure 1(b) to generate a video in
which only the water surface, sky, and trees move. Here,
alpha blending is performed at the boundary of the mask
to reduce the unnaturalness at the boundary between
the moving and static regions. The following sections
describe the details of motion generation in Step (4).
the motion calculated on the plane here only corresponds
to a part of the lower part of the omnidirectional image.
To handle the entire water area in the omnidirectional
image, this study assumes that the motion of the water
at any given location is similar. Here, since the projected
region from the perspective projection image is a
trapezoidal shape, as shown by the red outline in Figure 3, the
lfow map of the region is extracted, scaled, interpolated,
and shifted to align with the square region projected from
the lower half of the omnidirectional image, as shown in
Figure 4.</p>
        <p>Next, as shown in Figure 2, the ,  coordinates on
the plane obtained by equation (1) are shifted by flow
(, ), and projected onto the surface of the sphere. Pixel
(2, 2) in the omnidirectional image corresponding to
the shifted coordinate ( +,  +, ) on the horizontal
plane is determined as follows:</p>
      </sec>
      <sec id="sec-2-2">
        <title>3.2. Generation of Water Motion</title>
        <p>For the motion of water, we assume that the water is
moving along a planar surface in 3D space. This 3D
motion on a plane is represented as a 2D optical flow on
the omnidirectional image.</p>
        <p>Specifically, we first define the coordinate system for
the omnidirectional image and the plane. As shown in
Figure 2, position (, , ) on the water surface
corresponding to pixel (1, 1) of the omnidirectional image
is determined in a coordinate system where the center of
the sphere corresponding to the omnidirectional image
is at the origin, as follows:
⎡⎤ ⎡ tan ℎ1 cos 2 1 ⎤
⎣ ⎦ = ⎣ tan ℎ1 sin 2 1 ⎦ ,</p>
        <p>where  and ℎ are the width and height of the
omnidirectional image, and  is a negative constant representing
the height of the water surface.</p>
        <p>
          In this coordinate system, we compute the flow (, )
at the water surface. First, as shown in Figure 3, a part of
the omnidirectional image is extracted as a perspective
projection image so that its horizon is at the center of
the image height. The optical flow is then estimated by
the deep learning-based method in [
          <xref ref-type="bibr" rid="ref10">10</xref>
          ]. As illustrated
in Figure 3, both the original pixel and the pixel after
moving based on the flow are projected onto the plane
at height  using the focal lengths ,  and the image
center ,  of the perspective projection image. The
3D coordinates after projecting the pixel (, ) onto the
plane are calculated as follows:
⎡⎤ ⎡ ((−− )) ⎤
⎣ ⎦ = ⎣⎢ (−) ⎦⎥ .
        </p>
        <p>Next, the flow map on the plane is determined from 2
the diferences between the respective projected 3D
coordinates. This process is performed on the pixels in the Finally, the diference between the transformed pixel
lower half of the perspective projection image. However, (2, 2) and the original pixel (1, 1) is calculated as the
(3)
︂[ 2]︂
=
[︃
2 tan− 1  +</p>
        <p>+
ℎ cos− 1 √(+)2+( +)2+2
]︃
.
optical flow (, ) on the omnidirectional image. By
performing this process for all pixels, the optical flows
for the entire water surface on the omnidirectional image
are obtained. Based on the optical flows, pixel values are
copied to generate an image where the water surface has
moved. This process is repeated for each frame, and a
video is generated by combining all the frames.</p>
      </sec>
      <sec id="sec-2-3">
        <title>3.3. Generation of sky Motion</title>
        <p>
          For the motion of the sky, assuming that clouds in the sky
move on a plane in 3D space above the scene, the motion
is estimated in the same manner as for the water. The
optical flows in the upper part of the perspective projection
image in Figure 3 is estimated by the deep learning-based
method [
          <xref ref-type="bibr" rid="ref10">10</xref>
          ], and the motion is projected on the plane,
and the motion on of the upper part of the
omnidirectional image is finally determined by re-projecting the
motion on the plane onto the sphere representing the
omnidirectional image.
        </p>
        <p>Note that, as described in section 3.1, by removing all
areas other than the sky using inpainting, the plausible
sky texture is generated in the areas. Even when the flow
is from behind buildings, the generated texture is copied,
and the motion of the sky can be reproduced.</p>
      </sec>
      <sec id="sec-2-4">
        <title>3.4. Generation of tree Motion</title>
        <p>For the motion of trees, rather than assuming a single
plane like water and sky, we assume that, as shown in
Figure 5, they move on a vertical plane perpendicular to
the radial line from the center of the sphere to the sphere
surface at height 0, for each column. Similar to the sky
and water, this 3D motion is expressed as a 2D optical
lfow on the omnidirectional image.</p>
        <p>
          Specifically, a reference video is first input and the
optical flows are estimated by Farneback method [
          <xref ref-type="bibr" rid="ref13">13</xref>
          ].
The flow map is resized to match the tree region. Next,
the 2D coordinates of the input image in the mask region
for trees are converted into 3D coordinates as follows:
⎡⎤ ⎡cos 2 1 ⎤
⎣⎦ = ⎢sin 12 1 ⎥⎦ .
        </p>
        <p>⎣
tan ℎ1
(4)
The 3D coordinate shifted on the vertical plane based
on the flow map, and the shifted 3D coordinate is
reprojected onto the shpere. The flow on the omnidirectinal
image is determined by the original and the re-projected
pixels. By repeating this process for the number of frames
in the reference video, a video with the tree motion is
generated.</p>
      </sec>
    </sec>
    <sec id="sec-3">
      <title>4. Experiments and Discussions</title>
      <sec id="sec-3-1">
        <title>4.1. Experimental Settings</title>
        <p>
          We conducted experiments to generate a video from a
single omnidirectional image. As input, we used an
image captured with the 360° camera RICOH THETA Z1
and an image obtained from Google Street View, which
were resized to a resolution of 1600 × 800. We used the
image captured with the 360° camera as Case 1, and the
image obtained from Google Street View as Case 2. In the
experiments, we set the height of the planes representing
the sky and water along the Z-axis to 2 and -2,
respectively. We set the focal lengths ,  and image center
,  of the perspective projection image with a
resolution of 384 × 384 to 192. We obtained the motion of the
trees from the reference video as shown in Figure 6. The
generated video consisted of 199 frames. Additionally,
in Case 2, we compared the results with those obtained
by directly applying the conventional method [
          <xref ref-type="bibr" rid="ref10">10</xref>
          ] to the
equirectangular omnidirectional image. The following
sections describe the experiments for Cases 1 and 2 in
turn.
        </p>
      </sec>
      <sec id="sec-3-2">
        <title>4.2. Experimental Results</title>
        <p>4.2.1. Result of Case 1</p>
        <sec id="sec-3-2-1">
          <title>Frame 60</title>
        </sec>
        <sec id="sec-3-2-2">
          <title>Frame 120</title>
        </sec>
        <sec id="sec-3-2-3">
          <title>Frame 60</title>
        </sec>
        <sec id="sec-3-2-4">
          <title>Frame 120</title>
          <p>From these experimental results, we can observe that
the sky moves naturally in the sky region, and we can
frame) by the proposed method in Case 1. Figures 8 also feel perspective because the clouds just above us
and 9 show the result of converting these frames into moves faster than those in the distance. As for the
waperspective projection images in a specific direction. In ter, we can see that the water surface moves in various
this experiment, we convereted the omnidirectional im- directions, resulting in successfully representing waves.
age into the perspective projection image as shown in In the oflw map of the water plane in Figure 10(c), we
Figure 10(a). Figure 10(b) shows the the calculated optical can observe various hues between green and yellow, and
lfow at the 30th frame. From this flow map, we generated the brightness also varies, indicating that the complex
the flow maps of the water and sky planes as shown in motion of the water is well-represented. In contrast, the
Figures 10(c) and (d). In these figures, the angle of mo- lfow of the sky shows less variation in hue compared to
tion is represented by hue, the relative magnitude of the the water, confirming that it moves in a mostly consistent
motion is represented by brightness, and the saturation direction. Regarding the trees, although the motion of
is fixed at 1. the trees in the reference video is reflected in the
omni</p>
        </sec>
        <sec id="sec-3-2-5">
          <title>Frame 60</title>
          <p>(a) Perspective projection im-(b) Flow of perspective
projecage tion image
(c) Flow of water plane
(d) Flow of sky plane</p>
        </sec>
        <sec id="sec-3-2-6">
          <title>Frame 60</title>
        </sec>
        <sec id="sec-3-2-7">
          <title>Frame 120</title>
        </sec>
        <sec id="sec-3-2-8">
          <title>Frame 60</title>
        </sec>
        <sec id="sec-3-2-9">
          <title>Frame 120</title>
          <p>results. A specific example of this issue is illustrated in
Figure 11, where the sun is also considered as part of the
sky, making it move in the same way as the clouds. One
solution is to extract the sun from the sky by developing
a new semantic segmentation method and then to keep
the sun in its original position.</p>
          <p>Furthermore, while this study focuses on animating
natural objects, many tourist spots also have moving
man-made objects such as cars and flags. If these objects
are not properly animated, the realism of the video is
reduced. We should develop a method for animating
man-made objects, further enhancing the realism of the
video.
(a) Omnidirectional image
(b) Perspective projection</p>
        </sec>
      </sec>
    </sec>
    <sec id="sec-4">
      <title>5. Conclusion</title>
      <p>In this study, we proposed a method for generating videos
with motion of natural objects from a single
omnidirectional image by the combination of estimating optical
lfows using deep learning and considering the motion in
3D space for virtual sightseeing. Through experiments,
we confirmed that the proposed method is efective.
However, while the water and sky regions moved naturally,
the tree regions still show some unnatural motion. In
future work, we introduce deep learning for the motion
of trees as well.</p>
    </sec>
    <sec id="sec-5">
      <title>Acknowledgment</title>
      <p>This research was partially supported by JSPS KAKENHI
JP23K21689.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <given-names>V. T.</given-names>
            <surname>Consortium</surname>
          </string-name>
          , Townwrap,
          <year>2024</year>
          . URL: https:// townwarp.net/,
          <source>last accessed: September</source>
          <volume>25</volume>
          ,
          <year>2024</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [2] AirPano, Airpano,
          <year>2024</year>
          . URL: https://www.airpano. com/,
          <source>last accessed: September</source>
          <volume>25</volume>
          ,
          <year>2024</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [3]
          <string-name>
            <given-names>C.</given-names>
            <surname>Valero-Franco</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Berns</surname>
          </string-name>
          ,
          <article-title>A virtual reality app created with cospaces: Student perceptions and attitudes, in: Ethical Considerations of Virtual Reality in the College Classroom</article-title>
          , 1st ed.,
          <source>Routledge</source>
          ,
          <year>2023</year>
          , p.
          <fpage>16</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [4]
          <string-name>
            <given-names>Y.</given-names>
            <surname>Suganuma</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Oda</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K.</given-names>
            <surname>Nakayama</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Nishikawa</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Hata</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K.</given-names>
            <surname>Paul</surname>
          </string-name>
          , S. Wada,
          <string-name>
            <surname>N.</surname>
          </string-name>
          <article-title>Kawai, Integrated system of augmented and virtual reality for ruins tourism</article-title>
          ,
          <source>in: Proceedings of NICOGRAPH International</source>
          <year>2023</year>
          ,
          <year>2023</year>
          , p.
          <fpage>85</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          [5]
          <string-name>
            <surname>Google</surname>
          </string-name>
          , Google street view,
          <year>2024</year>
          . URL: https://www. google.co.jp/maps, last accessed:
          <source>July</source>
          <volume>17</volume>
          ,
          <year>2024</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          [6]
          <string-name>
            <given-names>J.</given-names>
            <surname>Lambert</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Z.</given-names>
            <surname>Lie</surname>
          </string-name>
          ,
          <string-name>
            <given-names>O.</given-names>
            <surname>Sener</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Hays</surname>
          </string-name>
          , V. Koltun,
          <article-title>MSeg: A composite dataset for multi-domain semantic segmentation</article-title>
          ,
          <source>in: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition</source>
          ,
          <year>2020</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          [7]
          <string-name>
            <given-names>Y.-Y.</given-names>
            <surname>Chuang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D. B.</given-names>
            <surname>Goldman</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K. C.</given-names>
            <surname>Zheng</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B.</given-names>
            <surname>Curless</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Salesin</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Szeliski</surname>
          </string-name>
          ,
          <article-title>Animating pictures with stochastic motion textures</article-title>
          ,
          <source>ACM Transactions on Graphics</source>
          <volume>24</volume>
          (
          <year>2005</year>
          )
          <fpage>853</fpage>
          -
          <lpage>860</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          [8]
          <string-name>
            <given-names>M.</given-names>
            <surname>Okabe</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K.</given-names>
            <surname>Anjyor</surname>
          </string-name>
          ,
          <string-name>
            <given-names>T.</given-names>
            <surname>Igarashi</surname>
          </string-name>
          ,
          <string-name>
            <surname>H.-P.</surname>
          </string-name>
          ,
          <article-title>Animating pictures of fluid using video examples</article-title>
          ,
          <source>Computer Graphics Forum</source>
          <volume>28</volume>
          (
          <year>2009</year>
          )
          <fpage>677</fpage>
          -
          <lpage>686</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          [9]
          <string-name>
            <given-names>M.</given-names>
            <surname>Okabe</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K.</given-names>
            <surname>Anjyor</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Onai</surname>
          </string-name>
          ,
          <article-title>Creating fluid animation from a single image using video database</article-title>
          ,
          <source>Computer Graphics Forum</source>
          <volume>30</volume>
          (
          <year>2011</year>
          )
          <fpage>1973</fpage>
          -
          <lpage>1982</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          [10]
          <string-name>
            <given-names>Y.</given-names>
            <surname>Endo</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Y.</given-names>
            <surname>Kanamori</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Kuriyama</surname>
          </string-name>
          ,
          <article-title>Animating landscape: Self-supervised learning of decoupled motion and appearance for single-image video video synthesis</article-title>
          ,
          <source>ACM Transactions on Graphics</source>
          <volume>38</volume>
          (
          <year>2019</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          [11]
          <string-name>
            <given-names>M.</given-names>
            <surname>Kakuho</surname>
          </string-name>
          ,
          <string-name>
            <given-names>H.</given-names>
            <surname>Ikebayashi</surname>
          </string-name>
          ,
          <string-name>
            <given-names>N.</given-names>
            <surname>Kawai</surname>
          </string-name>
          ,
          <article-title>Motion reproduction of sky and water surface from an omnidirectional still image</article-title>
          ,
          <source>in: Proceedings of IEEE Global Conference on Consumer Electronics</source>
          ,
          <year>2023</year>
          , pp.
          <fpage>150</fpage>
          -
          <lpage>151</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          [12]
          <string-name>
            <given-names>J. Y. ans Z.</given-names>
            <surname>Lin</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Yang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>X.</given-names>
            <surname>Shen</surname>
          </string-name>
          ,
          <string-name>
            <given-names>X.</given-names>
            <surname>Lu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>T. S.</given-names>
            <surname>Huang</surname>
          </string-name>
          ,
          <article-title>Free-form image inpainting with gated convolution</article-title>
          ,
          <source>in: Proceedings of IEEE International Conference on Computer Vision</source>
          ,
          <year>2019</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          [13]
          <string-name>
            <given-names>G.</given-names>
            <surname>Farnebäck</surname>
          </string-name>
          ,
          <article-title>Two-frame motion estimation based on polynomial expansion</article-title>
          ,
          <source>in: Proceedings of Scandinavian Conference on Image Analysis (SCIA</source>
          <year>2003</year>
          ), volume
          <volume>2749</volume>
          ,
          <year>2003</year>
          .
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>