<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Point cloud change detection in indoor environments</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Tomoya Matsubara</string-name>
          <email>tomoya.matsubara@hvrl.ics.keio.ac.jp</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Hideo Saito</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Keio University</institution>
          ,
          <addr-line>Yokohama</addr-line>
          ,
          <country country="JP">Japan</country>
        </aff>
      </contrib-group>
      <abstract>
        <p>3D change detection plays a crucial role in a wide range of applications, including disaster management, as well as in robotics for search, rescue, security, and surveillance purposes. Although previous works exist, most of them are limited to detecting a few specific targets or are restricted to 2D images. Additionally, some assume prior knowledge of the object positions of interest. This paper presents a novel change detection algorithm that combines panoptic segmentation and  -NN, enabling the detection of changes without relying on positional information about the objects of interest. Experimental evaluations on indoor point clouds demonstrate the algorithm's capability to detect the removal of densely and closely placed objects, an aspect overlooked by previous approaches due to their inherent limitations. Despite variations in settings and datasets, our algorithm achieves a recall improvement of 0.06 for the removed class, surpassing the performance of existing related works.</p>
      </abstract>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>-</title>
      <p>metaverse, point cloud, change detection, panoptic segmentation, k-Nearest Neighbor</p>
      <p>rithm that combines 2D and 3D nearest points. However,</p>
      <p>© 2023 Copyright for this paper by its authors. Use permitted under Creative Commons License building geometry and propose a change detection
algo</p>
    </sec>
    <sec id="sec-2">
      <title>1. Introduction</title>
      <p>Point clouds, which accurately capture the 3D geometry
of scenes, have extensive applications in scene
understanding and robotics [1], encompassing tasks such as 3D
shape classification [ 2, 3], 3D object detection [4, 5, 6, 7],
and point cloud segmentation [8]. In the realm of the
metaverse, point clouds frequently serve as
representations of scenes within virtual worlds.</p>
      <p>The metaverse is built upon immersive user
experiences, necessitating an interaction layer that efectively
bridges the physical and virtual worlds [9]. Digital twins
[10] serve as a critical component within this layer,
facilitating the transmission and synchronization of data
and information between the virtual and physical worlds.
However, the constant scanning of the entire scene to
update the digital twin is impractical due to the vast
amount of data involved. Thus, the selective updating of
the digital twin in areas where changes have occurred
becomes paramount, highlighting the essential role of
change detection.</p>
      <p>2D change detection techniques, which primarily focus
on comparing two input images, have been proposed
prithese approaches are constrained by the requirement of
image alignment between the two inputs. In contrast, 3D
change detection has garnered attention for its ability to
overcome this limitation. Although some research has
been conducted on 3D change detection in domains such
as disaster management [13], security patrols [14], and
APMAR’23: The 15th Asia-Pacific Workshop on Mixed and Augmented</p>
    </sec>
    <sec id="sec-3">
      <title>2. Related Work</title>
      <p>ods; however, these methods often sufer from limitations
in their applicability. Some approaches necessitate prior
knowledge of the object positions of interest [16], while
others are confined to detecting a limited number of
targets [17, 7]. Additionally, certain techniques are only
suitable for change detection in 2D images [18]. The
process of collecting positional information can be
laborintensive, particularly in the case of 3D data, as it involves
annotation. The range and diversity of detection targets
directly impact the algorithm’s applicability.</p>
      <p>S. Nikoohemat et al. [17] concentrate on changes in
marily for remote sensing applications [11, 12]. However, 3D change detection has been studied by various
meth3. Methods
their algorithm’s applicability is limited as it only
considers vertical planes as potential objects of interest that may
undergo changes. M. Voelse et al. [7] introduce change Let   and   ′ denote a pair of already registered point
detection as a means to distinguish between static and clouds captured at diferent times  and  ′ (where  ≠  ′),
temporal objects when creating updated 3D models of respectively, of the same scene. Given that an object
environments. The algorithm initiates by segmenting instance exists in   but not in   ′, we define this scenario
point clouds using region growing, with a set distance as the instance being removed when  &lt;  ′, or added when
threshold of 10 cm. Additionally, segments with a height  &gt;  ′. Therefore, the addition and removal of an object
below 20 cm are excluded (i.e., discarded) to mitigate instance can be treated equivalently by interchanging the
misclassifications. These defined thresholds make it chal- time values  and  ′. Accordingly, we formulate the task
lenging to apply the algorithm in environments where of change detection as a binary classification problem,
objects are closely positioned, such as indoor scenarios. distinguishing between no change and removed, with a</p>
      <p>T. Ku et al. [16] propose three distinct algorithms, particular focus on identifying the specific object instance
namely PoChaDeHH, HGI-CD, and SiamGCN, for change that underwent the change in the point cloud   .
detection on a street-scene dataset consisting of point First, 3D point clouds of the scene are reconstructed
clouds. The dataset encompasses various street furniture from captured images. Subsequently, panoptic
segmentaobjects, including road signs, advertisements, statues, and tion is applied to assign an object instance label to each
garbage bins, with the positions of each object of interest pixel in the images, thereby associating them with the
provided, facilitating the extraction of these objects from corresponding points in the point clouds. In the next step,
the point cloud. PoChaDeHH initially eliminates outliers partial point clouds containing the same object instance
and noisy objects from the extracted point cloud, then are extracted from   , while their bounding volumes are
employs clustering techniques to separate the remaining utilized to extract the corresponding point clouds from   ′.
objects. The change is estimated based on the mean Finally, the pair of extracted point clouds undergo
classidistance between points in the registered point clouds. fication using the  -NN algorithm, determining whether
HGI-CD utilizes statistical techniques to remove outliers, there is no change or the instance has been removed. An
constructs color and geometric change graphs using the overview of the proposed change detection algorithm is
 -NN algorithm, and estimates change using Siamese depicted in Figure 1.
graph convolutional networks (GCNs) with Fast Point The details are described in this section.
Feature Histograms (FPFH) [19] as the node features.</p>
      <p>SiamGCN also employs GCNs with graphs constructed 3.1. Point Cloud Reconstruction
through  -NN, but does not include a cleaning step for
the extracted point cloud. The proposed algorithm for change detection relies on</p>
      <p>As a change detection algorithm that works without the comparison of two point clouds. These point clouds
prior information of the positions of objects of interest, are reconstructed using RGB-D images captured by an
K. Sakura et al. [18] propose two deep learning models, iPhone running ARKit, along with the corresponding
namely CSCDNet and SSCDNet. CSCDNet, a Siamese confidence maps and camera parameters (i.e., intrinsic
network based on ResNet-18 [20], estimates the proba- and extrinsic).
bility mask of change from two input images. SSCDNet, The confidence maps, provided by ARKit, are 2D arrays
on the other hand, is a U-Net-based network that uti- with the same dimensions as the depth map. They take
lizes the input images and the output of CSCDNet to three values: low, medium, and high, which indicate the
predict semantic change labels for each pixel. While the accuracy of the depth values. To optimize computational
networks can be trained with semantic labels from non- eficiency and enhance performance, pixels with low or
aligned images, during inference, two aligned images medium confidence values are excluded and not utilized
are required. If the input images are not aligned during in the reconstruction process.
inference, the models would erroneously detect changes By utilizing the camera parameters and depth map,
because many pixels in the input images represent dif- the position of each pixel in the 3D world coordinate
ferent objects that did not undergo change but appear system can be uniquely computed. The RGB images are
diferent. Similarly, W. G. C. Bandara and V. M. Patel employed to extract color information, while the instance
[11] propose ChangeFormer for 2D change detection, labels are obtained through segmentation, as discussed
which has demonstrated state-of-the-art performance on in Subsection 3.2.</p>
      <p>LEVIR-CD [21] and DSIFN-CD [22]. However, it faces
the same challenge mentioned above, where the align- 3.2. Panoptic Segmentation
ment of input images during inference is crucial to avoid
misinterpretation of changes.</p>
      <p>To perform panoptic segmentation on the RGB images,
we employ Detectron2 [23], a pre-trained deep
learn</p>
      <sec id="sec-3-1">
        <title>Pt : Point cloud at t</title>
      </sec>
      <sec id="sec-3-2">
        <title>Pt′ : Point cloud at t′</title>
      </sec>
      <sec id="sec-3-3">
        <title>Bounding Volume</title>
      </sec>
      <sec id="sec-3-4">
        <title>Keyboard 1 at t</title>
      </sec>
      <sec id="sec-3-5">
        <title>Keyboard 1 at t′</title>
        <p>&lt; nchange points?
Yes
No
N
N
k</p>
      </sec>
      <sec id="sec-3-6">
        <title>Removed</title>
      </sec>
      <sec id="sec-3-7">
        <title>No Change</title>
      </sec>
      <sec id="sec-3-8">
        <title>Removed</title>
        <p>in Figure 3. This algorithm verifies whether a point  1
and its neighboring point  2 possess the same object label
and if their distance falls below the predefined
threshold  merge. Moreover, it examines the set  to determine
if  1 and  2 should be considered as the same instance.</p>
        <p>It is important to note that the condition ( 1,  2) ∉ 
Figure 2: Instance labels obtained from panoptic segmen- alone is insuficient;  1 may share the same instance
tation. chair_1 (resp. chair_2) of the left image is the same label with other points   , where (  ,  2) ∈  , and
likeinstance as chair_4 (resp. chair_3) in the right image. wise for ( 1,   ′), where   ′ represents a point within the
group sharing the instance label with  2. If all three
conditions are satisfied, the instance labels are merged. The
ing model that assigns an instance label to each pixel. Union-Find data structure is employed in the algorithm
The instance label takes the form of object_id (e.g., key- to manage and merge instance labels.
board_10)), where object is a character string indicating
the type of object represented by the pixel, and id denotes 3.3. Point Cloud Extraction
the instance’s unique identifier.</p>
        <p>It is important to note that the uniqueness of instance Following the reconstruction and panoptic
segmentalabels extends not only within each image but also across tion steps, the point cloud   is partitioned into partial
all images used for point cloud reconstruction. Conse- point clouds based on their instance labels. This division
quently, diferent instance labels in two separate images ensures that each resulting partial point cloud contains
may correspond to the same underlying object (as il- precisely one instance label, and no other partial point
lustrated in Figure 2). Conversely, distinct instance la- clouds share the same label. However, due to
imperfecbels within a single image necessarily represent diferent tions in the panoptic segmentation performed by
Detecobjects. At this stage, we establish an unpaired set,  , tron2, erroneous instance labels may occasionally arise.
comprising pairs of instance labels that must represent These incorrectly labeled partial point clouds often
condistinct instances. sist of only a few points, as they do not correspond to</p>
        <p>By projecting pixels with instance labels onto the 3D any actual instances in the scene. To address this issue,
space, each point within the reconstructed point cloud is we introduce a threshold  discard and discard any partial
assigned an instance label. Consequently, the point cloud point clouds containing fewer than  discard points.
is represented as a 7-dimensional vector, comprising the Let   () denote the partial point cloud associated with
3D position, RGB color, and instance label. the instance label  obtained from   . In contrast to the</p>
        <p>To consolidate diferent instance labels corresponding extraction of partial point clouds from   the extraction
to the same object, we employ a  -NN-based algorithm. of corresponding partial point clouds from   ′ is based on
Initially, for each point, we compute the  merge nearest the bounding volume of   () . Specifically, if   () resides
points and their corresponding distances. Subsequently, within the range [ min,  max] × [ min,  max] × [ min,  max],
instance labels are merged using the algorithm depicted then the corresponding point cloud   ′() consists of
Data:  : Point cloud,  :  merge nearest neighbors,  :
 merge nearest neighbors’ distances,  : Unpair
set,  merge: Distance threshold,   : Union-Find
instance equipped with 
merges instance labels and  
method that</p>
        <p>method that
returns the group members of the given point.</p>
        <p>Result: Instance labels of the same instance are</p>
        <p>merged into one single label.
1 for  1 ∈  do

1 ←  1.
 1 ←  1.
for  2 ∈  [</p>
        <p>_
_
points from   ′ that also fall within the same range
[ min,  max] × [ min,  max] × [ min,  max]. It is important to
note that   ′() may contain multiple instance labels,
unlike   () which only has a single instance label associated</p>
        <sec id="sec-3-8-1">
          <title>3.4. Change Detection</title>
          <p>Our proposed algorithm focuses on detecting changes
between two point clouds,   () and   ′() . Initially, the
algorithm counts the number of points in   ′() , denoted
as  , and promptly identifies the change as removed if 
is less than a predefined threshold  change. This decision
is based on the observation that instances are less likely
to be represented by a small number of points, indicating
the absence of the instance in   ′() .</p>
          <p>In cases where  is not less than  change, the algorithm
proceeds with the change classification process, as
outlined in Figure 4. This process involves counting the
Data:  1,  2: Point clouds to compare,  :  change nearest
neighbors,  :  change nearest neighbors’
distances,  change: Ratio threshold.</p>
          <p>Result: Detection result: no change or removed.
1  _ℎ ← 0
2 for  1 ∈  1 do</p>
          <p>;
3
4
5
6
7
8</p>
          <p>end
10 if  _ℎ/( ∗ 
9 end
11
13
12 else
14 end
 1 ←  1.
for  2 ∈  [  ] do</p>
          <p>_
 2 ←  2.
if  1 =  2 then
 _ℎ ← 
;
_
return removed
return no change
;
_ℎ + 1
1.ℎ) &lt; 
change then</p>
        </sec>
      </sec>
    </sec>
    <sec id="sec-4">
      <title>4. Experiments</title>
      <sec id="sec-4-1">
        <title>4.1. Dataset</title>
        <p>We utilized an iPhone equipped with ARKit to capture
a dataset1 comprising RGB-D images, confidence maps,
and camera parameters within a room containing various
objects, including chairs, computers, books, cell phones,
and keyboards. The dataset consists of a collection of
frames captured at two diferent time instances,  and  ′.
The specific number of frames captured at each time
instance is presented in Table 1. It is important to highlight
that individual partial point clouds within the dataset do
not necessarily correspond to distinct object instances;
some partial point clouds may represent diferent parts
of the same object instance.
1Our source code is available on GitHub: https://github.com/
Tomoya-Matsubara/RGB-D-Scan-with-ARKit</p>
      </sec>
      <sec id="sec-4-2">
        <title>4.2. Implementation Details</title>
        <p>Table 2 provides an overview of the parameter settings (c) Chair before merging (d) Chair after merging
employed in our implementation2. During the point
cloud reconstruction phase, we applied random sampling Fmiegrugrineg5.:EaIncshtainnscteanlacbeellasboefl htvasainsdcoclhoarierdbuenfioqrueealyn.dInaftethre
and selected  sample sample pixels from each frame to left images (a) and (c), each instance ( tv and chair ) has many
optimize processing time for subsequent operations. colors, which shows many labels are assigned to the same</p>
        <p>Although Detectron2 ofers support for various object instance before merging. In the right images (b) and (d), in
instances, we focused on extracting point clouds associ- contrast, each of them has only a few colors, which indicates
ated with specific labels, namely book, bottle, cup, chair, those labels are merged correctly.
keyboard, laptop, and cell phone.</p>
      </sec>
      <sec id="sec-4-3">
        <title>4.3. Annotation</title>
      </sec>
      <sec id="sec-4-4">
        <title>5.1. Label Merge</title>
        <p>The proposed algorithm extracted 88 partial point clouds
from the dataset. Table 3 shows the detail of the extracted
point clouds.</p>
        <p>We performed manual annotation to assign labels (i.e., no Figure 5 demonstrates the successful merging of instance
change and removed) to the extracted partial point clouds. labels belonging to the object categories tv and chair.
During the annotation process, we carefully examined It can be observed that not only the labels of planar tv
the origin of each partial point cloud in   , determining instances but also those of chair instances with more
the corresponding object instance it belonged to, and complex shapes were efectively merged. This can be
atverified the presence of the same object instance in   ′. tributed to the fact that the merge operation solely relied
The presence of the object instance indicated no change, on distance information, without making any
assumpwhile its absence indicated that the object instance had tions about the shapes of the instances.
been removed. Although some instances still retained multiple labels,
the overall number of instance labels was significantly
reduced by approximately 40% (from 4, 534 to 2, 729). After
5. Results discarding partial point clouds with a point count below
the threshold  discard, the remaining labels were further
reduced to a final count of 88.
2Our implementation is available on GitHub: https://github.com/
Tomoya-Matsubara/point-cloud-change-detection</p>
      </sec>
      <sec id="sec-4-5">
        <title>5.2. Change Detection</title>
        <p>The result of the change detection is shown in Figure
6. As the false negative (bottom left corner of the
matrix) indicates, the proposed algorithm detected changes
perfectly.
th 0
u
r
T
d
n
u
o
rG1
Prediction
40
35
30
25
20
15
10
5
0</p>
        <p>However, there were 17 cases of false positives, where
the algorithm incorrectly predicted that an object
instance was removed when it was actually present. This
can be attributed to the limited number of partial point
clouds in   ′. Figure 7 provides a visual representation
of the change detection results, with   and   ′ captured
from the same angle. False positive cases are highlighted
in green, particularly noticeable in the central chair in
Figure 7 (a). Upon closer examination of the same chair
in Figure 7 (b) and (c), it becomes evident that   contains
a substantial number of points accurately representing
the chair’s shape. Conversely,   ′ only consists of a few
points, failing to capture the chair adequately.
Consequently, due to the algorithm’s tendency to predict
removal in cases with limited point coverage, these false
positives were triggered.</p>
        <p>Since there is no false negative, as explained above, no
red-colored point can be seen in Figure 7 (a).</p>
        <p>Figure 8 provides a visualization from a diferent
angle, showcasing objects on a table, such as a laptop and
a smartphone. These objects are typically ignored as
change detection targets in previous works due to their
close proximity to each other, often just a few centimeters
apart. However, in the proposed algorithm, we
successfully detected the removal of such objects by leveraging
pixel-level object segmentation rather than relying solely
on distance-based criteria. This approach allowed us to
accurately identify and classify the removal of objects,
even in challenging scenarios where objects are spatially
close to each other.</p>
      </sec>
      <sec id="sec-4-6">
        <title>5.3. Comparison with 2D Change</title>
      </sec>
      <sec id="sec-4-7">
        <title>Detection</title>
        <p>Figure 9 presents the change detection results obtained
using the pre-trained ChangeFormer [11] model. This
ifgure showcases the same scene as shown in Figure 8,
with a focus on the successful detection of the laptop
removal. Notably, Figure 9 (a) and (b) exhibit misalignment
(a) Change detection result
(b)   : Point cloud at</p>
        <p>(c)   ′: Point cloud at  ′
because they were recorded by humans, as opposed to
robots whose movements can be pre-defined and
controlled.</p>
        <p>In Figure 9 (c), the ChangeFormer model trained on
DSIFN-CD detects changes in the top left corner, although
no actual changes occurred in that region. Additionally,
while the pixels at the center seemingly detect the
re(a) Image from 
(b) Image from  ′
(c) ChangeFormer (DSIFN)
(d) ChangeFormer (LEVIR)
(e) Point Cloud captured at  (f) Point Cloud captured at  ′ (g) ChangeFormer (DSIFN)
(h) ChangeFormer (LEVIR)
moval of the laptop, their size is significantly smaller
compared to the actual change. On the other hand, Figure
9 (e) and (f) depict manually captured images achieved by
aligning two point clouds. In this particular case, Figure
9 (g) successfully detects the removal.</p>
        <p>It is worth noting that these results are not surprising,
considering that ChangeFormer was not specifically
designed or trained to detect changes in unaligned images.</p>
        <p>However, in metaverse applications, it is expected that
both robots and humans contribute to data collection
(e.g., image capture) for immediate updates to the virtual
world. Consequently, captured images are not always
perfectly aligned, and cases resembling Figure 9 (e) and
(f) are less likely to arise, especially when the detection
target instance is not pre-determined. From this
perspective, our proposed algorithm demonstrates its ability to
detect changes in object instances, even when captured
from diferent angles or under misalignment conditions.</p>
      </sec>
      <sec id="sec-4-8">
        <title>5.4. Comparison with Related Work of 3D</title>
      </sec>
      <sec id="sec-4-9">
        <title>Change Detection</title>
        <p>For reference, we conducted a comparison of our change
detection results with the performance of PoChaDeHH,
HGI-CD, and SiamGCN [16] algorithms on their street
scene dataset, as presented in Table 4. It should be noted
that direct comparison between our algorithm and the
reference algorithms is challenging due to the following
reasons:
• Classification Diferences: The reference
algorithms are designed for five-class classification,
including categories such as no change, removed,
added, change, and color change. In contrast, our
algorithm focuses on detecting the removal of
object instances.
• Dataset Variation: The reference algorithms
utilize a diferent dataset consisting of street scenes,
which may introduce variations in terms of scene
composition, object types, and background
elements.
• Known Object Positions: The positions of the
objects of interest are provided in the reference
algorithms, whereas our algorithm operates
without this prior knowledge.</p>
        <p>Despite these diferences, our algorithm demonstrates
superior performance in terms of recall for the removed
class compared to the reference algorithms. Even when
considering the added class as equivalent to the removed
class, our algorithm still exhibits a slight performance
advantage of 0.06. However, the recall for the no change
class is comparatively lower than that of PoChaDeHH
and HGI-CD; according to [16], these algorithms tend to
predict no change, but this specialization comes at the
expense of generalization performance for other classes.
Conversely, SiamGCN, which showcases the best
generalization performance among the reference algorithms,
exhibits a recall rate similar to ours.</p>
      </sec>
    </sec>
    <sec id="sec-5">
      <title>6. Conclusion</title>
      <p>In this study, we have presented a change detection
algorithm that relies on panoptic segmentation and  -NN,
operating without the need for positional information
about the object of interest.</p>
      <p>Our label merge algorithm efectively combines
different instance labels that may correspond to the same
object instance, resulting in a reduced number of labels.
We have demonstrated its success in merging labels for
instances with complex shapes, such as chairs.</p>
      <p>Through experiments conducted on an indoor point
cloud dataset, our change detection algorithm has proven
its ability to detect the removal of closely situated objects.
Unlike 2D change detection techniques, our algorithm
surpasses the limitations of capturing changes from a
single angle and showcases its capability to detect changes
in objects captured from diferent angles. Furthermore,
our algorithm has been compared with a
state-of-theart algorithm, revealing its competitive performance in
terms of recall, particularly for the removed class.</p>
      <p>In future research, we propose exploring techniques to
assess the quality of input images. Blurred images caused
by camera shake can adversely impact the segmentation
performance, and addressing this issue would enhance
the overall accuracy of our algorithm. Additionally, as
multiple frames may capture the same scene with
minimal diferences, removing duplicates could be considered
to reduce the number of frames for processing, ultimately
improving computational eficiency.</p>
      <p>Robots and Systems, 2007, pp. 3429–3435. doi:10.
1109/IROS.2007.4399381.
[15] U. Katsura, K. Matsumoto, A. Kawamura,
T. Ishigami, T. Okada, R. Kurazume, Spatial change
detection using voxel classification by normal
distributions transform, in: 2019 International
Conference on Robotics and Automation (ICRA), 2019,
pp. 2953–2959. doi:10.1109/ICRA.2019.8794173.
[16] T. Ku, S. Galanakis, B. Boom, R. C. Veltkamp,
D. Bangera, S. Gangisetty, N. Stagakis, G.
Arvanitis, K. Moustakas, Shrec 2021: 3d point
cloud change detection for street scenes,
Computers Graphics 99 (2021) 192–200.
URL: https://www.sciencedirect.com/science/
article/pii/S0097849321001369. doi:https:
//doi.org/10.1016/j.cag.2021.07.004.
[17] S. Nikoohemat, M. Koeva, S. Oude Elberink, C.
Lemmen, Change detection from point clouds to
support indoor 3d cadastre, The International Archives
of the Photogrammetry, Remote Sensing and
Spatial Information Sciences 42 (2018) 451–457.
[18] K. Sakurada, M. Shibuya, W. Wang, Weakly
supervised silhouette-based semantic scene change
detection, in: 2020 IEEE International conference
on robotics and automation (ICRA), IEEE, 2020, pp.
6861–6867.
[19] R. B. Rusu, N. Blodow, M. Beetz, Fast point feature
histograms (fpfh) for 3d registration, in: 2009 IEEE
international conference on robotics and
automation, IEEE, 2009, pp. 3212–3217.
[20] K. He, X. Zhang, S. Ren, J. Sun, Deep residual
learning for image recognition, in: Proceedings of the
IEEE conference on computer vision and pattern
recognition, 2016, pp. 770–778.
[21] H. Chen, Z. Shi, A spatial-temporal attention-based
method and a new dataset for remote sensing image
change detection, Remote Sensing 12 (2020) 1662.
[22] C. Zhang, P. Yue, D. Tapete, L. Jiang, B. Shangguan,
L. Huang, G. Liu, A deeply supervised image fusion
network for change detection in high resolution
bi-temporal remote sensing images, ISPRS Journal
of Photogrammetry and Remote Sensing 166 (2020)
183–200.
[23] Y. Wu, A. Kirillov, F. Massa, W.-Y. Lo, R. Girshick,
Detectron2, https://github.com/facebookresearch/
detectron2, 2019.</p>
    </sec>
  </body>
  <back>
    <ref-list />
  </back>
</article>