=Paper= {{Paper |id=Vol-3919/short26 |storemode=property |title=A Review of Topological Map Construction Methods for Indoor Robot Localization and Navigation |pdfUrl=https://ceur-ws.org/Vol-3919/short26.pdf |volume=Vol-3919 |authors=Wen Liu,Ran Li,Zhongliang Deng |dblpUrl=https://dblp.org/rec/conf/ipin/LiuLD24 }} ==A Review of Topological Map Construction Methods for Indoor Robot Localization and Navigation== https://ceur-ws.org/Vol-3919/short26.pdf
                                A Review of Topological Map Construction Methods for
                                Indoor Robot Localization and Navigation1⋆
                                Wen Liu1,∗, Ran Li1,∗ and Zhongliang Deng1
                                1
                                    School of Electronic Engineering, Beijing University of Posts and Telecommunications, Beijing 100876, China



                                                   Abstract
                                                   In indoor scenes, robots typically rely on prior maps to obtain environmental information for localization
                                                   and navigation. The review first compares the characteristics of metric maps, semantic maps, and
                                                   topological maps, which are commonly used in indoor scenes, emphasizing the advantages and potential of
                                                   topological maps as a high-level representation of environmental structure in the application of indoor
                                                   robot localization and navigation. It introduces methods for constructing topological maps based on
                                                   visibility graphs, graph partitioning, landmark features, and graph appearance, briefly analyzing their
                                                   respective advantages and challenges. The review further explores multi-level expression methods that
                                                   combine topological maps with other environmental information, focusing on the multi-dimensional
                                                   information representation that combines topological structure with semantic mapping, and looks forward
                                                   to its future development direction of intelligent interactive applications for indoor robots.

                                                   Keywords
                                                   Indoor positioning and navigation, topological maps, semantic information, multi-level map 2



                                1. Introduction
                                    Indoor positioning and navigation for robots can be summarized into three key questions: the
                                robot needs to know "Where am I?", "Where do I want to go?", and "How do I get there?" [1]. This
                                involves how the robot perceive, explore, acquire, and understand the surrounding environmental
                                model, as well as how to plan and execute motion strategies effectively. First and foremost, this
                                requires the construction of corresponding indoor maps as a priori basic reference, enabling the robot
                                to complete complex tasks effectively.
                                    Initially, researchers used metric maps to construct environmental models. But over time, issues
                                such as high computational demand, and difficulty in maintenance have increasingly become
                                apparent, making them insufficient to meet the needs of intelligent applications for indoor robots.
                                Therefore, finding more advanced ways of environmental expression and developing navigation
                                behaviors that are closer to those of humans has become a new direction for challenges. By
                                leveraging the concept of graph theory, topological maps describe the environment based on
                                topological structures, focusing only on the connectivity between nodes rather than precise
                                geographical coordinates [2], eliminating the need for fine map construction, and thus being more
                                lightweight, providing high-level positioning and navigation information for indoor robots.
                                    This review will compare the commonly used types of indoor maps to highlight the unique
                                advantages of topological maps in robot applications. Based on the summary of indoor topological
                                map construction methods, it will explore the multi-level map representation with topological maps
                                as the main part, aiming to provide better map services for indoor robots to achieve positioning and
                                navigation, and to perform complex and advanced tasks.



                                Proceedings of the Work-in-Progress Papers at the 14th International Conference on Indoor Positioning and Indoor
                                Navigation(IPIN-WiP 2024), October 14 - 17, 2024, Hong Kong, China
                                *
                                  Corresponding author.
                                   liuwen@bupt.edu.cn (W. Liu); liran@bupt.edu.cn (R. Li); dengzhl@bupt.edu.cn (Z. Deng)
                                    0000-0002-6450-1969 (W. Liu); 0009-0005-2934-9358 (R. Li)
                                              © 2024 Copyright for this paper by its authors. Use permitted under Creative Commons License Attribution 4.0 International (CC BY 4.0).


CEUR
                  ceur-ws.org
Workshop      ISSN 1613-0073
Proceedings
2. Types of Indoor Map
    For indoor scenes, different mapping methods can be used to construct various types of maps. Fig.
1 Illustrates the mapping relationship between the environmental information and the map. This
chapter will compare and introduce these three types of indoor maps.


                                          Metric Mapping                 Metric Map

                       Environment
                                          Semantic Mapping              Semantic Map
                       Information

                                          Topological Mapping          Topological Map


Figure 1: Environment Information and the Mapping Relationship with Maps
   Metric maps refer to maps that describe the environment through mathematical metric mapping,
representing the environment with real physical dimensions. Metric maps can be constructed based
on point clouds [5][6], or more typically, by discretizing the environment into grid cells [3][4], which
encode certain metrics to represent environmental information (such as occupancy rate, distance
from obstacles, etc.). However, due to the need for rich information to accurately represent the
environment, the construction of metric maps requires a significant consumption of computational
resources and storage space, and for most indoor robot tasks, much of this information is
unnecessary [22]. Secondly, because the construction of metric maps demands a high level of
granularity, their quality is greatly affected by the precision and noise of the sensors, and it is difficult
to update and maintain them in dynamic environments.
   Semantic maps map the semantic information of objects, areas, etc., in the environment,
providing robots with more in-depth conditions for understanding the environment. Semantic maps
provide an environmental representation with elements of high-level abstraction [36], furthering the
logical representation of the environment. It means that robots can infer new information through
additional knowledge of global elements [3], greatly enhancing their capabilities. However, due to
the construction of semantic maps relies on computer vision algorithms[7][8][9][10], the accuracy
and robustness of the algorithms and models used will directly affect the quality of the semantic
maps. In addition, in actual indoor scenes, changes in the position and state of objects may make the
updating and maintenance of semantic maps difficult [40].
   Topological maps use topological mapping to represent spatial relationships, simplifying the
detailed description of actual geographical environmental information, focusing on the relative
positions and connection relationships between nodes in the geographical space. The mathematical
relationship of topological maps can generally be expressed by Eq:

                                           𝐺 = (𝑉, 𝐸),                                           (1)
    Where G refer to the topological map, V refer to the set of vertex information in the topological
graph, and E refer to the matrix of edges. In practical applications, nodes can represent key positions
in the geographical environment, such as rooms, doors, corridors, or other feature points, and edges
represent the connection relationships between nodes, such as adjacency, direction, accessibility, etc.
    In the field of positioning and navigation for indoor robot, topological maps are a low-granularity
structural representation method that can reduce the complexity of maps by abstracting the
environment, thereby reducing information storage and lowering computational complexity. Since
topology itself is a branch of mathematics that studies shapes and spaces, especially the properties
that remain unchanged under continuous deformation of space, topological maps have a certain
degree of flexibility and adaptability. And It naturally have a certain tolerance for dynamic
environments. Even in the face of environmental changes or the appearance of dynamic obstacles,
researchers can more easily maintain and update them.
3. Construction Methods of Indoor Topological Map
   Constructing a topological map begins with using various sensors to acquire data from the
environment, followed by utilizing different algorithms to extract the topological structure from the
data and represent it as a graph or network. This chapter mainly introduces several commonly used
methods for constructing topological maps.

3.1. Constructing Topological Map Based on Voronoi Diagram
    The Voronoi diagram is a graphical representation method based on the concept of geometric
distance. It effectively represents the spatial structure and relationships by dividing space into
multiple regions associated with a set of input points. The boundaries of Voronoi regions are
composed of the perpendicular bisectors between adjacent input points. The Voronoi diagram
provides positional information of points in space and their relationships with the nearest
neighboring points [12][13], making it suitable for robot motion planning and obstacle avoidance
and supporting fast and efficient nearest neighbor searches. As in references [12], during the process
of constructing a topological map, the junction points or endpoints of pruned Voronoi edges can be
used as topological nodes, with Voronoi paths between nodes considered as edges, thus converting
the Voronoi diagram into a topological map. In three-dimensional space, reference [15] uses the
three-dimensional generalized Voronoi diagram (GVD) extracted from the Euclidean signed distance
field [5] to represent the topological structure of the spatial environment and generate a thin skeleton
graph [14]. The sparse topological map can resist changes in noise and resolution and is suitable for
micro aerial robots to perform indoor navigation tasks. The application of these methods has made
the Voronoi diagram a powerful tool in the field of robot navigation from the beginning.

3.2. Constructing Topological Map Based on Graph Partitioning
   Constructing topological map based on graph partitioning methods typically involves dividing
acquired RGB images or metric maps into different subgraphs as nodes, with the connections
between subgraphs serving as edges. The aim is to maximize connectivity within the same group
while minimizing connectivity between different cluster nodes. Spectral clustering [16][17][18] is
one of the commonly used graph partitioning algorithms, which shares the characteristic of using
an affinity matrix as input [17]. This matrix typically uses Euclidean distance or other metrics to
describe the similarity between data points. There are inherent disadvantages to using spectral
clustering algorithms to generate topological maps [19], such as high computational costs when the
input affinity matrix is large, and issues with excessive nodes and non-repetitive results. In Reference
[21], the environment is modeled with "appearance graphs" using visual camera poses as low-level
map nodes. The normalized cut criterion [20], one of the graph partitioning method, is then used to
cluster nodes and construct higher-level mappings. This approach can be considered a precursor to
constructing topological maps based on graph appearance.
   Methods for constructing indoor topological maps based on Voronoi diagrams and graph
partitioning, although more structured and hierarchical compared to contrast metric maps, are still
not concise enough, and have many constraints during construction and a relatively fixed range of
applicable environments. Therefore, in the subsequent research on topological map construction
methods, they serve more as auxiliary tools.

3.3. Constructing Topological Map Based on Landmark Features
   The method based on landmark features utilizes feature points in the environment, such as corner
points, doors, rooms, etc., as nodes, and the distance or direction between feature points as edges to
construct a topological map. Feature points can be obtained from the environment through sensors;
for example, Reference [23] uses a 3D sensor to acquire depth information and develops a progressive
Bayesian classifier for directly identifying different types of corridors (such as dead ends, T-junctions,
crossroads, etc.). It abstracts the environment into a topological map with rooms or intersections as
nodes and corridors as edges, integrating information from multiple observations to extract features.
Reference [26] obtains stable visual landmarks from videos as nodes (such as doors, fire extinguishers,
elevators, etc.) and constructs a topological map using continuous sequences as connectivity
information. In addition, feature points can be extracted from prior environmental information, such
as Reference [24] which is based on a 3D indoor map model from Building Information Modeling
(BIM), extracting elements like doors, windows, facilities, rooms, etc., as nodes, with edges including
corridors and connection relationships. It also proposes step nodes for assisting indoor positioning
and navigation, adaptable to complex and open indoor environments. Reference [25], on the other
hand, extracts elements from CAD drawings and analyzes their topological relationships to construct
an object-oriented topological structure.

3.4. Constructing Topological Map Based on Graph Appearance
    Methods based on graph appearance primarily use visual information to construct topological
maps. One such method represents the robot's world environment as a collection of linked waypoint
images, that is, using images as nodes, creating edges between consecutive images and uses image
matching methods for localization and navigation. Reference [27] establishes nodes based on
positional visibility and assigns edges with spatial distance information and the navigability
probability between two nodes, allowing intelligent agents to form long-term plans and navigate in
new environments without prior knowledge of specific environments.
    Another more typical method is the planning method based on topological memory
[29][30][31][33][34][35]. Topological memory is a memory map where each node corresponds to a
past observation of the robot. The SPTM [28], as a representative of topological memory, establishes
nodes by interacting with the environment at discrete time steps. SPTM builds a dense topological
map using image similarity as accessibility. Reference [36] specifically learns an accessibility
estimator to predict the probability of reachability and sparsely reduces dense trajectories to anchor
observation sequences, using anchor observation values as nodes and assigning edge weights based
on reachability probability to construct a sparser topological map. Furthermore, Reference [37]
proposes a graph maintenance strategy to improve lifelong navigation performance by eliminating
incorrect edges and expanding the graph as needed. Reference [48] merges multiple sparse
trajectories into a single topological map suitable for localization and navigation planning, using
RGB-D panoramic images as nodes and additionally attaching rough geometric information to the
directed edges in the map, enhancing the robot's global navigation capabilities.

Table 1
Comparison of Indoor Topological Map Construction Methods

 Method     Features                                                  Applicable Scenarios
  3.1       Emphasizes geometric distance and spatial division        Static environments / Obstacle
                                                                      avoidance tasks
    3.2     Focuses on maximizing regional connectivity               Existing foundational maps
    3.3     Concentrates on prominent feature points in the           Structured scenarios
            environment
    3.4     Utilizes visual information, capable of real-time         Relying on visual information
            exploration and map construction                          / Real-time construction

   Table 1. compares the four methods of topological map construction for indoor scenes mentioned
earlier. These methods utilize different sensor data and algorithms, each with its own characteristics
and suitable for various application scenarios, together forming a diversified framework for the
construction of topological maps in indoor scenes. Although topological maps have unique
advantages among the three types of indoor maps introduced in Chapter 2, under the backdrop of
the rapid development of artificial intelligence, a single topological map still falls short when facing
the demands of intelligent real-time interactive tasks. Since topological maps are constructed with a
structure of nodes and edges, extended constraints can be added to them [38], using other
environmental mapping information as auxiliary to provide multi-level map information for indoor
robot localization and navigation.

4. Multi-level Map Representation Methods
    The "multi-level" refers to the use of multiple sensors to perceive and map the environment using
various mathematical expression methods to obtain multi-dimensional information, with the aim of
better serving future intelligent interactive applications for indoor robots [11]. As shown in Fig. 2,
this chapter will focus on topological maps, combined with other information of the environment,
to study indoor multi-level map expression methods.

                                                                   Other Information
                                                                       Visibility
                                                                       Semantic
                                                   Node      ADD
                                                                         Text
                                                                         … ...


                                                                       Distance
                                                    Edge              Probability
                                                             ADD
                                                                        Weight
                                                                         … ...



Figure 2: Adding additional information to the topological structure

4.1. Metrical-Topological Methods
   The metrical-topological methods involve combining metric mapping with topological mapping
to construct multi-layered topological maps. It integrates the basic geometric information of space
on the basis of topological structures to provide a multi-dimensional representation of environmental
structures. The Spatial Semantic Hierarchy (SSH) proposed by Kuipers [41] describes the knowledge
of large-scale space using four dimensions: metric, topological, causal, and control, which is a
meaningful pioneering attempt to describe the environment by integrating multi-dimensional
information. Subsequent research [42] expanded the basic SSH, using metric mapping to create and
store local perceptual maps of position neighborhoods as small sealed space ontologies, mapping
them into the large-scale spatial ontology of cognitive maps, constructing a global topological
relationship mapping, which enables robots to perform global topological inference and local motion
planning effectively. In the metrical-topological methods, some researchers study how to extract
topological maps from metric maps and represent the environment together with both [43][44][45],
while others study how to enrich the topological structure by adding metric mapping information to
topological maps. These attempts have a significant enlightening effect on the development of indoor
topological map construction. In this process, many researchers have discovered the importance of
semantic concepts in the positioning and navigation of indoor robots, making semantic topological
methods a new research hotspot.

4.2. Semantic-Topological Methods
   The semantic-topological methods combine semantic mapping with topological mapping to
construct multi-layered topological maps. They integrate semantic features such as objects and
locations on the basis of topological structures, helping robots understand their surroundings from
the perspective of human spatial concepts. Some researchers combine semantic mapping with
topological mapping in a layer-by-layer manner [15][52][53][54], but these approaches is merely
"multi-layered" in a literal sense. It does not deeply integrate information across dimensions and is
not lightweight enough, containing too much redundant information.
    Another approaches are to integrate semantic information with the topological structure at the
level of the structure, constructing multi-layered maps that are more suitable for positioning and
navigation tasks indoor robots. For example, based on the SLAM algorithm, conference [32] proposed
a semantic-topological method based on ORB-SLAM2, which uses the YOLOv5 network for object
detection to obtain semantic features, and constructs a topological map based on the spatial position
information of static objects. Based on the topological memory method SPTM [28], conference [47]
proposed the Topological Semantic Graph Memory (TSGM), where image nodes represent different
locations, and object nodes point to unique semantic objects using their visual representations.
Object nodes within the neighborhood are connected to the corresponding image nodes as contextual
auxiliary information according to visual rules, to eliminate the ambiguity of similar but different
objects. There are also studies [11][48][49][50] based on modular methods, using cross-modal
encoders to fuse topological maps with natural language instructions, effectively integrating
semantic information to generate navigation plans, enabling robots to better achieve visual and
language navigation tasks. The latest research [51] combines topological mapping with large
language model, capturing the spatial structure and connectivity of the environment to build
topological maps online and convert them into text prompts, and using visual models to convert the
visual information of the scene into feature information rich in semantic content. These semantic-
topological methods combine the global navigation advantages of topological structures with the
rich semantic information of visual observations, helping robots understand their navigation paths
in the environment, and enabling effective global exploration and robust and efficient navigation in
complex environments.
    The combination of topological maps with semantic information shows great application
potential and development prospects in the field of indoor robots. Future research can further explore
how to integrate rich semantic information more deeply into topological maps, not only including
the recognition of objects and landmarks and other points of interest, but also involving the
understanding of functional areas of the environment and divergent cognition of dynamic changes,
enhancing generalization ability, enabling robots to adapt to various scales and complexities of
indoor scenes. For example, developing advanced cross-modal learning algorithms and combining
them with large language models to achieve effective integration of visual, language information,
and topological structure, providing flexible and reliable support for human-computer interaction
and intelligent strategy-making.

5. Conclusion
   This review emphasizes the superiority of topological maps as an abstract representation method
of environmental structure, in providing high-level navigation information, handling dynamic
environments, reducing computation and storage and other dimensions. It points out their potential
in positioning and navigation for indoor robots, compares and outlines some construction methods
of indoor topological maps. On this basis, this review further proposes the concept of multi-level
map expression, using the structural flexibility and scalability of topological maps, combined with
other environmental information, to construct multi-level maps that enrich the robot's cognitive
understanding of the environment, thereby better supporting indoor positioning and navigation of
robots. In particular, semantic-topological methods aim to enhance the autonomy and intelligence
of robots by combining advanced semantic understanding and topological representation
technologies. The development of this field in the future is expected to be multifaceted, and with the
continuous advancement in semantic-topological research, it is anticipated to bring breakthrough
progress to indoor robotics technology.

Acknowledgements
   This work was financially supported by the National Natural Science Foundation of China under
Grant No.62372049.
References
[1] J.J. Leonard, H. F. Durrant-Whyte, Mobile robot localization by tracking geometric beacons,
     IEEE Transactions on Robotics and Automation 7 (1991) 376-382. doi:10.1109/70.88147.
[2] X. Meng, N. Ratliff, Y. Xiang, and D. Fox, Scaling local control to large-scale topological
     navigation, in: Proc. IEEE Int. Conf. Robot. Autom. (ICRA), Paris, France, 2020, pp. 672-678.
     doi:10.1109/ICRA40945.2020.9196644.
[3] P. Racinskis, J. Arents, M. Greitans, Constructing Maps for Autonomous Robotics: An
     Introductory Conceptual Overview, Electronics 12 (2023). doi:10.3390/electronics12132925.
[4] T. Collins, J. J. Collins and D. Ryan, Occupancy grid mapping: An empirical evaluation, in:
     Mediterranean Conference on Control & Automation, Athens, Greece, 2007, pp. 1-6.
     doi:10.1109/MED.2007.4433772.
[5] H. Oleynikova, Z. Taylor, et al., Voxblox: Incremental 3D Euclidean signed distance fields for
     on-board MAV planning, in: Proc. IEEE/RSJ Int. Conf. Intell. Robots Syst. (IROS), Vancouver,
     BC, Canada, 2017, pp. 1366–1373. doi:10.1109/IROS.2017.8202315.
[6] V. Reijgwart, A. Millane, H. Oleynikova, R. Siegwart, C. Cadena, and J. Nieto, Voxgraph:
     Globally consistent, volumetric mapping using signed distance function submaps, IEEE Robot.
     Autom. Lett. 5 (2020) 227–234. doi:10.1109/LRA.2019.2953859.
[7] C. Case, B. Suresh, A. Coates and A. Y. Ng, Autonomous sign reading for semantic mapping, in:
     Proc. IEEE Int. Conf. Robot. Autom. (ICRA), Shanghai, China, 2011, pp. 3297–3303.
     doi:10.1109/ICRA.2011.5980523.
[8] N. Sünderhauf, F. Dayoub, et al, Place categorization and semantic mapping on a mobile robot,
     in: Proc. IEEE Int. Conf. Robot. Autom. (ICRA), Stockholm, Sweden, 2016, pp. 5729-5736. doi:
     10.1109/ICRA.2016.7487796.
[9] G. Narita, T. Seno, T. Ishikawa and Y. Kaji, PanopticFusion: Online Volumetric Semantic
     Mapping at the Level of Stuff and Things, in: Proc. IEEE/RSJ Int. Conf. Intell. Robots Syst. (IROS),
     Macau, China, 2019, pp. 4205-4212. doi:10.1109/IROS40897.2019.8967890.
[10] Qi, Xianyu et al., Building semantic grid maps for domestic robot navigation, Int. J. Adv. Robot.
     Syst. 17 (2020). doi:10.1177/1729881419900066.
[11] S. Chen, P. -L. Guhur, M. Tapaswi, C. Schmid and I. Laptev, Think Global, Act Local: Dual-scale
     Graph Transformer for Vision-and-language Navigation, in: Proceedings of the IEEE/CVF
     Conference on Computer Vision and Pattern Recognition (CVPR), New Orleans, LA, USA, 2022,
     pp. 16516-16526. doi:10.1109/CVPR52688.2022.01604.
[12] V. Setalaphruk, A. Ueno, I. Kume, Y. Kono and M. Kidode, Robot navigation in corridor
     environments using a sketch floor map, in: Proceedings 2003 IEEE International Symposium on
     Computational Intelligence in Robotics and Automation, Kobe, Japan, 2003, pp. 552-557.
     doi:10.1109/CIRA.2003.1222240.
[13] S. Friedman, H. Pasula, D. Fox, Voronoi Random Fields: Extracting the Topological Structure of
     Indoor Environments via Place Labeling, in: Proceedings of the 20th. International Joint
     Conference on Artifical Intelligence, IJCAI'07, San Francisco, CA, USA, 2006, pp. 2109-2114.
     doi/10.5555/1625275.1625616.
[14] P. Beeson, N. K. Jong and B. Kuipers, Towards Autonomous Topological Place Detection Using
     the Extended Voronoi Graph, in: Proc. IEEE Int. Conf. Robot. Autom., Barcelona, Spain, 2005,
     pp. 4373-4379. doi: 10.1109/ROBOT.2005.1570793.
[15] H. Oleynikova, Z. Taylor, R. Siegwart, J. Nieto, 3D Topological Graphs for Micro-Aerial Vehicle
     Planning, in: IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS),
     Madrid, Spain, 2018, pp. 1-9. doi:10.1109/IROS.2018.8594152.
[16] U. Luxburg, A tutorial on spectral clustering, Statistics and computing 17 (2007) 395-416.
     doi:10.1007/s11222-007-9033-z.
[17] C. Valgren, T. Duckett, A. Lilienthal, Incremental Spectral Clustering and Its Application to
     Topological Mapping, in: Proc. IEEE Int. Conf. Robot. Autom., Rome, Italy, 2007, pp. 4283-4288.
     doi:10.1109/ROBOT.2007.364138.
[18] E. Brunskill, T. Kollar and N. Roy, Topological mapping using spectral clustering and
     classification, in: 2007 IEEE/RSJ International Conference on Intelligent Robots and Systems,
     San Diego, CA, USA, 2007, pp. 3491-3496. doi:10.1109/IROS.2007.4399611.
[19] M. Liu, F. Colas and R. Siegwart, Regional topological segmentation based on mutual
     information graphs. in: Proc. IEEE Int. Conf. Robot. Autom., Shanghai, China, 2011, pp. 369-3274.
     doi:10.1109/ICRA.2011.5979672.
[20] Jianbo Shi and J. Malik, Normalized cuts and image segmentation, in IEEE Transactions on
     Pattern Analysis and Machine Intelligence 22 (2000) pp. 888-905. doi: 10.1109/34.868688.
[21] Z. Zivkovic, B. Bakker, B. Krose, Hierarchical map building using visual landmarks and
     geometric constraints, in: IEEE/RSJ International Conference on Intelligent Robots and Systems,
     Edmonton, AB, Canada, 2005, pp. 2480-2485. doi:10.1109/IROS.2005.1544951.
[22] M. Liu, F. Colas, L. Oth, R. Siegwart, Incremental topological segmentation for semi-structured
     environments using discretized GVG, Autonomous Robots 38 (2015) 143-160.
     doi:10.1007/s10514-014-9398-8.
[23] H. Cheng, H. Chen and Y. Liu, Topological Indoor Localization and Navigation for Autonomous
     Mobile Robot, IEEE Transactions on Automation Science and Engineering 12 (2015), pp. 729-
     738. doi:10.1109/TASE.2014.2351814.
[24] J. Liu, J. Luo, J. Hou, D. Wen, G. Feng, X. Zhang, A BIM Based Hybrid 3D Indoor Map Model for
     Indoor Positioning and Navigation, ISPRS Int. J. Geo-Inf. 9 (2020). doi:10.3390/ijgi9120747
[25] Z. Lin, C. Xiu, W. Yang, D. Yang, A Graph-Based Topological Maps Generation Method for
     Indoor Localization, in: Ubiquitous Positioning, Indoor Navigation and Location-Based Services
     (UPINLBS), Wuhan, China, 2018, pp. 1-8. doi:10.1109/UPINLBS.2018.8559830.
[26] J. Zhu, Q. Li, R. Cao, K. Sun, T. Liu, J.M. Garibaldi et al., Indoor Topological Localization Using
     a Visual Landmark Sequence, Remote Sensing 11 (2019). doi:10.3390/rs11010073
[27] E. Beeching, J. Dibangoye, O. Simonin, C. Wolf, Learning to plan with uncertain topological
     maps, in: Vedaldi, A., Bischof, H., Brox, T., Frahm, JM. (eds) Computer Vision-ECCV 2020.
     Volume 12348 of Lecture Notes in Computer Science, Springer, Cham, pp. 473-490.
     doi:10.1007/978-3-030-58580-8_28.
[28] N. Savinov, A. Dosovitskiy, V. Koltun, Semi-parametric Topological Memory for Navigation, in:
     International Conference on Learning Presentations, Vancouver, Canada, 2018.
[29] K. Chen, J.P.d. Vicente, G. Sepulveda, F. Xia, A. Soto, M. Vázquez, and S. Savarese. A behavioral
     approach to visual navigation with graph localization networks, Robotics: Science and Systems
     2 (2019).
[30] Z. Huang, F. Liu, and H. Su, Mapping state space using landmarks for universal goal reaching,
     in: Proceedings of the 33rd. International Conference on Neural Information Processing Systems,
     Curran Associates Inc., Red Hook, NY, USA, 2019, pp. 1942–1952.
[31] M. Laskin, S. Emmons, A. Jain, T. Kurutach, P. Abbeel, and D. Pathak, Sparse graphical memory
     for robust planning, Advances in Neural Information Processing Systems 33 (NeurIPS 2020)
     (2020).
[32] Y. Wang, Y. Zhang, L. Hu, Wei. Wang, C. Ge, S. Tan, A Semantic Topology Graph to Detect Re-
     Localization and Loop Closure of the Visual Simultaneous Localization and Mapping System in
     a Dynamic Environment, Sensors 23 (2023). doi:10.3390/s23208445
[33] K. Liu, T. Kurutach, C. Tung, P. Abbeel, A. Tamar, Hallucinative topological memory for
     ZeroShot visual planning, in: Proceedings of the 37th. International Conference on Machine
     Learning, ICML’20, 2020, pp. 6259–6270.
[34] T. -H. Wang, H. -J. Huang, J. -T. Lin, C. -W. Hu, K. -H. Zeng and M. Sun, Omnidirectional CNN
     for visual place recognition and navigation, in: Proc. IEEE Int. Conf. Robot. Autom. (ICRA),
     Brisbane, QLD, Australia, 2018, pp. 2341-2348. doi:10.1109/ICRA.2018.8463173
[35] A. Taniguchi, F. Sasaki and R. Yamashina, Pose Invariant Topological Memory for Visual
     Navigation, in: Proc. IEEE Int. Conf. Comput. Vis. (ICCV), Montreal, QC, Canada, 2021, pp.
     15364-15373. doi:10.1109/ICCV48922.2021.01510.
[36] X. Meng, N. Ratliff, Y. Xiang and D. Fox, Scaling Local Control to Large-Scale Topological
     Navigation, in: Proc. IEEE Int. Conf. Robot. Autom. (ICRA), Paris, France, 2020, pp. 672-678.
     doi:10.1109/ICRA40945.2020.9196644.
[37] R. R. Wiyatno, A. Xu and L. Paull, Lifelong Topological Visual Navigation, IEEE Robotics and
     Automation Letters 7 (2022) 9271-9278. doi:10.1109/LRA.2022.3189164.
[38] F. Fraundorfer, C. Engels and D. Nister, Topological mapping, localization and navigation using
     image collections, in: IEEE/RSJ International Conference on Intelligent Robots and Systems, San
     Diego, CA, USA, 2007, pp. 3872-3877. doi:10.1109/IROS.2007.4399123.
[39] J. Crespo, J. Carlos, O.M. Mozos, R. Barber, Semantic Information for Robot Navigation: A
     Survey, Applied Sciences 10 (2020). doi:10.3390/app10020497.
[40] X. Han, S. Li, X. Wang, W. Zhou, Semantic Mapping for Mobile Robots in Indoor Scenes: A
     survey, Information 12 (2021). doi:10.3390/info12020092
[41] B. J. Kuipers, The Spatial Semantic Hierarchy, Artificial Intelligence 119 (2000) 191-233.
     doi:10.1016/S0004-3702(00)00017-5.
[42] B. Kuipers, J. Modayil, P. Beeson, M. MacMahon and F. Savelli, Local metrical and global
     topological maps in the hybrid spatial semantic hierarchy, in: Proc. IEEE Int. Conf. Robot. Autom.
     (ICRA), ICRA '04, New Orleans, LA, USA, 2004, pp. 4845-4851. doi:10.1109/ROBOT.2004.1302485.
[43] B. Kaleci, O. Parlaktuna, U. Gürel, A comparative study for topological map construction
     methods from metric map, in: Proceedings of the 26th Signal Processing and Communications
     Applications Conference (SIU), Izmir, Turkey, 2018, pp. 1–4. doi:10.1109/SIU.2018.8404845.
[44] F. Blochliger, M. Fehr, M. Dymczyk, T. Schneider and R. Siegwart, Topomap: Topological
     Mapping and Navigation Based on Visual SLAM Maps, in: Proc. IEEE Int. Conf. Robot. Autom.
     (ICRA), Brisbane, QLD, Australia, 2018, pp. 1–9. doi:10.1109/ICRA.2018.8460641.
[45] F. Wang, Y. Liu, C. Wu, H. Chu, Topological Map Construction Based on Region Dynamic
     Growing and Map Representation Method, Applied Sciences 9 (2019). doi:10.3390/app9050816
[46] K. He, G. Gkioxari, P. Dollár, and R. Girshick, Mask R-CNN, in: Proc. IEEE Int. Conf. Comput.
     Vis. (ICCV), Venice, Italy, 2017, pp. 2961–2969. doi:10.1109/ICCV.2017.322.
[47] N. Kim, O. Kwon, H. Yoo, Y. Choi, J. Park, S. Oh, Topological Semantic Graph Memory for Image-
     Goal Navigation, in: 6th. Annual Conference on Robot Learning, Auckland, New Zealand, 2022.
[48] Kevin Chen, Junshen K. Chen, Jo Chuang, Marynel Vazquez, Silvio Savarese, Topological
     Planning with Transformers for Vision-and-Language Navigation, in: Proceedings of the
     IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2021, pp. 11276-
     11286.
[49] D. An, H. Wang, W. Wang, Z. Wang, Y. Huang, K. He, L. Wang, ETPNav: Evolving Topological
     Planning for Vision-Language Navigation in Continuous Environments, IEEE Transactions on
     Pattern Analysis and Machine Intelligence (2024). doi: arxiv-2304.03047.
[50] M. Hwang, J. Jeong, M. Kim, Y. Oh, S. Oh, Meta-Explore Exploratory Hierarchical Vision-and-
     Language Navigation Using Scene Object Spectrum Grounding, in: Proceedings of the IEEE/CVF
     Conference on Computer Vision and Pattern Recognition (CVPR), Vancouver, Canada, 2023, pp.
     6683-6693. doi:arXiv.2303.04077
[51] J. Chen, B. Lin, R. Xu, Z. Chai, X. Liang, K. Wong, MapGPT: Map-Guided Prompting with
     Adaptive Path Planning for Vision-and-Language Navigation, in: Proceedings of the 62nd
     Annual Meeting of the Association for Computational Linguistics, Bangkok, Thailand, 2024.
[52] D.An, Y. Qi, et al., BEVBert: Multimodal Map Pre-training for Language-guided Navigation, in:
     Proc. IEEE Int. Conf. Comput. Vis. (ICCV), Paris, France, 2023. doi: arXiv.2212.04385.
[53] A. Rosinol, A. Gupta, M. Abate, J. Shi, L. Carlone, 3D Dynamic Scene Graphs: Actionable Spatial
     Perception with Places, Objects, and Humans, Robotics: Science and Systems (RSS) (2020). doi:
     arXiv.2002.06289.
[54] N. Hughes, Y. Chang, L. Carlone, Hydra: A Real-time Spatial Perception System for 3D Scene
     Graph Construction and Optimization, Robotics: Science and Systems (RSS) (2022).