I. BRIEF SURVEY

Artificial Spatial Cognition for Robotics and Mobile Systems: Brief Survey and Current Open Challenges

Paloma de la Puente

M. Guadalupe

Sánchez-Escribano Universidad Politécnica de

Madrid (UPM) Madrid

Spain

2016

52 53

-Remarkable and impressive advancements in the areas of perception, mapping and navigation of artificial mobile systems have been witnessed in the last decades. However, it is clear that important limitations remain regarding the spatial cognition capabilities of existing available implementations and the current practical functionality of high level cognitive models [1, 2]. For enhanced robustness and flexibility in different kinds of real world scenarios, a deeper understanding of the environment, the system, and their interactions -in general termsis desired. This long abstract aims at outlining connections between recent contributions in the above mentioned areas and research in cognitive architectures and biological systems. We try to summarize, integrate and update previous reviews, highlighting the main open issues and aspects not yet unified or integrated in a common architectural framework.

spatial cognition surveys perception navigation

I. BRIEF SURVEY

A. Initial models for spatial knowledge representation and main missing elements

Focusing on the spatial knowledge representation and management, the first contributions inspired by the human cognitive map combined metric local maps, as an Absolute Space Representation (ASR), and topological graphs [3]. As a related approach, the Spatial Semantic Hierarchy (SSH) [ 4 ] was the first fundamental cognitive model for large-scale space. It evolved into the Hybrid SSH [ 5 ], which also included knowledge about small-scale space. This fundamental work was undoubtedly groundbreaking, but it did not go beyond basic levels of information abstraction and conceptualization [ 6 ]. Moreover, the well-motivated dependencies among different types of knowledge (both declarative and procedural) were not further considered for general problem solving [ 7 ]. The SSH model was considered suitable for the popular schema of a “three layer architecture”, without explicitly dealing with processes such as attention or forgetting mechanisms. This lack of principled forgetting mechanisms has been identified by the Simultaneous Localization and Mapping (SLAM) robotics community as a key missing feature of most existing mapping approaches [ 8, 9 ].

B. The role of cognitive architectures and their relation to other works in the robotics community Cognitive architectures provide a solid approach for modeling general intelligent agents and their main commitments support the ambitious requirements of high level behavior in arbitrary situations for robotics [ 10 ]. A more recent model of spatial knowledge, the Spatial/Visual System (SVS) [ 11 ] designed as an extension of the Soar cognitive architecture, proposed a different multiplicity of representations, i.e. symbolic, quantitative spatial and visual depictive. The spatial scene is a hierarchy tree of entities and their constitutive parts, with intermediate nodes defining the transformation relations between parts and objects. Other works in robotics employ similar internal representation ideas [ 12-14 ], and other ones included the possibility to hypothesize geometric environment structure in order to build consistent maps [ 15 ]. While a complete implementation of this approach for all kind of objects requires solving the corresponding segmentation and recognition problems in a domain independent manner (which is far beyond the state of the art), keeping the perceptual level representations within the architecture enhances functionality. A very active research community address these difficult challenges.

The recognition process should not only use visual, spatial and motion data from the Perceptual LTM but also conceptual context information [ 7, 16 ] and episodic memories of remembered places [ 17 ], from Symbolic LTM. This should also apply to the navigation techniques for different situations [ 18, 19 ]. The existence of motion models for the objects can improve navigation in dynamic environments, which is one of the main problems in real world robotic applications [ 20, 21 ].

A novel cognitive architecture specifically designed for spatial knowledge processing is the Casimir architecture [ 22 ], which presents rich modeling capabilities pursuing human-like behavior. Navigation, however, has not been addressed, and this work has scarcely been discussed in the robotics domain.

One of the latest spatial models is the NavModel [ 23 ], designed and implemented for the ACT-R architecture. Besides considering multi-level representations, this model presents three navigation strategies with varying cognitive cost. The first developed implementation assumes known topological localization at room level, while a subsequent implementation incorporates a mental rotation model. This work focuses on the cognitive load and does not deal with lower level issues.

To point out how topics are addressed by the respective communities, we compiled Table I as a comparison. The contrast regarding memory management and uncertainty seems to be relevant. The lack of approaches combining both allocentric and egocentric representations is also remarkable. To conclude, Table II shows a summary of surveys.

COMPARISON OF TOPICS ADDRESSED BY THE COGNITIVE

ARCHITECTURES AND ROBOTICS COMMUNITIES

← Topic → Egocentric spatial models

Allocentric spatial models Explicit motion models / dynamic information about the environment Memory management, forgetting

mechanisms Casimir, LIDA, SOAR-SVS

Object based/ semantic representations Extended LIDA [ 29 ]

Uncertainty considerations

II. CURRENT OPEN CHALLENGES

The big challenge is closing the gap between high level models and actual implementations in artificial mobile systems. To reduce this existing gap, we identify three main goals:   

Combination of allocentric and egocentric models using different levels of features/objects + topology/semantics.

Acquisition and integration of motion models and dynamic information for the elements/objects.

Integration of global mapping & loop closure capabilities with extensive declarative knowledge about features relevance and forgetting mechanisms with episodic memory.

ACKNOWLEDGMENT The authors want to thank the EUCog community for fostering interdisciplinary research in Artificial Cognitive Systems and organizing inspiring meetings and events.

[1]

G. Eason M.

Jefferies and

W.K.

Yeap . Robotics and cognitive approaches to spatial mapping . Springer, 2008 .

[2]

Madl ,

Chen ,

Montaldi and

Trappl . Computational cognitive models of spatial memory in navigation space: A review . Neural Networks , 2015 .

W.K.

Yeap . Towards a computational theory of cognitive maps . Journal of Artificial Intelligence , 1988 .

[4]

Kuipers . The spatial semantic hierarchy . Artificial Intelligence . 2000 .

[5] Kuipers , J.

Modayil , P.

Beeson , M.

MacMahon and

Savelli . Local metrical and global topological maps in the hybrid Spatial Semantic Hierarchy . ICRA, 2004 .

[6]

Pronobis and

Jensfelt . Large-scale semantic mapping and reasoning with heterogeneous modalities . ICRA , 2012 .

[7]

S.D.

Lathrop . Extending cognitive architectures with spatial and visual imagery mechanisms . PhD Thesis , 2008 .

[8]

J.A.

Fernandez-Madrigal and

J.L.

Blanco . Simultaneous localization and mapping for mobile robots: iIntroduction and methods . IGI , 2012 .

[9]

Cadena et al. Past, present, and future of simultaneous localization and mapping: towards the robust-perception age . T-RO , 2016 .

[10]

Kurup and

Lebiere . What can cognitive architectures do for robotics? Biologically Inspired Cognitive Architectures , 2012 .

[11]

S.D.

Lathrop . Exploring the functional advantages of spatial and visual cognition from an architectural perspective . TopiCS 2011 .

[12]

R.F.

Salas-Moreno , R.A : Newcombe ,

Strasdat ,

P.H.J

Kelly and

A.J.

Davison . SLAM++ : Simultaneous localisation and mapping at the Level of objects . CVPR , 2013 .

[13]

Eslami and

Williams . A generative model for parts-based object segmentation . Advances Neural Information Processing Systems , 2012 .

[14]

Uckermann ,

Eibrechter ,

Haschke and

Ritter . Real time hierarchical scene segmentation and classification . Humanoids , 2014 .

[15] P. de la Puente and D. Rodriguez-Losada. Feature based graph SLAM in structured environments . Autonomous Robots, 2014 .

[16]

Kunze et al. Combining top-down spatial reasoning and bottom-up object class recognition for scene understanding . IROS , 2014 .

[17] M.B Moser and E.I. Moser. The brain's GPS . Scientific American , 2016 .

[18]

Gunzelmann and

Lyon ( 2007 ) Mechanisms for human spatial competence . Spatial Cognition

, LNAI-Springer, 2007 .

[19]

Dayoub , G. Cielniak and

Duckett . Eight weeks of episodic visual navigation inside a non-stationary environment using adaptive spherical views . FSR , 2013 .

[20]

Hawes et al. The STRANDS project: long-term autonomy in everyday environments . Robotics and Automation Magazine , in press.

[21] P. de la Puente et al. Experiences with RGB-D navigation in real home robotic trials . ARW , 2016 .

[22]

Schultheis and

Barkowsky . Casimir: an architecture for mental spatial knowledge processing . TopiCS , 2011 .

[23]

Zhao . Understanding human spatial navigation behaviors: A cognitive modeling . PhD Thesis , 2016 .

[24]

Drouilly ,

Rives and

Morisset . Semantic representation for navigation in large-scale environments . ICRA , 2015 .

[25]

L.F.

Posada ,

Hoffmann and

Bertram . Visual semantic robot navigation in indoor environments . ISR , 2014 .

[26]

Richardson and

Olson . Iterative path optimization for practical robot planning . IROS , 2011 .

[27]

Ambrus ,

Bore ,

Folkesson and

Jensfelt . Meta-rooms: building and maintaining long term spatial models in a dynamic world . IROS , 2014 .

[28] D. M. Rosen , J.

Mason and J. J.

Leonard . Towards lifelong featurebased mapping in semi-static environments . ICRA , 2016 .

[29]

Madl ,

Franklin ,

Chen ,

Montaldi and

Trappl . Towards realworld capable spatial memory in the LIDA cognitive architecture . BICA , 2016 .

[30] J. J. DiCarlo , D.

Zoccolan and N. C.

Rust . How does the brain solve visual object recognition? Neuron , 2012 .