Some Connections between Qualitative Spatial Reasoning and Machine Learning

Some Connections between Qualitative Spatial Reasoning and Machine Learning AnthonyGCohn a.g.cohn@leeds.ac.uk School of Computer Science University of Leeds

LS2 9JT UK

The Alan Turing Institute

Some Connections between Qualitative Spatial Reasoning and Machine Learning 1613-0073 BABB16CFFB353BAC2430720913E37529 GROBID - A machine learning software for extracting information from scholarly documents

As has been remarked on before, Space is Special [1,2]. Tobler's First Law of Geography [3] captures the notion that all things are related, but close things are more related. Tversky [2] eloquently argues for the special place for spatial representations, and in particular that (living) things must move and act in space to survive, that all thought begins as spatial thought and that spatial thinking comes from and is shaped by perceiving the world and acting in it, be it through learning or through evolution. Artificial Intelligence has thus naturally sought to endow artificial agents with spatial representations and ways of reasoning about space. Amongst these, I will focus on qualitative spatial representations and reasoning mechanisms (henceforth QSR, where the 'R' may stand for representation or reasoning or both, depending on the context). There have been many calculi developed for representing and reasoning about space in qualitative ways, covering aspects such as (mereo)topology, orientation/direction, size, distance and shape [4,5]. Whilst QSR has primarily been concerned with deductive reasoning, there have been and there are increasingly many connections between QSR and machine learning. In this talk I will discuss a number of such connections, ranging from the use of qualitative spatial representations in an inductive logic programming system to learn event classes occurring in video data, to the question of whether large language models (LLMs) are able to make inferences reliably about qualitative spatial relations, and whether they can be supported by symbolic reasoners. Learning rules for video interpretation: Dubba et al. [6] show how Inductive Logic Programming can be used to learn a set of rules which can be used to recognise event class instances where videos have been abstracted to a set of qualitative spatio-temporal relations. The method is demonstrated in two domains including one which involves recognising the events which are necessary to service an aircraft whilst it is turning around at an airport. Whilst the resulting rules are relatively simple and it might be wondered whether a hand-written set of rules could not be easily written and just as effective, it turns out that in a comparison with such a set of manually written rules, the learned model is more effective, because the latter does not take account of noise in the video data, where as the learned model was already trained on noisy data and was thus more robust in the face of noisy data at classification time. The paper also shows how the inductive process can be interleaved with abduction, using an embedded spatial theory to improve the learned model in the face of noisy training data.

Learning groundings for spatial representations:

A key question for QSR is how the relations in the calculus correspond to their use in language and their correspondence to the real world. Whilst relations are usually given plausible names in a relational calculus, there is no guarantee that these correspond to naturally occurring instances. Indeed, McDermott [7] notes the dangers of "wishful naming". Alomari et al. [8] present a system, named OLAV, which addresses the problem of bootstrapping knowledge in language and vision for autonomous robots. OLAV is able, for the first time, to (1) learn to form discrete concepts from sensory data; (2) ground language (n-grams) to these concepts (which include not only spatial relations, but also object attributes and actions); (3) induce a grammar for the language being used to describe the perceptual world; and moreover to do all this incrementally, without storing all previous data. The resulting grammar can then be used to parse novel commands for downstream action in a robotic system. Analysing polysemy in spatial prepositions: One challenge in assigning meanings to spatial prepositions is that they can frequently be polysemous, i.e. they can have multiple related senses (the polysemes). As the senses of polysemous terms are so closely intertwined, the theoretical and computational treatment of polysemy presents a difficult challenge for semantic models. To given an example: compare "book on a table", "balloon on the ceiling" and "picture on the wall". discuss this problem and shows how a model can be built in which these senses can be distinguished using data from human subjects. Can Large Language Models perform qualitative spatial reasoning reliably? Many claims (e.g. [10,11,12]) have been made since the emergence of Large Language Models (LLMs) as to their ability to reason. Spatial

reasoning is of particular interest since not only does it underlie a human's ability to operate in the physical world, but also because LLMs are not embodied; so the question arises, have they nonetheless acquired an ability to reason about situations which might occur in the real physical world? I will present the results of a number of experiments in which this ability is tested: for (cardinal)Michael Sioutis <michael.sioutis@lirmm.fr> directions [13,14], for relational composition and conceptual neighbourhood construction [15] and other notions in spatial reasoning [16]. One challenge for evaluating LLMs in the domain of spatial reasoning (and commonsense more generally [17]) is the paucity of good benchmarks -I will discuss this issue and briefly present a new benchmark which is based on a synthetic generator, able to provide arbitrarily many examples of automatically labelled indoor virtual scenes [18]. Using LLMs as a natural language interface to symbolic spatial reasoners: Given the deficiencies in the robustness of LLMs in performing qualitative spatial reasoning, it is worth asking the question whether an LLM and a more traditional symbolic reasoner in combination could be more effective than either on their own. An LLM has strengths in analysing language, but no so much in more complex reasoning, whilst an LLM on its own has no ability to comprehend natural language. The combination of the two can be particularly effective, for example as demonstrated in the StepGame benchmark [19,14].

Acknowledgements This work was supported by: the Fundamental Research priority area of The Alan Turing Institute; Microsoft Research -Accelerating Foundation Models Research program; the Economic and Social Research Council (ESRC) under grant ES/W003473/1. I also wish to give heartfelt thanks to all my co-authors in the papers [6,8,9,14,16,18] I will discuss in the talk, and with whom it has been such a pleasure to interact with.

Challenges in geographical information science MFGoodchild Proceedings of the Royal Society A: Mathematical, Physical and Engineering Sciences 467 2011 BTversky Mind in Motion: How Action Shapes Thought Basic Books 2019 Tobler's first law and spatial analysis HJMiller Annals of the association of American geographers 94 2004 Qualitative spatial representation and reasoning AGCohn JRenz Handbook of knowledge representation FVan Harmelen VLifschitz BPorter Elsevier 2008 A survey of qualitative spatial representations JChen AGCohn DLiu SWang JOuyang QYu The Knowledge Engineering Review 30 2015 Learning relational event models from video KSDubba AGCohn DCHogg MBhatt FDylla Journal of Artificial Intelligence Research 53 2015 Artificial intelligence meets natural stupidity DMcdermott ACM SIGART Bulletin 1976 Online perceptual learning and natural language acquisition for autonomous robots MAlomari FLi DCHogg AGCohn Artificial Intelligence 303 103637 2022 Identifying and modelling polysemous senses of spatial prepositions in referring expressions ARichard-Bollans LGÁlvarez AGCohn Cognitive Systems Research 77 2023 ACreswell MShanahan arXiv:2208.14271 Faithful reasoning using large language models 2022 JHuang KC .-CChang arXiv:2212.10403 Towards reasoning in large language models: A survey 2023 Large language models are zero-shot reasoners TKojima SSGu MReid YMatsuo YIwasawa arXiv:2205.11916 Advances in neural information processing systems 35 2022 Evaluating the ability of large language models to reason about cardinal directions AGCohn REBlackwell arXiv:2406.16528 Proc. COSIT-24 (to appear) COSIT-24 (to appear) 2024 Advancing spatial reasoning in large language models: An in-depth evaluation and enhancement using the StepGame benchmark FLi DCHogg AGCohn Proc. AAAI AAAI 2024 An evaluation of ChatGPT-4's qualitative spatial reasoning capabilities in RCC-8 AGCohn 2023 appears in Working Notes of QR-23 Dialectical language model evaluation: An initial appraisal of the commonsense spatial reasoning abilities of LLMs AGCohn JHernandez-Orallo arXiv:2304.11164 2023 arXiv preprint Benchmarks for automated commonsense reasoning: A survey EDavis ACM Computing Surveys 56 2023 Reframing spatial reasoning evaluation in language models: A real-world simulation benchmark for qualitative reasoning FLi DCHogg AGCohn Proc. IJCAI IJCAI 2024 StepGame: A new benchmark for robust multi-hop spatial reasoning in texts ZShi QZhang ALipani Proc. AAAI AAAI 2022 36