<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>On Integrated Reasoning and Learning about Space and Motion in Embodied Multimodal Interaction Mehul Bhatt</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Mehul Bhatt</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Cognitive vision, Knowlede representation and reasoning (KR), Machine Learning, Integration of reasoning &amp; learning</institution>
          ,
          <addr-line>Commonsense reasoning, Declarative spatial reasoning, Relational Learning, Computational cognitive modelling, Human-Centred AI, Responsible AI</addr-line>
        </aff>
        <aff id="aff1">
          <label>1</label>
          <institution>School of Science and Technology, Örebro University -</institution>
          <country country="SE">Sweden</country>
        </aff>
      </contrib-group>
      <pub-date>
        <year>2013</year>
      </pub-date>
      <volume>32</volume>
      <fpage>4</fpage>
      <lpage>6</lpage>
      <abstract>
        <p>We present recent and emerging advances in computational cognitive vision addressing artificial visual and spatial intelligence at the interface of (spatial) language, (spatial) logic and (spatial) cognition research. With a primary focus on explainable sensemaking of dynamic visuospatial imagery, we highlight the (systematic and modular) integration of methods from knowledge representation and reasoning, computer vision, spatial informatics, and computational cognitive modelling. A key emphasis here is on generalised (declarative) neurosymbolic reasoning &amp; learning about space, motion, actions, and events relevant to embodied multimodal interaction under ecologically valid naturalistic settings in everyday life. Practically, this translates to general-purpose mechanisms for computational visual commonsense encompassing capabilities such as (neurosymbolic) semantic question-answering, relational spatio-temporal learning, visual abduction etc. The presented work is motivated by and demonstrated in the applied backdrop of areas as diverse as autonomous driving, cognitive robotics, design of digital visuoauditory media, and behavioural visual perception research in cognitive psychology and neuroscience. More broadly, our emerging work is driven by an interdisciplinary research mindset addressing human-centred responsible AI through a methodological confluence of AI, Vision, Psychology, and (human-factors centred) Interaction Design.</p>
      </abstract>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>1. Motivation</title>
      <p>
        Multimodality in embodied interaction is an inherent aspect
of human activity, be it in social, professional, or
everyday mundane contexts. Next-generation human-centred
AI technologies, operating in such contextualised
everyday settings, will require an inherent foundational capacity
to “make sense” of —e.g., perceive, understand, explain,
anticipate— everyday, naturalistic interactional
multimodality. This would be essential towards successfully achieving
technology mediated (“human-in-the-loop” ) collaborative
assistance, as well as ensuring compliance with emerging
human-centred ethical and legal requirements, performance
benchmarks, and inclusive usability expectations. It is
therefore crucial that the foundational building blocks of such
next-generation systems be semantically aligned with the
descriptive, analytical, and explanatory characteristics and
complexity of human task conceptualisation, performance
benchmarks, and usability expectations. Against this
backdrop, we define artificial visual intelligence [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ] as:
» The computational capability to
semantically process and interpret diverse forms
of visual stimuli (typically, but not
necessarily) emanating from sensing embodied
multimodal interactions of / amongst humans
and other artefacts in diverse naturalistic
situations of everyday life and work.
      </p>
      <p>Within the scope of artificial visual intelligence are a
widespectrum of high-level human-centred sensemaking
capabilities. These capabilities encompass operational functions
such as:
• Visuospatial conception formation,
commonsense/qualitative generalisation, analogical
inference;
• Hypothetical reasoning, argumentation,
explanation, counterfactual reasoning;
• Event based episodic maintenance &amp; retrieval for
perceptual narrativisation.</p>
      <p>The afore enumeration is by no means exhaustive: in
essence, in scope of artificial visual intelligence are diverse
high-level cognitive visuospatial sensemaking
capabilities —be it mundane, analytical, or creative— that humans
acquire developmentally or through specialised training,
and are routinely adept at performing seamlessly in their
everyday life and work (e.g., driving a vehicle, tracking
moving objects, navigating a crowded urban environment,
engaging in sports, interpreting subtle cues in everyday
people-communication from visual / gestural and auditory
signals).</p>
      <p>
        Our central focus is on the development of general,
domain-independent methods that may be seamlessly
integrated as part of hybrid computational cognitive system,
or even within computational cognitive models / cognitive
architectures [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ]. We also contextualise and demonstrate in
the backdrop of applications in autonomous driving,
cognitive robotics, visuoauditory media design, and cognitive
psychology (e.g. [
        <xref ref-type="bibr" rid="ref3 ref4 ref6 ref7">3, 4, 5, 6</xref>
        ], [
        <xref ref-type="bibr" rid="ref8 ref9">7, 8</xref>
        ] ). Through applied
casestudies, we provide a systematic model and general
methodology showcasing the integration of diverse, multi-faceted
AI methods pertaining Knowledge Representation and
Reasoning, Computer Vision, Machine Learning, and Visual
      </p>
      <p>
        Perception towards realising practical, human-centred,
computational visual intelligence.
spatial systems [
        <xref ref-type="bibr" rid="ref11">10</xref>
        ] where integrated reasoning about
action and change [
        <xref ref-type="bibr" rid="ref12 ref13">11, 12</xref>
        ] is involved:
      </p>
    </sec>
    <sec id="sec-2">
      <title>2. Neurosymbolic Visual</title>
    </sec>
    <sec id="sec-3">
      <title>Commonsense: Integrated</title>
    </sec>
    <sec id="sec-4">
      <title>Reasoning and Learning about</title>
    </sec>
    <sec id="sec-5">
      <title>Space, Motion, and Inter(A)ction</title>
      <p>In the present status quo, our research in (computational)
neurosymbolic visual commonsense categorically addresses
three key questions:</p>
      <p>
        I. What kind of (relational) abstraction mechanisms
are needed to computationally “make-sense” of
embodied multimodal interaction ?
II. How can (and why should) abstraction mechanisms
(such as in I) be founded on behaviourally
established cognitive human- factors emanating from
naturalistic empirical observation in real-world applied
contexts?
III. How to articulate behaviourally established
abstraction mechanisms, preferences (etc) as formal
declarative models suited for computational modelling
aimed at operational“sensemaking” (encompassing
capabilities such as abduction, relational learning,
counterfactual inference) ?
Present work is particularly aimed at developing general
methods for the semantic interpretation of (multimodal)
dynamic visuospatial imagery with an emphasis on the ability
to neurosymbolically perform abstraction, reasoning, and
learning with cognitively rooted structured
characterisations of commonsense knowledge pertaining to space and
motion. Here, we specifically emphasise:
• General foundational commonsense abstractions of
space, time, and motion needed for representation
mediated (grounded) reasoning and learning with
dynamic visuospatial stimuli (e.g., emanating from
multimodal human behavioural signals in
modalities such as RGB(D), video, audio, eye-tracking and
possibly even bio signals [
        <xref ref-type="bibr" rid="ref10">9</xref>
        ]);
• Deep (visuospatial) semantics, entailing
systematically formalised declarative (neurosymbolic)
reasoning and learning with aspects pertaining to
space, space-time, motion, actions &amp; events,
spatiolinguistic conceptual knowledge. Here, it is of the
essence that an expressive ontology consisting of,
for instance, space, time, space-time motion
primitives as first-class ‘neurosymbolic’ objects is
accessible within the (declarative) programming paradigm
under consideration; and
• Explainable models of computational visuospatial
commonsense based on a systematic integration of
symbolic/relational methods on the one hand, and
neural techniques aimed at low level quantitative
(e.g., visual) data processing on the other;
At a higher level of abstraction, deep (visualspatial)
semantics (or deep semantics for short) entails inherent
support for tackling a range of challenges concerning
epistemological and phenomenological aspects relevant to dynamic
• interpolation and projection of missing
information, e.g., what could be hypothesised about missing
information (e.g., moments of occlusion [
        <xref ref-type="bibr" rid="ref14">13</xref>
        ]); how
can this hypothesis support planning an immediate
next step?
• object identity maintenance at a semantic level,
e.g., in the presence of occlusions, missing and noisy
quantitative data, error in detection and tracking
• ability to make default assumptions, e.g.,
pertaining to persistence objects and/or object attributes
• maintaining consistent beliefs respecting
(domainneutral) commonsense criteria, e.g., related to
compositionality &amp; indirect efects, space-time
continuity, positional changes resulting from motion
• inferring / computing counterfactuals [
        <xref ref-type="bibr" rid="ref15">14</xref>
        ], in a
manner akin to human cognitive ability to perform
mental simulation for purposes of introspection
about the past or anticipation of the future, or
performing “what-if” reasoning tasks etc
We particularly emphasise the abilities to abstract, learn,
and reason with cognitively rooted structured
characterisations of commonsense knowledge about space and motion,
encompassing visuospatial question-answering, abduction,
and relational learning:
I. Visuospatial Question-Answering. Focus is on a
computational framework for semantic-question answering
with video and eye-tracking data founded in constraint logic
programming; we also demonstrate an application in
cognitive film &amp; media studies, where human perception of films
vis-a-via cinematographic devices is of interest.
» [
        <xref ref-type="bibr" rid="ref4 ref7 ref8 ref9">4, 6, 7, 8</xref>
        ]
II. Visuospatial Abduction. Focus is on a hybrid
architecture for systematically computing robust visual
explanation(s) encompassing hypothesis formation, belief revision,
and default reasoning with video data (for active vision
for autonomous driving, as well as for ofline processing).
The architecture supports visual abduction with space-time
histories as native entities, and founded in (functional)
answer set programming based spatial reasoning.
» [
        <xref ref-type="bibr" rid="ref14 ref16 ref3">3, 13, 15</xref>
        ][
        <xref ref-type="bibr" rid="ref17 ref18">16, 17</xref>
        ]
III. Relational Visuospatial Learning. Focus is on a
general framework and pipeline for: relational spatio-temporal
(inductive) learning with an elaborate ontology supporting
a range of space-time features; and generating semantic,
(declaratively) explainable interpretation models in a
neurosymbolic pipeline demonstrated for the case of analysing
visuospatial symmetry in visual art.
» [
        <xref ref-type="bibr" rid="ref19">18</xref>
        ][
        <xref ref-type="bibr" rid="ref6">5</xref>
        ][
        <xref ref-type="bibr" rid="ref20">19</xref>
        ]
Formal semantics and computational models of deep
semantics manifest themselves as neurosymbolic spatio-temporal
extensions of established declarative AI frameworks such
as Constraint Logic Programming (CLP) [
        <xref ref-type="bibr" rid="ref21">20</xref>
        ], Inductive
Logic Programming (ILP) [
        <xref ref-type="bibr" rid="ref22">21</xref>
        ], and Answer Set
Programming (ASP) [
        <xref ref-type="bibr" rid="ref23">22</xref>
        ]. The more foundational aspects pertaining
declarative spatial reasoning (built on top of CLP, ILP, ASP)
independent of its relationship to cognitive vision research
may be consulted in [
        <xref ref-type="bibr" rid="ref24">23</xref>
        ], [
        <xref ref-type="bibr" rid="ref17 ref25">16, 24</xref>
        ], [
        <xref ref-type="bibr" rid="ref19">18</xref>
        ].
      </p>
    </sec>
    <sec id="sec-6">
      <title>3. Discussion</title>
      <p>The vision that drives our scientific methodology is:
» To shape the nature and character of
(machine-based) artificial visual intelligence
with respect to human-centred cognitive
considerations, demonstrating an exemplar
for developing, applying, and disseminating
such methods in socio-technologically
relevant application areas where:
(a) embodied (multimodal) human
interaction is inherent;
(b) human-in-the-loop collaborative work is
of the essence; and
(c) normative ethico-legal compliance based
on regulatory requirement and
humanfactors driven inclusive or universal design
criteria is to be ensured.</p>
      <p>Towards realising this vision, we adopt an interdisciplinary
approach –at the confluence of Cognition, AI, Interaction,
and Design– which we deem necessary to better
appreciate the complexity and spectrum of varied human-centred
challenges for the design and (usable) implementation of
(explainable) artificial visual intelligence solutions in diverse
human-system interaction contexts.</p>
      <p>
        One of the key technical driving forces in our work is that of
“representation mediated multimodal sensemaking”.
In essence, we consider (neurosymbolic) representation
mediated grounding as being significant in semiotic
construction, e.g., enabling high-level meaning-making. This view
stems from the long-established value of “grounding” in
Artificial Intelligence and related disciplines [
        <xref ref-type="bibr" rid="ref26">25</xref>
        ]. Our
research advances the theoretical, methodological, and
applied understanding of “grounded representation” mediated
multimodal sensemaking of embodied human interaction
at the interface of spatial language, spatial logic, and spatial
cognition. In our view, the significance of this form of
(neurosymbolic) grounding must now be reiterated, re-asserted
even, in view of recent advances in neural machine learning
and the well-recognised “explainability” and
“interpretability” requirements from the viewpoint of human-centred AI
[
        <xref ref-type="bibr" rid="ref27 ref28 ref29">26, 27, 28</xref>
        ]. We believe that research in knowledge
representation and reasoning (KR) has, since its inception, concerned
itself with the “hard” problem of semantics, emphasising
explainability, formal verification and diagnosis,
elaboration tolerance amongst other things. Research in KR, and
more broadly in symbolic AI and semantics, and their role
and contribution towards large-scale hybrid
“human-in-theloop” intelligence is of even greater significance now than
ever before given the tremendous synergistic opportunities
aforded by the widely demonstrated power of deep learning
driven techniques in computer vision (and beyond). The
onus now, we posit, is on KR research to drive itself towards
developing methods that can seamlessly integrate (and be
“usable”) with other kinds of AI methods, be data-centric
neural learning techniques, or otherwise.
      </p>
      <p>In this invited position statement, we have attempted to
summarise our mindset and ongoing work in the CoDesign
Lab towards:
» Establishing a human-centric foundation
and roadmap for the development of
neurosymbolically grounded inference about
embodied multimodal interaction as
identifiable in a range of real-world application
contexts.</p>
      <p>
        This summary is not meant to be a comprehensive literature
review; this may be obtained through the cited works. For
key technical details and to obtain a summary of open
directions, we direct interested readers to select publications
as follows: a compact starting point may be obtained via
the comprehensive summary in [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ], or through the
shorter/focussed components in [
        <xref ref-type="bibr" rid="ref14 ref16 ref3 ref4 ref6">15, 5, 4, 13, 3</xref>
        ]. Longer summaries
in the form of (recent) doctoral dissertations are available
in [
        <xref ref-type="bibr" rid="ref30">29</xref>
        ] and [
        <xref ref-type="bibr" rid="ref31 ref32">30, 31</xref>
        ].
      </p>
    </sec>
    <sec id="sec-7">
      <title>Acknowledgments</title>
      <p>We acknowledge funding by the Swedish Research Council
(VR - Vetenskapsrådet) - https://www.vr.se, and the Swedish
Foundation for Strategic Research (SSF – Stiftelsens för
Strategisk Forskning) - https://strategiska.se. Previously,
this research has been supported by the German Research
Foundation (DFG – Deutsche Forschungsgemeinschaft)
https://www.dfg.de.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <given-names>M.</given-names>
            <surname>Bhatt</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Suchan</surname>
          </string-name>
          ,
          <article-title>Artificial visual intelligence: Perceptual commonsense for human-centred cognitive technologies</article-title>
          ,
          <source>in: Human-Centered Artificial Intelligence: Advanced Lectures</source>
          , Springer-Verlag, Berlin, Heidelberg,
          <year>2023</year>
          , p.
          <fpage>216</fpage>
          -
          <lpage>242</lpage>
          . URL: https:// doi.org/10.1007/978-3-
          <fpage>031</fpage>
          -24349-3_
          <fpage>12</fpage>
          . doi:
          <volume>10</volume>
          .1007/ 978-3-
          <fpage>031</fpage>
          -24349-3_
          <fpage>12</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [2]
          <string-name>
            <given-names>S.</given-names>
            <surname>Jones</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Laird</surname>
          </string-name>
          ,
          <article-title>Anticipatory thinking in cognitive architectures with event cognition mechanisms</article-title>
          , in: A.
          <string-name>
            <surname>Amos-Binks</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          <string-name>
            <surname>Dannenhauer</surname>
          </string-name>
          , R. E. CardonaRivera, G. A.
          <string-name>
            <surname>Brewer</surname>
          </string-name>
          (Eds.),
          <source>Short Paper Proc. of Workshop on Cognitive Systems for Anticipatory Thinking (COGSAT</source>
          <year>2019</year>
          ), AAAI Fall Symp., volume
          <volume>2558</volume>
          ,
          <year>2019</year>
          . URL: http://ceur-ws.
          <source>org/</source>
          Vol-
          <volume>2558</volume>
          /short1.pdf .
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [3]
          <string-name>
            <given-names>J.</given-names>
            <surname>Suchan</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Bhatt</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Varadarajan</surname>
          </string-name>
          ,
          <article-title>Commonsense visual sensemaking for autonomous driving - on generalised neurosymbolic online abduction integrating vision and semantics</article-title>
          , Artif. Intell.
          <volume>299</volume>
          (
          <year>2021</year>
          )
          <article-title>103522</article-title>
          . URL: https://doi.org/10.1016/ j.artint.
          <year>2021</year>
          .
          <volume>103522</volume>
          . doi:
          <volume>10</volume>
          .1016/j.artint.
          <year>2021</year>
          .
          <volume>103522</volume>
          .
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [4]
          <string-name>
            <given-names>J.</given-names>
            <surname>Suchan</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Bhatt</surname>
          </string-name>
          ,
          <article-title>Semantic question-answering with video and eye-tracking data: AI foundations for human visual perception driven cognitive film studies</article-title>
          , in: S. Kambhampati (Ed.),
          <source>Proceedings of the Twenty-Fifth International Joint Conference on Artificial Intelligence, IJCAI</source>
          <year>2016</year>
          , New York, NY, USA,
          <fpage>9</fpage>
          -
          <issue>15</issue>
          <year>July 2016</year>
          , IJCAI/AAAI Press,
          <year>2016</year>
          , pp.
          <fpage>2633</fpage>
          -
          <lpage>2639</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>URL: http://www.ijcai.org/Abstract/16/374.</mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          [5]
          <string-name>
            <given-names>J.</given-names>
            <surname>Suchan</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Bhatt</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Vardarajan</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S. A.</given-names>
            <surname>Amirshahi</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Yu</surname>
          </string-name>
          ,
          <article-title>Semantic Analysis of (Reflectional) Visual Symmetry: A Human-Centred Computational Model for Declarative Explainability</article-title>
          ,
          <source>Advances in Cognitive Systems</source>
          <volume>6</volume>
          (
          <year>2018</year>
          )
          <fpage>65</fpage>
          -
          <lpage>84</lpage>
          . URL: http://www.cogsys.org/ journal.
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          [6]
          <string-name>
            <given-names>J.</given-names>
            <surname>Suchan</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Bhatt</surname>
          </string-name>
          ,
          <article-title>The geometry of a scene: On deep semantics for visual perception driven cognitive film, studies</article-title>
          , in: 2016
          <source>IEEE Winter Conference on Applications of Computer Vision</source>
          , WACV 2016,
          <string-name>
            <surname>Lake</surname>
            <given-names>Placid</given-names>
          </string-name>
          ,
          <string-name>
            <surname>NY</surname>
          </string-name>
          , USA, March 7-
          <issue>10</issue>
          ,
          <year>2016</year>
          , IEEE Computer Society,
          <year>2016</year>
          , pp.
          <fpage>1</fpage>
          -
          <lpage>9</lpage>
          . URL: https://doi.org/10.1109/WACV.
          <year>2016</year>
          .
          <volume>7477712</volume>
          . doi:
          <volume>10</volume>
          .1109/WACV.
          <year>2016</year>
          .
          <volume>7477712</volume>
          .
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          [7]
          <string-name>
            <given-names>J.</given-names>
            <surname>Suchan</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Bhatt</surname>
          </string-name>
          ,
          <article-title>Deep Semantic Abstractions of Everyday Human Activities: On Commonsense Representations of Human Interactions</article-title>
          , in: ROBOT 2017: Third Iberian Robotics Conference,
          <source>Advances in Intelligent Systems and Computing 693</source>
          ,
          <year>2017</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          [8]
          <string-name>
            <given-names>M.</given-names>
            <surname>Spranger</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Suchan</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Bhatt</surname>
          </string-name>
          ,
          <string-name>
            <surname>Robust Natural Language Processing - Combining Reasoning</surname>
          </string-name>
          ,
          <article-title>Cognitive Semantics and Construction Grammar for Spatial Language</article-title>
          ,
          <source>in: IJCAI 2016: 25th International Joint Conference on Artificial Intelligence</source>
          , AAAI Press,
          <year>2016</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          [9]
          <string-name>
            <given-names>M.</given-names>
            <surname>Bhatt</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K.</given-names>
            <surname>Kersting</surname>
          </string-name>
          ,
          <article-title>Semantic interpretation of multimodal human-behaviour data - making sense of events, activities</article-title>
          , processes, Künstliche Intell.
          <volume>31</volume>
          (
          <year>2017</year>
          )
          <fpage>317</fpage>
          -
          <lpage>320</lpage>
          . URL: https://doi.org/10.1007/s13218-017-0511-y. doi:
          <volume>10</volume>
          .1007/S13218-017-0511-Y.
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          [10]
          <string-name>
            <given-names>M.</given-names>
            <surname>Bhatt</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S. W.</given-names>
            <surname>Loke</surname>
          </string-name>
          ,
          <article-title>Modelling dynamic spatial systems in the situation calculus</article-title>
          ,
          <source>Spatial Cognition &amp; Computation</source>
          <volume>8</volume>
          (
          <year>2008</year>
          )
          <fpage>86</fpage>
          -
          <lpage>130</lpage>
          . URL: https: //doi.org/10.1080/13875860801926884. doi:
          <volume>10</volume>
          .1080/ 13875860801926884.
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          [11]
          <string-name>
            <given-names>M.</given-names>
            <surname>Bhatt</surname>
          </string-name>
          ,
          <string-name>
            <given-names>H. W.</given-names>
            <surname>Guesgen</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Wölfl</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S. M.</given-names>
            <surname>Hazarika</surname>
          </string-name>
          ,
          <article-title>Qualitative spatial and temporal reasoning: Emerging applications, trends, and directions</article-title>
          ,
          <source>Spatial Cognition &amp; Computation</source>
          <volume>11</volume>
          (
          <year>2011</year>
          )
          <fpage>1</fpage>
          -
          <lpage>14</lpage>
          . URL: https://doi.org/10. 1080/13875868.
          <year>2010</year>
          .
          <volume>548568</volume>
          . doi:
          <volume>10</volume>
          .1080/13875868.
          <year>2010</year>
          .
          <volume>548568</volume>
          .
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          [12]
          <string-name>
            <given-names>M.</given-names>
            <surname>Bhatt</surname>
          </string-name>
          ,
          <article-title>Reasoning about space, actions and change: A paradigm for applications of spatial reasoning, in: Qualitative Spatial Representation and Reasoning: Trends and Future Directions, IGI Global</article-title>
          , USA,
          <year>2012</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref14">
        <mixed-citation>
          [13]
          <string-name>
            <given-names>J.</given-names>
            <surname>Suchan</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Bhatt</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Varadarajan</surname>
          </string-name>
          ,
          <article-title>Out of sight but not out of mind: An answer set programming based online abduction framework for visual sensemaking in autonomous driving</article-title>
          , in: S. Kraus (Ed.),
          <source>Proc. of 25th Intnl. Joint Conference on Artificial Intelligence, IJCAI</source>
          <year>2019</year>
          ,
          <year>2019</year>
          , pp.
          <fpage>1879</fpage>
          -
          <lpage>1885</lpage>
          . doi:
          <volume>10</volume>
          .24963/ijcai.
          <year>2019</year>
          /260.
        </mixed-citation>
      </ref>
      <ref id="ref15">
        <mixed-citation>
          [14]
          <string-name>
            <given-names>R.</given-names>
            <surname>Byrne</surname>
          </string-name>
          , Counterfactual thought,
          <source>Annual Review of Psychology</source>
          <volume>67</volume>
          (
          <year>2016</year>
          )
          <fpage>135</fpage>
          -
          <lpage>157</lpage>
          . URL: https: //doi.org/10.1146/annurev-psych-
          <volume>122414</volume>
          -
          <fpage>033249</fpage>
          . doi:
          <volume>10</volume>
          .1146/annurev-psych-
          <volume>122414</volume>
          -
          <fpage>033249</fpage>
          . pMID:
          <fpage>26393873</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref16">
        <mixed-citation>
          [15]
          <string-name>
            <given-names>J.</given-names>
            <surname>Suchan</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Bhatt</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P. A.</given-names>
            <surname>Walega</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C. P. L.</given-names>
            <surname>Schultz</surname>
          </string-name>
          ,
          <article-title>Visual explanation by high-level abduction: On answerset programming driven reasoning about moving objects</article-title>
          ,
          <source>in: 32nd AAAI Conference on Artificial Intelligence (AAAI-18)</source>
          , USA, AAAI Press,
          <year>2018</year>
          , pp.
          <fpage>1965</fpage>
          -
          <lpage>1972</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref17">
        <mixed-citation>
          [16]
          <string-name>
            <given-names>P. A.</given-names>
            <surname>Walega</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Bhatt</surname>
          </string-name>
          ,
          <string-name>
            <surname>C. P. L. Schultz,</surname>
          </string-name>
          <article-title>ASPMT(QS): non-monotonic spatial reasoning with answer set programming modulo theories</article-title>
          , in: F. Calimeri, G. Ianni, M. Truszczynski (Eds.),
          <source>Logic Programming and Nonmonotonic Reasoning - 13th International Conference, LPNMR</source>
          <year>2015</year>
          ,
          <article-title>Lexington</article-title>
          ,
          <string-name>
            <surname>KY</surname>
          </string-name>
          , USA, September
          <volume>27</volume>
          -
          <issue>30</issue>
          ,
          <year>2015</year>
          . Proceedings, volume
          <volume>9345</volume>
          of Lecture Notes in Computer Science, Springer,
          <year>2015</year>
          , pp.
          <fpage>488</fpage>
          -
          <lpage>501</lpage>
          . URL: https://doi.org/10.1007/978-3-
          <fpage>319</fpage>
          -23264-5_
          <fpage>41</fpage>
          . doi:
          <volume>10</volume>
          . 1007/978-3-
          <fpage>319</fpage>
          -23264-5\_
          <fpage>41</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref18">
        <mixed-citation>
          [17]
          <string-name>
            <given-names>P. A.</given-names>
            <surname>Walega</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C. P. L.</given-names>
            <surname>Schultz</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Bhatt</surname>
          </string-name>
          ,
          <article-title>Nonmonotonic spatial reasoning with answer set programming modulo theories</article-title>
          ,
          <source>Theory Pract. Log. Program</source>
          .
          <volume>17</volume>
          (
          <year>2017</year>
          )
          <fpage>205</fpage>
          -
          <lpage>225</lpage>
          . URL: https://doi.org/10.1017/S1471068416000193. doi:
          <volume>10</volume>
          .1017/S1471068416000193.
        </mixed-citation>
      </ref>
      <ref id="ref19">
        <mixed-citation>
          [18]
          <string-name>
            <given-names>J.</given-names>
            <surname>Suchan</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Bhatt</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C. P. L.</given-names>
            <surname>Schultz</surname>
          </string-name>
          ,
          <article-title>Deeply semantic inductive spatio-temporal learning</article-title>
          , in: J.
          <string-name>
            <surname>Cussens</surname>
            ,
            <given-names>A</given-names>
          </string-name>
          . Russo (Eds.),
          <source>Proceedings of the 26th International Conference on Inductive Logic Programming (Short papers)</source>
          , London, UK,
          <year>2016</year>
          , volume
          <year>1865</year>
          ,
          <article-title>CEURWS</article-title>
          .org,
          <year>2016</year>
          , pp.
          <fpage>73</fpage>
          -
          <lpage>80</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref20">
        <mixed-citation>
          [19]
          <string-name>
            <given-names>K. S. R.</given-names>
            <surname>Dubba</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A. G.</given-names>
            <surname>Cohn</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D. C.</given-names>
            <surname>Hogg</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Bhatt</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F.</given-names>
            <surname>Dylla</surname>
          </string-name>
          ,
          <source>Learning Relational Event Models from Video, J. Artif. Intell. Res. (JAIR) 53</source>
          (
          <year>2015</year>
          )
          <fpage>41</fpage>
          -
          <lpage>90</lpage>
          . URL: http://dx.doi.org/10.1613/jair.4395. doi:
          <volume>10</volume>
          . 1613/jair.4395.
        </mixed-citation>
      </ref>
      <ref id="ref21">
        <mixed-citation>
          [20]
          <string-name>
            <given-names>J.</given-names>
            <surname>Jafar</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M. J.</given-names>
            <surname>Maher</surname>
          </string-name>
          ,
          <article-title>Constraint logic programming: A survey</article-title>
          ,
          <source>The journal of logic programming 19</source>
          (
          <year>1994</year>
          )
          <fpage>503</fpage>
          -
          <lpage>581</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref22">
        <mixed-citation>
          [21]
          <string-name>
            <given-names>S.</given-names>
            <surname>Muggleton</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L. D.</given-names>
            <surname>Raedt</surname>
          </string-name>
          ,
          <article-title>Inductive logic programming: Theory and methods</article-title>
          ,
          <source>Journal of Logic Programming</source>
          <volume>19</volume>
          (
          <year>1994</year>
          )
          <fpage>629</fpage>
          -
          <lpage>679</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref23">
        <mixed-citation>
          [22]
          <string-name>
            <given-names>G.</given-names>
            <surname>Brewka</surname>
          </string-name>
          ,
          <string-name>
            <given-names>T.</given-names>
            <surname>Eiter</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Truszczyński</surname>
          </string-name>
          ,
          <article-title>Answer set programming at a glance</article-title>
          ,
          <source>Commun. ACM</source>
          <volume>54</volume>
          (
          <year>2011</year>
          )
          <fpage>92</fpage>
          -
          <lpage>103</lpage>
          . doi:
          <volume>10</volume>
          .1145/2043174.2043195.
        </mixed-citation>
      </ref>
      <ref id="ref24">
        <mixed-citation>
          [23]
          <string-name>
            <given-names>M.</given-names>
            <surname>Bhatt</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J. H.</given-names>
            <surname>Lee</surname>
          </string-name>
          ,
          <string-name>
            <surname>C. P. L. Schultz,</surname>
          </string-name>
          <article-title>CLP(QS): A declarative spatial reasoning framework</article-title>
          , in: M. J.
          <string-name>
            <surname>Egenhofer</surname>
            ,
            <given-names>N. A.</given-names>
          </string-name>
          <string-name>
            <surname>Giudice</surname>
            ,
            <given-names>R.</given-names>
          </string-name>
          <string-name>
            <surname>Moratz</surname>
            ,
            <given-names>M. F.</given-names>
          </string-name>
          <string-name>
            <surname>Worboys</surname>
          </string-name>
          (Eds.),
          <source>Spatial Information Theory - 10th International Conference, COSIT</source>
          <year>2011</year>
          ,
          <article-title>Belfast</article-title>
          ,
          <string-name>
            <surname>ME</surname>
          </string-name>
          , USA, September
          <volume>12</volume>
          -
          <issue>16</issue>
          ,
          <year>2011</year>
          . Proceedings, volume
          <volume>6899</volume>
          of Lecture Notes in Computer Science, Springer,
          <year>2011</year>
          , pp.
          <fpage>210</fpage>
          -
          <lpage>230</lpage>
          . URL: https://doi.org/10.1007/978-3-
          <fpage>642</fpage>
          -23196-4_
          <fpage>12</fpage>
          . doi:
          <volume>10</volume>
          . 1007/978-3-
          <fpage>642</fpage>
          -23196-4_
          <fpage>12</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref25">
        <mixed-citation>
          [24]
          <string-name>
            <surname>C. P. L. Schultz</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          <string-name>
            <surname>Bhatt</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          <string-name>
            <surname>Suchan</surname>
            ,
            <given-names>P. A.</given-names>
          </string-name>
          <string-name>
            <surname>Walega</surname>
          </string-name>
          ,
          <article-title>Answer Set Programming Modulo Space-Time</article-title>
          , in: C.
          <string-name>
            <surname>Benzmüller</surname>
            ,
            <given-names>F.</given-names>
          </string-name>
          <string-name>
            <surname>Ricca</surname>
            ,
            <given-names>X.</given-names>
          </string-name>
          <string-name>
            <surname>Parent</surname>
          </string-name>
          , D. Roman (Eds.), Rules and Reasoning - Second International Joint Conference,
          <source>RuleML+RR</source>
          <year>2018</year>
          ,
          <article-title>Luxembourg</article-title>
          ,
          <source>September 18- 21</source>
          ,
          <year>2018</year>
          , Proceedings, volume
          <volume>11092</volume>
          of Lecture Notes in Computer Science, Springer,
          <year>2018</year>
          , pp.
          <fpage>318</fpage>
          -
          <lpage>326</lpage>
          . URL: https://doi.org/10.1007/978-3-
          <fpage>319</fpage>
          -99906-7_
          <fpage>24</fpage>
          . doi:10.
        </mixed-citation>
      </ref>
      <ref id="ref26">
        <mixed-citation>
          [25]
          <string-name>
            <given-names>S.</given-names>
            <surname>Harnad</surname>
          </string-name>
          ,
          <article-title>The symbol grounding problem</article-title>
          ,
          <source>Physica D 42</source>
          (
          <year>1990</year>
          )
          <fpage>335</fpage>
          -
          <lpage>346</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref27">
        <mixed-citation>
          [26]
          <string-name>
            <surname>AI</surname>
            <given-names>HLEG</given-names>
          </string-name>
          ,
          <article-title>High-level expert group on artificial intelligence: Ethical guidelines for trustworthy ai</article-title>
          ,
          <year>2019</year>
          . URL: https://www.aepd.es/sites/default/files/2019-12/
          <article-title>ai-ethics-guidelines</article-title>
          .pdf.
        </mixed-citation>
      </ref>
      <ref id="ref28">
        <mixed-citation>
          [27]
          <string-name>
            <given-names>EU</given-names>
            <surname>Commission</surname>
          </string-name>
          ,
          <source>Communication: Building trust in human centric artificial intelligence</source>
          ,
          <year>2019</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref29">
        <mixed-citation>
          [28]
          <string-name>
            <given-names>EU</given-names>
            <surname>Commission</surname>
          </string-name>
          ,
          <article-title>Proposal for a regulation of the european parliament and of the council laying down harmonised rules on artificial intelligence (artificial intelligence act) and amending certain union legislative acts</article-title>
          ,
          <year>2021</year>
          . URL: https://eur-lex.europa.eu/legal-content/ EN/TXT/?uri=CELEX:
          <fpage>52021PC0206</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref30">
        <mixed-citation>
          [29]
          <string-name>
            <given-names>J.</given-names>
            <surname>Suchan</surname>
          </string-name>
          ,
          <source>Declarative Reasoning about Space and Motion in Visual Imagery - Theoretical Foundations and Applications</source>
          ,
          <source>Ph.D. thesis</source>
          , Universität Bremen,
          <year>2022</year>
          . URL: https://elib.dlr.de/188919/.
        </mixed-citation>
      </ref>
      <ref id="ref31">
        <mixed-citation>
          [30]
          <string-name>
            <given-names>V.</given-names>
            <surname>Kondyli</surname>
          </string-name>
          ,
          <article-title>Behavioural Principles for the Design of Human-Centred Cognitive Technologies : The Case of Visuo-Locomotive Experience</article-title>
          ,
          <source>Ph.D. thesis</source>
          , Örebro University, School of Science and Technology,
          <year>2023</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref32">
        <mixed-citation>
          [31]
          <string-name>
            <given-names>V.</given-names>
            <surname>Nair</surname>
          </string-name>
          , The Observer Lens:
          <article-title>Characterizing Visuospatial Features in Multimodal Interactions</article-title>
          ,
          <source>Ph.D. thesis,</source>
          , School of Informatics, Informatics Research Environment,
          <year>2024</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref33">
        <mixed-citation>
          [32]
          <string-name>
            <given-names>M.</given-names>
            <surname>Bhatt</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Suchan</surname>
          </string-name>
          ,
          <article-title>Cognitive vision and perception</article-title>
          , in: G. D.
          <string-name>
            <surname>Giacomo</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          <string-name>
            <surname>Catalá</surname>
            ,
            <given-names>B.</given-names>
          </string-name>
          <string-name>
            <surname>Dilkina</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          <string-name>
            <surname>Milano</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          <string-name>
            <surname>Barro</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          <string-name>
            <surname>Bugarín</surname>
          </string-name>
          , J. Lang (Eds.),
          <source>ECAI 2020 - 24th European Conference on Artificial Intelligence</source>
          ,
          <volume>29</volume>
          <fpage>August</fpage>
          -8
          <source>September</source>
          <year>2020</year>
          , Santiago de Compostela, Spain,
          <source>August 29 - September 8, 2020 - Including 10th Conference on Prestigious Applications of Artificial Intelligence (PAIS</source>
          <year>2020</year>
          ), volume
          <volume>325</volume>
          <source>of Frontiers in Artificial Intelligence and Applications</source>
          , IOS Press,
          <year>2020</year>
          , pp.
          <fpage>2881</fpage>
          -
          <lpage>2882</lpage>
          . URL: https://doi.org/10.3233/FAIA200434. doi:
          <volume>10</volume>
          .3233/FAIA200434.
        </mixed-citation>
      </ref>
      <ref id="ref34">
        <mixed-citation>
          [33]
          <string-name>
            <given-names>V.</given-names>
            <surname>Kondyli</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Bhatt</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Levin</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Suchan</surname>
          </string-name>
          ,
          <article-title>How do drivers mitigate the efects of naturalistic visual complexity? on attentional strategies and their implications under a change blindness protocol</article-title>
          ,
          <source>Cognitive Research: Principles and Implications</source>
          <volume>8</volume>
          (
          <year>2023</year>
          ).
          <source>doi:10.1186/s41235-023-00501-1.</source>
        </mixed-citation>
      </ref>
      <ref id="ref35">
        <mixed-citation>
          [34]
          <string-name>
            <given-names>K. S. R.</given-names>
            <surname>Dubba</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Bhatt</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F.</given-names>
            <surname>Dylla</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D. C.</given-names>
            <surname>Hogg</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A. G.</given-names>
            <surname>Cohn</surname>
          </string-name>
          ,
          <article-title>Interleaved inductive-abductive reasoning for learning complex event models</article-title>
          , in: S. H.
          <string-name>
            <surname>Muggleton</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          <string-name>
            <surname>Tamaddoni-Nezhad</surname>
            ,
            <given-names>F. A.</given-names>
          </string-name>
          <string-name>
            <surname>Lisi</surname>
          </string-name>
          (Eds.),
          <source>Inductive Logic Programming - 21st International Conference, ILP</source>
          <year>2011</year>
          , Windsor Great Park, UK,
          <source>July 31 - August 3</source>
          ,
          <year>2011</year>
          , Revised Selected Papers, volume
          <volume>7207</volume>
          of Lecture Notes in Computer Science, Springer,
          <year>2011</year>
          , pp.
          <fpage>113</fpage>
          -
          <lpage>129</lpage>
          . URL: https://doi.org/10.1007/978-3-
          <fpage>642</fpage>
          -31951-8_
          <fpage>14</fpage>
          . doi:
          <volume>10</volume>
          .1007/978-3-
          <fpage>642</fpage>
          -31951-8\_
          <fpage>14</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref36">
        <mixed-citation>
          [35]
          <string-name>
            <given-names>M.</given-names>
            <surname>Bhatt</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Suchan</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Schultz</surname>
          </string-name>
          ,
          <article-title>Cognitive interpretation of everyday activities - toward perceptual narrative based visuo-spatial scene interpretation</article-title>
          , in: M. A.
          <string-name>
            <surname>Finlayson</surname>
            ,
            <given-names>B.</given-names>
          </string-name>
          <string-name>
            <surname>Fisseni</surname>
            ,
            <given-names>B.</given-names>
          </string-name>
          <string-name>
            <surname>Löwe</surname>
          </string-name>
          , J. C. Meister
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>