<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Wheels Within Wheels: A Causal Treatment of Image Schemas in An Embodied Storytelling System</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Philipp WICKE</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Tony VEALE</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>University College Dublin</institution>
          ,
          <addr-line>Belfield, Dublin 4</addr-line>
          ,
          <country country="IE">Ireland</country>
        </aff>
      </contrib-group>
      <abstract>
        <p>outlines a novel approach to grounding a computational storytelling system in an embodied robotic agent that is capable of making such gestures, and argues for why image schemata make the ideal glue for linking the causal structures of plot generation to the gestures of bodily expressiveness.</p>
      </abstract>
      <kwd-group>
        <kwd />
        <kwd>image schema</kwd>
        <kwd>embodied cognition</kwd>
        <kwd>computational creativity</kwd>
        <kwd>storytelling</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>1. Introduction</title>
      <p>
        It has been argued that image schemata are a fundamental basis of how humans express
themselves, whether speaking of concrete or abstract ideas. Image schemata are
recurring cognitive structures that are shaped by our bodily interaction with our physical
environment. The definition has been most prominently championed, among others, by
Johnson [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ] and Lakoff [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ]. These definitions presuppose and apply the theory of embodied
cognition, in which cognitive processes are to be understood as processes of the mind
and the brain, including the body and its environment, challenging the doctrine of pure
Computationalism [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ]. As image schemas impose meaningful structures on our bodily
movements through space and time, they can be observed not just in our spoken language
but in our body language too, i.e. gestures [
        <xref ref-type="bibr" rid="ref4">4</xref>
        ][
        <xref ref-type="bibr" rid="ref5">5</xref>
        ][
        <xref ref-type="bibr" rid="ref6">6</xref>
        ].
      </p>
      <p>
        This extended abstract proposes an approach to utilizing image schemas in an
embodied story-generation and story-telling system. As such, the domains of Cognitive
Linguistics/Semantics and Computational Creativity overlap in a framework that aims to lift
the creative processes of stories out of their container, in this case an embodied NAO
robot 2. Building on our interactive storytelling framework [
        <xref ref-type="bibr" rid="ref7">7</xref>
        ][
        <xref ref-type="bibr" rid="ref8">8</xref>
        ] we investigate the use
of image schemas as a cognitive framework for a storytelling robot. For this
investigation we have correlated a set of 9 prominent image schemas with more than 800 story
actions (plot verbs) from the storytelling system. Prior to the empirical evaluation of
gestures based on the image schemas incorporated into the system, this paper presents a
preliminary study which examines the database that has been created to combine gestures,
schemas and story-actions. This examination lays the groundwork for a consideration of
future work and expected progress on the proposed framework.
      </p>
    </sec>
    <sec id="sec-2">
      <title>2. Image Schema for Robotic agents</title>
      <p>
        Our goal is to build an engaging robotic story-teller than exploits its physical presence
to lend embodiment to the symbolic narratives of an automated story-telling system. We
consider here the use of image schemas to enrich the representation of story actions and
their corresponding gestures in earlier versions of the work [
        <xref ref-type="bibr" rid="ref7">7</xref>
        ][
        <xref ref-type="bibr" rid="ref8">8</xref>
        ]. By giving the system
an image-schematic understanding of its own actions and gestures, we hypothesize that
it will be better positioned to choose apt gestures in context, or to mold new stories
around these gestures (as opposed to molding gestures around stories). Image schemas
are powerful abstractions that capture the spatial/temporal logic of an action, as shown
in the following instantiations of the outward motion:
      </p>
      <p>
        John went out of the room. Pump out the air. Let out your anger. Pick out the best
theory. Drown out the music.
(Selected examples from Johnson [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ] p. 32 - Figure 4)
      </p>
      <p>We choose our core set of image schemas to suit the gestural capabilities of our
robot, a humanoid Nao robot from the French company Aldebaran Robotics (recently
acquired by Softbank Robotics). This bipedal and bimanual robot, released in 2008, has
25 degrees of freedom, and can move its arms in ways that accommodate many human
gestures. We do not need very many image schemas to express a wide array of different
actions, so we choose only those that match the capabilities of the robot. Consequently,
we propose the selection of schemas outlined in Table 1.</p>
      <p>Image Schema</p>
      <p>UP-DOWN</p>
      <p>FRONT-BACK
CENTRE-PERIPHERY</p>
      <p>CONTACT
IN-OUT
SURFACE
CYCLE</p>
      <p>S-P-G
NEAR-FAR
(Direction) Gesture
(UP) Both arms up
(FRONT) Both arms stretching to front
(CENTER) Hands unite from sides
(SAME) Both hands touch in front
(IN) arm-1 encloses, arm-2 reaches in
(SAME) Both hands circle horizontally
(SAME) Both hands circle vertically
(SAME) arm-1 moves vertically 3 times
(NEAR) arm-1 fixed front, arm-2 unites</p>
      <p>(Inverse Direction) Gesture
(DOWN) Both arms/torso lower
(BACK) both arms moving back
(PERIP.) Hands depart from centre
(SAME) Both hands touch in front
(OUT) arm-1 encloses, arm-2 reaches out
(SAME) Both hands circle horizontally
(SAME) Both hands circle vertically
(SAME) arm-1 moves vertically 3 times
(FAR) arm-1 fixed front, arm-2 departs</p>
      <p>
        Table 1 provides an overview of nine image schemas. Of these, five are sensitive
to direction, meaning that a corresponding inverse gesture along the opposite direction
also exists. For example, for the UP-DOWN schema both UP and DOWN have their own
distinct gestures, though each can be seen as a directional inflection of the other. In all
then, the 9 schemas give rise to 14 distinct gestures. The association between gesture
and image schema builds on the evidence provided in other studies [
        <xref ref-type="bibr" rid="ref4">4</xref>
        ][
        <xref ref-type="bibr" rid="ref5">5</xref>
        ][
        <xref ref-type="bibr" rid="ref6">6</xref>
        ] and its
validity for our framework will be the subject of further investigation. See Figure 2 for an
overview of the proposed improved framework. To connect these image schemas to the
storytelling system, we next consider the database of actions on which the storytelling
system is built.
      </p>
    </sec>
    <sec id="sec-3">
      <title>3. Image Schemas in Creative Story-Generation and Story-telling</title>
      <p>
        We tie these schemas and their gestures to specific actions in the repertoire of the
storygeneration system. We build upon the Sce´alextric system of [
        <xref ref-type="bibr" rid="ref9">9</xref>
        ], since its approach is
explicitly action-centred. Each Sce´alextric story is composed by chaining together
actions (chosen from a pool of 800 plot verbs) into a causal path (explanation of this causal
chaining can be found in [
        <xref ref-type="bibr" rid="ref9">9</xref>
        ]). Actions connect to subsequent actions via causal
connectives, such as so, then and but. A key part of Sce´alextric is its large set of causal tuples,
which indicate how each of its 800 actions might result in another. In our previous work
[
        <xref ref-type="bibr" rid="ref8">8</xref>
        ], we tied the movements of the NAO gesture library to the actions of the Sce´alextric
causal graph, so that a path through the latter could be physically embellished (or, in
some cases, pantomimed) by the robot. This mapping of gestures to actions was done in
the absence of any mediating abstraction, and was thus opportunistically ad-hoc.
However, we can now use image schemas as the motivating abstraction to tie gestures to story
actions. Given the causal underpinnings of Sce´alextric, this will subsequently allow us
to look for recurring causal patterns between the image schemas themselves.
      </p>
      <p>Sce´alextric uses the causal connectives but, but later, yet, until, and, then, because,
though and so to connect two successive story actions. It is self-evident that but and but
later form a group of connections (hereafter the BUT-Group) reflecting how a successor
action happens contrary to its predecessor, e.g. ”Faust fell in love with Gretchen, but later
he betrayed her.”. In the group of connections then, because, though and so (hereafter the
SO-Group) we expect a strong causal continuation, e.g. ”Mephisto stood up for Faust,
so Faust was impressed by Mephisto.”. In the last group of connectives yet, until and and
(hereafter the AND-Group) we neither expect nor are surprised by a given continuation,
e.g. ”God challenged Mephisto the Devil and Mephisto the Devil enchanted Faust.”.
We can now analyze our mappings of image schemas to actions using the following
hypothesis:</p>
      <p>The continuity of story actions is reflected in the interconnection of image schemas
over the course of a story.</p>
    </sec>
    <sec id="sec-4">
      <title>4. Analysis of Database</title>
      <p>
        The first analysis compares how many image schemas follow the continuation of the
actions from the storytelling system. For that, we investigate the continuation of
directionality of the image schema through the course of a story, i.e. the UP schema repeatedly
following schemas related to direction away from the robot (FAR, PERIPHERY). We
expect this continuation of direction in stories of the SO-Group. The storytelling
system draws the causal connectives from a database, which connects the 800+ actions in
about 3700+ combinations with causal connectives, e.g. ”trusted by BUT stalked by”.
The first result divides the connectives into the aforementioned three groups and tests
if the continuation holds for the image schema, e.g.: ”Faust fell in love with Gretchen,
but later he betrayed her.”. Here, fell in love is mapped onto the schema ”IN”, because
falling in love is the conceptual metaphor for the example of entering the abstract image
schema CONTAINMENT [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ]. betrayed is mapped onto the schema ”BACK” (following
the metaphorical stab in the back, or to be attacked in the back). Importantly, this first
investigation simply looks for the continuation of the image scheme irrespective of the
direction (e.g., BACK-FRONT treated as one image schema). The difference between the
BUT-Group and the SO-Group is 5.84%, i.e. in the SO-Group just 5.84% more actions
show causal continuity in their use of the same image schemas.
      </p>
      <p>In Figure 1, we see the relative frequency of each image schema pairs connected by
SO-Group connectives (left) or the BUT-Group connectives (right). Each row depicts for
each image schema the percentage of occurrence for the following image schema.</p>
      <p>In the second analysis, we investigate the image schema with respect to their
direction. We group the image schemas according to their spatial relation, i.e. the first group
(Group A) comprises image schemas related to movement towards the robot: BACK,
CENTER, IN, NEAR. The second group (Group B) comprises image schemas related to
movement away from the robot: FRONT, PERIPHERY, OUT, and FAR. The last group
(Group C) comprises all other image schemas: UP, DOWN, CONTACT, SURFACE,
CYCLE, and SOURCE-PATH-GOAL. This second investigation allows us to tighten the
association between spatial domain and image schema. The difference between the
BUTGroup and the SO-Group is 7.84%, i.e. in the SO-Group 7.84% more actions are
continuous with regard to their image schema group. While this is a marginal difference, it
indicates that this preliminary trend does not confirm but also does not contradict our
hypothesis.</p>
    </sec>
    <sec id="sec-5">
      <title>5. Conclusion and Future Work</title>
      <p>
        This research presents a novel way of investigating image schemas in the realm of
computational embodied storytelling, and sketches a framework that utilizes a storytelling
system to explore the causal connections between image schemas. While initial results
are minimal at best, these investigations point towards easily extensible further studies
and improvements. The current mapping between image schemas and actions is a definite
weak point and potential explanation for the weak results. In future research, we plan to
refine the mapping between the image schemas and actions in order to test the proposed
hypothesis in general terms. In Figure 2 we can observe the two crucial links between
story actions and image schemas and image schemas and gestures. Necessary evidence
in previous research has been discussed to justify this investigation. Consequently, we
want to investigate the benefits of grounding actions and gestures in a common set of
image schemas in contrast to the previous ad-hoc mappings. This common set of image
schemas will draw from the idea of Conceptual Scaffolding [
        <xref ref-type="bibr" rid="ref11">11</xref>
        ] which provides a simple
calculus for reasoning about image schema. This approach will allow us to define a set
of gestural expressions of those inherently spatial schemas. The implementation of those
schemas will be guided by recent approaches of [
        <xref ref-type="bibr" rid="ref12">12</xref>
        ], who use image schema for
conceptual blending and with that provide a formalized logic language for image schema.
Strengthening the links of this framework will help to provide the necessary data which
will allow a proper empirical evaluation.
      </p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <surname>M. Johnson</surname>
          </string-name>
          <article-title>The body in the mind: The bodily basis of meaning, imagination</article-title>
          , and reason University of Chicago Press, (
          <year>2013</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [2]
          <string-name>
            <given-names>G.</given-names>
            <surname>Lakoff Women</surname>
          </string-name>
          , fire, and dangerous things University of Chicago press, (
          <year>2008</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [3]
          <string-name>
            <given-names>A.</given-names>
            <surname>Chemero Radical Embodied Cognitive Science</surname>
          </string-name>
          MIT Press, (
          <year>2011</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [4]
          <string-name>
            <given-names>A.</given-names>
            <surname>Cienski</surname>
          </string-name>
          <article-title>From perception to meaning: Image schemas in cognitive linguistics - Image schema</article-title>
          and gesture De Gruyter,
          <volume>421</volume>
          :
          <fpage>442</fpage>
          , (
          <year>2005</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          [5]
          <string-name>
            <surname>I. Mittelberg</surname>
          </string-name>
          <article-title>Gestures as image schemas and force gestalts: A dynamic systems approach augmented with motion-capture data analyses De Gruyter</article-title>
          , Vol.
          <volume>11</volume>
          - Cognitive
          <string-name>
            <surname>Semiotics</surname>
          </string-name>
          (
          <year>2018</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          [6]
          <string-name>
            <given-names>B.</given-names>
            <surname>Ravenet</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Clavel</surname>
          </string-name>
          and
          <string-name>
            <surname>C.</surname>
          </string-name>
          <article-title>Pelachaud Automatic Nonverbal Behavior Generation from Image Schemas Proceedings of the International Conference on Autonomous Agents and Multi-agent Systems, (</article-title>
          <year>2008</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          [7]
          <string-name>
            <given-names>P.</given-names>
            <surname>Wicke</surname>
          </string-name>
          and
          <string-name>
            <given-names>T.</given-names>
            <surname>Veale</surname>
          </string-name>
          <article-title>Interview with the Robot: Question-Guided Collaboration in a Storytelling System</article-title>
          <source>Proceedings of the International Conference on Computational Creativity</source>
          , (
          <year>2018</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          [8]
          <string-name>
            <given-names>P.</given-names>
            <surname>Wicke</surname>
          </string-name>
          and
          <string-name>
            <given-names>T.</given-names>
            <surname>Veale</surname>
          </string-name>
          <article-title>Storytelling by a Show of Hands: A framework for interactive embodied storytelling in robotic agents</article-title>
          <source>Proceedings of the Conference on Artificial Intelligence and Simulated Behavior</source>
          , (
          <year>2018</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          [9]
          <string-name>
            <given-names>T.</given-names>
            <surname>Veale</surname>
          </string-name>
          <article-title>A Rap on the Knuckles and a Twist in the Tale From Tweeting Affective Metaphors to Generating Stories with a Moral</article-title>
          .
          <source>Proceedings 2016 AAAI Spring Symposium Series</source>
          , (
          <year>2016</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          [10]
          <string-name>
            <surname>T. Veale De</surname>
          </string-name>
          <article-title>´ja` Vu All Over Again</article-title>
          .
          <source>Proceedings of the International Conference on Computational Creativity</source>
          , (
          <year>2017</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          [11]
          <string-name>
            <given-names>T.</given-names>
            <surname>Veale</surname>
          </string-name>
          and
          <string-name>
            <given-names>M.</given-names>
            <surname>Keane Conceptual</surname>
          </string-name>
          <article-title>Scaffolding: A spatially founded meaning representation for metaphor comprehension</article-title>
          .
          <source>Computational Intelligence</source>
          <volume>8</volume>
          .3
          <fpage>494</fpage>
          -
          <lpage>519</lpage>
          , (
          <year>1992</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          [12]
          <string-name>
            <surname>M.M. Hedblom</surname>
            ,
            <given-names>O.</given-names>
          </string-name>
          <string-name>
            <surname>Kutz</surname>
            and
            <given-names>F.</given-names>
          </string-name>
          <string-name>
            <surname>Neuhaus</surname>
          </string-name>
          <article-title>Image schemas in computational conceptual blending</article-title>
          .
          <source>Cognitive Systems Research</source>
          <volume>39</volume>
          <fpage>42</fpage>
          -
          <lpage>57</lpage>
          (
          <year>2016</year>
          )
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>