<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Image Schemas and Conceptual Dependency Primitives: A Comparison</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Jamie C. MACBETH</string-name>
          <email>jmacbeth@fair</email>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Dagmar GROMANN</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Maria M. HEDBLOM</string-name>
          <xref ref-type="aff" rid="aff2">2</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Artificial Intelligence Research Institute (IIIA-CSIC)</institution>
          ,
          <addr-line>Bellaterra</addr-line>
          ,
          <country country="ES">Spain</country>
        </aff>
        <aff id="aff1">
          <label>1</label>
          <institution>Department of Electrical and Computer Systems Engineering, Fairfield University</institution>
          ,
          <country country="US">USA</country>
        </aff>
        <aff id="aff2">
          <label>2</label>
          <institution>Research Center for Knowledge and Data, Free University of Bozen-Bolzano</institution>
          ,
          <country country="IT">Italy</country>
        </aff>
      </contrib-group>
      <abstract>
        <p>A major challenge in natural language understanding research in artificial intelligence (AI) has been and still is the grounding of symbols in a representation that allows for rich semantic interpretation, inference, and deduction. Across cognitive linguistics and other disciplines, a number of principled methods for meaning representation of natural language have been proposed that aim to emulate capacities of human cognition. However, little cross-fertilization among those methods has taken place. A joint effort of human-level meaning representation from AI research and from cognitive linguistics holds the potential of contributing new insights to this profound challenge. To this end, this paper presents a first comparison of image schemas to an AI meaning representation system called Conceptual Dependency (CD). Restricting our study to the domain of physical and spatial conceptual primitives, we find connections and mappings from a set of action primitives in CD to a remarkably similar set of image schemas. We also discuss important implications of this connection, from formalizing image schemas to improving meaning representation systems in AI.</p>
      </abstract>
      <kwd-group>
        <kwd />
        <kwd>natural language understanding</kwd>
        <kwd>human cognition</kwd>
        <kwd>conceptual primitives</kwd>
        <kwd>Conceptual Dependency</kwd>
        <kwd>image schemas</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>1. Introduction</title>
      <p>
        Building systems to understand natural language has been a significant, decades-long
challenge to Artificial Intelligence (AI) research. Whether these systems are composed
through machine learning (see e.g. [
        <xref ref-type="bibr" rid="ref12">12</xref>
        ]), deep learning over large datasets (see e.g. [
        <xref ref-type="bibr" rid="ref25">25</xref>
        ]),
or are built by hand, a major component to this challenge is the question of meaning
representation behind language, also known as the symbol grounding problem. If AI
systems are to achieve human-like performances on natural language understanding tasks,
it stands to reason that the meaning representations they use should also be human-like.
Work in cognitive AI in the area of natural language understanding will benefit from
stronger links to cognitive linguistics.
      </p>
      <p>
        There is compelling evidence from cognitive linguistics that the semantic structure
of natural language reflects conceptual structure that arises from our sensorimotor
experience with the world [
        <xref ref-type="bibr" rid="ref23 ref32 ref4 ref6">32,4,23,6</xref>
        ]. One theory that aims to concretize these sensorimotor
experiences is the theory of image schemas, in which abstract patterns construe
conceptual building blocks for higher level cognition, such as language and analogical reasoning
[
        <xref ref-type="bibr" rid="ref13 ref9">13,9</xref>
        ]. This paper presents an investigation that establishes connections between image
schemas and Conceptual Dependency, a theory and meaning representation system from
“classical” AI research in natural language understanding. Conceptual Dependency (CD)
attempts to decompose meanings into complex structures whose elements are conceptual
primitives that are very similar to the building blocks of image schemas. But CD theory
also presents a formalized model of a semantics of natural language that reflects human
cognition and memory.
      </p>
      <p>
        For CD and other AI research in natural language understanding systems, drawing
connections between CD and image schemas serves several purposes. Firstly, insofar as
image schemas are based on empirical support on how humans represent thought [
        <xref ref-type="bibr" rid="ref14 ref22">14,22</xref>
        ],
it will serve to transitively ground and validate CD’s conceptual primitives in
empirical data. Alternatively, the methods of investigation that have been developed for image
schemas may be extended to investigations into the cognitive plausibility of CD. Finally,
the connection could guide processes for improving CD as a representation system—
for example, by providing supporting theory and evidence for efforts to generalize CD
primitives to more situations, add or remove primitives, or modify the connectives used
to build CD structures. For image schemas, few approaches to their formal
representation exist, and where they do, they generally focus on specific image schemas, such as
CONTAINMENT [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ] or SOURCE PATH GOAL [
        <xref ref-type="bibr" rid="ref8">8</xref>
        ], or specific problems that blend
various schemas [
        <xref ref-type="bibr" rid="ref30">30</xref>
        ]. A comparison to CD holds the promise of contributing to their
formalization, and enables their potential use in AI and cognitive systems. CD provides a set of
action primitives that have been applied to low level sensory data [
        <xref ref-type="bibr" rid="ref10">10</xref>
        ] and thus may help
to bridge the gap between the highly abstract quality of image schemas and low-level
data and object manipulation instructions.
      </p>
      <p>In this paper we present a sophisticated and rich collection of correspondences
between CD and image schemas. In particular, by narrowing the scope of our
investigation to structures representing physical and spatial concepts, we find a strong mapping
between six action primitives and “picture-producing structures” of CD and a
corresponding set of image schemas and the spatial primitives that are theorized to compose
them. We support our conclusions with linguistic evidence and evidence from a
humansubjects study connecting conceptual primitives to language comprehension. Although
this comparison did not result in an ideal one-to-one mapping between the two systems,
the links we found have a broad range of implications, from formalizing image schemas
to engineering meaning representation systems that are grounded in cognitive linguistics.</p>
    </sec>
    <sec id="sec-2">
      <title>2. Background</title>
      <p>Prior to an overall comparison, we individually and separately provide a background
introduction to CD and image schemas.</p>
      <sec id="sec-2-1">
        <title>2.1. CD Primitives</title>
        <p>
          Conceptual Dependency (CD) theory is a meaning representation theory developed for
computer-based natural language understanding systems as an alternative to meaning
representation theories based on formal, mainstream linguistics [
          <xref ref-type="bibr" rid="ref18 ref26 ref27 ref28">26,27,28,18</xref>
          ]. CD is in
a unique class of AI systems that intend to be non-linguistic meaning representations to
enable systems with an in-depth, human-like understanding of language that mirrors
human cognition and the imagery it generates in the understanding process. CD was applied
successfully over several decades of natural language understanding systems research
[
          <xref ref-type="bibr" rid="ref18 ref28 ref29">28,29,18</xref>
          ].
        </p>
        <p>
          CD decomposes meaning into primitive physical acts in a way that renders it
particularly apt for action-based representations as favored by image schemas. As such, is
suitable as a representation system to correspond to image schemas, since image schemas are
suggested to be generalizations made from sensorimotor experiences, and thus are more
abstract than lexical concepts. Although extant, hand-built CD systems are currently rare
following a turn in natural language processing research towards big data,
cognitivelyinspired meaning representation systems like CD continue to be relevant for machine
learning in relation to in-depth natural language understanding tasks [
          <xref ref-type="bibr" rid="ref10 ref16 ref17 ref20">17,16,10,20</xref>
          ].
        </p>
        <p>
          CD has a number of primitives used to represent thought, perception, social
interaction and communication (see [
          <xref ref-type="bibr" rid="ref27">27</xref>
          ]). In this paper, however, we narrow our focus to
CD’s physical, spatial, and object-defining primitives, whose names and descriptions are
given in Table 1. Although many of the names of the primitives are chosen from the
English lexicon, the concepts that they identify differ from the typical meanings
corresponding to their English names, and are usually broader and more abstract. For example,
although the English word “ingest” is typically used to describe events where animate
beings consume food or drink, the INGEST primitive of CD is broader and is intended
for use in a broad variety of acts where a substance or object enters the body of an
animate being, such as a person breathing air, a person injecting themselves or someone
else with a substance using a hypodermic needle, or a single-cell organism absorbing
a molecule through its cell wall. Some of the primitives have abbreviated names; for
example PTRANS is short for “Physical TRANSfer”2.
        </p>
        <p>
          Researchers in CD purposefully endeavored to keep the number of primitives in the
system small, and to develop the system to represent a large number of surface
linguistic expressions by decomposing each into a unique but complex structure of primitive
events, actors, objects, and connectives between events (such as causality). Having a
small number of primitives was intended to make the representations as unambiguous as
possible, by reducing cases where more than one primitive or more than one combination
of primitives represented the same meaning [
          <xref ref-type="bibr" rid="ref29">29</xref>
          ]. Reducing ambiguity in the meaning
representation is of paramount importance to natural language understanding and story
understanding systems in AI, since much of the understanding process involves
inference and deduction operations performed on a meaning instance, or using the meaning
instance to search for knowledge in a knowledge base [
          <xref ref-type="bibr" rid="ref3 ref33">33,3</xref>
          ].
        </p>
      </sec>
      <sec id="sec-2-2">
        <title>2.2. Image Schemas and Spatial Primitives</title>
        <p>
          Image schemas were first introduced by Lakoff [
          <xref ref-type="bibr" rid="ref13">13</xref>
          ] and Johnson [
          <xref ref-type="bibr" rid="ref9">9</xref>
          ] as abstract
patterns that, most often, can be described as spatio-temporal relations arising from bodily
experience. Within the framework of embodied cognition theory [
          <xref ref-type="bibr" rid="ref31">31</xref>
          ], the
sensorimotor experiences humans make in interacting with the world, even in early infancy, are
mentally represented as collections of perceptual states [
          <xref ref-type="bibr" rid="ref24 ref4">4,24</xref>
          ]. For instance, if a child
learns that a glass can contain water, it can, with enough exposure, infer that a cup can
contain any liquid. Through further exposure and generalization of similar situations of
‘objects contained in other objects’ the cognitive understanding represented as the image
schema CONTAINMENT emerges as part of the child’s conceptual world. In contrast to
purely perceptual information, image schemas are stored in an explicit and accessible
format [
          <xref ref-type="bibr" rid="ref24">24</xref>
          ]. Cognitive linguistics has provided compelling evidence that image schemas
underlie lexical concepts externalized in specific lexical forms [
          <xref ref-type="bibr" rid="ref2 ref32">32,2</xref>
          ].
        </p>
        <p>
          Mandler and Paga´n Ca´novas [
          <xref ref-type="bibr" rid="ref23">23</xref>
          ] study the development of image schemas in
infancy and differentiate three degrees of complexity in embodied cognitive structures:
spatial primitives, image schemas, and schematic integrations. Spatial primitives are
initial conceptual buildings blocks that allow us to understand what we perceive, image
schemas are combined to simple spatial events using those primitives, and schematic
integrations are built from the first two and represent conceptual blends structured by
2The abbreviation is likely due to the primitive names being invented in resource-limited computing
environments of the 1960s and 1970s where implementations abbreviated symbol names to conserve memory.
        </p>
        <p>Spatial Primitives
IN, OUT, BOUNDARY,
CONTAINER
SOURCE, GOAL, PATH, LINK,</p>
        <p>MOVE, DIRECTION
PART-WHOLE</p>
        <p>PARTS, WHOLE, CONFIGURATION
SUPPORT
FORCE</p>
        <p>
          CONTACT
SOURCE, GOAL, PATH,
DIRECTION, MOVE, SCALE
image schemas. For instance, they suggest PATH, MOVE, START PATH, and END PATH
among others as primitives that structure the image schema SOURCE PATH GOAL.
Findings from linguistic analyses support the claim that image schemas are structured by
spatial primitives that can be combined in different ways to make up an image schema. For
instance, English children and adults were found to more frequently use PATH GOAL
(PATH, MOVE, END PATH) over SOURCE PATH (PATH, MOVE, START PATH) in
language [
          <xref ref-type="bibr" rid="ref15">15</xref>
          ].
        </p>
        <p>Some of the most common image schemas that are relevant for an initial mapping to
primitives of CD are presented in Table 2, which specifies their spatial primitives and a
description derived from the indicated existing literature. The distinction between image
schemas and spatial primitives is not yet well defined. For instance, Johnson considers
LINK an image schema in its own right, since the type of connection between two
OBJECTs can be of temporal, causal or functional nature, whereas its use as a primitive here
only relates to its spatial connection between OBJECTs.</p>
        <p>
          The formalization of image schemas in algebraic and logical theories (e.g. [
          <xref ref-type="bibr" rid="ref1 ref11 ref8">11,1,
8</xref>
          ]) has been a concern across disciplines. Any formal representation of image schemas
needs to not only represent static relationships, but also temporal change and movement,
while simultaneously following logical transitivity and the gestalt laws identified with a
particular image schema. From a logical perspective, Bennett and Cialone [
          <xref ref-type="bibr" rid="ref1">1</xref>
          ] formally
specified different cases of CONTAINMENT as found in natural language text. Galton
[
          <xref ref-type="bibr" rid="ref5">5</xref>
          ] formalized some transitive aspects of CONTAINMENT in a study on how to formally
approach affordances. Hedblom, Kutz and Neuhaus [
          <xref ref-type="bibr" rid="ref8">8</xref>
          ] proposed a formalized family
of micro-theories for SOURCE PATH GOAL. In [
          <xref ref-type="bibr" rid="ref7">7</xref>
          ] the authors introduce a method to
logically represent image schemas based on a combination of different calculi and linear
temporal logic inspired by previous formalizations, which allows for the representation
of events and complex image schemas. However, no complete formalization of all image
schemas has yet been achieved.
        </p>
      </sec>
    </sec>
    <sec id="sec-3">
      <title>3. Case Study Evidence</title>
      <p>
        To facilitate a discussion comparing and contrasting image schemas and CD theory, we
present qualitative evidence from data collected in a recent human subjects study on the
coherence of CD primitives [
        <xref ref-type="bibr" rid="ref19">19</xref>
        ]. In this CD focused study, 50 subjects were presented
with descriptions of the six physical primitive acts of CD as well as 12 short, simple
“target” sentences which described human characters performing simple acts. The
descriptions of the primitives and the 12 sentences that participants were presented with are
illustrated in Table 1. The sentences were designed so that the major act in each sentence
was a match for the broad category of acts represented by one of the conceptual
primitives, and among the 12 sentences there were two sentences corresponding to each of
the six primitives. For each sentence, subjects were asked to freely choose the primitive
which was the best match for the “main action” in their understanding and
conceptualization of the sentence. They were also asked to give brief, one-sentence explanations of
their selections. The CD primitives were identified by numbers instead of their
Englishword names so that subjects would not be biased by the typical meanings attached to
their names.
      </p>
      <p>The study found that the subjects’ answers occasionally disagreed with the expected
matching between primitives and sentences, and that these answers and explanations
were informative of ways that the primitive representation could be improved, either
for applying a primitive to new situations, or for modifying the way the primitives, as
a complementary set, cover the space of conceptualizations. Throughout this paper we
use particular cases from this study as evidence to support our arguments, views, and
reflections regarding connections between CD and image schemas.</p>
    </sec>
    <sec id="sec-4">
      <title>4. Contrasting CD and Image Schemas</title>
      <p>The method of the comparison of CD and image schemas presented in detail in this
section is based on an element-wise comparison of CD primitives to image schemas and
their spatial primitives. To support the resulting mapping (shown in Table 3), we provide
natural language example sentences as well as their interpretation by participants in a
case study described in Section 3.</p>
      <sec id="sec-4-1">
        <title>4.1. Motion and PATHs</title>
        <p>A majority of CD physical primitives incorporate some sort of motion through space.
While PTRANS is typically used to build conceptualizations corresponding to an object
or actor in its entirety changing location from one place to another by whatever means,
the MOVE primitive is reserved for situations in which an animate actor moves a part of
their (or its) body, while the rest of their body remains at rest. This distinction is useful
for story understanding systems in AI, because they are often required to keep track of
the locations of story characters and objects as a story progresses, while at the same time
processing movements of actors’ body parts that do not cause a change in their location,
such as moving a foot to kick a ball or moving a hand to grasp a glass of liquid.</p>
        <p>However, moving a body part does also imply a change of location of that body part;
if a character is near a door and MOVEs their left hand to the location of the doorknob,
the story understander system must be able to track or represent that the character’s left
hand has changed its location so that it is now at the doorknob, and not, say, in the
character’s pants pocket. A further examination indicates that, for all practical purposes, the
only difference between PTRANS and MOVE is that MOVE represents a change in
location of a body part, while PTRANS represents a change in the location of a character’s
entire body, as well as any other kind of object.</p>
        <p>The CD primitives PTRANS and MOVE most resemble the image schema
SOURCE PATH GOAL and could also be interpreted image-schematically in terms of
the spatial primitives found in SOURCE PATH GOAL. One of the spatial primitives of
SOURCE PATH GOAL is actually called MOVE (see Table 2), only it does not have the
constraint to the movement of body parts that CD’s MOVE relies on. Both PTRANS
and MOVE in CD describe the movement of an entity along a PATH from one
location, a SOURCE, to another location, a GOAL, passing a number of contiguous
locations in between those two points. One fundamental difference is that the image schema
SOURCE PATH GOAL is not limited to physical locations but also includes abstract,
metaphorical relocations, such as “We want to continue along the successful path to
growth”. Continuing the previous example on MOVE, one could consider the “hand”
as an OBJECT image schema that moves along a PATH towards a final destination, the
GOAL. When we consider a movement to the doorknob, the doorknob would be its
GOAL. This would give us a PATH GOAL image schema for the movement of the hand
to the doorknob, which is similar to MOVE.</p>
        <p>Generally the comparison between CD and image schemas suggests that a modified
version of CD could combine PTRANS and MOVE together into a single primitive
representing change in location for both objects and body parts of animate actors. There is
evidence from studies of human subjects conceiving CD primitive acts in the meanings
of simple sentences in support of combining PTRANS and MOVE in a way that is
similar to the single SOURCE PATH GOAL image schema. When subjects were asked which
CD primitive was the best match for the main act in the sentence “Kevin crossed his
arms”, two subjects responded that PTRANS was the best match, in spite of the fact that
“arms” should probably be conceived as a part of the actor’s body, implying the MOVE
CD primitive act. Their explanations were also consistent with viewing the main act as a
PTRANS:
“The location of his arms changed.”
“Physical position of his arms changed.”
For the sentence “Joe swung his fist at David,” we again expected subjects to respond
with the MOVE primitive, but one subject answered PTRANS. This subject explained:
“Physical position of his fist changed and no force is implied.”</p>
      </sec>
      <sec id="sec-4-2">
        <title>4.2. Motion and CONTAINMENT</title>
        <p>
          The act of one object going into another object as well as the act of an object exiting
from another object are represented in the CD system through the pair of conceptual
primitives, INGEST and EXPEL. In representing meaning in AI systems based on CD,
the INGEST primitive was typically used to represent some object or substance, either
animate or inanimate, going inside of an animate object. For example, INGEST was
used to describe humans or other animate beings consuming food or drink, breathing,
or inhaling smoke [
          <xref ref-type="bibr" rid="ref27">27</xref>
          ]. Likewise the EXPEL primitive was used to represent animate
beings performing acts like expelling or vomiting, sweating, bleeding from a wound, or
exhaling air or smoke.
        </p>
        <p>The image schema CONTAINMENT is structured in terms of an inside, an outside,
and a border that separates the two. Compared to the definition of INGEST in Table 1,
going into an object (or region) corresponds to crossing the border and entering a
CONTAINER. Since the CD is closely tied to motion, the CONTAINER represents the GOAL
of a SOURCE PATH GOAL and based on the definition of INGEST, FORCE is applied
to the motion of an OBJECT. To map from the CD INGEST we require the three image
schemas CONTAINMENT, FORCE, and SOURCE PATH GOAL. In case the type of force
happens to be passive, that is, something is forced rather than forces, we could specify
the FORCE image schema to COMPULSION in the conceptual blend or schematic
integration, where COMPULSION stands for one entity being moved by external force with a
certain magnitude and a certain direction along a path. For EXPEL, only the DIRECTION
primitive of the SOURCE PATH GOAL image schema changes, but not the conceptual
blend as such, which means we can map EXPEL and INGEST to the same conceptual
blend. The image schemas also encourage extending the application of CD’s EXPEL and
INGEST to inanimate objects performing these acts.</p>
      </sec>
      <sec id="sec-4-3">
        <title>4.3. GRASP and CONTACT</title>
        <p>While the CD primitive GRASP does not appear to be about motion, it may be used
in stories where actors grasp some object and then move or change location while they
are still grasping that object. This is important for story understanding because story
characters may grab an object with their hand, and then MOVE their hand. The story
understanding system needs to make the inference that the object, grasped in the actor’s
hand, also changed location to wherever the hand moved. GRASP does not have to be
performed with human hands, it could be a person pinching a newspaper between their
elbow and the side of their body, a dog holding something in its teeth, or an animal using
a prehensile tail. For animate objects performing a GRASP, GRASP may be generalized
to also contain the typical meaning of the verb “to attach”.</p>
        <p>From an image-schematic perspective, the two different situations described by
GRASP, namely the attachment on the one hand and the active grabbing on the other
hand, need to be distinguished. While both represent a CONTACT between two OBJECTs,
the attachment corresponds to a unidirectional ATTRACTION. In other words, one
OBJECT applies force to achieve a continuous CONTACT with the other OBJECT. This can
be seen particularly well from the example sentence for GRASP in Table 1, which is
“The gecko stuck to the wall”. The case of active grabbing is more complex and requires
conceptual blends in most situations. In the example “Jim held on to the railing” a
mapping to SUPPORT could be argued if we accept a non-vertical dimension of the entity
being supported as a possibility for this schema. However, if one OBJECT encloses the
other OBJECT, we would consider it a CONTAINER. Should the situation involve motion,
it could be a blend with SOURCE PATH GOAL.</p>
      </sec>
      <sec id="sec-4-4">
        <title>4.4. PROPEL and Impact</title>
        <p>
          PARTS, WHOLE, CONFIGURATION
CONTAINER, IN, OUT, BOUNDARY
In examining the CD primitive act PROPEL, the general description of an entity
applying force to another entity can be seen as corresponding to the general definition of the
FORCE schema, which implies a physical interaction between two entities. Based on
previous research, we believe it is reasonable to take a broad stance on FORCE, as it
appears to be part of many image schematic structures. FORCE is an especially complicated
case: some argue it is a non-spatial conceptual structure that impacts image schemas and
therefore is not an image schema in its own right [
          <xref ref-type="bibr" rid="ref23">23</xref>
          ], while others utilize FORCE as a
conceptual primitive [
          <xref ref-type="bibr" rid="ref7">7</xref>
          ].
        </p>
        <p>
          Still others argue that force is available as a group of image schemas and does not
appear as an individual case [
          <xref ref-type="bibr" rid="ref13 ref9">9,13</xref>
          ], and because PROPEL encompasses different types
of FORCEs and impacts of one OBJECT on another, it can be generally mapped to the
FORCE schema group. In the case of one entity striking or kicking another, a mapping to
COMPULSION—one entity being moved by external force with a certain magnitude and
a certain direction along a path—could be achieved. In the event “Lisa kicked the ball,”
the “ball” will be moved by the external force of Lisa’s kick, unless it is made of steel or
lead and cannot be moved by the impact of the kick; this would result in BLOCKAGE of
the movement—one entity encountering another entity or obstacle that blocks or resists
the movement of the first—and probably pain or an injury for Lisa.
        </p>
      </sec>
      <sec id="sec-4-5">
        <title>4.5. Picture Producers and OBJECTs</title>
        <p>
          Recognition of objects and parts of objects is essential to all forms of conceptualization
of physical acts. While much discussion of CD focuses on the primitive acts [
          <xref ref-type="bibr" rid="ref27">27</xref>
          ], CD
conceptualizations also have dependent cases and sub-conceptualizations that are
objects. For example, CD specifies that a simple conceptualization is specified by one of
the primitive acts, but also specifies an actor that performs the act, and often an object of
the act. CD calls these actor and object “cases” of the conceptualization “picture
producers,” and specifies that they, unlike nouns in surface language expressions, must always
represent physical objects. Picture producers (PPs) correspond strongly to OBJECTs in
image schemas.
        </p>
        <p>Additionally, CD retains a primitive called PART that is used to specify that one
object (or PP) is a part or sub-object of another object (or PP). An expression such as hand
&lt;=&gt; PART(John) in a CD conceptualization modifies the PP hand to state that it is an
inalienable possession or body part of the PP John. PART corresponds strongly to the
image schema PART-WHOLE, and is particularly important in specifying
conceptualizations involving MOVE, which requires that the object of a MOVE is a body part of the
actor of a MOVE. The CD primitive CONTAIN is used to specify containment relations
for PPs in CD conceptualizations: an expression such as frog &lt;=&gt; CONTAIN(box)
represents the state of affairs in which the PP frog is contained in the PP box. CONTAIN,
both in its name and in its function, has strong correspondence with the image schema
CONTAINMENT.</p>
      </sec>
    </sec>
    <sec id="sec-5">
      <title>5. Conclusion and Future Work</title>
      <p>In comparing and contrasting image schemas with CD, we examined the linguistic
expressions that constructs in each system are designed to represent, as well as evidence
from studies in which human subjects connected physical primitives to their
comprehension of sentences. We generally have found strong links between CD primitive acts and
both image schemas and the spatial primitives that are theorized as building blocks for
composing them.</p>
      <p>
        These comparisons suggest a number of modifications to the arrangement of CD
physical primitives. Part of the original goals of CD (and one factor in its design which
made it controversial, see [
        <xref ref-type="bibr" rid="ref29">29</xref>
        ]) was its designers’ insistence on keeping the number of
conceptual primitives small, making them abstract and general, and working to devise
representations of meaning which combined the primitives in complex ways to represent
human perception and cognition of events and situations. This allows natural language
understanding systems built from CD to establish relations between natural language
expressions and concepts by comparing the complex structures instead of relying on
“surface level” relations between words. A redesign which reduces the number of primitives
while allowing them to represent meanings through more complex combinations allows
for richer relations between conceptual structures. Less primitives may also reduce
ambiguity, because it makes it less likely that more than one decomposition represents a
particular meaning. These comparisons between CD and image schemas suggest that a
redesign of CD could combine the PTRANS and MOVE primitives, or simply
eliminate MOVE and use only PTRANS, in correspondence with the SOURCE PATH GOAL
image schema. It also suggests eliminating INGEST and EXPEL in favor of simply
using PTRANS and using CD’s CONTAIN primitive to specify a containing object that is
the source or the goal of the PTRANS. This would correspond with the image schemas
SOURCE PATH GOAL and CONTAINMENT.
      </p>
      <p>Findings from this comparison highlight the need to consider how animate and
inanimate OBJECTs are discriminated conceptually in image schema research as they are
in CD. Furthermore, a difference in the level of granularity of both theories in the
representation of meaning can be beneficial to specifying complex and controversial
image schemas, especially FORCE. CD addresses some specific types of FORCE that are
not considered in the group described in the image schema, and contributes a
substantial range of examples from story telling in which those types have been analyzed. In
general, the comparison showed the difficulty and the potential of bringing the highly
abstract image schemas to the level of physical motion and action. Additionally, while
we did find links between PROPEL and GRASP and image schemas such as SUPPORT
and FORCE, these connections did not immediately suggest improvements to the CD
primitives; further investigation of PROPEL and the family of FORCE schemas will be
necessary.</p>
      <p>
        Because this investigation focused on the physical and spatial primitives of CD,
extending the correspondence to CD primitives representing thought, memory, and
perception (such as MTRANS, “Mental TRANSfer”) is an obvious topic of future inquiry.
Further investigation will also be required to determine if these modifications to CD actually
improve CD as a meaning representation as it is used in natural language understanding
and story understanding systems in AI. To provide further supporting evidence, of greater
or lesser strength, for the validity of the mapping between CD and image schemas that
we have derived, a future study similar to that in [
        <xref ref-type="bibr" rid="ref19">19</xref>
        ] could have human subjects
perform crowdsourcing tasks with similar sentences, but categorize the main acts and events
as image schemas instead of CD primitives. Other future work could apply both CD
and image schemas to a larger corpus of natural language examples—automatically or
manually—to establish a larger body of evidence of the correspondence between them.
      </p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <given-names>B.</given-names>
            <surname>Bennett</surname>
          </string-name>
          and
          <string-name>
            <given-names>C.</given-names>
            <surname>Cialone</surname>
          </string-name>
          .
          <article-title>Corpus guided sense cluster analysis: a methodology for ontology development (with examples from the spatial domain)</article-title>
          .
          <source>In 8th Int. Conf. on Formal Ontology in Information Systems (FOIS)</source>
          , volume
          <volume>267</volume>
          , pages
          <fpage>213</fpage>
          -
          <lpage>226</lpage>
          . IOS Press,
          <year>2014</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [2]
          <string-name>
            <given-names>Ellen</given-names>
            <surname>Dodge</surname>
          </string-name>
          and
          <string-name>
            <given-names>George</given-names>
            <surname>Lakoff</surname>
          </string-name>
          .
          <article-title>Image schemas: From linguistic analysis to neural grounding. From perception to meaning: Image schemas in cognitive linguistics</article-title>
          , pages
          <fpage>57</fpage>
          -
          <lpage>91</lpage>
          ,
          <year>2005</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [3]
          <string-name>
            <surname>Michael</surname>
            <given-names>G</given-names>
          </string-name>
          <string-name>
            <surname>Dyer</surname>
          </string-name>
          .
          <article-title>In-Depth Understanding: A Computer Model of Integrated Processing for Narrative Comprehension</article-title>
          . MIT Press, Cambridge, MA,
          <year>1982</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [4]
          <string-name>
            <given-names>Vyvyan</given-names>
            <surname>Evans</surname>
          </string-name>
          and
          <string-name>
            <given-names>Melanie</given-names>
            <surname>Green</surname>
          </string-name>
          .
          <article-title>Cognitive linguistics: an introduction</article-title>
          . Edinburgh University Press,
          <year>2006</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          [5]
          <string-name>
            <given-names>Antony</given-names>
            <surname>Galton</surname>
          </string-name>
          .
          <article-title>The Formalities of Affordance</article-title>
          . In Mehul Bhatt, Hans Guesgen, and Shyamanta Hazarika, editors,
          <source>Proc. of workshop Spatio-Temporal Dynamics</source>
          , pages
          <fpage>1</fpage>
          -
          <lpage>6</lpage>
          ,
          <year>2010</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          [6]
          <string-name>
            <given-names>Dagmar</given-names>
            <surname>Gromann and Maria M. Hedblom</surname>
          </string-name>
          .
          <article-title>Kinesthetic mind reader: A method to identify image schemas in natural language</article-title>
          .
          <source>In Proceedings of Advancements in Cogntivie Systems</source>
          ,
          <year>2017</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          [7]
          <string-name>
            <surname>Maria</surname>
            <given-names>M.</given-names>
          </string-name>
          <string-name>
            <surname>Hedblom</surname>
            , Oliver Kutz, Till Mossakowski, and
            <given-names>Fabian</given-names>
          </string-name>
          <string-name>
            <surname>Neuhaus</surname>
          </string-name>
          .
          <article-title>Between contact and support: Introducing a logic for image schemas and directed movement</article-title>
          .
          <source>In Proceedings of AIXIA</source>
          ,
          <year>2017</year>
          . Forthcoming.
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          [8]
          <string-name>
            <surname>Maria</surname>
            <given-names>M.</given-names>
          </string-name>
          <string-name>
            <surname>Hedblom</surname>
            , Oliver Kutz, and
            <given-names>Fabian</given-names>
          </string-name>
          <string-name>
            <surname>Neuhaus</surname>
          </string-name>
          .
          <article-title>Choosing the Right Path: Image Schema Theory as a Foundation for Concept Invention</article-title>
          .
          <source>Journal of Artificial General Intelligence</source>
          ,
          <volume>6</volume>
          (
          <issue>1</issue>
          ):
          <fpage>22</fpage>
          -
          <lpage>54</lpage>
          ,
          <year>2015</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          [9]
          <string-name>
            <given-names>Mark</given-names>
            <surname>Johnson</surname>
          </string-name>
          .
          <article-title>The Body in the Mind: The Bodily Basis of Meaning, Imagination, and Reason</article-title>
          . The University of Chicago Press, Chicago and London,
          <year>1987</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          [10]
          <string-name>
            <surname>Eun-Sol</surname>
            <given-names>Kim</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Kyoung-Woon</surname>
            <given-names>On</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Byoung-Tak Zhang</surname>
          </string-name>
          , and
          <source>Cognitive Robotics Artificial Intelligence Center</source>
          . Deepschema:
          <article-title>Automatic schema acquisition from wearable sensor data in restaurant situations</article-title>
          .
          <source>In IJCAI</source>
          , pages
          <fpage>834</fpage>
          -
          <lpage>840</lpage>
          ,
          <year>2016</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          [11]
          <string-name>
            <given-names>Werner</given-names>
            <surname>Kuhn</surname>
          </string-name>
          .
          <article-title>An image-schematic account of spatial categories</article-title>
          . In Stephan Winter, Matt Duckham, Lars Kulik, and Ben Kuipers, editors,
          <source>Spatial information theory</source>
          , pages
          <fpage>152</fpage>
          -
          <lpage>168</lpage>
          . Springer Berlin Heidelberg,
          <year>2007</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          [12]
          <string-name>
            <surname>Ankit</surname>
            <given-names>Kumar</given-names>
          </string-name>
          , Ozan Irsoy,
          <string-name>
            <given-names>Peter</given-names>
            <surname>Ondruska</surname>
          </string-name>
          , Mohit Iyyer, James Bradbury, Ishaan Gulrajani, Victor Zhong, Romain Paulus, and Richard Socher.
          <article-title>Ask me anything: Dynamic memory networks for natural language processing</article-title>
          .
          <source>In International Conference on Machine Learning</source>
          , pages
          <fpage>1378</fpage>
          -
          <lpage>1387</lpage>
          ,
          <year>2016</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          [13]
          <string-name>
            <given-names>George</given-names>
            <surname>Lakoff. Women</surname>
          </string-name>
          , Fire, and
          <string-name>
            <given-names>Dangerous</given-names>
            <surname>Things</surname>
          </string-name>
          .
          <article-title>What Categories Reveal about the Mind</article-title>
          . The University of Chicago Press,
          <year>1987</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref14">
        <mixed-citation>
          [14]
          <string-name>
            <given-names>George</given-names>
            <surname>Lakoff</surname>
          </string-name>
          and
          <article-title>Rafael Nu´n˜ez</article-title>
          .
          <source>Where Mathematics Comes from: How the Embodied Mind Brings Mathematics Into Being. Basic Books</source>
          , New York,
          <year>2000</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref15">
        <mixed-citation>
          [15]
          <string-name>
            <given-names>Laura</given-names>
            <surname>Lakusta</surname>
          </string-name>
          and
          <string-name>
            <given-names>Barbara</given-names>
            <surname>Landau</surname>
          </string-name>
          .
          <article-title>Starting at the end: the importance of goals in spatial language</article-title>
          .
          <source>Cognition</source>
          ,
          <volume>96</volume>
          (
          <issue>1</issue>
          ):
          <fpage>1</fpage>
          -
          <lpage>33</lpage>
          ,
          <year>2005</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref16">
        <mixed-citation>
          [16]
          <string-name>
            <given-names>Pat</given-names>
            <surname>Langley</surname>
          </string-name>
          .
          <article-title>Intelligent behavior in humans and machines</article-title>
          .
          <source>Advances in Cognitive Systems</source>
          ,
          <volume>2</volume>
          :
          <fpage>3</fpage>
          -
          <lpage>12</lpage>
          ,
          <year>2012</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref17">
        <mixed-citation>
          [17]
          <string-name>
            <surname>Hector</surname>
            <given-names>J</given-names>
          </string-name>
          <string-name>
            <surname>Levesque</surname>
          </string-name>
          .
          <article-title>On our best behaviour</article-title>
          .
          <source>Artificial Intelligence</source>
          ,
          <volume>212</volume>
          :
          <fpage>27</fpage>
          -
          <lpage>35</lpage>
          ,
          <year>2014</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref18">
        <mixed-citation>
          [18]
          <string-name>
            <surname>Steven</surname>
            <given-names>L</given-names>
          </string-name>
          <string-name>
            <surname>Lytinen</surname>
          </string-name>
          .
          <article-title>Conceptual dependency and its descendants</article-title>
          .
          <source>Computers &amp; Mathematics with Applications</source>
          ,
          <volume>23</volume>
          (
          <issue>2</issue>
          ):
          <fpage>51</fpage>
          -
          <lpage>73</lpage>
          ,
          <year>1992</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref19">
        <mixed-citation>
          [19]
          <string-name>
            <surname>Jamie</surname>
            <given-names>C.</given-names>
          </string-name>
          <string-name>
            <surname>Macbeth</surname>
            and
            <given-names>Marydjina</given-names>
          </string-name>
          <string-name>
            <surname>Barionnette</surname>
          </string-name>
          .
          <article-title>The coherence of conceptual primitives</article-title>
          .
          <source>In Proceedings of the Fourth Annual Conference on Advances in Cognitive Systems. The Cognitive Systems Foundation</source>
          ,
          <year>June 2016</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref20">
        <mixed-citation>
          [20]
          <string-name>
            <surname>Jamie</surname>
            <given-names>C.</given-names>
          </string-name>
          <string-name>
            <surname>Macbeth</surname>
            and
            <given-names>Sandra</given-names>
          </string-name>
          <string-name>
            <surname>Grandic</surname>
          </string-name>
          .
          <article-title>Crowdsourcing a parallel corpus for conceptual analysis of natural language</article-title>
          .
          <source>In Proceedings of The Fifth AAAI Conference on Human Computation and Crowdsourcing. The Association for the Advancement of Artificial Intelligence</source>
          ,
          <year>October 2017</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref21">
        <mixed-citation>
          [21]
          <string-name>
            <surname>Jean</surname>
            <given-names>M</given-names>
          </string-name>
          <string-name>
            <surname>Mandler.</surname>
          </string-name>
          <article-title>How to build a baby: Ii. conceptual primitives</article-title>
          .
          <source>Psychological review</source>
          ,
          <volume>99</volume>
          (
          <issue>4</issue>
          ):
          <fpage>587</fpage>
          ,
          <year>1992</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref22">
        <mixed-citation>
          [22]
          <string-name>
            <surname>Jean</surname>
            <given-names>M.</given-names>
          </string-name>
          <string-name>
            <surname>Mandler</surname>
          </string-name>
          .
          <article-title>The spatial foundations of the conceptual system</article-title>
          .
          <source>Language and Cognition</source>
          ,
          <volume>2</volume>
          (
          <issue>1</issue>
          ):
          <fpage>21</fpage>
          -
          <lpage>44</lpage>
          ,
          <year>2010</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref23">
        <mixed-citation>
          [23]
          <string-name>
            <surname>Jean</surname>
            <given-names>M.</given-names>
          </string-name>
          <string-name>
            <surname>Mandler</surname>
          </string-name>
          and Cristo´bal Paga´n Ca´novas.
          <article-title>On defining image schemas</article-title>
          .
          <source>Language and Cognition</source>
          , pages
          <fpage>1</fpage>
          -
          <lpage>23</lpage>
          ,
          <year>2014</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref24">
        <mixed-citation>
          [24]
          <string-name>
            <surname>Jean</surname>
            <given-names>Matter</given-names>
          </string-name>
          <string-name>
            <surname>Mandler</surname>
          </string-name>
          .
          <article-title>The foundations of mind: Origins of conceptual thought</article-title>
          . Oxford University Press,
          <year>2004</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref25">
        <mixed-citation>
          [25]
          <string-name>
            <surname>Christopher</surname>
            <given-names>D</given-names>
          </string-name>
          <string-name>
            <surname>Manning</surname>
          </string-name>
          .
          <article-title>Computational linguistics and deep learning</article-title>
          .
          <source>Computational Linguistics</source>
          ,
          <volume>41</volume>
          ,
          <year>2016</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref26">
        <mixed-citation>
          [26]
          <string-name>
            <surname>Roger</surname>
            <given-names>C</given-names>
          </string-name>
          <string-name>
            <surname>Schank</surname>
          </string-name>
          .
          <article-title>Conceptual dependency: A theory of natural language understanding</article-title>
          .
          <source>Cognitive Psychology</source>
          ,
          <volume>3</volume>
          (
          <issue>4</issue>
          ):
          <fpage>552</fpage>
          -
          <lpage>631</lpage>
          ,
          <year>1972</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref27">
        <mixed-citation>
          [27]
          <string-name>
            <surname>Roger</surname>
            <given-names>C</given-names>
          </string-name>
          <string-name>
            <surname>Schank. Conceptual Information Processing. Elsevier</surname>
          </string-name>
          , New York, NY,
          <year>1975</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref28">
        <mixed-citation>
          [28]
          <string-name>
            <surname>Roger</surname>
            <given-names>C.</given-names>
          </string-name>
          <string-name>
            <surname>Schank</surname>
            and
            <given-names>Robert P.</given-names>
          </string-name>
          <string-name>
            <surname>Abelson</surname>
          </string-name>
          .
          <article-title>Scripts, plans, goals and understanding : an inquiry into human knowledge structures</article-title>
          . L. Erlbaum Associates, Hillsdale, N.J.,
          <year>1977</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref29">
        <mixed-citation>
          [29]
          <string-name>
            <surname>Roger</surname>
            <given-names>C</given-names>
          </string-name>
          <string-name>
            <surname>Schank and Christopher K Riesbeck. Inside Computer Understanding: Five Programs Plus Miniatures</surname>
          </string-name>
          . L. Erlbaum Associates Inc., Hillsdale, NJ,
          <year>1982</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref30">
        <mixed-citation>
          [30]
          <string-name>
            <surname>Marco</surname>
            <given-names>Schorlemmer</given-names>
          </string-name>
          , Roberto Confalonieri, and
          <string-name>
            <given-names>Enric</given-names>
            <surname>Plaza</surname>
          </string-name>
          .
          <article-title>The yoneda path to the buddhist monk blend</article-title>
          .
          <source>In Proceedings of the Joint Ontology Workshops 2016 Episode</source>
          <volume>2</volume>
          :
          <article-title>The French Summer of Ontology co-located with the 9th</article-title>
          <source>International Conference on Formal Ontology in Information Systems (FOIS</source>
          <year>2016</year>
          ), Annecy, France,
          <source>July 6-9</source>
          ,
          <year>2016</year>
          .,
          <year>2016</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref31">
        <mixed-citation>
          [31]
          <string-name>
            <given-names>Lawrence</given-names>
            <surname>Shapiro</surname>
          </string-name>
          .
          <article-title>Embodied cognition. New problems of philosophy</article-title>
          . Routledge, London and New York,
          <year>2011</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref32">
        <mixed-citation>
          [32]
          <string-name>
            <given-names>Leonard</given-names>
            <surname>Talmy</surname>
          </string-name>
          .
          <article-title>The fundamental system of spatial schemas in language</article-title>
          . In Beate Hampe and Joseph E Grady,
          <article-title>editors, From perception to meaning: Image schemas in cognitive linguistics</article-title>
          , volume
          <volume>29</volume>
          <source>of Cognitive Linguistics Research</source>
          , pages
          <fpage>199</fpage>
          -
          <lpage>234</lpage>
          . Walter de Gruyter,
          <year>2005</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref33">
        <mixed-citation>
          [33]
          <string-name>
            <surname>Patrick</surname>
            <given-names>Henry</given-names>
          </string-name>
          <string-name>
            <surname>Winston</surname>
          </string-name>
          .
          <article-title>The genesis story understanding and story telling system: A 21st century step toward artificial intelligence</article-title>
          .
          <source>Technical report</source>
          , Center for Brains,
          <source>Minds and Machines (CBMM)</source>
          ,
          <year>2014</year>
          .
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>