Breaking Down Finance
 A method for concept simplification by identifying movement structures
                from the image schema PATH-following


                      Dagmar GROMANN a,1 , Maria M. HEDBLOM b
        a Artificial Intelligence Research Institute (IIIA-CSIC), Bellaterra, Spain
                          b Free University of Bozen-Bolzano, Italy


            Abstract. Image schemas provide preverbal conceptual structures and are sug-
            gested to be the conceptual building blocks from which cognitive phenomena such
            as language and reasoning are constructed. ‘Motion along a path’ is one of the first
            image schemas infants remember, making PATH-following one of the earliest cog-
            nitive building blocks. We are interested in the importance of this developmentally
            relevant image schema in abstract adult language. For this purpose, we propose
            a semi-automated method to extract image-schematic structures related to PATH-
            following from a multilingual financial terminology. Two major assumptions are
            that a linguistic mapping of image schemas facilitates the understanding of com-
            plex concepts and is persistent across languages. Our results show that complex
            textual representations can be made simpler to understand by extracting the under-
            lying image schemas and that they are persistent across languages. Another result
            includes the identification of novel specifications of predefined image-schematic
            structures.
            Keywords. Image schemas, information extraction, ontologies, lexico-syntactic
            patterns, terminological database, finance


1. Introduction

Image schemas provide a theory for concept formation based on sensory-motor expe-
riences. As such image schemas represent pre-linguistic structures of (usually) spatio-
temporal object relations. The common framework they provide for thought also mani-
fests itself in natural language [13]. They may be employed to establish rigorous defini-
tions that capture part of the meaning of natural language expressions. It is generally be-
lieved that analyzing natural language leads to a greater understanding of image schemas
[6, 15].
     Research on image schemas is performed in several disciplines; cognitive linguis-
tics [5], developmental psychology [14] and more formal areas (e.g. [10, 23, 1]). It has
been shown that while the conceptual notions of image schemas are mostly equivalent,
the linguistic expression can vary slightly across languages [15]. Bennett and Cialone
[1] strengthened the interrelationship between image schemas and natural language by
analyzing C ONTAINMENT in a biological textbook corpus. In their investigation they
  1 Corresponding Author: dgromann@iiia.csic.es
were able to identify eight different types of C ONTAINMENT. Hedblom et al. [6] showed
how the image schema PATH-following represents a family of theories rather than an
individual schema and demonstrated how this can be used to ground abstract concepts.
With examples, they demonstrated how image schemas capture the information skele-
ton in (some) linguistic metaphors. We are interested in the universal persistence of im-
age schemas in abstract adult communication, i.e., finance, and across languages, i.e.,
English, Swedish, German, and Italian.
     The two main assumptions this paper addresses are, first, the idea that image-
schematic structures persist across languages and domains, and second, that these early
developed image schemas shape abstract adult communication and conceptualization. To
address these assumptions, we semi-automatically identify representations of the image
schema family PATH-following in a multilingual financial terminology (extracted from
IATE2 ). We extract financial terminological entries where English, Swedish, German,
and Italian natural language descriptions are aligned. Thereby, we are able to see whether
the image schemas involved in abstract concepts are consistent across languages. Sec-
ondly, we also show how complex financial concepts can be simplified when broken
down to their image-schematic core. Furthermore, this experiment strengthens the link
between language and image schemas as well as their relation to formal ontologies. Our
results show that image-schematic structures occur with a high consistency within the
same entries across all four languages, at times with slight variations though, e.g. omis-
sion of the S OURCE in the S OURCE PATH G OAL.
     Since image schemas are multidisciplinary a clarification of their theoretical foun-
dation is introduced in the next section. We continue by detailing the employed method-
ology, before we introduce the obtained results. The results are then compared to related
work, followed by a discussion section. Finally, we provide some concluding remarks
with a brief outlook to future work.


2. The theory of image schemas

The theory was introduced by Lakoff [11] and Johnson [8] in the late 1980s. Since then it
has become an important theory to ground higher cognitive phenomena, such as language
and reasoning, in the low-level sensations acquired from embodied experiences. Image
schemas are defined as the abstract patterns derived from sensory-motor experiences
found in embodied cognition [22]. Developed in early infancy, they are pre-linguistic
conceptualisations that allow infants to make predictions about their surroundings [14].
Classic examples of image schemas include C ONTAINMENT, S UPPORT, V ERTICALITY,
and S OURCE PATH G OAL.
     Some important characteristics for image schemas are that they exist both as static
and dynamic concepts [25], and both in simple and more complex form [15]. Addition-
ally, there is no clear border for when one image schema becomes another and in lan-
guage image schemas often appear in constellations of one or more image schemas com-
bined [21], e.g. C ONTAINMENT and PATH combined forms the conceptual structure in
expressions such as ‘get into trouble’.
     One use of image schemas is that they can act as an information skeleton in analogi-
cal transfer [9]. In infancy this means that when a child has learned that ‘tables S UPPORT
  2 http://iate.europa.eu/
plates’ they can infer that ‘desks S UPPORT books’. As cognitive abilities become more
complex and with the acquisition of the capacity for increasingly abstract and complex
thinking in the early teens [20], this analogical transfer can help build conceptualisations
of abstract concept. An example is ‘to offer S UPPORT to a friend in need’.
     While some words and concepts cannot be described using image schemas, other
abstract concepts can be. For instance, ‘transportation’ can be broken down into a com-
bination of PATH and either S UPPORT or C ONTAINMENT. This kind of combination is
parallel in its constellation, but there are also combinations of image schemas that al-
ter the nature of the image schema. For example, a common conceptualisation of the
concept ‘marriage’ is as a L INKED PATH. Here the components of the image schemas
are merged rather than sequentially added. This illustrates the gestalt structure of image
schemas, meaning that no component can be removed or added without changing the
logics of the image schemas [12]. For example, it is not possible to remove the ‘border’
from the C ONTAINMENT image schema, nor is it possible to speak of solely ‘an inside’
without at least implicitly considering a border and an outside as well. In natural lan-
guage many image-schematic components are implicit, yet for formal analyses of image
schemas these image schema components need to be considered more directly.

2.1. Ontology of PATH-following

Aiming to take the above mentioned aspects of image-schematic structures, components
and combinatorial possibilities into account, Hedblom et al. [6] took a closer look at what
in the literature is called the S OURCE PATH G OAL schema [12]. They presented a hier-
archical structure, the PATH-following family (see Figure 1), that grew more specialised
based on the addition of spatial primitives found in developmental psychology [15].
     Their method took the image schema components into account and also considered
concept integrations by introducing a graphical and logical representation for how image
schemas occasionally ‘share’ components from different image-schematic notions. In the
figure, some preliminary participants of the PATH-family were introduced. Their method
also includes a more complete common logic formalisation for the graph, available in
an Ontohub repository3 . Important for this paper is to register how each node in the
graph represent an individual image-schematic structure that can be mapped to natural
language expressions and conceptualisation.
     A C YCLE is an iterative temporal path. Hedblom et al. [6] argued that M OVE -
MENT I N L OOPS is the physical representation of the temporal C YCLE . In this paper
we largely merge the temporal and the physical PATHs into one and look at cycle as a
relative to the PATH-family. Consequently, a C YCLE is a specific manifestation where
the S OURCE and the G OAL coincide and related to C LOSED PATH M OVEMENT.


3. Method

Our method relies on the ontology of PATH-following introduced in Section 2.1, which
we utilize to identify linguistic manifestations of image-schematic structures. We extract
potential candidate entries for the PATH schema in English by means of lexico-syntactic
patterns and synonym sets. To retrieve the German, Swedish and Italian data we bene-
  3 https://ontohub.org/repositories/imageschemafamily/
                   Figure 1. Path family as introduced by Hedblom et al. (2015)


fit from the alignment of multilingual data in the terminological database. The resulting
corpus is manually analyzed by first language speakers to identify potential representa-
tions of PATH schemas. For this manual analysis we followed the structure of the utilized
PATH-following ontology as well as a graphical representation method.

3.1. Financial terminological database

Concept-oriented terminological databases organize multilingual natural language data
into terminological entries, so-called ‘units of meaning’. A terminology seeks to miti-
gate ambiguity and polysemy of natural language by limiting its content to a specialized
domain of discourse. The use of a given term is specified by means of its salient features
and semantic type in a natural language definition. All natural language descriptions as-
sociated with the same entry are considered semantically equivalent. Such resources are
typically applied to computer-aided translation, information extraction, machine transla-
tion, corporate terminology management, and many more.
     Our data set for this experiment was extracted from the InterActive Terminology for
Europe (IATE)4 , which classifies its 1.3 million entries in up to 24 European languages
  4 http://iate.europa.eu/
  Pattern name               Content
  From-to (PP, TO)           % from % to
  Prepositions (PP)          around, across, through, behind, before, earlier
  Movement (NN,NNS)          movement, track, path, transportation, transit, mobility, steps, passage
  Process (NN, NNS)          process, operation, transfer, transferal
  Development (NN, NNS)      development, evolution, progress, progress, progression, chance, migration
  Cycle (NN, NNS)            cycle, course, chain, ring, rotation, circle, circuit, loop, sequel, orbit, wheel
  Move (VB, VBG, VBZ)        move, transfer, drift, migrate, walk, drive, fly, proceed, etc.
  Start (VB, VBG, VBZ )      start, commence, begin, etc.
  End (VB, VBG, VBZ)      end, target, arrive, etc.
              Table 1. Lexico-syntactic patterns and synonym sets for PATH following


by domain and sub-domain. For this experiment we only considered entries in the fi-
nancial domain and its sub-domains and limited the extraction to entries with English,
Swedish, German, and Italian natural language definitions. Since our examples only pro-
vide a small subset of the actual natural language descriptions associated with each term.
such as context or language usage, we provide the IATE identifier for each example so
that the full entry can be consulted online.

3.2. Lexico-syntactic patterns and entry extraction

We draw from research in pattern-based ontology development [24, 16] and metaphor
identification [4] to align image schemas and their natural language representation.
A widespread methodology for detecting metaphors is the initial identification of
metaphoric expressions that are then automatically extracted and analyzed in their
context [4]. Thereby, blended domains can be detected that provide candidates for
metaphoric language. We adopt this idea and formulate linguistic expressions related to
start, end, and movement along a path as lexico-syntactic patterns [24].
      English lexico-syntactic patterns and synonym sets listed in Table 1 are employed to
extract terminological entries that potentially contain the PATH schema from our finan-
cial terminological database. By extracting the entry based on the English definition only,
we automatically obtain the definitions in other languages aligned with the same entry in
the database. We assume that PATHs are internally structured in the sense that they have
the structure S OURCE PATH G OAL, which could in language be modelled by indicating
a trajectory ‘from’ a S OURCE ‘to’ a G OAL. This assumption is translated to the the first
‘from-to’ pattern as shown in Table 1. For additional patterns we utilized linguistic ex-
pressions related to PATH-following as defined by Hedblom et al. [6] and Mandler and
Cánovas [15] and their synonym sets to establish a set of recurring linguistic structures
as detailed in Table 1. A special case is a ‘cycle’ where start and end coincide and which
requires its own pattern. Since a movement can also be abstractly defined by a ‘devel-
opment’ we included it as a synonym set in the extraction process. Finally, B LOCKAGE
is the hindered movement by an obstruction in the trajectory of the object from source
to target, which is mainly represented by prepositions in the patterns, e.g. ‘across’. All
patterns and the POS tags of the morphological variants we considered are provided in
Table 1.
3.3. Linguistic mapping of image-schematic structures

The manual mapping procedure was applied to the pattern-extracted entries per language.
For each language one (for German and Swedish) or two native/fluent speakers (for En-
glish and Italian) identified image-schematic structures from the PATH-family presented
in [6] on natural language definitions. Each candidate image schema was graphically
represented, that is, diagrams were created to draw links and the objects moving between
potential S OURCE and G OAL for each definition. We only considered them PATHs when
the links defined actual movements over time. While following the general structure of
the family, additional image-schematic components where considered in order to not
only strengthen the PATH-family notion, but also by analysis improve the PATH-family to
match natural language. This allowed for a freer interpretation of the terms which better
mapped the intended content. At the end a comparison of all identified schemas allowed
for an evaluation of their cross-linguistic persistence.
     As this paper focuses on the PATH-following image schema family, one important
aspect of the mapping method is to restrict the pattern-extracted entries to the terms that
could be identified as a form of movement. This means that all terms referring exclu-
sively to objects, both abstract and concrete (e.g. risks, credit cards), proper names (e.g.
financial institutions), numbers and measurements are omitted. Terms that depict things
like processes, events or changes over time are analysed further. In a financial termi-
nology, several entries refer to processes that do not refer to movement or development
over time, which were then not considered representing PATH schemas. The S OURCE
and G OAL had to be explicitly expressed for the image schema structure to be defined as
S OURCE PATH G OAL. If these parts were omitted by, for instance, using passive voice
instead of active, the structure was reduced to S OURCE PATH or PATH G OAL, in order
to achieve an improved correspondence between linguistic content and schema.


4. Results

Our analysis targeted the identification of image-schematic structures of PATH-following
in natural language text across four natural languages. We were interested in the
(a)symmetries of such structures across languages as well as the coverage of the prede-
fined schematic structures (see Figure 1) within the domain of financial terminology. We
first present our general image-schematic candidates and results before we investigate
cross-linguistic divergences of the identified image-schematic structures.
      We base our analysis on natural language definitions. Not all IATE entries contain
natural language definitions of terms or definitions in the languages we desired. Limiting
our extraction to the domain of finance with definitions in English, Swedish, German, and
Italian resulted in 864 entries. The lexico-syntactic patterns and synonym sets from Ta-
ble 1 were applied to those 864 entries, which further reduced our corpus to 190 entries.
All 190 definitions for each language were analyzed manually by a first language/fluent
speaker to find PATH schemas. The precision of the English patterns was unexpectedly
low with only 57 English entries containing PATH following schematic structures, i.e.,
30% in total. Judging from the number of identified image schemas for each pattern,
nominal structures and prepositions returned most candidate entries.
      A total of 67% of the ‘cycle’ synonym set and 52% of the ‘process’ nouns returned
image-schematic structures, followed by ‘from-to’ with 29% of the 48 extracted entries.
The 37 extracted entries based on prepositions (across, through, around, etc.) and the 7
ones based on motion verbs resulted in image schema candidates in 30% of their cases.
The ‘end’ pattern with 8 entries contained one schema, ‘start’ with three schemas can-
didates contained no actual schema at all. While the movement and development pat-
tern extracted almost 20 entries each, only 19% in the former and 6% in the latter case
contained PATH-related structures.
     We partially attribute the low precision to the fact that there are a lot of general state-
ments that do not relate to any movement in time or space. For instance, one part of the
definition of ‘central rate’ states that ‘Currencies have limited movement from the central
rate according to the relevant band’ (IATE:785015), which our ‘from-to’ and ‘movement’
patterns detected. However, neither the term nor the definition have any relation to PATH
image-schematic structures. In contrast, ‘capital outflow’ which is defined as ‘movement
of assets out of a country...’ (IATE:1104177) provides the kind of PATH-following we
intended to find. Thus, the linguistic surface structure alone is not a sufficient indicator
of movements along a path.
     The results separated by language and structure are depicted in Table 2 as cumula-
tive frequencies. Although one would expect there to be more PATHs because of transac-
tions in finance, a majority of extracted entries could be discarded as object, institution,
natural or legal person, strategies, techniques, or measures, that is, not related to any
kind of PATH or movement over time. Events, processes, and actions provided excellent
candidates for these image-schematic structures.

      Image-Schematic Structure               English    Swedish      German      Italian   Total
      LINK                                       2           4           2           2       10
      PATH                                       3           7           6           7       23
      SOURCE PATH                                7           6           7          11       31
      PATH GOAL                                  6           9           10          7       32
      SOURCE PATH VIA GOAL (SPVG)                3           2           3           1       9
      PATH VIA GOAL                              2           1                       2       5
      SOURCE PATH VIA                            1                                           1
      CAUSED MOVEMENT                            1                                           1
      CLOSED PATH MOVEMENT                       2                       1           1       4
      MOVEMENT IN LOOPS                          1           1           1           1       4
      PATH SWITCHING                             1                       1           1       3
      JUMPING                                    1           1                               2
      BLOCAKGE AVOIDANCE                         1                       1           1       3
      PATH SPLITTING                             4           3           4           3       14
      SPG AND SPG                                1           1           1           1       4
      SPVG AND SPVG                              1           1           1           1       4
      SPG OR PATH S PLITTING                                 1                               1
      SPG OR PATH                                                        1                   1
      SPG OR L INK                               1           1           1           1       4
      Total                                      57         54            58         54     224
              Table 2. Metrics for identified image-schematic structures across languages


    All resulting image-schematic structures are ordered by approximated complexity
in Table 2. Financial entries in our data set most frequently (30% of all cases) feature
a regular S OURCE PATH G OAL schema followed by the similar, yet simpler, pattern
PATH G OAL. On occasion, specific textual references concurrently defined two image-
schematic structures that could equally be designated by the same given term. For such
cases we opted for a representation with the logical operator “OR”. For instance, an
‘interlinking mechanism’ (IATE:892281) can designate a cross-border payment proce-
dure ‘OR’ a technical infrastructure, which we represent as S OURCE PATH G OAL ‘OR’
L INK.
     We employed a graphical representation technique to identify the movements of
objects between entities along PATHs for each definition in each language. It turned out
that some of the identified image-schematic structures were not present in the predefined
structures in Figure 1. From all languages four different scenarios depicted in Figure 3
could be identified by means of the graphical representation technique. Additionally,
image-schematic structures of a ‘double-way’ S OURCE PATH G OAL movement could
be observed in financial definitions. These movements were dependent on two variables:
the number of PATHs and the number of O BJECTs that are moved along them. The four
resulting image-schematic structures that are differentiated based on those two variables
are depicted in Figure 2.


                           Figure 2. The returning object(s) problem


     In a symmetric S OURCE PATH G OAL, one O BJECT moves or is being moved along
one path until it returns to its starting point, potentially also passing a distinguishing
point. For instance, taking out and repaying a loan is the transfer of money from the
creditor to the debtor where the same object (money) can be returned on the same path
(e.g. bank transfer) to the original source, that is, the creditor. Should the S OURCE and
the G OAL coincide, the schema matches the C LOSED PATH M OVEMENT introduced in
[6].
     It is also possible, however, that the returning path differs from the initial one, in
which case the schematic structure specifies two PATHs. If the same O BJECT moves
from the S OURCE and back again on a different PATH, we consider this a bidirectional
S OURCE PATH G OAL. In the event of S OURCE and G OAL being identical the PATH that
returns to the S OURCE can either be equivalent to the initial PATH (symmetric) or differ
from the original PATH (bidirectional). The latter would be considered a bidirectional
C LOSED PATH M OVEMENT. For instance, ‘painting the tape’ (IATE: 927775) is an ex-
ample of several transactions (PATHs) being used in a C LOSED PATH M OVEMENT to
create the impression of price movement of a financial instrument. Since this is a re-
peated cycle we even consider it a M OVEMENT I N L OOPS adding a temporal compo-
nent. It could be argued that this image-schematic structure integrates other concepts or
image-schematic structures, such as C ONTAINMENT, however, for the purpose of this
paper we are exclusively interested in variations and occurrences of S OURCE PATH.
     A second dimension we identified is whether the returning O BJECT is identical to
the first outgoing one. In finance, often the returning object is different from the one
initially moved along the PATH, basically capturing any kind of exchange or purchase.
The S OURCE for one object becomes the G OAL for the second object, and vice versa. We
refer to two different O BJECTs moving along the same path as poly-object symmetric
S OURCE PATH G OAL. If two O BJECTs move along two different PATHs, we call this a
poly-object bidirectional S OURCE PATH G OAL. A real life example is the exchange of
shares (the first O BJECT) from the stock market (the first PATH) and money (the second
O BJECT) from a bank transaction (the returning PATH) between a client and a broker.
     We encountered four PATH-related structures in our sample that could not be ex-
plained by the predefined ones in Figure 1. To accommodate these structures with our ap-
proach, we decided to extend the PATH family by adding four structures, namely J UMP -
ING , PATH S WITCHING , PATH S PLITTING , and B LOCKAGE AVOIDANCE , which are
depicted in Figure 3. The illustration of B LOCKAGE, itself an image schema, serves the
sole purpose to clarify the movement involved in B LOCKAGE AVOIDANCE.


           Figure 3. Four kinds of complex PATH structures extracted from the financial domain


     First, J UMPING5 represents a temporary or spatial discontinuity of a given PATH.
For instance, ‘bond washing’ (IATE:3544441) is a method of obtaining tax-free capi-
tal profits by selling the bond immediately before the coupon pays and buying it back
right thereafter to avoid tax payments. ‘Bond washing’ is a classical metaphor based on
the notion of ‘cleaning’, which indeed captures important aspects of the term. However,
when explaining the underlying process behind the term also the PATH-following family
can be used. Considering ownership as the PATH from the initial acquisition of the bond
(S OURCE) to the gains it generates (G OAL), ‘bond washing’ leads to this interruption of
the PATH and can be seen as an example of J UMPING. While it may be argued that J UMP -
ING is simply a sequential combination of two disjoint S OURCE PATH G OAL, J UMPING
takes on its own logic as both paths are involved in one particular movement as demon-

   5 Jumping is not to be confused with the motion verb to jump. It refer to a jump in time or space, much like

’teleportation’ rather than a temporary elevation.
strated in the conceptualisation example above. Therefore, we argue that J UMPING can
be justified as a complex image schema in its own right.
     Second, in case of PATH S PLITTING one object is distributed along a path to sev-
eral G OALs. It could be argued that this represents merely a type of cardinality. How-
ever, since the PATH can be asymmetrical or bidirectional, we consider it an image-
schematic structure in its own right. For instance, in all kinds of ‘tender procedures’ (e.g.
IATE:887199) the identical piece of information (a call) is sent to several parties, who
return their individual pieces of information (the bids). Hence, this is an example of bidi-
rectional PATH S PLITTING. One example to account for this image-schematic structure
in sensory-motor experiences would be the distribution of auditory information to several
recipients with varying replies.
     Third, in PATH S WITCHING the expected PATH is fully discontinued and replaced
by a new PATH. For instance, the definition of ‘refinancing’ (IATE:786103) specifies the
extending of a new loan and a mutual agreement to discontinue the previous loan. Thus,
the original loan PATH is switched to a new loan PATH with altered conditions. It is
important to note that the definition clearly specifies the replacement of a debt obligation
with a new one and not merely altering the conditions of an existing loan. This explicit
switching of the agreed path is an excellent example of PATH S WITCHING.
     Finally, the active avoidance of a B LOCKAGE can be considered an image-schematic
construction that combines a number of pre-existing structures and schemas. The course
of the PATH is (intentionally) altered to prevent the discontinuation of the movement of
the object due to a B LOCKAGE. A ‘Paulian action’ (IATE:822870) allows a creditor to
take action to avoid potential fraudulent activities of an insolvent debtor, granting the
former rights to have a debtor’s transaction to that end reversed. Thus, the term as such
represents an example of B LOCKAGE AVOIDANCE. Here the connection to the physical
world is the actual obstruction of the trajectory of an object and its alteration of the path
to avoid any interruption of its course by the B LOCKAGE.
     A slight asymmetry in the distribution of image-schematic structures across lan-
guages could be observed. In English and German definitions more structures could be
identified than in Swedish and Italian as shown in Table 2. However, those quantified
results fail to provide any insights into the differences across languages. In 55% of all
cases the same image schema detected in English could also be found in the definitions
of the other two languages. In 27% of the cases where the schemas were not identical, the
differences arise from either an addition or omission of a S OURCE, G OAL, or VIA, while
the general structure is that of a S OURCE PATH G OAL. Differences that arise from other
sources can be pinned down to 10% of all entries. We could observe a slight preference
of G OAL usage in Swedish and German as opposed to a heightened use of S OURCE in
Italian in the reduced S OURCE PATH G OALs.
     Our method deliberately relied on explicitly described content only. This means
that omissions that arise from linguistic or grammatical differences across languages or
stylistic choices effected the extraction result. For instance, differences can arise from a
heightened use of passive constructions in one language, e.g. German, and an increased
utilization of active S OURCEs and G OALs due to grammatical choices in another. One of
the reasons for this choice was the intention to analyse linguistic consistency in relation
to schematic persistence across languages.
     We found in a final cross-linguistic analysis that most cross-linguistic differences in
the identification of schematic structures arise from unnecessarily complicated descrip-
tions, or even inconsistencies, in one language. Semantically identical entries resulted in
diverging image schemas for two major reasons: a) the difference in lexical or grammat-
ical choices (e.g. passive vs. active voice), and b) the omission of salient features. All
languages but English showed a heightened use of nominal constructions and passive
voice, which led to the frequent omission of S OURCE and G OAL. For instance, ‘sell-
ing ... by’ in English is juxtaposed to ‘Umwandlung von ...’ (transformation of) in Ger-
man and ‘operazione che ...’ (operation that) in Italian. When the passive voice was used
in English, it was frequently supplemented with a ‘by’ and the subject or object of the
sentence. Thus, the number of simple PATH schemas as opposed to the more complex
S OURCE PATH G OAL schemas is much lower in English than in the other languages.
The second set of differences refers to the features and differences in content. For in-
stance, the number of explicitly mentioned G OALs is much higher in Swedish and Ger-
man than in English and Italian, the latter of which focuses more on the S OURCE. For
automated methods, both differences lead to a certain degree of difficulty. Our method
could uncover inconsistencies across languages for both cases, which we consider an
added benefit of the linguistic mapping of image schemas.
     This approach equally uncovered conceptual inconsistencies across and within lan-
guages. For instance, ‘equity capital’ and ‘equity financing’ (IATE: 1119090) are mod-
elled as synonymous where in fact the former refers to equity of the company while
financing refers to the process of generating such capital. Thus, they should clearly be
separated into two entries, a claim that is supported by the fact that the entry’s definition
consists of two sentences that define both concepts. In view of potentially automating the
approach, we found, as can be expected, that a linguistic analysis of the specification’s
surface structure would definitely lead to misleading results. For instance ‘lifecycling’
(IATE: 3516328) describes a shift of a person’s investment approach at a specific mo-
ment in life rather than a C YCLE as the term suggests. Furthermore, our manual approach
and cross-linguistic analysis revealed (unintentionally) repeated definitions and entries,
e.g. ‘fine-tuning operation’ (IATE: 111402 & 907147).


5. Related Work

From a top-down perspective, Kuhn [10] analyzed noun phrases in WordNet glosses and
connects them with spatial abstractions that model image-schematic affordances. Par-
ticularly interesting is his analysis of nesting and combining image schemas in natural
language to represent more complex concepts, e.g. ‘transportation’ brings together S UP -
PORT and PATH . One bottom-up approach that is very close to ours in methodology and
objective is Bennett and Cialone [1] who investigated the construction of spatial ontolo-
gies from a biological textbook corpus by applying sense clusters. They exemplified their
approach by using the image schema of C ONTAINMENT. Lakust and Landua [13] inves-
tigated the linguistic encoding of PATH in English speaking children and adults and find
an asymmetrically higher frequency of PATH G OALs over S OURCE PATHs. Participants
were asked to verbalize visualizations, which also included finance-related events, such
as change of possessions necessitating a transaction between agents.
     Automated solutions to extracting spatial expressions from natural language corpora
rely on machine learning for annotating text. Handcrafted rules for each language help
to extract motion verbs across languages and named entities or predefined spatial ex-
pressions [16]. The extracted data are then qualitatively mapped to ontological formal-
izations. The idea of an embodied construction grammar [2] equally requires the manual
crafting of a lexicon. Thus, the central issue we are facing, namely the mapping of identi-
fied spatial expressions to actual image schemas, persist in those approaches and no fully
automated solution has been provided. Additionally, the size and specialized type of our
data set rules out any machine learning approaches.
     It has been supported that the conceptual system underlying image schemas changes
in individual languages, even though the fundamental conceptual notions vary marginally
cross-linguistically [15]. In Korean C ONTAINMENT can only be expressed by differen-
tiating whether it is tight or loose [17], which is not systematically encoded in English
and thus an optional distinction. Papafragou et al. [19] found that English speakers more
likely linguistically encode manner of motion information than Greek speakers. This was
generalized to cross-linguistic asymmetries and the authors differentiated ‘Manner lan-
guages’ (e.g. German, Russian, Chinese) from ‘Path languages’ (e.g. French, Spanish,
Turkish). Since S OURCE PATH G OAL schemas are not only spatial but also temporal,
time has been frequently considered as an important aspect. Fuhrmann et al.[3] found
that in Chinese a vertical representation of time is preferred over the English horizontal
one. Núñez and Sweetser [18] found that the spatial construal of time can vary in the
sense of whether the future is depicted as in front or behind the speaker.


6. Discussion

6.1. Method discussion

Lexico-syntactic patterns were applied to extract image-schematic candidates based on
the English definition of terms. Our initial patterns resulted in more than 3000 extracted
entries that at first analysis contained less image-schematic structures than we had ex-
pected and desired. A repeated tweaking of the patterns reduced this number to 190
pattern-extracted entries with a precision of only one third. Given the issues with our
current approach discussed below, we abstained from creating a gold standard for this
specific data set. Thus, we do not provide any numbers on the potentially missed im-
age schemas here. However, we can definitely state that the start and end schemas were
the least successful ones. The approach to extract from English definitions only, how-
ever, returned good results from our database since only two of the 57 resulting English
definitions only contained an image schema in English and in no other language.
     The low precision was mainly due to the chosen approach, which relied on the sur-
face structure of linguistic expressions without considering their meaning in context. Ad-
ditionally, the choice of patterns and linguistic expressions generally has a strong influ-
ence on the results [13]. Although in finance one would expect an abundance of PATH
schemas because transactions are central to the domain, a surprisingly high number of
abstract and concrete objects (e.g. bonds, debit cards), entities (e.g. institutions, agents),
abstract strategies (e.g. hedging), measurements (e.g. exchange rate) among others were
present in our data and identified by our patterns. Additionally, the type of transactions
we found was very different as was the nature of the PATH-schematic structures they re-
ferred to. For instance, a simple transaction of buying and selling is very different from,
e.g. ‘painting the tape’ (IATE:927775), a market manipulation strategy that utilizes a se-
ries of transactions, i.e., a M OVEMENT I N L OOPS, to influence price movements. We
consider the analysis of the exact PATH schemas in natural language as useful and also
identified new schematic structures presented above. However, for further experiments a
more refined approach to extracting image schemas that goes beyond the surface struc-
ture is required. Alternatives to a pattern-based approach, such as a construction gram-
mar for image schemas [2] or deep natural language analysis, will yield improved re-
sults. However, the size of the data set makes this scenario not a very good candidate for
machine learning.
     Low numbers of human judges are a common issue in semantic annotation tasks of
any kind [16]. The low number of native speakers in our analysis might also have created
an unwanted bias. Although we did specify basic criteria for definitions qualifying as
image-schematic structures, the final decision might be subjectively biased due to the
low number of judges. We did, however, evaluate the quality of the schema identification
process by means of the final cross-linguistic comparison, which made us re-evaluate
each individual schema candidate in each language. In this comparison the number of
identical schemas that were detected across languages was rather high with more than
50% and of the non-identical ones the variation was frequently reduced to an omission
of S OURCE or G OAL. One way to improve on the issue of the bias is to have a larger
sample of analysts that perform the image-schematic mapping. This should primarily be
a method to obtain a gold standard as at the same time a stronger level of automation for
the actual method is needed.

6.2. Results discussion

A clear preference for the S OURCE PATH G OAL schema could be observed in all lan-
guages. In contrast, [15] claimed that PATH G OAL is more important and in fact more
prevalent in the (pre-linguistic) usage of schemas by adults and children, an argument
that is supported by the findings of [13]. They presumed that children do not require
S OURCEs to conceptualize a PATH G OAL, which is why it is often omitted in cross-
linguistic analyses of image schemas. Our experiment could not provide strong evi-
dence for or against this claim. Although there is a slight increase of PATH G OALs over
S OURCE PATHs, the predominant schema still explicitly contains the S OURCE. In fact,
in Italian a predominance of S OURCE PATH over PATH G OAL could be observed but
requires more extensive investigation.
      The definition adopted here [15] is that image schemas are not just gestalts but con-
ceptual structures. The omission and/or addition of a S OURCE or G OAL changes the per-
spective of the schema [13]. It is important to differentiate whether the description explic-
itly states that an agent transfers an O BJECT or that an O BJECT is being transferred to a
beneficiary [13]. Along the same line of argumentation we claim that the directionality of
the path as well as the number of paths and objects involved in a S OURCE PATH G OAL
schema influence the perspective of the conceptualization. These two influential vari-
ables on the basic underlying schema as well as the four new image-schematic structures
we identified can be considered specifications of the overall M OVEMENT A LONG PATH
schema.
      Some of the terms were defined as combinations of image schemas. While we here
looked at only PATH-following, we noticed that many concepts would have been better
described as combinations of a member of the PATH-following family and additional
image schemas or image-schematic structures such as S CALING or C ONTAINMENT, so-
called conceptual integrations [15]. Such integrations as well as conceptual blends [6, 10]
repeatedly surfaced in our analysis as did different FORCES that might be exerted to a
schema. We consider this point definitely important to investigate in future studies.
     Our analysis revealed differences across the four languages which could partially
be explained by grammatical decisions of the terminologists/experts, partially also by
inconsistencies across languages. While the sample in our experiment is considerably
too small for any generalized conclusions, the results hint at a high persistence of image
schemas across languages. The exact nature of movement along a path can definitely
be analyzed in more detail by for instance investigating whether financial descriptions
consider the manner of movement, e.g. as done by [19] for a more general corpus.
     Prepositions and verbs returned the most promising results in most bottom-up ap-
proaches [1, 7, 13], which we could not confirm in our experiment. Synonym sets of
nouns returned most image-schematic candidates here. However, this might be attributed
to our selection of prepositions and verbs rather than the domain and not represent a
contradiction to previous findings.


7. Conclusion and future work

The presented method illustrates how some essential aspects of complicated terms and
concepts can be described by using image schemas as a means for simplification.
Our analysis contributes two dimensions and four specifications to the most central
S OURCE PATH G OAL image-schematic structure. While in this study PATH-following
was the only image schema considered, in future work more image schemas should
be analyzed to better explain the concepts. In fact, conceptual blending and image-
schematic integrations, such as PATH and C ONTAINMENT repeatedly surfaced during the
analysis and could be structured as a paper on their own.
     For this first experiment, we exclusively focused on the natural language definitions
associated with entries in four languages. In future work it would be interesting to eval-
uate the image-schematic consistency between the definition and the term that it defines.
Additionally, the contrast of the definitions analyzed and the use of the terms in contexts
of texts provided by financial experts might provide further interesting insights into the
relation of natural language and image schemas. A comparison of our results to other
domains of discourse could further strengthen our claim of a domain- and language-
independent existence of image-schematic structures.
     This approach not only contributes to image schema research by showing that the
developmentally most relevant building blocks of our cognitive inventory are carried
to abstract adult communication, but also strengthens the idea that image schemas are
linguistically and cognitively universal since they exist across languages. The practical
use of this approach not only lies in the relation of image schemas and natural language,
but since the basis is provided by a formalized theory of PATH-following it also explores
the relation between lexical and model-theoretic semantics. In this sense, we believe
that this image-schematic method provides an interesting approach to learning spatial
ontologies from multilingual text to be explored further in future experiments. Since
manual ontology engineering is cumbersome and error prone, automated approaches are
required.
    We believe that the combination of linguistic and formal analysis of image-
schematic structures across languages can allow for their more specialized use in auto-
mated approaches and computational systems. Thus, future work will focus on the au-
tomation of image-schematic extractions from multilingual textual evidence based on
formalized theories. This also includes exploring interconnections of image schemas in
form of integrations as well as conceptual blending.


Acknowledgments.

The project COINVENT acknowledges the financial support of the Future and Emerg-
ing Technologies (FET) programme within the Seventh Framework Programme for Re-
search of the European Commission, under FET-Open Grant number: 611553.

The IIIA part of this work has been funded by the European Community’s Sev-
enth Framework Programme (FP7/2007-2013) under grant agreement No. 567652
/ESSENCE: Evolution of Shared Semantics in Computational Environments./


References

 [1] B. Bennett and C. Cialone. Corpus guided sense cluster analysis: a methodology for
     ontology development (with examples from the spatial domain). In P. Garbacz and
     O. Kutz, editors, 8th International Conference on Formal Ontology in Information
     Systems (FOIS), volume 267 of Frontiers in Artificial Intelligence and Applications,
     pages 213–226. IOS Press, 2014.
 [2] B. Bergen and N. Chang. Embodied construction grammar in simulation-based lan-
     guage understanding. Construction grammars: Cognitive grounding and theoreti-
     cal extensions, 3:147–190, 2005.
 [3] O. Fuhrman, K. McCormick, E. Chen, H. Jiang, D. Shu, S. Mao, and L. Boroditsky.
     How linguistic and cultural forces shape conceptions of time: English and mandarin
     time in 3d. Cognitive science, 35(7):1305–1328, 2011.
 [4] P. Group. Mip: A method for identifying metaphorically used words in discourse.
     Metaphor and symbol, 22(1):1–39, 2007.
 [5] B. Hampe and J. E. Grady. From perception to meaning: Image schemas in cogni-
     tive linguistics, volume 29 of Cognitive Linguistics Research. Walter de Gruyter,
     Berlin, 2005.
 [6] M. M. Hedblom, O. Kutz, and F. Neuhaus. Choosing the right path: image schema
     theory as a foundation for concept invention. Journal of Artificial General Intelli-
     gence, 6(1):22–54, 2015.
 [7] M. Johanson and A. Papafragou. What does children’s spatial language reveal about
     spatial concepts? Evidence from the use of containment expressions. Cognitive
     science, 38(5):881–910, June 2014.
 [8] M. Johnson. The body in the mind: the bodily basis of meaning, imagination, and
     reason. The University of Chicago Press, Chicago and London, 1987.
 [9] Z. Kövecses. Metaphor:A Practical Introduction. Oxford University Press, USA,
     2010.
[10] W. Kuhn. An image-schematic account of spatial categories. In S. Winter, M. Duck-
     ham, L. Kulik, and B. Kuipers, editors, Spatial information theory, pages 152–168.
     Springer Berlin Heidelberg, 2007.
[11] G. Lakoff. Women, fire, and dangerous things. what categories reveal about the
     mind. The University of Chicago Press, 1987.
[12] G. Lakoff and R. Núñez. Where Mathematics Comes From: How the Embodied
     Mind Brings Mathematics into Being. Basic Books, 2000.
[13] L. Lakusta and B. Landau. Starting at the end: the importance of goals in spatial
     language. Cognition, 96(1):1–33, 2005.
[14] J. M. Mandler. The foundations of mind : origins of conceptual thought: origins of
     conceptual though. Oxford University Press, New York, 2004.
[15] J. M. Mandler and C. Pagán Cánovas. On defining image schemas. Language and
     Cognition, 0:1–23, may 2014.
[16] I. Mani and J. Pustejovsky. Interpreting motion: Grounded representations for spa-
     tial language. Number 5 in Explorations in Language and Space. Oxford University
     Press, 2012.
[17] L. McDonough, S. Choi, and J. M. Mandler. Understanding spatial relations: Flex-
     ible infants, lexical adults. Cognitive Psychology, 46(3):229–259, 5 2003.
[18] R. E. Núñez and E. Sweetser. With the future behind them: Convergent evidence
     from aymara language and gesture in the crosslinguistic comparison of spatial con-
     struals of time. Cognitive science, 30(3):401–450, 2006.
[19] A. Papafragou, C. Massey, and L. Gleitman. When english proposes what greek pre-
     supposes: The cross-linguistic encoding of motion events. Cognition, 98(3):B75–
     B87, 2006.
[20] J. Piaget. The origins of intelligence in children. NY: International University Press,
     New York, 1952. Translated by Margaret Cook.
[21] F. Santibáñez. The object image-schema and other dependent schemas. Atlantis,
     24(2):183–201, 2002.
[22] L. Shapiro. Embodied cognition. New problems of philosophy. Routledge, London
     and New York, 2011.
[23] R. St. Amant, C. T. Morrison, Y.-H. Chang, P. R. Cohen, and C. Beal. An image
     schema language. In International Conference on Cognitive Modeling (ICCM),
     pages 292–297, 2006.
[24] R. Stevens and N. Aussenac-Gilles. Ontoenrich: A platform for the lexical analysis
     of ontologies. Knowledge Engineering and Knowledge Management: EKAW 2014
     Satellite Events, VISUAL, EKM1, and ARCOE-Logic, Linköping, Sweden, Novem-
     ber 24-28, 2014. Revised Selected Papers., 8982:172, 2015.
[25] M. Y. Tseng. Exploring image schemas as a critical concept: Toward a critical-
     cognitive linguistic account of image-schematic interactions. Journal of Literary
     Semantics, 36(2):135–157, 2007.