<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta>
      <journal-title-group>
        <journal-title>Journal of the ACM 38 (1991)
935-962.
[18] E. Muñoz</journal-title>
      </journal-title-group>
    </journal-meta>
    <article-meta>
      <article-id pub-id-type="doi">10.1016/j.artmed.2022.102486</article-id>
      <title-group>
        <article-title>A Post-Modern Approach to Automatic Metaphor Identification</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Dario Del Fante</string-name>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Federico Manzella</string-name>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Guido Sciavicco</string-name>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Eduard I. Stan</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Free University of Bozen-Bolzano</institution>
          ,
          <country country="IT">Italy</country>
        </aff>
        <aff id="aff1">
          <label>1</label>
          <institution>University of Ferrara</institution>
          ,
          <country country="IT">Italy</country>
        </aff>
      </contrib-group>
      <pub-date>
        <year>2023</year>
      </pub-date>
      <volume>13796</volume>
      <fpage>0000</fpage>
      <lpage>0002</lpage>
      <abstract>
        <p>This paper provides the theoretical bases for a symbolic approach to text classification, particularly metaphor identification, that generalizes the existing ones and is inspired by similar generalizations of symbolic approaches to learning models for non-text-related tasks. From a computational point of view, metaphor identification is a particular case of text classification. The recent literature on general text classification, particularly metaphor identification, is quite broad and includes both top-down approaches [9] and bottom-up ones. Topdown approaches start from a human-designed theory of the phenomenon, which is later digitalized to provide automatic identification. Bottom-up, or data-driven ones, Much work has been devoted to discussing the on the other hand, aim to perform identification starting metaphor identification and interpretation process, as from a dataset of examples. Bottom-up strategies can in [6]. In this sense, a qualitative approach represents be, in turn, separated into symbolic and sub-symbolic apthe safest methodology since metaphors regard an as- proaches. Sub-symbolic approaches, commonly realized pect of language that occasionally can be ambiguous. via several types of neural networks, produce black-box For example, two speakers from the same linguistic and models which in some cases can be very accurate [10]. cultural context can interpret the same metaphor difer- Along with the application of pre-trained and large lanently. However, this approach is time-consuming and guage models they currently are a de-facto standard for requires at least more than two human coders to be ef- text-related learning tasks, and quite a lot of results exist fectively reliable. Despite this phenomenon, it remains even in the narrow field of metaphor identification (see, a computationally hard task given the many structural among many others, [11, 10, 12, 13, 14]). Conversely, the problems that make automatic identification not quickly purpose of a symbolic approach is to provide an idenefective [ 7]. Scholars between digital humanities and tification model and a statistically validated theory of computational linguistics have developed diferent ap- the phenomenon, written in a suitable logical language. While symbolic systems are sometimes used for textrelated tasks in general, their application to the case of metaphor identification needs to be addressed.</p>
      </abstract>
      <kwd-group>
        <kwd>eol&gt;Automatic metaphor detection and interpretation</kwd>
        <kwd>Symbolic learning</kwd>
        <kwd>NLP</kwd>
        <kwd>Modal logic</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>1. Introduction</title>
      <p>proaches to support automatic identification. Indeed, the
recent improvements regarding artificial intelligence and
machine learning might consistently impact metaphor
research regarding the time and quantity of analyzed
text [8].</p>
      <p>Metaphors involve talking and, potentially, thinking of
one thing in terms of another; the two things are diferent,
but we can perceive sets of correspondences between
them. In other words, a metaphor corresponds to using a
word or phrase from the context in which it is expected
to occur to another context, where it is not expected to
occur [1]. Metaphors are ubiquitous in language [2]: they
cannot only be considered a pure artistic ornament that
exclusively pertains to literary discourse, but they are
essential for the development of language and culture [3,
4, 5].
[. . . ]
·
1
·
2
·
3
·
4
·
5
·
6
·
7
·
8
·
9
In the above example, ’flood of immigrants’ is a 2-gram
(before tokenization, stemming, and stop words
elimination), and the rule that has been learned checks whether
or not that particular 2-gram occurs. Towards an abstract
representation, 2-grams can be encoded into
propositional letters, which can represent not only their
occurrence but also other interesting properties, such as the
number of times that they occur. In the end, a text is
represented as a model of propositional logic, and a (set
of) propositional rule(s) can be statistically learned from
a dataset of texts.</p>
    </sec>
    <sec id="sec-2">
      <title>2. A Logic-Based Post-Modern</title>
    </sec>
    <sec id="sec-3">
      <title>Approach</title>
      <p>Symbolic and sub-symbolic approaches to text-related
tasks are diferent in spirit. In both cases, the key idea
is to provide a representation of the text later used for
learning. However, in the case of sub-symbolic strategies,
such a representation, usually referred to as embedding,
is numerical. The most famous examples of sub-symbolic
representations are (all variants of) vectorizations of
tokens (i.e., words, sentences, or paragraphs). Each token
is mapped to a point of a high-dimensional space so that
mathematical tools can be used to reason about texts, and
a learned model, for example, for metaphor identification,
takes the form of a mathematical function.</p>
      <sec id="sec-3-1">
        <title>A further generalization of symbolic text-based en</title>
        <p>codings requires two steps: generalizing the concepts
of -gram and increasing the expressive power of the
logic that we use to describe texts. Both ideas are simple.</p>
        <p>Focusing on 2-grams, specifically, the most natural
generalization consists of eliminating the constraint of two
words being one next to the other to form a 2-gram. So a
generalized 2-gram can be defined as any pair of
successive, non-consecutive words. Such a generalization has
two main consequences: first, the label of a generalized
2-gram may be much richer than the label of a standard
one, and second, the encoding of a text using generalized
2-grams can be much more expressive than the encoding
of the same text using standard ones.</p>
        <p>In symbolic approaches, on the other hand, we encode
a token (typically, an entire sentence or paragraph) as
a logical model. In the most uncomplicated cases,
following the so-called bag-of-words methodology, a text is
encoded starting from a fixed (arbitrarily long) dictionary;
it is translated into a binary vector of length  , being
 the size of the dictionary, where the -th component
takes the value 1 if and only if the -th word of the
dictionary occurs in the text. Text-based encodings are easily
generalized along two directions: bag-of-words become
bag-of--grams, and vector components become coun- Let us focus on labeling. As explained above, a
stanters so that the -th component takes value  if and only dard 2-gram is logically labeled using (the number of
if the -th -gram of the fixed -grams vocabulary occurs times) that it occurs. A generalized 2-gram, on the other
exactly  times in the text (in this context, -grams are hand, can be labeled using the occurrences of the words
not used in their canonical, probabilistic version, that in between. In Fig. 1, we see the abstract idea of a
generalis, to predict the -th element from the previous  − 1 ized 2-gram: the pair of words 3, 7 form a generalized
ones, but, instead, in their crisp one, that is, a straight- 2-gram (they are two, possibly non-consecutive, words)
forward generalization of single words). In most cases, and, in the encoding, they are represented by a
proposithe experiments show that using 2-grams attain the best tional letter (in the example, ). The meaning of such a
compromise between the computational complexity of propositional letter is no longer limited to depend on the
the tasks and the performances of the learned models. occurrence of 3, 7, either separately or together. On
The logical interpretation of symbolic encoding emerges the contrary, one can use the entire sentence between 3
by introducing propositional letters to represent the text and 7 to build ; examples may range from the topic of
by the presence of relevant -grams. Simplifying, a sym- the sentence, to its length, the semantic category of any
bolic encoding classification model can be described by word between the extremes of the generalizes 2-gram,
(sets of) rule(s) of the type: and so on.</p>
        <p>If ’flood of immigrants’ occurs then metaphor.</p>
      </sec>
      <sec id="sec-3-2">
        <title>Concerning the expressive power of the encoding,</title>
        <p>These people. They arrive forming a continuous wave, an endless
flow that changes societies at all levels, swirling together diferent
and irreconcilable cultures. These are the migrants, often considered
a problem.</p>
        <p>These people, the migrants, arrive on dilapidated boats at the mercy
of the waves and flows. They risk their lives and when they arrive
they are often rejected, because, it is believed, they risk changing
societies at all levels, including cultural ones.
⟨⟩(topic ’migrants’ ∧ ⟨⟩topic ’fluid’)</p>
        <p>⇓
metaphor
⟨⟩(topic ’migrants’ ∧ ¬[]topic ’fluid’)</p>
        <p>⇓
not a metaphor
now observe that in the standard text-based approaches, guages such as this one may in fact be designed, and its
the relative ordering of the original sentences is lost, expressive power be modulated, depending on the task.
while only the order of constituents of each -words is Symbolic learning algorithms for interval temporal logic
preserved. Generalized 2-words, instead, are naturally have been recently studied [19] and used for learning
linked to a qualitative, more-than-propositional logic that interval temporal properties in very diferent contexts,
allows one to preserve the ordering in a very expressive mostly, but not exclusively, in the medical sciences (see
way. The key idea is that a sentence can be seen as a [20, 21], among others); in those cases, the object being
linearly ordered sequence of words, which entails, in turn, encoded are multi-variate temporal series, via a process
a temporal order, as also proposed in other models, such that eventually produces interval temporal models from
as BiLSTMs [15, 16]. Thus, a generalized 2-words is an which rules are ultimately learned. It is of notice how
interval in such a order, and any two intervals on a lin- such diverse contexts, including text-related tasks, can in
ear order can be qualitatively related to each other in fact be approached with the same methodology.
Continexactly one of thirteen ways. The family of logics that uing with the example in Fig. 1, the generalized 2-gram
allow one to describe propositional properties of inter- 4, 6 is during the generalized 2-gram 3, 7
vals on a linear order is called interval temporal logics,
and they belong to the more general category of modal In Fig. 2, we show how, in a text, relevant generalized
logics. Originally studied by Allen in the early 80s, in- 2-grams are identified; in both texts, two generalized
terval temporal logic have been formalizes a few years 2-grams are identified. Focusing on the top paragraph,
later, and the most representative language for express- the first generalized 2-gram, in red color, is captured
ing propositional properties of intervals is the modal logic by the words people and migrants; the entire text in
beof time intervals, or HS [17]. In HS, each of the possible tween (even ignoring the full stop, thus ignoring that
binary relations that may exist between two intervals they belong to two diferent sentences) is categorized
becomes an accessibility relation; it can be immediately as topic ’migrants’, thus imitating a human reader who,
verified that they are, in fact, thirteen: after (capturing an reading the complete text, can identify when the writer
interval that starts at the end of the current one, usually starts referring to some category of persons, when he/she
denoted by ⟨⟩), later (capturing an interval that starts stops doing that, and which one this category is. The
secpast the end of the current one, ⟨⟩), overlaps (capturing ond generalized 2-gram, in blue color, is captured by the
an interval that starts during the current one and ending words wave and swirling, and the entire text in between
after it, ⟨⟩), during (capturing an interval that starts is categorized as topic ’fluid’ (observe the frequencies of
and ends within the current one, ⟨⟩), begins (captur- words that refer to fluids, and water in particular, that
ocing an interval that starts at the start of the current one cur in the blue-highlighted text). The bottom paragraph
and ends before it, ⟨⟩), and ends (capturing an inter- shows similar words in a similar but identical order. Both
val that starts within the current one and ends with it, topics are still present and identified in the same way.
⟨⟩). Working with the relations/operators as they were However, the two topics are in a diferent topological
originally introduced may not be always suitable; inter- order. On the right-hand side, we propose a possible
val temporal logics such as HS have been simplified for rule linking the topics’ topological order to distinguish
specific tasks in several ways. Among them, the most rel- between metaphoric and non-metaphoric text written in
evant proposals include the so-called topological versions propositional HS. Most interestingly, ChatGPT (version
of interval temporal logic, in which the relations are, in 3.5, consulted on the prompt in September 2023) classifies
fact, disjunctions of Allen’s relations. So, for example, in both texts as metaphoric, probably because metaphors
the case of HS3 [18], two intervals can just have at least linking fluids and migrants are statistically common.
one point in common or can be completely separated;
lan</p>
      </sec>
    </sec>
    <sec id="sec-4">
      <title>3. Conclusions</title>
      <sec id="sec-4-1">
        <title>We will further verify our hypotheses by conducting</title>
        <p>some tests on an annotated newspaper corpus, which
was human-labeled as metaphor or non-metaphor,
consisting of 13,000 tokens and 2,000 diferent words. The
label pertains to the entire text, and the task will regard
recognizing its metaphoric expressions.
This work represents an initial attempt to approach
symbolic learning for text-related tasks like metaphor
detection. A symbolic approach can extract a theory from
a specific linguistic phenomenon, which raises at least
three problems: first, determining whether a theory of
a phenomenon should exist and in what terms; second,
ifnding the appropriate logic for the extraction process;
and third, ensuring the existence of an automatic method
for extracting the theory in that logic. In this work, we
have attempted to address the first and second points,
and we did so using a logical formalism for which a
solution to the third one already exists. Should this approach
be successful, it can be used to address other text-related
challenges, such as all variants of text classification.
Additionally, our generalized 2-gram encoding can be further
generalized to partially benefit from well-known
wordto-vec approaches without compromising its symbolic
essence.</p>
      </sec>
    </sec>
  </body>
  <back>
    <ref-list />
  </back>
</article>