<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta>
      <issn pub-type="ppub">1613-0073</issn>
    </journal-meta>
    <article-meta>
      <title-group>
        <article-title>is the B of C: (Semi)-Automatic Creation of Vossian Antonomasias</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Johanna Rockstroh</string-name>
          <email>rockstro@uni-bremen.de</email>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Giada D'Ippolito</string-name>
          <email>giadadippolito30@gmail.com</email>
          <xref ref-type="aff" rid="aff2">2</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Nicolas Lazzari</string-name>
          <email>nicolas.lazzari3@unibo.it</email>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Anouk M. Oudshoorn</string-name>
          <email>anouk.oudshoorn@tuwien.ac.at</email>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Disha Purohit</string-name>
          <email>d.purohit@stud.uni-hannover.de</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Ensiyeh Raoufi</string-name>
          <email>ensiyeh.raoufi@lirmm.fr</email>
          <xref ref-type="aff" rid="aff3">3</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Sebastian Rudolph</string-name>
          <email>sebastian.rudolph@tu-dresden.de</email>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Leibniz University Hannover</institution>
        </aff>
        <aff id="aff1">
          <label>1</label>
          <institution>Technical University of Vienna</institution>
        </aff>
        <aff id="aff2">
          <label>2</label>
          <institution>University of Genova</institution>
        </aff>
        <aff id="aff3">
          <label>3</label>
          <institution>University of Montpellier</institution>
          ,
          <addr-line>LIRMM</addr-line>
        </aff>
      </contrib-group>
      <pub-date>
        <year>2023</year>
      </pub-date>
      <abstract>
        <p>A Vossian Antonomasia (VA) is a stylistic device used to describe a person (or, more generally, an entity) in terms of a well-known person and a modifying context. For instance, the Norwegian chess world champion Magnus Carlsen was described as “the Mozart of chess” [1]. All VAs follow the pattern where a source (e.g., “Mozart”), is used to describe a target, (e.g., “Magnus Carlsen”), and the transfer of meaning is “channeled” through the use of the modifier “of chess”. Although this rhetorical figure is well-known, there has not yet been a dedicated study of targeted automatic or semi-automatic methods to generate and judge the appropriateness of VAs using large Knowledge Graphs (KGs) such as Wikidata. In our work, we propose the use of vector space embeddings - both KG-based and text-based - for producing VAs. For comparison, we contrast our findings with a purely LLM-based approach, wherein VAs are obtained from ChatGPT using a reasonably engineered prompt. We provide a publicly available GitHub repository1 for the implementation of our method and a website2 that allows testing the proposed methods.</p>
      </abstract>
      <kwd-group>
        <kwd>Antonomasias</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>-</title>
      <p>CEUR</p>
      <p>ceur-ws.org
(Semi)-Automatic</p>
    </sec>
    <sec id="sec-2">
      <title>1. Introduction</title>
      <p>
        The question of whether computational methods can be used as creative devices can be traced
back to the beginning of computers when Ada Lovelace wondered about the endless possibilities
of automatic calculators [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ]. Even though Artificial Intelligence techniques have largely been
used in creative applications [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ], the evaluation of such creative outputs remains problematic [
        <xref ref-type="bibr" rid="ref4">4</xref>
        ].
In this work, we propose the generation of Vossian Antonomasias (VAs) as a benchmark for
exploring the creativity of AI methods.
2The website is available at https://antonomasia.informatik.uni-bremen.de/. Note that for eficiency reasons, a
restricted set of entities is available to users.
CEUR
Workshop
Proceedings
      </p>
      <p>VAs are a popular stylistic device for describing one entity by referring to another, typically
in a witty and resourceful manner. A VA consists of three parts: a target entity  , a source
entity  , and a modifier  , and is generally expressed as</p>
      <p>is the  of  .</p>
      <p>A meaningful VA requires a non-trivial degree of creativity and extensive knowledge of the
specifics of the target entity. One has to identify a set of salient characteristics of  that is
similarly, or even more prominently realised by  . It is fundamental, however, that  and 
difer when compared using the modifier  . For instance, in the sentence “Nacho Figueras is the
Brad Pitt of polo players”, Ignacio Figueras, among the most famous polo players in the world, is
compared to the actor Brad Pitt due to his appearance.1</p>
      <p>VAs are often used in many journalistic genres and frequently appear in headings, as they
can be both informative, enigmatic, and entertaining. In general,  is a well-known, widely
recognized entity. Through its popularity, the writer encourages readers to classify  as similar
to  , despite the diference  .</p>
      <p>In this work, we present a method to automatically generate VAs by exploiting the latent
semantic capabilities of vector space embeddings. We extract potential candidates for  by
using SPARQL queries over Wikidata. We rely on a heuristic method to select entities that can
be classified as popular. By relying on publicly available Knowledge Graph Embeddings (KGE)
trained on Wikidata, we compute the vector representations for an arbitrary  , the identified
set of  and a restricted set of  . We experiment with diferent operations between vectors to
select the best  candidate.</p>
      <p>
        In order to investigate the eficacy of each experiment, we compare the use of KGE with word
embeddings obtained from large corpora of text [
        <xref ref-type="bibr" rid="ref5">5</xref>
        ]. Additionally, given the recent surge of
Large Language Models (LLM) to mine creative analogies [
        <xref ref-type="bibr" rid="ref6">6</xref>
        ], we rely on ChatGPT [
        <xref ref-type="bibr" rid="ref7">7</xref>
        ] as an
additional baseline. We evaluate each method through a user-evaluation study.
      </p>
      <p>The paper can be summarised as:
1. Identification of a suitable pool of candidates that can serve as  elements;
2. Proposal of a novel method to automatically generate Vossian Antonomasias;
3. Evaluation of the proposed method using a user evaluation study;</p>
      <p>The paper is organised as follows: in Section 2 we describe related work, which is followed
by the presentation of the implemented method in Section 3. In Section 4, we discuss the results
produced by the methods of Section 3 and present the outcomes of the user evaluation in Section
5. We finish by drawing conclusions and providing an outlook in Section 6.
1This and many more examples of VAs extracted from a newspaper corpus can be found at
https://vossanto.weltliteratur.net/emnlp-ijcnlp2019/vossantos.html</p>
    </sec>
    <sec id="sec-3">
      <title>2. Related Work</title>
      <p>
        There has been limited research on automatically detecting Vossian Antonomasias in written
text. The authors of [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ] demonstrated that by using Wikidata, they were able to overcome the
shortcomings of available Named Entity Recognition (NER) tools and confirmed that VA is a
linguistic and cultural phenomenon. Through quantitative VA explorations, they were able to
capture the phenomenon as a whole, encompassing the source, target, and, when available,
modifier. Their approach involves searching for a network of individuals interconnected by
diverse modifiers, where the nodes can function as either sources or targets. This network aids
in understanding hidden patterns of role models, revealing how they vary across countries and
languages. However, a limitation they acknowledged is their reliance on the most prevalent
pattern of VA, namely, ”the...of” which resulted in the omission of numerous expressions
extracted from the New York Times (NYT) corpus. For instance, notable phrases such as
”the American Oscar Wilde” and ”Harlem’s Mozart” were overlooked despite their significance.
Another approach for the extraction of VAs is presented by [
        <xref ref-type="bibr" rid="ref8">8</xref>
        ]. The focus is on the extraction
of the target by using coreference resolution and visualising the connections between the source
and target entities extracted in the VAs in form of a web demo. The authors of [
        <xref ref-type="bibr" rid="ref9">9</xref>
        ] use neural
networks for the end-to-end detection of VAs resulting in two models: one for binary sentence
classification and another for sequence tagging of all parts of the VA on the word level.
      </p>
      <p>
        As opposed to the work described above, our approach focuses on the generation of VAs
rather than their detection. Similarly to the approach of [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ], our method only focuses on the
pattern ”the...of” by exploiting the latent semantic space of Knowledge Graph Embeddings and
word embedding methods. Knowledge Graph Embedding compute a vectorial approximation
of the originating Knowledge Graph through the use of various geometrical intuitions. In
TransE [
        <xref ref-type="bibr" rid="ref10">10</xref>
        ] predicates and entities are modelled as translations in the vector space. Given a
triple (ℎ,  , ) , the vector embedding of head entity ℎ, predicate or relation  , and tail entity  are
computed to minimise the quantity |ℎ +  − | - i.e.  should be close to the ℎ +  . Increasingly
complex methods have been presented in literature [11].
      </p>
      <p>Based on the distributional hypothesis, word embeddings [12] are used to compute vectors
based on the distribution of words in large corpora of text, such as GloVe [13], where the
representation is obtained using word co-occurrence statistics or word2Vec [14], where words
with similar contextual distribution are approximated as similar vectors.</p>
    </sec>
    <sec id="sec-4">
      <title>3. Methodology</title>
      <p>This section details our proposed approach to automatically generate VAs, as outlined in
Section 1. Figure 1 provides a high-level summary of our approach.</p>
      <sec id="sec-4-1">
        <title>3.1. Wikidata as a Knowledge Resource</title>
        <p>We rely on Wikidata to identify a set of candidates that can serve as  entities. This allows
us to benefit from a large number of triples 2 with broad coverage of encyclopedic knowledge,
2According to Wikidata’s statistics the knowledge graph currently contains 104,204,236 items. [15]
Target (A) is the source
(B) of modifier (C)</p>
        <p>Extract
Extracting B’s from</p>
        <p>Wikidata using
SPARQL queries</p>
        <p>Exploration
Knowledge Graph
Embeddings (KGE)</p>
        <p>Text-based</p>
        <p>Embeddings
Use Pre-Trained</p>
        <p>Models
Mapping B’s into
embedding space</p>
        <p>Evaluate
Applying similarity
measures to identify
appropriate B’s
which enables us to select a suficient sample of candidates for every component of the Vossian
Antonomasias. Additionally, Wikidata provides a more structured and consistent data model
when compared to similar resources, such as DBPedia[16]. Moreover, we can leverage the
language-independent design of Wikidata as opposed to DBPedia [16] to ensure that the targets
are widely popular.</p>
        <p>We first extract the entities that will be used as  candidates by means of SPARQL queries.</p>
        <p>We retrieve popular fictional characters and popular humans with the query of Listing 1.
Given the large number of triples retrieved by both queries, they are executed on the Semantic
Builders3 SPARQL endpoint rather than on the regular Wikidata endpoint4. This allows us to
overcome the querying timeout imposed by Wikidata and extract 3815 entities. The extracted
entities are used as the set of candidates for  in a VA. We rely on a heuristic method to compute
the popularity of an entity: the number of worldwide available Wikipedia articles in distinct
languages for an entity as a proxy for its popularity. Given  the number of translations of one
entity, we found  ≥ 70 for real-world individuals and  &gt; 30 for fictional characters to be a
good estimate.</p>
      </sec>
      <sec id="sec-4-2">
        <title>3.2. VA Generation using Vector Representations</title>
        <p>We generate VA sentences by using geometrical transformations on the latent space provided
by vector embeddings. Given an arbitrary  , we constrain the modifier  to be the occupation
of the entity  . The underlying assumption of the proposed model is that, despite their diferent
occupations,  and  need to be similar with respect to their salient features. For this reason,
given a particular  , all those entities  ∈  that share the same modifier (i.e. the same
occupation) are excluded from the pool of candidates. This brings us closer to ensuring the
accurate selection of ’B’ in accordance with the conditions specified in section 1.
3https://semantic.builders/
4https://query.wikidata.org/
PREFIX wdt: &lt;http://www.wikidata.org/prop/direct/&gt;
PREFIX wd: &lt;http://www.wikidata.org/entity/&gt;
SELECT ?item ?itemLabel ?occupation ?sitelinks WHERE {
?item wdt:P31 &lt;type&gt;;
wdt:P106 ?occupation;
wikibase:sitelinks ?sitelinks .</p>
        <p>FILTER(&lt;threshold&gt; &lt; ?sitelinks).</p>
        <p>SERVICE wikibase:label {bd:serviceParam wikibase:language
↪ "[AUTO_LANGUAGE],en".}}
Listing 1: SPARQL Query to extract candidate entities for  . &lt;type&gt; and &lt;threhsold&gt; are replaced
by wd:Q15632617 and 30 for fictional characters and wd:Q5 and 70 for humans.</p>
        <p>We propose two diferent methods for the selection of the best  candidate: a translation-based
approach and a projection-based approach. The translation-based approach follows the intuition
of TransE and word2vec: given  ⃗,  ⃗ the vector representations of two arbitrary entities and
given ⃗ the vector that represents a predicate holding between  ⃗ and  ⃗, it has been observed
that  ⃗+ ⃗≈  ⃗ .</p>
        <p>Given ⃗= ( 1, … ,   ) the embedding vector of an arbitrary target entity A, ℬ the set of
embedding vectors of each candidate entity for  , and ⃗= ( 1, … ,   ) the embedding vector of
the predicate  that denotes the occupation of an entity (i.e. P106 in Wikidata), the
translationbased method first disregards (i.e., “subtracts”) A’s occupation C obtaining:
Then, we define the fitness
distance between  ⃗′ and  ⃗′, i.e.,</p>
        <p>(where smaller is better) of a candidate  ⃗′ ∈ ℬ as the Euclidean
where  is the dimension of the vector embedding space.</p>
        <p>The projection-based method relies on a diferent assumption. Informally, we would like
to compute the fitness  (,⃗  ⃗′) on a subspace of the whole embedding space where every
information related to the occupation of the entities is ignored. We compute the projection of ⃗
and  ⃗′ to this subspace, which is a hyperplane perpendicular to ⃗, using
where ∘ denotes the inner product and  ⃗ is either ⃗ or  ⃗′. The fitness function  is hence adjusted
to the cosine distance between the projections of ⃗ and  ⃗′, formally
 ⃗ ( )⃗ =  ⃗− (
 ⃗∘ ⃗
⃗∘ ⃗</p>
        <p>) ⃗
 (,⃗  ⃗′) =  ⃗ ()⃗ ∘  ⃗ ( ⃗′)</p>
        <p>| ⃗ ()⃗|| ⃗ ( ⃗′)|
 ⃗′ = ⃗− ⃗
 (,⃗  ⃗′) = |⃗′ −  ⃗′| =
∑( ′ −  ′)2

√=0
(1)
(2)
(3)
(4)
(a) Equation 2 (translation-(b) Equation 4
(projection</p>
        <p>based) illustrated. based) illustrated.</p>
        <p>The fitness functions in Equation (2) and Equation (4) can be seen as similarity functions
between two entities. A suitable  can be extracted by taking the entity that minimises such
distance. Intuitively, in the translation-based approach, the vector representing  is supposed to
be similar to the one of  after we “translate away” or “subtract” the characteristics pertaining
to  using Equation (1), while in projection-based method by applying Equation (3), we “project
away” such characteristics.</p>
        <p>As addressed in Section 1, a good VA needs to be a creative sentence. While it is dificult to
assess creativity in an objective manner, it has been argued that among the many characteristics,
a creative output needs to display novelty when compared to others [17]. Given a set of entities
 ̂ that minimises the fitness function  , we propose to further rank such entities by using their
 1 norm. The intuition is that among all the candidates in  ̂, the ones with a greater distance
from the origin of the vector space are the most “extremal” ones.</p>
        <p>Figure 2 depicts a simplified illustration of Equation 2 and Equation 4. Using t-SNE [18], we
reduce the dimensionality of embedding vector ⃗ of a sample entity A, a set ℬ of embedding
vectors of each candidate entity for  , and ⃗∈  ̂ vectors.</p>
        <p>Given a particular entity  and the corresponding entity  , selected with one of the proposed
methods, we generate an assertional VA following template: {A} {verb} the {B} of {C}. If
the entity  has an entry in Wikidata that certifies its death, we set verb to was, else we use is.</p>
      </sec>
      <sec id="sec-4-3">
        <title>3.3. Purely LMM-based Baseline via ChatGPT</title>
        <p>
          Finally, we use ChatGPT [
          <xref ref-type="bibr" rid="ref7">7</xref>
          ], a Large Language Model, as a baseline for VA generation. Through
extensive experimentation, we found the prompt that obtains the best result to be
        </p>
        <p>Following the discussion of Section 1, we argue that an efective prompt for VA needs to
display the following properties:
•  should not share characteristics with  ;
•  and  should share at least one salient characteristic;
•  and  should be popular enough to draw the analogy in the context of 
Provide 10 Vossian Antonomasias for &lt;Name of A&gt;, where she is equated
with another person. Each of the phrases should have the structure
”&lt;Name of A&gt; is the [person name] of [profession]”, where [profession]
must not characterize [person name]. Provide a very short justification
for each example.</p>
      </sec>
    </sec>
    <sec id="sec-5">
      <title>4. Experimental Setting</title>
      <p>As briefly addressed in Section 1, we experiment with two diferent methods to obtain vector
representations of an entity: Knowledge Graph Embeddings (KGE) and Word Embeddings (WE).</p>
      <p>
        We employ TransE [
        <xref ref-type="bibr" rid="ref10">10</xref>
        ] as KGE method. We directly reuse the publicly available model
shared by GraphVite [19] and trained on the Wikidata-5M dataset [20]. For the WE method, we
employ word2vec [14] and GloVe [13] provided by gensim [
        <xref ref-type="bibr" rid="ref5">5</xref>
        ].
      </p>
      <p>Finally, we leverage the use of meta-embedding techniques, i.e. combining diferent
embedding methods together [21], to exploit the main advantages of both methods. We combine KGE
and WE by means of concatenation and averaging. When averaging two vectors with diferent
dimensionality, we apply zero-padding [21]. Note that, even though they are supposed to
converge to a similar semantic, the latent space represented by a method might difer drastically
from other methods. To prevent a drastically higher influence of one method over the other, we
normalise both methods by their  2 norm before combining them.</p>
      <p>To allow interested readers to try out the diferent methods themselves, we set up a
demonstrator website available at https://antonomasia.informatik.uni-bremen.de/.</p>
    </sec>
    <sec id="sec-6">
      <title>5. Results and Evaluation</title>
      <p>Due to the highly diverse nature of VAs, we decided to test the quality of the output with human
evaluation. A small selection of examples generated by the methods described in Section 3 is
presented in Table 2.</p>
      <p>The selection on Table 2 shows that ChatGPT, while being very creative when it comes to
the description of the domain, does not perform well in identifying a proper  that does not
share characteristics with the modifier  , such as in the sentence “Bill Gates is the Einstein
of Societal Transformation”. This phenomenon particularly occurs with politicians or writers.
Despite the explicit request in the prompt of Table 1, ChatGPT did not manage to adapt the
chosen entity. The results generated by the KGE, WE and their combination mostly meet the
mentioned criteria, even though some exceptions occur, such as in the sentence “Angela Merkel
is the Eva Braun of politics.”.</p>
      <sec id="sec-6-1">
        <title>5.1. User Evaluation</title>
        <p>The method described in Section 3 allows combining several diferent techniques to generate a
VA. For the user evaluation, after manual experimentation, we restricted the set of techniques to
the ones described in Table 3. The presented selection allows us to evaluate the importance of
diferent assumptions, such as whether the methods based on the distributional hypothesis can
complement content-based methods. Moreover, we are able to assess whether the presented
methods can overcome the issues of ChatGPT, namely the dificulty of selecting  and  from a
diferent domain dictated by  .</p>
        <p>
          We identify six individuals that will be used as  : Nelson Mandela, Angela Merkel, Mark Twain,
Albert Einstein, Bill Gates and Ronald Reagan. Those individuals are both part of the entities
extracted using the query in Listing 1 and the real-world samples from the New York Times [
          <xref ref-type="bibr" rid="ref1">1</xref>
          ]
and Der Umblätterer5 corpora. As mentioned in Section 3, we contain  to the profession of the
entity.
        </p>
        <p>To recruit participants for our study, we distributed a flyer as advertisement and shared it
with colleagues and friends who themselves distributed this further. The study can be done
online without any supervision. Due to the specificity of Vossian Antonomasia, we present a
definition on the front page of the study:</p>
        <p>Vossian antonomasias refer to someone by a special characteristic instead of their
name.For example, calling Bill Gates “the Henry Ford of the computer age”
highlights his influence as entrepeneur and his efect on the development of technology.</p>
        <p>It is a way to describe someone by an important quality they possess.</p>
        <p>The study was open for one week. We provide each participant with 21 sentences. We randomly
sample 3  s from the set described above and provide 7 VAs, one for every generation method
of Table 3. The selected VAs are randomly sampled from the top 10 sentences identified by
the method used. The participant is asked to judge each VA on three aspects: how well the
description fits, how understandable it is and how original the VA is. These three aspects can
be ranked on a Likert Scale ranging form 1 to 5. After that, the knowledge about the source and
the target will be inquired in the form of questions: we ask how well the participant knows the
individuals. The possible answers are I know who that person is, I have heard of the name but I
cannot relate it to anything, and I have never heard of that name before. This ensures a distinction
between a negative rating caused by ignorance of the output’s components and the lack of a
proper connection between the source, the target and the modifier.</p>
      </sec>
      <sec id="sec-6-2">
        <title>5.2. Results</title>
        <p>Through the user-evaluation test, we obtain a set of 207 human evaluations on automatically
generated VAs, provided by 29 unique annotators. The sentences presented to the participants
5https://www.umblaetterer.de/datenzentrum/vossianische-antonomasien.html
are repeated to avoid a random evaluation. The inter-annotator agreement, computed using
Cohen’s Kappa score [22], is 0.0491 on average. Such a low score highlights the dificulty in
evaluating VAs since they greatly depend on the reader’s knowledge, cultural reference and
degree of familiarity with the selected subject.</p>
        <p>In Figure 5 the distribution of ratings among the methods reported in Table 5 is reported.
Intuitively, whose distribution is skewed towards low ratings (represented in green) should be
considered the best-performing method. Using the translation-based meta-embedding method
outperforms all the other methods. Interestingly, the baseline provided by ChatGPT performs
worse than any other method. While this might be reconducted to the lack of explicit VA-related
knowledge that the underlying LLM is based on, it can also be argued that the prompt that we
propose to use is not perfectly suited for this task. We will further address this aspect in Section
6. At first glance, the best method turns out to be word2vec using the translation technique,
which results in 74 VAs rated with a score of 1. It needs to be argued, however, that using this
method results also in a high variance in the results. Indeed, the same method is also the one
that obtains the highest amount of low-rated VAs.
Average rating for each method. Lower is better. Best result is represented in bold.</p>
        <p>When taking into account the whole distribution, the projection-based method TransE
achieved the best performance: the lowest number of ”bad” VAs is obtained with such a method.
Indeed, this is the method that achieves the best overall performances, as can be seen in Table 4.
Overview of the mean rating for each user question. The mean rating for each confidence rating ( 1,  2,
and  3) is reported alongside the overall mean rate  1∪2∪3 for each question. The best results for each
criterion are highlighted in bold. Lower values indicate better performance.
set of sentences for which an evaluator expressed specific confidence in the knowledge of
and  . Those results are complementary to the ones of Figure 5. Interestingly, the best overall
method, projection-based TransE, does not classify as the best in any specific user question.
Depending on the target task, one model can be considered better than the other, even though
projection-based TransE guarantees the most consistent results.</p>
      </sec>
    </sec>
    <sec id="sec-7">
      <title>6. Conclusions and Discussion</title>
      <p>We looked at generating Vossian Antononmasias by using embeddings and LLM as a way to
characterize the creativity of AI. Our approach has resulted in creative examples of VAs, which
proves that both methods are suitable for solving such tasks. Since the lack of a clear definition

of creativity prevents a quantitative evaluation of the results, we conducted a manual qualitative
analysis of the results which highlighted several diferent weaknesses. The information bias
that is inherent to Wikidata results in VAs that are mostly focused on Western culture. While
this might be tampered by penalising some entities, we argue it would only partially solve the
issue. A diferent approach, which takes into account the semantic representation of each entity,
can help overcome such issues, making the creation process transparent and explainable.</p>
      <p>The human evaluation described in Section 5 provides meaningful insights into the
efectiveness of our methods. Firstly, they generally perform better than the ChatGPT baseline, which
fails in the generation of original and understandable VAs. Moreover, the results of Figure 5 and
Table 5 show how the use of Knowledge Graph Embeddings is generally to be preferred over the
methods, such as word embeddings and meta-embeddings. However, the low inter-annotator
agreement shows that rating Vossian Antonomasias is highly subjective and most probably
depends on not only the knowledge of an entity but also knowledge of the domain the entity
refers to. This could be addressed by filtering the human annotators into groups according to
their domain knowledge before rating the sentences.</p>
      <p>The focus on the occupation as a similarity measure resulted in several complications, such as
the use of semantically similar occupations like television actor and actor in the generated
sentences. Additionally, since some entities hold multiple occupations, a more accurate estimation
of their primary occupation needs to be investigated. A possible solution is to aggregate the
representation vectors of all their occupations instead of selecting a single occupation. Similarly,
ifctional characters are sometimes compared to their real-life actors. A possible solution is to
impose a minimum distance between vectors that are too close. An orthogonal solution is to
consider other criteria when comparing entities, such as achievement or awards. Apart from
famous people, famous locations or events along with their appropriate modifiers could be
added to increase the sample size and achieve a greater variability in the results.</p>
      <p>The mentioned limitations of the evaluation and sampling of entities show that the generation
of Vossian Antonomasia with an open-domain approach proves to be rather dificult. Instead, we
suggest focusing on specific domains, thereby using fine-tuning of the embeddings or diferent
embedding methods to overcome the mentioned shortcomings.</p>
      <p>Additionally, we envision integrating an LLM in the pre-evaluation step by evaluating the
VAs that have been generated by combining our proposed methods. The idea is to list the salient
similarities and diferences between  and  for a given VA by looking at their characteristic
properties. Following recent LLM-prompting studies [23, 24], we plan on performing additional
manual or automatic prompt engineering [25, 26]. This can lead to more efective results, since
any change in the prompts may afect significantly the quality of the output.</p>
      <p>Moreover, an interesting approach is to perform knowledge injection [27, 28] into a LLM,
following a neuro-symbolic approach to use the language model’s potential and control for the
criteria defining Vossian Antonomasia. The knowledge available to an LLM like ChatGPT is
currently limited regarding whether an entity has recently died or is a fictional character based
on the time it was trained. The injection of structured knowledge can help overcoming this
issue.</p>
    </sec>
    <sec id="sec-8">
      <title>Acknowledgements</title>
      <p>The project leading to this application has received funding from the European Union’s
Horizon 2020 research and innovation programme under grant agreement No 101034440. This
work was partially funded by the Klaus Tschira Foundation, grant number 40300928 and the
French National Research Agency ANR DACE-DL project, grant number ANR-21-CE23-0019.
United States, 2013, pp. 2787–2795. URL: https://proceedings.neurips.cc/paper/2013/hash/
1cecc7a77928ca8133fa24680a88d2f9-Abstract.html.
[11] S. Choudhary, T. Luthra, A. Mittal, R. Singh, A Survey of Knowledge Graph Embedding and
Their Applications, CoRR abs/2107.07842 (2021). URL: https://arxiv.org/abs/2107.07842.
arXiv:2107.07842.
[12] F. Almeida, G. Xexéo, Word embeddings: A survey, CoRR abs/1901.09069 (2019). URL:
http://arxiv.org/abs/1901.09069. arXiv:1901.09069.
[13] J. Pennington, R. Socher, C. D. Manning, Glove: Global Vectors for Word Representation,
in: A. Moschitti, B. Pang, W. Daelemans (Eds.), Proceedings of the 2014 Conference on
Empirical Methods in Natural Language Processing, EMNLP 2014, October 25-29, 2014,
Doha, Qatar, A meeting of SIGDAT, a Special Interest Group of the ACL, ACL, 2014, pp.
1532–1543. URL: https://doi.org/10.3115/v1/d14-1162. doi:10.3115/v1/d14- 1162.
[14] T. Mikolov, K. Chen, G. Corrado, J. Dean, Eficient Estimation of Word Representations
in Vector Space, in: Y. Bengio, Y. LeCun (Eds.), 1st International Conference on Learning
Representations, ICLR 2013, Scottsdale, Arizona, USA, May 2-4, 2013, Workshop Track
Proceedings, 2013. URL: http://arxiv.org/abs/1301.3781.
[15] Wikidata statistics, https://www.wikidata.org/wiki/Wikidata:Statistics, 2023. Accessed on
2023-06-16.
[16] D. Abián, F. Guerra, J. Martínez-Romanos, R. T. Lado, Wikidata and DBpedia: A
Comparative Study, in: International KEYSTONE Conference, 2017.
[17] G. Ritchie, Assessing Creativity, in: Proc. of AISB’01 Symposium, 2001.
[18] L. Van der Maaten, G. Hinton, Visualizing Data using t-SNE., Journal of Machine Learning</p>
      <p>Research 9 (2008).
[19] Z. Zhu, S. Xu, M. Qu, J. Tang, Graphvite: A High-Performance CPU-GPU Hybrid System
for Node Embedding, in: The World Wide Web Conference, ACM, 2019, pp. 2494–2504.
[20] X. Wang, T. Gao, Z. Zhu, Z. Zhang, Z. Liu, J. Li, J. Tang, KEPLER: A Unified Model for
Knowledge Embedding and Pre-trained Language Representation, Trans. Assoc. Comput.
Linguistics 9 (2021) 176–194. URL: https://doi.org/10.1162/tacl_a_00360. doi:10.1162/
tacl\_a\_00360.
[21] D. Bollegala, J. O’Neill, A Survey on Word Meta-Embedding Learning, in: L. D. Raedt (Ed.),
Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence,
IJCAI 2022, Vienna, Austria, 23-29 July 2022, ijcai.org, 2022, pp. 5402–5409. URL: https:
//doi.org/10.24963/ijcai.2022/758. doi:10.24963/ijcai.2022/758.
[22] J. Cohen, A Coeficient of AgI see you’re on the paper, I updated the table and will proofread
everything now, do you think that are particulreement for Nominal Scales, Educational
and Psychological Measurement 20 (1960) 37–46.
[23] X. Liu, Y. Zheng, Z. Du, M. Ding, Y. Qian, Z. Yang, J. Tang, GPT Understands, Too, arXiv
preprint arXiv:2103.10385 (2021).
[24] J. White, Q. Fu, S. Hays, M. Sandborn, C. Olea, H. Gilbert, A. Elnashar, J. Spencer-Smith,
D. C. Schmidt, A Prompt Pattern Catalog to Enhance Prompt Engineering with ChatGPT,
arXiv preprint arXiv:2302.11382 (2023).
[25] L. Reynolds, K. McDonell, Prompt Programming for Large Language Models: Beyond
the Few-Shot Paradigm, in: Extended Abstracts of the 2021 CHI Conference on Human
Factors in Computing Systems, 2021, pp. 1–7.
[26] Y. Zhou, A. I. Muresanu, Z. Han, K. Paster, S. Pitis, H. Chan, J. Ba, Large language Models</p>
      <p>Are Human-Level Prompt Engineers, arXiv preprint arXiv:2211.01910 (2022).
[27] S. Pan, L. Luo, Y. Wang, C. Chen, J. Wang, X. Wu, Unifying Large Language Models
and Knowledge Graphs: A Roadmap, 2023. URL: https://arxiv.org/abs/2306.08302. doi:10.
48550/ARXIV.2306.08302.
[28] L. Yang, H. Chen, Z. Li, X. Ding, X. Wu, ChatGPT is not Enough: Enhancing Large
Language Models with Knowledge Graphs for Fact-aware Language Modeling, 2023. URL:
https://arxiv.org/abs/2306.11489. doi:10.48550/ARXIV.2306.11489.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <given-names>F.</given-names>
            <surname>Fischer</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Jäschke</surname>
          </string-name>
          , '
          <source>The Michael Jordan of Greatness'- Extracting Vossian Antonomasia from Two Decades of The New York Times</source>
          ,
          <fpage>1987</fpage>
          -
          <lpage>2007</lpage>
          ,
          <article-title>Digital Scholarship in the Humanities (</article-title>
          <year>2019</year>
          ). URL: https://doi.org/10.1093/llc/fqy087. doi:
          <volume>10</volume>
          .1093/llc/fqy087.
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [2]
          <string-name>
            <given-names>L. F.</given-names>
            <surname>Menabrea</surname>
          </string-name>
          ,
          <article-title>Sketch of the Analytical Engine invented by Charles Babbage, Esq., in: Ada's Legacy: Cultures of Computing from the Victorian to the Digital Age</article-title>
          ,
          <year>1843</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [3]
          <string-name>
            <given-names>N.</given-names>
            <surname>Anantrasirichai</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D. R.</given-names>
            <surname>Bull</surname>
          </string-name>
          ,
          <article-title>Artificial Intelligence in the Creative Industries: A Review, Artif</article-title>
          .
          <source>Intell. Rev</source>
          .
          <volume>55</volume>
          (
          <year>2022</year>
          )
          <fpage>589</fpage>
          -
          <lpage>656</lpage>
          . URL: https://doi.org/10.1007/s10462-021-10039-7. doi:
          <volume>10</volume>
          .1007/s10462- 021- 10039- 7.
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [4]
          <string-name>
            <given-names>R.</given-names>
            <surname>Wingström</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Hautala</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Lundman</surname>
          </string-name>
          ,
          <article-title>Redefining Creativity in the Era of AI? Perspectives of Computer Scientists</article-title>
          and New Media Artists, Creativity Research Journal (
          <year>2023</year>
          )
          <fpage>1</fpage>
          -
          <lpage>17</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          [5]
          <string-name>
            <given-names>R.</given-names>
            <surname>Řehůřek</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Sojka</surname>
          </string-name>
          ,
          <article-title>Software Framework for Topic Modelling with Large Corpora</article-title>
          ,
          <source>in: Proceedings of the LREC 2010 Workshop on New Challenges for NLP Frameworks</source>
          ,
          <string-name>
            <surname>ELRA</surname>
          </string-name>
          , Valletta, Malta,
          <year>2010</year>
          , pp.
          <fpage>45</fpage>
          -
          <lpage>50</lpage>
          . http://is.muni.cz/publication/884893/en.
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          [6]
          <string-name>
            <given-names>B.</given-names>
            <surname>Bhavya</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Xiong</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Zhai</surname>
          </string-name>
          , Cam:
          <string-name>
            <given-names>A Large</given-names>
            <surname>Language</surname>
          </string-name>
          <article-title>Model-based Creative Analogy Mining Framework</article-title>
          ,
          <source>in: Proceedings of the ACM Web Conference</source>
          <year>2023</year>
          , WWW '23,
          <string-name>
            <surname>Association</surname>
          </string-name>
          for Computing Machinery, New York, NY, USA,
          <year>2023</year>
          , p.
          <fpage>3903</fpage>
          -
          <lpage>3914</lpage>
          . URL: https://doi.org/10.1145/3543507.3587431. doi:
          <volume>10</volume>
          .1145/3543507.3587431.
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          [7] OpenAI, ChatGPT, https://openai.com,
          <year>2021</year>
          .
          <source>Version GPT-3</source>
          .5.
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          [8]
          <string-name>
            <given-names>M.</given-names>
            <surname>Schwab</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Jäschke</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F.</given-names>
            <surname>Fischer</surname>
          </string-name>
          , ”
          <article-title>Who is the Madonna of Italian-American Literature?”: Extracting and Analyzing Target Entities of Vossian Antonomasia</article-title>
          ,
          <source>in: Proceedings of the 7th Joint SIGHUM Workshop on Computational Linguistics for Cultural Heritage</source>
          ,
          <source>Social Sciences, Humanities and Literature</source>
          , Association for Computational Linguistics,
          <year>2023</year>
          , pp.
          <fpage>110</fpage>
          -
          <lpage>115</lpage>
          . URL: https://sighum.files.wordpress.com/
          <year>2023</year>
          /03/ latech-clfl-2023
          <source>-unofficial-proceedings.pdf.</source>
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          [9]
          <string-name>
            <given-names>M.</given-names>
            <surname>Schwab</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Jäschke</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F.</given-names>
            <surname>Fischer</surname>
          </string-name>
          , “
          <article-title>The Rodney Dangerfield of Stylistic Devices”: End-toEnd Detection and Extraction of Vossian Antonomasia Using Neural Networks</article-title>
          ,
          <source>Frontiers in Artificial Intelligence</source>
          <volume>5</volume>
          (
          <year>2022</year>
          ). URL: https://doi.org/10.3389/frai.
          <year>2022</year>
          .
          <volume>868249</volume>
          . doi:
          <volume>10</volume>
          . 3389/frai.
          <year>2022</year>
          .
          <volume>868249</volume>
          .
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          [10]
          <string-name>
            <given-names>A.</given-names>
            <surname>Bordes</surname>
          </string-name>
          ,
          <string-name>
            <given-names>N.</given-names>
            <surname>Usunier</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>García-Durán</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Weston</surname>
          </string-name>
          ,
          <string-name>
            <given-names>O.</given-names>
            <surname>Yakhnenko</surname>
          </string-name>
          ,
          <article-title>Translating Embeddings for Modeling Multi-relational Data</article-title>
          , in: C.
          <string-name>
            <surname>J. C. Burges</surname>
            ,
            <given-names>L.</given-names>
          </string-name>
          <string-name>
            <surname>Bottou</surname>
            ,
            <given-names>Z.</given-names>
          </string-name>
          <string-name>
            <surname>Ghahramani</surname>
            ,
            <given-names>K. Q.</given-names>
          </string-name>
          <string-name>
            <surname>Weinberger</surname>
          </string-name>
          (Eds.),
          <source>Advances in Neural Information Processing Systems 26: 27th Annual Conference on Neural Information Processing Systems 2013. Proceedings of a meeting held December 5-8</source>
          ,
          <year>2013</year>
          ,
          <string-name>
            <given-names>Lake</given-names>
            <surname>Tahoe</surname>
          </string-name>
          , Nevada,
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>