<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta>
      <journal-title-group>
        <journal-title>Corresponding author.
pgervas@ucm.es (P. Gervás); gmendez@ucm.es (G. Méndez)
http://nil.fdi.ucm.es (P. Gervás); http://nil.fdi.ucm.es (G. Méndez)</journal-title>
      </journal-title-group>
    </journal-meta>
    <article-meta>
      <title-group>
        <article-title>Tagging Narrative with Propp's Character Functions Using Large Language Models</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Pablo Gervás</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Gonzalo Méndez</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Facultad de Informática, Universidad Complutense de Madrid</institution>
          ,
          <addr-line>Madrid</addr-line>
          ,
          <institution>28040 Spain Instituto de Tecnología del Conocimiento, Universidad Complutense de Madrid</institution>
          ,
          <addr-line>Madrid, 28223</addr-line>
          <country country="ES">Spain</country>
        </aff>
      </contrib-group>
      <pub-date>
        <year>2024</year>
      </pub-date>
      <volume>000</volume>
      <fpage>0</fpage>
      <lpage>0003</lpage>
      <abstract>
        <p>The character functions proposed by Vladimir Propp as abstraction of plot structure to understand Russian folk tales have been popular as means of analysing narrative. Several e orts have been carried out to annotate narratives in this way manually or to apply machine learning techniques over texts previously annotated with syntactic and semantic information. The present paper explores the feasibility of annotating directly from the bare text of synopses of the stories by relying on large language models.</p>
      </abstract>
      <kwd-group>
        <kwd>eol&gt;automated tagging</kwd>
        <kwd>Propp's Morphology of the Folktale</kwd>
        <kwd>character functions</kwd>
        <kwd>large language models</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>1. Introduction</title>
    </sec>
    <sec id="sec-2">
      <title>2. Previous Work</title>
      <p>The work relevant to this paper that needs reviewing is the representation of narrative using</p>
      <sec id="sec-2-1">
        <title>Propp’s character functions, existing e orts to annotate narrative with them, and solutions for annotating text with large language models.</title>
        <sec id="sec-2-1-1">
          <title>2.1. Narrative Structure and Propp’s Character Functions</title>
          <p>
            Based on a detailed study of 100 Russian folk tales from the anthology by Afanasiev, the Russian
formalist Vladimir Propp proposed a set of abstractions of plot-relevant functions played by
the characters of a tale, which he called character functions [
            <xref ref-type="bibr" rid="ref1">1</xref>
            ]. These character functions
represent basic contributions to the plot such as going on a journey, having a con
ict with
another character, being sent on a mission, or being rewarded. The character functions proposed
by Propp are shown in Table 1.
          </p>
          <p>Certain character functions are associated with particular types of character, such as the
hero or the villain. Propp also postulated seven such types of character: the villain, the donor
(provider), the helper, the princess (or sought-for person), the dispatcher, the hero or victim and
the false hero.</p>
          <p>Early uptake of Propp’s character functions as means of annotating a corpus of stories
relied on XML [5]. Propp’s character functions have been re-represented as ontologies de ned
in Description Logics [7]. The more relevant e ort at annotating with character function
information the very stories for which Propp designed his character functions was carried
out by Finlayson as part of his PhD thesis [12], and later extended by Yarlott and Finlayson
[6]. They propose ProppML, “an annotation scheme designed to capture all the components
of a Proppian-style morphological analysis of narratives”. Using this scheme, they annotated
fteen Russian folk tales from the original corpus, which amounts to 18,862 words. The corpus
was annotated separately by two highly trained annotators, and the results then merged with
supervision by a third expert annotator.</p>
        </sec>
        <sec id="sec-2-1-2">
          <title>2.2. Automated Annotation of Propp Character Functions</title>
          <p>Finlayson’s e orts at deeply annotating a corpus of Russian folk tales at various levels–syntax,
semantics, and narrative structure– made it possible to develop a machine-learning algorithm
capable of extracting culturally-relevant plot patterns from sets of folktales. Such an algorithm
takes as input a story annotated as an event timeline and it can predict for it abstract
narrative structures such as Villainy, Struggle-Victory, and Reward [14]. This an instance of a
computational system learning a theory of narrative structure directly from a corpus of data.</p>
        </sec>
        <sec id="sec-2-1-3">
          <title>2.3. Text Annotation with Large Language Models</title>
          <p>The recent advances demonstrated by applying Large Language Models (LLMs) to various
Natural Language Processing (NLP) tasks [15] has opened for exploration a number of avenues
that appeared to be impossible just a few years ago. The main idea is that a model pre-trained on
very large amounts of textual data can be queried in very simple ways to carry out speci c NLP
tasks and perform surprisingly well either with no additional information (zero-shot learning)
or with very simple queues that exemply how the task should be solved inserted in the prompt
that poses the query (few-shot learning).</p>
          <p>In particular, the task of annotating texts shows that tools such as ChatGPT show very high
performance [16]. ChatGPT is an AI language model developed as part of the GPT (Generative
Pre-trained Transformer) series of models and speci cally designed for generating human-like
text responses in conversational contexts. This type of approach has been applied to diverse
tasks such as annotating legal text [17] or propagandistic text [18].</p>
          <p>Gemini [19] (previously Bard [20]) is a family of generative AI models that lets developers
generate content handling both text and images as input. These models can generate text,
handle multi-turn conversations and also zero, one, and few-shot tasks. These models have
been trained with data existing up to the beginning of 2023. The Gemini Pro 1.0 version is
available for use through an open web interface or through a commercial web API. It allows for
similar queries as those presented to ChatGPT [21].</p>
        </sec>
      </sec>
    </sec>
    <sec id="sec-3">
      <title>3. Automated Tagging of Narrative Synopses with Proppian</title>
    </sec>
    <sec id="sec-4">
      <title>Character Functions</title>
      <p>We want to explore the relative performance of large language models when used to annotate
stories with Propp’s character functions via zero-shot learning and few-shot learning approaches.</p>
      <sec id="sec-4-1">
        <title>3.1. Zero-Shot Tagging Using ChatGPT</title>
        <p>We are going to try to get ChatGPT to annotate the synopses of the tales that Propp includes as
examples of annotations in his book. This should allow us to compare the annotation proposed
by ChatGPT with the one indicated by Propp himself.</p>
        <p>As a rst step, we ask ChatGPT if it knows about Propp’s functions. This is the reply:</p>
        <p>Vladimir Propp, a Russian folklorist and scholar, identi ed and analyzed common
narrative elements in Russian folktales. One of his most notable contributions is
his identi cation of “morphology of the folktale,” which includes what he termed
“character functions”. Propp identi ed 31 character functions that frequently appear
in folktales. Here’s a brief overview of Propp’s character functions:</p>
        <p>It then provides a list with 31 items, containing 7 character types and 24 character functions,
some of which are Propp’s and others are not; there are also some missing Propp’s character
functions:
• Character types: hero, villain, donor, helper, princess/prize, false hero, dispatcher
• Propp’s character functions: interdiction, violation, reconnaissance, delivery, trickery,
complicity, mediation, counteraction, departure, rst function of the donor, hero’s
reaction, receipt of a magical agent, struggle, branding, victory, return, pursuit, rescue,
unrecognized arrival
• Other functions: helper’s object, magical agent, villain’s lack, guidance, liquidation
(equivalent to victory)
• Missing Propp’s functions: absentation, villainy or lack, transfer, resolution of trigger,
unfounded claims, di cult task, solution, recognition, exposure, trans guration,
punishment, wedding</p>
        <p>It is interesting to note that ChatGPT, although knowledgeable in general in a surprising
way, mixes up the types of character identi ed by Propp with the character functions, gets the
number right, but misses out a number of functions because it includes the types of characters
instead of those functions.</p>
        <p>To test the ability of the system to annotate a story with the characters functions, the system
is given the following prompt:
Can you annotate this tale using Propp’s character functions?
&lt;story&gt;</p>
        <p>In response, the system outputs a text that describes its version of the annotation. For the
following example (tale 104 as described in Propp’s book):</p>
        <p>A girl called Vasilisa has a miraculous doll. The tsar does not have a wife. Vasilisa
goes to town. Vasilisa settles in an old woman’s house. The old woman buys
Vasilisa some ax. Vasilisa spins unusually ne yarn from the ax. The doll makes
a loom for Vasilisa in one night . Vasilisa weaves unusually ne linen on the loom.
The old woman takes the linen to the tsar. The tsar orders that shirts be sewn by
the one who has spun and woven the linen. The girl sews the shirts from the linen.</p>
        <p>The tsar sends for Vasilisa. Vasilisa marries the tsar.</p>
        <sec id="sec-4-1-1">
          <title>The assignment of character functions that Propp suggests is:</title>
          <p>A girl called Vasilisa has a miraculous doll (Acquisition of Magical Agent). The
tsar does not have a wife (Lack). Vasilisa goes to town. Vasilisa settles in an old
woman’s house (Unrecognised Arrival). The old woman buys Vasilisa some ax.
Vasilisa spins unusually ne yarn from the ax. The doll makes a loom for Vasilisa
in one night (Acquisition of Magical Agent). Vasilisa weaves unusually ne linen on
the loom. The old woman takes the linen to the tsar. The tsar orders that shirts
be sewn by the one who has spun and woven the linen (Di cult Task). The girl
sews the shirts from the linen (Task Resolved). The tsar sends for Vasilisa. Vasilisa
marries the tsar (Wedding).</p>
          <p>In contrast, the annotation that ChatGPT suggests as a zero-shot solution is:
Lack: The tsar lacks a wife.</p>
          <p>Dispatcher: Vasilisa goes to town.</p>
          <p>Helper: Vasilisa settles in the old woman’s house and receives aid from the miraculous doll.
Task: The old woman buys flax for Vasilisa, who spins unusually fine yarn and weaves linen.
Delivery: The old woman takes the linen to the tsar.</p>
          <p>Task: The tsar orders shirts to be sewn from the linen spun and woven by the same person.
Performance of task: Vasilisa sews the shirts.</p>
          <p>Arrival: The tsar sends for Vasilisa.</p>
          <p>Final union: Vasilisa marries the tsar.</p>
          <p>Zero-shot learning presents the di culty of having the system provide the result in a format
of its own devising, which may di er signi cantly from the one being used as reference.</p>
        </sec>
      </sec>
      <sec id="sec-4-2">
        <title>3.2. Zero-Shot Tagging Using Gemini</title>
        <p>Similarly to what we did using ChatGPT, we prompted Gemini for what it knew about Propp’s
character types and character functions. It provided a list of the seven character types and
then another with 31 character functions:1 initial situation, absentation, interdiction, violation,
reconnaissance, delivery of information, trickery, complicity, mediation, beginning of the
counteraction, hero’s departure, rst function of the donor, tests of the hero, acquisition of
magic agent, hero’s reaction, spatial translocation, struggle, branding, victory, liquidation of
the lack, return, pursuit, rescue, unrecognized arrival, false hero’s claims. recognition, reward,
di cult task, exposure of the false hero, trans guration, wedding. punishment, and resolution.</p>
        <p>This response does in essence match Propp’s set of character functions, but it presents an
issue that adds complexity to the task. The system is sometimes not using the exact names that
1Gemini listed the character functions grouped by sphere of action, following Propp’s description of them.
Propp assigned to his character functions, but rather paraphrasing some of them. This happens
for instance for the ones shown in Table 2.</p>
        <p>Zero-shot learning approaches to this task will likely su er from having to identify an
appropriate mapping between the set of tags used by the system and the set of tags being used
as reference. This presents a particular problem because it makes it very di cult to carry out
automated processing of the tagging.</p>
        <p>To test the ability of the system to annotate a story with the characters functions, the system
was given the same prompt as in the previous case. For the same example considered above (tale
104 as described in Propp’s book) the annotation that Gemini suggests as a zero-shot solution is:
Characters:</p>
        <p>Hero: Vasilisa
Dispatcher: None explicitly identified (opportunity arises naturally)
Donor: Doll (magical helper)
Prize/Princess: Tsar (implicitly)</p>
        <p>False Hero: None explicitly identified
Narrative Functions:</p>
        <p>Sphere of Action 1: Preparation</p>
        <p>Initial Situation: Vasilisa has a miraculous doll and lives without a father figure.</p>
        <p>Lack: Tsar lacks a wife. (Note: not directly affecting Vasilisa, but creates initial context)
Sphere of Action 2: Departure</p>
        <p>Beginning of the Counteraction: Vasilisa goes to town (implicitly seeking opportunity).
Sphere of Action 3: The Hero’s Actions</p>
        <p>Hero’s Departure: Vasilisa leaves her home environment.</p>
        <p>First Function of the Donor: Old woman provides flax (indirect aid).</p>
        <p>Tests of the Hero: Task of spinning fine yarn (challenge).</p>
        <p>Acquisition of Magic Agent: Doll magically creates a loom for Vasilisa.</p>
        <p>Hero’s Reaction: Uses the loom to weave fine linen.</p>
        <p>Spatial Translocation: Linen reaches the Tsar.</p>
        <p>Sphere of Action 4: Return</p>
        <p>Return: Tsar acknowledges the linen and initiates contact with Vasilisa.</p>
        <p>Return: Vasilisa arrives at the Tsar’s court.</p>
        <p>Sphere of Action 5: Recognition and Reward</p>
        <p>Unrecognized Arrival: Vasilisa implicitly demonstrates her skills.</p>
        <p>Difficult Task: Sewing the shirts (implicit test of skill).</p>
        <p>Exposure of the False Hero: N/A
Transfiguration: Vasilisa gains new status as Tsar’s wife.</p>
        <p>Wedding: Marriage to the Tsar (reward).</p>
        <p>Sphere of Action 6: Resolution</p>
        <p>Resolution: Lack of the Tsar is addressed (gains a wife), and Vasilisa finds a new life.</p>
        <p>It is interesting to note that, in the zero-shot learning, Gemini decides to provide information
about how the content of the story relates to the Spheres of Action that it considers Propp to
have de ned.</p>
        <p>It is also identifying a number of inferences of implicit content that is not mentioned in the
story but is indeed inferable from the text. Examples of this are the fact that Vasilisa has left
home (to go to town) or that Vasilisa gains new status as a result of having married the Tsar.</p>
        <p>Another relevant detail is that Gemini assigns 11 character functions to this story, whereas
Propp only assigned 7. Propp himself had assigned to this tale only 7 character functions,
though not the same as ChatGPT.</p>
      </sec>
      <sec id="sec-4-3">
        <title>3.3. Few-Shot Tagging Using ChatGPT</title>
        <p>Finally, we wanted to test the performance of the system at a few-shot learning task. So we
replicated the experiment above with an enriched prompt that explicitly included the list of
character functions and their de nitions.</p>
        <p>I need to annotate some tales using Propp’s character types and narrative functions.
These are the tags and definitions I need to use:
For the character types:
+ Hero: The role model of a story.
+ Villain: The negative character who creates struggles for the hero.
+ Donor: The character who provides some magical object to the hero.
+ Helper: The character who helps the hero.
+ Princess: The character who becomes a companion of the hero.
+ Dispatcher: The character who illustrates the need for the hero’s quest and sends the hero off.
+ False Hero: The character who takes credit for the hero’s actions
For the narrative functions:
+ Absentation: The villain or a member of the family leaves the home
+ Branding: The hero is marked or identified
+ Interdiction: A prohibition is imposed on the hero
+ Victory: The hero defeats the villain
+ Interdiction violation: The interdiction is violated
+ Resolution of trigger: Villainy or lack resolved
+ Reconnaissance: The villain attempts to obtain information
+ Return: The hero returns
+ Delivery: The villain gains information or an object
+ Pursuit: The villain chases the hero
+ Trickery: The villain deceives the hero to gain an advantage
+ Rescue from pursuit: The hero is saved from pursuit or danger
+ Complicity: The victim submits to deception
+ Unrecognized arrival: The hero arrives unrecognized
+ Villainy or lack: The villain causes harm or injury or someone lacks something they want
+ Unfounded claims: A false hero claims hero status
+ Mediation: Misfortune is made known
+ Difficult task: The hero is given a difficult task
+ Counteraction: The hero reacts to the villain’s actions
+ Solution: The task is accomplished or resolved
+ Departure: The hero leaves home
+ Recognition: The hero is recognized or acknowledged
+ 1st donor function: The hero is tested by a potential donor
+ Exposure: The villain’s identity or deception is exposed
+ Hero’s reaction: The hero reacts to the test
+ Transfiguration: The hero undergoes a transformation
+ Receipt of agent: The hero acquires a magical agent
+ Punishment: The villain receives punishment or consequences
+ Transfer: The hero is hero directed or taken to new location
+ Wedding: The hero marries or is rewarded
+ Struggle: The hero and villain engage in direct conflict
Annotate this tale: &lt;story&gt;</p>
        <p>We tested this with ChatGPT and for the previous example (tale 104 as described in Propp’s
book), the annotation produced is:</p>
        <p>An important point to note here is that the system response does not actually annotate the
text in the sense of respecting the text and assigning labels to speci c segments of it. Instead
it is declaring what labels might be applicable to the text in some sense, and for each one it
provides a description of the events in the text that justify the assignment. For instance, the 1st
donor function label is justi ed with the sentence “The miraculous doll aids Vasilisa in spinning
and weaving”, which can be understood as a condensed description of the various tasks that
Vasilisa carries out in the story, taken in the context that we have been told that the magical
doll is helping her. The summary is actually incorrect, or maybe just not altogether precise, as
the doll actually only provides the magical loom with which Vasilisa weaves, and it does not
actually help with the spinning. This presents a problem in how this type of response should be
annotated. First, because it is di cult to establish whether the system assignment of the label
matches accurately Propp’s assignment–the boundaries of the segment of text considered to
justify the label are not de ned. Second, because the assignment appears to be based on an
incorrect interpretation of the facts in the story.</p>
        <p>In this particular case, the 1st donor function label is incorrectly assigned, because Propp
de nes the rst action of the donor to be testing of the hero by the donor to decide whether he
is worthy of gift or assistance. The system has also made an incorrect inference here. A similar
case can be observed for the inference supporting the Delivery tag: the system justi es it by
saying Vasilisa delivers the linen to the tsar, when the story says the old woman does.</p>
        <p>A more re ned option might be to include in the prompt an example of how the annotations
should be presented in combination with the input.</p>
      </sec>
      <sec id="sec-4-4">
        <title>3.4. Few-Shot Tagging Using Gemini</title>
        <p>We tested the same prompt-enriched with the explicit list of character functions to use–with
GeminiPro and for the previous example (tale 104 as described in Propp’s book), the annotation
produced is:
Annotating the Tale of Vasilisa using Propp’s Framework
Characters:
Hero: Vasilisa
Donor: The Doll (magical helper)
Dispatcher (indirect): The Tsar’s lack of wife creates an opportunity for Vasilisa.
Narrative Functions:
Absentation: The Tsar lacks a wife (lack).</p>
        <p>Hero’s possession of a magical agent: Vasilisa has the miraculous doll.</p>
        <p>Departure: Vasilisa goes to town (possibly seeking a new life).</p>
        <p>Helper: The old woman provides Vasilisa with flax.</p>
        <p>Difficult Task (indirect): Vasilisa spins unusually fine yarn, implying exceptional skill.
Receipt of Magical Agent (indirect): The doll creates a loom for Vasilisa, demonstrating its
magical abilities.</p>
        <p>Difficult Task: Vasilisa weaves unusually fine linen, again showcasing her skill.
Mediation: The old woman takes the linen to the Tsar, revealing Vasilisa’s talent.
Transfer: The Tsar sends for Vasilisa, bringing her closer to her potential reward.
Marriage: Vasilisa marries the Tsar (fulfills the lack and resolves the story).
Notes:
This tale is a simplified version and omits some of the more complex elements of Propp’s morphology.
The villain and some other functions (e.g., struggle, victory) are absent.</p>
        <p>The focus is on Vasilisa’s exceptional skills, aided by the magical doll, which ultimately lead to
her marriage with the Tsar.</p>
        <p>It is interesting to see that Gemini has some di culty in identifying the di erence between
character types and narrative functions as described in the prompt. This results in an annotation
that mixes both kinds of labels indistincly.</p>
      </sec>
    </sec>
    <sec id="sec-5">
      <title>4. Discussion</title>
      <p>Regardless of the impressive nature at rst glance of the replies provided by these large language
models, it is very easy for them to be incorrect to a surprising degree. This is especially true for
queries that require responses involving detailed recall of de nitions. The examples reported
above of how the di erent LLMs recall the actual list of Propp’s character functions shows both
the strengths and the weaknesses of these models.</p>
      <p>The very nature of these systems implies that the text of any response is dynamically
constructed on demand from a neural representation in an underlying latent space. For this reason,
responses are very likely to be paraphrases rather than verbatim transcriptions. This has the
advantage of allowing for a certain leeway in the interpretation of de nitions. These systems
are more likely to identify instances of a particular character function even when they are
phrased in a signi cantly di erent way from the original de nition. However, it also has the
disadvantage of relying on potentially divergent understandings of speci c function. The set of
de nitions provided for the character functions when prompted suggests that this is the case at
least for ChatGPT. This suggests that the set of tags being used to annotate by ChatGPT in a
zero-shot approach to the task is simply incorrect.</p>
      <p>To provide some quantitative measure of the degree of divergence, the set of tags assigned
to the sentences in the tale is compared with the assignment that Propp provides for it in his
book. In appendices II and III, Propp provides examples of how the plots of a number of tales
can be described in terms of his set of character functions. The tales so described are referred
to only by number, and the assignment of character functions is given over a synopses of the
tale. These synopses have been used as input to the system. The tales considered correspond to
numbers 93, 104, 123, 127, 131, 133, 139, 155, 198, 244 and 247.</p>
      <p>The assignments provided of character functions to segments of the synopses required a
detailed analysis. The responses by the various models include some kind of label together with
a justi cation of which part of the story the label is assigned to, both labels and descriptions
tend to be paraphrases rather than references to speci c character function names or speci c
segments of the text. This makes simple measures like precision and recall against Propp’s
original assignment di cult to compute and even more di cult to interpret. To circumvent this
problem, in all cases where the model suggested a label that could reasonably be considered
to refer to the one assigned by Propp it has been considered valid. In the hope of making the
analysis more informative, four additional metrics have been computed. First, the number of
times that identi able references to Propp’s character functions have been assigned to segments
of the story that they do not apply to, which measures correct match between label and text.
Second, the number of times that labels have been used that cannot in be related in some way to
the character functions de ned by Propp, which measures familiarity with the set of character
functions. Third, the number of times that labels assigned (whether identi able as Propp’s
character functions or not) can be seen to relate to the corresponding story segment by some
valid inference, which measures the language understanding ability of the system. Finally, we
report a metric to measure the number of times that the system has attempted to assign a label
to segments of the text to which Propp had assigned one, which measure the ability of the
system to identify plot-relevant events, regardless of whether it can tag them correctly. The
results for these set of metrics are shown in Table 3.</p>
      <p>The rst observation to make is that the rst four metrics show relatively low values. This
indicates poor performance overall.</p>
      <p>It is interesting to note that for several of the metrics the zero-shot approach performs better
than the few-shot approach. This happens to ChatGPT for all metrics save the identi cation
of plot-relevant segments, and to Gemini at least for the number of incorrect assignments (44
from 16). This is consistent with observed behaviour of LLMs, with quality of results dropping
as the size of prompts increases.</p>
      <p>The fact that ChatGPT zero-shot had a very low recall is not surprising given that it was not
really using the full set of character functions. The zero-shot approach using Gemini had both
higher precision and higher recall than the other options, which is consistent with the fact that
it is not confused about the character functions. Nevertheless, the scores are surprisingly low.</p>
      <p>The additional metrics reported provide some insight into the reasons for these low values. We
notice very poor performance (33%) for ChatGPt at the identi cation of plot-relevant segments,
and signi cant percentages of incorrect .assignments of existing labels (26% for ChatGPT and 16
for Gemini%). The number of labels assigned that do not correspond to Propp’s set is surprisingly
high (10%) for Gemini zero shot given that it supposedly knows the correct set, but it drops
signi cantly (to 4%) when prompted with the set of names. In contrast, ChatGPT reacts badly
to the enriched prompt, with percentages for both incorrect assignments and invented labels.</p>
    </sec>
    <sec id="sec-6">
      <title>5. Conclusions</title>
      <p>The experiments reported suggest that the use of large language models for automated tagging
of stories with Propp character function may yield very di erent results depending on the
level of familiarity of the particular model employed with the set of character functions in
question. They also suggest that, if the model is not su ciently familiar with the de nitions
of the character functions, a few-shot approach in which brief versions of the de nitions are
provided may not be enough to compensate for this.</p>
      <p>Further experiments are needed both in terms of exploration of a broader range of variations in
the prompting and in the consideration of di erent models or versions of the models. Important
aspects to consider in this sense are the di erences between public web interfaces and publicly
available sources for the models, and di ences in model size.</p>
    </sec>
    <sec id="sec-7">
      <title>Acknowledgments</title>
      <p>This paper has been partially supported by the CANTOR project (PID2019-108927RB-I00) funded
by the Spanish Ministry of Science and Innovation.
[5] S. A. Malec, Proppian structural analysis and xml modeling, Proc. of Computers, Literature
and Philology (CLiP 2001) (2001).
[6] W. V. H. Yarlott, M. A. Finlayson, Proppml: A complete annotation scheme for proppian
morphologies, in: 7th Workshop on Computational Models of Narrative (CMN 2016),
Schloss Dagstuhl-Leibniz-Zentrum fuer Informatik, 2016.
[7] F. Peinado, P. Gervás, B. Díaz-Agudo, A description logic ontology for fairy tale
generation, in: Procs. of the Workshop on Language Resources for Linguistic Creativity, LREC,
volume 4, 2004, pp. 56–61.
[8] T. Declerck, A. Aman, M. Banzer, D. Machá ek, L. Schäfer, N. Skachkova, Multilingual
ontologies for the representation and processing of folktales, Proceedings of the LT4DHCSEE
in conjunction with RANLP (2017) 20–23.
[9] F. Pannach, An ontology-driven information system based on Vladimir Propp’s morphology
of the folktale for southern African folktales, Ph.D. thesis, Master’s thesis, University of
Göttingen, Germany, 2019.
[10] M. A. Finlayson, Deriving narrative morphologies via analogical story merging, New</p>
      <p>Frontiers in Analogy Research (2009) 127–136.
[11] M. Despontin, L. Sbattella, R. Tedesco, Natural language processing for storytelling and
role playing: a training system based on the propp model, in: ICERI2010 Proceedings,
IATED, 2010, pp. 5036–5045.
[12] M. M. A. Finlayson, Learning narrative structure from annotated folktales, Ph.D. thesis,</p>
      <p>Massachusetts Institute of Technology, 2012.
[13] M. A. Finlayson, Inferring propp’s functions from semantically annotated text, Journal of</p>
      <p>American Folklore 129 (2016) 55–77.
[14] M. A. Finlayson, Propplearner: Deeply annotating a corpus of russian folktales to enable
the machine learning of a russian formalist theory, Digital Scholarship in the Humanities
32 (2017) 284–300.
[15] B. Min, H. Ross, E. Sulem, A. P. B. Veyseh, T. H. Nguyen, O. Sainz, E. Agirre, I. Heintz,
D. Roth, Recent advances in natural language processing via large pre-trained language
models: A survey, ACM Computing Surveys 56 (2023) 1–40.
[16] M. Alizadeh, M. Kubli, Z. Samei, S. Dehghani, J. D. Bermeo, M. Korobeynikova, F. Gilardi,
Open-source large language models outperform crowd workers and approach chatgpt in
text-annotation tasks, arXiv preprint arXiv:2307.02179 (2023).
[17] J. Savelka, K. D. Ashley, The unreasonable e ectiveness of large language models in
zero-shot semantic annotation of legal texts, Frontiers in Arti cial Intelligence 6 (2023).
[18] M. Hasanain, F. Ahmed, F. Alam, Large language models for propaganda span annotation,
arXiv preprint arXiv:2311.09812 (2023).
[19] G. Team, Gemini: A family of highly capable multimodal models, 2023.</p>
      <p>arXiv:2312.11805.
[20] J. Manyika, S. Hsiao, An overview of bard: an early experiment with generative ai, AI.</p>
      <p>Google Static Documents 2 (2023).
[21] S. Mandvikar, Factors to consider when selecting a large language model: A comparative
analysis, International Journal of Intelligent Automation and Computing 6 (2023) 37–40.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <given-names>V.</given-names>
            <surname>Propp</surname>
          </string-name>
          , Morphology of the Folktale, University of texas Press,
          <year>1968</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [2]
          <string-name>
            <given-names>E. D.</given-names>
            <surname>Williams</surname>
          </string-name>
          ,
          <source>The Fairy Tales by Madamme d'Aulnoy, Ph.D. thesis</source>
          , Rice University, Houston, Texas,
          <year>1982</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [3]
          <string-name>
            <given-names>M. A.</given-names>
            <surname>McGowan</surname>
          </string-name>
          ,
          <article-title>An analysis of the fantasy plays of James M. Barrie utilizing Vladimir Propp's structural model of the fairy tale</article-title>
          ,
          <source>Ph.D. thesis</source>
          , New York University,
          <year>1984</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [4]
          <string-name>
            <given-names>D. J.</given-names>
            <surname>Latourette</surname>
          </string-name>
          ,
          <article-title>Doctor Who meets Vladimir Propp: A comparative narrative analysis of myth/folktale and the television science ction genre</article-title>
          ,
          <source>Ph.D. thesis</source>
          , Northwestern University,
          <year>1990</year>
          .
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>