=Paper=
{{Paper
|id=Vol-2160/C3GI_2017_paper_10
|storemode=property
|title=Comparative Evaluation of Elementary Plot Generation Procedures
|pdfUrl=https://ceur-ws.org/Vol-2160/C3GI_2017_paper_10.pdf
|volume=Vol-2160
|authors=Pablo Gervas
}}
==Comparative Evaluation of Elementary Plot Generation Procedures==
<pdf width="1500px">https://ceur-ws.org/Vol-2160/C3GI_2017_paper_10.pdf</pdf>
<pre>
    Comparative Evaluation of Elementary Plot
             Generation Procedures

                                   Pablo Gervás

        Instituto de Tecnologı́a del Conocimiento - Facultad de Informática,
                        Universidad Complutense de Madrid
                     Ciudad Universitaria, 28040 Madrid, Spain
                                   pgervas@ucm.es
                    WWW home page: http://nil.fdi.ucm.es/


      Abstract. There are many different abstractions as to what a ’story-
      telling’ mechanism might be, each based on a particular understand-
      ing of what makes stories come together as a whole. Examples may be:
      archetypical instances of plot, story grammars, or the famous canoni-
      cal sequence of character functions proposed by Vladimir Propp. From
      a computational point of view, each of these mechanisms can be used
      to construct or to validate new stories. The present paper carries out a
      comparative evaluation of a number of plot generation procedures, by
      grounding all of them on a basic reference vocabulary for the represen-
      tation of narrative units, and applying to all of them a set of metrics
      distilled from the same procedures. The resulting set of computational
      tools is used in combination for comparative evaluation.

      Keywords: computational creativity, narrative, story grammars, char-
      acter functions, metrics for narrative


1   Introduction

Humans have for a long time tried to understand how it is that they can come up
with stories that make sense and are enjoyable. In this endeavour, many different
abstractions as to what a ’story-telling’ mechanism might be have arisen. Each of
these mechanisms is based on a particular understanding of what makes stories
come together as a whole. Examples of how these different understandings of the
essence of storyness are captured may be: archetypical instances of plot, story
grammars, or the canonical sequence of character functions proposed by Vladimir
Propp. From a computational point of view, each of these ways of capturing what
makes a story work can be understood in two different ways: either as a procedure
for constructing new stories or as a procedure for determining whether a given
candidate sample is a valid story. Each of these mechanisms must be considered
at most one possible simplification of the much more complex problem which is
story-telling.
    The present paper carries out a comparative evaluation of a number of plot
generation procedures, by grounding all of them on a basic reference vocabulary
for the representation of narrative units, and applying to all of them a set of
metrics distilled from the same procedures.


2     Previous Work

For the purposes of this paper we focus on an abstract view of stories that
concentrates on their overall narrative structure without considering details be-
yond a particular level of abstraction. The level of abstraction we select is that of
large-grain description of activities of character that are relevant to the narrative
structure. We refer to this abstract view of a story as plot.
    Work on modelling the abstract structure of story has taken place both at
the theoretical level – models of story structure in abstract terms – and at the
computational level – computational implementations for story construction.


2.1   Theoretical Abstractions of Story Structure

Vladimir Propp [9] identified a set of regularities in a subset of the corpus of Rus-
sian folk tales and formulated them in terms of character functions, understood
as acts of the character defined from the point of view of their significance for
the course of the action. Character functions represent a certain contribution to
the development of the narrative by a given character. According to Propp, for
the given set of tales, the number of such functions was limited and the relative
order of appearance of these functions was noticeably stable. This led him to
postulate that all these tales could be considered instances of a single structure.
    Gervás et al [6] reviewed existing work on the description of plot to propose
a set of schemas, compiled from various sources, and expressed in terms of se-
quences of character functions. The character functions employed came from an
extended set, which used Propp’s basic 31 as a seed and added additional ele-
ments were necessary to cover the features expressed in the reviewed descriptions
of plot.
    Gervás et al [5] further extended this work and devised an extended set of
units of abstraction for narrative, equivalent to character functions, but defined
as a vocabulary for the annotation of a corpus of plots of musicals. 1
    George Lakoff attempted [7] a reformulation of Propp’s account of Russian
folk tales as a transformational grammar in Chomsky’s style,2 then very much
1
  Because the vocabulary was developed to help a large set of volunteer annotators, and
  the term “character function” was considered confusing, and the term plot element
  was used instead.
2
  A number of grammar-based descriptions of story structure are reviewed in this
  paper. Each of them was originally formulated with a different notation. To make
  it easier to understand the differences and similarities across the different solutions
  reviewed, an attempt has been made to unify then into a single notation. This
  notation includes elements required to represent the peculiarities appearing over the
  complete set, but does not match a particular formalism. In the rules presented
  below, elements appearing in a sequence would appear in the same order in the
in vogue. The paper argues for the potential of different formal mechanisms to
capture different aspects of the complexity of stories as identified in Propp’s
account, but the grammar described is actually incomplete (with rules missing
for certain non-terminal symbols used). A simplified version3 is provided in Table
1.

  Plot → ComplicatingSequence ResolvingSequence
  ResolvingSequence → (Episode) Resolution trigger resolved Reward
  ResolvingSequence → DonorSequence (Episode) Resolution trigger resolved Reward
  DonorSequence → test by donor hero reaction acquisition magical agent use magical agent
  Resolution → struggle victory | difficult task task resolved
  ComplicatingSequence → (HelplessnessSequence) Complication begin counteraction
  Complication → villainy | lack
  HelplessnessSequence → interdiction Violation
  Violation → WilfullViolation | DeceptionByVillain SubmissionOfHero
Table 1. Lakoff’s reinterpretation of Propp’s morphology for Russian Folk Tales as a
grammar


    Rumelhart [10] pioneered the study of the structure of stories in the form of
a grammar. Rumelhart suggests that the grammar he developed “accounts in
a reasonable way for the structure of a wide range of simple stories”. Rumel-
hart’s grammar includes a set of syntactical rules that generate the constituent
structure of stories and a parallel set of semantic interpretation rules which “de-
termine the semantic representation of the story”. This aspect of Rumelhart’s
work has received less attention than the syntactical rules. Rumelhart’s syntactic
rules, and the associated semantic interpretation rules, are presented in Table 2.
    Thorndyke [11] carried out a set of experiments on the comprehension and
recall of narrative discourse, and used for this a simplified version of Rumelhart’s
grammar. A simple transcription of this grammar is given in Table 3.

Computational Approaches to Story Generation Although story gram-
mars were discredited as an actual model of human cognitive processing of stories
[1], they remained a popular technique with researchers in story generation. The
Joseph system [8] and the BRUTUS system [2] were based on story grammars.
They both produced a succesful number of stories of high quality.
     The Propper system [3] is a computational implementation of the procedure
for generating stories described by Propp [9]. It uses Propp’s canonical sequence
of character functions for Russian folk tales, selects character functions out of it
at random and places them in the same relative order in an output sequence. The
revised version presented in [4] describes extensions to the original constructive
  presentation of a story, → is used to indicate that the single term to the left can be
  rewritten (built) as the sequence of terms to the right, | indicates disjunction (choice),
  simple brackets are used to indicate optional elements, and square brackets are used
  to indicate that one or many of the corresponding elements should be included.
3
  References to Proppian character functions are all transcribed in terms of a unified
  vocabulary (as defined in [3, 4]) and represented in typewriter font.
 Story → Setting Episode                      ALLOW(Setting,Episode)
 Setting → [State]                            AND(States)
 Episode → [Event] Reaction                   (ALLOW(Event,Event) |
                                              CAUSE(Event,Event) ),
                                              INITIATE(Event,Reaction)
 Event → Change-of-state
 Event → Action
 Event → Episode
 Reaction → InternalResponse OvertResponse     MOTIVATE(InternalResponse,OvertResponse)
 InternalResponse → Emotion | Desire
 OvertResponse → Action | *Attempt
 Attempt → Plan Application                    MOTIVATE(Plan,Application)
 Application → (*Preaction) Action Consequence CAUSE(Action,Consequence) |
                                               INITIATE(Action,Consequence) |
                                               ALLOW(Action,Consequence)
 Preaction → Subgoal Attempt                   MOTIVATE(Subgoal,Attempt)
 Consequence → Reaction | Event
         Table 2. Rumelhart’s syntactical and semantic interpretation rules


             Story → Setting Theme Plot Resolution   Attempt → Event | Episode
             Setting → State                         Outcome→ Event | State
             Theme → Goal | Event Goal               Resolution → Event | State
             Plot → Episode                          Subgoal → DesiredState
             Episode → Subgoal Attempt Outcome       Goal → DesiredState
                              Table 3. Thorndyke grammar


procedure that take into account the possibility of dependencies between char-
acter functions – such as for instance, a kidnapping having to be resolved by
the release of the victim – and the need for the last character function in the
sequence for a story to be a valid ending for it. Metrics are proposed to evaluate
the validity of story candidates.


3     A Toolkit of Story Structure Abstractions as
      Constructors and Evaluators

To achieve a meaningful comparative evaluation of the various plot generation
procedures, we consider the following steps. First we establish a reference rep-
resentation format of abstract units of narrative, and we map it to the differ-
ent representations of narrative used by the construction procedures considered.
Second we identify how these construction procedures may be adapted to act
as validation procedures. Finally, we carry out a number of experiments that
combine the resulting resources.


3.1   Alligning Representations

In order to compare different story generation procedures with one another, it
is important that they generate outputs in a comparable representation format.
To avoid the problems associated with evaluating natural language renderings of
narrative [3], in this paper we opt for direct comparison of the various procedures
at the level of the sequence of abstract units of representation of narrative. This
requires the adoption of a common set of abstract units of representation of
narrative which might be alligned with the various elements of representation
employed for the different generative procedures.
    We adopt as common set of abstract units of representation of narrative the
set of plot elements described in [5], which presents the following advantages.
First, it allows for a relatively straightforward correspondence between elements
in the set and their counterparts in the original sources. This is because it has
been constructed by combination of a number of prior existing sources, with
one of them being Propp’s set of character functions. Second, given that Rumel-
hart’s and Thorndyke’s grammars for stories are formulated at a higher level
of abstraction, establishing a correspondence between the terminal symbols of
those grammars and the set of plot elements may be resolved by classifying the
plot elements – which are more specific – as instances of the terminal symbols
of the grammars – which are more generic.
    The allignment between the set of plot elements and the various alternative
representations is described in Tables 4 and 5.
    It is important to note that the set of plot elements is more detailed and more
fine-grained than the set of Propp’s character functions. The correspondence
between them has been established by considering that: (1) certain character
functions represent more than one plot element – such as for instance Abduction
and Imprisoned being types of villainy – and (2) certain character functions
were phrased to encompass a range of options and the finer granularity allows
for distinctions that were not available originally – such as Propp’s use of hero
marries to cover both Reward and Wedding.
    With respect to the terminal symbols in Rumelhart’s and Thorndyke’s gram-
mars, it must be noted that certain plot elements are slightly ambiguous with
respect to their classification. Examples of this are Cross-Dressing, which in
terms of Rumelhart’s grammar can be considered either an Action in itself or
as a Change-of-state that results from the action, or AnEnemyLoved which in
terms of Thorndyke’s grammar can be considered an Event in itself – if one
focuses on the moment that it happens –, a State – if one focuses on the animic
state of the protagonist – or as a DesiredState – if one focuses on what the pro-
tagonist hopes for. This suggests that the particular categories being used might
require careful refinement. We consider this task outside the scope of the present
paper. Although we might address it as further work, we opt at this stage for
accepting that certain plot elements might be classified under more than one of
the available categories.
    The existence of dependencies between character functions had been identi-
fied as a fundamental ingredient in the perception of the validity of a story [4].
The same happens for plot elements: Imprisoned calls for Rescue, Pursuit calls
for RescueFromPursuit. Such pairs are identified into a set of dependencies that
can be checked over a given sequence of plot elements. Whereas Proppian charac-
ter functions were easy to pair off (departure-return, struggle-victory), the
set of plot elements is more complex in two ways: some plot elements now have
two possible outcomes (Struggle calls for Victory or Defeat), and certain types of
                              Propp
                              character             Rumelhart               Thorndyke
Plot elements                 functions             terminals               terminals
InitialSituation              *                     State                   State
Summary                       *                     State                   State
Aspiration                    lack                  Desire, Plan            DesiredState
CallToAction                  hero dispatched       Action                  Event
Cross-Dressing                unrecognised arrival Action,Change-of-state Event
Departure                     departure, transfer Action                    Event
Deliverance                   delivery              Action                  Event
DisconnectedFromReality       *                     State                   State
Discovery                     *                     Action                  Event
Disguise                      unrecognised arrival Change-of-state          Event
Epiphany                      *                     Change-of-state         Event
Escape                        trigger resolved      Action                  Event
Guidance                      *                     Action                  Event
HighStatusRevealed            *                     Change-of-state         Event
Maturation                    *                     Change-of-state         Event
Metamorphosis                 *                     Change-of-state         Event
Pursuit                       hero pursued          Action                  Event
Reconnaissance                reconnaissance        Action                  Event
Rescue                        trigger resolved      Action                  Event
RescueFromPursuit             rescue from pursuit Action                    Event
Return                        return, transfer      Action                  Event
SomeoneLeaves                 absentation           Action                  Event
Transfiguration               transfiguration       Change-of-state         Event
Transformation                transfiguration       Change-of-state         Event
UnrecognizedArrival           unrecognised arrival Action                   Event
Character’sReaction           hero reaction         Action                  Event
DecisionToTakeAction          begin counteraction Action, Plan              Event
DeceptionToFitIn              *                     Action, Plan            Event
ErroneousJudgement            hero reaction         Action                  Event
Ill-fatedImprudence           *                     Action                  Event
MoralDilemmaTriumph           hero reaction         Action, Change-of-state Event
MoralDilemmaFailure           hero reaction         Action, Change-of-state Event
MistakenJealousy              *                     Action                  Event
SacrificeForAnIdeal           *                     Action                  Event
SacrificeForFamily            *                     Action                  Event
SacrificeForPassion           *                     Action                  Event
SacrificeOfLovedOnes          *                     Action                  Event
SucumbingToTemptation         hero reaction         Action                  Event
TemptationResisted            hero reaction         Action                  Event
Warning/ForbiddingDisregarded interdiction violated Action                  Event
CharacterFlaw                 *                     State                   State
BoyMeetsGirl                  *                     Action                  Event
BoyLoosesGirl                 *                     Action                  Event
Wedding                       hero marries          Action                  Event
ClassDifferences              lack                  State                   State
ForbiddenLove                 lack                  State, Action           State, Event
Inconstancy                   *                     Action                  Event
InvoluntaryCrimesOLove        *                     Action                  Event
Adultery                      villainy              Action                  Event
AnEnemyLoved                  *                     Emotion                 Event, State, DesiredState
CrimesOfLove                  *                     Action                  Event
LoveShift                     *                     Emotion                 Event
LoveTriangle                  *                     Action                  Event
MurderousAdultery             villainy              Action                  Event
One-sidedLove                 *                     Emotion                 Event, DesiredState
ObstaclesToLove               lack                  State, Action           State, Event
ParentConvinced               trigger resolved      Action, Change-of-state Event
RecoveryOfALostOne            trigger resolved      Action, Change-of-state Event
   Table 4. Allignment between abstract units of representation of narrative (1)
                             Propp
                             character                 Rumelhart                        Thorndyke
Plot elements                functions                 terminals                        terminals
CoupleWantsToMarry           lack                      Desire, Plan                     Event, DesiredState
Abduction                    villainy                  Action                           Event
Branding                     branding                  Action                           Event
Deception                    *                         Action, Plan                     Event
DifficultTask                difficult task            Action                           Event
Disaster                     *                         Action                           Event
ShameOfLovedOne              *                         Action, Change-of-state          Event
Exposure                     false hero exposed        Action, Change-of-state          Event
Forbidding/Warning           interdiction announced    Action                           Event
MistakenMurder               villainy                  Action                           Event
Lack                         lack                      State, Change-of-state, Action   Event
LossOfLovedOnes              lack                      Action, Change-of-state          Event
Madness                      *                         State, Change-of-state, Action   Event
Misfortune                   lack                      State, Change-of-state, Action   Event
Persuasion                   *                         Action                           Event
Poverty                      lack                      State, Change-of-state           Event
Punishment                   villain punished          Action                           Event
Recognition                  false hero exposed        Action, Change-of-state          Event
Remorse                      *                         Emotion                          Event
Tested                       test by donor             Action                           Event
TheEnigma                    *                         Action                           Event
Defeat                       *                         Action                           Event
Villainy                     villainy                  Action                           Event
Imprisoned                   villainy                  Action                           Event
LessonLearned                *                         Action, Change-of-state          Event
Ambition                     *                         Desire, Plan                     Event, DesiredState
IAmWhatIAam                  *                         State, Emotion                   Event
Complicity                   complicity                Action                           Event
ConflictWithAGod             *                         Action                           Event
Cross-RankRivalry            *                         Action, State, Emotion           Event
DaringEnterprise             *                         Action                           Event
HatredBetweenFriends         *                         Emotion, State                   State
Jealousy                     *                         Emotion, State                   State
MisunderstandingArises       difficult task            Action, Change-of-state          Event
Revenge                      villain punished          Action                           Event
Revolt                       *                         Action                           Event
Rivalry                      *                         Action, State, Emotion           Event
Struggle                     struggle                  Action                           Event
JudgementDeferredToAuthority *                         Action                           Event
Trickery                     trickery                  Action                           Event
Underdog                     *                         Action, State                    Event
UnfoundedClaims              unfounded claims          Action                           Event
UnrelentingGuardian          lack                      Action, State                    Event
Assistance                   *                         Action                           Event
BondStrengthened             *                         Change-of-state, Emotion         Event
UsefulInformation            *                         Change-of-state                  Event
LackFulfilled                trigger resolved          Action, Change-of-state          Event
AspirationAchieved           trigger resolved          Action, Change-of-state          Event
ProvisionOfMagicalAgent      acquisition magical agent Action                           Event
RepentanceRewarded           *                         Action                           Event
Reward                       hero marries              Action                           Event
Riches                       hero marries              Action, State, Change-of-state   Event
Victory                      victory                   Action                           Event
MisunderstandingCleared      task resolved             Action, Change-of-state          Event
Reconciliation               hero marries              Action, Change-of-state, Emotion Event
Repentance                   *                         Emotion, Change-of-state         Event
Solution                     task resolved             Action                           Event
   Table 5. Allignment between abstract units of representation of narrative (2)
action are represented at different levels of abstraction (Villainy is included but
also Abduction, Imprisoned,...). This influences the values of the metrics applied
later in the paper.

3.2   Construction Procedures
We consider the following construction procedures:
SchemaBaseline random choice of one out of the set of schemas of narrative
   identified in [6] transcribed in terms of plot elements.
ProppBaseline an adapted version of Propp’s suggested method of selecting
   elements at random from the canonical sequence of character functions [3],
   re-formulated in terms of plot elements in the reference vocabulary (a random
   choice is made when one character function can correspond to more than one
   plot element).
ProppDependency an adapted version of the refinement proposed in [4] that
   extends ProppBaseline with restrictions that maximise satisfaction of de-
   pendencies between plot elements across the sequence and a preference for
   closing on plot elements used at the end of tales in the corpus.
PropperGrammar grammar-based generation using an instance of the gram-
   mar used by the Propper system [3] with a lexicon that associates plot ele-
   ments to Proppian character functions (as per Tables 4 and 5).
LakoffGrammar grammar-based generation using an instance of Lakoff’s gram-
   mar for Proppian tales [7] with a lexicon that associates plot elements to
   Proppian character functions (as per Tables 4 and 5).
RumelhartGrammar grammar-based generation using an instance of Rumel-
   hart’s grammar [10] with a lexicon that associates plot elements to grammar
   terminal symbols (as per Tables 4 and 5).
ThorndykeGrammar grammar-based generation using an instance of Thorndyke’s
   grammar [11] with a lexicon that associates plot elements to grammar ter-
   minal symbols (as per Tables 4 and 5).

3.3   Evaluation Procedures
The different formalisms for story generation described in section 3.2 can be
adapted to provide a diagnostic procedure that, given a sequence of plot ele-
ments, can provide a numerical score that represents some type of conformance
of that sequence to the view of narrative exemplified by the approach.
    Some of the construction procedures considered provide simple solutions to
achieve this. The following metrics are considered:

SS similarity between the candidate sequence and the most similar of the schemas
   in the set of schemas of narrative identified in [6].
PS conformance to Propp’s canonical sequence of character functions – as pre-
   sented in [3] – applied to the adapted version of the canonical sequence –
   used in ProppBaseline above – and corrected to deal with the differences
   in correspondence
DS ratio of satisfied dependencies over total number of dependencies present –
   adapted from the metric proposed in [4], relying on the set of dependencies
   identified for plot elements.
E(Cor ) considers whether the final plot element in a candidate sequence is
   valid as an ending, defined in terms of a corpus Cor – which in the present
   case is Propp’s original set of Russian tales (rt).

    A grammar provides a strict judgment on the validity of a given sequence,
classifying it as either valid or invalid. We require a metric that indicates the
degree of partial conformance of a candidate sequence to a given grammar. To
this end, for any parse of a candidate plot element sequence that does not result
in a valid parse, we build a shadow tree that uses empty place holder nodes for
the top part of the grammar, until the nodes that have been identified from the
input sequence can be linked to it. In any given shadow tree, there are a number
of empty nodes – which are simply place holders for non-terminal symbols of the
grammar for which no support has been found in the input sequence – and non-
empty nodes – which correspond to assignments of non-terminal symbols of the
grammar to parses of subsequences of elements found in the candidate sequence.
As indicative first approximation we consider the following metric with respect
to a grammar X :

GR(X ) ratio of non-empty nodes over the total number of nodes in the shadow
  tree.


3.4   Combining Construction and Evaluation

The set of construction procedures is run 100 times to produce 100 different
candidates sequences. The set of metrics is applied to all the candidate sequences.
The average results for the various construction procedures over the basic metrics
are presented in Table 6, together with some basic data on sequence length. The
values of all metrics are normalised over 100 for ease of comparison.


                         minL maxL Len PS DS E(rt) SS GR(R) GR(T)
    SchemaBaseline            4     12    8 39 40       71 100       53      39
    ProppBaseline             8     18 12 74 17         64 40        63      39
    ProppDependency           7     22 14 80 49         86 46        66      39
    PropperGrammar            7     14 11 68 40 100 55               53      40
    LakoffGrammar             6     13    8 40 43 100 44             29      24
    RumelhartGrammar          4     17    8 2 7         16 26        89      40
    ThorndykeGrammar          6     19    7 0 4          5 22        50      91
  Table 6. Results for basic metrics obtained by different construction procedures
4   Discussion
The first columns of Table 6 refer to the length of the generated sequences of
plot elements. In each case, minimum length (minL), maximum length (maxL)
and average length (Len) are given. Differences in length across the sequences
arise from different factors, depending on the nature of the technique employed.
    The SchemaBaseline instantiates one of the available schemas, which are
of fixed length. ProppBaseline and ProppDependency are limited by the
size of Propp’s canonical sequence, and variations arise from the choices made
– random (ProppBaseline) and driven by satisfaction of dependencies across
the elements chosen (ProppDependency). The remaining solutions apply dif-
ferent grammars to generate sequences. Variations in length arise from different
choices over the available rules of the grammar. Both PropperGrammar and
LakoffGrammar use grammars intended to capture Propp’s account. Rumel-
hartGrammar and ThorndykeGrammar use grammars intended for more
generic concepts of story.
    With respect to the proposed metrics, the behaviour of the different con-
struction procedures differs widely, as expected.
    For PS, which captures conformance to Propp’s canonical sequence, the best
score is ProppDependency (80) with ProppBaseline a close second (74).
The metric fails to give top scores of 100 because it sometimes penalises for
non-appearance of elements in the sequence that are optional. Enforcement of
dependencies makes more of these optional elements appear, whenever their
antecedents have already been included. The relative performance of Prop-
perGrammar and LakoffGrammar can be interpreted as an indication that
their grammars are not entirely capturing the essence of the canonical sequence,
and that the grammar used by the Propper system does this slightly better
than Lakoff’s grammar. In this sense, SchemaBaseline performs reasonably
well (39), indicating that there is a close relation between the canonical se-
quence and the plot schemas used as reference. RumelhartGrammar and
ThorndykeGrammar are clearly not built to consider this aspect.
    For DS – degree of satisfaction of dependencies between plot elements – the
top performer is again ProppDependency (49), that incorporates this aspect
in its decision processes. The surprisingly low value in this case arises from the
frequent existence of multiple dependents for a given plot element, whereas in
a plot each appearance of an antecedent is normally resolved by a single conse-
quent. It is interesting to note that the procedures that rely on representation of
structure (SchemaBaseline, PropperGrammar and LakoffGrammar) also
perform reasonably well (values around 40), presumably because the structure
they use captures dependencies across the elements to a certain extent. Other
procedures perform poorly. The length of the plots is not an issue because the
metric is normalised over the number of potential dependencies appearing in the
plot.
    With respect to endings as found in Russian folk tales, E(rt), it is clear that
PropperGrammar (100) and LakoffGrammar (100) by construction capture
perfectly the typical ending of a Russian folk tale from the corpus used by Propp.
ProppDependency (86) seems to have sometimes been forced to add tailing
consequents that spoil its performance. SchemaBaseline does reasonably well
(71) and ProppBaseline (64) succeeds often simply by picking elements from
the end of the canonical sequence. Aditional metrics for endings need to be
considered.
    On similarity to existing schemas, SS, SchemaBaseline (100) shines, and
all the procedures based on the Proppian account fare quite well. Again, pointing
towards similarities between the canonical sequence and the schemas.
    On compliance to a given grammar, not surprisingly, RumelhartGram-
mar and ThorndykeGrammar come out as top performers on their respective
grammar, with others far behind.
    The description of the results in terms of averages is useful to identify differ-
ences between different procedures. However, it is also clouding the differences
that may arise between plots generated by the same procedure. It is at this level
that valuable insights for further work may arise.


           SchemaBaseline 18            SchemaBaseline 03
PS      52 Villainy                  28 Imprisoned
DS      86 Pursuit                   40 Pursuit
E(rt)  100 RescueFromPursuit          0 RescueFromPursuit
SS     100 Struggle                 100 Struggle
GR(R)   53 Victory                   53 Victory
GR(T)   37 Revenge                   37 Maturation
Av    71.3 Riches                  43.0 RepentanceRewarded

           ProppBaseline 34             ProppBaseline 03              ProppDependency 23        ProppDependency 86
PS      85 TemptationResisted        53 Adultery                   86 Character’sReaction    71 Tested
DS      47 Adultery                   0 Tested                     67 CoupleWantsToMarry     20 ProvisionOfMagicalAgent
E(rt)  100 ProvisionOfMagicalAgent    0 Disguise                  100 Departure               0 ForbiddenLove
SS      46 Tested                    31 Branding                   49 UnfoundedClaims        39 Branding
GR(R)   67 MoralDilemmaFailure       53 Solution                   75 Struggle               53 Return
GR(T)   37 UnfoundedClaims           37 Return                     58 Victory                37 Recognition
Av    63.7 Branding                29.0 Pursuit                 72.27 Return               36.7 Transfiguration
           Victory                      Branding                      Pursuit
           Rescue                                                     RescueFromPursuit
           Pursuit                                                    Exposure
           RescueFromPursuit                                          Branding
           UnrecognizedArrival                                        Punishment
           Exposure                                                   Reward
           Transformation
           Punishment
           PropperGrammar 94            PropperGrammar 05             LakoffGrammar 80          LakoffGrammar 74
PS      85 ForbiddenLove             57 MistakenMurder             60 Poverty                16 Interdiction
DS      58 CallToAction              10 CallToAction               84 DecisionToTakeAction    0 DeceptionByVillain
E(rt)  100 DecisionToTakeAction     100 DecisionToTakeAction      100 Struggle              100 SubmissionOfHero
SS      40 Departure                 60 Departure                  46 Victory                35 Villainy
GR(R)   66 Tested                    41 UnrecognizedArrival        85 Reconciliation          0 DecisionToTakeAction
GR(T)   50 Character’sReaction       37 UnfoundedClaims            37 Reward                  0 DifficultTask
Av    66.5 ProvisionOfMagicalAgent 50.8 DifficultTask            68.7                      25.2 MisunderstandingCleared
           Struggle                     MisunderstandingCleared                                 Escape
           Branding                     hero recognised                                         Reward
           Victory                      Recognition
           Rescue                       Transformation
           Return                       Revenge
           Pursuit                      Wedding
           RescueFromPursuit
           RumelhartGrammar 52          RumelhartGrammar 31           ThorndykeGrammar 58       ThorndykeGrammar 01
PS      24 Madness                    0 CharacterFlaw               4 InitialSituation        0 Jealousy
DS      34 CharacterFlaw              9 DisconnectedFromReality     0 Trickery                0 Ambition
E(rt)  100 RecoveryOfALostOne         0 LackFulfilled             100 One-sidedLove           0 AnEnemyLoved
SS      37 Metamorphosis             17 LessonLearned              39 One-sidedLove          16 Aspiration
GR(R) 100 ErroneousJudgement         18 Madness                    58 DifficultTask          36 CoupleWantsToMarry
GR(T)   37 Reconciliation            37 Epiphany                  100 ClassDifferences       56 AnEnemyLoved
Av    55.3 LoveShift               13.5 Guidance                 50.2 Riches               18.0 Warning/ForbiddingDisregarded
           Revenge                      Ambition                                                Adultery
                                        Aspiration                                              Return
                                        SacrificeForPassion                                     Summary
                                        Persuasion                                              ClassDifferences
                                                                                                Jealousy
               Table 7. Examples of specific plots with values for metrics
    Examples of specific plots, together with their values for the given metrics, are
shown in Table 7. For each construction procedure, the best and worst performers
in terms of the average of all the metrics are shown. This can serve to show a
wide range of possible values without cherry-picking interesting outputs. For
each example, the first column indicates the values obtained by the story on the
metrics, and the second column indicates the sequence of plot elements produced.
    The results in in Table 7 show a number of interesting insights. The basic
structure of Propp’s account is shared by many of these solutions, resulting in
frequent appearance of similar sub-sequences across samples that score highly.
The Rumelhart and Thorndyke grammars result in sequences that score very
low on all the other metrics, but which have more surprising plot elements,
combined in ways that differ from the basic sequence. Part of the problem here,
is that only the syntactic part of these grammars has been considered. If the
semantic interpretation rules provided by Rumelhart were considered to inform
the process of selecting plot elements to instantiate the grammar, better results
may be obtained. This will be considered as further work.
    The differences in performance across the different approaches and the fact
that now example performs well under all of the metrics indicate that each
approach is focusing on a particular valuable aspect of stories. This suggests that
an ideal method for plot generation should strive to combine the different aspects
in a single constructive procedure. Alternatively, a simpler way of improving
results might be achieved by using the metrics for some approaches to select
best performers out of the set of results obtained by a different approach. This
is a particularly interesting insight that will be pursued in further work.
    It would be interesting to extend this evaluation to plot generation solu-
tions beyond those developed by the author. Because the methodology invovles
grounding the representation used on the reference vocabulary, this would re-
quire not just access to the source code of such solutions but also a relatively
detailed understanding of the particular solution, to avoid betrayals of its spirit.


5   Conclusions


The analysis of the comparative evaluation shows that each type of procedure for
the generation of stories focuses on features that may be necessary in a story. But
such features are generally not sufficient, in the sense that other attempts to for-
mulate the structure of stories with different tools may be capturing additional
features that are also relevant. The comparative evaluation has served to identify
a number of shortcomings in the various approaches when considered individu-
ally. Refinements of the approaches and consideration of additional approaches
are possible lines of future work. However, the most promising avenue of work
for short-term improvement of results would be joint use of a given generative
procedure and validation procedures based on different aspects of stories.
Acknowledgements

This paper has been partially supported by the IDiLyCo project (TIN2015-
66655-R) funded by the Spanish Ministry of Economy, Industry and Competi-
tiveness.


References
 1. J. B. Black and R. Wilensky. An evaluation of story grammars. Cognitive Science,
    3(3):213–230, 1979.
 2. S. Bringsjord and D.A. Ferrucci. Artificial Intelligence and Literary Creativity: In-
    side the Mind of BRUTUS, a Storytelling Machine. Lawrence Erlbaum Associates,
    1999.
 3. P. Gervás. Propp’s Morphology of the Folk Tale as a Grammar for Generation. In
    Workshop on Computational Models of Narrative, Universität Hamburg Hamburg,
    Germany, 2013. Schloss Dagstuhl.
 4. P. Gervás. Reviewing Propp’s story generation procedure in the light of compu-
    tational creativity. In AISB Symposium on Computational Creativity, AISB-2014,
    Goldsmiths, London, UK, 2014.
 5. P. Gervás, R. Hervás, C. León, and C.V. Gale. Annotating musical theatre plots
    on narrative structure and emotional content. In Seventh International Workshop
    on Computational Models of Narrative, Kravov, Poland, 2016. OpenAccess Series
    in Informatics.
 6. P. Gervás, C. León, and G. Méndez. Schemas for narrative generation mined
    from existing descriptions of plot. In Computational Models of Narrative. Schloss
    Dagstuhl, 05/2015 2015.
 7. G.P. Lakoff. Structural complexity in fairy tales. The Study of Man, 1:128–150,
    1972.
 8. R. Raymond Lang. A Formal Model for Simple Narratives. PhD thesis, Tulane
    University, 1997.
 9. V. Propp. Morphology of the Folk Tale. Akademija, Leningrad, 1928.
10. D. E. Rumelhart. Notes on a schema for stories. Representation and Understanding:
    Studies in Cognitive Science, pages 211–236, 1975.
11. P. W. Thorndyke. Cognitive structures in comprehension and memory of narrative
    discourse. Cognitive Psychology, 9:77–110, 1977.

</pre>