<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Albin Zehe</string-name>
          <email>zehe@informatik.uni-wuerzburg.de</email>
          <xref ref-type="aff" rid="aff0">0</xref>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Evelyn Gius</string-name>
          <xref ref-type="aff" rid="aff1">1</xref>
          <xref ref-type="aff" rid="aff2">2</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Equal Contribution</institution>
        </aff>
        <aff id="aff1">
          <label>1</label>
          <institution>Leonard Konle</institution>
        </aff>
        <aff id="aff2">
          <label>2</label>
          <institution>TU Darmstadt</institution>
        </aff>
        <aff id="aff3">
          <label>3</label>
          <institution>University of Cologne</institution>
        </aff>
      </contrib-group>
      <pub-date>
        <year>2021</year>
      </pub-date>
      <abstract>
        <p>This paper describes the Shared Task on Scene Segmentation1 STSS@KONVENS 2021: The goal is to provide a model that can accurately segment literary narrative texts into scenes and non-scenes. To this end, participants were provided with a set of 20 contemporary dime novels annotated with scene information as training data. The evaluation of the task is split into two tracks: The test set for Track 1 consists of 4 in-domain texts (dime novels), while Track 2 tests the generalisation capabilities of the model on 2 out-of-domain texts (highbrow literature from the 19th century). 5 teams participated in the task and submitted a model for final evaluation as well as a system description paper, with the best-performing models reaching F1-scores of 37 % for Track 1 and 26 % for Track 2. The results show that the task of scene segmentation is very challenging, but also suggest that it is feasible in principle. Detailed evaluation of the predictions reveals that the best-performing model is able to pick up many signals for scene changes, but struggles with the level of granularity that actually constitutes a scene change.</p>
      </abstract>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>-</title>
      <p>The objective of this shared task is to develop a
model capable of solving the task of scene
segmentation, as discussed by Gius et al. (2019) and
formally introduced by Zehe et al. (2021).
According to their definition, a scene can be understood
as “a segment of a text where the story time and
the discourse time are more or less equal, the
narration focuses on one action and space and character
constellations stay the same”. The task of scene
segmentation is therefore a kind of text segmentation
task applicable specifically to narrative texts (e.g.,
novels or biographies): These texts can be seen as a
1https://go.uniwue.de/stss2021
sequence of segments, where some of the segments
are scenes and some are non-scenes. The goal of
scene segmentation is to provide both the borders
of the segments as well as the classification of each
segment as a scene or non-scene. Solving this task
advances the field of computational literary studies:
the texts of interest in this field are often very long
and can therefore not easily be processed with NLP
methods. Breaking them down into
narratologically motivated units of meaning like scenes would
enable processing these units (semi-)individually
and then aggregating the results over the entire text.
In addition, a segmentation into scenes allows
plotand content-based analyses of the texts.
2</p>
    </sec>
    <sec id="sec-2">
      <title>Background: Scene Segmentation</title>
      <p>This section provides an overview of the task of
scene segmentation according to the definition by
Zehe et al. (2021). For the full motivation and
description, we refer to this paper.</p>
      <p>From a narratological point of view, a scene can
be defined by reference to a set of four
dimensions: time, space, action and character constellation.
Using these dimensions, a scene is a segment of the
discours (presentation) of a narrative which
presents a part of the histoire (connected events in the
narrated world) such that (1) time is equal in
discours and histoire, (2) place stays the same, (3) it
centers around a particular action, and (4) the
character constellation is equal. All of these conditions
are not absolute but rather relative, that is, small
changes in either of them do not necessarily lead to
a scene change but can rather be seen as indicators.</p>
      <p>Casting this definition as a machine learning task,
we receive as input a (narrative) text and want to
develop a model that (a) splits the text into a
sequence of segments and (b) labels each of these
segments either as a scene or as a non-scene.
Depending on the realisation of the changes, there are
strong or weak boundaries. Segments separated by
a weak boundary can be aggregated into one
segment, while segments with hard boundaries need
to be considered separately.
3</p>
    </sec>
    <sec id="sec-3">
      <title>Related Work</title>
      <p>The related work for scene segmentation has been
covered in much detail by Zehe et al. (2021). For
completeness, we reproduce their overview here
with only minor adaptation:</p>
      <p>
        Segmentation tasks have been discussed in NLP
for a while, mostly with the goal of identifying
regions of news or other non-fictional texts discussing
certain topics. The task of topic segmentation is
then to identify points in the text where the topic
under discussion changes. Early work to this end
uses similarity of adjacent text segments (such as
sentences or paragraphs) with a manually designed
similarity metric in order to produce the resulting
segments. One of the most well known systems
of this manner is TextTiling
        <xref ref-type="bibr" rid="ref11">(Hearst, 1997)</xref>
        , which
was applied to science magazines. Similarity
based on common words
        <xref ref-type="bibr" rid="ref2 ref5">(Choi, 2000; Beeferman
et al., 1999)</xref>
        was superseded with the introduction
of Latent Dirichlet Allocation
        <xref ref-type="bibr" rid="ref3">(Blei et al., 2003)</xref>
        ,
which allowed to segment the text into coherent text
snippets with similar topic distributions
        <xref ref-type="bibr" rid="ref24 ref31">(Riedl and
Biemann, 2012; Misra et al., 2011)</xref>
        . This procedure
was extended by the integration of entity coherence
        <xref ref-type="bibr" rid="ref12">(John et al., 2016)</xref>
        and Wanzare et al. (2019) have
used it on (very short) narrative texts in an attempt
to extract scripts. Recently, many approaches
making use of neural architectures deal with the
detection and classification of local coherence
        <xref ref-type="bibr" rid="ref12 ref12 ref14 ref19 ref20 ref20 ref28 ref28">(e. g. Li
and Jurafsky, 2016; Pichotta and Mooney, 2016; Li
and Hovy, 2014)</xref>
        , which is an important step for a
text summarization of high quality
        <xref ref-type="bibr" rid="ref34">(Xu et al., 2019)</xref>
        .
Text segmentation using neural architectures was
conducted on Chinese texts and it was shown that
recurrent neural networks are able to predict the
coherence of subsequent paragraphs with an
accuracy of more than 80 %
        <xref ref-type="bibr" rid="ref25">(Pang et al., 2019)</xref>
        . Lukasik
et al. (2020) compare three BERT based
architectures for segmentation tasks: Cross-Segment BERT
following the NSP Pretraining-Task and fine-tuned
on segmentation, a Bi-LSTM on top of BERT to
keep track of larger context and an adaption of a
Hierarchical BERT network
        <xref ref-type="bibr" rid="ref36">(Zhang et al., 2019)</xref>
        .
      </p>
      <p>
        Some work has been done on segmenting
narrative texts, but aiming at identifying topical
segments – which, as we have pointed out above, is
different from scene segmentation. With a set of
hand-crafted features, Kauchak and Chen (2005)
achieve a WindowDiff score
        <xref ref-type="bibr" rid="ref27 ref4">(Pevzner and Hearst,
2002)</xref>
        of about 0.5, evaluated on two novels.
Kazantseva and Szpakowicz (2014) have annotated
the novel Moonstone with topical segments, and
presented a model to create a hierarchy of topic
segments. They report about 0.3 WindowDiff
score. Recently, Pethe et al. (2020) have introduced
the task of chapter segmentation, which is similar
to scene segmentation in that they both focus on
narrative texts. However, it aims at detecting
chapters, which are based on structural information like
headers, whereas scenes are defined by features of
the told story not directly connected to structural
information. Notably, our dataset contains some
scenes that cross chapter boundaries, since our
characteristics of scenes are entirely independent of
such formal markers. Most closely related to our
task are the papers by Reiter (2015), who
documents a number of annotation experiments, and
Kozima and Furugori (1994), who present lexical
cohesiveness based on the semantic network
Paradigme
        <xref ref-type="bibr" rid="ref15">(Kozima and Furugori, 1993)</xref>
        as an indicator
for scene boundaries and evaluates their approach
qualitatively on a single novel. However, neither of
them provide annotation guidelines, annotated data
or a formal definition of the task.
      </p>
      <p>
        A related area of research is discourse
segmentation, where the goal is also to find segments that are
not necessarily defined by topic, and are also
assigned labels in addition to the segmentation. There
are annotated news corpora in this area featuring
fine-grained discourse relations between relatively
small text spans
        <xref ref-type="bibr" rid="ref29 ref4">(Carlson et al., 2002; Prasad et al.,
2008)</xref>
        . Although larger structures have been
discussed in literature
        <xref ref-type="bibr" rid="ref21">(Grosz and Sidner, 1986)</xref>
        , no
annotated corpora have been released.
4
      </p>
    </sec>
    <sec id="sec-4">
      <title>Shared Task on Scene Segmentation</title>
    </sec>
    <sec id="sec-5">
      <title>STSS</title>
      <p>The Shared Task on Scene Segmentation was
organised as one of the shared tasks of KONVENS
2021.2 There were a total of 8 registrations, out
of which 5 teams submitted a model for the final
evaluation as well as a system description paper.
The task was split into two tracks, with the first
one evaluating on in-domain data and the second
one on out-of-domain data. The test data was kept
back for the entire duration of the task and trained
2https://konvens2021.phil.hhu.de/
models were submitted to the organisers as Docker
images for the final evaluation.
4.1</p>
      <sec id="sec-5-1">
        <title>Data</title>
        <p>Trial Data A single text, “Der kleine
Chinesengott” by Pitt Strong (aka Elisabeth von Aspern)
was released as trial data before the actual training
set, in order to show the format of the dataset and
enable participants to start working on their
implementation as soon as possible.</p>
        <p>Training Data The training data consisted of
20 annotated dime novels, which include the
15 texts from Zehe et al. (2021) as well as 5 new
texts that were annotated according to the same
guidelines. The texts are given in the appendix in
Table 3, along with detailed dataset statistics
(Table 4). Since the texts are protected by copyright,
they could not be distributed directly. Instead,
participants were asked to register for a German ebook
shop3 and received the books as a gift on this
website, along with standoff annotations and a script to
merge the epub files with the annotations.</p>
      </sec>
      <sec id="sec-5-2">
        <title>Evaluation Data</title>
        <p>Track 1 The first subset of the evaluation
data, used in Track 1 of the shared task, consisted
of 4 texts from the same domain as the training
set, that is, dime novels. Detailed statistics for this
dataset are available in Table 5.</p>
        <p>Track 2 The second evaluation set, used for
Track 2, consisted of out-of-domain data,
specifically 2 high-brow literature novels. This set, presented
in detail in Table 6, was chosen to investigate how
well the submitted approaches were able to deal
with texts that are assumed to differ strongly from
the training data in writing style.
4.2</p>
      </sec>
      <sec id="sec-5-3">
        <title>Evaluation Metrics</title>
        <p>
          Evaluating scene segmentation is a somewhat
challenging problem in itself. Zehe et al. (2021) use
two evaluation metrics, F1-score and Mathet’s γ
          <xref ref-type="bibr" rid="ref23">(Mathet et al., 2015)</xref>
          , arguing that γ is the more
suitable measure for scene segmentation: F1-score
only counts a scene boundary as correct if it is
predicted at exactly the right position, while an
offset of one sentence would already count as a
complete miss. On the other hand, γ tries to align
the predicted boundaries with the gold boundaries
and score both the fit of the alignment as well
as the classification into scene and non-scene.
However, since the γ measure itself requires the
user to specify certain parameters and it is not
immediately obvious how to set these parameters
in our context, the main evaluation in this shared
task is based on the exact F1-score. More precisely,
we represent the segmentation produced by each
system as a list of boundary predictions: Each
sentence in the text is labelled as either NOBORDER,
SCENE-TO-SCENE, SCENE-TO-NONSCENE
or NONSCENE-TO-SCENE. For example, a
sentence that starts a new scene after a segment
that is classified as a non-scene would be labelled
as NONSCENE-TO-SCENE. This classification
can then directly be compared to the gold standard
annotations.
        </p>
        <p>The classes in this scheme are highly
imbalanced, with NOBORDER making up the vast
majority of the labels. Therefore, for our main
evaluation, we exclude the label NOBORDER and build
micro-averaged scores between the other classes.
We chose to use micro-averaging despite the class
imbalance since the minority classes are not more
important to the classification and therefore
microaveraged scores lead to a better representation of
the overall classification performance.</p>
        <p>For informative reasons, we also report the γ
score of the approaches.
4.3</p>
      </sec>
      <sec id="sec-5-4">
        <title>Submitted Systems</title>
        <p>This section provides an overview of the
approaches to scene segmentation submitted by the
participants of the Shared Task.</p>
        <p>Kurfali and Wire´n (2021) apply the sequential
sentence classification system proposed by Cohan
et al. (2019) to the scene-segmentation task. This
system is based on BERT, but uses a customised
input format, where each sentence of the input
sequence is separated by BERT’s special token
“[SEP]”. After passing a sequence through BERT,
the output of those “[SEP]” tokens is fed into a
multi-layer perceptron to predict a label for its
preceding sentence. While the original system
utilises a mean-squared-error loss, Kurfali and Wire´n
(2021) implement weighted cross-entropy to deal
with the class imbalance in the scene dataset and
make use of the IOB2 scheme instead of simple
classification with categories.</p>
        <p>The system submitted by Gombert (2021) builds
on the idea to use sentences functioning as
scene borders as feature vectors for the prediction of
4.4</p>
      </sec>
      <sec id="sec-5-5">
        <title>Evaluation of the Automatic Systems for Scene Segmentation</title>
        <p>scene borders. For this purpose, first a sentence
embedding space is learned in a twin BERT training
setup. The model separates sentences functioning
as scene borders from sentences within scenes. In
a second step, a gradient-boosted decision tree
ensemble is fed with feature vectors from the sentence
embeddings generated by the model.</p>
        <p>In the following, we present and discuss the
performance of the submitted systems in our shared task.</p>
        <p>All results are summarised in Table 1.</p>
        <p>The most successful system on Track 1 was the
one proposed by Kurfali and Wire´n (2021),
rea</p>
        <p>The system submitted by Barth and Do¨nicke ching an F1-score of 37 % on the evaluation set
(2021) focuses on the manual design of vectors co- for Track 1. For Track 2, their model was
somevering different sets of features for scene segmen- what less successful, reaching an F1-score of 17 %,
tation. The first set consists of general linguistic which still corresponds to the second place. On
features like tense, POS tags, etc. The other sets fo- Track 2, the system proposed by Gombert (2021)
cus on features crucial for the scene segmentation performed best, with an F1-score of 26 % (16 % on
task, explicitly encoding temporal expressions as Track 1). All results for both systems, with
evaluawell as entity mentions. These feature vectors are tion for all border classes on individual texts, can
then used as input to a random forest classifier. be found in the appendix in Tables 7 and 8. Overall,</p>
        <p>The system of Hatzel and Biemann (2021) casts these results show that scene segmentation is a very
the problem of scene segmentation as a kind of challenging, but not impossible task. Especially the
next-sentence-prediction: It focuses on the “[SEP]” winning system is capable of finding 51 % of all
tokens which appear in between two subsequent annotated scene boundaries in the in-domain data,
sentences in the input representation for a BERT which is a promising score. The bigger issue of this
model, and uses their embedding representation system at the moment seems to be the precision
from a German BERT model. In addition to the (29 %), indicating that many of the boundaries the
BERT-embeddings, the authors add manual featu- systems predicts are wrong. We provide an analysis
res capturing changes in the character constellation of what leads to these results in the next section.
that are derived from a German adaptation of the Interestingly, all systems except the one from
coarse-to-fine co-reference architecture (Lee et al., Kurfali and Wire´n (2021) actually performed better
2018). This final representation is fed into a fully on the out-of-domain evaluation set of Track 2 than
connected layer with a softmax activation function on the (in-domain) dime novels of Track 1.
Howein order to detect scene changes. Since this ap- ver, it must also be noted that the scores are overall
proach predicts too many scenes in close proximity, somewhat low and the differences should therefore
they evaluate different ways to suppress neighbou- not be overinterpreted. We can also see that the
ring scenes for their final prediction. Specifically, ranking according to the γ measure would be
ratthey use a cost function which punishes very short her similar to the F1-score-ranking. However, there
scenes harshly. are also differences in the ranking, for example the
system submitted by Hatzel and Biemann (2021)
would have been ranked higher in both tracks
according to γ. This shows that the selection of a
fitting evaluation measure for scene segmentation
is indeed important.</p>
        <p>The team Schneider et al. (2021) present the
“Embedding Delta Signal” as a method for both
scene segmentation and topic segmentation. They
focus on context change in documents using a
sliding window method that compares cluster
assignments of word embeddings using the cosine
distance measure and detect scene changes by
searching for local maxima in the signal. In a further
step, they distinguish between different scene types
using a simple support vector machine approach
with hyper-parameter search. They use an
additional evaluation method, intersection over union
of predicted and actual scenes, arguing that this
measure is more suitable because it punishes
scene boundaries that are in the vicinity of the gold
annotations less severely than the F1-score.</p>
      </sec>
      <sec id="sec-5-6">
        <title>Additional Evaluation</title>
        <p>Addressing the fact that our F1-score is a very
unforgiving measure, since only exact matches are
counted as correct scene boundaries, we performed
some additional evaluation on the predictions by
the different systems.</p>
        <p>As a first step, we noticed that some of the
systems had a tendency to predict multiple short
scenes in the vicinity of a hand-annotated scene
change. Therefore, we conducted an additional
evaKurfali and Wire´n (2021)
Gombert (2021)
Barth and Do¨nicke (2021)
Hatzel and Biemann (2021)
Schneider et al. (2021)
Prec.
luation where we merged scenes that were less than
5 sentences long to the preceding or following
scene, if this led to a correctly predicted scene (e.g., if
the beginning of the short scene was a gold scene
boundary and the end of the next scene was a gold
scene boundary, the two scenes were merged). This
improved some of the scores by up to 3 percentage
points in F1-score. Note that this is not a “valid”
evaluation, since the decision whether to merge to
the preceding or following scene is taken based
on the gold standard. However, it does show that
correct handling of short scenes would have some
positive influence on the results.</p>
        <p>Additionally, we analysed whether we could
determine especially “important” scene boundaries
more reliably. To this end, the existing annotations
of Track 1 were re-edited: Annotators were asked
to identify strong and weak boundaries between
the previously annotated scenes, depending on how
they judged the importance of each boundary. A
strong boundary is one that must be set in any
annotation, while a weak boundary is one that may
be omitted based on the desired level of granularity.
Note that we did not collect any additional scene
annotations, but only categorised the existing ones
further. We did not see significant changes in the
performance when considering only strong
boundaries. In particular, the recall was not consistently
higher than for all scene boundaries.
5</p>
      </sec>
    </sec>
    <sec id="sec-6">
      <title>Manual Error Analysis</title>
      <p>
        In this section, we provide a deeper analysis of the
prediction errors that the best-performing system
on Track 1
        <xref ref-type="bibr" rid="ref1 ref10 ref17 ref32">(Kurfali and Wire´n, 2021)</xref>
        makes. To
this end, we manually analyse the predictions and
potential error sources on two texts from Track 1:
• Hochzeit wider Willen (Wedding against will,
the text with the best γ score)
• Bomben fu¨r Dortmund (Bombs for Dortmund,
the text with the second worst γ score; we
decided not to use the text Die Begegnung,
which has the worst γ score, since it was a
very hard text even for the annotators)
      </p>
      <p>Table 2 compares the manual to the automatic
annotations for these texts. The analysis reveals that
the following factors have a particular influence on
the predictions: (a) Length of the detected scenes or
granularity of scene detection in general (b)
explicit markers of time and space changes, (c) changes
in character constellation (entrance and exit of
characters, especially protagonists), (d) naming and
description of newly introduced characters (full
name plus verb sequence), as well as (e) end of dialog
passages. We provide a brief overview of the
problematic factors here and refer to Appendix C for a
detailed analysis with specific examples.
5.1</p>
      <sec id="sec-6-1">
        <title>Analysis of Markers</title>
        <p>First, we investigate how the markers used in our
definition of scenes influence the system’s
decisions regarding scene borders.</p>
        <p>Time Markers The system clearly seems to have
identified time markers as an important signal for
scene changes. Many false positives (scene borders
annotated by the system, but not by the human
annotators) start with temporal markers, especially
the word “as”. Overall, the system appears to have
overgeneralised the impact of temporal markers,
seeing every mention of time in the text as a strong
signal for a scene change.</p>
        <p>Location Markers A similar issue arises with
the presence of location markers: the system is
very sensitive to changes in action space, often
producing false positives at the mention of locations.
According to our annotation guidelines, only
significant location changes induce a scene border while,
for example, moving through rooms in a house is
not necessarily cause enough for a scene change.
total length (tokens)
longest gold standard scene (tokens)
longest correctly predicted scene (tokens)
annotated segments by winner system
annotated segments in gold standard
Bomben fu¨r Dortmund</p>
        <p>Hochzeit wider Willen</p>
        <p>Changes in Character Constellation Another
marker that our scene definition takes into account
is the character constellation. We find that the
model is capable of identifying the introduction of a
new character, often accompanied with the
character’s full name as well as a short description, as
a marker for a new scene. However, once again
the system seems to struggle with the
importance of character constellation changes, showing a
tendency to start a scene for every introduction.
Dialogue Passages Dialogue passages are not
explicitly part of our scene definition, however it is
reasonable to assume that they can be valuable
markers for scenes: for one, dialogues appear
almost exclusively in scenes, rarely in non-scenes.
Additionally, a new scene usually does not start in
the middle of a dialogue passage. The model seems
to have picked up this fact, since it has a tendency
to predict scene changes on the end of dialogue
passages. While this can be a valid marker, it again
leads to false positives in the system’s output.
5.2</p>
      </sec>
      <sec id="sec-6-2">
        <title>General Issues of the Model’s Output</title>
        <p>Here, we attempt to extract a generalisation of the
specific issues described before. They can be
grouped into two major categories: issues with scene
length and issues with the granularity of markers.
Scene Length One of the most general problems
was that the system predicts very short scenes in
succession, often caused by the occurrence of
multiple markers within a few sentences. In our manual
annotations, very short passages are usually not
considered as separate scenes, but rather as part of
the preceding or following scene. The system does
not appear to have learned this and therefore often
predicts multiple very short scenes in succession.
Granularity of Markers An issue that was
noticeable for any of the markers discussed above is the
system’s apparent inability to infer the importance
of a scene change marker. Many false positive
predictions are caused by small changes in time, place
or character constellation that were not considered
as significant enough for a scene change by the
annotators. In some cases, the model’s decision to
predict a scene change is perfectly reasonable and
can be seen as a more fine-grained scene
segmentation than the one agreed on in our annotations (cf.
Section 6). In other cases, however, the
oversensitivity of the system is clearer, as for example with
the temporal marker “as” (see above).
6</p>
      </sec>
    </sec>
    <sec id="sec-7">
      <title>Discussion</title>
      <p>In this section, we briefly discuss the results of the
shared task along with possible next steps towards
the improvement of automatic scene segmentation.</p>
      <p>
        The winning systems for both tracks
        <xref ref-type="bibr" rid="ref1 ref10 ref17 ref32 ref9">(Kurfali
and Wire´n, 2021; Gombert, 2021)</xref>
        are based on
BERT variants, showing that, as for many other
NLP-tasks, pre-trained Transformer models are
very valuable for scene segmentation. However, the
results also reinforce our belief that scene
segmentation cannot be solved by BERT alone, but
requires a deeper understanding of the text. Some of
the submissions of the shared task explore
alternative ways of approaching scene segmentation,
either adapting methods from co-reference
resolution
        <xref ref-type="bibr" rid="ref1 ref10 ref17 ref32">(Hatzel and Biemann, 2021)</xref>
        , handcrafting
features that are assumed to be helpful for scene
segmentation
        <xref ref-type="bibr" rid="ref1 ref10 ref17 ref32">(Barth and Do¨nicke, 2021)</xref>
        , or using
differences in the text over time to derive scene
change candidates
        <xref ref-type="bibr" rid="ref32">(Schneider et al., 2021)</xref>
        .
      </p>
      <p>The two most consistent sources of errors in
the most successful model are the granularity of
scene change markers and the length of scenes.
Both of these problems should be – at least in part
– addressable by introducing additional constraints
or signals to the model. For the scene length, it
seems promising to make the model aware of the
length of the current scene, which could prevent it
from predicting many short scenes, or to use global
information about the scene boundaries. A possible
approach to this has been used by Pethe et al. (2020)
for the related task of chapter segmentation and
was also applied in this shared task by Hatzel and
Biemann (2021) with some success.</p>
      <p>
        For the problem of granularity, the model could
be given access to explicit information regarding
the scale of the markers. For example, information
from knowledge graphs about the scale of
temporal markers or location changes could be useful
(e.g., a minute is much less relevant than a month;
a different room is much less relevant than a
different country). Character changes appear to be more
challenging in this regard, since the model needs to
be able to judge the importance of a character for
the current scene. This might be achieved by
applying co-reference resolution to the texts and building
a local character network, representing how many
interactions each character has with others in the
neighbouring text, how often they are mentioned,
etc. Although a somewhat boring solution, using
more training data might also enable the model to
learn the granularity of markers, at least for
location and temporal markers. A possible step in this
direction is to use the related task of chapter
segmentation
        <xref ref-type="bibr" rid="ref26">(Pethe et al., 2020)</xref>
        , for which a large
amount of weakly labelled training data is available,
for pre-training and then fine-tuning the resulting
model for scene segmentation. While chapters and
scenes are different in principle (cf. Section 3), they
may be similar enough to make this pre-training
step promising. On the other hand, it might be
interesting to explore the scene segmentations provided
by the model further. Our annotations represent our
understanding of a scene, however other
applications may require a more fine- or coarse-grained
definition. To this end, it seems promising to
optimise a model for recall (i.e., detect as many
annotated scene borders as possible) in a first step and
then filter these candidates for the desired level of
granularity in a second step.
      </p>
      <p>One of the most surprising results of the shared
task is the fact that most models perform better
on the out-of-domain high-brow literature than the
in-domain dime novels. This is in stark contrast
to our previous intuition, for two reasons: First,
the training data consists of dime novels, which
should lead to a model that is better suited to this
type of texts. Secondly, from a literary perspective,
we expected the high-brow literature to be more
challenging to understand and therefore the scene
segmentation to be more difficult. However, the
more implicit style of writing in high-brow literature
may actually be helpful for the models here. While
dime novels often present explicit references to
characters, locations or the passing of time, high-brow
literature may use these references much more
sparsely, making them more reliable markers of scene
changes. Although the number of data points is too
low to make a reliable statement, the higher
precision of predictions from Gombert (2021) on the
high-brow texts compared to the dime novels (cf.
Table 8) might point in a similar direction.</p>
      <p>Finally, we also see that the choice of evaluation
measure is important, as F1-score and γ lead to
different rankings in both tracks. For this shared
task, we have decided to use the exact F1-score
as the main measure, however this decision is not
final. As already discussed before, measures that
take into account the proximity of predicted to gold
standard scenes, like γ, are equally valid, albeit
more difficult to interpret. Schneider et al. (2021)
propose a third potentially useful measure,
intersection over union. While this measure would have
to be adapted to be able to handle both non-scenes
and scenes, this is also a promising direction.
7</p>
    </sec>
    <sec id="sec-8">
      <title>Conclusion</title>
      <p>In this paper, we have summarised the results of
the Shared Task on Scene Segmentation, where
the objective was to develop a method for
automatic scene segmentation in literary narrative texts.
To this end, we provided a training set of 20
dime novels and evaluated the submitted systems on
two tracks, one with in-domain data and one with
out-of-domain data in the form of high-brow
literature. Overall, our shared task has received five
submissions with very different approaches to
scene segmentation. While none of these systems were
capable of solving the task completely, especially
the best performing systems for each track yielded
promising results, with F1-scores of 37 % on Track
1 and 26 % on Track 2, respectively. These results
show that scene segmentation remains challenging,
but also that it is not an impossible task. In manual
analysis, we discovered that the models are capable
of picking up many important markers for scene
boundaries, but sometimes still struggle to draw the
correct conclusions from these markers.</p>
    </sec>
    <sec id="sec-9">
      <title>Acknowledgements</title>
      <p>We would like to thank all participants for their
submissions. We are especially happy about the
wide range of completely different and orthogonal
approaches, opening great possibilities for future
work on this challenging task!</p>
      <p>Dataset Information</p>
      <p>Title</p>
      <p>Author
Bezaubernde neue Mutti
Widerstand zwecklos
Tausend Pferde
Der Turm der 1000 Schrecken
Die Widows Connection
Ein su¨ndiges Erbe
Immer wenn der Sturm kommt
Wechselhaft wie der April
Lass Blumen sprechen
Prophet der Apokalypse
Verschma¨ht
Der Sohn des Kometen
Ein Weihnachtslied fu¨r Dr. Bergen
Die hochmu¨tigen Fellmann-Kinder
Als der Meister starb
Hetzjagd durch die Zeit
Wir schaffen es - auch ohne Mann
Griseldis
Deus Ex Machina
Die Abrechnung
Number of Segments Percentage of Scenes Number of Sentences Number of tokens Avg. Scene Length (Tokens)
Im Bann der Vampire
Bad Earth
Hochzeit wider Willen
Bomben fu¨r Dortmund
micro avg
macro avg
weighted avg
micro avg
macro avg
weighted avg
micro avg
macro avg
weighted avg
prec.
micro avg
macro avg
weighted avg
micro avg
macro avg
weighted avg
micro avg
macro avg
weighted avg
(a) Im Bann der Vampire
(b) Die Begegnung
(c) Hochzeit wider Willen
(d) Bomben fu¨r Dortmund
(e) Aus guter Familie
(f) Effi Briest
0.25
0.00
0.00
micro avg
macro avg
weighted avg
micro avg
macro avg
weighted avg
micro avg
macro avg
weighted avg
micro avg
macro avg
weighted avg
micro avg
macro avg
weighted avg
micro avg
macro avg
weighted avg
prec.
(c) Hochzeit wider Willen
(d) Bomben fu¨r Dortmund
(e) Aus guter Familie
(f) Effi Briest
C.1</p>
    </sec>
    <sec id="sec-10">
      <title>Detailed Manual Analysis</title>
      <sec id="sec-10-1">
        <title>Time markers</title>
        <p>
          As a starting point for the error analysis, the
actual markers for scene changes, as known from the
guidelines
          <xref ref-type="bibr" rid="ref8">(Gius et al., 2021)</xref>
          , were considered
separately: changes of narrated time, place, action
and character constellation. Thereby, an
overgeneralisation of time markers could be detected in the
output of the winner system. It is noticeable that
many annotated scenes start with formulations like
“as”, “it was five over”, “at this moment”, “three
minutes elapsed”, indicating changes in the time
of the narrative. Especially many falsely annotated
scene changes (false positives) begin with the
temporal indicator “as”. The following passage shows
an example of a wrong scene change indication
triggered by the temporal conjunction “as”,
marking a change in the narrated time. According to
the gold standard, this passage does not include a
scene change.
        </p>
        <p>’If there really was something to the call,
the colleagues in the radio patrol car
might still be able to catch the man who
had buzzed me out of my sleep. I got up
and went to take a shower. A cold one
would have been best now. But I wasn’t
brave enough to do that yet. It was five
over. Fat Peter Steiner, the owner of the
bar Steinkrug, had by all appearances put
not only rat poison but also a strong
sleeping pill in the grain. As I got dressed
and was about to leave the apartment, the
phone rang again. ’Mattek?’ ’Speaking.’
’Did you get the message through to the
alarm center?’ ’Yes.” (German original
text in Figure 1)</p>
        <p>In this example, the short reflective passage
containing the first-person-narrator’s thoughts about
the night before interrupts the narrated action,
which is resumed with the words “ As I got dressed
and was about to leave [...]”. The temporal
conjunction “as” could have caused the system to indicate
a scene change, whereas according to the gold
standard there is no scene change. This indication of a
scene change may have resulted from
overgeneralisation of the system. The use of temporal markers
as indicators of probable scene changes is often
successful, but risks an over-sensitive system.</p>
        <p>However, not only temporal conjunctions seem
to trigger the system to indicate scene changes, but
also multi word expressions containing information
on the narrated time, as can be seen in the
following example, again from Bomben fu¨r Dortmund,
in which a new scene was indicated differently to
the gold standard annotation. Looking at this
example, the question of granularity arises that will be
encountered in the later subsection C.6.</p>
        <p>’There was no need to hurry. Regarding
the station, we had everything under
control. Sure, there were loopholes to
escape, but someone who didn’t even suspect
being expected had no reason to look for
them and use them. I simply assumed
that Jutta Speißer didn’t have the
faintest idea that we knew practically
everything about her. Two, maybe three
minutes passed. ’Can you hear me,
Hermann?’ I had hidden the walkie-talkie
under my leather jacket so that I could
talk into it if I lowered my head a little.
’Yes.” (German original text in Figure 2)</p>
        <p>Nevertheless, there are also many passages
containing temporal markers that the system correctly
indicated as new scenes. Another example from
Bomben fu¨r Dortmund shows how it detected the
scene change without requiring the temporal
marker to be at the beginning of the sentence.
”May be,’ I said, ’[...]. One devilish lady,
one big bastard, and the third bomb we
know about. We’re going back to the
station. Lampert and Blechmann will report
there.’ The feeling of being watched
faded when she left the train in Brackel
and the long, skinny man who had caught
her attention on the train was no longer
behind her. But she had quickly calmed
down.’ (German original text in Figure 3)</p>
        <p>Although this correct detection of the scene
change could also be related to the simultaneous
occurrence of a change of the space of action, which
will be discussed in more detail in the next section.
C.2</p>
      </sec>
      <sec id="sec-10-2">
        <title>Change of action space</title>
        <p>Another possible overgeneralisation of the system
could be its hypersensitivity to descriptions of the
action space, since many scene changes
annotated by the system happen to be accompanied by
references to changes in the action space at the
beginning of a new scene.</p>
        <p>The following passage is an example from
Bomben fu¨r Dortmund of a correctly annotated scene
change followed by an indication of a change in
the action space.</p>
        <p>’I nodded to DAB. ’Give me your
walkie-talkie, DAB. Get one from another
officer.’ He didn’t expect anything from
it, it was clear from his face, but he
gave me his walkie-talkie. I disappeared
from track one and walked through
the underpass to the stairs leading up
to track three. There was no sign of Tin
Man. Nor was there any sign of the
person he had described. That meant they
must already be upstairs. I stopped in the
middle of the stairs, lit a fresh
cigarette and waited.’ (German original text in</p>
        <p>Figure 4)</p>
        <p>In addition to many true positive scene
changes that the system recognises as in the previous
sample passage, there are also many false positives
that can be interpreted as the result of the system’s
overgeneralisation. One example can be found in
the following sample passage, in which the main
characters do not move but an action outside of the
scene setting is described that probably triggered
the annotation of a wrong scene change within that
scene. There is no scene change according to the
gold standard.</p>
        <p>”You’d make a great cop chick,’ I said,
’I used to be in the Scouts.’ Outside,
in the small reception hall, someone
pounded on the bell as I did. ’I don’t have
time now, have to take care of the guests
and sell rooms, or I’ll be out of a job. At
nine?’ ’You bet!” (German original text
in Figure 5)</p>
        <p>
          Since the system generally tends towards
finegrained scene segmentation, it is not surprising that
it often annotates scene changes too much in
addition to some actual scene changes. The following
passage shows an example of fine-grained scene
annotation by the system. In the passage, the main
characters move from the hotel reception to the
kitchen in the next room. For the manual annotation
process, this change of action space is a
prototypical example of the application of the container
principle defined in the annotation guidelines
          <xref ref-type="bibr" rid="ref8">(Gius
et al., 2021, 4)</xref>
          . This principle is used to summarise
short scenes without clear scene change indicators,
e.g., when the characters remain the same and the
change from one action space to another is
described, while the settings are close to each other and
often in the same building, as it is the case in this
sample passage. Nonetheless, this scene change
could be reasonable with the goal of more finely
granulated scene annotation. These considerations
have inspired us to look more closely at the
distinction between weak and strong boundaries, which
we analyse in subsection 4.5.
        </p>
        <p>”Free choice. You won first prize with
me.’ ’What would the second have
been?’ ’A washing machine.’ ’I’d rather
have the first,to be honest. I finish at
eleven.’ ’Then the choice of fine venues is
very limited.’ ’Pull strings,’ she said. I
followed her into the small, white-tiled
kitchen, where breakfast was also made
for the guests. The sight of her made me
look forward to the evening.’ (German
original text in Figure 7)</p>
        <p>Another recurring phenomenon that often
triggers a change of scene is a character entering or
exiting a scene. As in the following example from
Bomben fu¨r Dortmund that contains a collection of
typical verbal phrases, the exiting and reentering
of a character is introduced by the indication of a
character’s movement from one to another location
by the phrases ’to leave’, ’to go back to’, ’to turn
into’, and ’to disappear into’. However, according to
the gold standard, the scene change should be
displayed before the beginning of the sentence ’It was
still raining cats and dogs’ to mark the beginning of
the new scene outside the restaurant. Probably due
to oversensitivity, the winning system annotated
two scene changes instead of only one as in the
gold standard, also missing the actual position of
the scene change.</p>
        <p>’She flus hed the toilet as a cover, left the
cabin and washed her hands. Then, in
front of the large, clean mirror, she fixed
her frayed hair, which had nothing to be
fixed. Then she went back to the
restaurant. She drank the rest of the ouzo left
in the glass, smiled at Dimitri, the owner,
secretly wished him all known and
unknown venereal diseases, preferably all
at once, left an appropriate tip and left
the restaurant. It was still raining cats and
dogs. In the reflec tion of some lanterns,
the rain looked like many cords next to
each other, which did not tear and did
not come to an end. It was just before
seven when she turned into Karl Marx
Street, crossed it and disappeared into
Rubel Street. Out of the street stood the
green Sierra.’ (German original text in</p>
        <p>Figure 6)</p>
        <p>As has become clear in this subsection,
characters and their entrances and exits play a significant
role in automatic annotation as markers of a likely
scene change. In the following subsection, we will
discuss another phenomenon related to characters
that often coincides with scene change annotations,
namely changes in character constellation.
C.3</p>
      </sec>
      <sec id="sec-10-3">
        <title>Change in Character Constellation</title>
        <p>Another marker that frequently occurs at the
beginning of automatically detected scenes is the
introduction of a new character with the respective full
name as well as the accompanying description of
the character, its state or an action (presented as
a combination of full name plus verb sequence).
It can be concluded that the system has learned
that this combination occurs frequently at scene
beginnings. However, the following two examples
(German original text in the appendices 8 and 9)
show that this is not always the case.</p>
        <p>The first passage is an example for the correct
detection by the winner system of a new scene
beginning with an introduction of a new character
from Bomben fu¨r Dortmund.</p>
        <p>’At nine I had an appointment with
Marlies. Lohmeyer couldn’t ruin it for me.
Jutta Speißer ate stifado and drank
Cypriot Aphrodite wine. For Dimitri, the
owner of the Greek restaurant
’Akropolis’ on Karl-Zahn-Straße, she was a new,
welcome guest.’ (German original text in</p>
        <p>Figure 8)</p>
        <p>The second passage is an example of scene
annotation differing from the gold standard, that
contains the introduction and description of three new
characters.</p>
        <p>’Baldwein started the green Sierra. He
slowly steered the vehicle past the post
office and drove in the direction of Hoher
Wall. Although Police Sergeant Werner</p>
        <p>Okker had not been drinking last night,
because of the duties he had to fulfillas
now officially to his he looked bad. He
was sitting at the counter of the
Steinkrug. His angular, broad shoulders
slumped forward in a tired manner. He
seemed to be visibly struggling to lift his
beer glass. Susanne Steiner stood behind
the bar. Large, coarse-boned, Nordic. A
girl who had grown up in the pub
milieu. She had long, brunette hair and a
decidedly beautiful face with full,
sensual lips. Peter Steiner, her father, who
was standing next to her at the tap, was
not at all like her. He was around sixty.
A former tusker.’ (German original text
in Figure 9).</p>
        <p>According to the gold standard, there is only one
scene change in the text before ” Although Police
Sergeant Werner Okker [. . . ]“, which was also
detected by the automatic system. In addition to this,
however, another scene change was indicated at
the introduction of the new character Peter Steiner.
It is noticeable that the constructions around the
introduction of the character Susanne Steiner and
the character Peter Steiner are similar in structure,
but the sentence introducing Susanne Steiner was
not recognized as the beginning of a new scene.</p>
        <p>Another example of an incorrectly marked new
scene which coincides with the introduction of a
new character can be found in Hochzeit wider
Willen. According to the gold standard, there is no
scene change in the following passage.</p>
        <p>’It was a warm morning at the beginning
of August, the sun was shining golden
in the breakfast room of the town palace.
Here the Hohenstein family had gathered
for the firstmeal of the day. Fu¨rst
Heinrich, head of the family and chairman of
the Hohenstein Bank, a traditional house
in the Frankfurt financial center, was
talking lively with his elder son Bernhard.’
(German original text in Figure 10)</p>
        <p>Since similar constructions occur in the text
Hochzeit wider Willen and can be found at the
beginning of scenes detected by the system (like in
Figure 10), it could be determined that this is not
a singular phenomenon that occurs specifically in
the text Bomben fu¨r Dortmund.
It is also noticeable that the end of an automatically
detected scene is often accompanied by the end of
a dialogue passage, which is then followed by a
descriptive passage that represents the beginning
of a new scene.</p>
        <p>The following example from Hochzeit wider
Willen shows a passage which the winner system
segmented into four different scenes, indicating a
scene change after every ending of a dialogue passage
followed by a descriptive passage without any
dialogues. However, according to the gold standard,
there is only one scene change in the passage
before ” Prince Frederik appeared in his office a little
later than usual that morning“.</p>
        <p>’Frederik gazed pensively into his coffee.
’Well, someday I’ll get myself a lovely
wife and a few offspring, but I still
have a bit of a reprieve. Let’s say ten to
fifteen years . . . ’ ’You’ve got a lot of
nerve.’ The princess laughed and stood
up. ’You don’t really believe that.’ ’Oh
yes I do,’ he murmured and smiled
narrowly. ’I know that.’ Prince Frederik
appeared in his office a little later than
usual that morning. Carina Bo¨ttiger, his
secretary, was used to this and also knew
what state her boss was in on such days.
The petite blonde with the sky-blue eyes
had strong coffee and aspirin ready. She
brought both together with the signature
folder. [...]. ’You look lovely today,’
Frederik noted, glancing at her dress. He
eyed her rather thoughtfully for a
moment, and she pretended not to notice,
just thanking him artfully for the
compliment and asking if there was anything
else she could do for him. ’No, that was
all for the moment.’ He gave her back
the signature folder. ’When Herr von</p>
      </sec>
      <sec id="sec-10-4">
        <title>Solm comes, send him right through.</title>
        <p>I have something else to discuss with
him.’ He noticed her slightly s u¨ffisant
look, so he clarified: ’Something
business-related.’ Carina laughed slightly
and left the executive room. The fact
that Frederik had noticed her new dress
made her happy. Until now, she had
always believed that he hardly had an eye
for her. But she didn’t want to get any
ideas about that either. After all, it
seemed clear that this man was out of her
reach. And she was really too good for
a brief fling with the ladies’ man.’
(German original text in Figure 12).</p>
        <p>One possible interpretation of this regular
annotation of a scene change as a separation of dialogue
and descriptive passages could be that the system
recognises these passages as different writing
styles, leaving the actual reasons for scene changes
unconsidered.</p>
        <p>C.5
However, the most common error, which was also
the easiest to spot, was in the output of scenes
that are only one to three sentences long as in the
example from Hochzeit wider Willen
’One could see from the mother’s face
that this was not necessarily the case. But
Hedwig sensed that she would not
receive any more information from
Carina. ’My little princess ...’ That was what
she had called Carina as a child. None
of them could have imagined that she
would ever become a real princess. And
if the young woman was honest, she still
couldn’t quite believe it now. A little
later, the bride and groom left for the
airport. Ewald Bo¨ttiger asked his
wife: W´hat did you have to talk about for
so long? Everyone was waiting for you’.
’I’m not sure if Carina married the right
guy[...].’ (German original text in
Figure 11)</p>
        <p>In the manual scene annotation following the
guidelines by Gius et al. (2021), the decision was
made to append very short scenic passages to the
appropriate preceding or following scene in the
sense of the container principle. In the given example,
however, there is no scene change at all, because it
is only a description of the exit of the characters,
which takes place within the scene at the bride’s
parents’ house.</p>
        <p>C.6</p>
      </sec>
      <sec id="sec-10-5">
        <title>Granularity of Scenes</title>
        <p>With respect to the length of the individual
passages that should be detected as scenes, there is also
the question of how granular the segmentation into
scenes should be without becoming too small-scale.
The following passage from Hochzeit wider Willen
is an example of a small-scale, granular scene
segmentation choice by the winning system, in which
three scenes were indicated while there are only
two according to the gold standard.</p>
        <p>’Prince Frederik was quite pleased with
himself. Carina had swallowed his
excuse whole. She had thus given him a free
pass, so to speak, to finally go back to
living the way he liked. And he was
determined to do so immediately ... The very
next evening, Frederik called his wife to
let her know that it was getting late. He
was supposedly waiting for the
conclusion of a lucrative business deal. Carina
did not suspect anything - yet. When she
asked him the next morning when he had
come home, he did not tell her the truth.’
(German original text in Figure 13)
’Prince Frederik was quite pleased with
himself. Carina had swallowed his
excuse whole. She had thus given him a free
pass, so to speak, to finally go back to
living the way he liked. And he was
determined to do so immediately ... The very
next evening, Frederik called his wife to
let her know that it was getting late. He
was supposedly waiting for the
conclusion of a lucrative business deal. Carina
did not suspect anything - yet. When she
asked him the next morning when he had
come home, he did not tell her the truth.’
(German original text in Figure 14)</p>
        <p>In this text passage, the choice of the automatic
system to recognize another scene is not an
implausible one. On the contrary, the system’s decision
can be justified, but the small-scale granularity of
scene annotation should be avoided in view of the
overall goal of the segmentation task, in which a
text is to be segmented into units of meaning in
terms of content, which should exceed a minimum
token length for their further use. The system was
more precise than the gold standard.</p>
        <p>C.7</p>
        <p>German original text of the sample
passages
’Falls an dem Anruf wirklich etwas dran war,
konnten die Kollegen im Funkstreifenwagen den Mann
vielleicht noch stellen, der mich aus dem Schlaf
gebimmelt hatte. Ich stand auf und ging unter die
Dusche. Eine kalte wa¨re jetzt am besten gewesen.
Aber dazu war ich noch nicht mutig genug. Es war
fu¨nf vorbei. Der fette Peter Steiner, der Wirt vom
Steinkrug, hatte allem Anschein nach nicht nur
Rattengift, sondern auch ein starkes Schlafmittel in
den Korn gepanscht. Als ich mich angezogen hatte
und die Wohnung verlassen wollte, la¨utete das
Telefon erneut. ’Mattek?’ ’Am Apparat’. ’Haben Sie
die Meldung an die Alarmzentrale durchgegeben?’
’Ja.”
Figure 1: Example from Bomben fu¨r Dortmund of a
wrong scene change indication triggered by the
temporal conjunction ’als’ marking a change in the narrated
time.
’Es bestand kein Grund zur Eile. Was den
Bahnhof anging, so hatten wir alles unter Kontrolle.
Sicher gab es Schlupfl o¨cher zum Entkommen, aber
jemand, der nicht einmal ahnte, dass er erwartet
wurde, hatte auch keinen Grund, danach zu suchen
und sie zu benutzen. Ich ging einfach davon aus,
dass Jutta Speißer nicht den blassesten Schimmer
davon hatte, dass wir praktisch alles u¨ber sie
wussten. Zwei, vielleicht drei Minuten verstrichen.
’Kannst du mich ho¨ren, Hermann?’ Ich hatte das
Walkie-talkie so unter der Lederjacke verborgen,
dass ich hineinsprechen konnte, wenn ich den Kopf
etwas senkte. ’Ja.”
”Viel leicht’, sagte ich. ’[...]. Eine teuflische Lady,
einen großen Schweinehund und die dritte Bombe,
von der wir wissen. Wir fahren ins Revier zuru¨ck.
Lampert und Blechmann werden sich dort
melden.’ Das Gefu¨hl, beobachtet zu werden, schwand,
als sie in Brackel den Zug verließ und der lange,
du¨rre Mann nicht mehr hinter ihr war, auf den sie
im Zug aufmerksam geworden war. Aber sie hatte
sich schnell wieder beruhigt.’
’Ich nickte DAB zu. ’Gib mir dein Walkie-talkie,
DAB. Hol dir eins von einem anderen Beamten.’ Er
versprach sich nichts davon, das war ihm deutlich
anzusehen, aber er gab mir sein Walkie-talkie. Ich
verschwand von Gleis eins und lief durch die</p>
      </sec>
      <sec id="sec-10-6">
        <title>Unterfu¨hrung bis zur Treppe, die nach Gleis</title>
        <p>drei hinauffu¨hrte. Von Blechmann war nichts zu
sehen. Von der Person, die er beschrieben hatte,
ebenfalls nicht. Das hieß, sie mussten schon oben
sein. Ich blieb mitten auf der Treppe stehen,
zu¨ndete mir frische Zigarette an und wartete.
”Du wa¨rst eine prima Polizistenbraut’, sagte ich.
’Ich war mal bei den Pfadfindern.’ Draußen, in
der kleinen Empfangshalle, ha¨mmerte jemand
auf die Glocke, wie ich es getan hatte. ’Ich
habe jetzt keine Zeit mehr, muss mich um die Ga¨ste
und Zimmer verkaufen, sonst bin ich meinen Job
los. Um neun?’ ’Worauf du dich verlassen kannst!”
’Sie spu¨lte zur Tarnung, verließ die Kabine und
wusch sich die Ha¨nde. Anschließend richtete sie
sich vor dem großen, sauberen Spiegel die
ausgefransten Haare, an denen es nichts zu richten gab.
Dann ging sie ins Restaurant Sie trank den Rest
Ouzo, der sich noch im Glas befand, la¨chelte
Dimitri, den Besitzer, an, wu¨nschte ihm insgeheim
alle bekannten und unbekannten
Geschlechtskrankheiten, am liebsten auf einmal, ein angemessenes
Trinkgeld liegen und verließ das Restaurant. Es
regnete noch immer in Stro¨men. Im Widerschein
einiger Laternen sah der Regen aus wie viele sich
nebeneinanderbefindliche Bindfa¨den, die nicht
rissen und kein Ende nahmen. Es war kurz vor sieben,
als sie in die Karl-Marx-Straße einbog, sie
kreuzte und in der Rubelstraße verschwand. Ausgangs
stand der Gru¨ne Sierra.’
’Freie Auswahl. Du hast mit mir den ersten Preis
gewonnen.’ ’Was wa¨re der zweite gewesen?’
’Eine Waschmaschine.’ ’Der erste ist mir, ehrlich
gesagt, lieber. Ich mache um elf Schluss.’ ’Dann ist
die Auswahl der feinen Lokalita¨ten sehr begrenzt.’
’Lass deine Beziehungen spielen’, sagte sie. Ich
folgte ihr in die kleine, weiß gekachelte Ku¨che, in
der auch das Fru¨hstu¨ck die Ga¨ste gemacht wurde.
Ihr Anblick ließ mich auf den Abend hoffen.’
’Um neun hatte ich eine Verabredung mit
Marlies. Die konnte Lohmeyer mir nicht kaputtmachen.
Jutta Speißer aß Stifado und trank zypriotischen
Aphrodite-Wein. Fu¨r Dimitri, den Besitzer des
griechischen Restaurants ’Akropolis’ in der
Karl-ZahnStraße, war sie ein neuer, willkommener Gast.’
’Baldwein startete den gru¨nen Sierra. Langsam
lenkte er das Fahrzeug am Postgiroamt vorbei und
fuhr in Richtung Hoher Wall. Obgleich
Polizeimeister Werner Okker gestern Nacht nicht
getrunken hatte, wegen der Pflich ten, die er als nun
offiziell Verlobter seiner Verlobten gegenu¨ber zu erfu¨llen
hatte, sah er schlecht aus. Er saß am Tresen vom
Steinkrug. Die eckigen, breiten Schultern waren
mu¨de nach vorn abgefallen. Es schien ihm sichtlich
Mu¨he zu bereiten, sein Bierglas zu heben.
Susanne Steiner stand hinter der Theke. Groß,
grobknochig, nordisch. Ein Ma¨dchen, das im
Kneipenmilieu groß geworden war. Sie hatte langes, bru¨nettes
Haar und ein ausgesprochen scho¨nes Gesicht mit
vollen, sinnlichen Lippen. Peter Steiner, ihr Vater,
der neben ihr am Zapfhahn stand, war ihr u¨berhaupt
nicht a¨hnlich. Er war um die Sechzig herum. Ein
ehemaliger Hauer.’
’Es war ein warmer Morgen Anfang August, die
Sonne schien golden in das Fru¨hstu¨ckszimmer des
Stadtpalais. Hier hatte sich die Fu¨rstenfamilie
Hohenstein zur ersten gemeinsamen Mahlzeit des
Tages versammelt. Fu¨rst Heinrich,
Familienoberhaupt und Vorstand der Hohenstein-Bank, eines
traditionsreichen Hauses am Frankfurter
Finanzplatz, unterhielt sich angeregt mit seinem a¨lteren
Sohn Bernhard.’
’Man sah der Mutter an, dass dies nicht
unbedingt der Fall war. Doch Hedwig spu¨rte, sie wu¨rde
von Carina keine weiteren Ausku¨nfte erhalten. [...]
’Meine kleine Prinzessin So hatte sie Carina als
Kind genannt. Keiner von ihnen ha¨tte sich wohl
vorstellen ko¨nnen, dass sie jemals eine wirkliche
Prinzessin werden wu¨rde. Und wenn die junge Frau
ehrlich war, konnte sie es jetzt noch immer nicht
so ganz fassen. Wenig spa¨ter fuhr das Brautpaar
zum Flughafen. Ewald Bo¨ttiger fragte seine Frau:
’Was hattet ihr denn noch so lange zu bereden?
Alle haben auf euch gewartet.’ ’Ich bin mir nicht
sicher, ob Carina den Richtigen geheiratet hat.”
’Frederik blickte sinnend in seinen Kaffee. ’Na ja,
irgendwann werde ich mir eben ein liebes
Frauchen und ein paar Spro¨sslinge zulegen, aber ein
bisschen Galgenfrist bleibt mir ja noch. Sagen
wir mal zehn bis fu¨nfzehn Jahre ’Du hast
Nerven.’ Die Prinzessin musste lachen und erhob sich.
’Das glaubst du doch wohl nicht im Ernst.’ ’Oh
doch’, murmelte er und la¨chelte schmal. ’Das weiß
ich.’ Prinz Frederik erschien an diesem Morgen
etwas spa¨ter als sonst in seinem Bu¨ro. Carina
Bo¨ttiger, seine Sekreta¨rin, war das gewohnt und
wusste auch, in welchem Zustand ihr Chef an solchen
Tagen war. Die zierliche Blondine mit den
himmelblauen Augen hielt starken Kaffee und Aspirin
bereit. Beides brachte sie zusammen mit der
Unterschriftenmappe. [...] ’Sie sehen heute hu¨bsch
aus’ , stellte Frederik mit einem Blick auf ihr Kleid
fest. Er musterte sie einen Moment lang ziemlich
nachdenklich, und sie tat so, als merke sie es gar
nicht, bedankte sich nur artig fu¨r das Kompliment
und fragte, ob sie sonst noch etwas fu¨r ihn tun
ko¨nne. ’Nein, das war im Moment alles.’ Er gab
ihr die Unterschriftenmappe zuru¨ck. ’Wenn Herr
von Solm kommt, schicken Sie ihn gleich durch.</p>
      </sec>
      <sec id="sec-10-7">
        <title>Ich habe noch etwas mit ihm zu besprechen.’ Er</title>
        <p>bemerkte ihren leicht su¨ffisanten Blick und
stellte deshalb klar: ’Etwas Gescha¨ftliches.’ Carina
la¨chelte leicht und verließ das Chefzimmer. Dass
Frederik ihr neues Kleid bemerkt hatte, machte sie
glu¨cklich. Bislang hatte sie immer geglaubt, dass
er kaum einen Blick fu¨r sie hatte. Doch sie
wollte sich darauf auch nichts einbilden. Schließ lich
schien es klar, dass dieser Mann außer halb ihrer
Reichweite war. Und fu¨r eine kurze Affa¨re mit dem
Frauenliebling war sie sich wirklich zu schade.’
’Prinz Frederik war ganz zufrieden mit sich selbst.
Carina hatte seine Ausrede glatt geschluckt. Damit
hatte sie ihm sozusagen selbst den Freifahrtschein
ausgestellt, um endlich wieder so zu leben, wie es
ihm gefiel.Und er war fest entschlossen, dies auch
umgehend zu tun Bereits am na¨chsten Abend
meldete Frederik sich telefonisch bei seiner Frau
und ließ sie wissen, dass es spa¨t wurde.
Angeblich wartete er auf den Abschluss eines lukrativen
Gescha¨fts. Carina scho¨pfte noch keinen Verdacht.</p>
      </sec>
      <sec id="sec-10-8">
        <title>Als sie ihn am na¨chsten Morgen fragte, wann er</title>
        <p>heimgekommen sei, sagte er ihr nicht die
Wahrheit.’
’Prinz Frederik war ganz zufrieden mit sich selbst.
Carina hatte seine Ausrede glatt geschluckt. Damit
hatte sie ihm sozusagen selbst den Freifahrtschein
ausgestellt, um endlich wieder so zu leben, wie es
ihm gefiel.Und er war fest entschlossen, dies auch
umgehend zu tun . . . Bereits am na¨chsten Abend
meldete Frederik sich telefonisch bei seiner Frau
und ließ sie wissen, dass es spa¨t wurde.
Angeblich wartete er auf den Abschluss eines lukrativen
Gescha¨fts. Carina scho¨pfte noch keinen Verdacht.</p>
      </sec>
      <sec id="sec-10-9">
        <title>Als sie ihn am na¨chsten Morgen fragte, wann er</title>
        <p>heimgekommen sei, sagte er ihr nicht die
Wahrheit.’</p>
      </sec>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          <string-name>
            <given-names>Florian</given-names>
            <surname>Barth</surname>
          </string-name>
          and Tillmann Do¨nicke.
          <year>2021</year>
          .
          <article-title>Participation in the konvens 2021 shared task on scene segmentation using temporal, spatial and entity feature vectors</article-title>
          .
          <source>In Shared Task on Scene Segmentation.</source>
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          <string-name>
            <given-names>Doug</given-names>
            <surname>Beeferman</surname>
          </string-name>
          , Adam Berger,
          <string-name>
            <given-names>and John</given-names>
            <surname>Lafferty</surname>
          </string-name>
          .
          <year>1999</year>
          .
          <article-title>Statistical models for text segmentation</article-title>
          .
          <source>Machine learning</source>
          ,
          <volume>34</volume>
          (
          <issue>1-3</issue>
          ):
          <fpage>177</fpage>
          -
          <lpage>210</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          <string-name>
            <surname>David M Blei</surname>
            , Andrew Y Ng, and
            <given-names>Michael I</given-names>
          </string-name>
          <string-name>
            <surname>Jordan</surname>
          </string-name>
          .
          <year>2003</year>
          .
          <article-title>Latent dirichlet allocation</article-title>
          .
          <source>Journal of machine Learning research</source>
          ,
          <volume>3</volume>
          (Jan):
          <fpage>993</fpage>
          -
          <lpage>1022</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          <string-name>
            <surname>Lynn</surname>
            <given-names>Carlson</given-names>
          </string-name>
          , Daniel Marcu, and Mary Ellen Okurowski.
          <year>2002</year>
          .
          <article-title>Rst discourse treebank, ldc2002t07</article-title>
          .
          <source>Technical report</source>
          , Philadelphia: Linguistic Data Consortium.
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          <source>Freddy YY Choi</source>
          .
          <year>2000</year>
          .
          <article-title>Advances in domain independent linear text segmentation</article-title>
          .
          <source>arXiv preprint cs/0003083.</source>
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          <string-name>
            <given-names>Arman</given-names>
            <surname>Cohan</surname>
          </string-name>
          , Iz Beltagy, Daniel King, Bhavana Dalvi, and
          <string-name>
            <given-names>Dan</given-names>
            <surname>Weld</surname>
          </string-name>
          .
          <year>2019</year>
          .
          <article-title>Pretrained language models for sequential sentence classification</article-title>
          .
          <source>In Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP)</source>
          , pages
          <fpage>3693</fpage>
          -
          <lpage>3699</lpage>
          ,
          <string-name>
            <surname>Hong</surname>
            <given-names>Kong</given-names>
          </string-name>
          , China. Association for Computational Linguistics.
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          <string-name>
            <given-names>Evelyn</given-names>
            <surname>Gius</surname>
          </string-name>
          , Fotis Jannidis, Markus Krug, Albin Zehe, Andreas Hotho, Frank Puppe, Jonathan Krebs, Nils Reiter, Nathalie Wiedmer, and
          <string-name>
            <given-names>Leonard</given-names>
            <surname>Konle</surname>
          </string-name>
          .
          <year>2019</year>
          .
          <article-title>Detection of scenes in fiction</article-title>
          .
          <source>InProceedings of Digital Humanities</source>
          <year>2019</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          <string-name>
            <given-names>Evelyn</given-names>
            <surname>Gius</surname>
          </string-name>
          , Carla So¨kefeld, Lea Du¨mpelmann, Lucas Kaufmann, Annekea Schreiber, Svenja Guhr, Nathalie Wiedmer, and
          <string-name>
            <given-names>Fotis</given-names>
            <surname>Jannidis</surname>
          </string-name>
          .
          <year>2021</year>
          .
          <article-title>Guidelines for detection of scenes.</article-title>
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          <string-name>
            <given-names>Sebastian</given-names>
            <surname>Gombert</surname>
          </string-name>
          .
          <year>2021</year>
          .
          <article-title>Twin bert contextualized sentence embedding space learning and gradientboosted decision tree ensembles for scene segmentation in german literature</article-title>
          .
          <source>In Shared Task on Scene Segmentation.</source>
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          <string-name>
            <given-names>Hans</given-names>
            <surname>Ole</surname>
          </string-name>
          Hatzel and
          <string-name>
            <given-names>Chris</given-names>
            <surname>Biemann</surname>
          </string-name>
          .
          <year>2021</year>
          .
          <article-title>Applying coreference to literary scene segmentation</article-title>
          .
          <source>In Shared Task on Scene Segmentation.</source>
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          <string-name>
            <given-names>Marti A</given-names>
            <surname>Hearst</surname>
          </string-name>
          .
          <year>1997</year>
          .
          <article-title>Texttiling: Segmenting text into multi-paragraph subtopic passages</article-title>
          .
          <source>Computational linguistics</source>
          ,
          <volume>23</volume>
          (
          <issue>1</issue>
          ):
          <fpage>33</fpage>
          -
          <lpage>64</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          <string-name>
            <surname>Adebayo Kolawole</surname>
            <given-names>John</given-names>
          </string-name>
          , Luigi Di Caro, and
          <string-name>
            <given-names>Guido</given-names>
            <surname>Boella</surname>
          </string-name>
          .
          <year>2016</year>
          .
          <article-title>Text segmentation with topic modeling and entity coherence</article-title>
          .
          <source>In International Conference on Hybrid Intelligent Systems</source>
          , pages
          <fpage>175</fpage>
          -
          <lpage>185</lpage>
          . Springer.
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          <string-name>
            <given-names>David</given-names>
            <surname>Kauchak</surname>
          </string-name>
          and
          <string-name>
            <given-names>Francine</given-names>
            <surname>Chen</surname>
          </string-name>
          .
          <year>2005</year>
          .
          <article-title>Featurebased segmentation of narrative documents</article-title>
          .
          <source>In Proceedings of the ACL Workshop on Feature Engineering for Machine Learning in Natural Language Processing</source>
          , pages
          <fpage>32</fpage>
          -
          <lpage>39</lpage>
          , Ann Arbor, Michigan. Association for Computational Linguistics.
        </mixed-citation>
      </ref>
      <ref id="ref14">
        <mixed-citation>
          <string-name>
            <given-names>Anna</given-names>
            <surname>Kazantseva</surname>
          </string-name>
          and
          <string-name>
            <given-names>Stan</given-names>
            <surname>Szpakowicz</surname>
          </string-name>
          .
          <year>2014</year>
          .
          <article-title>Hierarchical topical segmentation with affinity propagation</article-title>
          .
          <source>In Proceedings of COLING</source>
          <year>2014</year>
          ,
          <source>the 25th International Conference on Computational Linguistics: Technical Papers</source>
          , pages
          <fpage>37</fpage>
          -
          <lpage>47</lpage>
          , Dublin, Ireland. Dublin City University and Association for Computational Linguistics.
        </mixed-citation>
      </ref>
      <ref id="ref15">
        <mixed-citation>
          <string-name>
            <given-names>Hideki</given-names>
            <surname>Kozima</surname>
          </string-name>
          and
          <string-name>
            <given-names>Teiji</given-names>
            <surname>Furugori</surname>
          </string-name>
          .
          <year>1993</year>
          .
          <article-title>Similarity between words computed by spreading activation on an english dictionary</article-title>
          .
          <source>In Proceedings of the European Association for Computational Linguistics.</source>
        </mixed-citation>
      </ref>
      <ref id="ref16">
        <mixed-citation>
          <string-name>
            <given-names>Hideki</given-names>
            <surname>Kozima</surname>
          </string-name>
          and
          <string-name>
            <given-names>Teiji</given-names>
            <surname>Furugori</surname>
          </string-name>
          .
          <year>1994</year>
          .
          <article-title>Segmenting narrative text into coherent scenes</article-title>
          .
          <source>Literary and Linguistic Computing</source>
          ,
          <volume>9</volume>
          (
          <issue>1</issue>
          ):
          <fpage>13</fpage>
          -
          <lpage>19</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref17">
        <mixed-citation>
          <string-name>
            <given-names>Murathan</given-names>
            <surname>Kurfali</surname>
          </string-name>
          and Mats Wire´n.
          <year>2021</year>
          .
          <article-title>Breaking the narrative: Scene segmentation through sequential sentence classification</article-title>
          .
          <source>In Shared Task on Scene Segmentation.</source>
        </mixed-citation>
      </ref>
      <ref id="ref18">
        <mixed-citation>
          <string-name>
            <given-names>Kenton</given-names>
            <surname>Lee</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Luheng</given-names>
            <surname>He</surname>
          </string-name>
          , and
          <string-name>
            <given-names>Luke</given-names>
            <surname>Zettlemoyer</surname>
          </string-name>
          .
          <year>2018</year>
          .
          <article-title>Higher-order coreference resolution with coarse-tofine inference</article-title>
          .
          <source>In Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies</source>
          , Volume
          <volume>2</volume>
          (
          <issue>Short Papers)</issue>
          , pages
          <fpage>687</fpage>
          -
          <lpage>692</lpage>
          , New Orleans, Louisiana. Association for Computational Linguistics.
        </mixed-citation>
      </ref>
      <ref id="ref19">
        <mixed-citation>
          <string-name>
            <given-names>Jiwei</given-names>
            <surname>Li</surname>
          </string-name>
          and
          <string-name>
            <given-names>Eduard</given-names>
            <surname>Hovy</surname>
          </string-name>
          .
          <year>2014</year>
          .
          <article-title>A model of coherence based on distributed sentence representation</article-title>
          .
          <source>In Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP)</source>
          , pages
          <fpage>2039</fpage>
          -
          <lpage>2048</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref20">
        <mixed-citation>
          <string-name>
            <given-names>Jiwei</given-names>
            <surname>Li</surname>
          </string-name>
          and
          <string-name>
            <given-names>Dan</given-names>
            <surname>Jurafsky</surname>
          </string-name>
          .
          <year>2016</year>
          .
          <article-title>Neural net models for open-domain discourse coherence</article-title>
          .
          <source>arXiv preprint arXiv:1606</source>
          .
          <fpage>01545</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref21">
        <mixed-citation>
          <string-name>
            <given-names>Barbara J.</given-names>
            <surname>Grosz</surname>
          </string-name>
          and
          <string-name>
            <surname>Candace L. Sidner</surname>
          </string-name>
          .
          <year>1986</year>
          .
          <article-title>Attention, intentions, and the structure of discourse</article-title>
          .
          <source>Computational Linguistics</source>
          ,
          <volume>12</volume>
          (
          <issue>3</issue>
          ):
          <fpage>175</fpage>
          -
          <lpage>204</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref22">
        <mixed-citation>
          <string-name>
            <given-names>Michal</given-names>
            <surname>Lukasik</surname>
          </string-name>
          , Boris Dadachev, Gonc¸alo Simo˜es, and
          <string-name>
            <given-names>Kishore</given-names>
            <surname>Papineni</surname>
          </string-name>
          .
          <year>2020</year>
          .
          <article-title>Text segmentation by cross segment attention</article-title>
          .
        </mixed-citation>
      </ref>
      <ref id="ref23">
        <mixed-citation>
          <string-name>
            <given-names>Yann</given-names>
            <surname>Mathet</surname>
          </string-name>
          , Antoine Widlo¨cher, and
          <string-name>
            <surname>Jean-Philippe Me</surname>
          </string-name>
          ´tivier.
          <year>2015</year>
          .
          <article-title>The unified and holistic method gamma () for inter-annotator agreement measure and alignment</article-title>
          .
          <source>Computational Linguistics</source>
          ,
          <volume>41</volume>
          (
          <issue>3</issue>
          ):
          <fpage>437</fpage>
          -
          <lpage>479</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref24">
        <mixed-citation>
          <string-name>
            <given-names>Hemant</given-names>
            <surname>Misra</surname>
          </string-name>
          , Franc¸ois Yvon, Olivier Cappe´, and Joemon Jose.
          <year>2011</year>
          .
          <article-title>Text segmentation: A topic modeling perspective</article-title>
          .
          <source>Information Processing &amp; Management</source>
          ,
          <volume>47</volume>
          (
          <issue>4</issue>
          ):
          <fpage>528</fpage>
          -
          <lpage>544</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref25">
        <mixed-citation>
          <string-name>
            <given-names>Yihe</given-names>
            <surname>Pang</surname>
          </string-name>
          , Jie Liu,
          <string-name>
            <surname>Jianshe Zhou</surname>
            , and
            <given-names>Kai</given-names>
          </string-name>
          <string-name>
            <surname>Zhang</surname>
          </string-name>
          .
          <year>2019</year>
          .
          <article-title>Paragraph coherence detection model based on recurrent neural networks</article-title>
          .
          <source>In International Conference on Swarm Intelligence</source>
          , pages
          <fpage>122</fpage>
          -
          <lpage>131</lpage>
          . Springer.
        </mixed-citation>
      </ref>
      <ref id="ref26">
        <mixed-citation>
          <string-name>
            <given-names>Charuta</given-names>
            <surname>Pethe</surname>
          </string-name>
          , Allen Kim, and
          <string-name>
            <given-names>Steve</given-names>
            <surname>Skiena</surname>
          </string-name>
          .
          <year>2020</year>
          .
          <article-title>Chapter Captor: Text Segmentation in Novels</article-title>
          .
          <source>In Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP)</source>
          , pages
          <fpage>8373</fpage>
          -
          <lpage>8383</lpage>
          , Online. Association for Computational Linguistics.
        </mixed-citation>
      </ref>
      <ref id="ref27">
        <mixed-citation>
          <string-name>
            <given-names>Lev</given-names>
            <surname>Pevzner</surname>
          </string-name>
          and
          <string-name>
            <given-names>Marti A.</given-names>
            <surname>Hearst</surname>
          </string-name>
          .
          <year>2002</year>
          .
          <article-title>A critique and improvement of an evaluation metric for text segmentation</article-title>
          .
          <source>Comput. Linguist.</source>
          ,
          <volume>28</volume>
          (
          <issue>1</issue>
          ):
          <fpage>19</fpage>
          -
          <lpage>36</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref28">
        <mixed-citation>
          <string-name>
            <given-names>Karl</given-names>
            <surname>Pichotta and Raymond J Mooney</surname>
          </string-name>
          .
          <year>2016</year>
          .
          <article-title>Learning statistical scripts with lstm recurrent neural networks</article-title>
          .
          <source>In Thirtieth AAAI Conference on Artificial Intelligence.</source>
        </mixed-citation>
      </ref>
      <ref id="ref29">
        <mixed-citation>
          <string-name>
            <given-names>Rashmi</given-names>
            <surname>Prasad</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Alan</given-names>
            <surname>Lee</surname>
          </string-name>
          , Nikhil Dinesh, Eleni Miltsakaki, Geraud Campion, Aravind Joshi, and
          <string-name>
            <given-names>Bonnie</given-names>
            <surname>Webber</surname>
          </string-name>
          .
          <year>2008</year>
          .
          <source>Penn Discourse Treebank Version 2.0 LDC2008T05</source>
          . Web download,
          <source>Linguistic Data Consortium</source>
          , Philadelphia.
        </mixed-citation>
      </ref>
      <ref id="ref30">
        <mixed-citation>
          <string-name>
            <given-names>Nils</given-names>
            <surname>Reiter</surname>
          </string-name>
          .
          <year>2015</year>
          .
          <article-title>Towards Annotating Narrative Segments</article-title>
          .
          <source>In Proceedings of the 9th SIGHUM Workshop on Language Technology for Cultural Heritage</source>
          ,
          <source>Social Sciences, and Humanities (LaTeCH)</source>
          , pages
          <fpage>34</fpage>
          -
          <lpage>38</lpage>
          , Beijing, China. Association for Computational Linguistics.
        </mixed-citation>
      </ref>
      <ref id="ref31">
        <mixed-citation>
          <string-name>
            <given-names>Martin</given-names>
            <surname>Riedl</surname>
          </string-name>
          and
          <string-name>
            <given-names>Chris</given-names>
            <surname>Biemann</surname>
          </string-name>
          .
          <year>2012</year>
          .
          <article-title>Topictiling: a text segmentation algorithm based on lda</article-title>
          .
          <source>In Proceedings of ACL 2012 Student Research Workshop</source>
          , pages
          <fpage>37</fpage>
          -
          <lpage>42</lpage>
          . Association for Computational Linguistics.
        </mixed-citation>
      </ref>
      <ref id="ref32">
        <mixed-citation>
          <string-name>
            <given-names>Felix</given-names>
            <surname>Schneider</surname>
          </string-name>
          , Bjo¨rn Barz, and
          <string-name>
            <given-names>Joachim</given-names>
            <surname>Denzler</surname>
          </string-name>
          .
          <year>2021</year>
          .
          <article-title>Detecting scenes in fiction using the embedding delta signal</article-title>
          .
          <source>In Shared Task on Scene Segmentation.</source>
        </mixed-citation>
      </ref>
      <ref id="ref33">
        <mixed-citation>
          <string-name>
            <given-names>Lilian</given-names>
            <surname>Diana Awuor Wanzare</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Michael</given-names>
            <surname>Roth</surname>
          </string-name>
          , and
          <string-name>
            <given-names>Manfred</given-names>
            <surname>Pinkal</surname>
          </string-name>
          .
          <year>2019</year>
          .
          <article-title>Detecting everyday scenarios in narrative texts</article-title>
          .
          <source>In Proceedings of the Second Workshop on Storytelling</source>
          , pages
          <fpage>90</fpage>
          -
          <lpage>106</lpage>
          , Florence, Italy. Association for Computational Linguistics.
        </mixed-citation>
      </ref>
      <ref id="ref34">
        <mixed-citation>
          <string-name>
            <given-names>Jiacheng</given-names>
            <surname>Xu</surname>
          </string-name>
          , Zhe Gan, Yu Cheng, and Jingjing Liu.
          <year>2019</year>
          .
          <article-title>Discourse-aware neural extractive model for text summarization</article-title>
          . arXiv preprint arXiv:
          <year>1910</year>
          .14142.
        </mixed-citation>
      </ref>
      <ref id="ref35">
        <mixed-citation>
          <string-name>
            <given-names>Albin</given-names>
            <surname>Zehe</surname>
          </string-name>
          , Leonard Konle, Lea Katharina Du¨mpelmann, Evelyn Gius, Andreas Hotho, Fotis Jannidis, Lucas Kaufmann, Markus Krug, Frank Puppe, Nils Reiter, Annekea Schreiber, and
          <string-name>
            <given-names>Nathalie</given-names>
            <surname>Wiedmer</surname>
          </string-name>
          .
          <year>2021</year>
          .
          <article-title>Detecting scenes in fiction: A new segmentation task</article-title>
          .
          <source>InProceedings of the 16th Conference of the European Chapter of the Association for Computational Linguistics: Main Volume</source>
          , pages
          <fpage>3167</fpage>
          -
          <lpage>3177</lpage>
          , Online. Association for Computational Linguistics.
        </mixed-citation>
      </ref>
      <ref id="ref36">
        <mixed-citation>
          <string-name>
            <given-names>Xingxing</given-names>
            <surname>Zhang</surname>
          </string-name>
          , Furu Wei, and
          <string-name>
            <given-names>Ming</given-names>
            <surname>Zhou</surname>
          </string-name>
          .
          <year>2019</year>
          .
          <article-title>HIBERT: Document level pre-training of hierarchical bidirectional transformers for document summarization</article-title>
          .
          <source>In Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics</source>
          , pages
          <fpage>5059</fpage>
          -
          <lpage>5069</lpage>
          , Florence, Italy. Association for Computational Linguistics.
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>