=Paper=
{{Paper
|id=Vol-3834/paper93
|storemode=property
|title=And then I saw it: Testing Hypotheses on Turning Points in a Corpus of UFO Sighting Reports
|pdfUrl=https://ceur-ws.org/Vol-3834/paper93.pdf
|volume=Vol-3834
|authors=Jan Langenhorst,Robert C. Schuppe,Yannick Frommherz
|dblpUrl=https://dblp.org/rec/conf/chr/LangenhorstSF24
}}
==And then I saw it: Testing Hypotheses on Turning Points in a Corpus of UFO Sighting Reports==
And then I saw it: Testing Hypotheses on Turning
Points in a Corpus of UFO Sighting Reports
Jan Langenhorst1,∗,† , Robert C. Schuppe1,‡ and Yannick Frommherz1
1
TUD Dresden University of Technology
Abstract
As part of developing a Computational Narrative Understanding, modeling events within stories has
recently received significant attention within the digital humanities community. Most of the current
research aims at good performance when predicting events. By contrast, we explore a focused approach
based on qualitative observations. We attempt to trace the role of structural elements – more specifically,
temporal function words – that may be characteristic of a narrative’s turning point. We draw on a corpus
of UFO sighting reports in which authors employ a prototypical narrative structure that relies on a
turning point at which the extraordinary intrudes the ordinary. Using binary logistic regression, we can
identify structural properties which are indicative of turning points in our data, showcasing that a focus
on detail can fruitfully complement NLP models in gaining a quantitatively informed understanding of
narratives.
Keywords
turning points, events, computational literary studies, corpus linguistics, logistic regression
1. Introduction
(1) I was in my room of a paying guest flat, 5th floor, and was about to go for my bath and
then I suddenly noticed from my window an object glowing/flashing over a jungle area
more than 1 km away from my apartment. (Report 76519)
(2) As we drove north, 2 out of four of us saw a big bright blue ball of fire that looked as if
it got brighter the closer it got to the ground. (Report 65963)
(3) I was getting in my car, when all four of us – my grandson, my grandson’s tutor, my
granddaughter, and myself – noticed a low, slow-moving, sideways teardrop-shaped
object moving from north to south through the San Gabriel Mountains. (Report 4061)
When people tell the story of something extraordinary which has happened to them, they
use a particular kind of language. This is especially true for recounting moments when the
CHR 2024: Computational Humanities Research Conference, December 4 – 6, 2024, Aarhus, Denmark
∗
Corresponding author.
†
Author 1 and 2 contributed 40 % each to this work, author 3 contributed 20 %.
‡
This author is supported by the Foundation for Innovation in Higher Education (Stiftung Innovation in der
Hochschullehre) as part of the virTUos project.
£ jan.langenhorst@tu-dresden.de (J. Langenhorst); robert_cornelis.schuppe@tu-dresden.de (R. C. Schuppe);
yannick.frommherz@tu-dresden.de (Y. Frommherz)
ȉ 0000-0002-5620-8738 (J. Langenhorst); 0009-0008-0874-3681 (R. C. Schuppe); 0000-0002-3167-1670
(Y. Frommherz)
© 2024 Copyright for this paper by its authors. Use permitted under Creative Commons License Attribution 4.0 International (CC BY 4.0).
950
CEUR
ceur-ws.org
Workshop ISSN 1613-0073
Proceedings
extraordinary intrudes the ordinary. The excerpts above are sentences stemming from texts
about alleged UFO sightings which were collected online. Within these narratives, the first
appearance of something out of the ordinary marks an important turning point. When looking
at these sentences, a pattern emerges: While they might differ from other parts of the text
contentwise, they also tend to stand out structurally. More precisely, they typically include an
adverbial which temporally grounds the event in relation to other parts of the narrative.
Concepts related to what we just introduced as turning points are, among others, Labov’s
most reportable event [6], the disruptive event in the narrative theory of Todorov [19] or Field’s
plot point [3]. Hühn [5] distinguishes between type-I-events and type-II-events: Whereas ev-
ery change of state in a story marks a type-I-event, a type-II-event is characterized by further
differentiating traits, such as its unpredictability and its deviation from the norm. We see a
turning point as a type-II-event which has a particular function and prototypical position in
narratives. Hühn argues that type-II-events can only be identified hermeneutically [5]. How-
ever, he also notes following Schmid [17, 16] that there are criteria which hint at the presence
of a type-II-event in a sentence such as, e.g., the non-iterativity of an action. In our context, this
should entail a higher frequency of temporal function words such as the highlighted ones in
(1) – (3) in sentences recounting a turning point compared to other sentences, as these words
typically hint at a singular event. In this short paper, we aim at testing whether there is a
systematic association between sentences containing a turning point and the use of certain
context-independent markers of temporality.
Following observations such as (1) – (3), we opted to focus on then_ADV, as_SCONJ, and
when_ADV as function words which frequently introduce temporal adverbials and seem to
characterize turning point sentences. We test our hypothesis by specifying a model that pre-
dicts whether a sentence is a turning point or not, assessing whether the selected words are
associated with a higher probability. By using a limited number of linguistic factors as predic-
tors in our model, we aim to contribute to a better understanding of turning points as simpler
models help to keep the impact of individual variables more transparent.
2. Related Work
The computational modeling of narratives – both their constituent elements (e.g., characters
or events) and their overall structure – is a vibrant field of research [1, 2, 14] and can be
seen as part of a project that strives to develop what Piper calls a Computational Narrative
Understanding [10]. Literary event detection is a key element in this enterprise [18, 10, 11].
NLP research on how to best predict events in a narrative has yielded models that have been
tested on various datasets with recent approaches reporting good performance [20, 7, 8]. Not
all of these studies try and measure the same theoretical construct, since event and related
concepts are not defined consistently [4]. Also, events can be measured on different levels, e.g.,
sentence vs. word. Nevertheless, all approaches have in common that they aim to extract those
parts of a narrative that are distinguished from other parts in the way they contribute to the
development of the story.
While these studies broadly investigate the same phenomenon, they differ from our approach
in that they are predominantly concerned with predicting events while we aim at identifying
951
certain characteristics of those events we consider turning points. While, e.g., approaches like
the one by Ouyang/McKeown [9] make extensive use of prior findings from linguistics and
literary studies when selecting features for their models, they are still aimed at good predictive
performance. This is typically achieved by including a myriad of different factors which, on
the downside, hampers disentangling variables that contribute to what constitutes a turning
point. In contrast, to be able to better interpret results, we aim to keep our model as simple as
possible when estimating turning point probabilities.
3. Data
The data stem from a larger corpus of approximately 110,000 reports of UFO sightings submit-
ted via the online platform UFO Stalker (https://ufostalker.com) and scraped by one of the au-
thors.1 Texts are mostly written in English, presumably by people from the U.S., even though
these metadata cannot be verified. The reports’ narrative shape is typically as follows: In a
short exposition or staging phase, authors describe the – usually mundane – situation they say
they were in and quite often who they were with at the time (cf. (4)). Then something hap-
pens: Most often, authors report that they suddenly see a strange light moving in the sky and
such the ‘actual’ reporting of the sighting unfolds. This reporting is mostly an account of the
author’s cognitive processes. Reports typically end without them reaching a definitive conclu-
sion with regard to what it was that they saw. This puzzlement can be seen as the prototypical
resolution of these narratives.
(4) I was inside my house with my wife, brother and his daughter. i thought i’d go out into
my backyard. so i opened my door which faces east and walked out. i stopped about 6
feet from the door and felt like i needed to look up in the south direction. so i did and
then i saw it right there in front of a low cloud. it like came out and went down
about 25 feet then left about 25 feet then back then up and back again, then stopped and
sat there. i was yelling at my wife and brother to come out here fast now! my wife was
the first one yet could not see it cause she did not have her glasses on! then my brother
came out and before he could actually look at it, it went into another cloud next to it.
the funny thing is these clouds were kind of transparent so i do not know where it went
it just went into it and vanished. at the time i saw it, it was a circular object just like a
ufo to say. it was of a dark color yet you could see the sun hitting it. so it was there! but
the moves were just to quick. it went from a straight down to a left turn in just a split
second and did what i said above in the same time. but i did get to see it for the time
mentioned above. it makes me wonder if it intended for me to see it. as when i walked
out the door i had the urge to look in that direction. but who knows. this was what i
saw and was amazed at what it did. (Report 60500)
We sampled 496 texts from the larger corpus of reports. These texts were preprocessed
using Stanza (Version 1.8.1) [13] for tokenization, sentence segmentation and part-of-speech
tagging. Two of the authors annotated which of each report’s sentences marks the turning
1
A similar dataset (that encompasses a different timespan) is available at Kaggle.
952
Non-Turning Points Turning Points
when_ADV
as_SCONJ
then_ADV
5 10 15 20 25 30 1 3 5
Ratio
Percentage Sentences Percentage TP / Percentage Non-TP
Figure 1: Percentages of sentences including when_ADV, as_SCONJ and then_ADV for turning points
(TP) and non-turning points (left) as well as the ratios of these percentages (right).
Turning Point Relative Position in Text
120
100
80
Frequency
60
40
20
0
0 20 40 60 80 100
Relative Position
Figure 2: Distribution of relative positions of turning points within texts as a percentage.
953
60
50
40
Length
30
20
10
0
Non−Turning Points Turning Points
Figure 3: Distribution of sentence lengths for turning point and non-turning point sentences. Outliers
not shown.
point which we operationalized as the one sentence where it becomes clear that the narrative
is about a UFO sighting, i.e., we only annotated one turning point per text. Inter-annotator
agreement was good (ICC(2,1) = 0.808, 95%-CI [0.766, 0.843]; ICC(3,1) = 0.81, 95%-CI [0.769,
0.845]). Disagreement was resolved via discussion. Reports that consisted of fewer than three
sentences were discarded, in line with the data preparation done by Ouyang/McKeown [9]
following Prince’s definition of a minimal narrative [12]. Also excluded were reports that were
written in languages other than English, described something other than a UFO sighting, were
a mere description of photos or videos that were provided along with the report, did not feature
any narrativity or did not include a discernible turning point. Finally, 352 reports consisting
of 5,346 sentences were included in the analysis. Texts contained up to 81 sentences (Median
= 12, IQR = 10).
4. Modeling
To test the hypothesis laid out above, we fit binary logistic regression models with the prob-
ability of a sentence being the turning point as the outcome variable. As predictor variables
we used dummy variables coding for whether the words when_ADV, then_ADV or as_SCONJ
occurred in a given sentence. Further, we opted to include two more structural variables. First,
we added the sentence’s relative position within the text (the sentence’s index divided by the
text’s length) as a percentage. Since we assumed a certain narrative structure, we knew that po-
sition within the narrative would play a role: We observed beforehand that the turning point is
954
Predicted Probabilities of Turning Point
60%
40%
Probability Turning Point
Length
8
16
32
20%
0%
0% 25% 50% 75% 100%
Relative Position
Figure 4: Predicted probabilities of turning points for all possible values of relative position and three
different sentence lengths.
usually located toward the beginning of the story. Second, we included logged sentence length
as a predictor. Sap et al. found that what they call major events are usually expressed in longer
sentences [15] and a similar pattern has been observed by Ouyang/McKeown [9]. Importantly,
the context of sentences is not included in any way – no information on what was written in
the preceding or following sentence was used in the model, i.e., sentences were assumed inde-
pendent. Thus, we do not measure any kind of change from one sentence to the next (like, e.g.,
Ouyang/McKeown do [9]), but rather compare ‘global’ differences between turning points and
non-turning points. Note that even though sentences are naturally clustered at the text level,
a multilevel model is not warranted in our case since we decided to only select one sentence
per text as the turning point. Thus, the turning point probability does not vary between texts.
Looking at descriptive evidence, the three selected words do exhibit different occurrence dis-
tributions depending on whether sentences are turning points or not (Fig. 1). The percentage of
sentences that include when_ADV is four times higher for turning points than for non-turning
points. The same tendency, though less pronounced can be observed for the word as_SCONJ,
whereas then_ADV occurs in a similarly sized share in both subsets of the corpus. Turning
point sentences have a median relative position within the text of 16.7 (IQR = 19.4), so they are
usually present in the earlier parts of the narrative (Fig. 2). Turning point sentences are also
longer than non-turning point sentences in our data (MedianTP = 25 vs. Mediannon-TP = 17;
955
Predicted Probabilities of Turning Point
60%
Probability Turning Point 40% when_ADV
Absent
Present
20%
0%
0% 25% 50% 75% 100%
Relative Position
Figure 5: Predicted probabilities of turning points for all possible values of relative position and
when_ADV absent vs. present.
Fig. 3).
5. Results
We fit three separate models. In a first step, regressing the probability of a sentence being a
turning point on a sentence’s relative position within the text, we estimate a negative associa-
tion. This model described a small amount of variance (Tjur’s 𝑅2 = 0.113). In a second step, we
added the logged sentence length (Tjur’s 𝑅2 = 0.183). Fig. 4 plots the predicted probabilities of
a sentence being a turning point against its relative position within the text (a percentage value
close to 0 for the very beginning and 100 for the end of a text) for different sentence lengths.
As can be seen, the model assigns very low probability to sentences after half of the narrative
has passed whereas sentences that lie in the first quarter of the text are assigned probabilities
between around 0.15 and 0.04 for shorter sentences and between 0.54 and 0.20 for very long
sentences.
Adding our main variables of interest, namely the occurrence of temporal markers resulted
in improved model fit (Tjur’s 𝑅2 = 0.213; for full model comparison, see Data Availability). The
estimated coefÏcients for as_SCONJ and then_ADV were not statistically significant, which is
in accordance with the descriptive patterns presented above. The word when_ADV, however,
was associated with an increased probability of a sentence being a turning point (Fig. 5). Again,
different sentence lengths predict different probabilities for a sentence being the turning point
with longer sentences being associated with higher probabilities (Fig. 6).
It is important to note that adding content words which we know a priori to be discriminative
956
Predicted Probabilities of Turning Point
Length = 8 Length = 16 Length = 32
80%
60%
Probability Turning Point
when_ADV
Absent
40%
Present
20%
0%
0% 25% 50% 75% 100% 0% 25% 50% 75% 100% 0% 25% 50% 75% 100%
Relative Position
Figure 6: Predicted probabilities of turning points for sentences containing vs. not containing
when_ADV and three different sentence lengths.
of turning points for our specific genre of text would also result in better model fit – e.g., adding
a binary variable that captures whether the word sky appears in a given sentence (which is
presumably typical of turning points in our texts since that is the locus of the extraordinary
event) improves model fit (Tjur’s 𝑅2 = 0.242). While it is clear that including more or even all
word occurrences into the model will result in better model fit or predictive power, respectively,
it was not our aim to design a model that discriminates turning point sentences and non-turning
point sentences perfectly – i.e., solve a classification problem – but rather test the theoretical
question laid out above in a very specific exemplary genre of texts.
6. Discussion
Our investigation of turning points in UFO sighting narratives was driven by a hypothesis on
the role of certain content-independent characteristics of turning points and had a relatively
narrow scope: Not only does our corpus consist of a very particular and, possibly, idiosyncratic
genre of texts. Also, our study only used a small hand-annotated sample and focused on few
variables that were situated at different levels: Position within texts and sentence length did
already account for some variation in the probability of sentences being turning points. Re-
garding the role of temporal function words, we found that while when_ADV is predictive of
turning points, then_ADV and as_SCONJ are not. Thus, we were not able to identify a whole
957
group or class of words that are used to mark turning points, but we did corroborate that the
use of when_ADV is predictive of a turning point. This finding supports our general hypothesis
that turning points are characterized not only by their content, but also by structural properties
such as temporal adverbials. Whether this also holds true for other types of narratives remains
subject to further investigation.
Using state-of-the-art NLP methodology, there may be huge advances in the prediction of
event types in narrative texts over the next few years. Another question, however, is how well
these NLP models will serve us to understand what makes a turning point a turning point (or
an event an event, for that matter). On a theoretical level, one can think about approaches like
ours as modeling the reader, but also as modeling the author: What hints enable readers to
place the content of a given sentence within the greater narrative? What hints does the author
deem viable to trigger said interpretation? Do these cues vary between different genres that
feature different narrative structures or schemas? These and many more questions should be
addressed by future research from the vantage point of different disciplines – such as literary
studies, linguistics, and psychology. This will help us gain a quantitatively informed under-
standing of (literary) narratives. We hope to have exemplified with this study how focusing
on individual linguistic characteristics can complement prediction-focused approaches, aiding
the development of a more thorough, corpus-based understanding of narrativity.
Data Availability
The data and code for our analysis are available at: https://osf.io/vd9pu/.
References
[1] A. Berhe, C. Guinaudeau, and C. Barras. “Survey on Narrative Structure: from Linguistic
Theories to Automatic Extraction Approaches”. In: Traitement Automatique des Langues
63. Ed. by C. Fabre, E. Morin, S. Rosset, and P. Sébillot. France: ATALA (Association pour
le Traitement Automatique des Langues), 2022, pp. 63–87. url: https://aclanthology.org
/2022.tal-1.3.
[2] R. L. Boyd, K. G. Blackburn, and J. W. Pennebaker. “The Narrative Arc: Revealing Core
Narrative Structures through Text Analysis”. In: Science Advances 6.32 (2020), pp. 1–9.
doi: 10.1126/sciadv.aba2196.
[3] S. Field. Screenplay: The Foundations of Screenwriting. Revised. New York: Random House,
2005.
[4] E. Gius and M. Vauth. “Towards an Event Based Plot Model. A Computational Narratol-
ogy Approach”. In: Journal of Computational Literary Studies 1.1 (2022), pp. 1–20. doi:
10.48694/jcls.110.
[5] P. Hühn. “Event and Eventfulness”. In: Handbook of Narratology. Ed. by P. Hühn, J. C.
Meister, J. Pier, and W. Schmid. Berlin/Boston: De Gruyter, 2014, pp. 159–178. doi: 10.15
15/9783110316469.159.
958
[6] W. Labov. Language in the Inner City. Philadelphia: University of Pennsylvania Press,
1972, pp. 354–396.
[7] V. D. Lai, T. N. Nguyen, and T. H. Nguyen. “Event Detection: Gate Diversity and Syn-
tactic Importance Scores for Graph Convolution Neural Networks”. In: Proceedings of
the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP). 2020,
pp. 5405–5411. doi: 10.18653/v1/2020.emnlp-main.435.
[8] C. Liu, M. Last, and A. Shmilovici. “Identifying Turning Points in Animated Cartoons”. In:
Expert Systems with Applications 123 (2019), pp. 246–255. doi: 10.1016/j.eswa.2019.01.003.
[9] J. Ouyang and K. McKeown. “Modeling Reportable Events as Turning Points in Narra-
tive”. In: Proceedings of the 2015 Conference on Empirical Methods in Natural Language
Processing. Ed. by L. Màrquez, C. Callison-Burch, and J. Su. Lisbon, Portugal: Association
for Computational Linguistics, 2015, pp. 2149–2158. doi: 10.18653/v1/D15-1257.
[10] A. Piper. “Computational Narrative Understanding: A Big Picture Analysis”. In: Proceed-
ings of the Big Picture Workshop. Ed. by Y. Elazar, A. Ettinger, N. Kassner, S. Ruder, and
N. A. Smith. Singapore: Association for Computational Linguistics, 2023, pp. 28–39. doi:
10.18653/v1/2023.bigpicture-1.3.
[11] A. Piper and S. Bagga. “Toward a Data-Driven Theory of Narrativity”. In: New Literary
History 54.1 (2022), pp. 879–901. doi: 10.1353/nlh.2022.a898332.
[12] G. Prince. A Grammar of Stories: An Introduction. Vol. 13. De Proprietatibus Litterarum.
Series Minor. The Hague: De Gruyter, 1973.
[13] P. Qi, Y. Zhang, Y. Zhang, J. Bolton, and C. D. Manning. “Stanza: A Python Nat-
ural Language Processing Toolkit for Many Human Languages”. In: arXiv preprint
arXiv:2003.07082 (2020). doi: https://doi.org/10.48550/arXiv.2003.07082.
[14] A. J. Reagan, L. Mitchell, D. Kiley, C. M. Danforth, and P. S. Dodds. “The Emotional Arcs
of Stories are Dominated by Six Basic Shapes”. In: EPJ Data Science 5.1 (2016), pp. 1–12.
doi: 10.1140/epjds/s13688-016-0093-1.
[15] M. Sap, A. Jafarpour, Y. Choi, N. A. Smith, J. W. Pennebaker, and E. Horvitz. “Quantifying
the Narrative Flow of Imagined versus Autobiographical Stories”. In: Proceedings of the
National Academy of Sciences 119.45 (2022), pp. 1–12. doi: 10.1073/pnas.2211715119.
[16] W. Schmid. Elemente der Narratologie. 3rd, revised. De Gruyter Studium. Berlin/New
York: De Gruyter, 2014. doi: https://doi.org/10.1515/9783110350975.
[17] W. Schmid. “Narrativity and Eventfulness”. In: What is Narratology? Questions and An-
swers Regarding the Status of a Theory. Ed. by T. Kindt and H.-H. Müller. Vol. 1. Narra-
tologia. Berlin/New York: De Gruyter, 2003, pp. 239–275.
[18] M. Sims, J. H. Park, and D. Bamman. “Literary Event Detection”. In: Proceedings of the
57th Annual Meeting of the Association for Computational Linguistics. Florence, Italy: As-
sociation for Computational Linguistics, 2019, pp. 3623–3634. doi: 10.18653/v1/P19-135
3.
959
[19] T. Todorov. “Die Grammatik der Erzählung”. In: Strukturalismus als interpretatives Ver-
fahren. Ed. by H. Gallas. Vol. 2. Collection Alternative. Darmstadt/Neuwied: Luchter-
hand, 1972, pp. 57–71.
[20] Z. Wang, A. Jafarpour, and M. Sap. “Uncovering Surprising Event Boundaries in Narra-
tives”. In: Proceedings of the 4th Workshop of Narrative Understanding (WNU2022). Seattle,
United States: Association for Computational Linguistics, 2022, pp. 1–12. doi: 10.18653
/v1/2022.wnu-1.1.
A. Model comparison
Table 1
Model Comparison
Dependent variable:
TurningPoint
(1) (2) (3)
∗∗∗ ∗∗∗
RelativePosition −0.06 −0.06 −0.06∗∗∗
(−0.07, −0.05) (−0.07, −0.06) (−0.07, −0.06)
log(Length) 1.37∗∗∗ 1.22∗∗∗
(1.15, 1.59) (0.98, 1.46)
as_SCONJ 0.28
(−0.11, 0.66)
then_ADV −0.27
(−0.61, 0.07)
when_ADV 1.31∗∗∗
(0.85, 1.77)
Constant −0.51∗∗∗ −4.58∗∗∗ −4.28∗∗∗
(−0.70, −0.32) (−5.27, −3.89) (−5.01, −3.55)
Tjur’s 𝑅2 0.11 0.18 0.21
Observations 5,346 5,346 5,346
Akaike Inf. Crit. 2,046.34 1,877.00 1,826.43
∗
Note: p<0.1; ∗∗ p<0.05; ∗∗∗ p<0.01
960