<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta>
      <journal-title-group>
        <journal-title>a corpus of The Wall Street Journal news paper articles from 2009 to 2023. Our findings indicate that a Large</journal-title>
      </journal-title-group>
      <issn pub-type="ppub">1613-0073</issn>
    </journal-meta>
    <article-meta>
      <title-group>
        <article-title>Approach of Dynamic Topic Models and Large Language Models</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Kai-Robin Lange</string-name>
          <email>kalange@statistik.tu-dortmund.de</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Tobias Schmidt</string-name>
          <email>tobias3.schmidt@tu-dortmund.de</email>
          <xref ref-type="aff" rid="aff2">2</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Matthias Reccius</string-name>
          <email>Matthias.Reccius@ruhr-uni-bochum.de</email>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Henrik Müller</string-name>
          <email>henrik.mueller@tu-dortmund.de</email>
          <xref ref-type="aff" rid="aff2">2</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Michael Roos</string-name>
          <email>michael.roos@ruhr-uni-bochum.de</email>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Carsten Jentsch</string-name>
          <email>jentsch@statistik.tu-dortmund.de</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Workshop</string-name>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Department of Statistics, TU Dortmund University</institution>
          ,
          <addr-line>44221 Dortmund</addr-line>
          ,
          <country country="DE">Germany</country>
        </aff>
        <aff id="aff1">
          <label>1</label>
          <institution>Faculty of Management and Economics, Ruhr University Bochum</institution>
          ,
          <addr-line>44780 Bochum</addr-line>
          ,
          <country country="DE">Germany</country>
        </aff>
        <aff id="aff2">
          <label>2</label>
          <institution>Institute of Journalism, TU Dortmund University</institution>
          ,
          <addr-line>44221 Dortmund</addr-line>
          ,
          <country country="DE">Germany</country>
        </aff>
      </contrib-group>
      <pub-date>
        <year>2009</year>
      </pub-date>
      <abstract>
        <p>With rapidly evolving media narratives, it has become increasingly critical to not just extract narratives from a given corpus but rather investigate, how they develop over time. While popular narrative extraction methods such as Large Language Models do well in capturing typical narrative elements or even the complex structure of a narrative, applying them to an entire corpus comes with obstacles, such as a high financial or computational cost. We propose a combination of the language understanding capabilities of Large Language Models with the large scale applicability of topic models to dynamically model narrative shifts across time using the Narrative Policy Framework. We apply a topic model and a corresponding change point detection method to find changes that concern a specific topic of interest. Using this model, we filter our corpus for documents that are particularly representative of that change and feed them into a Large Language Model that interprets the change that happened in an automated fashion and distinguishes between content and narrative shifts. We employ our pipeline on Language Model can eficiently extract a narrative shift if one exists at a given point in time, but does not perform as well when having to decide whether a shift in content or a narrative shift took place.</p>
      </abstract>
      <kwd-group>
        <kwd>change point</kwd>
        <kwd>narrative</kwd>
        <kwd>story</kwd>
        <kwd>Large Language Models</kwd>
        <kwd>Latent Dirichlet Allocation</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>-</title>
      <p>Models and Large Language</p>
    </sec>
    <sec id="sec-2">
      <title>1. Introduction</title>
      <p>With the rise of populism in western democracies, it has become increasingly important to evaluate
narratives in politics, economics and other areas of interest that shape our society. Broadly speaking,
narratives are linguistic constructs that boil complex connections between events down to an explainable
form. Such narratives do not have to fall in line with facts and have become increasingly important in
many areas of our society as they can act as a substitute for fact-based decision making. Consequently,
methods to extract and analyze such narratives from large corpora have garnered the interest of many
researchers. With both constantly changing and also conflicting narratives spread by entities such
as policy makers, news outlets or social media personalities, we consider it especially important to
observe, how a narrative develops over time instead of just globally extracting narratives from a corpus
that spans over a long time period.</p>
      <p>
        Along with the development and improvement of Natural Language Processing (NLP) methods,
narrative extraction methods have also improved. The latest big development in NLP has brought a
push towards more reliable human-like narrative extraction: the language understanding capabilities
of Large Language Models (LLMs) such as ChatGPT [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ], Llama [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ], and others show the ability to
detect narratives with increasingly complex definitions and thus blur the lines between qualitative and
quantitative analyses. Given billions of parameters in the neural networks these models are based on,
      </p>
      <p>CEUR</p>
      <p>
        ceur-ws.org
they can answer questions about almost any input document and the narratives contained within, but
this enormous size also comes with drawbacks. Similarly to how expensive it is to pay experts of a
certain field of interest to annotate texts, using LLMs for large scale corpora is not feasible for many
researchers due to their computational demand or the financial cost of commercial models. Additionally,
their size does not allow users to fully train the models on just a single corpus from scratch. Thus,
many methods that use training from scratch to model temporal developments in a corpus, such as
many dynamic topic models and diachronic word embedding models [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ], cannot be directly transferred
to LLMs, leaving fewer options for temporal narrative change detection.
      </p>
      <p>
        We propose a pipeline that combines the best qualities of both approaches, using dynamic topic
models and LLMs. Leveraging the approach of [
        <xref ref-type="bibr" rid="ref4">4</xref>
        ], we use a bootstrap-based topical change detection
on the topics resulting from the dynamic topic model “RollingLDA” [
        <xref ref-type="bibr" rid="ref5">5</xref>
        ]. While this model succeeds in
providing us with change points that are based on diferences in the word count vectors of individual
topics, it does not give us any intricate details about the changes themselves except for the topic of
the change, the time chunk in which it happened and some key words that are responsible for the
detection of the change. Furthermore, when a change is detected, it is not always certain whether it
actually signifies a genuine shift in the narrative or in another dimension of the discourse, such as the
factual content or the contextual framework provided. We can however use this method to narrow a
large corpus down to a small curated number of documents that are suspected to contain information
about some discoursive shift. We do this by filtering the documents within the time period in which the
change occurred, given the information we are provided about the change. We then use the language
understanding capabilities of an LLM by processing these documents to explain the topical change that
occurred and to decide whether the change signifies a narrative shift or not. We additionally provide
the LLM with information gathered by the topic model to put the change into context. To accurately
guide our identification of shifts in narratives, we use the Narrative Policy Framework (NPF) [
        <xref ref-type="bibr" rid="ref6 ref7">6, 7</xref>
        ], an
analytical approach from political science. Our findings indicate that the LLM performs well when
explaining a narrative shift, if one exists, but hallucinates when judging whether a detected change is a
narrative shift or not, claiming most content shifts to also be narrative shifts.
      </p>
      <p>
        We evaluate our model on a corpus of news articles of The Wall Street Journal ranging from 2009 to
2023. As the documents are copyright-protected, we do not use a commercial LLM in the cloud, but
rather a local instance of the open source LLM Llama 3.1 8B [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ].
      </p>
    </sec>
    <sec id="sec-3">
      <title>2. Related Work</title>
      <p>In our research, we focus on media narratives stemming from news articles with a business and finance
focus. Thus, we base our definition of narratives on the existing literature on narratives from research
in the field of economics and political economics.</p>
      <sec id="sec-3-1">
        <title>2.1. Defining Narratives</title>
        <p>
          The study of narratives has recently gained traction in economic research, though scholars have yet
to converge on a single, universally accepted definition of the concept. Shiller [
          <xref ref-type="bibr" rid="ref8">8</xref>
          ], as an early and
influential example, characterizes economic narratives as “stories that ofer interpretations of economic
events, or morals, or hints of theories about the economy”, thus providing a rather vague definition that
leaves ample room for interpretation. A more formal modeling strategy was pioneered by Eliaz and
Spiegler [
          <xref ref-type="bibr" rid="ref9">9</xref>
          ], who adapt concepts from the literature on Bayesian networks [
          <xref ref-type="bibr" rid="ref10">10</xref>
          ]. The authors highlight
the causal connections among events and economic variables as suficient for shaping people’s economic
and political beliefs. While analytically appealing, this conceptualization sets aside value judgments, a
crucial aspect of narratives. We hold that these normative implications ground the impact of political
and economic narratives by motivating individuals and groups to act on their beliefs [
          <xref ref-type="bibr" rid="ref11 ref12 ref13">11, 12, 13</xref>
          ].
        </p>
        <p>
          To empirically capture narratives, including their ideological valence, the Narrative Policy Framework
(NPF) ofers an alternative lens [
          <xref ref-type="bibr" rid="ref14 ref6 ref7">6, 7, 14</xref>
          ]. The NPF distinguishes narratives by their content and form,
systemizing the latter through four elements: a setting, certain characters, a plot and a moral of the
story. The setting describes the scenery in which the narrative takes efect, such as a presidential
election, a military conflict or a time of high inflation. The characters of a narrative can be persons or
organizations, but also other entities take actions in the narrative including even non-sentient entities
such as a spreading virus. The plot establishes relationships between the characters in space and time.
Lastly, the moral of the story acts as a “takeaway” that often includes implicit or explicit calls to action.
With these components, the NPF highlights both the role of ideological charge and the centrality of
causal attributions, particularly by emphasizing that identifying “who is to blame for the problem” is an
essential part of every narrative [
          <xref ref-type="bibr" rid="ref15">15</xref>
          ].
        </p>
        <p>
          Similarly, Müller et al. [
          <xref ref-type="bibr" rid="ref16">16</xref>
          ] propose a narrative definition that is based on six key elements. They build
their framework around the theory of media frames [
          <xref ref-type="bibr" rid="ref17">17</xref>
          ], proposing that narratives provide dynamic,
evolving depictions of events. According to this view, a media narrative comprises one or more media
frames (all of which are built around four key elements, see [
          <xref ref-type="bibr" rid="ref17">17</xref>
          ]) combined with protagonists (e.g.,
individuals and institutions) and events arranged chronologically and often presented as causally linked.
        </p>
        <p>
          For a more comprehensive overview of narrative definitions across disciplines, see the overview
papers by Roos and Reccius [
          <xref ref-type="bibr" rid="ref11">11</xref>
          ] or Santana et al. [
          <xref ref-type="bibr" rid="ref18">18</xref>
          ], among others.
        </p>
        <p>Building upon foundational works in narrative theory, our research focuses on detecting narrative
structures and their evolution in large text corpora. To achieve this, we draw on methodologies that
have proven efective in identifying thematic shifts or change points within textual data. These change
points often coincide with shifts in journalistic focus, and we hypothesize that such shifts frequently
reflect underlying narrative changes. By leveraging established methods that combine topic modeling
with change point detection, we aim to capture these transitions, providing valuable insights into how
narratives develop and evolve over time.</p>
      </sec>
      <sec id="sec-3-2">
        <title>2.2. Extracting Narratives from Text</title>
        <p>
          Following the identification of discursive change points, the second critical step in our approach involves
the automated analysis of the documents associated with these transitions. To date, there is no widely
established (language) model specifically optimized for narrative extraction in this context. Instead,
a diverse range of methods has emerged, each attempting to identify narratives or their components
through diverse techniques ranging from word-count-based analyses [
          <xref ref-type="bibr" rid="ref19">19</xref>
          ] to Large Language
Modelbased methods [
          <xref ref-type="bibr" rid="ref20">20</xref>
          ].
        </p>
        <p>
          A detailed review of existing NLP-methods to identify systematic parts of narratives is outlined by
Santana et al. [
          <xref ref-type="bibr" rid="ref18">18</xref>
          ]. The authors focus on identifying key components such as events, participants, and
temporal and spatial data, and linking these components to form coherent narratives. The methodology
discussed includes part-of-speech tagging, event extraction, semantic role labeling, and entity linking,
among others. However, this approach using classical NLP-tools struggles with challenges such as
narrative complexity and cross-document narratives.
        </p>
        <p>
          A similar approach, focusing on political and economic narratives, is proposed by Ash et al. [
          <xref ref-type="bibr" rid="ref21">21</xref>
          ].
Their methodology RELATIO employs semantic role labeling to identify key narrative components such
as agents, actions, and patients within sentences, which culminates in the production of interpretable
narrative statements. While their approach efectively identifies simple narrative building blocks, it falls
short in detecting and extracting more complex narratives that integrate causality and sense-making.
Lange et al. [
          <xref ref-type="bibr" rid="ref22">22</xref>
          ] advanced this approach by enhancing the existing RELATIO method with additional
pre- and post-processing steps. By combining multiple RELATIO-extracted narrative blocks, the authors
were able to link related statements and extract complex narrative structures that better align with
causality-based definitions of narratives. At the same time, they emphasize that the increased complexity
of their pipeline can amplify error cascades, where even minor changes in a longer input sequence may
lead to significantly diferent results.
        </p>
        <p>
          A recent study leveraging large language models (LLMs) for narrative extraction is presented by
Gueta et al. [
          <xref ref-type="bibr" rid="ref20">20</xref>
          ]. The authors explore whether LLMs can efectively capture macroeconomic narratives
from social media platforms like X (formerly Twitter). However, the study falls short of providing a
robust definition of narratives, focusing instead on sentiment analysis and RELATIO-style statements,
leaving key aspects of narrative complexity and causality underexplored.
        </p>
        <p>Collectively, these studies demonstrate the evolving landscape of narrative extraction methodologies,
highlighting the integration of advanced NLP techniques to unravel complex narrative structures.
By employing a hybrid approach, we seek to address the limitations identified in previous studies,
particularly concerning the temporal dynamics of narratives and their computational feasibility. Our
proposed pipeline, which combines dynamic topic models and LLMs, aims to provide a more
comprehensive understanding of narrative changes over time, thereby contributing to the broader discourse on
narrative extraction and analysis.</p>
        <p>
          The choice to utilize LLMs for annotating and categorizing economic narratives stems from their
demonstrated ability to excel at complex natural language processing tasks. Modern state-of-the-art
LLMs, such as OpenAI’s GPT-4 [
          <xref ref-type="bibr" rid="ref1">1</xref>
          ] or Anthropic’s Claude 3.5 Sonnet, have consistently outperformed
traditional NLP models in natural language understanding, classification tasks, and information retrieval.
These models enable eficient processing of large corpora, reducing the time needed to annotate
thousands of documents from weeks or months to just hours [
          <xref ref-type="bibr" rid="ref23 ref24 ref25">23, 24, 25</xref>
          ].
        </p>
        <p>
          Unlike traditional NLP tools, which often rely on predefined models like sentiment analysis or topic
modeling, LLMs can understand nuanced, contextual relationships in text. Previous approaches to
analyzing economic narratives frequently employed machine learning pipelines [
          <xref ref-type="bibr" rid="ref21">21</xref>
          ], topic modeling
[
          <xref ref-type="bibr" rid="ref26">26</xref>
          ], or sentiment analysis [
          <xref ref-type="bibr" rid="ref27">27</xref>
          ]. While these methods are efective for identifying broad patterns, as
discussed earlier, they lack the depth to identify and categorize predefined, backward-looking narratives,
particularly when such narratives involve subtle linguistic cues or complex causal relationships. On the
other hand, studies in computational social sciences have shown that LLMs can match or even surpass
human coders in annotating political, social, and economic texts [
          <xref ref-type="bibr" rid="ref28">28</xref>
          ], underscoring their potential for
content analysis. For example, Mellon et al. [
          <xref ref-type="bibr" rid="ref29">29</xref>
          ] reported that LLMs achieved 95% agreement with expert
annotators when analyzing British election statements. Similarly, Gilardi et al. [
          <xref ref-type="bibr" rid="ref30">30</xref>
          ] demonstrated that
LLMs like GPT-3.5 could classify tweet content, author stances, and narrative frames more accurately
than trained crowd workers. Building on this work, we explore the ability of LLMs to identify narratives
and narrative-like structures in text. Detailed prompting and the continuous involvement of humans in
the loop ensure that LLM annotations align as closely as possible with human intuition.
        </p>
      </sec>
    </sec>
    <sec id="sec-4">
      <title>3. Methodology</title>
      <p>
        To extract change points in our documents from our corpus, we use the Topical Changes method [
        <xref ref-type="bibr" rid="ref31 ref4">4, 31</xref>
        ],
which is based on the models RollingLDA [
        <xref ref-type="bibr" rid="ref5">5</xref>
        ] and LDAPrototype [
        <xref ref-type="bibr" rid="ref32">32</xref>
        ], that improve the reliability of
the classical Latent Dirichlet Allocation [
        <xref ref-type="bibr" rid="ref33">33</xref>
        ] and allows us to apply it to temporal data. After extracting
the change points, we further analyze our documents using the LLM Llama 3.1 8B [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ]. In this section,
we detail the models’ functionalities and advantages for the task at hand.
      </p>
      <sec id="sec-4-1">
        <title>3.1. LDAPrototype</title>
        <p>Because the modeling of LDAs is inherently non-deterministic due to its sampling and initialization,
there is no way of telling if a single given run does well to represent the corpus or whether it creates
“bad” topics by chance. To prevent relying on randomness, we use LDAPrototype.</p>
        <p>
          In terms of language modeling, LDAPrototype follows an LDA [
          <xref ref-type="bibr" rid="ref33">33</xref>
          ], but instead of training just one
LDA,  LDAs are trained from which one model is chosen as a prototypical LDA. To “represent” the
 models, the model with the highest average pairwise similarity to every other model is chosen as
the prototype. To do this, Rieger et al. [
          <xref ref-type="bibr" rid="ref32">32</xref>
          ] proposed a similarity measure to compare LDAs with. All
pair-wise combinations of  LDA models are compared by clustering their topics based on the cosine
similarity of the topic’s top words into clusters of size two. The topic model similarity between models
 1 and  2 is then given by
#topics of model  1 that are matched with a topic of model  2
        </p>
        <p>with  as the number of topics in both models  1 and  2. This similarity can thus be interpreted as the
percentage of topics that is “more similar” to a topic of the diferent model rather than to a topic of
the same model, so a form of topic overlap between two models. The chosen prototype will thus be
one of the models with the highest average topic overlap and, conversely, the least “unique” topics that
have not been generated by other LDAs and can therefore be considered the most stable out of the 
generated models.</p>
      </sec>
      <sec id="sec-4-2">
        <title>3.2. RollingLDA</title>
        <p>
          RollingLDA [
          <xref ref-type="bibr" rid="ref5">5</xref>
          ] is a dynamic topic model based on LDA and can, by extension, also be used with
LDAPrototype as a back-end. The model uses a rolling window approach that does not train an LDA
from all time chunks at once, but rather begins to model the first  time chunks and then proceeds to
model the remaining time chunks based on the information and topic assignments of the last  time
chunks.
        </p>
        <p>As the training of RollingLDA is thus based on the initially trained model, we use  instead of just
one time chunk as a warm-up to ensure the model that initializes RollingLDA is properly trained. This
parameter should thus be chosen suficiently large that a proper topic model that covers most important
repeating topics. For instance, choosing  = 12 when using monthly chunks, enables the model to
initialize on the first 12 months of the data without a temporal component, ensuring its initial model
has been trained on observing topics that return in yearly trends.</p>
        <p>
          The memory parameter  creates the rolling window efect of the model. In each time chunk, we
provide our model with previous topic assignments from the last  time chunks, which are considered
while estimating the document-topic and word-topic distributions during the Gibbs-sampling step.
This enables RollingLDA to eficiently model trends in temporal copropra, if the memory parameter
is tuned accordingly. For instance, when using monthly chunks,  = 4 or  = 3 can be chosen to
generate topics “remembering” quarterly trends, while “forgetting” older information from years prior.
Only providing the model with a bit of past information allows it to change flexibly while still forcing
it to keep coherent topics over time. This model is specifically designed to model abrupt changes in
rapidly changing news media, which sets it apart from other dynamic topic models, such as the original
dynamic topic model [
          <xref ref-type="bibr" rid="ref34">34</xref>
          ] and other early iteration of the idea [
          <xref ref-type="bibr" rid="ref35 ref36 ref37">35, 36, 37</xref>
          ].
        </p>
      </sec>
      <sec id="sec-4-3">
        <title>3.3. Topical Changes</title>
        <p>
          The topical changes model [
          <xref ref-type="bibr" rid="ref31 ref4">4, 31</xref>
          ] detects change points within topics put out by RollingLDA by
comparing the development of word-topic vectors over time. For this, we observe both the current
and previous count vectors of word assignments and compare the resulting similarity scores. We then
perform a Bootstrap-based monitoring procedure that tests whether a change occurred.
        </p>
        <p>To construct these word count vectors, we count the number of occurrences of each word in that
topic over the last  time chunks (or since the last detected change, if it occurred less than  time chunks
ago). Thus,  is the maximum number of time chunks the change detection can “look back” to, enabling
a rolling window based change detection similar to RollingLDA’s rolling window based topic modeling.
 can be tuned to, for instance, stop the detection from capturing repeatedly appearing trend efects by
setting it as the assumed length of the trend. As the model focuses on finding abrupt changes rather
than the slow natural development of language over time, a “mixture” parameter controls, how much
language change the user expects from one time chunk to the next, which alters the look-back word
topic vectors by mixing them with the current word vector to a certain degree.</p>
        <p>The change detection is then performed by sequentially performing a bootstrap-based test with a
significance level  for each time chunk in each topic. In this test, the cosine similarities between the
word topic distribution in the current time chunks and  Bootstrap samples of the look-back word topic
vector are compared. If the cosine distance of the observed look-back word topic vector to the current
word topic vector is larger than  ∗ (1 − ) of the bootstrap samples, a change is detected.</p>
        <p>
          This detection does however not give indication about which tokens have actually caused the change.
Rieger et al. [
          <xref ref-type="bibr" rid="ref4">4</xref>
          ] propose to use words with high leave-one-out word impacts, that is words for which
the cosine distance is reduced the most when leaving the word out of both word-topic vectors that are
compared during the detection step. These words can thus be interpreted as the main causes for the
detected change, as they had the highest impact on the drop of similarity.
        </p>
      </sec>
      <sec id="sec-4-4">
        <title>3.4. Llama as a change interpreter</title>
        <p>
          With the topical changes model, we are able to identify not only points in time where a change point in
the topics of our corpus occurred, but also which topics are afected by this change. However, as topics
are abstract constructs consisting of word distributions, the model can only give us abstract information
about what changed. To gather more information, we further analyze our documents using a large
language instruction model with great language understanding and summarization abilities. As we are
handling copyrighted texts, we will not use a commercial model such as GPT [
          <xref ref-type="bibr" rid="ref1">1</xref>
          ], but rather an open
source model that runs on a local machine. We use Llama 3.1 8B model, a instruction model designed
by Meta with 8 billion parameters [
          <xref ref-type="bibr" rid="ref2">2</xref>
          ].
        </p>
        <p>We first narrow down the number of potential documents to give our LLM as an input. Given a
change point in topic  between the time chunks  and  − 1 , we employ a filtering strategy that looks
for change-related documents to feed into our LLM.</p>
        <p>For this, we use the leave-one-out word impacts native to the Topical Changes model as a foundation.
We count the number of occurrences of the words found to be significant due to the leave-one-out
measure in each document in time chunk  . We then select those 5 documents with the highest count
of these words and feed them into our model, letting it compare them to the topic of the previous
time chunk, judging from 10 top words. We do not include documents from time chunk  − 1 here, as
preliminary experiments showed that this “confused” the model. Leveraging from the Narrative Policy
Framework, we give our LLM the following instruction, alongside the 5 chosen documents:
## You are an expert journalist. You will be asked to explain, why a topical change in a
corpus of news articles has has been found and what the change consists of. To fulfill this
task, you will be provided information from other text analysis models such as parts of the
output of a RollingLDA topic model.
## Whenever you are asked to analyze a “narrative”, assume the definition of a narrative
that is laid out in the paper “The Narrative Policy Framework: A Traveler’s Guide to Policy
Stories”. Specifically, respect and apply the following definitory aspects of a narrative:
“The NPF posits that while the content of narratives may vary across contexts, structural
elements are generalizable. For example, the content of a story about fracking told by a
Scottish environmentalist is certainly diferent from the story told by a right-wing populist
who attacks a public agency in Switzerland. However, these stories share common structural
elements: They take place in a setting, contain characters, have a plot, and often champion a
moral.” Keep in mind that a moral must feature a value judgement. When asked to specifiy
a moral of a narratives, you must refer to this value judgement or note that there is no
moral and thus no narrative! A narrative change must satisfy the four structural criteria,
while a content change can simply be caused by an event that shifts the focus of the topic
without a clear narrative. Your goal is to determine if a narrative change occurred or if it
was a mere content change.
## Please explain an apparent change within a RollingLDA topic that has occurred in [date]
## The following topic top words might give you an idea of what the topic was about before
the change: [10 top words of the topic in chunk  − 1 ]
## The following topic top words might give you an idea of what the topic was about after
the change: [10 top words of the topic in chunk  ]
## The following words were found to be significant to the detected change: [leave-one-out
word impacts]
## The following are those articles from the period that make the most use of the words
found to be significant to the detected change: [Filtered articles]
## Provide your output in a strict JSON format. First, summarize each article in one sentence:
{“summaries”: [{“article_1”: ...}, {“article_2”: ...}, ...]}. Then formulate what the topic was
about before and after the change based on the topic top words, emphasizing the changes
induced to the topic, judged by the articles and the change words: “topic_change”: ...
Explain how this change in topic indicates a shift in narrative. How did the narrative shift?
“narrative_before”: “Before the change, the narrative centered around ...”, “narrative_after”:
“After the change, the narrative centers around ...”. Finally, walk through the four structural
criteria that true narratives must satisfy according to the Narrative Policy Framework and
confirm or disconfirm their existence in the narrative after the break by briefly naming
what they are in the texts provided {“narrative_criteria”: [{“setting”: ...}, {“characters”: ...},
{“plot”: ...}, {“moral”: ...}]}. Make sure to specify the exact source of the moral judgement
that you may have found. Lastly, make a final judgement if there is a narrative shift to be
found with {“true narrative”: True/False}. Do not answer in anything but JSON.</p>
        <p>This filtering strategy enables us to specifically observe documents that are found by the model to be
significant to the change. It does, however, not capture a larger picture of the topic itself and might lead
the model to focus too much on a few significant words. This can happen if more than a few words are
significant to the change with similar intensity, not just those that were captured by the leave-one-out
word impacts. This might indicate an abrupt and broad topic change that shook the entire word topic
distribution.</p>
        <p>We also tested out diferent filtering strategies, such as providing the LLM with documents that are
particularly representative of the topic  (i.e. documents with the highest topic share of  ) in both time
frames  − 1 and  , but they generally yielded worse results. We aim to further improve our prompting
to further optimize our pipeline in the future.</p>
      </sec>
    </sec>
    <sec id="sec-5">
      <title>4. Evaluation</title>
      <p>
        We use Python 3.9 to run our scripts. We will publish the code corresponding to this paper as a part of
the Tools for Temporal Text Analysis (ttta) Python package[
        <xref ref-type="bibr" rid="ref38">38</xref>
        ].
      </p>
      <p>We evaluate our model on a The Wall Street Journal data set containing 795, 800 articles dating
from 01/01/2009 to 12/31/2023. This high count of documents allows us to use small time chunks for
our RollingLDA analysis. We choose monthly time chunks to enable a fine-grained analysis while
ensuring that the number of detected changes remains manageable for annotators. We also choose
our memory to check the last four months  = 4 , thus enabling RollingLDA to remember quarterly
trends in the data., To provide a stable initial LDA model and to incorporate all yearly trends into that
initial model, we choose a warm-up period of  = 12 months. As a trade-of between computational
eficiency and model reliability, we conduct 10 LDA runs to determine an LDAPrototype. We generated
the initial model multiple times with {20, 30, ..., 100} topics and decided to choose  = 50 topics. Our
Topical Changes model is then performed with a look-back-window of 4 months (also to remember for
quarterly trends). We use a mixture parameter of 95% and evaluate the similarities to a significance
level of  = 0.01 to control the severity of changes that are observed to be major changes. We choose
 = 500 Bootstrap samples to generate the bootstrap percentiles. For our LLM we set the temperature
parameter to 0 in the hopes of minimizing hallucinations.</p>
      <p>Our Topical Changes model found a total of 68 changes across 156 time chunks. In Figure 1 each
topic is represented with its overall top words across the entire corpus in its title for interpretability.
Each line plot includes a blue line, representing the word vector similarity, going from one chunk to
the next while the orange line signalizes the dynamic threshold calculated using bootstrap samples. A
change is detected when the blue line falls below the orange line, resulting in a vertical red line. At
these change points, we filter the documents for candidate documents to feed into our LLM to check
for a narrative shift.</p>
      <p>Three expert annotators discussed each change and annotated them according to the Narrative Policy
Framework and afterwards classified the answers of the Large Language Model. They found 37 of them
to contain narrative rather than mere content shifts. To exemplify Llama’s performance, we show three
samples of detected changes as well as their narrative shift evaluation of the LLM in the appendix. We
will release the list of annotated narratives upon publication of this paper in a GitHub repository.</p>
      <p>
        We split our evaluation into two parts: We see the binary classification between a narrative shift and
a content shift as a first step. In the second step, we check if Llama correctly categorized aspects of the
existing narratives according to the NPF. The results show that the LLM does not perform particularly
well in classifying the changes into content shifts and narrative shifts, as it finds a narrative in 60, so
all but eight cases. This results in a accuracy score of 57.35%, and an f1 score of 0.7010. Since the task
involves Llama reproducing all 4 aspects of narratives laid out by NPF, this misclassification is likely
caused by hallucinatory behavior, which is a well-known tendency of LLMs [
        <xref ref-type="bibr" rid="ref39">39</xref>
        ]. The LLM hallucinates
trying to give a satisfactory output – ie. provide the user with aspects of narratives according to the
NPF – at all times, resulting in a large false positive rate.
      </p>
      <p>The model does, however, perform better when explaining an existing narrative shift. In those cases
the model accurately defines the narrative in 31 out of 37, so 83,78%, of the cases.</p>
      <p>Overall, the LLM performs well when a narrative shift exists for a given change point, accurately
applying the NPF definition to capture the narrative. However, it does not perform as well when a mere
content change occurs, stretching the definitional aspects of the NPF, thus hallucinating narratives. An
improved prompt or an additional filtering step could help to solve this issue in future research.</p>
    </sec>
    <sec id="sec-6">
      <title>5. Summary</title>
      <p>When interpreting political or economic events, people align the corresponding information with
their internal world view, combining the two into a narrative. Media narratives have become a big
research topic in recent years due to the rise of spreaders of “simple”, often populist narratives. While
recent advances in Natural Language Processing, namely the emergence of LLMs, have resulted in
improvements in the task of narrative extraction from texts due to their language understanding
capabilities, these models are resource intensive. Thus, using them to label narrative in large corpora is
often not always feasible.</p>
      <p>We introduce a novel pipeline that combines the scalability of the dynamic topic model RollingLDA
and its extension Topical Changes with the language understanding of the LLM Llama 3.1 8B. The topic
model is used to detect changes within a corpus of The Wall Street Journal dating from 2009 to 2023
in 50 topics. We use two filtering strategies to identify documents that contain information about the
nature of the detected change or the before-and-after of the topic. These texts are then fed to Llama to
analyze the change according to the Narrative Policy Framework and detect whether a mere shift in
content or in the narrative took place.</p>
      <p>We processed the articles in monthly time chunks and detected 68 changes within 13 years, as one
year served as a warm up period. After manually labeling the changes, we find that 37 of those 68
changes show signs of narrative shifts. While our LLM managed to distinguish content shifts from
narrative shifts only 57.35% of the time due to hallucinatory behavior, it correctly explained a narrative
shift, if one exists, 83.78% of the times. The incorrect narrative shift detections stem from the model
showing a preference to report a narrative shift for each input rather than considering the option of
a change that is not caused by a narrative shift. While this result might improve with more careful
prompting, it indicates that the language understanding capabilities of an LLM are functional enough
to properly fit an existing narrative shift into a given complex definition, yet may fail not because of
insuficient language understanding, but rather due to an inclination to oversatisfy the user’s prompt.</p>
      <p>While we observed narratives as purely binary cases in this paper, we plan to perform a more nuanced
evaluation of narrative extraction techniques in the future, considering a wider array of subjectivity at
this complex task.</p>
    </sec>
    <sec id="sec-7">
      <title>Limitations</title>
      <p>While we demonstrate an exemplary usage of our pipeline here, parameter choices such as the size
of time chunks for RollingLDA and the mixture parameter for the Topical Changes can impact the
number and granularity of changes detected, which could also alter the results we observed in our
study. Furthermore, we didn’t observe cases in which the Topical Changes model incorrectly detected a
change, we therefore cannot tell, how the LLM would react to such a case.</p>
    </sec>
    <sec id="sec-8">
      <title>Acknowledgments</title>
      <p>This study is part of a project of the Dortmund Center for data-based Media Analysis (DoCMA) at TU
Dortmund University. The work was also part of the Narrative Economic Alliance Ruhr (NEAR) project,
supported by the Mercator Research Center Ruhr (MERCUR) with project number Ko-2022-0015.</p>
    </sec>
    <sec id="sec-9">
      <title>Appendix</title>
      <sec id="sec-9-1">
        <title>Sample 1</title>
        <sec id="sec-9-1-1">
          <title>Topic: 29 (healthcare) Date: 10/2014 Leave-one-out word impacts: ebola, drug, africa, worker, dallas</title>
          <p>Narrative shift? : Yes
Hand-coded narrative shift : After a small ebola outbreak in the U.S. in the month prior and an
appropriate response that condemned that outbreak, the focus of the topic shifts. Instead of the fear of
an outbreak in the U.S., the articles focus on drug tests that give hope and also shift the focus back on
the African continent.</p>
          <p>Response of the LLM:
{
summaries: [
{article_1: “Pharmaceutical companies and health oficials are racing to develop and test treatments for
the Ebola virus, with several experimental drugs showing promise in early trials.”},
{article_2: “The US response to the Ebola outbreak has improved significantly since the first case in
Dallas, with oficials learning from past mistakes and taking steps to contain the virus.”},
{article_3: “The number of people potentially exposed to Ebola in the US has dropped to 50 from 100,
and oficials are working to reassure the public that the virus can be contained.”},
{article_4: “A nurse in Texas has tested positive for Ebola, highlighting the need for hospitals to be
more vigilant in their infection-control procedures and raising concerns about the preparedness of US
hospitals for the disease.”},
{article_5: “The Ebola outbreak in West Africa is being hindered by a lack of qualified staf, with many
health workers in Liberia and other afected countries lacking proper training and equipment to combat
the virus.”} ],
topic_change: “The topic shifted from a focus on the global response to the Ebola outbreak, including
the development of treatments and the US response, to a focus on the challenges of containing the
outbreak in West Africa, particularly the lack of qualified staf and the need for improved
infectioncontrol procedures.”,
narrative_before: “Before the change, the narrative centered around the global response to the Ebola
outbreak, with a focus on the development of treatments and the US response, emphasizing the sense
of urgency and the need for action.”,
narrative_after : “After the change, the narrative centers around the challenges of containing the
outbreak in West Africa, highlighting the dificulties of providing adequate care and the need for
improved infection-control procedures, with a focus on the human cost of the outbreak and the need
for more efective solutions.”,
narrative_criteria: [
{setting: “The setting of the narrative is the Ebola outbreak in West Africa, particularly in Liberia, where
the lack of qualified staf and inadequate infection-control procedures are exacerbating the crisis.”},
{characters: “The characters in the narrative include health workers, patients, and oficials, who are
struggling to contain the outbreak and provide adequate care.”},
{plot: “The plot of the narrative revolves around the challenges of containing the outbreak, including the
lack of qualified staf, inadequate infection-control procedures, and the need for improved solutions.”},
{moral: “The moral of the narrative is that the Ebola outbreak highlights the need for improved global
health infrastructure and the importance of prioritizing the health and safety of healthcare workers,
as emphasized by Dr. Frieden, the CDC Director, who stated that ‘preventing transmission requires
scrupulous attention to infection control’.”}
],
true_narrative: true }
LLM evaluation: LLM detected the fact that a narrative shift correctly. It correctly identified that the
focus shifted back to the African continent and condemning the disease there. It did not focus on the
emotional aspect of “hope vs. fear”, which might be the result of not having been provided from the
month prior to the change, resulting in a lack of context.</p>
        </sec>
      </sec>
      <sec id="sec-9-2">
        <title>Sample 2</title>
        <sec id="sec-9-2-1">
          <title>Topic: 36 (Elections and Polls) Date: 10/2014 Leave-one-out word impacts: scotland, scottish, independence, neves, scot</title>
          <p>Narrative shift? : No
Hand-coded narrative shift : The protagonists (and settings) in this topic have hardly anything in
common. Documents deal with Catalonia’s independence movement, Brazil’s presidential election, and
Scotland’s decision to remain in the U.K., among others. There is no such thing as a common narrative
here.</p>
          <p>Response of the LLM:
{ summaries: [
{article_1: “Catalonia’s independence movement remains strong despite a court injunction, with activists
planning to hold a nonbinding referendum on November 9.”},
{article_2: “Aécio Neves, the opposition candidate in Brazil’s presidential election, is gaining momentum
against incumbent President Dilma Roussef, with polls showing him within striking distance of a
runof.”},
{article_3: “Marina Silva, the third-place finisher in Brazil’s presidential election, is considering
endorsing Aécio Neves, the opposition candidate, in a bid to defeat President Dilma Roussef in a runof.”},
{article_4: “The Brazilian presidential election is too close to call, with Aécio Neves and Dilma Roussef
neck and neck in the polls, with Neves focusing on the economy and Roussef maintaining a loyal base
of support.”},
{article_5: “The Bank of England had planned to flood the financial system with cash and hold emergency
auctions to secure banks’ funding if Scotland had voted for independence in its referendum.”} ],
topic_change: “The topic shifted from a discussion of Scottish independence to a discussion of Brazilian
politics and the presidential election, with a focus on the opposition candidate Aécio Neves and his
chances of defeating incumbent President Dilma Roussef.”,
narrative_before: “Before the change, the narrative centered around the Scottish independence
movement, with a focus on the potential consequences of a ’yes’ vote and the plans of the Bank of England
to maintain financial stability.”,
narrative_after : “After the change, the narrative centers around the Brazilian presidential election,
with a focus on the opposition candidate Aécio Neves and his chances of defeating incumbent President
Dilma Roussef.”,
narrative_criteria: [
{setting: “The setting is the countries of Scotland and Brazil, with a focus on the political and economic
systems of each country.”},
{characters: “The main characters are the politicians and activists involved in the Scottish independence
movement and the Brazilian presidential election, including Aécio Neves, Dilma Roussef, and Marina
Silva.”},
{plot: “The plot revolves around the potential consequences of a ’yes’ vote in the Scottish referendum
and the outcome of the Brazilian presidential election, with a focus on the opposition candidate Aécio
Neves and his chances of defeating incumbent President Dilma Roussef.”},
{moral: “There is no clear moral judgment in the narrative, but the narrative does suggest that the
opposition candidate Aécio Neves is a more viable option for Brazil’s economy and that the incumbent
President Dilma Roussef’s policies have been inefective in addressing the country’s economic
problems.”}]
true_narrative: true }
LLM evaluation: The model correctly detected some of the main characters. However, it did not
come to the conclusion that, due to the diferent settings, the provided documents are more or less
incomparable. The suggested moral may fit to a single document, but does certainly not consider all
provided texts adequately.</p>
        </sec>
      </sec>
      <sec id="sec-9-3">
        <title>Sample 3</title>
        <sec id="sec-9-3-1">
          <title>Topic: 8 (US Elections) Date: 01/2016 Leave-one-out word impacts: trump, house, iowa, cruz, sander</title>
          <p>Narrative shift? : Yes
Hand-coded narrative shift : Just before the start of the 2016 Republican primaries, Donald Trump’s
campaign has gained momentum, overtaking his more mainstream competitors through his charisma
and aggressive attacks on competitors, such as Ted Cruz and Marco Rubio, and despite questions
surrounding his character and previous business dealings.</p>
          <p>Response of the LLM:
{ summaries: [
{article_1: “The article discusses Donald Trump’s financial struggles in the 1990s, including his debt
restructuring and bankruptcy, and how he managed to come out of it by taking advantage of his creditors
and using his charisma to negotiate deals.”},
{article_2: “The article discusses the 2016 presidential election and how Donald Trump’s campaign is
gaining momentum, despite his high disapproval ratings and the objections of some voters, and how he
may be able to overcome these obstacles by confronting his opponents and using his charisma to win
over voters.”},
{article_3: “The article discusses the 2016 presidential election and how Ted Cruz and Donald Trump
are battling for the lead in Iowa, with Cruz’s campaign accusing Trump of being too egotistical and
Trump questioning Cruz’s eligibility to be president.”},
{article_4: “The article discusses the 2016 presidential election and how Donald Trump is targeting Ted
Cruz, questioning his eligibility to be president and his evangelical faith, and how Cruz is handling the
attacks with humor and poise.”},
{article_5: “The article discusses the final debate before the Iowa caucuses, where the candidates, minus
Donald Trump, sparred over policy diferences and personal attacks, with Ted Cruz and Marco Rubio
engaging in a heated exchange over immigration policy.”} ],
topic_change: “The topic shifted from discussing Donald Trump’s financial struggles and personal
characteristics to focusing on his presidential campaign and the 2016 election, with a particular emphasis
on his rivalry with Ted Cruz and the other Republican candidates.”,
narrative_before: “Before the change, the narrative centered around Donald Trump’s personal
characteristics, such as his financial struggles, his ability to negotiate deals, and his charisma, and how these
traits might make him a successful businessman and president.”,
narrative_after : “After the change, the narrative centers around Trump’s presidential campaign and
his rivalry with other candidates, particularly Ted Cruz, and how he is using his charisma and tactics to
win over voters and gain momentum in the election.”,
narrative_criteria: [
{setting: “The setting of the narrative is the 2016 presidential election, with the context of the Republican
primary and the Iowa caucuses.”},
{characters: “The main characters in the narrative are Donald Trump, Ted Cruz, and the other
Republican candidates, who are portrayed as rivals and opponents in the election.”},
{plot: “The plot of the narrative is the struggle for the Republican nomination, with Trump and Cruz
engaging in a heated rivalry and the other candidates trying to position themselves as alternatives to
the two front-runners.”},
{moral: “There is no clear moral judgment in the narrative, but the story implies that Trump’s ability
to win over voters and gain momentum in the election is a testament to his charisma and tactics, rather
than any inherent qualities or values.”} ]
true_narrative: true }
LLM evaluation: The model correctly identified the main characters, the setting and the plot of
Donald Trump taking the leading role in the primary field. While the model somewhat overstated
the importance of the “moral” implications of this development, all aspects of the NPF were correctly
recognized and summarized.</p>
        </sec>
      </sec>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <surname>OpenAI</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          <string-name>
            <surname>Achiam</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          <string-name>
            <surname>Adler</surname>
            , S. Agarwal, ...,
            <given-names>B.</given-names>
          </string-name>
          <string-name>
            <surname>Zoph</surname>
          </string-name>
          , GPT-4
          <source>Technical Report</source>
          ,
          <year>2024</year>
          . ArXiv:
          <volume>2303</volume>
          .08774 [cs].
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [2]
          <string-name>
            <given-names>A.</given-names>
            <surname>Dubey</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Jauhri</surname>
          </string-name>
          ,
          <string-name>
            <surname>A</surname>
          </string-name>
          . Pandey, ...,
          <string-name>
            <given-names>Z.</given-names>
            <surname>Zhao</surname>
          </string-name>
          ,
          <source>The Llama 3 Herd of Models</source>
          ,
          <year>2024</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [3]
          <string-name>
            <given-names>A.</given-names>
            <surname>Kutuzov</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L.</given-names>
            <surname>Øvrelid</surname>
          </string-name>
          ,
          <string-name>
            <given-names>T.</given-names>
            <surname>Szymanski</surname>
          </string-name>
          , E. Velldal,
          <article-title>Diachronic word embeddings and semantic shifts: a survey</article-title>
          ,
          <source>in: Proceedings of the 27th International Conference on Computational Linguistics</source>
          , Association for Computational Linguistics, Santa Fe, New Mexico, USA,
          <year>2018</year>
          , pp.
          <fpage>1384</fpage>
          -
          <lpage>1397</lpage>
          . URL: https://aclanthology.org/C18-1117.
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [4]
          <string-name>
            <given-names>J.</given-names>
            <surname>Rieger</surname>
          </string-name>
          ,
          <string-name>
            <surname>K.-R. Lange</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          <string-name>
            <surname>Flossdorf</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          <string-name>
            <surname>Jentsch</surname>
          </string-name>
          ,
          <article-title>Dynamic change detection in topics based on rolling LDAs</article-title>
          ,
          <source>Proceedings of the Text2Story'22 Workshop CEUR-WS</source>
          <volume>3117</volume>
          (
          <year>2022</year>
          )
          <fpage>5</fpage>
          -
          <lpage>13</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          [5]
          <string-name>
            <given-names>J.</given-names>
            <surname>Rieger</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Jentsch</surname>
          </string-name>
          ,
          <string-name>
            <surname>J. Rahnenführer,</surname>
          </string-name>
          <article-title>RollingLDA: An Update Algorithm of Latent Dirichlet Allocation to Construct Consistent Time Series from Textual Data, in: Findings of the Association for Computational Linguistics: EMNLP 2021, Association for Computational Linguistics</article-title>
          , Punta Cana, Dominican Republic,
          <year>2021</year>
          , pp.
          <fpage>2337</fpage>
          -
          <lpage>2347</lpage>
          . doi:
          <volume>10</volume>
          .18653/v1/
          <year>2021</year>
          .findings-emnlp.
          <volume>201</volume>
          .
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          [6]
          <string-name>
            <given-names>M. D.</given-names>
            <surname>Jones</surname>
          </string-name>
          ,
          <string-name>
            <surname>M. K. McBeth</surname>
          </string-name>
          ,
          <article-title>A narrative policy framework: Clear enough to be wrong?</article-title>
          ,
          <source>Policy studies journal 38</source>
          (
          <year>2010</year>
          )
          <fpage>329</fpage>
          -
          <lpage>353</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          [7]
          <string-name>
            <given-names>E. A.</given-names>
            <surname>Shanahan</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M. D.</given-names>
            <surname>Jones</surname>
          </string-name>
          ,
          <string-name>
            <surname>M. K. McBeth</surname>
            ,
            <given-names>C. M.</given-names>
          </string-name>
          <string-name>
            <surname>Radaelli</surname>
          </string-name>
          ,
          <article-title>The narrative policy framework, in: Theories of the policy process</article-title>
          ,
          <source>Routledge</source>
          ,
          <year>2018</year>
          , pp.
          <fpage>173</fpage>
          -
          <lpage>213</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          [8]
          <string-name>
            <given-names>R. J.</given-names>
            <surname>Shiller</surname>
          </string-name>
          , Narrative economics,
          <source>American economic review 107</source>
          (
          <year>2017</year>
          )
          <fpage>967</fpage>
          -
          <lpage>1004</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          [9]
          <string-name>
            <given-names>K.</given-names>
            <surname>Eliaz</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Spiegler</surname>
          </string-name>
          ,
          <article-title>A model of competing narratives</article-title>
          ,
          <source>American Economic Review</source>
          <volume>110</volume>
          (
          <year>2020</year>
          )
          <fpage>3786</fpage>
          -
          <lpage>3816</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          [10]
          <string-name>
            <given-names>J.</given-names>
            <surname>Pearl</surname>
          </string-name>
          , Causality, Cambridge University Press,
          <year>2009</year>
          . URL: https://www.ebook.de/de/product/ 20478918/judea_pearl_causality.html.
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          [11]
          <string-name>
            <given-names>M.</given-names>
            <surname>Roos</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Reccius</surname>
          </string-name>
          , Narratives in economics,
          <source>Journal of Economic Surveys</source>
          <volume>38</volume>
          (
          <year>2024</year>
          )
          <fpage>303</fpage>
          -
          <lpage>341</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          [12]
          <string-name>
            <given-names>S. R.</given-names>
            <surname>Shenhav</surname>
          </string-name>
          ,
          <article-title>Political narratives and political reality</article-title>
          ,
          <source>International Political Science Review</source>
          <volume>27</volume>
          (
          <year>2006</year>
          )
          <fpage>245</fpage>
          -
          <lpage>262</lpage>
          . doi:
          <volume>10</volume>
          .1177/0192512106064474.
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          [13]
          <string-name>
            <given-names>K.</given-names>
            <surname>Eliaz</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Galperti</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Spiegler</surname>
          </string-name>
          ,
          <article-title>False narratives and political mobilization (</article-title>
          <year>2024</year>
          ). doi:
          <volume>10</volume>
          .1093/ jeea/jvae047.
        </mixed-citation>
      </ref>
      <ref id="ref14">
        <mixed-citation>
          [14]
          <string-name>
            <given-names>C.</given-names>
            <surname>Schlaufer</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Kuenzler</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M. D.</given-names>
            <surname>Jones</surname>
          </string-name>
          ,
          <string-name>
            <given-names>E. A.</given-names>
            <surname>Shanahan</surname>
          </string-name>
          ,
          <article-title>The narrative policy framework: a traveler's guide to policy stories</article-title>
          ,
          <source>Politische Vierteljahresschrift</source>
          <volume>63</volume>
          (
          <year>2022</year>
          )
          <fpage>249</fpage>
          -
          <lpage>273</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref15">
        <mixed-citation>
          [15]
          <string-name>
            <given-names>D.</given-names>
            <surname>Crow</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Jones</surname>
          </string-name>
          ,
          <article-title>Narratives as tools for influencing policy change</article-title>
          ,
          <source>Policy &amp; Politics</source>
          <volume>46</volume>
          (
          <year>2018</year>
          )
          <fpage>217</fpage>
          -
          <lpage>234</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref16">
        <mixed-citation>
          [16]
          <string-name>
            <given-names>H.</given-names>
            <surname>Müller</surname>
          </string-name>
          , G. von Nordheim,
          <string-name>
            <given-names>K.</given-names>
            <surname>Boczek</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L.</given-names>
            <surname>Koppers</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Rahnenführer</surname>
          </string-name>
          ,
          <article-title>Der wert der worte-wie digitale methoden helfen, kommunikations-und wirtschaftswissenschaft zu verknüpfen</article-title>
          ,
          <source>Publizistik</source>
          <volume>63</volume>
          (
          <year>2018</year>
          )
          <fpage>557</fpage>
          -
          <lpage>582</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref17">
        <mixed-citation>
          [17]
          <string-name>
            <surname>R. M. Entman</surname>
          </string-name>
          , Framing:
          <article-title>Toward clarification of a fractured paradigm</article-title>
          ,
          <source>Journal of communication 43</source>
          (
          <year>1993</year>
          )
          <fpage>51</fpage>
          -
          <lpage>58</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref18">
        <mixed-citation>
          [18]
          <string-name>
            <given-names>B.</given-names>
            <surname>Santana</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Campos</surname>
          </string-name>
          ,
          <string-name>
            <given-names>E.</given-names>
            <surname>Amorim</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Jorge</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Silvano</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Nunes</surname>
          </string-name>
          ,
          <article-title>A survey on narrative extraction from textual data</article-title>
          ,
          <source>Artificial Intelligence Review</source>
          <volume>56</volume>
          (
          <year>2023</year>
          )
          <fpage>8393</fpage>
          -
          <lpage>8435</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref19">
        <mixed-citation>
          [19]
          <string-name>
            <given-names>N.</given-names>
            <surname>Benner</surname>
          </string-name>
          ,
          <string-name>
            <surname>K.-R. Lange</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          <string-name>
            <surname>Jentsch</surname>
          </string-name>
          , Named Entity Narratives, Working Paper 962,
          <string-name>
            <surname>Ruhr</surname>
            <given-names>Economic Papers</given-names>
          </string-name>
          ,
          <year>2022</year>
          . doi:
          <volume>10</volume>
          .4419/96973126.
        </mixed-citation>
      </ref>
      <ref id="ref20">
        <mixed-citation>
          [20]
          <string-name>
            <given-names>A.</given-names>
            <surname>Gueta</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Feder</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Z.</given-names>
            <surname>Gekhman</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Goldstein</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Reichart</surname>
          </string-name>
          ,
          <article-title>Can llms learn macroeconomic narratives from social media?</article-title>
          ,
          <source>arXiv preprint arXiv:2406.12109</source>
          (
          <year>2024</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref21">
        <mixed-citation>
          [21]
          <string-name>
            <given-names>E.</given-names>
            <surname>Ash</surname>
          </string-name>
          , G. Gauthier, P. Widmer, Relatio:
          <article-title>Text semantics capture political and economic narratives</article-title>
          ,
          <source>Political Analysis</source>
          <volume>32</volume>
          (
          <year>2024</year>
          )
          <fpage>115</fpage>
          -
          <lpage>132</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref22">
        <mixed-citation>
          [22]
          <string-name>
            <surname>K.-R. Lange</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          <string-name>
            <surname>Reccius</surname>
          </string-name>
          , T. Schmidt,
          <source>Towards Extracting Collective Economic Narratives from Texts, Ruhr Economic Papers</source>
          (
          <year>2022</year>
          ). doi:https://doi.org/10.4419/96973127.
        </mixed-citation>
      </ref>
      <ref id="ref23">
        <mixed-citation>
          [23]
          <string-name>
            <given-names>Y.</given-names>
            <surname>Chang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>X.</given-names>
            <surname>Wang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Wang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Y.</given-names>
            <surname>Wu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L.</given-names>
            <surname>Yang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K.</given-names>
            <surname>Zhu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>H.</given-names>
            <surname>Chen</surname>
          </string-name>
          ,
          <string-name>
            <given-names>X.</given-names>
            <surname>Yi</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Wang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Y.</given-names>
            <surname>Wang</surname>
          </string-name>
          , et al.,
          <article-title>A survey on evaluation of large language models</article-title>
          ,
          <source>ACM Transactions on Intelligent Systems and Technology</source>
          (
          <year>2023</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref24">
        <mixed-citation>
          [24]
          <string-name>
            <given-names>W. X.</given-names>
            <surname>Zhao</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K.</given-names>
            <surname>Zhou</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Li</surname>
          </string-name>
          ,
          <string-name>
            <given-names>T.</given-names>
            <surname>Tang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>X.</given-names>
            <surname>Wang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Y.</given-names>
            <surname>Hou</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Y.</given-names>
            <surname>Min</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B.</given-names>
            <surname>Zhang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Zhang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Z.</given-names>
            <surname>Dong</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Y.</given-names>
            <surname>Du</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Yang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Y.</given-names>
            <surname>Chen</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Z.</given-names>
            <surname>Chen</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Jiang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Ren</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Y.</given-names>
            <surname>Li</surname>
          </string-name>
          ,
          <string-name>
            <given-names>X.</given-names>
            <surname>Tang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Z.</given-names>
            <surname>Liu</surname>
          </string-name>
          , P. Liu,
          <string-name>
            <given-names>J.-Y.</given-names>
            <surname>Nie</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.-R.</given-names>
            <surname>Wen</surname>
          </string-name>
          ,
          <article-title>A survey of large language models</article-title>
          ,
          <source>arXiv preprint 2303.18223</source>
          (
          <year>2023</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref25">
        <mixed-citation>
          [25]
          <string-name>
            <given-names>S.</given-names>
            <surname>Bubeck</surname>
          </string-name>
          ,
          <string-name>
            <given-names>V.</given-names>
            <surname>Chandrasekaran</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Eldan</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Gehrke</surname>
          </string-name>
          ,
          <string-name>
            <given-names>E.</given-names>
            <surname>Horvitz</surname>
          </string-name>
          , E. Kamar,
          <string-name>
            <given-names>P.</given-names>
            <surname>Lee</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Y. T.</given-names>
            <surname>Lee</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Y.</given-names>
            <surname>Li</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Lundberg</surname>
          </string-name>
          ,
          <string-name>
            <given-names>H.</given-names>
            <surname>Nori</surname>
          </string-name>
          ,
          <string-name>
            <given-names>H.</given-names>
            <surname>Palangi</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M. T.</given-names>
            <surname>Ribeiro</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Y.</given-names>
            <surname>Zhang</surname>
          </string-name>
          ,
          <source>Sparks of artificial general intelligence: Early experiments with GPT-4, arXiv preprint 2303.12712</source>
          (
          <year>2023</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref26">
        <mixed-citation>
          [26]
          <string-name>
            <given-names>A.</given-names>
            <surname>Macaulay</surname>
          </string-name>
          , W. Song,
          <article-title>News media, inflation, and sentiment</article-title>
          ,
          <source>in: AEA Papers and Proceedings</source>
          , volume
          <volume>113</volume>
          , American Economic Association 2014 Broadway, Suite 305,
          <string-name>
            <surname>Nashville</surname>
          </string-name>
          , TN
          <volume>37203</volume>
          ,
          <year>2023</year>
          , pp.
          <fpage>172</fpage>
          -
          <lpage>176</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref27">
        <mixed-citation>
          [27]
          <string-name>
            <given-names>S.</given-names>
            <surname>Tilly</surname>
          </string-name>
          , G. Livan,
          <article-title>Predicting market inflation expectations with news topics and sentiment, 2021</article-title>
          . ArXiv:
          <volume>2107</volume>
          .07155 [cs].
        </mixed-citation>
      </ref>
      <ref id="ref28">
        <mixed-citation>
          [28]
          <string-name>
            <given-names>C.</given-names>
            <surname>Ziems</surname>
          </string-name>
          ,
          <string-name>
            <given-names>W.</given-names>
            <surname>Held</surname>
          </string-name>
          ,
          <string-name>
            <given-names>O.</given-names>
            <surname>Shaikh</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Chen</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Z.</given-names>
            <surname>Zhang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Yang</surname>
          </string-name>
          ,
          <article-title>Can large language models transform computational social science?</article-title>
          ,
          <source>Computational Linguistics</source>
          <volume>50</volume>
          (
          <year>2024</year>
          )
          <fpage>237</fpage>
          -
          <lpage>291</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref29">
        <mixed-citation>
          [29]
          <string-name>
            <given-names>J.</given-names>
            <surname>Mellon</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Bailey</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Scott</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Breckwoldt</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Miori</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Schmedeman</surname>
          </string-name>
          ,
          <article-title>Do ais know what the most important issue is? using language models to code open-text social survey responses at scale</article-title>
          ,
          <source>Research &amp; Politics</source>
          <volume>11</volume>
          (
          <year>2024</year>
          )
          <fpage>20531680241231468</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref30">
        <mixed-citation>
          [30]
          <string-name>
            <given-names>F.</given-names>
            <surname>Gilardi</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Alizadeh</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Kubli</surname>
          </string-name>
          ,
          <article-title>Chatgpt outperforms crowd workers for text-annotation tasks</article-title>
          ,
          <source>Proceedings of the National Academy of Sciences</source>
          <volume>120</volume>
          (
          <year>2023</year>
          ). doi:
          <volume>10</volume>
          .1073/pnas.2305016120.
        </mixed-citation>
      </ref>
      <ref id="ref31">
        <mixed-citation>
          [31]
          <string-name>
            <surname>K.-R. Lange</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          <string-name>
            <surname>Rieger</surname>
            ,
            <given-names>N.</given-names>
          </string-name>
          <string-name>
            <surname>Benner</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          <string-name>
            <surname>Jentsch</surname>
          </string-name>
          , Zeitenwenden:
          <article-title>Detecting changes in the German political discourse</article-title>
          ,
          <source>Proceedings of the 2nd Workshop on Computational Linguistics for Political Text Analysis@KONVENS2022</source>
          (
          <year>2022</year>
          ). URL: https://old.gscl.org/media/pages/arbeitskreise/cpss/ cpss-2022/workshop-proceedings-2022/
          <fpage>254133848</fpage>
          -1662996909/cpss-2022
          <source>-proceedings.pdf.</source>
        </mixed-citation>
      </ref>
      <ref id="ref32">
        <mixed-citation>
          [32]
          <string-name>
            <given-names>J.</given-names>
            <surname>Rieger</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Jentsch</surname>
          </string-name>
          ,
          <string-name>
            <surname>J. Rahnenführer,</surname>
          </string-name>
          <article-title>LDAPrototype: A model selection algorithm to improve reliability of latent Dirichlet allocation</article-title>
          ,
          <source>PeerJ Computer Science</source>
          <volume>10</volume>
          (
          <year>2024</year>
          )
          <article-title>e2279</article-title>
          . doi:
          <volume>10</volume>
          .7717/ peerj-cs.
          <volume>2279</volume>
          .
        </mixed-citation>
      </ref>
      <ref id="ref33">
        <mixed-citation>
          [33]
          <string-name>
            <surname>D. M. Blei</surname>
            ,
            <given-names>A. Y.</given-names>
          </string-name>
          <string-name>
            <surname>Ng</surname>
            ,
            <given-names>M. I. Jordan</given-names>
          </string-name>
          , Latent dirichlet allocation,
          <source>J. Mach. Learn. Res</source>
          .
          <volume>3</volume>
          (
          <year>2003</year>
          )
          <fpage>993</fpage>
          -
          <lpage>1022</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref34">
        <mixed-citation>
          [34]
          <string-name>
            <surname>D. M. Blei</surname>
            ,
            <given-names>J. D.</given-names>
          </string-name>
          <string-name>
            <surname>Laferty</surname>
          </string-name>
          ,
          <article-title>Dynamic topic models</article-title>
          ,
          <source>in: Proceedings of the 23rd international conference on Machine learning, ICML '06</source>
          ,
          <string-name>
            <surname>Association</surname>
          </string-name>
          for Computing Machinery, New York, NY, USA,
          <year>2006</year>
          , pp.
          <fpage>113</fpage>
          -
          <lpage>120</lpage>
          . doi:
          <volume>10</volume>
          .1145/1143844.1143859.
        </mixed-citation>
      </ref>
      <ref id="ref35">
        <mixed-citation>
          [35]
          <string-name>
            <given-names>X.</given-names>
            <surname>Song</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.-Y.</given-names>
            <surname>Lin</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B. L.</given-names>
            <surname>Tseng</surname>
          </string-name>
          , M.-T. Sun,
          <article-title>Modeling and predicting personal information dissemination behavior, in: Proceedings of the eleventh ACM SIGKDD international conference on Knowledge discovery in data mining</article-title>
          ,
          <source>KDD '05</source>
          ,
          <string-name>
            <surname>Association</surname>
          </string-name>
          for Computing Machinery, New York, NY, USA,
          <year>2005</year>
          , pp.
          <fpage>479</fpage>
          -
          <lpage>488</lpage>
          . doi:
          <volume>10</volume>
          .1145/1081870.1081925.
        </mixed-citation>
      </ref>
      <ref id="ref36">
        <mixed-citation>
          [36]
          <string-name>
            <given-names>X.</given-names>
            <surname>Wang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>McCallum</surname>
          </string-name>
          ,
          <article-title>Topics over time: a non-Markov continuous-time model of topical trends, in: Proceedings of the 12th ACM SIGKDD international conference on Knowledge discovery and data mining</article-title>
          ,
          <source>KDD '06</source>
          ,
          <string-name>
            <surname>Association</surname>
          </string-name>
          for Computing Machinery, New York, NY, USA,
          <year>2006</year>
          , pp.
          <fpage>424</fpage>
          -
          <lpage>433</lpage>
          . doi:
          <volume>10</volume>
          .1145/1150402.1150450.
        </mixed-citation>
      </ref>
      <ref id="ref37">
        <mixed-citation>
          [37]
          <string-name>
            <given-names>C.</given-names>
            <surname>Wang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Blei</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Heckerman</surname>
          </string-name>
          ,
          <article-title>Continuous time dynamic topic models</article-title>
          ,
          <source>in: Proceedings of the Twenty-Fourth Conference on Uncertainty in Artificial Intelligence</source>
          , UAI'08, AUAI Press, Arlington, Virginia, USA,
          <year>2008</year>
          , pp.
          <fpage>579</fpage>
          -
          <lpage>586</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref38">
        <mixed-citation>
          [38]
          <string-name>
            <surname>K.-R. Lange</surname>
            ,
            <given-names>N.</given-names>
          </string-name>
          <string-name>
            <surname>Benner</surname>
            ,
            <given-names>L.</given-names>
          </string-name>
          <string-name>
            <surname>Grönberg</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          <string-name>
            <surname>Hachcham</surname>
            ,
            <given-names>I. Kolli</given-names>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Rieger</surname>
          </string-name>
          ,
          <string-name>
            <surname>C.</surname>
          </string-name>
          <article-title>Jentsch, ttta: Tools for temporal text analysis</article-title>
          ,
          <year>2025</year>
          . URL: https://arxiv.org/abs/2503.02625. arXiv:
          <volume>2503</volume>
          .
          <fpage>02625</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref39">
        <mixed-citation>
          [39]
          <string-name>
            <given-names>Z.</given-names>
            <surname>Ji</surname>
          </string-name>
          ,
          <string-name>
            <given-names>N.</given-names>
            <surname>Lee</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Frieske</surname>
          </string-name>
          ,
          <string-name>
            <given-names>T.</given-names>
            <surname>Yu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Su</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Y.</given-names>
            <surname>Xu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>E.</given-names>
            <surname>Ishii</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Y. J.</given-names>
            <surname>Bang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Madotto</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Fung</surname>
          </string-name>
          ,
          <article-title>Survey of hallucination in natural language generation</article-title>
          ,
          <source>ACM Computing Surveys</source>
          <volume>55</volume>
          (
          <year>2023</year>
          )
          <fpage>1</fpage>
          -
          <lpage>38</lpage>
          .
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>