<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta>
      <journal-title-group>
        <journal-title>R. M. M. Hicke);
martonkardos@cas.au.dk(M. Kardos); mettethunoe@cas.au.dk (M. Thunø)
ç https://rmatouschekh.github.io(R. M. M. Hicke)
ȉ</journal-title>
      </journal-title-group>
    </journal-meta>
    <article-meta>
      <title-group>
        <article-title>Context is Key(NMF): Modelling Topical Information Dynamics in Chinese Diaspora Media</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Ross Deans Kristensen-McLachlan</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
          <xref ref-type="aff" rid="aff3">3</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Rebecca M. M. Hicke</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Márton Kardos</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Mette Thunø</string-name>
          <xref ref-type="aff" rid="aff2">2</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Center for Humanities Computing, Aarhus University</institution>
          ,
          <country country="DK">Denmark</country>
        </aff>
        <aff id="aff1">
          <label>1</label>
          <institution>Department of Computer Science, Cornell University</institution>
          ,
          <country country="US">USA</country>
        </aff>
        <aff id="aff2">
          <label>2</label>
          <institution>Department of Global Studies, Aarhus University</institution>
          ,
          <country country="DK">Denmark</country>
        </aff>
        <aff id="aff3">
          <label>3</label>
          <institution>Department of Linguistics, Cognitive Science, and Semiotics, Aarhus University</institution>
          ,
          <country country="DK">Denmark</country>
        </aff>
      </contrib-group>
      <pub-date>
        <year>2024</year>
      </pub-date>
      <volume>000</volume>
      <fpage>0</fpage>
      <lpage>0001</lpage>
      <abstract>
        <p>Does the People's Republic of China (PRC) interfere with European elections through ethnic Chinese diaspora media? This question forms the basis of an ongoing research project exploring how PRC narratives about European elections are represented in Chinese diaspora media, and thus the objectives of PRC news media manipulation. In order to study diaspora media efÏciently and at scale, it is necessary to use techniques derived from quantitative text analysis, such as topic modelling. In this paper, we present a pipeline for studying information dynamics in Chinese media. Firstly, we present KeyNMF, a new approach to static and dynamic topic modelling using transformer-based contextual embedding models. We provide benchmark evaluations to demonstrate that our approach is competitive on a number of Chinese datasets and metrics. Secondly, we integrate KeyNMF with existing methods for describing information dynamics in complex systems. We apply this pipeline to data from five news sites, focusing on the period of time leading up to the 2024 European parliamentary elections. Our methods and results demonstrate the efectiveness of KeyNMF for studying information dynamics in Chinese media and lay groundwork for further work addressing the broader research questions.</p>
      </abstract>
      <kwd-group>
        <kwd>eol&gt;keywords</kwd>
        <kwd>novelty</kwd>
        <kwd>contextual topic models</kwd>
        <kwd>Chinese</kwd>
        <kwd>information dynamics</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>1. Introduction</title>
      <p>
        Much digital ink is spilled on these topics in Western media as the various electorates
determine their preferences before elections and digest the fallout afterwards. Moreover, a
significant part of this media coverage is fundamentally persuasive, aiming to convince voters
to bet on the candidate who most closely aligns with the social and economic ideology of the
media outlets and their owners [
        <xref ref-type="bibr" rid="ref13">13</xref>
        ]. Likewise, coverage of these elections is not limited to
European media institutions, with media outlets around the world updating their readership
on how these elections impact them.
      </p>
      <p>
        In this context, one particular type of media stands out as especially interesting: ethnic
Chinese media targeting diaspora communities in Europe, a group which by some estimates
comprises around 1.5-3 million individuals. These media outlets are potentially invaluable
sources for understanding how the Chinese government and the Chinese Communist Party
(CCP) attempt to influence the diaspora. Furthermore, studying these outlets potentially
provides unique insights into how China views itself in relation to the West by showing how the
PRC presents itself to its diaspora groups. A growing body of literature has already begun
to address these questions in the context of social media2[8, 29] or in terms of digital
infrastructure more generally [
        <xref ref-type="bibr" rid="ref10 ref11 ref16">10, 11</xref>
        ]. In ongoing research, our aim is to assess whether Chinese
diaspora news sources intend to impact opinions on elections in the West during 2024. We
attempt to understand the control of information flow in Chinese diaspora media and how this
control is used to set specific agendas during electoral periods: promoting certain political
parties or individual candidates, polarizing citizens, and attacking or promoting specific political
positions.
      </p>
      <p>To pursue this research, we design a pipeline for analyzing large amounts of
Chineselanguage news data. First, we introduce KeyNMF, a novel approach to creating
contextsensitive topics models via transformer-based encoder models. KeyNMF can be trivially
applied across diferent languages and in data scarce environments, and is shown here to create
coherent, human-interpretable outputs when working with Chinese language data. We then
integrate KeyNMF with existing techniques for describing the information dynamics of
complex systems which measure the novelty and resonance of information present in a system over
time. We use this pipeline to perform preliminary analysis on our dataset of Chinese diaspora
media, finding clear trends in the novelty and resonance signals which correlate with
significant political events. The results presented are thus intended to be both a proof of concept and
a stepping stone towards more meaningful understanding of the dynamics underlying Chinese
diaspora media.</p>
    </sec>
    <sec id="sec-2">
      <title>2. Related Work</title>
      <sec id="sec-2-1">
        <title>2.1. Information Dynamics</title>
        <p>The study of information dynamics in complex cultural systems has been a central aspect of
research in computational humanities and cultural analytics in recent years. One of the most
promising approaches to this problem was introduced in 3[] which studied the shifting
debates which took place during the French Revolution. In this approach, divergence in content
between diferent time slices can be calculated using information-theoretic measures. These
measures can then be used to quantify two interrelated values: thenovelty of the system, or
how much the new time slice diverges from preceding time slices; and theresonance of this
information, which describes how information persists over time.</p>
        <p>
          Novelty-resonance patterns have been studied in a number of diferent discourse domains.
[
          <xref ref-type="bibr" rid="ref22">21</xref>
          ] demonstrate their usefulness in identifying so-called trend reservoirs on Reddit. Similar
interaction patterns between novelty and resonance have been successfully employed to study
the manner in which online news media responded to catastrophic events1[
          <xref ref-type="bibr" rid="ref20 ref21">8, 20, 19</xref>
          ]. In [31],
the same fundamental method of analysis demonstrates that novelty-resonance patterns clearly
track major social and historical events in the 20th century, using data taken from the front
page of Dutch newspapers.
        </p>
        <p>Calculating these underlying dynamics requires the creation of some kind of numerical
representation of the data. Specifically, the diference between individual windows is computed
by finding the windowed relative entropy, in this case calculated using Jensen-Shannon
Divergence (JSD). Since JSD computes the distance between probability distributions, the numerical
representations of the data are required to take that form. In 2[], this was achieved by
calculating the probabilities of a pre-trained, BERT-based emotion classification model, where the
predicted probabilities for each label created a distribution over emotions for each document.
However, for most purposes, novelty and resonance are calculated based on distributions
generated by a probabilistic topic model.</p>
      </sec>
      <sec id="sec-2-2">
        <title>2.2. Vanilla LDA</title>
        <p>Typically, novelty and resonance are calculated from topic probability distributions extracted
by Latent Dirichlet Allocation (LDA) [9, 7]. Topic distributions in documents are a natural
choice for information dynamics, as they are immediately usable with entropy-based measures.
LDA is a generative bag-of-words model, which assumes that a document contains a mixture
of topics and all words in the document are drawn from this mixture distribution.</p>
        <p>
          However, LDA has a number of well-known shortcomings. Documents have to be heavily
pre-processed for optimal results; otherwise, the topic descriptions produced by the model are
often contaminated by noise and stop words [16]. In addition, since LDA makes the
bag-ofwords assumption, it cannot utilize contextual and syntactic information, nor general
properties of natural language learned from outside sources. Finally, LDA is sensitive to
hyperparameter choices and Wallach, Mimno, and McCallum [30] demonstrate that using symmetric
Dirichlet priors, which is the case in canonical implementations 2[
          <xref ref-type="bibr" rid="ref23 ref5">5, 22</xref>
          ] and the majority of
academic studies, can lead to sub-optimal performance.
        </p>
        <p>
          There have also been challenges to the generalizability of LDA from the perspective of
Chinese NLP, as the primary structural and semantic unit of Chinese is the character rather than
the word [
          <xref ref-type="bibr" rid="ref25 ref27">32, 24</xref>
          ]. While these concerns might be overstated, working with Chinese language
data causes specific challenges in terms of tokenization and semantics which directly impact
the efÏcacy of traditional LDA approaches to topic modelling.
        </p>
      </sec>
      <sec id="sec-2-3">
        <title>2.3. Alternatives to LDA</title>
        <p>A major shortcoming of LDA when trying to model change over time is that topics are
calculated over all documents, essentially flattening any temporal aspect of the data. This is
undesirable, since topics themselves naturally evolve over time, meaning that LDA may not reflect
the true dynamics of a system. These issue is partly rectified by dynamic topic models [8]
which account for temporal changes in topics with a state-space model. However, Dynamic
LDA models are even more parameter-rich than the vanilla implementation and thus amplify
its limitations.</p>
        <p>
          Recently, contemporary topic models have shown that it is possible to utilize embeddings
from the sentence transformers [27] to infuse contextual information into topic models and
to allow for transfer learning [
          <xref ref-type="bibr" rid="ref1 ref5 ref6">5, 6, 14, 1, 16</xref>
          ]. This contextual information can lead to more
coherent and semantically interpretable topics. In addition, since these models draw on existing
pre-trained language models, they do not require training a generative model from scratch.
This means that it is possible to train topic models in data scarce contexts where traditional
LDA might perform poorly.
        </p>
        <p>
          Among the most popular of these contemporary models is BERTopic1[
          <xref ref-type="bibr" rid="ref2">4</xref>
          ], which also has
dynamic modelling capabilities. In this model, topic-term importances are estimated post-hoc on
pre-defined time slices based on one underlying topic model. However, as with LDA, BERTopic
is sensitive to pre-processing [16]. Additionally, because BERTopic is a clustering topic model,
documents are only assigned a single topic label. This renders the model impractical in settings
where documents are expected to contain multiple topics and means that BERTopic is not
suitable for calculating novelty and resonance, since the entropy calculations assume probability
distributions over documents.
        </p>
      </sec>
    </sec>
    <sec id="sec-3">
      <title>3. KeyNMF</title>
      <p>
        We propose KeyNMF, a novel topic modelling approach that utilizes neural text embeddings.
KeyNMF builds on the reliability, stability 4[], scalability [
        <xref ref-type="bibr" rid="ref18">17</xref>
        ], and interpretability of
Nonnegative Matrix Factorization (NMF) [
        <xref ref-type="bibr" rid="ref12">12</xref>
        ], while mitigating its sensitivity to pre-processing
and making use of contextual information in texts. This is achieved by: 1) computing keyword
importances from documents with contextual embeddings (similar to KeyBERT1[
        <xref ref-type="bibr" rid="ref5">5</xref>
        ]); and 2)
decomposing those importances with NMF.
      </p>
      <p>We release an implementation of KeyNMF as part of theTurftopic Python package.1</p>
      <sec id="sec-3-1">
        <title>3.1. Model Description</title>
        <p>KeyNMF operationalizes topic extraction as the following steps:
1. For each document :
a) Let   be the document’s embedding produced with an encoder model.
b) Let   be the word embedding of a word produced with the same encoder model.
c) Let   be the set of  keywords in  with the highest cosine similarity to :
  = arg max
 ∗
∑ sim(  ,   ), where |  | =  and  ∈ 
∈ ∗
the importance of keyword in document  :
2. Arrange the keyword similarities into a non-negative keyword matri x . Let   be
 
= {sim(,  ),
0,
if  ∈   and sim(  ,   ) &gt; 0
otherwise.
descent, minimizing the square loss( ,  ) = || −   ||
2
.</p>
        <sec id="sec-3-1-1">
          <title>3. Decompose  with non-negative matrix factorization:  ≈   , where</title>
          <p>is the
document-topic matrix, and is the topic-term-matrix. This is achieved with
coordinate</p>
        </sec>
      </sec>
      <sec id="sec-3-2">
        <title>3.2. Dynamic KeyNMF</title>
        <p>KeyNMF can be used for modelling topics’ evolution in a corpus over time. This is done by
ifrst computing a global model over the entire corpus, then calculating time-specific topic-term
importances in predefined time slices. Specifically:
1. Compute the keyword matrix for the whole corpus.
2. Decompose 
with non-negative matrix factorization: ≈  
.
3. For each time slice  :
b) Obtain the topic-term-matrix for with NMF while fixing   :
a) Let   be a subset of  and   a subset of  for the documents in time slice .</p>
        <p>∗
  = arg min ||  −    ∗||2
by L1-normalizing the temporal importances: ̂ =


∑   .
c) The temporal importance of topic is then   = ∑∈ (  ) , where all  are
documents in time slice  . We can obtain pseudo-topic distributions in the time-slices
Since NMF is not a probabilistic model, we use temporal pseudo-probabilities as a proxy for
topic distributions.</p>
      </sec>
      <sec id="sec-3-3">
        <title>3.3. Performance</title>
        <p>
          To demonstrate KeyNMF’s efectiveness as a topic model, we evaluate its performance using
the topic-benchmark Python package and the paraphrase-multilingual-MiniLM-L12-v22
embedding model. 15 keywords are extracted for each document. Our evaluation procedure
is based on that of Kardos, Kostkan, Vermillet, Nielbo, Enevoldsen, and Rocca1[
          <xref ref-type="bibr" rid="ref6">6</xref>
          ], but, since
our intended use case is Chinese news data, we ran the benchmark using the same corpora and
pipeline as in our investigations (see Sections4 and 5). Additionally, we utilized
paraphrasemultilingual-MiniLM for measuring external word embedding coherence, instead of an English
        </p>
        <sec id="sec-3-3-1">
          <title>Word2Vec model. 3</title>
          <p>the metric.   scores on Top2Vec should thus be interpreted with caution.
2https://huggingface.co/sentence-transformers/paraphrase-multilingual-MiniLM-L12-v2
3This gives Top2Vec an unfair advantage on this metric as it selects descriptive words based on the same criteria as
evaluated on diversity ( ), internal (  ) and external (  ) word embedding coherence.
KeyNMF’s performance on Chinese news data against a number of baselines. Topic descriptions were
Top2Vec on most corpora, which explicitly selects words based on their proximity in semantic
space (see Table 1). The model represents a drastic improvement over classical topic models
outperforming both NMF and LDA significantly, indicating that the contextual information
infused into the model enhances its performance in a meaningful way.
3.3.1. Sensitivity to Number of Keywords
We additionally test whether the number of keywords extracted from a text influences the
model’s performance on diferent corpora, which allows us to determine KeyNMF’s robustness
to hyperparameter choices. We used the same news sources, pipeline, and quantitative metrics
for evaluating this property of the model as for previous evaluations and analyses. The number
of keywords was varied from 5 to 100 with a step size of 5 (see Figure1).</p>
          <p>We observed that performance was relatively stable regardless of number of keywords, and
converged rather quickly. Only minimal fluctuations are observable with  &gt; 25
pora. However, on Xinozhou and Yidali-Huarenjie, lower values o f (5-15) resulted in higher
coherence scores. We thus deem 15 keywords a balanced choice of for further investigations.
on most
cor</p>
          <p>Total and Unique Articles Collected by Site
Xinouzhou</p>
          <p>5,905</p>
        </sec>
      </sec>
    </sec>
    <sec id="sec-4">
      <title>4. Data</title>
      <p># New Articles at Each Time Point</p>
      <p>Xinouzhou</p>
      <p>Oushinet
Yidali Huarenjie</p>
      <p>Chinanews</p>
      <p>Ihuawen</p>
      <p>Date
the amount of boilerplate text (e.g. bylines and publication dates) included in the extracted
texts; although it is impossible to remove all such text from our dataset, a hand analysis of ten
random articles from each news site indicates that the amount of ‘junk’ text included in the
ifnal dataset is minimal.</p>
      <p>The total and unique number of articles collected from each site are reported in Figure2.
It is clear that diferent sites follow diferent publication patterns. To further validate this,
we examine the number of ‘new’ articles at each time point for each source, or the number
of articles that were not included in the last scrape (Figure3). We see that some sites, like
Xinouzhou and Yidali Huarenjie, frequently refresh the articles displayed on their main pages,
leading to a larger number of unique articles. In contrast, sites like Ihuawen appear to keep
several articles on the main pages for a long time, meaning that they display a very small
number of unique articles overall. These diferences likely afect the patterns we see in the
information systems for each source.</p>
    </sec>
    <sec id="sec-5">
      <title>5. Experimental Design</title>
      <p>
        Extracted article texts are embedded with a multilingual transformer-based model2[
        <xref ref-type="bibr" rid="ref6">6</xref>
        ]9 using
the Sentence Transformers library.10 The embedding is done entirely on a 64-core CPU with
384GB RAM. Each document is embedded once for each time it appears in the dataset. In total,
embedding all the documents takes∼2 hours. The maximum sequence length of this
embedding model is 128 tokens. Thus, any article longer than 128 tokens is truncated and information
from later in the piece is not included in the embedding. Although this is a limitation, we do
not consider it prohibitive, as previous research has shown that the bulk of the content in a
9paraphrase-multilingual-MiniLM-L12-v2
10https://sbert.net
news article is presented at the very beginning — a widely-practiced professional standard for
journalistic writing known as theinverted pyramid [
        <xref ref-type="bibr" rid="ref24">23</xref>
        ].
      </p>
      <p>Since our primary interest is understanding the evolution of information dynamics in each
news site over time, we use Dynamic KeyNMF to find topic proportions for each timeslice.
For keyword extraction, we utilize thejieba tokenizer and remove stop-words present in an
authoritative list,11 with the retained tokens then encoded using the same multilingual model as
was used on the documents [26]. We fit multiple models with 10, 25, and 50 topics respectively
in order to investigate topical dynamics at multiple levels of granularity. Separate models are fit
for each news site. The plotted topics over time, top keywords for each topic at each timeslice,
and topic distributions at each timeslice are extracted from each model and saved for further
analysis.</p>
      <p>
        We then use the topic pseudo-distributions to measure the novelty and resonance signals for
each news site and, following [
        <xref ref-type="bibr" rid="ref21">20</xref>
        ] and [2], use windowed relative entropy with Jensen-Shannon
divergence to calculate both metrics. For a window of siz e , the novelty at time point is the
mean entropy of the topic pseudo-distribution at ( ̂ ) and the  previous pseudo-distributions.
The transience at time point is the mean entropy of the topic pseudo-distribution at and the
 subsequent pseudo-distributions. Then, the resonance of a time point is the novelty at that
point minus the transience. We use a window of size 12 when calculating both signals, which
is equivalent to three days of data.
      </p>
      <p>
        We apply nonlinear adaptive filtering to smooth the extracted novelty and resonance, again
following [
        <xref ref-type="bibr" rid="ref21">20</xref>
        ] and [2]. This removes noise from the signals by calculating the value at a given
time point relative to the surrounding time points. We use a span of 56, the same as2[], for
smoothing. The code we use for calculating novelty and resonance is adapted from that
released alongside [2] and [
        <xref ref-type="bibr" rid="ref21">20</xref>
        ].
      </p>
    </sec>
    <sec id="sec-6">
      <title>6. Results and Discussion</title>
      <p>We find clear trends in the novelty and resonance signals that correlate to significant events
in the EU during the period studied: Xi Jinping’s European Tour (May 5-10), Putin’s state visit
to China (May 16-17), and the EU parliamentary elections (June 6-9). Our analysis focuses on
the novelty and resonance trends extracted from the KeyNMF models with ten topics as these
provide the clearest signals. The results for 25 and 50 topics are included in AppendixC.1.
We additionally focus our in depth discussion of the results on the two largest news sources,
Xinouzhou and Oushinet, for this preliminary validation of the pipeline.</p>
      <p>We see spikes in novelty of varying strengths for both Xinouzhou and Oushinet during Xi
Jinping’s European tour (Figure4). There are also corresponding dips in resonance before his
tour for both sites, followed by increases in resonance during the tour. This indicates that novel
information is introduced to the site ecosystems during the tour which replaces previous topics
of interest, and which persists in the system for some time.</p>
      <p>One of the most productive aspects of Dynamic KeyNMF is that it allows us to study topic
lfuctuations over time. Thus, we explore which topical shifts contribute to changes in the
novelty and resonance signals. For example, on Oushinet, the time period during Xi Jinping’s
11https://github.com/stopwords-iso/stopwords-zh/blob/master/stopwords-zh.txt</p>
      <p>European tour is associated with high pseudo-probabilities for a topic defined by the keywords
Paris, France and state visit and a topic defined by President, China, and Xi Jinping (Appendix
C.2, Figure 9). Towards the end of the tour, a topic on diplomacy andbilaterial relations between
China and France also gains prominence. For Xinouzhou, this time period contains a peak in
the pseudo-probabilities for two topics on Hungary and Chinese relations with Hungary, one
of the locations on the tour.</p>
      <p>Similarly, there is a noticeable spike in the novelty and resonance for Oushinet directly before
Putin’s state visit to China. This period is marked by relatively high pseudo-probabilities for a
topic characterized by the termsChina, Beijing, Chinese, and Chinese News Service and a topic
with the keywords Russia, Ukraine, Putin, and Moscow (Appendix C.2, Figure 7).</p>
      <p>Most significantly for this study, there are fluctuations in novelty and resonance for both
sites around the EU parliamentary elections. Specifically, there are peaks in the novelty and
resonance signals for Xinouzhou and Oushinet before and after the elections, with troughs
throughout much of the election period. We hypothesize that these trends reflect a focus on
election-related news which begins in early June and continues through the elections and then
an introduction of new topics after their end. Again examining the topic distributions, we
see that for Oushinet the period before and during the election is marked by high
pseudoprobabilities for two topics directly related to the parliamentary elections, one topic
surrounding the Spanish prime minister, and two on Russia and Ukraine and the Israel-Palestine war
(Appendix C.2, Figure 8). Interestingly, pseudo-probabilities for the topic most directly
focused on the elections continued to grow even after the election, suggesting that Oushinet was
still discussing the election results during this time. Similarly, for Xinouzhou, three topics
focused on the UK elections, Europe broadly, and the Spanish prime minister were comparatively
prominent towards the end of May and beginning of June.</p>
      <p>Overall, we find that this pipeline allows us to efectively locate changes in news ecosystems,
correlate these changes to political and cultural events of interest, and further explore possible
reasons for these changes via topic models. It reveals diferences in media responses both
between events and between sites, while also demonstrating the similarities in sites’ news
ecosystems, such as the increased discussion of the Spanish prime minister on both Xinouzhou
and Oushinet before the EU parliamentary elections. We believe that the combination of the
novelty and resonance metrics with the novel KeyNMF topic model will permit further in-depth
analysis of these media sites and facilitate research on other Chinese-language domains.</p>
    </sec>
    <sec id="sec-7">
      <title>7. Conclusion</title>
      <p>In this paper, we present a pipeline designed to facilitate research on the underlying
information dynamics of Chinese diaspora media published in Europe. This pipeline combines
existing information-theoretic methods that model how new information enters and persists in
systems with a novel topic model, KeyNMF. KeyNMF overcomes some of the weaknesses of
previous traditional and contextual topic models, demonstrating high performance on standard
benchmarks. We validate this pipeline through preliminary experimentation on our dataset of
Chinese diaspora media, finding that it reveals informational trends that correlate with major,
newsworthy events in European politics and allows for further analysis of the topical changes
that cause those trends. While further qualitative research is required to fully understand these
dynamics, we believe that we have presented a major step forward in terms of context-sensitive
and interpretable topic modelling and information dynamics which can generalize to
multilingual and data scarce environments.</p>
    </sec>
    <sec id="sec-8">
      <title>Acknowledgments</title>
      <p>Part of the computation done for this project was performed on the UCloud interactive HPC
system, which is managed by the eScience Center at the University of Southern Denmark.</p>
    </sec>
    <sec id="sec-9">
      <title>A. News Site Subpages</title>
      <p>The subpages scraped for each news site are listed below:
• Xinouzhou: France, Italy, Spain, UK, Germany, Hungary, International
• Ihuawen: News, Comments &amp; Opinions
• Oushinet: Europe, Europe: Germany, Europe: Central and Eastern Europe, Europe:
Italy, Europe: Spain, Europe: Other, France, Europe and China, Overseas Chinese
community, China, International, Opinion on public afairs
• Chinanews: Nordic headlines, China news, Mutual learning among civilizations,
Overseas Chinese community, Nordic Commercial Bridge, Overseas thoughts
• Yidali Huarenjie: ∅</p>
    </sec>
    <sec id="sec-10">
      <title>B. NPMI Coherence</title>
      <p>Since NPMI Coherence has historical significance in topic modeling literature, we also
evaluated topic descriptions with this metrics. Due to theoretical and practical limitation1s6[],
Model
BERTopic
CombinedTM
the sake of completeness, we report NPMI scores in Table 2.
however, we do not consider NPMI Coherence a good metric for evaluating topic models. For</p>
      <p>NPMI coherence of diferent topic models on the studied corpora.
oushinet
xinozhou</p>
    </sec>
    <sec id="sec-11">
      <title>C. Additional Experimental Results</title>
      <sec id="sec-11-1">
        <title>C.1. Novelty and Resonance Ablations</title>
        <p>Novelty</p>
        <p>Resonance
0.0100
0.0075
0.0050
0.0025
0.0000
0.0
2024-05-021024-05-025024-05-029024-05-123024-05-127024-05-221024-05-225024-05-229024-06-022024-06-026024-06-120024-06-124024-06-18</p>
        <p>Xinouzhou Oushinet
0.15
0.10
0.10</p>
        <p>Xinouzhou Oushinet Chinanews Ihuawen Yidali Huarenjie
Figure 6: The novelty and resonance plots for each news site from KeyNMF with 50 topics. The three
shaded areas represent Xi Jinping’s European tour (May 5-10, 2024), Putin’s state visit to China (May
16-17, 2024), and the EU parliamentary elections (June 6-9, 2024). Note that the y-axis ranges difer for
each chart.</p>
      </sec>
      <sec id="sec-11-2">
        <title>C.2. Topic Distributions Over Time</title>
        <p>Oushinet: China, Beijing, Chinese, China News Service ( , , ,</p>
        <p>)
Oushinet: Russia, Ukraine, Putin, Moscow (
,
, ,
)
Oushinet: elections, political parties, voting, European Parliament ( ,
,
,
,
,
,
)
)
Oushinet: Spain, prime minister, Madrid, resign (</p>
        <p>,
Oushinet: Israel, Gaza, Palestine, Palestinians (
,
,</p>
        <p>)
Oushinet: Russia, Ukraine, Putin, Moscow (
,
,
,</p>
        <p>)
Xinouzhou: elections, UK, political parties, politics ( ,
,
,</p>
        <p>)
Xinouzhou: Europe, Germany, France, Paris ( ,
,
,</p>
        <p>)
Xinouzhou: Spain, Prime Minister, Madrid, Spanish government (
,
,
,
)
2024-05-01 2024-05-05 2024-05-09 2024-05-13 2024-05-17 2024-05-21 2024-05-25 2024-05-29 2024-06-02 2024-06-06 2024-06-10 2024-06-14 2024-06-18
Figure 8: The distributions over time for eight topics with high pseudo-probabilities around the EU
parliamentary elections. These topics are generated by the 10-topic KeyNMF models for Oushinet and
Xinouzhou. Note that the y-axis scale difers for each subplot.
0.12
0.10
0.08
0.06
0.10
0.08
0.06
0.04
0.06
0.05
0.04
0.050
0.045
0.040
0.035</p>
      </sec>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <given-names>D.</given-names>
            <surname>Angelov</surname>
          </string-name>
          .
          <source>Top2Vec: Distributed Representations of Topics</source>
          .
          <year>2020</year>
          . arXiv:
          <year>2008</year>
          .
          <article-title>09470 [cs</article-title>
          .CL].
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          <source>[4] [7] [8] [9]</source>
          [2]
          <string-name>
            <given-names>R. B.</given-names>
            <surname>Baglini</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S. M.</given-names>
            <surname>Østergaard</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S. N.</given-names>
            <surname>Larsen</surname>
          </string-name>
          , and
          <string-name>
            <given-names>K. L.</given-names>
            <surname>Nielbo</surname>
          </string-name>
          . “Emodynamics:
          <article-title>Detecting and Characterizing Pandemic Sentiment Change Points on Danish Twitter”</article-title>
          .
          <source>InP:roceedings of the Fourth Conference on Computational Humanities Research</source>
          ,
          <string-name>
            <surname>CHR</surname>
          </string-name>
          <year>2022</year>
          . Antwerp, Belgium,
          <year>2022</year>
          , pp.
          <fpage>162</fpage>
          -
          <lpage>176</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          <string-name>
            <given-names>A. T. J.</given-names>
            <surname>Barron</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Huang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R. L.</given-names>
            <surname>Spang</surname>
          </string-name>
          , and
          <string-name>
            <surname>S. DeDeo.</surname>
          </string-name>
          “Individuals, Institutions, and
          <article-title>Innovation in the Debates of the French Revolution”</article-title>
          .
          <source>In:Proceedings of the National Academy of Sciences 115.18</source>
          (
          <year>2018</year>
          ), pp.
          <fpage>4607</fpage>
          -
          <lpage>4612</lpage>
          . doi:
          <volume>10</volume>
          .1073/pnas.1717729115.
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          <string-name>
            <given-names>M.</given-names>
            <surname>Belford</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B. Mac</given-names>
            <surname>Namee</surname>
          </string-name>
          , and
          <string-name>
            <given-names>D.</given-names>
            <surname>Greene</surname>
          </string-name>
          . “
          <article-title>Stability of Topic Modeling via Matrix Factorization”</article-title>
          .
          <source>In:Expert Systems With Applications 91.1</source>
          (
          <issue>2018</issue>
          ), pp.
          <fpage>159</fpage>
          -
          <lpage>169</lpage>
          . doi:
          <volume>10</volume>
          .1016/j .eswa.
          <year>2017</year>
          .
          <volume>08</volume>
          .047.
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          [5]
          <string-name>
            <given-names>F.</given-names>
            <surname>Bianchi</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Terragni</surname>
          </string-name>
          , and
          <string-name>
            <given-names>D.</given-names>
            <surname>Hovy</surname>
          </string-name>
          . “
          <article-title>Pre-Training is a Hot Topic: Contextualized Document Embeddings Improve Topic Coherence”</article-title>
          .
          <source>In:Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing</source>
          (Volume
          <volume>2</volume>
          :
          <string-name>
            <surname>Short</surname>
            <given-names>Papers).</given-names>
          </string-name>
          <string-name>
            <surname>Online</surname>
          </string-name>
          ,
          <year>2021</year>
          , pp.
          <fpage>759</fpage>
          -
          <lpage>766</lpage>
          . doi:
          <volume>10</volume>
          .18653/v1/
          <year>2021</year>
          .acl-short.
          <volume>96</volume>
          .
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          [6]
          <string-name>
            <given-names>F.</given-names>
            <surname>Bianchi</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Terragni</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Hovy</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Nozza</surname>
          </string-name>
          , and
          <string-name>
            <surname>E. Fersini.</surname>
          </string-name>
          “
          <article-title>Cross-lingual Contextualized Topic Models with Zero-shot Learning”</article-title>
          .
          <source>In:Proceedings of the 16th Conference of the European Chapter of the Association for Computational Linguistics: Main Volume. Online</source>
          ,
          <year>2021</year>
          , pp.
          <fpage>1676</fpage>
          -
          <lpage>1683</lpage>
          . doi:
          <volume>10</volume>
          .18653/v1/
          <year>2021</year>
          .eacl-main.
          <volume>143</volume>
          .
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          <string-name>
            <surname>D. M. Blei</surname>
          </string-name>
          . “
          <article-title>Probabilistic Topic Models”</article-title>
          .
          <source>In:Communications of the ACM 55.4</source>
          (
          <issue>2012</issue>
          ), pp.
          <fpage>77</fpage>
          -
          <lpage>84</lpage>
          . doi:
          <volume>10</volume>
          .1145/2133806.2133826.
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          <string-name>
            <surname>D. M. Blei</surname>
            and
            <given-names>J. D.</given-names>
          </string-name>
          <string-name>
            <surname>Laferty</surname>
          </string-name>
          . “
          <article-title>Dynamic Topic Models”</article-title>
          .
          <source>In: Proceedings of the 23rd International Conference on Machine Learning</source>
          . Pittsburgh, Pennsylvania, USA,
          <year>2006</year>
          , pp.
          <fpage>113</fpage>
          -
          <lpage>120</lpage>
          . doi:
          <volume>10</volume>
          .1145/1143844.1143859.
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          <string-name>
            <surname>D. M. Blei</surname>
            ,
            <given-names>A. Y.</given-names>
          </string-name>
          <string-name>
            <surname>Ng</surname>
            , and
            <given-names>M. I. Jordan.</given-names>
          </string-name>
          “Latent Dirichlet Allocation”.
          <source>InJ:ournal of Machine Learning Research 3.1</source>
          (
          <issue>2003</issue>
          ), pp.
          <fpage>993</fpage>
          -
          <lpage>1022</lpage>
          . doi:
          <volume>10</volume>
          .5555/944919.944937.
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          [10]
          <string-name>
            <given-names>V.</given-names>
            <surname>Brussee</surname>
          </string-name>
          . “
          <article-title>Authoritarian Design: How the Digital Architecture on China's Sina Weibo Facilitate Information Control”</article-title>
          .
          <source>In:Asiascape: Digital Asia 9.3</source>
          (
          <issue>2022</issue>
          ), pp.
          <fpage>207</fpage>
          -
          <lpage>241</lpage>
          . doi:
          <volume>10</volume>
          .1163/
          <fpage>22142312</fpage>
          -
          <lpage>bja10033</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          [11]
          <string-name>
            <given-names>K.</given-names>
            <surname>Chan</surname>
          </string-name>
          and
          <string-name>
            <given-names>C.</given-names>
            <surname>Alden</surname>
          </string-name>
          . “&lt;
          <article-title>Redirecting&gt; the Diaspora: China's United Front Work and the Hyperlink Networks of Diasporic Chinese Websites in Cyberspace”</article-title>
          .
          <source>InP:olitical Research Exchange</source>
          <volume>5</volume>
          .1 (
          <issue>2023</issue>
          ), pp.
          <fpage>1</fpage>
          -
          <lpage>21</lpage>
          . doi:
          <volume>10</volume>
          .1080/2474736x.
          <year>2023</year>
          .
          <volume>2179409</volume>
          .
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          [12]
          <string-name>
            <given-names>A.</given-names>
            <surname>Cichocki</surname>
          </string-name>
          and
          <string-name>
            <given-names>A.-H.</given-names>
            <surname>Phan</surname>
          </string-name>
          . “
          <article-title>Fast Local Algorithms for Large Scale Nonnegative Matrix and Tensor Factorizations”</article-title>
          .
          <source>In:IEICE Transactions on Fundamentals of Electronics, Communications and Computer Sciences E92.a.3</source>
          (
          <issue>2009</issue>
          ), pp.
          <fpage>708</fpage>
          -
          <lpage>721</lpage>
          . doi:
          <volume>10</volume>
          .1587/transfun.E9 2.
          <string-name>
            <surname>A.</surname>
          </string-name>
          <year>708</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          [13]
          <string-name>
            <given-names>K.</given-names>
            <surname>Gatterman</surname>
          </string-name>
          , T. M. Meyer, and
          <string-name>
            <given-names>K.</given-names>
            <surname>Wurzer</surname>
          </string-name>
          . “
          <article-title>Who Won the Election? Explaining News Coverage of Election Results in Multi-Party Systems”</article-title>
          .
          <source>In:European Journal of Political Research 61.4</source>
          (
          <issue>2022</issue>
          ), pp.
          <fpage>857</fpage>
          -
          <lpage>877</lpage>
          . doi:
          <volume>10</volume>
          .1111/
          <fpage>1475</fpage>
          -
          <lpage>6765</lpage>
          .
          <fpage>12498</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref14">
        <mixed-citation>
          2022. doi:
          <volume>10</volume>
          .48550/arXiv.2203.05794. arXiv:
          <volume>2203</volume>
          .05794 [cs.CL].
        </mixed-citation>
      </ref>
      <ref id="ref15">
        <mixed-citation>
          <string-name>
            <given-names>M.</given-names>
            <surname>Grootendorst</surname>
          </string-name>
          .
          <article-title>KeyBERT: Minimal Keyword Extraction with BERT</article-title>
          .
          <source>Version v0.3.0</source>
          .
          <year>2020</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref16">
        <mixed-citation>
          <source>doi: 10</source>
          .5281/zenodo.4461265.
        </mixed-citation>
      </ref>
      <ref id="ref17">
        <mixed-citation>
          <string-name>
            <given-names>M.</given-names>
            <surname>Kardos</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Kostkan</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.-Q.</given-names>
            <surname>Vermillet</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K.</given-names>
            <surname>Nielbo</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K.</given-names>
            <surname>Enevoldsen</surname>
          </string-name>
          , and R. Rocca . 3
          <string-name>
            <surname>- Semantic</surname>
            <given-names>Signal</given-names>
          </string-name>
          <string-name>
            <surname>Separation</surname>
          </string-name>
          .
          <year>2024</year>
          . doi:
          <volume>10</volume>
          .48550/arXiv.2406.09556. arXiv:
          <volume>2406</volume>
          .09556 [cs.LG].
        </mixed-citation>
      </ref>
      <ref id="ref18">
        <mixed-citation>
          [17]
          <string-name>
            <given-names>A.</given-names>
            <surname>Lefèvre</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F.</given-names>
            <surname>Bach</surname>
          </string-name>
          , and
          <string-name>
            <given-names>C.</given-names>
            <surname>Févotte</surname>
          </string-name>
          . “
          <article-title>Online Algorithms for Nonnegative Matrix Factorization with the Itakura-Saito Divergence”</article-title>
          .
          <source>In:2011 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA)</source>
          . New Paltz, NY, USA,
          <year>2011</year>
          , pp.
          <fpage>313</fpage>
          -
          <lpage>316</lpage>
          . doi:
          <volume>10</volume>
          .1109/aspaa.
          <year>2011</year>
          .
          <volume>6082314</volume>
          .
        </mixed-citation>
      </ref>
      <ref id="ref19">
        <mixed-citation>
          [18]
          <string-name>
            <surname>K. L. Nielbo</surname>
            ,
            <given-names>R. B.</given-names>
          </string-name>
          <string-name>
            <surname>Baglini</surname>
            ,
            <given-names>P. B.</given-names>
          </string-name>
          <string-name>
            <surname>Vahlstrup</surname>
            ,
            <given-names>K. C.</given-names>
          </string-name>
          <string-name>
            <surname>Enevoldsen</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          <string-name>
            <surname>Bechmann</surname>
          </string-name>
          ,
          <article-title>and</article-title>
          <string-name>
            <given-names>A.</given-names>
            <surname>Roepstorf</surname>
          </string-name>
          . “News Information Decoupling:
          <article-title>An Information Signature of Catastrophes in Legacy News Media”</article-title>
          .
          <source>In: Proceedings of the 2020 European Association for Digital Humanities Conference. Krasnoyarsk, Russia</source>
          ,
          <year>2021</year>
          , pp.
          <fpage>1</fpage>
          -
          <lpage>8</lpage>
          . doi:
          <volume>10</volume>
          .48550/arXiv.2101.02956.
        </mixed-citation>
      </ref>
      <ref id="ref20">
        <mixed-citation>
          [19]
          <string-name>
            <surname>K. L. Nielbo</surname>
            ,
            <given-names>K.</given-names>
          </string-name>
          <string-name>
            <surname>Enevoldsen</surname>
            ,
            <given-names>R.</given-names>
          </string-name>
          <string-name>
            <surname>Baglini</surname>
            ,
            <given-names>E.</given-names>
          </string-name>
          <string-name>
            <surname>Fano</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          <string-name>
            <surname>Roepstorf</surname>
            , and
            <given-names>J.</given-names>
          </string-name>
          <string-name>
            <surname>Gao</surname>
          </string-name>
          . “
          <string-name>
            <surname>Pandemic News Information Uncertainty -News Dynamics</surname>
          </string-name>
          Mirror Diferential Response Strategies to COVID-
          <volume>19</volume>
          ”.
          <source>In: Plos One 18.1</source>
          (
          <issue>2023</issue>
          ),
          <year>e0278098</year>
          . doi:
          <volume>10</volume>
          .1371/journal.pone.
          <volume>0278098</volume>
          .
        </mixed-citation>
      </ref>
      <ref id="ref21">
        <mixed-citation>
          [20]
          <string-name>
            <surname>K. L. Nielbo</surname>
            ,
            <given-names>F.</given-names>
          </string-name>
          <string-name>
            <surname>Haestrup</surname>
            ,
            <given-names>K. C.</given-names>
          </string-name>
          <string-name>
            <surname>Enevoldsen</surname>
            ,
            <given-names>P. B.</given-names>
          </string-name>
          <string-name>
            <surname>Vahlstrup</surname>
            ,
            <given-names>R. B.</given-names>
          </string-name>
          <string-name>
            <surname>Baglini</surname>
          </string-name>
          ,
          <article-title>and</article-title>
          <string-name>
            <given-names>A.</given-names>
            <surname>Roepstorf</surname>
          </string-name>
          .
          <article-title>When No News is Bad News - Detection of Negative Events from News Media Content</article-title>
          .
          <year>2021</year>
          . doi:
          <volume>10</volume>
          .48550/arXiv.2102.06505. arXiv:
          <volume>2102</volume>
          .06505 [cs.CY].
        </mixed-citation>
      </ref>
      <ref id="ref22">
        <mixed-citation>
          [21]
          <string-name>
            <surname>K. L. Nielbo</surname>
            ,
            <given-names>P. B.</given-names>
          </string-name>
          <string-name>
            <surname>Vahlstrup</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          <string-name>
            <surname>Bechmann</surname>
            , and
            <given-names>J.</given-names>
          </string-name>
          <string-name>
            <surname>Gao</surname>
          </string-name>
          . “
          <article-title>Trend Reservoir Detection: Minimal Persistence and Resonant Behavior of Trends in Social Media”</article-title>
          .
          <source>InP: roceedings of the Workshop on Computational Humanities Research (CHR</source>
          <year>2020</year>
          ). Amsterdam, the Netherlands,
          <year>2020</year>
          , pp.
          <fpage>290</fpage>
          -
          <lpage>297</lpage>
          . doi:
          <volume>10</volume>
          .48550/arXiv.2109.08589.
        </mixed-citation>
      </ref>
      <ref id="ref23">
        <mixed-citation>
          [22]
          <string-name>
            <given-names>F.</given-names>
            <surname>Pedregosa</surname>
          </string-name>
          ,
          <string-name>
            <given-names>G.</given-names>
            <surname>Varoquaux</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Gramfort</surname>
          </string-name>
          ,
          <string-name>
            <given-names>V.</given-names>
            <surname>Michel</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B.</given-names>
            <surname>Thirion</surname>
          </string-name>
          ,
          <string-name>
            <given-names>O.</given-names>
            <surname>Grisel</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Blondel</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Prettenhofer</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Weiss</surname>
          </string-name>
          ,
          <string-name>
            <given-names>V.</given-names>
            <surname>Dubourg</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Vanderplas</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Passos</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Cournapeau</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Brucher</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Perrot</surname>
          </string-name>
          , and
          <string-name>
            <given-names>E. Duchesnay.</given-names>
            “
            <surname>Scikit-Learn</surname>
          </string-name>
          :
          <article-title>Machine Learning in Python”</article-title>
          .
          <source>In:Journal of Machine Learning Research 12.1</source>
          (
          <issue>2011</issue>
          ), pp.
          <fpage>2825</fpage>
          -
          <lpage>2830</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref24">
        <mixed-citation>
          [23]
          <string-name>
            <given-names>H.</given-names>
            <surname>Pöttker</surname>
          </string-name>
          . “
          <article-title>News and Its Communicative Quality: the Inverted Pyramid - When</article-title>
          and Why Did It Appear?”
          <source>In: Journalism Studies 4.4</source>
          (
          <issue>2003</issue>
          ), pp.
          <fpage>501</fpage>
          -
          <lpage>511</lpage>
          . doi:
          <volume>10</volume>
          .1080/146167 0032000136596.
        </mixed-citation>
      </ref>
      <ref id="ref25">
        <mixed-citation>
          [24]
          <string-name>
            <given-names>Z.</given-names>
            <surname>Qin</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Y.</given-names>
            <surname>Cong</surname>
          </string-name>
          , and
          <string-name>
            <given-names>T.</given-names>
            <surname>Wan</surname>
          </string-name>
          . “
          <article-title>Topic modeling of Chinese language beyond a bag-ofwords”</article-title>
          .
          <source>In: Computer Speech &amp; Language</source>
          <volume>40</volume>
          (
          <year>2016</year>
          ), pp.
          <fpage>60</fpage>
          -
          <lpage>78</lpage>
          . doi: https://doi.org/10.10 16/j.csl.
          <year>2016</year>
          .
          <volume>03</volume>
          .004.
        </mixed-citation>
      </ref>
      <ref id="ref26">
        <mixed-citation>
          [25]
          <string-name>
            <given-names>R.</given-names>
            <surname>Řehůřek</surname>
          </string-name>
          and
          <string-name>
            <given-names>P.</given-names>
            <surname>Sojka</surname>
          </string-name>
          . “
          <article-title>Software Framework for Topic Modelling with Large Corpora”</article-title>
          .
          <source>In: Proceedings of the LREC 2010 Workshop on New Challenges for NLP Frameworks. Valletta, Malta</source>
          ,
          <year>2010</year>
          , pp.
          <fpage>45</fpage>
          -
          <lpage>50</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref27">
        <mixed-citation>
          [32]
          <string-name>
            <given-names>N.</given-names>
            <surname>Reimers</surname>
          </string-name>
          and
          <string-name>
            <surname>I. Gurevych.</surname>
          </string-name>
          “
          <article-title>Making Monolingual Sentence Embeddings Multilingual using Knowledge Distillation”</article-title>
          .
          <source>In:Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing. Online</source>
          ,
          <year>2020</year>
          , pp.
          <fpage>4512</fpage>
          -
          <lpage>4525</lpage>
          . doi:
          <volume>10</volume>
          .48550/arXiv.20
          <volume>04</volume>
          .09813.
        </mixed-citation>
      </ref>
      <ref id="ref28">
        <mixed-citation>
          <string-name>
            <given-names>N.</given-names>
            <surname>Reimers</surname>
          </string-name>
          and
          <string-name>
            <surname>I. Gurevych.</surname>
          </string-name>
          “
          <article-title>Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks”</article-title>
          .
          <source>In: Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing. Hong Kong</source>
          , China,
          <year>2019</year>
          , pp.
          <fpage>3982</fpage>
          -
          <lpage>3992</lpage>
          . doi:
          <volume>10</volume>
          .18653/v1/
          <fpage>D19</fpage>
          - 1410.
        </mixed-citation>
      </ref>
      <ref id="ref29">
        <mixed-citation>
          <string-name>
            <given-names>M.</given-names>
            <surname>Schliebs</surname>
          </string-name>
          ,
          <string-name>
            <given-names>H.</given-names>
            <surname>Bailey</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Bright</surname>
          </string-name>
          , and
          <string-name>
            <given-names>P. N.</given-names>
            <surname>Howard</surname>
          </string-name>
          .
          <article-title>China's Public Diplomacy Operations: Understanding Engagement and Inauthentic Amplification of PRC Diplomats on Facebook and Twitter</article-title>
          .
          <source>Tech. rep. Oxford, UK: Programme on Democracy &amp; Technology</source>
          ,
          <year>2021</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref30">
        <mixed-citation>
          <string-name>
            <given-names>M.</given-names>
            <surname>Thunø</surname>
          </string-name>
          and
          <string-name>
            <given-names>K. L.</given-names>
            <surname>Nielbo</surname>
          </string-name>
          . “
          <article-title>The Initial Digitalization of Chinese Diplomacy (</article-title>
          <year>2019</year>
          -2021):
          <article-title>Establishing Global Communication Networks on Twitter”</article-title>
          .
          <source>InJ:ournal of Contemporary China</source>
          <volume>33</volume>
          .146 (
          <year>2024</year>
          ), pp.
          <fpage>244</fpage>
          -
          <lpage>266</lpage>
          . doi:
          <volume>10</volume>
          .1080/10670564.
          <year>2023</year>
          .
          <volume>2195811</volume>
          .
        </mixed-citation>
      </ref>
      <ref id="ref31">
        <mixed-citation>
          <string-name>
            <surname>H. M. Wallach</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          <string-name>
            <surname>Mimno</surname>
          </string-name>
          ,
          <article-title>and</article-title>
          <string-name>
            <given-names>A. McCallum.</given-names>
            “
            <surname>Rethinking</surname>
          </string-name>
          <string-name>
            <surname>LDA</surname>
          </string-name>
          :
          <article-title>Why Priors Matter”</article-title>
          .
          <source>In: Advances in Neural Information Processing Systems</source>
          . Vancouver, Canada,
          <year>2009</year>
          , pp.
          <fpage>1</fpage>
          -
          <lpage>9</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref32">
        <mixed-citation>
          <string-name>
            <given-names>M.</given-names>
            <surname>Wevers</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Kostkan</surname>
          </string-name>
          , and
          <string-name>
            <given-names>K. L.</given-names>
            <surname>Nielbo</surname>
          </string-name>
          . “
          <article-title>Event Flow - How Events Shaped the Flow of the News,</article-title>
          <year>1950</year>
          -
          <fpage>1995</fpage>
          ”.
          <source>In: Proceedings of the Third Conference on Computational Humanities Research</source>
          ,
          <string-name>
            <surname>CHR</surname>
          </string-name>
          <year>2021</year>
          . Amsterdam, the Netherlands,
          <year>2021</year>
          , pp.
          <fpage>62</fpage>
          -
          <lpage>76</lpage>
          . doi:
          <volume>10</volume>
          .48550/ar Xiv.
          <volume>2109</volume>
          .08589.
        </mixed-citation>
      </ref>
      <ref id="ref33">
        <mixed-citation>
          <string-name>
            <given-names>Q.</given-names>
            <surname>Zhao</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Z.</given-names>
            <surname>Qin</surname>
          </string-name>
          , and
          <string-name>
            <given-names>T.</given-names>
            <surname>Wan</surname>
          </string-name>
          . “
          <article-title>Topic Modeling of Chinese Language Using CharacterWord Relations”</article-title>
          .
          <source>In: Neural Information Processing</source>
          . Berlin, Heidelberg,
          <year>2011</year>
          , pp.
          <fpage>139</fpage>
          -
          <lpage>147</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref34">
        <mixed-citation>
          <source>2024-05-012024-05-052024-05-092024-05-132024-05-172024-05-212024-05-252024-05-292024-06-022024-06-062024-06-102024-06-142024-06-18 2024-05-01 2024-05-05 2024-05-09 2024-05-13 2024-05-17 2024-05-21 2024-05-25 2024-05-29 2024-06-02 2024-06-06 2024-06-10 2024-06-14</source>
          <year>2024</year>
          -
          <fpage>06</fpage>
          -
          <issue>18 Figure 9</issue>
          :
          <article-title>The distributions over time for five topics with high pseudo-probabilities during Xi Jinping's European tour</article-title>
          .
          <article-title>These topics are generated by the 10-topic KeyNMF models for Oushinet and Xinouzhou</article-title>
          .
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>