=Paper= {{Paper |id=Vol-3322/short5 |storemode=property |title=Detecting Traces of Narrative Evolution on Telegram: Inductive Methods from Corpus-Based Discourse Analysis |pdfUrl=https://ceur-ws.org/Vol-3322/short5.pdf |volume=Vol-3322 |authors=Tom Willaert |dblpUrl=https://dblp.org/rec/conf/ijcai/Willaert22 }} ==Detecting Traces of Narrative Evolution on Telegram: Inductive Methods from Corpus-Based Discourse Analysis== https://ceur-ws.org/Vol-3322/short5.pdf
Detecting Traces of Narrative Evolution on Telegram:
Inductive Methods from Corpus-Based Discourse Analysis
Tom Willaert1
1
    Brussels School of Governance, IMEC-SMIT-VUB, Vrije Universiteit Brussel, Brussels, Belgium


                                         Abstract
                                          In the face of world-changing events, narratives on the messaging platform Telegram, including instances
                                          of disinformation, tend to arise and evolve at high speeds. However, key signals of this process, including
                                          newly emerging or idiosyncratic concepts, often elude traditional, top-down analyses. Addressing the need for
                                          inductive approaches to narrative evolution on Telegram, this paper operationalizes quantitative methods
                                          from the field of corpus-based discourse analysis. On a technical and methodological level, the paper discusses
                                          how data from Telegram’s messages and images can be collected and preprocessed for the purposes of a
                                         ‘keyness’ (Log Ratio) analysis that surfaces salient nouns and verbs for further investigation. On an empirical
                                          level, this method is then applied to a case study of 225 predominantly Dutch-speaking Telegram channels
                                         (spanning the period March 2017- March 2022), revealing some of the dynamics that govern their recent shift
                                          from propagating narratives about the coronavirus pandemic to narratives concerning the war in Ukraine.
                                         This case study is accompanied by an interactive demonstrator that enables readers to further explore the
                                          processed dataset. The paper concludes with a reflection on the status of and future avenues for this ‘distant
                                          reading’ approach in relation to established interpretative practices.


1. Introduction                                                                Confronting these challenges of (dis)information
                                                                             overload on Telegram and beyond, the development
In political science, the concept of ‘narrative’ has of inductive, machine-guided methods for mining
broadly been defined as a form of discourse in which narratives from (social media) texts at ‘big data’
humans “construct disparate facts in [their] own scale has become an active area of research. First
worlds and weave them together cognitively in or- examples of such computational analyses of narra-
der to make sense of [their] reality” [1, p.135]. At a tives can be traced back to work on scripts, story
time when world-changing events such as pandemics grammars, and planning formalisms from the field
and wars happen in rapid succession, this process of of artificial intelligence [4]. More recent contin-
narrative sense-making is intensified on social me- uations of this line of research have mapped the
dia. There, spanning countless posts and channels, underlying structures and dynamics of narratives
eclectic facts are continuously (re)combined into by representing them as (evolving) networks of rela-
new stories, including instances of disinformation tions between ‘actants’ figuring in texts, the latter
and conspiracy theory. A prototypical example of concerning people, places, or organizations that are
this are the narratives that circulate on Telegram, a detected through techniques such as Named Entity
messaging platform that through a lack of central- Recognition (NER) [5]. Following a similar logic,
ized content moderation tends to harbor conspiracy some texts have explored the possibilities afforded
theories and other misleading or antagonistic dis- by co-occurence networks of inductively-sourced
course usually not tolerated on social media such as hashtags to trace dynamics of converging narratives
Twitter [2, 3]. In this prolific environment, newly- [6]. These empirically-informed approaches have
coined and often idiosyncratic concepts (such as thus yielded first insights into the structural ties
the provocative ‘denazification’ used by the Rus- that allow online conspiracy theories and other nar-
sian government to legitimize the war in Ukraine) ratives to form from seemingly disparate concepts
can emerge and propagate freely, which renders and information.
keyword-based query designs and other top-down                                 As this paper aims to elaborate, the study of
methods for identifying narratives on the platform online narratives, including the aforementioned
rather ineffective.                                                          network-based approaches, can benefit from bottom-
IJCAI 2022: Workshop on semantic techniques for up methods for identifying the idiosyncratic and
narrative-based understanding, July 24, 2022, Vienna, Aus- evolving concepts that constitute those narratives.
tria                                                                         Previous literature in media studies has for in-
$ tom.willaert@vub.be (T. Willaert)                                          stance bridged gaps between cultural-theoretical
         © 2022 Copyright for this paper by its authors. Use permitted under
         Creative Commons License Attribution 4.0 International (CC BY and computational-linguistic approaches by using
         4.0).
    CEUR
    Workshop
    Proceedings
                  http://ceur-ws.org
                  ISSN 1613-0073
                                       CEUR Workshop Proceedings (CEUR-WS.org)




                                                                                                                                                            28
word embeddings to demonstrate how platforms              content travels from one channel to its many fol-
 such as 4chan form incubators for “robust vernacu-       lowers, who can receive and forward the content
 lar innovations” [7]. These conceptual innovations       to other channels, but not respond to it. As such,
 include neologisms such as ‘redpill’ (referring to an    Telegram channels can effectively be considered “de-
 awakening from ignorance), whose emerging and            positories” and “amplifiers” of narratives [11, p.3].
 shifting meanings can be interpreted as traces of           Identifying relevant Telegram channels from
 the evolving narratives that define antagonistic sub-    which to mine these narratives is a non-trivial mat-
 cultural communities. Comparable assumptions             ter, as channels can be scattered and difficult to
 underpin quantitative, corpus-based analyses that        identify by channel name alone. Therefore, channels
 trace the propagation of specific ‘vernacular’ con-      were retrieved by means of an established ‘snow-
 cepts between platforms as means of identifying the      balling’ method for Telegram research described
‘mainstreaming’ of fringe narratives [8, 9, 10]. Here,    in Peeters and Willaert [12]. This method repur-
 innovative or marked words that appear outside of        poses Telegram’s affordance of message-forwarding
 the fringe environments in which they were first ob-     between channels as a means of identifying related
 served can be considered traces of a wider adoption      channels. It assumes that if one channel forwards a
 of certain narratives.                                   message from another channel, a meaningful connec-
    Addressing the need for inductive approaches to       tion or shared interest exists between both. Starting
 narrative evolution on Telegram, the present paper       from a seed list of channels defined based on expert
 operationalizes quantitative methods from the field      knowledge, the researcher can thus retrace these
 of corpus-based discourse analysis. On a technical       links to other channels, bringing into view a net-
 and methodological level, the paper offers a discus-     work of interconnected channels in a bottom-up
 sion of how data from messages and images from           way.
 the platform can be collected and preprocessed for          For the purposes of this paper, a tailor-made
 the purposes of a ‘keyness’ (Log Ratio) analysis that    scraper based on Python’s Selenium library was
 surfaces salient nouns and verbs for further investi-    used to automate and scale-up this process.1 The
 gation. On an empirical level, this method is then       network of channels under investigation in this ar-
 applied to a case study of 225 predominantly Dutch-      ticle was first mapped in the summer of 2021. At
 speaking Telegram channels (spanning the period          that time, these channels were mainly preoccupied
 March 2017- March 2022). This case study tests the       with the coronavirus pandemic and associated nar-
 double hypothesis that 1) around the time of the         ratives, making this a suitable sample for explor-
 outbreak of the war in Ukraine, Telegram channels        ing further narrative evolutions. The contents of
 that previously spread disinformation narratives         these channels (both texts and images) were subse-
 on the coronavirus pandemic embraced narratives          quently scraped again in March 2022. This results
 about the war, and that 2) this shift might reveal       in a dataset of 821,020 messages from 225 public
 aspects of the underlying mechanisms governing the       Telegram channels pertaining to Dutch-speaking
 evolution of disinformation on the platform. To          far-right and conspiracy-theory communities, span-
 foster further exploration, this case study is accom-    ning a period between 18 March 2017 and 11 March
 panied by an interactive demonstrator that allows        2022.
 users to search and plot words from the dataset by          An initial inspection of this dataset revealed
 their keyness scores. The paper concludes with a         that narratives were constructed in both messages
wider reflection on the status of and future avenues      and images, with some images containing rele-
 for this ‘distant reading’ approach to narratives in     vant patches of text. Working towards a ‘multi-
 relation to established interpretative practices.        modal’ analysis that considers these aspects of the
                                                          data, the channel contents were processed further
                                                          along two tracks. Firstly, the texts embedded in
2. Data Collection                                        the images were programmatically extracted us-
                                                          ing Google’s Tesseract-OCR engine, by means of
The focus of this paper is on the analysis of narrative
                                                          the Python-tesseract wrapper.2 Secondly, the lan-
evolution in message texts and images from public
                                                          guages of the retrieved texts (from both posts and
Telegram channels pertaining to Dutch-speaking
                                                          images) were detected using the Python ‘langdetect’
far-right and conspiratorial communities. Following
                                                          library.3 This created opportunities for working
the taxonomy of platform affordances proposed in
Van Raemdonck and Pierson [11], Telegram chan-            1
                                                              https://selenium-python.readthedocs.io/
nels can be considered to afford “directed and iso-       2
                                                              https://pypi.org/project/pytesseract/
lated n-to-many interactions”, meaning that the           3
                                                              https://pypi.org/project/langdetect/




                                                                                                                  29
with linguistically-homogeneous subcorpora in the
subsequent analysis.
   After preprocessing, it was found that of the
retrieved messages, ca. 85% (697,364 messages)
contained a non-empty message text field, and ca.
33% (267,956 messages) contained an image file.4
After cleaning the outputs of the OCR for images,
such as removing ‘texts’ that only contained new-
line characters, texts could be extracted from ca.
67% (179,904) of the images. Automated language
detection revealed that the corpus was multilingual
(in part due to the forwarding of messages from
international channels), with English and Dutch be-
                                                               Figure 1: Schematic overview of the approach. Data are
ing the most prominent languages. Of the message
                                                               grouped by timestamps. For data at each timestamp (the
texts, ca. 21% (143,120) were classified as written            target corpus), keyness scores (Log Ratio) for nouns and
in English, and ca. 57% (399,842) in Dutch. For                verbs are calculated in relation to data for all remaining
the images from which texts were extracted, we                 timestamps (the reference corpus). Offering a ‘distant’
found ca. 53% (94,803) contained text in English,              perspective of the period as a whole, this approach fore-
and ca. 28% (51,138) contained text in Dutch. The              grounds key items for each individual week in relation to
prominence of English texts in the images again                the full corpus minus that week.
points towards an international dynamic of message
and content forwarding between channels.
                                                               (which are each multiplied by a factor of 1,000,000
                                                               for readability purposes).5
3. Methodology                                                    As illustrated in Figure 1, our overall approach to
 In order to inductively detect signals of narrative           narrative detection on Telegram, then, is to detect
 evolution in the collected data, this paper applies           these key items from a reference corpus of texts
 the method of ‘keyness’ analysis. This approach               grouped by week in relation to all remaining data.
 from the fields of corpus linguistics and corpus-             We then consider the items with the highest keyness
 based discourse analysis is directed at identifying           scores for each timestamp, thus opening them up
‘key’ items (e.g. words) in a target corpus in relation        for further interpretation. Concretely, this technical
 to a reference corpus based on the frequencies of             pipeline comprises the following steps:
 items in both corpora. As such, a keyness analysis                1. We filter the data by content type (message
 can support an exploratory approach to texts that                    texts, image texts, or combinations of both)
 gives an indication of their “aboutness” [13, p.227].                and language (Dutch or English).
Arguably, this makes the method well suited for our                2. We group the texts by timestamps (viz. per
 purposes of identifying emerging narrative signals                   week of data).
 in texts. The keyness metric chosen for this paper                3. We clean the texts at each timestamp by
 is that of Log Ratio, which is defined as the “binary                removing hyperlinks and emojis.
 log of the ratio of relative frequencies” [14]. This
                                                                   4. We perform part of speech tagging and retain
 gives a measure of the actual observed difference
                                                                      only nouns and verbs (as we consider these
 between two corpora for a key item (rather than a
                                                                      to express core concepts).
 measure of statistical significance). The advantage
                                                                   5. We calculate the frequencies for these items
 of this is that it allows for the sorting of items by
                                                                      per timestamp (week).
 the size of the actual frequency difference between
 the corpora, enabling us to find the top N most key               6. We calculate the Log Ratio of the target
 items. In order to calculate the Log Ratio for an                    corpus (normalized frequencies) in relation
 item in target corpus C1 and a reference corpus                      to all other weeks.
 C2, we take the binary logarithm of the ratio of the              7. Finally, we rank words by keyness score.
 normalised frequencies of the term in C1 and C2                 On a conceptual level, this approach returns key-
                                                               ness scores for items in relation to the combined
4                                                              5
    It should be acknowledged here that for messages with       For    a    Python     implementation,      see   https:
    multiple images, the scraper only stored the first image    //kristopherkyle.github.io/corpus-analysis-python/
    attached to the message.                                    Python_Tutorial_7.html




                                                                                                                            30
data that precede and follow it – offering a distant     Ukraine and the Russia-Ukraine crisis in general” in-
perspective on distinctive (key) narrative signals       deed become more frequent in the discourse of these
for each week’s worth of data in relation to the full    communities. The actual (pro-Russian) narratives
period minus that week. The keyness scores for           themselves were then analysed on the basis of close-
the final timestamp have a special status in this        readings of articles from the most frequently shared
regard, as they reveal key items in relation to all of   domains in the dataset [idem.]. As the dataset
the preceding data, illustrating what is key at the      investigated in the aforementioned study closely re-
last moment of observation. It should be acknowl-        sembles the one introduced in the present article,
edged upfront that this keyness analysis does not        we can hypothesize that a similar transition from
yet integrate semantics, apart from the significance     coronavirus-related narratives to narratives about
attributed to nouns and verbs as key indicators          the war in Ukraine should be observable in our cor-
of narratives. As will be expanded upon in the           pus. Moreover, we can also hypothesize that our in-
conclusion, this approach thus requires further in-      ductive approach can reveal more detailed traces of
terpretation and contextualization of the detected       the actual narratives that thus emerge, thus opening
key items.                                               up perspectives on the more fundamental dynamics
   In order to illustrate this method and make an        underlying this narrative evolution.
empirical contribution to the study of narrative            In order to interpret the results of the keyness
dynamics on Telegram, the following section zooms        analysis in light of these hypotheses, they have been
in on a case study that investigates the relation        integrated into an interactive demonstrator or ‘ob-
between narratives about the coronavirus pandemic        servatory’ [17] that allows for interactive exploration
and the war in Ukraine as expressed in our corpus.       and plotting of terms based on their keyness scores.
                                                         This ‘observatory’ covers the full dataset (only snap-
                                                         shots of which are discussed in the present paper)
4. Case Study and Findings                               and is openly available online.6
                                                            A first observation that can be made on the basis
Recent and on-going events such as the coronavirus
                                                         of our keyness analysis, is that we can indeed see
pandemic and the war in Ukraine have kindled an
                                                         emerging traces of narratives concerning the war
interest in the evolutionary dynamics of (disinfor-
                                                         in Ukraine. The table in Figure 2 shows the top
mation) narratives among researchers, civil society
                                                         20 nouns and verbs (by keyness score) retrieved
actors, and journalists. One comparative analysis of
                                                         for the last four weeks of English message texts in
international fact-checks has for instance revealed
                                                         the dataset. From this overview, it follows that
some striking, high-level parallels between disinfor-
                                                         discourse in these messages distinguishes itself from
mation surrounding both events in terms of style
                                                         previous weeks through references to the war in
and contents [15]. Examples from this study in-
                                                         Ukraine. Possible first traces are already observed
clude references to Nazism (e.g. the coronapass as
                                                         in the week of February 20 in the form of a ref-
a Nazi ‘health passport’ or Ukraine as a region that
                                                         erence to “mobilisation”. Further, more explicit
should be ‘denazified’), and recurring conspiracies
                                                         references can be found in ensuing weeks, which
about secret laboratories (e.g. false claims that the
                                                         feature high-keyness words such as “demilitarize”
coronavirus was created in a lab and references to
                                                         and “bombards” (week of 27/02/2022), “defections”
the alleged presence of U.S. bioweapon labratories
                                                         (week of 06/03/2022), as well as “vladimir” and
in Ukraine as a pretext for the war). This then
                                                         “corridors” (week of 13/03/2022).
raises the question of whether similar trends are
                                                            A second observation is that our empirical anal-
reflected on a more localized level. Or more con-
                                                         ysis reflects some of the trends observed in the
cretely: have the same communities that previously
                                                         aforementioned study of narrative similarities in
pushed false narratives about the coronavirus also
                                                         fact-checks. Among the high-keyness terms that are
embraced disinformation about the war in Ukraine?
                                                         detected in the latter weeks of the dataset, terms le-
   A recent study by the Institute of Strategic Di-
                                                         gitimizing the war such as “denazify” (27/02/2022)
alogue confirms that this can indeed be the case
                                                         clearly evoke Nazism. The analysis likewise fore-
[16]. Based on the analysis of a dataset of 229
                                                         grounds references to the biolaboratories conspiracy
German-language Telegram channels (spanning the
                                                         mentioned earlier (e.g. “biolaboratories”, “biosci-
period between 1 November 2021 and 27 February
                                                         entist” (13/03/2022)). Results of a wider search
2022) pertaining to far-right and conspiracy the-
                                                         for terms referring to biology laboratories shown
ory communities, this study has shown that terms
                                                         in Figure 3 reveal that an earlier segment of the
from a preconstructed list of 80 keywords related to
“Russia, Ukraine, the breakaway regions in Eastern       6
                                                             https://jvansoest.github.io/




                                                                                                                   31
                                                         channels continuously adapt narratives to match
                                                         ongoing events.


                                                         5. Discussion
                                                           In light of our hypotheses, the analysis conducted
                                                           above indeed reveals traces of narratives related
                                                           to the war in Ukraine in communities that were
                                                           previously mainly concerned with the pandemic.
                                                           Furthermore, our inductive approach brings into
                                                           view three more general dynamics governing this
                                                           transition. Firstly, it was possible to observe both
                                                           emerging narratives as well as more ‘stable’ under-
                                                           currents. Secondly, our case study suggests that
                                                           certain narratives recur over time. Thirdly, expand-
                                                           ing the scope of the investigation indicates that the
                                                           recent shifts between narratives are part of a longer
Figure 2: Top 20 nouns and verbs with highest key- process of narrative evolution.
ness scores for message texts in English for the last four   Given the specific nature of the corpus under con-
weeks of the dataset (week of 20/02/2022 - week of sideration, these observations might provide some
13/03/2022). Various traces of emerging narratives about deeper insights into the nature of disinformation
the war in Ukraine can be observed (e.g. “mobilisation”, narratives. It notably seems to be the case that in
“demilitarize”, “denazify”, “vladimir”). This indicates
                                                           order to persist, disinformation needs to contain a
that the same Telegram channels known for propagating
                                                           foundation of recognisable, recurring elements, yet
narratives about the coronavirus pandemic have recently
also embraced narratives about the war in Ukraine          at the same time it needs to be flexible enough to
                                                           adapt to world-changing events. It can be argued
                                                           that on Telegram, this continuous process of recur-
                                                           rence and adaptation is facilitated by the permissive
data where this term had a higher score was dur- affordances of the platform.
ing the coronavirus pandemic, before the Ukraine
war. This illustrates that some narrative traces are
actually recurrent in the dataset. Moreover, this 6. Conclusions and Future Work
plot demonstrates that the dataset contains a rel-
atively stable narrative ‘undercurrent’ marked by This paper set out to make a double contribution
e.g. words referring to the coronavirus pandemic. to the detection of evolving, often idiosyncratic nar-
These have a keyness score that remains close to 0 ratives on social media. For one thing, the paper
in each week of the dataset.                               proposed a technical pipeline for detecting traces of
   Finally, the repeated occurrence of ‘biolaborato- narrative innovations and narrative continuity in a
ries’ as a high-keyness item suggests that within the bottom-up way by operationalizing keyness analysis
Dutch-speaking disinformation communities under (Log Ratio). For another, the paper applied this
investigation, narratives contextualizing the war in method to the case study of narrative evolution
Ukraine are but a salient pivot point in an ongo- on Dutch-speaking Telegram channels (pertaining
ing process of narrative evolution. Offering a more to far-right and conspiracy theory communities).
‘zoomed out’ perspective, Figure 4 shows the re- It has thus been shown how keyness analysis can
sults of the keyness analysis for texts in English be applied to Telegram data to inductively iden-
from both messages and images combined, covering tify traces of emerging or persistent narratives that
a more extended period of time. A closer inspec- might warrant further investigation.
tion of key items predating the Russian invasion of          It should be acknowledged that the exploratory
Ukraine hint at a range of other events that have scope of the present paper has its limitations. Build-
been appropriated to match the then-predominant ing on these initial results, at least two pathways
agenda of the communities. Our method for in- for future research can be envisaged. On a method-
stance picks up traces of references to the freedom ological and technical level, more work is needed
convoys in Canada (e.g. “Winnipeg”, “blockading” to reduce noise and introduce additional granular-
(06/02/2022)), which demonstrates how Telegram ity in the analysis. As has been illustrated in this




                                                                                                                   32
Figure 3: Plot of keyness scores over time for terms referring to biology laboratories and the coronavirus in the
dataset’s message texts in English. The graph suggests a recurrence of emerging narratives involving biology
laboratories during the pandemic and at the start of the war in Ukraine. The keyness score of the term “coronavirus”
remains close to 0 in each week of the dataset (except for some higher scores around the time of the outbreak of the
pandemic), suggesting a relatively stable ‘undercurrent’ of coronavirus-related narratives




Figure 4: Results of the keyness analysis for texts in English from both messages and images combined, covering a
more extended period of time. A closer inspection of key items predating the Russian invasion of Ukraine hint at a
range of other events that have been appropriated to match the then-predominant agenda of the retrieved channels,
including the ‘freedom convoys’ in Canada (“Winnipeg”, “blockading”). This points towards a continuous process of
narrative evolution on Telegram



paper, transferring methods from corpus-based dis-         detection and optical character recognition (text
course analysis to Telegram requires intensive data-       extraction from images) suitable for Telegram’s id-
preprocessing. Future investigations might for in-         iosyncratic (visual) discourse. Along the same lines,
stance explore more refined methods for language           future research might complement the aggregated




                                                                                                                       33
perspective on offer and explore the distributions     References
of key items over channels, thus bringing into per-
spective more intricate relations between channel     [1] M. Patterson, K. R. Monroe, Narrative in
dynamics and discourse. Finally, additional method-       political science, Annual Review of Polit-
ological work is needed to situate the retrieved          ical Science 1 (1998) 315–331. doi:10.1146/
items in their wider semantic networks, for instance      annurev.polisci.1.1.315.
through statistically-informed co-occurrence analy-   [2] R. Rogers, Deplatforming: Following extreme
ses. Introducing further granularity, one promising       Internet celebrities to Telegram and alterna-
avenue here would be to contextualize key items           tive social media, European Journal of Com-
through graph-like representations of narratives in-      munication 35 (2020) 213–229. doi:10.1177/
ferred from the sentences’ argument structure [18].       0267323120922066.
   On a more conceptual level, our analysis raises    [3] A. Urman, S. Katz,          What they do in
bigger questions of meaning and interpretation. As        the shadows: examining the far-right net-
indicated, the keyness analysis itself does not cap-      works on Telegram, Information, Commu-
ture the semantics of the messages and image texts        nication & Society (2020) 1–20. doi:10.1080/
under investigation. Meaning has to be assigned to        1369118X.2020.1803946.
key items by the human interpreter, for instance      [4] I. Mani,         Computational narratology,
by considering and comparing combinations of key          in:     P. Hühn, J. Pier, W. Schmid,
items, by looking up the retrieved key words in the       J. Schönert (Eds.), The Living Hand-
corpus and reading the messages or image texts            book of Narratology, Hamburg University,
in which they figure, or through broader cultural         2013. URL: http://www.lhn.uni-hamburg.de/
or media-theoretical contextualization. This fore-        article/computational-narratology.
grounds the question of how critical frameworks       [5] T. R. Tangherlini, S. Shahsavari, B. Shahbazi,
might be developed that streamline and formal-            E. Ebrahimzadeh, V. Roychowdhury, An auto-
ize the integration of inductive methods from data        mated pipeline for the discovery of conspir-
science and interpretative approaches from the hu-        acy and conspiracy theory narrative frame-
manities. Proposals for such frameworks have been         works: Bridgegate, Pizzagate and storytelling
made under the denominator of ‘data hermeneutics’         on the web, PLOS ONE 15 (2020) e0233879.
[19, 20], opening up the field for future work on         doi:10.1371/journal.pone.0233879.
actionable implementations.                           [6] M. Tuters, T. Willaert, Deep state phobia:
                                                          Narrative convergence in coronavirus conspir-
                                                          acism on Instagram, Convergence: The Inter-
7. Data availability statement                            national Journal of Research into New Media
                                                          Technologies (in print).
The dataset of weekly keyness scores for nouns        [7] S. Peeters, M. Tuters, T. Willaert, D. de Zeeuw,
and verbs in the Telegram dataset can be queried          On the vernacular language games of an antago-
through the online demonstrator accompanying the          nistic online subculture, Frontiers in Big Data 4
paper.                                                    (2021) 1–15. doi:10.3389/fdata.2021.718368.
                                                      [8] S. Peeters, T. Willaert, M. Tuters, Trav-
Acknowledgments                                           elling tokens:     Following extreme terms
                                                          from 4chan/pol/ to Breitbart. OILab blog,
This project has received funding from the European       https://oilab.eu/travelling-tokens-following-
Union under Grant Agreement number INEA/CE-               extreme-terms-from-4chan-pol-to-breitbart/,
F/ICT/A2020/2394296. The paper was written                2020.
during a research visit at SciencesPo Médialab made   [9] T. Willaert, P. V. Eecke, J. V. Soest, K. Beuls,
possible by a travel grant from the Research Founda-      A tool for tracking the propagation of words
tion Flanders (FWO). The author wishes to thank           on Reddit, Computational Communication
Jeroen Van Soest (Vrije Universiteit Brussel) for         Research 3 (2021) 117–132.
his work on the demonstrator accompanying this       [10] S. Peeters, T. Willaert, M. Tuters, K. Beuls,
paper.                                                    P. Van Eecke, J. Van Soest, A fringe main-
                                                          streamed, or tracing antagonistic slang be-
                                                          tween 4chan and Breitbart before and after
                                                          Trump, in: R. Rogers (Ed.), How Misinforma-
                                                          tion Propagates on Social Media. Mainstream-




                                                                                                              34
     ing the Fringe, Amsterdam University Press,        hermeneutics: From interpreting with ma-
     in print.                                          chines to interpretational machines, AI & SO-
[11] N. Van Raemdonck, J. Pierson, Taxonomy of          CIETY 35 (2020) 73–86. doi:10.1007/s00146-
     social network platform affordances for group      018-0856-2.
     interactions, in: 2021 14th CMI International [20] P. Gerbaudo, From data analytics to data
     Conference - Critical ICT Infrastructures and      hermeneutics: Online political discussions, dig-
     Platforms (CMI), 2021, p. 1–8. doi:10.1109/        ital methods and the continuing relevance of
     CMI53512.2021.9663773.                             interpretive approaches, Digital Culture & So-
[12] S. Peeters, T. Willaert,       Telegram and        ciety 2 (2016) 95–112. doi:10.14361/dcs-2016-
     digital methods:        Mapping networked          0207.
     conspiracy     theories   through     platform
     affordances,       M/C Journal 25 (2022).
     URL: https://journal.media-culture.org.au/
     index.php/mcjournal/article/view/2878.
     doi:10.5204/mcj.2878.
[13] C. Gabrielatos, Keyness analysis, in: C. Tay-
     lor, A. Marchi (Eds.), Corpus Approaches
     To Discourse, Routledge, 2018, p. 225–258.
     URL:          https://www.taylorfrancis.com/
     books/9781351716079/chapters/10.4324/
     9781315179346-11.                 doi:10.4324/
     9781315179346-11.
[14] A. Hardie, Log Ratio: an informal introduction.
     ESRC Centre for Corpus Approaches to Social
     Science (CASS), http://cass.lancs.ac.uk/log-
     ratio-an-informal-introduction/, 2014. URL:
     http://cass.lancs.ac.uk/log-ratio-an-informal-
     introduction/.
[15] T. Willaert, M. G. Sessa, From infodemic
     to information war: A contextualization
     of current narrative trends and evolutions
     in Dutch-language disinformation commu-
     nities,    https://researchportal.vub.be/en/
     publications/from-infodemic-to-information-
     war-edmo-belux-investigative-report, 2022.
[16] J. Smirnova, P. Matlach, F. Arcostanzo,
     Support from the conspiracy corner: German-
     language disinformation about the Russian
     invasion of Ukraine on Telegram, https:
     //www.isdglobal.org/digital_dispatches/
     support-from-the-conspiracy-corner-german-
     language-disinformation-about-the-russian-
     invasion-of-ukraine-on-telegram/, 2022.
[17] T. Willaert, P. Van Eecke, K. Beuls, L. Steels,
     Building social media observatories for moni-
     toring online opinion dynamics, Social Me-
     dia + Society 6 (2020) 1–12. doi:10.1177/
     2056305119898778.
[18] E. Ash, G. Gauthier, P. Widmer,            RE-
     LATIO: Text semantics capture political
     and economic narratives,        arXiv (2021).
     URL:        https://arxiv.org/abs/2108.01720.
     doi:10.48550/ARXIV.2108.01720.
[19] A. Romele, M. Severo, P. Furia, Digital




                                                                                                           35