<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta>
      <journal-title-group>
        <journal-title>E. B. Marino);</journal-title>
      </journal-title-group>
    </journal-meta>
    <article-meta>
      <title-group>
        <article-title>Linguistic Markers of Population Replacement Conspiracy Theories in YouTube Immigration Discourse</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Erik Bran Marino</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Davide Bassi</string-name>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Renata Vieira</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Universidade de Évora, CIDEHUS</institution>
          ,
          <addr-line>Évora</addr-line>
          ,
          <country country="PT">Portugal</country>
        </aff>
        <aff id="aff1">
          <label>1</label>
          <institution>Universidade de Santiago de Compostela</institution>
          ,
          <addr-line>Santiago de Compostela</addr-line>
          ,
          <country country="ES">Spain</country>
        </aff>
      </contrib-group>
      <pub-date>
        <year>2025</year>
      </pub-date>
      <volume>000</volume>
      <fpage>0</fpage>
      <lpage>0003</lpage>
      <abstract>
        <p>This paper presents a linguistic analysis of YouTube comments related to immigration discourse, analyzing the contrasts between standard anti-immigration comments and those linked to Population Replacement Conspiracy Theories (PRCT). Using a dataset of 71,137 YouTube comments classified into three stance categories (PRO, NEUTRAL, CONTRA) and PRCT annotation, we analyze the linguistic features of each group through LIWC (Linguistic Inquiry and Word Count). Our findings reveal significant diferences in the language patterns of PRCT comments, both in comparison to standard anti-immigration discourse (CONTRA) and to all other groups. These diferences appear particularly in religious references, power dynamics, conflict framing, and emotional tone. The high linguistic overlap (89.7%) between conspiracy and non-conspiracy antiimmigration discourse reveals the subtle nature of these diferences. These distinctive linguistic patterns provide valuable insights both for the understanding and the automatic detection of conspiracy theories in online discourse, contributing to the growing body of research on computational approaches to identifying harmful content online.</p>
      </abstract>
      <kwd-group>
        <kwd>eol&gt;Population Replacement Conspiracy Theory</kwd>
        <kwd>Immigration discourse</kwd>
        <kwd>YouTube comments</kwd>
        <kwd>LIWC analysis</kwd>
        <kwd>LLMs</kwd>
        <kwd>Deepseek</kwd>
        <kwd>Hybrid approach</kwd>
        <kwd>Computational Social Sciences</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>1. Introduction</title>
      <sec id="sec-1-1">
        <title>Immigration has become one of the central and most con</title>
        <p>troversial topics in cultural and political debates across
Western societies. The debate is increasingly influenced
by Population-Replacement Conspiracy Theories (PRCTs)
narratives that portray demographic change as an élite
plot to replace native populations [1, 2]. Online, the
mantra at the core of these narratives—the Great
Replacement—has migrated from fringe blogs to mainstream
platforms, reshaping how migration is framed and
politicised [3].</p>
        <p>The impact of PRCTs goes beyond mere rhetoric.
Analyses of terrorist manifestos show that the Christchurch
(2019) and Utøya (2011) attackers adopted the
GreatReplacement frame as moral legitimation for violence
[4, 5, 6]. Experimental work further demonstrates that
exposure to PRCT claims heightens Islamophobia and
support for extremist action [4]. These findings
underscore the societal risks tied to PRCT difusion [5].</p>
        <p>Automatic moderation faces two intertwined
challenges. First, PRCT cues are lexically sparse,
domainlfexible, and embedded in high-volume comment streams,
limiting rule-based filters. Second, existing supervised
classifiers require large, domain-specific corpora that are
rarely available for niche conspiracies [7, 8]. Even
stateof-the-art large language models (LLMs) may struggle
when prompted zero-shot on conspiracy detection tasks
[9, 10].</p>
        <p>This study ofers a dual contribution:</p>
      </sec>
      <sec id="sec-1-2">
        <title>1. Methodological: We provide, to our knowledge,</title>
        <p>the first systematic evaluation of an open-weight
LLM (DeepSeek-v3) for PRCT detection in a
fewshot setting. Performance is validated against
a gold subset independently annotated by two
experts (see §3).
2. Psycholinguistic: Using LIWC, we deliver the
ifrst fine-grained comparison of PRCT language
with other stances in the immigration debate
(PRO, CONTRA, NEUTRAL), illuminating
diferences in temporal focus, power rhetoric and
conlfict framing [9] 1.</p>
      </sec>
      <sec id="sec-1-3">
        <title>These aims translate into two research questions:</title>
      </sec>
      <sec id="sec-1-4">
        <title>RQ1 Can DeepSeek-v3, with minimal in-context exam</title>
        <p>ples, reliably distinguish PRCT comments from
non-PRCT content?
RQ2 Do PRCT comments exhibit psycho-linguistic
patterns that difer systematically from other
immigration stances?</p>
      </sec>
      <sec id="sec-1-5">
        <title>1Throughout this paper we use psycholinguistic in the computational</title>
        <p>social-science sense: the study of how everyday language reflects
basic social and personality processes [11].</p>
      </sec>
    </sec>
    <sec id="sec-2">
      <title>2. Related Work</title>
      <sec id="sec-2-1">
        <title>Our analysis is based on a dataset comprising 71,137 unique YouTube comments related to immigration.</title>
      </sec>
      <sec id="sec-2-2">
        <title>Specifically, we expanded the dataset described in Bassi</title>
        <p>et al. [16] by crawling a total of 15 videos about
immigraPRCTs comprise a family of narratives such as the Great tion (see Table 7 in the appendix for complete video list).
Replacement, the Kalergi Plan, White Genocide and Eu- Following the methodology established in the referenced
rabia. Recent scholarships track their strategic main- study, which demonstrated that parent comment
contexstreaming, whereby far-right actors blend demographic tual information is crucial for accurate stance detection
alarmism with cultural-defence rhetoric to broaden ap- in YouTube comments, we employed the same hybrid
peal [1, 2]. pipeline to reconstruct conversation chains and preserve</p>
        <p>In terms of computational approaches to conspiracy parent-child relationships between comments.
detection, early systems combined rule-based extraction For stance classification, we utilized GPT-4o with
conwith bag-of-words classifiers [ 8]. More recent pipelines textual information from reconstructed comment chains
present an automated pipeline using BERT embeddings to detect the stance of the comments. The vast
majorto discover narrative frameworks in conspiracy theories ity of comments mention migration. The classification
and conspiracies. Evaluated against expert data, it shows scheme distinguished between three primary categories:
relation extraction recall of 83.7-82.9% for Pizzagate and
Bridgegate [7]. • CONTRA: expressing anti-immigration views</p>
        <p>Large Language Models ofer new possibilities for this • NEUTRAL: expressing neutral, unclear or
unredomain, promising zero-shot classification without costly lated perspectives towards immigration
annotation. Previous works shows that GPT-3.5 and • PRO: expressing pro-immigration views
LLaMA-2 outperform RoBERTa on generic conspiracy
tasks but inflate false-positive rates [ 12, 13]. However,
no prior study evaluates DeepSeek on PRCT specifically,
leaving a clear research gap that we address.</p>
        <p>From a linguistic perspective, corpus studies reveal
that conspiracy texts favour future-oriented temporal
frames, certainty language and out-group pronouns [8, 7].</p>
        <p>Our work isolates PRCT language to test whether it is
merely an intensification of generic anti-immigration
talk or a qualitatively distinct register. In this context,
LIWC remains a widely validated tool for
psycholinguistic profiling. In extremist contexts it is able to capture
cues pertinent to radical rhetoric [14]. Yet its capacity
to discriminate between sub-types of anti-immigration
discourse goes beyond its goals. By integrating LIWC
with stance labels, we extend its interpretive utility.</p>
        <p>Overall, the literature lacks (i) validated LLM
approaches for PRCT detection and (ii) systematic
linguistic characterisation that separates PRCT from
nonconspiratorial rhetoric. Our study addresses both
gaps, laying empirical foundations for future detection
pipelines and theory-driven analyses of demographic
conspiracy talk. Furthermore, Hernaiz [15] theorizes
that conspiracy theories operate within the same secular
rational frame as mainstream explanations, suggesting
that linguistic diferences between conspiracy and
nonconspiracy discourse may be more subtle than categorical,
warranting empirical investigation of their shared and
distinct features.</p>
      </sec>
      <sec id="sec-2-3">
        <title>A detailed performance evaluation of GPT-4o for</title>
        <p>immigration-related stance labelling is provided in [16].</p>
        <p>The model achieved a  −  1 = 78.7% on a
manually labelled subset, demonstrating suficient accuracy
to enable automated annotation across the entire dataset.</p>
        <p>Subsequently, the comments were further analyzed
using DeepSeek v3 in a few-shot learning approach to
identify those containing Population Replacement
Conspiracy Theory elements, resulting in the PRCT annotation.</p>
        <p>The classification process employed carefully structured
prompts that included reference examples extracted
directly from the existing labeled dataset (5 PRCT examples
and 5 Non-PRCT examples) to guide the model’s
understanding. Representative PRCT and Non-PRCT examples
for the few-shot prompt were drawn from the training
pool via stratified random sampling across the 15 videos,
balancing length, topic, and stance. The five PRCT
instances include both explicit markers (e.g. explicit
mention of "Great Replacement") and implicit cues (coded
dog-whistles such as "demographic engineering");
likewise, the five Non-PRCT examples span policy-oriented,
economic, and security-focused objections free of
conspiratorial framing. The prompts featured explicit
definitions of PRCT content, encompassing specific conspiracy
narratives such as "Great Replacement Theory", "White
Genocide Theory", "Eurabia", and "Kalergi Plan", as well
as broader indicators like demographic warfare
narratives, terms such as "invasion", "replacement", and
"remigration", and claims of orchestrated population change.</p>
        <p>Non-PRCT examples were defined to include policy
discussions, border security debates, integration challenges,
and economic impact analysis without conspiracy
elements. The model was configured with temperature=0 to
ensure deterministic and reproducible classifications, and
was explicitly instructed to respond strictly with either
3. Methodology
3.1. Dataset
"PRCT" or "Non-PRCT", avoiding ambiguous classifica- Count (LIWC) tool. LIWC is a text analysis software
tions. To ensure the reliability of our PRCT classification, that calculates the percentage of words pertaining to
we validated DeepSeek v3’s performance using a man- specific dictionaries falling into specific psychological
ually annotated gold standard dataset of 500 YouTube and linguistic categories [17].
comments, evenly split between PRCT and Non-PRCT We processed all comments through LIWC, focusing
classifications 2. Each comment was independently re- on the following key dimensions:
viewed by two expert annotators following detailed an- Temporal focus: refers to the extent to which
innotation guidelines that provided clear criteria for iden- dividuals characteristically direct their attention to the
tifying PRCT content. The inter-annotator agreement past, present, and future [18]. LIWC derives temporal
demonstrated high reliability with Gwet’s AC1 = 0.891 focus scores by counting the frequency of time-related
and PABAK = 0.804, indicating substantial agreement words in text. For example, past focus includes words
particularly for PRCT identification (Positive Agreement like "ago" or "did;" present focus captures "today," "is," and
Rate: 0.947). DeepSeek v3 achieved 94.5% accuracy on "now," while future focus is based on "may," "will," and
this gold standard, with balanced precision and recall, "soon"[19].
demonstrating robust detection capabilities across difer- Pronoun usage: Pronoun use highlights whether
atent PRCT manifestations. tention is on others—third-person singular/plural (he/she,</p>
        <p>
          This methodology allowed us to create a comprehen- they), on ourselves as distinct entities—first-person
sinsive dataset that distinguishes between standard anti- gular pronouns (I), or ourselves embedded within a social
immigration discourse and discourse specifically contain- relationship—first-person plural (we) and second-person
ing population replacement conspiracy theories. Given (you) [20].
the nature of our study, we proceeded by removing du- Cognitive processes: This dictionary comprises over
plicated comments and applying a word count filter to 1,000 entries that identify active information-processing;
retain comments between 5 and 1000 words, ensuring suf- it yields six sub-scores (insight, causation, discrepancy,
ifcient content for meaningful analysis while excluding tentativeness, certainty and diferentiation) [
          <xref ref-type="bibr" rid="ref2">21</xref>
          ]. These
extremely short or excessively long comments. Table 1 dimensions capture the depth and style of mental
elabdescribes the final distribution of stance and PRCT anno- oration, indicating whether individuals are reasoning
tations in our dataset. analytically (causation, insight), expressing uncertainty
or confidence (tentativeness, certainty), or making
disCategory Count (%) tinctions and comparisons (diferentiation, discrepancy).
Stance Emotional dimensions: LIWC distinguishes
beCONTRA 37,531 (52.76%) tween broad sentiment and specific emotions [
          <xref ref-type="bibr" rid="ref3">22</xref>
          ]. The
afNEUTRAL 22,190 (31.19%) fect category encompasses both positive tone (e.g., "good,"
PRO 11,416 (16.05%) "love," "happy") and negative tone (e.g., "bad," "hate,"
PRCT "hurt") words, which reflect general sentiment. The
emoNon-PRCT 65,915 (92.66%) tion categories are more targeted, focusing on specific
PRCT 5,221 (7.34%) emotion labels such as positive emotion (e.g., "joy,"
"excited"), negative emotion (e.g., "sad," "angry"), and
dis
        </p>
        <p>Total Dataset 71,137 (100.00%) crete emotional states including anxiety (e.g., "worry,"
Table 1 "fear"), anger (e.g., "mad," "frustrated"), and sadness (e.g.,
Distribution of stance categories and PRCT annotations in the "disappointed," "cry") [19]. These dimensions capture
dataset both the valence and intensity of emotional expression
in text.</p>
        <p>Within the CONTRA stance category, 4,905 comments Social dynamics: this dictionary captures references
(13.07%) contained PRCT elements, while 32,625 com- to interpersonal relationships and social behaviors,
inments (86.93%) were standard anti-immigration discourse cluding social referents (e.g., "you," "we"), prosocial
behavwithout conspiracy theories. This distinction forms the ior (e.g., "help," "care"), conflict (e.g., "fight," "argue"), and
basis of our comparative linguistic analysis. communication acts (e.g., "said," "tell"). The framework
also measures power-related language reflecting
awareness of social hierarchies and clout, which captures
confi3.2. LIWC Analysis dence or leadership displayed through language [19, 20].
To analyze the linguistic characteristics of each comment Linguistic style: captures stylistic markers (such as
category, we utilized the Linguistic Inquiry and Word usage of exclamation and question marks, or periods)
which can reflect formality or communicative intent [ 19].
2Detailed annotation criteria for the PRCT validation task are pub- For each category, we averaged LIWC scores and
licly available at https://zenodo.org/records/16605519. conducted comparative analyses to identify significant</p>
        <sec id="sec-2-3-1">
          <title>3.3. Statistical Analysis</title>
          <p>diferences, particularly between CONTRA-PRCT (the content from general anti-immigration rhetoric. The
bi4,905 merged class) comments and other categories. We nary comparison directly addresses whether conspiracy
adopted an exploratory approach, running the complete theories represent fundamentally diferent discourse or
LIWC dictionary and retaining all variables for analysis. an intensification of existing patterns.
Figure 3 displays the subset that reached || &gt; 0.2 af- PRCT-Specific Feature Classification : We
categoter multiple-comparison correction; these include both rized the LIWC dimensions as either PRCT-specific
(statissingle-word scores (e.g. religion) and composite cate- tically significant after FDR correction with || ≥ 0.2) or
gories (e.g. analytic). shared features (|| &lt; 0.2). The overlap percentage was
calculated as the proportion of shared features relative
to total features analyzed.</p>
        </sec>
      </sec>
      <sec id="sec-2-4">
        <title>Statistical Test Selection: given the large sample sizes</title>
        <p>and non-normal distributions typical of linguistic data,
for each dimension, we assessed normality conditions
through Shapiro-Wilk and homogeneity of variance using
Levene’s test. Normality assumption was violated in all
39 cases, hence we recurred to Kruskal-Wallis test.</p>
        <p>Multiple Comparison Correction: Given the
exploratory nature of our research (comparison of multiple
LIWC dimensions across 4 diferent groups), we applied
multiple comparison corrections. Specifically, False
Discovery Rate (FDR), and Bonferroni Correction to identify
most robust efects.</p>
        <p>
          Efect Size : for each significant diference, we
calculated Cohen’s d. In this regard, we highlight how usually
efect sizes 0.2 ≤ | | ≥ 0.5 are considered small,
however we considered efect sizes of || &gt; 0.2 as substantial,
in line with field-specific benchmarks for linguistic
research [
          <xref ref-type="bibr" rid="ref4 ref5">23, 24</xref>
          ].
        </p>
        <p>Two-Phase Analysis: Our analytical approach
comprised two phases: (1) a comprehensive four-group
comparison (CONTRA-PRCT, CONTRA, NEUTRAL, PRO) to
establish general immigration discourse patterns, and (2)
a focused binary analysis (CONTRA-PRCT vs CONTRA)
to identify features specifically distinguishing conspiracy</p>
      </sec>
    </sec>
    <sec id="sec-3">
      <title>4. Results</title>
      <sec id="sec-3-1">
        <title>Our analysis revealed distinct linguistic patterns in</title>
        <p>immigration-related discourse, with significant
diferences between stance groups while highlighting
substantial overlap between conspiracy and non-conspiracy
antiimmigration rhetoric.</p>
        <sec id="sec-3-1-1">
          <title>4.1. General Immigration Discourse</title>
        </sec>
        <sec id="sec-3-1-2">
          <title>Patterns</title>
        </sec>
      </sec>
      <sec id="sec-3-2">
        <title>The comprehensive four-group comparison (CONTRA</title>
        <p>PRCT, CONTRA, NEUTRAL, PRO) revealed systematic
linguistic diferences across immigration stances. After
applying FDR correction for multiple comparisons, the
majority of LIWC dimensions showed significant
diferences (FDR &lt; 0.05).</p>
        <p>Anti-Immigration vs Pro-Immigration Discourse.</p>
        <p>As shown in Figure 1, both anti-immigration groups
(CONTRA-PRCT and CONTRA) demonstrated a
similar depersonalised rhetoric, signalled by a higher
usage of third-person plural pronouns (they), reflecting
out-group focus, and first-person plural pronouns ( we),
signalling in-group consolidation, compared to PRO framing of demographic change as a spiritual or
civilizaand NEUTRAL comments.Specifically, "They" pronouns: tional threat.</p>
        <p>CONTRA-PRCT (3.36), CONTRA (3.17) vs PRO (2.51) vs Power Language ( = 0.233, FDR &lt; 0.001;
NEUTRAL (1.68); and "We" pronouns: CONTRA-PRCT CONTRA-PRCT: 3.621 vs CONTRA: 2.560): PRCT
dis(1.43), CONTRA (1.30) vs PRO (0.97) vs NEUTRAL (0.77). course shows 41.4% higher usage of power-related
lanAdditionally, PRCT discourse exhibited distinct cogni- guage, reflecting emphasis on elite control and
orchestive processing patterns. PRCT comments showed the trated manipulation.
highest analytic thinking scores (43.2) compared to all Conflict Framing ( = 0.219, FDR &lt; 0.001:
other groups (PRO: 38.2, NEUTRAL: 39.1, CONTRA: 39.2), CONTRA-PRCT: 0.853 vs CONTRA: 0.437): Conspiracy
suggesting more structured, logical reasoning style. Con- discourse frames immigration as active conflict/warfare
versely, PRCT comments demonstrated lower insight lan- with 95.2% higher conflict language usage.
guage usage (1.7) compared to PRO (2.3) and NEUTRAL Tone ( = − 0.214, FDR &lt; 0.001; CONTRA-PRCT:
(2.6) groups, indicating less expression of sudden under- 30.674 vs CONTRA: 39.347): PRCT comments exhibit
standing or realization. This pattern can indicate that significantly more negative tone, with 22.0% lower
poswhile PRCT discourse employs analytical framing, it may itive sentiment scores than standard anti-immigration
rely more on predetermined interpretive frameworks discourse.
rather than exploratory or discovery-oriented thinking.</p>
        <sec id="sec-3-2-1">
          <title>4.2. PRCT-Specific Linguistic Signature</title>
        </sec>
      </sec>
    </sec>
    <sec id="sec-4">
      <title>5. Discussion</title>
      <p>To isolate features unique to conspiracy discourse from The linguistic patterns identified in our analysis ofer
siggeneral comments against immigration, we conducted nificant insights into the nature of PRCT discourse and its
a focused binary comparison between CONTRA-PRCT distinction from standard anti-immigration rhetoric. Our
(n=4,905) and CONTRA non-PRCT (n=32,625) comments. findings reveal that while conspiracy and non-conspiracy
This analysis revealed a striking finding: 89.7% of lin- anti-immigration discourse share 89.7% of their linguistic
guistic features showed negligible diferences (Cohen’s features, they difer significantly in four key dimensions:
d &lt; 0.2) between conspiracy and non-conspiracy anti- religious references, power dynamics, conflict framing,
immigration discourse, suggesting that anti-immigration and emotional tone.
discourse, regardless of conspiracy content, shares
fundamental characteristics of outgroup construction and 5.1. High Linguistic Overlap
authoritative positioning. As shown in Figure 2, only four
dimensions exceeded the meaningful efect size thresh- A potential limitation is that the Non-PRCT
compariold. son set, although explicitly anti-immigration, aggregates</p>
      <p>As shown in Figure 3, four dimensions demonstrated heterogeneous sub-registers (security, economic,
assimmeaningful efect sizes (  ≥ 0.2) with statistical signifi- ilationist). This breadth may inflate the observed
lincance after FDR correction: guistic overlap. Nevertheless, the residual diferences
we detect—religious framing, power attribution, conflict,
PRRCeTl:i1g.i2o7n4 v(s C=ON0T.R2A51: ,0.5F9D1)R: PR&lt;CT0.d0i0sc1o;uCrsOeNshToRwAs- and tone—remain interpretable within Hernaiz [15]’s
115.6% higher usage of religious language, reflecting the framework of shared rational frames, suggesting that
PRCT discourse intensifies, rather than qualitatively de- higher use of religious terminology in PRCT comments
reparts from, mainstream anti-immigration rhetoric. The lfects the framing of immigration as not merely a political
substantial overlap (89.7%) between PRCT and standard or economic issue, but as a threat to cultural and spiritual
anti-immigration discourse, in fact, aligns with Hernaiz identity. This finding aligns with Hernaiz [15]’s
obser[15]’s theoretical framework of shared secular rational vation that conspiracy theories operate within a hybrid
frames. Rather than representing fundamentally diferent framework, employing rational secular arguments while
discourses, conspiracy theories may intensify existing simultaneously appealing to notions of "faith" and
"berhetorical patterns while operating within the same ra- lief" that pair them with religious explanations. Like
relitional framework as mainstream explanations. Our find- gious narratives, PRCT discourse ascribes demographic
ing of high ANALYTIC thinking combined with low IN- change to volitional agents with malevolent intent,
transSIGHT language suggests that PRCT commenters employ forming a social phenomenon into a spiritual or
civianalytical reasoning to validate existing beliefs rather lizational crisis. This supports previous findings that
than explore new understandings, potentially reflecting replacement conspiracy theories present demographic
the confirmatory versus exploratory cognitive distinction change as an existential threat to a civilization’s core
val[15]. This high overlap could pose challenges for auto- ues [1]. This pattern manifests empirically in comments
mated detection systems but provides valuable insights such as "Have you heard of Islamic Jihad? that’s most
for understanding how conspiracy narratives emerge likely why........Islamization!!" (relig=27.27), where
immifrom and relate to mainstream discourse. gration becomes reframed as deliberate religious warfare
rather than demographic movement, directly invoking
5.2. PRCT-Specific Features the Eurabia conspiracy framework that portrays Muslim
immigration as orchestrated civilizational replacement.</p>
      <p>The four distinctive features of PRCT discourse, as visu- Power dynamics (d = 0.233): The emphasis on
poweralized in Figure 3, provide interesting insights into its related language could reflect the classic conspiratorial
conceptual structure: view that demographic changes are orchestrated by
pow</p>
      <p>
        Religious language (d = 0.251): The significantly erful elites rather than resulting from natural social
processes. This finding corroborates studies showing that at- conflict framing and negative afect—map closely onto
tribution of agency and intentionality to shadowy power defining features documented in other conspiracy
famicenters is a defining characteristic of conspiracy thinking lies (e.g., QAnon, anti-vaccination, or Great Reset
narra[
        <xref ref-type="bibr" rid="ref6">10, 25</xref>
        ]. The linguistic manifestation of this attribution tives). Future work can test whether these markers
genappears in constructions such as "Import third world → be- eralize across domains, turning the present fine-grained
come third world" (power=66.67), where the verb "import" analysis into a broader framework for detecting
conspirtransforms organic migration processes into deliberate atorial escalation in online discourse.
elite manipulation. This deterministic arrow formulation
removes agency from migrants themselves while imply- 5.3. Socio-Linguistic Mechanisms in PRCT
ing the existence of powerful orchestrators capable of Discourse: Theoretical Perspectives
engineering demographic transformation, exemplifying
how power-related language shifts explanatory frame- The linguistic patterns identified in our analysis invite
works from socio-economic to conspiratorial causation. broader theoretical reflections on the socio-linguistic
      </p>
      <p>
        Conflict framing (d = 0.219): PRCT discourse shows mechanisms underlying PRCT discourse. While
acknowlnearly double the rate of conflict terminology compared edging the limitations of drawing definitive conclusions
to standard anti-immigration comments (0.85% vs 0.44%), from a single study with an English-language YouTube
representing a 95.2% relative increase. This signals how dataset, the distinctive features we observed suggest
sevconspiracy theories transform social issues into exis- eral promising avenues for theoretical exploration. The
tential struggles between groups [
        <xref ref-type="bibr" rid="ref7">26</xref>
        ]. This Manichean high linguistic overlap (89.7%) between PRCT and
stanframing can serve to legitimize more extreme responses, dard anti-immigration discourse suggests what might
as demonstrated by Bracke and Aguilar [6]. The mil- be conceptualized as a rhetorical continuum rather than
itarization of discourse materializes in statements like a categorical distinction. This finding resonates with
"aggressively defending our borders from invaders" (con- the concept of the Overton window [
        <xref ref-type="bibr" rid="ref8">27</xref>
        ] - the range of
lfict=25.00), where immigration policy becomes reconcep- politically acceptable discourse at a given time. Rather
tualized as warfare requiring defensive military action. than emerging as entirely separate discourses,
conspirThe lexical choice of "invaders" transforms migrants from acy narratives may represent incremental shifts along
policy subjects into military threats, while "aggressively this continuum, potentially facilitating the
mainstreamdefending" positions exclusionary responses as legitimate ing of fringe ideas through gradual rhetorical
transforself-defense, illustrating how conflict framing escalates mations. Within this continuum, we observe that the
immigration discourse from policy debate to existential significantly higher use of religious terminology in PRCT
combat. comments (+115.6%) might reflect the so-called
sacral
      </p>
      <p>
        Negative tone (d = -0.214): The markedly more nega- ization of collective identity - a process through which
tive emotional tone of PRCT discourse, with 22.0% lower political issues are transformed into matters of
existenpositive sentiment scores, shows the afective dimen- tial and moral value [
        <xref ref-type="bibr" rid="ref9">28</xref>
        ]. While our data cannot
estabsion of conspiracy theories. This emotional negativity lish causality, this linguistic pattern aligns with Girard’s
may function as a mobilizing mechanism, generating (2020) theory of sacred diferentiation, where boundaries
moral outrage and urgency [4]. This heightened negativ- between in-group and out-group acquire quasi-religious
ity appears in apocalyptic formulations such as "Most significance. The emphasis on power-related language
of Europe has been destroyed because of illegal immi- (+41.4%) in PRCT discourse further connects to what
Hofgrants" (tone_neg=30.00), where the verb "destroyed" es- stadter [
        <xref ref-type="bibr" rid="ref11">30</xref>
        ] termed the paranoid style in political rhetoric
calates beyond policy criticism to civilizational annihila- - the perception of systematic, malevolent orchestration
tion. The continental scope ("Most of Europe") and direct behind social phenomena. This linguistic pattern may
causal attribution ("because of") exemplify how PRCT reflect the construction of alternative relevance
strucdiscourse employs catastrophic language to transform tures through which events are reframed as evidence
demographic statistics into existential crisis narratives, of hidden designs [
        <xref ref-type="bibr" rid="ref12">31</xref>
        ]. Equally notable is the
substanintensifying emotional engagement through linguistic tial increase in conflict terminology (+95.2%), suggesting
extremity. a potential militarization of the interpretive frame that
      </p>
      <p>
        These four linguistic markers ofer insights for both transforms political debate into existential struggle. This
socio-psychological understanding of conspiracy dis- might create what Bauman [
        <xref ref-type="bibr" rid="ref13">32</xref>
        ] characterizes as a
discurcourse and the development of computational detection sive state of emergency in which exceptional responses
systems, providing empirically grounded features that become justified by the perception of imminent threat.
could enhance automated identification of PRCT content Such framing represents not merely a rhetorical choice
online. While this study isolates Population-Replacement but a fundamental shift in how immigration discourse is
Conspiracy Theories, the four linguistic dimensions we conceptualized and processed. These theoretical
perspecidentify—religious sacralization, elite power attribution, tives collectively suggest several promising directions for
future research. Longitudinal studies could track the evo- generalizability to other platforms and languages where
lution of these linguistic markers over time to understand conspiracy discourse could manifest diferently. The
auhow discursive shifts occur. Comparative analyses across tomatic classification process, though efective with high
diferent languages and cultural contexts would test the agreement scores, inevitably introduces some risk of
misgeneralizability of these patterns, while experimental classification that future work might address through
studies might investigate how exposure to these specific additional validation approaches or multi-platform
comlinguistic features afects audience perceptions and be- parisons.
liefs. It is important to emphasize that these theoretical Regarding data handling, our research relies on
userinterpretations remain speculative based on our limited generated content from public YouTube videos, raising
dataset. The patterns we observed ofer intriguing cor- important privacy considerations. We conducted this
relations, but establishing causal relationships between research in accordance with GDPR Article 9(2)(j) and
these linguistic features and the social mechanisms de- Article 89, which permit processing of potentially
senscribed would require more extensive mixed-methods sitive data for research purposes with appropriate
saferesearch combining computational and qualitative ap- guards. Throughout our analysis, we removed personal
proaches. Nevertheless, these preliminary findings sug- identifiers from collected comments, focused on
aggregest that the subtle linguistic distinctions between con- gate linguistic patterns rather than individual profiles,
spiracy and non-conspiracy discourse may reveal deeper and maintained secure data storage with restricted access.
social and cognitive processes worthy of further inves- Although the YouTube videos themselves remain
pubtigation. Future research might investigate whether the licly accessible, we do not publish the raw comment data
transition from mainstream to conspiratorial discourse openly to protect user privacy. Researchers interested in
follows predictable linguistic trajectories, and how im- accessing the dataset for scientific purposes may contact
migration discourse becomes embedded within broader the authors with appropriate research ethics
documencivilizational or existential frames. tation, with any data sharing conducted in compliance
with GDPR and relevant national regulations.
      </p>
      <p>This research also raises broader ethical questions
6. Conclusion about the study and identification of conspiracy
theories online. While identifying linguistic markers of
potentially harmful content could facilitate better content
moderation, we recognize the complex balance between
reducing harmful misinformation and protecting
legitimate discourse. The high linguistic overlap (89.7%)
between conspiracy and non-conspiracy anti-immigration
discourse underscores the subtlety of these distinctions
and the risks of over-moderation based solely on
automated detection. Our findings should be interpreted as
identifying patterns across large samples, not as
definitive classifiers for individual comments. This complexity
highlights the importance of human oversight in content
moderation systems that might leverage these linguistic
insights.</p>
      <sec id="sec-4-1">
        <title>This study advances both methodological and theoretical</title>
        <p>fronts. RQ1 asked whether DeepSeek-v3 can reliably
detect PRCT content with minimal examples; our
validation on a 500-comment gold set (§3) confirms 94.5 %
accuracy (balanced precision/recall), demonstrating that
a LLM in a few-shot regime is adequate for this task.</p>
        <p>RQ2 examined whether PRCT comments exhibit distinct
psycho-linguistic patterns; the comparison revealed four
robust markers—religious references, power dynamics,
conflict framing and negative tone—that systematically
diferentiate PRCT from standard anti-immigration
discourse.</p>
        <p>While 89.7 % of linguistic features are shared between
conspiracy and non-conspiracy anti-immigration
comments, the four PRCT-specific dimensions remain stable
and interpretable. These findings underscore that con- Acknowledgments
spiracy narratives often intensify, rather than abandon,
mainstream rhetorical frames, and they provide empiri- This research was conducted as part of a larger project
cally grounded cues for automated moderation systems. focused on detecting disinformation such as conspiracy
theories in online discourse. The authors would like to
thank their supervisors and colleagues for their guidance
7. Limitations and Ethical and support throughout this research. We are particularly
Considerations grateful to Katarina Laken for her valuable contributions
and insightful advice. This work was supported by the
While our study reveals significant linguistic patterns HYBRIDS project, which has received funding from the
in PRCT discourse, several limitations and ethical con- European Union’s Horizon Europe research and
innosiderations warrant discussion. Our analysis focuses on vation programme under the Marie Skłodowska-Curie
English-language YouTube comments, which may limit Grant Agreement No. 101073351 and from the UK
Research and Innovation (UKRI) Horizon Europe funding [9] M. Hunter, T. Grant, Is linguistic inquiry and
guarantee (Grant Number: EP/X036758/1). The work is word count (liwc) reliable, eficient, and efective
partially supported by the Portuguese Science Founda- for the analysis of large online datasets in forensic
tion as part of the projects CEECIND/ 01997/2017 and and security contexts?, Applied Corpus
LinguisUIDP/00057/2025. The content of this work reflects only tics 5 (2025) 100118. doi:10.1016/j.acorp.2025.
the authors’ view and the funding agencies are not re- 100118.
sponsible for any use that may be made of the information [10] A. Platt, J. Brown, A. Venske, Toward detecting
it contains. conspiracy language in misinformation documents,
in: Proceedings of the 2022 Computers and
People Research Conference (SIGMIS–CPR ’22), 2022.</p>
        <p>References doi:10.1145/3510606.3551895.
[11] J. W. Pennebaker, The secret life of pronouns, New
[1] M. Ekman, The great replacement: Strategic main- Scientist 211 (2011) 42–45.</p>
        <p>streaming of far-right conspiracy claims, Conver- [12] T. Vergho, J.-F. Godbout, R. Rabbany, K. Pelrine,
gence 28 (2022) 1127–1143. Comparing gpt-4 and open-source language
mod[2] M. Sedgwick, The great replacement narrative: Fear, els in misinformation mitigation, arXiv preprint
anxiety and loathing across the west, Politics, Reli- arXiv:2401.06920 (2024).
gion &amp; Ideology 25 (2024) 548–562. doi:10.1080/ [13] A. Kumar, R. Sharma, P. Bedi, Towards optimal nlp
21567689.2024.2424790. solutions: analyzing gpt and llama-2 models across
[3] E. B. Marino, J. M. Benitez-Baleato, A. S. Ribeiro, model scale, dataset size, and task diversity,
EngiThe polarization loop: How emotions drive propa- neering, Technology &amp; Applied Science Research
gation of disinformation in online media—the case 14 (2024) 14219–14224.
of conspiracy theories and extreme right move- [14] A. Etaywe, K. Macfarlane, M. Alazab, A
cybertments in southern europe, Social Sciences 13 (2024) errorist behind the keyboard: An automated text
603. analysis for psycholinguistic profiling and threat
[4] M. Obaidi, J. R. Kunst, S. Ozer, S. Y. Kimel, The assessment, Journal of Language Aggression and
“great replacement” conspiracy: How the per- Conflict (2024).
ceived ousting of whites can evoke violent extrem- [15] H. A. P. Hernaiz, Competing explanations of global
ism and islamophobia, Group Processes &amp; Inter- evils: Theodicy, social sciences, and conspiracy
thegroup Relations 25 (2021) 1675–1695. doi:10.1177/ ories, AGLOS: journal of area-based global studies
13684302211028293. 2 (2011) 27.
[5] M. Davis, Violence as method: The “white replace- [16] D. Bassi, M. J. Maggini, R. Vieira, M. Pereira-Fariña,
ment”, “white genocide”, and “eurabia” conspiracy A pipeline for the analysis of user interactions in
theories and the biopolitics of networked violence, youtube comments: A hybridization of llms and
Ethnic and Racial Studies (2024). doi:10.1080/ rule-based methods, in: 2024 11th International
01419870.2024.2304640, advance online pub- Conference on Social Networks Analysis,
Managelication. ment and Security (SNAMS), 2024, pp. 146–153.
[6] S. Bracke, L. M. H. Aguilar, The politics of
replacement: from “race suicide” to the “great replace- [17] dYo.iR:
1.0T.a1u1sc0z9ik/,SJN.AWM.SP6e4n3n1e6b.a2k0e2r,4.T1h0e8p8s3y7c8h1o.logment”, in: The politics of replacement, Routledge, ical meaning of words: Liwc and computerized
2023, pp. 1–19. text analysis methods, Journal of Language and
[7] S. Shahsavari, T. R. Tangherlini, B. Shahbazi, Social Psychology 29 (2010) 24–54. doi:10.1177/
E. Ebrahimzadeh, V. Roychowdhury, An
automated pipeline for the discovery of conspiracy and [18] S0.26J.19B2a7rnXe0s9,351S6t7u6ck. in the past or living in
conspiracy-theory narrative frameworks, PLOS the present? temporal focus and the spread
ONE 15 (2020) e0233879. doi:10.1371/journal. of covid-19, Social Science &amp; Medicine 280
[8] pMo.nSea.m0o2r3y3, 8T7.9M.itra, Conspiracies online: User (2021) 114057. doi:https://doi.org/10.1016/
discussions in a conspiracy community follow- [19] jR..sLo.cBsocyidm,eAd..A20sh2o1k.k1u1m40a5r,7S.. Seraj, J. W.
Pening dramatic events, in: Proceedings of the nebaker, The development and psychometric
propTwelfth International Conference on Web and So- erties of liwc-22, Austin, TX: University of Texas
cial Media, ICWSM 2018, Stanford, California, at Austin 10 (2022) 1–47. URL: https://www.liwc.
USA, June 25-28, 2018, AAAI Press, 2018, pp. 340– app/static/documents/LIWC-22%20Manual%20-%
349. URL: https://aaai.org/ocs/index.php/ICWSM/ 20Development%20and%20Psychometrics.pdf.
ICWSM18/paper/view/17907. [20] E. Kacewicz, J. W. Pennebaker, M. Davis, M. Jeon,</p>
      </sec>
    </sec>
    <sec id="sec-5">
      <title>Appendix</title>
      <sec id="sec-5-1">
        <title>YouTube Videos Used in Dataset</title>
      </sec>
      <sec id="sec-5-2">
        <title>Collection</title>
        <p>• Chinese migrants are fastest growing group
crossing into U.S. from Mexico
youtube.com/watch?v=M7TNP2OTY2g
• Native American Shuts Down Immigration
Protest
youtube.com/watch?v=2utsjsWOWUA
• Migrants evade Texas floating barrier
youtube.com/watch?v=2i8n6jCH1S4
Declaration on Generative AI</p>
      </sec>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          <source>Psychology</source>
          <volume>33</volume>
          (
          <year>2014</year>
          )
          <fpage>125</fpage>
          -
          <lpage>143</lpage>
          . doi:https://doi. org/10.1177/0261927X13502654.
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [21]
          <string-name>
            <given-names>R. L.</given-names>
            <surname>Moore</surname>
          </string-name>
          ,
          <string-name>
            <surname>C.-J. Yen</surname>
            ,
            <given-names>F. E.</given-names>
          </string-name>
          <string-name>
            <surname>Powers</surname>
          </string-name>
          ,
          <article-title>Exploring the relationship between clout and cognitive processing in mooc discussion forums</article-title>
          ,
          <source>British Journal of Educational Technology</source>
          <volume>52</volume>
          (
          <year>2021</year>
          )
          <fpage>482</fpage>
          -
          <lpage>497</lpage>
          . doi:https://doi.org/10.1111/bjet.13033.
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [22]
          <string-name>
            <surname>K. K. Aldous</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          <string-name>
            <surname>An</surname>
            ,
            <given-names>B. J.</given-names>
          </string-name>
          <string-name>
            <surname>Jansen</surname>
          </string-name>
          ,
          <article-title>Measuring 9 emotions of news posts from 8 news organizations across 4 social media platforms for 8 months</article-title>
          ,
          <source>Trans. Soc. Comput</source>
          .
          <volume>4</volume>
          (
          <year>2022</year>
          ). URL: https://doi.org/10.1145/ 3516491. doi:
          <volume>10</volume>
          .1145/3516491.
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [23]
          <string-name>
            <given-names>L.</given-names>
            <surname>Plonsky</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F. L.</given-names>
            <surname>Oswald</surname>
          </string-name>
          ,
          <article-title>How big is “big”? interpreting efect sizes in l2 research, Language learning 64 (</article-title>
          <year>2014</year>
          )
          <fpage>878</fpage>
          -
          <lpage>912</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          [24]
          <string-name>
            <given-names>R.</given-names>
            <surname>Wei</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Y.</given-names>
            <surname>Hu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Xiong</surname>
          </string-name>
          ,
          <article-title>Efect size reporting practices in applied linguistics research: A study of one major journal</article-title>
          ,
          <source>Sage Open</source>
          <volume>9</volume>
          (
          <year>2019</year>
          )
          <fpage>2158244019850035</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          [25]
          <string-name>
            <given-names>R.</given-names>
            <surname>Brotherton</surname>
          </string-name>
          ,
          <article-title>Suspicious minds: Why we believe conspiracy theories</article-title>
          ,
          <source>Bloomsbury Publishing</source>
          ,
          <year>2015</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          [26]
          <string-name>
            <given-names>M.</given-names>
            <surname>Barkun</surname>
          </string-name>
          ,
          <article-title>A culture of conspiracy: Apocalyptic visions in contemporary America</article-title>
          , volume
          <volume>15</volume>
          , Univ of California Press,
          <year>2013</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          [27]
          <string-name>
            <given-names>N. J.</given-names>
            <surname>Russell</surname>
          </string-name>
          ,
          <article-title>An introduction to the overton window of political possibilities</article-title>
          ,
          <source>Mackinac Center for Public Policy</source>
          <volume>4</volume>
          (
          <year>2006</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          [28]
          <string-name>
            <given-names>E.</given-names>
            <surname>Durkheim</surname>
          </string-name>
          ,
          <article-title>Suicide: A study in sociology</article-title>
          , Routledge,
          <year>2005</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          [29]
          <string-name>
            <given-names>R.</given-names>
            <surname>Girard</surname>
          </string-name>
          , Il capro espiatorio,
          <source>Adelphi Edizioni spa</source>
          ,
          <year>2020</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          [30]
          <string-name>
            <given-names>R.</given-names>
            <surname>Hofstadter</surname>
          </string-name>
          ,
          <article-title>The paranoid style in American politics</article-title>
          , Vintage,
          <year>2012</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          [31]
          <string-name>
            <given-names>E.</given-names>
            <surname>Gofman</surname>
          </string-name>
          ,
          <article-title>Frame analysis: An essay on the organization of experience</article-title>
          ., Harvard University Press,
          <year>1974</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          [32]
          <string-name>
            <given-names>Z.</given-names>
            <surname>Bauman</surname>
          </string-name>
          , Retrotopia, Revista Española de Investigaciones Sociológicas (REIS)
          <volume>163</volume>
          (
          <year>2018</year>
          )
          <fpage>155</fpage>
          -
          <lpage>158</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref14">
        <mixed-citation>
          <article-title>• Denmark Is Leading Europe's Anti-Immigration Policies youtube</article-title>
          .com/watch?v=zpkBKEPxze4
        </mixed-citation>
      </ref>
      <ref id="ref15">
        <mixed-citation>
          <article-title>• Venezuelan Immigrant: 'I Regret Having Come to the United States' youtube</article-title>
          .com/watch?v=3FPbZcVLTBI
        </mixed-citation>
      </ref>
      <ref id="ref16">
        <mixed-citation>
          <article-title>• Migrant group attempts mass entry into US at Mexico border youtube</article-title>
          .com/watch?v=h_TqO9EqMhY
        </mixed-citation>
      </ref>
      <ref id="ref17">
        <mixed-citation>
          <article-title>• Norway's Muslim immigrants attend classes on western attitudes to women youtube</article-title>
          .com/watch?v=oKY600o3CXw
        </mixed-citation>
      </ref>
      <ref id="ref18">
        <mixed-citation>
          <article-title>• Why does Sweden no longer wants immigrants? youtube</article-title>
          .com/watch?v=5CSUimZjiI0
        </mixed-citation>
      </ref>
      <ref id="ref19">
        <mixed-citation>
          <article-title>• How Sweden is Destroyed by the Immigration Crisis youtube</article-title>
          .com/watch?v=rUw4cs2MHwc
        </mixed-citation>
      </ref>
      <ref id="ref20">
        <mixed-citation>
          <article-title>• Migrant crisis reaches boiling point on Staten Island youtube</article-title>
          .com/watch?v=-LDra78ksTo
        </mixed-citation>
      </ref>
      <ref id="ref21">
        <mixed-citation>
          <article-title>• "Deportation, not relocation!" Poland votes on illegal migration youtube</article-title>
          .com/watch?v=x4afwGepMkM
        </mixed-citation>
      </ref>
      <ref id="ref22">
        <mixed-citation>
          <string-name>
            <surname>• Students Say Obama Immigration Quote Is Racist</surname>
          </string-name>
          ...
          <article-title>When They Think It's From Trump youtube</article-title>
          .com/watch?v=Vj9IxVlLRl0
        </mixed-citation>
      </ref>
      <ref id="ref23">
        <mixed-citation>
          <article-title>• US' illegal immigrants crisis: Elon Musk visits Texas youtube</article-title>
          .com/watch?v=2_iYuiHyzKQ
        </mixed-citation>
      </ref>
      <ref id="ref24">
        <mixed-citation>
          <article-title>• Migrant beats resident, steals flag from NY home youtube</article-title>
          .com/watch?v=FTXZmor6KBY
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>