<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta>
      <journal-title-group>
        <journal-title>Year Article count “virtual”-embedding count Cleaned tokens count</journal-title>
      </journal-title-group>
    </journal-meta>
    <article-meta>
      <title-group>
        <article-title>Tracing the Development of the Virtual Particle Concept Using Semantic Change Detection</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Michael Zichert</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Adrian Wüthrich</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>History and Philosophy of Modern Science, Technische Universität Berlin</institution>
          ,
          <country country="DE">Germany</country>
        </aff>
      </contrib-group>
      <pub-date>
        <year>1924</year>
      </pub-date>
      <volume>3</volume>
      <fpage>848</fpage>
      <lpage>868</lpage>
      <abstract>
        <p>Virtual particles are peculiar objects. They figure prominently in much of theoretical and experimental research in elementary particle physics. But exactly what they are is far from obvious. In particular, to what extent they should be considered “real” remains a matter of controversy in philosophy of science. Also their origin and development has only recently come into focus of scholarship in the history of science. In this study, we propose using the intriguing case of virtual particles to discuss the efÏcacy of Semantic Change Detection (SCD) based on contextualized word embeddings from a domain-adapted BERT model in studying specific scientific concepts. We find that the SCD metrics align well with qualitative research insights in the history and philosophy of science, as well as with the results obtained from Dependency Parsing to determine the frequency and connotations of the term “virtual”. Still, the metrics of SCD provide additional insights over and above the qualitative research and the Dependency Parsing. Among other things, the metrics suggest that the concept of the virtual particle became more stable after 1950 but at the same time also more polysemous. Semantic change detection, digital conceptual history, history and philosophy of science, virtual particle</p>
      </abstract>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>1. Introduction</title>
      <p>
        Virtual particles have been important elements of particle physics since long. But despite their
widespread use, the term “virtual particle” holds diferent meanings and connotations within
today’s particle physics community, and its historical origins and development have remained
unclear. Virtual particles are peculiar objects which may be considered responsible for the
fundamental interactions of matter and radiation. In this sense, they have detectable and real
efects. However, they do not share the properties of real particles; for instance, the mass and
energy of a virtual particle does not stand in the same relation as would be the case with a
particle that is observed in the appropriate detectors. Virtual particles only ever occur in
intermediate, unobservable phases of decays or other processes involving elementary particles. The
precise meaning and interpretation of the term “virtual particle” has, therefore, been a topic for
philosophical debate [33]. Recent works by Ehberger [14] and Martinez [
        <xref ref-type="bibr" rid="ref29">27</xref>
        ] have shed
considerable light on the associated historical issues concerning the origin and development of the
virtual particle concept. Additional studies on the conceptual shift due to Feynman diagrams
and the associated calculation schemes have highlighted the relevance of virtual particles in
the evolution of theoretical and experimental particle physics2[
        <xref ref-type="bibr" rid="ref7">0, 35, 7</xref>
        ]. While valuable, these
studies are limited by their focus on carefully selected texts. Here, we aim to go beyond case
studies and gain a more comprehensive view of the development of the concept of the virtual
particle by analyzing a large dataset over an extended period of time.
      </p>
      <p>
        To achieve this, we combine conceptual history with computational methods, an approach
also referred to asdigital Begrifsgeschichte [34]. First, we adapt our BERT model to the
domainspecific language of our large corpus of physics texts and extract contextualized word
embeddings for all occurrences of the term “virtual”. These word embeddings can then be used to
employ Semantic Change Detection (SCD), which aims to identify, interpret and assess shifts
in lexical meaning over time using computational techniques. SCD has emerged as a distinct
research field in recent years supported by multiple survey studies [e.g., 30, 32]. While most
studies focus on the technical implementation of SCD, there have also been calls for further
evaluation of the methods through in-depth case studies backed by qualitative analysis2[
        <xref ref-type="bibr" rid="ref21 ref9">9, 21</xref>
        ].
We hope to provide such a case study with this paper. To this end, we employ various SCD
metrics to trace the origin, usage, and evolution of the concept of the virtual particle from a
historical perspective, with special focus on the change in dominant meaning of the term “virtual”
as well as its degree of polysemy, i.e., the coexistence of multiple meanings for a single word
form. For instance, the meaning of “virtual” in the context of “reality” difers from its meaning
in the context of “particle.” In order to enable a thorough evaluation of our results, we also use
Dependency Parsing, thereby gaining a deeper understanding of the observed semantic shifts.1
      </p>
    </sec>
    <sec id="sec-2">
      <title>2. Dataset</title>
      <sec id="sec-2-1">
        <title>2.1. Physical Review corpus</title>
        <p>
          Our dataset consists of a large number of scientific articles from eight journals of the
Physical Review-family. The corpus spans from the introduction of the concept of virtuality in
quantum physics in 1924 up to 2022, the latest complete year available for analysis, making it
well-suited for studying the history of the virtual particle. ThePR-journals are highly
influential in the field of physics [
          <xref ref-type="bibr" rid="ref8">8</xref>
          ] and qualitative investigations [
          <xref ref-type="bibr" rid="ref29">14, 27</xref>
          ] confirm their pivotal role
in the emergence and establishment of the virtual particle concept, with several key articles
on the topic published in these journals [e.g.,6, 15, 12]. Through an agreement between our
research project and the American Physical Society (APS), we have access to all normally
restricted full texts, metadata, and citation data from this period1[]. We include eight relevant
journals into our analysis: PR - Series II (all of physics until 1969), Review of Modern Physics
(long review articles with broad disciplinary scope, since 1929)P,R - Letters (short articles with
high impact and broad disciplinary scope, since 1958)P,R - A (covering atomic, molecular and
optical physics, since 1970),PR - B (condensed matter and materials physics, since 1970),PR - C
(nuclear physics, since 1970), PR - D (particle physics, field theory, gravitation, and cosmology,
since 1970), and PR - E (statistical, nonlinear, biological and soft matter physics, since 1993). To
1The code used in this study is available athttps://github.com/mZichert/scd_vp. Due to copyright restrictions, the
dataset and the domain-adapted BERT model used in the study are not available for public release.
focus on long-term trends, newer journals are excluded from the analysis.
        </p>
        <p>The dataset’s substantial size, comprising nearly 700,000 articles, makes it well-suited for
extensive analysis using computational methods. However, it also presents notable limitations,
particularly concerning the early development of the concept. As a primarily US-based source
written exclusively in English, significant developments from other regions are not captured.
For instance, the center of the old quantum theory in the 1920s and early 1930s was in Central
Europe, particularly in German-speaking countries, the Netherlands, and Denmark. Since they
also published mostly in German journals, most of their works are not a direct part of this study.
Another issue is the relatively small number of articles in the corpus published before 1950
(approximately 12,000 articles or just under 2 percent). For a more comprehensive analysis
of the early phase of the concept using, it would be necessary to incorporate additional text
sources.</p>
      </sec>
      <sec id="sec-2-2">
        <title>2.2. Data preprocessing</title>
        <p>Analyzing articles in the entire corpus using word embeddings is impractical due to scalability
issues. Instead, we first identify articles potentially relevant to the concept of the virtual
particle through a keyword search for “virtual” in the full texts, abstracts, and titles. Approximately
half of the full texts are available as digitized and OCR-processed PDF files (331,210 entries
before 2004), while the other half are in native digital XML format (329,880 entries from 2004
onwards). For processing the PDF-files we use GROBID 2, which allows parsing and
restructuring of scientific publications in PDF format into uniformly TEI-formatted3 XML files. To catch
common OCR-errors prevalent the PDF-extracted text data, we apply some basic cleaning steps
like removing special characters etc. Subsequently, citations and mathematical formulas are
also removed from the text. While the formulas used likely reflect significant developments
in the conceptualization of the virtual particle, there are currently no established tools for the
content analysis of mathematical formulas in the context of conceptual history and the history
of science.4 Therefore, this work focuses on the analysis of linguistic text data.</p>
        <p>
          To ensure the efÏcient use of the BERT model the texts are segmented into sentences. For
this task, we utilize the large language model of the Python natural language processing library
SciSpaCy5, which has been trained on a large corpus of scientific texts (albeit in bio-medicine),
making it suited for this purpose. We also use the model for dependency parsing, where a
sentence’s syntactic structure is created by identifying how words are grammatically related
through directed links. This is particularly helpful for analyzing adjectives like “virtual”, as it
allows for accurate identification of the associated nouns. We use these dependencies to evaluate
2GROBID stands for GeneRation Of BIbliographic Data h(ttps://grobid.readthedocs.io/en/latest)/.
3https://tei-c.org/
4We consider this an important open problem in semantic change detection in scientific texts. Also, it is hard
to estimate the impact of the omission of mathematical formulas. On the one hand, the symbols used in the
formulas are usually explained in the surrounding text (which we do take into account). On the other hand,
diferent mathematical formulas may describe diferent virtual entities (particles, states, processes etc.) without
clear indications of this in the surrounding text. Moreover, as one of the anonymous reviewers pointed out, the
frequency of formulas might have changed over time, which makes the omission potentially more or less impactful.
We did not control for this.
5https://allenai.github.io/scispacy/
and gain a deeper understanding of the observed semantic shifts. Following Laicher, Kurtyigit,
Schlechtweg, Kuhn, and Schulte im Walde [
          <xref ref-type="bibr" rid="ref22">22</xref>
          ], we do not employ further preprocessing steps,
such as lemmatization, as they do not seem to improve for SCD in English texts. After data
preparation, our corpus consists 126,540 occurrences of “virtual”, spread across 41,786 articles.
        </p>
      </sec>
    </sec>
    <sec id="sec-3">
      <title>3. Methods</title>
      <sec id="sec-3-1">
        <title>3.1. BERT and domain adaptation</title>
        <p>
          For Semantic Change Detection using BERT[
          <xref ref-type="bibr" rid="ref11">11</xref>
          ], fine-tuning for downstream tasks is
unnecessary, as the focus is on the learned word representations, i.e., the contextualized word
embeddings themselves. Instead, BERT is adapted to the domain-specific language through
retraining, a process known as domain adaptation. This involves reapplying Masked Language
Modeling, enabling the model to learn the linguistic nuances and specialized terminology of
the target domain. Domain adaptation is particularly crucial for this study, as the dataset
comprises highly specialized scientific texts in physics. At the time of conducting our analysis,
no suitable large language models specifically trained on general physics text data were
available. However, there are two models trained on specific sub-domains of physics: astroBERT6
for astrophysics [
          <xref ref-type="bibr" rid="ref18">18</xref>
          ] and Astro-HEP-BERT7 for astrophysics and (recent) high energy physics
[
          <xref ref-type="bibr" rid="ref33">31</xref>
          ].8 For a comprehensive overview of scientific large language models, including those in
the domain of physics, see Zhang, Chen, Jin, Wang, Ji, Wang, and Han 3[
          <xref ref-type="bibr" rid="ref7">7</xref>
          ].
        </p>
        <p>
          We therefore employ the BERT-base-uncased mode9l, which features 12 attention layers
and a hidden layer size of 768, and apply domain-adaption on our “virtual”-corpus. We also
retrained and tested SciBERT [
          <xref ref-type="bibr" rid="ref4">4</xref>
          ]10, which is primarily trained on scientific texts from biomedicine,
but found that the re-trained BERT-base performs slightly better in terms of training and
validation loss. Regarding time-specific fine-tuning, we follow the findings from Martinc, Novak,
and Pollak [26], indicating that BERT’s word embeddings are already well-suited to their
temporal context due to their context-dependent nature. For inference, the segmented sentences
are fed into the model with a maximum sequence length of 512 tokens, and the sum of the last
four layers is extracted for each token. For words comprising multiple subword tokens, the
average embedding is stored. Given the contextual embeddings, each token occurrence results
in one embedding vector. To reduce disk storage requirements, embeddings are saved only for
meaningful words, excluding stop words, numbers, and special characters.
        </p>
      </sec>
      <sec id="sec-3-2">
        <title>3.2. Semantic Change Detection</title>
        <sec id="sec-3-2-1">
          <title>3.2.1. General workflow</title>
          <p>The basic procedure of Semantic Change Detection (SCD) can be outlined as follows: Given a
diachronic corpus of document s =</p>
          <p>⋃=1</p>
          <p>, where   represents a subcorpus of documents
at time  within the overall investigation period[1, … ,  ] that contains the target word . The
goal of SCD is to quantify the semantic shift   for  between two time-specific subcorpora  

and   ′ or across the entire corpus. There are two ways a semantic shift can manifest: firstly,
as a change in the dominant meaning of a term, or secondly, as a change in the degree of its
polysemy. Both aspects will be analyzed in this study. Specifically, for our purposes the target
word is “virtual”, the documents comprise all the full texts plus abstracts of thePR-corpus that
contain “virtual”, and the time interval is one year.</p>
          <p>
            The generalized work-flow required for performing contextualized SCD can be split into
three steps. In the first step ( Embedding), contextualized word embeddings are generated for
each occurrence of the target word in the corpus using a large language model like BERT. The
set of all these embeddings in the time-specific subcorpus   is expressed asΦ


where  , represents a contextualized word embedding in the subcorpus and denotes the
number of all occurrences of  in it. In the second step (Aggregation) the embeddings of
= { 
, , … ,  
, },
8Recently, PhysBERT [
            <xref ref-type="bibr" rid="ref19">19</xref>
            ] was released, having been pre-trained on a large corpus of 1.2 million arXiv papers across
various sub-fields of physics. While the model appears promising for our use case, it was released too late to be
included in our study.
9https://huggingface.co/google-bert/bert-base-uncased
10https://huggingface.co/allenai/scibert_scivocab_uncased
a time period Φ are aggregated to represent the time-specific meanings of  . Two types of
representations are defined: Form-based approaches examine the high-level properties of the
target word per time period by looking directly at the dominant sense of a word or the degree
of polysemy. When considering the dominant meaning, word prototype s can be generated

for each time interval representing the average of all embeddings inΦ , thus providing an
aggregated representation of the semantic properties of the target word i n  . When looking at
polysemy at the high level, the aggregation step is usually skipped and the semantic shift of  is

Φ
          </p>
          <p>′
measured by directly comparing the degree of polysemy in the time-specific set of embeddings
 and Φ . Sense-based approaches, in contrast, attempt to first capture the diferent
timespecific senses or meanings of the target word in  
specific meaning corresponds to a cluster of embeddings  ,
 using clustering methods. Each
time</p>
          <p>in the set of embeddingsΦ .</p>
          <p>We apply two clustering methods to identify meaning clusters. InK-Means Clustering
(KM), embeddings are organized into a predefined number of clusters by iteratively
updating cluster centers until stable. Determining the optimal number of clusters is challenging;
meaning clusters [25]. Therefore, we set the number of clusters to = 10
automated methods like the silhouette coefÏcient often fail to identify the actual number of
based on qualitative
assessment. AfÏnity propagation (AP)</p>
          <p>
            identifies exemplars among data points and forms
clusters without the need to pre-specify their number by iteratively exchanging “messages”
between data points to determine the clusters. However, the number of clusters often correlates
with the number of input embeddings rather than actual meanings, potentially resulting in a
large number of clusters [
            <xref ref-type="bibr" rid="ref30">28</xref>
            ]. Another drawback of AP is its high computational complexity
of ( 2). In our study, both clustering methods are applied to the entire corpus; however, it
would also be feasible to employ time-specific clustering. In order to make the clusters usable
be assigned to a particular cluster , . It is defined as follows:
for SCD, we then calculate the probability distribution of the clusters, i.e., the cluster
distribution    . The cluster distribution consists of the individual probabilities , , which indicate


the frequency with which a specific embedding  , from the total set of embeddingsΦ can

  = [ 
,1
,  ,2 , … ,  
,
], where  ,
=
| , |
ifnal step ( Assessment) to determine the extent of the semantic shift   . The methods used
to quantify this shift, split into those measuring the semantic shift for polysemy and those for
dominant meaning, will be introduced in the next chapters. Table1 provides an overview of
the notations used in this study.
          </p>
        </sec>
        <sec id="sec-3-2-2">
          <title>3.2.2. Polysemy</title>
          <p>We apply two methods to quantify the temporal development of a term’s polysemy. The
ifrst method is Shannon entropy  (</p>
          <p>
            ), which utilizes the cluster distribution to describe
the degree of uncertainty in the distribution of embeddings across meaning clusters within a
given time period. Specifically, Shannon entropy quantifies the average amount of information
needed to assign a particular embedding, i.e., an occurrence of the term “virtual”, to a specific
cluster, i.e., a specific meaning of the term “virtual”. A higher value of  (   )indicates a higher
degree of polysemy, as there is greater uncertainty or variability in the cluster membership of
the embeddings [
            <xref ref-type="bibr" rid="ref3">3, 17</xref>
            ]. To ensure comparability of entropy values across diferent time
periods, we use the normalized Shannon entropy(   ), which ranges from 0 to 1 and is defined as
follows:
(   ) =
 (   )
log( ) , where  (   ) = − ∑  
∈
, log( , )⋅
( ,
,
          </p>
          <p>, ). AID is defined as follows:</p>
          <p>
            The second method,Average Inner Distance (AID), utilizes the variance of the
contextualized word embeddingsΦ , reflecting the degree of polysemy of  in   . In this approach,

embeddings are not aggregated into meaning clusters or word prototypes. Instead, the average
distances between all possible pairs of embeddings within a single time period are calculated
[
            <xref ref-type="bibr" rid="ref32">30</xref>
            ]. This method is sometimes also referred to as self-similarity1[6]. A higher AID value
indicates greater polysemy of in  . We employ Euclidean distance, denoted in the formula as
AID(Φ ) =
          </p>
          <p>1</p>
        </sec>
        <sec id="sec-3-2-3">
          <title>3.2.3. Dominant meaning</title>
          <p>
            To assess the shift in dominant meaning in a form-based manner, Cosine Similarity (CS) can
be used. CS measures the alignment between the vectors of two word prototype s  and   ′
by calculating the dot product of the vectors divided by the product of their norms (lengths).
CS values range between -1 and 1, where a high value indicates vector alignment and a low
value indicates opposition. We employ the variantInverted Cosine Similarity over Word
Prototypes (PRT), which, according to Kutuzov, Velldal, and Øvrelid 2[
            <xref ref-type="bibr" rid="ref1">1</xref>
            ], is better suited
for quantifying the extent of the semantic shift. PRT values are always greater than 1, where
higher values signify a more pronounced shift. PRT is defined as follows:
          </p>
          <p>PRT(  ,   ′ ) =</p>
          <p>CS(  ,   ′ ), where CS(  ,   ′ ) =</p>
          <p>⋅   ′ ⋅
‖  ‖ ‖  ′ ‖</p>
          <p>The shift in dominant meaning can also be assessed using meaning clusters (sense-based)
through the Jensen-Shannon Divergence (JSD). JSD, based on normalized Shannon entropy,
measures the similarity between cluster distributions across diferent time periods. This method
considers not only the variation in the size of the clusters but also how the size of specific
clusters across the diferent time periods changes [ 17]. A high JSD value indicates significantly
diferent cluster distributions, suggesting pronounced semantic shifts. Conversely, a low JSD
value indicates relatively similar distributions, implying stability in the dominant meaning. JSD
is defined as follows:</p>
          <p>JSD(   ,    ′ ) =  ( 2 (   +    ′ )) −
1
2</p>
          <p>( (   ) −  (   ′ )) ⋅
1
1</p>
        </sec>
      </sec>
    </sec>
    <sec id="sec-4">
      <title>4. Results</title>
      <sec id="sec-4-1">
        <title>4.1. Temporal development of “virtual”</title>
        <p>Articles containing 'virtual' per year (~42k total)</p>
        <p>Share of articles containing 'virtual'
1200
1000
tn 800
u
o
c
lce 600
i
t
r
A
400
200</p>
        <p>The first result of our study is the descriptive analysis of the “virtual” corpus in regards to the
temporal development of the term. Figure1 shows the number of published articles per year
containing “virtual” for the entire corpus (left) and their proportion per journal (right). The
dashed lines in the left figure indicate two key disciplinary diferentiations in the PR journals:
the transition from Series II to PR A - D in 1970, and the introduction of new journals like
PR - X (2011) and PRX - Quantum (2021). To focus on long-term trends, these newer journals
are excluded from the analysis. The decline in articles after 2010 is thus an artefact of the
dataset and does not reflect overall trends in PR publications or physics. Notably, there is a
low number of articles in the early phase of the study period, with only 384 publications in our
corpus containing “virtual” before 1950, especially sparse before 1930 and during the war years
(1942-1945). The exact number of articles, “virtual”-embeddings and cleaned tokens per year
for the early phase can be found in the appendix (table4). From 1950 onwards, the number of
articles containing “virtual” increases steadily, with short periods of relative stagnation during
the 1970s and 2010s, mirroring the broader increase in PR journal publications. Additional
details on the total publication count per journal are available in the appendix (figure 4).</p>
        <p>
          The average share of articles containing “virtual” across all journals, as depicted in the right
ifgure, is 6.04 percent over the entire period. In the pre-Feynman era (before 1950), this
percentage generally remains lower, except for two notable peaks. In 1937, there is a temporary
increase above 5 percent, driven by significant contributions from Bethe, Bacher, and Livingston
in RMP [
          <xref ref-type="bibr" rid="ref24 ref5">6, 5, 24</xref>
          ]. The second peak in 1949 is best explained by Richard Feynman’s
groundbreaking articles and their reception. For instance, withSpace-Time Approach to Quantum
Electrodynamics [
          <xref ref-type="bibr" rid="ref15">15</xref>
          ] – published in PR - Series II – Feynman introduced his eponymous diagrams
for representing and analyzing quantum electrodynamic processes, which contributed
significantly to the establishment of the concept of the virtual particle. In the same year, Freeman
J. Dyson’s contributions, also published inSeries II [
          <xref ref-type="bibr" rid="ref12 ref13">12, 13</xref>
          ], further validated and established
Feynman diagrams as a fundamental tool in quantum field theory (QFT) [ 14, 35]. Following
the publications by Feynman and Dyson, the prevalence of “virtual” steadily increased,
culminating in a peak during the 1960s and 1970s. This relatively high ratio of articles containing
“virtual” may, at least in part, be due to the rise of an alternative to QFT: the so-called S-matrix
theory [
          <xref ref-type="bibr" rid="ref10">10</xref>
          ]. In this new theory, intermediate states were always on-shell such that it seems, at
ifrst sight, that “all talk of virtual particles was gone” [ 20, p. 285]. However, in other work by
S-matrix theorists like Chew, Low, or Barut the virtual particle concept seems to take center
stage, and even explicitly occurs in the title of one of their articles9[
          <xref ref-type="bibr" rid="ref2">, 2</xref>
          ]. Subsequently, from
the 1970s onward, QFT emerged as the dominant theory, supported by its successful
predictions and discoveries of fundamental particles such as quarks, W bosons, and Z bosons. Finally,
by the early 1980s, the proportion of articles containing “virtual” starts to decline to
approximately 5 percent, gradually rising again from the 1990s onward, albeit not returning to the
levels observed during the earlier peak period.
        </p>
        <p>Zooming in on the individual journals or disciplines respectively, articles containing “virtual”
are notably prevalent in PR - D (particle physics, field theory, gravitation, and cosmology) and
PR - C (nuclear physics). Examination of arXiv classifications within PR - D reveals that nearly
90 percent of these articles fall under high-energy physics. The frequency of “virtual” inPR
D peaks in the 1970s, 1990s and 2000s with drops in usage in between. Overall, it contributes
approximately 27 percent of all articles containing the term “virtual” in the corpus, making it
the largest source. Nuclear physics (PR - C) also features a significant percentage of articles
containing “virtual”, comprising about 9 percent of the corpus. This aligns with recent research
by Martinez on the origin of the notion of virtuality in modern physics27[]. The proportion
of relevant articles inPR - C increases steadily until the mid-1990s, plateaus until around 2010,
and shows a recent decline. The term is less prevalent in the remaining journals, which will not
be discussed in detail here for the sake of brevity. A table showing the top 5 journal-specific
dependencies of “virtual” can be found in the appendix (table3).</p>
      </sec>
      <sec id="sec-4-2">
        <title>4.2. Dominant meaning becomes more stable</title>
        <p>One key finding of our study is that the dominant meaning of “virtual” becomes more stable
over time. Figure 2 presents the results of the SCD-calculations regarding the shifts in the
dominant meaning throughout the entire investigation period. The left graph displays the
PRTvalues for “virtual”, i.e., the inverted cosine similarity of the word prototypes for each year to
preceding year. The right graph shows the JSD-values for both the K-Means-Clustering and
the AP-Clustering. Due to the computational expense of AP-Clustering, we randomly sampled
approximately 25 percent of all embeddings, ensuring a minimum of 400 embeddings per year,
where available.</p>
        <p>The resulting conceptual development of “virtual” can be divided into two distinct phases.
1.14
1.12</p>
        <p>The first period, up until the 1950s, is characterized by pronounced fluctuations, indicating
repeated conceptual reorientation during the early development of the concept, with no firmly
established or dominant meaning. This trend can be seen in all three metrics, although the
values for JSD on the basis of AP-Clustering stabilizes at around 0.4. Notably, peaks are
observed in the late 1920s and early 1940s. Given the limited number of data points available
for this period, it is important to emphasize that our results for this early period reflect
general trends rather than individual peaks. To ensure the robustness of our results, we conduct
permutation-based statistical tests, which are described in detail at the end of this chapter.
From approximately 1950 onward, marking the beginning of the second phase, the dominant
meaning begins to stabilize progressively, although a minor peak is observed in the early 1980s.
This suggests the growing establishment of the concept of the virtual particle, following the
outlined contributions of Feynman and Dyson. Additional details on the shifts in dominant
meaning in the discipline-specific journals can be found in the appendix (figure 5), indicating
that the peak in the 1980s is mainly caused by a change in dominant meaning inPR - C (nuclear
physics). We plan to conduct further research into the cause of this and other peaks.</p>
        <p>
          Our findings regarding the stabilization of the dominant meaning of “virtual” are also
supported by the time-specific dependencies, as shown in Table 2. From the 1920s to the 1940s,
“virtual” is most often associated with terms as diverse as “cathode”, “height”, “orbit”, “level”,
and “oscillator”. In the 1940s, “virtual quanta” came into use, prominently featured in
Feynman’s first diagrams [
          <xref ref-type="bibr" rid="ref15">15</xref>
          ]. With the onset of the post-Feynman era in the 1950s, “virtual
photons” and “virtual states” become increasingly established as the dominant contexts. Notably
though, the concept of “virtual transition”, which Ehberger describes as essential for the
concept’s early development [14], only appears among the most frequent dependencies from 1960s
on. From around 1990 onward, the dependency “correction” gains importance. These “virtual
corrections” refer to parts of Feynman diagrams (or the corresponding mathematical
expressions) involving the representation of a virtual particle. The increasing frequency of this use
of “virtual” might be attributed to an increasing interest in (and feasibility of) “higher order”
calculations and presicion measurements in various contexts, the most prominent being the
search for the Higgs boson at the Large Electron–Positron Collider (LEP), which was in use at
CERN from 1989 to 2000, the Tevatron (at Fermilab, 1983–2011), the planned Superconducting
Super Collider (SSC, planned ca. 1983, cancelled in 1993), and at the Large Hadron Collider
(LHC), which has been in use at CERN since 2009.11 Nonetheless, “virtual photons” and
“virtual states” remain the dominant contexts of use until the present, though less pronounced
than in the 1960s and 1970s.
        </p>
        <p>
          The consistency of results across all three calculation methods, despite their diferent
approaches, also notable: The values of PRT strongly correlate with those of JSD (Pearson
coefÏcient for PRT and JSD - KM: 0.96, PRT and JSD - AP: 0.8), as well as the those of the two
JSD metrics (0.77). These high correlation values suggest that both clustering methods reliably
identify the various meanings of “virtual”, indicating stable and meaningful results. To further
ensure the robustness of our findings despite the relatively low frequency of “virtual” in the
early years, we employ permutation-based statistical tests for the PRT-metric, following the
approach outlined in Liu, Medlar, and Glowacka 2[
          <xref ref-type="bibr" rid="ref3">3</xref>
          ]. Permutation tests can be used to assess
whether the observed test statistic (i.e., the SCD-metrics) difers significantly from zero,
therefore indicating a semantic shift between two time periods. These tests are particularly suitable
for low-frequency data because they do not rely on large sample sizes or specific distributional
11For an non-technical overview of higher order calculations, see3[6]
        </p>
        <p>AID
1.0
0.8 yp
o
r
t
n</p>
        <p>E
0.6 -on
n
n
a
h</p>
        <p>
          S
0.4 ed
z
li
a
m
r
0.2 oN
Entropy (AP)
Entropy (K-Means) 0.0
assumptions; instead they generate the sampling distribution based on the available data itself.
This is achieved through the random and repeated rearrangement of the “virtual”-embeddings
across the two time periods by sampling without replacement and then recalculating the
SCDmetric for each permutation.12 Following Liu, Medlar, and Glowacka 2[
          <xref ref-type="bibr" rid="ref3">3</xref>
          ], we employ the
Benjamini-Hochberg procedure to adjust the -values for multiple comparisons, thereby
limiting the false discovery rate. Applying this method to our data, we find that the semantic
shifts for the dominant meaning of “virtual” based on PRT are significant for almost all time
interval. These findings support our conclusion regarding the general trend of the conceptual
development while acknowledging variability in specific time periods. A detailed exemplary
ifgure illustrating the results of the permutation tests for PRT can be found in the appendix
(figure 6).
        </p>
      </sec>
      <sec id="sec-4-3">
        <title>4.3. Polysemy increases</title>
        <p>Degree of polysemy of 'virtual' over time
1920
1940
1960
1980
2000</p>
        <p>The second key finding of our study is that the degree of polysemy of “virtual” increases.
That means that while the most dominant use is that in association with the aforementioned
concepts, its usage in diferent meanings is also expanding. Figure 3 presents the development
of the degree of polysemy for “virtual” in the entirePR-corpus and over the entire investigation
period. The left graph shows the AID-values, i.e., the average inner distances of all “virtual”
embeddings in a given year. The values for the normalized Shannon-Entropy are displayed in
the right graph, again for both the K-Means-Clustering and the AP-Clustering (with the same
random sampling as described in Section4.2).
12We limit the number of permutations to a maximum of 100,000 per time interval, i.e. two subsequent years, to
save computational resources.</p>
        <p>Similar to the results regarding the dominant meaning, the degree of polysemy fluctuates
significantly in the early phase of the concept. Notably, the values are particularly low in the
mid to late 1920s and early 1940s. These results are expected given the limited number of
articles during these periods, as a small number of embeddings implies a correspondingly low
number of diferent meanings. From 1938 to 1940, however, the values for all calculation
methods are particularly high. A clear explanation for this spike is not immediately apparent, as
neither the examination of the dependencies nor the shift in dominant meaning during these
years provide insight. The described peaks in PRT and JSD occur several years later. One
possible explanation could be that few but very diferent embeddings cause the peak. While the
correlation coefÏcients between the metrics are again high (0.64 for AID and Entropy (KM),
0.66 for AID and Entropy (AP), and 0.94 for Entropy (KM) and Entropy (AP)), suggesting
stable results, we were, however, unable to identify a suitable method for statistical testing of
polysemy. Further research and qualitative assessment of the relevant papers is required and
planned. Consequently, our present analysis focuses, once again, on general trends rather than
individual peaks.</p>
        <p>From around 1950 or 1960, depending on the metric, the fluctuations become smaller and
the degree of polysemy continues to steadily increase. Notably, there is a brief spike in the
early 1980s in the AID-values and another sharp increase in the 1990s, followed by a relative
stabilization in recent years. This increase in recent years is also reflected in the dependencies
of “virtual” (table 2), with the most frequent usage contexts becoming more evenly distributed
from the 1990s compared to earlier decades. This trend is supported by the introduction of the
journal PR - E in 1993, which is characterized by distinct usage contexts difering from those of
other journals (see table 3). The Shannon-Entropy based on both clustering methods remains
consistently high, exceeding or maxing out at 0.8 from about the 1950s onward and reaching
nearly maximum values around the 2000s in the case of K-Means. From 2010 onward, there is
a small decrease in polysemy, possibly due to the second disciplinary diferentiation leading to
a slightly less varied usage of the term across the remaining journals. The trends observed in
discipline-specific journals generally align with the overall findings. The details can be found
in the appendix (figure 7).</p>
      </sec>
    </sec>
    <sec id="sec-5">
      <title>5. Discussion</title>
      <p>We have used a large number of contextualized word embeddings to employ various Semantic
Change Detection metrics in order to trace the diachronic development of the concept of the
virtual particle. Our findings show that the dominant meaning of “virtual” becomes more stable
over time while at the same time its degree of polysemy is increasing. This development can be
split into two periods: An initial phase characterized by repeated conceptual reorientation with
no firmly established meaning yet, and a second phase marked by the growing consolidation
of the dominant meaning in the sense of the virtual particle, following the seminal works of
Richard Feynman and their reception around 1950. Simultaneously, the degree of polysemy
steadily increases throughout almost the entire investigation period and only recently seems
to stabilize at a high level.</p>
      <p>While these two findings might seem contradictory at first, they can easily be reconciled.
Simply put, the metrics for polysemy measure how spread out the word embeddings are in the
vector space, while the metrics for dominant meaning measure where the relative majority of
the embeddings lie and how this position changes from year to year. Our findings suggest that
from the 1950s onward, the relative majority of the embeddings consistently centers around a
usage in the sense of the virtual particle (especially virtual photons), while the overall usage
of the term “virtual” diversifies, possibly due to its uses in diferent disciplines like those in PR
- E.</p>
      <p>We have combined our SCD-based approach with evaluation via Dependency Parsing as
well as qualitative assessment of the results. We find that the observed semantic shifts are
largely supported by recent work in the history of the virtual particle. This is particularly
true for the first period of the conceptual development, whereas SCD can be employed in a
more heuristic manner for the still relatively under-researched second phase. For instance, we
identified a notable and unexpected shift in dominant meaning in the 1980s, primarily driven
by articles in nuclear physics (PR - C). We plan to conduct further research into this peak as well
as a more in-depth discussion of the relevance of our findings for the history and philosophy
of physics.13 The complementary method of Dependency Parsing revealed that most of the
semantic shifts coincide with significant changes in the most prominent dependencies at that
time. While Dependency Parsing may have been particularly efective in our case because
“virtual”, the focus of our study, is an adjective, it could prove to be a valuable and
resourceefÏcient evaluation method for broader use in SCD research.</p>
    </sec>
    <sec id="sec-6">
      <title>Acknowledgments</title>
      <p>
        This work was supported by the DFG Research Unit “The Epistemology of the Large Hadron
Collider” (Grant FOR 2063). The members of the Unit provided valuable feedback at several
stages of this work. Special thanks go to Robert Harlander, Jean-Philippe Martinez, Rebecka
Mähring, Arno Simons and Friedrich Steinle as well as three anonymous reviewers for their
comments and helpful suggestions. The work is based on M.Z.’s MSc thesis, which has been
defended at the University of Leipzig (Computational Humanities Research Group), and was
supervised by A.W. and Andreas Niekler. We are also grateful to the American Physical Society
for granting us access to the relevant full texts and metadata.
13Many further preliminary results are contained in M.Z.’s master thesis on the topic 3[
        <xref ref-type="bibr" rid="ref8">8</xref>
        ]. Here, we focused on
advocating a new method (Semantic Change Detection) for studying concepts in science.
[35]
      </p>
      <p>M. Zichert. “Eine digitale Begrifsgeschichte des virtuellen Teilchens”. M.Sc. thesis.
University of Leipzig, 2023.</p>
      <p>Series II
PR - A
PR - B
PR - C
PR - D
PR - E
Letters</p>
      <p>RMP</p>
      <p>Published articles for PR-corpus per journal</p>
      <p>D
,
C
,
B
,</p>
      <p>A
iIIrse f-PR
e o
foS rtta
d S
n</p>
      <p>E</p>
      <p>Year</p>
      <p>Shifts in dominant meaning of 'virtual' for discipline-specific journals</p>
      <p>PR - A
PR - B
PR - C
PR - D</p>
      <p>PR - E
p-values (unadjusted)
p-values (Benjamini-Hochberg)
2020
B. Tables</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <given-names>American</given-names>
            <surname>Physical</surname>
          </string-name>
          <article-title>Society</article-title>
          .
          <source>APS Data Sets for Research</source>
          .
          <year>2023</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [2]
          <string-name>
            <given-names>A. O.</given-names>
            <surname>Barut</surname>
          </string-name>
          . “
          <article-title>Virtual Particles”</article-title>
          .
          <source>In: Physical Review 126.5</source>
          (
          <issue>1962</issue>
          ), pp.
          <fpage>1873</fpage>
          -
          <lpage>1875</lpage>
          . doi: 10.1 103/PhysRev.126.
          <year>1873</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [3]
          <string-name>
            <given-names>A.</given-names>
            <surname>Baumann</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Stephan</surname>
          </string-name>
          , and
          <string-name>
            <given-names>B.</given-names>
            <surname>Roth</surname>
          </string-name>
          . “
          <article-title>Seeing Through the Mess: Evolutionary Dynamics of Lexical Polysemy”</article-title>
          .
          <source>In:Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing</source>
          . Ed. by
          <string-name>
            <given-names>H.</given-names>
            <surname>Bouamor</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Pino</surname>
          </string-name>
          , and
          <string-name>
            <given-names>K.</given-names>
            <surname>Bali</surname>
          </string-name>
          . Singapore: Association for Computational Linguistics,
          <year>2023</year>
          , pp.
          <fpage>8745</fpage>
          -
          <lpage>8762</lpage>
          . doi:
          <volume>10</volume>
          .18653/v1/
          <year>2023</year>
          .emnlpmain.
          <volume>541</volume>
          .
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [4]
          <string-name>
            <given-names>I.</given-names>
            <surname>Beltagy</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K.</given-names>
            <surname>Lo</surname>
          </string-name>
          ,
          <article-title>and</article-title>
          <string-name>
            <given-names>A.</given-names>
            <surname>Cohan.SciBERT: A Pretrained Language</surname>
          </string-name>
          <article-title>Model for Scientific Text</article-title>
          .
          <year>2019</year>
          . doi:
          <volume>10</volume>
          .48550/arXiv.
          <year>1903</year>
          .
          <volume>10676</volume>
          . arXiv:
          <year>1903</year>
          .10676 [cs].
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          [5] [6]
          <string-name>
            <given-names>H. A.</given-names>
            <surname>Bethe</surname>
          </string-name>
          . “
          <string-name>
            <surname>Nuclear Physics B. Nuclear Dynamics</surname>
          </string-name>
          ,
          <article-title>Theoretical”</article-title>
          .
          <source>In:Reviews of Modern Physics 9.2</source>
          (
          <issue>1937</issue>
          ), pp.
          <fpage>69</fpage>
          -
          <lpage>244</lpage>
          . doi:
          <volume>10</volume>
          .1103/RevModPhys.9.69.
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          <string-name>
            <given-names>H. A.</given-names>
            <surname>Bethe</surname>
          </string-name>
          and
          <string-name>
            <given-names>R. F.</given-names>
            <surname>Bacher</surname>
          </string-name>
          . “Nuclear Physics A.
          <article-title>Stationary States of Nuclei”</article-title>
          .
          <source>In:Reviews of Modern Physics 8.2</source>
          (
          <issue>1936</issue>
          ), pp.
          <fpage>82</fpage>
          -
          <lpage>229</lpage>
          . doi:
          <volume>10</volume>
          .1103/RevModPhys.8.82.
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          [7]
          <string-name>
            <given-names>A. S.</given-names>
            <surname>Blum</surname>
          </string-name>
          . “The State Is Not Abolished, It Withers Away:
          <article-title>How Quantum Field Theory Became a Theory of Scattering”</article-title>
          .
          <source>In: Studies in History and Philosophy of Science Part B: Studies in History and Philosophy of Modern Physics. On the History of the Quantum, HQ4</source>
          <volume>60</volume>
          (
          <year>2017</year>
          ), pp.
          <fpage>46</fpage>
          -
          <lpage>80</lpage>
          . doi:
          <volume>10</volume>
          .1016/j.shpsb.
          <year>2017</year>
          .
          <volume>01</volume>
          .004.
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          [8]
          <string-name>
            <given-names>J.</given-names>
            <surname>Bollen</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M. A.</given-names>
            <surname>Rodriguez</surname>
          </string-name>
          , and H. Van de Sompel.
          <source>“Journal Status”</source>
          .
          <source>InS:cientometrics 69.3</source>
          (
          <issue>2006</issue>
          ), pp.
          <fpage>669</fpage>
          -
          <lpage>687</lpage>
          . doi:
          <volume>10</volume>
          .1007/s11192-006-0176-z.
          <source>arXiv: cs/0601030.</source>
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          [9]
          <string-name>
            <given-names>G. F.</given-names>
            <surname>Chew</surname>
          </string-name>
          and
          <string-name>
            <surname>F. E. Low.</surname>
          </string-name>
          “
          <article-title>Unstable Particles as Targets in Scattering Experiments”</article-title>
          .
          <source>In: Physical Review 113.6</source>
          (
          <issue>1959</issue>
          ), pp.
          <fpage>1640</fpage>
          -
          <lpage>1648</lpage>
          . doi:
          <volume>10</volume>
          .1103/PhysRev.113.1640.
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          [10]
          <string-name>
            <given-names>J. T.</given-names>
            <surname>Cushing</surname>
          </string-name>
          .
          <source>Theory Construction and Selection in Modern Physics: The S Matrix</source>
          . Cambridge University Press,
          <year>1990</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          [11]
          <string-name>
            <given-names>J.</given-names>
            <surname>Devlin</surname>
          </string-name>
          , M.-
          <string-name>
            <given-names>W.</given-names>
            <surname>Chang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K.</given-names>
            <surname>Lee</surname>
          </string-name>
          , and
          <string-name>
            <given-names>K.</given-names>
            <surname>Toutanova</surname>
          </string-name>
          .BERT:
          <article-title>Pre-training of Deep Bidirectional Transformers for Language Understanding</article-title>
          .
          <year>2019</year>
          . doi:
          <volume>10</volume>
          .48550/arXiv.
          <year>1810</year>
          .
          <volume>04805</volume>
          . arXiv:
          <year>1810</year>
          .04805 [cs].
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          [12]
          <string-name>
            <given-names>F. J.</given-names>
            <surname>Dyson</surname>
          </string-name>
          . “
          <source>The Radiation Theories of Tomonaga</source>
          , Schwinger, and Feynman”.
          <source>InP: hysical Review 75.3</source>
          (
          <issue>1949</issue>
          ), pp.
          <fpage>486</fpage>
          -
          <lpage>502</lpage>
          . doi:
          <volume>10</volume>
          .1103/PhysRev.75.486.
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          [13]
          <string-name>
            <given-names>F. J.</given-names>
            <surname>Dyson</surname>
          </string-name>
          . “
          <article-title>The S Matrix in Quantum Electrodynamics”</article-title>
          .
          <source>In:Physical Review 75.11</source>
          (
          <year>1949</year>
          ), pp.
          <fpage>1736</fpage>
          -
          <lpage>1755</lpage>
          . doi:
          <volume>10</volume>
          .1103/PhysRev.75.1736.
        </mixed-citation>
      </ref>
      <ref id="ref14">
        <mixed-citation>
          <string-name>
            <given-names>Technische</given-names>
            <surname>Universität Berlin</surname>
          </string-name>
          ,
          <year>2023</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref15">
        <mixed-citation>
          [15]
          <string-name>
            <given-names>R. P.</given-names>
            <surname>Feynman</surname>
          </string-name>
          . “
          <article-title>Space-Time Approach to Quantum Electrodynamics”</article-title>
          .
          <source>In:Physical Review 76.6</source>
          (
          <issue>1949</issue>
          ), pp.
          <fpage>769</fpage>
          -
          <lpage>789</lpage>
          . doi:
          <volume>10</volume>
          .1103/PhysRev.76.769.
        </mixed-citation>
      </ref>
      <ref id="ref16">
        <mixed-citation>
          [16]
          <string-name>
            <given-names>A. Garí</given-names>
            <surname>Soler</surname>
          </string-name>
          and
          <string-name>
            <given-names>M.</given-names>
            <surname>Apidianaki</surname>
          </string-name>
          . “
          <string-name>
            <surname>Let's Play</surname>
          </string-name>
          Mono-Poly:
          <article-title>BERT Can Reveal Words' Polysemy Level and Partitionability into Senses”</article-title>
          .
          <source>In:Transactions of the Association for Computational Linguistics</source>
          <volume>9</volume>
          (
          <year>2021</year>
          ), pp.
          <fpage>825</fpage>
          -
          <lpage>844</lpage>
          . doi:
          <volume>10</volume>
          .1162/tacl\_a\_
          <volume>00400</volume>
          .
        </mixed-citation>
      </ref>
      <ref id="ref17">
        <mixed-citation>
          <string-name>
            <given-names>M.</given-names>
            <surname>Giulianelli</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Del Tredici</surname>
          </string-name>
          ,
          <article-title>and</article-title>
          <string-name>
            <given-names>R.</given-names>
            <surname>Fernández</surname>
          </string-name>
          . “
          <article-title>Analysing Lexical Semantic Change with Contextualised Word Representations”</article-title>
          .
          <source>In:Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics. Online: Association for Computational Linguistics</source>
          ,
          <year>2020</year>
          , pp.
          <fpage>3960</fpage>
          -
          <lpage>3973</lpage>
          . doi:
          <volume>10</volume>
          .18653/v1/
          <year>2020</year>
          .acl-main.
          <volume>365</volume>
          .
        </mixed-citation>
      </ref>
      <ref id="ref18">
        <mixed-citation>
          [18]
          <string-name>
            <given-names>F.</given-names>
            <surname>Grezes</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Blanco-Cuaresma</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Accomazzi</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M. J.</given-names>
            <surname>Kurtz</surname>
          </string-name>
          ,
          <string-name>
            <given-names>G.</given-names>
            <surname>Shapurian</surname>
          </string-name>
          ,
          <string-name>
            <given-names>E.</given-names>
            <surname>Henneken</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C. S.</given-names>
            <surname>Grant</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D. M.</given-names>
            <surname>Thompson</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Chyla</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>McDonald</surname>
          </string-name>
          ,
          <string-name>
            <given-names>T. W.</given-names>
            <surname>Hostetler</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M. R.</given-names>
            <surname>Templeton</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K. E.</given-names>
            <surname>Lockhart</surname>
          </string-name>
          ,
          <string-name>
            <given-names>N.</given-names>
            <surname>Martinovic</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Chen</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Tanner</surname>
          </string-name>
          , and
          <string-name>
            <given-names>P.</given-names>
            <surname>Protopapas</surname>
          </string-name>
          .
          <article-title>Building AstroBERT, a Language Model for Astronomy &amp; Astrophysics</article-title>
          .
          <year>2021</year>
          . arXiv:
          <volume>2112</volume>
          .00590 [astro-ph].
        </mixed-citation>
      </ref>
      <ref id="ref19">
        <mixed-citation>
          [19]
          <string-name>
            <given-names>T.</given-names>
            <surname>Hellert</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Montenegro</surname>
          </string-name>
          ,
          <article-title>and</article-title>
          <string-name>
            <given-names>A.</given-names>
            <surname>Pollastro.PhysBERT: A Text Embedding</surname>
          </string-name>
          <article-title>Model for Physics Scientific Literature</article-title>
          .
          <year>2024</year>
          . doi:
          <volume>10</volume>
          .48550/arXiv.2408.09574. arXiv:
          <volume>2408</volume>
          .09574 [physics].
        </mixed-citation>
      </ref>
      <ref id="ref20">
        <mixed-citation>
          [20]
          <string-name>
            <given-names>D.</given-names>
            <surname>Kaiser</surname>
          </string-name>
          . “
          <article-title>Physics and Feynman's Diagrams”</article-title>
          .
          <source>In:American Scientist 93.2</source>
          (
          <year>2005</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref21">
        <mixed-citation>
          [21]
          <string-name>
            <given-names>A.</given-names>
            <surname>Kutuzov</surname>
          </string-name>
          , E. Velldal, and
          <string-name>
            <given-names>L.</given-names>
            <surname>Øvrelid</surname>
          </string-name>
          . “
          <article-title>Contextualized Language Models for Semantic Change Detection: Lessons Learned”</article-title>
          .
          <source>In: Northern European Journal of Language Technology 8.1</source>
          (
          <year>2022</year>
          ). doi:
          <volume>10</volume>
          .3384/nejlt.2000-
          <fpage>1533</fpage>
          .
          <year>2022</year>
          .
          <volume>3478</volume>
          . arXiv:
          <volume>2209</volume>
          .00154 [cs].
        </mixed-citation>
      </ref>
      <ref id="ref22">
        <mixed-citation>
          [22]
          <string-name>
            <given-names>S.</given-names>
            <surname>Laicher</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Kurtyigit</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Schlechtweg</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Kuhn</surname>
          </string-name>
          , and
          <string-name>
            <surname>S.</surname>
          </string-name>
          <article-title>Schulte im Walde. “Explaining and Improving BERT Performance on Lexical Semantic Change Detection”</article-title>
          .
          <source>In:Proceedings of the 16th Conference of the European Chapter of the Association for Computational Linguistics: Student Research Workshop</source>
          . Online: Association for Computational Linguistics,
          <year>2021</year>
          , pp.
          <fpage>192</fpage>
          -
          <lpage>202</lpage>
          . doi:
          <volume>10</volume>
          .18653/v1/
          <year>2021</year>
          .eacl-srw.
          <volume>25</volume>
          .
        </mixed-citation>
      </ref>
      <ref id="ref23">
        <mixed-citation>
          [23]
          <string-name>
            <given-names>Y.</given-names>
            <surname>Liu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Medlar</surname>
          </string-name>
          , and
          <string-name>
            <given-names>D.</given-names>
            <surname>Glowacka</surname>
          </string-name>
          . “
          <article-title>Statistically Significant Detection of Semantic Shifts Using Contextual Word Embeddings”</article-title>
          .
          <source>In:Proceedings of the 2nd Workshop on Evaluation and Comparison of NLP Systems</source>
          .
          <year>2021</year>
          , pp.
          <fpage>104</fpage>
          -
          <lpage>113</lpage>
          . doi:
          <volume>10</volume>
          .18653/v1/
          <year>2021</year>
          .eval4nlp-
          <fpage>1</fpage>
          .
          <fpage>11</fpage>
          . arXiv:
          <volume>2104</volume>
          .03776 [cs].
        </mixed-citation>
      </ref>
      <ref id="ref24">
        <mixed-citation>
          [24] [25] [26]
          <string-name>
            <given-names>M. S.</given-names>
            <surname>Livingston</surname>
          </string-name>
          and
          <string-name>
            <given-names>H. A.</given-names>
            <surname>Bethe</surname>
          </string-name>
          . “
          <string-name>
            <surname>Nuclear Physics C. Nuclear Dynamics</surname>
          </string-name>
          , Experimental”.
        </mixed-citation>
      </ref>
      <ref id="ref25">
        <mixed-citation>
          <source>In: Reviews of Modern Physics 9.3</source>
          (
          <issue>1937</issue>
          ), pp.
          <fpage>245</fpage>
          -
          <lpage>390</lpage>
          . doi:
          <volume>10</volume>
          .1103/RevModPhys.9.245.
        </mixed-citation>
      </ref>
      <ref id="ref26">
        <mixed-citation>
          <string-name>
            <given-names>M.</given-names>
            <surname>Martinc</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Montariol</surname>
          </string-name>
          , E. Zosa, and
          <string-name>
            <given-names>L.</given-names>
            <surname>Pivovarova</surname>
          </string-name>
          . “Capturing Evolution in Word Usage: Just Add More Clusters?”
          <source>In:Companion Proceedings of the Web Conference</source>
          <year>2020</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref27">
        <mixed-citation>
          <year>2020</year>
          , pp.
          <fpage>343</fpage>
          -
          <lpage>349</lpage>
          . doi:
          <volume>10</volume>
          .1145/3366424.3382186. arXiv:
          <year>2001</year>
          .06629 [cs].
        </mixed-citation>
      </ref>
      <ref id="ref28">
        <mixed-citation>
          <string-name>
            <given-names>M.</given-names>
            <surname>Martinc</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Novak</surname>
          </string-name>
          , and
          <string-name>
            <given-names>S.</given-names>
            <surname>Pollak</surname>
          </string-name>
          .
          <article-title>Leveraging Contextual Embeddings for Detecting Diachronic Semantic Shift</article-title>
          .
          <year>2020</year>
          . doi:
          <volume>10</volume>
          .48550/arXiv.
          <year>1912</year>
          .
          <volume>01072</volume>
          . arXiv:
          <year>1912</year>
          .01072 [cs].
        </mixed-citation>
      </ref>
      <ref id="ref29">
        <mixed-citation>
          [27]
          <string-name>
            <given-names>J.-P.</given-names>
            <surname>Martinez</surname>
          </string-name>
          . “
          <article-title>Virtuality in Modern Physics in the 1920s and 1930s: Meaning(s) of an Emerging Notion”</article-title>
          .
          <source>In:Perspectives on Science</source>
          (
          <year>2023</year>
          ), pp.
          <fpage>1</fpage>
          -
          <lpage>40</lpage>
          . doi:
          <volume>10</volume>
          .1162/posc\_a\_
          <volume>00</volume>
          610.
        </mixed-citation>
      </ref>
      <ref id="ref30">
        <mixed-citation>
          [28]
          <string-name>
            <given-names>S.</given-names>
            <surname>Montariol</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Martinc</surname>
          </string-name>
          , and
          <string-name>
            <given-names>L.</given-names>
            <surname>Pivovarova</surname>
          </string-name>
          . “
          <article-title>Scalable and Interpretable Semantic Change Detection”</article-title>
          .
          <source>In: Proceedings of the</source>
          <year>2021</year>
          <article-title>Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies</article-title>
          .
          <source>Online: Association for Computational Linguistics</source>
          ,
          <year>2021</year>
          , pp.
          <fpage>4642</fpage>
          -
          <lpage>4652</lpage>
          .
          <year>doi1</year>
          :
          <fpage>0</fpage>
          .18653/v1/
          <year>2021</year>
          .naacl -main.369.
        </mixed-citation>
      </ref>
      <ref id="ref31">
        <mixed-citation>
          [29]
          <string-name>
            <given-names>F.</given-names>
            <surname>Periti</surname>
          </string-name>
          ,
          <string-name>
            <given-names>H.</given-names>
            <surname>Dubossarsky</surname>
          </string-name>
          , and
          <string-name>
            <surname>N. Tahmasebi.</surname>
          </string-name>
          (
          <article-title>Chat)GPT v BERT: Dawn of Justice for Semantic Change Detection</article-title>
          .
          <year>2024</year>
          . doi:
          <volume>10</volume>
          .48550/arXiv.2401.14040. arXiv:
          <volume>2401</volume>
          .14040 [cs].
        </mixed-citation>
      </ref>
      <ref id="ref32">
        <mixed-citation>
          [30]
          <string-name>
            <given-names>F.</given-names>
            <surname>Periti</surname>
          </string-name>
          and
          <string-name>
            <given-names>S.</given-names>
            <surname>Montanelli</surname>
          </string-name>
          . “
          <article-title>Lexical Semantic Change through Large Language Models: A Survey”</article-title>
          .
          <source>In: ACM Comput. Surv</source>
          .
          <volume>56</volume>
          .11 (
          <year>2024</year>
          ),
          <volume>282</volume>
          :
          <fpage>1</fpage>
          -
          <lpage>282</lpage>
          :
          <fpage>38</fpage>
          . doi:
          <volume>10</volume>
          .1145/3672393.
        </mixed-citation>
      </ref>
      <ref id="ref33">
        <mixed-citation>
          [31]
          <string-name>
            <given-names>A.</given-names>
            <surname>Simons</surname>
          </string-name>
          .
          <article-title>Meaning at the Planck Scale? Contextualized Word Embeddings for Doing History</article-title>
          , Philosophy, and Sociology of Science.
          <year>2024</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref34">
        <mixed-citation>
          <string-name>
            <given-names>N.</given-names>
            <surname>Tahmasebi</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L.</given-names>
            <surname>Borin</surname>
          </string-name>
          ,
          <article-title>and</article-title>
          <string-name>
            <given-names>A.</given-names>
            <surname>Jatowt</surname>
          </string-name>
          .Survey of Computational Approaches to Lexical Semantic Change Detection.
          <year>2021</year>
          . doi:
          <volume>10</volume>
          .5281/zenodo.5040302.
        </mixed-citation>
      </ref>
      <ref id="ref35">
        <mixed-citation>
          <string-name>
            <surname>M. B. Valente</surname>
          </string-name>
          . “
          <article-title>Are Virtual Quanta Nothing but Formal Tools?”</article-title>
          <source>In:International Studies in the Philosophy of Science 25.1</source>
          (
          <issue>2011</issue>
          ), pp.
          <fpage>39</fpage>
          -
          <lpage>53</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref36">
        <mixed-citation>
          <string-name>
            <given-names>M.</given-names>
            <surname>Wevers</surname>
          </string-name>
          and
          <string-name>
            <given-names>M.</given-names>
            <surname>Koolen</surname>
          </string-name>
          . “Digital Begrifsgeschichte:
          <article-title>Tracing Semantic Change Using Word Embeddings”</article-title>
          .
          <source>In:Historical Methods: A Journal of Quantitative and Interdisciplinary History 53.4</source>
          (
          <issue>2020</issue>
          ), pp.
          <fpage>226</fpage>
          -
          <lpage>243</lpage>
          . doi:
          <volume>10</volume>
          .1080/01615440.
          <year>2020</year>
          .
          <volume>1760157</volume>
          .
        </mixed-citation>
      </ref>
      <ref id="ref37">
        <mixed-citation>
          Dordrecht: Springer,
          <year>2010</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref38">
        <mixed-citation>
          [36]
          <string-name>
            <given-names>G.</given-names>
            <surname>Zanderighi</surname>
          </string-name>
          <article-title>. “The Two-Loop Explosion”</article-title>
          .
          <source>In:CERN Courier 57.3</source>
          (
          <issue>2017</issue>
          ), pp.
          <fpage>19</fpage>
          -
          <lpage>22</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref39">
        <mixed-citation>
          [37]
          <string-name>
            <given-names>Y.</given-names>
            <surname>Zhang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>X.</given-names>
            <surname>Chen</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B.</given-names>
            <surname>Jin</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Wang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Ji</surname>
          </string-name>
          ,
          <string-name>
            <given-names>W.</given-names>
            <surname>Wang</surname>
          </string-name>
          , and
          <string-name>
            <given-names>J.</given-names>
            <surname>Han</surname>
          </string-name>
          .
          <article-title>A Comprehensive Survey of Scientific Large Language Models and Their Applications in Scientific Discovery</article-title>
          .
          <year>2024</year>
          . doi:
          <volume>10</volume>
          .48550/arXiv.2406.10833. arXiv:
          <volume>2406</volume>
          .10833 [cs].
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>