<?xml version="1.0" encoding="UTF-8"?>
<TEI xml:space="preserve" xmlns="http://www.tei-c.org/ns/1.0" 
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" 
xsi:schemaLocation="http://www.tei-c.org/ns/1.0 https://raw.githubusercontent.com/kermitt2/grobid/master/grobid-home/schemas/xsd/Grobid.xsd"
 xmlns:xlink="http://www.w3.org/1999/xlink">
	<teiHeader xml:lang="en">
		<fileDesc>
			<titleStmt>
				<title level="a" type="main">Is Dynamicity All You Need?</title>
			</titleStmt>
			<publicationStmt>
				<publisher/>
				<availability status="unknown"><licence/></availability>
			</publicationStmt>
			<sourceDesc>
				<biblStruct>
					<analytic>
						<author>
							<persName><forename type="first">Richard</forename><surname>Delwin Myloth</surname></persName>
							<affiliation key="aff0">
								<orgName type="department">Information Sciences Institute</orgName>
								<address>
									<addrLine>Marina del Ray</addrLine>
									<region>CA</region>
									<country key="US">USA</country>
								</address>
							</affiliation>
							<affiliation key="aff1">
								<orgName type="institution">University of Southern California</orgName>
								<address>
									<settlement>Los Angeles</settlement>
									<region>CA</region>
									<country key="US">USA</country>
								</address>
							</affiliation>
						</author>
						<author>
							<persName><forename type="first">Kian</forename><surname>Ahrabian</surname></persName>
							<email>ahrabian@usc.edu</email>
							<affiliation key="aff0">
								<orgName type="department">Information Sciences Institute</orgName>
								<address>
									<addrLine>Marina del Ray</addrLine>
									<region>CA</region>
									<country key="US">USA</country>
								</address>
							</affiliation>
							<affiliation key="aff1">
								<orgName type="institution">University of Southern California</orgName>
								<address>
									<settlement>Los Angeles</settlement>
									<region>CA</region>
									<country key="US">USA</country>
								</address>
							</affiliation>
						</author>
						<author>
							<persName><forename type="first">Arun</forename><surname>Baalaaji</surname></persName>
						</author>
						<author>
							<persName><forename type="first">Sankar</forename><surname>Ananthan</surname></persName>
							<affiliation key="aff0">
								<orgName type="department">Information Sciences Institute</orgName>
								<address>
									<addrLine>Marina del Ray</addrLine>
									<region>CA</region>
									<country key="US">USA</country>
								</address>
							</affiliation>
							<affiliation key="aff1">
								<orgName type="institution">University of Southern California</orgName>
								<address>
									<settlement>Los Angeles</settlement>
									<region>CA</region>
									<country key="US">USA</country>
								</address>
							</affiliation>
						</author>
						<author>
							<persName><forename type="first">Xinwei</forename><surname>Du</surname></persName>
							<email>xinweidu@usc.edu</email>
							<affiliation key="aff0">
								<orgName type="department">Information Sciences Institute</orgName>
								<address>
									<addrLine>Marina del Ray</addrLine>
									<region>CA</region>
									<country key="US">USA</country>
								</address>
							</affiliation>
							<affiliation key="aff1">
								<orgName type="institution">University of Southern California</orgName>
								<address>
									<settlement>Los Angeles</settlement>
									<region>CA</region>
									<country key="US">USA</country>
								</address>
							</affiliation>
						</author>
						<author>
							<persName><forename type="first">Jay</forename><surname>Pujara</surname></persName>
							<email>jpujara@usc.edu</email>
							<affiliation key="aff0">
								<orgName type="department">Information Sciences Institute</orgName>
								<address>
									<addrLine>Marina del Ray</addrLine>
									<region>CA</region>
									<country key="US">USA</country>
								</address>
							</affiliation>
							<affiliation key="aff1">
								<orgName type="institution">University of Southern California</orgName>
								<address>
									<settlement>Los Angeles</settlement>
									<region>CA</region>
									<country key="US">USA</country>
								</address>
							</affiliation>
						</author>
						<author>
							<affiliation key="aff2">
								<orgName type="laboratory">The Third AAAI Workshop on Scientific Document Understanding 2023</orgName>
								<address>
									<addrLine>February 14th</addrLine>
									<postCode>2023</postCode>
									<settlement>Washington</settlement>
									<region>DC</region>
									<country key="US">USA</country>
								</address>
							</affiliation>
						</author>
						<title level="a" type="main">Is Dynamicity All You Need?</title>
					</analytic>
					<monogr>
						<idno type="ISSN">1613-0073</idno>
					</monogr>
					<idno type="MD5">0694722830E971094D085E4524C80D99</idno>
				</biblStruct>
			</sourceDesc>
		</fileDesc>
		<encodingDesc>
			<appInfo>
				<application version="0.7.2" ident="GROBID" when="2025-04-23T16:26+0000">
					<desc>GROBID - A machine learning software for extracting information from scholarly documents</desc>
					<ref target="https://github.com/kermitt2/grobid"/>
				</application>
			</appInfo>
		</encodingDesc>
		<profileDesc>
			<textClass>
				<keywords>
					<term>Author Dynamicity</term>
					<term>Causal Analysis</term>
					<term>Scientific Research Analysis</term>
					<term>Community Detection</term>
				</keywords>
			</textClass>
			<abstract>
<div xmlns="http://www.tei-c.org/ns/1.0"><p>Scientific domains are fluid entities that change and turn as time passes. Take machine learning as an example. Up until the '90s, most of the methods were expert-knowledge-driven. However, as time passed, more data-driven approaches appeared, finally leading to the advent of deep learning methods. As a result, in a span of 30 years, the field has gone through many changes and breakthroughs and is at a point where many novelties have a life span of shorter than five years. In parallel, a regular researcher's career span is around the same length. Consequently, being a researcher requires shifts in the field of study throughout one's career. Besides, researchers' scientific interests are inherently dynamic and change over time. Hence, there exists a dynamicity to authors' interests and fields of work over time. In this work, we study this phenomenon through systematic approaches for representing and tracking dynamicity in different epochs. Our representation approaches are based on the idea that each author could be represented as a distribution of other authors. Concurrently, our tracking approaches rely on established mathematical concepts for measuring the change between two distributions. We focus on the publications in the 2001-2020 range and present a set of analyses built on top of the introduced approaches to understanding the potential connection between dynamicity and success.</p></div>
			</abstract>
		</profileDesc>
	</teiHeader>
	<text xml:lang="en">
		<body>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="1.">Introduction</head><p>The past few decades have been an unprecedented era of scientific discoveries, with the sheer number of publications rising steadily <ref type="bibr" target="#b0">[1]</ref>. This constant growth of research collaborations has led to the emergence of new interdisciplinary domains, prompting researchers to expand their research horizons. This expansion, combined with the continuous development of scientific domains and the inherent nature of research to explore new areas, results in a potentially volatile set of research directions. This work introduces approaches for systematically studying this fluidity and uncovering interesting behaviors among authors.</p><p>Scientific publications are the information vessels scientists use to communicate their findings, methodologies, and critiques. At the same time, publications are reflections of their authors' interests and fields of study. These publications are bound together through citations that specify the foundations of each work. As a result, citations create tightly connected groups of publications with similar research directions. Consequently, authors with a high number of interactions in these groups, either through collaborations or citations, are more likely to have similar interests.</p><p>Community detection algorithms are graph partitioning approaches that identify sets of tightly connected nodes that are loosely connected to nodes outside their respective sets <ref type="bibr" target="#b1">[2,</ref><ref type="bibr" target="#b2">3]</ref>. When employed on citation networks, these algorithms yield a set of communities where each community contains highly related publications. These extracted communities could then be exploited for indirectly analyzing authors' interests through publications and citations as proxies.</p><p>In this work, we study the authors' dynamicity phenomenon from a relational standpoint. More specifically, we focus on the following research questions:</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="1.">How can we characterize and quantify the interests and dynamicity of an author? 2. Is there any connection between dynamicity and success due to reasons such as adaptability or diversity?</head><p>To this end, we first create two knowledge graphs (KG) from publications in the 2001-2020 period, each encompassing ten years' worth of scholarly information, i.e., publications and authors. Then, we introduce three vectorizing approaches focused on presenting authors' interest in one epoch, and two tracking approaches focused on quantifying the change in interests in two distinct epochs. Our vectorizing approaches are built on top of relational information in the KGs and represent authors as a distribution of other authors. Meanwhile, our tracking approaches are based on the two well-known cosine similarity and relative entropy (Kullback-Leibler </p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="2.">Related Work</head><p>Bird et al. <ref type="bibr" target="#b3">[4]</ref> analyzed community structures in the DBLP bibliographic database to investigate collaborative connections in computer science and interdisciplinary research at the individual, within-area, and network-wide levels. They developed quantifiable metrics such as longitudinal assortativity over the number of publications, collaborators, and career length to study author overlap and migration patterns. Prior to Bird et al. <ref type="bibr" target="#b3">[4]</ref>, Newman <ref type="bibr" target="#b4">[5]</ref> used data from publications in physics, biomedical research, and computer science to build co-authorship collaboration networks. They looked at the number of publications produced by authors, the number of authors per article, the number of collaborators that scientists have, the existence and size of a significant component of connected scientists, and the degree of clustering in the networks. They examined collaboration patterns among participants and discovered that these variables follow a power law distribution and that collaboration relationships are transitive. Paul et al. <ref type="bibr" target="#b5">[6]</ref> also used the DBLP database in their study to develop a citation-collaboration network to rank authors based on their contributions in terms of co-authorship and citations while verifying them against the h-index. They also carried out a comparative examination of the change in author ranking for different parts of the author spectrum over time.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="3.">Dataset</head><p>OpenAlex <ref type="bibr" target="#b6">[7]</ref> is a free and open catalog of scholarly entities that provides metadata for publications, authors, venues, institutions, and scientific concepts, along with the relationships among them. It gathers data from sources such as Crossref, Microsoft Academic Graph (MAG), ROR, ORCID, DOAJ, PubMed, PubMed Central, and Unpaywall. We use the OpenAlex dump obtained on 2022-12-07 to construct our dataset for this work. Given this dump, we first extract a KG containing all the publications and their connections, i.e., citation links. Then, we extract two induced KGs by filtering the publications with publication dates within two ranges of 2001-2010 and 2011-2020, naming them CG-2010 and CG-2020, respectively. Following this, we add the authorship information for each KG for all the publications. Finally, we drop all the nodes with a zero degree (in and out) in both KGs. After this procedure, we end up with two temporally-scoped KGs containing authorship and citation information for all the publications in the 2001-2010 and 2011-2020 periods. Table <ref type="table" target="#tab_0">1</ref> illustrates the statistics of the extracted KGs. To handle the large size of the raw dump, we resorted to using the KGTK toolkit for all our KG processing procedures <ref type="bibr" target="#b7">[8]</ref>.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="4.">Methodology</head><p>We break down the problem of characterizing authors' dynamicity into two sets of approaches: Vectorizers and Trackers. Vectorizers, as described in Section 4.1, focus on presenting authors' interest in one epoch. As described in Section 4.2, trackers focus on quantifying the change in interests in two distinct epochs. When combined, these approaches provide a systematic way of characterizing authors' dynamicity.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="4.1.">Vectorizers</head><p>We introduce three approaches for vectorizing authors' interests in a given epoch. The main idea of all these approaches is that each author's interests could be modeled through a distribution over the set of other authors.</p><p>Our first two approaches rely only on the information that could be directly extracted from citation links. In contrast, the third approach uses external information by building upon the output of a community detection algorithm. As a result, the third approach is prone to erroneous information propagated from the underlying community detection algorithm; in return, it gains access to more complex information compared to the first two approaches.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="4.1.1.">Co-authors</head><p>In this approach, we present an author's interests through their co-authors. To this end, given two arbitrary authors 𝑝 and 𝑞 and epoch 𝑡, we define the co-author weight value</p><formula xml:id="formula_0">𝜓 𝑡 𝑝 (𝑞) as 𝜓 𝑡 𝑝 (𝑞) = |𝒱 𝑡 𝑝 ∩ 𝒱 𝑡 𝑞 |<label>(1)</label></formula><p>where 𝒱 𝑡 𝑥 is the set of publications by author 𝑥 in epoch 𝑡. Building on top of these co-author weight values, for any arbitrary author 𝑝, we form the representative vector 𝑧 𝑡 𝑝 as</p><formula xml:id="formula_1">𝑧 𝑡 𝑝 = [𝜓 𝑡 𝑝 (𝑎0), 𝜓 𝑡 𝑝 (𝑎1), . . . , 𝜓 𝑡 𝑝 (𝑎 |𝒜| )]<label>(2)</label></formula><p>where 𝒜 is the set of all authors in the KG. It is important to note that these representative vectors are extremely sparse due to the large cardinality of 𝒜.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="4.1.2.">Citations</head><p>In this approach, we present an author's interests through its citing and cited authors. To this end, given two arbitrary authors 𝑝 and 𝑞 and epoch 𝑡, we define the citation weight value 𝜑 𝑡 𝑝 (𝑞) as</p><formula xml:id="formula_2">𝜑 𝑡 𝑝 (𝑞) = ∑︁ 𝑣∈𝒱 𝑡 𝑝 |𝒩 𝑡 𝑣 ∩ 𝒱 𝑡 𝑞 | + ∑︁ 𝑢∈𝒱 𝑡 𝑞 |𝒱 𝑡 𝑝 ∩ 𝒩 𝑡 𝑢 | (3)</formula><p>where 𝒱 𝑡 𝑥 is the set of publications by author 𝑥 in epoch 𝑡 and 𝒩 𝑡 𝑦 is the set of all publications cited by publication 𝑦 in epoch 𝑡. Building on these citation weight values, for any arbitrary author 𝑝, we form the representative vector 𝑧 𝑡 𝑝 following Equation <ref type="formula" target="#formula_1">2</ref>, replacing 𝜓 𝑡 𝑝 with 𝜑 𝑡 𝑝 .</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="4.1.3.">Communities</head><p>In this approach, we present an author's interests through authors with whom they publish in the same research communities. To this end, given a KG encompassing epoch 𝑡, we first extract the citation graph by removing all non-publication nodes, i.e., authors. Then, we run the Leiden <ref type="bibr" target="#b2">[3]</ref> community detection algorithm to extract a set of communities 𝒞. We rely on the hypothesis that each community represents a somewhat unique field of study. We use a modified version of the Leiden algorithm that limits the maximum number of generated communities and the number of publications in a community. Doing so avoids the creation of large unfocused, or small insignificant communities. Given the set of extracted communities 𝒞, for any two arbitrary authors 𝑝 and 𝑞, we define the co-occurrence weight value 𝜂 𝐶 𝑝 (𝑞) as</p><formula xml:id="formula_3">𝜂 𝐶 𝑝 (𝑞) = {︃ ∑︀ 𝑐∈𝒞 |𝑐𝑝| |𝒱 𝑡 𝑝 | log 2 (|𝑐𝑞| + 𝛼) 𝑝 ̸ = 𝑞 0 𝑝 = 𝑞 (<label>4</label></formula><formula xml:id="formula_4">)</formula><p>where 𝑐𝑥 is the set of publications by author 𝑥 in community 𝑐, 𝒱 𝑡 𝑥 is the set of publications by author 𝑥 in epoch 𝑡, and 𝛼 = 0.001. In this formalization, the effect of each community is weighed on the number of publications an author has in that community, e.g., 𝑐𝑝 |𝒱 𝑡 𝑝 | . Moreover, each author's influence is smoothened by taking the log value of their number of publications, e.g., log 2 (𝑐𝑞 +𝛼). The resulting equation highlights the connection between any two authors that have many papers in the same communities and simultaneously waives the need for tracking the communities themselves. Building on top of these cooccurrence weight values, for any arbitrary author 𝑝, we can form a representative vector 𝑧 𝑡 𝑝 following Equation <ref type="formula" target="#formula_1">2</ref>, replacing 𝜓 𝑡 𝑝 with 𝜂 𝐶 𝑝 .</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="4.2.">Trackers</head><p>We introduce two tracking approaches for quantifying the dynamicity between two distinct epochs. These two approaches are built on well-known mathematical concepts of cosine similarity and relative entropy.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="4.2.1.">Cosine Similarity (𝒮-score)</head><p>Given the representative vectors of an arbitrary author 𝑝 from two time periods, 𝑧 𝑡 𝑝 and 𝑧 𝑡 ′ 𝑝 , we calculate the cosine similarity score 𝒮 𝑡,𝑡 ′ 𝑝 defined as</p><formula xml:id="formula_5">𝒮 𝑡,𝑡 ′ 𝑝 = 𝑧 𝑡 𝑝 .𝑧 𝑡 ′ 𝑝 ‖𝑧 𝑡 𝑝 ‖‖𝑧 𝑡 ′ 𝑝 ‖ . (<label>5</label></formula><formula xml:id="formula_6">)</formula><p>The calculated cosine similarity scores represent the stability of authors' interests in two epochs, i.e., the higher the value, the more consistent the authors' interests.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="4.2.2.">Relative Entropy (ℰ-score)</head><p>Building on top of the representative vectors, for each arbitrary author 𝑝 in period 𝑡, we define a probability distribution as</p><formula xml:id="formula_7">ℱ 𝑡 𝑝 (𝑞) = 𝑧 𝑡 𝑝 [𝑞] + 𝜖 ∑︀ 𝑞 ′ ∈𝒜 𝑧 𝑡 𝑝 [𝑞 ′ ] + 𝜖|𝒜| ∀𝑞 ∈ 𝒜<label>(6)</label></formula><p>where 𝜖 = 1 |𝒜| is the prior probability and 𝒜 is the set of all authors in the KG. Then, given the probability distributions of an arbitrary author 𝑝 from two time periods, ℱ 𝑡 𝑝 and ℱ 𝑡 ′ 𝑝 , we calculate the relative entropy ℰ 𝑡,𝑡 ′ 𝑞 as</p><formula xml:id="formula_8">ℰ 𝑡,𝑡 ′ 𝑝 = 𝐷KL(ℱ 𝑡 ′ 𝑝 ‖ℱ 𝑡 𝑝 ) = ∑︁ 𝑞∈𝒜 ℱ 𝑡 ′ 𝑝 (𝑞) log( ℱ 𝑡 ′ 𝑝 (𝑞) ℱ 𝑡 𝑝 (𝑞)</formula><p>) .</p><p>(7) In contrast to the cosine similarity score, the calculated relative entropy scores represent the volatility of authors' interests in two epochs, i.e., the higher the value, the less consistent the authors' interests are.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="5.">Analyses</head><p>Throughout this section, we run all our analyses on a set of randomly 10,000 sampled authors. More specifically, we do a weighted sampling without replacement using the citation counts. This procedure allows us to manage the computational costs of running these analyses. </p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="5.1.">Statistical Dependence Analysis</head><p>This analysis studies the connection between the introduced stability scores and success across two epochs. We use the relative change in average citation count as the proxy metric for success. The main intuitions behind this metric are 1) citation count is an accepted correlated metric for success in the community, 2) using average mitigates the effect of the high number of publications from an author, and 3) using relative change locally normalizes the metric values. Moreover, to reduce the potential noise in the data, we remove the outliers by filtering out samples outside two standard deviations of relative change in average citation count mean.</p><p>To quantify the strength of this connection, we use the established bivariate correlation and univariate linear regression measurements. We also include a random noise vectorizer as a sanity check to our methodology. Table <ref type="table" target="#tab_1">2</ref> presents the results of our analysis with one of the introduced scores as the independent variable 𝒳 and the number of citations as the dependent variable 𝒴. As evident from Table <ref type="table" target="#tab_1">2</ref>, every introduced score has a significant connection with success, some in the same direction and some in the opposite direction. Moreover, the "Citations" vectorizer showcases the highest correlation with the measurement for success which signifies the effect of author interactions.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="5.2.">Entropy Analysis</head><p>In this analysis, we study the connection between diversity and success. We use the authors' entropy across the extracted communities as a proxy for diversity. As for success, with similar intuitions to the previous section,  </p><formula xml:id="formula_9">𝑤 𝑐 𝑝 = |𝑐𝑝| |𝒱 𝑡 𝑝 |<label>(8)</label></formula><formula xml:id="formula_10">ℋ 𝒞 𝑝 = − ∑︁ 𝑐∈𝒞 𝑤 𝑐 𝑝 log 2 (𝑤 𝑐 𝑝 ) (<label>9</label></formula><formula xml:id="formula_11">)</formula><p>where 𝑐𝑥 is the set of publications by author 𝑥 in community 𝑐 and 𝒱 𝑡 𝑥 is the set of publications by author 𝑥 in epoch 𝑡. Figure <ref type="figure" target="#fig_0">1</ref> illustrates the results of our analysis. We can observe in Figure <ref type="figure" target="#fig_0">1</ref> that in both epochs average citation count increases with the increase of entropy up until a point and then drops again. This observation indicates the benefit of having a diverse portfolio, but simultaneously too much diversity could negatively impact success.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="5.3.">Propensity Score Matching Analysis</head><p>This analysis focuses on the potential causal relationship between adaptability and success in two epochs by utilizing the propensity score matching (PSM) technique. We use the increase in entropy and citation count in the second epoch as proxy metrics for adaptability and success, respectively. Following this, we designate the increase in entropy as the treatment variable and the citation count in the second epoch as the outcome variable. As for the confounding variables, we use the publication counts from both epochs and the citation count in the first epoch.</p><p>To check the matching quality, we plot one of the confounding variables, i.e., publication counts in the second epoch, against the outcome variable for both control and treatment groups in Figure <ref type="figure" target="#fig_1">2</ref>. Moreover, Table <ref type="table" target="#tab_2">3</ref> presents the treatment effect evaluation results. From Table <ref type="table" target="#tab_2">3</ref>, we can observe that the average treatment effect (ATE) has a larger value compared to the average treatment effect on treated (ATT) while both have a negative value. This observation indicates that while, in general, the authors have experienced a decline in the number of citations, the increase in entropy slows down this phenomenon. Hence, adaptability, i.e., an increase in entropy, could be seen as a remedy for a decline in success.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="6.">Conclusion and Future Works</head><p>Motivated by our observation of scientific domains' fluidity and empowered by the emergence of public repositories of scholarly data, we presented a thorough systematic study of the author dynamicity phenomenon in this work. With the idea of representing authors' interests and fields of work by a distribution of other authors, we introduced three different systematic approaches vectorizing each author in a single epoch. Then, to track an author's behavioral changes between two epochs, we introduced two approaches built on top of the extracted vectors and well-known mathematical approaches for quantifying change. Based on these approaches, we presented in-depth analyses to understand the connection between success better, as measured by citation counts, and specific dynamic behaviors, as measured through the introduced approaches. Some of the straightforward extensions of our work for future studies are 1) including more authors, 2) using a more extended period, and 3) changing the temporal granularity for tracking changes. Moreover, we used a relatively simple metric as our success proxy; future works could work with other metrics, such as the h-index or i10-index.</p></div><figure xmlns="http://www.tei-c.org/ns/1.0" xml:id="fig_0"><head>Figure 1 :</head><label>1</label><figDesc>Figure 1: The effect of entropy on average citation count.</figDesc><graphic coords="4,94.38,84.19,193.19,185.87" type="bitmap" /></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" xml:id="fig_1"><head>Figure 2 :</head><label>2</label><figDesc>Figure 2: Matched groups for the confounding variable, i.e., publication count in the second epoch, for both control and treatment groups against the outcome variable.</figDesc><graphic coords="5,94.38,84.19,193.19,171.81" type="bitmap" /></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" type="table" xml:id="tab_0"><head>Table 1</head><label>1</label><figDesc>Statistics of the extracted KGs.</figDesc><table><row><cell>Dataset</cell><cell>CG-2010</cell><cell>CG-2020</cell></row><row><cell># Publications</cell><cell>19,707,369</cell><cell>33,743,276</cell></row><row><cell># Authors</cell><cell>20,333,216</cell><cell>36,077,559</cell></row><row><cell># Citation Links</cell><cell cols="2">167,133,583 323,927,950</cell></row><row><cell># Authorship Links</cell><cell>67,531,472</cell><cell>137,160,724</cell></row><row><cell cols="3">divergence) measures. By mix-and-matching, these ap-</cell></row><row><cell cols="3">proaches yield six different dynamicity scores for each</cell></row><row><cell cols="3">author. We then use these scores to investigate the con-</cell></row><row><cell cols="3">nection between authors' dynamicity and success. Our</cell></row><row><cell cols="3">analyses showcase the connection between success, di-</cell></row><row><cell cols="2">versity, and adaptability in research.</cell><cell></cell></row></table></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" type="table" xml:id="tab_1"><head>Table 2</head><label>2</label><figDesc>Univariate linear regression and bivariate correlation metrics between introduced scores and relative change in average citation count. Legend: PCC: Pearson correlation coefficient.</figDesc><table><row><cell>Tracker</cell><cell>Vectorizer</cell><cell>PCC</cell><cell>Coef.</cell><cell>SE</cell><cell>𝑡</cell><cell>𝑃 &gt; |𝑡|</cell></row><row><cell></cell><cell>Random</cell><cell>-0.001</cell><cell cols="3">-967.70 5156.52 -0.188</cell><cell>0.851</cell></row><row><cell>𝒮-score</cell><cell>Co-authors</cell><cell>-0.121</cell><cell>-26.03</cell><cell>2.15</cell><cell>-12.11</cell><cell>0.000</cell></row><row><cell></cell><cell>Citations</cell><cell>-0.138</cell><cell>-27.95</cell><cell>2.02</cell><cell>-13.81</cell><cell>0.000</cell></row><row><cell></cell><cell>Communities</cell><cell>-0.082</cell><cell>-25.72</cell><cell>3.17</cell><cell>-8.12</cell><cell>0.000</cell></row><row><cell></cell><cell>Random</cell><cell>0.015</cell><cell>47.03</cell><cell>31.15</cell><cell>1.51</cell><cell>0.131</cell></row><row><cell>ℰ-score</cell><cell>Co-authors</cell><cell>-0.057</cell><cell>-0.64</cell><cell>0.11</cell><cell>-5.65</cell><cell>0.000</cell></row><row><cell></cell><cell>Citations</cell><cell>0.198</cell><cell>3.019</cell><cell>0.15</cell><cell>20.00</cell><cell>0.000</cell></row><row><cell></cell><cell>Communities</cell><cell>0.048</cell><cell>0.66</cell><cell>0.14</cell><cell>4.73</cell><cell>0.000</cell></row></table></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" type="table" xml:id="tab_2"><head>Table 3</head><label>3</label><figDesc>Treatment effect evaluations. Legend: ATE: Average treatment effect, ATT: Average treatment effect on the treated, ATU: Average treatment effect on the untreated.we use the average citation count as the proxy metric. Formally, given the set of extracted communities 𝐶, for any arbitrary author 𝑝, we calculate the entropy across communities ℋ 𝐶 𝑝 as</figDesc><table><row><cell>Metric</cell><cell>Est.</cell><cell>SE</cell><cell>𝑧</cell><cell>𝑃 &gt; |𝑧|</cell></row><row><cell>ATE</cell><cell cols="3">-189.157 36.274 -5.215</cell><cell>0.000</cell></row><row><cell>ATT</cell><cell cols="3">-176.136 29.762 -5.918</cell><cell>0.000</cell></row><row><cell>ATU</cell><cell cols="3">-202.178 43.471 -4.651</cell><cell>0.000</cell></row></table></figure>
		</body>
		<back>

			<div type="acknowledgement">
<div xmlns="http://www.tei-c.org/ns/1.0"><head>Acknowledgments</head><p>This work was funded by the Defense Advanced Research Projects Agency with award W911NF-19-20271 and with support from a Keston Exploratory Research Award.</p></div>
			</div>

			<div type="references">

				<listBibl>

<biblStruct xml:id="b0">
	<analytic>
		<title level="a" type="main">Growth rates of modern science: A bibliometric analysis based on the number of publications and cited references</title>
		<author>
			<persName><forename type="first">L</forename><surname>Bornmann</surname></persName>
		</author>
		<author>
			<persName><forename type="first">R</forename><surname>Mutz</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">Journal of the Association for Information Science and Technology</title>
		<imprint>
			<biblScope unit="volume">66</biblScope>
			<biblScope unit="page" from="2215" to="2222" />
			<date type="published" when="2015">2015</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b1">
	<analytic>
		<title level="a" type="main">Fast unfolding of communities in large networks</title>
		<author>
			<persName><forename type="first">V</forename><forename type="middle">D</forename><surname>Blondel</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J.-L</forename><surname>Guillaume</surname></persName>
		</author>
		<author>
			<persName><forename type="first">R</forename><surname>Lambiotte</surname></persName>
		</author>
		<author>
			<persName><forename type="first">E</forename><surname>Lefebvre</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">Journal of statistical mechanics: theory and experiment</title>
		<imprint>
			<biblScope unit="page">P10008</biblScope>
			<date type="published" when="2008">2008</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b2">
	<analytic>
		<title level="a" type="main">From louvain to leiden: guaranteeing well-connected communities</title>
		<author>
			<persName><forename type="first">V</forename><forename type="middle">A</forename><surname>Traag</surname></persName>
		</author>
		<author>
			<persName><forename type="first">L</forename><surname>Waltman</surname></persName>
		</author>
		<author>
			<persName><forename type="first">N</forename><forename type="middle">J</forename><surname>Van Eck</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">Scientific reports</title>
		<imprint>
			<biblScope unit="volume">9</biblScope>
			<biblScope unit="page" from="1" to="12" />
			<date type="published" when="2019">2019</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b3">
	<monogr>
		<author>
			<persName><forename type="first">C</forename><surname>Bird</surname></persName>
		</author>
		<author>
			<persName><forename type="first">E</forename><forename type="middle">T</forename><surname>Barr</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Nash</surname></persName>
		</author>
		<author>
			<persName><forename type="first">P</forename><forename type="middle">T</forename><surname>Devanbu</surname></persName>
		</author>
		<author>
			<persName><forename type="first">V</forename><surname>Filkov</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Z</forename><surname>Su</surname></persName>
		</author>
		<title level="m">Structure and dynamics of research collaboration in computer science</title>
				<imprint>
			<publisher>SDM</publisher>
			<date type="published" when="2009">2009</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b4">
	<analytic>
		<title level="a" type="main">Scientific collaboration networks. i. network construction and fundamental results</title>
		<author>
			<persName><forename type="first">M</forename><forename type="middle">E</forename><surname>Newman</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">Phys Rev E Stat Nonlin Soft Matter Phys</title>
		<imprint>
			<biblScope unit="volume">64</biblScope>
			<biblScope unit="page">16131</biblScope>
			<date type="published" when="2001">2001</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b5">
	<analytic>
		<title level="a" type="main">Temporal analysis of author ranking using citationcollaboration network</title>
		<author>
			<persName><forename type="first">P</forename><forename type="middle">S</forename><surname>Paul</surname></persName>
		</author>
		<author>
			<persName><forename type="first">V</forename><surname>Kumar</surname></persName>
		</author>
		<author>
			<persName><forename type="first">P</forename><surname>Choudhury</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><surname>Nandi</surname></persName>
		</author>
		<idno type="DOI">10.1109/COMSNETS.2015.7098737</idno>
	</analytic>
	<monogr>
		<title level="m">2015 7th International Conference on Communication Systems and Networks (COMSNETS)</title>
				<imprint>
			<date type="published" when="2015">2015</date>
			<biblScope unit="page" from="1" to="6" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b6">
	<monogr>
		<author>
			<persName><forename type="first">J</forename><surname>Priem</surname></persName>
		</author>
		<author>
			<persName><forename type="first">H</forename><surname>Piwowar</surname></persName>
		</author>
		<author>
			<persName><forename type="first">R</forename><surname>Orr</surname></persName>
		</author>
		<idno type="arXiv">arXiv:2205.01833</idno>
		<title level="m">Openalex: A fully-open index of scholarly works, authors, venues, institutions, and concepts</title>
				<imprint>
			<date type="published" when="2022">2022</date>
		</imprint>
	</monogr>
	<note type="report_type">arXiv preprint</note>
</biblStruct>

<biblStruct xml:id="b7">
	<analytic>
		<title level="a" type="main">Kgtk: a toolkit for large knowledge graph manipulation and analysis</title>
		<author>
			<persName><forename type="first">F</forename><surname>Ilievski</surname></persName>
		</author>
		<author>
			<persName><forename type="first">D</forename><surname>Garijo</surname></persName>
		</author>
		<author>
			<persName><forename type="first">H</forename><surname>Chalupsky</surname></persName>
		</author>
		<author>
			<persName><forename type="first">N</forename><forename type="middle">T</forename><surname>Divvala</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Y</forename><surname>Yao</surname></persName>
		</author>
		<author>
			<persName><forename type="first">C</forename><surname>Rogers</surname></persName>
		</author>
		<author>
			<persName><forename type="first">R</forename><surname>Li</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Liu</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Singh</surname></persName>
		</author>
		<author>
			<persName><forename type="first">D</forename><surname>Schwabe</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">International Semantic Web Conference</title>
				<imprint>
			<publisher>Springer</publisher>
			<date type="published" when="2020">2020</date>
			<biblScope unit="page" from="278" to="293" />
		</imprint>
	</monogr>
</biblStruct>

				</listBibl>
			</div>
		</back>
	</text>
</TEI>
