=Paper= {{Paper |id=Vol-3745/paper25 |storemode=property |title=Open-mentorship Team is Beneficial to Disruptive Ideas |pdfUrl=https://ceur-ws.org/Vol-3745/paper25.pdf |volume=Vol-3745 |authors=Bili Zheng,Wenjing Li,Jianhua Hou |dblpUrl=https://dblp.org/rec/conf/eeke/ZhengLH24 }} ==Open-mentorship Team is Beneficial to Disruptive Ideas== https://ceur-ws.org/Vol-3745/paper25.pdf

Open-mentorship team is beneficial to disruptive ideas⋆
Bili Zheng1, Wenjing Li1 and Jianhua Hou1,∗

1 Sun Yat-sen University, Guangzhou 510006, Guangdong, China

Abstract
How collaboration benefits disruption is widely discussed in academia, but less attention is paid
to mentorship in the collaboration of an article. This study focuses on the association between
close/open-mentorship measured by whether coauthors in publications belong to the same
academic genealogy and the disruption of publications measured by the Disruption Index (DI).
We selected 361,189 publications in Neuroscience from the SciSciNet database and then
constructed regression models and estimated the relationship between the variables. Moreover,
we use Propensity Score Matching and causal forest to estimate the causal relationship between
them. The findings show that articles with open-mentorship collaboration are more disruptive
than those with close-mentorship collaboration. The findings provide implications for team
formation and team management in practice.

Keywords
academic genealogy, disruption index, mentorship type, team science1

1. Introduction and broad range relationships which means the
mentee may be the “learner” in mentoring
In the past decades, scientific papers have become relationships regardless of age or position [5]. For
less disruptive [1]. Some studies attribute this drastic similarity, we here refer mentorship to as the advisor-
change to the scientific enterprise, team size, and advisee relationship like most genealogical studies [6,
collaboration distance [2, 3, 4]. Inspired by a series of 7].
studies on collaboration and disruption [3, 4], we are
interested in whether a close-mentorship or open- 2. Data and method
mentorship team will fuse more disruptive ideas. The
research question is based on the following 2.1. Data collection
assumption: a close-mentorship team means all the
We derive mentorship from the dataset released by
members in a team belong to the same genealogy,
Qing et al (2022) [8], which enriches the Academic
while an open-mentorship team means the members
Family Tree by adding publication records from
belong to more than one genealogy.
Microsoft Academic Graph (MAG) [8]. Then, we obtain
To address the question, we first define the term
the DI of each paper from SciSciNet, which provides
mentorship. Mentorship can occur formally through
over 134 million scientific publications and frequently
doctoral and postdoctoral advisor-advisee
used indexes (such as DI, Z-score, and sleeping beauty
relationships or informally through collaborations.
coefficient) [9]. We obtained 505,926 papers with DI,
Some genealogy databases like The Academic Family
82,814 authors, and 5,855 academic genealogies.
Tree encompass both advisor-advisee relationships

Joint Workshop of the 5th Extraction and Evaluation of
Knowledge Entities from Scientific Documents and the 4th AI +
Informetrics (EEKE-AII2024), April 23~24, 2024, Changchun,
China and Online
EMAIL: zhengbli@mail2.sysu.edu.cn (Bili Zheng);
liwj336@mail2.sysu.edu.cn (Wenjing Li);
houjh5@mail.sysu.edu.cn (Jianhua Hou)
∗ Corresponding author.

© 2023 Copyright for this paper by its authors. Use permitted under
Creative Commons License Attribution 4.0 International (CC BY 4.0).

CEUR
ceur-ws.org
Workshop ISSN 1613-0073
Proceedings

155
After excluding missing values, there were 361,189 studies, we selected those factors for which: (1) prior
papers. work has investigated the factors possibly influencing
DI; (2) existing studies had verified the relationship
2.2. Causal inference with DI; (3) the data for calculating the factors were
available in records from SciSciNet [10].
Table 1
Variable description
To check the robustness of the PSM, we use causal
Variable
No Variable Annotation forest (CF), a state-of-art causal inference method
type
[11]. Compared with PSM, it solves the curse of
1 if it is a close- dimensionality and provides a more accurate estimate
mentorship of the treatment effect.
team; 0 if it is In the causal forest, considering the analysis of
1 treatment binary
open- heterogeneous causal effects, our estimation objective
mentorship is Conditional Average Treatment Effects (CATE). The
team CATE for a given observation 𝑖 is defined as:
Disruption τ(x) = E[ YiW=1 − YiW=0 ∣∣ Xi = x ] (Eq. 1)
2 outcome continuous
index (DI) where i = 1,2, … , n represents the paper in our
3 PY discrete Publication year sample and Wi ∈ {0,1} indicates whether the team of
Total citation paper 𝑖 is close-mentorship. We observe the outcome
4 CI discrete
counts of interest YiW=1 if the paper is assigned to the
10th percentile treatment condition (i.e., if the team of paper is close-
Z-score of the mentorship), otherwise we observe YiW=0 . Xi denote a
5 A10 continuous
paper defined in vector of the paper`s other characteristics.
Uzzi et al (2013)

6 TS discrete
Team size of an 3. Results
article
1 if it is remote 3.1. OLS estimates
7 RC binary collaboration; 0 From the data we observed, the number of papers
if it is not with open-mentorship teams dramatically increased
The average age until 2011. However, the number of papers with close-
8 AA continuous of authors in a mentorship teams is 0 after 1980. The number of
team papers with open-mentorship teams far exceeds that
The average with close-mentorship teams (Figure 1). We tested
productivity of the between-group difference between the two
9 AP continuous groups by Mann-Whitney test through Python. The
authors in a
team result shows that there is a significant difference
Average citation between the two groups (p<0.001).
counts of We first answer the question by OLS regression.
10 AC continuous When only the independent variable relationship was
authors in a
included in the model. The regression coefficient of
team
the variable is negative and significant at the 0.01
level, providing initial evidence that close mentorship
We adapt Propensity Score Matching (PSM) to
has a negative effect on disruption. The magnitude of
validate causality between mentorship type and DI.
this coefficient changed significantly when we added
The main effect of this study is the effect of the close-
the control variables one by one. In the final model, we
mentorship team on DI (treatment effect). The effect
controlled all the confounding variables and fixed
can be influenced by some confounding factors. from
effect, and the model specification had the largest
the literature review, team-related, personal-related,
adjusted R2, suggesting that the explanatory power of
article-related factors can be considered as
the model was enhanced by the control. The results in
confounding factors of DI. Table 1 shows the variables
the final model show that a close-mentorship team
we included in this study. The variable “outcome” (DI)
has a significantly negative effect on disruption。
is the explained variable. Admittedly, there is a large
bulk of factors that may influence DI, but it is hard to
include all factors. As implemented in previous

156
(a) To check the robustness of the results, we used
causal forest (CF), a state-of-the-art method. For each
paper, we obtain an individualized treatment effect
with its 95% confidence interval estimated. The
CATEs of the close-mentorship team have a mean of -
0.0004. In other words, the close-mentorship team
decreases DI by 0.0004 times. However, when we take
citation counts as the dependent variable, we found
that the CATEs of the close-mentorship team have a
mean of 8.503, which means that papers with close-
mentorship may have more citation counts.
(a)

(b)

Figure 1. The distribution of mentorship type. (a)
The annual distribution of papers with different
mentorship types. (b) The distribution of the paper`s
DI with different mentorship types.

3.2. Mentorship type and DI
We test the relationship between mentorship and
DI through PSM. We categorized papers with a close-
mentorship team as the treatment group (29,556
samples) and papers with an open-mentorship team
as the control group (331,632 samples). Figure 2(a)
shows that the propensity score distributions of the
two groups of samples are significantly different, Figure 2. The propensity score distribution. (a)
while the propensity scores of the two groups The propensity score before matching. (b) The
converge after matching. However, after matching the distribution of DI between close-mentorship group
two groups of samples, the distributions of PY, CI, A10, and open-mentorship group.
TS, RC, AA, AP, and AC are the same, which indicates
that the matching is effective. Through the hypothesis
test commonly used in AB experiments, we found that 4. Discussion and conclusion
there is a significant difference in DI between the two
groups (p<0.05), with the close-mentorship team This study investigated whether the close-
having an average of -0.002917 DI lower than the mentorship team fuses more disruptive ideas than the
open-mentorship team, which means that the DI of the open-mentorship team. We used academic genealogy
close-mentorship team is 36.34% lower than the DI of to quantify whether an article was close-mentorship
the open-mentorship team (Figure 2(b)). or open-mentorship and used the Disruption Index to
quantify the disruption idea. We investigated the

157
relationship between the variables by analyzing [10] Bornmann, L., Haunschild, R., & Mutz, R. (2020).
papers in Neuroscience and constructing regression Should citations be field-normalized in
models. Moreover, we used PSM and causal forest to evaluative bibliometrics? An empirical analysis
test whether there is a causal relationship between based on propensity score matching. Journal of
mentorship type and DI. The results indicate that the Informetrics, 14(4), 20.
articles with the close-mentorship team are less doi:10.1016/j.joi.2020.101098
disruptive than those with the open-mentorship team. [11] Wager, S., & Athey, S. (2018). Estimation and
However, the articles with the close-mentorship team Inference of Heterogeneous Treatment Effects
are more cited than those with the open-mentorship using Random Forests. Journal of the American
team. Statistical Association, 113(523), 1228-1242.
doi:10.1080/01621459.2017.1319839
References
[1] Park, M., Leahey, E., & Funk, R. J. (2023). Papers
and patents are becoming less disruptive over
time. Nature, 613(7942), 138-144.
doi:10.1038/s41586-022-05543-x
[2] Kozlov, M. (2023). 'Disruptive' science has
declined - and no one knows why. Nature,
613(7943), 225. doi:10.1038/d41586-022-
04577-5
[3] Lin, Y., Frey, C. B., & Wu, L. (2023). Remote
collaboration fuses fewer breakthrough ideas.
Nature, 623(7989), 987-991.
doi:10.1038/s41586-023-06767-1
[4] Wu, L., Wang, D., & Evans, J. A. (2019). Large
teams develop and small teams disrupt science
and technology. Nature, 566(7744), 378-382.
doi:10.1038/s41586-019-0941-9
[5] Ke, Q., Liang, L., Ding, Y., David, S. V., & Acuna, D.
E. (2022). A dataset of mentorship in bioscience
with semantic and demographic estimations.
Scientific Data, 9(1), 467. doi:10.1038/s41597-
022-01578-x
[6] Corsini, A., Pezzoni, M., & Visentin, F. (2022).
What makes a productive Ph.D. student?
Research Policy, 51(10).
doi:10.1016/j.respol.2022.104561
[7] Rosenfeld, A., & Maksimov, O. (2022). Factors
that impact (positively and negatively) the
advisor-advisee relationship Should Young
Computer Scientists Stop Collaborating with
Their Doctoral Advisors? COMMUNICATIONS
OF THE ACM, 65(10), 66-72.
doi:10.1145/3529089
[8] Ke, Q., Liang, L., Ding, Y., David, S. V., & Acuna, D.
E. (2022). A dataset of mentorship in bioscience
with semantic and demographic estimations.
Scientific Data, 9(1), 467. doi:10.1038/s41597-
022-01578-x
[9] Lin, Z., Yin, Y., Liu, L., & Wang, D. (2023).
SciSciNet: A large-scale open data lake for the
science of science research. Scientific Data,
10(1). doi:10.1038/s41597-023-02198-9

158