1. Introduction

Open-mentorship team is beneficial to disruptive ideas⋆

Bili Zheng

zhengbli@mail2.sysu.edu.cn 0

Wenjing Li

Jianhua Hou

houjh5@mail.sysu.edu.cn 0 0 Sun Yat-sen University , Guangzhou 510006, Guangdong , China

How collaboration benefits disruption is widely discussed in academia, but less attention is paid to mentorship in the collaboration of an article. This study focuses on the association between close/open-mentorship measured by whether coauthors in publications belong to the same academic genealogy and the disruption of publications measured by the Disruption Index (DI). We selected 361,189 publications in Neuroscience from the SciSciNet database and then constructed regression models and estimated the relationship between the variables. Moreover, we use Propensity Score Matching and causal forest to estimate the causal relationship between them. The findings show that articles with open-mentorship collaboration are more disruptive than those with close-mentorship collaboration. The findings provide implications for team formation and team management in practice.

eol>academic genealogy disruption index mentorship type team science1

1. Introduction

In the past decades, scientific papers have become less disruptive [ 1 ]. Some studies attribute this drastic change to the scientific enterprise, team size, and collaboration distance [ 2, 3, 4 ]. Inspired by a series of studies on collaboration and disruption [ 3, 4 ], we are interested in whether a close-mentorship or openmentorship team will fuse more disruptive ideas. The research question is based on the following assumption: a close-mentorship team means all the members in a team belong to the same genealogy, while an open-mentorship team means the members belong to more than one genealogy.

To address the question, we first define the term mentorship. Mentorship can occur formally through doctoral and postdoctoral advisor-advisee relationships or informally through collaborations. Some genealogy databases like The Academic Family Tree encompass both advisor-advisee relationships

© 2023 Copyright for this paper by its authors. Use permitted under Creative Commons License Attribution 4.0 International (CC BY 4.0). and broad range relationships which means the mentee may be the “learner” in mentoring relationships regardless of age or position [ 5 ]. For similarity, we here refer mentorship to as the advisoradvisee relationship like most genealogical studies [ 6, 7 ].

2. Data and method 2.1. Data collection

We derive mentorship from the dataset released by Qing et al (2022) [ 8 ], which enriches the Academic Family Tree by adding publication records from Microsoft Academic Graph (MAG) [ 8 ]. Then, we obtain the DI of each paper from SciSciNet, which provides over 134 million scientific publications and frequently used indexes (such as DI, Z-score, and sleeping beauty coefficient) [ 9 ]. We obtained 505,926 papers with DI, 82,814 authors, and 5,855 academic genealogies. After excluding missing values, there were 361,189 papers.

2.2. Causal inference

We adapt Propensity Score Matching (PSM) to validate causality between mentorship type and DI. The main effect of this study is the effect of the closementorship team on DI (treatment effect). The effect can be influenced by some confounding factors. from the literature review, team-related, personal-related, article-related factors can be considered as confounding factors of DI. Table 1 shows the variables we included in this study. The variable “outcome” (DI) is the explained variable. Admittedly, there is a large bulk of factors that may influence DI, but it is hard to include all factors. As implemented in previous studies, we selected those factors for which: (1) prior work has investigated the factors possibly influencing DI; (2) existing studies had verified the relationship with DI; (3) the data for calculating the factors were available in records from SciSciNet [ 10 ].

To check the robustness of the PSM, we use causal forest (CF), a state-of-art causal inference method [ 11 ]. Compared with PSM, it solves the curse of dimensionality and provides a more accurate estimate of the treatment effect.

In the causal forest, considering the analysis of heterogeneous causal effects, our estimation objective is Conditional Average Treatment Effects (CATE). The CATE for a given observation is defined as: τ(x) = E[ YiW=1 − YiW=0 ∣∣ Xi = x ] (Eq. 1) where i = 1,2, … , n represents the paper in our sample and Wi ∈ {0,1} indicates whether the team of paper is close-mentorship. We observe the outcome of interest YW=1 if the paper is assigned to the i treatment condition (i.e., if the team of paper is closementorship), otherwise we observe YiW=0. Xi denote a vector of the paper`s other characteristics.

3. Results 3.1. OLS estimates

From the data we observed, the number of papers with open-mentorship teams dramatically increased until 2011. However, the number of papers with closementorship teams is 0 after 1980. The number of papers with open-mentorship teams far exceeds that with close-mentorship teams (Figure 1). We tested the between-group difference between the two groups by Mann-Whitney test through Python. The result shows that there is a significant difference between the two groups (p<0.001).

We first answer the question by OLS regression. When only the independent variable relationship was included in the model. The regression coefficient of the variable is negative and significant at the 0.01 level, providing initial evidence that close mentorship has a negative effect on disruption. The magnitude of this coefficient changed significantly when we added the control variables one by one. In the final model, we controlled all the confounding variables and fixed effect, and the model specification had the largest adjusted R2, suggesting that the explanatory power of the model was enhanced by the control. The results in the final model show that a close-mentorship team has a significantly negative effect on disruption。 (a)

3.2. Mentorship type and DI

We test the relationship between mentorship and DI through PSM. We categorized papers with a closementorship team as the treatment group (29,556 samples) and papers with an open-mentorship team as the control group (331,632 samples). Figure 2(a) shows that the propensity score distributions of the two groups of samples are significantly different, while the propensity scores of the two groups converge after matching. However, after matching the two groups of samples, the distributions of PY, CI, A10, TS, RC, AA, AP, and AC are the same, which indicates that the matching is effective. Through the hypothesis test commonly used in AB experiments, we found that there is a significant difference in DI between the two groups (p<0.05), with the close-mentorship team having an average of -0.002917 DI lower than the open-mentorship team, which means that the DI of the close-mentorship team is 36.34% lower than the DI of the open-mentorship team (Figure 2(b)).

To check the robustness of the results, we used causal forest (CF), a state-of-the-art method. For each paper, we obtain an individualized treatment effect with its 95% confidence interval estimated. The CATEs of the close-mentorship team have a mean of 0.0004. In other words, the close-mentorship team decreases DI by 0.0004 times. However, when we take citation counts as the dependent variable, we found that the CATEs of the close-mentorship team have a mean of 8.503, which means that papers with closementorship may have more citation counts. (a)

4. Discussion and conclusion

This study investigated whether the closementorship team fuses more disruptive ideas than the open-mentorship team. We used academic genealogy to quantify whether an article was close-mentorship or open-mentorship and used the Disruption Index to quantify the disruption idea. We investigated the relationship between the variables by analyzing papers in Neuroscience and constructing regression models. Moreover, we used PSM and causal forest to test whether there is a causal relationship between mentorship type and DI. The results indicate that the articles with the close-mentorship team are less disruptive than those with the open-mentorship team. However, the articles with the close-mentorship team are more cited than those with the open-mentorship team.

[1] Park , M. , Leahey , E. , & Funk , R. J. ( 2023 ). Papers and patents are becoming less disruptive over time . Nature , 613 ( 7942 ), 138 - 144 . doi: 10 .1038/s41586-022-05543-x

[2] Kozlov , M. ( 2023 ). 'Disruptive' science has declined - and no one knows why . Nature , 613 ( 7943 ), 225 . doi: 10 .1038/d41586-022- 04577-5

[3] Lin , Y. , Frey , C. B. , & Wu , L. ( 2023 ). Remote collaboration fuses fewer breakthrough ideas . Nature , 623 ( 7989 ), 987 - 991 . doi: 10 .1038/s41586-023-06767-1

[4] Wu , L. , Wang , D. , & Evans , J. A. ( 2019 ). Large teams develop and small teams disrupt science and technology . Nature , 566 ( 7744 ), 378 - 382 . doi: 10 .1038/s41586-019-0941-9

[5] Ke , Q. , Liang , L. , Ding , Y. , David , S. V. , & Acuna , D. E. ( 2022 ). A dataset of mentorship in bioscience with semantic and demographic estimations . Scientific Data , 9 ( 1 ), 467. doi: 10 .1038/s41597- 022-01578-x

[6] Corsini , A. , Pezzoni , M. , & Visentin , F. ( 2022 ). What makes a productive Ph.D. student? Research Policy , 51 ( 10 ). doi: 10 .1016/j.respol. 2022 .104561

[7] Rosenfeld , A. , & Maksimov , O. ( 2022 ). Factors that impact (positively and negatively) the advisor-advisee relationship Should Young Computer Scientists Stop Collaborating with Their Doctoral Advisors? COMMUNICATIONS OF THE ACM , 65 ( 10 ), 66 - 72 . doi: 10 .1145/3529089

[8] Ke , Q. , Liang , L. , Ding , Y. , David , S. V. , & Acuna , D. E. ( 2022 ). A dataset of mentorship in bioscience with semantic and demographic estimations . Scientific Data , 9 ( 1 ), 467. doi: 10 .1038/s41597- 022-01578-x

[9] Lin , Z. , Yin , Y. , Liu , L. , & Wang , D. ( 2023 ). SciSciNet: A large-scale open data lake for the science of science research . Scientific Data , 10 ( 1 ). doi:10.1038/s41597-023-02198-9

[10] Bornmann , L. , Haunschild , R. , & Mutz , R. ( 2020 ). Should citations be field-normalized in evaluative bibliometrics? An empirical analysis based on propensity score matching . Journal of Informetrics , 14 ( 4 ), 20. doi: 10 .1016/j.joi. 2020 .101098

[11] Wager , S. , & Athey , S. ( 2018 ). Estimation and Inference of Heterogeneous Treatment Effects using Random Forests . Journal of the American Statistical Association , 113 ( 523 ), 1228 - 1242 . doi: 10 .1080/01621459. 2017 .1319839