<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Open-mentorship team is beneficial to disruptive ideas⋆</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Bili Zheng</string-name>
          <email>zhengbli@mail2.sysu.edu.cn</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Wenjing Li</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Jianhua Hou</string-name>
          <email>houjh5@mail.sysu.edu.cn</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Sun Yat-sen University</institution>
          ,
          <addr-line>Guangzhou 510006, Guangdong</addr-line>
          ,
          <country country="CN">China</country>
        </aff>
      </contrib-group>
      <abstract>
        <p>How collaboration benefits disruption is widely discussed in academia, but less attention is paid to mentorship in the collaboration of an article. This study focuses on the association between close/open-mentorship measured by whether coauthors in publications belong to the same academic genealogy and the disruption of publications measured by the Disruption Index (DI). We selected 361,189 publications in Neuroscience from the SciSciNet database and then constructed regression models and estimated the relationship between the variables. Moreover, we use Propensity Score Matching and causal forest to estimate the causal relationship between them. The findings show that articles with open-mentorship collaboration are more disruptive than those with close-mentorship collaboration. The findings provide implications for team formation and team management in practice.</p>
      </abstract>
      <kwd-group>
        <kwd>eol&gt;academic genealogy</kwd>
        <kwd>disruption index</kwd>
        <kwd>mentorship type</kwd>
        <kwd>team science1</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>1. Introduction</title>
      <p>
        In the past decades, scientific papers have become
less disruptive [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ]. Some studies attribute this drastic
change to the scientific enterprise, team size, and
collaboration distance [
        <xref ref-type="bibr" rid="ref2 ref3 ref4">2, 3, 4</xref>
        ]. Inspired by a series of
studies on collaboration and disruption [
        <xref ref-type="bibr" rid="ref3 ref4">3, 4</xref>
        ], we are
interested in whether a close-mentorship or
openmentorship team will fuse more disruptive ideas. The
research question is based on the following
assumption: a close-mentorship team means all the
members in a team belong to the same genealogy,
while an open-mentorship team means the members
belong to more than one genealogy.
      </p>
      <p>To address the question, we first define the term
mentorship. Mentorship can occur formally through
doctoral and postdoctoral advisor-advisee
relationships or informally through collaborations.
Some genealogy databases like The Academic Family
Tree encompass both advisor-advisee relationships</p>
      <p>
        © 2023 Copyright for this paper by its authors. Use permitted under
Creative Commons License Attribution 4.0 International (CC BY 4.0).
and broad range relationships which means the
mentee may be the “learner” in mentoring
relationships regardless of age or position [
        <xref ref-type="bibr" rid="ref5">5</xref>
        ]. For
similarity, we here refer mentorship to as the
advisoradvisee relationship like most genealogical studies [
        <xref ref-type="bibr" rid="ref6 ref7">6,
7</xref>
        ].
      </p>
    </sec>
    <sec id="sec-2">
      <title>2. Data and method</title>
      <sec id="sec-2-1">
        <title>2.1. Data collection</title>
        <p>
          We derive mentorship from the dataset released by
Qing et al (2022) [
          <xref ref-type="bibr" rid="ref8">8</xref>
          ], which enriches the Academic
Family Tree by adding publication records from
Microsoft Academic Graph (MAG) [
          <xref ref-type="bibr" rid="ref8">8</xref>
          ]. Then, we obtain
the DI of each paper from SciSciNet, which provides
over 134 million scientific publications and frequently
used indexes (such as DI, Z-score, and sleeping beauty
coefficient) [
          <xref ref-type="bibr" rid="ref9">9</xref>
          ]. We obtained 505,926 papers with DI,
82,814 authors, and 5,855 academic genealogies.
After excluding missing values, there were 361,189
papers.
        </p>
      </sec>
      <sec id="sec-2-2">
        <title>2.2. Causal inference</title>
        <p>
          We adapt Propensity Score Matching (PSM) to
validate causality between mentorship type and DI.
The main effect of this study is the effect of the
closementorship team on DI (treatment effect). The effect
can be influenced by some confounding factors. from
the literature review, team-related, personal-related,
article-related factors can be considered as
confounding factors of DI. Table 1 shows the variables
we included in this study. The variable “outcome” (DI)
is the explained variable. Admittedly, there is a large
bulk of factors that may influence DI, but it is hard to
include all factors. As implemented in previous
studies, we selected those factors for which: (1) prior
work has investigated the factors possibly influencing
DI; (2) existing studies had verified the relationship
with DI; (3) the data for calculating the factors were
available in records from SciSciNet [
          <xref ref-type="bibr" rid="ref10">10</xref>
          ].
        </p>
        <p>
          To check the robustness of the PSM, we use causal
forest (CF), a state-of-art causal inference method
[
          <xref ref-type="bibr" rid="ref11">11</xref>
          ]. Compared with PSM, it solves the curse of
dimensionality and provides a more accurate estimate
of the treatment effect.
        </p>
        <p>In the causal forest, considering the analysis of
heterogeneous causal effects, our estimation objective
is Conditional Average Treatment Effects (CATE). The
CATE for a given observation  is defined as:
τ(x) = E[ YiW=1 − YiW=0 ∣∣ Xi = x ] (Eq. 1)
where i = 1,2, … , n represents the paper in our
sample and Wi ∈ {0,1} indicates whether the team of
paper  is close-mentorship. We observe the outcome
of interest YW=1 if the paper is assigned to the
i
treatment condition (i.e., if the team of paper is
closementorship), otherwise we observe YiW=0. Xi denote a
vector of the paper`s other characteristics.</p>
      </sec>
    </sec>
    <sec id="sec-3">
      <title>3. Results</title>
      <sec id="sec-3-1">
        <title>3.1. OLS estimates</title>
        <p>From the data we observed, the number of papers
with open-mentorship teams dramatically increased
until 2011. However, the number of papers with
closementorship teams is 0 after 1980. The number of
papers with open-mentorship teams far exceeds that
with close-mentorship teams (Figure 1). We tested
the between-group difference between the two
groups by Mann-Whitney test through Python. The
result shows that there is a significant difference
between the two groups (p&lt;0.001).</p>
        <p>We first answer the question by OLS regression.
When only the independent variable relationship was
included in the model. The regression coefficient of
the variable is negative and significant at the 0.01
level, providing initial evidence that close mentorship
has a negative effect on disruption. The magnitude of
this coefficient changed significantly when we added
the control variables one by one. In the final model, we
controlled all the confounding variables and fixed
effect, and the model specification had the largest
adjusted R2, suggesting that the explanatory power of
the model was enhanced by the control. The results in
the final model show that a close-mentorship team
has a significantly negative effect on disruption。
(a)</p>
      </sec>
      <sec id="sec-3-2">
        <title>3.2. Mentorship type and DI</title>
        <p>We test the relationship between mentorship and
DI through PSM. We categorized papers with a
closementorship team as the treatment group (29,556
samples) and papers with an open-mentorship team
as the control group (331,632 samples). Figure 2(a)
shows that the propensity score distributions of the
two groups of samples are significantly different,
while the propensity scores of the two groups
converge after matching. However, after matching the
two groups of samples, the distributions of PY, CI, A10,
TS, RC, AA, AP, and AC are the same, which indicates
that the matching is effective. Through the hypothesis
test commonly used in AB experiments, we found that
there is a significant difference in DI between the two
groups (p&lt;0.05), with the close-mentorship team
having an average of -0.002917 DI lower than the
open-mentorship team, which means that the DI of the
close-mentorship team is 36.34% lower than the DI of
the open-mentorship team (Figure 2(b)).</p>
        <p>To check the robustness of the results, we used
causal forest (CF), a state-of-the-art method. For each
paper, we obtain an individualized treatment effect
with its 95% confidence interval estimated. The
CATEs of the close-mentorship team have a mean of
0.0004. In other words, the close-mentorship team
decreases DI by 0.0004 times. However, when we take
citation counts as the dependent variable, we found
that the CATEs of the close-mentorship team have a
mean of 8.503, which means that papers with
closementorship may have more citation counts.
(a)</p>
      </sec>
    </sec>
    <sec id="sec-4">
      <title>4. Discussion and conclusion</title>
      <p>This study investigated whether the
closementorship team fuses more disruptive ideas than the
open-mentorship team. We used academic genealogy
to quantify whether an article was close-mentorship
or open-mentorship and used the Disruption Index to
quantify the disruption idea. We investigated the
relationship between the variables by analyzing
papers in Neuroscience and constructing regression
models. Moreover, we used PSM and causal forest to
test whether there is a causal relationship between
mentorship type and DI. The results indicate that the
articles with the close-mentorship team are less
disruptive than those with the open-mentorship team.
However, the articles with the close-mentorship team
are more cited than those with the open-mentorship
team.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <surname>Park</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Leahey</surname>
            ,
            <given-names>E.</given-names>
          </string-name>
          , &amp;
          <string-name>
            <surname>Funk</surname>
            ,
            <given-names>R. J.</given-names>
          </string-name>
          (
          <year>2023</year>
          ).
          <article-title>Papers and patents are becoming less disruptive over time</article-title>
          .
          <source>Nature</source>
          ,
          <volume>613</volume>
          (
          <issue>7942</issue>
          ),
          <fpage>138</fpage>
          -
          <lpage>144</lpage>
          . doi:
          <volume>10</volume>
          .1038/s41586-022-05543-x
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [2]
          <string-name>
            <surname>Kozlov</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          (
          <year>2023</year>
          ).
          <article-title>'Disruptive' science has declined - and no one knows why</article-title>
          .
          <source>Nature</source>
          ,
          <volume>613</volume>
          (
          <issue>7943</issue>
          ),
          <volume>225</volume>
          . doi:
          <volume>10</volume>
          .1038/d41586-022- 04577-5
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [3]
          <string-name>
            <surname>Lin</surname>
            ,
            <given-names>Y.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Frey</surname>
            ,
            <given-names>C. B.</given-names>
          </string-name>
          , &amp;
          <string-name>
            <surname>Wu</surname>
            ,
            <given-names>L.</given-names>
          </string-name>
          (
          <year>2023</year>
          ).
          <article-title>Remote collaboration fuses fewer breakthrough ideas</article-title>
          .
          <source>Nature</source>
          ,
          <volume>623</volume>
          (
          <issue>7989</issue>
          ),
          <fpage>987</fpage>
          -
          <lpage>991</lpage>
          . doi:
          <volume>10</volume>
          .1038/s41586-023-06767-1
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [4]
          <string-name>
            <surname>Wu</surname>
            ,
            <given-names>L.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Wang</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          , &amp;
          <string-name>
            <surname>Evans</surname>
            ,
            <given-names>J. A.</given-names>
          </string-name>
          (
          <year>2019</year>
          ).
          <article-title>Large teams develop and small teams disrupt science and technology</article-title>
          .
          <source>Nature</source>
          ,
          <volume>566</volume>
          (
          <issue>7744</issue>
          ),
          <fpage>378</fpage>
          -
          <lpage>382</lpage>
          . doi:
          <volume>10</volume>
          .1038/s41586-019-0941-9
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          [5]
          <string-name>
            <surname>Ke</surname>
            ,
            <given-names>Q.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Liang</surname>
            ,
            <given-names>L.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Ding</surname>
            ,
            <given-names>Y.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>David</surname>
            ,
            <given-names>S. V.</given-names>
          </string-name>
          , &amp;
          <string-name>
            <surname>Acuna</surname>
            ,
            <given-names>D. E.</given-names>
          </string-name>
          (
          <year>2022</year>
          ).
          <article-title>A dataset of mentorship in bioscience with semantic and demographic estimations</article-title>
          .
          <source>Scientific Data</source>
          ,
          <volume>9</volume>
          (
          <issue>1</issue>
          ), 467. doi:
          <volume>10</volume>
          .1038/s41597- 022-01578-x
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          [6]
          <string-name>
            <surname>Corsini</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Pezzoni</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          , &amp;
          <string-name>
            <surname>Visentin</surname>
            ,
            <given-names>F.</given-names>
          </string-name>
          (
          <year>2022</year>
          ).
          <source>What makes a productive Ph.D. student? Research Policy</source>
          ,
          <volume>51</volume>
          (
          <issue>10</issue>
          ). doi:
          <volume>10</volume>
          .1016/j.respol.
          <year>2022</year>
          .104561
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          [7]
          <string-name>
            <surname>Rosenfeld</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          , &amp;
          <string-name>
            <surname>Maksimov</surname>
            ,
            <given-names>O.</given-names>
          </string-name>
          (
          <year>2022</year>
          ).
          <article-title>Factors that impact (positively and negatively) the advisor-advisee relationship Should Young Computer Scientists Stop Collaborating with Their Doctoral Advisors? COMMUNICATIONS OF THE ACM</article-title>
          ,
          <volume>65</volume>
          (
          <issue>10</issue>
          ),
          <fpage>66</fpage>
          -
          <lpage>72</lpage>
          . doi:
          <volume>10</volume>
          .1145/3529089
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          [8]
          <string-name>
            <surname>Ke</surname>
            ,
            <given-names>Q.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Liang</surname>
            ,
            <given-names>L.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Ding</surname>
            ,
            <given-names>Y.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>David</surname>
            ,
            <given-names>S. V.</given-names>
          </string-name>
          , &amp;
          <string-name>
            <surname>Acuna</surname>
            ,
            <given-names>D. E.</given-names>
          </string-name>
          (
          <year>2022</year>
          ).
          <article-title>A dataset of mentorship in bioscience with semantic and demographic estimations</article-title>
          .
          <source>Scientific Data</source>
          ,
          <volume>9</volume>
          (
          <issue>1</issue>
          ), 467. doi:
          <volume>10</volume>
          .1038/s41597- 022-01578-x
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          [9]
          <string-name>
            <surname>Lin</surname>
            ,
            <given-names>Z.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Yin</surname>
            ,
            <given-names>Y.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Liu</surname>
            ,
            <given-names>L.</given-names>
          </string-name>
          , &amp;
          <string-name>
            <surname>Wang</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          (
          <year>2023</year>
          ).
          <article-title>SciSciNet: A large-scale open data lake for the science of science research</article-title>
          .
          <source>Scientific Data</source>
          ,
          <volume>10</volume>
          (
          <issue>1</issue>
          ).
          <source>doi:10.1038/s41597-023-02198-9</source>
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          [10]
          <string-name>
            <surname>Bornmann</surname>
            ,
            <given-names>L.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Haunschild</surname>
            ,
            <given-names>R.</given-names>
          </string-name>
          , &amp;
          <string-name>
            <surname>Mutz</surname>
            ,
            <given-names>R.</given-names>
          </string-name>
          (
          <year>2020</year>
          ).
          <article-title>Should citations be field-normalized in evaluative bibliometrics? An empirical analysis based on propensity score matching</article-title>
          .
          <source>Journal of Informetrics</source>
          ,
          <volume>14</volume>
          (
          <issue>4</issue>
          ), 20. doi:
          <volume>10</volume>
          .1016/j.joi.
          <year>2020</year>
          .101098
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          [11]
          <string-name>
            <surname>Wager</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          , &amp;
          <string-name>
            <surname>Athey</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          (
          <year>2018</year>
          ).
          <article-title>Estimation and Inference of Heterogeneous Treatment Effects using Random Forests</article-title>
          .
          <source>Journal of the American Statistical Association</source>
          ,
          <volume>113</volume>
          (
          <issue>523</issue>
          ),
          <fpage>1228</fpage>
          -
          <lpage>1242</lpage>
          . doi:
          <volume>10</volume>
          .1080/01621459.
          <year>2017</year>
          .1319839
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>