<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>On the impact of pull request decisions on future contributions</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Damien Legay</string-name>
          <email>damien.legay@umons.ac.be</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Alexandre Decan</string-name>
          <email>alexandre.decan@umons.ac.be</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Tom Mens</string-name>
          <email>tom.mens@umons.ac.be</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Software Engineering Lab, University of Mons</institution>
          ,
          <addr-line>Mons</addr-line>
          ,
          <country country="BE">Belgium</country>
        </aff>
      </contrib-group>
      <abstract>
        <p>-The pull-based development process has become prevalent on platforms such as GitHub as a form of distributed software development. Potential contributors can create and submit a set of changes to a software project through pull requests. These changes can be accepted, discussed or rejected by the maintainers of the software project, and can influence further contribution proposals. As such, it is important to examine the practices that encourage contributors to a project to submit pull requests. Specifically, we consider the impact of prior pull requests on the acceptance or rejection of subsequent pull requests. We also consider the potential effect of rejecting or ignoring pull requests on further contributions. In this preliminary research, we study three large projects on GitHub, using pull request data obtained through the GitHub API, and we perform empirical analyses to investigate the above questions. Our results show that continued contribution to a project is correlated with higher pull request acceptance rates and that pull request rejections lead to fewer future contributions.</p>
      </abstract>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>I. INTRODUCTION</title>
      <p>The turn of the century saw the rise of version control
systems (VCS) to support large-scale software engineering
projects. Centralised VCS (e.g. CVS and Subversion) allow
developers to share a common repository. Decentralised ones
(e.g., Mercurial and git) allow each developer to own a local
copy of the repository containing the full change history.
This enables collaborative (often geographically distributed)
software development on an hitherto unmatched scale. It has
given birth to extremely popular online hosting platforms such
as GitHub, BitBucket and Mozdev, allowing thousands of
people to remotely work together on the same projects. These
platforms provide additional features on top of their underlying
VCS to further support distributed collaborative development.
Examples of such features are issue tracking, code review,
integrated discussions, team management, documentation &amp;
wiki and integration with external tools.</p>
      <p>Today, git has become the most popular distributed VCS by
a large margin1. It will thereby be the focus of our current
research. git supports two types of development processes:the
shared repository approach, where all contributors are given
write access to the central repository and can therefore
con1For anecdotal evidence, based on a 2016 survey with 881 votes, 87%
of responders identified git as their VCS of choice https://rhodecode.com/
insights/version-control-systems-2016
tribute to the project directly; and the pull request (PR)
approach where only project integrators are allowed to do so.</p>
      <p>
        With the PR approach, external contributions are managed
indirectly: would-be contributors create a fork of the repository
and, once they have addressed an issue or lack in the project,
they request for their modifications to be “pulled” to the
repository by submitting a pull request. The project integrators
can decide to approve these PRs, which are then merged
into the main project’s codebase. PRs are extremely valuable,
as they represent a major part of the project’s continued
evolution and expansion. It is therefore important to incentivise
people to create pull requests, thereby contributing to the
project. Previous studies have attempted to identify the factors
influencing whether and when a PR will be merged [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ]–[
        <xref ref-type="bibr" rid="ref3">3</xref>
        ].
      </p>
      <p>We expand upon this work, by focusing on determining
those patterns of PR-acceptance behaviour that are indicative
of continued contribution. Our working hypothesis is that
people contributing to a project repository through PRs may get
demotivated (and hence stop contributing) if their submitted
PRs get rejected too often, or if too many of them are left open
without any decision to merge them. Evolutionary insights in
such phenomena may help us to understand which of such
factors tend to dissuade people to keep contributing to a given
project. To this extent, we quantitatively study the following
research questions using techniques based on survival analysis:</p>
      <p>RQ1: How are PR acceptance and rejection rates influenced
by previous PRs? As a contributor accrues familiarity with a
project, he becomes more able to contribute effectively, which
we expect to result in a lower PR rejection rate. Similarly,
as integrators become more acquainted with a contributor,
they may develop a favourable bias towards his PRs, further
decreasing rejection rates.</p>
      <p>RQ2: To which extent does PR acceptance or rejection
influence further contributions? When his PRs are rejected,
a developer could become discouraged and stop, temporarily
or permanently, contributing to the project as a result.</p>
      <p>RQ3: To which extent do PRs left open influence further
contributions? A PR is sometimes left open for a long period,
neither rejected nor merged into the core project. We posit
this may constitute a form of ”soft” rejection, wherein the
integrators want to avoid alienating the contributor but do not
want to merge the PR. Seeing a large number of untreated
PRs may send an implicit message to potential contributors
that the project integrators are unwilling or unable to process
the volume of contributions they receive, and, therefore, that
their participation to the project would not be valued or useful.</p>
      <p>To provide preliminary evidence for these RQs, we carry
out an empirical analysis on a large number of PRs in three
large, popular and long-lived projects on GitHub. We focus
on GitHub because it is undoubtedly one of the largest and
most active online hosting services for git projects.</p>
    </sec>
    <sec id="sec-2">
      <title>II. RELATED WORK</title>
      <p>
        Several researchers have studied aspects related to the
PRbased software development process, either qualitatively or
quantitatively. Gousios and Zaidman proposed a PR dataset [
        <xref ref-type="bibr" rid="ref4">4</xref>
        ]
including 900 projects and 350,000 PRs extracted using
GHTorrent. Through a mixed-method analysis of 291 GitHub
projects, Gousios et al. [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ] established that the PR-based
development approach is used as frequently as the shared
repository approach on GitHub. They observed that most PRs
are short, receive few comments and are processed quickly.
They also found that most PR rejections are due to the
distributed nature of the pull-based process (e.g., PRs that are
already obsolete upon creation).
      </p>
      <p>
        In a follow-up work [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ], they interviewed 645 contributors
to examine their work practices and identify the challenges
they face. They found that while contributors tend to check
if their intended contribution is already covered, they do
not communicate their intended contributions. Interviewed
contributors outlined that poor responsiveness on the part of
integrators could be a barrier to attracting or retaining
contributors. Contributors also stated that it is hard to accept rejection
of their PRs, as rejected PRs could harm their reputation as
developers. Conversely, it is hard for integrators to explain the
reasons for rejecting PRs. Rejecting a PR without alienating
its contributor was already identified as a challenge of the
PR-based model [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ]. In that paper, they evaluated PRs from
an integrator’s point of view by interviewing 749 project
integrators in order to understand which criteria are used to
determine the quality of a PR and how they prioritise the
evaluation of contributions. They found that most integrators
decide to merge PRs based on project’s objectives, their quality
as measured by compliance to the project guidelines, test
coverage and passing continuous integration checks.
      </p>
      <p>
        Yu et al. [
        <xref ref-type="bibr" rid="ref5">5</xref>
        ] studied the factors that contribute to latency in
PR reviews, defining this latency as the “time interval between
pull request creation and closing date”. They found that PR
latency is mainly affected by process-related factors such as
whether a PR was assigned to a specific reviewer or not. They
also found that continuous integration is a dominant factor in
PR latency.
      </p>
      <p>
        Rahman and Roy [
        <xref ref-type="bibr" rid="ref6">6</xref>
        ] categorised the technical issues
discussed in PR comments and analysed information about
projects and developers to obtain insights into PR acceptance
or rejection. They discovered that the rate of PR rejection
is highly correlated to the programming language used (e.g.,
Java PRs are more frequently rejected than PRs for the C
programming language), the application domain of the project
(e.g., the database application domain sees fewer merged PRs
than the IDE domain), the maturity of a project (older projects
accept fewer PRs) and the number of developers on the project.
      </p>
      <p>
        Tsay et al. [
        <xref ref-type="bibr" rid="ref7">7</xref>
        ] explored both technical and social factors
that contribute to acceptance of PRs. They found that, although
technical factors like the presence of tests in the PR and a small
number of lines changed contribute to a higher probability
of acceptance, social factors, such as whether the contributor
follows the user that closes the PR, had stronger associations
to PR acceptance than technical ones.
      </p>
      <p>
        Terrel et al. [
        <xref ref-type="bibr" rid="ref8">8</xref>
        ] established that PR acceptance is subject
to a bias against women, when their gender is identifiable.
Rastogi et al. [
        <xref ref-type="bibr" rid="ref9">9</xref>
        ] built upon the factors identified in [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ] and
[
        <xref ref-type="bibr" rid="ref7">7</xref>
        ], adding information about the geographical location of
contributors and integrators. They conclude that PR acceptance
rate is higher when both contributor and integrator are from the
same country, with the exception of India, and that contributors
from some countries (e.g., Switzerland and Japan) see their
contributions more frequently accepted than contributors from
other countries (e.g., China and Germany).
      </p>
    </sec>
    <sec id="sec-3">
      <title>III. METHODOLOGY</title>
      <p>The main goal of our research is to study the longevity of
PR-based contributions to large open source software projects.
We focus on software development through GitHub, the
largest and most active online hosting service for git projects.
As of 2018-09-30, GitHub has hosted 96M+ repositories,
31M+ developers, and 200M+ PRs and about one third of
these repositories and PRs were created in the last 12 months.2</p>
      <p>For this exploratory research, we selected three case studies
of large open source git projects on GitHub. These projects
have been obtained by convenience sampling. This method is
acceptable for getting preliminary research insights, and will
be replaced in a later phase to obtain a bigger corpus that
covers a larger set of relevant projects.</p>
      <p>The main criteria for our selected sample were that the
projects should be representative of a typical PR-based
software development process. To do so, the projects needed to be
mature (i.e., have a time span of several years), have an active
development history with a huge number of commits and
contributors, and of course contain a very large number of PRs,
in order to be able to derive statistically significant results from
their analysis. In addition to this, we selected projects written
in three different languages to ensure sufficient diversity. The
three selected projects are ansible, rails and kubernetes.
Some of their characteristics are shown in Table I.</p>
      <p>Because we have observed problems of missing or
inconsistent data when using GHTorrent, we decided to extract the PR
data of the selected projects from GitHub repositories through
the GitHub API directly. For each PR, the data contains
information about the PR creation date, its status (accepted,
rejected or open), its closing date (for accepted and rejected
PRs), the GitHub ID of its author and its PR number. This
2https://octoverse.github.com
PR number corresponds to a chronological ordering of issues
opened in the repository, of which PRs are a subset.
RQ1: How are PR acceptance and rejection rates influenced
by previous PRs?</p>
      <p>To answer RQ1, we examined whether repeat contributions
impact a contributor’s PR acceptance rate. To that effect, for
each repository we analysed the PR acceptance rate in function
of the number of submitted PRs by each contributor.</p>
      <p>Figure 1 displays, for each positive integer threshold x
between 1 and 250, the PR acceptance rate (blue curve) and
rejection rate (orange curve) considering the first x PRs of each
contributor only, thereby discarding contributors having less
than x PRs. The green curve shows the number of contributors
having submitted at least x PRs. Thresholds above 250 are
excluded due to the low fraction of contributors having that
many submissions: 0.33% for ansible, 0.15% for rails and
1.23% for kubernetes.</p>
      <p>One can observe in all three examined project repositories
that, as contributors submit more PRs, their acceptance rates
increase significantly. Over the first 50 PRs, we observe a rise
from 54.2% to 80.0% for ansible, from 61.3% to 81.4% for
rails, and from 49.1% to 74.3% for kubernetes. Beyond the
50 first PRs, all three projects saw continuous increase in PR
acceptance rates as contributors submitted more PRs to them.</p>
      <p>
        These results agree with prior findings by Tsay et al [
        <xref ref-type="bibr" rid="ref7">7</xref>
        ]
and Gousios et al [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ], but are to be nuanced, given the
rapid decrease in number of contributors as the threshold x
increases. As a consequence, in Figure 2 we looked at the PR
acceptance rate of all contributors, excluding the few that made
over 250 contributions. We observe that, while contributors
with a high number of PRs tend to have a consistently high
PR acceptance rate, the behaviour for contributors with few
PRs is quite unpredictable: they can have either low or high
acceptance rates. Therefore, although the number of previous
PRs influences acceptance rate, this can only be verified
starting from a certain threshold of PRs, below which no
conclusion can be reached as to whether such an influence
exists.
      </p>
      <p>RQ2: To which extent does PR acceptance or rejection
influence further contributions?</p>
      <p>
        While related work (e.g., [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ]) has studied the impact of
PR acceptance rate on future PR decision time, RQ2 focuses
on the impact of PR acceptance rate on the likelihood of
making further PRs. To do so, we compared the probability
to contribute again after either a rejected or an accepted PR.
The results are presented in Table II. In all three considered
projects, contributors are more likely to make subsequent PRs
if their prior PRs were accepted.
      </p>
      <p>
        We then used the statistical technique of survival analysis
(a.k.a. event history analysis) [
        <xref ref-type="bibr" rid="ref10">10</xref>
        ]. Given a specific “event
of interest” (in our case: acceptance or rejection of a PR),
survival analysis models the “time to event” data during a
given observation period. Survival functions model the survival
rate, i.e., the expected time duration until the event of interest
occurs. The models take into account the “censoring” of some
observed subjects, either because they enter or leave the study
during the observation period, or because the event of interest
was not observed for them during the observation period.
A common non-parametric statistic used to estimate survival
functions is the Kaplan-Meier estimator [
        <xref ref-type="bibr" rid="ref11">11</xref>
        ].
      </p>
      <p>
        We performed an analysis of the survival probability to
submit a new PR in function of the time elapsed since the latest
submission (at that time) of a PR by the same contributor. In
order to assess if the PR acceptance rate influences the delay
for a contributor to submit new PRs, we considered three
acceptance rate classes: [0; 33%[; [33%; 67%[ and [67%; 100%].
The survival curves are shown in Figure 3. We observe
that the survival probability is higher for classes of higher
acceptance rate, regardless of the considered projects. For
instance, after ten days, the probability to submit a new PR is
72.2% in Ansible if the acceptance rate is over 67%, while this
probability drops to 52.9% if the acceptance rate is between
33% and 67%, and even to 31.5% if the acceptance rate is
below 33%. Similar patterns can be observed for the two other
projects. We carried out pairwise log-rank tests to compare
whether statistically significant differences could be found
between the survival curves. The differences were statistically
confirmed at = 0:01 (after a Bonferroni correction [
        <xref ref-type="bibr" rid="ref12">12</xref>
        ]),
i.e., the null hypotheses, assuming that the survival curves for
different acceptance rate classes are the same, were rejected.
      </p>
      <p>
        We performed a proportional hazards regression based on
Cox regression to determine to which extent the acceptance
rate impacts the probability of further contribution [
        <xref ref-type="bibr" rid="ref13">13</xref>
        ]. The
Cox regression is a method for investigating the effect of
several variables upon a specified event’s hazard rate. For this
analysis, we included the following factors: the acceptance rate
of all prior PRs by the same contributor; the number of prior
PRs made by this contributor, and the time elapsed since the
contributor’s first PR (the contributor’s age).
      </p>
      <p>Table III summarizes the results we obtained. The
concordance (fourth column) provides the goodness of fit of the
model. It is comprised between 0 (perfect anti-concordance)
and 1 (perfect concordance). The table also reports the
regression coefficients for the three considered factors. These
coefficients measure the magnitude of the impact of the
aforementioned factors on the probability to submit a new PR. All
these coefficients are statistically significant (p &lt; 0:01 after
Bonferroni correction). Their values signify that an increase
of one increment (10% in acceptance rate, 1 prior PR or 1
day since the contributor’s first PR) multiplies the probability
to submit a PR by a factor ecoefficient. For instance, in the case of
Kubernetes: an increase of 10% in PR acceptance rate modifies
the probability to submit a new PR by a factor 1.0774, each
prior PR by 1.0032 and each day since the contributor’s first
PR by 0.9984.</p>
      <p>RQ3: To which extent do PRs left open influence further
contributions?</p>
      <p>To provide insight into RQ3, we looked at the proportion of
PRs that were ultimately accepted or rejected given the time
it took to decide (the PR’s age). We excluded PRs that were
left open, since no decision has been reached for those. This
is plotted in Figure 4. We notice that, the longer a PR remains
open, the higher the probability that it will be rejected. After a
threshold x, PRs have a higher probability to be rejected than
accepted. The threshold is 28 days for ansible, 5 days for
rails and 25 days for kubernetes. Presuming that contributors
are aware of this phenomenon, we expect that they implicitly
consider PRs left open for a too long duration as being tacitly
rejected, producing effects similar to those identified in the
previous RQ. This preliminary result needs to be confirmed
with further analyses.</p>
    </sec>
    <sec id="sec-4">
      <title>V. THREATS TO VALIDITY</title>
      <p>
        A threat to the validity of this paper is the fact that we
only selected three projects in this exploratory phase, so the
preliminary findings might not generalise to bigger sets of
projects. Choosing only large, popular and mature projects are
also a source of bias, as Rahman and Roy [
        <xref ref-type="bibr" rid="ref6">6</xref>
        ] found that such
factors affect PR acceptance rates.
      </p>
      <p>Another threat is that the PR status returned by the GitHub
API does not necessarily correspond to the actual fate of the
PR in some repositories. One such case is homebrew-core,
where the policy of the repository is to close most PRs without
merging them, but to integrate those they deem appropriate
through another mechanism, such as the integrators
committing the changes themselves.If analyses were to be applied
to this repository, the rate of acceptance would be artificially
low due to that specific PR handling policy. Another example
is that of angular, wherein PRs marked with specific tags
(“PR action: merge” and “PR target: *” where * represents
one or more branch branches to merge the PR into) will have
their relevant code automatically integrated into the repository
through commits. Those PRs will appear to be rejected on
GitHub, even though they aren’t. It would be possible to
recover the actual PR status based on those tags, which is
not the case for homebrew-core.</p>
      <p>
        Yet another threat is tied to the way we have identified
contributors. It has been reported that a single individual may
use multiple identities in different capacities or at different
times on software repositories [
        <xref ref-type="bibr" rid="ref14">14</xref>
        ]–[
        <xref ref-type="bibr" rid="ref16">16</xref>
        ]. More specifically, it
may be the case that the same author owns multiple GitHub
accounts, or that multiple authors contribute under the same
GitHub account. In that case, we may have erroneously
attributed the PRs to an incorrect identity. This could have
affected our findings. Therefore, as future work, we aim to
rails
rails
kubernetes
kubernetes
rails
ansible
empirically study the impact of such incorrect identifications.
      </p>
      <p>VI. CONCLUSION</p>
      <p>The collaborative development of open-source software
through a pull-based contribution process involves subtle
social interactions that can influence the frequency and likelihood
of contribution to a repository, or even its ability to retain
contributors. Recent qualitative results have highlighted that
contributors do not appreciate the rejection of their PRs, and
that they find poor responsiveness from integrators frustrating.
Integrators, on the other hand, are wary of alienating
contributors in their handling of PRs.</p>
      <p>In this paper, we provide preliminary quantitative results
showing that a contributor’s PRs are more more likely to be
accepted when he has submitted more PRs previously. We
also reveal the impact of PR decisions on the willingness
of contributors to contribute anew. Indeed, fewer contributors
submit a new PR after the previous one was rejected than when
the previous one was accepted. This highlights the importance
for project integrators to avoid aleniating contributors, lest they
lose their contributions.</p>
      <p>ACKNOWLEDGEMENTS</p>
      <p>This research was supported by the FRQ-FNRS
collaborative research project R.60.04.18.F SECOHealth, the
Excellence of Science project 30446992 SECO-ASSIST financed by
FWO-Vlaanderen and F.R.S.-FNRS, and F.R.S.-FNRS Grant
T.0017.18.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <given-names>Georgios</given-names>
            <surname>Gousios</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Martin</given-names>
            <surname>Pinzger</surname>
          </string-name>
          , and Arie van Deursen.
          <article-title>An exploratory study of the pull-based software development model</article-title>
          .
          <source>In International Conference on Software Engineering</source>
          , pages
          <fpage>345</fpage>
          -
          <lpage>355</lpage>
          . ACM,
          <year>2014</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [2]
          <string-name>
            <given-names>Georgios</given-names>
            <surname>Gousios</surname>
          </string-name>
          , Andy Zaidman,
          <string-name>
            <surname>Margaret-Anne Storey</surname>
          </string-name>
          , and Arie van Deursen.
          <article-title>Work practices and challenges in pull-based development: The integrator's perspective</article-title>
          .
          <source>In International Conference on Software Engineering</source>
          , pages
          <fpage>358</fpage>
          -
          <lpage>368</lpage>
          . IEEE Press,
          <year>2015</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [3]
          <string-name>
            <given-names>Georgios</given-names>
            <surname>Gousios</surname>
          </string-name>
          ,
          <string-name>
            <surname>Margaret-Anne Storey</surname>
            , and
            <given-names>Alberto</given-names>
          </string-name>
          <string-name>
            <surname>Bacchelli</surname>
          </string-name>
          .
          <article-title>Work practices and challenges in pull-based development: The contributor's perspective</article-title>
          .
          <source>In International Conference on Software Engineering</source>
          , pages
          <fpage>285</fpage>
          -
          <lpage>296</lpage>
          . ACM,
          <year>2016</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [4]
          <string-name>
            <given-names>Georgios</given-names>
            <surname>Gousios</surname>
          </string-name>
          and
          <string-name>
            <given-names>Andy</given-names>
            <surname>Zaidman</surname>
          </string-name>
          .
          <article-title>A dataset for pull-based development research</article-title>
          .
          <source>In Working Conference on Mining Software Repositories</source>
          , pages
          <fpage>368</fpage>
          -
          <lpage>371</lpage>
          . ACM,
          <year>2014</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          [5]
          <string-name>
            <given-names>Y.</given-names>
            <surname>Yu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>H.</given-names>
            <surname>Wang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>V.</given-names>
            <surname>Filkov</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Devanbu</surname>
          </string-name>
          , and
          <string-name>
            <given-names>B.</given-names>
            <surname>Vasilescu</surname>
          </string-name>
          .
          <article-title>Wait for it: Determinants of pull request evaluation latency on GitHub</article-title>
          .
          <source>In Working Conference on Mining Software Repositories</source>
          , pages
          <fpage>367</fpage>
          -
          <lpage>371</lpage>
          , May
          <year>2015</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          [6]
          <string-name>
            <given-names>Mohammad</given-names>
            <surname>Masudur Rahman and Chanchal</surname>
          </string-name>
          <string-name>
            <given-names>K.</given-names>
            <surname>Roy</surname>
          </string-name>
          .
          <article-title>An insight into the pull requests of GitHub</article-title>
          .
          <source>In Working Conference on Mining Software Repositories</source>
          , pages
          <fpage>364</fpage>
          -
          <lpage>367</lpage>
          . ACM,
          <year>2014</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          [7]
          <string-name>
            <given-names>Jason</given-names>
            <surname>Tsay</surname>
          </string-name>
          , Laura Dabbish, and
          <string-name>
            <given-names>James</given-names>
            <surname>Herbsleb</surname>
          </string-name>
          .
          <article-title>Influence of social and technical factors for evaluating contribution in GitHub</article-title>
          .
          <source>In International Conference on Software Engineering</source>
          , pages
          <fpage>356</fpage>
          -
          <lpage>366</lpage>
          . ACM,
          <year>2014</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          [8]
          <string-name>
            <given-names>Josh</given-names>
            <surname>Terrell</surname>
          </string-name>
          , Andrew Kofink, Justin Middleton, Clarissa Rainear, Emerson Murphy-Hill,
          <string-name>
            <given-names>Chris</given-names>
            <surname>Parnin</surname>
          </string-name>
          , and
          <string-name>
            <given-names>Jon</given-names>
            <surname>Stallings</surname>
          </string-name>
          .
          <article-title>Gender differences and bias in open source: pull request acceptance of women versus men</article-title>
          .
          <source>PeerJ Computer Science</source>
          ,
          <volume>3</volume>
          :e111, May
          <year>2017</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          [9]
          <string-name>
            <given-names>Ayushi</given-names>
            <surname>Rastogi</surname>
          </string-name>
          , Nachiappan Nagappan, Georgios Gousios, and Andre´ van der Hoek.
          <article-title>Relationship between geographical location and evaluation of developer contributions in Github</article-title>
          .
          <source>In International Symposium on Empirical Software Engineering and Measurement. ACM</source>
          ,
          <year>2018</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          [10]
          <string-name>
            <given-names>O.</given-names>
            <surname>Aalen</surname>
          </string-name>
          ,
          <string-name>
            <given-names>O.</given-names>
            <surname>Borgan</surname>
          </string-name>
          , and
          <string-name>
            <given-names>H.</given-names>
            <surname>Gjessing</surname>
          </string-name>
          .
          <source>Survival and Event History Analysis: A Process Point of View</source>
          . Springer,
          <year>2008</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          [11]
          <string-name>
            <given-names>E. L.</given-names>
            <surname>Kaplan</surname>
          </string-name>
          and
          <string-name>
            <given-names>P.</given-names>
            <surname>Meier</surname>
          </string-name>
          .
          <article-title>Nonparametric estimation from incomplete observations</article-title>
          .
          <source>J. American Statistical Association</source>
          ,
          <volume>53</volume>
          (
          <issue>282</issue>
          ):
          <fpage>457</fpage>
          -
          <lpage>481</lpage>
          ,
          <year>2012</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          [12]
          <string-name>
            <given-names>Winston</given-names>
            <surname>Haynes</surname>
          </string-name>
          .
          <source>Bonferroni Correction</source>
          , pages
          <fpage>154</fpage>
          -
          <lpage>154</lpage>
          . Springer, New York,
          <year>2013</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          [13]
          <string-name>
            <given-names>D. R.</given-names>
            <surname>Cox</surname>
          </string-name>
          .
          <article-title>Regression models and life-tables</article-title>
          .
          <source>Journal of the Royal Statistical Society. Series B (Methodological)</source>
          ,
          <volume>34</volume>
          (
          <issue>2</issue>
          ):
          <fpage>187</fpage>
          -
          <lpage>220</lpage>
          ,
          <year>1972</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref14">
        <mixed-citation>
          [14]
          <string-name>
            <given-names>Mathieu</given-names>
            <surname>Goeminne</surname>
          </string-name>
          and
          <string-name>
            <given-names>Tom</given-names>
            <surname>Mens</surname>
          </string-name>
          .
          <article-title>A comparison of identity merge algorithms for software repositories</article-title>
          .
          <source>Science of Computer Programming</source>
          ,
          <volume>78</volume>
          (
          <issue>8</issue>
          ):
          <fpage>971</fpage>
          -
          <lpage>986</lpage>
          ,
          <year>August 2013</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref15">
        <mixed-citation>
          [15]
          <string-name>
            <surname>Erik</surname>
            <given-names>Kouters</given-names>
          </string-name>
          , Bogdan Vasilescu, Alexander Serebrenik, and Mark G. J. van den Brand.
          <article-title>Who's who in Gnome: using LSA to merge software repository identities</article-title>
          .
          <source>In International Conference on Software Maintenance</source>
          , pages
          <fpage>592</fpage>
          -
          <lpage>595</lpage>
          . IEEE,
          <year>2012</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref16">
        <mixed-citation>
          [16]
          <string-name>
            <given-names>I. S.</given-names>
            <surname>Wiese</surname>
          </string-name>
          , J. T. d. Silva, I. Steinmacher,
          <string-name>
            <given-names>C.</given-names>
            <surname>Treude</surname>
          </string-name>
          , and
          <string-name>
            <given-names>M. A.</given-names>
            <surname>Gerosa</surname>
          </string-name>
          .
          <article-title>Who is who in the mailing list? Comparing six disambiguation heuristics to identify multiple addresses of a participant</article-title>
          .
          <source>In International Conference on Software Maintenance and Evolution</source>
          , pages
          <fpage>345</fpage>
          -
          <lpage>355</lpage>
          ,
          <year>Oct 2016</year>
          .
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>